DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v3 0/2] add new PHY affinity in the flow item and Tx queue API
       [not found] <20221221102934.13822-1-jiaweiw@nvidia.com/>
@ 2023-02-03  5:07 ` Jiawei Wang
  2023-02-03  5:07   ` [PATCH v3 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
                     ` (5 more replies)
  0 siblings, 6 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-03  5:07 UTC (permalink / raw)
  To: viacheslavo, orika, thomas; +Cc: dev, rasland

For the multiple hardware ports connect to a single DPDK port (mhpsdp),
currently, there is no information to indicate the packet belongs to
which hardware port.

Introduce a new phy affinity item in rte flow API, and
the phy affinity value reflects the physical phy affinity of the
received packets.

Add the tx_phy_affinity setting in Tx queue API, the affinity value
reflects packets be sent to which hardware port.

Add the nb_phy_ports into device info and value greater than 0 mean
that the number of physical ports connect to the DPDK port.

While uses the phy affinity as a matching item in the flow, and sets the
same affinity on the tx queue, then the packet can be sent from the same
hardware port with received.

RFC: http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaweiw@nvidia.com/

v3:
* Update exception rule
* Update the commit log
* Add the description for PHY affinity and numbering definition
* Add the number of physical ports into device info
* Change the patch order 

v2: Update based on the comments

Jiawei Wang (2):
  ethdev: introduce the PHY affinity field in Tx queue API
  ethdev: add PHY affinity match item

 app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
 app/test-pmd/cmdline_flow.c                 |  29 ++++++
 app/test-pmd/config.c                       |   1 +
 devtools/libabigail.abignore                |   5 +
 doc/guides/prog_guide/rte_flow.rst          |   8 ++
 doc/guides/rel_notes/release_23_03.rst      |   6 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  17 ++++
 lib/ethdev/rte_ethdev.h                     |  13 ++-
 lib/ethdev/rte_flow.c                       |   1 +
 lib/ethdev/rte_flow.h                       |  33 +++++++
 10 files changed, 212 insertions(+), 1 deletion(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v3 1/2] ethdev: introduce the PHY affinity field in Tx queue API
  2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
@ 2023-02-03  5:07   ` Jiawei Wang
  2023-02-03  5:07   ` [PATCH v3 2/2] ethdev: add PHY affinity match item Jiawei Wang
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-03  5:07 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
	Ferruh Yigit, Andrew Rybchenko
  Cc: dev, rasland

For the multiple hardware ports connect to a single DPDK port (mhpsdp),
the previous patch introduces the new rte flow item to match the
phy affinity of the received packets.

Add the tx_phy_affinity setting in Tx queue API, the affinity
value reflects packets be sent to which hardware port.
Value 0 is no affinity and traffic will be routed between different
physical ports.

Add the nb_phy_ports into device info and value greater than 0 mean
that the number of physical ports connect to the DPDK port.

Add the new tx_phy_affinity field into the padding hole of rte_eth_txconf
structure, the size of rte_eth_txconf keeps the same. Adds a suppress
type for structure change in the ABI check file.

Add the testpmd command line:
testpmd> port config (port_id) txq (queue_id) phy_affinity (value)

For example, there're two hardware ports 0 and 1 connected to
a single DPDK port (port id 0), and phy_affinity 1 stood for
hardware port 0 and phy_affinity 2 stood for hardware port 1,
used the below command to config tx phy affinity for per Tx Queue:
        port config 0 txq 0 phy_affinity 1
        port config 0 txq 1 phy_affinity 1
        port config 0 txq 2 phy_affinity 2
        port config 0 txq 3 phy_affinity 2

These commands config the TxQ index 0 and TxQ index 1 with phy affinity 1,
uses TxQ 0 or TxQ 1 send packets, these packets will be sent from the
hardware port 0, and similar with hardware port 1 if sending packets
with TxQ 2 or TxQ 3.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
 app/test-pmd/config.c                       |   1 +
 devtools/libabigail.abignore                |   5 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
 lib/ethdev/rte_ethdev.h                     |  13 ++-
 5 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b32dc8bfd4..3450b1be36 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -764,6 +764,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 
 			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
 			"    Cleanup txq mbufs for a specific Tx queue\n\n"
+
+			"port config (port_id) txq (queue_id) phy_affinity (value)\n"
+			"    Set the physical affinity value "
+			"on a specific Tx queue\n\n"
 		);
 	}
 
@@ -12621,6 +12625,101 @@ static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
 	}
 };
 
+/* *** configure port txq phy_affinity value *** */
+struct cmd_config_tx_phy_affinity {
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t config;
+	portid_t portid;
+	cmdline_fixed_string_t txq;
+	uint16_t qid;
+	cmdline_fixed_string_t phy_affinity;
+	uint8_t value;
+};
+
+static void
+cmd_config_tx_phy_affinity_parsed(void *parsed_result,
+				  __rte_unused struct cmdline *cl,
+				  __rte_unused void *data)
+{
+	struct cmd_config_tx_phy_affinity *res = parsed_result;
+	struct rte_eth_dev_info dev_info;
+	struct rte_port *port;
+	int ret;
+
+	if (port_id_is_invalid(res->portid, ENABLED_WARN))
+		return;
+
+	if (res->portid == (portid_t)RTE_PORT_ALL) {
+		printf("Invalid port id\n");
+		return;
+	}
+
+	port = &ports[res->portid];
+
+	if (strcmp(res->txq, "txq")) {
+		printf("Unknown parameter\n");
+		return;
+	}
+	if (tx_queue_id_is_invalid(res->qid))
+		return;
+
+	ret = eth_dev_info_get_print_err(res->portid, &dev_info);
+	if (ret != 0)
+		return;
+
+	if (dev_info.nb_phy_ports == 0) {
+		printf("Number of physical ports is 0 which is invalid for PHY Affinity\n");
+		return;
+	}
+	printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
+	if (dev_info.nb_phy_ports < res->value) {
+		printf("The PHY affinity value %u is Invalid, exceeds the "
+		       "number of physical ports\n", res->value);
+		return;
+	}
+	port->txq[res->qid].conf.tx_phy_affinity = res->value;
+
+	cmd_reconfig_device_queue(res->portid, 0, 1);
+}
+
+cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 port, "port");
+cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 config, "config");
+cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 portid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 txq, "txq");
+cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
+			      qid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 phy_affinity, "phy_affinity");
+cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
+			      value, RTE_UINT8);
+
+static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
+	.f = cmd_config_tx_phy_affinity_parsed,
+	.data = (void *)0,
+	.help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
+	.tokens = {
+		(void *)&cmd_config_tx_phy_affinity_port,
+		(void *)&cmd_config_tx_phy_affinity_config,
+		(void *)&cmd_config_tx_phy_affinity_portid,
+		(void *)&cmd_config_tx_phy_affinity_txq,
+		(void *)&cmd_config_tx_phy_affinity_qid,
+		(void *)&cmd_config_tx_phy_affinity_hwport,
+		(void *)&cmd_config_tx_phy_affinity_value,
+		NULL,
+	},
+};
+
 /* ******************************************************************************** */
 
 /* list of instructions */
@@ -12851,6 +12950,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_capability,
 	(cmdline_parse_inst_t *)&cmd_set_flex_is_pattern,
 	(cmdline_parse_inst_t *)&cmd_set_flex_spec_pattern,
+	(cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
 	NULL,
 };
 
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index acccb6b035..b83fb17cfa 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
 		printf("unknown\n");
 		break;
 	}
+	printf("Current number of physical ports: %u\n", dev_info.nb_phy_ports);
 }
 
 void
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 7a93de3ba1..0f4b5ec74b 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -34,3 +34,8 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Ignore fields inserted in middle padding of rte_eth_txconf
+[suppress_type]
+        name = rte_eth_txconf
+        has_data_member_inserted_between = {offset_of(tx_deferred_start), offset_of(offloads)}
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 0037506a79..856fb55005 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only on a specific Tx queue::
 
 This command should be run when the port is stopped, or else it will fail.
 
+config per queue Tx physical affinity
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configure a per queue physical affinity value only on a specific Tx queue::
+
+   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
+
+* ``phy_affinity``: reflects packet can be sent to which hardware port.
+                    uses it on multiple hardware ports connect to
+                    a single DPDK port (mhpsdp).
+
+This command should be run when the port is stopped, or else it will fail.
+
 Config VXLAN Encap outer layers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c129ca1eaf..ecfa2c6781 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1138,6 +1138,16 @@ struct rte_eth_txconf {
 				      less free descriptors than this value. */
 
 	uint8_t tx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */
+	/**
+	 * Physical affinity to be set.
+	 * Value 0 is no affinity and traffic could be routed between different
+	 * physical ports, if 0 is disabled then try to match on phy_affinity 0 will
+	 * result in an error.
+	 *
+	 * Value starts from 1 means for specific phy affinity and uses 1 for
+	 * the first physical port.
+	 */
+	uint8_t tx_phy_affinity;
 	/**
 	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_* flags.
 	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa
@@ -1777,7 +1787,8 @@ struct rte_eth_dev_info {
 	struct rte_eth_switch_info switch_info;
 	/** Supported error handling mode. */
 	enum rte_eth_err_handle_mode err_handle_mode;
-
+	uint8_t nb_phy_ports;
+	/** Number of physical ports to connect with single DPDK port. */
 	uint64_t reserved_64s[2]; /**< Reserved for future fields */
 	void *reserved_ptrs[2];   /**< Reserved for future fields */
 };
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v3 2/2] ethdev: add PHY affinity match item
  2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
  2023-02-03  5:07   ` [PATCH v3 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
@ 2023-02-03  5:07   ` Jiawei Wang
  2023-02-03 13:33   ` [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-03  5:07 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
	Ferruh Yigit, Andrew Rybchenko
  Cc: dev, rasland

For the multiple hardware ports connect to a single DPDK port (mhpsdp),
currently, there is no information to indicate the packet belongs to
which hardware port.

Introduce a new phy affinity item in RTE flow API, and the phy affinity
value reflects the physical port of the received packets.

While uses the phy affinity as a matching item in the flow, and sets the
same phy_affinity value on the Tx queue, then the packet can be sent from
the same hardware port with received.
If tx_phy_affinity 0 is disabled then try to match on phy_affinity 0
will result in an error.

Add the testpmd command line to match the new item:
	flow create 0 ingress group 0 pattern phy_affinity affinity is 1 /
	end actions queue index 0 / end

The above command means that creates a flow on a single DPDK port and
matches the packet from the first physical port (assume the phy affinity 1
stands for the first port) and redirects these packets into RxQ 0.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 app/test-pmd/cmdline_flow.c                 | 29 ++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst          |  8 +++++
 doc/guides/rel_notes/release_23_03.rst      |  6 ++++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 33 +++++++++++++++++++++
 6 files changed, 81 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..a6d4615038 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -465,6 +465,8 @@ enum index {
 	ITEM_METER,
 	ITEM_METER_COLOR,
 	ITEM_METER_COLOR_NAME,
+	ITEM_PHY_AFFINITY,
+	ITEM_PHY_AFFINITY_VALUE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -1355,6 +1357,7 @@ static const enum index next_item[] = {
 	ITEM_L2TPV2,
 	ITEM_PPP,
 	ITEM_METER,
+	ITEM_PHY_AFFINITY,
 	END_SET,
 	ZERO,
 };
@@ -1821,6 +1824,12 @@ static const enum index item_meter[] = {
 	ZERO,
 };
 
+static const enum index item_phy_affinity[] = {
+	ITEM_PHY_AFFINITY_VALUE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -6443,6 +6452,23 @@ static const struct token token_list[] = {
 				ARGS_ENTRY(struct buffer, port)),
 		.call = parse_mp,
 	},
+	[ITEM_PHY_AFFINITY] = {
+		.name = "phy_affinity",
+		.help = "match on the physical affinity of the"
+			" received packet.",
+		.priv = PRIV_ITEM(PHY_AFFINITY,
+				  sizeof(struct rte_flow_item_phy_affinity)),
+		.next = NEXT(item_phy_affinity),
+		.call = parse_vc,
+	},
+	[ITEM_PHY_AFFINITY_VALUE] = {
+		.name = "affinity",
+		.help = "physical affinity value",
+		.next = NEXT(item_phy_affinity, NEXT_ENTRY(COMMON_UNSIGNED),
+			     item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_phy_affinity,
+					affinity)),
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -10981,6 +11007,9 @@ flow_item_default_mask(const struct rte_flow_item *item)
 	case RTE_FLOW_ITEM_TYPE_METER_COLOR:
 		mask = &rte_flow_item_meter_color_mask;
 		break;
+	case RTE_FLOW_ITEM_TYPE_PHY_AFFINITY:
+		mask = &rte_flow_item_phy_affinity_mask;
+		break;
 	default:
 		break;
 	}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..3b4e8923dc 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1544,6 +1544,14 @@ Matches Color Marker set by a Meter.
 
 - ``color``: Metering color marker.
 
+Item: ``PHY_AFFINITY``
+^^^^^^^^^^^^^^^^^^^^^^
+
+Matches on the physical affinity of the received packet, the physical port
+in the group of physical ports connected to a single DPDK port.
+
+- ``affinity``: Physical affinity.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index c15f6fbb9f..029b57316c 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -69,6 +69,12 @@ New Features
     ``rte_event_dev_config::nb_single_link_event_port_queues`` parameter
     required for eth_rx, eth_tx, crypto and timer eventdev adapters.
 
+* **Added rte_flow support for matching PHY affinity fields.**
+
+  For the multiple hardware ports connect to a single DPDK port (mhpsdp),
+  Added ``phy_affinity`` item in rte_flow to support physical affinity of
+  the packets.
+
 
 Removed Items
 -------------
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 856fb55005..e9f20607a2 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3725,6 +3725,10 @@ This section lists supported pattern items and their attributes, if any.
 
   - ``color {value}``: meter color value (green/yellow/red).
 
+- ``phy_affinity``: match physical affinity.
+
+  - ``affinity {value}``: physical affinity value.
+
 - ``send_to_kernel``: send packets to kernel.
 
 
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 7d0c24366c..0c2d3b679b 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -157,6 +157,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(L2TPV2, sizeof(struct rte_flow_item_l2tpv2)),
 	MK_FLOW_ITEM(PPP, sizeof(struct rte_flow_item_ppp)),
 	MK_FLOW_ITEM(METER_COLOR, sizeof(struct rte_flow_item_meter_color)),
+	MK_FLOW_ITEM(PHY_AFFINITY, sizeof(struct rte_flow_item_phy_affinity)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index b60987db4b..785a56a405 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -624,6 +624,13 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_meter_color.
 	 */
 	RTE_FLOW_ITEM_TYPE_METER_COLOR,
+
+	/**
+	 * Matches on the physical affinity of the received packet.
+	 *
+	 * @see struct rte_flow_item_phy_affinity.
+	 */
+	RTE_FLOW_ITEM_TYPE_PHY_AFFINITY,
 };
 
 /**
@@ -2103,6 +2110,32 @@ static const struct rte_flow_item_meter_color rte_flow_item_meter_color_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_PHY_AFFINITY
+ *
+ * For the multiple hardware ports connect to a single DPDK port (mhpsdp),
+ * use this item to match the physical affinity of the packets.
+ */
+struct rte_flow_item_phy_affinity {
+	uint8_t affinity;
+	/**
+	 * Affinity defines the phy affinity value to match the physical
+	 * port of received packet. The value starts from 1 and uses 1 to
+	 * match the first physical port.
+	 */
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_PHY_AFFINITY. */
+#ifndef __cplusplus
+static const struct rte_flow_item_phy_affinity
+rte_flow_item_phy_affinity_mask = {
+	.affinity = 0xff,
+};
+#endif
+
 /**
  * Action types.
  *
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API
  2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
  2023-02-03  5:07   ` [PATCH v3 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
  2023-02-03  5:07   ` [PATCH v3 2/2] ethdev: add PHY affinity match item Jiawei Wang
@ 2023-02-03 13:33   ` Jiawei Wang
  2023-02-03 13:33     ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
  2023-02-03 13:33     ` [PATCH v4 2/2] ethdev: add PHY affinity match item Jiawei Wang
  2023-02-14 15:48   ` [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port Jiawei Wang
                     ` (2 subsequent siblings)
  5 siblings, 2 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-03 13:33 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko; +Cc: dev, rasland

When multiple physical ports are connected to a single DPDK port,
(example: kernel bonding, DPDK bonding, failsafe, etc.),
we want to know which physical port is used for Rx and Tx.

This patch maps a DPDK Tx queue with a physical port,
by adding tx_phy_affinity setting in Tx queue.
The affinity number is the physical port ID where packets will be
sent.
Value 0 means no affinity and traffic could be routed to any
connected physical ports, this is the default current behavior.

The number of physical ports is reported with rte_eth_dev_info_get().

This patch allows to map a Rx queue with a physical port by using
a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_PHY_AFFINITY.

While uses the phy affinity as a matching item in the flow rule, and
sets the same phy_affinity value on the Tx queue, then the packet
can be sent from the same physical port as the receiving one.
The physical affinity numbering starts from 1, then trying to
match on phy_affinity 0 will result in an error.

RFC: http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaweiw@nvidia.com/

v4:
* Rebase the latest code
* Update new field description
* Update release release note
* Reword the commit log to make clear

v3:
* Update exception rule
* Update the commit log
* Add the description for PHY affinity and numbering definition
* Add the number of physical ports into device info
* Change the patch order 

v2: Update based on the comments

Jiawei Wang (2):
  ethdev: introduce the PHY affinity field in Tx queue API
  ethdev: add PHY affinity match item

 app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
 app/test-pmd/cmdline_flow.c                 |  28 ++++++
 app/test-pmd/config.c                       |   1 +
 devtools/libabigail.abignore                |   5 +
 doc/guides/prog_guide/rte_flow.rst          |   8 ++
 doc/guides/rel_notes/release_23_03.rst      |   5 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  17 ++++
 lib/ethdev/rte_ethdev.h                     |  10 ++
 lib/ethdev/rte_flow.c                       |   1 +
 lib/ethdev/rte_flow.h                       |  35 +++++++
 10 files changed, 210 insertions(+)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
  2023-02-03 13:33   ` [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
@ 2023-02-03 13:33     ` Jiawei Wang
  2023-02-06 15:29       ` Jiawei(Jonny) Wang
                         ` (2 more replies)
  2023-02-03 13:33     ` [PATCH v4 2/2] ethdev: add PHY affinity match item Jiawei Wang
  1 sibling, 3 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-03 13:33 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, Aman Singh,
	Yuying Zhang, Ferruh Yigit
  Cc: dev, rasland

When multiple physical ports are connected to a single DPDK port,
(example: kernel bonding, DPDK bonding, failsafe, etc.),
we want to know which physical port is used for Rx and Tx.

This patch maps a DPDK Tx queue with a physical port,
by adding tx_phy_affinity setting in Tx queue.
The affinity number is the physical port ID where packets will be
sent.
Value 0 means no affinity and traffic could be routed to any
connected physical ports, this is the default current behavior.

The number of physical ports is reported with rte_eth_dev_info_get().

The new tx_phy_affinity field is added into the padding hole of
rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
An ABI check rule needs to be added to avoid false warning.

Add the testpmd command line:
testpmd> port config (port_id) txq (queue_id) phy_affinity (value)

For example, there're two physical ports connected to
a single DPDK port (port id 0), and phy_affinity 1 stood for
the first physical port and phy_affinity 2 stood for the second
physical port.
Use the below commands to config tx phy affinity for per Tx Queue:
        port config 0 txq 0 phy_affinity 1
        port config 0 txq 1 phy_affinity 1
        port config 0 txq 2 phy_affinity 2
        port config 0 txq 3 phy_affinity 2

These commands config the Tx Queue index 0 and Tx Queue index 1 with
phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
these packets will be sent from the first physical port, and similar
with the second physical port if sending packets with Tx Queue 2
or Tx Queue 3.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
 app/test-pmd/config.c                       |   1 +
 devtools/libabigail.abignore                |   5 +
 doc/guides/rel_notes/release_23_03.rst      |   4 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
 lib/ethdev/rte_ethdev.h                     |  10 ++
 6 files changed, 133 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index cb8c174020..f771fcf8ac 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 
 			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
 			"    Cleanup txq mbufs for a specific Tx queue\n\n"
+
+			"port config (port_id) txq (queue_id) phy_affinity (value)\n"
+			"    Set the physical affinity value "
+			"on a specific Tx queue\n\n"
 		);
 	}
 
@@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
 	}
 };
 
+/* *** configure port txq phy_affinity value *** */
+struct cmd_config_tx_phy_affinity {
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t config;
+	portid_t portid;
+	cmdline_fixed_string_t txq;
+	uint16_t qid;
+	cmdline_fixed_string_t phy_affinity;
+	uint8_t value;
+};
+
+static void
+cmd_config_tx_phy_affinity_parsed(void *parsed_result,
+				  __rte_unused struct cmdline *cl,
+				  __rte_unused void *data)
+{
+	struct cmd_config_tx_phy_affinity *res = parsed_result;
+	struct rte_eth_dev_info dev_info;
+	struct rte_port *port;
+	int ret;
+
+	if (port_id_is_invalid(res->portid, ENABLED_WARN))
+		return;
+
+	if (res->portid == (portid_t)RTE_PORT_ALL) {
+		printf("Invalid port id\n");
+		return;
+	}
+
+	port = &ports[res->portid];
+
+	if (strcmp(res->txq, "txq")) {
+		printf("Unknown parameter\n");
+		return;
+	}
+	if (tx_queue_id_is_invalid(res->qid))
+		return;
+
+	ret = eth_dev_info_get_print_err(res->portid, &dev_info);
+	if (ret != 0)
+		return;
+
+	if (dev_info.nb_phy_ports == 0) {
+		printf("Number of physical ports is 0 which is invalid for PHY Affinity\n");
+		return;
+	}
+	printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
+	if (dev_info.nb_phy_ports < res->value) {
+		printf("The PHY affinity value %u is Invalid, exceeds the "
+		       "number of physical ports\n", res->value);
+		return;
+	}
+	port->txq[res->qid].conf.tx_phy_affinity = res->value;
+
+	cmd_reconfig_device_queue(res->portid, 0, 1);
+}
+
+cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 port, "port");
+cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 config, "config");
+cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 portid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 txq, "txq");
+cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
+			      qid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
+				 phy_affinity, "phy_affinity");
+cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
+			      value, RTE_UINT8);
+
+static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
+	.f = cmd_config_tx_phy_affinity_parsed,
+	.data = (void *)0,
+	.help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
+	.tokens = {
+		(void *)&cmd_config_tx_phy_affinity_port,
+		(void *)&cmd_config_tx_phy_affinity_config,
+		(void *)&cmd_config_tx_phy_affinity_portid,
+		(void *)&cmd_config_tx_phy_affinity_txq,
+		(void *)&cmd_config_tx_phy_affinity_qid,
+		(void *)&cmd_config_tx_phy_affinity_hwport,
+		(void *)&cmd_config_tx_phy_affinity_value,
+		NULL,
+	},
+};
+
 /* ******************************************************************************** */
 
 /* list of instructions */
@@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
 	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
+	(cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
 	NULL,
 };
 
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index acccb6b035..b83fb17cfa 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
 		printf("unknown\n");
 		break;
 	}
+	printf("Current number of physical ports: %u\n", dev_info.nb_phy_ports);
 }
 
 void
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 7a93de3ba1..ac7d3fb2da 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -34,3 +34,8 @@
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ; Temporary exceptions till next major ABI version ;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+; Ignore fields inserted in padding hole of rte_eth_txconf
+[suppress_type]
+        name = rte_eth_txconf
+        has_data_member_inserted_between = {offset_of(tx_deferred_start), offset_of(offloads)}
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index 73f5d94e14..e99bd2dcb6 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -55,6 +55,10 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added affinity for multiple physical ports connected to a single DPDK port.**
+
+  * Added Tx affinity in queue setup to map a physical port.
+
 * **Updated AMD axgbe driver.**
 
   * Added multi-process support.
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 79a1fa9cb7..5c716f7679 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only on a specific Tx queue::
 
 This command should be run when the port is stopped, or else it will fail.
 
+config per queue Tx physical affinity
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Configure a per queue physical affinity value only on a specific Tx queue::
+
+   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
+
+* ``phy_affinity``: physical port to use for sending,
+                    when multiple physical ports are connected to
+                    a single DPDK port.
+
+This command should be run when the port is stopped, otherwise it fails.
+
 Config VXLAN Encap outer layers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c129ca1eaf..2fd971b7b5 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
 				      less free descriptors than this value. */
 
 	uint8_t tx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */
+	/**
+	 * Affinity with one of the multiple physical ports connected to the DPDK port.
+	 * Value 0 means no affinity and traffic could be routed to any connected
+	 * physical port.
+	 * The first physical port is number 1 and so on.
+	 * Number of physical ports is reported by nb_phy_ports in rte_eth_dev_info.
+	 */
+	uint8_t tx_phy_affinity;
 	/**
 	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_* flags.
 	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa
@@ -1744,6 +1752,8 @@ struct rte_eth_dev_info {
 	/** Device redirection table size, the total number of entries. */
 	uint16_t reta_size;
 	uint8_t hash_key_size; /**< Hash key size in bytes */
+	/** Number of physical ports connected with DPDK port. */
+	uint8_t nb_phy_ports;
 	/** Bit mask of RSS offloads, the bit offset also means flow type */
 	uint64_t flow_type_rss_offloads;
 	struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration */
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4 2/2] ethdev: add PHY affinity match item
  2023-02-03 13:33   ` [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
  2023-02-03 13:33     ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
@ 2023-02-03 13:33     ` Jiawei Wang
  1 sibling, 0 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-03 13:33 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, Aman Singh,
	Yuying Zhang, Ferruh Yigit
  Cc: dev, rasland

When multiple physical ports are connected to a single DPDK port,
(example: kernel bonding, DPDK bonding, failsafe, etc.),
we want to know which physical port is used for Rx and Tx.

This patch allows to map a Rx queue with a physical port by using
a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_PHY_AFFINITY.

While uses the phy affinity as a matching item in the flow rule, and
sets the same phy_affinity value on the Tx queue, then the packet
can be sent from the same physical port as the receiving one.
The physical affinity numbering starts from 1, then trying to
match on phy_affinity 0 will result in an error.

Add the testpmd command line to match the new item:
	flow create 0 ingress group 0 pattern phy_affinity affinity is 1 /
	end actions queue index 0 / end

The above command means that creates a flow on a single DPDK port and
matches the packet from the first physical port and redirects
these packets into Rx queue 0.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 app/test-pmd/cmdline_flow.c                 | 28 +++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst          |  8 +++++
 doc/guides/rel_notes/release_23_03.rst      |  1 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 35 +++++++++++++++++++++
 6 files changed, 77 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..5e9e617674 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -465,6 +465,8 @@ enum index {
 	ITEM_METER,
 	ITEM_METER_COLOR,
 	ITEM_METER_COLOR_NAME,
+	ITEM_PHY_AFFINITY,
+	ITEM_PHY_AFFINITY_VALUE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -1355,6 +1357,7 @@ static const enum index next_item[] = {
 	ITEM_L2TPV2,
 	ITEM_PPP,
 	ITEM_METER,
+	ITEM_PHY_AFFINITY,
 	END_SET,
 	ZERO,
 };
@@ -1821,6 +1824,12 @@ static const enum index item_meter[] = {
 	ZERO,
 };
 
+static const enum index item_phy_affinity[] = {
+	ITEM_PHY_AFFINITY_VALUE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -6443,6 +6452,22 @@ static const struct token token_list[] = {
 				ARGS_ENTRY(struct buffer, port)),
 		.call = parse_mp,
 	},
+	[ITEM_PHY_AFFINITY] = {
+		.name = "phy_affinity",
+		.help = "match on the physical port receiving the packets",
+		.priv = PRIV_ITEM(PHY_AFFINITY,
+				  sizeof(struct rte_flow_item_phy_affinity)),
+		.next = NEXT(item_phy_affinity),
+		.call = parse_vc,
+	},
+	[ITEM_PHY_AFFINITY_VALUE] = {
+		.name = "affinity",
+		.help = "physical affinity value",
+		.next = NEXT(item_phy_affinity, NEXT_ENTRY(COMMON_UNSIGNED),
+			     item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_phy_affinity,
+					affinity)),
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -10981,6 +11006,9 @@ flow_item_default_mask(const struct rte_flow_item *item)
 	case RTE_FLOW_ITEM_TYPE_METER_COLOR:
 		mask = &rte_flow_item_meter_color_mask;
 		break;
+	case RTE_FLOW_ITEM_TYPE_PHY_AFFINITY:
+		mask = &rte_flow_item_phy_affinity_mask;
+		break;
 	default:
 		break;
 	}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..fa43b9bb66 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1544,6 +1544,14 @@ Matches Color Marker set by a Meter.
 
 - ``color``: Metering color marker.
 
+Item: ``PHY_AFFINITY``
+^^^^^^^^^^^^^^^^^^^^^^
+
+Matches on the physical port of the received packet.
+In case of multiple physical ports, the affinity numbering starts from 1.
+
+- ``affinity``: Physical affinity.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index e99bd2dcb6..320b7f2efc 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -58,6 +58,7 @@ New Features
 * **Added affinity for multiple physical ports connected to a single DPDK port.**
 
   * Added Tx affinity in queue setup to map a physical port.
+  * Added Rx affinity flow matching of a physical port.
 
 * **Updated AMD axgbe driver.**
 
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 5c716f7679..6079ca1ed6 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3754,6 +3754,10 @@ This section lists supported pattern items and their attributes, if any.
 
   - ``color {value}``: meter color value (green/yellow/red).
 
+- ``phy_affinity``: match physical port.
+
+  - ``affinity {value}``: physical port (starts from 1).
+
 - ``send_to_kernel``: send packets to kernel.
 
 
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 7d0c24366c..0c2d3b679b 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -157,6 +157,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(L2TPV2, sizeof(struct rte_flow_item_l2tpv2)),
 	MK_FLOW_ITEM(PPP, sizeof(struct rte_flow_item_ppp)),
 	MK_FLOW_ITEM(METER_COLOR, sizeof(struct rte_flow_item_meter_color)),
+	MK_FLOW_ITEM(PHY_AFFINITY, sizeof(struct rte_flow_item_phy_affinity)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index b60987db4b..da32f9e383 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -624,6 +624,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_meter_color.
 	 */
 	RTE_FLOW_ITEM_TYPE_METER_COLOR,
+
+	/**
+	 * Matches on the physical port of the received packet.
+	 * Used in case multiple physical ports are connected to the DPDK port.
+	 * First port is number 1.
+	 *
+	 * @see struct rte_flow_item_phy_affinity.
+	 */
+	RTE_FLOW_ITEM_TYPE_PHY_AFFINITY,
 };
 
 /**
@@ -2103,6 +2112,32 @@ static const struct rte_flow_item_meter_color rte_flow_item_meter_color_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_PHY_AFFINITY
+ *
+ * For multiple physical ports connected to a single DPDK port,
+ * match the physical port receiving the packets.
+ */
+struct rte_flow_item_phy_affinity {
+	/**
+	 * Physical port receiving the packets.
+	 * Numbering starts from 1.
+	 * Number of physical ports is reported by nb_phy_ports in rte_eth_dev_info.
+	 */
+	uint8_t affinity;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_PHY_AFFINITY. */
+#ifndef __cplusplus
+static const struct rte_flow_item_phy_affinity
+rte_flow_item_phy_affinity_mask = {
+	.affinity = 0xff,
+};
+#endif
+
 /**
  * Action types.
  *
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
  2023-02-03 13:33     ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
@ 2023-02-06 15:29       ` Jiawei(Jonny) Wang
  2023-02-07  9:40       ` Ori Kam
  2023-02-09 19:44       ` Ferruh Yigit
  2 siblings, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-06 15:29 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	andrew.rybchenko, Aman Singh, Yuying Zhang, Ferruh Yigit
  Cc: dev, Raslan Darawsheh

Hi,

@Andrew, @Thomas, @Ori, 

Could you lease help to review the patch?

Thanks.

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@nvidia.com>
> Sent: Friday, February 3, 2023 9:34 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>; Ferruh Yigit <ferruh.yigit@amd.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue
> API
> 
> When multiple physical ports are connected to a single DPDK port,
> (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to know
> which physical port is used for Rx and Tx.
> 
> This patch maps a DPDK Tx queue with a physical port, by adding
> tx_phy_affinity setting in Tx queue.
> The affinity number is the physical port ID where packets will be sent.
> Value 0 means no affinity and traffic could be routed to any connected
> physical ports, this is the default current behavior.
> 
> The number of physical ports is reported with rte_eth_dev_info_get().
> 
> The new tx_phy_affinity field is added into the padding hole of rte_eth_txconf
> structure, the size of rte_eth_txconf keeps the same.
> An ABI check rule needs to be added to avoid false warning.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> 
> For example, there're two physical ports connected to a single DPDK port
> (port id 0), and phy_affinity 1 stood for the first physical port and phy_affinity
> 2 stood for the second physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 phy_affinity 1
>         port config 0 txq 1 phy_affinity 1
>         port config 0 txq 2 phy_affinity 2
>         port config 0 txq 3 phy_affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with phy
> affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these packets will be
> sent from the first physical port, and similar with the second physical port if
> sending packets with Tx Queue 2 or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> ---

snip

> 2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
  2023-02-03 13:33     ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
  2023-02-06 15:29       ` Jiawei(Jonny) Wang
@ 2023-02-07  9:40       ` Ori Kam
  2023-02-09 19:44       ` Ferruh Yigit
  2 siblings, 0 replies; 40+ messages in thread
From: Ori Kam @ 2023-02-07  9:40 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	andrew.rybchenko, Aman Singh, Yuying Zhang, Ferruh Yigit
  Cc: dev, Raslan Darawsheh

Hi Jiawei,


> -----Original Message-----
> From: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>
> Sent: Friday, 3 February 2023 15:34
> 
> When multiple physical ports are connected to a single DPDK port,
> (example: kernel bonding, DPDK bonding, failsafe, etc.),
> we want to know which physical port is used for Rx and Tx.
> 
> This patch maps a DPDK Tx queue with a physical port,
> by adding tx_phy_affinity setting in Tx queue.
> The affinity number is the physical port ID where packets will be
> sent.
> Value 0 means no affinity and traffic could be routed to any
> connected physical ports, this is the default current behavior.
> 
> The number of physical ports is reported with rte_eth_dev_info_get().
> 
> The new tx_phy_affinity field is added into the padding hole of
> rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> An ABI check rule needs to be added to avoid false warning.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> 
> For example, there're two physical ports connected to
> a single DPDK port (port id 0), and phy_affinity 1 stood for
> the first physical port and phy_affinity 2 stood for the second
> physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 phy_affinity 1
>         port config 0 txq 1 phy_affinity 1
>         port config 0 txq 2 phy_affinity 2
>         port config 0 txq 3 phy_affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
> these packets will be sent from the first physical port, and similar
> with the second physical port if sending packets with Tx Queue 2
> or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> ---
>  app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
>  app/test-pmd/config.c                       |   1 +
>  devtools/libabigail.abignore                |   5 +
>  doc/guides/rel_notes/release_23_03.rst      |   4 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
>  lib/ethdev/rte_ethdev.h                     |  10 ++
>  6 files changed, 133 insertions(+)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index cb8c174020..f771fcf8ac 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void
> *parsed_result,
> 
>  			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
>  			"    Cleanup txq mbufs for a specific Tx queue\n\n"
> +
> +			"port config (port_id) txq (queue_id) phy_affinity
> (value)\n"
> +			"    Set the physical affinity value "
> +			"on a specific Tx queue\n\n"
>  		);
>  	}
> 
> @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t
> cmd_show_port_flow_transfer_proxy = {
>  	}
>  };
> 
> +/* *** configure port txq phy_affinity value *** */
> +struct cmd_config_tx_phy_affinity {
> +	cmdline_fixed_string_t port;
> +	cmdline_fixed_string_t config;
> +	portid_t portid;
> +	cmdline_fixed_string_t txq;
> +	uint16_t qid;
> +	cmdline_fixed_string_t phy_affinity;
> +	uint8_t value;
> +};
> +
> +static void
> +cmd_config_tx_phy_affinity_parsed(void *parsed_result,
> +				  __rte_unused struct cmdline *cl,
> +				  __rte_unused void *data)
> +{
> +	struct cmd_config_tx_phy_affinity *res = parsed_result;
> +	struct rte_eth_dev_info dev_info;
> +	struct rte_port *port;
> +	int ret;
> +
> +	if (port_id_is_invalid(res->portid, ENABLED_WARN))
> +		return;
> +
> +	if (res->portid == (portid_t)RTE_PORT_ALL) {
> +		printf("Invalid port id\n");
> +		return;
> +	}
> +
> +	port = &ports[res->portid];
> +
> +	if (strcmp(res->txq, "txq")) {
> +		printf("Unknown parameter\n");
> +		return;
> +	}
> +	if (tx_queue_id_is_invalid(res->qid))
> +		return;
> +
> +	ret = eth_dev_info_get_print_err(res->portid, &dev_info);
> +	if (ret != 0)
> +		return;
> +
> +	if (dev_info.nb_phy_ports == 0) {
> +		printf("Number of physical ports is 0 which is invalid for PHY
> Affinity\n");
> +		return;
> +	}
> +	printf("The number of physical ports is %u\n",
> dev_info.nb_phy_ports);
> +	if (dev_info.nb_phy_ports < res->value) {
> +		printf("The PHY affinity value %u is Invalid, exceeds the "
> +		       "number of physical ports\n", res->value);
> +		return;
> +	}
> +	port->txq[res->qid].conf.tx_phy_affinity = res->value;
> +
> +	cmd_reconfig_device_queue(res->portid, 0, 1);
> +}
> +
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 port, "port");
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 config, "config");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 portid, RTE_UINT16);
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 txq, "txq");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +			      qid, RTE_UINT16);
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 phy_affinity, "phy_affinity");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +			      value, RTE_UINT8);
> +
> +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
> +	.f = cmd_config_tx_phy_affinity_parsed,
> +	.data = (void *)0,
> +	.help_str = "port config <port_id> txq <queue_id> phy_affinity
> <value>",
> +	.tokens = {
> +		(void *)&cmd_config_tx_phy_affinity_port,
> +		(void *)&cmd_config_tx_phy_affinity_config,
> +		(void *)&cmd_config_tx_phy_affinity_portid,
> +		(void *)&cmd_config_tx_phy_affinity_txq,
> +		(void *)&cmd_config_tx_phy_affinity_qid,
> +		(void *)&cmd_config_tx_phy_affinity_hwport,
> +		(void *)&cmd_config_tx_phy_affinity_value,
> +		NULL,
> +	},
> +};
> +
>  /*
> ****************************************************************
> **************** */
> 
>  /* list of instructions */
> @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
>  	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
> +	(cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
>  	NULL,
>  };
> 
> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
> index acccb6b035..b83fb17cfa 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
>  		printf("unknown\n");
>  		break;
>  	}
> +	printf("Current number of physical ports: %u\n",
> dev_info.nb_phy_ports);
>  }
> 
>  void
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index 7a93de3ba1..ac7d3fb2da 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -34,3 +34,8 @@
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ; Temporary exceptions till next major ABI version ;
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> +
> +; Ignore fields inserted in padding hole of rte_eth_txconf
> +[suppress_type]
> +        name = rte_eth_txconf
> +        has_data_member_inserted_between = {offset_of(tx_deferred_start),
> offset_of(offloads)}
> diff --git a/doc/guides/rel_notes/release_23_03.rst
> b/doc/guides/rel_notes/release_23_03.rst
> index 73f5d94e14..e99bd2dcb6 100644
> --- a/doc/guides/rel_notes/release_23_03.rst
> +++ b/doc/guides/rel_notes/release_23_03.rst
> @@ -55,6 +55,10 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +* **Added affinity for multiple physical ports connected to a single DPDK
> port.**
> +
> +  * Added Tx affinity in queue setup to map a physical port.
> +
>  * **Updated AMD axgbe driver.**
> 
>    * Added multi-process support.
> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> index 79a1fa9cb7..5c716f7679 100644
> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only
> on a specific Tx queue::
> 
>  This command should be run when the port is stopped, or else it will fail.
> 
> +config per queue Tx physical affinity
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Configure a per queue physical affinity value only on a specific Tx queue::
> +
> +   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
> +
> +* ``phy_affinity``: physical port to use for sending,
> +                    when multiple physical ports are connected to
> +                    a single DPDK port.
> +
> +This command should be run when the port is stopped, otherwise it fails.
> +
>  Config VXLAN Encap outer layers
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index c129ca1eaf..2fd971b7b5 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
>  				      less free descriptors than this value. */
> 
>  	uint8_t tx_deferred_start; /**< Do not start queue with
> rte_eth_dev_start(). */
> +	/**
> +	 * Affinity with one of the multiple physical ports connected to the
> DPDK port.
> +	 * Value 0 means no affinity and traffic could be routed to any
> connected
> +	 * physical port.
> +	 * The first physical port is number 1 and so on.
> +	 * Number of physical ports is reported by nb_phy_ports in
> rte_eth_dev_info.
> +	 */
> +	uint8_t tx_phy_affinity;
>  	/**
>  	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_*
> flags.
>  	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa
> @@ -1744,6 +1752,8 @@ struct rte_eth_dev_info {
>  	/** Device redirection table size, the total number of entries. */
>  	uint16_t reta_size;
>  	uint8_t hash_key_size; /**< Hash key size in bytes */
> +	/** Number of physical ports connected with DPDK port. */
> +	uint8_t nb_phy_ports;
>  	/** Bit mask of RSS offloads, the bit offset also means flow type */
>  	uint64_t flow_type_rss_offloads;
>  	struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration
> */
> --
> 2.18.1

Acked-by: Ori Kam <orika@nvidia.com>
Thanks,
Ori

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
  2023-02-03 13:33     ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
  2023-02-06 15:29       ` Jiawei(Jonny) Wang
  2023-02-07  9:40       ` Ori Kam
@ 2023-02-09 19:44       ` Ferruh Yigit
  2023-02-10 14:06         ` Jiawei(Jonny) Wang
  2023-02-14  9:38         ` Jiawei(Jonny) Wang
  2 siblings, 2 replies; 40+ messages in thread
From: Ferruh Yigit @ 2023-02-09 19:44 UTC (permalink / raw)
  To: Jiawei Wang, viacheslavo, orika, thomas, andrew.rybchenko,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

On 2/3/2023 1:33 PM, Jiawei Wang wrote:
> When multiple physical ports are connected to a single DPDK port,
> (example: kernel bonding, DPDK bonding, failsafe, etc.),
> we want to know which physical port is used for Rx and Tx.
> 

I assume "kernel bonding" is out of context, but this patch concerns
DPDK bonding, failsafe or softnic. (I will refer them as virtual bonding
device.)

To use specific queues of the virtual bonding device may interfere with
the logic of these devices, like bonding modes or RSS of the underlying
devices. I can see feature focuses on a very specific use case, but not
sure if all possible side effects taken into consideration.


And although the feature is only relavent to virtual bondiong device,
core ethdev structures are updated for this. Most use cases won't need
these, so is there a way to reduce the scope of the changes to virtual
bonding devices?


There are a few very core ethdev APIs, like:
rte_eth_dev_configure()
rte_eth_tx_queue_setup()
rte_eth_rx_queue_setup()
rte_eth_dev_start()
rte_eth_dev_info_get()

Almost every user of ehtdev uses these APIs, since these are so
fundemental I am for being a little more conservative on these APIs.

Every eccentric features are targetting these APIs first because they
are common and extending them gives an easy solution, but in long run
making these APIs more complex, harder to maintain and harder for PMDs
to support them correctly. So I am for not updating them unless it is a
generic use case.


Also as we talked about PMDs supporting them, I assume your coming PMD
patch will be implementing 'tx_phy_affinity' config option only for mlx
drivers. What will happen for other NICs? Will they silently ignore the
config option from user? So this is a problem for the DPDK application
portabiltiy.



As far as I understand target is application controlling which
sub-device is used under the virtual bonding device, can you pleaes give
more information why this is required, perhaps it can help to provide a
better/different solution.
Like adding the ability to use both bonding device and sub-device for
data path, this way application can use whichever it wants. (this is
just first solution I come with, I am not suggesting as replacement
solution, but if you can describe the problem more I am sure other
people can come with better solutions.)

And isn't this against the applicatio transparent to underneath device
being bonding device or actual device?


> This patch maps a DPDK Tx queue with a physical port,
> by adding tx_phy_affinity setting in Tx queue.
> The affinity number is the physical port ID where packets will be
> sent.
> Value 0 means no affinity and traffic could be routed to any
> connected physical ports, this is the default current behavior.
> 
> The number of physical ports is reported with rte_eth_dev_info_get().
> 
> The new tx_phy_affinity field is added into the padding hole of
> rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> An ABI check rule needs to be added to avoid false warning.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> 
> For example, there're two physical ports connected to
> a single DPDK port (port id 0), and phy_affinity 1 stood for
> the first physical port and phy_affinity 2 stood for the second
> physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 phy_affinity 1
>         port config 0 txq 1 phy_affinity 1
>         port config 0 txq 2 phy_affinity 2
>         port config 0 txq 3 phy_affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
> these packets will be sent from the first physical port, and similar
> with the second physical port if sending packets with Tx Queue 2
> or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> ---
>  app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
>  app/test-pmd/config.c                       |   1 +
>  devtools/libabigail.abignore                |   5 +
>  doc/guides/rel_notes/release_23_03.rst      |   4 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
>  lib/ethdev/rte_ethdev.h                     |  10 ++
>  6 files changed, 133 insertions(+)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index cb8c174020..f771fcf8ac 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void *parsed_result,
>  
>  			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
>  			"    Cleanup txq mbufs for a specific Tx queue\n\n"
> +
> +			"port config (port_id) txq (queue_id) phy_affinity (value)\n"
> +			"    Set the physical affinity value "
> +			"on a specific Tx queue\n\n"
>  		);
>  	}
>  
> @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
>  	}
>  };
>  
> +/* *** configure port txq phy_affinity value *** */
> +struct cmd_config_tx_phy_affinity {
> +	cmdline_fixed_string_t port;
> +	cmdline_fixed_string_t config;
> +	portid_t portid;
> +	cmdline_fixed_string_t txq;
> +	uint16_t qid;
> +	cmdline_fixed_string_t phy_affinity;
> +	uint8_t value;
> +};
> +
> +static void
> +cmd_config_tx_phy_affinity_parsed(void *parsed_result,
> +				  __rte_unused struct cmdline *cl,
> +				  __rte_unused void *data)
> +{
> +	struct cmd_config_tx_phy_affinity *res = parsed_result;
> +	struct rte_eth_dev_info dev_info;
> +	struct rte_port *port;
> +	int ret;
> +
> +	if (port_id_is_invalid(res->portid, ENABLED_WARN))
> +		return;
> +
> +	if (res->portid == (portid_t)RTE_PORT_ALL) {
> +		printf("Invalid port id\n");
> +		return;
> +	}
> +
> +	port = &ports[res->portid];
> +
> +	if (strcmp(res->txq, "txq")) {
> +		printf("Unknown parameter\n");
> +		return;
> +	}
> +	if (tx_queue_id_is_invalid(res->qid))
> +		return;
> +
> +	ret = eth_dev_info_get_print_err(res->portid, &dev_info);
> +	if (ret != 0)
> +		return;
> +
> +	if (dev_info.nb_phy_ports == 0) {
> +		printf("Number of physical ports is 0 which is invalid for PHY Affinity\n");
> +		return;
> +	}
> +	printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
> +	if (dev_info.nb_phy_ports < res->value) {
> +		printf("The PHY affinity value %u is Invalid, exceeds the "
> +		       "number of physical ports\n", res->value);
> +		return;
> +	}
> +	port->txq[res->qid].conf.tx_phy_affinity = res->value;
> +
> +	cmd_reconfig_device_queue(res->portid, 0, 1);
> +}
> +
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 port, "port");
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 config, "config");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 portid, RTE_UINT16);
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 txq, "txq");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +			      qid, RTE_UINT16);
> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +				 phy_affinity, "phy_affinity");
> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> +			      value, RTE_UINT8);
> +
> +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
> +	.f = cmd_config_tx_phy_affinity_parsed,
> +	.data = (void *)0,
> +	.help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
> +	.tokens = {
> +		(void *)&cmd_config_tx_phy_affinity_port,
> +		(void *)&cmd_config_tx_phy_affinity_config,
> +		(void *)&cmd_config_tx_phy_affinity_portid,
> +		(void *)&cmd_config_tx_phy_affinity_txq,
> +		(void *)&cmd_config_tx_phy_affinity_qid,
> +		(void *)&cmd_config_tx_phy_affinity_hwport,
> +		(void *)&cmd_config_tx_phy_affinity_value,
> +		NULL,
> +	},
> +};
> +
>  /* ******************************************************************************** */
>  
>  /* list of instructions */
> @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
>  	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
> +	(cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
>  	NULL,
>  };
>  
> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
> index acccb6b035..b83fb17cfa 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
>  		printf("unknown\n");
>  		break;
>  	}
> +	printf("Current number of physical ports: %u\n", dev_info.nb_phy_ports);
>  }
>  
>  void
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index 7a93de3ba1..ac7d3fb2da 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -34,3 +34,8 @@
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ; Temporary exceptions till next major ABI version ;
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> +
> +; Ignore fields inserted in padding hole of rte_eth_txconf
> +[suppress_type]
> +        name = rte_eth_txconf
> +        has_data_member_inserted_between = {offset_of(tx_deferred_start), offset_of(offloads)}
> diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
> index 73f5d94e14..e99bd2dcb6 100644
> --- a/doc/guides/rel_notes/release_23_03.rst
> +++ b/doc/guides/rel_notes/release_23_03.rst
> @@ -55,6 +55,10 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
>  
> +* **Added affinity for multiple physical ports connected to a single DPDK port.**
> +
> +  * Added Tx affinity in queue setup to map a physical port.
> +
>  * **Updated AMD axgbe driver.**
>  
>    * Added multi-process support.
> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> index 79a1fa9cb7..5c716f7679 100644
> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only on a specific Tx queue::
>  
>  This command should be run when the port is stopped, or else it will fail.
>  
> +config per queue Tx physical affinity
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Configure a per queue physical affinity value only on a specific Tx queue::
> +
> +   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
> +
> +* ``phy_affinity``: physical port to use for sending,
> +                    when multiple physical ports are connected to
> +                    a single DPDK port.
> +
> +This command should be run when the port is stopped, otherwise it fails.
> +
>  Config VXLAN Encap outer layers
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index c129ca1eaf..2fd971b7b5 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
>  				      less free descriptors than this value. */
>  
>  	uint8_t tx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */
> +	/**
> +	 * Affinity with one of the multiple physical ports connected to the DPDK port.
> +	 * Value 0 means no affinity and traffic could be routed to any connected
> +	 * physical port.
> +	 * The first physical port is number 1 and so on.
> +	 * Number of physical ports is reported by nb_phy_ports in rte_eth_dev_info.
> +	 */
> +	uint8_t tx_phy_affinity;
>  	/**
>  	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_* flags.
>  	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa
> @@ -1744,6 +1752,8 @@ struct rte_eth_dev_info {
>  	/** Device redirection table size, the total number of entries. */
>  	uint16_t reta_size;
>  	uint8_t hash_key_size; /**< Hash key size in bytes */
> +	/** Number of physical ports connected with DPDK port. */
> +	uint8_t nb_phy_ports;
>  	/** Bit mask of RSS offloads, the bit offset also means flow type */
>  	uint64_t flow_type_rss_offloads;
>  	struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration */


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
  2023-02-09 19:44       ` Ferruh Yigit
@ 2023-02-10 14:06         ` Jiawei(Jonny) Wang
  2023-02-14  9:38         ` Jiawei(Jonny) Wang
  1 sibling, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-10 14:06 UTC (permalink / raw)
  To: Ferruh Yigit, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	andrew.rybchenko, Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh

Hi,

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, February 10, 2023 3:45 AM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue
> API
> 
> On 2/3/2023 1:33 PM, Jiawei Wang wrote:
> > When multiple physical ports are connected to a single DPDK port,
> > (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to
> > know which physical port is used for Rx and Tx.
> >
> 
> I assume "kernel bonding" is out of context, but this patch concerns DPDK
> bonding, failsafe or softnic. (I will refer them as virtual bonding
> device.)
> 

''kernel bonding'' can be thought as Linux bonding.

> To use specific queues of the virtual bonding device may interfere with the
> logic of these devices, like bonding modes or RSS of the underlying devices. I
> can see feature focuses on a very specific use case, but not sure if all possible
> side effects taken into consideration.
> 
> 
> And although the feature is only relavent to virtual bondiong device, core
> ethdev structures are updated for this. Most use cases won't need these, so is
> there a way to reduce the scope of the changes to virtual bonding devices?
> 
> 
> There are a few very core ethdev APIs, like:
> rte_eth_dev_configure()
> rte_eth_tx_queue_setup()
> rte_eth_rx_queue_setup()
> rte_eth_dev_start()
> rte_eth_dev_info_get()
> 
> Almost every user of ehtdev uses these APIs, since these are so fundemental I
> am for being a little more conservative on these APIs.
> 
> Every eccentric features are targetting these APIs first because they are
> common and extending them gives an easy solution, but in long run making
> these APIs more complex, harder to maintain and harder for PMDs to support
> them correctly. So I am for not updating them unless it is a generic use case.
> 
> 
> Also as we talked about PMDs supporting them, I assume your coming PMD
> patch will be implementing 'tx_phy_affinity' config option only for mlx drivers.
> What will happen for other NICs? Will they silently ignore the config option
> from user? So this is a problem for the DPDK application portabiltiy.
> 

Yes, the PMD patch is for net/mlx5 only, the 'tx_phy_affinity' can be used for HW to
choose an mapping queue with physical port.

Other NICs ignore this new configuration for now, or we should add checking in queue setup?

> 
> 
> As far as I understand target is application controlling which sub-device is used
> under the virtual bonding device, can you pleaes give more information why
> this is required, perhaps it can help to provide a better/different solution.
> Like adding the ability to use both bonding device and sub-device for data path,
> this way application can use whichever it wants. (this is just first solution I
> come with, I am not suggesting as replacement solution, but if you can describe
> the problem more I am sure other people can come with better solutions.)
> 

For example: 
There're two physical ports (assume device interface: eth2, eth3), and bonded these two
Devices into one interface (assume bond0).
DPDK application probed/attached the bond0 only (dpdk port id:0),  while sending traffic from dpdk port,
We want to know the packet be sent into which physical port (eth2 or eth3).

With the new configuration, the queue could be configured with underlay device,
Then DPDK application could send the traffic into correct queue as desired.

Add all devices into DPDK, means that need to create multiple RX/TX Queue resources on it.


> And isn't this against the applicatio transparent to underneath device being
> bonding device or actual device?
> 
> 
> > This patch maps a DPDK Tx queue with a physical port, by adding
> > tx_phy_affinity setting in Tx queue.
> > The affinity number is the physical port ID where packets will be
> > sent.
> > Value 0 means no affinity and traffic could be routed to any connected
> > physical ports, this is the default current behavior.
> >
> > The number of physical ports is reported with rte_eth_dev_info_get().
> >
> > The new tx_phy_affinity field is added into the padding hole of
> > rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> > An ABI check rule needs to be added to avoid false warning.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and phy_affinity 1 stood for the first physical port
> > and phy_affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> >         port config 0 txq 0 phy_affinity 1
> >         port config 0 txq 1 phy_affinity 1
> >         port config 0 txq 2 phy_affinity 2
> >         port config 0 txq 3 phy_affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> > ---
snip

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
  2023-02-09 19:44       ` Ferruh Yigit
  2023-02-10 14:06         ` Jiawei(Jonny) Wang
@ 2023-02-14  9:38         ` Jiawei(Jonny) Wang
  2023-02-14 10:01           ` Ferruh Yigit
  1 sibling, 1 reply; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-14  9:38 UTC (permalink / raw)
  To: Ferruh Yigit, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	andrew.rybchenko, Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh

Hi,

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, February 10, 2023 3:45 AM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue
> API
> 
> On 2/3/2023 1:33 PM, Jiawei Wang wrote:
> > When multiple physical ports are connected to a single DPDK port,
> > (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to
> > know which physical port is used for Rx and Tx.
> >
> 
> I assume "kernel bonding" is out of context, but this patch concerns DPDK
> bonding, failsafe or softnic. (I will refer them as virtual bonding
> device.)
> 
> To use specific queues of the virtual bonding device may interfere with the
> logic of these devices, like bonding modes or RSS of the underlying devices. I
> can see feature focuses on a very specific use case, but not sure if all possible
> side effects taken into consideration.
> 
> 
> And although the feature is only relavent to virtual bondiong device, core
> ethdev structures are updated for this. Most use cases won't need these, so is
> there a way to reduce the scope of the changes to virtual bonding devices?
> 
> 
> There are a few very core ethdev APIs, like:
> rte_eth_dev_configure()
> rte_eth_tx_queue_setup()
> rte_eth_rx_queue_setup()
> rte_eth_dev_start()
> rte_eth_dev_info_get()
> 
> Almost every user of ehtdev uses these APIs, since these are so fundemental I
> am for being a little more conservative on these APIs.
> 
> Every eccentric features are targetting these APIs first because they are
> common and extending them gives an easy solution, but in long run making
> these APIs more complex, harder to maintain and harder for PMDs to support
> them correctly. So I am for not updating them unless it is a generic use case.
> 
> 
> Also as we talked about PMDs supporting them, I assume your coming PMD
> patch will be implementing 'tx_phy_affinity' config option only for mlx drivers.
> What will happen for other NICs? Will they silently ignore the config option
> from user? So this is a problem for the DPDK application portabiltiy.
> 
> 
> 
> As far as I understand target is application controlling which sub-device is used
> under the virtual bonding device, can you pleaes give more information why
> this is required, perhaps it can help to provide a better/different solution.
> Like adding the ability to use both bonding device and sub-device for data path,
> this way application can use whichever it wants. (this is just first solution I
> come with, I am not suggesting as replacement solution, but if you can describe
> the problem more I am sure other people can come with better solutions.)
> 
> And isn't this against the applicatio transparent to underneath device being
> bonding device or actual device?
> 
> 

OK, I will send the new version with separate functions in ethdev layer, 
to support the Map a Tx queue to port and get the number of ports.
And these functions work with device ops callback, other NICs will reported
The unsupported the ops callback is NULL.

> > This patch maps a DPDK Tx queue with a physical port, by adding
> > tx_phy_affinity setting in Tx queue.
> > The affinity number is the physical port ID where packets will be
> > sent.
> > Value 0 means no affinity and traffic could be routed to any connected
> > physical ports, this is the default current behavior.
> >
> > The number of physical ports is reported with rte_eth_dev_info_get().
> >
> > The new tx_phy_affinity field is added into the padding hole of
> > rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> > An ABI check rule needs to be added to avoid false warning.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and phy_affinity 1 stood for the first physical port
> > and phy_affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> >         port config 0 txq 0 phy_affinity 1
> >         port config 0 txq 1 phy_affinity 1
> >         port config 0 txq 2 phy_affinity 2
> >         port config 0 txq 3 phy_affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> > ---
> >  app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
> >  app/test-pmd/config.c                       |   1 +
> >  devtools/libabigail.abignore                |   5 +
> >  doc/guides/rel_notes/release_23_03.rst      |   4 +
> >  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
> >  lib/ethdev/rte_ethdev.h                     |  10 ++
> >  6 files changed, 133 insertions(+)
> >
> > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > cb8c174020..f771fcf8ac 100644
> > --- a/app/test-pmd/cmdline.c
> > +++ b/app/test-pmd/cmdline.c
> > @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void
> > *parsed_result,
> >
> >  			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
> >  			"    Cleanup txq mbufs for a specific Tx queue\n\n"
> > +
> > +			"port config (port_id) txq (queue_id) phy_affinity
> (value)\n"
> > +			"    Set the physical affinity value "
> > +			"on a specific Tx queue\n\n"
> >  		);
> >  	}
> >
> > @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t
> cmd_show_port_flow_transfer_proxy = {
> >  	}
> >  };
> >
> > +/* *** configure port txq phy_affinity value *** */ struct
> > +cmd_config_tx_phy_affinity {
> > +	cmdline_fixed_string_t port;
> > +	cmdline_fixed_string_t config;
> > +	portid_t portid;
> > +	cmdline_fixed_string_t txq;
> > +	uint16_t qid;
> > +	cmdline_fixed_string_t phy_affinity;
> > +	uint8_t value;
> > +};
> > +
> > +static void
> > +cmd_config_tx_phy_affinity_parsed(void *parsed_result,
> > +				  __rte_unused struct cmdline *cl,
> > +				  __rte_unused void *data)
> > +{
> > +	struct cmd_config_tx_phy_affinity *res = parsed_result;
> > +	struct rte_eth_dev_info dev_info;
> > +	struct rte_port *port;
> > +	int ret;
> > +
> > +	if (port_id_is_invalid(res->portid, ENABLED_WARN))
> > +		return;
> > +
> > +	if (res->portid == (portid_t)RTE_PORT_ALL) {
> > +		printf("Invalid port id\n");
> > +		return;
> > +	}
> > +
> > +	port = &ports[res->portid];
> > +
> > +	if (strcmp(res->txq, "txq")) {
> > +		printf("Unknown parameter\n");
> > +		return;
> > +	}
> > +	if (tx_queue_id_is_invalid(res->qid))
> > +		return;
> > +
> > +	ret = eth_dev_info_get_print_err(res->portid, &dev_info);
> > +	if (ret != 0)
> > +		return;
> > +
> > +	if (dev_info.nb_phy_ports == 0) {
> > +		printf("Number of physical ports is 0 which is invalid for PHY
> Affinity\n");
> > +		return;
> > +	}
> > +	printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
> > +	if (dev_info.nb_phy_ports < res->value) {
> > +		printf("The PHY affinity value %u is Invalid, exceeds the "
> > +		       "number of physical ports\n", res->value);
> > +		return;
> > +	}
> > +	port->txq[res->qid].conf.tx_phy_affinity = res->value;
> > +
> > +	cmd_reconfig_device_queue(res->portid, 0, 1); }
> > +
> > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
> > +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +				 port, "port");
> > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
> > +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +				 config, "config");
> > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
> > +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +				 portid, RTE_UINT16);
> > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
> > +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +				 txq, "txq");
> > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
> > +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +			      qid, RTE_UINT16);
> > +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
> > +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +				 phy_affinity, "phy_affinity");
> > +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
> > +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
> > +			      value, RTE_UINT8);
> > +
> > +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
> > +	.f = cmd_config_tx_phy_affinity_parsed,
> > +	.data = (void *)0,
> > +	.help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
> > +	.tokens = {
> > +		(void *)&cmd_config_tx_phy_affinity_port,
> > +		(void *)&cmd_config_tx_phy_affinity_config,
> > +		(void *)&cmd_config_tx_phy_affinity_portid,
> > +		(void *)&cmd_config_tx_phy_affinity_txq,
> > +		(void *)&cmd_config_tx_phy_affinity_qid,
> > +		(void *)&cmd_config_tx_phy_affinity_hwport,
> > +		(void *)&cmd_config_tx_phy_affinity_value,
> > +		NULL,
> > +	},
> > +};
> > +
> >  /*
> >
> ****************************************************************
> ******
> > ********** */
> >
> >  /* list of instructions */
> > @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
> >  	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
> >  	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
> >  	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
> > +	(cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
> >  	NULL,
> >  };
> >
> > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
> > acccb6b035..b83fb17cfa 100644
> > --- a/app/test-pmd/config.c
> > +++ b/app/test-pmd/config.c
> > @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
> >  		printf("unknown\n");
> >  		break;
> >  	}
> > +	printf("Current number of physical ports: %u\n",
> > +dev_info.nb_phy_ports);
> >  }
> >
> >  void
> > diff --git a/devtools/libabigail.abignore
> > b/devtools/libabigail.abignore index 7a93de3ba1..ac7d3fb2da 100644
> > --- a/devtools/libabigail.abignore
> > +++ b/devtools/libabigail.abignore
> > @@ -34,3 +34,8 @@
> >  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> >  ; Temporary exceptions till next major ABI version ;
> > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> > +
> > +; Ignore fields inserted in padding hole of rte_eth_txconf
> > +[suppress_type]
> > +        name = rte_eth_txconf
> > +        has_data_member_inserted_between =
> > +{offset_of(tx_deferred_start), offset_of(offloads)}
> > diff --git a/doc/guides/rel_notes/release_23_03.rst
> > b/doc/guides/rel_notes/release_23_03.rst
> > index 73f5d94e14..e99bd2dcb6 100644
> > --- a/doc/guides/rel_notes/release_23_03.rst
> > +++ b/doc/guides/rel_notes/release_23_03.rst
> > @@ -55,6 +55,10 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =======================================================
> >
> > +* **Added affinity for multiple physical ports connected to a single
> > +DPDK port.**
> > +
> > +  * Added Tx affinity in queue setup to map a physical port.
> > +
> >  * **Updated AMD axgbe driver.**
> >
> >    * Added multi-process support.
> > diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > index 79a1fa9cb7..5c716f7679 100644
> > --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only
> on a specific Tx queue::
> >
> >  This command should be run when the port is stopped, or else it will fail.
> >
> > +config per queue Tx physical affinity
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Configure a per queue physical affinity value only on a specific Tx queue::
> > +
> > +   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
> > +
> > +* ``phy_affinity``: physical port to use for sending,
> > +                    when multiple physical ports are connected to
> > +                    a single DPDK port.
> > +
> > +This command should be run when the port is stopped, otherwise it fails.
> > +
> >  Config VXLAN Encap outer layers
> >  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > c129ca1eaf..2fd971b7b5 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
> >  				      less free descriptors than this value. */
> >
> >  	uint8_t tx_deferred_start; /**< Do not start queue with
> > rte_eth_dev_start(). */
> > +	/**
> > +	 * Affinity with one of the multiple physical ports connected to the
> DPDK port.
> > +	 * Value 0 means no affinity and traffic could be routed to any
> connected
> > +	 * physical port.
> > +	 * The first physical port is number 1 and so on.
> > +	 * Number of physical ports is reported by nb_phy_ports in
> rte_eth_dev_info.
> > +	 */
> > +	uint8_t tx_phy_affinity;
> >  	/**
> >  	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_*
> flags.
> >  	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa @@
> > -1744,6 +1752,8 @@ struct rte_eth_dev_info {
> >  	/** Device redirection table size, the total number of entries. */
> >  	uint16_t reta_size;
> >  	uint8_t hash_key_size; /**< Hash key size in bytes */
> > +	/** Number of physical ports connected with DPDK port. */
> > +	uint8_t nb_phy_ports;
> >  	/** Bit mask of RSS offloads, the bit offset also means flow type */
> >  	uint64_t flow_type_rss_offloads;
> >  	struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration
> > */


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API
  2023-02-14  9:38         ` Jiawei(Jonny) Wang
@ 2023-02-14 10:01           ` Ferruh Yigit
  0 siblings, 0 replies; 40+ messages in thread
From: Ferruh Yigit @ 2023-02-14 10:01 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	andrew.rybchenko, Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh

On 2/14/2023 9:38 AM, Jiawei(Jonny) Wang wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Friday, February 10, 2023 3:45 AM
>> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
>> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
>> Yuying Zhang <yuying.zhang@intel.com>
>> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
>> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue
>> API
>>
>> On 2/3/2023 1:33 PM, Jiawei Wang wrote:
>>> When multiple physical ports are connected to a single DPDK port,
>>> (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to
>>> know which physical port is used for Rx and Tx.
>>>
>>
>> I assume "kernel bonding" is out of context, but this patch concerns DPDK
>> bonding, failsafe or softnic. (I will refer them as virtual bonding
>> device.)
>>
>> To use specific queues of the virtual bonding device may interfere with the
>> logic of these devices, like bonding modes or RSS of the underlying devices. I
>> can see feature focuses on a very specific use case, but not sure if all possible
>> side effects taken into consideration.
>>
>>
>> And although the feature is only relavent to virtual bondiong device, core
>> ethdev structures are updated for this. Most use cases won't need these, so is
>> there a way to reduce the scope of the changes to virtual bonding devices?
>>
>>
>> There are a few very core ethdev APIs, like:
>> rte_eth_dev_configure()
>> rte_eth_tx_queue_setup()
>> rte_eth_rx_queue_setup()
>> rte_eth_dev_start()
>> rte_eth_dev_info_get()
>>
>> Almost every user of ehtdev uses these APIs, since these are so fundemental I
>> am for being a little more conservative on these APIs.
>>
>> Every eccentric features are targetting these APIs first because they are
>> common and extending them gives an easy solution, but in long run making
>> these APIs more complex, harder to maintain and harder for PMDs to support
>> them correctly. So I am for not updating them unless it is a generic use case.
>>
>>
>> Also as we talked about PMDs supporting them, I assume your coming PMD
>> patch will be implementing 'tx_phy_affinity' config option only for mlx drivers.
>> What will happen for other NICs? Will they silently ignore the config option
>> from user? So this is a problem for the DPDK application portabiltiy.
>>
>>
>>
>> As far as I understand target is application controlling which sub-device is used
>> under the virtual bonding device, can you pleaes give more information why
>> this is required, perhaps it can help to provide a better/different solution.
>> Like adding the ability to use both bonding device and sub-device for data path,
>> this way application can use whichever it wants. (this is just first solution I
>> come with, I am not suggesting as replacement solution, but if you can describe
>> the problem more I am sure other people can come with better solutions.)
>>
>> And isn't this against the applicatio transparent to underneath device being
>> bonding device or actual device?
>>
>>
> 
> OK, I will send the new version with separate functions in ethdev layer, 
> to support the Map a Tx queue to port and get the number of ports.
> And these functions work with device ops callback, other NICs will reported
> The unsupported the ops callback is NULL.
> 

OK, thanks Jonny, at least this separates the fetaure to its own APIs
which reduces the impact for applications and drivers that are not using
this feature.


>>> This patch maps a DPDK Tx queue with a physical port, by adding
>>> tx_phy_affinity setting in Tx queue.
>>> The affinity number is the physical port ID where packets will be
>>> sent.
>>> Value 0 means no affinity and traffic could be routed to any connected
>>> physical ports, this is the default current behavior.
>>>
>>> The number of physical ports is reported with rte_eth_dev_info_get().
>>>
>>> The new tx_phy_affinity field is added into the padding hole of
>>> rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
>>> An ABI check rule needs to be added to avoid false warning.
>>>
>>> Add the testpmd command line:
>>> testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
>>>
>>> For example, there're two physical ports connected to a single DPDK
>>> port (port id 0), and phy_affinity 1 stood for the first physical port
>>> and phy_affinity 2 stood for the second physical port.
>>> Use the below commands to config tx phy affinity for per Tx Queue:
>>>         port config 0 txq 0 phy_affinity 1
>>>         port config 0 txq 1 phy_affinity 1
>>>         port config 0 txq 2 phy_affinity 2
>>>         port config 0 txq 3 phy_affinity 2
>>>
>>> These commands config the Tx Queue index 0 and Tx Queue index 1 with
>>> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
>>> packets will be sent from the first physical port, and similar with
>>> the second physical port if sending packets with Tx Queue 2 or Tx
>>> Queue 3.
>>>
>>> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
>>> ---
>>>  app/test-pmd/cmdline.c                      | 100 ++++++++++++++++++++
>>>  app/test-pmd/config.c                       |   1 +
>>>  devtools/libabigail.abignore                |   5 +
>>>  doc/guides/rel_notes/release_23_03.rst      |   4 +
>>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  13 +++
>>>  lib/ethdev/rte_ethdev.h                     |  10 ++
>>>  6 files changed, 133 insertions(+)
>>>
>>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
>>> cb8c174020..f771fcf8ac 100644
>>> --- a/app/test-pmd/cmdline.c
>>> +++ b/app/test-pmd/cmdline.c
>>> @@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void
>>> *parsed_result,
>>>
>>>  			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
>>>  			"    Cleanup txq mbufs for a specific Tx queue\n\n"
>>> +
>>> +			"port config (port_id) txq (queue_id) phy_affinity
>> (value)\n"
>>> +			"    Set the physical affinity value "
>>> +			"on a specific Tx queue\n\n"
>>>  		);
>>>  	}
>>>
>>> @@ -12633,6 +12637,101 @@ static cmdline_parse_inst_t
>> cmd_show_port_flow_transfer_proxy = {
>>>  	}
>>>  };
>>>
>>> +/* *** configure port txq phy_affinity value *** */ struct
>>> +cmd_config_tx_phy_affinity {
>>> +	cmdline_fixed_string_t port;
>>> +	cmdline_fixed_string_t config;
>>> +	portid_t portid;
>>> +	cmdline_fixed_string_t txq;
>>> +	uint16_t qid;
>>> +	cmdline_fixed_string_t phy_affinity;
>>> +	uint8_t value;
>>> +};
>>> +
>>> +static void
>>> +cmd_config_tx_phy_affinity_parsed(void *parsed_result,
>>> +				  __rte_unused struct cmdline *cl,
>>> +				  __rte_unused void *data)
>>> +{
>>> +	struct cmd_config_tx_phy_affinity *res = parsed_result;
>>> +	struct rte_eth_dev_info dev_info;
>>> +	struct rte_port *port;
>>> +	int ret;
>>> +
>>> +	if (port_id_is_invalid(res->portid, ENABLED_WARN))
>>> +		return;
>>> +
>>> +	if (res->portid == (portid_t)RTE_PORT_ALL) {
>>> +		printf("Invalid port id\n");
>>> +		return;
>>> +	}
>>> +
>>> +	port = &ports[res->portid];
>>> +
>>> +	if (strcmp(res->txq, "txq")) {
>>> +		printf("Unknown parameter\n");
>>> +		return;
>>> +	}
>>> +	if (tx_queue_id_is_invalid(res->qid))
>>> +		return;
>>> +
>>> +	ret = eth_dev_info_get_print_err(res->portid, &dev_info);
>>> +	if (ret != 0)
>>> +		return;
>>> +
>>> +	if (dev_info.nb_phy_ports == 0) {
>>> +		printf("Number of physical ports is 0 which is invalid for PHY
>> Affinity\n");
>>> +		return;
>>> +	}
>>> +	printf("The number of physical ports is %u\n", dev_info.nb_phy_ports);
>>> +	if (dev_info.nb_phy_ports < res->value) {
>>> +		printf("The PHY affinity value %u is Invalid, exceeds the "
>>> +		       "number of physical ports\n", res->value);
>>> +		return;
>>> +	}
>>> +	port->txq[res->qid].conf.tx_phy_affinity = res->value;
>>> +
>>> +	cmd_reconfig_device_queue(res->portid, 0, 1); }
>>> +
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_port =
>>> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 port, "port");
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_config =
>>> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 config, "config");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_portid =
>>> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 portid, RTE_UINT16);
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_txq =
>>> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 txq, "txq");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_qid =
>>> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +			      qid, RTE_UINT16);
>>> +cmdline_parse_token_string_t cmd_config_tx_phy_affinity_hwport =
>>> +	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +				 phy_affinity, "phy_affinity");
>>> +cmdline_parse_token_num_t cmd_config_tx_phy_affinity_value =
>>> +	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_phy_affinity,
>>> +			      value, RTE_UINT8);
>>> +
>>> +static cmdline_parse_inst_t cmd_config_tx_phy_affinity = {
>>> +	.f = cmd_config_tx_phy_affinity_parsed,
>>> +	.data = (void *)0,
>>> +	.help_str = "port config <port_id> txq <queue_id> phy_affinity <value>",
>>> +	.tokens = {
>>> +		(void *)&cmd_config_tx_phy_affinity_port,
>>> +		(void *)&cmd_config_tx_phy_affinity_config,
>>> +		(void *)&cmd_config_tx_phy_affinity_portid,
>>> +		(void *)&cmd_config_tx_phy_affinity_txq,
>>> +		(void *)&cmd_config_tx_phy_affinity_qid,
>>> +		(void *)&cmd_config_tx_phy_affinity_hwport,
>>> +		(void *)&cmd_config_tx_phy_affinity_value,
>>> +		NULL,
>>> +	},
>>> +};
>>> +
>>>  /*
>>>
>> ****************************************************************
>> ******
>>> ********** */
>>>
>>>  /* list of instructions */
>>> @@ -12866,6 +12965,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
>>>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
>>>  	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
>>>  	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
>>> +	(cmdline_parse_inst_t *)&cmd_config_tx_phy_affinity,
>>>  	NULL,
>>>  };
>>>
>>> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
>>> acccb6b035..b83fb17cfa 100644
>>> --- a/app/test-pmd/config.c
>>> +++ b/app/test-pmd/config.c
>>> @@ -936,6 +936,7 @@ port_infos_display(portid_t port_id)
>>>  		printf("unknown\n");
>>>  		break;
>>>  	}
>>> +	printf("Current number of physical ports: %u\n",
>>> +dev_info.nb_phy_ports);
>>>  }
>>>
>>>  void
>>> diff --git a/devtools/libabigail.abignore
>>> b/devtools/libabigail.abignore index 7a93de3ba1..ac7d3fb2da 100644
>>> --- a/devtools/libabigail.abignore
>>> +++ b/devtools/libabigail.abignore
>>> @@ -34,3 +34,8 @@
>>>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>>  ; Temporary exceptions till next major ABI version ;
>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>> +
>>> +; Ignore fields inserted in padding hole of rte_eth_txconf
>>> +[suppress_type]
>>> +        name = rte_eth_txconf
>>> +        has_data_member_inserted_between =
>>> +{offset_of(tx_deferred_start), offset_of(offloads)}
>>> diff --git a/doc/guides/rel_notes/release_23_03.rst
>>> b/doc/guides/rel_notes/release_23_03.rst
>>> index 73f5d94e14..e99bd2dcb6 100644
>>> --- a/doc/guides/rel_notes/release_23_03.rst
>>> +++ b/doc/guides/rel_notes/release_23_03.rst
>>> @@ -55,6 +55,10 @@ New Features
>>>       Also, make sure to start the actual text at the margin.
>>>       =======================================================
>>>
>>> +* **Added affinity for multiple physical ports connected to a single
>>> +DPDK port.**
>>> +
>>> +  * Added Tx affinity in queue setup to map a physical port.
>>> +
>>>  * **Updated AMD axgbe driver.**
>>>
>>>    * Added multi-process support.
>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> index 79a1fa9cb7..5c716f7679 100644
>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> @@ -1605,6 +1605,19 @@ Enable or disable a per queue Tx offloading only
>> on a specific Tx queue::
>>>
>>>  This command should be run when the port is stopped, or else it will fail.
>>>
>>> +config per queue Tx physical affinity
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Configure a per queue physical affinity value only on a specific Tx queue::
>>> +
>>> +   testpmd> port (port_id) txq (queue_id) phy_affinity (value)
>>> +
>>> +* ``phy_affinity``: physical port to use for sending,
>>> +                    when multiple physical ports are connected to
>>> +                    a single DPDK port.
>>> +
>>> +This command should be run when the port is stopped, otherwise it fails.
>>> +
>>>  Config VXLAN Encap outer layers
>>>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>> c129ca1eaf..2fd971b7b5 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -1138,6 +1138,14 @@ struct rte_eth_txconf {
>>>  				      less free descriptors than this value. */
>>>
>>>  	uint8_t tx_deferred_start; /**< Do not start queue with
>>> rte_eth_dev_start(). */
>>> +	/**
>>> +	 * Affinity with one of the multiple physical ports connected to the
>> DPDK port.
>>> +	 * Value 0 means no affinity and traffic could be routed to any
>> connected
>>> +	 * physical port.
>>> +	 * The first physical port is number 1 and so on.
>>> +	 * Number of physical ports is reported by nb_phy_ports in
>> rte_eth_dev_info.
>>> +	 */
>>> +	uint8_t tx_phy_affinity;
>>>  	/**
>>>  	 * Per-queue Tx offloads to be set  using RTE_ETH_TX_OFFLOAD_*
>> flags.
>>>  	 * Only offloads set on tx_queue_offload_capa or tx_offload_capa @@
>>> -1744,6 +1752,8 @@ struct rte_eth_dev_info {
>>>  	/** Device redirection table size, the total number of entries. */
>>>  	uint16_t reta_size;
>>>  	uint8_t hash_key_size; /**< Hash key size in bytes */
>>> +	/** Number of physical ports connected with DPDK port. */
>>> +	uint8_t nb_phy_ports;
>>>  	/** Bit mask of RSS offloads, the bit offset also means flow type */
>>>  	uint64_t flow_type_rss_offloads;
>>>  	struct rte_eth_rxconf default_rxconf; /**< Default Rx configuration
>>> */
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port
  2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
                     ` (2 preceding siblings ...)
  2023-02-03 13:33   ` [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
@ 2023-02-14 15:48   ` Jiawei Wang
  2023-02-14 15:48     ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
  2023-02-14 15:48     ` [PATCH v5 2/2] ethdev: add Aggregated affinity match item Jiawei Wang
  2023-02-17 10:50   ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
  2023-02-17 15:47   ` [PATCH v7 " Jiawei Wang
  5 siblings, 2 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-14 15:48 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko; +Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port is used for Rx and Tx.

This patch introduces the new ethdev API
rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
with an aggregated port of the DPDK port (specified with port_id),
The affinity is the number of the aggregated port.
Value 0 means no affinity and traffic could be routed to any
aggregated port, this is the default current behavior.

The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().

This patch allows to map a Rx queue with an aggregated port by using
a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

While uses the aggregated affinity as a matching item in the flow rule,
and sets the same affinity value by call
rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
the same port as the receiving one.
The affinity numbering starts from 1, then trying to match on
aggr_affinity 0 will result in an error.

RFC: http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaweiw@nvidia.com/

v5:
* Adds rte_eth_dev_map_aggr_tx_affinity() to map a Tx queue to an aggregated port.
* Adds rte_eth_dev_count_aggr_ports() to get the number of aggregated ports.
* Updates the flow item RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

v4:
* Rebase the latest code
* Update new field description
* Update release release note
* Reword the commit log to make clear

v3:
* Update exception rule
* Update the commit log
* Add the description for PHY affinity and numbering definition
* Add the number of physical ports into device info
* Change the patch order 

v2: Update based on the comments

Jiawei Wang (2):
  ethdev: introduce the Tx map API for aggregated ports
  ethdev: add Aggregated affinity match item

 app/test-pmd/cmdline.c                      | 96 +++++++++++++++++++++
 app/test-pmd/cmdline_flow.c                 | 28 ++++++
 doc/guides/prog_guide/rte_flow.rst          |  9 ++
 doc/guides/rel_notes/release_23_03.rst      |  8 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 18 ++++
 lib/ethdev/ethdev_driver.h                  | 39 +++++++++
 lib/ethdev/ethdev_trace.h                   | 17 ++++
 lib/ethdev/ethdev_trace_points.c            |  6 ++
 lib/ethdev/rte_ethdev.c                     | 49 +++++++++++
 lib/ethdev/rte_ethdev.h                     | 46 ++++++++++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 35 ++++++++
 lib/ethdev/version.map                      |  2 +
 13 files changed, 354 insertions(+)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
  2023-02-14 15:48   ` [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port Jiawei Wang
@ 2023-02-14 15:48     ` Jiawei Wang
  2023-02-15 11:41       ` Jiawei(Jonny) Wang
                         ` (2 more replies)
  2023-02-14 15:48     ` [PATCH v5 2/2] ethdev: add Aggregated affinity match item Jiawei Wang
  1 sibling, 3 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-14 15:48 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, Aman Singh,
	Yuying Zhang, Ferruh Yigit
  Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port use for Tx via a queue.

This patch introduces the new ethdev API
rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
with an aggregated port of the DPDK port (specified with port_id),
The affinity is the number of the aggregated port.
Value 0 means no affinity and traffic could be routed to any
aggregated port, this is the default current behavior.

The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().

Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
and rte_eth_dev_map_aggr_tx_affinity() functions.

Add the testpmd command line:
testpmd> port config (port_id) txq (queue_id) affinity (value)

For example, there're two physical ports connected to
a single DPDK port (port id 0), and affinity 1 stood for
the first physical port and affinity 2 stood for the second
physical port.
Use the below commands to config tx phy affinity for per Tx Queue:
        port config 0 txq 0 affinity 1
        port config 0 txq 1 affinity 1
        port config 0 txq 2 affinity 2
        port config 0 txq 3 affinity 2

These commands config the Tx Queue index 0 and Tx Queue index 1 with
phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
these packets will be sent from the first physical port, and similar
with the second physical port if sending packets with Tx Queue 2
or Tx Queue 3.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 app/test-pmd/cmdline.c                      | 96 +++++++++++++++++++++
 doc/guides/rel_notes/release_23_03.rst      |  7 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 14 +++
 lib/ethdev/ethdev_driver.h                  | 39 +++++++++
 lib/ethdev/ethdev_trace.h                   | 17 ++++
 lib/ethdev/ethdev_trace_points.c            |  6 ++
 lib/ethdev/rte_ethdev.c                     | 49 +++++++++++
 lib/ethdev/rte_ethdev.h                     | 46 ++++++++++
 lib/ethdev/version.map                      |  2 +
 9 files changed, 276 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index bb7ff2b449..36c798ac45 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 
 			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
 			"    Cleanup txq mbufs for a specific Tx queue\n\n"
+
+			"port config (port_id) txq (queue_id) affinity (value)\n"
+			"    Map a Tx queue with an aggregated port "
+			"of the DPDK port\n\n"
 		);
 	}
 
@@ -12636,6 +12640,97 @@ static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
 	}
 };
 
+/* *** configure port txq affinity value *** */
+struct cmd_config_tx_affinity_map {
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t config;
+	portid_t portid;
+	cmdline_fixed_string_t txq;
+	uint16_t qid;
+	cmdline_fixed_string_t affinity;
+	uint8_t value;
+};
+
+static void
+cmd_config_tx_affinity_map_parsed(void *parsed_result,
+				  __rte_unused struct cmdline *cl,
+				  __rte_unused void *data)
+{
+	struct cmd_config_tx_affinity_map *res = parsed_result;
+	int ret;
+
+	if (port_id_is_invalid(res->portid, ENABLED_WARN))
+		return;
+
+	if (res->portid == (portid_t)RTE_PORT_ALL) {
+		printf("Invalid port id\n");
+		return;
+	}
+
+	if (strcmp(res->txq, "txq")) {
+		printf("Unknown parameter\n");
+		return;
+	}
+	if (tx_queue_id_is_invalid(res->qid))
+		return;
+
+	ret = rte_eth_dev_count_aggr_ports(res->portid);
+	if (ret < 0) {
+		printf("Failed to count the aggregated ports: (%s)\n",
+			strerror(-ret));
+		return;
+	}
+	if (ret == 0) {
+		printf("Number of aggregated ports is 0 which is invalid for affinity mapping\n");
+		return;
+	}
+	if (ret < res->value) {
+		printf("The affinity value %u is Invalid, exceeds the "
+		       "number of aggregated ports\n", res->value);
+		return;
+	}
+
+	rte_eth_dev_map_aggr_tx_affinity(res->portid, res->qid, res->value);
+}
+
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 port, "port");
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_config =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 config, "config");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_portid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 portid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_txq =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 txq, "txq");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_qid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+			      qid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_affinity =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 affinity, "affinity");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+			      value, RTE_UINT8);
+
+static cmdline_parse_inst_t cmd_config_tx_affinity_map = {
+	.f = cmd_config_tx_affinity_map_parsed,
+	.data = (void *)0,
+	.help_str = "port config <port_id> txq <queue_id> affinity <value>",
+	.tokens = {
+		(void *)&cmd_config_tx_affinity_map_port,
+		(void *)&cmd_config_tx_affinity_map_config,
+		(void *)&cmd_config_tx_affinity_map_portid,
+		(void *)&cmd_config_tx_affinity_map_txq,
+		(void *)&cmd_config_tx_affinity_map_qid,
+		(void *)&cmd_config_tx_affinity_map_affinity,
+		(void *)&cmd_config_tx_affinity_map_value,
+		NULL,
+	},
+};
+
 /* ******************************************************************************** */
 
 /* list of instructions */
@@ -12869,6 +12964,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
 	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
+	(cmdline_parse_inst_t *)&cmd_config_tx_affinity_map,
 	NULL,
 };
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index ab998a5357..becf6fed91 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -68,6 +68,13 @@ New Features
   * Applications can register a callback at startup via
     ``rte_lcore_register_usage_cb()`` to provide lcore usage information.
 
+* **Added support for Tx queue mapping with an aggregated port.**
+
+  * Introduced new function ``rte_eth_dev_count_aggr_ports()``
+    to get the number of aggregated ports.
+  * Introduced new function ``rte_eth_dev_map_aggr_tx_affinity()``
+    to map a Tx queue with an aggregated port of the DPDK port.
+
 * **Added rte_flow support for matching IPv6 routing extension header fields.**
 
   Added ``ipv6_routing_ext`` items in rte_flow to match IPv6 routing extension
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 357adb09d7..1d3c372601 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1612,6 +1612,20 @@ Enable or disable a per queue Tx offloading only on a specific Tx queue::
 
 This command should be run when the port is stopped, or else it will fail.
 
+config per queue Tx affinity mapping
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Map a Tx queue with an aggregated port of the DPDK port (specified with port_id)::
+
+   testpmd> port (port_id) txq (queue_id) affinity (value)
+
+* ``affinity``: the number of the aggregated port.
+                When multiple ports are aggregated into a single one,
+                it allows to choose which port to use for Tx via a queue.
+
+This command should be run when the port is stopped, otherwise it fails.
+
+
 Config VXLAN Encap outer layers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 6a550cfc83..b7fdc454a8 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1171,6 +1171,40 @@ typedef int (*eth_tx_descriptor_dump_t)(const struct rte_eth_dev *dev,
 					uint16_t queue_id, uint16_t offset,
 					uint16_t num, FILE *file);
 
+/**
+ * @internal
+ * Get the number of aggregated ports.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ *
+ * @return
+ *   Negative errno value on error, 0 or positive on success.
+ *
+ * @retval >=0
+ *   The number of aggregated port if success.
+ * @retval -ENOTSUP
+ *   Get aggregated ports API is not supported.
+ */
+typedef int (*eth_count_aggr_ports_t)(uint16_t port_id);
+
+/**
+ * @internal
+ * Map a Tx queue with an aggregated port of the DPDK port.
+ *
+ * @param port_id
+ *   The identifier of the port used in rte_eth_tx_burst().
+ * @param tx_queue_id
+ *   The index of the transmit queue used in rte_eth_tx_burst().
+ * @param affinity
+ *   The number of the aggregated port.
+ *
+ * @return
+ *   Negative on error, 0 on success.
+ */
+typedef int (*eth_map_aggr_tx_affinity_t)(uint16_t port_id, uint16_t tx_queue_id,
+					  uint8_t affinity);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -1403,6 +1437,11 @@ struct eth_dev_ops {
 	eth_cman_config_set_t cman_config_set;
 	/** Retrieve congestion management configuration */
 	eth_cman_config_get_t cman_config_get;
+
+	/** Get the number of aggregated ports */
+	eth_count_aggr_ports_t count_aggr_ports;
+	/** Map a Tx queue with an aggregated port of the DPDK port */
+	eth_map_aggr_tx_affinity_t map_aggr_tx_affinity;
 };
 
 /**
diff --git a/lib/ethdev/ethdev_trace.h b/lib/ethdev/ethdev_trace.h
index 9fae22c490..4e210d099b 100644
--- a/lib/ethdev/ethdev_trace.h
+++ b/lib/ethdev/ethdev_trace.h
@@ -1385,6 +1385,23 @@ RTE_TRACE_POINT(
 	rte_trace_point_emit_int(ret);
 )
 
+RTE_TRACE_POINT(
+	rte_eth_trace_count_aggr_ports,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_int(ret);
+)
+
+RTE_TRACE_POINT(
+	rte_eth_trace_map_aggr_tx_affinity,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id,
+			     uint8_t affinity, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_u16(tx_queue_id);
+	rte_trace_point_emit_u8(affinity);
+	rte_trace_point_emit_int(ret);
+)
+
 RTE_TRACE_POINT(
 	rte_flow_trace_dynf_metadata_register,
 	RTE_TRACE_POINT_ARGS(int offset, uint64_t flag),
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 34d12e2859..61010cae56 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -475,6 +475,12 @@ RTE_TRACE_POINT_REGISTER(rte_eth_trace_cman_config_set,
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_cman_config_get,
 	lib.ethdev.cman_config_get)
 
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
+	lib.ethdev.count_aggr_ports)
+
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
+	lib.ethdev.map_aggr_tx_affinity)
+
 RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
 	lib.ethdev.flow.copy)
 
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index dc0a4eb12c..1d5b3a16b2 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6915,6 +6915,55 @@ rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
 	return j;
 }
 
+int rte_eth_dev_count_aggr_ports(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (*dev->dev_ops->count_aggr_ports == NULL)
+		return -ENOTSUP;
+	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
+
+	rte_eth_trace_count_aggr_ports(port_id, ret);
+
+	return ret;
+}
+
+int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
+				     uint8_t affinity)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (tx_queue_id >= dev->data->nb_tx_queues) {
+		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n", tx_queue_id);
+		return -EINVAL;
+	}
+
+	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
+		return -ENOTSUP;
+
+	if (dev->data->dev_started) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u must be stopped to allow configuration\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
+				tx_queue_id, affinity));
+
+	rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity, ret);
+
+	return ret;
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c129ca1eaf..07b8250eb8 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2589,6 +2589,52 @@ int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port);
 __rte_experimental
 int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the number of aggregated ports.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - (>=0) the number of aggregated port if success.
+ *   - (-ENOTSUP) if not supported.
+ */
+__rte_experimental
+int rte_eth_dev_count_aggr_ports(uint16_t port_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ *  Map a Tx queue with an aggregated port of the DPDK port (specified with port_id).
+ *  When multiple ports are aggregated into a single one,
+ *  it allows to choose which port to use for Tx via a queue.
+ *
+ *  The application should use rte_eth_dev_map_aggr_tx_affinity()
+ *  after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
+ *  before rte_eth_dev_start().
+ *
+ * @param port_id
+ *   The identifier of the port used in rte_eth_tx_burst().
+ * @param tx_queue_id
+ *   The index of the transmit queue used in rte_eth_tx_burst().
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param affinity
+ *   The number of the aggregated port.
+ *   Value 0 means no affinity and traffic could be routed to any aggregated port.
+ *   The first aggregated port is number 1 and so on.
+ *   The maximum number is given by rte_eth_dev_count_aggr_ports().
+ *
+ * @return
+ *   Zero if successful. Non-zero otherwise.
+ */
+__rte_experimental
+int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
+				     uint8_t affinity);
+
 /**
  * Return the NUMA socket to which an Ethernet device is connected
  *
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index dbc2bffe64..685aa71e51 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -300,6 +300,8 @@ EXPERIMENTAL {
 	rte_mtr_meter_profile_get;
 
 	# added in 23.03
+	rte_eth_dev_count_aggr_ports;
+	rte_eth_dev_map_aggr_tx_affinity;
 	rte_flow_async_create_by_index;
 };
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 2/2] ethdev: add Aggregated affinity match item
  2023-02-14 15:48   ` [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port Jiawei Wang
  2023-02-14 15:48     ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
@ 2023-02-14 15:48     ` Jiawei Wang
  2023-02-16 17:46       ` Thomas Monjalon
  1 sibling, 1 reply; 40+ messages in thread
From: Jiawei Wang @ 2023-02-14 15:48 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, Aman Singh,
	Yuying Zhang, Ferruh Yigit
  Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port is used for Rx and Tx.

This patch allows to map a Rx queue with an aggregated port by using
a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

While uses the aggregated affinity as a matching item in the flow rule,
and sets the same affinity value by call
rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
the same port as the receiving one.
The affinity numbering starts from 1, then trying to match on
aggr_affinity 0 will result in an error.

Add the testpmd command line to match the new item:
	flow create 0 ingress group 0 pattern aggr_affinity affinity is 1 /
	end actions queue index 0 / end

The above command means that creates a flow on a single DPDK port and
matches the packet from the first physical port and redirects
these packets into Rx queue 0.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
---
 app/test-pmd/cmdline_flow.c                 | 28 +++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst          |  9 ++++++
 doc/guides/rel_notes/release_23_03.rst      |  1 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 35 +++++++++++++++++++++
 6 files changed, 78 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 63a0b36622..889c49a28a 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -481,6 +481,8 @@ enum index {
 	ITEM_METER,
 	ITEM_METER_COLOR,
 	ITEM_METER_COLOR_NAME,
+	ITEM_AGGR_AFFINITY,
+	ITEM_AGGR_AFFINITY_VALUE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -1403,6 +1405,7 @@ static const enum index next_item[] = {
 	ITEM_L2TPV2,
 	ITEM_PPP,
 	ITEM_METER,
+	ITEM_AGGR_AFFINITY,
 	END_SET,
 	ZERO,
 };
@@ -1892,6 +1895,12 @@ static const enum index item_meter[] = {
 	ZERO,
 };
 
+static const enum index item_aggr_affinity[] = {
+	ITEM_AGGR_AFFINITY_VALUE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -6694,6 +6703,22 @@ static const struct token token_list[] = {
 				ARGS_ENTRY(struct buffer, port)),
 		.call = parse_mp,
 	},
+	[ITEM_AGGR_AFFINITY] = {
+		.name = "aggr_affinity",
+		.help = "match on the aggregated port receiving the packets",
+		.priv = PRIV_ITEM(AGGR_AFFINITY,
+				  sizeof(struct rte_flow_item_aggr_affinity)),
+		.next = NEXT(item_aggr_affinity),
+		.call = parse_vc,
+	},
+	[ITEM_AGGR_AFFINITY_VALUE] = {
+		.name = "affinity",
+		.help = "aggregated affinity value",
+		.next = NEXT(item_aggr_affinity, NEXT_ENTRY(COMMON_UNSIGNED),
+			     item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_aggr_affinity,
+					affinity)),
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -11424,6 +11449,9 @@ flow_item_default_mask(const struct rte_flow_item *item)
 	case RTE_FLOW_ITEM_TYPE_IPV6_ROUTING_EXT:
 		mask = &ipv6_routing_ext_default_mask;
 		break;
+	case RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY:
+		mask = &rte_flow_item_aggr_affinity_mask;
+		break;
 	default:
 		break;
 	}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 161c5dce53..dcc8b42540 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1536,6 +1536,15 @@ Matches IPv6 routing extension header.
 - ``type``: IPv6 routing extension header type.
 - ``segments_left``: How many IPv6 destination addresses carries on.
 
+
+Item: ``AGGR_AFFINITY``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches on the aggregated port of the received packet.
+In case of multiple aggregated ports, the affinity numbering starts from 1.
+
+- ``affinity``: Aggregated affinity.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index becf6fed91..7bd8881077 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -74,6 +74,7 @@ New Features
     to get the number of aggregated ports.
   * Introduced new function ``rte_eth_dev_map_aggr_tx_affinity()``
     to map a Tx queue with an aggregated port of the DPDK port.
+  * Added Rx affinity flow matching of an aggregated port.
 
 * **Added rte_flow support for matching IPv6 routing extension header fields.**
 
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 1d3c372601..5a88933635 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3775,6 +3775,10 @@ This section lists supported pattern items and their attributes, if any.
 
   - ``color {value}``: meter color value (green/yellow/red).
 
+- ``aggr_affinity``: match aggregated port.
+
+  - ``affinity {value}``: aggregated port (starts from 1).
+
 - ``send_to_kernel``: send packets to kernel.
 
 
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index b29c66e2f4..749effb19b 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -162,6 +162,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(PPP, sizeof(struct rte_flow_item_ppp)),
 	MK_FLOW_ITEM(METER_COLOR, sizeof(struct rte_flow_item_meter_color)),
 	MK_FLOW_ITEM(IPV6_ROUTING_EXT, sizeof(struct rte_flow_item_ipv6_routing_ext)),
+	MK_FLOW_ITEM(AGGR_AFFINITY, sizeof(struct rte_flow_item_aggr_affinity)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 6a51442b8c..6fc2d37847 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -656,6 +656,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_icmp6_echo.
 	 */
 	RTE_FLOW_ITEM_TYPE_ICMP6_ECHO_REPLY,
+
+	/**
+	 * Matches on the aggregated port of the received packet.
+	 * Used in case multiple ports are aggregated to the a DPDK port.
+	 * First port is number 1.
+	 *
+	 * @see struct rte_flow_item_aggr_affinity.
+	 */
+	RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY,
 };
 
 /**
@@ -2187,6 +2196,32 @@ static const struct rte_flow_item_meter_color rte_flow_item_meter_color_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY
+ *
+ * For multiple ports aggregated to a single DPDK port,
+ * match the aggregated port receiving the packets.
+ */
+struct rte_flow_item_aggr_affinity {
+	/**
+	 * An aggregated port receiving the packets.
+	 * Numbering starts from 1.
+	 * Number of aggregated ports is reported by rte_eth_dev_count_aggr_ports().
+	 */
+	uint8_t affinity;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY. */
+#ifndef __cplusplus
+static const struct rte_flow_item_aggr_affinity
+rte_flow_item_aggr_affinity_mask = {
+	.affinity = 0xff,
+};
+#endif
+
 /**
  * Action types.
  *
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
  2023-02-14 15:48     ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
@ 2023-02-15 11:41       ` Jiawei(Jonny) Wang
  2023-02-16 17:42       ` Thomas Monjalon
  2023-02-16 17:58       ` Ferruh Yigit
  2 siblings, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-15 11:41 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	andrew.rybchenko, Aman Singh, Yuying Zhang, Ferruh Yigit
  Cc: dev, Raslan Darawsheh

Hi Ori, Thomas and Ferruh,

Could you please help to review it?

Thanks.

> -----Original Message-----
> From: Jiawei Wang <jiaweiw@nvidia.com>
> Sent: Tuesday, February 14, 2023 11:49 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>; Ferruh Yigit <ferruh.yigit@amd.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
> 
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to know which
> port use for Tx via a queue.
> 
> This patch introduces the new ethdev API rte_eth_dev_map_aggr_tx_affinity(),
> it's used to map a Tx queue with an aggregated port of the DPDK port
> (specified with port_id), The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any aggregated port,
> this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> rte_eth_dev_map_aggr_tx_affinity() functions.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two physical ports connected to a single DPDK port (port
> id 0), and affinity 1 stood for the first physical port and affinity 2 stood for the
> second physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 affinity 1
>         port config 0 txq 1 affinity 1
>         port config 0 txq 2 affinity 2
>         port config 0 txq 3 affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with phy
> affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these packets will be
> sent from the first physical port, and similar with the second physical port if
> sending packets with Tx Queue 2 or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> ---
>  app/test-pmd/cmdline.c                      | 96 +++++++++++++++++++++
>  doc/guides/rel_notes/release_23_03.rst      |  7 ++
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst | 14 +++
>  lib/ethdev/ethdev_driver.h                  | 39 +++++++++
>  lib/ethdev/ethdev_trace.h                   | 17 ++++
>  lib/ethdev/ethdev_trace_points.c            |  6 ++
>  lib/ethdev/rte_ethdev.c                     | 49 +++++++++++
>  lib/ethdev/rte_ethdev.h                     | 46 ++++++++++
>  lib/ethdev/version.map                      |  2 +
>  9 files changed, 276 insertions(+)
> 
snip
> 2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
  2023-02-14 15:48     ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
  2023-02-15 11:41       ` Jiawei(Jonny) Wang
@ 2023-02-16 17:42       ` Thomas Monjalon
  2023-02-17  6:45         ` Jiawei(Jonny) Wang
  2023-02-16 17:58       ` Ferruh Yigit
  2 siblings, 1 reply; 40+ messages in thread
From: Thomas Monjalon @ 2023-02-16 17:42 UTC (permalink / raw)
  To: Jiawei Wang
  Cc: viacheslavo, orika, andrew.rybchenko, Aman Singh, Yuying Zhang,
	Ferruh Yigit, dev, rasland

For the title, I suggest
ethdev: add Tx queue mapping of aggregated ports

14/02/2023 16:48, Jiawei Wang:
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.),
> we want to know which port use for Tx via a queue.
> 
> This patch introduces the new ethdev API
> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
> with an aggregated port of the DPDK port (specified with port_id),
> The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any
> aggregated port, this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
> and rte_eth_dev_map_aggr_tx_affinity() functions.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two physical ports connected to
> a single DPDK port (port id 0), and affinity 1 stood for
> the first physical port and affinity 2 stood for the second
> physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 affinity 1
>         port config 0 txq 1 affinity 1
>         port config 0 txq 2 affinity 2
>         port config 0 txq 3 affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
> these packets will be sent from the first physical port, and similar
> with the second physical port if sending packets with Tx Queue 2
> or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 2/2] ethdev: add Aggregated affinity match item
  2023-02-14 15:48     ` [PATCH v5 2/2] ethdev: add Aggregated affinity match item Jiawei Wang
@ 2023-02-16 17:46       ` Thomas Monjalon
  2023-02-17  6:45         ` Jiawei(Jonny) Wang
  0 siblings, 1 reply; 40+ messages in thread
From: Thomas Monjalon @ 2023-02-16 17:46 UTC (permalink / raw)
  To: Jiawei Wang
  Cc: viacheslavo, orika, andrew.rybchenko, Aman Singh, Yuying Zhang,
	Ferruh Yigit, dev, rasland

For the title, I suggest
ethdev: add flow matching of aggregated port

14/02/2023 16:48, Jiawei Wang:
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.),
> we want to know which port is used for Rx and Tx.
> 
> This patch allows to map a Rx queue with an aggregated port by using
> a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> 
> While uses the aggregated affinity as a matching item in the flow rule,
> and sets the same affinity value by call
> rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
> the same port as the receiving one.
> The affinity numbering starts from 1, then trying to match on
> aggr_affinity 0 will result in an error.
> 
> Add the testpmd command line to match the new item:
> 	flow create 0 ingress group 0 pattern aggr_affinity affinity is 1 /
> 	end actions queue index 0 / end
> 
> The above command means that creates a flow on a single DPDK port and
> matches the packet from the first physical port and redirects
> these packets into Rx queue 0.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
  2023-02-14 15:48     ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
  2023-02-15 11:41       ` Jiawei(Jonny) Wang
  2023-02-16 17:42       ` Thomas Monjalon
@ 2023-02-16 17:58       ` Ferruh Yigit
  2023-02-17  6:44         ` Jiawei(Jonny) Wang
  2023-02-17  8:24         ` Andrew Rybchenko
  2 siblings, 2 replies; 40+ messages in thread
From: Ferruh Yigit @ 2023-02-16 17:58 UTC (permalink / raw)
  To: Jiawei Wang, viacheslavo, orika, thomas, andrew.rybchenko,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.),
> we want to know which port use for Tx via a queue.
> 
> This patch introduces the new ethdev API
> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
> with an aggregated port of the DPDK port (specified with port_id),
> The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any
> aggregated port, this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
> and rte_eth_dev_map_aggr_tx_affinity() functions.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two physical ports connected to
> a single DPDK port (port id 0), and affinity 1 stood for
> the first physical port and affinity 2 stood for the second
> physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>         port config 0 txq 0 affinity 1
>         port config 0 txq 1 affinity 1
>         port config 0 txq 2 affinity 2
>         port config 0 txq 3 affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
> these packets will be sent from the first physical port, and similar
> with the second physical port if sending packets with Tx Queue 2
> or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>

<...>

> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index dc0a4eb12c..1d5b3a16b2 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6915,6 +6915,55 @@ rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
>  	return j;
>  }
>  
> +int rte_eth_dev_count_aggr_ports(uint16_t port_id)
> +{
> +	struct rte_eth_dev *dev;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +
> +	if (*dev->dev_ops->count_aggr_ports == NULL)
> +		return -ENOTSUP;

What do you think to return a default value when dev_ops is not defined,
assuming device is not a bounded device.
Not sure which one is better for application, return a default value or
error.


> +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> +
> +	rte_eth_trace_count_aggr_ports(port_id, ret);
> +
> +	return ret;
> +}
> +
> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
> +				     uint8_t affinity)
> +{
> +	struct rte_eth_dev *dev;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +
> +	if (tx_queue_id >= dev->data->nb_tx_queues) {
> +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n", tx_queue_id);
> +		return -EINVAL;
> +	}
> +

Although documentation says this API should be called before configure,
if user misses it I guess above can crash, is there a way to add runtime
check, like checking 'dev->data->dev_configured'?


> +	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
> +		return -ENOTSUP;
> +
> +	if (dev->data->dev_started) {
> +		RTE_ETHDEV_LOG(ERR,
> +			"Port %u must be stopped to allow configuration\n",
> +			port_id);
> +		return -EBUSY;
> +	}
> +
> +	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
> +				tx_queue_id, affinity));
> +

Should API check if port_id is a bonding port before it continue with
mapping?

> +	rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity, ret);
> +
> +	return ret;
> +}
> +
>  RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
>  
>  RTE_INIT(ethdev_init_telemetry)
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index c129ca1eaf..07b8250eb8 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -2589,6 +2589,52 @@ int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port);
>  __rte_experimental
>  int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the number of aggregated ports.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @return
> + *   - (>=0) the number of aggregated port if success.
> + *   - (-ENOTSUP) if not supported.
> + */
> +__rte_experimental
> +int rte_eth_dev_count_aggr_ports(uint16_t port_id);


Can you please give more details in the function description, in the
context of this patch it is clear, but someone sees it first time can be
confused what is "aggregated ports" is.

What is expected value for regular pysical port, that doesn't have any
sub-devices, 0 or 1? Can you please document?


> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + *  Map a Tx queue with an aggregated port of the DPDK port (specified with port_id).
> + *  When multiple ports are aggregated into a single one,
> + *  it allows to choose which port to use for Tx via a queue.
> + *
> + *  The application should use rte_eth_dev_map_aggr_tx_affinity()
> + *  after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
> + *  before rte_eth_dev_start().
> + *
> + * @param port_id
> + *   The identifier of the port used in rte_eth_tx_burst().
> + * @param tx_queue_id
> + *   The index of the transmit queue used in rte_eth_tx_burst().
> + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> + *   to rte_eth_dev_configure().
> + * @param affinity
> + *   The number of the aggregated port.
> + *   Value 0 means no affinity and traffic could be routed to any aggregated port.
> + *   The first aggregated port is number 1 and so on.
> + *   The maximum number is given by rte_eth_dev_count_aggr_ports().
> + *
> + * @return
> + *   Zero if successful. Non-zero otherwise.
> + */
> +__rte_experimental
> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
> +				     uint8_t affinity);
> +
>  /**
>   * Return the NUMA socket to which an Ethernet device is connected
>   *
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> index dbc2bffe64..685aa71e51 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -300,6 +300,8 @@ EXPERIMENTAL {
>  	rte_mtr_meter_profile_get;
>  
>  	# added in 23.03
> +	rte_eth_dev_count_aggr_ports;
> +	rte_eth_dev_map_aggr_tx_affinity;
>  	rte_flow_async_create_by_index;
>  };
>  


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
  2023-02-16 17:58       ` Ferruh Yigit
@ 2023-02-17  6:44         ` Jiawei(Jonny) Wang
  2023-02-17  8:24         ` Andrew Rybchenko
  1 sibling, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-17  6:44 UTC (permalink / raw)
  To: Ferruh Yigit, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	andrew.rybchenko, Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, February 17, 2023 1:58 AM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port use for Tx via a queue.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> > rte_eth_dev_map_aggr_tx_affinity() functions.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and affinity 1 stood for the first physical port and
> > affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> >         port config 0 txq 0 affinity 1
> >         port config 0 txq 1 affinity 1
> >         port config 0 txq 2 affinity 2
> >         port config 0 txq 3 affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> 
> <...>
> 
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> > dc0a4eb12c..1d5b3a16b2 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -6915,6 +6915,55 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> >  	return j;
> >  }
> >
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	if (*dev->dev_ops->count_aggr_ports == NULL)
> > +		return -ENOTSUP;
> 
> What do you think to return a default value when dev_ops is not defined,
> assuming device is not a bounded device.
> Not sure which one is better for application, return a default value or error.
> 

For device which isn't a boned device, the count should be zero.
So, we can return 0 as default value if the PMD doesn't support.

Per application perspective, it only needs to check the count > 0.

> 
> > +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> > +
> > +	rte_eth_trace_count_aggr_ports(port_id, ret);
> > +
> > +	return ret;
> > +}
> > +
> > +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> > +				     uint8_t affinity)
> > +{
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	if (tx_queue_id >= dev->data->nb_tx_queues) {
> > +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n",
> tx_queue_id);
> > +		return -EINVAL;
> > +	}
> > +
> 
> Although documentation says this API should be called before configure, if user
> misses it I guess above can crash, is there a way to add runtime check, like
> checking 'dev->data->dev_configured'?
> 

OK, I will add the checking and report the error if (dev->data->dev_configured == 0).

> 
> > +	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
> > +		return -ENOTSUP;
> > +
> > +	if (dev->data->dev_started) {
> > +		RTE_ETHDEV_LOG(ERR,
> > +			"Port %u must be stopped to allow configuration\n",
> > +			port_id);
> > +		return -EBUSY;
> > +	}
> > +
> > +	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
> > +				tx_queue_id, affinity));
> > +
> 
> Should API check if port_id is a bonding port before it continue with mapping?
> 

I added this check in the app before, will move to ethdev layer.

> > +	rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity,
> > +ret);
> > +
> > +	return ret;
> > +}
> > +
> >  RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
> >
> >  RTE_INIT(ethdev_init_telemetry)
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > c129ca1eaf..07b8250eb8 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -2589,6 +2589,52 @@ int rte_eth_hairpin_bind(uint16_t tx_port,
> > uint16_t rx_port);  __rte_experimental  int
> > rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Get the number of aggregated ports.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @return
> > + *   - (>=0) the number of aggregated port if success.
> > + *   - (-ENOTSUP) if not supported.
> > + */
> > +__rte_experimental
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id);
> 
> 
> Can you please give more details in the function description, in the context of
> this patch it is clear, but someone sees it first time can be confused what is
> "aggregated ports" is.
> 

OK, for multiple ports are aggregated into single one, we can call these ports as "aggregated ports".
Will add more description in next patch.

> What is expected value for regular pysical port, that doesn't have any sub-
> devices, 0 or 1? Can you please document?
> 

OK, API return 0 for regular physical port (w/o bonded).
Will add document in next patch.

> 
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + *  Map a Tx queue with an aggregated port of the DPDK port (specified with
> port_id).
> > + *  When multiple ports are aggregated into a single one,
> > + *  it allows to choose which port to use for Tx via a queue.
> > + *
> > + *  The application should use rte_eth_dev_map_aggr_tx_affinity()
> > + *  after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
> > + *  before rte_eth_dev_start().
> > + *
> > + * @param port_id
> > + *   The identifier of the port used in rte_eth_tx_burst().
> > + * @param tx_queue_id
> > + *   The index of the transmit queue used in rte_eth_tx_burst().
> > + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> > + *   to rte_eth_dev_configure().
> > + * @param affinity
> > + *   The number of the aggregated port.
> > + *   Value 0 means no affinity and traffic could be routed to any aggregated
> port.
> > + *   The first aggregated port is number 1 and so on.
> > + *   The maximum number is given by rte_eth_dev_count_aggr_ports().
> > + *
> > + * @return
> > + *   Zero if successful. Non-zero otherwise.
> > + */
> > +__rte_experimental
> > +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> > +				     uint8_t affinity);
> > +
> >  /**
> >   * Return the NUMA socket to which an Ethernet device is connected
> >   *
> > diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> > dbc2bffe64..685aa71e51 100644
> > --- a/lib/ethdev/version.map
> > +++ b/lib/ethdev/version.map
> > @@ -300,6 +300,8 @@ EXPERIMENTAL {
> >  	rte_mtr_meter_profile_get;
> >
> >  	# added in 23.03
> > +	rte_eth_dev_count_aggr_ports;
> > +	rte_eth_dev_map_aggr_tx_affinity;
> >  	rte_flow_async_create_by_index;
> >  };
> >

Thanks.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v5 2/2] ethdev: add Aggregated affinity match item
  2023-02-16 17:46       ` Thomas Monjalon
@ 2023-02-17  6:45         ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-17  6:45 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon (EXTERNAL)
  Cc: Slava Ovsiienko, Ori Kam, andrew.rybchenko, Aman Singh,
	Yuying Zhang, Ferruh Yigit, dev, Raslan Darawsheh

Hi,

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, February 17, 2023 1:46 AM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>
> Cc: Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>; Ferruh Yigit <ferruh.yigit@amd.com>;
> dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v5 2/2] ethdev: add Aggregated affinity match item
> 
> For the title, I suggest
> ethdev: add flow matching of aggregated port
> 
> 14/02/2023 16:48, Jiawei Wang:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port is used for Rx and Tx.
> >
> > This patch allows to map a Rx queue with an aggregated port by using a
> > flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> >
> > While uses the aggregated affinity as a matching item in the flow
> > rule, and sets the same affinity value by call
> > rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
> > the same port as the receiving one.
> > The affinity numbering starts from 1, then trying to match on
> > aggr_affinity 0 will result in an error.
> >
> > Add the testpmd command line to match the new item:
> > 	flow create 0 ingress group 0 pattern aggr_affinity affinity is 1 /
> > 	end actions queue index 0 / end
> >
> > The above command means that creates a flow on a single DPDK port and
> > matches the packet from the first physical port and redirects these
> > packets into Rx queue 0.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> 
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> 

OK, update the title next patch, thanks for Ack.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
  2023-02-16 17:42       ` Thomas Monjalon
@ 2023-02-17  6:45         ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-17  6:45 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon (EXTERNAL)
  Cc: Slava Ovsiienko, Ori Kam, andrew.rybchenko, Aman Singh,
	Yuying Zhang, Ferruh Yigit, dev, Raslan Darawsheh



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, February 17, 2023 1:42 AM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>
> Cc: Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> andrew.rybchenko@oktetlabs.ru; Aman Singh <aman.deep.singh@intel.com>;
> Yuying Zhang <yuying.zhang@intel.com>; Ferruh Yigit <ferruh.yigit@amd.com>;
> dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> For the title, I suggest
> ethdev: add Tx queue mapping of aggregated ports
> 
> 14/02/2023 16:48, Jiawei Wang:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port use for Tx via a queue.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> > rte_eth_dev_map_aggr_tx_affinity() functions.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and affinity 1 stood for the first physical port and
> > affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> >         port config 0 txq 0 affinity 1
> >         port config 0 txq 1 affinity 1
> >         port config 0 txq 2 affinity 2
> >         port config 0 txq 3 affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> 
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> 

OK, update the title next patch, thanks for Ack.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
  2023-02-16 17:58       ` Ferruh Yigit
  2023-02-17  6:44         ` Jiawei(Jonny) Wang
@ 2023-02-17  8:24         ` Andrew Rybchenko
  2023-02-17  9:50           ` Jiawei(Jonny) Wang
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Rybchenko @ 2023-02-17  8:24 UTC (permalink / raw)
  To: Ferruh Yigit, Jiawei Wang, viacheslavo, orika, thomas,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

On 2/16/23 20:58, Ferruh Yigit wrote:
> On 2/14/2023 3:48 PM, Jiawei Wang wrote:
>> When multiple ports are aggregated into a single DPDK port,
>> (example: Linux bonding, DPDK bonding, failsafe, etc.),
>> we want to know which port use for Tx via a queue.
>>
>> This patch introduces the new ethdev API
>> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
>> with an aggregated port of the DPDK port (specified with port_id),
>> The affinity is the number of the aggregated port.
>> Value 0 means no affinity and traffic could be routed to any
>> aggregated port, this is the default current behavior.
>>
>> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
>>
>> Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
>> and rte_eth_dev_map_aggr_tx_affinity() functions.
>>
>> Add the testpmd command line:
>> testpmd> port config (port_id) txq (queue_id) affinity (value)
>>
>> For example, there're two physical ports connected to
>> a single DPDK port (port id 0), and affinity 1 stood for
>> the first physical port and affinity 2 stood for the second
>> physical port.
>> Use the below commands to config tx phy affinity for per Tx Queue:
>>          port config 0 txq 0 affinity 1
>>          port config 0 txq 1 affinity 1
>>          port config 0 txq 2 affinity 2
>>          port config 0 txq 3 affinity 2
>>
>> These commands config the Tx Queue index 0 and Tx Queue index 1 with
>> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
>> these packets will be sent from the first physical port, and similar
>> with the second physical port if sending packets with Tx Queue 2
>> or Tx Queue 3.
>>
>> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> 
> <...>
> 
>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>> index dc0a4eb12c..1d5b3a16b2 100644
>> --- a/lib/ethdev/rte_ethdev.c
>> +++ b/lib/ethdev/rte_ethdev.c
>> @@ -6915,6 +6915,55 @@ rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
>>   	return j;
>>   }
>>   
>> +int rte_eth_dev_count_aggr_ports(uint16_t port_id)
>> +{
>> +	struct rte_eth_dev *dev;
>> +	int ret;
>> +
>> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>> +	dev = &rte_eth_devices[port_id];
>> +
>> +	if (*dev->dev_ops->count_aggr_ports == NULL)
>> +		return -ENOTSUP;
> 
> What do you think to return a default value when dev_ops is not defined,
> assuming device is not a bounded device.
> Not sure which one is better for application, return a default value or
> error.

I think 0 is better here. It simply means that
rte_eth_dev_map_aggr_tx_affinity() cannot be used as
well as corresponding flow API item.
It will be true even for bonding as long as corresponding
API is not supported.

>> +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
>> +
>> +	rte_eth_trace_count_aggr_ports(port_id, ret);
>> +
>> +	return ret;
>> +}
>> +
>> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
>> +				     uint8_t affinity)
>> +{
>> +	struct rte_eth_dev *dev;
>> +	int ret;
>> +
>> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>> +	dev = &rte_eth_devices[port_id];
>> +
>> +	if (tx_queue_id >= dev->data->nb_tx_queues) {
>> +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n", tx_queue_id);
>> +		return -EINVAL;
>> +	}
>> +
> 
> Although documentation says this API should be called before configure,

Documentation says "after". Anyway, it is better to check vs
dev_configured.

> if user misses it I guess above can crash, is there a way to add runtime
> check, like checking 'dev->data->dev_configured'?
> 
> 
>> +	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
>> +		return -ENOTSUP;
>> +
>> +	if (dev->data->dev_started) {
>> +		RTE_ETHDEV_LOG(ERR,
>> +			"Port %u must be stopped to allow configuration\n",
>> +			port_id);
>> +		return -EBUSY;
>> +	}
>> +
>> +	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
>> +				tx_queue_id, affinity));
>> +
> 
> Should API check if port_id is a bonding port before it continue with
> mapping?

Since it is a control path I think it is a good idea to
call rte_eth_dev_count_aggr_ports() and chck affinity value.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
  2023-02-17  8:24         ` Andrew Rybchenko
@ 2023-02-17  9:50           ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-17  9:50 UTC (permalink / raw)
  To: Andrew Rybchenko, Ferruh Yigit, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh



> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Friday, February 17, 2023 4:24 PM
> To: Ferruh Yigit <ferruh.yigit@amd.com>; Jiawei(Jonny) Wang
> <jiaweiw@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam
> <orika@nvidia.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Aman Singh <aman.deep.singh@intel.com>; Yuying
> Zhang <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> On 2/16/23 20:58, Ferruh Yigit wrote:
> > On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> >> When multiple ports are aggregated into a single DPDK port,
> >> (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> >> know which port use for Tx via a queue.
> >>
> >> This patch introduces the new ethdev API
> >> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> >> an aggregated port of the DPDK port (specified with port_id), The
> >> affinity is the number of the aggregated port.
> >> Value 0 means no affinity and traffic could be routed to any
> >> aggregated port, this is the default current behavior.
> >>
> >> The maximum number of affinity is given by
> rte_eth_dev_count_aggr_ports().
> >>
> >> Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> >> rte_eth_dev_map_aggr_tx_affinity() functions.
> >>
> >> Add the testpmd command line:
> >> testpmd> port config (port_id) txq (queue_id) affinity (value)
> >>
> >> For example, there're two physical ports connected to a single DPDK
> >> port (port id 0), and affinity 1 stood for the first physical port
> >> and affinity 2 stood for the second physical port.
> >> Use the below commands to config tx phy affinity for per Tx Queue:
> >>          port config 0 txq 0 affinity 1
> >>          port config 0 txq 1 affinity 1
> >>          port config 0 txq 2 affinity 2
> >>          port config 0 txq 3 affinity 2
> >>
> >> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> >> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> >> packets will be sent from the first physical port, and similar with
> >> the second physical port if sending packets with Tx Queue 2 or Tx
> >> Queue 3.
> >>
> >> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> >
> > <...>
> >
> >> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> >> dc0a4eb12c..1d5b3a16b2 100644
> >> --- a/lib/ethdev/rte_ethdev.c
> >> +++ b/lib/ethdev/rte_ethdev.c
> >> @@ -6915,6 +6915,55 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> >>   	return j;
> >>   }
> >>
> >> +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> >> +	struct rte_eth_dev *dev;
> >> +	int ret;
> >> +
> >> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >> +	dev = &rte_eth_devices[port_id];
> >> +
> >> +	if (*dev->dev_ops->count_aggr_ports == NULL)
> >> +		return -ENOTSUP;
> >
> > What do you think to return a default value when dev_ops is not
> > defined, assuming device is not a bounded device.
> > Not sure which one is better for application, return a default value
> > or error.
> 
> I think 0 is better here. It simply means that
> rte_eth_dev_map_aggr_tx_affinity() cannot be used as well as corresponding
> flow API item.
> It will be true even for bonding as long as corresponding API is not supported.
> 

Will send the new patch later with this change.

> >> +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> >> +
> >> +	rte_eth_trace_count_aggr_ports(port_id, ret);
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> >> +				     uint8_t affinity)
> >> +{
> >> +	struct rte_eth_dev *dev;
> >> +	int ret;
> >> +
> >> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >> +	dev = &rte_eth_devices[port_id];
> >> +
> >> +	if (tx_queue_id >= dev->data->nb_tx_queues) {
> >> +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n",
> tx_queue_id);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >
> > Although documentation says this API should be called before
> > configure,
> 
> Documentation says "after". Anyway, it is better to check vs dev_configured.
> 

Yes, after device configure, I add the checking and will send the new patch.

> > if user misses it I guess above can crash, is there a way to add
> > runtime check, like checking 'dev->data->dev_configured'?
> >
> >
> >> +	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
> >> +		return -ENOTSUP;
> >> +
> >> +	if (dev->data->dev_started) {
> >> +		RTE_ETHDEV_LOG(ERR,
> >> +			"Port %u must be stopped to allow configuration\n",
> >> +			port_id);
> >> +		return -EBUSY;
> >> +	}
> >> +
> >> +	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
> >> +				tx_queue_id, affinity));
> >> +
> >
> > Should API check if port_id is a bonding port before it continue with
> > mapping?
> 
> Since it is a control path I think it is a good idea to call
> rte_eth_dev_count_aggr_ports() and chck affinity value.

OK, add the API call before map and check the affinity value as well.

Will send the v6 patch include all comments/suggestions.

Thanks.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v6 0/2] Add Tx queue mapping of aggregated ports
  2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
                     ` (3 preceding siblings ...)
  2023-02-14 15:48   ` [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port Jiawei Wang
@ 2023-02-17 10:50   ` Jiawei Wang
  2023-02-17 10:50     ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
                       ` (2 more replies)
  2023-02-17 15:47   ` [PATCH v7 " Jiawei Wang
  5 siblings, 3 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-17 10:50 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, ferruh.yigit; +Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port is used for Rx and Tx.

This patch introduces the new ethdev API
rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
with an aggregated port of the DPDK port (specified with port_id),
The affinity is the number of the aggregated port.
Value 0 means no affinity and traffic could be routed to any
aggregated port, this is the default current behavior.

The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().

This patch allows to map a Rx queue with an aggregated port by using
a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

While uses the aggregated affinity as a matching item in the flow rule,
and sets the same affinity value by call
rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
the same port as the receiving one.
The affinity numbering starts from 1, then trying to match on
aggr_affinity 0 will result in an error.

RFC: http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaweiw@nvidia.com/

v6:
* Update the commit titles.
* Return 0 by default if dev_ops.count_aggr_ports is not defined.
* Adds the dev_configure and affinity value checking before call map_aggr_tx_affinity.
* Update the rte_eth_dev_count_aggr_ports description.

v5:
* Adds rte_eth_dev_map_aggr_tx_affinity() to map a Tx queue to an aggregated port.
* Adds rte_eth_dev_count_aggr_ports() to get the number of aggregated ports.
* Updates the flow item RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

v4:
* Rebase the latest code
* Update new field description
* Update release release note
* Reword the commit log to make clear

v3:
* Update exception rule
* Update the commit log
* Add the description for PHY affinity and numbering definition
* Add the number of physical ports into device info
* Change the patch order 

v2: Update based on the comments

Jiawei Wang (2):
  ethdev: add Tx queue mapping of aggregated ports
  ethdev: add flow matching of aggregated port

 app/test-pmd/cmdline.c                      | 92 +++++++++++++++++++++
 app/test-pmd/cmdline_flow.c                 | 28 +++++++
 doc/guides/prog_guide/rte_flow.rst          |  8 ++
 doc/guides/rel_notes/release_23_03.rst      |  8 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 18 ++++
 lib/ethdev/ethdev_driver.h                  | 39 +++++++++
 lib/ethdev/ethdev_trace.h                   | 17 ++++
 lib/ethdev/ethdev_trace_points.c            |  6 ++
 lib/ethdev/rte_ethdev.c                     | 72 ++++++++++++++++
 lib/ethdev/rte_ethdev.h                     | 50 +++++++++++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 35 ++++++++
 lib/ethdev/version.map                      |  2 +
 13 files changed, 376 insertions(+)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 10:50   ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
@ 2023-02-17 10:50     ` Jiawei Wang
  2023-02-17 12:53       ` Ferruh Yigit
  2023-02-17 12:56       ` Andrew Rybchenko
  2023-02-17 10:50     ` [PATCH v6 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
  2023-02-17 13:01     ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
  2 siblings, 2 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-17 10:50 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, ferruh.yigit,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port use for Tx via a queue.

This patch introduces the new ethdev API
rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
with an aggregated port of the DPDK port (specified with port_id),
The affinity is the number of the aggregated port.
Value 0 means no affinity and traffic could be routed to any
aggregated port, this is the default current behavior.

The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().

Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
and rte_eth_dev_map_aggr_tx_affinity() functions.

Add the testpmd command line:
testpmd> port config (port_id) txq (queue_id) affinity (value)

For example, there're two physical ports connected to
a single DPDK port (port id 0), and affinity 1 stood for
the first physical port and affinity 2 stood for the second
physical port.
Use the below commands to config tx phy affinity for per Tx Queue:
        port config 0 txq 0 affinity 1
        port config 0 txq 1 affinity 1
        port config 0 txq 2 affinity 2
        port config 0 txq 3 affinity 2

These commands config the Tx Queue index 0 and Tx Queue index 1 with
phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
these packets will be sent from the first physical port, and similar
with the second physical port if sending packets with Tx Queue 2
or Tx Queue 3.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/cmdline.c                      | 92 +++++++++++++++++++++
 doc/guides/rel_notes/release_23_03.rst      |  7 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 14 ++++
 lib/ethdev/ethdev_driver.h                  | 39 +++++++++
 lib/ethdev/ethdev_trace.h                   | 17 ++++
 lib/ethdev/ethdev_trace_points.c            |  6 ++
 lib/ethdev/rte_ethdev.c                     | 72 ++++++++++++++++
 lib/ethdev/rte_ethdev.h                     | 50 +++++++++++
 lib/ethdev/version.map                      |  2 +
 9 files changed, 299 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index bb7ff2b449..151f356224 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 
 			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
 			"    Cleanup txq mbufs for a specific Tx queue\n\n"
+
+			"port config (port_id) txq (queue_id) affinity (value)\n"
+			"    Map a Tx queue with an aggregated port "
+			"of the DPDK port\n\n"
 		);
 	}
 
@@ -12636,6 +12640,93 @@ static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
 	}
 };
 
+/* *** configure port txq affinity value *** */
+struct cmd_config_tx_affinity_map {
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t config;
+	portid_t portid;
+	cmdline_fixed_string_t txq;
+	uint16_t qid;
+	cmdline_fixed_string_t affinity;
+	uint8_t value;
+};
+
+static void
+cmd_config_tx_affinity_map_parsed(void *parsed_result,
+				  __rte_unused struct cmdline *cl,
+				  __rte_unused void *data)
+{
+	struct cmd_config_tx_affinity_map *res = parsed_result;
+	int ret;
+
+	if (port_id_is_invalid(res->portid, ENABLED_WARN))
+		return;
+
+	if (res->portid == (portid_t)RTE_PORT_ALL) {
+		printf("Invalid port id\n");
+		return;
+	}
+
+	if (strcmp(res->txq, "txq")) {
+		printf("Unknown parameter\n");
+		return;
+	}
+	if (tx_queue_id_is_invalid(res->qid))
+		return;
+
+	ret = rte_eth_dev_count_aggr_ports(res->portid);
+	if (ret < 0) {
+		printf("Failed to count the aggregated ports: (%s)\n",
+			strerror(-ret));
+		return;
+	}
+
+	ret = rte_eth_dev_map_aggr_tx_affinity(res->portid, res->qid, res->value);
+	if (ret != 0) {
+		printf("Failed to map tx queue with an aggregated port: %s\n",
+			rte_strerror(-ret));
+		return;
+	}
+}
+
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 port, "port");
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_config =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 config, "config");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_portid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 portid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_txq =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 txq, "txq");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_qid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+			      qid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_affinity =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 affinity, "affinity");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+			      value, RTE_UINT8);
+
+static cmdline_parse_inst_t cmd_config_tx_affinity_map = {
+	.f = cmd_config_tx_affinity_map_parsed,
+	.data = (void *)0,
+	.help_str = "port config <port_id> txq <queue_id> affinity <value>",
+	.tokens = {
+		(void *)&cmd_config_tx_affinity_map_port,
+		(void *)&cmd_config_tx_affinity_map_config,
+		(void *)&cmd_config_tx_affinity_map_portid,
+		(void *)&cmd_config_tx_affinity_map_txq,
+		(void *)&cmd_config_tx_affinity_map_qid,
+		(void *)&cmd_config_tx_affinity_map_affinity,
+		(void *)&cmd_config_tx_affinity_map_value,
+		NULL,
+	},
+};
+
 /* ******************************************************************************** */
 
 /* list of instructions */
@@ -12869,6 +12960,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
 	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
+	(cmdline_parse_inst_t *)&cmd_config_tx_affinity_map,
 	NULL,
 };
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index 2ca30b3b49..30a7eaca2b 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -68,6 +68,13 @@ New Features
   * Applications can register a callback at startup via
     ``rte_lcore_register_usage_cb()`` to provide lcore usage information.
 
+* **Added support for Tx queue mapping with an aggregated port.**
+
+  * Introduced new function ``rte_eth_dev_count_aggr_ports()``
+    to get the number of aggregated ports.
+  * Introduced new function ``rte_eth_dev_map_aggr_tx_affinity()``
+    to map a Tx queue with an aggregated port of the DPDK port.
+
 * **Added flow matching of IPv6 routing extension.**
 
   Added ``RTE_FLOW_ITEM_TYPE_IPV6_ROUTING_EXT``
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 357adb09d7..1d3c372601 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1612,6 +1612,20 @@ Enable or disable a per queue Tx offloading only on a specific Tx queue::
 
 This command should be run when the port is stopped, or else it will fail.
 
+config per queue Tx affinity mapping
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Map a Tx queue with an aggregated port of the DPDK port (specified with port_id)::
+
+   testpmd> port (port_id) txq (queue_id) affinity (value)
+
+* ``affinity``: the number of the aggregated port.
+                When multiple ports are aggregated into a single one,
+                it allows to choose which port to use for Tx via a queue.
+
+This command should be run when the port is stopped, otherwise it fails.
+
+
 Config VXLAN Encap outer layers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 6a550cfc83..b7fdc454a8 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1171,6 +1171,40 @@ typedef int (*eth_tx_descriptor_dump_t)(const struct rte_eth_dev *dev,
 					uint16_t queue_id, uint16_t offset,
 					uint16_t num, FILE *file);
 
+/**
+ * @internal
+ * Get the number of aggregated ports.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ *
+ * @return
+ *   Negative errno value on error, 0 or positive on success.
+ *
+ * @retval >=0
+ *   The number of aggregated port if success.
+ * @retval -ENOTSUP
+ *   Get aggregated ports API is not supported.
+ */
+typedef int (*eth_count_aggr_ports_t)(uint16_t port_id);
+
+/**
+ * @internal
+ * Map a Tx queue with an aggregated port of the DPDK port.
+ *
+ * @param port_id
+ *   The identifier of the port used in rte_eth_tx_burst().
+ * @param tx_queue_id
+ *   The index of the transmit queue used in rte_eth_tx_burst().
+ * @param affinity
+ *   The number of the aggregated port.
+ *
+ * @return
+ *   Negative on error, 0 on success.
+ */
+typedef int (*eth_map_aggr_tx_affinity_t)(uint16_t port_id, uint16_t tx_queue_id,
+					  uint8_t affinity);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -1403,6 +1437,11 @@ struct eth_dev_ops {
 	eth_cman_config_set_t cman_config_set;
 	/** Retrieve congestion management configuration */
 	eth_cman_config_get_t cman_config_get;
+
+	/** Get the number of aggregated ports */
+	eth_count_aggr_ports_t count_aggr_ports;
+	/** Map a Tx queue with an aggregated port of the DPDK port */
+	eth_map_aggr_tx_affinity_t map_aggr_tx_affinity;
 };
 
 /**
diff --git a/lib/ethdev/ethdev_trace.h b/lib/ethdev/ethdev_trace.h
index 8932ac33e0..53d1a71ff0 100644
--- a/lib/ethdev/ethdev_trace.h
+++ b/lib/ethdev/ethdev_trace.h
@@ -1385,6 +1385,23 @@ RTE_TRACE_POINT(
 	rte_trace_point_emit_int(ret);
 )
 
+RTE_TRACE_POINT(
+	rte_eth_trace_count_aggr_ports,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_int(ret);
+)
+
+RTE_TRACE_POINT(
+	rte_eth_trace_map_aggr_tx_affinity,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id,
+			     uint8_t affinity, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_u16(tx_queue_id);
+	rte_trace_point_emit_u8(affinity);
+	rte_trace_point_emit_int(ret);
+)
+
 RTE_TRACE_POINT(
 	rte_flow_trace_dynf_metadata_register,
 	RTE_TRACE_POINT_ARGS(int offset, uint64_t flag),
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 34d12e2859..61010cae56 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -475,6 +475,12 @@ RTE_TRACE_POINT_REGISTER(rte_eth_trace_cman_config_set,
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_cman_config_get,
 	lib.ethdev.cman_config_get)
 
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
+	lib.ethdev.count_aggr_ports)
+
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
+	lib.ethdev.map_aggr_tx_affinity)
+
 RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
 	lib.ethdev.flow.copy)
 
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 055c46082b..1a63f9fb7a 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6946,6 +6946,78 @@ rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
 	return j;
 }
 
+int rte_eth_dev_count_aggr_ports(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (*dev->dev_ops->count_aggr_ports == NULL)
+		return 0;
+	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
+
+	rte_eth_trace_count_aggr_ports(port_id, ret);
+
+	return ret;
+}
+
+int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
+				     uint8_t affinity)
+{
+	struct rte_eth_dev *dev;
+	int aggr_ports;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (tx_queue_id >= dev->data->nb_tx_queues) {
+		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n", tx_queue_id);
+		return -EINVAL;
+	}
+
+	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
+		return -ENOTSUP;
+
+	if (dev->data->dev_configured == 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u must be configured before Tx affinity mapping\n",
+			port_id);
+		return -EINVAL;
+	}
+
+	if (dev->data->dev_started) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u must be stopped to allow configuration\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	aggr_ports = rte_eth_dev_count_aggr_ports(port_id);
+	if (aggr_ports == 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u number of aggregated ports is 0 which is invalid\n",
+			port_id);
+		return -ENOTSUP;
+	}
+
+	if (affinity > aggr_ports) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u map invalid affinity %u exceeds the maximum number %u\n",
+			port_id, affinity, aggr_ports);
+		return -EINVAL;
+	}
+
+	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
+				tx_queue_id, affinity));
+
+	rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity, ret);
+
+	return ret;
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c129ca1eaf..18e14dcbd0 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2589,6 +2589,56 @@ int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port);
 __rte_experimental
 int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ *  Get the number of aggregated ports of the DPDK port (specified with port_id).
+ *  It is used when multiple ports are aggregated into a single one.
+ *
+ *  For the regular physical port doesn't have aggregated ports,
+ *  the number of aggregated ports is reported as 0.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - (>=0) the number of aggregated port if success.
+ *   - (-ENOTSUP) if not supported.
+ */
+__rte_experimental
+int rte_eth_dev_count_aggr_ports(uint16_t port_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ *  Map a Tx queue with an aggregated port of the DPDK port (specified with port_id).
+ *  When multiple ports are aggregated into a single one,
+ *  it allows to choose which port to use for Tx via a queue.
+ *
+ *  The application should use rte_eth_dev_map_aggr_tx_affinity()
+ *  after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
+ *  before rte_eth_dev_start().
+ *
+ * @param port_id
+ *   The identifier of the port used in rte_eth_tx_burst().
+ * @param tx_queue_id
+ *   The index of the transmit queue used in rte_eth_tx_burst().
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param affinity
+ *   The number of the aggregated port.
+ *   Value 0 means no affinity and traffic could be routed to any aggregated port.
+ *   The first aggregated port is number 1 and so on.
+ *   The maximum number is given by rte_eth_dev_count_aggr_ports().
+ *
+ * @return
+ *   Zero if successful. Non-zero otherwise.
+ */
+__rte_experimental
+int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
+				     uint8_t affinity);
+
 /**
  * Return the NUMA socket to which an Ethernet device is connected
  *
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index dbc2bffe64..685aa71e51 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -300,6 +300,8 @@ EXPERIMENTAL {
 	rte_mtr_meter_profile_get;
 
 	# added in 23.03
+	rte_eth_dev_count_aggr_ports;
+	rte_eth_dev_map_aggr_tx_affinity;
 	rte_flow_async_create_by_index;
 };
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v6 2/2] ethdev: add flow matching of aggregated port
  2023-02-17 10:50   ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
  2023-02-17 10:50     ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
@ 2023-02-17 10:50     ` Jiawei Wang
  2023-02-17 13:01     ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
  2 siblings, 0 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-17 10:50 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, ferruh.yigit,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port is used for Rx and Tx.

This patch allows to map a Rx queue with an aggregated port by using
a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

While uses the aggregated affinity as a matching item in the flow rule,
and sets the same affinity value by call
rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
the same port as the receiving one.
The affinity numbering starts from 1, then trying to match on
aggr_affinity 0 will result in an error.

Add the testpmd command line to match the new item:
	flow create 0 ingress group 0 pattern aggr_affinity affinity is 1 /
	end actions queue index 0 / end

The above command means that creates a flow on a single DPDK port and
matches the packet from the first physical port and redirects
these packets into Rx queue 0.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/cmdline_flow.c                 | 28 +++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst          |  8 +++++
 doc/guides/rel_notes/release_23_03.rst      |  1 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 35 +++++++++++++++++++++
 6 files changed, 77 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 63a0b36622..889c49a28a 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -481,6 +481,8 @@ enum index {
 	ITEM_METER,
 	ITEM_METER_COLOR,
 	ITEM_METER_COLOR_NAME,
+	ITEM_AGGR_AFFINITY,
+	ITEM_AGGR_AFFINITY_VALUE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -1403,6 +1405,7 @@ static const enum index next_item[] = {
 	ITEM_L2TPV2,
 	ITEM_PPP,
 	ITEM_METER,
+	ITEM_AGGR_AFFINITY,
 	END_SET,
 	ZERO,
 };
@@ -1892,6 +1895,12 @@ static const enum index item_meter[] = {
 	ZERO,
 };
 
+static const enum index item_aggr_affinity[] = {
+	ITEM_AGGR_AFFINITY_VALUE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -6694,6 +6703,22 @@ static const struct token token_list[] = {
 				ARGS_ENTRY(struct buffer, port)),
 		.call = parse_mp,
 	},
+	[ITEM_AGGR_AFFINITY] = {
+		.name = "aggr_affinity",
+		.help = "match on the aggregated port receiving the packets",
+		.priv = PRIV_ITEM(AGGR_AFFINITY,
+				  sizeof(struct rte_flow_item_aggr_affinity)),
+		.next = NEXT(item_aggr_affinity),
+		.call = parse_vc,
+	},
+	[ITEM_AGGR_AFFINITY_VALUE] = {
+		.name = "affinity",
+		.help = "aggregated affinity value",
+		.next = NEXT(item_aggr_affinity, NEXT_ENTRY(COMMON_UNSIGNED),
+			     item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_aggr_affinity,
+					affinity)),
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -11424,6 +11449,9 @@ flow_item_default_mask(const struct rte_flow_item *item)
 	case RTE_FLOW_ITEM_TYPE_IPV6_ROUTING_EXT:
 		mask = &ipv6_routing_ext_default_mask;
 		break;
+	case RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY:
+		mask = &rte_flow_item_aggr_affinity_mask;
+		break;
 	default:
 		break;
 	}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 8ad01ef05c..d825c788d0 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1536,6 +1536,14 @@ Matches Color Marker set by a Meter.
 
 - ``color``: Metering color marker.
 
+Item: ``AGGR_AFFINITY``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches on the aggregated port of the received packet.
+In case of multiple aggregated ports, the affinity numbering starts from 1.
+
+- ``affinity``: Aggregated affinity.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index 30a7eaca2b..5f37a41f78 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -74,6 +74,7 @@ New Features
     to get the number of aggregated ports.
   * Introduced new function ``rte_eth_dev_map_aggr_tx_affinity()``
     to map a Tx queue with an aggregated port of the DPDK port.
+  * Added Rx affinity flow matching of an aggregated port.
 
 * **Added flow matching of IPv6 routing extension.**
 
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 1d3c372601..5a88933635 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3775,6 +3775,10 @@ This section lists supported pattern items and their attributes, if any.
 
   - ``color {value}``: meter color value (green/yellow/red).
 
+- ``aggr_affinity``: match aggregated port.
+
+  - ``affinity {value}``: aggregated port (starts from 1).
+
 - ``send_to_kernel``: send packets to kernel.
 
 
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index b29c66e2f4..749effb19b 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -162,6 +162,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(PPP, sizeof(struct rte_flow_item_ppp)),
 	MK_FLOW_ITEM(METER_COLOR, sizeof(struct rte_flow_item_meter_color)),
 	MK_FLOW_ITEM(IPV6_ROUTING_EXT, sizeof(struct rte_flow_item_ipv6_routing_ext)),
+	MK_FLOW_ITEM(AGGR_AFFINITY, sizeof(struct rte_flow_item_aggr_affinity)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index fb984a1097..abaf49f781 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -656,6 +656,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_icmp6_echo.
 	 */
 	RTE_FLOW_ITEM_TYPE_ICMP6_ECHO_REPLY,
+
+	/**
+	 * Matches on the aggregated port of the received packet.
+	 * Used in case multiple ports are aggregated to the a DPDK port.
+	 * First port is number 1.
+	 *
+	 * @see struct rte_flow_item_aggr_affinity.
+	 */
+	RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY,
 };
 
 /**
@@ -2187,6 +2196,32 @@ static const struct rte_flow_item_meter_color rte_flow_item_meter_color_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY
+ *
+ * For multiple ports aggregated to a single DPDK port,
+ * match the aggregated port receiving the packets.
+ */
+struct rte_flow_item_aggr_affinity {
+	/**
+	 * An aggregated port receiving the packets.
+	 * Numbering starts from 1.
+	 * Number of aggregated ports is reported by rte_eth_dev_count_aggr_ports().
+	 */
+	uint8_t affinity;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY. */
+#ifndef __cplusplus
+static const struct rte_flow_item_aggr_affinity
+rte_flow_item_aggr_affinity_mask = {
+	.affinity = 0xff,
+};
+#endif
+
 /**
  * Action types.
  *
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 10:50     ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
@ 2023-02-17 12:53       ` Ferruh Yigit
  2023-02-17 12:56       ` Andrew Rybchenko
  1 sibling, 0 replies; 40+ messages in thread
From: Ferruh Yigit @ 2023-02-17 12:53 UTC (permalink / raw)
  To: Jiawei Wang, viacheslavo, orika, thomas, andrew.rybchenko,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

On 2/17/2023 10:50 AM, Jiawei Wang wrote:
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + *  Get the number of aggregated ports of the DPDK port (specified with port_id).
> + *  It is used when multiple ports are aggregated into a single one.
> + *
> + *  For the regular physical port doesn't have aggregated ports,
> + *  the number of aggregated ports is reported as 0.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @return
> + *   - (>=0) the number of aggregated port if success.
> + *   - (-ENOTSUP) if not supported.
> + */
> +__rte_experimental
> +int rte_eth_dev_count_aggr_ports(uint16_t port_id);

(-ENOTSUP) not returned anymore, if this is the only change I can fix it
while merging.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 10:50     ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
  2023-02-17 12:53       ` Ferruh Yigit
@ 2023-02-17 12:56       ` Andrew Rybchenko
  2023-02-17 12:59         ` Ferruh Yigit
  2023-02-17 13:41         ` Jiawei(Jonny) Wang
  1 sibling, 2 replies; 40+ messages in thread
From: Andrew Rybchenko @ 2023-02-17 12:56 UTC (permalink / raw)
  To: Jiawei Wang, viacheslavo, orika, thomas, ferruh.yigit,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

On 2/17/23 13:50, Jiawei Wang wrote:
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.),
> we want to know which port use for Tx via a queue.
> 
> This patch introduces the new ethdev API
> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
> with an aggregated port of the DPDK port (specified with port_id),
> The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any
> aggregated port, this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
> and rte_eth_dev_map_aggr_tx_affinity() functions.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two physical ports connected to
> a single DPDK port (port id 0), and affinity 1 stood for
> the first physical port and affinity 2 stood for the second
> physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
>          port config 0 txq 0 affinity 1
>          port config 0 txq 1 affinity 1
>          port config 0 txq 2 affinity 2
>          port config 0 txq 3 affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
> these packets will be sent from the first physical port, and similar
> with the second physical port if sending packets with Tx Queue 2
> or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>

[snip]

> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> index 6a550cfc83..b7fdc454a8 100644
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -1171,6 +1171,40 @@ typedef int (*eth_tx_descriptor_dump_t)(const struct rte_eth_dev *dev,
>   					uint16_t queue_id, uint16_t offset,
>   					uint16_t num, FILE *file);
>   
> +/**
> + * @internal
> + * Get the number of aggregated ports.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + *
> + * @return
> + *   Negative errno value on error, 0 or positive on success.
> + *
> + * @retval >=0
> + *   The number of aggregated port if success.
> + * @retval -ENOTSUP
> + *   Get aggregated ports API is not supported.
> + */
> +typedef int (*eth_count_aggr_ports_t)(uint16_t port_id);

Why does use port_id as the first parameter whereas all other
driver callbacks use 'struct rte_eth_dev *'?

> +
> +/**
> + * @internal
> + * Map a Tx queue with an aggregated port of the DPDK port.
> + *
> + * @param port_id
> + *   The identifier of the port used in rte_eth_tx_burst().
> + * @param tx_queue_id
> + *   The index of the transmit queue used in rte_eth_tx_burst().
> + * @param affinity
> + *   The number of the aggregated port.
> + *
> + * @return
> + *   Negative on error, 0 on success.
> + */
> +typedef int (*eth_map_aggr_tx_affinity_t)(uint16_t port_id, uint16_t tx_queue_id,
> +					  uint8_t affinity);

same herre

> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index 055c46082b..1a63f9fb7a 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6946,6 +6946,78 @@ rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
>   	return j;
>   }
>   
> +int rte_eth_dev_count_aggr_ports(uint16_t port_id)
> +{
> +	struct rte_eth_dev *dev;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +
> +	if (*dev->dev_ops->count_aggr_ports == NULL)

Is it OK that tracing is long in this case?

> +		return 0;
> +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> +
> +	rte_eth_trace_count_aggr_ports(port_id, ret);
> +
> +	return ret;
> +}
> +

[snip]


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 12:56       ` Andrew Rybchenko
@ 2023-02-17 12:59         ` Ferruh Yigit
  2023-02-17 13:05           ` Jiawei(Jonny) Wang
  2023-02-17 13:41         ` Jiawei(Jonny) Wang
  1 sibling, 1 reply; 40+ messages in thread
From: Ferruh Yigit @ 2023-02-17 12:59 UTC (permalink / raw)
  To: Andrew Rybchenko, Jiawei Wang, viacheslavo, orika, thomas,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

On 2/17/2023 12:56 PM, Andrew Rybchenko wrote:
>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>> index 6a550cfc83..b7fdc454a8 100644
>> --- a/lib/ethdev/ethdev_driver.h
>> +++ b/lib/ethdev/ethdev_driver.h
>> @@ -1171,6 +1171,40 @@ typedef int (*eth_tx_descriptor_dump_t)(const
>> struct rte_eth_dev *dev,
>>                       uint16_t queue_id, uint16_t offset,
>>                       uint16_t num, FILE *file);
>>   +/**
>> + * @internal
>> + * Get the number of aggregated ports.
>> + *
>> + * @param port_id
>> + *   The port identifier of the Ethernet device.
>> + *
>> + * @return
>> + *   Negative errno value on error, 0 or positive on success.
>> + *
>> + * @retval >=0
>> + *   The number of aggregated port if success.
>> + * @retval -ENOTSUP
>> + *   Get aggregated ports API is not supported.
>> + */
>> +typedef int (*eth_count_aggr_ports_t)(uint16_t port_id);
> 
> Why does use port_id as the first parameter whereas all other
> driver callbacks use 'struct rte_eth_dev *'?

Ahh, this is wrong, internal functions should use 'struct rte_eth_dev
*', not port_id handler.

Thanks Andrew for catching it.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 0/2] Add Tx queue mapping of aggregated ports
  2023-02-17 10:50   ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
  2023-02-17 10:50     ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
  2023-02-17 10:50     ` [PATCH v6 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
@ 2023-02-17 13:01     ` Ferruh Yigit
  2023-02-17 13:07       ` Jiawei(Jonny) Wang
  2 siblings, 1 reply; 40+ messages in thread
From: Ferruh Yigit @ 2023-02-17 13:01 UTC (permalink / raw)
  To: Jiawei Wang, viacheslavo, orika, thomas, andrew.rybchenko; +Cc: dev, rasland

On 2/17/2023 10:50 AM, Jiawei Wang wrote:
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.),
> we want to know which port is used for Rx and Tx.
> 
> This patch introduces the new ethdev API
> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
> with an aggregated port of the DPDK port (specified with port_id),
> The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any
> aggregated port, this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> This patch allows to map a Rx queue with an aggregated port by using
> a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> 
> While uses the aggregated affinity as a matching item in the flow rule,
> and sets the same affinity value by call
> rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
> the same port as the receiving one.
> The affinity numbering starts from 1, then trying to match on
> aggr_affinity 0 will result in an error.
> 
> RFC: http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaweiw@nvidia.com/
> 
> v6:
> * Update the commit titles.
> * Return 0 by default if dev_ops.count_aggr_ports is not defined.
> * Adds the dev_configure and affinity value checking before call map_aggr_tx_affinity.
> * Update the rte_eth_dev_count_aggr_ports description.
> 
> v5:
> * Adds rte_eth_dev_map_aggr_tx_affinity() to map a Tx queue to an aggregated port.
> * Adds rte_eth_dev_count_aggr_ports() to get the number of aggregated ports.
> * Updates the flow item RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> 
> v4:
> * Rebase the latest code
> * Update new field description
> * Update release release note
> * Reword the commit log to make clear
> 
> v3:
> * Update exception rule
> * Update the commit log
> * Add the description for PHY affinity and numbering definition
> * Add the number of physical ports into device info
> * Change the patch order 
> 
> v2: Update based on the comments
> 
> Jiawei Wang (2):
>   ethdev: add Tx queue mapping of aggregated ports
>   ethdev: add flow matching of aggregated port

It looks like there will be a new version, can you please rebase next
version on top of next-net [1], there are multiple conflicts with this
patch, it is safer if you resolve them yourself.

[1]
https://git.dpdk.org/next/dpdk-next-net/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 12:59         ` Ferruh Yigit
@ 2023-02-17 13:05           ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-17 13:05 UTC (permalink / raw)
  To: Ferruh Yigit, Andrew Rybchenko, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, February 17, 2023 9:00 PM
> To: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; Jiawei(Jonny)
> Wang <jiaweiw@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>; Ori
> Kam <orika@nvidia.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Aman Singh <aman.deep.singh@intel.com>; Yuying
> Zhang <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
> 
> On 2/17/2023 12:56 PM, Andrew Rybchenko wrote:
> >> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> >> index 6a550cfc83..b7fdc454a8 100644
> >> --- a/lib/ethdev/ethdev_driver.h
> >> +++ b/lib/ethdev/ethdev_driver.h
> >> @@ -1171,6 +1171,40 @@ typedef int (*eth_tx_descriptor_dump_t)(const
> >> struct rte_eth_dev *dev,
> >>                       uint16_t queue_id, uint16_t offset,
> >>                       uint16_t num, FILE *file);
> >>   +/**
> >> + * @internal
> >> + * Get the number of aggregated ports.
> >> + *
> >> + * @param port_id
> >> + *   The port identifier of the Ethernet device.
> >> + *
> >> + * @return
> >> + *   Negative errno value on error, 0 or positive on success.
> >> + *
> >> + * @retval >=0
> >> + *   The number of aggregated port if success.
> >> + * @retval -ENOTSUP
> >> + *   Get aggregated ports API is not supported.
> >> + */
> >> +typedef int (*eth_count_aggr_ports_t)(uint16_t port_id);
> >
> > Why does use port_id as the first parameter whereas all other driver
> > callbacks use 'struct rte_eth_dev *'?
> 
> Ahh, this is wrong, internal functions should use 'struct rte_eth_dev *', not
> port_id handler.
> 
> Thanks Andrew for catching it.

Thanks Andrew and Ferruh, I will send the new version to fix it.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v6 0/2] Add Tx queue mapping of aggregated ports
  2023-02-17 13:01     ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
@ 2023-02-17 13:07       ` Jiawei(Jonny) Wang
  0 siblings, 0 replies; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-17 13:07 UTC (permalink / raw)
  To: Ferruh Yigit, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	andrew.rybchenko
  Cc: dev, Raslan Darawsheh



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, February 17, 2023 9:01 PM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> andrew.rybchenko@oktetlabs.ru
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v6 0/2] Add Tx queue mapping of aggregated ports
> 
> On 2/17/2023 10:50 AM, Jiawei Wang wrote:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port is used for Rx and Tx.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > This patch allows to map a Rx queue with an aggregated port by using a
> > flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> >
> > While uses the aggregated affinity as a matching item in the flow
> > rule, and sets the same affinity value by call
> > rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
> > the same port as the receiving one.
> > The affinity numbering starts from 1, then trying to match on
> > aggr_affinity 0 will result in an error.
> >
> > RFC:
> > http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaw
> > eiw@nvidia.com/
> >
> > v6:
> > * Update the commit titles.
> > * Return 0 by default if dev_ops.count_aggr_ports is not defined.
> > * Adds the dev_configure and affinity value checking before call
> map_aggr_tx_affinity.
> > * Update the rte_eth_dev_count_aggr_ports description.
> >
> > v5:
> > * Adds rte_eth_dev_map_aggr_tx_affinity() to map a Tx queue to an
> aggregated port.
> > * Adds rte_eth_dev_count_aggr_ports() to get the number of aggregated
> ports.
> > * Updates the flow item RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> >
> > v4:
> > * Rebase the latest code
> > * Update new field description
> > * Update release release note
> > * Reword the commit log to make clear
> >
> > v3:
> > * Update exception rule
> > * Update the commit log
> > * Add the description for PHY affinity and numbering definition
> > * Add the number of physical ports into device info
> > * Change the patch order
> >
> > v2: Update based on the comments
> >
> > Jiawei Wang (2):
> >   ethdev: add Tx queue mapping of aggregated ports
> >   ethdev: add flow matching of aggregated port
> 
> It looks like there will be a new version, can you please rebase next version on
> top of next-net [1], there are multiple conflicts with this patch, it is safer if you
> resolve them yourself.
> 
> [1]
> https://git.dpdk.org/next/dpdk-next-net/

Sure, will send the v7 and rebase on it.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 12:56       ` Andrew Rybchenko
  2023-02-17 12:59         ` Ferruh Yigit
@ 2023-02-17 13:41         ` Jiawei(Jonny) Wang
  2023-02-17 15:03           ` Andrew Rybchenko
  1 sibling, 1 reply; 40+ messages in thread
From: Jiawei(Jonny) Wang @ 2023-02-17 13:41 UTC (permalink / raw)
  To: Andrew Rybchenko, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	ferruh.yigit, Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh

Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Friday, February 17, 2023 8:57 PM
> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> ferruh.yigit@amd.com; Aman Singh <aman.deep.singh@intel.com>; Yuying
> Zhang <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
> 
[snip]
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -6946,6 +6946,78 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> >   	return j;
> >   }
> >
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> > +	struct rte_eth_dev *dev;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	dev = &rte_eth_devices[port_id];
> > +
> > +	if (*dev->dev_ops->count_aggr_ports == NULL)
> 
> Is it OK that tracing is long in this case?
> 

Do you mean that we don't need tracing in this case?

> > +		return 0;
> > +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> > +
> > +	rte_eth_trace_count_aggr_ports(port_id, ret);
> > +
> > +	return ret;
> > +}
> > +
> 
> [snip]


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 13:41         ` Jiawei(Jonny) Wang
@ 2023-02-17 15:03           ` Andrew Rybchenko
  2023-02-17 15:32             ` Ferruh Yigit
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Rybchenko @ 2023-02-17 15:03 UTC (permalink / raw)
  To: Jiawei(Jonny) Wang, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	ferruh.yigit, Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh

On 2/17/23 16:41, Jiawei(Jonny) Wang wrote:
> Hi Andrew,
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Friday, February 17, 2023 8:57 PM
>> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
>> ferruh.yigit@amd.com; Aman Singh <aman.deep.singh@intel.com>; Yuying
>> Zhang <yuying.zhang@intel.com>
>> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
>> Subject: Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
>>
> [snip]
>>> --- a/lib/ethdev/rte_ethdev.c
>>> +++ b/lib/ethdev/rte_ethdev.c
>>> @@ -6946,6 +6946,78 @@
>> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
>> *ptypes
>>>    	return j;
>>>    }
>>>
>>> +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
>>> +	struct rte_eth_dev *dev;
>>> +	int ret;
>>> +
>>> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>> +	dev = &rte_eth_devices[port_id];
>>> +
>>> +	if (*dev->dev_ops->count_aggr_ports == NULL)
>>
>> Is it OK that tracing is long in this case?
>>
> 
> Do you mean that we don't need tracing in this case?

Sorry for typo. I'm asking if it is OK that tracing is *lost* in this case.

> 
>>> +		return 0;
>>> +	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
>>> +
>>> +	rte_eth_trace_count_aggr_ports(port_id, ret);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>
>> [snip]
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 15:03           ` Andrew Rybchenko
@ 2023-02-17 15:32             ` Ferruh Yigit
  0 siblings, 0 replies; 40+ messages in thread
From: Ferruh Yigit @ 2023-02-17 15:32 UTC (permalink / raw)
  To: Andrew Rybchenko, Jiawei(Jonny) Wang, Slava Ovsiienko, Ori Kam,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Aman Singh, Yuying Zhang
  Cc: dev, Raslan Darawsheh

On 2/17/2023 3:03 PM, Andrew Rybchenko wrote:
> On 2/17/23 16:41, Jiawei(Jonny) Wang wrote:
>> Hi Andrew,
>>
>>> -----Original Message-----
>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>> Sent: Friday, February 17, 2023 8:57 PM
>>> To: Jiawei(Jonny) Wang <jiaweiw@nvidia.com>; Slava Ovsiienko
>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
>>> ferruh.yigit@amd.com; Aman Singh <aman.deep.singh@intel.com>; Yuying
>>> Zhang <yuying.zhang@intel.com>
>>> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
>>> Subject: Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of
>>> aggregated ports
>>>
>> [snip]
>>>> --- a/lib/ethdev/rte_ethdev.c
>>>> +++ b/lib/ethdev/rte_ethdev.c
>>>> @@ -6946,6 +6946,78 @@
>>> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
>>> *ptypes
>>>>        return j;
>>>>    }
>>>>
>>>> +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
>>>> +    struct rte_eth_dev *dev;
>>>> +    int ret;
>>>> +
>>>> +    RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>>> +    dev = &rte_eth_devices[port_id];
>>>> +
>>>> +    if (*dev->dev_ops->count_aggr_ports == NULL)
>>>
>>> Is it OK that tracing is long in this case?
>>>
>>
>> Do you mean that we don't need tracing in this case?
> 
> Sorry for typo. I'm asking if it is OK that tracing is *lost* in this case.
> 

It is not black & white, but I think it is OK since there is no dev_ops
called.

Adding tracing here can make it unnecessary complex and for many cases
tracing is limited only to success path.

>>
>>>> +        return 0;
>>>> +    ret = eth_err(port_id,
>>>> (*dev->dev_ops->count_aggr_ports)(port_id));
>>>> +
>>>> +    rte_eth_trace_count_aggr_ports(port_id, ret);
>>>> +
>>>> +    return ret;
>>>> +}
>>>> +
>>>
>>> [snip]
>>
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v7 0/2]  Add Tx queue mapping of aggregated ports
  2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
                     ` (4 preceding siblings ...)
  2023-02-17 10:50   ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
@ 2023-02-17 15:47   ` Jiawei Wang
  2023-02-17 15:47     ` [PATCH v7 1/2] ethdev: add " Jiawei Wang
                       ` (2 more replies)
  5 siblings, 3 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-17 15:47 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, ferruh.yigit; +Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port is used for Rx and Tx.

This patch introduces the new ethdev API
rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
with an aggregated port of the DPDK port (specified with port_id),
The affinity is the number of the aggregated port.
Value 0 means no affinity and traffic could be routed to any
aggregated port, this is the default current behavior.

The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().

This patch allows to map a Rx queue with an aggregated port by using
a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

While uses the aggregated affinity as a matching item in the flow rule,
and sets the same affinity value by call
rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
the same port as the receiving one.
The affinity numbering starts from 1, then trying to match on
aggr_affinity 0 will result in an error.

RFC: http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaweiw@nvidia.com/

v7:
* Remove the -ENOTSUP return value since no need anymore.
* Use the rte_eth_dev as argument in the internal function.

v6:
* Update the commit titles.
* Return 0 by default if dev_ops.count_aggr_ports is not defined.
* Adds the dev_configure and affinity value checking before call map_aggr_tx_affinity.
* Update the rte_eth_dev_count_aggr_ports description.

v5:
* Adds rte_eth_dev_map_aggr_tx_affinity() to map a Tx queue to an aggregated port.
* Adds rte_eth_dev_count_aggr_ports() to get the number of aggregated ports.
* Updates the flow item RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

v4:
* Rebase the latest code
* Update new field description
* Update release release note
* Reword the commit log to make clear

v3:
* Update exception rule
* Update the commit log
* Add the description for PHY affinity and numbering definition
* Add the number of physical ports into device info
* Change the patch order 

v2: Update based on the comments

Jiawei Wang (2):
  ethdev: add Tx queue mapping of aggregated ports
  ethdev: add flow matching of aggregated port

 app/test-pmd/cmdline.c                      | 92 +++++++++++++++++++++
 app/test-pmd/cmdline_flow.c                 | 28 +++++++
 doc/guides/prog_guide/rte_flow.rst          |  8 ++
 doc/guides/rel_notes/release_23_03.rst      |  8 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 18 ++++
 lib/ethdev/ethdev_driver.h                  | 37 +++++++++
 lib/ethdev/ethdev_trace.h                   | 17 ++++
 lib/ethdev/ethdev_trace_points.c            |  6 ++
 lib/ethdev/rte_ethdev.c                     | 72 ++++++++++++++++
 lib/ethdev/rte_ethdev.h                     | 49 +++++++++++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 35 ++++++++
 lib/ethdev/version.map                      |  2 +
 13 files changed, 373 insertions(+)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v7 1/2] ethdev: add Tx queue mapping of aggregated ports
  2023-02-17 15:47   ` [PATCH v7 " Jiawei Wang
@ 2023-02-17 15:47     ` Jiawei Wang
  2023-02-17 15:47     ` [PATCH v7 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
  2023-02-17 16:45     ` [PATCH v7 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
  2 siblings, 0 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-17 15:47 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, ferruh.yigit,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port use for Tx via a queue.

This patch introduces the new ethdev API
rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
with an aggregated port of the DPDK port (specified with port_id),
The affinity is the number of the aggregated port.
Value 0 means no affinity and traffic could be routed to any
aggregated port, this is the default current behavior.

The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().

Add the trace point for ethdev rte_eth_dev_count_aggr_ports()
and rte_eth_dev_map_aggr_tx_affinity() functions.

Add the testpmd command line:
testpmd> port config (port_id) txq (queue_id) affinity (value)

For example, there're two physical ports connected to
a single DPDK port (port id 0), and affinity 1 stood for
the first physical port and affinity 2 stood for the second
physical port.
Use the below commands to config tx phy affinity for per Tx Queue:
        port config 0 txq 0 affinity 1
        port config 0 txq 1 affinity 1
        port config 0 txq 2 affinity 2
        port config 0 txq 3 affinity 2

These commands config the Tx Queue index 0 and Tx Queue index 1 with
phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets,
these packets will be sent from the first physical port, and similar
with the second physical port if sending packets with Tx Queue 2
or Tx Queue 3.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/cmdline.c                      | 92 +++++++++++++++++++++
 doc/guides/rel_notes/release_23_03.rst      |  7 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 14 ++++
 lib/ethdev/ethdev_driver.h                  | 37 +++++++++
 lib/ethdev/ethdev_trace.h                   | 17 ++++
 lib/ethdev/ethdev_trace_points.c            |  6 ++
 lib/ethdev/rte_ethdev.c                     | 72 ++++++++++++++++
 lib/ethdev/rte_ethdev.h                     | 49 +++++++++++
 lib/ethdev/version.map                      |  2 +
 9 files changed, 296 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index bb7ff2b449..151f356224 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -776,6 +776,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 
 			"port cleanup (port_id) txq (queue_id) (free_cnt)\n"
 			"    Cleanup txq mbufs for a specific Tx queue\n\n"
+
+			"port config (port_id) txq (queue_id) affinity (value)\n"
+			"    Map a Tx queue with an aggregated port "
+			"of the DPDK port\n\n"
 		);
 	}
 
@@ -12636,6 +12640,93 @@ static cmdline_parse_inst_t cmd_show_port_flow_transfer_proxy = {
 	}
 };
 
+/* *** configure port txq affinity value *** */
+struct cmd_config_tx_affinity_map {
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t config;
+	portid_t portid;
+	cmdline_fixed_string_t txq;
+	uint16_t qid;
+	cmdline_fixed_string_t affinity;
+	uint8_t value;
+};
+
+static void
+cmd_config_tx_affinity_map_parsed(void *parsed_result,
+				  __rte_unused struct cmdline *cl,
+				  __rte_unused void *data)
+{
+	struct cmd_config_tx_affinity_map *res = parsed_result;
+	int ret;
+
+	if (port_id_is_invalid(res->portid, ENABLED_WARN))
+		return;
+
+	if (res->portid == (portid_t)RTE_PORT_ALL) {
+		printf("Invalid port id\n");
+		return;
+	}
+
+	if (strcmp(res->txq, "txq")) {
+		printf("Unknown parameter\n");
+		return;
+	}
+	if (tx_queue_id_is_invalid(res->qid))
+		return;
+
+	ret = rte_eth_dev_count_aggr_ports(res->portid);
+	if (ret < 0) {
+		printf("Failed to count the aggregated ports: (%s)\n",
+			strerror(-ret));
+		return;
+	}
+
+	ret = rte_eth_dev_map_aggr_tx_affinity(res->portid, res->qid, res->value);
+	if (ret != 0) {
+		printf("Failed to map tx queue with an aggregated port: %s\n",
+			rte_strerror(-ret));
+		return;
+	}
+}
+
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 port, "port");
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_config =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 config, "config");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_portid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 portid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_txq =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 txq, "txq");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_qid =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+			      qid, RTE_UINT16);
+cmdline_parse_token_string_t cmd_config_tx_affinity_map_affinity =
+	TOKEN_STRING_INITIALIZER(struct cmd_config_tx_affinity_map,
+				 affinity, "affinity");
+cmdline_parse_token_num_t cmd_config_tx_affinity_map_value =
+	TOKEN_NUM_INITIALIZER(struct cmd_config_tx_affinity_map,
+			      value, RTE_UINT8);
+
+static cmdline_parse_inst_t cmd_config_tx_affinity_map = {
+	.f = cmd_config_tx_affinity_map_parsed,
+	.data = (void *)0,
+	.help_str = "port config <port_id> txq <queue_id> affinity <value>",
+	.tokens = {
+		(void *)&cmd_config_tx_affinity_map_port,
+		(void *)&cmd_config_tx_affinity_map_config,
+		(void *)&cmd_config_tx_affinity_map_portid,
+		(void *)&cmd_config_tx_affinity_map_txq,
+		(void *)&cmd_config_tx_affinity_map_qid,
+		(void *)&cmd_config_tx_affinity_map_affinity,
+		(void *)&cmd_config_tx_affinity_map_value,
+		NULL,
+	},
+};
+
 /* ******************************************************************************** */
 
 /* list of instructions */
@@ -12869,6 +12960,7 @@ static cmdline_parse_ctx_t builtin_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_capa,
 	(cmdline_parse_inst_t *)&cmd_show_port_cman_config,
 	(cmdline_parse_inst_t *)&cmd_set_port_cman_config,
+	(cmdline_parse_inst_t *)&cmd_config_tx_affinity_map,
 	NULL,
 };
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index 6ecc22db3a..97a9a56bc7 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -68,6 +68,13 @@ New Features
   * Applications can register a callback at startup via
     ``rte_lcore_register_usage_cb()`` to provide lcore usage information.
 
+* **Added support for Tx queue mapping with an aggregated port.**
+
+  * Introduced new function ``rte_eth_dev_count_aggr_ports()``
+    to get the number of aggregated ports.
+  * Introduced new function ``rte_eth_dev_map_aggr_tx_affinity()``
+    to map a Tx queue with an aggregated port of the DPDK port.
+
 * **Added flow matching of IPv6 routing extension.**
 
   Added ``RTE_FLOW_ITEM_TYPE_IPV6_ROUTING_EXT``
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 357adb09d7..1d3c372601 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1612,6 +1612,20 @@ Enable or disable a per queue Tx offloading only on a specific Tx queue::
 
 This command should be run when the port is stopped, or else it will fail.
 
+config per queue Tx affinity mapping
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Map a Tx queue with an aggregated port of the DPDK port (specified with port_id)::
+
+   testpmd> port (port_id) txq (queue_id) affinity (value)
+
+* ``affinity``: the number of the aggregated port.
+                When multiple ports are aggregated into a single one,
+                it allows to choose which port to use for Tx via a queue.
+
+This command should be run when the port is stopped, otherwise it fails.
+
+
 Config VXLAN Encap outer layers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 6a550cfc83..2c9d615fb5 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1171,6 +1171,38 @@ typedef int (*eth_tx_descriptor_dump_t)(const struct rte_eth_dev *dev,
 					uint16_t queue_id, uint16_t offset,
 					uint16_t num, FILE *file);
 
+/**
+ * @internal
+ * Get the number of aggregated ports.
+ *
+ * @param dev
+ *   Port (ethdev) handle.
+ *
+ * @return
+ *   Negative errno value on error, 0 or positive on success.
+ *
+ * @retval >=0
+ *   The number of aggregated port if success.
+ */
+typedef int (*eth_count_aggr_ports_t)(struct rte_eth_dev *dev);
+
+/**
+ * @internal
+ * Map a Tx queue with an aggregated port of the DPDK port.
+ *
+ * @param dev
+ *   Port (ethdev) handle.
+ * @param tx_queue_id
+ *   The index of the transmit queue used in rte_eth_tx_burst().
+ * @param affinity
+ *   The number of the aggregated port.
+ *
+ * @return
+ *   Negative on error, 0 on success.
+ */
+typedef int (*eth_map_aggr_tx_affinity_t)(struct rte_eth_dev *dev, uint16_t tx_queue_id,
+					  uint8_t affinity);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -1403,6 +1435,11 @@ struct eth_dev_ops {
 	eth_cman_config_set_t cman_config_set;
 	/** Retrieve congestion management configuration */
 	eth_cman_config_get_t cman_config_get;
+
+	/** Get the number of aggregated ports */
+	eth_count_aggr_ports_t count_aggr_ports;
+	/** Map a Tx queue with an aggregated port of the DPDK port */
+	eth_map_aggr_tx_affinity_t map_aggr_tx_affinity;
 };
 
 /**
diff --git a/lib/ethdev/ethdev_trace.h b/lib/ethdev/ethdev_trace.h
index 8932ac33e0..53d1a71ff0 100644
--- a/lib/ethdev/ethdev_trace.h
+++ b/lib/ethdev/ethdev_trace.h
@@ -1385,6 +1385,23 @@ RTE_TRACE_POINT(
 	rte_trace_point_emit_int(ret);
 )
 
+RTE_TRACE_POINT(
+	rte_eth_trace_count_aggr_ports,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_int(ret);
+)
+
+RTE_TRACE_POINT(
+	rte_eth_trace_map_aggr_tx_affinity,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id,
+			     uint8_t affinity, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_u16(tx_queue_id);
+	rte_trace_point_emit_u8(affinity);
+	rte_trace_point_emit_int(ret);
+)
+
 RTE_TRACE_POINT(
 	rte_flow_trace_dynf_metadata_register,
 	RTE_TRACE_POINT_ARGS(int offset, uint64_t flag),
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 34d12e2859..61010cae56 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -475,6 +475,12 @@ RTE_TRACE_POINT_REGISTER(rte_eth_trace_cman_config_set,
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_cman_config_get,
 	lib.ethdev.cman_config_get)
 
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
+	lib.ethdev.count_aggr_ports)
+
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
+	lib.ethdev.map_aggr_tx_affinity)
+
 RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
 	lib.ethdev.flow.copy)
 
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 055c46082b..36f4222fe8 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6946,6 +6946,78 @@ rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes
 	return j;
 }
 
+int rte_eth_dev_count_aggr_ports(uint16_t port_id)
+{
+	struct rte_eth_dev *dev;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (*dev->dev_ops->count_aggr_ports == NULL)
+		return 0;
+	ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(dev));
+
+	rte_eth_trace_count_aggr_ports(port_id, ret);
+
+	return ret;
+}
+
+int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
+				     uint8_t affinity)
+{
+	struct rte_eth_dev *dev;
+	int aggr_ports;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	if (tx_queue_id >= dev->data->nb_tx_queues) {
+		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n", tx_queue_id);
+		return -EINVAL;
+	}
+
+	if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
+		return -ENOTSUP;
+
+	if (dev->data->dev_configured == 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u must be configured before Tx affinity mapping\n",
+			port_id);
+		return -EINVAL;
+	}
+
+	if (dev->data->dev_started) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u must be stopped to allow configuration\n",
+			port_id);
+		return -EBUSY;
+	}
+
+	aggr_ports = rte_eth_dev_count_aggr_ports(port_id);
+	if (aggr_ports == 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u number of aggregated ports is 0 which is invalid\n",
+			port_id);
+		return -ENOTSUP;
+	}
+
+	if (affinity > aggr_ports) {
+		RTE_ETHDEV_LOG(ERR,
+			"Port %u map invalid affinity %u exceeds the maximum number %u\n",
+			port_id, affinity, aggr_ports);
+		return -EINVAL;
+	}
+
+	ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(dev,
+				tx_queue_id, affinity));
+
+	rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity, ret);
+
+	return ret;
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c129ca1eaf..049641d57c 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2589,6 +2589,55 @@ int rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port);
 __rte_experimental
 int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ *  Get the number of aggregated ports of the DPDK port (specified with port_id).
+ *  It is used when multiple ports are aggregated into a single one.
+ *
+ *  For the regular physical port doesn't have aggregated ports,
+ *  the number of aggregated ports is reported as 0.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - (>=0) the number of aggregated port if success.
+ */
+__rte_experimental
+int rte_eth_dev_count_aggr_ports(uint16_t port_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ *  Map a Tx queue with an aggregated port of the DPDK port (specified with port_id).
+ *  When multiple ports are aggregated into a single one,
+ *  it allows to choose which port to use for Tx via a queue.
+ *
+ *  The application should use rte_eth_dev_map_aggr_tx_affinity()
+ *  after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
+ *  before rte_eth_dev_start().
+ *
+ * @param port_id
+ *   The identifier of the port used in rte_eth_tx_burst().
+ * @param tx_queue_id
+ *   The index of the transmit queue used in rte_eth_tx_burst().
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param affinity
+ *   The number of the aggregated port.
+ *   Value 0 means no affinity and traffic could be routed to any aggregated port.
+ *   The first aggregated port is number 1 and so on.
+ *   The maximum number is given by rte_eth_dev_count_aggr_ports().
+ *
+ * @return
+ *   Zero if successful. Non-zero otherwise.
+ */
+__rte_experimental
+int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
+				     uint8_t affinity);
+
 /**
  * Return the NUMA socket to which an Ethernet device is connected
  *
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 40d43035eb..72c443e996 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -300,6 +300,8 @@ EXPERIMENTAL {
 	rte_mtr_meter_profile_get;
 
 	# added in 23.03
+	rte_eth_dev_count_aggr_ports;
+	rte_eth_dev_map_aggr_tx_affinity;
 	rte_flow_action_handle_query_update;
 	rte_flow_async_action_handle_query_update;
 	rte_flow_async_create_by_index;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v7 2/2] ethdev: add flow matching of aggregated port
  2023-02-17 15:47   ` [PATCH v7 " Jiawei Wang
  2023-02-17 15:47     ` [PATCH v7 1/2] ethdev: add " Jiawei Wang
@ 2023-02-17 15:47     ` Jiawei Wang
  2023-02-17 16:45     ` [PATCH v7 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
  2 siblings, 0 replies; 40+ messages in thread
From: Jiawei Wang @ 2023-02-17 15:47 UTC (permalink / raw)
  To: viacheslavo, orika, thomas, andrew.rybchenko, ferruh.yigit,
	Aman Singh, Yuying Zhang
  Cc: dev, rasland

When multiple ports are aggregated into a single DPDK port,
(example: Linux bonding, DPDK bonding, failsafe, etc.),
we want to know which port is used for Rx and Tx.

This patch allows to map a Rx queue with an aggregated port by using
a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.

While uses the aggregated affinity as a matching item in the flow rule,
and sets the same affinity value by call
rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
the same port as the receiving one.
The affinity numbering starts from 1, then trying to match on
aggr_affinity 0 will result in an error.

Add the testpmd command line to match the new item:
	flow create 0 ingress group 0 pattern aggr_affinity affinity is 1 /
	end actions queue index 0 / end

The above command means that creates a flow on a single DPDK port and
matches the packet from the first physical port and redirects
these packets into Rx queue 0.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/cmdline_flow.c                 | 28 +++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst          |  8 +++++
 doc/guides/rel_notes/release_23_03.rst      |  1 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++
 lib/ethdev/rte_flow.c                       |  1 +
 lib/ethdev/rte_flow.h                       | 35 +++++++++++++++++++++
 6 files changed, 77 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index b74dabed8b..f1991a5a9a 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -494,6 +494,8 @@ enum index {
 	ITEM_QUOTA,
 	ITEM_QUOTA_STATE,
 	ITEM_QUOTA_STATE_NAME,
+	ITEM_AGGR_AFFINITY,
+	ITEM_AGGR_AFFINITY_VALUE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -1449,6 +1451,7 @@ static const enum index next_item[] = {
 	ITEM_PPP,
 	ITEM_METER,
 	ITEM_QUOTA,
+	ITEM_AGGR_AFFINITY,
 	END_SET,
 	ZERO,
 };
@@ -1944,6 +1947,12 @@ static const enum index item_quota[] = {
 	ZERO,
 };
 
+static const enum index item_aggr_affinity[] = {
+	ITEM_AGGR_AFFINITY_VALUE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -6920,6 +6929,22 @@ static const struct token token_list[] = {
 				ARGS_ENTRY(struct buffer, port)),
 		.call = parse_mp,
 	},
+	[ITEM_AGGR_AFFINITY] = {
+		.name = "aggr_affinity",
+		.help = "match on the aggregated port receiving the packets",
+		.priv = PRIV_ITEM(AGGR_AFFINITY,
+				  sizeof(struct rte_flow_item_aggr_affinity)),
+		.next = NEXT(item_aggr_affinity),
+		.call = parse_vc,
+	},
+	[ITEM_AGGR_AFFINITY_VALUE] = {
+		.name = "affinity",
+		.help = "aggregated affinity value",
+		.next = NEXT(item_aggr_affinity, NEXT_ENTRY(COMMON_UNSIGNED),
+			     item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_aggr_affinity,
+					affinity)),
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -11821,6 +11846,9 @@ flow_item_default_mask(const struct rte_flow_item *item)
 	case RTE_FLOW_ITEM_TYPE_IPV6_ROUTING_EXT:
 		mask = &ipv6_routing_ext_default_mask;
 		break;
+	case RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY:
+		mask = &rte_flow_item_aggr_affinity_mask;
+		break;
 	default:
 		break;
 	}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 7f6e0c3350..fff9e7b3e9 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1543,6 +1543,14 @@ Matches flow quota state set by quota action.
 
 - ``state``: Flow quota state
 
+Item: ``AGGR_AFFINITY``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches on the aggregated port of the received packet.
+In case of multiple aggregated ports, the affinity numbering starts from 1.
+
+- ``affinity``: Aggregated affinity.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index 97a9a56bc7..d678d28dc1 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -74,6 +74,7 @@ New Features
     to get the number of aggregated ports.
   * Introduced new function ``rte_eth_dev_map_aggr_tx_affinity()``
     to map a Tx queue with an aggregated port of the DPDK port.
+  * Added Rx affinity flow matching of an aggregated port.
 
 * **Added flow matching of IPv6 routing extension.**
 
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 1d3c372601..5a88933635 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3775,6 +3775,10 @@ This section lists supported pattern items and their attributes, if any.
 
   - ``color {value}``: meter color value (green/yellow/red).
 
+- ``aggr_affinity``: match aggregated port.
+
+  - ``affinity {value}``: aggregated port (starts from 1).
+
 - ``send_to_kernel``: send packets to kernel.
 
 
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index e21e32a4af..69e6e749f7 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -163,6 +163,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(METER_COLOR, sizeof(struct rte_flow_item_meter_color)),
 	MK_FLOW_ITEM(IPV6_ROUTING_EXT, sizeof(struct rte_flow_item_ipv6_routing_ext)),
 	MK_FLOW_ITEM(QUOTA, sizeof(struct rte_flow_item_quota)),
+	MK_FLOW_ITEM(AGGR_AFFINITY, sizeof(struct rte_flow_item_aggr_affinity)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 362d297790..119defa438 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -663,6 +663,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_quota
 	 */
 	 RTE_FLOW_ITEM_TYPE_QUOTA,
+
+	/**
+	 * Matches on the aggregated port of the received packet.
+	 * Used in case multiple ports are aggregated to the a DPDK port.
+	 * First port is number 1.
+	 *
+	 * @see struct rte_flow_item_aggr_affinity.
+	 */
+	RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY,
 };
 
 /**
@@ -2225,6 +2234,32 @@ static const struct rte_flow_item_meter_color rte_flow_item_meter_color_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY
+ *
+ * For multiple ports aggregated to a single DPDK port,
+ * match the aggregated port receiving the packets.
+ */
+struct rte_flow_item_aggr_affinity {
+	/**
+	 * An aggregated port receiving the packets.
+	 * Numbering starts from 1.
+	 * Number of aggregated ports is reported by rte_eth_dev_count_aggr_ports().
+	 */
+	uint8_t affinity;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY. */
+#ifndef __cplusplus
+static const struct rte_flow_item_aggr_affinity
+rte_flow_item_aggr_affinity_mask = {
+	.affinity = 0xff,
+};
+#endif
+
 /**
  * Action types.
  *
-- 
2.18.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v7 0/2] Add Tx queue mapping of aggregated ports
  2023-02-17 15:47   ` [PATCH v7 " Jiawei Wang
  2023-02-17 15:47     ` [PATCH v7 1/2] ethdev: add " Jiawei Wang
  2023-02-17 15:47     ` [PATCH v7 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
@ 2023-02-17 16:45     ` Ferruh Yigit
  2 siblings, 0 replies; 40+ messages in thread
From: Ferruh Yigit @ 2023-02-17 16:45 UTC (permalink / raw)
  To: Jiawei Wang, viacheslavo, orika, thomas, andrew.rybchenko; +Cc: dev, rasland

On 2/17/2023 3:47 PM, Jiawei Wang wrote:
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.),
> we want to know which port is used for Rx and Tx.
> 
> This patch introduces the new ethdev API
> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue
> with an aggregated port of the DPDK port (specified with port_id),
> The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any
> aggregated port, this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> This patch allows to map a Rx queue with an aggregated port by using
> a flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> 
> While uses the aggregated affinity as a matching item in the flow rule,
> and sets the same affinity value by call
> rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
> the same port as the receiving one.
> The affinity numbering starts from 1, then trying to match on
> aggr_affinity 0 will result in an error.
> 
> RFC: http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaweiw@nvidia.com/
> 
> v7:
> * Remove the -ENOTSUP return value since no need anymore.
> * Use the rte_eth_dev as argument in the internal function.
> 
> v6:
> * Update the commit titles.
> * Return 0 by default if dev_ops.count_aggr_ports is not defined.
> * Adds the dev_configure and affinity value checking before call map_aggr_tx_affinity.
> * Update the rte_eth_dev_count_aggr_ports description.
> 
> v5:
> * Adds rte_eth_dev_map_aggr_tx_affinity() to map a Tx queue to an aggregated port.
> * Adds rte_eth_dev_count_aggr_ports() to get the number of aggregated ports.
> * Updates the flow item RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> 
> v4:
> * Rebase the latest code
> * Update new field description
> * Update release release note
> * Reword the commit log to make clear
> 
> v3:
> * Update exception rule
> * Update the commit log
> * Add the description for PHY affinity and numbering definition
> * Add the number of physical ports into device info
> * Change the patch order 
> 
> v2: Update based on the comments
> 
> Jiawei Wang (2):
>   ethdev: add Tx queue mapping of aggregated ports
>   ethdev: add flow matching of aggregated port


Series applied to dpdk-next-net/main, thanks.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2023-02-17 16:45 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20221221102934.13822-1-jiaweiw@nvidia.com/>
2023-02-03  5:07 ` [PATCH v3 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
2023-02-03  5:07   ` [PATCH v3 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
2023-02-03  5:07   ` [PATCH v3 2/2] ethdev: add PHY affinity match item Jiawei Wang
2023-02-03 13:33   ` [PATCH v4 0/2] add new PHY affinity in the flow item and Tx queue API Jiawei Wang
2023-02-03 13:33     ` [PATCH v4 1/2] ethdev: introduce the PHY affinity field in " Jiawei Wang
2023-02-06 15:29       ` Jiawei(Jonny) Wang
2023-02-07  9:40       ` Ori Kam
2023-02-09 19:44       ` Ferruh Yigit
2023-02-10 14:06         ` Jiawei(Jonny) Wang
2023-02-14  9:38         ` Jiawei(Jonny) Wang
2023-02-14 10:01           ` Ferruh Yigit
2023-02-03 13:33     ` [PATCH v4 2/2] ethdev: add PHY affinity match item Jiawei Wang
2023-02-14 15:48   ` [PATCH v5 0/2] Added support for Tx queue mapping with an aggregated port Jiawei Wang
2023-02-14 15:48     ` [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports Jiawei Wang
2023-02-15 11:41       ` Jiawei(Jonny) Wang
2023-02-16 17:42       ` Thomas Monjalon
2023-02-17  6:45         ` Jiawei(Jonny) Wang
2023-02-16 17:58       ` Ferruh Yigit
2023-02-17  6:44         ` Jiawei(Jonny) Wang
2023-02-17  8:24         ` Andrew Rybchenko
2023-02-17  9:50           ` Jiawei(Jonny) Wang
2023-02-14 15:48     ` [PATCH v5 2/2] ethdev: add Aggregated affinity match item Jiawei Wang
2023-02-16 17:46       ` Thomas Monjalon
2023-02-17  6:45         ` Jiawei(Jonny) Wang
2023-02-17 10:50   ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Jiawei Wang
2023-02-17 10:50     ` [PATCH v6 1/2] ethdev: add " Jiawei Wang
2023-02-17 12:53       ` Ferruh Yigit
2023-02-17 12:56       ` Andrew Rybchenko
2023-02-17 12:59         ` Ferruh Yigit
2023-02-17 13:05           ` Jiawei(Jonny) Wang
2023-02-17 13:41         ` Jiawei(Jonny) Wang
2023-02-17 15:03           ` Andrew Rybchenko
2023-02-17 15:32             ` Ferruh Yigit
2023-02-17 10:50     ` [PATCH v6 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
2023-02-17 13:01     ` [PATCH v6 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit
2023-02-17 13:07       ` Jiawei(Jonny) Wang
2023-02-17 15:47   ` [PATCH v7 " Jiawei Wang
2023-02-17 15:47     ` [PATCH v7 1/2] ethdev: add " Jiawei Wang
2023-02-17 15:47     ` [PATCH v7 2/2] ethdev: add flow matching of aggregated port Jiawei Wang
2023-02-17 16:45     ` [PATCH v7 0/2] Add Tx queue mapping of aggregated ports Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).