DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue
@ 2021-10-18 12:08 Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 1/6] " Xueming Li
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Xueming Li @ 2021-10-18 12:08 UTC (permalink / raw)
  To: dev
  Cc: xuemingl, Jerin Jacob, Ferruh Yigit, Andrew Rybchenko,
	Viacheslav Ovsiienko, Thomas Monjalon, Lior Margalit,
	Ananyev Konstantin

In current DPDK framework, all Rx queues is pre-loaded with mbufs for
incoming packets. When number of representors scale out in a switch
domain, the memory consumption became significant. Further more,
polling all ports leads to high cache miss, high latency and low
throughputs.

This patch introduces shared Rx queue. PF and representors in same
Rx domain and switch domain could share Rx queue set by specifying
non-zero share group value in Rx queue configuration.

All ports that share Rx queue actually shares hardware descriptor
queue and feed all Rx queues with one descriptor supply, memory is saved.

Polling any queue using same shared Rx queue receives packets from all
member ports. Source port is identified by mbuf->port.

Multiple groups is supported by group ID. Port queue number in a shared
group should be identical. Queue index is 1:1 mapped in shared group.
An example of two share groups:
 Group1, 4 shared Rx queues per member port: PF, repr0, repr1
 Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127
 Poll first port for each group:
  core	port	queue
  0	0	0
  1	0	1
  2	0	2
  3	0	3
  4	2	0
  5	2	1

Shared Rx queue must be polled on single thread or core. If both PF0 and
representor0 joined same share group, can't poll pf0rxq0 on core1 and
rep0rxq0 on core2. Actually, polling one port within share group is
sufficient since polling any port in group will return packets for any
port in group.

There was some discussion to aggregate member ports in same group into a
dummy port, several ways to achieve it. Since it optional, need to collect
more feedback and requirement from user, make better decision later.

v1:
  - initial version
v2:
  - add testpmd patches
v3:
  - change common forwarding api to macro for performance, thanks Jerin.
  - save global variable accessed in forwarding to flowstream to minimize
    cache miss
  - combined patches for each forwarding engine
  - support multiple groups in testpmd "--share-rxq" parameter
  - new api to aggregate shared rxq group
v4:
  - spelling fixes
  - remove shared-rxq support for all forwarding engines
  - add dedicate shared-rxq forwarding engine
v5:
 - fix grammars
 - remove aggregate api and leave it for later discussion
 - add release notes
 - add deployment example
v6:
 - replace RxQ offload flag with device offload capability flag
 - add Rx domain
 - RxQ is shared when share group > 0
 - update testpmd accordingly
v7:
 - fix testpmd share group id allocation
 - change rx_domain to 16bits
v8:
 - add new patch for testpmd to show device Rx domain ID and capability
 - new share_qid in RxQ configuration

Xueming Li (6):
  ethdev: introduce shared Rx queue
  app/testpmd: dump device capability and Rx domain info
  app/testpmd: new parameter to enable shared Rx queue
  app/testpmd: dump port info for shared Rx queue
  app/testpmd: force shared Rx queue polled on same core
  app/testpmd: add forwarding engine for shared Rx queue

 app/test-pmd/config.c                         | 114 +++++++++++++-
 app/test-pmd/meson.build                      |   1 +
 app/test-pmd/parameters.c                     |  13 ++
 app/test-pmd/shared_rxq_fwd.c                 | 148 ++++++++++++++++++
 app/test-pmd/testpmd.c                        |  25 ++-
 app/test-pmd/testpmd.h                        |   5 +
 app/test-pmd/util.c                           |   3 +
 doc/guides/nics/features.rst                  |  13 ++
 doc/guides/nics/features/default.ini          |   1 +
 .../prog_guide/switch_representation.rst      |  11 ++
 doc/guides/rel_notes/release_21_11.rst        |   6 +
 doc/guides/testpmd_app_ug/run_app.rst         |   8 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst   |   5 +-
 lib/ethdev/rte_ethdev.c                       |   8 +
 lib/ethdev/rte_ethdev.h                       |  24 +++
 15 files changed, 379 insertions(+), 6 deletions(-)
 create mode 100644 app/test-pmd/shared_rxq_fwd.c

-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v8 1/6] ethdev: introduce shared Rx queue
  2021-10-18 12:08 [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue Xueming Li
@ 2021-10-18 12:08 ` Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 2/6] app/testpmd: dump device capability and Rx domain info Xueming Li
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Xueming Li @ 2021-10-18 12:08 UTC (permalink / raw)
  To: dev
  Cc: xuemingl, Jerin Jacob, Ferruh Yigit, Andrew Rybchenko,
	Viacheslav Ovsiienko, Thomas Monjalon, Lior Margalit,
	Ananyev Konstantin

In current DPDK framework, each Rx queue is pre-loaded with mbufs to
save incoming packets. For some PMDs, when number of representors scale
out in a switch domain, the memory consumption became significant.
Polling all ports also leads to high cache miss, high latency and low
throughput.

This patch introduce shared Rx queue. Ports in same Rx domain and
switch domain could share Rx queue set by specifying non-zero sharing
group in Rx queue configuration.

Shared Rx queue is identified by share_rxq field of Rx queue
configuration. Port A RxQ X can share RxQ with Port B RxQ Y by using
same shared Rx queue ID.

No special API is defined to receive packets from shared Rx queue.
Polling any member port of a shared Rx queue receives packets of that
queue for all member ports, port_id is identified by mbuf->port. PMD is
responsible to resolve shared Rx queue from device and queue data.

Shared Rx queue must be polled in same thread or core, polling a queue
ID of any member port is essentially same.

Multiple share groups are supported. Device should support mixed
configuration by allowing multiple share groups and non-shared Rx queue
on one port.

Example grouping and polling model to reflect service priority:
 Group1, 2 shared Rx queues per port: PF, rep0, rep1
 Group2, 1 shared Rx queue per port: rep2, rep3, ... rep127
 Core0: poll PF queue0
 Core1: poll PF queue1
 Core2: poll rep2 queue0

PMD advertise shared Rx queue capability via RTE_ETH_DEV_CAPA_RXQ_SHARE.

PMD is responsible for shared Rx queue consistency checks to avoid
member port's configuration contradict to each other.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 doc/guides/nics/features.rst                  | 13 ++++++++++
 doc/guides/nics/features/default.ini          |  1 +
 .../prog_guide/switch_representation.rst      | 11 +++++++++
 doc/guides/rel_notes/release_21_11.rst        |  6 +++++
 lib/ethdev/rte_ethdev.c                       |  8 +++++++
 lib/ethdev/rte_ethdev.h                       | 24 +++++++++++++++++++
 6 files changed, 63 insertions(+)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index e346018e4b8..89f9accbca1 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -615,6 +615,19 @@ Supports inner packet L4 checksum.
   ``tx_offload_capa,tx_queue_offload_capa:DEV_TX_OFFLOAD_OUTER_UDP_CKSUM``.
 
 
+.. _nic_features_shared_rx_queue:
+
+Shared Rx queue
+---------------
+
+Supports shared Rx queue for ports in same Rx domain of a switch domain.
+
+* **[uses]     rte_eth_dev_info**: ``dev_capa:RTE_ETH_DEV_CAPA_RXQ_SHARE``.
+* **[uses]     rte_eth_dev_info,rte_eth_switch_info**: ``rx_domain``, ``domain_id``.
+* **[uses]     rte_eth_rxconf**: ``share_group``, ``share_qid``.
+* **[provides] mbuf**: ``mbuf.port``.
+
+
 .. _nic_features_packet_type_parsing:
 
 Packet type parsing
diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index d473b94091a..93f5d1b46f4 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -19,6 +19,7 @@ Free Tx mbuf on demand =
 Queue start/stop     =
 Runtime Rx queue setup =
 Runtime Tx queue setup =
+Shared Rx queue      =
 Burst mode info      =
 Power mgmt address monitor =
 MTU update           =
diff --git a/doc/guides/prog_guide/switch_representation.rst b/doc/guides/prog_guide/switch_representation.rst
index ff6aa91c806..fe89a7f5c33 100644
--- a/doc/guides/prog_guide/switch_representation.rst
+++ b/doc/guides/prog_guide/switch_representation.rst
@@ -123,6 +123,17 @@ thought as a software "patch panel" front-end for applications.
 .. [1] `Ethernet switch device driver model (switchdev)
        <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_
 
+- For some PMDs, memory usage of representors is huge when number of
+  representor grows, mbufs are allocated for each descriptor of Rx queue.
+  Polling large number of ports brings more CPU load, cache miss and
+  latency. Shared Rx queue can be used to share Rx queue between PF and
+  representors among same Rx domain. ``RTE_ETH_DEV_CAPA_RXQ_SHARE`` in
+  device info is used to indicate the capability. Setting non-zero share
+  group in Rx queue configuration to enable share, share_qid is used to
+  identifiy the shared Rx queue in group. Polling any member port can
+  receive packets of all member ports in the group, port ID is saved in
+  ``mbuf.port``.
+
 Basic SR-IOV
 ------------
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index d5435a64aa1..2143e38ff11 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -75,6 +75,12 @@ New Features
     operations.
   * Added multi-process support.
 
+* **Added ethdev shared Rx queue support.**
+
+  * Added new device capability flag and rx domain field to switch info.
+  * Added share group and share queue ID to Rx queue configuration.
+  * Added testpmd support and dedicate forwarding engine.
+ 
 * **Added new RSS offload types for IPv4/L4 checksum in RSS flow.**
 
   Added macros ETH_RSS_IPV4_CHKSUM and ETH_RSS_L4_CHKSUM, now IPv4 and
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 028907bc4b9..bc55f899f72 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -2159,6 +2159,14 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 		return -EINVAL;
 	}
 
+	if (local_conf.share_group > 0 &&
+	    (dev_info.dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE) == 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"Ethdev port_id=%d rx_queue_id=%d, enabled share_group=%hu while device doesn't support Rx queue share\n",
+			port_id, rx_queue_id, local_conf.share_group);
+		return -EINVAL;
+	}
+
 	/*
 	 * If LRO is enabled, check that the maximum aggregated packet
 	 * size is supported by the configured device.
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 6d80514ba7a..465293fd66d 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1044,6 +1044,14 @@ struct rte_eth_rxconf {
 	uint8_t rx_drop_en; /**< Drop packets if no descriptors are available. */
 	uint8_t rx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */
 	uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */
+	/**
+	 * Share group index in Rx domain and switch domain.
+	 * Non-zero value to enable Rx queue share, zero value disable share.
+	 * PMD is responsible for Rx queue consistency checks to avoid member
+	 * port's configuration contradict to each other.
+	 */
+	uint16_t share_group;
+	uint16_t share_qid; /**< Shared Rx queue ID in group. */
 	/**
 	 * Per-queue Rx offloads to be set using DEV_RX_OFFLOAD_* flags.
 	 * Only offloads set on rx_queue_offload_capa or rx_offload_capa
@@ -1445,6 +1453,16 @@ struct rte_eth_conf {
 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
 /** Device supports Tx queue setup after device started. */
 #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
+/**
+ * Device supports shared Rx queue among ports within Rx domain and
+ * switch domain. Mbufs are consumed by shared Rx queue instead of
+ * each queue. Multiple groups is supported by share_group of Rx
+ * queue configuration. Shared Rx queue is identified by PMD using
+ * share_qid of Rx queue configuration. Polling any port in the group
+ * receive packets of all member ports, source port identified by
+ * mbuf->port field.
+ */
+#define RTE_ETH_DEV_CAPA_RXQ_SHARE              RTE_BIT64(2)
 /**@}*/
 
 /*
@@ -1488,6 +1506,12 @@ struct rte_eth_switch_info {
 	 * but each driver should explicitly define the mapping of switch
 	 * port identifier to that physical interconnect/switch
 	 */
+	/**
+	 * Shared Rx queue sub-domain boundary. Only ports in same Rx domain
+	 * and switch domain can share Rx queue. Valid only if device advertised
+	 * RTE_ETH_DEV_CAPA_RXQ_SHARE capability.
+	 */
+	uint16_t rx_domain;
 };
 
 /**
-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v8 2/6] app/testpmd: dump device capability and Rx domain info
  2021-10-18 12:08 [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 1/6] " Xueming Li
@ 2021-10-18 12:08 ` Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 3/6] app/testpmd: new parameter to enable shared Rx queue Xueming Li
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Xueming Li @ 2021-10-18 12:08 UTC (permalink / raw)
  To: dev
  Cc: xuemingl, Jerin Jacob, Ferruh Yigit, Andrew Rybchenko,
	Viacheslav Ovsiienko, Thomas Monjalon, Lior Margalit,
	Ananyev Konstantin, Xiaoyun Li

Dump device capability and Rx domain ID if shared Rx queue is supported
by device.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/config.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 9c66329e96e..c0616dcd2fd 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -733,6 +733,7 @@ port_infos_display(portid_t port_id)
 	printf("Max segment number per MTU/TSO: %hu\n",
 		dev_info.tx_desc_lim.nb_mtu_seg_max);
 
+	printf("Device capabilities: 0x%"PRIx64"\n", dev_info.dev_capa);
 	/* Show switch info only if valid switch domain and port id is set */
 	if (dev_info.switch_info.domain_id !=
 		RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) {
@@ -743,6 +744,9 @@ port_infos_display(portid_t port_id)
 			dev_info.switch_info.domain_id);
 		printf("Switch Port Id: %u\n",
 			dev_info.switch_info.port_id);
+		if ((dev_info.dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE) != 0)
+			printf("Switch Rx domain: %u\n",
+			       dev_info.switch_info.rx_domain);
 	}
 }
 
-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v8 3/6] app/testpmd: new parameter to enable shared Rx queue
  2021-10-18 12:08 [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 1/6] " Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 2/6] app/testpmd: dump device capability and Rx domain info Xueming Li
@ 2021-10-18 12:08 ` Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 4/6] app/testpmd: dump port info for " Xueming Li
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Xueming Li @ 2021-10-18 12:08 UTC (permalink / raw)
  To: dev
  Cc: xuemingl, Jerin Jacob, Ferruh Yigit, Andrew Rybchenko,
	Viacheslav Ovsiienko, Thomas Monjalon, Lior Margalit,
	Ananyev Konstantin, Xiaoyun Li

Adds "--rxq-share=X" parameter to enable shared RxQ, share if device
supports, otherwise fallback to standard RxQ.

Share group number grows per X ports. X defaults to MAX, implies all
ports join share group 1. Queue ID is mapped equally with shared Rx
queue ID.

Forwarding engine "shared-rxq" should be used which Rx only and update
stream statistics correctly.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/config.c                 |  7 ++++++-
 app/test-pmd/parameters.c             | 13 +++++++++++++
 app/test-pmd/testpmd.c                | 20 +++++++++++++++++---
 app/test-pmd/testpmd.h                |  2 ++
 doc/guides/testpmd_app_ug/run_app.rst |  7 +++++++
 5 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index c0616dcd2fd..f8fb8961cae 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2713,7 +2713,12 @@ rxtx_config_display(void)
 			printf("      RX threshold registers: pthresh=%d hthresh=%d "
 				" wthresh=%d\n",
 				pthresh_tmp, hthresh_tmp, wthresh_tmp);
-			printf("      RX Offloads=0x%"PRIx64"\n", offloads_tmp);
+			printf("      RX Offloads=0x%"PRIx64, offloads_tmp);
+			if (rx_conf->share_group > 0)
+				printf(" share_group=%u share_qid=%u",
+				       rx_conf->share_group,
+				       rx_conf->share_qid);
+			printf("\n");
 		}
 
 		/* per tx queue config only for first queue to be less verbose */
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 3f94a82e321..30dae326310 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -167,6 +167,7 @@ usage(char* progname)
 	printf("  --tx-ip=src,dst: IP addresses in Tx-only mode\n");
 	printf("  --tx-udp=src[,dst]: UDP ports in Tx-only mode\n");
 	printf("  --eth-link-speed: force link speed.\n");
+	printf("  --rxq-share: number of ports per shared rxq groups, defaults to MAX(1 group)\n");
 	printf("  --disable-link-check: disable check on link status when "
 	       "starting/stopping ports.\n");
 	printf("  --disable-device-start: do not automatically start port\n");
@@ -607,6 +608,7 @@ launch_args_parse(int argc, char** argv)
 		{ "rxpkts",			1, 0, 0 },
 		{ "txpkts",			1, 0, 0 },
 		{ "txonly-multi-flow",		0, 0, 0 },
+		{ "rxq-share",			2, 0, 0 },
 		{ "eth-link-speed",		1, 0, 0 },
 		{ "disable-link-check",		0, 0, 0 },
 		{ "disable-device-start",	0, 0, 0 },
@@ -1271,6 +1273,17 @@ launch_args_parse(int argc, char** argv)
 			}
 			if (!strcmp(lgopts[opt_idx].name, "txonly-multi-flow"))
 				txonly_multi_flow = 1;
+			if (!strcmp(lgopts[opt_idx].name, "rxq-share")) {
+				if (optarg == NULL) {
+					rxq_share = UINT32_MAX;
+				} else {
+					n = atoi(optarg);
+					if (n >= 0)
+						rxq_share = (uint32_t)n;
+					else
+						rte_exit(EXIT_FAILURE, "rxq-share must be >= 0\n");
+				}
+			}
 			if (!strcmp(lgopts[opt_idx].name, "no-flush-rx"))
 				no_flush_rx = 1;
 			if (!strcmp(lgopts[opt_idx].name, "eth-link-speed")) {
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17ec..123142ed110 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -498,6 +498,11 @@ uint8_t record_core_cycles;
  */
 uint8_t record_burst_stats;
 
+/*
+ * Number of ports per shared Rx queue group, 0 disable.
+ */
+uint32_t rxq_share;
+
 unsigned int num_sockets = 0;
 unsigned int socket_ids[RTE_MAX_NUMA_NODES];
 
@@ -3393,14 +3398,23 @@ dev_event_callback(const char *device_name, enum rte_dev_event_type type,
 }
 
 static void
-rxtx_port_config(struct rte_port *port)
+rxtx_port_config(portid_t pid)
 {
 	uint16_t qid;
 	uint64_t offloads;
+	struct rte_port *port = &ports[pid];
 
 	for (qid = 0; qid < nb_rxq; qid++) {
 		offloads = port->rx_conf[qid].offloads;
 		port->rx_conf[qid] = port->dev_info.default_rxconf;
+
+		if (rxq_share > 0 &&
+		    (port->dev_info.dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE)) {
+			/* Non-zero share group to enable RxQ share. */
+			port->rx_conf[qid].share_group = pid / rxq_share + 1;
+			port->rx_conf[qid].share_qid = qid; /* Equal mapping. */
+		}
+
 		if (offloads != 0)
 			port->rx_conf[qid].offloads = offloads;
 
@@ -3558,7 +3572,7 @@ init_port_config(void)
 				port->dev_conf.rxmode.mq_mode = ETH_MQ_RX_NONE;
 		}
 
-		rxtx_port_config(port);
+		rxtx_port_config(pid);
 
 		ret = eth_macaddr_get_print_err(pid, &port->eth_addr);
 		if (ret != 0)
@@ -3772,7 +3786,7 @@ init_port_dcb_config(portid_t pid,
 
 	memcpy(&rte_port->dev_conf, &port_conf, sizeof(struct rte_eth_conf));
 
-	rxtx_port_config(rte_port);
+	rxtx_port_config(pid);
 	/* VLAN filter */
 	rte_port->dev_conf.rxmode.offloads |= DEV_RX_OFFLOAD_VLAN_FILTER;
 	for (i = 0; i < RTE_DIM(vlan_tags); i++)
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 5863b2f43f3..3dfaaad94c0 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -477,6 +477,8 @@ extern enum tx_pkt_split tx_pkt_split;
 
 extern uint8_t txonly_multi_flow;
 
+extern uint32_t rxq_share;
+
 extern uint16_t nb_pkt_per_burst;
 extern uint16_t nb_pkt_flowgen_clones;
 extern int nb_flows_flowgen;
diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst
index 640eadeff73..ff5908dcd50 100644
--- a/doc/guides/testpmd_app_ug/run_app.rst
+++ b/doc/guides/testpmd_app_ug/run_app.rst
@@ -389,6 +389,13 @@ The command line options are:
 
     Generate multiple flows in txonly mode.
 
+*   ``--rxq-share=[X]``
+
+    Create queues in shared Rx queue mode if device supports.
+    Group number grows per X ports. X defaults to MAX, implies all ports
+    join share group 1. Forwarding engine "shared-rxq" should be used
+    which Rx only and update stream statistics correctly.
+
 *   ``--eth-link-speed``
 
     Set a forced link speed to the ethernet port::
-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v8 4/6] app/testpmd: dump port info for shared Rx queue
  2021-10-18 12:08 [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue Xueming Li
                   ` (2 preceding siblings ...)
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 3/6] app/testpmd: new parameter to enable shared Rx queue Xueming Li
@ 2021-10-18 12:08 ` Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 5/6] app/testpmd: force shared Rx queue polled on same core Xueming Li
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Xueming Li @ 2021-10-18 12:08 UTC (permalink / raw)
  To: dev
  Cc: xuemingl, Jerin Jacob, Ferruh Yigit, Andrew Rybchenko,
	Viacheslav Ovsiienko, Thomas Monjalon, Lior Margalit,
	Ananyev Konstantin, Xiaoyun Li

In case of shared Rx queue, polling any member port returns mbufs for
all members. This patch dumps mbuf->port for each packet.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/util.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 51506e49404..e98f136d5ed 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -100,6 +100,9 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 		struct rte_flow_restore_info info = { 0, };
 
 		mb = pkts[i];
+		if (rxq_share > 0)
+			MKDUMPSTR(print_buf, buf_size, cur_len, "port %u, ",
+				  mb->port);
 		eth_hdr = rte_pktmbuf_read(mb, 0, sizeof(_eth_hdr), &_eth_hdr);
 		eth_type = RTE_BE_TO_CPU_16(eth_hdr->ether_type);
 		packet_type = mb->packet_type;
-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v8 5/6] app/testpmd: force shared Rx queue polled on same core
  2021-10-18 12:08 [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue Xueming Li
                   ` (3 preceding siblings ...)
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 4/6] app/testpmd: dump port info for " Xueming Li
@ 2021-10-18 12:08 ` Xueming Li
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 6/6] app/testpmd: add forwarding engine for shared Rx queue Xueming Li
  2021-10-18 13:05 ` [dpdk-dev] [PATCH v8 0/6] ethdev: introduce " Xueming(Steven) Li
  6 siblings, 0 replies; 9+ messages in thread
From: Xueming Li @ 2021-10-18 12:08 UTC (permalink / raw)
  To: dev
  Cc: xuemingl, Jerin Jacob, Ferruh Yigit, Andrew Rybchenko,
	Viacheslav Ovsiienko, Thomas Monjalon, Lior Margalit,
	Ananyev Konstantin, Xiaoyun Li

Shared Rx queue must be polled on same core. This patch checks and stops
forwarding if shared RxQ being scheduled on multiple
cores.

It's suggested to use same number of Rx queues and polling cores.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/config.c  | 103 +++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/testpmd.c |   4 +-
 app/test-pmd/testpmd.h |   2 +
 3 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index f8fb8961cae..c4150d77589 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2890,6 +2890,109 @@ port_rss_hash_key_update(portid_t port_id, char rss_type[], uint8_t *hash_key,
 	}
 }
 
+/*
+ * Check whether a shared rxq scheduled on other lcores.
+ */
+static bool
+fwd_stream_on_other_lcores(uint16_t domain_id, lcoreid_t src_lc,
+			   portid_t src_port, queueid_t src_rxq,
+			   uint32_t share_group, queueid_t share_rxq)
+{
+	streamid_t sm_id;
+	streamid_t nb_fs_per_lcore;
+	lcoreid_t  nb_fc;
+	lcoreid_t  lc_id;
+	struct fwd_stream *fs;
+	struct rte_port *port;
+	struct rte_eth_dev_info *dev_info;
+	struct rte_eth_rxconf *rxq_conf;
+
+	nb_fc = cur_fwd_config.nb_fwd_lcores;
+	/* Check remaining cores. */
+	for (lc_id = src_lc + 1; lc_id < nb_fc; lc_id++) {
+		sm_id = fwd_lcores[lc_id]->stream_idx;
+		nb_fs_per_lcore = fwd_lcores[lc_id]->stream_nb;
+		for (; sm_id < fwd_lcores[lc_id]->stream_idx + nb_fs_per_lcore;
+		     sm_id++) {
+			fs = fwd_streams[sm_id];
+			port = &ports[fs->rx_port];
+			dev_info = &port->dev_info;
+			rxq_conf = &port->rx_conf[fs->rx_queue];
+			if ((dev_info->dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE)
+			    == 0)
+				/* Not shared rxq. */
+				continue;
+			if (domain_id != port->dev_info.switch_info.domain_id)
+				continue;
+			if (rxq_conf->share_group != share_group)
+				continue;
+			if (rxq_conf->share_qid != share_rxq)
+				continue;
+			printf("Shared Rx queue group %u queue %hu can't be scheduled on different cores:\n",
+			       share_group, share_rxq);
+			printf("  lcore %hhu Port %hu queue %hu\n",
+			       src_lc, src_port, src_rxq);
+			printf("  lcore %hhu Port %hu queue %hu\n",
+			       lc_id, fs->rx_port, fs->rx_queue);
+			printf("Please use --nb-cores=%hu to limit number of forwarding cores\n",
+			       nb_rxq);
+			return true;
+		}
+	}
+	return false;
+}
+
+/*
+ * Check shared rxq configuration.
+ *
+ * Shared group must not being scheduled on different core.
+ */
+bool
+pkt_fwd_shared_rxq_check(void)
+{
+	streamid_t sm_id;
+	streamid_t nb_fs_per_lcore;
+	lcoreid_t  nb_fc;
+	lcoreid_t  lc_id;
+	struct fwd_stream *fs;
+	uint16_t domain_id;
+	struct rte_port *port;
+	struct rte_eth_dev_info *dev_info;
+	struct rte_eth_rxconf *rxq_conf;
+
+	nb_fc = cur_fwd_config.nb_fwd_lcores;
+	/*
+	 * Check streams on each core, make sure the same switch domain +
+	 * group + queue doesn't get scheduled on other cores.
+	 */
+	for (lc_id = 0; lc_id < nb_fc; lc_id++) {
+		sm_id = fwd_lcores[lc_id]->stream_idx;
+		nb_fs_per_lcore = fwd_lcores[lc_id]->stream_nb;
+		for (; sm_id < fwd_lcores[lc_id]->stream_idx + nb_fs_per_lcore;
+		     sm_id++) {
+			fs = fwd_streams[sm_id];
+			/* Update lcore info stream being scheduled. */
+			fs->lcore = fwd_lcores[lc_id];
+			port = &ports[fs->rx_port];
+			dev_info = &port->dev_info;
+			rxq_conf = &port->rx_conf[fs->rx_queue];
+			if ((dev_info->dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE)
+			    == 0)
+				/* Not shared rxq. */
+				continue;
+			/* Check shared rxq not scheduled on remaining cores. */
+			domain_id = port->dev_info.switch_info.domain_id;
+			if (fwd_stream_on_other_lcores(domain_id, lc_id,
+						       fs->rx_port,
+						       fs->rx_queue,
+						       rxq_conf->share_group,
+						       rxq_conf->share_qid))
+				return false;
+		}
+	}
+	return true;
+}
+
 /*
  * Setup forwarding configuration for each logical core.
  */
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 123142ed110..f3f81ef561f 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2236,10 +2236,12 @@ start_packet_forwarding(int with_tx_first)
 
 	fwd_config_setup();
 
+	pkt_fwd_config_display(&cur_fwd_config);
+	if (!pkt_fwd_shared_rxq_check())
+		return;
 	if(!no_flush_rx)
 		flush_fwd_rx_queues();
 
-	pkt_fwd_config_display(&cur_fwd_config);
 	rxtx_config_display();
 
 	fwd_stats_reset();
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 3dfaaad94c0..f121a2da90c 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -144,6 +144,7 @@ struct fwd_stream {
 	uint64_t     core_cycles; /**< used for RX and TX processing */
 	struct pkt_burst_stats rx_burst_stats;
 	struct pkt_burst_stats tx_burst_stats;
+	struct fwd_lcore *lcore; /**< Lcore being scheduled. */
 };
 
 /**
@@ -795,6 +796,7 @@ void port_summary_header_display(void);
 void rx_queue_infos_display(portid_t port_idi, uint16_t queue_id);
 void tx_queue_infos_display(portid_t port_idi, uint16_t queue_id);
 void fwd_lcores_config_display(void);
+bool pkt_fwd_shared_rxq_check(void);
 void pkt_fwd_config_display(struct fwd_config *cfg);
 void rxtx_config_display(void);
 void fwd_config_setup(void);
-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v8 6/6] app/testpmd: add forwarding engine for shared Rx queue
  2021-10-18 12:08 [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue Xueming Li
                   ` (4 preceding siblings ...)
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 5/6] app/testpmd: force shared Rx queue polled on same core Xueming Li
@ 2021-10-18 12:08 ` Xueming Li
  2021-10-18 13:05 ` [dpdk-dev] [PATCH v8 0/6] ethdev: introduce " Xueming(Steven) Li
  6 siblings, 0 replies; 9+ messages in thread
From: Xueming Li @ 2021-10-18 12:08 UTC (permalink / raw)
  To: dev
  Cc: xuemingl, Jerin Jacob, Ferruh Yigit, Andrew Rybchenko,
	Viacheslav Ovsiienko, Thomas Monjalon, Lior Margalit,
	Ananyev Konstantin, Xiaoyun Li

To support shared Rx queue, this patch introduces dedicate forwarding
engine. The engine groups received packets by mbuf->port into sub-group,
updates stream statistics and simply frees packets.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/meson.build                    |   1 +
 app/test-pmd/shared_rxq_fwd.c               | 148 ++++++++++++++++++++
 app/test-pmd/testpmd.c                      |   1 +
 app/test-pmd/testpmd.h                      |   1 +
 doc/guides/testpmd_app_ug/run_app.rst       |   1 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |   5 +-
 6 files changed, 156 insertions(+), 1 deletion(-)
 create mode 100644 app/test-pmd/shared_rxq_fwd.c

diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 98f3289bdfa..07042e45b12 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -21,6 +21,7 @@ sources = files(
         'noisy_vnf.c',
         'parameters.c',
         'rxonly.c',
+        'shared_rxq_fwd.c',
         'testpmd.c',
         'txonly.c',
         'util.c',
diff --git a/app/test-pmd/shared_rxq_fwd.c b/app/test-pmd/shared_rxq_fwd.c
new file mode 100644
index 00000000000..4e262b99bc7
--- /dev/null
+++ b/app/test-pmd/shared_rxq_fwd.c
@@ -0,0 +1,148 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2021 NVIDIA Corporation & Affiliates
+ */
+
+#include <stdarg.h>
+#include <string.h>
+#include <stdio.h>
+#include <errno.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <inttypes.h>
+
+#include <sys/queue.h>
+#include <sys/stat.h>
+
+#include <rte_common.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_cycles.h>
+#include <rte_memory.h>
+#include <rte_memcpy.h>
+#include <rte_launch.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_atomic.h>
+#include <rte_branch_prediction.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_pci.h>
+#include <rte_ether.h>
+#include <rte_ethdev.h>
+#include <rte_string_fns.h>
+#include <rte_ip.h>
+#include <rte_udp.h>
+#include <rte_net.h>
+#include <rte_flow.h>
+
+#include "testpmd.h"
+
+/*
+ * Rx only sub-burst forwarding.
+ */
+static void
+forward_rx_only(uint16_t nb_rx, struct rte_mbuf **pkts_burst)
+{
+	rte_pktmbuf_free_bulk(pkts_burst, nb_rx);
+}
+
+/**
+ * Get packet source stream by source port and queue.
+ * All streams of same shared Rx queue locates on same core.
+ */
+static struct fwd_stream *
+forward_stream_get(struct fwd_stream *fs, uint16_t port)
+{
+	streamid_t sm_id;
+	struct fwd_lcore *fc;
+	struct fwd_stream **fsm;
+	streamid_t nb_fs;
+
+	fc = fs->lcore;
+	fsm = &fwd_streams[fc->stream_idx];
+	nb_fs = fc->stream_nb;
+	for (sm_id = 0; sm_id < nb_fs; sm_id++) {
+		if (fsm[sm_id]->rx_port == port &&
+		    fsm[sm_id]->rx_queue == fs->rx_queue)
+			return fsm[sm_id];
+	}
+	return NULL;
+}
+
+/**
+ * Forward packet by source port and queue.
+ */
+static void
+forward_sub_burst(struct fwd_stream *src_fs, uint16_t port, uint16_t nb_rx,
+		  struct rte_mbuf **pkts)
+{
+	struct fwd_stream *fs = forward_stream_get(src_fs, port);
+
+	if (fs != NULL) {
+		fs->rx_packets += nb_rx;
+		forward_rx_only(nb_rx, pkts);
+	} else {
+		/* Source stream not found, drop all packets. */
+		src_fs->fwd_dropped += nb_rx;
+		while (nb_rx > 0)
+			rte_pktmbuf_free(pkts[--nb_rx]);
+	}
+}
+
+/**
+ * Forward packets from shared Rx queue.
+ *
+ * Source port of packets are identified by mbuf->port.
+ */
+static void
+forward_shared_rxq(struct fwd_stream *fs, uint16_t nb_rx,
+		   struct rte_mbuf **pkts_burst)
+{
+	uint16_t i, nb_sub_burst, port, last_port;
+
+	nb_sub_burst = 0;
+	last_port = pkts_burst[0]->port;
+	/* Locate sub-burst according to mbuf->port. */
+	for (i = 0; i < nb_rx - 1; ++i) {
+		rte_prefetch0(pkts_burst[i + 1]);
+		port = pkts_burst[i]->port;
+		if (i > 0 && last_port != port) {
+			/* Forward packets with same source port. */
+			forward_sub_burst(fs, last_port, nb_sub_burst,
+					  &pkts_burst[i - nb_sub_burst]);
+			nb_sub_burst = 0;
+			last_port = port;
+		}
+		nb_sub_burst++;
+	}
+	/* Last sub-burst. */
+	nb_sub_burst++;
+	forward_sub_burst(fs, last_port, nb_sub_burst,
+			  &pkts_burst[nb_rx - nb_sub_burst]);
+}
+
+static void
+shared_rxq_fwd(struct fwd_stream *fs)
+{
+	struct rte_mbuf *pkts_burst[nb_pkt_per_burst];
+	uint16_t nb_rx;
+	uint64_t start_tsc = 0;
+
+	get_start_cycles(&start_tsc);
+	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
+				 nb_pkt_per_burst);
+	inc_rx_burst_stats(fs, nb_rx);
+	if (unlikely(nb_rx == 0))
+		return;
+	forward_shared_rxq(fs, nb_rx, pkts_burst);
+	get_end_cycles(fs, start_tsc);
+}
+
+struct fwd_engine shared_rxq_engine = {
+	.fwd_mode_name  = "shared_rxq",
+	.port_fwd_begin = NULL,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = shared_rxq_fwd,
+};
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index f3f81ef561f..11a85d92d9a 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -188,6 +188,7 @@ struct fwd_engine * fwd_engines[] = {
 #ifdef RTE_LIBRTE_IEEE1588
 	&ieee1588_fwd_engine,
 #endif
+	&shared_rxq_engine,
 	NULL,
 };
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index f121a2da90c..f1fd607e365 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -299,6 +299,7 @@ extern struct fwd_engine five_tuple_swap_fwd_engine;
 #ifdef RTE_LIBRTE_IEEE1588
 extern struct fwd_engine ieee1588_fwd_engine;
 #endif
+extern struct fwd_engine shared_rxq_engine;
 
 extern struct fwd_engine * fwd_engines[]; /**< NULL terminated array. */
 extern cmdline_parse_inst_t cmd_set_raw;
diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst
index ff5908dcd50..e4b97844ced 100644
--- a/doc/guides/testpmd_app_ug/run_app.rst
+++ b/doc/guides/testpmd_app_ug/run_app.rst
@@ -252,6 +252,7 @@ The command line options are:
        tm
        noisy
        5tswap
+       shared-rxq
 
 *   ``--rss-ip``
 
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 8ead7a4a712..499874187f2 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -314,7 +314,7 @@ set fwd
 Set the packet forwarding mode::
 
    testpmd> set fwd (io|mac|macswap|flowgen| \
-                     rxonly|txonly|csum|icmpecho|noisy|5tswap) (""|retry)
+                     rxonly|txonly|csum|icmpecho|noisy|5tswap|shared-rxq) (""|retry)
 
 ``retry`` can be specified for forwarding engines except ``rx_only``.
 
@@ -357,6 +357,9 @@ The available information categories are:
 
   L4 swaps the source port and destination port of transport layer (TCP and UDP).
 
+* ``shared-rxq``: Receive only for shared Rx queue.
+  Resolve packet source port from mbuf and update stream statistics accordingly.
+
 Example::
 
    testpmd> set fwd rxonly
-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue
  2021-10-18 12:08 [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue Xueming Li
                   ` (5 preceding siblings ...)
  2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 6/6] app/testpmd: add forwarding engine for shared Rx queue Xueming Li
@ 2021-10-18 13:05 ` Xueming(Steven) Li
  6 siblings, 0 replies; 9+ messages in thread
From: Xueming(Steven) Li @ 2021-10-18 13:05 UTC (permalink / raw)
  To: dev
  Cc: jerinjacobk, NBU-Contact-Thomas Monjalon, andrew.rybchenko,
	Slava Ovsiienko, konstantin.ananyev, ferruh.yigit, Lior Margalit

Sorry, forgot to reply to original thread, resent.
	
Please ignore this series.


On Mon, 2021-10-18 at 20:08 +0800, Xueming Li wrote:
> In current DPDK framework, all Rx queues is pre-loaded with mbufs for
> incoming packets. When number of representors scale out in a switch
> domain, the memory consumption became significant. Further more,
> polling all ports leads to high cache miss, high latency and low
> throughputs.
> 
> This patch introduces shared Rx queue. PF and representors in same
> Rx domain and switch domain could share Rx queue set by specifying
> non-zero share group value in Rx queue configuration.
> 
> All ports that share Rx queue actually shares hardware descriptor
> queue and feed all Rx queues with one descriptor supply, memory is saved.
> 
> Polling any queue using same shared Rx queue receives packets from all
> member ports. Source port is identified by mbuf->port.
> 
> Multiple groups is supported by group ID. Port queue number in a shared
> group should be identical. Queue index is 1:1 mapped in shared group.
> An example of two share groups:
>  Group1, 4 shared Rx queues per member port: PF, repr0, repr1
>  Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127
>  Poll first port for each group:
>   core	port	queue
>   0	0	0
>   1	0	1
>   2	0	2
>   3	0	3
>   4	2	0
>   5	2	1
> 
> Shared Rx queue must be polled on single thread or core. If both PF0 and
> representor0 joined same share group, can't poll pf0rxq0 on core1 and
> rep0rxq0 on core2. Actually, polling one port within share group is
> sufficient since polling any port in group will return packets for any
> port in group.
> 
> There was some discussion to aggregate member ports in same group into a
> dummy port, several ways to achieve it. Since it optional, need to collect
> more feedback and requirement from user, make better decision later.
> 
> v1:
>   - initial version
> v2:
>   - add testpmd patches
> v3:
>   - change common forwarding api to macro for performance, thanks Jerin.
>   - save global variable accessed in forwarding to flowstream to minimize
>     cache miss
>   - combined patches for each forwarding engine
>   - support multiple groups in testpmd "--share-rxq" parameter
>   - new api to aggregate shared rxq group
> v4:
>   - spelling fixes
>   - remove shared-rxq support for all forwarding engines
>   - add dedicate shared-rxq forwarding engine
> v5:
>  - fix grammars
>  - remove aggregate api and leave it for later discussion
>  - add release notes
>  - add deployment example
> v6:
>  - replace RxQ offload flag with device offload capability flag
>  - add Rx domain
>  - RxQ is shared when share group > 0
>  - update testpmd accordingly
> v7:
>  - fix testpmd share group id allocation
>  - change rx_domain to 16bits
> v8:
>  - add new patch for testpmd to show device Rx domain ID and capability
>  - new share_qid in RxQ configuration
> 
> Xueming Li (6):
>   ethdev: introduce shared Rx queue
>   app/testpmd: dump device capability and Rx domain info
>   app/testpmd: new parameter to enable shared Rx queue
>   app/testpmd: dump port info for shared Rx queue
>   app/testpmd: force shared Rx queue polled on same core
>   app/testpmd: add forwarding engine for shared Rx queue
> 
>  app/test-pmd/config.c                         | 114 +++++++++++++-
>  app/test-pmd/meson.build                      |   1 +
>  app/test-pmd/parameters.c                     |  13 ++
>  app/test-pmd/shared_rxq_fwd.c                 | 148 ++++++++++++++++++
>  app/test-pmd/testpmd.c                        |  25 ++-
>  app/test-pmd/testpmd.h                        |   5 +
>  app/test-pmd/util.c                           |   3 +
>  doc/guides/nics/features.rst                  |  13 ++
>  doc/guides/nics/features/default.ini          |   1 +
>  .../prog_guide/switch_representation.rst      |  11 ++
>  doc/guides/rel_notes/release_21_11.rst        |   6 +
>  doc/guides/testpmd_app_ug/run_app.rst         |   8 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst   |   5 +-
>  lib/ethdev/rte_ethdev.c                       |   8 +
>  lib/ethdev/rte_ethdev.h                       |  24 +++
>  15 files changed, 379 insertions(+), 6 deletions(-)
>  create mode 100644 app/test-pmd/shared_rxq_fwd.c
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue
  2021-07-27  3:42 [dpdk-dev] [RFC] " Xueming Li
@ 2021-10-18 12:59 ` Xueming Li
  0 siblings, 0 replies; 9+ messages in thread
From: Xueming Li @ 2021-10-18 12:59 UTC (permalink / raw)
  To: dev
  Cc: xuemingl, Jerin Jacob, Ferruh Yigit, Andrew Rybchenko,
	Viacheslav Ovsiienko, Thomas Monjalon, Lior Margalit,
	Ananyev Konstantin

In current DPDK framework, all Rx queues is pre-loaded with mbufs for
incoming packets. When number of representors scale out in a switch
domain, the memory consumption became significant. Further more,
polling all ports leads to high cache miss, high latency and low
throughputs.

This patch introduces shared Rx queue. PF and representors in same
Rx domain and switch domain could share Rx queue set by specifying
non-zero share group value in Rx queue configuration.

All ports that share Rx queue actually shares hardware descriptor
queue and feed all Rx queues with one descriptor supply, memory is saved.

Polling any queue using same shared Rx queue receives packets from all
member ports. Source port is identified by mbuf->port.

Multiple groups is supported by group ID. Port queue number in a shared
group should be identical. Queue index is 1:1 mapped in shared group.
An example of two share groups:
 Group1, 4 shared Rx queues per member port: PF, repr0, repr1
 Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127
 Poll first port for each group:
  core	port	queue
  0	0	0
  1	0	1
  2	0	2
  3	0	3
  4	2	0
  5	2	1

Shared Rx queue must be polled on single thread or core. If both PF0 and
representor0 joined same share group, can't poll pf0rxq0 on core1 and
rep0rxq0 on core2. Actually, polling one port within share group is
sufficient since polling any port in group will return packets for any
port in group.

There was some discussion to aggregate member ports in same group into a
dummy port, several ways to achieve it. Since it optional, need to collect
more feedback and requirement from user, make better decision later.

v1:
  - initial version
v2:
  - add testpmd patches
v3:
  - change common forwarding api to macro for performance, thanks Jerin.
  - save global variable accessed in forwarding to flowstream to minimize
    cache miss
  - combined patches for each forwarding engine
  - support multiple groups in testpmd "--share-rxq" parameter
  - new api to aggregate shared rxq group
v4:
  - spelling fixes
  - remove shared-rxq support for all forwarding engines
  - add dedicate shared-rxq forwarding engine
v5:
 - fix grammars
 - remove aggregate api and leave it for later discussion
 - add release notes
 - add deployment example
v6:
 - replace RxQ offload flag with device offload capability flag
 - add Rx domain
 - RxQ is shared when share group > 0
 - update testpmd accordingly
v7:
 - fix testpmd share group id allocation
 - change rx_domain to 16bits
v8:
 - add new patch for testpmd to show device Rx domain ID and capability
 - new share_qid in RxQ configuration

Xueming Li (6):
  ethdev: introduce shared Rx queue
  app/testpmd: dump device capability and Rx domain info
  app/testpmd: new parameter to enable shared Rx queue
  app/testpmd: dump port info for shared Rx queue
  app/testpmd: force shared Rx queue polled on same core
  app/testpmd: add forwarding engine for shared Rx queue

 app/test-pmd/config.c                         | 114 +++++++++++++-
 app/test-pmd/meson.build                      |   1 +
 app/test-pmd/parameters.c                     |  13 ++
 app/test-pmd/shared_rxq_fwd.c                 | 148 ++++++++++++++++++
 app/test-pmd/testpmd.c                        |  25 ++-
 app/test-pmd/testpmd.h                        |   5 +
 app/test-pmd/util.c                           |   3 +
 doc/guides/nics/features.rst                  |  13 ++
 doc/guides/nics/features/default.ini          |   1 +
 .../prog_guide/switch_representation.rst      |  11 ++
 doc/guides/rel_notes/release_21_11.rst        |   6 +
 doc/guides/testpmd_app_ug/run_app.rst         |   8 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst   |   5 +-
 lib/ethdev/rte_ethdev.c                       |   8 +
 lib/ethdev/rte_ethdev.h                       |  24 +++
 15 files changed, 379 insertions(+), 6 deletions(-)
 create mode 100644 app/test-pmd/shared_rxq_fwd.c

-- 
2.33.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-18 13:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 12:08 [dpdk-dev] [PATCH v8 0/6] ethdev: introduce shared Rx queue Xueming Li
2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 1/6] " Xueming Li
2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 2/6] app/testpmd: dump device capability and Rx domain info Xueming Li
2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 3/6] app/testpmd: new parameter to enable shared Rx queue Xueming Li
2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 4/6] app/testpmd: dump port info for " Xueming Li
2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 5/6] app/testpmd: force shared Rx queue polled on same core Xueming Li
2021-10-18 12:08 ` [dpdk-dev] [PATCH v8 6/6] app/testpmd: add forwarding engine for shared Rx queue Xueming Li
2021-10-18 13:05 ` [dpdk-dev] [PATCH v8 0/6] ethdev: introduce " Xueming(Steven) Li
  -- strict thread matches above, loose matches on Subject: below --
2021-07-27  3:42 [dpdk-dev] [RFC] " Xueming Li
2021-10-18 12:59 ` [dpdk-dev] [PATCH v8 0/6] " Xueming Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).