DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v4 0/3] DLB2 Enhancements
@ 2024-05-01 19:46 Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-05-01 19:46 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

This patchset  addresses DLB enhancements in the DLB driver.

Abdullah Sevincer (3):
  event/dlb2: add support for HW delayed token
  event/dlb2: add support for dynamic HL entries
  event/dlb2: enhance DLB credit handling

 app/test-eventdev/test_perf_common.c       |  20 +-
 drivers/event/dlb2/dlb2.c                  | 385 ++++++++++++++++++---
 drivers/event/dlb2/dlb2_iface.c            |   3 +
 drivers/event/dlb2/dlb2_iface.h            |   4 +-
 drivers/event/dlb2/dlb2_priv.h             |  16 +-
 drivers/event/dlb2/dlb2_user.h             |  24 ++
 drivers/event/dlb2/meson.build             |  12 +
 drivers/event/dlb2/meson_options.txt       |   6 +
 drivers/event/dlb2/pf/base/dlb2_regs.h     |   9 +
 drivers/event/dlb2/pf/base/dlb2_resource.c |  95 ++++-
 drivers/event/dlb2/pf/base/dlb2_resource.h |  19 +
 drivers/event/dlb2/pf/dlb2_pf.c            |  28 +-
 drivers/event/dlb2/rte_pmd_dlb2.c          |  29 ++
 drivers/event/dlb2/rte_pmd_dlb2.h          |  41 +++
 drivers/event/dlb2/version.map             |   3 +
 15 files changed, 630 insertions(+), 64 deletions(-)
 create mode 100644 drivers/event/dlb2/meson_options.txt

-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/3] event/dlb2: add support for HW delayed token
  2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
@ 2024-05-01 19:46 ` Abdullah Sevincer
  2024-05-27 15:19   ` Jerin Jacob
  2024-06-19 21:01   ` [PATCH v5 0/5] DLB2 Enhancements Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-05-01 19:46 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

In DLB 2.5, hardware assist is available, complementing the Delayed
token POP software implementation. When it is enabled, the feature
works as follows:

It stops CQ scheduling when the inflight limit associated with the CQ
is reached. So the feature is activated only if the core is
congested. If the core can handle multiple atomic flows, DLB will not
try to switch them. This is an improvement over SW implementation
which always switches the flows.

The feature will resume CQ scheduling when the number of pending
completions fall below a configured threshold. To emulate older 2.0
behavior, this threshold is set to 1 by old APIs. SW sets CQ to
auto-pop mode for token return, as tokens withholding is not
necessary now. As HW counts completions and not tokens, events equal
to HL (History List) entries will be scheduled to DLB before the
feature activates and stops CQ scheduling.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 drivers/event/dlb2/dlb2.c                  | 58 ++++++++++++-
 drivers/event/dlb2/dlb2_iface.c            |  3 +
 drivers/event/dlb2/dlb2_iface.h            |  4 +-
 drivers/event/dlb2/dlb2_priv.h             |  5 ++
 drivers/event/dlb2/dlb2_user.h             | 24 ++++++
 drivers/event/dlb2/pf/base/dlb2_regs.h     |  9 ++
 drivers/event/dlb2/pf/base/dlb2_resource.c | 95 +++++++++++++++++++++-
 drivers/event/dlb2/pf/base/dlb2_resource.h | 19 +++++
 drivers/event/dlb2/pf/dlb2_pf.c            | 21 +++++
 drivers/event/dlb2/rte_pmd_dlb2.c          | 29 +++++++
 drivers/event/dlb2/rte_pmd_dlb2.h          | 40 +++++++++
 drivers/event/dlb2/version.map             |  3 +
 12 files changed, 306 insertions(+), 4 deletions(-)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 628ddef649..d64274b01e 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -879,8 +879,11 @@ dlb2_hw_reset_sched_domain(const struct rte_eventdev *dev, bool reconfig)
 	dlb2_iface_domain_reset(dlb2);
 
 	/* Free all dynamically allocated port memory */
-	for (i = 0; i < dlb2->num_ports; i++)
+	for (i = 0; i < dlb2->num_ports; i++) {
 		dlb2_free_qe_mem(&dlb2->ev_ports[i].qm_port);
+		if (!reconfig)
+			memset(&dlb2->ev_ports[i], 0, sizeof(struct dlb2_eventdev_port));
+	}
 
 	/* If reconfiguring, mark the device's queues and ports as "previously
 	 * configured." If the user doesn't reconfigure them, the PMD will
@@ -1525,7 +1528,7 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	struct dlb2_hw_dev *handle = &dlb2->qm_instance;
 	struct dlb2_create_ldb_port_args cfg = { {0} };
 	int ret;
-	struct dlb2_port *qm_port = NULL;
+	struct dlb2_port *qm_port = &ev_port->qm_port;
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	uint32_t qm_port_id;
 	uint16_t ldb_credit_high_watermark = 0;
@@ -1554,6 +1557,11 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	cfg.cq_depth = rte_align32pow2(dequeue_depth);
 	cfg.cq_depth_threshold = 1;
 
+	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
+		cfg.enable_inflight_ctrl = 1;
+		cfg.inflight_threshold = qm_port->inflight_threshold;
+	}
+
 	cfg.cq_history_list_size = cfg.cq_depth;
 
 	cfg.cos_id = ev_port->cos_id;
@@ -4321,6 +4329,52 @@ dlb2_get_queue_depth(struct dlb2_eventdev *dlb2,
 		return dlb2_get_ldb_queue_depth(dlb2, queue);
 }
 
+int
+dlb2_set_port_param(struct dlb2_eventdev *dlb2,
+		    int port_id,
+		    uint64_t param_flags,
+		    void *param_val)
+{
+	struct dlb2_port_param *port_param = (struct dlb2_port_param *)param_val;
+	struct dlb2_port *port = &dlb2->ev_ports[port_id].qm_port;
+	struct dlb2_hw_dev *handle = &dlb2->qm_instance;
+	int ret = 0, bit = 0;
+
+	while (param_flags) {
+		uint64_t param = rte_bit_relaxed_test_and_clear64(bit++, &param_flags);
+
+		if (!param)
+			continue;
+		switch (param) {
+		case DLB2_FLOW_MIGRATION_THRESHOLD:
+			if (dlb2->version == DLB2_HW_V2_5) {
+				struct dlb2_cq_inflight_ctrl_args args;
+				args.enable = true;
+				args.port_id = port->id;
+				args.threshold = port_param->inflight_threshold;
+
+				if (dlb2->ev_ports[port_id].setup_done)
+					ret = dlb2_iface_set_cq_inflight_ctrl(handle, &args);
+				if (ret < 0) {
+					DLB2_LOG_ERR("dlb2: can not set port parameters\n");
+					return -EINVAL;
+				}
+				port->enable_inflight_ctrl = true;
+				port->inflight_threshold = args.threshold;
+			} else {
+				DLB2_LOG_ERR("dlb2: FLOW_MIGRATION_THRESHOLD is only supported for 2.5 HW\n");
+				return -EINVAL;
+			}
+			break;
+		default:
+			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
+			return -EINVAL;
+		}
+	}
+
+	return ret;
+}
+
 static bool
 dlb2_queue_is_empty(struct dlb2_eventdev *dlb2,
 		    struct dlb2_eventdev_queue *queue)
diff --git a/drivers/event/dlb2/dlb2_iface.c b/drivers/event/dlb2/dlb2_iface.c
index 100db434d0..b829da2454 100644
--- a/drivers/event/dlb2/dlb2_iface.c
+++ b/drivers/event/dlb2/dlb2_iface.c
@@ -77,5 +77,8 @@ int (*dlb2_iface_get_dir_queue_depth)(struct dlb2_hw_dev *handle,
 int (*dlb2_iface_enable_cq_weight)(struct dlb2_hw_dev *handle,
 				   struct dlb2_enable_cq_weight_args *args);
 
+int (*dlb2_iface_set_cq_inflight_ctrl)(struct dlb2_hw_dev *handle,
+				       struct dlb2_cq_inflight_ctrl_args *args);
+
 int (*dlb2_iface_set_cos_bw)(struct dlb2_hw_dev *handle,
 			     struct dlb2_set_cos_bw_args *args);
diff --git a/drivers/event/dlb2/dlb2_iface.h b/drivers/event/dlb2/dlb2_iface.h
index dc0c446ce8..55b6bdcf84 100644
--- a/drivers/event/dlb2/dlb2_iface.h
+++ b/drivers/event/dlb2/dlb2_iface.h
@@ -72,10 +72,12 @@ extern int (*dlb2_iface_get_ldb_queue_depth)(struct dlb2_hw_dev *handle,
 extern int (*dlb2_iface_get_dir_queue_depth)(struct dlb2_hw_dev *handle,
 				struct dlb2_get_dir_queue_depth_args *args);
 
-
 extern int (*dlb2_iface_enable_cq_weight)(struct dlb2_hw_dev *handle,
 					  struct dlb2_enable_cq_weight_args *args);
 
+extern int (*dlb2_iface_set_cq_inflight_ctrl)(struct dlb2_hw_dev *handle,
+					      struct dlb2_cq_inflight_ctrl_args *args);
+
 extern int (*dlb2_iface_set_cos_bw)(struct dlb2_hw_dev *handle,
 				    struct dlb2_set_cos_bw_args *args);
 
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index 49f1c6691d..d6828aa482 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -389,6 +389,8 @@ struct dlb2_port {
 	bool use_avx512;
 	uint32_t cq_weight;
 	bool is_producer; /* True if port is of type producer */
+	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
+	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
 };
 
 /* Per-process per-port mmio and memory pointers */
@@ -718,6 +720,9 @@ int dlb2_secondary_eventdev_probe(struct rte_eventdev *dev,
 uint32_t dlb2_get_queue_depth(struct dlb2_eventdev *dlb2,
 			      struct dlb2_eventdev_queue *queue);
 
+int dlb2_set_port_param(struct dlb2_eventdev *dlb2, int port_id,
+			uint64_t flags, void *val);
+
 int dlb2_parse_params(const char *params,
 		      const char *name,
 		      struct dlb2_devargs *dlb2_args,
diff --git a/drivers/event/dlb2/dlb2_user.h b/drivers/event/dlb2/dlb2_user.h
index 8739e2a5ac..ca09c65ac4 100644
--- a/drivers/event/dlb2/dlb2_user.h
+++ b/drivers/event/dlb2/dlb2_user.h
@@ -472,6 +472,8 @@ struct dlb2_create_ldb_port_args {
 	__u16 cq_history_list_size;
 	__u8 cos_id;
 	__u8 cos_strict;
+	__u8 enable_inflight_ctrl;
+	__u16 inflight_threshold;
 };
 
 /*
@@ -717,6 +719,28 @@ struct dlb2_enable_cq_weight_args {
 	__u32 limit;
 };
 
+/*
+ * DLB2_DOMAIN_CMD_SET_CQ_INFLIGHT_CTRL: Set Per-CQ inflight control for
+ * {ATM,UNO,ORD} QEs.
+ *
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - enable: True if inflight control is enabled. False otherwise
+ * - threshold: Per CQ inflight threshold.
+ *
+ * Output parameters:
+ * - response.status: Detailed error code. In certain cases, such as if the
+ *	ioctl request arg is invalid, the driver won't set status.
+ */
+struct dlb2_cq_inflight_ctrl_args {
+	/* Output parameters */
+	struct dlb2_cmd_response response;
+	/* Input parameters */
+	__u32 port_id;
+	__u16 enable;
+	__u16 threshold;
+};
+
 /*
  * Mapping sizes for memory mapping the consumer queue (CQ) memory space, and
  * producer port (PP) MMIO space.
diff --git a/drivers/event/dlb2/pf/base/dlb2_regs.h b/drivers/event/dlb2/pf/base/dlb2_regs.h
index 7167f3d2ff..b639a5b659 100644
--- a/drivers/event/dlb2/pf/base/dlb2_regs.h
+++ b/drivers/event/dlb2/pf/base/dlb2_regs.h
@@ -3238,6 +3238,15 @@
 #define DLB2_LSP_CQ_LDB_INFL_LIM_LIMIT_LOC	0
 #define DLB2_LSP_CQ_LDB_INFL_LIM_RSVD0_LOC	12
 
+#define DLB2_LSP_CQ_LDB_INFL_THRESH(x) \
+	(0x90580000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RST 0x0
+
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH	0x00000FFF
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RSVD0	0xFFFFF000
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH_LOC	0
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RSVD0_LOC	12
+
 #define DLB2_V2LSP_CQ_LDB_TKN_CNT(x) \
 	(0xa0580000 + (x) * 0x1000)
 #define DLB2_V2_5LSP_CQ_LDB_TKN_CNT(x) \
diff --git a/drivers/event/dlb2/pf/base/dlb2_resource.c b/drivers/event/dlb2/pf/base/dlb2_resource.c
index 7ce3e3531c..051d7e51c3 100644
--- a/drivers/event/dlb2/pf/base/dlb2_resource.c
+++ b/drivers/event/dlb2/pf/base/dlb2_resource.c
@@ -3062,10 +3062,14 @@ static void __dlb2_domain_reset_ldb_port_registers(struct dlb2_hw *hw,
 		    DLB2_CHP_LDB_CQ_DEPTH(hw->ver, port->id.phys_id),
 		    DLB2_CHP_LDB_CQ_DEPTH_RST);
 
-	if (hw->ver != DLB2_HW_V2)
+	if (hw->ver != DLB2_HW_V2) {
 		DLB2_CSR_WR(hw,
 			    DLB2_LSP_CFG_CQ_LDB_WU_LIMIT(port->id.phys_id),
 			    DLB2_LSP_CFG_CQ_LDB_WU_LIMIT_RST);
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id),
+			    DLB2_LSP_CQ_LDB_INFL_THRESH_RST);
+	}
 
 	DLB2_CSR_WR(hw,
 		    DLB2_LSP_CQ_LDB_INFL_LIM(hw->ver, port->id.phys_id),
@@ -4446,6 +4450,20 @@ static int dlb2_ldb_port_configure_cq(struct dlb2_hw *hw,
 	reg = 0;
 	DLB2_CSR_WR(hw, DLB2_LSP_CQ2PRIOV(hw->ver, port->id.phys_id), reg);
 
+	if (hw->ver == DLB2_HW_V2_5) {
+		reg = 0;
+		DLB2_BITS_SET(reg, args->enable_inflight_ctrl,
+				DLB2_LSP_CFG_CTRL_GENERAL_0_ENAB_IF_THRESH_V2_5);
+		DLB2_CSR_WR(hw, DLB2_V2_5LSP_CFG_CTRL_GENERAL_0, reg);
+
+		if (args->enable_inflight_ctrl) {
+			reg = 0;
+			DLB2_BITS_SET(reg, args->inflight_threshold,
+					DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH);
+			DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id), reg);
+		}
+	}
+
 	return 0;
 }
 
@@ -5464,6 +5482,35 @@ dlb2_get_domain_used_ldb_port(u32 id,
 	return NULL;
 }
 
+static struct dlb2_ldb_port *
+dlb2_get_domain_ldb_port(u32 id,
+			 bool vdev_req,
+			 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_list_entry *iter __attribute__((unused));
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (id >= DLB2_MAX_NUM_LDB_PORTS)
+		return NULL;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port, iter) {
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+		}
+
+		DLB2_DOM_LIST_FOR(domain->avail_ldb_ports[i], port, iter) {
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+		}
+	}
+
+	return NULL;
+}
+
 static void dlb2_ldb_port_change_qid_priority(struct dlb2_hw *hw,
 					      struct dlb2_ldb_port *port,
 					      int slot,
@@ -6816,3 +6863,49 @@ int dlb2_hw_set_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id, u8 bandwidth)
 
 	return 0;
 }
+
+int dlb2_hw_set_cq_inflight_ctrl(struct dlb2_hw *hw, u32 domain_id,
+		struct dlb2_cq_inflight_ctrl_args *args,
+		struct dlb2_cmd_response *resp,
+		bool vdev_req,
+		unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	u32 reg = 0;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_ldb_port(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	DLB2_BITS_SET(reg, args->enable,
+		      DLB2_LSP_CFG_CTRL_GENERAL_0_ENAB_IF_THRESH_V2_5);
+	DLB2_CSR_WR(hw, DLB2_V2_5LSP_CFG_CTRL_GENERAL_0, reg);
+
+	if (args->enable) {
+		reg = 0;
+		DLB2_BITS_SET(reg, args->threshold,
+			      DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH);
+		DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id),
+			    reg);
+	}
+
+	resp->status = 0;
+
+	return 0;
+}
diff --git a/drivers/event/dlb2/pf/base/dlb2_resource.h b/drivers/event/dlb2/pf/base/dlb2_resource.h
index 71bd6148f1..17cc745824 100644
--- a/drivers/event/dlb2/pf/base/dlb2_resource.h
+++ b/drivers/event/dlb2/pf/base/dlb2_resource.h
@@ -1956,4 +1956,23 @@ int dlb2_hw_enable_cq_weight(struct dlb2_hw *hw,
 			     bool vdev_request,
 			     unsigned int vdev_id);
 
+/**
+ * This function configures the inflight control threshold for a cq.
+ *
+ * This must be called after creating the port.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the queue ID.
+ *
+ * Errors:
+ * EINVAL - The domain or port is not configured.
+ */
+int dlb2_hw_set_cq_inflight_ctrl(struct dlb2_hw *hw, u32 domain_id,
+		struct dlb2_cq_inflight_ctrl_args *args,
+		struct dlb2_cmd_response *resp,
+		bool vdev_req,
+		unsigned int vdev_id);
+
 #endif /* __DLB2_RESOURCE_H */
diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
index 3d15250e11..249ed7ede9 100644
--- a/drivers/event/dlb2/pf/dlb2_pf.c
+++ b/drivers/event/dlb2/pf/dlb2_pf.c
@@ -665,6 +665,26 @@ dlb2_pf_set_cos_bandwidth(struct dlb2_hw_dev *handle,
 	return ret;
 }
 
+static int
+dlb2_pf_set_cq_inflight_ctrl(struct dlb2_hw_dev *handle,
+			     struct dlb2_cq_inflight_ctrl_args *args)
+{
+	struct dlb2_dev *dlb2_dev = (struct dlb2_dev *)handle->pf_dev;
+	struct dlb2_cmd_response response = {0};
+	int ret = 0;
+
+	DLB2_INFO(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_hw_set_cq_inflight_ctrl(&dlb2_dev->hw, handle->domain_id,
+					   args, &response, false, 0);
+	args->response = response;
+
+	DLB2_INFO(dev->dlb2_device, "Exiting %s() with ret=%d\n",
+		  __func__, ret);
+
+	return ret;
+}
+
 static void
 dlb2_pf_iface_fn_ptrs_init(void)
 {
@@ -691,6 +711,7 @@ dlb2_pf_iface_fn_ptrs_init(void)
 	dlb2_iface_get_sn_occupancy = dlb2_pf_get_sn_occupancy;
 	dlb2_iface_enable_cq_weight = dlb2_pf_enable_cq_weight;
 	dlb2_iface_set_cos_bw = dlb2_pf_set_cos_bandwidth;
+	dlb2_iface_set_cq_inflight_ctrl = dlb2_pf_set_cq_inflight_ctrl;
 }
 
 /* PCI DEV HOOKS */
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.c b/drivers/event/dlb2/rte_pmd_dlb2.c
index 43990e46ac..c72a42b466 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.c
+++ b/drivers/event/dlb2/rte_pmd_dlb2.c
@@ -33,7 +33,36 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 	if (port_id >= dlb2->num_ports || dlb2->ev_ports[port_id].setup_done)
 		return -EINVAL;
 
+	if (dlb2->version == DLB2_HW_V2_5 && mode == DELAYED_POP) {
+		dlb2->ev_ports[port_id].qm_port.enable_inflight_ctrl = true;
+		dlb2->ev_ports[port_id].qm_port.inflight_threshold = 1;
+		mode = AUTO_POP;
+	}
+
 	dlb2->ev_ports[port_id].qm_port.token_pop_mode = mode;
 
 	return 0;
 }
+
+int
+rte_pmd_dlb2_set_port_param(uint8_t dev_id,
+			    uint8_t port_id,
+			    uint64_t flags,
+			    void *val)
+{
+	struct dlb2_eventdev *dlb2;
+	struct rte_eventdev *dev;
+
+	if (val == NULL)
+		return -EINVAL;
+
+	RTE_EVENTDEV_VALID_DEVID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_eventdevs[dev_id];
+
+	dlb2 = dlb2_pmd_priv(dev);
+
+	if (port_id >= dlb2->num_ports)
+		return -EINVAL;
+
+	return dlb2_set_port_param(dlb2, port_id, flags, val);
+}
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
index 334c6c356d..6e78dfb5a5 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.h
+++ b/drivers/event/dlb2/rte_pmd_dlb2.h
@@ -67,6 +67,46 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 				uint8_t port_id,
 				enum dlb2_token_pop_mode mode);
 
+/** Set inflight threshold for flow migration */
+#define DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0)
+
+/** Set port history list */
+#define DLB2_SET_PORT_HL RTE_BIT64(1)
+
+struct dlb2_port_param {
+	uint16_t inflight_threshold : 12;
+};
+
+/*!
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Configure various port parameters.
+ * AUTO_POP. This function must be called before calling rte_event_port_setup()
+ * for the port, but after calling rte_event_dev_configure().
+ *
+ * @param dev_id
+ *    The identifier of the event device.
+ * @param port_id
+ *    The identifier of the event port.
+ * @param flags
+ *    Bitmask of the parameters being set.
+ * @param val
+ *    Structure coantaining the values of parameters being set.
+ *
+ * @return
+ * - 0: Success
+ * - EINVAL: Invalid dev_id, port_id, or mode
+ * - EINVAL: The DLB2 is not configured, is already running, or the port is
+ *   already setup
+ */
+__rte_experimental
+int
+rte_pmd_dlb2_set_port_param(uint8_t dev_id,
+			    uint8_t port_id,
+			    uint64_t flags,
+			    void *val);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/event/dlb2/version.map b/drivers/event/dlb2/version.map
index 1d0a0a75d7..5078e4960a 100644
--- a/drivers/event/dlb2/version.map
+++ b/drivers/event/dlb2/version.map
@@ -5,6 +5,9 @@ DPDK_24 {
 EXPERIMENTAL {
 	global:
 
+	# added in 24.07
+	rte_pmd_dlb2_set_port_param;
+
 	# added in 20.11
 	rte_pmd_dlb2_set_token_pop_mode;
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries
  2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
@ 2024-05-01 19:46 ` Abdullah Sevincer
  2024-05-27 15:23   ` Jerin Jacob
  2024-05-01 19:46 ` [PATCH v4 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
  2024-05-02  7:34 ` [PATCH v4 0/3] DLB2 Enhancements Bruce Richardson
  3 siblings, 1 reply; 28+ messages in thread
From: Abdullah Sevincer @ 2024-05-01 19:46 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

In DLB 2.5, hardware assist is available, complementing the Delayed
token POP software implementation. When it is enabled, the feature
works as follows:

It stops CQ scheduling when the inflight limit associated with the CQ
is reached. So the feature is activated only if the core is
congested. If the core can handle multiple atomic flows, DLB will not
try to switch them. This is an improvement over SW implementation
which always switches the flows.

The feature will resume CQ scheduling when the number of pending
completions fall below a configured threshold.

DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
possible HL entries per LDB port equals 2048 / 64 = 32. So, the
maximum CQ depth possible is 16, if all 64 LB ports are needed in a
high-performance setting.

In case all CQs are configured to have HL = 2* CQ Depth as a
performance option, then the calculation of HL at the time of domain
creation will be based on maximum possible dequeue depth. This could
result in allocating too many HL  entries to the domain as DLB only
has limited number of HL entries to be allocated. Hence, it is best
to allow application to specify HL entries as a command line argument
and override default allocation. A summary of usage is listed below:

When 'use_default_hl = 1', Per port HL is set to
DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
alloc_hl_entries is ignored.

When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.

User should calculate needed HL entries based on CQ depths the
application will use and specify it as command line parameter
'alloc_hl_entries'. This will be used to allocate HL entries.
Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).

If alloc_hl_entries is not specified, then Total HL entries for the
vdev = num_ldb_ports * 64.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 drivers/event/dlb2/dlb2.c         | 124 ++++++++++++++++++++++++++++--
 drivers/event/dlb2/dlb2_priv.h    |  10 ++-
 drivers/event/dlb2/pf/dlb2_pf.c   |   7 +-
 drivers/event/dlb2/rte_pmd_dlb2.h |   1 +
 4 files changed, 130 insertions(+), 12 deletions(-)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index d64274b01e..11bbe30d7b 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -180,10 +180,7 @@ dlb2_hw_query_resources(struct dlb2_eventdev *dlb2)
 	 * The capabilities (CAPs) were set at compile time.
 	 */
 
-	if (dlb2->max_cq_depth != DLB2_DEFAULT_CQ_DEPTH)
-		num_ldb_ports = DLB2_MAX_HL_ENTRIES / dlb2->max_cq_depth;
-	else
-		num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
+	num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
 
 	evdev_dlb2_default_info.max_event_queues =
 		dlb2->hw_rsrc_query_results.num_ldb_queues;
@@ -631,6 +628,52 @@ set_enable_cq_weight(const char *key __rte_unused,
 	return 0;
 }
 
+static int set_hl_override(const char *key __rte_unused,
+		const char *value,
+		void *opaque)
+{
+	bool *default_hl = opaque;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	if ((*value == 'n') || (*value == 'N') || (*value == '0'))
+		*default_hl = false;
+	else
+		*default_hl = true;
+
+	return 0;
+}
+
+static int set_hl_entries(const char *key __rte_unused,
+		const char *value,
+		void *opaque)
+{
+	int hl_entries = 0;
+	int ret;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	ret = dlb2_string_to_int(&hl_entries, value);
+	if (ret < 0)
+		return ret;
+
+	if ((uint32_t)hl_entries > DLB2_MAX_HL_ENTRIES) {
+		DLB2_LOG_ERR(
+		    "alloc_hl_entries %u out of range, must be in [1 - %d]\n",
+		    hl_entries, DLB2_MAX_HL_ENTRIES);
+		return -EINVAL;
+	}
+	*(uint32_t *)opaque = hl_entries;
+
+	return 0;
+}
+
 static int
 set_qid_depth_thresh(const char *key __rte_unused,
 		     const char *value,
@@ -828,8 +871,15 @@ dlb2_hw_create_sched_domain(struct dlb2_eventdev *dlb2,
 		DLB2_NUM_ATOMIC_INFLIGHTS_PER_QUEUE *
 		cfg->num_ldb_queues;
 
-	cfg->num_hist_list_entries = resources_asked->num_ldb_ports *
-		evdev_dlb2_default_info.max_event_port_dequeue_depth;
+	/* If hl_entries is non-zero then user specified command line option.
+	 * Else compute using default_port_hl that has been set earlier based
+	 * on use_default_hl option
+	 */
+	if (dlb2->hl_entries)
+		cfg->num_hist_list_entries = dlb2->hl_entries;
+	else
+		cfg->num_hist_list_entries =
+		    resources_asked->num_ldb_ports * dlb2->default_port_hl;
 
 	if (device_version == DLB2_HW_V2_5) {
 		DLB2_LOG_DBG("sched domain create - ldb_qs=%d, ldb_ports=%d, dir_ports=%d, atomic_inflights=%d, hist_list_entries=%d, credits=%d\n",
@@ -1041,7 +1091,7 @@ dlb2_eventdev_port_default_conf_get(struct rte_eventdev *dev,
 	struct dlb2_eventdev *dlb2 = dlb2_pmd_priv(dev);
 
 	port_conf->new_event_threshold = dlb2->new_event_limit;
-	port_conf->dequeue_depth = 32;
+	port_conf->dequeue_depth = dlb2->default_port_hl / 2;
 	port_conf->enqueue_depth = DLB2_MAX_ENQUEUE_DEPTH;
 	port_conf->event_port_cfg = 0;
 }
@@ -1560,9 +1610,16 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
 		cfg.enable_inflight_ctrl = 1;
 		cfg.inflight_threshold = qm_port->inflight_threshold;
+		if (!qm_port->hist_list)
+			qm_port->hist_list = cfg.cq_depth;
 	}
 
-	cfg.cq_history_list_size = cfg.cq_depth;
+	if (qm_port->hist_list)
+		cfg.cq_history_list_size = qm_port->hist_list;
+	else if (dlb2->default_port_hl == DLB2_FIXED_CQ_HL_SIZE)
+		cfg.cq_history_list_size = DLB2_FIXED_CQ_HL_SIZE;
+	else
+		cfg.cq_history_list_size = cfg.cq_depth * 2;
 
 	cfg.cos_id = ev_port->cos_id;
 	cfg.cos_strict = 0;/* best effots */
@@ -4366,6 +4423,13 @@ dlb2_set_port_param(struct dlb2_eventdev *dlb2,
 				return -EINVAL;
 			}
 			break;
+		case DLB2_SET_PORT_HL:
+			if (dlb2->ev_ports[port_id].setup_done) {
+				DLB2_LOG_ERR("DLB2_SET_PORT_HL must be called before setting up port\n");
+				return -EINVAL;
+			}
+			port->hist_list = port_param->port_hl;
+			break;
 		default:
 			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
 			return -EINVAL;
@@ -4684,6 +4748,28 @@ dlb2_primary_eventdev_probe(struct rte_eventdev *dev,
 		return err;
 	}
 
+	if (dlb2_args->use_default_hl) {
+		dlb2->default_port_hl = DLB2_FIXED_CQ_HL_SIZE;
+		if (dlb2_args->alloc_hl_entries)
+			DLB2_LOG_ERR(": Ignoring 'alloc_hl_entries' and using "
+				     "default history list sizes for eventdev:"
+				     " %s\n", dev->data->name);
+		dlb2->hl_entries = 0;
+	} else {
+		dlb2->default_port_hl = 2 * DLB2_FIXED_CQ_HL_SIZE;
+
+		if (dlb2_args->alloc_hl_entries >
+		    dlb2->hw_rsrc_query_results.num_hist_list_entries) {
+			DLB2_LOG_ERR(": Insufficient HL entries asked=%d "
+				     "available=%d for eventdev: %s\n",
+				     dlb2->hl_entries,
+				     dlb2->hw_rsrc_query_results.num_hist_list_entries,
+				     dev->data->name);
+			return -EINVAL;
+		}
+		dlb2->hl_entries = dlb2_args->alloc_hl_entries;
+	}
+
 	dlb2_iface_hardware_init(&dlb2->qm_instance);
 
 	/* configure class of service */
@@ -4791,6 +4877,8 @@ dlb2_parse_params(const char *params,
 					     DLB2_PRODUCER_COREMASK,
 					     DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG,
 					     DLB2_ENABLE_CQ_WEIGHT_ARG,
+					     DLB2_USE_DEFAULT_HL,
+					     DLB2_ALLOC_HL_ENTRIES,
 					     NULL };
 
 	if (params != NULL && params[0] != '\0') {
@@ -4994,6 +5082,26 @@ dlb2_parse_params(const char *params,
 				return ret;
 			}
 
+			ret = rte_kvargs_process(kvlist, DLB2_USE_DEFAULT_HL,
+						 set_hl_override,
+						 &dlb2_args->use_default_hl);
+			if (ret != 0) {
+				DLB2_LOG_ERR("%s: Error parsing use_default_hl arg",
+					     name);
+				rte_kvargs_free(kvlist);
+				return ret;
+			}
+
+			ret = rte_kvargs_process(kvlist, DLB2_ALLOC_HL_ENTRIES,
+						 set_hl_entries,
+						 &dlb2_args->alloc_hl_entries);
+			if (ret != 0) {
+				DLB2_LOG_ERR("%s: Error parsing hl_override arg",
+					     name);
+				rte_kvargs_free(kvlist);
+				return ret;
+			}
+
 			rte_kvargs_free(kvlist);
 		}
 	}
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index d6828aa482..dc9f98e142 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -52,6 +52,8 @@
 #define DLB2_PRODUCER_COREMASK "producer_coremask"
 #define DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG "default_port_allocation"
 #define DLB2_ENABLE_CQ_WEIGHT_ARG "enable_cq_weight"
+#define DLB2_USE_DEFAULT_HL "use_default_hl"
+#define DLB2_ALLOC_HL_ENTRIES "alloc_hl_entries"
 
 /* Begin HW related defines and structs */
 
@@ -101,7 +103,8 @@
  */
 #define DLB2_MAX_HL_ENTRIES 2048
 #define DLB2_MIN_CQ_DEPTH 1
-#define DLB2_DEFAULT_CQ_DEPTH 32
+#define DLB2_DEFAULT_CQ_DEPTH 32  /* Can be overridden using max_cq_depth command line parameter */
+#define DLB2_FIXED_CQ_HL_SIZE 32  /* Used when ENABLE_FIXED_HL_SIZE is true */
 #define DLB2_MIN_HARDWARE_CQ_DEPTH 8
 #define DLB2_NUM_HIST_LIST_ENTRIES_PER_LDB_PORT \
 	DLB2_DEFAULT_CQ_DEPTH
@@ -391,6 +394,7 @@ struct dlb2_port {
 	bool is_producer; /* True if port is of type producer */
 	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
 	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
+	uint16_t hist_list; /* Port history list */
 };
 
 /* Per-process per-port mmio and memory pointers */
@@ -640,6 +644,8 @@ struct dlb2_eventdev {
 	uint32_t cos_bw[DLB2_COS_NUM_VALS]; /* bandwidth per cos domain */
 	uint8_t max_cos_port; /* Max LDB port from any cos */
 	bool enable_cq_weight;
+	uint16_t hl_entries; /* Num HL entires to allocate for the domain */
+	int default_port_hl;  /* Fixed or dynamic (2*CQ Depth) HL assignment */
 };
 
 /* used for collecting and passing around the dev args */
@@ -678,6 +684,8 @@ struct dlb2_devargs {
 	const char *producer_coremask;
 	bool default_ldb_port_allocation;
 	bool enable_cq_weight;
+	bool use_default_hl;
+	uint32_t alloc_hl_entries;
 };
 
 /* End Eventdev related defines and structs */
diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
index 249ed7ede9..ba22f37731 100644
--- a/drivers/event/dlb2/pf/dlb2_pf.c
+++ b/drivers/event/dlb2/pf/dlb2_pf.c
@@ -422,6 +422,7 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
 				      cfg,
 				      cq_base,
 				      &response);
+	cfg->response = response;
 	if (ret)
 		goto create_port_err;
 
@@ -437,8 +438,6 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
 
 	dlb2_list_init_head(&port_memory.list);
 
-	cfg->response = response;
-
 	return 0;
 
 create_port_err:
@@ -731,7 +730,9 @@ dlb2_eventdev_pci_init(struct rte_eventdev *eventdev)
 		.hw_credit_quanta = DLB2_SW_CREDIT_BATCH_SZ,
 		.default_depth_thresh = DLB2_DEPTH_THRESH_DEFAULT,
 		.max_cq_depth = DLB2_DEFAULT_CQ_DEPTH,
-		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH
+		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH,
+		.use_default_hl = true,
+		.alloc_hl_entries = 0
 	};
 	struct dlb2_eventdev *dlb2;
 	int q;
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
index 6e78dfb5a5..91b47ede11 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.h
+++ b/drivers/event/dlb2/rte_pmd_dlb2.h
@@ -75,6 +75,7 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 
 struct dlb2_port_param {
 	uint16_t inflight_threshold : 12;
+	uint16_t port_hl;
 };
 
 /*!
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 3/3] event/dlb2: enhance DLB credit handling
  2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
@ 2024-05-01 19:46 ` Abdullah Sevincer
  2024-05-27 15:30   ` Jerin Jacob
  2024-05-02  7:34 ` [PATCH v4 0/3] DLB2 Enhancements Bruce Richardson
  3 siblings, 1 reply; 28+ messages in thread
From: Abdullah Sevincer @ 2024-05-01 19:46 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

This commit improves DLB credit handling scenarios when
ports hold on to credits but can't release them due to insufficient
accumulation (less than 2 * credit quanta).

Worker ports now release all accumulated credits when back-to-back
zero poll count reaches preset threshold.

Producer ports release all accumulated credits if enqueue fails for a
consecutive number of retries.

In a multi-producer system, some producer(s) may exit early while
holding on to credits. Now these are released during port unlink
which needs to be performed by the application.

test-eventdev is modified to call rte_event_port_unlink() to release
any accumulated credits by producer ports.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 app/test-eventdev/test_perf_common.c |  20 +--
 drivers/event/dlb2/dlb2.c            | 203 +++++++++++++++++++++------
 drivers/event/dlb2/dlb2_priv.h       |   1 +
 drivers/event/dlb2/meson.build       |  12 ++
 drivers/event/dlb2/meson_options.txt |   6 +
 5 files changed, 194 insertions(+), 48 deletions(-)
 create mode 100644 drivers/event/dlb2/meson_options.txt

diff --git a/app/test-eventdev/test_perf_common.c b/app/test-eventdev/test_perf_common.c
index 93e6132de8..b3a12e12ac 100644
--- a/app/test-eventdev/test_perf_common.c
+++ b/app/test-eventdev/test_perf_common.c
@@ -854,6 +854,7 @@ perf_producer_wrapper(void *arg)
 	struct rte_event_dev_info dev_info;
 	struct prod_data *p  = arg;
 	struct test_perf *t = p->t;
+	int ret = 0;
 
 	rte_event_dev_info_get(p->dev_id, &dev_info);
 	if (!t->opt->prod_enq_burst_sz) {
@@ -870,29 +871,32 @@ perf_producer_wrapper(void *arg)
 	 */
 	if (t->opt->prod_type == EVT_PROD_TYPE_SYNT &&
 			t->opt->prod_enq_burst_sz == 1)
-		return perf_producer(arg);
+		ret = perf_producer(arg);
 	else if (t->opt->prod_type == EVT_PROD_TYPE_SYNT &&
 			t->opt->prod_enq_burst_sz > 1) {
 		if (dev_info.max_event_port_enqueue_depth == 1)
 			evt_err("This event device does not support burst mode");
 		else
-			return perf_producer_burst(arg);
+			ret = perf_producer_burst(arg);
 	}
 	else if (t->opt->prod_type == EVT_PROD_TYPE_EVENT_TIMER_ADPTR &&
 			!t->opt->timdev_use_burst)
-		return perf_event_timer_producer(arg);
+		ret = perf_event_timer_producer(arg);
 	else if (t->opt->prod_type == EVT_PROD_TYPE_EVENT_TIMER_ADPTR &&
 			t->opt->timdev_use_burst)
-		return perf_event_timer_producer_burst(arg);
+		ret = perf_event_timer_producer_burst(arg);
 	else if (t->opt->prod_type == EVT_PROD_TYPE_EVENT_CRYPTO_ADPTR) {
 		if (t->opt->prod_enq_burst_sz > 1)
-			return perf_event_crypto_producer_burst(arg);
+			ret = perf_event_crypto_producer_burst(arg);
 		else
-			return perf_event_crypto_producer(arg);
+			ret = perf_event_crypto_producer(arg);
 	} else if (t->opt->prod_type == EVT_PROD_TYPE_EVENT_DMA_ADPTR)
-		return perf_event_dma_producer(arg);
+		ret = perf_event_dma_producer(arg);
 
-	return 0;
+	/* Unlink port to release any acquired HW resources*/
+	rte_event_port_unlink(p->dev_id, p->port_id, &p->queue_id, 1);
+
+	return ret;
 }
 
 static inline uint64_t
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 11bbe30d7b..2c341a5845 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -43,7 +43,47 @@
  * to DLB can go ahead of relevant application writes like updates to buffers
  * being sent with event
  */
+#ifndef DLB2_BYPASS_FENCE_ON_PP
 #define DLB2_BYPASS_FENCE_ON_PP 0  /* 1 == Bypass fence, 0 == do not bypass */
+#endif
+
+/* HW credit checks can only be turned off for DLB2 device if following
+ * is true for each created eventdev
+ * LDB credits <= DIR credits + minimum CQ Depth
+ * (CQ Depth is minimum of all ports configured within eventdev)
+ * This needs to be true for all eventdevs created on any DLB2 device
+ * managed by this driver.
+ * DLB2.5 does not any such restriction as it has single credit pool
+ */
+#ifndef DLB_HW_CREDITS_CHECKS
+#define DLB_HW_CREDITS_CHECKS 1
+#endif
+
+/*
+ * SW credit checks can only be turned off if application has a way to
+ * limit input events to the eventdev below assigned credit limit
+ */
+#ifndef DLB_SW_CREDITS_CHECKS
+#define DLB_SW_CREDITS_CHECKS 1
+#endif
+
+/*
+ * To avoid deadlock situations, by default, per port new_event_threshold
+ * check is disabled. nb_events_limit is still checked while allocating
+ * new event credits.
+ */
+#define ENABLE_PORT_THRES_CHECK 1
+/*
+ * To avoid deadlock, ports holding to credits will release them after these
+ * many consecutive zero dequeues
+ */
+#define DLB2_ZERO_DEQ_CREDIT_RETURN_THRES 16384
+
+/*
+ * To avoid deadlock, ports holding to credits will release them after these
+ * many consecutive enqueue failures
+ */
+#define DLB2_ENQ_FAIL_CREDIT_RETURN_THRES 100
 
 /*
  * Resources exposed to eventdev. Some values overridden at runtime using
@@ -2488,6 +2528,61 @@ dlb2_event_queue_detach_ldb(struct dlb2_eventdev *dlb2,
 	return ret;
 }
 
+static inline void
+dlb2_port_credits_return(struct dlb2_port *qm_port)
+{
+	/* Return all port credits */
+	if (qm_port->dlb2->version == DLB2_HW_V2_5) {
+		if (qm_port->cached_credits) {
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_COMBINED_POOL],
+					   qm_port->cached_credits, __ATOMIC_SEQ_CST);
+			qm_port->cached_credits = 0;
+		}
+	} else {
+		if (qm_port->cached_ldb_credits) {
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_LDB_QUEUE],
+					   qm_port->cached_ldb_credits, __ATOMIC_SEQ_CST);
+			qm_port->cached_ldb_credits = 0;
+		}
+		if (qm_port->cached_dir_credits) {
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_DIR_QUEUE],
+					   qm_port->cached_dir_credits, __ATOMIC_SEQ_CST);
+			qm_port->cached_dir_credits = 0;
+		}
+	}
+}
+
+static inline void
+dlb2_release_sw_credits(struct dlb2_eventdev *dlb2,
+			struct dlb2_eventdev_port *ev_port, uint16_t val)
+{
+	if (ev_port->inflight_credits) {
+		__atomic_fetch_sub(&dlb2->inflights, val, __ATOMIC_SEQ_CST);
+		ev_port->inflight_credits -= val;
+	}
+}
+
+static void dlb2_check_and_return_credits(struct dlb2_eventdev_port *ev_port,
+					  bool cond, uint32_t threshold)
+{
+#if DLB_SW_CREDITS_CHECKS || DLB_HW_CREDITS_CHECKS
+	if (cond) {
+		if (++ev_port->credit_return_count > threshold) {
+#if DLB_SW_CREDITS_CHECKS
+			dlb2_release_sw_credits(ev_port->dlb2, ev_port,
+						ev_port->inflight_credits);
+#endif
+#if DLB_HW_CREDITS_CHECKS
+			dlb2_port_credits_return(&ev_port->qm_port);
+#endif
+			ev_port->credit_return_count = 0;
+		}
+	} else {
+		ev_port->credit_return_count = 0;
+	}
+#endif
+}
+
 static int
 dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 			  uint8_t queues[], uint16_t nb_unlinks)
@@ -2507,14 +2602,15 @@ dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 
 	if (queues == NULL || nb_unlinks == 0) {
 		DLB2_LOG_DBG("dlb2: queues is NULL or nb_unlinks is 0\n");
-		return 0; /* Ignore and return success */
+		nb_unlinks = 0; /* Ignore and return success */
+		goto ret_credits;
 	}
 
 	if (ev_port->qm_port.is_directed) {
 		DLB2_LOG_DBG("dlb2: ignore unlink from dir port %d\n",
 			     ev_port->id);
 		rte_errno = 0;
-		return nb_unlinks; /* as if success */
+		goto ret_credits;
 	}
 
 	dlb2 = ev_port->dlb2;
@@ -2553,6 +2649,10 @@ dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 		ev_queue->num_links--;
 	}
 
+ret_credits:
+	if (ev_port->inflight_credits)
+		dlb2_check_and_return_credits(ev_port, true, 0);
+
 	return nb_unlinks;
 }
 
@@ -2752,8 +2852,7 @@ dlb2_replenish_sw_credits(struct dlb2_eventdev *dlb2,
 		/* Replenish credits, saving one quanta for enqueues */
 		uint16_t val = ev_port->inflight_credits - quanta;
 
-		__atomic_fetch_sub(&dlb2->inflights, val, __ATOMIC_SEQ_CST);
-		ev_port->inflight_credits -= val;
+		dlb2_release_sw_credits(dlb2, ev_port, val);
 	}
 }
 
@@ -2924,7 +3023,9 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 {
 	struct dlb2_eventdev *dlb2 = ev_port->dlb2;
 	struct dlb2_eventdev_queue *ev_queue;
+#if DLB_HW_CREDITS_CHECKS
 	uint16_t *cached_credits = NULL;
+#endif
 	struct dlb2_queue *qm_queue;
 
 	ev_queue = &dlb2->ev_queues[ev->queue_id];
@@ -2936,6 +3037,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		goto op_check;
 
 	if (!qm_queue->is_directed) {
+#if DLB_HW_CREDITS_CHECKS
 		/* Load balanced destination queue */
 
 		if (dlb2->version == DLB2_HW_V2) {
@@ -2951,6 +3053,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			}
 			cached_credits = &qm_port->cached_credits;
 		}
+#endif
 		switch (ev->sched_type) {
 		case RTE_SCHED_TYPE_ORDERED:
 			DLB2_LOG_DBG("dlb2: put_qe: RTE_SCHED_TYPE_ORDERED\n");
@@ -2981,7 +3084,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		}
 	} else {
 		/* Directed destination queue */
-
+#if DLB_HW_CREDITS_CHECKS
 		if (dlb2->version == DLB2_HW_V2) {
 			if (dlb2_check_enqueue_hw_dir_credits(qm_port)) {
 				rte_errno = -ENOSPC;
@@ -2995,6 +3098,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			}
 			cached_credits = &qm_port->cached_credits;
 		}
+#endif
 		DLB2_LOG_DBG("dlb2: put_qe: RTE_SCHED_TYPE_DIRECTED\n");
 
 		*sched_type = DLB2_SCHED_DIRECTED;
@@ -3002,6 +3106,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 
 op_check:
 	switch (ev->op) {
+#if DLB_SW_CREDITS_CHECKS
 	case RTE_EVENT_OP_NEW:
 		/* Check that a sw credit is available */
 		if (dlb2_check_enqueue_sw_credits(dlb2, ev_port)) {
@@ -3009,7 +3114,10 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			return 1;
 		}
 		ev_port->inflight_credits--;
+#endif
+#if DLB_HW_CREDITS_CHECKS
 		(*cached_credits)--;
+#endif
 		break;
 	case RTE_EVENT_OP_FORWARD:
 		/* Check for outstanding_releases underflow. If this occurs,
@@ -3020,10 +3128,14 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		RTE_ASSERT(ev_port->outstanding_releases > 0);
 		ev_port->outstanding_releases--;
 		qm_port->issued_releases++;
+#if DLB_HW_CREDITS_CHECKS
 		(*cached_credits)--;
+#endif
 		break;
 	case RTE_EVENT_OP_RELEASE:
+#if DLB_SW_CREDITS_CHECKS
 		ev_port->inflight_credits++;
+#endif
 		/* Check for outstanding_releases underflow. If this occurs,
 		 * the application is not using the EVENT_OPs correctly; for
 		 * example, forwarding or releasing events that were not
@@ -3032,9 +3144,10 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		RTE_ASSERT(ev_port->outstanding_releases > 0);
 		ev_port->outstanding_releases--;
 		qm_port->issued_releases++;
-
+#if DLB_SW_CREDITS_CHECKS
 		/* Replenish s/w credits if enough are cached */
 		dlb2_replenish_sw_credits(dlb2, ev_port);
+#endif
 		break;
 	}
 
@@ -3145,6 +3258,8 @@ __dlb2_event_enqueue_burst(void *event_port,
 			break;
 	}
 
+	dlb2_check_and_return_credits(ev_port, !i, DLB2_ENQ_FAIL_CREDIT_RETURN_THRES);
+
 	return i;
 }
 
@@ -3283,53 +3398,45 @@ dlb2_event_release(struct dlb2_eventdev *dlb2,
 		return;
 	}
 	ev_port->outstanding_releases -= i;
+#if DLB_SW_CREDITS_CHECKS
 	ev_port->inflight_credits += i;
 
 	/* Replenish s/w credits if enough releases are performed */
 	dlb2_replenish_sw_credits(dlb2, ev_port);
+#endif
 }
 
 static inline void
 dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 {
 	uint32_t batch_size = qm_port->hw_credit_quanta;
+	int val;
 
 	/* increment port credits, and return to pool if exceeds threshold */
-	if (!qm_port->is_directed) {
-		if (qm_port->dlb2->version == DLB2_HW_V2) {
-			qm_port->cached_ldb_credits += num;
-			if (qm_port->cached_ldb_credits >= 2 * batch_size) {
-				__atomic_fetch_add(
-					qm_port->credit_pool[DLB2_LDB_QUEUE],
-					batch_size, __ATOMIC_SEQ_CST);
-				qm_port->cached_ldb_credits -= batch_size;
-			}
-		} else {
-			qm_port->cached_credits += num;
-			if (qm_port->cached_credits >= 2 * batch_size) {
-				__atomic_fetch_add(
-				      qm_port->credit_pool[DLB2_COMBINED_POOL],
-				      batch_size, __ATOMIC_SEQ_CST);
-				qm_port->cached_credits -= batch_size;
-			}
+	if (qm_port->dlb2->version == DLB2_HW_V2_5) {
+		qm_port->cached_credits += num;
+		if (qm_port->cached_credits >= 2 * batch_size) {
+			val = qm_port->cached_credits - batch_size;
+			__atomic_fetch_add(
+			    qm_port->credit_pool[DLB2_COMBINED_POOL], val,
+			    __ATOMIC_SEQ_CST);
+			qm_port->cached_credits -= val;
+		}
+	} else if (!qm_port->is_directed) {
+		qm_port->cached_ldb_credits += num;
+		if (qm_port->cached_ldb_credits >= 2 * batch_size) {
+			val = qm_port->cached_ldb_credits - batch_size;
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_LDB_QUEUE],
+					   val, __ATOMIC_SEQ_CST);
+			qm_port->cached_ldb_credits -= val;
 		}
 	} else {
-		if (qm_port->dlb2->version == DLB2_HW_V2) {
-			qm_port->cached_dir_credits += num;
-			if (qm_port->cached_dir_credits >= 2 * batch_size) {
-				__atomic_fetch_add(
-					qm_port->credit_pool[DLB2_DIR_QUEUE],
-					batch_size, __ATOMIC_SEQ_CST);
-				qm_port->cached_dir_credits -= batch_size;
-			}
-		} else {
-			qm_port->cached_credits += num;
-			if (qm_port->cached_credits >= 2 * batch_size) {
-				__atomic_fetch_add(
-				      qm_port->credit_pool[DLB2_COMBINED_POOL],
-				      batch_size, __ATOMIC_SEQ_CST);
-				qm_port->cached_credits -= batch_size;
-			}
+		qm_port->cached_dir_credits += num;
+		if (qm_port->cached_dir_credits >= 2 * batch_size) {
+			val = qm_port->cached_dir_credits - batch_size;
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_DIR_QUEUE],
+					   val, __ATOMIC_SEQ_CST);
+			qm_port->cached_dir_credits -= val;
 		}
 	}
 }
@@ -3360,6 +3467,15 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 
 	/* Wait/poll time expired */
 	if (elapsed_ticks >= timeout) {
+
+		/* Return all credits before blocking if remaining credits in
+		 * system is less than quanta.
+		 */
+		uint32_t sw_inflights = __atomic_load_n(&dlb2->inflights, __ATOMIC_SEQ_CST);
+		uint32_t quanta = ev_port->credit_update_quanta;
+
+		if (dlb2->new_event_limit - sw_inflights < quanta)
+			dlb2_check_and_return_credits(ev_port, true, 0);
 		return 1;
 	} else if (dlb2->umwait_allowed) {
 		struct rte_power_monitor_cond pmc;
@@ -4222,8 +4338,9 @@ dlb2_hw_dequeue(struct dlb2_eventdev *dlb2,
 			dlb2_consume_qe_immediate(qm_port, num);
 
 		ev_port->outstanding_releases += num;
-
+#if DLB_HW_CREDITS_CHECKS
 		dlb2_port_credits_inc(qm_port, num);
+#endif
 	}
 
 	return num;
@@ -4257,6 +4374,9 @@ dlb2_event_dequeue_burst(void *event_port, struct rte_event *ev, uint16_t num,
 	DLB2_INC_STAT(ev_port->stats.traffic.total_polls, 1);
 	DLB2_INC_STAT(ev_port->stats.traffic.zero_polls, ((cnt == 0) ? 1 : 0));
 
+	dlb2_check_and_return_credits(ev_port, !cnt,
+				      DLB2_ZERO_DEQ_CREDIT_RETURN_THRES);
+
 	return cnt;
 }
 
@@ -4293,6 +4413,9 @@ dlb2_event_dequeue_burst_sparse(void *event_port, struct rte_event *ev,
 
 	DLB2_INC_STAT(ev_port->stats.traffic.total_polls, 1);
 	DLB2_INC_STAT(ev_port->stats.traffic.zero_polls, ((cnt == 0) ? 1 : 0));
+
+	dlb2_check_and_return_credits(ev_port, !cnt,
+				      DLB2_ZERO_DEQ_CREDIT_RETURN_THRES);
 	return cnt;
 }
 
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index dc9f98e142..fd76b5b9fb 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -527,6 +527,7 @@ struct __rte_cache_aligned dlb2_eventdev_port {
 	struct rte_event_port_conf conf; /* user-supplied configuration */
 	uint16_t inflight_credits; /* num credits this port has right now */
 	uint16_t credit_update_quanta;
+	uint32_t credit_return_count; /* count till the credit return condition is true */
 	struct dlb2_eventdev *dlb2; /* backlink optimization */
 	alignas(RTE_CACHE_LINE_SIZE) struct dlb2_port_stats stats;
 	struct dlb2_event_queue_link link[DLB2_MAX_NUM_QIDS_PER_LDB_CQ];
diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build
index 515d1795fe..77a197e32c 100644
--- a/drivers/event/dlb2/meson.build
+++ b/drivers/event/dlb2/meson.build
@@ -68,3 +68,15 @@ endif
 headers = files('rte_pmd_dlb2.h')
 
 deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci']
+
+if meson.version().version_compare('> 0.58.0')
+fs = import('fs')
+dlb_options = fs.read('meson_options.txt').strip().split('\n')
+
+foreach opt: dlb_options
+	if (opt.strip().startswith('#') or opt.strip() == '')
+		continue
+	endif
+	cflags += '-D' + opt.strip().to_upper().replace(' ','')
+endforeach
+endif
diff --git a/drivers/event/dlb2/meson_options.txt b/drivers/event/dlb2/meson_options.txt
new file mode 100644
index 0000000000..b57c999e54
--- /dev/null
+++ b/drivers/event/dlb2/meson_options.txt
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2023-2024 Intel Corporation
+
+DLB2_BYPASS_FENCE_ON_PP = 0
+DLB_HW_CREDITS_CHECKS = 1
+DLB_SW_CREDITS_CHECKS = 1
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 0/3] DLB2 Enhancements
  2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
                   ` (2 preceding siblings ...)
  2024-05-01 19:46 ` [PATCH v4 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
@ 2024-05-02  7:34 ` Bruce Richardson
  2024-05-02 15:52   ` Sevincer, Abdullah
  3 siblings, 1 reply; 28+ messages in thread
From: Bruce Richardson @ 2024-05-02  7:34 UTC (permalink / raw)
  To: Abdullah Sevincer
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Wed, May 01, 2024 at 02:46:17PM -0500, Abdullah Sevincer wrote:
> This patchset  addresses DLB enhancements in the DLB driver.
> 
> Abdullah Sevincer (3):
>   event/dlb2: add support for HW delayed token
>   event/dlb2: add support for dynamic HL entries
>   event/dlb2: enhance DLB credit handling
> 
Hi Abdullah,

Couple of small asks/tips when sending new versions of a patchset:
1) When sending v2, v3, v4 using git-send-email, include
  "--in-reply-to <message-id-of-v1-cover-letter>" in the command. This will
  ensure all copies of the patches get put in the same email thread, rather
  than having different versions spread throughout the reader's mailbox.
2) Please include in the cover letter a short one/two-line description of
  what has changed in each version, so anyone reviewing e.g. v4 after
  reading v3, is aware of what parts of v4 they need to look at
  specifically. Generally, this should be in reverse order e.g.

v4: renamed bar to foobar
v3: changed foo to bar
v2: added new foo

Thanks,
/Bruce

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v4 0/3] DLB2 Enhancements
  2024-05-02  7:34 ` [PATCH v4 0/3] DLB2 Enhancements Bruce Richardson
@ 2024-05-02 15:52   ` Sevincer, Abdullah
  0 siblings, 0 replies; 28+ messages in thread
From: Sevincer, Abdullah @ 2024-05-02 15:52 UTC (permalink / raw)
  To: Richardson, Bruce
  Cc: dev, jerinj, Chen, Mike Ximing, Sarkar, Tirthendu, Pathak,
	Pravin, Doneria, Shivani

>+Hi Abdullah,

>+Couple of small asks/tips when sending new versions of a patchset:
>+1) When sending v2, v3, v4 using git-send-email, include
 >+ "--in-reply-to <message-id-of-v1-cover-letter>" in the command. This will
 >+ ensure all copies of the patches get put in the same email thread, rather
  >+than having different versions spread throughout the reader's mailbox.
>+2) Please include in the cover letter a short one/two-line description of
 >+ what has changed in each version, so anyone reviewing e.g. v4 after
  >+reading v3, is aware of what parts of v4 they need to look at
 >+ specifically. Generally, this should be in reverse order e.g.

>+v4: renamed bar to foobar
>+v3: changed foo to bar
>+v2: added new foo

>+Thanks,
>+/Bruce

Hi Bruce,

Thanks for the tips, and sorry for filling in the inboxes, I will follow your instructions 
for the following patches.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/3] event/dlb2: add support for HW delayed token
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
@ 2024-05-27 15:19   ` Jerin Jacob
  2024-06-19 21:01   ` [PATCH v5 0/5] DLB2 Enhancements Abdullah Sevincer
  1 sibling, 0 replies; 28+ messages in thread
From: Jerin Jacob @ 2024-05-27 15:19 UTC (permalink / raw)
  To: Abdullah Sevincer
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Thu, May 2, 2024 at 1:16 AM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> In DLB 2.5, hardware assist is available, complementing the Delayed
> token POP software implementation. When it is enabled, the feature
> works as follows:
>
> It stops CQ scheduling when the inflight limit associated with the CQ
> is reached. So the feature is activated only if the core is
> congested. If the core can handle multiple atomic flows, DLB will not
> try to switch them. This is an improvement over SW implementation
> which always switches the flows.
>
> The feature will resume CQ scheduling when the number of pending
> completions fall below a configured threshold. To emulate older 2.0
> behavior, this threshold is set to 1 by old APIs. SW sets CQ to
> auto-pop mode for token return, as tokens withholding is not
> necessary now. As HW counts completions and not tokens, events equal
> to HL (History List) entries will be scheduled to DLB before the
> feature activates and stops CQ scheduling.


1) Also tell about adding new PMD API and update the release notes for
PMD section for new feature.
2) Fix CI http://mails.dpdk.org/archives/test-report/2024-May/657681.html


>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
 +/** Set inflight threshold for flow migration */
> +#define DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0)

Fix the namespace for public API, RTE_PMD_DLB2_PORT_SET_F_FLOW_MIGRATION_...


> +
> +/** Set port history list */
> +#define DLB2_SET_PORT_HL RTE_BIT64(1)

RTE_PMD_DLB2_PORT_SET_F_PORT_HL


> +
> +struct dlb2_port_param {

fix name space, rte_pmd_dlb2_port_params

> +       uint16_t inflight_threshold : 12;
> +};
> +
> +/*!
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Configure various port parameters.
> + * AUTO_POP. This function must be called before calling rte_event_port_setup()
> + * for the port, but after calling rte_event_dev_configure().
> + *
> + * @param dev_id
> + *    The identifier of the event device.
> + * @param port_id
> + *    The identifier of the event port.
> + * @param flags
> + *    Bitmask of the parameters being set.
> + * @param val
> + *    Structure coantaining the values of parameters being set.

Why not use struct rte_pmd_dlb2_port_params itself instead of void *.

> + *
> + * @return
> + * - 0: Success
> + * - EINVAL: Invalid dev_id, port_id, or mode
> + * - EINVAL: The DLB2 is not configured, is already running, or the port is
> + *   already setup
> + */
> +__rte_experimental
> +int
> +rte_pmd_dlb2_set_port_param(uint8_t dev_id,
> +                           uint8_t port_id,
> +                           uint64_t flags,
> +                           void *val);

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries
  2024-05-01 19:46 ` [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
@ 2024-05-27 15:23   ` Jerin Jacob
  0 siblings, 0 replies; 28+ messages in thread
From: Jerin Jacob @ 2024-05-27 15:23 UTC (permalink / raw)
  To: Abdullah Sevincer
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Thu, May 2, 2024 at 1:16 AM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> In DLB 2.5, hardware assist is available, complementing the Delayed
> token POP software implementation. When it is enabled, the feature
> works as follows:
>
> It stops CQ scheduling when the inflight limit associated with the CQ
> is reached. So the feature is activated only if the core is
> congested. If the core can handle multiple atomic flows, DLB will not
> try to switch them. This is an improvement over SW implementation
> which always switches the flows.
>
> The feature will resume CQ scheduling when the number of pending
> completions fall below a configured threshold.
>
> DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
> possible HL entries per LDB port equals 2048 / 64 = 32. So, the
> maximum CQ depth possible is 16, if all 64 LB ports are needed in a
> high-performance setting.
>
> In case all CQs are configured to have HL = 2* CQ Depth as a
> performance option, then the calculation of HL at the time of domain
> creation will be based on maximum possible dequeue depth. This could
> result in allocating too many HL  entries to the domain as DLB only
> has limited number of HL entries to be allocated. Hence, it is best
> to allow application to specify HL entries as a command line argument
> and override default allocation. A summary of usage is listed below:
>
> When 'use_default_hl = 1', Per port HL is set to
> DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
> alloc_hl_entries is ignored.
>
> When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
> port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
>
> User should calculate needed HL entries based on CQ depths the
> application will use and specify it as command line parameter
> 'alloc_hl_entries'. This will be used to allocate HL entries.
> Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).
>
> If alloc_hl_entries is not specified, then Total HL entries for the
> vdev = num_ldb_ports * 64.
>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>

>         }
> diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
> index d6828aa482..dc9f98e142 100644
> --- a/drivers/event/dlb2/dlb2_priv.h
> +++ b/drivers/event/dlb2/dlb2_priv.h
> @@ -52,6 +52,8 @@
>  #define DLB2_PRODUCER_COREMASK "producer_coremask"
>  #define DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG "default_port_allocation"
>  #define DLB2_ENABLE_CQ_WEIGHT_ARG "enable_cq_weight"
> +#define DLB2_USE_DEFAULT_HL "use_default_hl"
> +#define DLB2_ALLOC_HL_ENTRIES "alloc_hl_entries"


1)Update doc/guides/eventdevs/dlb2.rst for new devargs
2)Please release note PMD section for this feature.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] event/dlb2: enhance DLB credit handling
  2024-05-01 19:46 ` [PATCH v4 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
@ 2024-05-27 15:30   ` Jerin Jacob
  2024-06-04 18:22     ` Sevincer, Abdullah
  0 siblings, 1 reply; 28+ messages in thread
From: Jerin Jacob @ 2024-05-27 15:30 UTC (permalink / raw)
  To: Abdullah Sevincer, Richardson, Bruce
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Thu, May 2, 2024 at 1:27 AM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> This commit improves DLB credit handling scenarios when
> ports hold on to credits but can't release them due to insufficient
> accumulation (less than 2 * credit quanta).
>
> Worker ports now release all accumulated credits when back-to-back
> zero poll count reaches preset threshold.
>
> Producer ports release all accumulated credits if enqueue fails for a
> consecutive number of retries.
>
> In a multi-producer system, some producer(s) may exit early while
> holding on to credits. Now these are released during port unlink
> which needs to be performed by the application.
>
> test-eventdev is modified to call rte_event_port_unlink() to release
> any accumulated credits by producer ports.
>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
> ---
>  app/test-eventdev/test_perf_common.c |  20 +--

1) Spotted non-driver changes in driver patches, Please send
test-eventdev changes as separate commit with complete rational.
2) Fix CI issues http://mails.dpdk.org/archives/test-report/2024-May/657683.html



>  drivers/event/dlb2/dlb2.c            | 203 +++++++++++++++++++++------
>  drivers/event/dlb2/dlb2_priv.h       |   1 +
>  drivers/event/dlb2/meson.build       |  12 ++
>  drivers/event/dlb2/meson_options.txt |   6 +
>  5 files changed, 194 insertions(+), 48 deletions(-)
>  create mode 100644 drivers/event/dlb2/meson_options.txt
>

>
>  static inline uint64_t
> diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
> index 11bbe30d7b..2c341a5845 100644
> --- a/drivers/event/dlb2/dlb2.c
> +++ b/drivers/event/dlb2/dlb2.c
> @@ -43,7 +43,47 @@
>   * to DLB can go ahead of relevant application writes like updates to buffers
>   * being sent with event
>   */
> +#ifndef DLB2_BYPASS_FENCE_ON_PP
>  #define DLB2_BYPASS_FENCE_ON_PP 0  /* 1 == Bypass fence, 0 == do not bypass */
> +#endif
> +
> +/* HW credit checks can only be turned off for DLB2 device if following
> + * is true for each created eventdev
> + * LDB credits <= DIR credits + minimum CQ Depth
> + * (CQ Depth is minimum of all ports configured within eventdev)
> + * This needs to be true for all eventdevs created on any DLB2 device
> + * managed by this driver.
> + * DLB2.5 does not any such restriction as it has single credit pool
> + */
> +#ifndef DLB_HW_CREDITS_CHECKS
> +#define DLB_HW_CREDITS_CHECKS 1
> +#endif
> +
> +/*
> + * SW credit checks can only be turned off if application has a way to
> + * limit input events to the eventdev below assigned credit limit
> + */
> +#ifndef DLB_SW_CREDITS_CHECKS
> +#define DLB_SW_CREDITS_CHECKS 1
> +#endif
> +

> +
> +static void dlb2_check_and_return_credits(struct dlb2_eventdev_port *ev_port,
> +                                         bool cond, uint32_t threshold)
> +{
> +#if DLB_SW_CREDITS_CHECKS || DLB_HW_CREDITS_CHECKS


This new patch is full of compilation flags clutter, can you make it runtime?

>
> diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
> index dc9f98e142..fd76b5b9fb 100644
> --- a/drivers/event/dlb2/dlb2_priv.h
> +++ b/drivers/event/dlb2/dlb2_priv.h
> @@ -527,6 +527,7 @@ struct __rte_cache_aligned dlb2_eventdev_port {
>         struct rte_event_port_conf conf; /* user-supplied configuration */
>         uint16_t inflight_credits; /* num credits this port has right now */
>         uint16_t credit_update_quanta;
> +       uint32_t credit_return_count; /* count till the credit return condition is true */
>         struct dlb2_eventdev *dlb2; /* backlink optimization */
>         alignas(RTE_CACHE_LINE_SIZE) struct dlb2_port_stats stats;
>         struct dlb2_event_queue_link link[DLB2_MAX_NUM_QIDS_PER_LDB_CQ];
> diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build
> index 515d1795fe..77a197e32c 100644
> --- a/drivers/event/dlb2/meson.build
> +++ b/drivers/event/dlb2/meson.build
> @@ -68,3 +68,15 @@ endif
>  headers = files('rte_pmd_dlb2.h')
>
>  deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci']
> +
> +if meson.version().version_compare('> 0.58.0')
> +fs = import('fs')
> +dlb_options = fs.read('meson_options.txt').strip().split('\n')
> +
> +foreach opt: dlb_options
> +       if (opt.strip().startswith('#') or opt.strip() == '')
> +               continue
> +       endif
> +       cflags += '-D' + opt.strip().to_upper().replace(' ','')
> +endforeach
> +endif
> diff --git a/drivers/event/dlb2/meson_options.txt b/drivers/event/dlb2/meson_options.txt


Adding @Richardson, Bruce   @Thomas Monjalon   to comment on this, I
am not sure driver specific
meson_options.txt is a good path?



> new file mode 100644
> index 0000000000..b57c999e54
> --- /dev/null
> +++ b/drivers/event/dlb2/meson_options.txt
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2023-2024 Intel Corporation
> +
> +DLB2_BYPASS_FENCE_ON_PP = 0
> +DLB_HW_CREDITS_CHECKS = 1
> +DLB_SW_CREDITS_CHECKS = 1
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v4 3/3] event/dlb2: enhance DLB credit handling
  2024-05-27 15:30   ` Jerin Jacob
@ 2024-06-04 18:22     ` Sevincer, Abdullah
  2024-06-05  4:02       ` Jerin Jacob
  0 siblings, 1 reply; 28+ messages in thread
From: Sevincer, Abdullah @ 2024-06-04 18:22 UTC (permalink / raw)
  To: Jerin Jacob, Richardson, Bruce
  Cc: dev, jerinj, Chen, Mike Ximing, Sarkar, Tirthendu, Pathak,
	Pravin, Doneria, Shivani



>+This new patch is full of compilation flags clutter, can you make it runtime?
Thanks for the reviews Jerin, we can make it run time but it will cost more cycles and new APIs in the PMD to set those run time params. The reason we do it this way is to save cycles also not to make cli params longer.

>+Adding @Richardson, Bruce   @Thomas Monjalon   to comment on this, I
>+am not sure driver specific
>+meson_options.txt is a good path?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 3/3] event/dlb2: enhance DLB credit handling
  2024-06-04 18:22     ` Sevincer, Abdullah
@ 2024-06-05  4:02       ` Jerin Jacob
  2024-06-19 21:07         ` Sevincer, Abdullah
  0 siblings, 1 reply; 28+ messages in thread
From: Jerin Jacob @ 2024-06-05  4:02 UTC (permalink / raw)
  To: Sevincer, Abdullah
  Cc: Richardson, Bruce, dev, jerinj, Chen, Mike Ximing, Sarkar,
	Tirthendu, Pathak, Pravin, Doneria, Shivani

On Tue, Jun 4, 2024 at 11:52 PM Sevincer, Abdullah
<abdullah.sevincer@intel.com> wrote:
>
>
>
> >+This new patch is full of compilation flags clutter, can you make it runtime?
> Thanks for the reviews Jerin, we can make it run time but it will cost more cycles and new APIs in the PMD to set those run time params. The reason we do it this way is to save cycles also not to make cli params longer.

OK. At least, slowpath code you can remove ifdef and use only in fastpath.

>
> >+Adding @Richardson, Bruce   @Thomas Monjalon   to comment on this, I
> >+am not sure driver specific
> >+meson_options.txt is a good path?
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 0/5] DLB2 Enhancements
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
  2024-05-27 15:19   ` Jerin Jacob
@ 2024-06-19 21:01   ` Abdullah Sevincer
  2024-06-19 21:01     ` [PATCH v5 1/5] event/dlb2: add support for HW delayed token Abdullah Sevincer
                       ` (4 more replies)
  1 sibling, 5 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-06-19 21:01 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

This patchset  addresses DLB enhancements in the DLB driver.

v5: Address reviews and update documentation.
v4: Fix CI Issues.
V3: Fix CI issues.
v2: Fix compilation issues.
v1: Initial commit.

Abdullah Sevincer (5):
  event/dlb2: add support for HW delayed token
  event/dlb2: add support for dynamic HL entries
  event/dlb2: enhance DLB credit handling
  doc: update DLB2 documentation
  doc: update release notes for 24.07

 doc/guides/eventdevs/dlb2.rst              |  60 +++
 doc/guides/rel_notes/release_24_07.rst     |  32 ++
 drivers/event/dlb2/dlb2.c                  | 509 ++++++++++++++++++---
 drivers/event/dlb2/dlb2_iface.c            |   3 +
 drivers/event/dlb2/dlb2_iface.h            |   4 +-
 drivers/event/dlb2/dlb2_priv.h             |  18 +-
 drivers/event/dlb2/dlb2_user.h             |  24 +
 drivers/event/dlb2/meson.build             |  40 ++
 drivers/event/dlb2/pf/base/dlb2_regs.h     |   9 +
 drivers/event/dlb2/pf/base/dlb2_resource.c |  95 +++-
 drivers/event/dlb2/pf/base/dlb2_resource.h |  19 +
 drivers/event/dlb2/pf/dlb2_pf.c            |  28 +-
 drivers/event/dlb2/rte_pmd_dlb2.c          |  29 ++
 drivers/event/dlb2/rte_pmd_dlb2.h          |  41 ++
 drivers/event/dlb2/version.map             |   3 +
 meson_options.txt                          |   2 +
 16 files changed, 841 insertions(+), 75 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 1/5] event/dlb2: add support for HW delayed token
  2024-06-19 21:01   ` [PATCH v5 0/5] DLB2 Enhancements Abdullah Sevincer
@ 2024-06-19 21:01     ` Abdullah Sevincer
  2024-06-20 12:01       ` Jerin Jacob
  2024-06-19 21:01     ` [PATCH v5 2/5] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Abdullah Sevincer @ 2024-06-19 21:01 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

In DLB 2.5, hardware assist is available, complementing the Delayed
token POP software implementation. When it is enabled, the feature
works as follows:

It stops CQ scheduling when the inflight limit associated with the CQ
is reached. So the feature is activated only if the core is
congested. If the core can handle multiple atomic flows, DLB will not
try to switch them. This is an improvement over SW implementation
which always switches the flows.

The feature will resume CQ scheduling when the number of pending
completions fall below a configured threshold. To emulate older 2.0
behavior, this threshold is set to 1 by old APIs. SW sets CQ to
auto-pop mode for token return, as tokens withholding is not
necessary now. As HW counts completions and not tokens, events equal
to HL (History List) entries will be scheduled to DLB before the
feature activates and stops CQ scheduling.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 drivers/event/dlb2/dlb2.c                  | 57 ++++++++++++-
 drivers/event/dlb2/dlb2_iface.c            |  3 +
 drivers/event/dlb2/dlb2_iface.h            |  4 +-
 drivers/event/dlb2/dlb2_priv.h             |  5 ++
 drivers/event/dlb2/dlb2_user.h             | 24 ++++++
 drivers/event/dlb2/pf/base/dlb2_regs.h     |  9 ++
 drivers/event/dlb2/pf/base/dlb2_resource.c | 95 +++++++++++++++++++++-
 drivers/event/dlb2/pf/base/dlb2_resource.h | 19 +++++
 drivers/event/dlb2/pf/dlb2_pf.c            | 21 +++++
 drivers/event/dlb2/rte_pmd_dlb2.c          | 29 +++++++
 drivers/event/dlb2/rte_pmd_dlb2.h          | 40 +++++++++
 drivers/event/dlb2/version.map             |  3 +
 12 files changed, 305 insertions(+), 4 deletions(-)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 0b91f03956..70e4289097 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -879,8 +879,11 @@ dlb2_hw_reset_sched_domain(const struct rte_eventdev *dev, bool reconfig)
 	dlb2_iface_domain_reset(dlb2);
 
 	/* Free all dynamically allocated port memory */
-	for (i = 0; i < dlb2->num_ports; i++)
+	for (i = 0; i < dlb2->num_ports; i++) {
 		dlb2_free_qe_mem(&dlb2->ev_ports[i].qm_port);
+		if (!reconfig)
+			memset(&dlb2->ev_ports[i], 0, sizeof(struct dlb2_eventdev_port));
+	}
 
 	/* If reconfiguring, mark the device's queues and ports as "previously
 	 * configured." If the user doesn't reconfigure them, the PMD will
@@ -1525,7 +1528,7 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	struct dlb2_hw_dev *handle = &dlb2->qm_instance;
 	struct dlb2_create_ldb_port_args cfg = { {0} };
 	int ret;
-	struct dlb2_port *qm_port = NULL;
+	struct dlb2_port *qm_port = &ev_port->qm_port;
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	uint32_t qm_port_id;
 	uint16_t ldb_credit_high_watermark = 0;
@@ -1554,6 +1557,11 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	cfg.cq_depth = rte_align32pow2(dequeue_depth);
 	cfg.cq_depth_threshold = 1;
 
+	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
+		cfg.enable_inflight_ctrl = 1;
+		cfg.inflight_threshold = qm_port->inflight_threshold;
+	}
+
 	cfg.cq_history_list_size = cfg.cq_depth;
 
 	cfg.cos_id = ev_port->cos_id;
@@ -4321,6 +4329,51 @@ dlb2_get_queue_depth(struct dlb2_eventdev *dlb2,
 		return dlb2_get_ldb_queue_depth(dlb2, queue);
 }
 
+int
+dlb2_set_port_params(struct dlb2_eventdev *dlb2,
+		    int port_id,
+		    uint64_t param_flags,
+		    struct  rte_pmd_dlb2_port_params *params)
+{
+	struct dlb2_port *port = &dlb2->ev_ports[port_id].qm_port;
+	struct dlb2_hw_dev *handle = &dlb2->qm_instance;
+	int ret = 0, bit = 0;
+
+	while (param_flags) {
+		uint64_t param = rte_bit_relaxed_test_and_clear64(bit++, &param_flags);
+
+		if (!param)
+			continue;
+		switch (param) {
+		case RTE_PMD_DLB2_FLOW_MIGRATION_THRESHOLD:
+			if (dlb2->version == DLB2_HW_V2_5) {
+				struct dlb2_cq_inflight_ctrl_args args;
+				args.enable = true;
+				args.port_id = port->id;
+				args.threshold = params->inflight_threshold;
+
+				if (dlb2->ev_ports[port_id].setup_done)
+					ret = dlb2_iface_set_cq_inflight_ctrl(handle, &args);
+				if (ret < 0) {
+					DLB2_LOG_ERR("dlb2: can not set port parameters\n");
+					return -EINVAL;
+				}
+				port->enable_inflight_ctrl = true;
+				port->inflight_threshold = args.threshold;
+			} else {
+				DLB2_LOG_ERR("dlb2: FLOW_MIGRATION_THRESHOLD is only supported for 2.5 HW\n");
+				return -EINVAL;
+			}
+			break;
+		default:
+			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
+			return -EINVAL;
+		}
+	}
+
+	return ret;
+}
+
 static bool
 dlb2_queue_is_empty(struct dlb2_eventdev *dlb2,
 		    struct dlb2_eventdev_queue *queue)
diff --git a/drivers/event/dlb2/dlb2_iface.c b/drivers/event/dlb2/dlb2_iface.c
index 100db434d0..b829da2454 100644
--- a/drivers/event/dlb2/dlb2_iface.c
+++ b/drivers/event/dlb2/dlb2_iface.c
@@ -77,5 +77,8 @@ int (*dlb2_iface_get_dir_queue_depth)(struct dlb2_hw_dev *handle,
 int (*dlb2_iface_enable_cq_weight)(struct dlb2_hw_dev *handle,
 				   struct dlb2_enable_cq_weight_args *args);
 
+int (*dlb2_iface_set_cq_inflight_ctrl)(struct dlb2_hw_dev *handle,
+				       struct dlb2_cq_inflight_ctrl_args *args);
+
 int (*dlb2_iface_set_cos_bw)(struct dlb2_hw_dev *handle,
 			     struct dlb2_set_cos_bw_args *args);
diff --git a/drivers/event/dlb2/dlb2_iface.h b/drivers/event/dlb2/dlb2_iface.h
index dc0c446ce8..55b6bdcf84 100644
--- a/drivers/event/dlb2/dlb2_iface.h
+++ b/drivers/event/dlb2/dlb2_iface.h
@@ -72,10 +72,12 @@ extern int (*dlb2_iface_get_ldb_queue_depth)(struct dlb2_hw_dev *handle,
 extern int (*dlb2_iface_get_dir_queue_depth)(struct dlb2_hw_dev *handle,
 				struct dlb2_get_dir_queue_depth_args *args);
 
-
 extern int (*dlb2_iface_enable_cq_weight)(struct dlb2_hw_dev *handle,
 					  struct dlb2_enable_cq_weight_args *args);
 
+extern int (*dlb2_iface_set_cq_inflight_ctrl)(struct dlb2_hw_dev *handle,
+					      struct dlb2_cq_inflight_ctrl_args *args);
+
 extern int (*dlb2_iface_set_cos_bw)(struct dlb2_hw_dev *handle,
 				    struct dlb2_set_cos_bw_args *args);
 
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index 2470ae0271..bd11c0facf 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -389,6 +389,8 @@ struct dlb2_port {
 	bool use_avx512;
 	uint32_t cq_weight;
 	bool is_producer; /* True if port is of type producer */
+	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
+	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
 };
 
 /* Per-process per-port mmio and memory pointers */
@@ -715,6 +717,9 @@ int dlb2_secondary_eventdev_probe(struct rte_eventdev *dev,
 uint32_t dlb2_get_queue_depth(struct dlb2_eventdev *dlb2,
 			      struct dlb2_eventdev_queue *queue);
 
+int dlb2_set_port_params(struct dlb2_eventdev *dlb2, int port_id,
+			uint64_t flags, struct  rte_pmd_dlb2_port_params *params);
+
 int dlb2_parse_params(const char *params,
 		      const char *name,
 		      struct dlb2_devargs *dlb2_args,
diff --git a/drivers/event/dlb2/dlb2_user.h b/drivers/event/dlb2/dlb2_user.h
index 8739e2a5ac..ca09c65ac4 100644
--- a/drivers/event/dlb2/dlb2_user.h
+++ b/drivers/event/dlb2/dlb2_user.h
@@ -472,6 +472,8 @@ struct dlb2_create_ldb_port_args {
 	__u16 cq_history_list_size;
 	__u8 cos_id;
 	__u8 cos_strict;
+	__u8 enable_inflight_ctrl;
+	__u16 inflight_threshold;
 };
 
 /*
@@ -717,6 +719,28 @@ struct dlb2_enable_cq_weight_args {
 	__u32 limit;
 };
 
+/*
+ * DLB2_DOMAIN_CMD_SET_CQ_INFLIGHT_CTRL: Set Per-CQ inflight control for
+ * {ATM,UNO,ORD} QEs.
+ *
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - enable: True if inflight control is enabled. False otherwise
+ * - threshold: Per CQ inflight threshold.
+ *
+ * Output parameters:
+ * - response.status: Detailed error code. In certain cases, such as if the
+ *	ioctl request arg is invalid, the driver won't set status.
+ */
+struct dlb2_cq_inflight_ctrl_args {
+	/* Output parameters */
+	struct dlb2_cmd_response response;
+	/* Input parameters */
+	__u32 port_id;
+	__u16 enable;
+	__u16 threshold;
+};
+
 /*
  * Mapping sizes for memory mapping the consumer queue (CQ) memory space, and
  * producer port (PP) MMIO space.
diff --git a/drivers/event/dlb2/pf/base/dlb2_regs.h b/drivers/event/dlb2/pf/base/dlb2_regs.h
index 7167f3d2ff..b639a5b659 100644
--- a/drivers/event/dlb2/pf/base/dlb2_regs.h
+++ b/drivers/event/dlb2/pf/base/dlb2_regs.h
@@ -3238,6 +3238,15 @@
 #define DLB2_LSP_CQ_LDB_INFL_LIM_LIMIT_LOC	0
 #define DLB2_LSP_CQ_LDB_INFL_LIM_RSVD0_LOC	12
 
+#define DLB2_LSP_CQ_LDB_INFL_THRESH(x) \
+	(0x90580000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RST 0x0
+
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH	0x00000FFF
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RSVD0	0xFFFFF000
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH_LOC	0
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RSVD0_LOC	12
+
 #define DLB2_V2LSP_CQ_LDB_TKN_CNT(x) \
 	(0xa0580000 + (x) * 0x1000)
 #define DLB2_V2_5LSP_CQ_LDB_TKN_CNT(x) \
diff --git a/drivers/event/dlb2/pf/base/dlb2_resource.c b/drivers/event/dlb2/pf/base/dlb2_resource.c
index 7ce3e3531c..051d7e51c3 100644
--- a/drivers/event/dlb2/pf/base/dlb2_resource.c
+++ b/drivers/event/dlb2/pf/base/dlb2_resource.c
@@ -3062,10 +3062,14 @@ static void __dlb2_domain_reset_ldb_port_registers(struct dlb2_hw *hw,
 		    DLB2_CHP_LDB_CQ_DEPTH(hw->ver, port->id.phys_id),
 		    DLB2_CHP_LDB_CQ_DEPTH_RST);
 
-	if (hw->ver != DLB2_HW_V2)
+	if (hw->ver != DLB2_HW_V2) {
 		DLB2_CSR_WR(hw,
 			    DLB2_LSP_CFG_CQ_LDB_WU_LIMIT(port->id.phys_id),
 			    DLB2_LSP_CFG_CQ_LDB_WU_LIMIT_RST);
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id),
+			    DLB2_LSP_CQ_LDB_INFL_THRESH_RST);
+	}
 
 	DLB2_CSR_WR(hw,
 		    DLB2_LSP_CQ_LDB_INFL_LIM(hw->ver, port->id.phys_id),
@@ -4446,6 +4450,20 @@ static int dlb2_ldb_port_configure_cq(struct dlb2_hw *hw,
 	reg = 0;
 	DLB2_CSR_WR(hw, DLB2_LSP_CQ2PRIOV(hw->ver, port->id.phys_id), reg);
 
+	if (hw->ver == DLB2_HW_V2_5) {
+		reg = 0;
+		DLB2_BITS_SET(reg, args->enable_inflight_ctrl,
+				DLB2_LSP_CFG_CTRL_GENERAL_0_ENAB_IF_THRESH_V2_5);
+		DLB2_CSR_WR(hw, DLB2_V2_5LSP_CFG_CTRL_GENERAL_0, reg);
+
+		if (args->enable_inflight_ctrl) {
+			reg = 0;
+			DLB2_BITS_SET(reg, args->inflight_threshold,
+					DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH);
+			DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id), reg);
+		}
+	}
+
 	return 0;
 }
 
@@ -5464,6 +5482,35 @@ dlb2_get_domain_used_ldb_port(u32 id,
 	return NULL;
 }
 
+static struct dlb2_ldb_port *
+dlb2_get_domain_ldb_port(u32 id,
+			 bool vdev_req,
+			 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_list_entry *iter __attribute__((unused));
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (id >= DLB2_MAX_NUM_LDB_PORTS)
+		return NULL;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port, iter) {
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+		}
+
+		DLB2_DOM_LIST_FOR(domain->avail_ldb_ports[i], port, iter) {
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+		}
+	}
+
+	return NULL;
+}
+
 static void dlb2_ldb_port_change_qid_priority(struct dlb2_hw *hw,
 					      struct dlb2_ldb_port *port,
 					      int slot,
@@ -6816,3 +6863,49 @@ int dlb2_hw_set_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id, u8 bandwidth)
 
 	return 0;
 }
+
+int dlb2_hw_set_cq_inflight_ctrl(struct dlb2_hw *hw, u32 domain_id,
+		struct dlb2_cq_inflight_ctrl_args *args,
+		struct dlb2_cmd_response *resp,
+		bool vdev_req,
+		unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	u32 reg = 0;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_ldb_port(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	DLB2_BITS_SET(reg, args->enable,
+		      DLB2_LSP_CFG_CTRL_GENERAL_0_ENAB_IF_THRESH_V2_5);
+	DLB2_CSR_WR(hw, DLB2_V2_5LSP_CFG_CTRL_GENERAL_0, reg);
+
+	if (args->enable) {
+		reg = 0;
+		DLB2_BITS_SET(reg, args->threshold,
+			      DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH);
+		DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id),
+			    reg);
+	}
+
+	resp->status = 0;
+
+	return 0;
+}
diff --git a/drivers/event/dlb2/pf/base/dlb2_resource.h b/drivers/event/dlb2/pf/base/dlb2_resource.h
index 71bd6148f1..17cc745824 100644
--- a/drivers/event/dlb2/pf/base/dlb2_resource.h
+++ b/drivers/event/dlb2/pf/base/dlb2_resource.h
@@ -1956,4 +1956,23 @@ int dlb2_hw_enable_cq_weight(struct dlb2_hw *hw,
 			     bool vdev_request,
 			     unsigned int vdev_id);
 
+/**
+ * This function configures the inflight control threshold for a cq.
+ *
+ * This must be called after creating the port.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the queue ID.
+ *
+ * Errors:
+ * EINVAL - The domain or port is not configured.
+ */
+int dlb2_hw_set_cq_inflight_ctrl(struct dlb2_hw *hw, u32 domain_id,
+		struct dlb2_cq_inflight_ctrl_args *args,
+		struct dlb2_cmd_response *resp,
+		bool vdev_req,
+		unsigned int vdev_id);
+
 #endif /* __DLB2_RESOURCE_H */
diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
index 3d15250e11..249ed7ede9 100644
--- a/drivers/event/dlb2/pf/dlb2_pf.c
+++ b/drivers/event/dlb2/pf/dlb2_pf.c
@@ -665,6 +665,26 @@ dlb2_pf_set_cos_bandwidth(struct dlb2_hw_dev *handle,
 	return ret;
 }
 
+static int
+dlb2_pf_set_cq_inflight_ctrl(struct dlb2_hw_dev *handle,
+			     struct dlb2_cq_inflight_ctrl_args *args)
+{
+	struct dlb2_dev *dlb2_dev = (struct dlb2_dev *)handle->pf_dev;
+	struct dlb2_cmd_response response = {0};
+	int ret = 0;
+
+	DLB2_INFO(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_hw_set_cq_inflight_ctrl(&dlb2_dev->hw, handle->domain_id,
+					   args, &response, false, 0);
+	args->response = response;
+
+	DLB2_INFO(dev->dlb2_device, "Exiting %s() with ret=%d\n",
+		  __func__, ret);
+
+	return ret;
+}
+
 static void
 dlb2_pf_iface_fn_ptrs_init(void)
 {
@@ -691,6 +711,7 @@ dlb2_pf_iface_fn_ptrs_init(void)
 	dlb2_iface_get_sn_occupancy = dlb2_pf_get_sn_occupancy;
 	dlb2_iface_enable_cq_weight = dlb2_pf_enable_cq_weight;
 	dlb2_iface_set_cos_bw = dlb2_pf_set_cos_bandwidth;
+	dlb2_iface_set_cq_inflight_ctrl = dlb2_pf_set_cq_inflight_ctrl;
 }
 
 /* PCI DEV HOOKS */
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.c b/drivers/event/dlb2/rte_pmd_dlb2.c
index 43990e46ac..8a54cb5a31 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.c
+++ b/drivers/event/dlb2/rte_pmd_dlb2.c
@@ -33,7 +33,36 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 	if (port_id >= dlb2->num_ports || dlb2->ev_ports[port_id].setup_done)
 		return -EINVAL;
 
+	if (dlb2->version == DLB2_HW_V2_5 && mode == DELAYED_POP) {
+		dlb2->ev_ports[port_id].qm_port.enable_inflight_ctrl = true;
+		dlb2->ev_ports[port_id].qm_port.inflight_threshold = 1;
+		mode = AUTO_POP;
+	}
+
 	dlb2->ev_ports[port_id].qm_port.token_pop_mode = mode;
 
 	return 0;
 }
+
+int
+rte_pmd_dlb2_set_port_params(uint8_t dev_id,
+			    uint8_t port_id,
+			    uint64_t flags,
+			    struct  rte_pmd_dlb2_port_params *params)
+{
+	struct dlb2_eventdev *dlb2;
+	struct rte_eventdev *dev;
+
+	if (params == NULL)
+		return -EINVAL;
+
+	RTE_EVENTDEV_VALID_DEVID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_eventdevs[dev_id];
+
+	dlb2 = dlb2_pmd_priv(dev);
+
+	if (port_id >= dlb2->num_ports)
+		return -EINVAL;
+
+	return dlb2_set_port_params(dlb2, port_id, flags, params);
+}
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
index 334c6c356d..ff2c684609 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.h
+++ b/drivers/event/dlb2/rte_pmd_dlb2.h
@@ -67,6 +67,46 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 				uint8_t port_id,
 				enum dlb2_token_pop_mode mode);
 
+/** Set inflight threshold for flow migration */
+#define RTE_PMD_DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0)
+
+/** Set port history list */
+#define RTE_PMD_DLB2_SET_PORT_HL RTE_BIT64(1)
+
+struct rte_pmd_dlb2_port_params {
+	uint16_t inflight_threshold : 12;
+};
+
+/*!
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Configure various port parameters.
+ * AUTO_POP. This function must be called before calling rte_event_port_setup()
+ * for the port, but after calling rte_event_dev_configure().
+ *
+ * @param dev_id
+ *    The identifier of the event device.
+ * @param port_id
+ *    The identifier of the event port.
+ * @param flags
+ *    Bitmask of the parameters being set.
+ * @param params
+ *    Structure coantaining the values of parameters being set.
+ *
+ * @return
+ * - 0: Success
+ * - EINVAL: Invalid dev_id, port_id, or mode
+ * - EINVAL: The DLB2 is not configured, is already running, or the port is
+ *   already setup
+ */
+__rte_experimental
+int
+rte_pmd_dlb2_set_port_params(uint8_t dev_id,
+			    uint8_t port_id,
+			    uint64_t flags,
+			    struct  rte_pmd_dlb2_port_params *params);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/event/dlb2/version.map b/drivers/event/dlb2/version.map
index 1d0a0a75d7..c72c8b988a 100644
--- a/drivers/event/dlb2/version.map
+++ b/drivers/event/dlb2/version.map
@@ -5,6 +5,9 @@ DPDK_24 {
 EXPERIMENTAL {
 	global:
 
+	# added in 24.07
+	rte_pmd_dlb2_set_port_params;
+
 	# added in 20.11
 	rte_pmd_dlb2_set_token_pop_mode;
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 2/5] event/dlb2: add support for dynamic HL entries
  2024-06-19 21:01   ` [PATCH v5 0/5] DLB2 Enhancements Abdullah Sevincer
  2024-06-19 21:01     ` [PATCH v5 1/5] event/dlb2: add support for HW delayed token Abdullah Sevincer
@ 2024-06-19 21:01     ` Abdullah Sevincer
  2024-06-19 21:01     ` [PATCH v5 3/5] event/dlb2: enhance DLB credit handling Abdullah Sevincer
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-06-19 21:01 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
possible HL entries per LDB port equals 2048 / 64 = 32. So, the
maximum CQ depth possible is 16, if all 64 LB ports are needed in a
high-performance setting.

In case all CQs are configured to have HL = 2* CQ Depth as a
performance option, then the calculation of HL at the time of domain
creation will be based on maximum possible dequeue depth. This could
result in allocating too many HL  entries to the domain as DLB only
has limited number of HL entries to be allocated. Hence, it is best
to allow application to specify HL entries as a command line argument
and override default allocation. A summary of usage is listed below:

When 'use_default_hl = 1', Per port HL is set to
DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
alloc_hl_entries is ignored.

When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.

User should calculate needed HL entries based on CQ depths the
application will use and specify it as command line parameter
'alloc_hl_entries'. This will be used to allocate HL entries.
Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).

If alloc_hl_entries is not specified, then Total HL entries for the
vdev = num_ldb_ports * 64.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 drivers/event/dlb2/dlb2.c         | 130 ++++++++++++++++++++++++++++--
 drivers/event/dlb2/dlb2_priv.h    |  12 ++-
 drivers/event/dlb2/pf/dlb2_pf.c   |   7 +-
 drivers/event/dlb2/rte_pmd_dlb2.h |   1 +
 4 files changed, 138 insertions(+), 12 deletions(-)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 70e4289097..837c0639a3 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -180,10 +180,7 @@ dlb2_hw_query_resources(struct dlb2_eventdev *dlb2)
 	 * The capabilities (CAPs) were set at compile time.
 	 */
 
-	if (dlb2->max_cq_depth != DLB2_DEFAULT_CQ_DEPTH)
-		num_ldb_ports = DLB2_MAX_HL_ENTRIES / dlb2->max_cq_depth;
-	else
-		num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
+	num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
 
 	evdev_dlb2_default_info.max_event_queues =
 		dlb2->hw_rsrc_query_results.num_ldb_queues;
@@ -631,6 +628,52 @@ set_enable_cq_weight(const char *key __rte_unused,
 	return 0;
 }
 
+static int set_hl_override(const char *key __rte_unused,
+		const char *value,
+		void *opaque)
+{
+	bool *default_hl = opaque;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	if ((*value == 'n') || (*value == 'N') || (*value == '0'))
+		*default_hl = false;
+	else
+		*default_hl = true;
+
+	return 0;
+}
+
+static int set_hl_entries(const char *key __rte_unused,
+		const char *value,
+		void *opaque)
+{
+	int hl_entries = 0;
+	int ret;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	ret = dlb2_string_to_int(&hl_entries, value);
+	if (ret < 0)
+		return ret;
+
+	if ((uint32_t)hl_entries > DLB2_MAX_HL_ENTRIES) {
+		DLB2_LOG_ERR(
+		    "alloc_hl_entries %u out of range, must be in [1 - %d]\n",
+		    hl_entries, DLB2_MAX_HL_ENTRIES);
+		return -EINVAL;
+	}
+	*(uint32_t *)opaque = hl_entries;
+
+	return 0;
+}
+
 static int
 set_qid_depth_thresh(const char *key __rte_unused,
 		     const char *value,
@@ -828,8 +871,19 @@ dlb2_hw_create_sched_domain(struct dlb2_eventdev *dlb2,
 		DLB2_NUM_ATOMIC_INFLIGHTS_PER_QUEUE *
 		cfg->num_ldb_queues;
 
-	cfg->num_hist_list_entries = resources_asked->num_ldb_ports *
-		evdev_dlb2_default_info.max_event_port_dequeue_depth;
+	/* If hl_entries is non-zero then user specified command line option.
+	 * Else compute using default_port_hl that has been set earlier based
+	 * on use_default_hl option
+	 */
+	if (dlb2->hl_entries) {
+		cfg->num_hist_list_entries = dlb2->hl_entries;
+		if (resources_asked->num_ldb_ports)
+			dlb2->default_port_hl = cfg->num_hist_list_entries /
+						resources_asked->num_ldb_ports;
+	} else {
+		cfg->num_hist_list_entries =
+		    resources_asked->num_ldb_ports * dlb2->default_port_hl;
+	}
 
 	if (device_version == DLB2_HW_V2_5) {
 		DLB2_LOG_DBG("sched domain create - ldb_qs=%d, ldb_ports=%d, dir_ports=%d, atomic_inflights=%d, hist_list_entries=%d, credits=%d\n",
@@ -1041,7 +1095,7 @@ dlb2_eventdev_port_default_conf_get(struct rte_eventdev *dev,
 	struct dlb2_eventdev *dlb2 = dlb2_pmd_priv(dev);
 
 	port_conf->new_event_threshold = dlb2->new_event_limit;
-	port_conf->dequeue_depth = 32;
+	port_conf->dequeue_depth = dlb2->default_port_hl / 2;
 	port_conf->enqueue_depth = DLB2_MAX_ENQUEUE_DEPTH;
 	port_conf->event_port_cfg = 0;
 }
@@ -1560,9 +1614,18 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
 		cfg.enable_inflight_ctrl = 1;
 		cfg.inflight_threshold = qm_port->inflight_threshold;
+		if (!qm_port->hist_list)
+			qm_port->hist_list = cfg.cq_depth;
 	}
 
-	cfg.cq_history_list_size = cfg.cq_depth;
+	if (qm_port->hist_list)
+		cfg.cq_history_list_size = qm_port->hist_list;
+	else if (cfg.enable_inflight_ctrl)
+		cfg.cq_history_list_size = RTE_MIN(cfg.cq_depth, dlb2->default_port_hl);
+	else if (dlb2->default_port_hl == DLB2_FIXED_CQ_HL_SIZE)
+		cfg.cq_history_list_size = DLB2_FIXED_CQ_HL_SIZE;
+	else
+		cfg.cq_history_list_size = cfg.cq_depth * 2;
 
 	cfg.cos_id = ev_port->cos_id;
 	cfg.cos_strict = 0;/* best effots */
@@ -4365,6 +4428,13 @@ dlb2_set_port_params(struct dlb2_eventdev *dlb2,
 				return -EINVAL;
 			}
 			break;
+		case RTE_PMD_DLB2_SET_PORT_HL:
+			if (dlb2->ev_ports[port_id].setup_done) {
+				DLB2_LOG_ERR("DLB2_SET_PORT_HL must be called before setting up port\n");
+				return -EINVAL;
+			}
+			port->hist_list = params->port_hl;
+			break;
 		default:
 			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
 			return -EINVAL;
@@ -4683,6 +4753,28 @@ dlb2_primary_eventdev_probe(struct rte_eventdev *dev,
 		return err;
 	}
 
+	if (dlb2_args->use_default_hl) {
+		dlb2->default_port_hl = DLB2_FIXED_CQ_HL_SIZE;
+		if (dlb2_args->alloc_hl_entries)
+			DLB2_LOG_ERR(": Ignoring 'alloc_hl_entries' and using "
+				     "default history list sizes for eventdev:"
+				     " %s\n", dev->data->name);
+		dlb2->hl_entries = 0;
+	} else {
+		dlb2->default_port_hl = 2 * DLB2_FIXED_CQ_HL_SIZE;
+
+		if (dlb2_args->alloc_hl_entries >
+		    dlb2->hw_rsrc_query_results.num_hist_list_entries) {
+			DLB2_LOG_ERR(": Insufficient HL entries asked=%d "
+				     "available=%d for eventdev: %s\n",
+				     dlb2->hl_entries,
+				     dlb2->hw_rsrc_query_results.num_hist_list_entries,
+				     dev->data->name);
+			return -EINVAL;
+		}
+		dlb2->hl_entries = dlb2_args->alloc_hl_entries;
+	}
+
 	dlb2_iface_hardware_init(&dlb2->qm_instance);
 
 	/* configure class of service */
@@ -4790,6 +4882,8 @@ dlb2_parse_params(const char *params,
 					     DLB2_PRODUCER_COREMASK,
 					     DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG,
 					     DLB2_ENABLE_CQ_WEIGHT_ARG,
+					     DLB2_USE_DEFAULT_HL,
+					     DLB2_ALLOC_HL_ENTRIES,
 					     NULL };
 
 	if (params != NULL && params[0] != '\0') {
@@ -4993,6 +5087,26 @@ dlb2_parse_params(const char *params,
 				return ret;
 			}
 
+			ret = rte_kvargs_process(kvlist, DLB2_USE_DEFAULT_HL,
+						 set_hl_override,
+						 &dlb2_args->use_default_hl);
+			if (ret != 0) {
+				DLB2_LOG_ERR("%s: Error parsing use_default_hl arg",
+					     name);
+				rte_kvargs_free(kvlist);
+				return ret;
+			}
+
+			ret = rte_kvargs_process(kvlist, DLB2_ALLOC_HL_ENTRIES,
+						 set_hl_entries,
+						 &dlb2_args->alloc_hl_entries);
+			if (ret != 0) {
+				DLB2_LOG_ERR("%s: Error parsing hl_override arg",
+					     name);
+				rte_kvargs_free(kvlist);
+				return ret;
+			}
+
 			rte_kvargs_free(kvlist);
 		}
 	}
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index bd11c0facf..e7ed27251e 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -52,6 +52,8 @@
 #define DLB2_PRODUCER_COREMASK "producer_coremask"
 #define DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG "default_port_allocation"
 #define DLB2_ENABLE_CQ_WEIGHT_ARG "enable_cq_weight"
+#define DLB2_USE_DEFAULT_HL "use_default_hl"
+#define DLB2_ALLOC_HL_ENTRIES "alloc_hl_entries"
 
 /* Begin HW related defines and structs */
 
@@ -101,7 +103,8 @@
  */
 #define DLB2_MAX_HL_ENTRIES 2048
 #define DLB2_MIN_CQ_DEPTH 1
-#define DLB2_DEFAULT_CQ_DEPTH 32
+#define DLB2_DEFAULT_CQ_DEPTH 128  /* Can be overridden using max_cq_depth command line parameter */
+#define DLB2_FIXED_CQ_HL_SIZE 32  /* Used when ENABLE_FIXED_HL_SIZE is true */
 #define DLB2_MIN_HARDWARE_CQ_DEPTH 8
 #define DLB2_NUM_HIST_LIST_ENTRIES_PER_LDB_PORT \
 	DLB2_DEFAULT_CQ_DEPTH
@@ -123,7 +126,7 @@
 
 #define DLB2_NUM_QES_PER_CACHE_LINE 4
 
-#define DLB2_MAX_ENQUEUE_DEPTH 32
+#define DLB2_MAX_ENQUEUE_DEPTH 128
 #define DLB2_MIN_ENQUEUE_DEPTH 4
 
 #define DLB2_NAME_SIZE 64
@@ -391,6 +394,7 @@ struct dlb2_port {
 	bool is_producer; /* True if port is of type producer */
 	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
 	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
+	uint16_t hist_list; /* Port history list */
 };
 
 /* Per-process per-port mmio and memory pointers */
@@ -637,6 +641,8 @@ struct dlb2_eventdev {
 	uint32_t cos_bw[DLB2_COS_NUM_VALS]; /* bandwidth per cos domain */
 	uint8_t max_cos_port; /* Max LDB port from any cos */
 	bool enable_cq_weight;
+	uint16_t hl_entries; /* Num HL entires to allocate for the domain */
+	int default_port_hl;  /* Fixed or dynamic (2*CQ Depth) HL assignment */
 };
 
 /* used for collecting and passing around the dev args */
@@ -675,6 +681,8 @@ struct dlb2_devargs {
 	const char *producer_coremask;
 	bool default_ldb_port_allocation;
 	bool enable_cq_weight;
+	bool use_default_hl;
+	uint32_t alloc_hl_entries;
 };
 
 /* End Eventdev related defines and structs */
diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
index 249ed7ede9..137bdfd656 100644
--- a/drivers/event/dlb2/pf/dlb2_pf.c
+++ b/drivers/event/dlb2/pf/dlb2_pf.c
@@ -422,6 +422,8 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
 				      cfg,
 				      cq_base,
 				      &response);
+
+	cfg->response = response;
 	if (ret)
 		goto create_port_err;
 
@@ -437,7 +439,6 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
 
 	dlb2_list_init_head(&port_memory.list);
 
-	cfg->response = response;
 
 	return 0;
 
@@ -731,7 +732,9 @@ dlb2_eventdev_pci_init(struct rte_eventdev *eventdev)
 		.hw_credit_quanta = DLB2_SW_CREDIT_BATCH_SZ,
 		.default_depth_thresh = DLB2_DEPTH_THRESH_DEFAULT,
 		.max_cq_depth = DLB2_DEFAULT_CQ_DEPTH,
-		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH
+		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH,
+		.use_default_hl = true,
+		.alloc_hl_entries = 0
 	};
 	struct dlb2_eventdev *dlb2;
 	int q;
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
index ff2c684609..01aab8dc9a 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.h
+++ b/drivers/event/dlb2/rte_pmd_dlb2.h
@@ -75,6 +75,7 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 
 struct rte_pmd_dlb2_port_params {
 	uint16_t inflight_threshold : 12;
+	uint16_t port_hl;
 };
 
 /*!
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 3/5] event/dlb2: enhance DLB credit handling
  2024-06-19 21:01   ` [PATCH v5 0/5] DLB2 Enhancements Abdullah Sevincer
  2024-06-19 21:01     ` [PATCH v5 1/5] event/dlb2: add support for HW delayed token Abdullah Sevincer
  2024-06-19 21:01     ` [PATCH v5 2/5] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
@ 2024-06-19 21:01     ` Abdullah Sevincer
  2024-06-20 12:09       ` Jerin Jacob
  2024-07-12  0:17       ` [PATCH v6 0/3] DLB2 Enhancements Abdullah Sevincer
  2024-06-19 21:01     ` [PATCH v5 4/5] doc: update DLB2 documentation Abdullah Sevincer
  2024-06-19 21:01     ` [PATCH v5 5/5] doc: update release notes for 24.07 Abdullah Sevincer
  4 siblings, 2 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-06-19 21:01 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

This commit improves DLB credit handling scenarios when
ports hold on to credits but can't release them due to insufficient
accumulation (less than 2 * credit quanta).

Worker ports now release all accumulated credits when back-to-back
zero poll count reaches preset threshold.

Producer ports release all accumulated credits if enqueue fails for a
consecutive number of retries.

All newly introduced compilation flags are in the fastpath.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 drivers/event/dlb2/dlb2.c      | 322 +++++++++++++++++++++++++++------
 drivers/event/dlb2/dlb2_priv.h |   1 +
 drivers/event/dlb2/meson.build |  40 ++++
 meson_options.txt              |   2 +
 4 files changed, 306 insertions(+), 59 deletions(-)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 837c0639a3..e20c0173d0 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -43,7 +43,50 @@
  * to DLB can go ahead of relevant application writes like updates to buffers
  * being sent with event
  */
+#ifndef DLB2_BYPASS_FENCE_ON_PP
 #define DLB2_BYPASS_FENCE_ON_PP 0  /* 1 == Bypass fence, 0 == do not bypass */
+#endif
+/*
+ * HW credit checks can only be turned off for DLB2 device if following
+ * is true for each created eventdev
+ * LDB credits <= DIR credits + minimum CQ Depth
+ * (CQ Depth is minimum of all ports configured within eventdev)
+ * This needs to be true for all eventdevs created on any DLB2 device
+ * managed by this driver.
+ * DLB2.5 does not have any such restriction as it has single credit pool
+ */
+#ifndef DLB_HW_CREDITS_CHECKS
+#define DLB_HW_CREDITS_CHECKS 1
+#endif
+
+/*
+ * SW credit checks can only be turned off if application has a way to
+ * limit input events to the eventdev below assigned credit limit
+ */
+#ifndef DLB_SW_CREDITS_CHECKS
+#define DLB_SW_CREDITS_CHECKS 1
+#endif
+
+/*
+ * Once application is fully validated, type check can be turned off.
+ * HW will continue checking for correct type and generate alarm on mismatch
+ */
+#ifndef DLB_TYPE_CHECK
+#define DLB_TYPE_CHECK 1
+#endif
+#define DLB_TYPE_MACRO 0x010002
+
+/*
+ * To avoid deadlock, ports holding to credits will release them after these
+ * many consecutive zero dequeues
+ */
+#define DLB2_ZERO_DEQ_CREDIT_RETURN_THRES 16384
+
+/*
+ * To avoid deadlock, ports holding to credits will release them after these
+ * many consecutive enqueue failures
+ */
+#define DLB2_ENQ_FAIL_CREDIT_RETURN_THRES 100
 
 /*
  * Resources exposed to eventdev. Some values overridden at runtime using
@@ -366,6 +409,33 @@ set_max_num_events(const char *key __rte_unused,
 	return 0;
 }
 
+static int
+set_max_num_events_v2_5(const char *key __rte_unused,
+			const char *value,
+			void *opaque)
+{
+	int *max_num_events = opaque;
+	int ret;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	ret = dlb2_string_to_int(max_num_events, value);
+	if (ret < 0)
+		return ret;
+
+	if (*max_num_events < 0 || *max_num_events >
+			DLB2_MAX_NUM_CREDITS(DLB2_HW_V2_5)) {
+		DLB2_LOG_ERR("dlb2: max_num_events must be between 0 and %d\n",
+			     DLB2_MAX_NUM_CREDITS(DLB2_HW_V2_5));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static int
 set_num_dir_credits(const char *key __rte_unused,
 		    const char *value,
@@ -966,6 +1036,15 @@ dlb2_hw_reset_sched_domain(const struct rte_eventdev *dev, bool reconfig)
 	dlb2->num_queues = 0;
 	dlb2->num_ldb_queues = 0;
 	dlb2->num_dir_queues = 0;
+	if (dlb2->version == DLB2_HW_V2_5) {
+		dlb2->num_credits = 0;
+		dlb2->max_credits = 0;
+	} else {
+		dlb2->num_ldb_credits = 0;
+		dlb2->num_dir_credits = 0;
+		dlb2->max_ldb_credits = 0;
+		dlb2->max_dir_credits = 0;
+	}
 	dlb2->configured = false;
 }
 
@@ -1074,11 +1153,14 @@ dlb2_eventdev_configure(const struct rte_eventdev *dev)
 	if (dlb2->version == DLB2_HW_V2_5) {
 		dlb2->credit_pool = rsrcs->num_credits;
 		dlb2->max_credits = rsrcs->num_credits;
+		dlb2->num_credits = rsrcs->num_credits;
 	} else {
 		dlb2->ldb_credit_pool = rsrcs->num_ldb_credits;
 		dlb2->max_ldb_credits = rsrcs->num_ldb_credits;
+		dlb2->num_ldb_credits = rsrcs->num_ldb_credits;
 		dlb2->dir_credit_pool = rsrcs->num_dir_credits;
 		dlb2->max_dir_credits = rsrcs->num_dir_credits;
+		dlb2->num_dir_credits = rsrcs->num_dir_credits;
 	}
 
 	dlb2->configured = true;
@@ -1679,6 +1761,12 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 
 	qm_port->id = qm_port_id;
 
+	if (dlb2->version == DLB2_HW_V2) {
+		qm_port->cached_ldb_credits = 0;
+		qm_port->cached_dir_credits = 0;
+	} else
+		qm_port->cached_credits = 0;
+
 	if (dlb2->version == DLB2_HW_V2_5 && (dlb2->enable_cq_weight == true)) {
 		struct dlb2_enable_cq_weight_args cq_weight_args = { {0} };
 		cq_weight_args.port_id = qm_port->id;
@@ -2047,19 +2135,8 @@ dlb2_eventdev_port_setup(struct rte_eventdev *dev,
 	ev_port->credit_update_quanta = sw_credit_quanta;
 	ev_port->qm_port.hw_credit_quanta = hw_credit_quanta;
 
-	/*
-	 * Validate credit config before creating port
-	 */
 
-	if (port_conf->enqueue_depth > sw_credit_quanta ||
-	    port_conf->enqueue_depth > hw_credit_quanta) {
-		DLB2_LOG_ERR("Invalid port config. Enqueue depth %d must be <= credit quanta %d and batch size %d\n",
-			     port_conf->enqueue_depth,
-			     sw_credit_quanta,
-			     hw_credit_quanta);
-		return -EINVAL;
-	}
-	ev_port->enq_retries = port_conf->enqueue_depth / sw_credit_quanta;
+	ev_port->enq_retries = port_conf->enqueue_depth;
 
 	/* Save off port config for reconfig */
 	ev_port->conf = *port_conf;
@@ -2494,6 +2571,61 @@ dlb2_event_queue_detach_ldb(struct dlb2_eventdev *dlb2,
 	return ret;
 }
 
+static inline void
+dlb2_port_credits_return(struct dlb2_port *qm_port)
+{
+	/* Return all port credits */
+	if (qm_port->dlb2->version == DLB2_HW_V2_5) {
+		if (qm_port->cached_credits) {
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_COMBINED_POOL],
+					   qm_port->cached_credits, rte_memory_order_seq_cst);
+			qm_port->cached_credits = 0;
+		}
+	} else {
+		if (qm_port->cached_ldb_credits) {
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_LDB_QUEUE],
+					   qm_port->cached_ldb_credits, rte_memory_order_seq_cst);
+			qm_port->cached_ldb_credits = 0;
+		}
+		if (qm_port->cached_dir_credits) {
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_DIR_QUEUE],
+					   qm_port->cached_dir_credits, rte_memory_order_seq_cst);
+			qm_port->cached_dir_credits = 0;
+		}
+	}
+}
+
+static inline void
+dlb2_release_sw_credits(struct dlb2_eventdev *dlb2,
+			struct dlb2_eventdev_port *ev_port, uint16_t val)
+{
+	if (ev_port->inflight_credits) {
+		rte_atomic_fetch_sub_explicit(&dlb2->inflights, val, rte_memory_order_seq_cst);
+		ev_port->inflight_credits -= val;
+	}
+}
+
+static void dlb2_check_and_return_credits(struct dlb2_eventdev_port *ev_port,
+					  bool cond, uint32_t threshold)
+{
+#if DLB_SW_CREDITS_CHECKS || DLB_HW_CREDITS_CHECKS
+	if (cond) {
+		if (++ev_port->credit_return_count > threshold) {
+#if DLB_SW_CREDITS_CHECKS
+			dlb2_release_sw_credits(ev_port->dlb2, ev_port,
+						ev_port->inflight_credits);
+#endif
+#if DLB_HW_CREDITS_CHECKS
+			dlb2_port_credits_return(&ev_port->qm_port);
+#endif
+			ev_port->credit_return_count = 0;
+		}
+	} else {
+		ev_port->credit_return_count = 0;
+	}
+#endif
+}
+
 static int
 dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 			  uint8_t queues[], uint16_t nb_unlinks)
@@ -2513,14 +2645,15 @@ dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 
 	if (queues == NULL || nb_unlinks == 0) {
 		DLB2_LOG_DBG("dlb2: queues is NULL or nb_unlinks is 0\n");
-		return 0; /* Ignore and return success */
+		nb_unlinks = 0; /* Ignore and return success */
+		goto ret_credits;
 	}
 
 	if (ev_port->qm_port.is_directed) {
 		DLB2_LOG_DBG("dlb2: ignore unlink from dir port %d\n",
 			     ev_port->id);
 		rte_errno = 0;
-		return nb_unlinks; /* as if success */
+		goto ret_credits;
 	}
 
 	dlb2 = ev_port->dlb2;
@@ -2559,6 +2692,10 @@ dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 		ev_queue->num_links--;
 	}
 
+ret_credits:
+	if (ev_port->inflight_credits)
+		dlb2_check_and_return_credits(ev_port, true, 0);
+
 	return nb_unlinks;
 }
 
@@ -2758,8 +2895,7 @@ dlb2_replenish_sw_credits(struct dlb2_eventdev *dlb2,
 		/* Replenish credits, saving one quanta for enqueues */
 		uint16_t val = ev_port->inflight_credits - quanta;
 
-		rte_atomic_fetch_sub_explicit(&dlb2->inflights, val, rte_memory_order_seq_cst);
-		ev_port->inflight_credits -= val;
+		dlb2_release_sw_credits(dlb2, ev_port, val);
 	}
 }
 
@@ -2789,10 +2925,15 @@ dlb2_check_enqueue_sw_credits(struct dlb2_eventdev *dlb2,
 			rte_errno = -ENOSPC;
 			return 1;
 		}
-
-		rte_atomic_fetch_add_explicit(&dlb2->inflights, credit_update_quanta,
-				   rte_memory_order_seq_cst);
-		ev_port->inflight_credits += (credit_update_quanta);
+		/* Application will retry if this attempt fails due to contention */
+		if (rte_atomic_compare_exchange_strong_explicit(&dlb2->inflights, &sw_inflights,
+					(sw_inflights+credit_update_quanta),
+					rte_memory_order_seq_cst, rte_memory_order_seq_cst))
+			ev_port->inflight_credits += (credit_update_quanta);
+		else {
+			rte_errno = -ENOSPC;
+			return 1;
+		}
 
 		if (ev_port->inflight_credits < num) {
 			DLB2_INC_STAT(
@@ -2930,7 +3071,9 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 {
 	struct dlb2_eventdev *dlb2 = ev_port->dlb2;
 	struct dlb2_eventdev_queue *ev_queue;
+#if DLB_HW_CREDITS_CHECKS
 	uint16_t *cached_credits = NULL;
+#endif
 	struct dlb2_queue *qm_queue;
 
 	ev_queue = &dlb2->ev_queues[ev->queue_id];
@@ -2942,6 +3085,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		goto op_check;
 
 	if (!qm_queue->is_directed) {
+#if DLB_HW_CREDITS_CHECKS
 		/* Load balanced destination queue */
 
 		if (dlb2->version == DLB2_HW_V2) {
@@ -2985,9 +3129,20 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			rte_errno = -EINVAL;
 			return 1;
 		}
+#else
+#if (RTE_SCHED_TYPE_PARALLEL != 2) || (RTE_SCHED_TYPE_ATOMIC != 1)
+#error "ERROR: RTE event schedule type values changed. Needs a code change"
+#endif
+		/* Map RTE eventdev schedule type to DLB HW schedule type */
+		if (qm_queue->sched_type != RTE_SCHED_TYPE_ORDERED)
+			/* RTE-Parallel -> DLB-UnOrd 2->1, RTE-Atm -> DLB-Atm 1->0 */
+			*sched_type = ev->sched_type - 1;
+		else /* To support CFG_ALL_TYPEs */
+			*sched_type = DLB2_SCHED_ORDERED; /* RTE-Ord -> DLB-Ord 0->2 */
+#endif
 	} else {
 		/* Directed destination queue */
-
+#if DLB_HW_CREDITS_CHECKS
 		if (dlb2->version == DLB2_HW_V2) {
 			if (dlb2_check_enqueue_hw_dir_credits(qm_port)) {
 				rte_errno = -ENOSPC;
@@ -3001,6 +3156,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			}
 			cached_credits = &qm_port->cached_credits;
 		}
+#endif
 		DLB2_LOG_DBG("dlb2: put_qe: RTE_SCHED_TYPE_DIRECTED\n");
 
 		*sched_type = DLB2_SCHED_DIRECTED;
@@ -3009,13 +3165,17 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 op_check:
 	switch (ev->op) {
 	case RTE_EVENT_OP_NEW:
+#if DLB_SW_CREDITS_CHECKS
 		/* Check that a sw credit is available */
 		if (dlb2_check_enqueue_sw_credits(dlb2, ev_port)) {
 			rte_errno = -ENOSPC;
 			return 1;
 		}
 		ev_port->inflight_credits--;
+#endif
+#if DLB_HW_CREDITS_CHECKS
 		(*cached_credits)--;
+#endif
 		break;
 	case RTE_EVENT_OP_FORWARD:
 		/* Check for outstanding_releases underflow. If this occurs,
@@ -3026,10 +3186,14 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		RTE_ASSERT(ev_port->outstanding_releases > 0);
 		ev_port->outstanding_releases--;
 		qm_port->issued_releases++;
+#if DLB_HW_CREDITS_CHECKS
 		(*cached_credits)--;
+#endif
 		break;
 	case RTE_EVENT_OP_RELEASE:
+#if DLB_SW_CREDITS_CHECKS
 		ev_port->inflight_credits++;
+#endif
 		/* Check for outstanding_releases underflow. If this occurs,
 		 * the application is not using the EVENT_OPs correctly; for
 		 * example, forwarding or releasing events that were not
@@ -3038,9 +3202,28 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		RTE_ASSERT(ev_port->outstanding_releases > 0);
 		ev_port->outstanding_releases--;
 		qm_port->issued_releases++;
-
+#if DLB_SW_CREDITS_CHECKS
 		/* Replenish s/w credits if enough are cached */
 		dlb2_replenish_sw_credits(dlb2, ev_port);
+#endif
+		break;
+	/* Fragments not supported in the API, but left here for
+	 * possible future use.
+	 */
+#if DLB_SW_CREDITS_CHECKS
+		/* Check that a sw credit is available */
+		if (dlb2_check_enqueue_sw_credits(dlb2, ev_port)) {
+			rte_errno = -ENOSPC;
+			return 1;
+		}
+#endif
+
+#if DLB_SW_CREDITS_CHECKS
+		ev_port->inflight_credits--;
+#endif
+#if DLB_HW_CREDITS_CHECKS
+		(*cached_credits)--;
+#endif
 		break;
 	}
 
@@ -3151,6 +3334,8 @@ __dlb2_event_enqueue_burst(void *event_port,
 			break;
 	}
 
+	dlb2_check_and_return_credits(ev_port, !i, DLB2_ENQ_FAIL_CREDIT_RETURN_THRES);
+
 	return i;
 }
 
@@ -3289,53 +3474,45 @@ dlb2_event_release(struct dlb2_eventdev *dlb2,
 		return;
 	}
 	ev_port->outstanding_releases -= i;
+#if DLB_SW_CREDITS_CHECKS
 	ev_port->inflight_credits += i;
 
 	/* Replenish s/w credits if enough releases are performed */
 	dlb2_replenish_sw_credits(dlb2, ev_port);
+#endif
 }
 
 static inline void
 dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 {
 	uint32_t batch_size = qm_port->hw_credit_quanta;
+	int val;
 
 	/* increment port credits, and return to pool if exceeds threshold */
-	if (!qm_port->is_directed) {
-		if (qm_port->dlb2->version == DLB2_HW_V2) {
-			qm_port->cached_ldb_credits += num;
-			if (qm_port->cached_ldb_credits >= 2 * batch_size) {
-				rte_atomic_fetch_add_explicit(
-					qm_port->credit_pool[DLB2_LDB_QUEUE],
-					batch_size, rte_memory_order_seq_cst);
-				qm_port->cached_ldb_credits -= batch_size;
-			}
-		} else {
-			qm_port->cached_credits += num;
-			if (qm_port->cached_credits >= 2 * batch_size) {
-				rte_atomic_fetch_add_explicit(
-				      qm_port->credit_pool[DLB2_COMBINED_POOL],
-				      batch_size, rte_memory_order_seq_cst);
-				qm_port->cached_credits -= batch_size;
-			}
+	if (qm_port->dlb2->version == DLB2_HW_V2_5) {
+		qm_port->cached_credits += num;
+		if (qm_port->cached_credits >= 2 * batch_size) {
+			val = qm_port->cached_credits - batch_size;
+			rte_atomic_fetch_add_explicit(
+			    qm_port->credit_pool[DLB2_COMBINED_POOL], val,
+			    rte_memory_order_seq_cst);
+			qm_port->cached_credits -= val;
+		}
+	} else if (!qm_port->is_directed) {
+		qm_port->cached_ldb_credits += num;
+		if (qm_port->cached_ldb_credits >= 2 * batch_size) {
+			val = qm_port->cached_ldb_credits - batch_size;
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_LDB_QUEUE],
+					   val, rte_memory_order_seq_cst);
+			qm_port->cached_ldb_credits -= val;
 		}
 	} else {
-		if (qm_port->dlb2->version == DLB2_HW_V2) {
-			qm_port->cached_dir_credits += num;
-			if (qm_port->cached_dir_credits >= 2 * batch_size) {
-				rte_atomic_fetch_add_explicit(
-					qm_port->credit_pool[DLB2_DIR_QUEUE],
-					batch_size, rte_memory_order_seq_cst);
-				qm_port->cached_dir_credits -= batch_size;
-			}
-		} else {
-			qm_port->cached_credits += num;
-			if (qm_port->cached_credits >= 2 * batch_size) {
-				rte_atomic_fetch_add_explicit(
-				      qm_port->credit_pool[DLB2_COMBINED_POOL],
-				      batch_size, rte_memory_order_seq_cst);
-				qm_port->cached_credits -= batch_size;
-			}
+		qm_port->cached_dir_credits += num;
+		if (qm_port->cached_dir_credits >= 2 * batch_size) {
+			val = qm_port->cached_dir_credits - batch_size;
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_DIR_QUEUE],
+					   val, rte_memory_order_seq_cst);
+			qm_port->cached_dir_credits -= val;
 		}
 	}
 }
@@ -3366,6 +3543,16 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 
 	/* Wait/poll time expired */
 	if (elapsed_ticks >= timeout) {
+
+		/* Return all credits before blocking if remaining credits in
+		 * system is less than quanta.
+		 */
+		uint32_t sw_inflights = rte_atomic_load_explicit(&dlb2->inflights,
+				rte_memory_order_seq_cst);
+		uint32_t quanta = ev_port->credit_update_quanta;
+
+		if (dlb2->new_event_limit - sw_inflights < quanta)
+			dlb2_check_and_return_credits(ev_port, true, 0);
 		return 1;
 	} else if (dlb2->umwait_allowed) {
 		struct rte_power_monitor_cond pmc;
@@ -4101,7 +4288,9 @@ dlb2_hw_dequeue_sparse(struct dlb2_eventdev *dlb2,
 
 		ev_port->outstanding_releases += num;
 
+#if DLB_HW_CREDITS_CHECKS
 		dlb2_port_credits_inc(qm_port, num);
+#endif
 	}
 
 	return num;
@@ -4228,8 +4417,9 @@ dlb2_hw_dequeue(struct dlb2_eventdev *dlb2,
 			dlb2_consume_qe_immediate(qm_port, num);
 
 		ev_port->outstanding_releases += num;
-
+#if DLB_HW_CREDITS_CHECKS
 		dlb2_port_credits_inc(qm_port, num);
+#endif
 	}
 
 	return num;
@@ -4263,6 +4453,9 @@ dlb2_event_dequeue_burst(void *event_port, struct rte_event *ev, uint16_t num,
 	DLB2_INC_STAT(ev_port->stats.traffic.total_polls, 1);
 	DLB2_INC_STAT(ev_port->stats.traffic.zero_polls, ((cnt == 0) ? 1 : 0));
 
+	dlb2_check_and_return_credits(ev_port, !cnt,
+				      DLB2_ZERO_DEQ_CREDIT_RETURN_THRES);
+
 	return cnt;
 }
 
@@ -4299,6 +4492,9 @@ dlb2_event_dequeue_burst_sparse(void *event_port, struct rte_event *ev,
 
 	DLB2_INC_STAT(ev_port->stats.traffic.total_polls, 1);
 	DLB2_INC_STAT(ev_port->stats.traffic.zero_polls, ((cnt == 0) ? 1 : 0));
+
+	dlb2_check_and_return_credits(ev_port, !cnt,
+				      DLB2_ZERO_DEQ_CREDIT_RETURN_THRES);
 	return cnt;
 }
 
@@ -4903,9 +5099,17 @@ dlb2_parse_params(const char *params,
 				return ret;
 			}
 
-			ret = rte_kvargs_process(kvlist, DLB2_MAX_NUM_EVENTS,
-						 set_max_num_events,
-						 &dlb2_args->max_num_events);
+			if (version == DLB2_HW_V2) {
+				ret = rte_kvargs_process(kvlist,
+						DLB2_MAX_NUM_EVENTS,
+						set_max_num_events,
+						&dlb2_args->max_num_events);
+			} else {
+				ret = rte_kvargs_process(kvlist,
+						DLB2_MAX_NUM_EVENTS,
+						set_max_num_events_v2_5,
+						&dlb2_args->max_num_events);
+			}
 			if (ret != 0) {
 				DLB2_LOG_ERR("%s: Error parsing max_num_events parameter",
 					     name);
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index e7ed27251e..47f76f938f 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -527,6 +527,7 @@ struct __rte_cache_aligned dlb2_eventdev_port {
 	struct rte_event_port_conf conf; /* user-supplied configuration */
 	uint16_t inflight_credits; /* num credits this port has right now */
 	uint16_t credit_update_quanta;
+	uint32_t credit_return_count; /* count till the credit return condition is true */
 	struct dlb2_eventdev *dlb2; /* backlink optimization */
 	alignas(RTE_CACHE_LINE_SIZE) struct dlb2_port_stats stats;
 	struct dlb2_event_queue_link link[DLB2_MAX_NUM_QIDS_PER_LDB_CQ];
diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build
index 515d1795fe..7760f78f87 100644
--- a/drivers/event/dlb2/meson.build
+++ b/drivers/event/dlb2/meson.build
@@ -68,3 +68,43 @@ endif
 headers = files('rte_pmd_dlb2.h')
 
 deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci']
+
+dlb_pmd_opts = ['bypass_fence', 'hw_credits_checks', 'sw_credits_checks', 'type_check']
+dlb_pmd_defines = ['DLB2_BYPASS_FENCE_ON_PP', 'DLB_HW_CREDITS_CHECKS', 'DLB_SW_CREDITS_CHECKS', 'DLB_TYPE_CHECK']
+dlb_pmd_default = ['0','1','1','1']
+dlb_pmd_args = []
+
+#DLB PMD arguments can be provided as -Ddlb_pmd_args=option1:value1,option2:value2.. in meson command line options.
+arg_str=get_option('dlb_pmd_args').strip()
+if arg_str != ''
+	dlb_pmd_args = arg_str.split(',')
+	foreach arg: dlb_pmd_args
+		opt_args = arg.split(':')
+		if opt_args[0] not in dlb_pmd_opts
+			err_str = 'Unsupported DLB PMD option ' + opt_args[0]
+			err_str += ' Valid options are: bypass_fence, hw_credits_checks, sw_credits_checks, type_check'
+			error(err_str)
+		endif
+	endforeach
+endif
+
+index = 0
+foreach opt: dlb_pmd_opts
+	val = dlb_pmd_default[index]
+	foreach arg: dlb_pmd_args
+		opt_args = arg.split(':')
+		if opt == opt_args[0]
+			if opt_args[1] == 'enable' or opt_args[1] == '1'
+				val = '1'
+			elif opt_args[1] == 'disable' or opt_args[1] == '0'
+				val = '0'
+			else
+				err_str = 'Invalid DLB pmd option value: ' + arg + ' Valid values=enable/1/disable/0'
+				error(err_str)
+			endif
+			break
+		endif
+	endforeach
+	cflags += '-D' + dlb_pmd_defines[index] + '=' + val
+	index = index + 1
+endforeach
diff --git a/meson_options.txt b/meson_options.txt
index e49b2fc089..78b1e849f0 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -12,6 +12,8 @@ option('disable_drivers', type: 'string', value: '', description:
        'Comma-separated list of drivers to explicitly disable.')
 option('disable_libs', type: 'string', value: '', description:
        'Comma-separated list of optional libraries to explicitly disable. [NOTE: mandatory libs cannot be disabled]')
+option('dlb_pmd_args', type: 'string', value: '', description:
+       'Comma-separated list of DLB PMD arguments in option:value format')
 option('drivers_install_subdir', type: 'string', value: 'dpdk/pmds-<VERSION>', description:
        'Subdirectory of libdir where to install PMDs. Defaults to using a versioned subdirectory.')
 option('enable_docs', type: 'boolean', value: false, description:
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 4/5] doc: update DLB2 documentation
  2024-06-19 21:01   ` [PATCH v5 0/5] DLB2 Enhancements Abdullah Sevincer
                       ` (2 preceding siblings ...)
  2024-06-19 21:01     ` [PATCH v5 3/5] event/dlb2: enhance DLB credit handling Abdullah Sevincer
@ 2024-06-19 21:01     ` Abdullah Sevincer
  2024-06-19 21:01     ` [PATCH v5 5/5] doc: update release notes for 24.07 Abdullah Sevincer
  4 siblings, 0 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-06-19 21:01 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

This commit updates the dlb2.rst eventdev guide to document
for new features: HW delayed token support, dynamic hl entries
and improvement in DLB credit handling.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 doc/guides/eventdevs/dlb2.rst | 60 +++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/doc/guides/eventdevs/dlb2.rst b/doc/guides/eventdevs/dlb2.rst
index 2532d92888..dfa4e1d903 100644
--- a/doc/guides/eventdevs/dlb2.rst
+++ b/doc/guides/eventdevs/dlb2.rst
@@ -456,6 +456,66 @@ Example command to enable QE Weight feature:
 
        --allow ea:00.0,enable_cq_weight=<y/Y>
 
+Dynamic History List Entries
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
+possible HL entries per LDB port equals 2048 / 64 = 32. So, the
+maximum CQ depth possible is 16, if all 64 LB ports are needed in a
+high-performance setting.
+
+In case all CQs are configured to have HL = 2* CQ Depth as a
+performance option, then the calculation of HL at the time of domain
+creation will be based on maximum possible dequeue depth. This could
+result in allocating too many HL  entries to the domain as DLB only
+has a limited number of HL entries to be allocated. Hence, it is best
+to allow application to specify HL entries as a command line argument
+and override default allocation. A summary of usage is listed below:
+
+When 'use_default_hl = 1', Per port HL is set to
+DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
+alloc_hl_entries is ignored.
+
+When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
+port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
+
+Users should calculate needed HL entries based on CQ depths the
+application will use and specify it as command line parameter
+'alloc_hl_entries'. This will be used to allocate HL entries.
+Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).
+
+If alloc_hl_entries is not specified, then Total HL entries for the
+vdev = num_ldb_ports * 64
+
+Example command to use dynamic history list entries feature:
+
+    .. code-block:: console
+
+       --allow ea:00.0,use_default_hl=0,alloc_hl_entries=1024
+
+Credit Handling Scenario Improvements
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When ports hold on to credits but can't release them due to insufficient
+accumulation (less than 2 * credit quanta) deadlocks may occur. Improvement
+made for worker ports to release all accumulated credits when back-to-back
+zero poll count reaches preset threshold and producer ports release all
+accumulated credits if enqueue fails for a consecutive number of retries.
+
+New meson options are provided for handling credits. Valid options are
+are ``bypass_fence``, ``hw_credits_checks``, ``sw_credits_checks`` and
+``type_check``. These options need to be provided in meson in comma
+separated form.
+
+The default behavior for ``bypass_fence`` is disabled and all others are
+enabled.
+
+Example command to use as meson option for credit handling:
+
+    .. code-block:: console
+
+       meson configure -Ddlb_pmd_args=bypass_fence:0,hw_credits_checks:1
+
 Running Eventdev Applications with DLB Device
 ---------------------------------------------
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 5/5] doc: update release notes for 24.07
  2024-06-19 21:01   ` [PATCH v5 0/5] DLB2 Enhancements Abdullah Sevincer
                       ` (3 preceding siblings ...)
  2024-06-19 21:01     ` [PATCH v5 4/5] doc: update DLB2 documentation Abdullah Sevincer
@ 2024-06-19 21:01     ` Abdullah Sevincer
  2024-06-20  7:02       ` David Marchand
  4 siblings, 1 reply; 28+ messages in thread
From: Abdullah Sevincer @ 2024-06-19 21:01 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

Update release notes for new DLB features.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 doc/guides/rel_notes/release_24_07.rst | 32 ++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst
index 7c88de381b..b4eb819503 100644
--- a/doc/guides/rel_notes/release_24_07.rst
+++ b/doc/guides/rel_notes/release_24_07.rst
@@ -144,6 +144,38 @@ New Features
 
   Added an API that allows the user to reclaim the defer queue with RCU.
 
+* **Added API to support HW delayed token feature for DLB 2.5 device.**
+
+  * Added API ``rte_pmd_dlb2_set_port_params`` to support delayed token
+    feature for DLB 2.5 device. The feature will resume CQ scheduling
+    when the number of pending completions fall below a configured
+    threshold.
+
+* **Introduced dynamic HL (History List) feature for DLB device.**
+
+  * Users can configure history list entries dynamically by passing
+    parameters ``use_default_hl`` and ``alloc_hl_entries``.
+
+  * When 'use_default_hl = 1', Per port HL is set to
+    DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
+    alloc_hl_entries is ignored.
+
+  * When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
+    port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
+
+* **DLB credit handling scenario improvements.**
+
+  * When ports hold on to credits but can't release them due to insufficient
+    accumulation (less than 2 * credit quanta) deadlocks may occur.
+    Improvement made for worker ports to release all accumulated credits when
+    back-to-back zero poll count reaches preset threshold and producer ports
+    release all accumulated credits if enqueue fails for a consecutive number
+    of retries.
+
+  * New meson options are provided for handling credits. Valid options
+    are ``bypass_fence``, ``hw_credits_checks``, ``sw_credits_checks`` and
+    ``type_check``. These options need to be provided in meson in comma
+    separated form.
 
 Removed Items
 -------------
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v4 3/3] event/dlb2: enhance DLB credit handling
  2024-06-05  4:02       ` Jerin Jacob
@ 2024-06-19 21:07         ` Sevincer, Abdullah
  0 siblings, 0 replies; 28+ messages in thread
From: Sevincer, Abdullah @ 2024-06-19 21:07 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Richardson, Bruce, dev, jerinj, Chen, Mike Ximing, Sarkar,
	Tirthendu, Pathak, Pravin, Doneria, Shivani

>+OK. At least, slowpath code you can remove ifdef and use only in fastpath.
Hi Jerrin 
All of these compilation flags are under fastpath. All other reviews are addressed at v5. There is one CI issue complain about spelling which is a struct field, num_hist_list_entries. 
That’s been there before and it is a field that user applications use, so I did not change the name which would affect users(applications who use this field.) if I did so.
There is a stdatomic api usage warning which I addressed.

Thanks.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 5/5] doc: update release notes for 24.07
  2024-06-19 21:01     ` [PATCH v5 5/5] doc: update release notes for 24.07 Abdullah Sevincer
@ 2024-06-20  7:02       ` David Marchand
  0 siblings, 0 replies; 28+ messages in thread
From: David Marchand @ 2024-06-20  7:02 UTC (permalink / raw)
  To: Abdullah Sevincer
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Wed, Jun 19, 2024 at 11:02 PM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> Update release notes for new DLB features.
>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
> ---
>  doc/guides/rel_notes/release_24_07.rst | 32 ++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst
> index 7c88de381b..b4eb819503 100644
> --- a/doc/guides/rel_notes/release_24_07.rst
> +++ b/doc/guides/rel_notes/release_24_07.rst
> @@ -144,6 +144,38 @@ New Features
>
>    Added an API that allows the user to reclaim the defer queue with RCU.
>
> +* **Added API to support HW delayed token feature for DLB 2.5 device.**
> +
> +  * Added API ``rte_pmd_dlb2_set_port_params`` to support delayed token
> +    feature for DLB 2.5 device. The feature will resume CQ scheduling
> +    when the number of pending completions fall below a configured
> +    threshold.
> +
> +* **Introduced dynamic HL (History List) feature for DLB device.**
> +
> +  * Users can configure history list entries dynamically by passing
> +    parameters ``use_default_hl`` and ``alloc_hl_entries``.
> +
> +  * When 'use_default_hl = 1', Per port HL is set to
> +    DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
> +    alloc_hl_entries is ignored.
> +
> +  * When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
> +    port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
> +
> +* **DLB credit handling scenario improvements.**
> +
> +  * When ports hold on to credits but can't release them due to insufficient
> +    accumulation (less than 2 * credit quanta) deadlocks may occur.
> +    Improvement made for worker ports to release all accumulated credits when
> +    back-to-back zero poll count reaches preset threshold and producer ports
> +    release all accumulated credits if enqueue fails for a consecutive number
> +    of retries.
> +
> +  * New meson options are provided for handling credits. Valid options
> +    are ``bypass_fence``, ``hw_credits_checks``, ``sw_credits_checks`` and
> +    ``type_check``. These options need to be provided in meson in comma
> +    separated form.
>

Those 3 entries can be gathered under a single item about the dlb2 driver.
Like:

* **Updated dlb2 eventdev driver.**

  * Added API ``rte_pmd_dlb2_set_port_params`` to support delayed token...
    ...

  * Introduced dynamic HL (History List) feature for DLB device...
    ...
etc...


Besides, those doc updates should be split and go with the patches
that introduce the features.
This comment applies to the previous doc patch too.

Thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/5] event/dlb2: add support for HW delayed token
  2024-06-19 21:01     ` [PATCH v5 1/5] event/dlb2: add support for HW delayed token Abdullah Sevincer
@ 2024-06-20 12:01       ` Jerin Jacob
  0 siblings, 0 replies; 28+ messages in thread
From: Jerin Jacob @ 2024-06-20 12:01 UTC (permalink / raw)
  To: Abdullah Sevincer
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Thu, Jun 20, 2024 at 2:37 AM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> In DLB 2.5, hardware assist is available, complementing the Delayed
> token POP software implementation. When it is enabled, the feature
> works as follows:
>
> It stops CQ scheduling when the inflight limit associated with the CQ
> is reached. So the feature is activated only if the core is
> congested. If the core can handle multiple atomic flows, DLB will not
> try to switch them. This is an improvement over SW implementation
> which always switches the flows.
>
> The feature will resume CQ scheduling when the number of pending
> completions fall below a configured threshold. To emulate older 2.0
> behavior, this threshold is set to 1 by old APIs. SW sets CQ to
> auto-pop mode for token return, as tokens withholding is not
> necessary now. As HW counts completions and not tokens, events equal
> to HL (History List) entries will be scheduled to DLB before the
> feature activates and stops CQ scheduling.
>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>

>
> +/** Set inflight threshold for flow migration */
> +#define RTE_PMD_DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0)
> +
> +/** Set port history list */
> +#define RTE_PMD_DLB2_SET_PORT_HL RTE_BIT64(1)
> +

Missing Doxygen comment

> +struct rte_pmd_dlb2_port_params {
> +       uint16_t inflight_threshold : 12;

Missing Doxygen comment

> +};
> +
> +/*!
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Configure various port parameters.
> + * AUTO_POP. This function must be called before calling rte_event_port_setup()
> + * for the port, but after calling rte_event_dev_configure().
> + *
> + * @param dev_id
> + *    The identifier of the event device.
> + * @param port_id
> + *    The identifier of the event port.
> + * @param flags
> + *    Bitmask of the parameters being set.
> + * @param params
> + *    Structure coantaining the values of parameters being set.
> + *
> + * @return
> + * - 0: Success
> + * - EINVAL: Invalid dev_id, port_id, or mode
> + * - EINVAL: The DLB2 is not configured, is already running, or the port is
> + *   already setup
> + */
> +__rte_experimental
> +int
> +rte_pmd_dlb2_set_port_params(uint8_t dev_id,
> +                           uint8_t port_id,
> +                           uint64_t flags,
> +                           struct  rte_pmd_dlb2_port_params *params);
> +

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 3/5] event/dlb2: enhance DLB credit handling
  2024-06-19 21:01     ` [PATCH v5 3/5] event/dlb2: enhance DLB credit handling Abdullah Sevincer
@ 2024-06-20 12:09       ` Jerin Jacob
  2024-06-26  0:26         ` Sevincer, Abdullah
  2024-07-12  0:17       ` [PATCH v6 0/3] DLB2 Enhancements Abdullah Sevincer
  1 sibling, 1 reply; 28+ messages in thread
From: Jerin Jacob @ 2024-06-20 12:09 UTC (permalink / raw)
  To: Abdullah Sevincer, Richardson, Bruce, Thomas Monjalon,
	David Marchand, Ferruh Yigit
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Thu, Jun 20, 2024 at 2:31 AM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> This commit improves DLB credit handling scenarios when
> ports hold on to credits but can't release them due to insufficient
> accumulation (less than 2 * credit quanta).
>
> Worker ports now release all accumulated credits when back-to-back
> zero poll count reaches preset threshold.
>
> Producer ports release all accumulated credits if enqueue fails for a
> consecutive number of retries.
>
> All newly introduced compilation flags are in the fastpath.
>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
> ---
>  drivers/event/dlb2/dlb2.c      | 322 +++++++++++++++++++++++++++------
>  drivers/event/dlb2/dlb2_priv.h |   1 +
>  drivers/event/dlb2/meson.build |  40 ++++
>  meson_options.txt              |   2 +

+ @Richardson, Bruce  @Thomas Monjalon  @David Marchand @Ferruh Yigit

It is not allowed to add PMD specific build options in generic DPDK
build options.  Please check with Bruce.

You may use scheme like
https://patches.dpdk.org/project/dpdk/patch/20240522192139.3016-1-pbhagavatula@marvell.com/

or if we think, we need to standardize the PMD compilation options,
then we can do that as well.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 3/5] event/dlb2: enhance DLB credit handling
  2024-06-20 12:09       ` Jerin Jacob
@ 2024-06-26  0:26         ` Sevincer, Abdullah
  2024-06-26  9:37           ` Jerin Jacob
  0 siblings, 1 reply; 28+ messages in thread
From: Sevincer, Abdullah @ 2024-06-26  0:26 UTC (permalink / raw)
  To: Jerin Jacob, Richardson, Bruce, Thomas Monjalon, Marchand, David,
	Ferruh Yigit
  Cc: dev, jerinj, Chen, Mike Ximing, Sarkar, Tirthendu, Pathak,
	Pravin, Doneria, Shivani


>++ @Richardson, Bruce  @Thomas Monjalon  @David Marchand @Ferruh Yigit

>+It is not allowed to add PMD specific build options in generic DPDK build options.  Please check with Bruce.

>+You may use scheme like
>+>+https://patches.dpdk.org/project/dpdk/patch/20240522192139.3016-1-pbhagavatula@marvell.com/

>+or if we think, we need to standardize the PMD compilation options, then we can do that as well.

Thanks Jerrin, I will check the scheme. I will send 2 patches(dynamic hl and hw delayed token) together(as patch set) separately from this one and address this one later if that’s ok?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 3/5] event/dlb2: enhance DLB credit handling
  2024-06-26  0:26         ` Sevincer, Abdullah
@ 2024-06-26  9:37           ` Jerin Jacob
  0 siblings, 0 replies; 28+ messages in thread
From: Jerin Jacob @ 2024-06-26  9:37 UTC (permalink / raw)
  To: Sevincer, Abdullah
  Cc: Richardson, Bruce, Thomas Monjalon, Marchand, David,
	Ferruh Yigit, dev, jerinj, Chen, Mike Ximing, Sarkar, Tirthendu,
	Pathak, Pravin, Doneria, Shivani

On Wed, Jun 26, 2024 at 5:56 AM Sevincer, Abdullah
<abdullah.sevincer@intel.com> wrote:
>
>
> >++ @Richardson, Bruce  @Thomas Monjalon  @David Marchand @Ferruh Yigit
>
> >+It is not allowed to add PMD specific build options in generic DPDK build options.  Please check with Bruce.
>
> >+You may use scheme like
> >+>+https://patches.dpdk.org/project/dpdk/patch/20240522192139.3016-1-pbhagavatula@marvell.com/
>
> >+or if we think, we need to standardize the PMD compilation options, then we can do that as well.
>
> Thanks Jerrin, I will check the scheme. I will send 2 patches(dynamic hl and hw delayed token) together(as patch set) separately from this one and address this one later if that’s ok?

Take time, no hurry from my PoV.

Furthermore, dynamic hl, I have commented at
https://patches.dpdk.org/project/dpdk/patch/20240621222408.583464-3-abdullah.sevincer@intel.com/.
Library changes we cannot take after rc1 anyway.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 0/3] DLB2 Enhancements
  2024-06-19 21:01     ` [PATCH v5 3/5] event/dlb2: enhance DLB credit handling Abdullah Sevincer
  2024-06-20 12:09       ` Jerin Jacob
@ 2024-07-12  0:17       ` Abdullah Sevincer
  2024-07-12  0:17         ` [PATCH v6 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
                           ` (2 more replies)
  1 sibling, 3 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-07-12  0:17 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	Abdullah Sevincer

v6: Address review for main tree pmd options.
v5: Address reviews and update documentation.
v4: Fix CI Issues.
V3: Fix CI issues.
v2: Fix compilation issues.
v1: Initial commit.

Abdullah Sevincer (3):
  event/dlb2: add support for HW delayed token
  event/dlb2: add support for dynamic HL entries
  event/dlb2: enhance DLB credit handling

 doc/guides/eventdevs/dlb2.rst              |  58 +++
 doc/guides/rel_notes/release_24_07.rst     |  14 +
 drivers/event/dlb2/dlb2.c                  | 509 ++++++++++++++++++---
 drivers/event/dlb2/dlb2_iface.c            |   3 +
 drivers/event/dlb2/dlb2_iface.h            |   4 +-
 drivers/event/dlb2/dlb2_priv.h             |  18 +-
 drivers/event/dlb2/dlb2_user.h             |  24 +
 drivers/event/dlb2/meson.build             |  15 +
 drivers/event/dlb2/pf/base/dlb2_regs.h     |   9 +
 drivers/event/dlb2/pf/base/dlb2_resource.c |  95 +++-
 drivers/event/dlb2/pf/base/dlb2_resource.h |  19 +
 drivers/event/dlb2/pf/dlb2_pf.c            |  28 +-
 drivers/event/dlb2/rte_pmd_dlb2.c          |  29 ++
 drivers/event/dlb2/rte_pmd_dlb2.h          |  48 ++
 drivers/event/dlb2/version.map             |   3 +
 15 files changed, 801 insertions(+), 75 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 1/3] event/dlb2: add support for HW delayed token
  2024-07-12  0:17       ` [PATCH v6 0/3] DLB2 Enhancements Abdullah Sevincer
@ 2024-07-12  0:17         ` Abdullah Sevincer
  2024-07-12  0:17         ` [PATCH v6 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
  2024-07-12  0:17         ` [PATCH v6 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
  2 siblings, 0 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-07-12  0:17 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	Abdullah Sevincer

In DLB 2.5, hardware assist is available, complementing the Delayed
token POP software implementation. When it is enabled, the feature
works as follows:

It stops CQ scheduling when the inflight limit associated with the CQ
is reached. So the feature is activated only if the core is
congested. If the core can handle multiple atomic flows, DLB will not
try to switch them. This is an improvement over SW implementation
which always switches the flows.

The feature will resume CQ scheduling when the number of pending
completions fall below a configured threshold. To emulate older 2.0
behavior, this threshold is set to 1 by old APIs. SW sets CQ to
auto-pop mode for token return, as tokens withholding is not
necessary now. As HW counts completions and not tokens, events equal
to HL (History List) entries will be scheduled to DLB before the
feature activates and stops CQ scheduling.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 doc/guides/rel_notes/release_24_07.rst     |  7 ++
 drivers/event/dlb2/dlb2.c                  | 57 ++++++++++++-
 drivers/event/dlb2/dlb2_iface.c            |  3 +
 drivers/event/dlb2/dlb2_iface.h            |  4 +-
 drivers/event/dlb2/dlb2_priv.h             |  5 ++
 drivers/event/dlb2/dlb2_user.h             | 24 ++++++
 drivers/event/dlb2/pf/base/dlb2_regs.h     |  9 ++
 drivers/event/dlb2/pf/base/dlb2_resource.c | 95 +++++++++++++++++++++-
 drivers/event/dlb2/pf/base/dlb2_resource.h | 19 +++++
 drivers/event/dlb2/pf/dlb2_pf.c            | 21 +++++
 drivers/event/dlb2/rte_pmd_dlb2.c          | 29 +++++++
 drivers/event/dlb2/rte_pmd_dlb2.h          | 47 +++++++++++
 drivers/event/dlb2/version.map             |  3 +
 13 files changed, 319 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst
index 91a598e61f..858db48547 100644
--- a/doc/guides/rel_notes/release_24_07.rst
+++ b/doc/guides/rel_notes/release_24_07.rst
@@ -159,6 +159,13 @@ New Features
   * Added defer queue reclamation via RCU.
   * Added SVE support for bulk lookup.
 
+* **Updated DLB2 eventdev driver.**
+
+  * Added API ``rte_pmd_dlb2_set_port_params`` to support delayed token
+    feature for DLB 2.5 device. The feature will resume CQ scheduling
+    when the number of pending completions fall below a configured
+    threshold.
+
 
 Removed Items
 -------------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 0b91f03956..70e4289097 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -879,8 +879,11 @@ dlb2_hw_reset_sched_domain(const struct rte_eventdev *dev, bool reconfig)
 	dlb2_iface_domain_reset(dlb2);
 
 	/* Free all dynamically allocated port memory */
-	for (i = 0; i < dlb2->num_ports; i++)
+	for (i = 0; i < dlb2->num_ports; i++) {
 		dlb2_free_qe_mem(&dlb2->ev_ports[i].qm_port);
+		if (!reconfig)
+			memset(&dlb2->ev_ports[i], 0, sizeof(struct dlb2_eventdev_port));
+	}
 
 	/* If reconfiguring, mark the device's queues and ports as "previously
 	 * configured." If the user doesn't reconfigure them, the PMD will
@@ -1525,7 +1528,7 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	struct dlb2_hw_dev *handle = &dlb2->qm_instance;
 	struct dlb2_create_ldb_port_args cfg = { {0} };
 	int ret;
-	struct dlb2_port *qm_port = NULL;
+	struct dlb2_port *qm_port = &ev_port->qm_port;
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	uint32_t qm_port_id;
 	uint16_t ldb_credit_high_watermark = 0;
@@ -1554,6 +1557,11 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	cfg.cq_depth = rte_align32pow2(dequeue_depth);
 	cfg.cq_depth_threshold = 1;
 
+	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
+		cfg.enable_inflight_ctrl = 1;
+		cfg.inflight_threshold = qm_port->inflight_threshold;
+	}
+
 	cfg.cq_history_list_size = cfg.cq_depth;
 
 	cfg.cos_id = ev_port->cos_id;
@@ -4321,6 +4329,51 @@ dlb2_get_queue_depth(struct dlb2_eventdev *dlb2,
 		return dlb2_get_ldb_queue_depth(dlb2, queue);
 }
 
+int
+dlb2_set_port_params(struct dlb2_eventdev *dlb2,
+		    int port_id,
+		    uint64_t param_flags,
+		    struct  rte_pmd_dlb2_port_params *params)
+{
+	struct dlb2_port *port = &dlb2->ev_ports[port_id].qm_port;
+	struct dlb2_hw_dev *handle = &dlb2->qm_instance;
+	int ret = 0, bit = 0;
+
+	while (param_flags) {
+		uint64_t param = rte_bit_relaxed_test_and_clear64(bit++, &param_flags);
+
+		if (!param)
+			continue;
+		switch (param) {
+		case RTE_PMD_DLB2_FLOW_MIGRATION_THRESHOLD:
+			if (dlb2->version == DLB2_HW_V2_5) {
+				struct dlb2_cq_inflight_ctrl_args args;
+				args.enable = true;
+				args.port_id = port->id;
+				args.threshold = params->inflight_threshold;
+
+				if (dlb2->ev_ports[port_id].setup_done)
+					ret = dlb2_iface_set_cq_inflight_ctrl(handle, &args);
+				if (ret < 0) {
+					DLB2_LOG_ERR("dlb2: can not set port parameters\n");
+					return -EINVAL;
+				}
+				port->enable_inflight_ctrl = true;
+				port->inflight_threshold = args.threshold;
+			} else {
+				DLB2_LOG_ERR("dlb2: FLOW_MIGRATION_THRESHOLD is only supported for 2.5 HW\n");
+				return -EINVAL;
+			}
+			break;
+		default:
+			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
+			return -EINVAL;
+		}
+	}
+
+	return ret;
+}
+
 static bool
 dlb2_queue_is_empty(struct dlb2_eventdev *dlb2,
 		    struct dlb2_eventdev_queue *queue)
diff --git a/drivers/event/dlb2/dlb2_iface.c b/drivers/event/dlb2/dlb2_iface.c
index 100db434d0..b829da2454 100644
--- a/drivers/event/dlb2/dlb2_iface.c
+++ b/drivers/event/dlb2/dlb2_iface.c
@@ -77,5 +77,8 @@ int (*dlb2_iface_get_dir_queue_depth)(struct dlb2_hw_dev *handle,
 int (*dlb2_iface_enable_cq_weight)(struct dlb2_hw_dev *handle,
 				   struct dlb2_enable_cq_weight_args *args);
 
+int (*dlb2_iface_set_cq_inflight_ctrl)(struct dlb2_hw_dev *handle,
+				       struct dlb2_cq_inflight_ctrl_args *args);
+
 int (*dlb2_iface_set_cos_bw)(struct dlb2_hw_dev *handle,
 			     struct dlb2_set_cos_bw_args *args);
diff --git a/drivers/event/dlb2/dlb2_iface.h b/drivers/event/dlb2/dlb2_iface.h
index dc0c446ce8..55b6bdcf84 100644
--- a/drivers/event/dlb2/dlb2_iface.h
+++ b/drivers/event/dlb2/dlb2_iface.h
@@ -72,10 +72,12 @@ extern int (*dlb2_iface_get_ldb_queue_depth)(struct dlb2_hw_dev *handle,
 extern int (*dlb2_iface_get_dir_queue_depth)(struct dlb2_hw_dev *handle,
 				struct dlb2_get_dir_queue_depth_args *args);
 
-
 extern int (*dlb2_iface_enable_cq_weight)(struct dlb2_hw_dev *handle,
 					  struct dlb2_enable_cq_weight_args *args);
 
+extern int (*dlb2_iface_set_cq_inflight_ctrl)(struct dlb2_hw_dev *handle,
+					      struct dlb2_cq_inflight_ctrl_args *args);
+
 extern int (*dlb2_iface_set_cos_bw)(struct dlb2_hw_dev *handle,
 				    struct dlb2_set_cos_bw_args *args);
 
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index 2470ae0271..bd11c0facf 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -389,6 +389,8 @@ struct dlb2_port {
 	bool use_avx512;
 	uint32_t cq_weight;
 	bool is_producer; /* True if port is of type producer */
+	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
+	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
 };
 
 /* Per-process per-port mmio and memory pointers */
@@ -715,6 +717,9 @@ int dlb2_secondary_eventdev_probe(struct rte_eventdev *dev,
 uint32_t dlb2_get_queue_depth(struct dlb2_eventdev *dlb2,
 			      struct dlb2_eventdev_queue *queue);
 
+int dlb2_set_port_params(struct dlb2_eventdev *dlb2, int port_id,
+			uint64_t flags, struct  rte_pmd_dlb2_port_params *params);
+
 int dlb2_parse_params(const char *params,
 		      const char *name,
 		      struct dlb2_devargs *dlb2_args,
diff --git a/drivers/event/dlb2/dlb2_user.h b/drivers/event/dlb2/dlb2_user.h
index 8739e2a5ac..ca09c65ac4 100644
--- a/drivers/event/dlb2/dlb2_user.h
+++ b/drivers/event/dlb2/dlb2_user.h
@@ -472,6 +472,8 @@ struct dlb2_create_ldb_port_args {
 	__u16 cq_history_list_size;
 	__u8 cos_id;
 	__u8 cos_strict;
+	__u8 enable_inflight_ctrl;
+	__u16 inflight_threshold;
 };
 
 /*
@@ -717,6 +719,28 @@ struct dlb2_enable_cq_weight_args {
 	__u32 limit;
 };
 
+/*
+ * DLB2_DOMAIN_CMD_SET_CQ_INFLIGHT_CTRL: Set Per-CQ inflight control for
+ * {ATM,UNO,ORD} QEs.
+ *
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - enable: True if inflight control is enabled. False otherwise
+ * - threshold: Per CQ inflight threshold.
+ *
+ * Output parameters:
+ * - response.status: Detailed error code. In certain cases, such as if the
+ *	ioctl request arg is invalid, the driver won't set status.
+ */
+struct dlb2_cq_inflight_ctrl_args {
+	/* Output parameters */
+	struct dlb2_cmd_response response;
+	/* Input parameters */
+	__u32 port_id;
+	__u16 enable;
+	__u16 threshold;
+};
+
 /*
  * Mapping sizes for memory mapping the consumer queue (CQ) memory space, and
  * producer port (PP) MMIO space.
diff --git a/drivers/event/dlb2/pf/base/dlb2_regs.h b/drivers/event/dlb2/pf/base/dlb2_regs.h
index 7167f3d2ff..b639a5b659 100644
--- a/drivers/event/dlb2/pf/base/dlb2_regs.h
+++ b/drivers/event/dlb2/pf/base/dlb2_regs.h
@@ -3238,6 +3238,15 @@
 #define DLB2_LSP_CQ_LDB_INFL_LIM_LIMIT_LOC	0
 #define DLB2_LSP_CQ_LDB_INFL_LIM_RSVD0_LOC	12
 
+#define DLB2_LSP_CQ_LDB_INFL_THRESH(x) \
+	(0x90580000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RST 0x0
+
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH	0x00000FFF
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RSVD0	0xFFFFF000
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH_LOC	0
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RSVD0_LOC	12
+
 #define DLB2_V2LSP_CQ_LDB_TKN_CNT(x) \
 	(0xa0580000 + (x) * 0x1000)
 #define DLB2_V2_5LSP_CQ_LDB_TKN_CNT(x) \
diff --git a/drivers/event/dlb2/pf/base/dlb2_resource.c b/drivers/event/dlb2/pf/base/dlb2_resource.c
index 7ce3e3531c..051d7e51c3 100644
--- a/drivers/event/dlb2/pf/base/dlb2_resource.c
+++ b/drivers/event/dlb2/pf/base/dlb2_resource.c
@@ -3062,10 +3062,14 @@ static void __dlb2_domain_reset_ldb_port_registers(struct dlb2_hw *hw,
 		    DLB2_CHP_LDB_CQ_DEPTH(hw->ver, port->id.phys_id),
 		    DLB2_CHP_LDB_CQ_DEPTH_RST);
 
-	if (hw->ver != DLB2_HW_V2)
+	if (hw->ver != DLB2_HW_V2) {
 		DLB2_CSR_WR(hw,
 			    DLB2_LSP_CFG_CQ_LDB_WU_LIMIT(port->id.phys_id),
 			    DLB2_LSP_CFG_CQ_LDB_WU_LIMIT_RST);
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id),
+			    DLB2_LSP_CQ_LDB_INFL_THRESH_RST);
+	}
 
 	DLB2_CSR_WR(hw,
 		    DLB2_LSP_CQ_LDB_INFL_LIM(hw->ver, port->id.phys_id),
@@ -4446,6 +4450,20 @@ static int dlb2_ldb_port_configure_cq(struct dlb2_hw *hw,
 	reg = 0;
 	DLB2_CSR_WR(hw, DLB2_LSP_CQ2PRIOV(hw->ver, port->id.phys_id), reg);
 
+	if (hw->ver == DLB2_HW_V2_5) {
+		reg = 0;
+		DLB2_BITS_SET(reg, args->enable_inflight_ctrl,
+				DLB2_LSP_CFG_CTRL_GENERAL_0_ENAB_IF_THRESH_V2_5);
+		DLB2_CSR_WR(hw, DLB2_V2_5LSP_CFG_CTRL_GENERAL_0, reg);
+
+		if (args->enable_inflight_ctrl) {
+			reg = 0;
+			DLB2_BITS_SET(reg, args->inflight_threshold,
+					DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH);
+			DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id), reg);
+		}
+	}
+
 	return 0;
 }
 
@@ -5464,6 +5482,35 @@ dlb2_get_domain_used_ldb_port(u32 id,
 	return NULL;
 }
 
+static struct dlb2_ldb_port *
+dlb2_get_domain_ldb_port(u32 id,
+			 bool vdev_req,
+			 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_list_entry *iter __attribute__((unused));
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (id >= DLB2_MAX_NUM_LDB_PORTS)
+		return NULL;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port, iter) {
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+		}
+
+		DLB2_DOM_LIST_FOR(domain->avail_ldb_ports[i], port, iter) {
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+		}
+	}
+
+	return NULL;
+}
+
 static void dlb2_ldb_port_change_qid_priority(struct dlb2_hw *hw,
 					      struct dlb2_ldb_port *port,
 					      int slot,
@@ -6816,3 +6863,49 @@ int dlb2_hw_set_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id, u8 bandwidth)
 
 	return 0;
 }
+
+int dlb2_hw_set_cq_inflight_ctrl(struct dlb2_hw *hw, u32 domain_id,
+		struct dlb2_cq_inflight_ctrl_args *args,
+		struct dlb2_cmd_response *resp,
+		bool vdev_req,
+		unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	u32 reg = 0;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_ldb_port(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	DLB2_BITS_SET(reg, args->enable,
+		      DLB2_LSP_CFG_CTRL_GENERAL_0_ENAB_IF_THRESH_V2_5);
+	DLB2_CSR_WR(hw, DLB2_V2_5LSP_CFG_CTRL_GENERAL_0, reg);
+
+	if (args->enable) {
+		reg = 0;
+		DLB2_BITS_SET(reg, args->threshold,
+			      DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH);
+		DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id),
+			    reg);
+	}
+
+	resp->status = 0;
+
+	return 0;
+}
diff --git a/drivers/event/dlb2/pf/base/dlb2_resource.h b/drivers/event/dlb2/pf/base/dlb2_resource.h
index 71bd6148f1..17cc745824 100644
--- a/drivers/event/dlb2/pf/base/dlb2_resource.h
+++ b/drivers/event/dlb2/pf/base/dlb2_resource.h
@@ -1956,4 +1956,23 @@ int dlb2_hw_enable_cq_weight(struct dlb2_hw *hw,
 			     bool vdev_request,
 			     unsigned int vdev_id);
 
+/**
+ * This function configures the inflight control threshold for a cq.
+ *
+ * This must be called after creating the port.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the queue ID.
+ *
+ * Errors:
+ * EINVAL - The domain or port is not configured.
+ */
+int dlb2_hw_set_cq_inflight_ctrl(struct dlb2_hw *hw, u32 domain_id,
+		struct dlb2_cq_inflight_ctrl_args *args,
+		struct dlb2_cmd_response *resp,
+		bool vdev_req,
+		unsigned int vdev_id);
+
 #endif /* __DLB2_RESOURCE_H */
diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
index 3d15250e11..249ed7ede9 100644
--- a/drivers/event/dlb2/pf/dlb2_pf.c
+++ b/drivers/event/dlb2/pf/dlb2_pf.c
@@ -665,6 +665,26 @@ dlb2_pf_set_cos_bandwidth(struct dlb2_hw_dev *handle,
 	return ret;
 }
 
+static int
+dlb2_pf_set_cq_inflight_ctrl(struct dlb2_hw_dev *handle,
+			     struct dlb2_cq_inflight_ctrl_args *args)
+{
+	struct dlb2_dev *dlb2_dev = (struct dlb2_dev *)handle->pf_dev;
+	struct dlb2_cmd_response response = {0};
+	int ret = 0;
+
+	DLB2_INFO(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_hw_set_cq_inflight_ctrl(&dlb2_dev->hw, handle->domain_id,
+					   args, &response, false, 0);
+	args->response = response;
+
+	DLB2_INFO(dev->dlb2_device, "Exiting %s() with ret=%d\n",
+		  __func__, ret);
+
+	return ret;
+}
+
 static void
 dlb2_pf_iface_fn_ptrs_init(void)
 {
@@ -691,6 +711,7 @@ dlb2_pf_iface_fn_ptrs_init(void)
 	dlb2_iface_get_sn_occupancy = dlb2_pf_get_sn_occupancy;
 	dlb2_iface_enable_cq_weight = dlb2_pf_enable_cq_weight;
 	dlb2_iface_set_cos_bw = dlb2_pf_set_cos_bandwidth;
+	dlb2_iface_set_cq_inflight_ctrl = dlb2_pf_set_cq_inflight_ctrl;
 }
 
 /* PCI DEV HOOKS */
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.c b/drivers/event/dlb2/rte_pmd_dlb2.c
index 43990e46ac..8a54cb5a31 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.c
+++ b/drivers/event/dlb2/rte_pmd_dlb2.c
@@ -33,7 +33,36 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 	if (port_id >= dlb2->num_ports || dlb2->ev_ports[port_id].setup_done)
 		return -EINVAL;
 
+	if (dlb2->version == DLB2_HW_V2_5 && mode == DELAYED_POP) {
+		dlb2->ev_ports[port_id].qm_port.enable_inflight_ctrl = true;
+		dlb2->ev_ports[port_id].qm_port.inflight_threshold = 1;
+		mode = AUTO_POP;
+	}
+
 	dlb2->ev_ports[port_id].qm_port.token_pop_mode = mode;
 
 	return 0;
 }
+
+int
+rte_pmd_dlb2_set_port_params(uint8_t dev_id,
+			    uint8_t port_id,
+			    uint64_t flags,
+			    struct  rte_pmd_dlb2_port_params *params)
+{
+	struct dlb2_eventdev *dlb2;
+	struct rte_eventdev *dev;
+
+	if (params == NULL)
+		return -EINVAL;
+
+	RTE_EVENTDEV_VALID_DEVID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_eventdevs[dev_id];
+
+	dlb2 = dlb2_pmd_priv(dev);
+
+	if (port_id >= dlb2->num_ports)
+		return -EINVAL;
+
+	return dlb2_set_port_params(dlb2, port_id, flags, params);
+}
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
index 334c6c356d..027ac7413c 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.h
+++ b/drivers/event/dlb2/rte_pmd_dlb2.h
@@ -67,6 +67,53 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 				uint8_t port_id,
 				enum dlb2_token_pop_mode mode);
 
+/** Set inflight threshold for flow migration */
+#define RTE_PMD_DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0)
+
+/** Set port history list */
+#define RTE_PMD_DLB2_SET_PORT_HL RTE_BIT64(1)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Structure containing values for port parameters to enable HW delayed token
+ * assist and dynamic history list.
+ */
+struct rte_pmd_dlb2_port_params {
+	uint16_t inflight_threshold : 12;
+};
+
+/*!
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Configure port parameters to enable HW delayed token assist and dynamic
+ * history list. This function must be called before calling rte_event_port_setup()
+ * for the port, but after calling rte_event_dev_configure().
+ *
+ * @param dev_id
+ *    The identifier of the event device.
+ * @param port_id
+ *    The identifier of the event port.
+ * @param flags
+ *    Bitmask of the parameters being set.
+ * @param params
+ *    Structure coantaining the values of port parameters being set.
+ *
+ * @return
+ * - 0: Success
+ * - EINVAL: Invalid dev_id, port_id or parameters.
+ * - EINVAL: The DLB2 is not configured, is already running, or the port is
+ *   already setup
+ */
+__rte_experimental
+int
+rte_pmd_dlb2_set_port_params(uint8_t dev_id,
+			    uint8_t port_id,
+			    uint64_t flags,
+			    struct  rte_pmd_dlb2_port_params *params);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/event/dlb2/version.map b/drivers/event/dlb2/version.map
index 1d0a0a75d7..c72c8b988a 100644
--- a/drivers/event/dlb2/version.map
+++ b/drivers/event/dlb2/version.map
@@ -5,6 +5,9 @@ DPDK_24 {
 EXPERIMENTAL {
 	global:
 
+	# added in 24.07
+	rte_pmd_dlb2_set_port_params;
+
 	# added in 20.11
 	rte_pmd_dlb2_set_token_pop_mode;
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 2/3] event/dlb2: add support for dynamic HL entries
  2024-07-12  0:17       ` [PATCH v6 0/3] DLB2 Enhancements Abdullah Sevincer
  2024-07-12  0:17         ` [PATCH v6 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
@ 2024-07-12  0:17         ` Abdullah Sevincer
  2024-07-23  6:45           ` Mattias Rönnblom
  2024-07-12  0:17         ` [PATCH v6 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
  2 siblings, 1 reply; 28+ messages in thread
From: Abdullah Sevincer @ 2024-07-12  0:17 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	Abdullah Sevincer

DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
possible HL entries per LDB port equals 2048 / 64 = 32. So, the
maximum CQ depth possible is 16, if all 64 LB ports are needed in a
high-performance setting.

In case all CQs are configured to have HL = 2* CQ Depth as a
performance option, then the calculation of HL at the time of domain
creation will be based on maximum possible dequeue depth. This could
result in allocating too many HL  entries to the domain as DLB only
has limited number of HL entries to be allocated. Hence, it is best
to allow application to specify HL entries as a command line argument
and override default allocation. A summary of usage is listed below:

When 'use_default_hl = 1', Per port HL is set to
DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
alloc_hl_entries is ignored.

When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.

User should calculate needed HL entries based on CQ depths the
application will use and specify it as command line parameter
'alloc_hl_entries'. This will be used to allocate HL entries.
Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).

If alloc_hl_entries is not specified, then Total HL entries for the
vdev = num_ldb_ports * 64.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 doc/guides/eventdevs/dlb2.rst          |  37 +++++++
 doc/guides/rel_notes/release_24_07.rst |   4 +
 drivers/event/dlb2/dlb2.c              | 130 +++++++++++++++++++++++--
 drivers/event/dlb2/dlb2_priv.h         |  12 ++-
 drivers/event/dlb2/pf/dlb2_pf.c        |   7 +-
 drivers/event/dlb2/rte_pmd_dlb2.h      |   1 +
 6 files changed, 179 insertions(+), 12 deletions(-)

diff --git a/doc/guides/eventdevs/dlb2.rst b/doc/guides/eventdevs/dlb2.rst
index 2532d92888..fb920d6648 100644
--- a/doc/guides/eventdevs/dlb2.rst
+++ b/doc/guides/eventdevs/dlb2.rst
@@ -456,6 +456,43 @@ Example command to enable QE Weight feature:
 
        --allow ea:00.0,enable_cq_weight=<y/Y>
 
+Dynamic History List Entries
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
+possible HL entries per LDB port equals 2048 / 64 = 32. So, the
+maximum CQ depth possible is 16, if all 64 LB ports are needed in a
+high-performance setting.
+
+In case all CQs are configured to have HL = 2* CQ Depth as a
+performance option, then the calculation of HL at the time of domain
+creation will be based on maximum possible dequeue depth. This could
+result in allocating too many HL  entries to the domain as DLB only
+has a limited number of HL entries to be allocated. Hence, it is best
+to allow application to specify HL entries as a command line argument
+and override default allocation. A summary of usage is listed below:
+
+When 'use_default_hl = 1', Per port HL is set to
+DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
+alloc_hl_entries is ignored.
+
+When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
+port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
+
+Users should calculate needed HL entries based on CQ depths the
+application will use and specify it as command line parameter
+'alloc_hl_entries'. This will be used to allocate HL entries.
+Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).
+
+If alloc_hl_entries is not specified, then Total HL entries for the
+vdev = num_ldb_ports * 64
+
+Example command to use dynamic history list entries feature:
+
+    .. code-block:: console
+
+       --allow ea:00.0,use_default_hl=0,alloc_hl_entries=1024
+
 Running Eventdev Applications with DLB Device
 ---------------------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst
index 858db48547..4f587dd47c 100644
--- a/doc/guides/rel_notes/release_24_07.rst
+++ b/doc/guides/rel_notes/release_24_07.rst
@@ -166,6 +166,10 @@ New Features
     when the number of pending completions fall below a configured
     threshold.
 
+  * Introduced dynamic HL (History List) feature for DLB device. History
+    list entries can dynamically be configured by passing parameters
+    ``use_default_hl`` and ``alloc_hl_entries``.
+
 
 Removed Items
 -------------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 70e4289097..837c0639a3 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -180,10 +180,7 @@ dlb2_hw_query_resources(struct dlb2_eventdev *dlb2)
 	 * The capabilities (CAPs) were set at compile time.
 	 */
 
-	if (dlb2->max_cq_depth != DLB2_DEFAULT_CQ_DEPTH)
-		num_ldb_ports = DLB2_MAX_HL_ENTRIES / dlb2->max_cq_depth;
-	else
-		num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
+	num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
 
 	evdev_dlb2_default_info.max_event_queues =
 		dlb2->hw_rsrc_query_results.num_ldb_queues;
@@ -631,6 +628,52 @@ set_enable_cq_weight(const char *key __rte_unused,
 	return 0;
 }
 
+static int set_hl_override(const char *key __rte_unused,
+		const char *value,
+		void *opaque)
+{
+	bool *default_hl = opaque;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	if ((*value == 'n') || (*value == 'N') || (*value == '0'))
+		*default_hl = false;
+	else
+		*default_hl = true;
+
+	return 0;
+}
+
+static int set_hl_entries(const char *key __rte_unused,
+		const char *value,
+		void *opaque)
+{
+	int hl_entries = 0;
+	int ret;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	ret = dlb2_string_to_int(&hl_entries, value);
+	if (ret < 0)
+		return ret;
+
+	if ((uint32_t)hl_entries > DLB2_MAX_HL_ENTRIES) {
+		DLB2_LOG_ERR(
+		    "alloc_hl_entries %u out of range, must be in [1 - %d]\n",
+		    hl_entries, DLB2_MAX_HL_ENTRIES);
+		return -EINVAL;
+	}
+	*(uint32_t *)opaque = hl_entries;
+
+	return 0;
+}
+
 static int
 set_qid_depth_thresh(const char *key __rte_unused,
 		     const char *value,
@@ -828,8 +871,19 @@ dlb2_hw_create_sched_domain(struct dlb2_eventdev *dlb2,
 		DLB2_NUM_ATOMIC_INFLIGHTS_PER_QUEUE *
 		cfg->num_ldb_queues;
 
-	cfg->num_hist_list_entries = resources_asked->num_ldb_ports *
-		evdev_dlb2_default_info.max_event_port_dequeue_depth;
+	/* If hl_entries is non-zero then user specified command line option.
+	 * Else compute using default_port_hl that has been set earlier based
+	 * on use_default_hl option
+	 */
+	if (dlb2->hl_entries) {
+		cfg->num_hist_list_entries = dlb2->hl_entries;
+		if (resources_asked->num_ldb_ports)
+			dlb2->default_port_hl = cfg->num_hist_list_entries /
+						resources_asked->num_ldb_ports;
+	} else {
+		cfg->num_hist_list_entries =
+		    resources_asked->num_ldb_ports * dlb2->default_port_hl;
+	}
 
 	if (device_version == DLB2_HW_V2_5) {
 		DLB2_LOG_DBG("sched domain create - ldb_qs=%d, ldb_ports=%d, dir_ports=%d, atomic_inflights=%d, hist_list_entries=%d, credits=%d\n",
@@ -1041,7 +1095,7 @@ dlb2_eventdev_port_default_conf_get(struct rte_eventdev *dev,
 	struct dlb2_eventdev *dlb2 = dlb2_pmd_priv(dev);
 
 	port_conf->new_event_threshold = dlb2->new_event_limit;
-	port_conf->dequeue_depth = 32;
+	port_conf->dequeue_depth = dlb2->default_port_hl / 2;
 	port_conf->enqueue_depth = DLB2_MAX_ENQUEUE_DEPTH;
 	port_conf->event_port_cfg = 0;
 }
@@ -1560,9 +1614,18 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
 		cfg.enable_inflight_ctrl = 1;
 		cfg.inflight_threshold = qm_port->inflight_threshold;
+		if (!qm_port->hist_list)
+			qm_port->hist_list = cfg.cq_depth;
 	}
 
-	cfg.cq_history_list_size = cfg.cq_depth;
+	if (qm_port->hist_list)
+		cfg.cq_history_list_size = qm_port->hist_list;
+	else if (cfg.enable_inflight_ctrl)
+		cfg.cq_history_list_size = RTE_MIN(cfg.cq_depth, dlb2->default_port_hl);
+	else if (dlb2->default_port_hl == DLB2_FIXED_CQ_HL_SIZE)
+		cfg.cq_history_list_size = DLB2_FIXED_CQ_HL_SIZE;
+	else
+		cfg.cq_history_list_size = cfg.cq_depth * 2;
 
 	cfg.cos_id = ev_port->cos_id;
 	cfg.cos_strict = 0;/* best effots */
@@ -4365,6 +4428,13 @@ dlb2_set_port_params(struct dlb2_eventdev *dlb2,
 				return -EINVAL;
 			}
 			break;
+		case RTE_PMD_DLB2_SET_PORT_HL:
+			if (dlb2->ev_ports[port_id].setup_done) {
+				DLB2_LOG_ERR("DLB2_SET_PORT_HL must be called before setting up port\n");
+				return -EINVAL;
+			}
+			port->hist_list = params->port_hl;
+			break;
 		default:
 			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
 			return -EINVAL;
@@ -4683,6 +4753,28 @@ dlb2_primary_eventdev_probe(struct rte_eventdev *dev,
 		return err;
 	}
 
+	if (dlb2_args->use_default_hl) {
+		dlb2->default_port_hl = DLB2_FIXED_CQ_HL_SIZE;
+		if (dlb2_args->alloc_hl_entries)
+			DLB2_LOG_ERR(": Ignoring 'alloc_hl_entries' and using "
+				     "default history list sizes for eventdev:"
+				     " %s\n", dev->data->name);
+		dlb2->hl_entries = 0;
+	} else {
+		dlb2->default_port_hl = 2 * DLB2_FIXED_CQ_HL_SIZE;
+
+		if (dlb2_args->alloc_hl_entries >
+		    dlb2->hw_rsrc_query_results.num_hist_list_entries) {
+			DLB2_LOG_ERR(": Insufficient HL entries asked=%d "
+				     "available=%d for eventdev: %s\n",
+				     dlb2->hl_entries,
+				     dlb2->hw_rsrc_query_results.num_hist_list_entries,
+				     dev->data->name);
+			return -EINVAL;
+		}
+		dlb2->hl_entries = dlb2_args->alloc_hl_entries;
+	}
+
 	dlb2_iface_hardware_init(&dlb2->qm_instance);
 
 	/* configure class of service */
@@ -4790,6 +4882,8 @@ dlb2_parse_params(const char *params,
 					     DLB2_PRODUCER_COREMASK,
 					     DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG,
 					     DLB2_ENABLE_CQ_WEIGHT_ARG,
+					     DLB2_USE_DEFAULT_HL,
+					     DLB2_ALLOC_HL_ENTRIES,
 					     NULL };
 
 	if (params != NULL && params[0] != '\0') {
@@ -4993,6 +5087,26 @@ dlb2_parse_params(const char *params,
 				return ret;
 			}
 
+			ret = rte_kvargs_process(kvlist, DLB2_USE_DEFAULT_HL,
+						 set_hl_override,
+						 &dlb2_args->use_default_hl);
+			if (ret != 0) {
+				DLB2_LOG_ERR("%s: Error parsing use_default_hl arg",
+					     name);
+				rte_kvargs_free(kvlist);
+				return ret;
+			}
+
+			ret = rte_kvargs_process(kvlist, DLB2_ALLOC_HL_ENTRIES,
+						 set_hl_entries,
+						 &dlb2_args->alloc_hl_entries);
+			if (ret != 0) {
+				DLB2_LOG_ERR("%s: Error parsing hl_override arg",
+					     name);
+				rte_kvargs_free(kvlist);
+				return ret;
+			}
+
 			rte_kvargs_free(kvlist);
 		}
 	}
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index bd11c0facf..e7ed27251e 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -52,6 +52,8 @@
 #define DLB2_PRODUCER_COREMASK "producer_coremask"
 #define DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG "default_port_allocation"
 #define DLB2_ENABLE_CQ_WEIGHT_ARG "enable_cq_weight"
+#define DLB2_USE_DEFAULT_HL "use_default_hl"
+#define DLB2_ALLOC_HL_ENTRIES "alloc_hl_entries"
 
 /* Begin HW related defines and structs */
 
@@ -101,7 +103,8 @@
  */
 #define DLB2_MAX_HL_ENTRIES 2048
 #define DLB2_MIN_CQ_DEPTH 1
-#define DLB2_DEFAULT_CQ_DEPTH 32
+#define DLB2_DEFAULT_CQ_DEPTH 128  /* Can be overridden using max_cq_depth command line parameter */
+#define DLB2_FIXED_CQ_HL_SIZE 32  /* Used when ENABLE_FIXED_HL_SIZE is true */
 #define DLB2_MIN_HARDWARE_CQ_DEPTH 8
 #define DLB2_NUM_HIST_LIST_ENTRIES_PER_LDB_PORT \
 	DLB2_DEFAULT_CQ_DEPTH
@@ -123,7 +126,7 @@
 
 #define DLB2_NUM_QES_PER_CACHE_LINE 4
 
-#define DLB2_MAX_ENQUEUE_DEPTH 32
+#define DLB2_MAX_ENQUEUE_DEPTH 128
 #define DLB2_MIN_ENQUEUE_DEPTH 4
 
 #define DLB2_NAME_SIZE 64
@@ -391,6 +394,7 @@ struct dlb2_port {
 	bool is_producer; /* True if port is of type producer */
 	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
 	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
+	uint16_t hist_list; /* Port history list */
 };
 
 /* Per-process per-port mmio and memory pointers */
@@ -637,6 +641,8 @@ struct dlb2_eventdev {
 	uint32_t cos_bw[DLB2_COS_NUM_VALS]; /* bandwidth per cos domain */
 	uint8_t max_cos_port; /* Max LDB port from any cos */
 	bool enable_cq_weight;
+	uint16_t hl_entries; /* Num HL entires to allocate for the domain */
+	int default_port_hl;  /* Fixed or dynamic (2*CQ Depth) HL assignment */
 };
 
 /* used for collecting and passing around the dev args */
@@ -675,6 +681,8 @@ struct dlb2_devargs {
 	const char *producer_coremask;
 	bool default_ldb_port_allocation;
 	bool enable_cq_weight;
+	bool use_default_hl;
+	uint32_t alloc_hl_entries;
 };
 
 /* End Eventdev related defines and structs */
diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
index 249ed7ede9..137bdfd656 100644
--- a/drivers/event/dlb2/pf/dlb2_pf.c
+++ b/drivers/event/dlb2/pf/dlb2_pf.c
@@ -422,6 +422,8 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
 				      cfg,
 				      cq_base,
 				      &response);
+
+	cfg->response = response;
 	if (ret)
 		goto create_port_err;
 
@@ -437,7 +439,6 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
 
 	dlb2_list_init_head(&port_memory.list);
 
-	cfg->response = response;
 
 	return 0;
 
@@ -731,7 +732,9 @@ dlb2_eventdev_pci_init(struct rte_eventdev *eventdev)
 		.hw_credit_quanta = DLB2_SW_CREDIT_BATCH_SZ,
 		.default_depth_thresh = DLB2_DEPTH_THRESH_DEFAULT,
 		.max_cq_depth = DLB2_DEFAULT_CQ_DEPTH,
-		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH
+		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH,
+		.use_default_hl = true,
+		.alloc_hl_entries = 0
 	};
 	struct dlb2_eventdev *dlb2;
 	int q;
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
index 027ac7413c..53556cb7ad 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.h
+++ b/drivers/event/dlb2/rte_pmd_dlb2.h
@@ -82,6 +82,7 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
  */
 struct rte_pmd_dlb2_port_params {
 	uint16_t inflight_threshold : 12;
+	uint16_t port_hl;
 };
 
 /*!
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v6 3/3] event/dlb2: enhance DLB credit handling
  2024-07-12  0:17       ` [PATCH v6 0/3] DLB2 Enhancements Abdullah Sevincer
  2024-07-12  0:17         ` [PATCH v6 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
  2024-07-12  0:17         ` [PATCH v6 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
@ 2024-07-12  0:17         ` Abdullah Sevincer
  2 siblings, 0 replies; 28+ messages in thread
From: Abdullah Sevincer @ 2024-07-12  0:17 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	Abdullah Sevincer

This commit improves DLB credit handling scenarios when
ports hold on to credits but can't release them due to insufficient
accumulation (less than 2 * credit quanta).

Worker ports now release all accumulated credits when back-to-back
zero poll count reaches preset threshold.

Producer ports release all accumulated credits if enqueue fails for a
consecutive number of retries.

All newly introduced compilation flags are in the fastpath.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 doc/guides/eventdevs/dlb2.rst          |  21 ++
 doc/guides/rel_notes/release_24_07.rst |   3 +
 drivers/event/dlb2/dlb2.c              | 322 ++++++++++++++++++++-----
 drivers/event/dlb2/dlb2_priv.h         |   1 +
 drivers/event/dlb2/meson.build         |  15 ++
 5 files changed, 303 insertions(+), 59 deletions(-)

diff --git a/doc/guides/eventdevs/dlb2.rst b/doc/guides/eventdevs/dlb2.rst
index fb920d6648..6f554a2514 100644
--- a/doc/guides/eventdevs/dlb2.rst
+++ b/doc/guides/eventdevs/dlb2.rst
@@ -493,6 +493,27 @@ Example command to use dynamic history list entries feature:
 
        --allow ea:00.0,use_default_hl=0,alloc_hl_entries=1024
 
+Credit Handling Scenario Improvements
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When ports hold on to credits but can't release them due to insufficient
+accumulation (less than 2 * credit quanta) deadlocks may occur. Improvement
+made for worker ports to release all accumulated credits when back-to-back
+zero poll count reaches preset threshold and producer ports release all
+accumulated credits if enqueue fails for a consecutive number of retries.
+
+New meson options are provided through c_args for enabling and disabling
+credits handling option flags.
+
+The default behavior for ``bypass fence`` is disabled and all others are
+enabled.
+
+Example command to use as meson option for credit handling:
+
+    .. code-block:: console
+
+       meson configure -Dc_args='-DDLB_SW_CREDITS_CHECKS=0 -DDLB_HW_CREDITS_CHECKS=1'
+
 Running Eventdev Applications with DLB Device
 ---------------------------------------------
 
diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst
index 4f587dd47c..e211a44d48 100644
--- a/doc/guides/rel_notes/release_24_07.rst
+++ b/doc/guides/rel_notes/release_24_07.rst
@@ -170,6 +170,9 @@ New Features
     list entries can dynamically be configured by passing parameters
     ``use_default_hl`` and ``alloc_hl_entries``.
 
+  * Improved credit handling for DLB driver. New meson options are
+    provided through c_args for credit handling.
+
 
 Removed Items
 -------------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 837c0639a3..e20c0173d0 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -43,7 +43,50 @@
  * to DLB can go ahead of relevant application writes like updates to buffers
  * being sent with event
  */
+#ifndef DLB2_BYPASS_FENCE_ON_PP
 #define DLB2_BYPASS_FENCE_ON_PP 0  /* 1 == Bypass fence, 0 == do not bypass */
+#endif
+/*
+ * HW credit checks can only be turned off for DLB2 device if following
+ * is true for each created eventdev
+ * LDB credits <= DIR credits + minimum CQ Depth
+ * (CQ Depth is minimum of all ports configured within eventdev)
+ * This needs to be true for all eventdevs created on any DLB2 device
+ * managed by this driver.
+ * DLB2.5 does not have any such restriction as it has single credit pool
+ */
+#ifndef DLB_HW_CREDITS_CHECKS
+#define DLB_HW_CREDITS_CHECKS 1
+#endif
+
+/*
+ * SW credit checks can only be turned off if application has a way to
+ * limit input events to the eventdev below assigned credit limit
+ */
+#ifndef DLB_SW_CREDITS_CHECKS
+#define DLB_SW_CREDITS_CHECKS 1
+#endif
+
+/*
+ * Once application is fully validated, type check can be turned off.
+ * HW will continue checking for correct type and generate alarm on mismatch
+ */
+#ifndef DLB_TYPE_CHECK
+#define DLB_TYPE_CHECK 1
+#endif
+#define DLB_TYPE_MACRO 0x010002
+
+/*
+ * To avoid deadlock, ports holding to credits will release them after these
+ * many consecutive zero dequeues
+ */
+#define DLB2_ZERO_DEQ_CREDIT_RETURN_THRES 16384
+
+/*
+ * To avoid deadlock, ports holding to credits will release them after these
+ * many consecutive enqueue failures
+ */
+#define DLB2_ENQ_FAIL_CREDIT_RETURN_THRES 100
 
 /*
  * Resources exposed to eventdev. Some values overridden at runtime using
@@ -366,6 +409,33 @@ set_max_num_events(const char *key __rte_unused,
 	return 0;
 }
 
+static int
+set_max_num_events_v2_5(const char *key __rte_unused,
+			const char *value,
+			void *opaque)
+{
+	int *max_num_events = opaque;
+	int ret;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	ret = dlb2_string_to_int(max_num_events, value);
+	if (ret < 0)
+		return ret;
+
+	if (*max_num_events < 0 || *max_num_events >
+			DLB2_MAX_NUM_CREDITS(DLB2_HW_V2_5)) {
+		DLB2_LOG_ERR("dlb2: max_num_events must be between 0 and %d\n",
+			     DLB2_MAX_NUM_CREDITS(DLB2_HW_V2_5));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static int
 set_num_dir_credits(const char *key __rte_unused,
 		    const char *value,
@@ -966,6 +1036,15 @@ dlb2_hw_reset_sched_domain(const struct rte_eventdev *dev, bool reconfig)
 	dlb2->num_queues = 0;
 	dlb2->num_ldb_queues = 0;
 	dlb2->num_dir_queues = 0;
+	if (dlb2->version == DLB2_HW_V2_5) {
+		dlb2->num_credits = 0;
+		dlb2->max_credits = 0;
+	} else {
+		dlb2->num_ldb_credits = 0;
+		dlb2->num_dir_credits = 0;
+		dlb2->max_ldb_credits = 0;
+		dlb2->max_dir_credits = 0;
+	}
 	dlb2->configured = false;
 }
 
@@ -1074,11 +1153,14 @@ dlb2_eventdev_configure(const struct rte_eventdev *dev)
 	if (dlb2->version == DLB2_HW_V2_5) {
 		dlb2->credit_pool = rsrcs->num_credits;
 		dlb2->max_credits = rsrcs->num_credits;
+		dlb2->num_credits = rsrcs->num_credits;
 	} else {
 		dlb2->ldb_credit_pool = rsrcs->num_ldb_credits;
 		dlb2->max_ldb_credits = rsrcs->num_ldb_credits;
+		dlb2->num_ldb_credits = rsrcs->num_ldb_credits;
 		dlb2->dir_credit_pool = rsrcs->num_dir_credits;
 		dlb2->max_dir_credits = rsrcs->num_dir_credits;
+		dlb2->num_dir_credits = rsrcs->num_dir_credits;
 	}
 
 	dlb2->configured = true;
@@ -1679,6 +1761,12 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 
 	qm_port->id = qm_port_id;
 
+	if (dlb2->version == DLB2_HW_V2) {
+		qm_port->cached_ldb_credits = 0;
+		qm_port->cached_dir_credits = 0;
+	} else
+		qm_port->cached_credits = 0;
+
 	if (dlb2->version == DLB2_HW_V2_5 && (dlb2->enable_cq_weight == true)) {
 		struct dlb2_enable_cq_weight_args cq_weight_args = { {0} };
 		cq_weight_args.port_id = qm_port->id;
@@ -2047,19 +2135,8 @@ dlb2_eventdev_port_setup(struct rte_eventdev *dev,
 	ev_port->credit_update_quanta = sw_credit_quanta;
 	ev_port->qm_port.hw_credit_quanta = hw_credit_quanta;
 
-	/*
-	 * Validate credit config before creating port
-	 */
 
-	if (port_conf->enqueue_depth > sw_credit_quanta ||
-	    port_conf->enqueue_depth > hw_credit_quanta) {
-		DLB2_LOG_ERR("Invalid port config. Enqueue depth %d must be <= credit quanta %d and batch size %d\n",
-			     port_conf->enqueue_depth,
-			     sw_credit_quanta,
-			     hw_credit_quanta);
-		return -EINVAL;
-	}
-	ev_port->enq_retries = port_conf->enqueue_depth / sw_credit_quanta;
+	ev_port->enq_retries = port_conf->enqueue_depth;
 
 	/* Save off port config for reconfig */
 	ev_port->conf = *port_conf;
@@ -2494,6 +2571,61 @@ dlb2_event_queue_detach_ldb(struct dlb2_eventdev *dlb2,
 	return ret;
 }
 
+static inline void
+dlb2_port_credits_return(struct dlb2_port *qm_port)
+{
+	/* Return all port credits */
+	if (qm_port->dlb2->version == DLB2_HW_V2_5) {
+		if (qm_port->cached_credits) {
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_COMBINED_POOL],
+					   qm_port->cached_credits, rte_memory_order_seq_cst);
+			qm_port->cached_credits = 0;
+		}
+	} else {
+		if (qm_port->cached_ldb_credits) {
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_LDB_QUEUE],
+					   qm_port->cached_ldb_credits, rte_memory_order_seq_cst);
+			qm_port->cached_ldb_credits = 0;
+		}
+		if (qm_port->cached_dir_credits) {
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_DIR_QUEUE],
+					   qm_port->cached_dir_credits, rte_memory_order_seq_cst);
+			qm_port->cached_dir_credits = 0;
+		}
+	}
+}
+
+static inline void
+dlb2_release_sw_credits(struct dlb2_eventdev *dlb2,
+			struct dlb2_eventdev_port *ev_port, uint16_t val)
+{
+	if (ev_port->inflight_credits) {
+		rte_atomic_fetch_sub_explicit(&dlb2->inflights, val, rte_memory_order_seq_cst);
+		ev_port->inflight_credits -= val;
+	}
+}
+
+static void dlb2_check_and_return_credits(struct dlb2_eventdev_port *ev_port,
+					  bool cond, uint32_t threshold)
+{
+#if DLB_SW_CREDITS_CHECKS || DLB_HW_CREDITS_CHECKS
+	if (cond) {
+		if (++ev_port->credit_return_count > threshold) {
+#if DLB_SW_CREDITS_CHECKS
+			dlb2_release_sw_credits(ev_port->dlb2, ev_port,
+						ev_port->inflight_credits);
+#endif
+#if DLB_HW_CREDITS_CHECKS
+			dlb2_port_credits_return(&ev_port->qm_port);
+#endif
+			ev_port->credit_return_count = 0;
+		}
+	} else {
+		ev_port->credit_return_count = 0;
+	}
+#endif
+}
+
 static int
 dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 			  uint8_t queues[], uint16_t nb_unlinks)
@@ -2513,14 +2645,15 @@ dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 
 	if (queues == NULL || nb_unlinks == 0) {
 		DLB2_LOG_DBG("dlb2: queues is NULL or nb_unlinks is 0\n");
-		return 0; /* Ignore and return success */
+		nb_unlinks = 0; /* Ignore and return success */
+		goto ret_credits;
 	}
 
 	if (ev_port->qm_port.is_directed) {
 		DLB2_LOG_DBG("dlb2: ignore unlink from dir port %d\n",
 			     ev_port->id);
 		rte_errno = 0;
-		return nb_unlinks; /* as if success */
+		goto ret_credits;
 	}
 
 	dlb2 = ev_port->dlb2;
@@ -2559,6 +2692,10 @@ dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 		ev_queue->num_links--;
 	}
 
+ret_credits:
+	if (ev_port->inflight_credits)
+		dlb2_check_and_return_credits(ev_port, true, 0);
+
 	return nb_unlinks;
 }
 
@@ -2758,8 +2895,7 @@ dlb2_replenish_sw_credits(struct dlb2_eventdev *dlb2,
 		/* Replenish credits, saving one quanta for enqueues */
 		uint16_t val = ev_port->inflight_credits - quanta;
 
-		rte_atomic_fetch_sub_explicit(&dlb2->inflights, val, rte_memory_order_seq_cst);
-		ev_port->inflight_credits -= val;
+		dlb2_release_sw_credits(dlb2, ev_port, val);
 	}
 }
 
@@ -2789,10 +2925,15 @@ dlb2_check_enqueue_sw_credits(struct dlb2_eventdev *dlb2,
 			rte_errno = -ENOSPC;
 			return 1;
 		}
-
-		rte_atomic_fetch_add_explicit(&dlb2->inflights, credit_update_quanta,
-				   rte_memory_order_seq_cst);
-		ev_port->inflight_credits += (credit_update_quanta);
+		/* Application will retry if this attempt fails due to contention */
+		if (rte_atomic_compare_exchange_strong_explicit(&dlb2->inflights, &sw_inflights,
+					(sw_inflights+credit_update_quanta),
+					rte_memory_order_seq_cst, rte_memory_order_seq_cst))
+			ev_port->inflight_credits += (credit_update_quanta);
+		else {
+			rte_errno = -ENOSPC;
+			return 1;
+		}
 
 		if (ev_port->inflight_credits < num) {
 			DLB2_INC_STAT(
@@ -2930,7 +3071,9 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 {
 	struct dlb2_eventdev *dlb2 = ev_port->dlb2;
 	struct dlb2_eventdev_queue *ev_queue;
+#if DLB_HW_CREDITS_CHECKS
 	uint16_t *cached_credits = NULL;
+#endif
 	struct dlb2_queue *qm_queue;
 
 	ev_queue = &dlb2->ev_queues[ev->queue_id];
@@ -2942,6 +3085,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		goto op_check;
 
 	if (!qm_queue->is_directed) {
+#if DLB_HW_CREDITS_CHECKS
 		/* Load balanced destination queue */
 
 		if (dlb2->version == DLB2_HW_V2) {
@@ -2985,9 +3129,20 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			rte_errno = -EINVAL;
 			return 1;
 		}
+#else
+#if (RTE_SCHED_TYPE_PARALLEL != 2) || (RTE_SCHED_TYPE_ATOMIC != 1)
+#error "ERROR: RTE event schedule type values changed. Needs a code change"
+#endif
+		/* Map RTE eventdev schedule type to DLB HW schedule type */
+		if (qm_queue->sched_type != RTE_SCHED_TYPE_ORDERED)
+			/* RTE-Parallel -> DLB-UnOrd 2->1, RTE-Atm -> DLB-Atm 1->0 */
+			*sched_type = ev->sched_type - 1;
+		else /* To support CFG_ALL_TYPEs */
+			*sched_type = DLB2_SCHED_ORDERED; /* RTE-Ord -> DLB-Ord 0->2 */
+#endif
 	} else {
 		/* Directed destination queue */
-
+#if DLB_HW_CREDITS_CHECKS
 		if (dlb2->version == DLB2_HW_V2) {
 			if (dlb2_check_enqueue_hw_dir_credits(qm_port)) {
 				rte_errno = -ENOSPC;
@@ -3001,6 +3156,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			}
 			cached_credits = &qm_port->cached_credits;
 		}
+#endif
 		DLB2_LOG_DBG("dlb2: put_qe: RTE_SCHED_TYPE_DIRECTED\n");
 
 		*sched_type = DLB2_SCHED_DIRECTED;
@@ -3009,13 +3165,17 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 op_check:
 	switch (ev->op) {
 	case RTE_EVENT_OP_NEW:
+#if DLB_SW_CREDITS_CHECKS
 		/* Check that a sw credit is available */
 		if (dlb2_check_enqueue_sw_credits(dlb2, ev_port)) {
 			rte_errno = -ENOSPC;
 			return 1;
 		}
 		ev_port->inflight_credits--;
+#endif
+#if DLB_HW_CREDITS_CHECKS
 		(*cached_credits)--;
+#endif
 		break;
 	case RTE_EVENT_OP_FORWARD:
 		/* Check for outstanding_releases underflow. If this occurs,
@@ -3026,10 +3186,14 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		RTE_ASSERT(ev_port->outstanding_releases > 0);
 		ev_port->outstanding_releases--;
 		qm_port->issued_releases++;
+#if DLB_HW_CREDITS_CHECKS
 		(*cached_credits)--;
+#endif
 		break;
 	case RTE_EVENT_OP_RELEASE:
+#if DLB_SW_CREDITS_CHECKS
 		ev_port->inflight_credits++;
+#endif
 		/* Check for outstanding_releases underflow. If this occurs,
 		 * the application is not using the EVENT_OPs correctly; for
 		 * example, forwarding or releasing events that were not
@@ -3038,9 +3202,28 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		RTE_ASSERT(ev_port->outstanding_releases > 0);
 		ev_port->outstanding_releases--;
 		qm_port->issued_releases++;
-
+#if DLB_SW_CREDITS_CHECKS
 		/* Replenish s/w credits if enough are cached */
 		dlb2_replenish_sw_credits(dlb2, ev_port);
+#endif
+		break;
+	/* Fragments not supported in the API, but left here for
+	 * possible future use.
+	 */
+#if DLB_SW_CREDITS_CHECKS
+		/* Check that a sw credit is available */
+		if (dlb2_check_enqueue_sw_credits(dlb2, ev_port)) {
+			rte_errno = -ENOSPC;
+			return 1;
+		}
+#endif
+
+#if DLB_SW_CREDITS_CHECKS
+		ev_port->inflight_credits--;
+#endif
+#if DLB_HW_CREDITS_CHECKS
+		(*cached_credits)--;
+#endif
 		break;
 	}
 
@@ -3151,6 +3334,8 @@ __dlb2_event_enqueue_burst(void *event_port,
 			break;
 	}
 
+	dlb2_check_and_return_credits(ev_port, !i, DLB2_ENQ_FAIL_CREDIT_RETURN_THRES);
+
 	return i;
 }
 
@@ -3289,53 +3474,45 @@ dlb2_event_release(struct dlb2_eventdev *dlb2,
 		return;
 	}
 	ev_port->outstanding_releases -= i;
+#if DLB_SW_CREDITS_CHECKS
 	ev_port->inflight_credits += i;
 
 	/* Replenish s/w credits if enough releases are performed */
 	dlb2_replenish_sw_credits(dlb2, ev_port);
+#endif
 }
 
 static inline void
 dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 {
 	uint32_t batch_size = qm_port->hw_credit_quanta;
+	int val;
 
 	/* increment port credits, and return to pool if exceeds threshold */
-	if (!qm_port->is_directed) {
-		if (qm_port->dlb2->version == DLB2_HW_V2) {
-			qm_port->cached_ldb_credits += num;
-			if (qm_port->cached_ldb_credits >= 2 * batch_size) {
-				rte_atomic_fetch_add_explicit(
-					qm_port->credit_pool[DLB2_LDB_QUEUE],
-					batch_size, rte_memory_order_seq_cst);
-				qm_port->cached_ldb_credits -= batch_size;
-			}
-		} else {
-			qm_port->cached_credits += num;
-			if (qm_port->cached_credits >= 2 * batch_size) {
-				rte_atomic_fetch_add_explicit(
-				      qm_port->credit_pool[DLB2_COMBINED_POOL],
-				      batch_size, rte_memory_order_seq_cst);
-				qm_port->cached_credits -= batch_size;
-			}
+	if (qm_port->dlb2->version == DLB2_HW_V2_5) {
+		qm_port->cached_credits += num;
+		if (qm_port->cached_credits >= 2 * batch_size) {
+			val = qm_port->cached_credits - batch_size;
+			rte_atomic_fetch_add_explicit(
+			    qm_port->credit_pool[DLB2_COMBINED_POOL], val,
+			    rte_memory_order_seq_cst);
+			qm_port->cached_credits -= val;
+		}
+	} else if (!qm_port->is_directed) {
+		qm_port->cached_ldb_credits += num;
+		if (qm_port->cached_ldb_credits >= 2 * batch_size) {
+			val = qm_port->cached_ldb_credits - batch_size;
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_LDB_QUEUE],
+					   val, rte_memory_order_seq_cst);
+			qm_port->cached_ldb_credits -= val;
 		}
 	} else {
-		if (qm_port->dlb2->version == DLB2_HW_V2) {
-			qm_port->cached_dir_credits += num;
-			if (qm_port->cached_dir_credits >= 2 * batch_size) {
-				rte_atomic_fetch_add_explicit(
-					qm_port->credit_pool[DLB2_DIR_QUEUE],
-					batch_size, rte_memory_order_seq_cst);
-				qm_port->cached_dir_credits -= batch_size;
-			}
-		} else {
-			qm_port->cached_credits += num;
-			if (qm_port->cached_credits >= 2 * batch_size) {
-				rte_atomic_fetch_add_explicit(
-				      qm_port->credit_pool[DLB2_COMBINED_POOL],
-				      batch_size, rte_memory_order_seq_cst);
-				qm_port->cached_credits -= batch_size;
-			}
+		qm_port->cached_dir_credits += num;
+		if (qm_port->cached_dir_credits >= 2 * batch_size) {
+			val = qm_port->cached_dir_credits - batch_size;
+			rte_atomic_fetch_add_explicit(qm_port->credit_pool[DLB2_DIR_QUEUE],
+					   val, rte_memory_order_seq_cst);
+			qm_port->cached_dir_credits -= val;
 		}
 	}
 }
@@ -3366,6 +3543,16 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 
 	/* Wait/poll time expired */
 	if (elapsed_ticks >= timeout) {
+
+		/* Return all credits before blocking if remaining credits in
+		 * system is less than quanta.
+		 */
+		uint32_t sw_inflights = rte_atomic_load_explicit(&dlb2->inflights,
+				rte_memory_order_seq_cst);
+		uint32_t quanta = ev_port->credit_update_quanta;
+
+		if (dlb2->new_event_limit - sw_inflights < quanta)
+			dlb2_check_and_return_credits(ev_port, true, 0);
 		return 1;
 	} else if (dlb2->umwait_allowed) {
 		struct rte_power_monitor_cond pmc;
@@ -4101,7 +4288,9 @@ dlb2_hw_dequeue_sparse(struct dlb2_eventdev *dlb2,
 
 		ev_port->outstanding_releases += num;
 
+#if DLB_HW_CREDITS_CHECKS
 		dlb2_port_credits_inc(qm_port, num);
+#endif
 	}
 
 	return num;
@@ -4228,8 +4417,9 @@ dlb2_hw_dequeue(struct dlb2_eventdev *dlb2,
 			dlb2_consume_qe_immediate(qm_port, num);
 
 		ev_port->outstanding_releases += num;
-
+#if DLB_HW_CREDITS_CHECKS
 		dlb2_port_credits_inc(qm_port, num);
+#endif
 	}
 
 	return num;
@@ -4263,6 +4453,9 @@ dlb2_event_dequeue_burst(void *event_port, struct rte_event *ev, uint16_t num,
 	DLB2_INC_STAT(ev_port->stats.traffic.total_polls, 1);
 	DLB2_INC_STAT(ev_port->stats.traffic.zero_polls, ((cnt == 0) ? 1 : 0));
 
+	dlb2_check_and_return_credits(ev_port, !cnt,
+				      DLB2_ZERO_DEQ_CREDIT_RETURN_THRES);
+
 	return cnt;
 }
 
@@ -4299,6 +4492,9 @@ dlb2_event_dequeue_burst_sparse(void *event_port, struct rte_event *ev,
 
 	DLB2_INC_STAT(ev_port->stats.traffic.total_polls, 1);
 	DLB2_INC_STAT(ev_port->stats.traffic.zero_polls, ((cnt == 0) ? 1 : 0));
+
+	dlb2_check_and_return_credits(ev_port, !cnt,
+				      DLB2_ZERO_DEQ_CREDIT_RETURN_THRES);
 	return cnt;
 }
 
@@ -4903,9 +5099,17 @@ dlb2_parse_params(const char *params,
 				return ret;
 			}
 
-			ret = rte_kvargs_process(kvlist, DLB2_MAX_NUM_EVENTS,
-						 set_max_num_events,
-						 &dlb2_args->max_num_events);
+			if (version == DLB2_HW_V2) {
+				ret = rte_kvargs_process(kvlist,
+						DLB2_MAX_NUM_EVENTS,
+						set_max_num_events,
+						&dlb2_args->max_num_events);
+			} else {
+				ret = rte_kvargs_process(kvlist,
+						DLB2_MAX_NUM_EVENTS,
+						set_max_num_events_v2_5,
+						&dlb2_args->max_num_events);
+			}
 			if (ret != 0) {
 				DLB2_LOG_ERR("%s: Error parsing max_num_events parameter",
 					     name);
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index e7ed27251e..47f76f938f 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -527,6 +527,7 @@ struct __rte_cache_aligned dlb2_eventdev_port {
 	struct rte_event_port_conf conf; /* user-supplied configuration */
 	uint16_t inflight_credits; /* num credits this port has right now */
 	uint16_t credit_update_quanta;
+	uint32_t credit_return_count; /* count till the credit return condition is true */
 	struct dlb2_eventdev *dlb2; /* backlink optimization */
 	alignas(RTE_CACHE_LINE_SIZE) struct dlb2_port_stats stats;
 	struct dlb2_event_queue_link link[DLB2_MAX_NUM_QIDS_PER_LDB_CQ];
diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build
index 515d1795fe..0e4065a8f8 100644
--- a/drivers/event/dlb2/meson.build
+++ b/drivers/event/dlb2/meson.build
@@ -68,3 +68,18 @@ endif
 headers = files('rte_pmd_dlb2.h')
 
 deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci']
+
+dlb_pmd_defines = ['-DDLB2_BYPASS_FENCE_ON_PP', '-DDLB_HW_CREDITS_CHECKS', '-DDLB_SW_CREDITS_CHECKS', '-DDLB_TYPE_CHECK']
+dlb_pmd_default = ['0','1','1','1']
+
+c_args=get_option('c_args')
+index = 0
+foreach opt: dlb_pmd_defines
+    opt_true = opt + '=1'
+    opt_false = opt + '=0'
+    if not (c_args.contains(opt_true) or c_args.contains(opt_false))
+        cflags += opt + '=' + dlb_pmd_default[index]
+    endif
+
+    index += 1
+endforeach
-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v6 2/3] event/dlb2: add support for dynamic HL entries
  2024-07-12  0:17         ` [PATCH v6 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
@ 2024-07-23  6:45           ` Mattias Rönnblom
  0 siblings, 0 replies; 28+ messages in thread
From: Mattias Rönnblom @ 2024-07-23  6:45 UTC (permalink / raw)
  To: Abdullah Sevincer, dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak

On 2024-07-12 02:17, Abdullah Sevincer wrote:
> DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
> possible HL entries per LDB port equals 2048 / 64 = 32. So, the
> maximum CQ depth possible is 16, if all 64 LB ports are needed in a
> high-performance setting.
> 

It may be worth spelling out the acronym the first time it's used. At 
least the non-obvious ones. What is HL, for example?

> In case all CQs are configured to have HL = 2* CQ Depth as a
> performance option, then the calculation of HL at the time of domain
> creation will be based on maximum possible dequeue depth. This could
> result in allocating too many HL  entries to the domain as DLB only
> has limited number of HL entries to be allocated. Hence, it is best
> to allow application to specify HL entries as a command line argument
> and override default allocation. A summary of usage is listed below:
> 
> When 'use_default_hl = 1', Per port HL is set to
> DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
> alloc_hl_entries is ignored.
> 
> When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
> port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
> 
> User should calculate needed HL entries based on CQ depths the
> application will use and specify it as command line parameter
> 'alloc_hl_entries'. This will be used to allocate HL entries.
> Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).
> 
> If alloc_hl_entries is not specified, then Total HL entries for the
> vdev = num_ldb_ports * 64.
> 
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
> ---
>   doc/guides/eventdevs/dlb2.rst          |  37 +++++++
>   doc/guides/rel_notes/release_24_07.rst |   4 +
>   drivers/event/dlb2/dlb2.c              | 130 +++++++++++++++++++++++--
>   drivers/event/dlb2/dlb2_priv.h         |  12 ++-
>   drivers/event/dlb2/pf/dlb2_pf.c        |   7 +-
>   drivers/event/dlb2/rte_pmd_dlb2.h      |   1 +
>   6 files changed, 179 insertions(+), 12 deletions(-)
> 
> diff --git a/doc/guides/eventdevs/dlb2.rst b/doc/guides/eventdevs/dlb2.rst
> index 2532d92888..fb920d6648 100644
> --- a/doc/guides/eventdevs/dlb2.rst
> +++ b/doc/guides/eventdevs/dlb2.rst
> @@ -456,6 +456,43 @@ Example command to enable QE Weight feature:
>   
>          --allow ea:00.0,enable_cq_weight=<y/Y>
>   
> +Dynamic History List Entries
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
> +possible HL entries per LDB port equals 2048 / 64 = 32. So, the
> +maximum CQ depth possible is 16, if all 64 LB ports are needed in a
> +high-performance setting.
> +
> +In case all CQs are configured to have HL = 2* CQ Depth as a
> +performance option, then the calculation of HL at the time of domain
> +creation will be based on maximum possible dequeue depth. This could
> +result in allocating too many HL  entries to the domain as DLB only
> +has a limited number of HL entries to be allocated. Hence, it is best
> +to allow application to specify HL entries as a command line argument
> +and override default allocation. A summary of usage is listed below:
> +
> +When 'use_default_hl = 1', Per port HL is set to
> +DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
> +alloc_hl_entries is ignored.
> +
> +When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
> +port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
> +
> +Users should calculate needed HL entries based on CQ depths the
> +application will use and specify it as command line parameter
> +'alloc_hl_entries'. This will be used to allocate HL entries.
> +Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).
> +
> +If alloc_hl_entries is not specified, then Total HL entries for the
> +vdev = num_ldb_ports * 64
> +
> +Example command to use dynamic history list entries feature:
> +
> +    .. code-block:: console
> +
> +       --allow ea:00.0,use_default_hl=0,alloc_hl_entries=1024
> +
>   Running Eventdev Applications with DLB Device
>   ---------------------------------------------
>   
> diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst
> index 858db48547..4f587dd47c 100644
> --- a/doc/guides/rel_notes/release_24_07.rst
> +++ b/doc/guides/rel_notes/release_24_07.rst
> @@ -166,6 +166,10 @@ New Features
>       when the number of pending completions fall below a configured
>       threshold.
>   
> +  * Introduced dynamic HL (History List) feature for DLB device. History
> +    list entries can dynamically be configured by passing parameters
> +    ``use_default_hl`` and ``alloc_hl_entries``.
> +
>   
>   Removed Items
>   -------------
> diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
> index 70e4289097..837c0639a3 100644
> --- a/drivers/event/dlb2/dlb2.c
> +++ b/drivers/event/dlb2/dlb2.c
> @@ -180,10 +180,7 @@ dlb2_hw_query_resources(struct dlb2_eventdev *dlb2)
>   	 * The capabilities (CAPs) were set at compile time.
>   	 */
>   
> -	if (dlb2->max_cq_depth != DLB2_DEFAULT_CQ_DEPTH)
> -		num_ldb_ports = DLB2_MAX_HL_ENTRIES / dlb2->max_cq_depth;
> -	else
> -		num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
> +	num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
>   
>   	evdev_dlb2_default_info.max_event_queues =
>   		dlb2->hw_rsrc_query_results.num_ldb_queues;
> @@ -631,6 +628,52 @@ set_enable_cq_weight(const char *key __rte_unused,
>   	return 0;
>   }
>   
> +static int set_hl_override(const char *key __rte_unused,
> +		const char *value,
> +		void *opaque)
> +{
> +	bool *default_hl = opaque;
> +
> +	if (value == NULL || opaque == NULL) {
> +		DLB2_LOG_ERR("NULL pointer\n");
> +		return -EINVAL;
> +	}
> +
> +	if ((*value == 'n') || (*value == 'N') || (*value == '0'))
> +		*default_hl = false;
> +	else
> +		*default_hl = true;
> +
> +	return 0;
> +}
> +
> +static int set_hl_entries(const char *key __rte_unused,
> +		const char *value,
> +		void *opaque)
> +{
> +	int hl_entries = 0;
> +	int ret;
> +
> +	if (value == NULL || opaque == NULL) {
> +		DLB2_LOG_ERR("NULL pointer\n");
> +		return -EINVAL;
> +	}
> +
> +	ret = dlb2_string_to_int(&hl_entries, value);
> +	if (ret < 0)
> +		return ret;
> +
> +	if ((uint32_t)hl_entries > DLB2_MAX_HL_ENTRIES) {
> +		DLB2_LOG_ERR(
> +		    "alloc_hl_entries %u out of range, must be in [1 - %d]\n",
> +		    hl_entries, DLB2_MAX_HL_ENTRIES);
> +		return -EINVAL;
> +	}
> +	*(uint32_t *)opaque = hl_entries;
> +
> +	return 0;
> +}
> +
>   static int
>   set_qid_depth_thresh(const char *key __rte_unused,
>   		     const char *value,
> @@ -828,8 +871,19 @@ dlb2_hw_create_sched_domain(struct dlb2_eventdev *dlb2,
>   		DLB2_NUM_ATOMIC_INFLIGHTS_PER_QUEUE *
>   		cfg->num_ldb_queues;
>   
> -	cfg->num_hist_list_entries = resources_asked->num_ldb_ports *
> -		evdev_dlb2_default_info.max_event_port_dequeue_depth;
> +	/* If hl_entries is non-zero then user specified command line option.
> +	 * Else compute using default_port_hl that has been set earlier based
> +	 * on use_default_hl option
> +	 */
> +	if (dlb2->hl_entries) {
> +		cfg->num_hist_list_entries = dlb2->hl_entries;
> +		if (resources_asked->num_ldb_ports)
> +			dlb2->default_port_hl = cfg->num_hist_list_entries /
> +						resources_asked->num_ldb_ports;
> +	} else {
> +		cfg->num_hist_list_entries =
> +		    resources_asked->num_ldb_ports * dlb2->default_port_hl;
> +	}
>   
>   	if (device_version == DLB2_HW_V2_5) {
>   		DLB2_LOG_DBG("sched domain create - ldb_qs=%d, ldb_ports=%d, dir_ports=%d, atomic_inflights=%d, hist_list_entries=%d, credits=%d\n",
> @@ -1041,7 +1095,7 @@ dlb2_eventdev_port_default_conf_get(struct rte_eventdev *dev,
>   	struct dlb2_eventdev *dlb2 = dlb2_pmd_priv(dev);
>   
>   	port_conf->new_event_threshold = dlb2->new_event_limit;
> -	port_conf->dequeue_depth = 32;
> +	port_conf->dequeue_depth = dlb2->default_port_hl / 2;
>   	port_conf->enqueue_depth = DLB2_MAX_ENQUEUE_DEPTH;
>   	port_conf->event_port_cfg = 0;
>   }
> @@ -1560,9 +1614,18 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
>   	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
>   		cfg.enable_inflight_ctrl = 1;
>   		cfg.inflight_threshold = qm_port->inflight_threshold;
> +		if (!qm_port->hist_list)
> +			qm_port->hist_list = cfg.cq_depth;
>   	}
>   
> -	cfg.cq_history_list_size = cfg.cq_depth;
> +	if (qm_port->hist_list)
> +		cfg.cq_history_list_size = qm_port->hist_list;
> +	else if (cfg.enable_inflight_ctrl)
> +		cfg.cq_history_list_size = RTE_MIN(cfg.cq_depth, dlb2->default_port_hl);
> +	else if (dlb2->default_port_hl == DLB2_FIXED_CQ_HL_SIZE)
> +		cfg.cq_history_list_size = DLB2_FIXED_CQ_HL_SIZE;
> +	else
> +		cfg.cq_history_list_size = cfg.cq_depth * 2;
>   
>   	cfg.cos_id = ev_port->cos_id;
>   	cfg.cos_strict = 0;/* best effots */
> @@ -4365,6 +4428,13 @@ dlb2_set_port_params(struct dlb2_eventdev *dlb2,
>   				return -EINVAL;
>   			}
>   			break;
> +		case RTE_PMD_DLB2_SET_PORT_HL:
> +			if (dlb2->ev_ports[port_id].setup_done) {
> +				DLB2_LOG_ERR("DLB2_SET_PORT_HL must be called before setting up port\n");
> +				return -EINVAL;
> +			}
> +			port->hist_list = params->port_hl;
> +			break;
>   		default:
>   			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
>   			return -EINVAL;
> @@ -4683,6 +4753,28 @@ dlb2_primary_eventdev_probe(struct rte_eventdev *dev,
>   		return err;
>   	}
>   
> +	if (dlb2_args->use_default_hl) {
> +		dlb2->default_port_hl = DLB2_FIXED_CQ_HL_SIZE;
> +		if (dlb2_args->alloc_hl_entries)
> +			DLB2_LOG_ERR(": Ignoring 'alloc_hl_entries' and using "
> +				     "default history list sizes for eventdev:"
> +				     " %s\n", dev->data->name);
> +		dlb2->hl_entries = 0;
> +	} else {
> +		dlb2->default_port_hl = 2 * DLB2_FIXED_CQ_HL_SIZE;
> +
> +		if (dlb2_args->alloc_hl_entries >
> +		    dlb2->hw_rsrc_query_results.num_hist_list_entries) {
> +			DLB2_LOG_ERR(": Insufficient HL entries asked=%d "
> +				     "available=%d for eventdev: %s\n",
> +				     dlb2->hl_entries,
> +				     dlb2->hw_rsrc_query_results.num_hist_list_entries,
> +				     dev->data->name);
> +			return -EINVAL;
> +		}
> +		dlb2->hl_entries = dlb2_args->alloc_hl_entries;
> +	}
> +
>   	dlb2_iface_hardware_init(&dlb2->qm_instance);
>   
>   	/* configure class of service */
> @@ -4790,6 +4882,8 @@ dlb2_parse_params(const char *params,
>   					     DLB2_PRODUCER_COREMASK,
>   					     DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG,
>   					     DLB2_ENABLE_CQ_WEIGHT_ARG,
> +					     DLB2_USE_DEFAULT_HL,
> +					     DLB2_ALLOC_HL_ENTRIES,
>   					     NULL };
>   
>   	if (params != NULL && params[0] != '\0') {
> @@ -4993,6 +5087,26 @@ dlb2_parse_params(const char *params,
>   				return ret;
>   			}
>   
> +			ret = rte_kvargs_process(kvlist, DLB2_USE_DEFAULT_HL,
> +						 set_hl_override,
> +						 &dlb2_args->use_default_hl);
> +			if (ret != 0) {
> +				DLB2_LOG_ERR("%s: Error parsing use_default_hl arg",
> +					     name);
> +				rte_kvargs_free(kvlist);
> +				return ret;
> +			}
> +
> +			ret = rte_kvargs_process(kvlist, DLB2_ALLOC_HL_ENTRIES,
> +						 set_hl_entries,
> +						 &dlb2_args->alloc_hl_entries);
> +			if (ret != 0) {
> +				DLB2_LOG_ERR("%s: Error parsing hl_override arg",
> +					     name);
> +				rte_kvargs_free(kvlist);
> +				return ret;
> +			}
> +
>   			rte_kvargs_free(kvlist);
>   		}
>   	}
> diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
> index bd11c0facf..e7ed27251e 100644
> --- a/drivers/event/dlb2/dlb2_priv.h
> +++ b/drivers/event/dlb2/dlb2_priv.h
> @@ -52,6 +52,8 @@
>   #define DLB2_PRODUCER_COREMASK "producer_coremask"
>   #define DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG "default_port_allocation"
>   #define DLB2_ENABLE_CQ_WEIGHT_ARG "enable_cq_weight"
> +#define DLB2_USE_DEFAULT_HL "use_default_hl"
> +#define DLB2_ALLOC_HL_ENTRIES "alloc_hl_entries"
>   
>   /* Begin HW related defines and structs */
>   
> @@ -101,7 +103,8 @@
>    */
>   #define DLB2_MAX_HL_ENTRIES 2048
>   #define DLB2_MIN_CQ_DEPTH 1
> -#define DLB2_DEFAULT_CQ_DEPTH 32
> +#define DLB2_DEFAULT_CQ_DEPTH 128  /* Can be overridden using max_cq_depth command line parameter */
> +#define DLB2_FIXED_CQ_HL_SIZE 32  /* Used when ENABLE_FIXED_HL_SIZE is true */
>   #define DLB2_MIN_HARDWARE_CQ_DEPTH 8
>   #define DLB2_NUM_HIST_LIST_ENTRIES_PER_LDB_PORT \
>   	DLB2_DEFAULT_CQ_DEPTH
> @@ -123,7 +126,7 @@
>   
>   #define DLB2_NUM_QES_PER_CACHE_LINE 4
>   
> -#define DLB2_MAX_ENQUEUE_DEPTH 32
> +#define DLB2_MAX_ENQUEUE_DEPTH 128
>   #define DLB2_MIN_ENQUEUE_DEPTH 4
>   
>   #define DLB2_NAME_SIZE 64
> @@ -391,6 +394,7 @@ struct dlb2_port {
>   	bool is_producer; /* True if port is of type producer */
>   	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
>   	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
> +	uint16_t hist_list; /* Port history list */
>   };
>   
>   /* Per-process per-port mmio and memory pointers */
> @@ -637,6 +641,8 @@ struct dlb2_eventdev {
>   	uint32_t cos_bw[DLB2_COS_NUM_VALS]; /* bandwidth per cos domain */
>   	uint8_t max_cos_port; /* Max LDB port from any cos */
>   	bool enable_cq_weight;
> +	uint16_t hl_entries; /* Num HL entires to allocate for the domain */
> +	int default_port_hl;  /* Fixed or dynamic (2*CQ Depth) HL assignment */
>   };
>   
>   /* used for collecting and passing around the dev args */
> @@ -675,6 +681,8 @@ struct dlb2_devargs {
>   	const char *producer_coremask;
>   	bool default_ldb_port_allocation;
>   	bool enable_cq_weight;
> +	bool use_default_hl;
> +	uint32_t alloc_hl_entries;
>   };
>   
>   /* End Eventdev related defines and structs */
> diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
> index 249ed7ede9..137bdfd656 100644
> --- a/drivers/event/dlb2/pf/dlb2_pf.c
> +++ b/drivers/event/dlb2/pf/dlb2_pf.c
> @@ -422,6 +422,8 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
>   				      cfg,
>   				      cq_base,
>   				      &response);
> +
> +	cfg->response = response;
>   	if (ret)
>   		goto create_port_err;
>   
> @@ -437,7 +439,6 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
>   
>   	dlb2_list_init_head(&port_memory.list);
>   
> -	cfg->response = response;
>   
>   	return 0;
>   
> @@ -731,7 +732,9 @@ dlb2_eventdev_pci_init(struct rte_eventdev *eventdev)
>   		.hw_credit_quanta = DLB2_SW_CREDIT_BATCH_SZ,
>   		.default_depth_thresh = DLB2_DEPTH_THRESH_DEFAULT,
>   		.max_cq_depth = DLB2_DEFAULT_CQ_DEPTH,
> -		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH
> +		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH,
> +		.use_default_hl = true,
> +		.alloc_hl_entries = 0
>   	};
>   	struct dlb2_eventdev *dlb2;
>   	int q;
> diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
> index 027ac7413c..53556cb7ad 100644
> --- a/drivers/event/dlb2/rte_pmd_dlb2.h
> +++ b/drivers/event/dlb2/rte_pmd_dlb2.h
> @@ -82,6 +82,7 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
>    */
>   struct rte_pmd_dlb2_port_params {
>   	uint16_t inflight_threshold : 12;
> +	uint16_t port_hl;
>   };
>   
>   /*!

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2024-07-23  7:04 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
2024-05-27 15:19   ` Jerin Jacob
2024-06-19 21:01   ` [PATCH v5 0/5] DLB2 Enhancements Abdullah Sevincer
2024-06-19 21:01     ` [PATCH v5 1/5] event/dlb2: add support for HW delayed token Abdullah Sevincer
2024-06-20 12:01       ` Jerin Jacob
2024-06-19 21:01     ` [PATCH v5 2/5] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
2024-06-19 21:01     ` [PATCH v5 3/5] event/dlb2: enhance DLB credit handling Abdullah Sevincer
2024-06-20 12:09       ` Jerin Jacob
2024-06-26  0:26         ` Sevincer, Abdullah
2024-06-26  9:37           ` Jerin Jacob
2024-07-12  0:17       ` [PATCH v6 0/3] DLB2 Enhancements Abdullah Sevincer
2024-07-12  0:17         ` [PATCH v6 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
2024-07-12  0:17         ` [PATCH v6 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
2024-07-23  6:45           ` Mattias Rönnblom
2024-07-12  0:17         ` [PATCH v6 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
2024-06-19 21:01     ` [PATCH v5 4/5] doc: update DLB2 documentation Abdullah Sevincer
2024-06-19 21:01     ` [PATCH v5 5/5] doc: update release notes for 24.07 Abdullah Sevincer
2024-06-20  7:02       ` David Marchand
2024-05-01 19:46 ` [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
2024-05-27 15:23   ` Jerin Jacob
2024-05-01 19:46 ` [PATCH v4 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
2024-05-27 15:30   ` Jerin Jacob
2024-06-04 18:22     ` Sevincer, Abdullah
2024-06-05  4:02       ` Jerin Jacob
2024-06-19 21:07         ` Sevincer, Abdullah
2024-05-02  7:34 ` [PATCH v4 0/3] DLB2 Enhancements Bruce Richardson
2024-05-02 15:52   ` Sevincer, Abdullah

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).