DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v4 0/3] DLB2 Enhancements
@ 2024-05-01 19:46 Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Abdullah Sevincer @ 2024-05-01 19:46 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

This patchset  addresses DLB enhancements in the DLB driver.

Abdullah Sevincer (3):
  event/dlb2: add support for HW delayed token
  event/dlb2: add support for dynamic HL entries
  event/dlb2: enhance DLB credit handling

 app/test-eventdev/test_perf_common.c       |  20 +-
 drivers/event/dlb2/dlb2.c                  | 385 ++++++++++++++++++---
 drivers/event/dlb2/dlb2_iface.c            |   3 +
 drivers/event/dlb2/dlb2_iface.h            |   4 +-
 drivers/event/dlb2/dlb2_priv.h             |  16 +-
 drivers/event/dlb2/dlb2_user.h             |  24 ++
 drivers/event/dlb2/meson.build             |  12 +
 drivers/event/dlb2/meson_options.txt       |   6 +
 drivers/event/dlb2/pf/base/dlb2_regs.h     |   9 +
 drivers/event/dlb2/pf/base/dlb2_resource.c |  95 ++++-
 drivers/event/dlb2/pf/base/dlb2_resource.h |  19 +
 drivers/event/dlb2/pf/dlb2_pf.c            |  28 +-
 drivers/event/dlb2/rte_pmd_dlb2.c          |  29 ++
 drivers/event/dlb2/rte_pmd_dlb2.h          |  41 +++
 drivers/event/dlb2/version.map             |   3 +
 15 files changed, 630 insertions(+), 64 deletions(-)
 create mode 100644 drivers/event/dlb2/meson_options.txt

-- 
2.25.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 1/3] event/dlb2: add support for HW delayed token
  2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
@ 2024-05-01 19:46 ` Abdullah Sevincer
  2024-05-27 15:19   ` Jerin Jacob
  2024-05-01 19:46 ` [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Abdullah Sevincer @ 2024-05-01 19:46 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

In DLB 2.5, hardware assist is available, complementing the Delayed
token POP software implementation. When it is enabled, the feature
works as follows:

It stops CQ scheduling when the inflight limit associated with the CQ
is reached. So the feature is activated only if the core is
congested. If the core can handle multiple atomic flows, DLB will not
try to switch them. This is an improvement over SW implementation
which always switches the flows.

The feature will resume CQ scheduling when the number of pending
completions fall below a configured threshold. To emulate older 2.0
behavior, this threshold is set to 1 by old APIs. SW sets CQ to
auto-pop mode for token return, as tokens withholding is not
necessary now. As HW counts completions and not tokens, events equal
to HL (History List) entries will be scheduled to DLB before the
feature activates and stops CQ scheduling.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 drivers/event/dlb2/dlb2.c                  | 58 ++++++++++++-
 drivers/event/dlb2/dlb2_iface.c            |  3 +
 drivers/event/dlb2/dlb2_iface.h            |  4 +-
 drivers/event/dlb2/dlb2_priv.h             |  5 ++
 drivers/event/dlb2/dlb2_user.h             | 24 ++++++
 drivers/event/dlb2/pf/base/dlb2_regs.h     |  9 ++
 drivers/event/dlb2/pf/base/dlb2_resource.c | 95 +++++++++++++++++++++-
 drivers/event/dlb2/pf/base/dlb2_resource.h | 19 +++++
 drivers/event/dlb2/pf/dlb2_pf.c            | 21 +++++
 drivers/event/dlb2/rte_pmd_dlb2.c          | 29 +++++++
 drivers/event/dlb2/rte_pmd_dlb2.h          | 40 +++++++++
 drivers/event/dlb2/version.map             |  3 +
 12 files changed, 306 insertions(+), 4 deletions(-)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 628ddef649..d64274b01e 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -879,8 +879,11 @@ dlb2_hw_reset_sched_domain(const struct rte_eventdev *dev, bool reconfig)
 	dlb2_iface_domain_reset(dlb2);
 
 	/* Free all dynamically allocated port memory */
-	for (i = 0; i < dlb2->num_ports; i++)
+	for (i = 0; i < dlb2->num_ports; i++) {
 		dlb2_free_qe_mem(&dlb2->ev_ports[i].qm_port);
+		if (!reconfig)
+			memset(&dlb2->ev_ports[i], 0, sizeof(struct dlb2_eventdev_port));
+	}
 
 	/* If reconfiguring, mark the device's queues and ports as "previously
 	 * configured." If the user doesn't reconfigure them, the PMD will
@@ -1525,7 +1528,7 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	struct dlb2_hw_dev *handle = &dlb2->qm_instance;
 	struct dlb2_create_ldb_port_args cfg = { {0} };
 	int ret;
-	struct dlb2_port *qm_port = NULL;
+	struct dlb2_port *qm_port = &ev_port->qm_port;
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	uint32_t qm_port_id;
 	uint16_t ldb_credit_high_watermark = 0;
@@ -1554,6 +1557,11 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	cfg.cq_depth = rte_align32pow2(dequeue_depth);
 	cfg.cq_depth_threshold = 1;
 
+	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
+		cfg.enable_inflight_ctrl = 1;
+		cfg.inflight_threshold = qm_port->inflight_threshold;
+	}
+
 	cfg.cq_history_list_size = cfg.cq_depth;
 
 	cfg.cos_id = ev_port->cos_id;
@@ -4321,6 +4329,52 @@ dlb2_get_queue_depth(struct dlb2_eventdev *dlb2,
 		return dlb2_get_ldb_queue_depth(dlb2, queue);
 }
 
+int
+dlb2_set_port_param(struct dlb2_eventdev *dlb2,
+		    int port_id,
+		    uint64_t param_flags,
+		    void *param_val)
+{
+	struct dlb2_port_param *port_param = (struct dlb2_port_param *)param_val;
+	struct dlb2_port *port = &dlb2->ev_ports[port_id].qm_port;
+	struct dlb2_hw_dev *handle = &dlb2->qm_instance;
+	int ret = 0, bit = 0;
+
+	while (param_flags) {
+		uint64_t param = rte_bit_relaxed_test_and_clear64(bit++, &param_flags);
+
+		if (!param)
+			continue;
+		switch (param) {
+		case DLB2_FLOW_MIGRATION_THRESHOLD:
+			if (dlb2->version == DLB2_HW_V2_5) {
+				struct dlb2_cq_inflight_ctrl_args args;
+				args.enable = true;
+				args.port_id = port->id;
+				args.threshold = port_param->inflight_threshold;
+
+				if (dlb2->ev_ports[port_id].setup_done)
+					ret = dlb2_iface_set_cq_inflight_ctrl(handle, &args);
+				if (ret < 0) {
+					DLB2_LOG_ERR("dlb2: can not set port parameters\n");
+					return -EINVAL;
+				}
+				port->enable_inflight_ctrl = true;
+				port->inflight_threshold = args.threshold;
+			} else {
+				DLB2_LOG_ERR("dlb2: FLOW_MIGRATION_THRESHOLD is only supported for 2.5 HW\n");
+				return -EINVAL;
+			}
+			break;
+		default:
+			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
+			return -EINVAL;
+		}
+	}
+
+	return ret;
+}
+
 static bool
 dlb2_queue_is_empty(struct dlb2_eventdev *dlb2,
 		    struct dlb2_eventdev_queue *queue)
diff --git a/drivers/event/dlb2/dlb2_iface.c b/drivers/event/dlb2/dlb2_iface.c
index 100db434d0..b829da2454 100644
--- a/drivers/event/dlb2/dlb2_iface.c
+++ b/drivers/event/dlb2/dlb2_iface.c
@@ -77,5 +77,8 @@ int (*dlb2_iface_get_dir_queue_depth)(struct dlb2_hw_dev *handle,
 int (*dlb2_iface_enable_cq_weight)(struct dlb2_hw_dev *handle,
 				   struct dlb2_enable_cq_weight_args *args);
 
+int (*dlb2_iface_set_cq_inflight_ctrl)(struct dlb2_hw_dev *handle,
+				       struct dlb2_cq_inflight_ctrl_args *args);
+
 int (*dlb2_iface_set_cos_bw)(struct dlb2_hw_dev *handle,
 			     struct dlb2_set_cos_bw_args *args);
diff --git a/drivers/event/dlb2/dlb2_iface.h b/drivers/event/dlb2/dlb2_iface.h
index dc0c446ce8..55b6bdcf84 100644
--- a/drivers/event/dlb2/dlb2_iface.h
+++ b/drivers/event/dlb2/dlb2_iface.h
@@ -72,10 +72,12 @@ extern int (*dlb2_iface_get_ldb_queue_depth)(struct dlb2_hw_dev *handle,
 extern int (*dlb2_iface_get_dir_queue_depth)(struct dlb2_hw_dev *handle,
 				struct dlb2_get_dir_queue_depth_args *args);
 
-
 extern int (*dlb2_iface_enable_cq_weight)(struct dlb2_hw_dev *handle,
 					  struct dlb2_enable_cq_weight_args *args);
 
+extern int (*dlb2_iface_set_cq_inflight_ctrl)(struct dlb2_hw_dev *handle,
+					      struct dlb2_cq_inflight_ctrl_args *args);
+
 extern int (*dlb2_iface_set_cos_bw)(struct dlb2_hw_dev *handle,
 				    struct dlb2_set_cos_bw_args *args);
 
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index 49f1c6691d..d6828aa482 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -389,6 +389,8 @@ struct dlb2_port {
 	bool use_avx512;
 	uint32_t cq_weight;
 	bool is_producer; /* True if port is of type producer */
+	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
+	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
 };
 
 /* Per-process per-port mmio and memory pointers */
@@ -718,6 +720,9 @@ int dlb2_secondary_eventdev_probe(struct rte_eventdev *dev,
 uint32_t dlb2_get_queue_depth(struct dlb2_eventdev *dlb2,
 			      struct dlb2_eventdev_queue *queue);
 
+int dlb2_set_port_param(struct dlb2_eventdev *dlb2, int port_id,
+			uint64_t flags, void *val);
+
 int dlb2_parse_params(const char *params,
 		      const char *name,
 		      struct dlb2_devargs *dlb2_args,
diff --git a/drivers/event/dlb2/dlb2_user.h b/drivers/event/dlb2/dlb2_user.h
index 8739e2a5ac..ca09c65ac4 100644
--- a/drivers/event/dlb2/dlb2_user.h
+++ b/drivers/event/dlb2/dlb2_user.h
@@ -472,6 +472,8 @@ struct dlb2_create_ldb_port_args {
 	__u16 cq_history_list_size;
 	__u8 cos_id;
 	__u8 cos_strict;
+	__u8 enable_inflight_ctrl;
+	__u16 inflight_threshold;
 };
 
 /*
@@ -717,6 +719,28 @@ struct dlb2_enable_cq_weight_args {
 	__u32 limit;
 };
 
+/*
+ * DLB2_DOMAIN_CMD_SET_CQ_INFLIGHT_CTRL: Set Per-CQ inflight control for
+ * {ATM,UNO,ORD} QEs.
+ *
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - enable: True if inflight control is enabled. False otherwise
+ * - threshold: Per CQ inflight threshold.
+ *
+ * Output parameters:
+ * - response.status: Detailed error code. In certain cases, such as if the
+ *	ioctl request arg is invalid, the driver won't set status.
+ */
+struct dlb2_cq_inflight_ctrl_args {
+	/* Output parameters */
+	struct dlb2_cmd_response response;
+	/* Input parameters */
+	__u32 port_id;
+	__u16 enable;
+	__u16 threshold;
+};
+
 /*
  * Mapping sizes for memory mapping the consumer queue (CQ) memory space, and
  * producer port (PP) MMIO space.
diff --git a/drivers/event/dlb2/pf/base/dlb2_regs.h b/drivers/event/dlb2/pf/base/dlb2_regs.h
index 7167f3d2ff..b639a5b659 100644
--- a/drivers/event/dlb2/pf/base/dlb2_regs.h
+++ b/drivers/event/dlb2/pf/base/dlb2_regs.h
@@ -3238,6 +3238,15 @@
 #define DLB2_LSP_CQ_LDB_INFL_LIM_LIMIT_LOC	0
 #define DLB2_LSP_CQ_LDB_INFL_LIM_RSVD0_LOC	12
 
+#define DLB2_LSP_CQ_LDB_INFL_THRESH(x) \
+	(0x90580000 + (x) * 0x1000)
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RST 0x0
+
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH	0x00000FFF
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RSVD0	0xFFFFF000
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH_LOC	0
+#define DLB2_LSP_CQ_LDB_INFL_THRESH_RSVD0_LOC	12
+
 #define DLB2_V2LSP_CQ_LDB_TKN_CNT(x) \
 	(0xa0580000 + (x) * 0x1000)
 #define DLB2_V2_5LSP_CQ_LDB_TKN_CNT(x) \
diff --git a/drivers/event/dlb2/pf/base/dlb2_resource.c b/drivers/event/dlb2/pf/base/dlb2_resource.c
index 7ce3e3531c..051d7e51c3 100644
--- a/drivers/event/dlb2/pf/base/dlb2_resource.c
+++ b/drivers/event/dlb2/pf/base/dlb2_resource.c
@@ -3062,10 +3062,14 @@ static void __dlb2_domain_reset_ldb_port_registers(struct dlb2_hw *hw,
 		    DLB2_CHP_LDB_CQ_DEPTH(hw->ver, port->id.phys_id),
 		    DLB2_CHP_LDB_CQ_DEPTH_RST);
 
-	if (hw->ver != DLB2_HW_V2)
+	if (hw->ver != DLB2_HW_V2) {
 		DLB2_CSR_WR(hw,
 			    DLB2_LSP_CFG_CQ_LDB_WU_LIMIT(port->id.phys_id),
 			    DLB2_LSP_CFG_CQ_LDB_WU_LIMIT_RST);
+		DLB2_CSR_WR(hw,
+			    DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id),
+			    DLB2_LSP_CQ_LDB_INFL_THRESH_RST);
+	}
 
 	DLB2_CSR_WR(hw,
 		    DLB2_LSP_CQ_LDB_INFL_LIM(hw->ver, port->id.phys_id),
@@ -4446,6 +4450,20 @@ static int dlb2_ldb_port_configure_cq(struct dlb2_hw *hw,
 	reg = 0;
 	DLB2_CSR_WR(hw, DLB2_LSP_CQ2PRIOV(hw->ver, port->id.phys_id), reg);
 
+	if (hw->ver == DLB2_HW_V2_5) {
+		reg = 0;
+		DLB2_BITS_SET(reg, args->enable_inflight_ctrl,
+				DLB2_LSP_CFG_CTRL_GENERAL_0_ENAB_IF_THRESH_V2_5);
+		DLB2_CSR_WR(hw, DLB2_V2_5LSP_CFG_CTRL_GENERAL_0, reg);
+
+		if (args->enable_inflight_ctrl) {
+			reg = 0;
+			DLB2_BITS_SET(reg, args->inflight_threshold,
+					DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH);
+			DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id), reg);
+		}
+	}
+
 	return 0;
 }
 
@@ -5464,6 +5482,35 @@ dlb2_get_domain_used_ldb_port(u32 id,
 	return NULL;
 }
 
+static struct dlb2_ldb_port *
+dlb2_get_domain_ldb_port(u32 id,
+			 bool vdev_req,
+			 struct dlb2_hw_domain *domain)
+{
+	struct dlb2_list_entry *iter __attribute__((unused));
+	struct dlb2_ldb_port *port;
+	int i;
+
+	if (id >= DLB2_MAX_NUM_LDB_PORTS)
+		return NULL;
+
+	for (i = 0; i < DLB2_NUM_COS_DOMAINS; i++) {
+		DLB2_DOM_LIST_FOR(domain->used_ldb_ports[i], port, iter) {
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+		}
+
+		DLB2_DOM_LIST_FOR(domain->avail_ldb_ports[i], port, iter) {
+			if ((!vdev_req && port->id.phys_id == id) ||
+			    (vdev_req && port->id.virt_id == id))
+				return port;
+		}
+	}
+
+	return NULL;
+}
+
 static void dlb2_ldb_port_change_qid_priority(struct dlb2_hw *hw,
 					      struct dlb2_ldb_port *port,
 					      int slot,
@@ -6816,3 +6863,49 @@ int dlb2_hw_set_cos_bandwidth(struct dlb2_hw *hw, u32 cos_id, u8 bandwidth)
 
 	return 0;
 }
+
+int dlb2_hw_set_cq_inflight_ctrl(struct dlb2_hw *hw, u32 domain_id,
+		struct dlb2_cq_inflight_ctrl_args *args,
+		struct dlb2_cmd_response *resp,
+		bool vdev_req,
+		unsigned int vdev_id)
+{
+	struct dlb2_hw_domain *domain;
+	struct dlb2_ldb_port *port;
+	u32 reg = 0;
+	int id;
+
+	domain = dlb2_get_domain_from_id(hw, domain_id, vdev_req, vdev_id);
+	if (!domain) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: domain not found\n",
+			    __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	id = args->port_id;
+
+	port = dlb2_get_domain_ldb_port(id, vdev_req, domain);
+	if (!port) {
+		DLB2_HW_ERR(hw,
+			    "[%s():%d] Internal error: port not found\n",
+			    __func__, __LINE__);
+		return -EINVAL;
+	}
+
+	DLB2_BITS_SET(reg, args->enable,
+		      DLB2_LSP_CFG_CTRL_GENERAL_0_ENAB_IF_THRESH_V2_5);
+	DLB2_CSR_WR(hw, DLB2_V2_5LSP_CFG_CTRL_GENERAL_0, reg);
+
+	if (args->enable) {
+		reg = 0;
+		DLB2_BITS_SET(reg, args->threshold,
+			      DLB2_LSP_CQ_LDB_INFL_THRESH_THRESH);
+		DLB2_CSR_WR(hw, DLB2_LSP_CQ_LDB_INFL_THRESH(port->id.phys_id),
+			    reg);
+	}
+
+	resp->status = 0;
+
+	return 0;
+}
diff --git a/drivers/event/dlb2/pf/base/dlb2_resource.h b/drivers/event/dlb2/pf/base/dlb2_resource.h
index 71bd6148f1..17cc745824 100644
--- a/drivers/event/dlb2/pf/base/dlb2_resource.h
+++ b/drivers/event/dlb2/pf/base/dlb2_resource.h
@@ -1956,4 +1956,23 @@ int dlb2_hw_enable_cq_weight(struct dlb2_hw *hw,
 			     bool vdev_request,
 			     unsigned int vdev_id);
 
+/**
+ * This function configures the inflight control threshold for a cq.
+ *
+ * This must be called after creating the port.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb2_error. If successful, resp->id
+ * contains the queue ID.
+ *
+ * Errors:
+ * EINVAL - The domain or port is not configured.
+ */
+int dlb2_hw_set_cq_inflight_ctrl(struct dlb2_hw *hw, u32 domain_id,
+		struct dlb2_cq_inflight_ctrl_args *args,
+		struct dlb2_cmd_response *resp,
+		bool vdev_req,
+		unsigned int vdev_id);
+
 #endif /* __DLB2_RESOURCE_H */
diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
index 3d15250e11..249ed7ede9 100644
--- a/drivers/event/dlb2/pf/dlb2_pf.c
+++ b/drivers/event/dlb2/pf/dlb2_pf.c
@@ -665,6 +665,26 @@ dlb2_pf_set_cos_bandwidth(struct dlb2_hw_dev *handle,
 	return ret;
 }
 
+static int
+dlb2_pf_set_cq_inflight_ctrl(struct dlb2_hw_dev *handle,
+			     struct dlb2_cq_inflight_ctrl_args *args)
+{
+	struct dlb2_dev *dlb2_dev = (struct dlb2_dev *)handle->pf_dev;
+	struct dlb2_cmd_response response = {0};
+	int ret = 0;
+
+	DLB2_INFO(dev->dlb2_device, "Entering %s()\n", __func__);
+
+	ret = dlb2_hw_set_cq_inflight_ctrl(&dlb2_dev->hw, handle->domain_id,
+					   args, &response, false, 0);
+	args->response = response;
+
+	DLB2_INFO(dev->dlb2_device, "Exiting %s() with ret=%d\n",
+		  __func__, ret);
+
+	return ret;
+}
+
 static void
 dlb2_pf_iface_fn_ptrs_init(void)
 {
@@ -691,6 +711,7 @@ dlb2_pf_iface_fn_ptrs_init(void)
 	dlb2_iface_get_sn_occupancy = dlb2_pf_get_sn_occupancy;
 	dlb2_iface_enable_cq_weight = dlb2_pf_enable_cq_weight;
 	dlb2_iface_set_cos_bw = dlb2_pf_set_cos_bandwidth;
+	dlb2_iface_set_cq_inflight_ctrl = dlb2_pf_set_cq_inflight_ctrl;
 }
 
 /* PCI DEV HOOKS */
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.c b/drivers/event/dlb2/rte_pmd_dlb2.c
index 43990e46ac..c72a42b466 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.c
+++ b/drivers/event/dlb2/rte_pmd_dlb2.c
@@ -33,7 +33,36 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 	if (port_id >= dlb2->num_ports || dlb2->ev_ports[port_id].setup_done)
 		return -EINVAL;
 
+	if (dlb2->version == DLB2_HW_V2_5 && mode == DELAYED_POP) {
+		dlb2->ev_ports[port_id].qm_port.enable_inflight_ctrl = true;
+		dlb2->ev_ports[port_id].qm_port.inflight_threshold = 1;
+		mode = AUTO_POP;
+	}
+
 	dlb2->ev_ports[port_id].qm_port.token_pop_mode = mode;
 
 	return 0;
 }
+
+int
+rte_pmd_dlb2_set_port_param(uint8_t dev_id,
+			    uint8_t port_id,
+			    uint64_t flags,
+			    void *val)
+{
+	struct dlb2_eventdev *dlb2;
+	struct rte_eventdev *dev;
+
+	if (val == NULL)
+		return -EINVAL;
+
+	RTE_EVENTDEV_VALID_DEVID_OR_ERR_RET(dev_id, -EINVAL);
+	dev = &rte_eventdevs[dev_id];
+
+	dlb2 = dlb2_pmd_priv(dev);
+
+	if (port_id >= dlb2->num_ports)
+		return -EINVAL;
+
+	return dlb2_set_port_param(dlb2, port_id, flags, val);
+}
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
index 334c6c356d..6e78dfb5a5 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.h
+++ b/drivers/event/dlb2/rte_pmd_dlb2.h
@@ -67,6 +67,46 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 				uint8_t port_id,
 				enum dlb2_token_pop_mode mode);
 
+/** Set inflight threshold for flow migration */
+#define DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0)
+
+/** Set port history list */
+#define DLB2_SET_PORT_HL RTE_BIT64(1)
+
+struct dlb2_port_param {
+	uint16_t inflight_threshold : 12;
+};
+
+/*!
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Configure various port parameters.
+ * AUTO_POP. This function must be called before calling rte_event_port_setup()
+ * for the port, but after calling rte_event_dev_configure().
+ *
+ * @param dev_id
+ *    The identifier of the event device.
+ * @param port_id
+ *    The identifier of the event port.
+ * @param flags
+ *    Bitmask of the parameters being set.
+ * @param val
+ *    Structure coantaining the values of parameters being set.
+ *
+ * @return
+ * - 0: Success
+ * - EINVAL: Invalid dev_id, port_id, or mode
+ * - EINVAL: The DLB2 is not configured, is already running, or the port is
+ *   already setup
+ */
+__rte_experimental
+int
+rte_pmd_dlb2_set_port_param(uint8_t dev_id,
+			    uint8_t port_id,
+			    uint64_t flags,
+			    void *val);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/event/dlb2/version.map b/drivers/event/dlb2/version.map
index 1d0a0a75d7..5078e4960a 100644
--- a/drivers/event/dlb2/version.map
+++ b/drivers/event/dlb2/version.map
@@ -5,6 +5,9 @@ DPDK_24 {
 EXPERIMENTAL {
 	global:
 
+	# added in 24.07
+	rte_pmd_dlb2_set_port_param;
+
 	# added in 20.11
 	rte_pmd_dlb2_set_token_pop_mode;
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries
  2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
@ 2024-05-01 19:46 ` Abdullah Sevincer
  2024-05-27 15:23   ` Jerin Jacob
  2024-05-01 19:46 ` [PATCH v4 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
  2024-05-02  7:34 ` [PATCH v4 0/3] DLB2 Enhancements Bruce Richardson
  3 siblings, 1 reply; 9+ messages in thread
From: Abdullah Sevincer @ 2024-05-01 19:46 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

In DLB 2.5, hardware assist is available, complementing the Delayed
token POP software implementation. When it is enabled, the feature
works as follows:

It stops CQ scheduling when the inflight limit associated with the CQ
is reached. So the feature is activated only if the core is
congested. If the core can handle multiple atomic flows, DLB will not
try to switch them. This is an improvement over SW implementation
which always switches the flows.

The feature will resume CQ scheduling when the number of pending
completions fall below a configured threshold.

DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
possible HL entries per LDB port equals 2048 / 64 = 32. So, the
maximum CQ depth possible is 16, if all 64 LB ports are needed in a
high-performance setting.

In case all CQs are configured to have HL = 2* CQ Depth as a
performance option, then the calculation of HL at the time of domain
creation will be based on maximum possible dequeue depth. This could
result in allocating too many HL  entries to the domain as DLB only
has limited number of HL entries to be allocated. Hence, it is best
to allow application to specify HL entries as a command line argument
and override default allocation. A summary of usage is listed below:

When 'use_default_hl = 1', Per port HL is set to
DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
alloc_hl_entries is ignored.

When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.

User should calculate needed HL entries based on CQ depths the
application will use and specify it as command line parameter
'alloc_hl_entries'. This will be used to allocate HL entries.
Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).

If alloc_hl_entries is not specified, then Total HL entries for the
vdev = num_ldb_ports * 64.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 drivers/event/dlb2/dlb2.c         | 124 ++++++++++++++++++++++++++++--
 drivers/event/dlb2/dlb2_priv.h    |  10 ++-
 drivers/event/dlb2/pf/dlb2_pf.c   |   7 +-
 drivers/event/dlb2/rte_pmd_dlb2.h |   1 +
 4 files changed, 130 insertions(+), 12 deletions(-)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index d64274b01e..11bbe30d7b 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -180,10 +180,7 @@ dlb2_hw_query_resources(struct dlb2_eventdev *dlb2)
 	 * The capabilities (CAPs) were set at compile time.
 	 */
 
-	if (dlb2->max_cq_depth != DLB2_DEFAULT_CQ_DEPTH)
-		num_ldb_ports = DLB2_MAX_HL_ENTRIES / dlb2->max_cq_depth;
-	else
-		num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
+	num_ldb_ports = dlb2->hw_rsrc_query_results.num_ldb_ports;
 
 	evdev_dlb2_default_info.max_event_queues =
 		dlb2->hw_rsrc_query_results.num_ldb_queues;
@@ -631,6 +628,52 @@ set_enable_cq_weight(const char *key __rte_unused,
 	return 0;
 }
 
+static int set_hl_override(const char *key __rte_unused,
+		const char *value,
+		void *opaque)
+{
+	bool *default_hl = opaque;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	if ((*value == 'n') || (*value == 'N') || (*value == '0'))
+		*default_hl = false;
+	else
+		*default_hl = true;
+
+	return 0;
+}
+
+static int set_hl_entries(const char *key __rte_unused,
+		const char *value,
+		void *opaque)
+{
+	int hl_entries = 0;
+	int ret;
+
+	if (value == NULL || opaque == NULL) {
+		DLB2_LOG_ERR("NULL pointer\n");
+		return -EINVAL;
+	}
+
+	ret = dlb2_string_to_int(&hl_entries, value);
+	if (ret < 0)
+		return ret;
+
+	if ((uint32_t)hl_entries > DLB2_MAX_HL_ENTRIES) {
+		DLB2_LOG_ERR(
+		    "alloc_hl_entries %u out of range, must be in [1 - %d]\n",
+		    hl_entries, DLB2_MAX_HL_ENTRIES);
+		return -EINVAL;
+	}
+	*(uint32_t *)opaque = hl_entries;
+
+	return 0;
+}
+
 static int
 set_qid_depth_thresh(const char *key __rte_unused,
 		     const char *value,
@@ -828,8 +871,15 @@ dlb2_hw_create_sched_domain(struct dlb2_eventdev *dlb2,
 		DLB2_NUM_ATOMIC_INFLIGHTS_PER_QUEUE *
 		cfg->num_ldb_queues;
 
-	cfg->num_hist_list_entries = resources_asked->num_ldb_ports *
-		evdev_dlb2_default_info.max_event_port_dequeue_depth;
+	/* If hl_entries is non-zero then user specified command line option.
+	 * Else compute using default_port_hl that has been set earlier based
+	 * on use_default_hl option
+	 */
+	if (dlb2->hl_entries)
+		cfg->num_hist_list_entries = dlb2->hl_entries;
+	else
+		cfg->num_hist_list_entries =
+		    resources_asked->num_ldb_ports * dlb2->default_port_hl;
 
 	if (device_version == DLB2_HW_V2_5) {
 		DLB2_LOG_DBG("sched domain create - ldb_qs=%d, ldb_ports=%d, dir_ports=%d, atomic_inflights=%d, hist_list_entries=%d, credits=%d\n",
@@ -1041,7 +1091,7 @@ dlb2_eventdev_port_default_conf_get(struct rte_eventdev *dev,
 	struct dlb2_eventdev *dlb2 = dlb2_pmd_priv(dev);
 
 	port_conf->new_event_threshold = dlb2->new_event_limit;
-	port_conf->dequeue_depth = 32;
+	port_conf->dequeue_depth = dlb2->default_port_hl / 2;
 	port_conf->enqueue_depth = DLB2_MAX_ENQUEUE_DEPTH;
 	port_conf->event_port_cfg = 0;
 }
@@ -1560,9 +1610,16 @@ dlb2_hw_create_ldb_port(struct dlb2_eventdev *dlb2,
 	if (dlb2->version == DLB2_HW_V2_5 && qm_port->enable_inflight_ctrl) {
 		cfg.enable_inflight_ctrl = 1;
 		cfg.inflight_threshold = qm_port->inflight_threshold;
+		if (!qm_port->hist_list)
+			qm_port->hist_list = cfg.cq_depth;
 	}
 
-	cfg.cq_history_list_size = cfg.cq_depth;
+	if (qm_port->hist_list)
+		cfg.cq_history_list_size = qm_port->hist_list;
+	else if (dlb2->default_port_hl == DLB2_FIXED_CQ_HL_SIZE)
+		cfg.cq_history_list_size = DLB2_FIXED_CQ_HL_SIZE;
+	else
+		cfg.cq_history_list_size = cfg.cq_depth * 2;
 
 	cfg.cos_id = ev_port->cos_id;
 	cfg.cos_strict = 0;/* best effots */
@@ -4366,6 +4423,13 @@ dlb2_set_port_param(struct dlb2_eventdev *dlb2,
 				return -EINVAL;
 			}
 			break;
+		case DLB2_SET_PORT_HL:
+			if (dlb2->ev_ports[port_id].setup_done) {
+				DLB2_LOG_ERR("DLB2_SET_PORT_HL must be called before setting up port\n");
+				return -EINVAL;
+			}
+			port->hist_list = port_param->port_hl;
+			break;
 		default:
 			DLB2_LOG_ERR("dlb2: Unsupported flag\n");
 			return -EINVAL;
@@ -4684,6 +4748,28 @@ dlb2_primary_eventdev_probe(struct rte_eventdev *dev,
 		return err;
 	}
 
+	if (dlb2_args->use_default_hl) {
+		dlb2->default_port_hl = DLB2_FIXED_CQ_HL_SIZE;
+		if (dlb2_args->alloc_hl_entries)
+			DLB2_LOG_ERR(": Ignoring 'alloc_hl_entries' and using "
+				     "default history list sizes for eventdev:"
+				     " %s\n", dev->data->name);
+		dlb2->hl_entries = 0;
+	} else {
+		dlb2->default_port_hl = 2 * DLB2_FIXED_CQ_HL_SIZE;
+
+		if (dlb2_args->alloc_hl_entries >
+		    dlb2->hw_rsrc_query_results.num_hist_list_entries) {
+			DLB2_LOG_ERR(": Insufficient HL entries asked=%d "
+				     "available=%d for eventdev: %s\n",
+				     dlb2->hl_entries,
+				     dlb2->hw_rsrc_query_results.num_hist_list_entries,
+				     dev->data->name);
+			return -EINVAL;
+		}
+		dlb2->hl_entries = dlb2_args->alloc_hl_entries;
+	}
+
 	dlb2_iface_hardware_init(&dlb2->qm_instance);
 
 	/* configure class of service */
@@ -4791,6 +4877,8 @@ dlb2_parse_params(const char *params,
 					     DLB2_PRODUCER_COREMASK,
 					     DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG,
 					     DLB2_ENABLE_CQ_WEIGHT_ARG,
+					     DLB2_USE_DEFAULT_HL,
+					     DLB2_ALLOC_HL_ENTRIES,
 					     NULL };
 
 	if (params != NULL && params[0] != '\0') {
@@ -4994,6 +5082,26 @@ dlb2_parse_params(const char *params,
 				return ret;
 			}
 
+			ret = rte_kvargs_process(kvlist, DLB2_USE_DEFAULT_HL,
+						 set_hl_override,
+						 &dlb2_args->use_default_hl);
+			if (ret != 0) {
+				DLB2_LOG_ERR("%s: Error parsing use_default_hl arg",
+					     name);
+				rte_kvargs_free(kvlist);
+				return ret;
+			}
+
+			ret = rte_kvargs_process(kvlist, DLB2_ALLOC_HL_ENTRIES,
+						 set_hl_entries,
+						 &dlb2_args->alloc_hl_entries);
+			if (ret != 0) {
+				DLB2_LOG_ERR("%s: Error parsing hl_override arg",
+					     name);
+				rte_kvargs_free(kvlist);
+				return ret;
+			}
+
 			rte_kvargs_free(kvlist);
 		}
 	}
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index d6828aa482..dc9f98e142 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -52,6 +52,8 @@
 #define DLB2_PRODUCER_COREMASK "producer_coremask"
 #define DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG "default_port_allocation"
 #define DLB2_ENABLE_CQ_WEIGHT_ARG "enable_cq_weight"
+#define DLB2_USE_DEFAULT_HL "use_default_hl"
+#define DLB2_ALLOC_HL_ENTRIES "alloc_hl_entries"
 
 /* Begin HW related defines and structs */
 
@@ -101,7 +103,8 @@
  */
 #define DLB2_MAX_HL_ENTRIES 2048
 #define DLB2_MIN_CQ_DEPTH 1
-#define DLB2_DEFAULT_CQ_DEPTH 32
+#define DLB2_DEFAULT_CQ_DEPTH 32  /* Can be overridden using max_cq_depth command line parameter */
+#define DLB2_FIXED_CQ_HL_SIZE 32  /* Used when ENABLE_FIXED_HL_SIZE is true */
 #define DLB2_MIN_HARDWARE_CQ_DEPTH 8
 #define DLB2_NUM_HIST_LIST_ENTRIES_PER_LDB_PORT \
 	DLB2_DEFAULT_CQ_DEPTH
@@ -391,6 +394,7 @@ struct dlb2_port {
 	bool is_producer; /* True if port is of type producer */
 	uint16_t inflight_threshold; /* DLB2.5 HW inflight threshold */
 	bool enable_inflight_ctrl; /*DLB2.5 enable HW inflight control */
+	uint16_t hist_list; /* Port history list */
 };
 
 /* Per-process per-port mmio and memory pointers */
@@ -640,6 +644,8 @@ struct dlb2_eventdev {
 	uint32_t cos_bw[DLB2_COS_NUM_VALS]; /* bandwidth per cos domain */
 	uint8_t max_cos_port; /* Max LDB port from any cos */
 	bool enable_cq_weight;
+	uint16_t hl_entries; /* Num HL entires to allocate for the domain */
+	int default_port_hl;  /* Fixed or dynamic (2*CQ Depth) HL assignment */
 };
 
 /* used for collecting and passing around the dev args */
@@ -678,6 +684,8 @@ struct dlb2_devargs {
 	const char *producer_coremask;
 	bool default_ldb_port_allocation;
 	bool enable_cq_weight;
+	bool use_default_hl;
+	uint32_t alloc_hl_entries;
 };
 
 /* End Eventdev related defines and structs */
diff --git a/drivers/event/dlb2/pf/dlb2_pf.c b/drivers/event/dlb2/pf/dlb2_pf.c
index 249ed7ede9..ba22f37731 100644
--- a/drivers/event/dlb2/pf/dlb2_pf.c
+++ b/drivers/event/dlb2/pf/dlb2_pf.c
@@ -422,6 +422,7 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
 				      cfg,
 				      cq_base,
 				      &response);
+	cfg->response = response;
 	if (ret)
 		goto create_port_err;
 
@@ -437,8 +438,6 @@ dlb2_pf_dir_port_create(struct dlb2_hw_dev *handle,
 
 	dlb2_list_init_head(&port_memory.list);
 
-	cfg->response = response;
-
 	return 0;
 
 create_port_err:
@@ -731,7 +730,9 @@ dlb2_eventdev_pci_init(struct rte_eventdev *eventdev)
 		.hw_credit_quanta = DLB2_SW_CREDIT_BATCH_SZ,
 		.default_depth_thresh = DLB2_DEPTH_THRESH_DEFAULT,
 		.max_cq_depth = DLB2_DEFAULT_CQ_DEPTH,
-		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH
+		.max_enq_depth = DLB2_MAX_ENQUEUE_DEPTH,
+		.use_default_hl = true,
+		.alloc_hl_entries = 0
 	};
 	struct dlb2_eventdev *dlb2;
 	int q;
diff --git a/drivers/event/dlb2/rte_pmd_dlb2.h b/drivers/event/dlb2/rte_pmd_dlb2.h
index 6e78dfb5a5..91b47ede11 100644
--- a/drivers/event/dlb2/rte_pmd_dlb2.h
+++ b/drivers/event/dlb2/rte_pmd_dlb2.h
@@ -75,6 +75,7 @@ rte_pmd_dlb2_set_token_pop_mode(uint8_t dev_id,
 
 struct dlb2_port_param {
 	uint16_t inflight_threshold : 12;
+	uint16_t port_hl;
 };
 
 /*!
-- 
2.25.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 3/3] event/dlb2: enhance DLB credit handling
  2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
  2024-05-01 19:46 ` [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
@ 2024-05-01 19:46 ` Abdullah Sevincer
  2024-05-27 15:30   ` Jerin Jacob
  2024-05-02  7:34 ` [PATCH v4 0/3] DLB2 Enhancements Bruce Richardson
  3 siblings, 1 reply; 9+ messages in thread
From: Abdullah Sevincer @ 2024-05-01 19:46 UTC (permalink / raw)
  To: dev
  Cc: jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria, Abdullah Sevincer

This commit improves DLB credit handling scenarios when
ports hold on to credits but can't release them due to insufficient
accumulation (less than 2 * credit quanta).

Worker ports now release all accumulated credits when back-to-back
zero poll count reaches preset threshold.

Producer ports release all accumulated credits if enqueue fails for a
consecutive number of retries.

In a multi-producer system, some producer(s) may exit early while
holding on to credits. Now these are released during port unlink
which needs to be performed by the application.

test-eventdev is modified to call rte_event_port_unlink() to release
any accumulated credits by producer ports.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
---
 app/test-eventdev/test_perf_common.c |  20 +--
 drivers/event/dlb2/dlb2.c            | 203 +++++++++++++++++++++------
 drivers/event/dlb2/dlb2_priv.h       |   1 +
 drivers/event/dlb2/meson.build       |  12 ++
 drivers/event/dlb2/meson_options.txt |   6 +
 5 files changed, 194 insertions(+), 48 deletions(-)
 create mode 100644 drivers/event/dlb2/meson_options.txt

diff --git a/app/test-eventdev/test_perf_common.c b/app/test-eventdev/test_perf_common.c
index 93e6132de8..b3a12e12ac 100644
--- a/app/test-eventdev/test_perf_common.c
+++ b/app/test-eventdev/test_perf_common.c
@@ -854,6 +854,7 @@ perf_producer_wrapper(void *arg)
 	struct rte_event_dev_info dev_info;
 	struct prod_data *p  = arg;
 	struct test_perf *t = p->t;
+	int ret = 0;
 
 	rte_event_dev_info_get(p->dev_id, &dev_info);
 	if (!t->opt->prod_enq_burst_sz) {
@@ -870,29 +871,32 @@ perf_producer_wrapper(void *arg)
 	 */
 	if (t->opt->prod_type == EVT_PROD_TYPE_SYNT &&
 			t->opt->prod_enq_burst_sz == 1)
-		return perf_producer(arg);
+		ret = perf_producer(arg);
 	else if (t->opt->prod_type == EVT_PROD_TYPE_SYNT &&
 			t->opt->prod_enq_burst_sz > 1) {
 		if (dev_info.max_event_port_enqueue_depth == 1)
 			evt_err("This event device does not support burst mode");
 		else
-			return perf_producer_burst(arg);
+			ret = perf_producer_burst(arg);
 	}
 	else if (t->opt->prod_type == EVT_PROD_TYPE_EVENT_TIMER_ADPTR &&
 			!t->opt->timdev_use_burst)
-		return perf_event_timer_producer(arg);
+		ret = perf_event_timer_producer(arg);
 	else if (t->opt->prod_type == EVT_PROD_TYPE_EVENT_TIMER_ADPTR &&
 			t->opt->timdev_use_burst)
-		return perf_event_timer_producer_burst(arg);
+		ret = perf_event_timer_producer_burst(arg);
 	else if (t->opt->prod_type == EVT_PROD_TYPE_EVENT_CRYPTO_ADPTR) {
 		if (t->opt->prod_enq_burst_sz > 1)
-			return perf_event_crypto_producer_burst(arg);
+			ret = perf_event_crypto_producer_burst(arg);
 		else
-			return perf_event_crypto_producer(arg);
+			ret = perf_event_crypto_producer(arg);
 	} else if (t->opt->prod_type == EVT_PROD_TYPE_EVENT_DMA_ADPTR)
-		return perf_event_dma_producer(arg);
+		ret = perf_event_dma_producer(arg);
 
-	return 0;
+	/* Unlink port to release any acquired HW resources*/
+	rte_event_port_unlink(p->dev_id, p->port_id, &p->queue_id, 1);
+
+	return ret;
 }
 
 static inline uint64_t
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 11bbe30d7b..2c341a5845 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -43,7 +43,47 @@
  * to DLB can go ahead of relevant application writes like updates to buffers
  * being sent with event
  */
+#ifndef DLB2_BYPASS_FENCE_ON_PP
 #define DLB2_BYPASS_FENCE_ON_PP 0  /* 1 == Bypass fence, 0 == do not bypass */
+#endif
+
+/* HW credit checks can only be turned off for DLB2 device if following
+ * is true for each created eventdev
+ * LDB credits <= DIR credits + minimum CQ Depth
+ * (CQ Depth is minimum of all ports configured within eventdev)
+ * This needs to be true for all eventdevs created on any DLB2 device
+ * managed by this driver.
+ * DLB2.5 does not any such restriction as it has single credit pool
+ */
+#ifndef DLB_HW_CREDITS_CHECKS
+#define DLB_HW_CREDITS_CHECKS 1
+#endif
+
+/*
+ * SW credit checks can only be turned off if application has a way to
+ * limit input events to the eventdev below assigned credit limit
+ */
+#ifndef DLB_SW_CREDITS_CHECKS
+#define DLB_SW_CREDITS_CHECKS 1
+#endif
+
+/*
+ * To avoid deadlock situations, by default, per port new_event_threshold
+ * check is disabled. nb_events_limit is still checked while allocating
+ * new event credits.
+ */
+#define ENABLE_PORT_THRES_CHECK 1
+/*
+ * To avoid deadlock, ports holding to credits will release them after these
+ * many consecutive zero dequeues
+ */
+#define DLB2_ZERO_DEQ_CREDIT_RETURN_THRES 16384
+
+/*
+ * To avoid deadlock, ports holding to credits will release them after these
+ * many consecutive enqueue failures
+ */
+#define DLB2_ENQ_FAIL_CREDIT_RETURN_THRES 100
 
 /*
  * Resources exposed to eventdev. Some values overridden at runtime using
@@ -2488,6 +2528,61 @@ dlb2_event_queue_detach_ldb(struct dlb2_eventdev *dlb2,
 	return ret;
 }
 
+static inline void
+dlb2_port_credits_return(struct dlb2_port *qm_port)
+{
+	/* Return all port credits */
+	if (qm_port->dlb2->version == DLB2_HW_V2_5) {
+		if (qm_port->cached_credits) {
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_COMBINED_POOL],
+					   qm_port->cached_credits, __ATOMIC_SEQ_CST);
+			qm_port->cached_credits = 0;
+		}
+	} else {
+		if (qm_port->cached_ldb_credits) {
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_LDB_QUEUE],
+					   qm_port->cached_ldb_credits, __ATOMIC_SEQ_CST);
+			qm_port->cached_ldb_credits = 0;
+		}
+		if (qm_port->cached_dir_credits) {
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_DIR_QUEUE],
+					   qm_port->cached_dir_credits, __ATOMIC_SEQ_CST);
+			qm_port->cached_dir_credits = 0;
+		}
+	}
+}
+
+static inline void
+dlb2_release_sw_credits(struct dlb2_eventdev *dlb2,
+			struct dlb2_eventdev_port *ev_port, uint16_t val)
+{
+	if (ev_port->inflight_credits) {
+		__atomic_fetch_sub(&dlb2->inflights, val, __ATOMIC_SEQ_CST);
+		ev_port->inflight_credits -= val;
+	}
+}
+
+static void dlb2_check_and_return_credits(struct dlb2_eventdev_port *ev_port,
+					  bool cond, uint32_t threshold)
+{
+#if DLB_SW_CREDITS_CHECKS || DLB_HW_CREDITS_CHECKS
+	if (cond) {
+		if (++ev_port->credit_return_count > threshold) {
+#if DLB_SW_CREDITS_CHECKS
+			dlb2_release_sw_credits(ev_port->dlb2, ev_port,
+						ev_port->inflight_credits);
+#endif
+#if DLB_HW_CREDITS_CHECKS
+			dlb2_port_credits_return(&ev_port->qm_port);
+#endif
+			ev_port->credit_return_count = 0;
+		}
+	} else {
+		ev_port->credit_return_count = 0;
+	}
+#endif
+}
+
 static int
 dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 			  uint8_t queues[], uint16_t nb_unlinks)
@@ -2507,14 +2602,15 @@ dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 
 	if (queues == NULL || nb_unlinks == 0) {
 		DLB2_LOG_DBG("dlb2: queues is NULL or nb_unlinks is 0\n");
-		return 0; /* Ignore and return success */
+		nb_unlinks = 0; /* Ignore and return success */
+		goto ret_credits;
 	}
 
 	if (ev_port->qm_port.is_directed) {
 		DLB2_LOG_DBG("dlb2: ignore unlink from dir port %d\n",
 			     ev_port->id);
 		rte_errno = 0;
-		return nb_unlinks; /* as if success */
+		goto ret_credits;
 	}
 
 	dlb2 = ev_port->dlb2;
@@ -2553,6 +2649,10 @@ dlb2_eventdev_port_unlink(struct rte_eventdev *dev, void *event_port,
 		ev_queue->num_links--;
 	}
 
+ret_credits:
+	if (ev_port->inflight_credits)
+		dlb2_check_and_return_credits(ev_port, true, 0);
+
 	return nb_unlinks;
 }
 
@@ -2752,8 +2852,7 @@ dlb2_replenish_sw_credits(struct dlb2_eventdev *dlb2,
 		/* Replenish credits, saving one quanta for enqueues */
 		uint16_t val = ev_port->inflight_credits - quanta;
 
-		__atomic_fetch_sub(&dlb2->inflights, val, __ATOMIC_SEQ_CST);
-		ev_port->inflight_credits -= val;
+		dlb2_release_sw_credits(dlb2, ev_port, val);
 	}
 }
 
@@ -2924,7 +3023,9 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 {
 	struct dlb2_eventdev *dlb2 = ev_port->dlb2;
 	struct dlb2_eventdev_queue *ev_queue;
+#if DLB_HW_CREDITS_CHECKS
 	uint16_t *cached_credits = NULL;
+#endif
 	struct dlb2_queue *qm_queue;
 
 	ev_queue = &dlb2->ev_queues[ev->queue_id];
@@ -2936,6 +3037,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		goto op_check;
 
 	if (!qm_queue->is_directed) {
+#if DLB_HW_CREDITS_CHECKS
 		/* Load balanced destination queue */
 
 		if (dlb2->version == DLB2_HW_V2) {
@@ -2951,6 +3053,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			}
 			cached_credits = &qm_port->cached_credits;
 		}
+#endif
 		switch (ev->sched_type) {
 		case RTE_SCHED_TYPE_ORDERED:
 			DLB2_LOG_DBG("dlb2: put_qe: RTE_SCHED_TYPE_ORDERED\n");
@@ -2981,7 +3084,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		}
 	} else {
 		/* Directed destination queue */
-
+#if DLB_HW_CREDITS_CHECKS
 		if (dlb2->version == DLB2_HW_V2) {
 			if (dlb2_check_enqueue_hw_dir_credits(qm_port)) {
 				rte_errno = -ENOSPC;
@@ -2995,6 +3098,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			}
 			cached_credits = &qm_port->cached_credits;
 		}
+#endif
 		DLB2_LOG_DBG("dlb2: put_qe: RTE_SCHED_TYPE_DIRECTED\n");
 
 		*sched_type = DLB2_SCHED_DIRECTED;
@@ -3002,6 +3106,7 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 
 op_check:
 	switch (ev->op) {
+#if DLB_SW_CREDITS_CHECKS
 	case RTE_EVENT_OP_NEW:
 		/* Check that a sw credit is available */
 		if (dlb2_check_enqueue_sw_credits(dlb2, ev_port)) {
@@ -3009,7 +3114,10 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 			return 1;
 		}
 		ev_port->inflight_credits--;
+#endif
+#if DLB_HW_CREDITS_CHECKS
 		(*cached_credits)--;
+#endif
 		break;
 	case RTE_EVENT_OP_FORWARD:
 		/* Check for outstanding_releases underflow. If this occurs,
@@ -3020,10 +3128,14 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		RTE_ASSERT(ev_port->outstanding_releases > 0);
 		ev_port->outstanding_releases--;
 		qm_port->issued_releases++;
+#if DLB_HW_CREDITS_CHECKS
 		(*cached_credits)--;
+#endif
 		break;
 	case RTE_EVENT_OP_RELEASE:
+#if DLB_SW_CREDITS_CHECKS
 		ev_port->inflight_credits++;
+#endif
 		/* Check for outstanding_releases underflow. If this occurs,
 		 * the application is not using the EVENT_OPs correctly; for
 		 * example, forwarding or releasing events that were not
@@ -3032,9 +3144,10 @@ dlb2_event_enqueue_prep(struct dlb2_eventdev_port *ev_port,
 		RTE_ASSERT(ev_port->outstanding_releases > 0);
 		ev_port->outstanding_releases--;
 		qm_port->issued_releases++;
-
+#if DLB_SW_CREDITS_CHECKS
 		/* Replenish s/w credits if enough are cached */
 		dlb2_replenish_sw_credits(dlb2, ev_port);
+#endif
 		break;
 	}
 
@@ -3145,6 +3258,8 @@ __dlb2_event_enqueue_burst(void *event_port,
 			break;
 	}
 
+	dlb2_check_and_return_credits(ev_port, !i, DLB2_ENQ_FAIL_CREDIT_RETURN_THRES);
+
 	return i;
 }
 
@@ -3283,53 +3398,45 @@ dlb2_event_release(struct dlb2_eventdev *dlb2,
 		return;
 	}
 	ev_port->outstanding_releases -= i;
+#if DLB_SW_CREDITS_CHECKS
 	ev_port->inflight_credits += i;
 
 	/* Replenish s/w credits if enough releases are performed */
 	dlb2_replenish_sw_credits(dlb2, ev_port);
+#endif
 }
 
 static inline void
 dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 {
 	uint32_t batch_size = qm_port->hw_credit_quanta;
+	int val;
 
 	/* increment port credits, and return to pool if exceeds threshold */
-	if (!qm_port->is_directed) {
-		if (qm_port->dlb2->version == DLB2_HW_V2) {
-			qm_port->cached_ldb_credits += num;
-			if (qm_port->cached_ldb_credits >= 2 * batch_size) {
-				__atomic_fetch_add(
-					qm_port->credit_pool[DLB2_LDB_QUEUE],
-					batch_size, __ATOMIC_SEQ_CST);
-				qm_port->cached_ldb_credits -= batch_size;
-			}
-		} else {
-			qm_port->cached_credits += num;
-			if (qm_port->cached_credits >= 2 * batch_size) {
-				__atomic_fetch_add(
-				      qm_port->credit_pool[DLB2_COMBINED_POOL],
-				      batch_size, __ATOMIC_SEQ_CST);
-				qm_port->cached_credits -= batch_size;
-			}
+	if (qm_port->dlb2->version == DLB2_HW_V2_5) {
+		qm_port->cached_credits += num;
+		if (qm_port->cached_credits >= 2 * batch_size) {
+			val = qm_port->cached_credits - batch_size;
+			__atomic_fetch_add(
+			    qm_port->credit_pool[DLB2_COMBINED_POOL], val,
+			    __ATOMIC_SEQ_CST);
+			qm_port->cached_credits -= val;
+		}
+	} else if (!qm_port->is_directed) {
+		qm_port->cached_ldb_credits += num;
+		if (qm_port->cached_ldb_credits >= 2 * batch_size) {
+			val = qm_port->cached_ldb_credits - batch_size;
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_LDB_QUEUE],
+					   val, __ATOMIC_SEQ_CST);
+			qm_port->cached_ldb_credits -= val;
 		}
 	} else {
-		if (qm_port->dlb2->version == DLB2_HW_V2) {
-			qm_port->cached_dir_credits += num;
-			if (qm_port->cached_dir_credits >= 2 * batch_size) {
-				__atomic_fetch_add(
-					qm_port->credit_pool[DLB2_DIR_QUEUE],
-					batch_size, __ATOMIC_SEQ_CST);
-				qm_port->cached_dir_credits -= batch_size;
-			}
-		} else {
-			qm_port->cached_credits += num;
-			if (qm_port->cached_credits >= 2 * batch_size) {
-				__atomic_fetch_add(
-				      qm_port->credit_pool[DLB2_COMBINED_POOL],
-				      batch_size, __ATOMIC_SEQ_CST);
-				qm_port->cached_credits -= batch_size;
-			}
+		qm_port->cached_dir_credits += num;
+		if (qm_port->cached_dir_credits >= 2 * batch_size) {
+			val = qm_port->cached_dir_credits - batch_size;
+			__atomic_fetch_add(qm_port->credit_pool[DLB2_DIR_QUEUE],
+					   val, __ATOMIC_SEQ_CST);
+			qm_port->cached_dir_credits -= val;
 		}
 	}
 }
@@ -3360,6 +3467,15 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 
 	/* Wait/poll time expired */
 	if (elapsed_ticks >= timeout) {
+
+		/* Return all credits before blocking if remaining credits in
+		 * system is less than quanta.
+		 */
+		uint32_t sw_inflights = __atomic_load_n(&dlb2->inflights, __ATOMIC_SEQ_CST);
+		uint32_t quanta = ev_port->credit_update_quanta;
+
+		if (dlb2->new_event_limit - sw_inflights < quanta)
+			dlb2_check_and_return_credits(ev_port, true, 0);
 		return 1;
 	} else if (dlb2->umwait_allowed) {
 		struct rte_power_monitor_cond pmc;
@@ -4222,8 +4338,9 @@ dlb2_hw_dequeue(struct dlb2_eventdev *dlb2,
 			dlb2_consume_qe_immediate(qm_port, num);
 
 		ev_port->outstanding_releases += num;
-
+#if DLB_HW_CREDITS_CHECKS
 		dlb2_port_credits_inc(qm_port, num);
+#endif
 	}
 
 	return num;
@@ -4257,6 +4374,9 @@ dlb2_event_dequeue_burst(void *event_port, struct rte_event *ev, uint16_t num,
 	DLB2_INC_STAT(ev_port->stats.traffic.total_polls, 1);
 	DLB2_INC_STAT(ev_port->stats.traffic.zero_polls, ((cnt == 0) ? 1 : 0));
 
+	dlb2_check_and_return_credits(ev_port, !cnt,
+				      DLB2_ZERO_DEQ_CREDIT_RETURN_THRES);
+
 	return cnt;
 }
 
@@ -4293,6 +4413,9 @@ dlb2_event_dequeue_burst_sparse(void *event_port, struct rte_event *ev,
 
 	DLB2_INC_STAT(ev_port->stats.traffic.total_polls, 1);
 	DLB2_INC_STAT(ev_port->stats.traffic.zero_polls, ((cnt == 0) ? 1 : 0));
+
+	dlb2_check_and_return_credits(ev_port, !cnt,
+				      DLB2_ZERO_DEQ_CREDIT_RETURN_THRES);
 	return cnt;
 }
 
diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
index dc9f98e142..fd76b5b9fb 100644
--- a/drivers/event/dlb2/dlb2_priv.h
+++ b/drivers/event/dlb2/dlb2_priv.h
@@ -527,6 +527,7 @@ struct __rte_cache_aligned dlb2_eventdev_port {
 	struct rte_event_port_conf conf; /* user-supplied configuration */
 	uint16_t inflight_credits; /* num credits this port has right now */
 	uint16_t credit_update_quanta;
+	uint32_t credit_return_count; /* count till the credit return condition is true */
 	struct dlb2_eventdev *dlb2; /* backlink optimization */
 	alignas(RTE_CACHE_LINE_SIZE) struct dlb2_port_stats stats;
 	struct dlb2_event_queue_link link[DLB2_MAX_NUM_QIDS_PER_LDB_CQ];
diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build
index 515d1795fe..77a197e32c 100644
--- a/drivers/event/dlb2/meson.build
+++ b/drivers/event/dlb2/meson.build
@@ -68,3 +68,15 @@ endif
 headers = files('rte_pmd_dlb2.h')
 
 deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci']
+
+if meson.version().version_compare('> 0.58.0')
+fs = import('fs')
+dlb_options = fs.read('meson_options.txt').strip().split('\n')
+
+foreach opt: dlb_options
+	if (opt.strip().startswith('#') or opt.strip() == '')
+		continue
+	endif
+	cflags += '-D' + opt.strip().to_upper().replace(' ','')
+endforeach
+endif
diff --git a/drivers/event/dlb2/meson_options.txt b/drivers/event/dlb2/meson_options.txt
new file mode 100644
index 0000000000..b57c999e54
--- /dev/null
+++ b/drivers/event/dlb2/meson_options.txt
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2023-2024 Intel Corporation
+
+DLB2_BYPASS_FENCE_ON_PP = 0
+DLB_HW_CREDITS_CHECKS = 1
+DLB_SW_CREDITS_CHECKS = 1
-- 
2.25.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 0/3] DLB2 Enhancements
  2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
                   ` (2 preceding siblings ...)
  2024-05-01 19:46 ` [PATCH v4 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
@ 2024-05-02  7:34 ` Bruce Richardson
  2024-05-02 15:52   ` Sevincer, Abdullah
  3 siblings, 1 reply; 9+ messages in thread
From: Bruce Richardson @ 2024-05-02  7:34 UTC (permalink / raw)
  To: Abdullah Sevincer
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Wed, May 01, 2024 at 02:46:17PM -0500, Abdullah Sevincer wrote:
> This patchset  addresses DLB enhancements in the DLB driver.
> 
> Abdullah Sevincer (3):
>   event/dlb2: add support for HW delayed token
>   event/dlb2: add support for dynamic HL entries
>   event/dlb2: enhance DLB credit handling
> 
Hi Abdullah,

Couple of small asks/tips when sending new versions of a patchset:
1) When sending v2, v3, v4 using git-send-email, include
  "--in-reply-to <message-id-of-v1-cover-letter>" in the command. This will
  ensure all copies of the patches get put in the same email thread, rather
  than having different versions spread throughout the reader's mailbox.
2) Please include in the cover letter a short one/two-line description of
  what has changed in each version, so anyone reviewing e.g. v4 after
  reading v3, is aware of what parts of v4 they need to look at
  specifically. Generally, this should be in reverse order e.g.

v4: renamed bar to foobar
v3: changed foo to bar
v2: added new foo

Thanks,
/Bruce

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v4 0/3] DLB2 Enhancements
  2024-05-02  7:34 ` [PATCH v4 0/3] DLB2 Enhancements Bruce Richardson
@ 2024-05-02 15:52   ` Sevincer, Abdullah
  0 siblings, 0 replies; 9+ messages in thread
From: Sevincer, Abdullah @ 2024-05-02 15:52 UTC (permalink / raw)
  To: Richardson, Bruce
  Cc: dev, jerinj, Chen, Mike Ximing, Sarkar, Tirthendu, Pathak,
	Pravin, Doneria, Shivani

>+Hi Abdullah,

>+Couple of small asks/tips when sending new versions of a patchset:
>+1) When sending v2, v3, v4 using git-send-email, include
 >+ "--in-reply-to <message-id-of-v1-cover-letter>" in the command. This will
 >+ ensure all copies of the patches get put in the same email thread, rather
  >+than having different versions spread throughout the reader's mailbox.
>+2) Please include in the cover letter a short one/two-line description of
 >+ what has changed in each version, so anyone reviewing e.g. v4 after
  >+reading v3, is aware of what parts of v4 they need to look at
 >+ specifically. Generally, this should be in reverse order e.g.

>+v4: renamed bar to foobar
>+v3: changed foo to bar
>+v2: added new foo

>+Thanks,
>+/Bruce

Hi Bruce,

Thanks for the tips, and sorry for filling in the inboxes, I will follow your instructions 
for the following patches.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 1/3] event/dlb2: add support for HW delayed token
  2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
@ 2024-05-27 15:19   ` Jerin Jacob
  0 siblings, 0 replies; 9+ messages in thread
From: Jerin Jacob @ 2024-05-27 15:19 UTC (permalink / raw)
  To: Abdullah Sevincer
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Thu, May 2, 2024 at 1:16 AM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> In DLB 2.5, hardware assist is available, complementing the Delayed
> token POP software implementation. When it is enabled, the feature
> works as follows:
>
> It stops CQ scheduling when the inflight limit associated with the CQ
> is reached. So the feature is activated only if the core is
> congested. If the core can handle multiple atomic flows, DLB will not
> try to switch them. This is an improvement over SW implementation
> which always switches the flows.
>
> The feature will resume CQ scheduling when the number of pending
> completions fall below a configured threshold. To emulate older 2.0
> behavior, this threshold is set to 1 by old APIs. SW sets CQ to
> auto-pop mode for token return, as tokens withholding is not
> necessary now. As HW counts completions and not tokens, events equal
> to HL (History List) entries will be scheduled to DLB before the
> feature activates and stops CQ scheduling.


1) Also tell about adding new PMD API and update the release notes for
PMD section for new feature.
2) Fix CI http://mails.dpdk.org/archives/test-report/2024-May/657681.html


>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
 +/** Set inflight threshold for flow migration */
> +#define DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0)

Fix the namespace for public API, RTE_PMD_DLB2_PORT_SET_F_FLOW_MIGRATION_...


> +
> +/** Set port history list */
> +#define DLB2_SET_PORT_HL RTE_BIT64(1)

RTE_PMD_DLB2_PORT_SET_F_PORT_HL


> +
> +struct dlb2_port_param {

fix name space, rte_pmd_dlb2_port_params

> +       uint16_t inflight_threshold : 12;
> +};
> +
> +/*!
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Configure various port parameters.
> + * AUTO_POP. This function must be called before calling rte_event_port_setup()
> + * for the port, but after calling rte_event_dev_configure().
> + *
> + * @param dev_id
> + *    The identifier of the event device.
> + * @param port_id
> + *    The identifier of the event port.
> + * @param flags
> + *    Bitmask of the parameters being set.
> + * @param val
> + *    Structure coantaining the values of parameters being set.

Why not use struct rte_pmd_dlb2_port_params itself instead of void *.

> + *
> + * @return
> + * - 0: Success
> + * - EINVAL: Invalid dev_id, port_id, or mode
> + * - EINVAL: The DLB2 is not configured, is already running, or the port is
> + *   already setup
> + */
> +__rte_experimental
> +int
> +rte_pmd_dlb2_set_port_param(uint8_t dev_id,
> +                           uint8_t port_id,
> +                           uint64_t flags,
> +                           void *val);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries
  2024-05-01 19:46 ` [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
@ 2024-05-27 15:23   ` Jerin Jacob
  0 siblings, 0 replies; 9+ messages in thread
From: Jerin Jacob @ 2024-05-27 15:23 UTC (permalink / raw)
  To: Abdullah Sevincer
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Thu, May 2, 2024 at 1:16 AM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> In DLB 2.5, hardware assist is available, complementing the Delayed
> token POP software implementation. When it is enabled, the feature
> works as follows:
>
> It stops CQ scheduling when the inflight limit associated with the CQ
> is reached. So the feature is activated only if the core is
> congested. If the core can handle multiple atomic flows, DLB will not
> try to switch them. This is an improvement over SW implementation
> which always switches the flows.
>
> The feature will resume CQ scheduling when the number of pending
> completions fall below a configured threshold.
>
> DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used,
> possible HL entries per LDB port equals 2048 / 64 = 32. So, the
> maximum CQ depth possible is 16, if all 64 LB ports are needed in a
> high-performance setting.
>
> In case all CQs are configured to have HL = 2* CQ Depth as a
> performance option, then the calculation of HL at the time of domain
> creation will be based on maximum possible dequeue depth. This could
> result in allocating too many HL  entries to the domain as DLB only
> has limited number of HL entries to be allocated. Hence, it is best
> to allow application to specify HL entries as a command line argument
> and override default allocation. A summary of usage is listed below:
>
> When 'use_default_hl = 1', Per port HL is set to
> DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
> alloc_hl_entries is ignored.
>
> When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
> port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
>
> User should calculate needed HL entries based on CQ depths the
> application will use and specify it as command line parameter
> 'alloc_hl_entries'. This will be used to allocate HL entries.
> Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2).
>
> If alloc_hl_entries is not specified, then Total HL entries for the
> vdev = num_ldb_ports * 64.
>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>

>         }
> diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
> index d6828aa482..dc9f98e142 100644
> --- a/drivers/event/dlb2/dlb2_priv.h
> +++ b/drivers/event/dlb2/dlb2_priv.h
> @@ -52,6 +52,8 @@
>  #define DLB2_PRODUCER_COREMASK "producer_coremask"
>  #define DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG "default_port_allocation"
>  #define DLB2_ENABLE_CQ_WEIGHT_ARG "enable_cq_weight"
> +#define DLB2_USE_DEFAULT_HL "use_default_hl"
> +#define DLB2_ALLOC_HL_ENTRIES "alloc_hl_entries"


1)Update doc/guides/eventdevs/dlb2.rst for new devargs
2)Please release note PMD section for this feature.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 3/3] event/dlb2: enhance DLB credit handling
  2024-05-01 19:46 ` [PATCH v4 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
@ 2024-05-27 15:30   ` Jerin Jacob
  0 siblings, 0 replies; 9+ messages in thread
From: Jerin Jacob @ 2024-05-27 15:30 UTC (permalink / raw)
  To: Abdullah Sevincer, Richardson, Bruce
  Cc: dev, jerinj, mike.ximing.chen, tirthendu.sarkar, pravin.pathak,
	shivani.doneria

On Thu, May 2, 2024 at 1:27 AM Abdullah Sevincer
<abdullah.sevincer@intel.com> wrote:
>
> This commit improves DLB credit handling scenarios when
> ports hold on to credits but can't release them due to insufficient
> accumulation (less than 2 * credit quanta).
>
> Worker ports now release all accumulated credits when back-to-back
> zero poll count reaches preset threshold.
>
> Producer ports release all accumulated credits if enqueue fails for a
> consecutive number of retries.
>
> In a multi-producer system, some producer(s) may exit early while
> holding on to credits. Now these are released during port unlink
> which needs to be performed by the application.
>
> test-eventdev is modified to call rte_event_port_unlink() to release
> any accumulated credits by producer ports.
>
> Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
> ---
>  app/test-eventdev/test_perf_common.c |  20 +--

1) Spotted non-driver changes in driver patches, Please send
test-eventdev changes as separate commit with complete rational.
2) Fix CI issues http://mails.dpdk.org/archives/test-report/2024-May/657683.html



>  drivers/event/dlb2/dlb2.c            | 203 +++++++++++++++++++++------
>  drivers/event/dlb2/dlb2_priv.h       |   1 +
>  drivers/event/dlb2/meson.build       |  12 ++
>  drivers/event/dlb2/meson_options.txt |   6 +
>  5 files changed, 194 insertions(+), 48 deletions(-)
>  create mode 100644 drivers/event/dlb2/meson_options.txt
>

>
>  static inline uint64_t
> diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
> index 11bbe30d7b..2c341a5845 100644
> --- a/drivers/event/dlb2/dlb2.c
> +++ b/drivers/event/dlb2/dlb2.c
> @@ -43,7 +43,47 @@
>   * to DLB can go ahead of relevant application writes like updates to buffers
>   * being sent with event
>   */
> +#ifndef DLB2_BYPASS_FENCE_ON_PP
>  #define DLB2_BYPASS_FENCE_ON_PP 0  /* 1 == Bypass fence, 0 == do not bypass */
> +#endif
> +
> +/* HW credit checks can only be turned off for DLB2 device if following
> + * is true for each created eventdev
> + * LDB credits <= DIR credits + minimum CQ Depth
> + * (CQ Depth is minimum of all ports configured within eventdev)
> + * This needs to be true for all eventdevs created on any DLB2 device
> + * managed by this driver.
> + * DLB2.5 does not any such restriction as it has single credit pool
> + */
> +#ifndef DLB_HW_CREDITS_CHECKS
> +#define DLB_HW_CREDITS_CHECKS 1
> +#endif
> +
> +/*
> + * SW credit checks can only be turned off if application has a way to
> + * limit input events to the eventdev below assigned credit limit
> + */
> +#ifndef DLB_SW_CREDITS_CHECKS
> +#define DLB_SW_CREDITS_CHECKS 1
> +#endif
> +

> +
> +static void dlb2_check_and_return_credits(struct dlb2_eventdev_port *ev_port,
> +                                         bool cond, uint32_t threshold)
> +{
> +#if DLB_SW_CREDITS_CHECKS || DLB_HW_CREDITS_CHECKS


This new patch is full of compilation flags clutter, can you make it runtime?

>
> diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h
> index dc9f98e142..fd76b5b9fb 100644
> --- a/drivers/event/dlb2/dlb2_priv.h
> +++ b/drivers/event/dlb2/dlb2_priv.h
> @@ -527,6 +527,7 @@ struct __rte_cache_aligned dlb2_eventdev_port {
>         struct rte_event_port_conf conf; /* user-supplied configuration */
>         uint16_t inflight_credits; /* num credits this port has right now */
>         uint16_t credit_update_quanta;
> +       uint32_t credit_return_count; /* count till the credit return condition is true */
>         struct dlb2_eventdev *dlb2; /* backlink optimization */
>         alignas(RTE_CACHE_LINE_SIZE) struct dlb2_port_stats stats;
>         struct dlb2_event_queue_link link[DLB2_MAX_NUM_QIDS_PER_LDB_CQ];
> diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build
> index 515d1795fe..77a197e32c 100644
> --- a/drivers/event/dlb2/meson.build
> +++ b/drivers/event/dlb2/meson.build
> @@ -68,3 +68,15 @@ endif
>  headers = files('rte_pmd_dlb2.h')
>
>  deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci']
> +
> +if meson.version().version_compare('> 0.58.0')
> +fs = import('fs')
> +dlb_options = fs.read('meson_options.txt').strip().split('\n')
> +
> +foreach opt: dlb_options
> +       if (opt.strip().startswith('#') or opt.strip() == '')
> +               continue
> +       endif
> +       cflags += '-D' + opt.strip().to_upper().replace(' ','')
> +endforeach
> +endif
> diff --git a/drivers/event/dlb2/meson_options.txt b/drivers/event/dlb2/meson_options.txt


Adding @Richardson, Bruce   @Thomas Monjalon   to comment on this, I
am not sure driver specific
meson_options.txt is a good path?



> new file mode 100644
> index 0000000000..b57c999e54
> --- /dev/null
> +++ b/drivers/event/dlb2/meson_options.txt
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2023-2024 Intel Corporation
> +
> +DLB2_BYPASS_FENCE_ON_PP = 0
> +DLB_HW_CREDITS_CHECKS = 1
> +DLB_SW_CREDITS_CHECKS = 1
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-05-27 15:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-01 19:46 [PATCH v4 0/3] DLB2 Enhancements Abdullah Sevincer
2024-05-01 19:46 ` [PATCH v4 1/3] event/dlb2: add support for HW delayed token Abdullah Sevincer
2024-05-27 15:19   ` Jerin Jacob
2024-05-01 19:46 ` [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries Abdullah Sevincer
2024-05-27 15:23   ` Jerin Jacob
2024-05-01 19:46 ` [PATCH v4 3/3] event/dlb2: enhance DLB credit handling Abdullah Sevincer
2024-05-27 15:30   ` Jerin Jacob
2024-05-02  7:34 ` [PATCH v4 0/3] DLB2 Enhancements Bruce Richardson
2024-05-02 15:52   ` Sevincer, Abdullah

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).