From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 08F7E41B9D;
	Wed,  1 Feb 2023 10:26:32 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id C6E9C4300C;
	Wed,  1 Feb 2023 10:23:53 +0100 (CET)
Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com
 [67.231.148.174])
 by mails.dpdk.org (Postfix) with ESMTP id 39BAC42D29
 for <dev@dpdk.org>; Wed,  1 Feb 2023 10:23:26 +0100 (CET)
Received: from pps.filterd (m0045849.ppops.net [127.0.0.1])
 by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id
 3116LRYA024189 for <dev@dpdk.org>; Wed, 1 Feb 2023 01:23:25 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com;
 h=from : to : cc :
 subject : date : message-id : in-reply-to : references : mime-version :
 content-type; s=pfpt0220; bh=2Tdy5B4n5Km4Vbgqo/d4AGjHszOAkSvC89JZPpZG4jQ=;
 b=AQUaI5Y6/KknwkaHx3DDTbOCG1IgNpSljHPi6EZUnR8Njxhm3oVs4iUOcQD1ShIS3z/x
 +sbP+z67ayLp9e+zzTY0lE9JyAMrzzSCqO+uRoenZseYiLyyIiDtnFkLy3s2v3GUmpJK
 VaP1KrPGCJJEchXknwmKb77Rgby5GyyFMHXxps+KieEPzdsJaSWqKCLZDjpZMsAkEOk1
 GS8iocCx4n7dHm+7IuSgpOOAkdcWr67DSuD0Wnp/myL3Q4WIP7d7AoOcVJKj/2nwHtlX
 FczvRXw2joLlZaGM45ba0YEFIy5pqSDs9hB7Tm7TzUCdFjx/qdyKCrjQW1gV+f1HoOZg Ow== 
Received: from dc5-exch02.marvell.com ([199.233.59.182])
 by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3nfjr8rgv6-8
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT)
 for <dev@dpdk.org>; Wed, 01 Feb 2023 01:23:25 -0800
Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.42;
 Wed, 1 Feb 2023 01:23:20 -0800
Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com
 (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend
 Transport; Wed, 1 Feb 2023 01:23:20 -0800
Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233])
 by maili.marvell.com (Postfix) with ESMTP id 5BF913F704D;
 Wed,  1 Feb 2023 01:23:20 -0800 (PST)
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
To: Srikanth Yalavarthi <syalavarthi@marvell.com>
CC: <dev@dpdk.org>, <sshankarnara@marvell.com>, <jerinj@marvell.com>,
 <aprabhu@marvell.com>
Subject: [PATCH v4 24/39] ml/cnxk: enable support to dump device debug info
Date: Wed, 1 Feb 2023 01:22:55 -0800
Message-ID: <20230201092310.23252-25-syalavarthi@marvell.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20230201092310.23252-1-syalavarthi@marvell.com>
References: <20221208200220.20267-1-syalavarthi@marvell.com>
 <20230201092310.23252-1-syalavarthi@marvell.com>
MIME-Version: 1.0
Content-Type: text/plain
X-Proofpoint-GUID: SmTv1zEAwZ5vEjvdNzIWUgoC0JB9qkc_
X-Proofpoint-ORIG-GUID: SmTv1zEAwZ5vEjvdNzIWUgoC0JB9qkc_
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1
 definitions=2023-02-01_03,2023-01-31_01,2022-06-22_01
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Added support to dump device debug information. Debug info on
cn10k device includes model state info, OCM usage info, firmware
debug and exception buffer.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
 drivers/ml/cnxk/cn10k_ml_ocm.c |  51 +++++++++
 drivers/ml/cnxk/cn10k_ml_ocm.h |   1 +
 drivers/ml/cnxk/cn10k_ml_ops.c | 189 +++++++++++++++++++++++++++++++++
 3 files changed, 241 insertions(+)

diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.c b/drivers/ml/cnxk/cn10k_ml_ocm.c
index 034d9546eb..2083d99f81 100644
--- a/drivers/ml/cnxk/cn10k_ml_ocm.c
+++ b/drivers/ml/cnxk/cn10k_ml_ocm.c
@@ -458,3 +458,54 @@ cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, int16_t model_id)
 		}
 	}
 }
+
+static void
+cn10k_ml_ocm_pagemask_to_str(struct cn10k_ml_ocm_tile_info *tile_info, uint16_t nwords, char *str)
+{
+	char *p = str;
+	int word;
+
+	/* add prefix 0x */
+	*p++ = '0';
+	*p++ = 'x';
+
+	/* build one word at a time */
+	for (word = nwords - 1; word >= 0; word--) {
+		sprintf(p, "%02X", tile_info->ocm_mask[word]);
+		p += 2;
+	}
+
+	/* terminate */
+	*p++ = 0;
+}
+
+void
+cn10k_ml_ocm_print(struct rte_ml_dev *dev, FILE *fp)
+{
+	char str[ML_CN10K_OCM_NUMPAGES / 4 + 2]; /* nibbles + prefix '0x' */
+	struct cn10k_ml_dev *mldev;
+	struct cn10k_ml_ocm *ocm;
+	uint8_t tile_id;
+	uint8_t word_id;
+	int wb_pages;
+
+	mldev = dev->data->dev_private;
+	ocm = &mldev->ocm;
+
+	fprintf(fp, "OCM State:\n");
+	for (tile_id = 0; tile_id < ocm->num_tiles; tile_id++) {
+		cn10k_ml_ocm_pagemask_to_str(&ocm->tile_ocm_info[tile_id], ocm->mask_words, str);
+
+		wb_pages = 0 - ocm->tile_ocm_info[tile_id].scratch_pages;
+		for (word_id = 0; word_id < ML_CN10K_OCM_MASKWORDS; word_id++)
+			wb_pages +=
+				__builtin_popcount(ocm->tile_ocm_info[tile_id].ocm_mask[word_id]);
+
+		fprintf(fp,
+			"tile = %2u, scratch_pages = %4u,"
+			" wb_pages = %4d, last_wb_page = %4d,"
+			" pagemask = %s\n",
+			tile_id, ocm->tile_ocm_info[tile_id].scratch_pages, wb_pages,
+			ocm->tile_ocm_info[tile_id].last_wb_page, str);
+	}
+}
diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.h b/drivers/ml/cnxk/cn10k_ml_ocm.h
index cd65d1d8fa..4415bbfb45 100644
--- a/drivers/ml/cnxk/cn10k_ml_ocm.h
+++ b/drivers/ml/cnxk/cn10k_ml_ocm.h
@@ -83,5 +83,6 @@ int cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16
 void cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, int16_t model_id, uint64_t tilemask,
 				int wb_page_start, uint16_t wb_pages, uint16_t scratch_pages);
 void cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, int16_t model_id);
+void cn10k_ml_ocm_print(struct rte_ml_dev *dev, FILE *fp);
 
 #endif /* _CN10K_ML_OCM_H_ */
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 88809e6e96..ad849e7abc 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -14,10 +14,25 @@
 /* ML model macros */
 #define CN10K_ML_MODEL_MEMZONE_NAME "ml_cn10k_model_mz"
 
+/* Debug print width */
+#define STR_LEN	  12
+#define FIELD_LEN 16
+#define LINE_LEN  90
+
 /* ML Job descriptor flags */
 #define ML_FLAGS_POLL_COMPL BIT(0)
 #define ML_FLAGS_SSO_COMPL  BIT(1)
 
+static void
+print_line(FILE *fp, int len)
+{
+	int i;
+
+	for (i = 0; i < len; i++)
+		fprintf(fp, "-");
+	fprintf(fp, "\n");
+}
+
 static void
 qp_memzone_name_get(char *name, int size, int dev_id, int qp_id)
 {
@@ -116,6 +131,102 @@ cn10k_ml_qp_create(const struct rte_ml_dev *dev, uint16_t qp_id, uint32_t nb_des
 	return NULL;
 }
 
+static void
+cn10k_ml_model_print(struct rte_ml_dev *dev, int16_t model_id, FILE *fp)
+{
+
+	struct cn10k_ml_model *model;
+	struct cn10k_ml_dev *mldev;
+	struct cn10k_ml_ocm *ocm;
+	char str[STR_LEN];
+	uint8_t i;
+
+	mldev = dev->data->dev_private;
+	ocm = &mldev->ocm;
+	model = dev->data->models[model_id];
+
+	/* Print debug info */
+	print_line(fp, LINE_LEN);
+	fprintf(fp, " Model Information (%s)\n", model->metadata.model.name);
+	print_line(fp, LINE_LEN);
+	fprintf(fp, "%*s : %s\n", FIELD_LEN, "name", model->metadata.model.name);
+	fprintf(fp, "%*s : %u.%u.%u.%u\n", FIELD_LEN, "version", model->metadata.model.version[0],
+		model->metadata.model.version[1], model->metadata.model.version[2],
+		model->metadata.model.version[3]);
+	if (strlen(model->name) != 0)
+		fprintf(fp, "%*s : %s\n", FIELD_LEN, "debug_name", model->name);
+	fprintf(fp, "%*s : 0x%016lx\n", FIELD_LEN, "model", PLT_U64_CAST(model));
+	fprintf(fp, "%*s : %u\n", FIELD_LEN, "index", model->model_id);
+	fprintf(fp, "%*s : %u\n", FIELD_LEN, "batch_size", model->metadata.model.batch_size);
+	fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_layers", model->metadata.model.num_layers);
+
+	/* Print model state */
+	if (model->state == ML_CN10K_MODEL_STATE_LOADED)
+		fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "loaded");
+	if (model->state == ML_CN10K_MODEL_STATE_JOB_ACTIVE)
+		fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "job_active");
+	if (model->state == ML_CN10K_MODEL_STATE_STARTED)
+		fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "started");
+
+	/* Print OCM status */
+	fprintf(fp, "%*s : %" PRIu64 " bytes\n", FIELD_LEN, "wb_size",
+		model->metadata.model.ocm_wb_range_end - model->metadata.model.ocm_wb_range_start +
+			1);
+	fprintf(fp, "%*s : %u\n", FIELD_LEN, "wb_pages", model->model_mem_map.wb_pages);
+	fprintf(fp, "%*s : %" PRIu64 " bytes\n", FIELD_LEN, "scratch_size",
+		ocm->size_per_tile - model->metadata.model.ocm_tmp_range_floor);
+	fprintf(fp, "%*s : %u\n", FIELD_LEN, "scratch_pages", model->model_mem_map.scratch_pages);
+	fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_tiles",
+		model->metadata.model.tile_end - model->metadata.model.tile_start + 1);
+
+	if (model->state == ML_CN10K_MODEL_STATE_STARTED) {
+		fprintf(fp, "%*s : 0x%0*" PRIx64 "\n", FIELD_LEN, "tilemask",
+			ML_CN10K_OCM_NUMTILES / 4, model->model_mem_map.tilemask);
+		fprintf(fp, "%*s : 0x%x\n", FIELD_LEN, "ocm_wb_start",
+			model->model_mem_map.wb_page_start * ML_CN10K_OCM_PAGESIZE);
+	}
+
+	fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_inputs", model->metadata.model.num_input);
+	fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_outputs", model->metadata.model.num_output);
+	fprintf(fp, "\n");
+
+	print_line(fp, LINE_LEN);
+	fprintf(fp, "%8s  %16s  %12s  %18s  %12s  %14s\n", "input", "input_name", "input_type",
+		"model_input_type", "quantize", "format");
+	print_line(fp, LINE_LEN);
+	for (i = 0; i < model->metadata.model.num_input; i++) {
+		fprintf(fp, "%8u  ", i);
+		fprintf(fp, "%*s  ", 16, model->metadata.input[i].input_name);
+		rte_ml_io_type_to_str(model->metadata.input[i].input_type, str, STR_LEN);
+		fprintf(fp, "%*s  ", 12, str);
+		rte_ml_io_type_to_str(model->metadata.input[i].model_input_type, str, STR_LEN);
+		fprintf(fp, "%*s  ", 18, str);
+		fprintf(fp, "%*s", 12, (model->metadata.input[i].quantize == 1 ? "Yes" : "No"));
+		rte_ml_io_format_to_str(model->metadata.input[i].shape.format, str, STR_LEN);
+		fprintf(fp, "%*s", 16, str);
+		fprintf(fp, "\n");
+	}
+	fprintf(fp, "\n");
+
+	print_line(fp, LINE_LEN);
+	fprintf(fp, "%8s  %16s  %12s  %18s  %12s\n", "output", "output_name", "output_type",
+		"model_output_type", "dequantize");
+	print_line(fp, LINE_LEN);
+	for (i = 0; i < model->metadata.model.num_output; i++) {
+		fprintf(fp, "%8u  ", i);
+		fprintf(fp, "%*s  ", 16, model->metadata.output[i].output_name);
+		rte_ml_io_type_to_str(model->metadata.output[i].output_type, str, STR_LEN);
+		fprintf(fp, "%*s  ", 12, str);
+		rte_ml_io_type_to_str(model->metadata.output[i].model_output_type, str, STR_LEN);
+		fprintf(fp, "%*s  ", 18, str);
+		fprintf(fp, "%*s", 12, (model->metadata.output[i].dequantize == 1 ? "Yes" : "No"));
+		fprintf(fp, "\n");
+	}
+	fprintf(fp, "\n");
+	print_line(fp, LINE_LEN);
+	fprintf(fp, "\n");
+}
+
 static void
 cn10k_ml_prep_sp_job_descriptor(struct cn10k_ml_dev *mldev, struct cn10k_ml_model *model,
 				struct cn10k_ml_req *req, enum cn10k_ml_job_type job_type)
@@ -498,6 +609,83 @@ cn10k_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, uint16_t queue_pair_id,
 	return 0;
 }
 
+static int
+cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp)
+{
+	struct cn10k_ml_model *model;
+	struct cn10k_ml_dev *mldev;
+	struct cn10k_ml_fw *fw;
+
+	uint32_t head_loc;
+	uint32_t tail_loc;
+	uint32_t bufsize;
+	char *head_ptr;
+	int model_id;
+	int core_id;
+
+	if (roc_env_is_asim())
+		return 0;
+
+	mldev = dev->data->dev_private;
+	fw = &mldev->fw;
+
+	/* Dump model info */
+	for (model_id = 0; model_id < dev->data->nb_models; model_id++) {
+		model = dev->data->models[model_id];
+		if (model != NULL) {
+			cn10k_ml_model_print(dev, model_id, fp);
+			fprintf(fp, "\n");
+		}
+	}
+
+	/* Dump OCM state */
+	cn10k_ml_ocm_print(dev, fp);
+
+	/* Dump debug buffer */
+	for (core_id = 0; core_id <= 1; core_id++) {
+		bufsize = fw->req->jd.fw_load.debug.debug_buffer_size;
+		if (core_id == 0) {
+			head_loc = roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_DBG_BUFFER_HEAD_C0);
+			tail_loc = roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_DBG_BUFFER_TAIL_C0);
+			head_ptr = PLT_PTR_CAST(fw->req->jd.fw_load.debug.core0_debug_ptr);
+			head_ptr = roc_ml_addr_mlip2ap(&mldev->roc, head_ptr);
+		} else {
+			head_loc = roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_DBG_BUFFER_HEAD_C1);
+			tail_loc = roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_DBG_BUFFER_TAIL_C1);
+			head_ptr = PLT_PTR_CAST(fw->req->jd.fw_load.debug.core1_debug_ptr);
+			head_ptr = roc_ml_addr_mlip2ap(&mldev->roc, head_ptr);
+		}
+		if (head_loc < tail_loc) {
+			fprintf(fp, "%.*s\n", tail_loc - head_loc, &head_ptr[head_loc]);
+		} else if (head_loc >= tail_loc + 1) {
+			fprintf(fp, "%.*s\n", bufsize - tail_loc, &head_ptr[head_loc]);
+			fprintf(fp, "%.*s\n", tail_loc, &head_ptr[0]);
+		}
+	}
+
+	/* Dump exception info */
+	for (core_id = 0; core_id <= 1; core_id++) {
+		bufsize = fw->req->jd.fw_load.debug.exception_state_size;
+		if ((core_id == 0) &&
+		    (roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C0) != 0)) {
+			head_ptr = PLT_PTR_CAST(fw->req->jd.fw_load.debug.core0_exception_buffer);
+			fprintf(fp, "ML_SCRATCH_EXCEPTION_SP_C0 = 0x%016lx",
+				roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C0));
+			head_ptr = roc_ml_addr_mlip2ap(&mldev->roc, head_ptr);
+			fprintf(fp, "%.*s", bufsize, head_ptr);
+		} else if ((core_id == 1) &&
+			   (roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C1) != 0)) {
+			head_ptr = PLT_PTR_CAST(fw->req->jd.fw_load.debug.core1_exception_buffer);
+			fprintf(fp, "ML_SCRATCH_EXCEPTION_SP_C1 = 0x%016lx",
+				roc_ml_reg_read64(&mldev->roc, ML_SCRATCH_EXCEPTION_SP_C1));
+			head_ptr = roc_ml_addr_mlip2ap(&mldev->roc, head_ptr);
+			fprintf(fp, "%.*s", bufsize, head_ptr);
+		}
+	}
+
+	return 0;
+}
+
 int
 cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, int16_t *model_id)
 {
@@ -1139,6 +1327,7 @@ struct rte_ml_dev_ops cn10k_ml_ops = {
 	.dev_close = cn10k_ml_dev_close,
 	.dev_start = cn10k_ml_dev_start,
 	.dev_stop = cn10k_ml_dev_stop,
+	.dev_dump = cn10k_ml_dev_dump,
 
 	/* Queue-pair handling ops */
 	.dev_queue_pair_setup = cn10k_ml_dev_queue_pair_setup,
-- 
2.17.1