From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 421C4A0093;
	Tue, 29 Nov 2022 09:22:19 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 598B542D4B;
	Tue, 29 Nov 2022 09:21:28 +0100 (CET)
Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com
 [67.231.148.174])
 by mails.dpdk.org (Postfix) with ESMTP id C22C642BD9
 for <dev@dpdk.org>; Tue, 29 Nov 2022 09:21:19 +0100 (CET)
Received: from pps.filterd (m0045849.ppops.net [127.0.0.1])
 by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id
 2AT3NpLX005657 for <dev@dpdk.org>; Tue, 29 Nov 2022 00:21:19 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com;
 h=from : to : cc :
 subject : date : message-id : in-reply-to : references : mime-version :
 content-type; s=pfpt0220; bh=ZpnkRUQVsIG+LQyzhZYUZABeINrVmutRG6Fc/TM9zFk=;
 b=O3R1bvuU4BwKljHFJYvPOtvQTslmFuNcqZpKbZLFsxhfF+UecZaV8IY3ButjqtNXsFso
 bpSqDrjlvVr/IOll4RmKt0LasSMszgf5Czo8bbU7EDkf6HTair7FdZwNVwGlNDdc9NC4
 zq2PJ7YRlvyADMscX9MCGcEKIhJF4g+OG86JW0n+5SCjCNVwi5lImwfELcIpJLfgm8Um
 EhSFRS5e/AgCbTemCtQaFpiBUPRebB5V5moWZv4iMkmrB/sqYyWlKpS9aJ/bN/bJl9Lw
 Ne/FS9l4dOiZfkRze3R3CCIVWFFxflwI8BM8eS772RN9UAv+HbdwYDBHAsp9IYG2btNE nQ== 
Received: from dc5-exch02.marvell.com ([199.233.59.182])
 by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3m5a508yyp-8
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT)
 for <dev@dpdk.org>; Tue, 29 Nov 2022 00:21:18 -0800
Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18;
 Tue, 29 Nov 2022 00:21:16 -0800
Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com
 (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend
 Transport; Tue, 29 Nov 2022 00:21:16 -0800
Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233])
 by maili.marvell.com (Postfix) with ESMTP id 425333F7044;
 Tue, 29 Nov 2022 00:21:16 -0800 (PST)
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
To: Srikanth Yalavarthi <syalavarthi@marvell.com>
CC: <dev@dpdk.org>, <sshankarnara@marvell.com>, <jerinj@marvell.com>
Subject: [PATCH v2 11/12] app/mldev: enable reporting stats in mldev app
Date: Tue, 29 Nov 2022 00:21:08 -0800
Message-ID: <20221129082109.6809-11-syalavarthi@marvell.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20221129082109.6809-1-syalavarthi@marvell.com>
References: <20221129070746.20396-2-syalavarthi@marvell.com>
 <20221129082109.6809-1-syalavarthi@marvell.com>
MIME-Version: 1.0
Content-Type: text/plain
X-Proofpoint-ORIG-GUID: e6PHXm3h3J7M1bG7y3uK6vuz6Sje_YHT
X-Proofpoint-GUID: e6PHXm3h3J7M1bG7y3uK6vuz6Sje_YHT
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1
 definitions=2022-11-29_06,2022-11-28_02,2022-06-22_01
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Enable reporting driver xstats and inference end-to-end
latency and throughput in mldev inference tests. Reporting
of stats can be enabled using "--stats" option.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
 app/test-mldev/ml_options.c                |  22 ++--
 app/test-mldev/ml_options.h                |   2 +
 app/test-mldev/test_inference_common.c     | 139 +++++++++++++++++++++
 app/test-mldev/test_inference_common.h     |   8 ++
 app/test-mldev/test_inference_interleave.c |   4 +
 app/test-mldev/test_inference_ordered.c    |   1 +
 6 files changed, 168 insertions(+), 8 deletions(-)

diff --git a/app/test-mldev/ml_options.c b/app/test-mldev/ml_options.c
index 092303903f..0e7877eed3 100644
--- a/app/test-mldev/ml_options.c
+++ b/app/test-mldev/ml_options.c
@@ -36,6 +36,7 @@ ml_options_default(struct ml_options *opt)
 	opt->queue_size = 1;
 	opt->batches = 0;
 	opt->tolerance = 0.0;
+	opt->stats = false;
 	opt->debug = false;
 }
 
@@ -222,7 +223,8 @@ ml_dump_test_options(const char *testname)
 		       "\t\t--queue_pairs      : number of queue pairs to create\n"
 		       "\t\t--queue_size       : size fo queue-pair\n"
 		       "\t\t--batches          : number of batches of input\n"
-		       "\t\t--tolerance        : maximum tolerance (%%) for output validation\n");
+		       "\t\t--tolerance        : maximum tolerance (%%) for output validation\n"
+		       "\t\t--stats            : enable reporting performance statistics\n");
 		printf("\n");
 	}
 }
@@ -242,13 +244,12 @@ print_usage(char *program)
 	ml_test_dump_names(ml_dump_test_options);
 }
 
-static struct option lgopts[] = {{ML_TEST, 1, 0, 0},	   {ML_DEVICE_ID, 1, 0, 0},
-				 {ML_SOCKET_ID, 1, 0, 0},  {ML_MODELS, 1, 0, 0},
-				 {ML_FILELIST, 1, 0, 0},   {ML_REPETITIONS, 1, 0, 0},
-				 {ML_BURST_SIZE, 1, 0, 0}, {ML_QUEUE_PAIRS, 1, 0, 0},
-				 {ML_QUEUE_SIZE, 1, 0, 0}, {ML_BATCHES, 1, 0, 0},
-				 {ML_TOLERANCE, 1, 0, 0},  {ML_DEBUG, 0, 0, 0},
-				 {ML_HELP, 0, 0, 0},	   {NULL, 0, 0, 0}};
+static struct option lgopts[] = {
+	{ML_TEST, 1, 0, 0},	  {ML_DEVICE_ID, 1, 0, 0},   {ML_SOCKET_ID, 1, 0, 0},
+	{ML_MODELS, 1, 0, 0},	  {ML_FILELIST, 1, 0, 0},    {ML_REPETITIONS, 1, 0, 0},
+	{ML_BURST_SIZE, 1, 0, 0}, {ML_QUEUE_PAIRS, 1, 0, 0}, {ML_QUEUE_SIZE, 1, 0, 0},
+	{ML_BATCHES, 1, 0, 0},	  {ML_TOLERANCE, 1, 0, 0},   {ML_STATS, 0, 0, 0},
+	{ML_DEBUG, 0, 0, 0},	  {ML_HELP, 0, 0, 0},	     {NULL, 0, 0, 0}};
 
 static int
 ml_opts_parse_long(int opt_idx, struct ml_options *opt)
@@ -283,6 +284,11 @@ ml_options_parse(struct ml_options *opt, int argc, char **argv)
 	while ((opts = getopt_long(argc, argv, "", lgopts, &opt_idx)) != EOF) {
 		switch (opts) {
 		case 0: /* parse long options */
+			if (!strcmp(lgopts[opt_idx].name, "stats")) {
+				opt->stats = true;
+				break;
+			}
+
 			if (!strcmp(lgopts[opt_idx].name, "debug")) {
 				opt->debug = true;
 				break;
diff --git a/app/test-mldev/ml_options.h b/app/test-mldev/ml_options.h
index 79ac54de98..a375ae6750 100644
--- a/app/test-mldev/ml_options.h
+++ b/app/test-mldev/ml_options.h
@@ -24,6 +24,7 @@
 #define ML_QUEUE_SIZE  ("queue_size")
 #define ML_BATCHES     ("batches")
 #define ML_TOLERANCE   ("tolerance")
+#define ML_STATS       ("stats")
 #define ML_DEBUG       ("debug")
 #define ML_HELP	       ("help")
 
@@ -46,6 +47,7 @@ struct ml_options {
 	uint16_t queue_size;
 	uint16_t batches;
 	float tolerance;
+	bool stats;
 	bool debug;
 };
 
diff --git a/app/test-mldev/test_inference_common.c b/app/test-mldev/test_inference_common.c
index 008cee1023..8d1fc55c2f 100644
--- a/app/test-mldev/test_inference_common.c
+++ b/app/test-mldev/test_inference_common.c
@@ -11,6 +11,7 @@
 #include <unistd.h>
 
 #include <rte_common.h>
+#include <rte_cycles.h>
 #include <rte_hash_crc.h>
 #include <rte_launch.h>
 #include <rte_lcore.h>
@@ -45,6 +46,17 @@
 		}                                                                                  \
 	} while (0)
 
+static void
+print_line(uint16_t len)
+{
+	uint16_t i;
+
+	for (i = 0; i < len; i++)
+		printf("-");
+
+	printf("\n");
+}
+
 /* Enqueue inference requests with burst size equal to 1 */
 static int
 ml_enqueue_single(void *arg)
@@ -54,6 +66,7 @@ ml_enqueue_single(void *arg)
 	struct rte_ml_op *op = NULL;
 	struct ml_core_args *args;
 	uint64_t model_enq = 0;
+	uint64_t start_cycle;
 	uint32_t burst_enq;
 	uint32_t lcore_id;
 	int16_t fid;
@@ -61,6 +74,7 @@ ml_enqueue_single(void *arg)
 
 	lcore_id = rte_lcore_id();
 	args = &t->args[lcore_id];
+	args->start_cycles = 0;
 	model_enq = 0;
 
 	if (args->nb_reqs == 0)
@@ -96,10 +110,12 @@ ml_enqueue_single(void *arg)
 	req->fid = fid;
 
 enqueue_req:
+	start_cycle = rte_get_tsc_cycles();
 	burst_enq = rte_ml_enqueue_burst(t->cmn.opt->dev_id, args->qp_id, &op, 1);
 	if (burst_enq == 0)
 		goto enqueue_req;
 
+	args->start_cycles += start_cycle;
 	fid++;
 	if (likely(fid <= args->end_fid))
 		goto next_model;
@@ -123,10 +139,12 @@ ml_dequeue_single(void *arg)
 	uint64_t total_deq = 0;
 	uint8_t nb_filelist;
 	uint32_t burst_deq;
+	uint64_t end_cycle;
 	uint32_t lcore_id;
 
 	lcore_id = rte_lcore_id();
 	args = &t->args[lcore_id];
+	args->end_cycles = 0;
 	nb_filelist = args->end_fid - args->start_fid + 1;
 
 	if (args->nb_reqs == 0)
@@ -134,9 +152,11 @@ ml_dequeue_single(void *arg)
 
 dequeue_req:
 	burst_deq = rte_ml_dequeue_burst(t->cmn.opt->dev_id, args->qp_id, &op, 1);
+	end_cycle = rte_get_tsc_cycles();
 
 	if (likely(burst_deq == 1)) {
 		total_deq += burst_deq;
+		args->end_cycles += end_cycle;
 		if (unlikely(op->status == RTE_ML_OP_STATUS_ERROR)) {
 			rte_ml_op_error_get(t->cmn.opt->dev_id, op, &error);
 			ml_err("error_code = 0x%" PRIx64 ", error_message = %s\n", error.errcode,
@@ -159,6 +179,7 @@ ml_enqueue_burst(void *arg)
 {
 	struct test_inference *t = ml_test_priv((struct ml_test *)arg);
 	struct ml_core_args *args;
+	uint64_t start_cycle;
 	uint16_t ops_count;
 	uint64_t model_enq;
 	uint16_t burst_enq;
@@ -171,6 +192,7 @@ ml_enqueue_burst(void *arg)
 
 	lcore_id = rte_lcore_id();
 	args = &t->args[lcore_id];
+	args->start_cycles = 0;
 	model_enq = 0;
 
 	if (args->nb_reqs == 0)
@@ -212,8 +234,10 @@ ml_enqueue_burst(void *arg)
 	pending = ops_count;
 
 enqueue_reqs:
+	start_cycle = rte_get_tsc_cycles();
 	burst_enq =
 		rte_ml_enqueue_burst(t->cmn.opt->dev_id, args->qp_id, &args->enq_ops[idx], pending);
+	args->start_cycles += burst_enq * start_cycle;
 	pending = pending - burst_enq;
 
 	if (pending > 0) {
@@ -243,11 +267,13 @@ ml_dequeue_burst(void *arg)
 	uint64_t total_deq = 0;
 	uint16_t burst_deq = 0;
 	uint8_t nb_filelist;
+	uint64_t end_cycle;
 	uint32_t lcore_id;
 	uint32_t i;
 
 	lcore_id = rte_lcore_id();
 	args = &t->args[lcore_id];
+	args->end_cycles = 0;
 	nb_filelist = args->end_fid - args->start_fid + 1;
 
 	if (args->nb_reqs == 0)
@@ -256,9 +282,11 @@ ml_dequeue_burst(void *arg)
 dequeue_burst:
 	burst_deq = rte_ml_dequeue_burst(t->cmn.opt->dev_id, args->qp_id, args->deq_ops,
 					 t->cmn.opt->burst_size);
+	end_cycle = rte_get_tsc_cycles();
 
 	if (likely(burst_deq > 0)) {
 		total_deq += burst_deq;
+		args->end_cycles += burst_deq * end_cycle;
 
 		for (i = 0; i < burst_deq; i++) {
 			if (unlikely(args->deq_ops[i]->status == RTE_ML_OP_STATUS_ERROR)) {
@@ -387,6 +415,7 @@ test_inference_opt_dump(struct ml_options *opt)
 	ml_dump("queue_pairs", "%u", opt->queue_pairs);
 	ml_dump("queue_size", "%u", opt->queue_size);
 	ml_dump("tolerance", "%-7.3f", opt->tolerance);
+	ml_dump("stats", "%s", (opt->stats ? "true" : "false"));
 
 	if (opt->batches == 0)
 		ml_dump("batches", "%u (default)", opt->batches);
@@ -459,6 +488,11 @@ test_inference_setup(struct ml_test *test, struct ml_options *opt)
 			RTE_CACHE_LINE_SIZE, opt->socket_id);
 	}
 
+	for (i = 0; i < RTE_MAX_LCORE; i++) {
+		t->args[i].start_cycles = 0;
+		t->args[i].end_cycles = 0;
+	}
+
 	return 0;
 
 error:
@@ -985,3 +1019,108 @@ ml_inference_launch_cores(struct ml_test *test, struct ml_options *opt, int16_t
 
 	return 0;
 }
+
+int
+ml_inference_stats_get(struct ml_test *test, struct ml_options *opt)
+{
+	struct test_inference *t = ml_test_priv(test);
+	uint64_t total_cycles = 0;
+	uint32_t nb_filelist;
+	uint64_t throughput;
+	uint64_t avg_e2e;
+	uint32_t qp_id;
+	uint64_t freq;
+	int ret;
+	int i;
+
+	if (!opt->stats)
+		return 0;
+
+	/* get xstats size */
+	t->xstats_size = rte_ml_dev_xstats_names_get(opt->dev_id, NULL, 0);
+	if (t->xstats_size >= 0) {
+		/* allocate for xstats_map and values */
+		t->xstats_map = rte_malloc(
+			"ml_xstats_map", t->xstats_size * sizeof(struct rte_ml_dev_xstats_map), 0);
+		if (t->xstats_map == NULL) {
+			ret = -ENOMEM;
+			goto error;
+		}
+
+		t->xstats_values =
+			rte_malloc("ml_xstats_values", t->xstats_size * sizeof(uint64_t), 0);
+		if (t->xstats_values == NULL) {
+			ret = -ENOMEM;
+			goto error;
+		}
+
+		ret = rte_ml_dev_xstats_names_get(opt->dev_id, t->xstats_map, t->xstats_size);
+		if (ret != t->xstats_size) {
+			printf("Unable to get xstats names, ret = %d\n", ret);
+			ret = -1;
+			goto error;
+		}
+
+		for (i = 0; i < t->xstats_size; i++)
+			rte_ml_dev_xstats_get(opt->dev_id, &t->xstats_map[i].id,
+					      &t->xstats_values[i], 1);
+	}
+
+	/* print xstats*/
+	printf("\n");
+	print_line(80);
+	printf(" ML Device Extended Statistics\n");
+	print_line(80);
+	for (i = 0; i < t->xstats_size; i++)
+		printf(" %-64s = %" PRIu64 "\n", t->xstats_map[i].name, t->xstats_values[i]);
+	print_line(80);
+
+	/* release buffers */
+	if (t->xstats_map)
+		rte_free(t->xstats_map);
+
+	if (t->xstats_values)
+		rte_free(t->xstats_values);
+
+	/* print end-to-end stats */
+	freq = rte_get_tsc_hz();
+	for (qp_id = 0; qp_id < RTE_MAX_LCORE; qp_id++)
+		total_cycles += t->args[qp_id].end_cycles - t->args[qp_id].start_cycles;
+	avg_e2e = total_cycles / opt->repetitions;
+
+	if (freq == 0) {
+		avg_e2e = total_cycles / opt->repetitions;
+		printf(" %-64s = %" PRIu64 "\n", "Average End-to-End Latency (cycles)", avg_e2e);
+	} else {
+		avg_e2e = (total_cycles * NS_PER_S) / (opt->repetitions * freq);
+		printf(" %-64s = %" PRIu64 "\n", "Average End-to-End Latency (ns)", avg_e2e);
+	}
+
+	if (strcmp(opt->test_name, "inference_ordered") == 0)
+		nb_filelist = 1;
+	else
+		nb_filelist = t->cmn.opt->nb_filelist;
+
+	if (freq == 0) {
+		throughput = (nb_filelist * t->cmn.opt->repetitions * 1000000) / total_cycles;
+		printf(" %-64s = %" PRIu64 "\n", "Average Throughput (inferences / million cycles)",
+		       throughput);
+	} else {
+		throughput = (nb_filelist * t->cmn.opt->repetitions * freq) / total_cycles;
+		printf(" %-64s = %" PRIu64 "\n", "Average Throughput (inferences / second)",
+		       throughput);
+	}
+
+	print_line(80);
+
+	return 0;
+
+error:
+	if (t->xstats_map)
+		rte_free(t->xstats_map);
+
+	if (t->xstats_values)
+		rte_free(t->xstats_values);
+
+	return ret;
+}
diff --git a/app/test-mldev/test_inference_common.h b/app/test-mldev/test_inference_common.h
index 3f2b042360..bb2920cc30 100644
--- a/app/test-mldev/test_inference_common.h
+++ b/app/test-mldev/test_inference_common.h
@@ -32,6 +32,9 @@ struct ml_core_args {
 	struct rte_ml_op **enq_ops;
 	struct rte_ml_op **deq_ops;
 	struct ml_request **reqs;
+
+	uint64_t start_cycles;
+	uint64_t end_cycles;
 };
 
 struct test_inference {
@@ -50,6 +53,10 @@ struct test_inference {
 	int (*dequeue)(void *arg);
 
 	struct ml_core_args args[RTE_MAX_LCORE];
+
+	struct rte_ml_dev_xstats_map *xstats_map;
+	uint64_t *xstats_values;
+	int xstats_size;
 } __rte_cache_aligned;
 
 bool test_inference_cap_check(struct ml_options *opt);
@@ -67,5 +74,6 @@ void ml_inference_mem_destroy(struct ml_test *test, struct ml_options *opt);
 int ml_inference_result(struct ml_test *test, struct ml_options *opt, int16_t fid);
 int ml_inference_launch_cores(struct ml_test *test, struct ml_options *opt, int16_t start_fid,
 			      int16_t end_fid);
+int ml_inference_stats_get(struct ml_test *test, struct ml_options *opt);
 
 #endif /* _ML_TEST_INFERENCE_COMMON_ */
diff --git a/app/test-mldev/test_inference_interleave.c b/app/test-mldev/test_inference_interleave.c
index 74ad0c597f..d86838c3fa 100644
--- a/app/test-mldev/test_inference_interleave.c
+++ b/app/test-mldev/test_inference_interleave.c
@@ -60,7 +60,11 @@ test_inference_interleave_driver(struct ml_test *test, struct ml_options *opt)
 			goto error;
 
 		ml_inference_iomem_destroy(test, opt, fid);
+	}
+
+	ml_inference_stats_get(test, opt);
 
+	for (fid = 0; fid < opt->nb_filelist; fid++) {
 		ret = ml_model_stop(test, opt, &t->model[fid], fid);
 		if (ret != 0)
 			goto error;
diff --git a/app/test-mldev/test_inference_ordered.c b/app/test-mldev/test_inference_ordered.c
index 84e6bf9109..3826121a65 100644
--- a/app/test-mldev/test_inference_ordered.c
+++ b/app/test-mldev/test_inference_ordered.c
@@ -58,6 +58,7 @@ test_inference_ordered_driver(struct ml_test *test, struct ml_options *opt)
 		goto error;
 
 	ml_inference_iomem_destroy(test, opt, fid);
+	ml_inference_stats_get(test, opt);
 
 	/* stop model */
 	ret = ml_model_stop(test, opt, &t->model[fid], fid);
-- 
2.17.1