From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 54D4BA00C2; Thu, 8 Dec 2022 20:30:35 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1456F42D46; Thu, 8 Dec 2022 20:29:56 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 712DA42D21 for ; Thu, 8 Dec 2022 20:29:49 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B8J8KBo001352 for ; Thu, 8 Dec 2022 11:29:48 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0220; bh=kegt+iJjQjXnc0jOxF1LvmI1bRao+Zt+/Vycwwm53oY=; b=BtT7kYzQbkRuZ5/TGWAL0tBusyLGsBxOKe83UoVbUUQj/02s2PWVJqAm8IAPcmSMtjwR la1Wviv4IL6goSTlmkUuN+dfGhNTgxnR1Gt8JKdRzNfkrNig9nQJ9vv11D2OtKUWBPMs +mjNZQ/ggQlag9zqnv7vVhYKANjdeR/87W/RYWQn7te/2NYpK5C63nPOH4n57lvQDSpz NpUvjyEjkZwY8yGi2V83W0oJSNNC87rvOb5liwygfpSaBQ3dOSyP2Lb/mQ6LIeHVjcyj CX3mJk0X85aW4lXREkIqCP5tCZJIqWQ17qPEHbPdfrhMd1IhsRtZGQvnl8gpsDqz/ful 2A== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3mb22svjv3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 08 Dec 2022 11:29:48 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 8 Dec 2022 11:29:46 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Thu, 8 Dec 2022 11:29:46 -0800 Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233]) by maili.marvell.com (Postfix) with ESMTP id 95E423F7058; Thu, 8 Dec 2022 11:29:46 -0800 (PST) From: Srikanth Yalavarthi To: Srikanth Yalavarthi CC: , , , Subject: [PATCH v3 10/12] app/mldev: enable support for inference validation Date: Thu, 8 Dec 2022 11:29:16 -0800 Message-ID: <20221208192918.25022-10-syalavarthi@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221208192918.25022-1-syalavarthi@marvell.com> References: <20221129082109.6809-1-syalavarthi@marvell.com> <20221208192918.25022-1-syalavarthi@marvell.com> MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-ORIG-GUID: kvCg3ZoGN4b_To7vtrvmgnxdL6HC19Fq X-Proofpoint-GUID: kvCg3ZoGN4b_To7vtrvmgnxdL6HC19Fq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-08_11,2022-12-08_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Enabled support to validate inference output with reference output provided by the user. Validation would be successful only when the inference outputs are within the 'tolerance' specified through command line option "--tolerance". Signed-off-by: Srikanth Yalavarthi --- app/test-mldev/meson.build | 2 +- app/test-mldev/ml_options.c | 36 +++- app/test-mldev/ml_options.h | 3 + app/test-mldev/test_inference_common.c | 218 ++++++++++++++++++++++++- app/test-mldev/test_inference_common.h | 1 + app/test-mldev/test_model_common.h | 1 + 6 files changed, 250 insertions(+), 11 deletions(-) diff --git a/app/test-mldev/meson.build b/app/test-mldev/meson.build index 41d22fb22c..15db534dc2 100644 --- a/app/test-mldev/meson.build +++ b/app/test-mldev/meson.build @@ -21,4 +21,4 @@ sources = files( 'test_inference_interleave.c', ) -deps += ['mldev'] +deps += ['mldev', 'hash'] diff --git a/app/test-mldev/ml_options.c b/app/test-mldev/ml_options.c index 331ec1704c..092303903f 100644 --- a/app/test-mldev/ml_options.c +++ b/app/test-mldev/ml_options.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include #include @@ -34,6 +35,7 @@ ml_options_default(struct ml_options *opt) opt->queue_pairs = 1; opt->queue_size = 1; opt->batches = 0; + opt->tolerance = 0.0; opt->debug = false; } @@ -139,6 +141,13 @@ ml_parse_filelist(struct ml_options *opt, const char *arg) } strlcpy(opt->filelist[opt->nb_filelist].output, token, PATH_MAX); + /* reference - optional */ + token = strtok(NULL, delim); + if (token != NULL) + strlcpy(opt->filelist[opt->nb_filelist].reference, token, PATH_MAX); + else + memset(opt->filelist[opt->nb_filelist].reference, 0, PATH_MAX); + opt->nb_filelist++; if (opt->nb_filelist == 0) { @@ -183,6 +192,14 @@ ml_parse_batches(struct ml_options *opt, const char *arg) return parser_read_uint16(&opt->batches, arg); } +static int +ml_parse_tolerance(struct ml_options *opt, const char *arg) +{ + opt->tolerance = fabs(atof(arg)); + + return 0; +} + static void ml_dump_test_options(const char *testname) { @@ -199,12 +216,13 @@ ml_dump_test_options(const char *testname) if ((strcmp(testname, "inference_ordered") == 0) || (strcmp(testname, "inference_interleave") == 0)) { - printf("\t\t--filelist : comma separated list of model, input and output\n" + printf("\t\t--filelist : comma separated list of model, input, output and reference\n" "\t\t--repetitions : number of inference repetitions\n" "\t\t--burst_size : inference burst size\n" "\t\t--queue_pairs : number of queue pairs to create\n" "\t\t--queue_size : size fo queue-pair\n" - "\t\t--batches : number of batches of input\n"); + "\t\t--batches : number of batches of input\n" + "\t\t--tolerance : maximum tolerance (%%) for output validation\n"); printf("\n"); } } @@ -224,12 +242,13 @@ print_usage(char *program) ml_test_dump_names(ml_dump_test_options); } -static struct option lgopts[] = { - {ML_TEST, 1, 0, 0}, {ML_DEVICE_ID, 1, 0, 0}, {ML_SOCKET_ID, 1, 0, 0}, - {ML_MODELS, 1, 0, 0}, {ML_FILELIST, 1, 0, 0}, {ML_REPETITIONS, 1, 0, 0}, - {ML_BURST_SIZE, 1, 0, 0}, {ML_QUEUE_PAIRS, 1, 0, 0}, {ML_QUEUE_SIZE, 1, 0, 0}, - {ML_BATCHES, 1, 0, 0}, {ML_DEBUG, 0, 0, 0}, {ML_HELP, 0, 0, 0}, - {NULL, 0, 0, 0}}; +static struct option lgopts[] = {{ML_TEST, 1, 0, 0}, {ML_DEVICE_ID, 1, 0, 0}, + {ML_SOCKET_ID, 1, 0, 0}, {ML_MODELS, 1, 0, 0}, + {ML_FILELIST, 1, 0, 0}, {ML_REPETITIONS, 1, 0, 0}, + {ML_BURST_SIZE, 1, 0, 0}, {ML_QUEUE_PAIRS, 1, 0, 0}, + {ML_QUEUE_SIZE, 1, 0, 0}, {ML_BATCHES, 1, 0, 0}, + {ML_TOLERANCE, 1, 0, 0}, {ML_DEBUG, 0, 0, 0}, + {ML_HELP, 0, 0, 0}, {NULL, 0, 0, 0}}; static int ml_opts_parse_long(int opt_idx, struct ml_options *opt) @@ -242,6 +261,7 @@ ml_opts_parse_long(int opt_idx, struct ml_options *opt) {ML_FILELIST, ml_parse_filelist}, {ML_REPETITIONS, ml_parse_repetitions}, {ML_BURST_SIZE, ml_parse_burst_size}, {ML_QUEUE_PAIRS, ml_parse_queue_pairs}, {ML_QUEUE_SIZE, ml_parse_queue_size}, {ML_BATCHES, ml_parse_batches}, + {ML_TOLERANCE, ml_parse_tolerance}, }; for (i = 0; i < RTE_DIM(parsermap); i++) { diff --git a/app/test-mldev/ml_options.h b/app/test-mldev/ml_options.h index d23e842895..79ac54de98 100644 --- a/app/test-mldev/ml_options.h +++ b/app/test-mldev/ml_options.h @@ -23,6 +23,7 @@ #define ML_QUEUE_PAIRS ("queue_pairs") #define ML_QUEUE_SIZE ("queue_size") #define ML_BATCHES ("batches") +#define ML_TOLERANCE ("tolerance") #define ML_DEBUG ("debug") #define ML_HELP ("help") @@ -30,6 +31,7 @@ struct ml_filelist { char model[PATH_MAX]; char input[PATH_MAX]; char output[PATH_MAX]; + char reference[PATH_MAX]; }; struct ml_options { @@ -43,6 +45,7 @@ struct ml_options { uint16_t queue_pairs; uint16_t queue_size; uint16_t batches; + float tolerance; bool debug; }; diff --git a/app/test-mldev/test_inference_common.c b/app/test-mldev/test_inference_common.c index 4e29f6c7eb..008cee1023 100644 --- a/app/test-mldev/test_inference_common.c +++ b/app/test-mldev/test_inference_common.c @@ -3,12 +3,15 @@ */ #include +#include #include #include #include +#include #include #include +#include #include #include #include @@ -21,6 +24,27 @@ #include "test_common.h" #include "test_inference_common.h" +#define ML_TEST_READ_TYPE(buffer, type) (*((type *)buffer)) + +#define ML_TEST_CHECK_OUTPUT(output, reference, tolerance) \ + (((float)output - (float)reference) <= (((float)reference * tolerance) / 100.0)) + +#define ML_OPEN_WRITE_GET_ERR(name, buffer, size, err) \ + do { \ + FILE *fp = fopen(name, "w+"); \ + if (fp == NULL) { \ + ml_err("Unable to create file: %s, error: %s", name, strerror(errno)); \ + err = true; \ + } else { \ + if (fwrite(buffer, 1, size, fp) != size) { \ + ml_err("Error writing output, file: %s, error: %s", name, \ + strerror(errno)); \ + err = true; \ + } \ + fclose(fp); \ + } \ + } while (0) + /* Enqueue inference requests with burst size equal to 1 */ static int ml_enqueue_single(void *arg) @@ -362,6 +386,7 @@ test_inference_opt_dump(struct ml_options *opt) ml_dump("burst_size", "%u", opt->burst_size); ml_dump("queue_pairs", "%u", opt->queue_pairs); ml_dump("queue_size", "%u", opt->queue_size); + ml_dump("tolerance", "%-7.3f", opt->tolerance); if (opt->batches == 0) ml_dump("batches", "%u (default)", opt->batches); @@ -373,6 +398,8 @@ test_inference_opt_dump(struct ml_options *opt) ml_dump_list("model", i, opt->filelist[i].model); ml_dump_list("input", i, opt->filelist[i].input); ml_dump_list("output", i, opt->filelist[i].output); + if (strcmp(opt->filelist[i].reference, "\0") != 0) + ml_dump_list("reference", i, opt->filelist[i].reference); } ml_dump_end; } @@ -397,6 +424,7 @@ test_inference_setup(struct ml_test *test, struct ml_options *opt) t = ml_test_priv(test); t->nb_used = 0; + t->nb_valid = 0; t->cmn.result = ML_TEST_FAILED; t->cmn.opt = opt; @@ -572,6 +600,9 @@ ml_inference_iomem_setup(struct ml_test *test, struct ml_options *opt, int16_t f /* allocate buffer for user data */ mz_size = t->model[fid].inp_dsize + t->model[fid].out_dsize; + if (strcmp(opt->filelist[fid].reference, "\0") != 0) + mz_size += t->model[fid].out_dsize; + sprintf(mz_name, "ml_user_data_%d", fid); mz = rte_memzone_reserve(mz_name, mz_size, opt->socket_id, 0); if (mz == NULL) { @@ -582,6 +613,10 @@ ml_inference_iomem_setup(struct ml_test *test, struct ml_options *opt, int16_t f t->model[fid].input = mz->addr; t->model[fid].output = t->model[fid].input + t->model[fid].inp_dsize; + if (strcmp(opt->filelist[fid].reference, "\0") != 0) + t->model[fid].reference = t->model[fid].output + t->model[fid].out_dsize; + else + t->model[fid].reference = NULL; /* load input file */ fp = fopen(opt->filelist[fid].input, "r"); @@ -610,6 +645,27 @@ ml_inference_iomem_setup(struct ml_test *test, struct ml_options *opt, int16_t f } fclose(fp); + /* load reference file */ + if (t->model[fid].reference != NULL) { + fp = fopen(opt->filelist[fid].reference, "r"); + if (fp == NULL) { + ml_err("Failed to open reference file : %s\n", + opt->filelist[fid].reference); + ret = -errno; + goto error; + } + + if (fread(t->model[fid].reference, 1, t->model[fid].out_dsize, fp) != + t->model[fid].out_dsize) { + ml_err("Failed to read reference file : %s\n", + opt->filelist[fid].reference); + ret = -errno; + fclose(fp); + goto error; + } + fclose(fp); + } + /* create mempool for quantized input and output buffers. ml_request_initialize is * used as a callback for object creation. */ @@ -694,6 +750,121 @@ ml_inference_mem_destroy(struct ml_test *test, struct ml_options *opt) rte_mempool_free(t->op_pool); } +static bool +ml_inference_validation(struct ml_test *test, struct ml_request *req) +{ + struct test_inference *t = ml_test_priv((struct ml_test *)test); + struct ml_model *model; + uint32_t nb_elements; + uint8_t *reference; + uint8_t *output; + bool match; + uint32_t i; + uint32_t j; + + model = &t->model[req->fid]; + + /* compare crc when tolerance is 0 */ + if (t->cmn.opt->tolerance == 0.0) { + match = (rte_hash_crc(model->output, model->out_dsize, 0) == + rte_hash_crc(model->reference, model->out_dsize, 0)); + } else { + output = model->output; + reference = model->reference; + + i = 0; +next_output: + nb_elements = + model->info.output_info[i].shape.w * model->info.output_info[i].shape.x * + model->info.output_info[i].shape.y * model->info.output_info[i].shape.z; + j = 0; +next_element: + match = false; + switch (model->info.output_info[i].dtype) { + case RTE_ML_IO_TYPE_INT8: + if (ML_TEST_CHECK_OUTPUT(ML_TEST_READ_TYPE(output, int8_t), + ML_TEST_READ_TYPE(reference, int8_t), + t->cmn.opt->tolerance)) + match = true; + + output += sizeof(int8_t); + reference += sizeof(int8_t); + break; + case RTE_ML_IO_TYPE_UINT8: + if (ML_TEST_CHECK_OUTPUT(ML_TEST_READ_TYPE(output, uint8_t), + ML_TEST_READ_TYPE(reference, uint8_t), + t->cmn.opt->tolerance)) + match = true; + + output += sizeof(float); + reference += sizeof(float); + break; + case RTE_ML_IO_TYPE_INT16: + if (ML_TEST_CHECK_OUTPUT(ML_TEST_READ_TYPE(output, int16_t), + ML_TEST_READ_TYPE(reference, int16_t), + t->cmn.opt->tolerance)) + match = true; + + output += sizeof(int16_t); + reference += sizeof(int16_t); + break; + case RTE_ML_IO_TYPE_UINT16: + if (ML_TEST_CHECK_OUTPUT(ML_TEST_READ_TYPE(output, uint16_t), + ML_TEST_READ_TYPE(reference, uint16_t), + t->cmn.opt->tolerance)) + match = true; + + output += sizeof(uint16_t); + reference += sizeof(uint16_t); + break; + case RTE_ML_IO_TYPE_INT32: + if (ML_TEST_CHECK_OUTPUT(ML_TEST_READ_TYPE(output, int32_t), + ML_TEST_READ_TYPE(reference, int32_t), + t->cmn.opt->tolerance)) + match = true; + + output += sizeof(int32_t); + reference += sizeof(int32_t); + break; + case RTE_ML_IO_TYPE_UINT32: + if (ML_TEST_CHECK_OUTPUT(ML_TEST_READ_TYPE(output, uint32_t), + ML_TEST_READ_TYPE(reference, uint32_t), + t->cmn.opt->tolerance)) + match = true; + + output += sizeof(uint32_t); + reference += sizeof(uint32_t); + break; + case RTE_ML_IO_TYPE_FP32: + if (ML_TEST_CHECK_OUTPUT(ML_TEST_READ_TYPE(output, float), + ML_TEST_READ_TYPE(reference, float), + t->cmn.opt->tolerance)) + match = true; + + output += sizeof(float); + reference += sizeof(float); + break; + default: /* other types, fp8, fp16, bfloat16 */ + match = true; + } + + if (!match) + goto done; + j++; + if (j < nb_elements) + goto next_element; + + i++; + if (i < model->info.nb_outputs) + goto next_output; + } +done: + if (match) + t->nb_valid++; + + return match; +} + /* Callback for mempool object iteration. This call would dequantize output data. */ static void ml_request_finish(struct rte_mempool *mp, void *opaque, void *obj, unsigned int obj_idx) @@ -701,9 +872,10 @@ ml_request_finish(struct rte_mempool *mp, void *opaque, void *obj, unsigned int struct test_inference *t = ml_test_priv((struct ml_test *)opaque); struct ml_request *req = (struct ml_request *)obj; struct ml_model *model = &t->model[req->fid]; + char str[PATH_MAX]; + bool error = false; RTE_SET_USED(mp); - RTE_SET_USED(obj_idx); if (req->niters == 0) return; @@ -711,6 +883,48 @@ ml_request_finish(struct rte_mempool *mp, void *opaque, void *obj, unsigned int t->nb_used++; rte_ml_io_dequantize(t->cmn.opt->dev_id, model->id, t->model[req->fid].nb_batches, req->output, model->output); + + if (model->reference == NULL) { + t->nb_valid++; + goto dump_output_pass; + } + + if (!ml_inference_validation(opaque, req)) + goto dump_output_fail; + else + goto dump_output_pass; + +dump_output_pass: + if (obj_idx == 0) { + /* write quantized output */ + snprintf(str, PATH_MAX, "%s.q", t->cmn.opt->filelist[req->fid].output); + ML_OPEN_WRITE_GET_ERR(str, req->output, model->out_qsize, error); + if (error) + return; + + /* write dequantized output */ + snprintf(str, PATH_MAX, "%s", t->cmn.opt->filelist[req->fid].output); + ML_OPEN_WRITE_GET_ERR(str, model->output, model->out_dsize, error); + if (error) + return; + } + + return; + +dump_output_fail: + if (t->cmn.opt->debug) { + /* dump quantized output buffer */ + snprintf(str, PATH_MAX, "%s.q.%d", t->cmn.opt->filelist[req->fid].output, obj_idx); + ML_OPEN_WRITE_GET_ERR(str, req->output, model->out_qsize, error); + if (error) + return; + + /* dump dequantized output buffer */ + snprintf(str, PATH_MAX, "%s.%d", t->cmn.opt->filelist[req->fid].output, obj_idx); + ML_OPEN_WRITE_GET_ERR(str, model->output, model->out_dsize, error); + if (error) + return; + } } int @@ -722,7 +936,7 @@ ml_inference_result(struct ml_test *test, struct ml_options *opt, int16_t fid) rte_mempool_obj_iter(t->model[fid].io_pool, ml_request_finish, test); - if (t->nb_used > 0) + if (t->nb_used == t->nb_valid) t->cmn.result = ML_TEST_SUCCESS; else t->cmn.result = ML_TEST_FAILED; diff --git a/app/test-mldev/test_inference_common.h b/app/test-mldev/test_inference_common.h index 1bac2dcfa0..3f2b042360 100644 --- a/app/test-mldev/test_inference_common.h +++ b/app/test-mldev/test_inference_common.h @@ -43,6 +43,7 @@ struct test_inference { struct rte_mempool *op_pool; uint64_t nb_used; + uint64_t nb_valid; int16_t fid; int (*enqueue)(void *arg); diff --git a/app/test-mldev/test_model_common.h b/app/test-mldev/test_model_common.h index dfbf568f0b..ce12cbfecc 100644 --- a/app/test-mldev/test_model_common.h +++ b/app/test-mldev/test_model_common.h @@ -31,6 +31,7 @@ struct ml_model { uint8_t *input; uint8_t *output; + uint8_t *reference; struct rte_mempool *io_pool; uint32_t nb_batches; -- 2.17.1