From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 858F441D3D; Fri, 10 Mar 2023 09:11:35 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 184EC42D59; Fri, 10 Mar 2023 09:10:18 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 5CCB441151 for ; Fri, 10 Mar 2023 09:10:06 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32A7ajpP009832; Fri, 10 Mar 2023 00:10:05 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0220; bh=iBU8NaLdix8dbclJ/uFN4BYsocbCh9I/aOXGmtmD9mE=; b=PlJqMFm5lffx256Xd2swi7gmK/5MwP3V3lkH9tyzh0hWqBOkDEUNvG7+PryrXOv2XIdr +LeHgGLRHWTM/+Hr5ioQo25ySrm7UMqHvC2moTRf4+Xh6qczFfli0dWoqcx8KYkv1+1e qe/Yvf3xpMaQrZgu13NEHuZv5wHX4B92bRzvEPmUfVhgTHCjqgHpqwjmUGun3gTJVD28 /DqTZ0tZkgzbx8Mt02jNIlVqL9e1iaI/msCm9YcMS9CM9ynyEZSZ4CuHlW61ssDKyYsy jJdgYVo8DTDiFtsY8cXEr25BU2QVe532Gri+DxxKv9Jw8YpY6/D20LL+6ZXsxaXixqYG yw== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3p7n7dhy4p-9 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 10 Mar 2023 00:10:04 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Fri, 10 Mar 2023 00:09:42 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Fri, 10 Mar 2023 00:09:41 -0800 Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233]) by maili.marvell.com (Postfix) with ESMTP id 2375F3F708E; Fri, 10 Mar 2023 00:09:42 -0800 (PST) From: Srikanth Yalavarthi To: Thomas Monjalon , Srikanth Yalavarthi CC: , , , , , Subject: [PATCH v5 12/12] app/mldev: add documentation for mldev test cases Date: Fri, 10 Mar 2023 00:09:34 -0800 Message-ID: <20230310080935.2460-13-syalavarthi@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230310080935.2460-1-syalavarthi@marvell.com> References: <20221129070746.20396-1-syalavarthi@marvell.com> <20230310080935.2460-1-syalavarthi@marvell.com> MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-GUID: BD5bJQa0-cSXgPb643Gn1Lr3V8OdoN1Y X-Proofpoint-ORIG-GUID: BD5bJQa0-cSXgPb643Gn1Lr3V8OdoN1Y X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-10_02,2023-03-09_01,2023-02-09_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Added documentation specific to mldev test cases. Added details about all test cases and option supported by individual tests. Signed-off-by: Srikanth Yalavarthi --- MAINTAINERS | 1 + .../tools/img/mldev_inference_interleave.svg | 669 ++++++++++++++++++ .../tools/img/mldev_inference_ordered.svg | 528 ++++++++++++++ .../tools/img/mldev_model_ops_subtest_a.svg | 420 +++++++++++ .../tools/img/mldev_model_ops_subtest_b.svg | 423 +++++++++++ .../tools/img/mldev_model_ops_subtest_c.svg | 366 ++++++++++ .../tools/img/mldev_model_ops_subtest_d.svg | 424 +++++++++++ doc/guides/tools/index.rst | 1 + doc/guides/tools/testmldev.rst | 441 ++++++++++++ 9 files changed, 3273 insertions(+) create mode 100644 doc/guides/tools/img/mldev_inference_interleave.svg create mode 100644 doc/guides/tools/img/mldev_inference_ordered.svg create mode 100644 doc/guides/tools/img/mldev_model_ops_subtest_a.svg create mode 100644 doc/guides/tools/img/mldev_model_ops_subtest_b.svg create mode 100644 doc/guides/tools/img/mldev_model_ops_subtest_c.svg create mode 100644 doc/guides/tools/img/mldev_model_ops_subtest_d.svg create mode 100644 doc/guides/tools/testmldev.rst diff --git a/MAINTAINERS b/MAINTAINERS index 1914c4d614..320842e13f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -481,6 +481,7 @@ M: Srikanth Yalavarthi F: lib/mldev/ F: doc/guides/prog_guide/mldev.rst F: app/test-mldev +F: doc/guides/tools/testmldev.rst DMA device API - EXPERIMENTAL M: Chengwen Feng diff --git a/doc/guides/tools/img/mldev_inference_interleave.svg b/doc/guides/tools/img/mldev_inference_interleave.svg new file mode 100644 index 0000000000..3a741ea627 --- /dev/null +++ b/doc/guides/tools/img/mldev_inference_interleave.svg @@ -0,0 +1,669 @@ + + + + + + + + + + + + + + + + + + + + + + + + test: inference_interleave + + + + + + + + + QueuePair 0 + QueuePair 2 + + + + QueuePair 1 + + Machine LearningHardware Engine + + lcore 1 + lcore 5 + + + + lcore 3 + Enqueue Workers + + lcore 2 + lcore 4 + + + + lcore 6 + Dequeue Workers + + + + + + + + + + + + + Model 0 + Model 1 + Model 2 + Model 3 + + + + + + + nb_worker_threads = 2 * MIN(nb_queue_pairs, (lcore_count - 1) / 2) + inferences_per_queue_pair = nb_models * (repetitions / nb_queue_pairs) + + + + diff --git a/doc/guides/tools/img/mldev_inference_ordered.svg b/doc/guides/tools/img/mldev_inference_ordered.svg new file mode 100644 index 0000000000..12fa6acaec --- /dev/null +++ b/doc/guides/tools/img/mldev_inference_ordered.svg @@ -0,0 +1,528 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Model X + QueuePair 1 + QueuePair 2 + + + + + + + QueuePair 0 + + Machine LearningHardware Engine + lcore 1 + lcore 5 + lcore 3 + Enqueue Workers + lcore 2 + lcore 4 + lcore 6 + Dequeue Workers + test: inference_ordered + nb_worker_threads = 2 * MIN(nb_queue_pairs, (lcore_count - 1) / 2) + inferences_per_queue_pair = repetitions / nb_queue_pairs + + + + + + + + + + + + diff --git a/doc/guides/tools/img/mldev_model_ops_subtest_a.svg b/doc/guides/tools/img/mldev_model_ops_subtest_a.svg new file mode 100644 index 0000000000..ed12cc5a05 --- /dev/null +++ b/doc/guides/tools/img/mldev_model_ops_subtest_a.svg @@ -0,0 +1,420 @@ + + + + + + + + + + + + + + + + + + Model 0 / Load + + Model 0 / Start + + Model 0 / Stop + + Model 0 / Unload + + + + + Model 1 / Load + + Model 1 / Start + + Model 1 / Unload + + Model 1 / Stop + + + + + Model N / Load + + Model N / Start + + Model N / Stop + + Model N / Unload + + + mldev: model_ops / subtest D + + + diff --git a/doc/guides/tools/img/mldev_model_ops_subtest_b.svg b/doc/guides/tools/img/mldev_model_ops_subtest_b.svg new file mode 100644 index 0000000000..173a2c6c05 --- /dev/null +++ b/doc/guides/tools/img/mldev_model_ops_subtest_b.svg @@ -0,0 +1,423 @@ + + + + + + + + + + + + + + + Model 0 / Load + + Model 1 / Load + + Model N / Load + + + + + Model 0 / Start + + Model 1 / Start + + Model N / Start + + + + + Model 1 / Stop + + Model 0 / Stop + + Model N / Stop + + + + + Model 0 / Unload + + Model 1 / Unload + + Model N / Unload + + + mldev: model_ops / subtest A + + + diff --git a/doc/guides/tools/img/mldev_model_ops_subtest_c.svg b/doc/guides/tools/img/mldev_model_ops_subtest_c.svg new file mode 100644 index 0000000000..f66f146d05 --- /dev/null +++ b/doc/guides/tools/img/mldev_model_ops_subtest_c.svg @@ -0,0 +1,366 @@ + + + + + + + + + + + + + + + Model 0 / Load + + Model 1 / Load + + Model N / Load + + + + + Model 0 / Start + + Model 0 / Stop + + + + Model 0 / Unload + + Model 1 / Unload + + Model N / Unload + + + + Model N / Stop + + Model N / Start + + + mldev: model_ops / subtest C + + + diff --git a/doc/guides/tools/img/mldev_model_ops_subtest_d.svg b/doc/guides/tools/img/mldev_model_ops_subtest_d.svg new file mode 100644 index 0000000000..3e2b89ad25 --- /dev/null +++ b/doc/guides/tools/img/mldev_model_ops_subtest_d.svg @@ -0,0 +1,424 @@ + + + + + + + + + + + + + + + Model 0 / Load + + Model 0 / Start + + Model 1 / Load + + + + + Model 1 / Start + + + Model N / Load + + Model N / Start + + + + Model N / Unload + + Model N / Stop + + + + Model 1 / Stop + + + Model 1 / Unload + + Model 0 / Stop + + Model 0 / Unload + + + mldev: model_ops / subest B + + + diff --git a/doc/guides/tools/index.rst b/doc/guides/tools/index.rst index f1f5b94c8c..6f84fc31ff 100644 --- a/doc/guides/tools/index.rst +++ b/doc/guides/tools/index.rst @@ -21,4 +21,5 @@ DPDK Tools User Guides comp_perf testeventdev testregex + testmldev dts diff --git a/doc/guides/tools/testmldev.rst b/doc/guides/tools/testmldev.rst new file mode 100644 index 0000000000..845c2d9381 --- /dev/null +++ b/doc/guides/tools/testmldev.rst @@ -0,0 +1,441 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright (c) 2022 Marvell. + +dpdk-test-mldev Application +=========================== + +The ``dpdk-test-mldev`` tool is a Data Plane Development Kit (DPDK) application that allows testing +various mldev use cases. This application has a generic framework to add new mldev based test cases +to verify functionality and measure the performance of inference execution on DPDK ML devices. + + +Application and Options +----------------------- + +The application has a number of command line options: + +.. code-block:: console + + dpdk-test-mldev [EAL Options] -- [application options] + +EAL Options +~~~~~~~~~~~ + +The following are the EAL command-line options that can be used with the ``dpdk-test-mldev`` +application. See the DPDK Getting Started Guides for more information on these options. + +* ``-c `` or ``-l `` + + Set the hexadecimal bitmask of the cores to run on. The corelist is a list of cores to use. + +* ``-a `` + + Attach a PCI based ML device. Specific to drivers using a PCI based ML devices. + +* ``--vdev `` + + Add a virtual mldev device. Specific to drivers using a ML virtual device. + + +Application Options +~~~~~~~~~~~~~~~~~~~ + +The following are the command-line options supported by the test application. + +* ``--test `` + + ML tests are divided into two groups, Model and Device tests and Inference tests. Test + name one of the following supported tests. + + **ML Device Tests** :: + + device_ops + + **ML Model Tests** :: + + model_ops + + **ML Inference Tests** :: + + inference_ordered + inference_interleave + +* ``--dev_id `` + + Set the device id of the ML device to be used for the test. Default value is `0`. + +* ``--socket_id `` + + Set the socket id of the application resources. Default value is `SOCKET_ID_ANY`. + +* ``--debug`` + + Enable the tests to run in debug mode. + +* ``--models `` + + Set the list of model files to be used for the tests. Application expects the + ``model_list`` in comma separated form (i.e. ``--models model_A.bin,model_B.bin``). + Maximum number of models supported by the test is ``8``. + +* ``--filelist `` + + Set the list of model, input, output and reference files to be used for the tests. + Application expects the ``file_list`` to be in comma separated form + (i.e. ``--filelist [,reference]``). + + Multiple filelist entries can be specified when running the tests with multiple models. + Both quantized and dequantized outputs are written to the disk. Dequantized output file + would have the name specified by the user through ``--filelist`` option. A suffix ``.q`` + is appended to quantized output filename. Maximum number of filelist entries supported + by the test is ``8``. + +* ``--repetitions `` + + Set the number of inference repetitions to be executed in the test per each model. Default + value is `1`. + +* ``--burst_size `` + + Set the burst size to be used when enqueuing / dequeuing inferences. Default value is `1`. + +* ``--queue_pairs `` + + Set the number of queue-pairs to be used for inference enqueue and dequeue operations. + Default value is `1`. + +* ``--queue_size `` + + Set the size of queue-pair to be created for inference enqueue / dequeue operations. + Queue size would translate into `rte_ml_dev_qp_conf::nb_desc` field during queue-pair + creation. Default value is `1`. + +* ``--batches `` + + Set the number batches in the input file provided for inference run. When not specified + the test would assume the number of batches is equal to the batch size of the model. + +* ``--tolerance `` + + Set the tolerance value in percentage to be used for output validation. Default value + is `0`. + +* ``--stats`` + + Enable reporting device extended stats. + + +ML Device Tests +------------------------- + +ML device tests are functional tests to validate ML device APIs. Device tests validate the ML device +handling APIs configure, close, start and stop APIs. + + +Application Options +~~~~~~~~~~~~~~~~~~~ + +Supported command line options for the `model_ops` test are following:: + + --debug + --test + --dev_id + --socket_id + --queue_pairs + --queue_size + + +DEVICE_OPS Test +~~~~~~~~~~~~~~~ + +Device ops test validates the device configuration and reconfiguration support. The test configures +ML device based on the option ``--queue_pairs`` and ``--queue_size`` specified by the user, and +later reconfigures the ML device with the number of queue pairs and queue size based the maximum +specified through the device info. + + +Example +^^^^^^^ + +Command to run device_ops test: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=device_ops + + +Command to run device_ops test with user options: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=device_ops --queue_pairs --queue_size + + +ML Model Tests +------------------------- + +Model tests are functional tests to validate ML model APIs. Model tests validate the functioning +of APIs to load, start, stop and unload ML models. + + +Application Options +~~~~~~~~~~~~~~~~~~~ + +Supported command line options for the `model_ops` test are following:: + + --debug + --test + --dev_id + --socket_id + --models + + +List of model files to be used for the `model_ops` test can be specified through the option +``--models `` as a comma separated list. Maximum number of models supported in +the test is `8`. + +.. Note:: + + * The ``--models `` is a mandatory option for running this test. + * Options not supported by the test are ignored if specified. + + +MODEL_OPS Test +~~~~~~~~~~~~~~ + +The test is a collection of multiple sub-tests, each with a different order of slow-path +operations when handling with `N` number of models. + + +**Sub-test A:** executes the sequence of load / start / stop / unload for a model in order, +followed by next model. +.. _figure_mldev_model_ops_subtest_a: + +.. figure:: img/mldev_model_ops_subtest_a.* + + Execution sequence of model_ops subtest A. + + +**Sub-test B:** executes load for all models, followed by a start for all models. Upon successful +start of all models, stop is invoked for all models followed by unload. +.. _figure_mldev_model_ops_subtest_b: + +.. figure:: img/mldev_model_ops_subtest_b.* + + Execution sequence of model_ops subtest B. + + +**Sub-test C:** loads all models, followed by a start and stop of all models in order. Upon +completion of stop, unload is invoked for all models. +.. _figure_mldev_model_ops_subtest_c: + +.. figure:: img/mldev_model_ops_subtest_c.* + + Execution sequence of model_ops subtest C. + + +**Sub-test D:** executes load and start for all models available. Upon successful start of all +models, stop and stop is executed for the models. +.. _figure_mldev_model_ops_subtest_d: + +.. figure:: img/mldev_model_ops_subtest_d.* + + Execution sequence of model_ops subtest D. + + +Example +^^^^^^^ + +Command to run model_ops test: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=model_ops --models model_1.bin,model_2.bin,model_3.bin, model_4.bin + + +ML Inference Tests +------------------ + +Inference tests are a set of tests to validate end-to-end inference execution on ML device. +These tests executes the full sequence of operations required to run inferences with one or +multiple models. + +Application Options +~~~~~~~~~~~~~~~~~~~ + +Supported command line options for inference tests are following:: + + --debug + --test + --dev_id + --socket_id + --filelist + --repetitions + --burst_size + --queue_pairs + --queue_size + --batches + --tolerance + --stats + + +List of files to be used for the inference tests can be specified through the option +``--filelist `` as a comma separated list. A filelist entry would be of the format +``--filelist [,reference_file]`` and is used to specify the +list of files required to test with a single model. Multiple filelist entries are supported by +the test, one entry per model. Maximum number of file entries supported by the test is `8`. + +When ``--burst_size `` option is specified for the test, enqueue and dequeue burst would +try to enqueue or dequeue ``num`` number of inferences per each call respectively. + +In the inference test, a pair of lcores are mapped to each queue pair. Minimum number of lcores +required for the tests is equal to ``(queue_pairs * 2 + 1)``. + +Output validation of inference would be enabled only when a reference file is specified through +the ``--filelist`` option. Application would additionally consider the tolerance value provided +through ``--tolerance`` option during validation. When the tolerance values is 0, CRC32 hash of +inference output and reference output are compared. When the tolerance is non-zero, element wise +comparison of output is performed. Validation is considered as successful only when all the +elements of the output tensor are with in the tolerance range specified. + +When ``--debug`` option is specified, tests are run in debug mode. + +Enabling ``--stats`` would print the extended stats supported by the driver. + +.. Note:: + + * The ``--filelist `` is a mandatory option for running inference tests. + * Options not supported by the tests are ignored if specified. + * Element wise comparison is not supported when the output dtype is either fp8, fp16 + or bfloat16. This is applicable only when the tolerance is greater than zero and for + pre-quantized models only. + + +INFERENCE_ORDERED Test +~~~~~~~~~~~~~~~~~~~~~~ + +This is a functional test for validating the end-to-end inference execution on ML device. This +test configures ML device and queue pairs as per the queue-pair related options (queue_pairs and +queue_size) specified by the user. Upon successful configuration of the device and queue pairs, +the first model specified through the filelist is loaded to the device and inferences are enqueued +by a pool of worker threads to the ML device. Total number of inferences enqueued for the model +are equal to the repetitions specified. A dedicated pool of worker threads would dequeue the +inferences from the device. The model is unloaded upon completion of all inferences for the model. +The test would continue loading and executing inference requests for all models specified +through ``filelist`` option in an ordered manner. + +.. _figure_mldev_inference_ordered: + +.. figure:: img/mldev_inference_ordered.* + + Execution of inference_ordered on single model. + + +Example +^^^^^^^ + +Example command to run inference_ordered test: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=inference_ordered --filelist model.bin,input.bin,output.bin + +Example command to run inference_ordered with output validation using tolerance of `1%``: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=inference_ordered --filelist model.bin,input.bin,output.bin,reference.bin \ + --tolerance 1.0 + +Example command to run inference_ordered test with multiple queue-pairs and queue size: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=inference_ordered --filelist model.bin,input.bin,output.bin \ + --queue_pairs 4 --queue_size 16 + +Example command to run inference_ordered test with a specific burst size: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=inference_ordered --filelist model.bin,input.bin,output.bin \ + --burst_size 12 + + +INFERENCE_INTERLEAVE Test +~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is a stress test for validating the end-to-end inference execution on ML device. The test +configures the ML device and queue pairs as per the queue-pair related options (queue_pairs +and queue_size) specified by the user. Upon successful configuration of the device and queue +pairs, all models specified through the filelist are loaded to the device. Inferences for multiple +models are enqueued by a pool of worker threads in parallel. Inference execution by the device is +interleaved between multiple models. Total number of inferences enqueued for a model are equal to +the repetitions specified. An additional pool of threads would dequeue the inferences from the +device. Models would be unloaded upon completion of inferences for all models loaded. + + +.. _figure_mldev_inference_interleave: + +.. figure:: img/mldev_inference_interleave.* + + Execution of inference_interleave on single model. + + +Example +^^^^^^^ + +Example command to run inference_interleave test: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=inference_interleave --filelist model.bin,input.bin,output.bin + + +Example command to run inference_interleave test with multiple models: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=inference_interleave --filelist model_A.bin,input_A.bin,output_A.bin \ + --filelist model_B.bin,input_B.bin,output_B.bin + + +Example command to run inference_interleave test with multiple models ad output validation +using tolerance of `2.0%``: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=inference_interleave \ + --filelist model_A.bin,input_A.bin,output_A.bin,reference_A.bin \ + --filelist model_B.bin,input_B.bin,output_B.bin,reference_B.bin \ + --tolerance 2.0 + +Example command to run inference_interleave test with multiple queue-pairs and queue size +and burst size: + +.. code-block:: console + + sudo /app/dpdk-test-mldev -c 0xf -a -- \ + --test=inference_interleave --filelist model.bin,input.bin,output.bin \ + --queue_pairs 8 --queue_size 12 --burst_size 16 + + +Debug mode +---------- + +ML tests can be executed in debug mode by enabling the option ``--debug``. Execution of tests in +debug mode would enable additional prints. + +When a validation failure is observed, output from that buffer is written to the disk, with the +filenames having similar convention when the test has passed. Additionally index of the buffer +would be appended to the filenames. -- 2.17.1