* [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
@ 2022-08-03 13:28 jerinj
2022-08-03 13:28 ` [dpdk-dev] [RFC PATCH 1/1] " jerinj
2022-08-03 15:19 ` [dpdk-dev] [RFC PATCH 0/1] " Stephen Hemminger
0 siblings, 2 replies; 80+ messages in thread
From: jerinj @ 2022-08-03 13:28 UTC (permalink / raw)
To: dev
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, syalavarthi, dchickles, sshankarnara,
Jerin Jacob
From: Jerin Jacob <jerinj@marvell.com>
Machine learning inference library
==================================
Definition of machine learning inference
----------------------------------------
Inference in machine learning is the process of making an output prediction
based on new input data using a pre-trained machine learning model.
The scope of the RFC would include only inferencing with pre-trained machine learning models,
training and building/compiling the ML models is out of scope for this RFC or
DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
Motivation for the new library
------------------------------
Multiple semiconductor vendors are offering accelerator products such as DPU
(often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
integrated as part of the product. Use of ML inferencing is increasing in the domain
of packet processing for flow classification, intrusion, malware and anomaly detection.
Lack of inferencing support through DPDK APIs will involve complexities and
increased latency from moving data across frameworks (i.e, dataplane to
non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
inferencing would enable the dataplane solutions to harness the benefit of inline
inferencing supported by the hardware.
Contents of RFC
---------------
This RFC attempts to define standard APIs for:
1) Discovery of ML capabilities (e.g., device specific features) in a vendor
independent fashion
2) Definition of functions to handle ML devices, which includes probing,
initialization and termination of the devices.
3) Definition of functions to handle ML models used to perform inference operations.
4) Definition of function to handle quantize and dequantize operations
Roadmap
-------
1) Address the comments for this RFC.
2) Common code for mldev
3) SW mldev driver based on TVM (https://tvm.apache.org/)
4) HW mldev driver for cn10k
3) Add app/test-mldev application similar to other device class tests
Machine learning library framework
----------------------------------
The ML framework is built on the following model:
+-----------------+ rte_ml_[en|de]queue_burst()
| | |
| Machine o------+ +--------+ |
| Learning | | | queue | | +------+
| Inference o------+-----o |<===o===>|Core 0|
| Engine | | | pair 0 | +------+
| o----+ | +--------+
| | | |
+-----------------+ | | +--------+
^ | | | queue | +------+
| | +-----o |<=======>|Core 1|
| | | pair 1 | +------+
| | +--------+
+--------+--------+ |
| +-------------+ | | +--------+
| | Model 0 | | | | queue | +------+
| +-------------+ | +-------o |<=======>|Core N|
| +-------------+ | | pair N | +------+
| | Model 1 | | +--------+
| +-------------+ |
| +-------------+ |<------- rte_ml_model_load()
| | Model .. | |-------> rte_ml_model_info()
| +-------------+ |<------- rte_ml_model_start()
| +-------------+ |<------- rte_ml_model_stop()
| | Model N | |<------- rte_ml_model_params_update()
| +-------------+ |<------- rte_ml_model_unload()
+-----------------+
ML Device: A hardware or software-based implementation of ML device API for
running inferences using a pre-trained ML model.
ML Model: An ML model is an algorithm trained over a dataset. A model consists of
procedure/algorithm and data/pattern required to make predictions on live data.
Once the model is created and trained outside of the DPDK scope, the model can be loaded
via rte_ml_model_load() and then start it using rte_ml_model_start() API.
The rte_ml_model_params_update() can be used to update the model parameters such as weight
and bias without unloading the model using rte_ml_model_unload().
ML Inference: ML inference is the process of feeding data to the model via
rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
outputs/predictions from the started model.
In all functions of the ML device API, the ML device is designated by an
integer >= 0 named as device identifier *dev_id*.
The functions exported by the ML device API to setup a device designated by
its device identifier must be invoked in the following order:
- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_dev_start()
A model is required to run the inference operations with the user specified inputs.
Application needs to invoke the ML model API in the following order before queueing
inference jobs.
- rte_ml_model_load()
- rte_ml_model_start()
The rte_ml_model_info() API is provided to retrieve the information related to the model.
The information would include the shape and type of input and output required for the inference.
Data quantization and dequantization is one of the main aspects in ML domain. This involves
conversion of input data from a higher precision to a lower precision data type and vice-versa
for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
and output buffers holding data for multiple batches.
Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
size of quantized and de-quantized multi-batch input and output buffers.
User can optionally update the model parameters with rte_ml_model_params_update() after
invoking rte_ml_model_stop() API on a given model ID.
The application can invoke, in any order, the functions exported by the ML API to enqueue
inference jobs and dequeue inference response.
If the application wants to change the device configuration (i.e., call
rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
for the given model. The application does not need to call rte_ml_dev_stop() API for
any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
start state after invoking rte_ml_model_start() API, then the application can call
rte_ml_enqueue() and rte_ml_dequeue() API on the destined device and model ID.
Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
Typical application utilisation of the ML API will follow the following
programming flow.
- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_model_load()
- rte_ml_model_start()
- rte_ml_model_info()
- rte_ml_dev_start()
- rte_ml_enqueue_burst()
- rte_ml_dequeue_burst()
- rte_ml_model_stop()
- rte_ml_model_unload()
- rte_ml_dev_stop()
- rte_ml_dev_close()
Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on different logical cores
on the same target object. For instance, the dequeue function of a poll mode driver cannot be
invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the user application to enforce this rule.
Example application usage for ML inferencing
--------------------------------------------
This example application is to demonstrate the programming model of ML device
library. This example omits the error checks to simplify the application. This
example also assumes that the input data received is quantized and output expected
is also quantized. In order to handle non-quantized inputs and outputs, users can
invoke rte_ml_io_quantize() or rte_ml_io_dequantize() for data type conversions.
#define ML_MODEL_NAME "model"
#define IO_MZ "io_mz"
struct app_ctx {
char model_file[PATH_MAX];
char inp_file[PATH_MAX];
char out_file[PATH_MAX];
struct rte_ml_model_params params;
struct rte_ml_model_info info;
uint16_t id;
uint64_t input_size;
uint64_t output_size;
uint8_t *input_buffer;
uint8_t *output_buffer;
} __rte_cache_aligned;
struct app_ctx ctx;
static int
parse_args(int argc, char **argv)
{
int opt, option_index;
static struct option lgopts[] = {{"model", required_argument, NULL, 'm'},
{"input", required_argument, NULL, 'i'},
{"output", required_argument, NULL, 'o'},
{NULL, 0, NULL, 0}};
while ((opt = getopt_long(argc, argv, "m:i:o:", lgopts, &option_index)) != EOF)
switch (opt) {
case 'm':
strncpy(ctx.model_file, optarg, PATH_MAX - 1);
break;
case 'i':
strncpy(ctx.inp_file, optarg, PATH_MAX - 1);
break;
case 'o':
strncpy(ctx.out_file, optarg, PATH_MAX - 1);
break;
default:
return -1;
}
return 0;
}
int
main(int argc, char **argv)
{
struct rte_ml_dev_qp_conf qp_conf;
struct rte_ml_dev_config config;
struct rte_ml_dev_info dev_info;
const struct rte_memzone *mz;
struct rte_mempool *op_pool;
struct rte_ml_op *op_enq;
struct rte_ml_op *op_deq;
FILE *fp;
int rc;
/* Initialize EAL */
rc = rte_eal_init(argc, argv);
if (rc < 0)
rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
argc -= rc;
argv += rc;
/* Parse application arguments (after the EAL args) */
if (parse_args(argc, argv) < 0)
rte_exit(EXIT_FAILURE, "Invalid application arguments\n");
/* Step 1: Check for ML devices */
if (rte_ml_dev_count() <= 0)
rte_exit(EXIT_FAILURE, "Failed to find ML devices\n");
/* Step 2: Get device info */
if (rte_ml_dev_info_get(0, &dev_info) != 0)
rte_exit(EXIT_FAILURE, "Failed to get device info\n");
/* Step 3: Configure ML device, use device 0 */
config.socket_id = rte_ml_dev_socket_id(0);
config.max_nb_models = dev_info.max_models;
config.nb_queue_pairs = dev_info.max_queue_pairs;
if (rte_ml_dev_configure(0, &config) != 0)
rte_exit(EXIT_FAILURE, "Device configuration failed\n");
/* Step 4: Setup queue pairs, used qp_id = 0 */
qp_conf.nb_desc = 1;
if (rte_ml_dev_queue_pair_setup(0, 0, &qp_conf, config.socket_id) != 0)
rte_exit(EXIT_FAILURE, "Queue-pair setup failed\n");
/* Step 5: Start device */
if (rte_ml_dev_start(0) != 0)
rte_exit(EXIT_FAILURE, "Device start failed\n");
/* Step 6: Read model data and update load params structure */
fp = fopen(ctx.model_file, "r+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open model file\n");
fseek(fp, 0, SEEK_END);
ctx.params.size = ftell(fp);
fseek(fp, 0, SEEK_SET);
ctx.params.addr = malloc(ctx.params.size);
if (fread(ctx.params.addr, 1, ctx.params.size, fp) != ctx.params.size){
fclose(fp);
rte_exit(EXIT_FAILURE, "Failed to read model\n");
}
fclose(fp);
strcpy(ctx.params.name, ML_MODEL_NAME);
/* Step 7: Load the model */
if (rte_ml_model_load(0, &ctx.params, &ctx.id) != 0)
rte_exit(EXIT_FAILURE, "Failed to load model\n");
free(ctx.params.addr);
/* Step 8: Start the model */
if (rte_ml_model_start(0, ctx.id) != 0)
rte_exit(EXIT_FAILURE, "Failed to start model\n");
/* Step 9: Allocate buffers for quantized input and output */
/* Get model information */
if (rte_ml_model_info_get(0, ctx.id, &ctx.info) != 0)
rte_exit(EXIT_FAILURE, "Failed to get model info\n");
/* Get the buffer size for input and output */
rte_ml_io_input_size_get(0, ctx.id, ctx.info.batch_size, &ctx.input_size, NULL);
rte_ml_io_output_size_get(0, ctx.id, ctx.info.batch_size, &ctx.output_size, NULL);
mz = rte_memzone_reserve(IO_MZ, ctx.input_size + ctx.output_size, config.socket_id, 0);
if (mz == NULL)
rte_exit(EXIT_FAILURE, "Failed to create IO memzone\n");
ctx.input_buffer = mz->addr;
ctx.output_buffer = ctx.input_buffer + ctx.input_size;
/* Step 10: Fill the input data */
fp = fopen(ctx.inp_file, "r+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open input file\n");
if (fread(ctx.input_buffer, 1, ctx.input_size, fp) != ctx.input_size) {
fclose(fp);
rte_exit(EXIT_FAILURE, "Failed to read input file\n");
}
fclose(fp);
/* Step 11: Create ML op mempool */
op_pool = rte_ml_op_pool_create("ml_op_pool", 1, 0, 0, config.socket_id);
if (op_pool == NULL)
rte_exit(EXIT_FAILURE, "Failed to create op pool\n");
/* Step 12: Form an ML op */
rte_mempool_get_bulk(op_pool, (void *)op_enq, 1);
op_enq->model_id = ctx.id;
op_enq->nb_batches = ctx.info.batch_size;
op_enq->mempool = op_pool;
op_enq->input.addr = ctx.input_buffer;
op_enq->input.length = ctx.input_size;
op_enq->input.next = NULL;
op_enq->output.addr = ctx.output_buffer;
op_enq->output.length = ctx.output_size;
op_enq->output.next = NULL;
/* Step 13: Enqueue jobs */
rte_ml_enqueue_burst(0, 0, &op_enq, 1);
/* Step 14: Dequeue jobs and release op pool */
while (rte_ml_dequeue_burst(0, 0, &op_deq, 1) != 1)
;
/* Step 15: Write output */
fp = fopen(ctx.out_file, "w+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open output file\n");
fwrite(ctx.output_buffer, 1, ctx.output_size, fp);
fclose(fp);
/* Step 16: Clean up */
/* Stop ML model */
rte_ml_model_stop(0, ctx.id);
/* Unload ML model */
rte_ml_model_unload(0, ctx.id);
/* Free input/output memory */
rte_memzone_free(rte_memzone_lookup(IO_MZ));
/* Free the ml op back to pool */
rte_mempool_put_bulk(op_pool, (void *)op_deq, 1);
/* Free ml op pool */
rte_mempool_free(op_pool);
/* Stop the device */
rte_ml_dev_stop(0);
rte_ml_dev_close(0);
rte_eal_cleanup();
return 0;
}
Jerin Jacob (1):
mldev: introduce machine learning device library
config/rte_config.h | 3 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 164 +++++
lib/eal/common/eal_common_log.c | 1 +
lib/eal/include/rte_log.h | 1 +
lib/meson.build | 1 +
lib/mldev/meson.build | 12 +
lib/mldev/rte_mldev.c | 5 +
lib/mldev/rte_mldev.h | 1081 +++++++++++++++++++++++++++++++
lib/mldev/version.map | 5 +
12 files changed, 1276 insertions(+)
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/version.map
--
2.37.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [RFC PATCH 1/1] mldev: introduce machine learning device library
2022-08-03 13:28 [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library jerinj
@ 2022-08-03 13:28 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
2022-08-03 15:19 ` [dpdk-dev] [RFC PATCH 0/1] " Stephen Hemminger
1 sibling, 1 reply; 80+ messages in thread
From: jerinj @ 2022-08-03 13:28 UTC (permalink / raw)
To: dev, Bruce Richardson, Ray Kinsella
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, chas3, chenbo.xia, ciara.loftus, dsinghrawat,
ed.czeck, evgenys, grive, g.singh, zhouguoyang, haiyue.wang,
hkalra, heinrich.kuhn, hemant.agrawal, hyonkim, igorch,
irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, syalavarthi, dchickles, sshankarnara,
Jerin Jacob
From: Jerin Jacob <jerinj@marvell.com>
Add mldev API specification to standardize and use the machine learning
device and inference operations in vendor neutral way.
Following operations are abstracted through APIs
- ML device capability probe
- ML device configuration
- ML device queue pair configuration
- ML device state management
- ML device stat/xstat operations
- ML model load/unload/start/stop operations
- ML model information probe
- ML IO operations to find size for input and output buffers
- ML quantize and dequantize operations
- ML ops pool creation and free operations
- ML device enqueue/dequeue fastpath interference operations
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
config/rte_config.h | 3 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 164 +++++
lib/eal/common/eal_common_log.c | 1 +
lib/eal/include/rte_log.h | 1 +
lib/meson.build | 1 +
lib/mldev/meson.build | 12 +
lib/mldev/rte_mldev.c | 5 +
lib/mldev/rte_mldev.h | 1081 +++++++++++++++++++++++++++++++
lib/mldev/version.map | 5 +
12 files changed, 1276 insertions(+)
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/version.map
diff --git a/config/rte_config.h b/config/rte_config.h
index 46549cb062..2adbef3f51 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -81,6 +81,9 @@
/* rawdev defines */
#define RTE_RAWDEV_MAX_DEVS 64
+/* mldev defines */
+#define RTE_MLDEV_MAX_DEVS 64
+
/* ip_fragmentation defines */
#define RTE_LIBRTE_IP_FRAG_MAX_FRAG 8
// RTE_LIBRTE_IP_FRAG_TBL_STAT is not set
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 186a258be4..d55cca5b97 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -22,6 +22,7 @@ The public API headers are grouped by topics:
[compress](@ref rte_comp.h),
[regexdev](@ref rte_regexdev.h),
[dmadev](@ref rte_dmadev.h),
+ [mldev](@ref rte_mldev.h),
[eventdev](@ref rte_eventdev.h),
[event_eth_rx_adapter](@ref rte_event_eth_rx_adapter.h),
[event_eth_tx_adapter](@ref rte_event_eth_tx_adapter.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index 608494a7c0..82b28e8b18 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -59,6 +59,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/mempool \
@TOPDIR@/lib/meter \
@TOPDIR@/lib/metrics \
+ @TOPDIR@/lib/mldev \
@TOPDIR@/lib/node \
@TOPDIR@/lib/net \
@TOPDIR@/lib/pcapng \
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 8564883018..d7f2a28bdb 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -30,6 +30,7 @@ Programmer's Guide
regexdev
dmadev
gpudev
+ mldev
rte_security
rawdev
link_bonding_poll_mode_drv_lib
diff --git a/doc/guides/prog_guide/mldev.rst b/doc/guides/prog_guide/mldev.rst
new file mode 100644
index 0000000000..2ce8e2f7fe
--- /dev/null
+++ b/doc/guides/prog_guide/mldev.rst
@@ -0,0 +1,164 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(C) 2022 Marvell International Ltd.
+
+Machine Learning Device Library
+===============================
+
+The MLDEV library provides a Machine Learning device framework for the management and
+provisioning of hardware and software ML poll mode drivers, defining APIs which
+support a number of ML operations including device handling and inference processing.
+The ML model creation and training is outside of the scope of this library.
+
+Design Principles
+-----------------
+
+The MLDEV library follows the same basic principles as those used in DPDK's
+Ethernet Device framework and the Crypto framework. The MLDEV framework provides
+a generic Machine Learning device framework which supports both physical (hardware)
+and virtual (software) ML devices as well as an ML API to manage and configure ML
+devices. The APIs also supports performing ML inference operations through ML poll
+mode driver.
+
+
+Device Operations
+-----------------
+
+Device Creation
+~~~~~~~~~~~~~~~
+
+Physical ML devices are discovered during the PCI probe/enumeration, through the
+EAL functions which are executed at DPDK initialization, based on their PCI device
+identifier, each unique PCI BDF (bus/bridge, device, function). ML physical devices,
+like other physical devices in DPDK can be white-listed or black-listed
+using the EAL command line options.
+
+
+Device Identification
+~~~~~~~~~~~~~~~~~~~~~
+
+Each device, whether virtual or physical is uniquely designated by two
+identifiers:
+
+- A unique device index used to designate the ML device in all functions
+ exported by the MLDEV API.
+
+- A device name used to designate the ML device in console messages, for
+ administration or debugging purposes.
+
+Device Features and Capabilities
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ML devices may support different feature set. In order to get the
+supported PMD feature ``rte_ml_dev_info_get`` API which return the
+info of the device and it's supported features.
+
+Device Configuration
+~~~~~~~~~~~~~~~~~~~~
+
+The configuration of each ML device includes the following operations:
+
+- Allocation of resources, including hardware resources if a physical device.
+- Resetting the device into a well-known default state.
+- Initialization of statistics counters.
+
+The rte_ml_dev_configure API is used to configure a ML device.
+
+.. code-block:: c
+
+ int rte_ml_dev_configure(uint8_t dev_id, const struct rte_ml_dev_config *cfg);
+
+The ``rte_ml_dev_config`` structure is used to pass the configuration parameters
+for the ML device, for example number of queue pairs, maximum number of models,
+maximum size of model and so on.
+
+Configuration of Queue Pairs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each ML device can be configured with number of queue pairs.
+Each queue pair is configured using ``rte_ml_dev_queue_pair_setup``
+
+Logical Cores, Memory and Queues Pair Relationships
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Multiple logical cores should never share the same queue pair for enqueuing
+operations or dequeueing operations on the same ML device since this would
+require global locks and hinder performance.
+
+Configuration of Machine Learning models
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pre-trained ML models that are built using external ML compiler / training frameworks
+are used to perform inference operations. These models are configured on an ML device
+in a two-stage process that includes loading the model on an ML device, and starting
+the model to accept inference operations. Inference operations can be queued for a
+model only when the model is in started state. Model load stage assigns a Model ID,
+which is unique for the model in a driver's context. Model ID is used during all
+subsequent slow-path and fast-path operations.
+
+Model loading and start is done through the ``rte_ml_model_load`` and
+``rte_ml_model_start`` functions.
+
+Similarly stop and unloading are done through ``rte_ml_model_stop`` and
+``rte_ml_model_unload`` functions.
+
+Stop and unload functions would release the resources allocated for the
+models. Inference tasks cannot be queued for a model that is stopped.
+
+Detailed information related to the model can be retrieved from the driver using the
+function ``rte_ml_model_info_get``. Model information is accessible to the application
+through the ``rte_ml_model_info`` structure. Information available to the user would
+include the details related to the inputs and outputs, and the maximum batch size
+supported by the model.
+
+User can optionally update the model params such as weights and bias, without unloading
+the model, through the ``rte_ml_model_params_update`` function. A model should be in
+stopped state to update the params. Model has to be started in order to enqueue inference
+requests after a params update.
+
+Enqueue / Dequeue
+~~~~~~~~~~~~~~~~~
+
+The burst enqueue API uses a ML device identifier and a queue pair identifier
+to specify the device queue pair to schedule the processing on. The ``nb_ops``
+parameter is the number of operations to process which are supplied in the
+``ops`` array of ``rte_ml_op`` structures. The enqueue function returns the
+number of operations it enqueued for processing, a return value equal to
+``nb_ops`` means that all packets have been enqueued.
+
+The dequeue API uses the same format as the enqueue API of processed but
+the ``nb_ops`` and ``ops`` parameters are now used to specify the max processed
+operations the user wishes to retrieve and the location in which to store them.
+The API call returns the actual number of processed operations returned; this
+can never be larger than ``nb_ops``.
+
+``rte_ml_op`` provides the required information to the driver to queue an ML inference
+task. ML op specifies the model to be used and the number of batches to be executed in
+the inference task. Input and output buffer information is specified through the
+structure ``rte_ml_buff_seg``, which supports segmented data. Input is provided through
+the ``rte_ml_op::input`` and output through ``rte_ml_op::output``. Data pointed in each
+op, should not be released until the dequeue of for that op.
+
+
+Quantize and Dequantize
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Inference operations performed with lower precision types would improve the throughput
+and efficiency of the inference execution with a minimal loss of accuracy, which is within
+the tolerance limits. Quantization and dequantization is the process of converting data
+from a higher precision type to a lower precision type and vice-versa. ML library provides
+the functions ``rte_ml_io_quantize`` and ``rte_ml_io_dequantize`` to enable data type
+conversions. User needs to provide the address of the quantized and dequantized data
+buffers to the functions, along the number of the batches in the buffers.
+
+For quantization, the dequantized data is assumed to be of the type ``dtype`` provided by
+the ``rte_ml_model_info::input`` and the data is converted to ``qtype`` provided by the
+``rte_ml_model_info::input``.
+
+For dequantization, the quantized data is assumed to be of the type ``qtype`` provided by
+the ``rte_ml_model_info::output`` and the data is converted to ``dtype`` provided by the
+``rte_ml_model_info::output``.
+
+Size of the buffers required for the input and output can be calculated using the functions
+``rte_ml_io_input_size_get`` and ``rte_ml_io_output_size_get``. These functions would get the
+buffer sizes for both quantized and dequantized data for the given number of batches.
+
diff --git a/lib/eal/common/eal_common_log.c b/lib/eal/common/eal_common_log.c
index bd7b188ceb..5cb1b15dbe 100644
--- a/lib/eal/common/eal_common_log.c
+++ b/lib/eal/common/eal_common_log.c
@@ -369,6 +369,7 @@ static const struct logtype logtype_strings[] = {
{RTE_LOGTYPE_EFD, "lib.efd"},
{RTE_LOGTYPE_EVENTDEV, "lib.eventdev"},
{RTE_LOGTYPE_GSO, "lib.gso"},
+ {RTE_LOGTYPE_MLDEV, "lib.mldev"},
{RTE_LOGTYPE_USER1, "user1"},
{RTE_LOGTYPE_USER2, "user2"},
{RTE_LOGTYPE_USER3, "user3"},
diff --git a/lib/eal/include/rte_log.h b/lib/eal/include/rte_log.h
index 25ce42cdfc..226be9c778 100644
--- a/lib/eal/include/rte_log.h
+++ b/lib/eal/include/rte_log.h
@@ -48,6 +48,7 @@ extern "C" {
#define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
#define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
#define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
+#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
/* these log types can be used in an application */
#define RTE_LOGTYPE_USER1 24 /**< User-defined log type 1. */
diff --git a/lib/meson.build b/lib/meson.build
index c648f7d800..32c45f55ce 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -63,6 +63,7 @@ libraries = [
'flow_classify', # flow_classify lib depends on pkt framework table lib
'graph',
'node',
+ 'mldev'
]
optional_libs = [
diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build
new file mode 100644
index 0000000000..e1e0ffe975
--- /dev/null
+++ b/lib/mldev/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2022 Marvell.
+
+sources = files(
+ 'rte_mldev.c',
+)
+
+headers = files(
+ 'rte_mldev.h',
+)
+
+deps += ['mempool']
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
new file mode 100644
index 0000000000..c6644e6c12
--- /dev/null
+++ b/lib/mldev/rte_mldev.c
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Marvell.
+ */
+
+#include <rte_mldev.h>
diff --git a/lib/mldev/rte_mldev.h b/lib/mldev/rte_mldev.h
new file mode 100644
index 0000000000..f55cc8ffb3
--- /dev/null
+++ b/lib/mldev/rte_mldev.h
@@ -0,0 +1,1081 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Marvell.
+ */
+
+#ifndef RTE_MLDEV_H
+#define RTE_MLDEV_H
+
+/**
+ * @file rte_mldev.h
+ *
+ * @warning
+ * @b EXPERIMENTAL:
+ * All functions in this file may be changed or removed without prior notice.
+ *
+ * ML (Machine Learning) device API.
+ *
+ * The ML framework is built on the following model:
+ *
+ *
+ * +-----------------+ rte_ml_[en|de]queue_burst()
+ * | | |
+ * | Machine o------+ +--------+ |
+ * | Learning | | | queue | | +------+
+ * | Inference o------+-----o |<===o===>|Core 0|
+ * | Engine | | | pair 0 | +------+
+ * | o----+ | +--------+
+ * | | | |
+ * +-----------------+ | | +--------+
+ * ^ | | | queue | +------+
+ * | | +-----o |<=======>|Core 1|
+ * | | | pair 1 | +------+
+ * | | +--------+
+ * +--------+--------+ |
+ * | +-------------+ | | +--------+
+ * | | Model 0 | | | | queue | +------+
+ * | +-------------+ | +-------o |<=======>|Core N|
+ * | +-------------+ | | pair N | +------+
+ * | | Model 1 | | +--------+
+ * | +-------------+ |
+ * | +-------------+ |<------- rte_ml_model_load()
+ * | | Model .. | |-------> rte_ml_model_info()
+ * | +-------------+ |<------- rte_ml_model_start()
+ * | +-------------+ |<------- rte_ml_model_stop()
+ * | | Model N | |<------- rte_ml_model_params_update()
+ * | +-------------+ |<------- rte_ml_model_unload()
+ * +-----------------+
+ *
+ * ML Device: A hardware or software-based implementation of ML device API for
+ * running inferences using a pre-trained ML model.
+ *
+ * ML Model: An ML model is an algorithm trained over a dataset. A model consists of
+ * procedure/algorithm and data/pattern required to make predictions on live data.
+ * Once the model is created and trained outside of the DPDK scope, the model can be loaded
+ * via rte_ml_model_load() and then start it using rte_ml_model_start() API.
+ * The rte_ml_model_params_update() can be used to update the model parameters such as weight
+ * and bias without unloading the model using rte_ml_model_unload().
+ *
+ * ML Inference: ML inference is the process of feeding data to the model via
+ * rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+ * outputs/predictions from the started model.
+ *
+ * In all functions of the ML device API, the ML device is designated by an
+ * integer >= 0 named as device identifier *dev_id*.
+ *
+ * The functions exported by the ML device API to setup a device designated by
+ * its device identifier must be invoked in the following order:
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_dev_start()
+ *
+ * A model is required to run the inference operations with the user specified inputs.
+ * Application needs to invoke the ML model API in the following order before queueing
+ * inference jobs.
+ *
+ * - rte_ml_model_load()
+ * - rte_ml_model_start()
+ *
+ * The rte_ml_model_info() API is provided to retrieve the information related to the model.
+ * The information would include the shape and type of input and output required for the inference.
+ *
+ * Data quantization and dequantization is one of the main aspects in ML domain. This involves
+ * conversion of input data from a higher precision to a lower precision data type and vice-versa
+ * for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
+ * dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
+ * and output buffers holding data for multiple batches.
+ *
+ * Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
+ * size of quantized and de-quantized multi-batch input and output buffers.
+ *
+ * User can optionally update the model parameters with rte_ml_model_params_update() after
+ * invoking rte_ml_model_stop() API on a given model ID.
+ *
+ * The application can invoke, in any order, the functions exported by the ML API to enqueue
+ * inference jobs and dequeue inference response.
+ *
+ * If the application wants to change the device configuration (i.e., call
+ * rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
+ * device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
+ * the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
+ * for the given model. The application does not need to call rte_ml_dev_stop() API for
+ * any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
+ *
+ * Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
+ * start state after invoking rte_ml_model_start() API, then the application can call
+ * rte_ml_enqueue() and rte_ml_dequeue() API on the destined device and model ID.
+ *
+ * Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
+ *
+ * Typical application utilisation of the ML API will follow the following
+ * programming flow.
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_model_load()
+ * - rte_ml_model_start()
+ * - rte_ml_model_info()
+ * - rte_ml_dev_start()
+ * - rte_ml_enqueue_burst()
+ * - rte_ml_dequeue_burst()
+ * - rte_ml_model_stop()
+ * - rte_ml_model_unload()
+ * - rte_ml_dev_stop()
+ * - rte_ml_dev_close()
+ *
+ * Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on different logical cores
+ * on the same target object. For instance, the dequeue function of a poll mode driver cannot be
+ * invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the user application to enforce this rule.
+ */
+
+#include <rte_common.h>
+#include <rte_mempool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_ML_STR_MAX 128
+/**< Maximum length of name string */
+
+/* Device operations */
+
+/**
+ * Get the total number of ML devices that have been successfully initialised.
+ *
+ * @return
+ * - The total number of usable ML devices.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dev_count(void);
+
+/**
+ * Check if the device is in ready state.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 if device state is not in ready state.
+ * - 1 if device state is ready state.
+ */
+__rte_experimental
+int
+rte_ml_dev_is_valid_dev(int16_t dev_id);
+
+/**
+ * Return the NUMA socket to which a device is connected.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - The NUMA socket id to which the device is connected
+ * - 0 If the socket could not be determined.
+ * - -EINVAL: if the dev_id value is not valid.
+ */
+__rte_experimental
+int
+rte_ml_dev_socket_id(int16_t dev_id);
+
+/** ML device information */
+struct rte_ml_dev_info {
+ const char *driver_name;
+ /**< Driver name */
+ int16_t max_models;
+ /**< Maximum number of models supported by the device.
+ * @see struct rte_ml_dev_config::max_nb_models
+ */
+ uint16_t max_queue_pairs;
+ /**< Maximum number of queues pairs supported by the device.
+ * @see struct rte_ml_dev_config::nb_queue_pairs
+ */
+ uint16_t max_desc;
+ /**< Maximum allowed number of descriptors for queue pair by the device.
+ * @see struct rte_ml_dev_qp_conf::nb_desc
+ */
+ uint16_t max_segments;
+ /**< Maximum number of scatter-gather entry supported by the device.
+ * @see struct rte_ml_buff_seg struct rte_ml_buff_seg::next
+ */
+};
+
+/**
+ * Retrieve the information of the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param dev_info
+ * A pointer to a structure of type *rte_ml_dev_info* to be filled with the info of the device.
+ *
+ * @return
+ * - 0: Success, driver updates the information of the ML device
+ * - < 0: Error code returned by the driver info get function.
+ */
+__rte_experimental
+int
+rte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info);
+
+/** ML device configuration structure */
+struct rte_ml_dev_config {
+ int socket_id;
+ /**< Socket to allocate resources on. */
+ int16_t max_nb_models;
+ /**< Max number of models allowed to be loaded on the device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_models
+ */
+ uint16_t nb_queue_pairs;
+ /**< Number of queue pairs to configure on this device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_queue_pairs
+ */
+};
+
+/**
+ * Configure an ML device.
+ *
+ * This function must be invoked first before any other function in the API.
+ * This function can also be re-invoked when a device is in the stopped state.
+ *
+ * The caller may use rte_ml_dev_info_get() to get the capability of each resources available
+ * for this ML device.
+ *
+ * @param dev_id
+ * The identifier of the device to configure.
+ * @param config
+ * The ML device configuration structure.
+ *
+ * @return
+ * - 0: Success, device configured.
+ * - < 0: Error code returned by the driver configuration function.
+ */
+__rte_experimental
+int
+rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config);
+
+/* Forward declaration */
+struct rte_ml_op;
+
+/**< Callback function called during rte_ml_dev_stop(), invoked once per flushed ML op */
+typedef void (*rte_ml_dev_stop_flush_t)(int16_t dev_id, uint16_t qp_id, struct rte_ml_op *op);
+
+/** ML device queue pair configuration structure. */
+struct rte_ml_dev_qp_conf {
+ uint32_t nb_desc;
+ /**< Number of descriptors per queue pair.
+ * This value cannot exceed the max_desc which previously provided in
+ * struct rte_ml_dev_info:max_models
+ */
+ rte_ml_dev_stop_flush_t cb;
+ /**< Callback function called during rte_ml_dev_stop(), invoked once per active ML op.
+ * Value NULL is allowed, in which case callback will not be invoked.
+ * This function can be used to properly dispose of outstanding ML ops from all
+ * queue pairs, for example ops containing memory pointers.
+ * @see rte_ml_dev_stop()
+ */
+};
+
+/**
+ * Set up a queue pair for a device. This should only be called when the device is stopped.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param queue_pair_id
+ * The index of the queue pairs to set up. The value must be in the range [0, nb_queue_pairs - 1]
+ * previously supplied to rte_ml_dev_configure().
+ * @param qp_conf
+ * The pointer to the configuration data to be used for the queue pair.
+ * @param socket_id
+ * The *socket_id* argument is the socket identifier in case of NUMA.
+ * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the memory allocated
+ * for the queue pair.
+ *
+ * @return
+ * - 0: Success, queue pair correctly set up.
+ * - < 0: Queue pair configuration failed.
+ */
+__rte_experimental
+int
+rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *qp_conf, int socket_id);
+
+/**
+ * Start an ML device.
+ *
+ * The device start step consists of setting the configured features and enabling the ML device
+ * to accept inference jobs.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device started.
+ * - <0: Error code of the driver device start function.
+ */
+__rte_experimental
+int
+rte_ml_dev_start(int16_t dev_id);
+
+/**
+ * Stop an ML device. A stopped device cannot accept inference jobs.
+ * The device can be restarted with a call to rte_ml_dev_start().
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device stopped.
+ * - <0: Error code of the driver device stop function.
+ */
+__rte_experimental
+int
+rte_ml_dev_stop(int16_t dev_id);
+
+/**
+ * Close an ML device. The device cannot be restarted!
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 on successfully closing device.
+ * - <0 on failure to close device.
+ */
+__rte_experimental
+int
+rte_ml_dev_close(int16_t dev_id);
+
+/** Status of ML operation */
+enum rte_ml_op_status {
+ RTE_ML_OP_STATUS_SUCCESS = 0,
+ /**< Operation completed successfully */
+ RTE_ML_OP_STATUS_NOT_PROCESSED,
+ /**< Operation has not yet been processed by the device.
+ * When an ML op is enqueued to the device, the driver sets the status as
+ * RTE_ML_OP_STATUS_NOT_PROCESSED. Upon the ML operation completion,
+ * the respective status will be updated by the driver.
+ */
+ RTE_ML_OP_STATUS_ERROR,
+ /**< Operation completed with error.
+ * Application can invoke rte_ml_op_error_get() to get PMD specific
+ * error code if needed.
+ */
+};
+
+/** ML operation's input and output buffer representation as scatter gather list
+ */
+struct rte_ml_buff_seg {
+ rte_iova_t iova_addr;
+ /**< IOVA address of segment buffer. */
+ void *addr;
+ /**< Virtual address of segment buffer. */
+ uint32_t length;
+ /**< Segment length. */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_ml_buff_seg *next;
+ /**< Points to next segment. Value NULL represents the last segment. */
+};
+
+/**
+ * ML Operation.
+ *
+ * This structure contains data related to performing an ML operation on the buffers using
+ * the model specified through model_id.
+ */
+struct rte_ml_op {
+ int16_t model_id;
+ /**< Model ID to be used for the operation. */
+ uint16_t nb_batches;
+ /**< Number of batches. Minimum value must be one.
+ * Input buffer must hold inference data for each batch as contiguous.
+ */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_mempool *mempool;
+ /**< Pool from which operation is allocated. */
+ struct rte_ml_buff_seg input;
+ /**< Input buffer to hold the inference data. */
+ struct rte_ml_buff_seg output;
+ /**< Output buffer to hold the inference output by the driver. */
+ RTE_STD_C11
+ union {
+ uint64_t user_u64;
+ /**< User data as uint64_t.*/
+ void *user_ptr;
+ /**< User data as void*.*/
+ };
+ enum rte_ml_op_status status;
+ /**< Operation status. */
+} __rte_cache_aligned;
+
+/* Enqueue/Dequeue operations */
+
+/**
+ * Enqueue a burst of ML inferences for processing on an ML device.
+ *
+ * The rte_ml_enqueue_burst() function is invoked to place ML inference
+ * operations on the queue *qp_id* of the device designated by its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of inferences to process which are
+ * supplied in the *ops* array of *rte_ml_op* structures.
+ *
+ * The rte_ml_enqueue_burst() function returns the number of inferences it
+ * actually enqueued for processing. A return value equal to *nb_ops* means that
+ * all packets have been enqueued.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair which inferences are to be enqueued for processing.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * *rte_ml_dev_configure*.
+ * @param ops
+ * The address of an array of *nb_ops* pointers to *rte_ml_op* structures which contain the
+ * ML inferences to be processed.
+ * @param nb_ops
+ * The number of operations to process.
+ *
+ * @return
+ * The number of inference operations actually enqueued to the ML device.
+ * The return value can be less than the value of the *nb_ops* parameter when the ML device queue
+ * is full or if invalid parameters are specified in a *rte_ml_op*.
+ */
+__rte_experimental
+uint16_t
+rte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Dequeue a burst of processed ML inferences operations from a queue on the ML device.
+ * The dequeued operations are stored in *rte_ml_op* structures whose pointers are supplied
+ * in the *ops* array.
+ *
+ * The rte_ml_dequeue_burst() function returns the number of inferences actually dequeued,
+ * which is the number of *rte_ml_op* data structures effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained at least nb_ops* operations,
+ * and this is likely to signify that other processed operations remain in the devices output queue.
+ * Application implementing a "retrieve as many processed operations as possible" policy can check
+ * this specific case and keep invoking the rte_ml_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_ml_dequeue_burst() function does not provide any error notification to avoid
+ * the corresponding overhead.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair from which to retrieve processed packets.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * rte_ml_dev_configure().
+ * @param ops
+ * The address of an array of pointers to *rte_ml_op* structures that must be large enough to
+ * store *nb_ops* pointers in it.
+ * @param nb_ops
+ * The maximum number of inferences to dequeue.
+ *
+ * @return
+ * The number of operations actually dequeued, which is the number of pointers
+ * to *rte_ml_op* structures effectively supplied to the *ops* array.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Verbose error structure definition.
+ */
+struct rte_ml_op_error {
+ const char message[RTE_ML_STR_MAX]; /**< Human-readable error message. */
+ uint64_t errcode; /**< Vendor specific error code. */
+};
+
+/**
+ * Get PMD specific error information for an ML op.
+ *
+ * When an ML operation completed with RTE_ML_OP_STATUS_ERROR as status,
+ * This API allows to get PMD specific error details.
+ *
+ * @param[in] dev_id
+ * Device identifier
+ * @param[in] op
+ * Handle of ML operation
+ * @param[in] error
+ * Address of structure rte_ml_op_error to be filled
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error);
+
+/* Statistics operations */
+
+/** Device statistics. */
+struct rte_ml_dev_stats {
+ uint64_t enqueued_count;
+ /**< Count of all operations enqueued */
+ uint64_t dequeued_count;
+ /**< Count of all operations dequeued */
+ uint64_t enqueue_err_count;
+ /**< Total error count on operations enqueued */
+ uint64_t dequeue_err_count;
+ /**< Total error count on operations dequeued */
+};
+
+/**
+ * Retrieve the general I/O statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stats
+ * Pointer to structure to where statistics will be copied.
+ * On error, this location may or may not have been modified.
+ * @return
+ * - 0 on success
+ * - -EINVAL: If invalid parameter pointer is provided.
+ */
+__rte_experimental
+int
+rte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats);
+
+/**
+ * Reset the statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ */
+__rte_experimental
+void
+rte_ml_dev_stats_reset(int16_t dev_id);
+
+/**< Maximum name length for extended statistics counters */
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers for extended ML device statistics.
+ */
+struct rte_ml_dev_xstats_map {
+ uint16_t id;
+ /**< xstat identifier */
+ char name[RTE_ML_STR_MAX];
+ /**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param[out] xstats_map
+ * Block of memory to insert id and names into. Must be at least size in capacity.
+ * If set to NULL, function returns required capacity.
+ *
+ * @return
+ * - Positive value on success:
+ * - The return value is the number of entries filled in the stats map.
+ * - If xstats_map set to NULL then required capacity for xstats_map.
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param name
+ * The stat name to retrieve.
+ * @param stat_id
+ * If non-NULL, the numerical id of the stat will be returned, so that further requests for
+ * the stat can be got using rte_ml_dev_xstats_get, which will be faster as it doesn't need to
+ * scan a list of names for the stat.
+ * @param[out] value
+ * Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ * - 0: Successfully retrieved xstat value.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value);
+
+/**
+ * Retrieve extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * The id numbers of the stats to get. The ids can be fetched from the stat position in the
+ * stat list from rte_ml_dev_xstats_names_get(), or by using rte_ml_dev_xstats_by_name_get().
+ * @param values
+ * The values for each stats request by ID.
+ * @param nb_ids
+ * The number of stats requested.
+ * @return
+ * - Positive value: number of stat entries filled into the values array
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * Selects specific statistics to be reset. When NULL, all statistics will be reset.
+ * If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ * The number of ids available from the *ids* array. Ignored when ids is NULL.
+ * @return
+ * - 0: Successfully reset the statistics to zero.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids);
+
+/* Utility operations */
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *fd*.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param fd
+ * A pointer to a file for output.
+ * @return
+ * - 0: on success.
+ * - <0: on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_dump(int16_t dev_id, FILE *fd);
+
+/**
+ * Trigger the ML device self test.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @return
+ * - 0: Selftest successful.
+ * - -ENOTSUP: if the device doesn't support selftest.
+ * - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_selftest(int16_t dev_id);
+
+/* Model operations */
+
+/** ML model load parameters
+ *
+ * Parameters required to load an ML model.
+ */
+struct rte_ml_model_params {
+ void *addr;
+ /**< Address of model buffer */
+ size_t size;
+ /**< Size of model buffer */
+};
+
+/**
+ * Load an ML model to the device.
+ *
+ * Load an ML model to the device with parameters requested in the structure rte_ml_model_params.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] params
+ * Parameters for the model to be loaded.
+ * @param[out] model_id
+ * Identifier of the model loaded.
+ *
+ * @return
+ * - 0: Success, Model created.
+ * - < 0: Failure, Error code of the model load driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id);
+
+/**
+ * Unload an ML model from the device.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be unloaded.
+ *
+ * @return
+ * - 0: Success, Model destroyed.
+ * - < 0: Failure, Error code of the model unload driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_unload(int16_t dev_id, int16_t model_id);
+
+/**
+ * Start an ML model for the given device ID.
+ *
+ * Start an ML model to accept inference requests.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be started.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model start driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_start(int16_t dev_id, int16_t model_id);
+
+/**
+ * Stop an ML model for the given device ID.
+ *
+ * Model stop would disable the ML model to be used for inference jobs.
+ * All inference jobs must have been completed before model stop is attempted.
+
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be stopped.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model stop driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_stop(int16_t dev_id, int16_t model_id);
+
+/**
+ * Input and output data types. ML models can operate on reduced precision
+ * datatypes to achieve better power efficiency, lower network latency and lower memory footprint.
+ * This enum is used to represent the lower precision integer and floating point types used
+ * by ML models.
+ */
+enum rte_ml_io_type {
+ RTE_ML_IO_TYPE_UNKNOWN = 0,
+ /**< Invalid or unknown type */
+ RTE_ML_IO_TYPE_INT8,
+ /**< 8-bit integer */
+ RTE_ML_IO_TYPE_UINT8,
+ /**< 8-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT16,
+ /**< 16-bit integer */
+ RTE_ML_IO_TYPE_UINT16,
+ /**< 16-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT32,
+ /**< 32-bit integer */
+ RTE_ML_IO_TYPE_UINT32,
+ /**< 32-bit unsigned integer */
+ RTE_ML_IO_TYPE_FP8,
+ /**< 8-bit floating point number */
+ RTE_ML_IO_TYPE_FP16,
+ /**< IEEE 754 16-bit floating point number */
+ RTE_ML_IO_TYPE_FP32,
+ /**< IEEE 754 32-bit floating point number */
+ RTE_ML_IO_TYPE_BFLOAT16
+ /**< 16-bit brain floating point number. */
+};
+
+/**
+ * Input and output format. This is used to represent the encoding type of multi-dimensional
+ * used by ML models.
+ */
+enum rte_ml_io_format {
+ RTE_ML_IO_FORMAT_NCHW = 1,
+ /**< Batch size (N) x channels (C) x height (H) x width (W) */
+ RTE_ML_IO_FORMAT_NHWC,
+ /**< Batch size (N) x height (H) x width (W) x channels (C) */
+ RTE_ML_IO_FORMAT_CHWN,
+ /**< Channels (C) x height (H) x width (W) x batch size (N) */
+ RTE_ML_IO_FORMAT_3D,
+ /**< Format to represent a 3 dimensional data */
+ RTE_ML_IO_FORMAT_2D,
+ /**< Format to represent matrix data */
+ RTE_ML_IO_FORMAT_1D,
+ /**< Format to represent vector data */
+ RTE_ML_IO_FORMAT_SCALAR,
+ /**< Format to represent scalar data */
+};
+
+/**
+ * Input and output shape. This structure represents the encoding format and dimensions
+ * of the tensor or vector.
+ *
+ * The data can be a 4D / 3D tensor, matrix, vector or a scalar. Number of dimensions used
+ * for the data would depend on the format. Unused dimensions to be set to 1.
+ */
+struct rte_ml_io_shape {
+ enum rte_ml_io_format format;
+ /**< Format of the data */
+ uint32_t w;
+ /**< First dimension */
+ uint32_t x;
+ /**< Second dimension */
+ uint32_t y;
+ /**< Third dimension */
+ uint32_t z;
+ /**< Fourth dimension */
+};
+
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Name of data */
+ struct rte_ml_io_shape shape;
+ /**< Shape of data */
+ enum rte_ml_io_type qtype;
+ /**< Type of quantized data */
+ enum rte_ml_io_type dtype;
+ /**< Type of de-quantized data */
+};
+
+/** Model information structure */
+struct rte_ml_model_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Model name. */
+ char version[RTE_ML_STR_MAX];
+ /**< Model version */
+ int16_t model_id;
+ /**< Model ID */
+ uint16_t device_id;
+ /**< Device ID */
+ uint16_t batch_size;
+ /**< Maximum number of batches that the model can process simultaneously */
+ uint32_t nb_inputs;
+ /**< Number of inputs */
+ const struct rte_ml_io_info *input_info;
+ /**< Input info array. Array size is equal to nb_inputs */
+ uint32_t nb_outputs;
+ /**< Number of outputs */
+ const struct rte_ml_io_info *output_info;
+ /**< Output info array. Array size is equal to nb_output */
+ uint64_t wb_size;
+ /**< Size of model weights and bias */
+};
+
+/**
+ * Get ML model information.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[out] model_info
+ * Pointer to a model info structure
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_info_get(int16_t dev_id, int16_t model_id, struct rte_ml_model_info *model_info);
+
+/**
+ * Update the model parameters without unloading model.
+ *
+ * Update model parameters such as weights and bias without unloading the model.
+ * rte_ml_model_stop() must be called before invoking this API.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] buffer
+ * Pointer to the model weights and bias buffer.
+ * Size of the buffer is equal to wb_size returned in *rte_ml_model_info*.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);
+
+/* IO operations */
+
+/**
+ * Get size of quantized and dequantized input buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized input data.
+ * This API would return the buffer sizes for the number of batches provided and would
+ * consider the alignment requirements as per the PMD. Input sizes computed by this API can
+ * be used by the application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] input_qsize
+ * Quantized input size pointer.
+ * NULL value is allowed, in which case input_qsize is not calculated by the driver.
+ * @param[out] input_dsize
+ * Dequantized input size pointer.
+ * NULL value is allowed, in which case input_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_input_size_get(int16_t dev_id, int16_t model_id, uint32_t nb_batches,
+ uint64_t *input_qsize, uint64_t *input_dsize);
+
+/**
+ * Get size of quantized and dequantized output buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized output data.
+ * This API would return the buffer sizes for the number of batches provided and would consider
+ * the alignment requirements as per the PMD. Output sizes computed by this API can be used by the
+ * application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] output_qsize
+ * Quantized output size pointer.
+ * NULL value is allowed, in which case output_qsize is not calculated by the driver.
+ * @param[out] output_dsize
+ * Dequantized output size pointer.
+ * NULL value is allowed, in which case output_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_output_size_get(int16_t dev_id, int16_t model_id, uint32_t nb_batches,
+ uint64_t *output_qsize, uint64_t *output_dsize);
+
+/**
+ * Quantize input data.
+ *
+ * Quantization converts data from a higher precision types to a lower precision types to improve
+ * the throughput and efficiency of the model execution with minimal loss of accuracy.
+ * Types of dequantized data and quantized data are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized input buffer
+ * @param[in] dbuffer
+ * Address of dequantized input data
+ * @param[in] qbuffer
+ * Address of quantized input data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_quantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void *dbuffer,
+ void *qbuffer);
+
+/**
+ * Dequantize output data.
+ *
+ * Dequantization converts data from a lower precision type to a higher precision type.
+ * Types of quantized data and dequantized are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized output buffer
+ * @param[in] qbuffer
+ * Address of quantized output data
+ * @param[in] dbuffer
+ * Address of dequantized output data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_dequantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void *qbuffer,
+ void *dbuffer);
+
+/* ML op pool operations */
+
+/**
+ * Create an ML operation pool
+ *
+ * @param name
+ * ML operations pool name
+ * @param nb_elts
+ * Number of elements in pool
+ * @param cache_size
+ * Number of elements to cache on lcore, see
+ * *rte_mempool_create* for further details about cache size
+ * @param user_size
+ * Size of private data to allocate for user with each operation
+ * @param socket_id
+ * Socket to identifier allocate memory on
+ * @return
+ * - On success pointer to mempool
+ * - On failure NULL
+ */
+__rte_experimental
+struct rte_mempool *
+rte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
+ uint16_t user_size, int socket_id);
+
+/**
+ * Free an ML operation pool
+ *
+ * @param mempool
+ * A pointer to the mempool structure.
+ * If NULL then, the function does nothing.
+ */
+__rte_experimental
+void
+rte_ml_op_pool_free(struct rte_mempool *mempool);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_MLDEV_H */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
new file mode 100644
index 0000000000..5aeea7c827
--- /dev/null
+++ b/lib/mldev/version.map
@@ -0,0 +1,5 @@
+EXPERIMENTAL {
+
+ local: *;
+};
+
--
2.37.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-03 13:28 [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library jerinj
2022-08-03 13:28 ` [dpdk-dev] [RFC PATCH 1/1] " jerinj
@ 2022-08-03 15:19 ` Stephen Hemminger
2022-08-16 13:13 ` Jerin Jacob
1 sibling, 1 reply; 80+ messages in thread
From: Stephen Hemminger @ 2022-08-03 15:19 UTC (permalink / raw)
To: jerinj
Cc: dev, thomas, ferruh.yigit, ajit.khaparde, aboyer,
andrew.rybchenko, beilei.xing, bruce.richardson, chas3,
chenbo.xia, ciara.loftus, dsinghrawat, ed.czeck, evgenys, grive,
g.singh, zhouguoyang, haiyue.wang, hkalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, irusskikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, kirankumark, oulijun, lironh,
longli, mw, spinler, matan, matt.peters, maxime.coquelin, mk,
humin29, pnalla, ndabilpuram, qiming.yang, qi.z.zhang, radhac,
rahul.lakkireddy, rmody, rosen.xu, sachin.saxena, skoteshwar,
shshaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, syalavarthi, dchickles, sshankarnara
On Wed, 3 Aug 2022 18:58:37 +0530
<jerinj@marvell.com> wrote:
> Roadmap
> -------
> 1) Address the comments for this RFC.
> 2) Common code for mldev
> 3) SW mldev driver based on TVM (https://tvm.apache.org/)
Having a SW implementation is important because then it can be covered
by tests.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-03 15:19 ` [dpdk-dev] [RFC PATCH 0/1] " Stephen Hemminger
@ 2022-08-16 13:13 ` Jerin Jacob
2022-08-16 15:45 ` Morten Brørup
0 siblings, 1 reply; 80+ messages in thread
From: Jerin Jacob @ 2022-08-16 13:13 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Jerin Jacob, dpdk-dev, Thomas Monjalon, Ferruh Yigit,
Ajit Khaparde, Andrew Boyer, Andrew Rybchenko, Beilei Xing,
Richardson, Bruce, Chas Williams, Xia, Chenbo, Ciara Loftus,
Devendra Singh Rawat, Ed Czeck, Evgeny Schemeilin, Gaetan Rivet,
Gagandeep Singh, Guoyang Zhou, Haiyue Wang, Harman Kalra,
Heinrich Kuhn, Hemant Agrawal, Hyong Youb Kim, Igor Chauskin,
Igor Russkikh, Jakub Grajciar, Jasvinder Singh, Jian Wang,
Jiawen Wu, Jingjing Wu, John Daley, John Miller,
John W. Linville, Wiles, Keith, Kiran Kumar K, Lijun Ou,
Liron Himi, Long Li, Marcin Wojtas, Martin Spinler, Matan Azrad,
Matt Peters, Maxime Coquelin, Michal Krawczyk, Min Hu (Connor,
Pradeep Kumar Nalla, Nithin Dabilpuram, Qiming Yang, Qi Zhang,
Radha Mohan Chintakuntla, Rahul Lakkireddy, Rasesh Mody,
Rosen Xu, Sachin Saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, Shai Brandes, Shepard Siegel,
Somalapuram Amaranath, Somnath Kotur, Stephen Hemminger,
Steven Webster, Sunil Kumar Kori, Tetsuya Mukawa,
Veerasenareddy Burru, Viacheslav Ovsiienko, Xiao Wang,
Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Honnappa Nagarahalli, Mattias Rönnblom,
Ruifeng Wang (Arm Technology China),
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara
On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 3 Aug 2022 18:58:37 +0530
> <jerinj@marvell.com> wrote:
>
> > Roadmap
> > -------
> > 1) Address the comments for this RFC.
> > 2) Common code for mldev
> > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
>
> Having a SW implementation is important because then it can be covered
> by tests.
Yes. That reason for adding TVM based SW driver as item (3).
Is there any other high level or API level comments before proceeding
with v1 and implementation.
Or Anyone else interested to review or contribute to this new DPDK device class?
^ permalink raw reply [flat|nested] 80+ messages in thread
* RE: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-16 13:13 ` Jerin Jacob
@ 2022-08-16 15:45 ` Morten Brørup
2022-08-16 16:34 ` Honnappa Nagarahalli
2022-08-17 5:37 ` Jerin Jacob
0 siblings, 2 replies; 80+ messages in thread
From: Morten Brørup @ 2022-08-16 15:45 UTC (permalink / raw)
To: Jerin Jacob
Cc: Jerin Jacob, dpdk-dev, Thomas Monjalon, Ferruh Yigit,
Ajit Khaparde, Andrew Boyer, Andrew Rybchenko, Beilei Xing,
Richardson, Bruce, Chas Williams, Xia, Chenbo, Ciara Loftus,
Devendra Singh Rawat, Ed Czeck, Evgeny Schemeilin, Gaetan Rivet,
Gagandeep Singh, Guoyang Zhou, Haiyue Wang, Harman Kalra,
Heinrich Kuhn, Hemant Agrawal, Hyong Youb Kim, Igor Chauskin,
Igor Russkikh, Jakub Grajciar, Jasvinder Singh, Jian Wang,
Jiawen Wu, Jingjing Wu, John Daley, John Miller,
John W. Linville, Wiles, Keith, Kiran Kumar K, Lijun Ou,
Liron Himi, Long Li, Marcin Wojtas, Martin Spinler, Matan Azrad,
Matt Peters, Maxime Coquelin, Michal Krawczyk, Min Hu (Connor,
Pradeep Kumar Nalla, Nithin Dabilpuram, Qiming Yang, Qi Zhang,
Radha Mohan Chintakuntla, Rahul Lakkireddy, Rasesh Mody,
Rosen Xu, Sachin Saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, Shai Brandes, Shepard Siegel,
Somalapuram Amaranath, Somnath Kotur, Stephen Hemminger,
Steven Webster, Sunil Kumar Kori, Tetsuya Mukawa,
Veerasenareddy Burru, Viacheslav Ovsiienko, Xiao Wang,
Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Honnappa Nagarahalli, Mattias Rönnblom,
Ruifeng Wang (Arm Technology China),
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara, Stephen Hemminger
> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Tuesday, 16 August 2022 15.13
>
> On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Wed, 3 Aug 2022 18:58:37 +0530
> > <jerinj@marvell.com> wrote:
> >
> > > Roadmap
> > > -------
> > > 1) Address the comments for this RFC.
> > > 2) Common code for mldev
> > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> >
> > Having a SW implementation is important because then it can be
> covered
> > by tests.
>
> Yes. That reason for adding TVM based SW driver as item (3).
>
> Is there any other high level or API level comments before proceeding
> with v1 and implementation.
Have you seriously considered if the DPDK Project is the best home for this project? I can easily imagine the DPDK development process being a hindrance in many aspects for an evolving AI/ML library. Off the top of my head, it would probably be better off as a separate project, like SPDK.
If all this stuff can be completely omitted at build time, I have no objections.
A small note about naming (not intending to start a flame war, so please feel free to ignore!): I haven't worked seriously with ML/AI since university three decades ago, so I'm quite rusty in the domain. However, I don't see any Machine Learning functions proposed by this API. The library provides an API to an Inference Engine - but nobody says the inference model stems from Machine Learning; it might as well be a hand crafted model. Do you plan to propose APIs for training the models? If not, the name of the library could confuse some potential users.
> Or Anyone else interested to review or contribute to this new DPDK
> device class?
^ permalink raw reply [flat|nested] 80+ messages in thread
* RE: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-16 15:45 ` Morten Brørup
@ 2022-08-16 16:34 ` Honnappa Nagarahalli
2022-08-17 14:53 ` Jerin Jacob
2022-08-17 5:37 ` Jerin Jacob
1 sibling, 1 reply; 80+ messages in thread
From: Honnappa Nagarahalli @ 2022-08-16 16:34 UTC (permalink / raw)
To: Morten Brørup, Jerin Jacob
Cc: jerinj, dpdk-dev, thomas, Ferruh Yigit,
Ajit Khaparde (ajit.khaparde@broadcom.com),
Andrew Boyer, Andrew Rybchenko, Beilei Xing, Richardson, Bruce,
Chas Williams, Xia, Chenbo, Ciara Loftus, Devendra Singh Rawat,
Ed Czeck, Evgeny Schemeilin, Gaetan Rivet, Gagandeep Singh,
Guoyang Zhou, Haiyue Wang, Harman Kalra, Heinrich Kuhn,
hemant.agrawal, Hyong Youb Kim, Igor Chauskin, Igor Russkikh,
Jakub Grajciar, Jasvinder Singh, Jian Wang, Jiawen Wu,
Jingjing Wu, John Daley, John Miller, John W. Linville, Wiles,
Keith, Kiran Kumar K, Lijun Ou, Liron Himi, Long Li,
Marcin Wojtas, Martin Spinler, Matan Azrad, Matt Peters,
Maxime Coquelin, Michal Krawczyk, Min Hu (Connor,
Pradeep Kumar Nalla, Nithin Dabilpuram, Qiming Yang, Qi Zhang,
Radha Mohan Chintakuntla, Rahul Lakkireddy, Rasesh Mody,
Rosen Xu, Sachin Saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, Shai Brandes, Shepard Siegel,
Somalapuram Amaranath, Somnath Kotur, Stephen Hemminger,
Steven Webster, Sunil Kumar Kori, Tetsuya Mukawa,
Veerasenareddy Burru, nd, Viacheslav Ovsiienko, Xiao Wang,
Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Mattias Rönnblom, Ruifeng Wang,
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara, Stephen Hemminger, nd
<snip>
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Tuesday, 16 August 2022 15.13
> >
> > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > <jerinj@marvell.com> wrote:
> > >
> > > > Roadmap
> > > > -------
> > > > 1) Address the comments for this RFC.
> > > > 2) Common code for mldev
> > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > >
> > > Having a SW implementation is important because then it can be
> > covered
> > > by tests.
> >
> > Yes. That reason for adding TVM based SW driver as item (3).
> >
> > Is there any other high level or API level comments before proceeding
> > with v1 and implementation.
>
> Have you seriously considered if the DPDK Project is the best home for this
> project? I can easily imagine the DPDK development process being a hindrance
> in many aspects for an evolving AI/ML library. Off the top of my head, it would
> probably be better off as a separate project, like SPDK.
There is a lot of talk about using ML in networking workloads. Although, I am not very sure on how the use case looks like. For ex: is the inference engine going to be inline (i.e. the packet goes through the inference engine before coming to the CPU and provide some data (what sort of data?)), look aside (does it require the packets to be sent to the inference engine or is it some other data?), what would be an end to end use case? A sample application using these APIs would be helpful.
IMO, if we need to share the packets with the inference engine, then it fits into DPDK.
As I understand, there are many mature open source projects for ML/inference outside of DPDK. Does it make sense for DPDK to adopt those projects rather than inventing our own?
>
> If all this stuff can be completely omitted at build time, I have no objections.
>
> A small note about naming (not intending to start a flame war, so please feel
> free to ignore!): I haven't worked seriously with ML/AI since university three
> decades ago, so I'm quite rusty in the domain. However, I don't see any
> Machine Learning functions proposed by this API. The library provides an API to
> an Inference Engine - but nobody says the inference model stems from
> Machine Learning; it might as well be a hand crafted model. Do you plan to
> propose APIs for training the models? If not, the name of the library could
> confuse some potential users.
I think, at least on the edge devices, we need an inference device as ML requires more cycles/power.
>
> > Or Anyone else interested to review or contribute to this new DPDK
> > device class?
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-16 15:45 ` Morten Brørup
2022-08-16 16:34 ` Honnappa Nagarahalli
@ 2022-08-17 5:37 ` Jerin Jacob
2022-08-17 6:58 ` Morten Brørup
1 sibling, 1 reply; 80+ messages in thread
From: Jerin Jacob @ 2022-08-17 5:37 UTC (permalink / raw)
To: Morten Brørup
Cc: Jerin Jacob, dpdk-dev, Thomas Monjalon, Ferruh Yigit,
Ajit Khaparde, Andrew Boyer, Andrew Rybchenko, Beilei Xing,
Richardson, Bruce, Chas Williams, Xia, Chenbo, Ciara Loftus,
Devendra Singh Rawat, Ed Czeck, Evgeny Schemeilin, Gaetan Rivet,
Gagandeep Singh, Guoyang Zhou, Haiyue Wang, Harman Kalra,
Heinrich Kuhn, Hemant Agrawal, Hyong Youb Kim, Igor Chauskin,
Igor Russkikh, Jakub Grajciar, Jasvinder Singh, Jian Wang,
Jiawen Wu, Jingjing Wu, John Daley, John Miller,
John W. Linville, Wiles, Keith, Kiran Kumar K, Lijun Ou,
Liron Himi, Long Li, Marcin Wojtas, Martin Spinler, Matan Azrad,
Matt Peters, Maxime Coquelin, Michal Krawczyk, Min Hu (Connor,
Pradeep Kumar Nalla, Nithin Dabilpuram, Qiming Yang, Qi Zhang,
Radha Mohan Chintakuntla, Rahul Lakkireddy, Rasesh Mody,
Rosen Xu, Sachin Saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, Shai Brandes, Shepard Siegel,
Somalapuram Amaranath, Somnath Kotur, Stephen Hemminger,
Steven Webster, Sunil Kumar Kori, Tetsuya Mukawa,
Veerasenareddy Burru, Viacheslav Ovsiienko, Xiao Wang,
Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Honnappa Nagarahalli, Mattias Rönnblom,
Ruifeng Wang (Arm Technology China),
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara, Stephen Hemminger
On Tue, Aug 16, 2022 at 9:15 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Tuesday, 16 August 2022 15.13
> >
> > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > <jerinj@marvell.com> wrote:
> > >
> > > > Roadmap
> > > > -------
> > > > 1) Address the comments for this RFC.
> > > > 2) Common code for mldev
> > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > >
> > > Having a SW implementation is important because then it can be
> > covered
> > > by tests.
> >
> > Yes. That reason for adding TVM based SW driver as item (3).
> >
> > Is there any other high level or API level comments before proceeding
> > with v1 and implementation.
>
> Have you seriously considered if the DPDK Project is the best home for this project? I can easily imagine the DPDK development process being a hindrance in many aspects for an evolving AI/ML library. Off the top of my head, it would probably be better off as a separate project, like SPDK.
Yes. The reasons are following
# AI/ML compiler libraries more focused on model creation and
training etc (Thats where actual value addition the AI/ML libraries
can offer) and minimal part for interference(It is just added for
testing the model)
# Considering the inference is the scope of the DPDK. DPDK is ideal
place for following reasons
a) Inference scope is very limited.
b) Avoid memcpy of interference data (Use directly from network or
other class of device like cryptodev, regexdev)
c) Reuse highspeed IO interface like PCI backed driver etc
d) Integration with other DPDK subsystems like eventdev etc for job completion.
e) Also support more inline offloads by merging two device classes
like rte_secuity.
f) Run the inference model from different AI/ML compiler frameworks or
abstract the inference usage.
Similar concept is already applied to other DPDK device classes like
1) In Regexdev, The compiler generates the rule database which is out
of scope of DPDK. DPDK API just loads the rule database
2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
IO interface.
> If all this stuff can be completely omitted at build time, I have no objections.
Yes, It can be completely omitted at build time. Also no plan to
integrate to testpmd and other existing application. Planning to add
only app/test-mldev application.
>
> A small note about naming (not intending to start a flame war, so please feel free to ignore!): I haven't worked seriously with ML/AI since university three decades ago, so I'm quite rusty in the domain. However, I don't see any Machine Learning functions proposed by this API. The library provides an API to an Inference Engine - but nobody says the inference model stems from Machine Learning; it might as well be a hand crafted model. Do you plan to propose APIs for training the models? If not, the name of the library could confuse some potential users.
No, scope is only inference and it is documented in the programing
guide and API header file. I am trying to keep name similar to
regexdev, gpudev etc which have similar scope. But I am open to other
shortname/name if you have something in mind.
>
> > Or Anyone else interested to review or contribute to this new DPDK
> > device class?
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* RE: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-17 5:37 ` Jerin Jacob
@ 2022-08-17 6:58 ` Morten Brørup
2023-01-25 13:45 ` Thomas Monjalon
0 siblings, 1 reply; 80+ messages in thread
From: Morten Brørup @ 2022-08-17 6:58 UTC (permalink / raw)
To: Jerin Jacob
Cc: Jerin Jacob, dpdk-dev, Thomas Monjalon, Ferruh Yigit,
Ajit Khaparde, Andrew Boyer, Andrew Rybchenko, Beilei Xing,
Richardson, Bruce, Chas Williams, Xia, Chenbo, Ciara Loftus,
Devendra Singh Rawat, Ed Czeck, Evgeny Schemeilin, Gaetan Rivet,
Gagandeep Singh, Guoyang Zhou, Haiyue Wang, Harman Kalra,
Heinrich Kuhn, Hemant Agrawal, Hyong Youb Kim, Igor Chauskin,
Igor Russkikh, Jakub Grajciar, Jasvinder Singh, Jian Wang,
Jiawen Wu, Jingjing Wu, John Daley, John Miller,
John W. Linville, Wiles, Keith, Kiran Kumar K, Lijun Ou,
Liron Himi, Long Li, Marcin Wojtas, Martin Spinler, Matan Azrad,
Matt Peters, Maxime Coquelin, Michal Krawczyk, Min Hu (Connor,
Pradeep Kumar Nalla, Nithin Dabilpuram, Qiming Yang, Qi Zhang,
Radha Mohan Chintakuntla, Rahul Lakkireddy, Rasesh Mody,
Rosen Xu, Sachin Saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, Shai Brandes, Shepard Siegel,
Somalapuram Amaranath, Somnath Kotur, Stephen Hemminger,
Steven Webster, Sunil Kumar Kori, Tetsuya Mukawa,
Veerasenareddy Burru, Viacheslav Ovsiienko, Xiao Wang,
Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Honnappa Nagarahalli, Mattias Rönnblom,
Ruifeng Wang (Arm Technology China),
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara, Stephen Hemminger
> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Wednesday, 17 August 2022 07.37
>
> On Tue, Aug 16, 2022 at 9:15 PM Morten Brørup
> <mb@smartsharesystems.com> wrote:
> >
> > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > Sent: Tuesday, 16 August 2022 15.13
> > >
> > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:
> > > >
> > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > <jerinj@marvell.com> wrote:
> > > >
> > > > > Roadmap
> > > > > -------
> > > > > 1) Address the comments for this RFC.
> > > > > 2) Common code for mldev
> > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > >
> > > > Having a SW implementation is important because then it can be
> > > covered
> > > > by tests.
> > >
> > > Yes. That reason for adding TVM based SW driver as item (3).
> > >
> > > Is there any other high level or API level comments before
> proceeding
> > > with v1 and implementation.
> >
> > Have you seriously considered if the DPDK Project is the best home
> for this project? I can easily imagine the DPDK development process
> being a hindrance in many aspects for an evolving AI/ML library. Off
> the top of my head, it would probably be better off as a separate
> project, like SPDK.
>
> Yes. The reasons are following
>
> # AI/ML compiler libraries more focused on model creation and
> training etc (Thats where actual value addition the AI/ML libraries
> can offer) and minimal part for interference(It is just added for
> testing the model)
> # Considering the inference is the scope of the DPDK. DPDK is ideal
> place for following reasons
>
> a) Inference scope is very limited.
> b) Avoid memcpy of interference data (Use directly from network or
> other class of device like cryptodev, regexdev)
> c) Reuse highspeed IO interface like PCI backed driver etc
> d) Integration with other DPDK subsystems like eventdev etc for job
> completion.
> e) Also support more inline offloads by merging two device classes
> like rte_secuity.
> f) Run the inference model from different AI/ML compiler frameworks or
> abstract the inference usage.
> Similar concept is already applied to other DPDK device classes like
> 1) In Regexdev, The compiler generates the rule database which is out
> of scope of DPDK. DPDK API just loads the rule database
> 2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
> IO interface.
Thank you for the detailed reply, Jerin.
These are good reasons for adding the new device class to the DPDK project - especially the Regexdev comparison got me convinced.
>
> > If all this stuff can be completely omitted at build time, I have no
> objections.
>
> Yes, It can be completely omitted at build time.
Perfect.
> Also no plan to
> integrate to testpmd and other existing application. Planning to add
> only app/test-mldev application.
+1 to that
>
> >
> > A small note about naming (not intending to start a flame war, so
> please feel free to ignore!): I haven't worked seriously with ML/AI
> since university three decades ago, so I'm quite rusty in the domain.
> However, I don't see any Machine Learning functions proposed by this
> API. The library provides an API to an Inference Engine - but nobody
> says the inference model stems from Machine Learning; it might as well
> be a hand crafted model. Do you plan to propose APIs for training the
> models? If not, the name of the library could confuse some potential
> users.
>
> No, scope is only inference and it is documented in the programing
> guide and API header file. I am trying to keep name similar to
> regexdev, gpudev etc which have similar scope. But I am open to other
> shortname/name if you have something in mind.
The AI(Artificial Intelligence)/ML(Machine Learning)/IE(Inference Engine) chip market still seems immature and fragmented, so I can't find any consensus on generic names for such hardware accelerator devices.
Some of the chip vendors represented on the DPDK mailing list offer AI/ML/IE accelerator chips. Perhaps their marketing department could propose alternatives to "Machine Learning Device"/"mldev" for inference engine devices (with no acceleration for training the models). If not, the initially proposed name is good enough.
So: Everyone ask your marketing departments and speak up now, or the name "mldev" will be set in stone. ;-)
I'm thinking: While "Inference Engine Device"/iedev might be technically more correct, it doesn't have same value as "Machine Learning Device"/"mldev" on a marketing scale. And we should choose a name that we expect might become industry standard consensus.
>
> >
> > > Or Anyone else interested to review or contribute to this new DPDK
> > > device class?
> >
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-16 16:34 ` Honnappa Nagarahalli
@ 2022-08-17 14:53 ` Jerin Jacob
2023-01-25 13:47 ` Thomas Monjalon
0 siblings, 1 reply; 80+ messages in thread
From: Jerin Jacob @ 2022-08-17 14:53 UTC (permalink / raw)
To: Honnappa Nagarahalli
Cc: Morten Brørup, jerinj, dpdk-dev, thomas, Ferruh Yigit,
Ajit Khaparde (ajit.khaparde@broadcom.com),
Andrew Boyer, Andrew Rybchenko, Beilei Xing, Richardson, Bruce,
Chas Williams, Xia, Chenbo, Ciara Loftus, Devendra Singh Rawat,
Ed Czeck, Evgeny Schemeilin, Gaetan Rivet, Gagandeep Singh,
Guoyang Zhou, Haiyue Wang, Harman Kalra, Heinrich Kuhn,
hemant.agrawal, Hyong Youb Kim, Igor Chauskin, Igor Russkikh,
Jakub Grajciar, Jasvinder Singh, Jian Wang, Jiawen Wu,
Jingjing Wu, John Daley, John Miller, John W. Linville, Wiles,
Keith, Kiran Kumar K, Lijun Ou, Liron Himi, Long Li,
Marcin Wojtas, Martin Spinler, Matan Azrad, Matt Peters,
Maxime Coquelin, Michal Krawczyk, Min Hu (Connor,
Pradeep Kumar Nalla, Nithin Dabilpuram, Qiming Yang, Qi Zhang,
Radha Mohan Chintakuntla, Rahul Lakkireddy, Rasesh Mody,
Rosen Xu, Sachin Saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, Shai Brandes, Shepard Siegel,
Somalapuram Amaranath, Somnath Kotur, Stephen Hemminger,
Steven Webster, Sunil Kumar Kori, Tetsuya Mukawa,
Veerasenareddy Burru, nd, Viacheslav Ovsiienko, Xiao Wang,
Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Mattias Rönnblom, Ruifeng Wang,
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara, Stephen Hemminger
On Tue, Aug 16, 2022 at 10:04 PM Honnappa Nagarahalli
<Honnappa.Nagarahalli@arm.com> wrote:
>
> <snip>
>
> > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > Sent: Tuesday, 16 August 2022 15.13
> > >
> > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:
> > > >
> > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > <jerinj@marvell.com> wrote:
> > > >
> > > > > Roadmap
> > > > > -------
> > > > > 1) Address the comments for this RFC.
> > > > > 2) Common code for mldev
> > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > >
> > > > Having a SW implementation is important because then it can be
> > > covered
> > > > by tests.
> > >
> > > Yes. That reason for adding TVM based SW driver as item (3).
> > >
> > > Is there any other high level or API level comments before proceeding
> > > with v1 and implementation.
> >
> > Have you seriously considered if the DPDK Project is the best home for this
> > project? I can easily imagine the DPDK development process being a hindrance
> > in many aspects for an evolving AI/ML library. Off the top of my head, it would
> > probably be better off as a separate project, like SPDK.
> There is a lot of talk about using ML in networking workloads. Although, I am not very sure on how the use case looks like. For ex: is the inference engine going to be inline (i.e. the packet goes through the inference engine before coming to the CPU and provide some data (what sort of data?)), look aside (does it require the packets to be sent to the inference engine or is it some other data?), what would be an end to end use case? A sample application using these APIs would be helpful.
Simple application for the inference usage is added in the cover letter.
Regarding the use cases, There are many like firewall, intrusion
detection etc. Most of the use cases are driven by product
requirements and SW IP vendors try to keep it to themselves as a
product differentiate factor.
That is the prime reason for DPDK scope only for inference where IO is
involved. Model creation and training etc will heavily vary based on
use case but not the inference model.
>
> IMO, if we need to share the packets with the inference engine, then it fits into DPDK.
Yes. Yes for networking or ORAN use cases the interface data comes
over wire and result can go over wire.
>
> As I understand, there are many mature open source projects for ML/inference outside of DPDK. Does it make sense for DPDK to adopt those projects rather than inventing our own?
# AI/ML compiler libraries more focused on model creation and
training etc (Thats where actual value addition the AI/ML libraries
can offer) and
minimal part for inference (It is just added for testing the model)
# Considering the inference is the scope of the DPDK. DPDK is ideal
place for following reasons
a) Inference scope is very limited.
b) Avoid memcpy of inference data (Use directly from network or
other class of device like cryptodev, regexdev)
c) Reuse highspeed IO interface like PCI backed driver etc
d) Integration with other DPDK subsystems like eventdev etc for job completion.
e) Also support more inline offloads by merging two device classes
like rte_secuity.
f) Run the inference model from different AI/ML compiler frameworks or
abstract the inference usage.
Similar concept is already applied to other DPDK device classes like
1) In Regexdev, The compiler generates the rule database which is out
of scope of DPDK. DPDK API just loads the rule database
2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
IO interface.
>
> >
> > If all this stuff can be completely omitted at build time, I have no objections.
> >
> > A small note about naming (not intending to start a flame war, so please feel
> > free to ignore!): I haven't worked seriously with ML/AI since university three
> > decades ago, so I'm quite rusty in the domain. However, I don't see any
> > Machine Learning functions proposed by this API. The library provides an API to
> > an Inference Engine - but nobody says the inference model stems from
> > Machine Learning; it might as well be a hand crafted model. Do you plan to
> > propose APIs for training the models? If not, the name of the library could
> > confuse some potential users.
> I think, at least on the edge devices, we need an inference device as ML requires more cycles/power.
>
> >
> > > Or Anyone else interested to review or contribute to this new DPDK
> > > device class?
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2022-08-03 13:28 ` [dpdk-dev] [RFC PATCH 1/1] " jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 01/12] " jerinj
` (13 more replies)
0 siblings, 14 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, syalavarthi, dchickles, sshankarnara,
Jerin Jacob
From: Jerin Jacob <jerinj@marvell.com>
Machine learning inference library
==================================
Definition of machine learning inference
----------------------------------------
Inference in machine learning is the process of making an output prediction
based on new input data using a pre-trained machine learning model.
The scope of the RFC would include only inferencing with pre-trained machine learning models,
training and building/compiling the ML models is out of scope for this RFC or
DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
Motivation for the new library
------------------------------
Multiple semiconductor vendors are offering accelerator products such as DPU
(often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
integrated as part of the product. Use of ML inferencing is increasing in the domain
of packet processing for flow classification, intrusion, malware and anomaly detection.
Lack of inferencing support through DPDK APIs will involve complexities and
increased latency from moving data across frameworks (i.e, dataplane to
non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
inferencing would enable the dataplane solutions to harness the benefit of inline
inferencing supported by the hardware.
Contents
---------------
A) API specification for:
1) Discovery of ML capabilities (e.g., device specific features) in a vendor
independent fashion
2) Definition of functions to handle ML devices, which includes probing,
initialization and termination of the devices.
3) Definition of functions to handle ML models used to perform inference operations.
4) Definition of function to handle quantize and dequantize operations
B) Common code for above specification
Roadmap
-------
1) SW mldev driver based on TVM (https://tvm.apache.org/)
2) HW mldev driver for cn10k
3) Add app/test-mldev application similar to other device class tests
rfc..v1:
- Added programmer guide documentation
- Added implementation for common code
Machine learning library framework
----------------------------------
The ML framework is built on the following model:
+-----------------+ rte_ml_[en|de]queue_burst()
| | |
| Machine o------+ +--------+ |
| Learning | | | queue | | +------+
| Inference o------+-----o |<===o===>|Core 0|
| Engine | | | pair 0 | +------+
| o----+ | +--------+
| | | |
+-----------------+ | | +--------+
^ | | | queue | +------+
| | +-----o |<=======>|Core 1|
| | | pair 1 | +------+
| | +--------+
+--------+--------+ |
| +-------------+ | | +--------+
| | Model 0 | | | | queue | +------+
| +-------------+ | +-------o |<=======>|Core N|
| +-------------+ | | pair N | +------+
| | Model 1 | | +--------+
| +-------------+ |
| +-------------+ |<------- rte_ml_model_load()
| | Model .. | |-------> rte_ml_model_info()
| +-------------+ |<------- rte_ml_model_start()
| +-------------+ |<------- rte_ml_model_stop()
| | Model N | |<------- rte_ml_model_params_update()
| +-------------+ |<------- rte_ml_model_unload()
+-----------------+
ML Device: A hardware or software-based implementation of ML device API for
running inferences using a pre-trained ML model.
ML Model: An ML model is an algorithm trained over a dataset. A model consists of
procedure/algorithm and data/pattern required to make predictions on live data.
Once the model is created and trained outside of the DPDK scope, the model can be loaded
via rte_ml_model_load() and then start it using rte_ml_model_start() API.
The rte_ml_model_params_update() can be used to update the model parameters such as weight
and bias without unloading the model using rte_ml_model_unload().
ML Inference: ML inference is the process of feeding data to the model via
rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
outputs/predictions from the started model.
In all functions of the ML device API, the ML device is designated by an
integer >= 0 named as device identifier *dev_id*.
The functions exported by the ML device API to setup a device designated by
its device identifier must be invoked in the following order:
- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_dev_start()
A model is required to run the inference operations with the user specified inputs.
Application needs to invoke the ML model API in the following order before queueing
inference jobs.
- rte_ml_model_load()
- rte_ml_model_start()
The rte_ml_model_info() API is provided to retrieve the information related to the model.
The information would include the shape and type of input and output required for the inference.
Data quantization and dequantization is one of the main aspects in ML domain. This involves
conversion of input data from a higher precision to a lower precision data type and vice-versa
for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
and output buffers holding data for multiple batches.
Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
size of quantized and de-quantized multi-batch input and output buffers.
User can optionally update the model parameters with rte_ml_model_params_update() after
invoking rte_ml_model_stop() API on a given model ID.
The application can invoke, in any order, the functions exported by the ML API to enqueue
inference jobs and dequeue inference response.
If the application wants to change the device configuration (i.e., call
rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
for the given model. The application does not need to call rte_ml_dev_stop() API for
any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
start state after invoking rte_ml_model_start() API, then the application can call
rte_ml_enqueue() and rte_ml_dequeue() API on the destined device and model ID.
Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
Typical application utilisation of the ML API will follow the following
programming flow.
- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_model_load()
- rte_ml_model_start()
- rte_ml_model_info()
- rte_ml_dev_start()
- rte_ml_enqueue_burst()
- rte_ml_dequeue_burst()
- rte_ml_model_stop()
- rte_ml_model_unload()
- rte_ml_dev_stop()
- rte_ml_dev_close()
Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on different logical cores
on the same target object. For instance, the dequeue function of a poll mode driver cannot be
invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the user application to enforce this rule.
Example application usage for ML inferencing
--------------------------------------------
This example application is to demonstrate the programming model of ML device
library. This example omits the error checks to simplify the application. This
example also assumes that the input data received is quantized and output expected
is also quantized. In order to handle non-quantized inputs and outputs, users can
invoke rte_ml_io_quantize() or rte_ml_io_dequantize() for data type conversions.
#define ML_MODEL_NAME "model"
#define IO_MZ "io_mz"
struct app_ctx {
char model_file[PATH_MAX];
char inp_file[PATH_MAX];
char out_file[PATH_MAX];
struct rte_ml_model_params params;
struct rte_ml_model_info info;
uint16_t id;
uint64_t input_size;
uint64_t output_size;
uint8_t *input_buffer;
uint8_t *output_buffer;
} __rte_cache_aligned;
struct app_ctx ctx;
static int
parse_args(int argc, char **argv)
{
int opt, option_index;
static struct option lgopts[] = {{"model", required_argument, NULL, 'm'},
{"input", required_argument, NULL, 'i'},
{"output", required_argument, NULL, 'o'},
{NULL, 0, NULL, 0}};
while ((opt = getopt_long(argc, argv, "m:i:o:", lgopts, &option_index)) != EOF)
switch (opt) {
case 'm':
strncpy(ctx.model_file, optarg, PATH_MAX - 1);
break;
case 'i':
strncpy(ctx.inp_file, optarg, PATH_MAX - 1);
break;
case 'o':
strncpy(ctx.out_file, optarg, PATH_MAX - 1);
break;
default:
return -1;
}
return 0;
}
int
main(int argc, char **argv)
{
struct rte_ml_dev_qp_conf qp_conf;
struct rte_ml_dev_config config;
struct rte_ml_dev_info dev_info;
const struct rte_memzone *mz;
struct rte_mempool *op_pool;
struct rte_ml_op *op_enq;
struct rte_ml_op *op_deq;
FILE *fp;
int rc;
/* Initialize EAL */
rc = rte_eal_init(argc, argv);
if (rc < 0)
rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
argc -= rc;
argv += rc;
/* Parse application arguments (after the EAL args) */
if (parse_args(argc, argv) < 0)
rte_exit(EXIT_FAILURE, "Invalid application arguments\n");
/* Step 1: Check for ML devices */
if (rte_ml_dev_count() <= 0)
rte_exit(EXIT_FAILURE, "Failed to find ML devices\n");
/* Step 2: Get device info */
if (rte_ml_dev_info_get(0, &dev_info) != 0)
rte_exit(EXIT_FAILURE, "Failed to get device info\n");
/* Step 3: Configure ML device, use device 0 */
config.socket_id = rte_ml_dev_socket_id(0);
config.max_nb_models = dev_info.max_models;
config.nb_queue_pairs = dev_info.max_queue_pairs;
if (rte_ml_dev_configure(0, &config) != 0)
rte_exit(EXIT_FAILURE, "Device configuration failed\n");
/* Step 4: Setup queue pairs, used qp_id = 0 */
qp_conf.nb_desc = 1;
if (rte_ml_dev_queue_pair_setup(0, 0, &qp_conf, config.socket_id) != 0)
rte_exit(EXIT_FAILURE, "Queue-pair setup failed\n");
/* Step 5: Start device */
if (rte_ml_dev_start(0) != 0)
rte_exit(EXIT_FAILURE, "Device start failed\n");
/* Step 6: Read model data and update load params structure */
fp = fopen(ctx.model_file, "r+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open model file\n");
fseek(fp, 0, SEEK_END);
ctx.params.size = ftell(fp);
fseek(fp, 0, SEEK_SET);
ctx.params.addr = malloc(ctx.params.size);
if (fread(ctx.params.addr, 1, ctx.params.size, fp) != ctx.params.size){
fclose(fp);
rte_exit(EXIT_FAILURE, "Failed to read model\n");
}
fclose(fp);
strcpy(ctx.params.name, ML_MODEL_NAME);
/* Step 7: Load the model */
if (rte_ml_model_load(0, &ctx.params, &ctx.id) != 0)
rte_exit(EXIT_FAILURE, "Failed to load model\n");
free(ctx.params.addr);
/* Step 8: Start the model */
if (rte_ml_model_start(0, ctx.id) != 0)
rte_exit(EXIT_FAILURE, "Failed to start model\n");
/* Step 9: Allocate buffers for quantized input and output */
/* Get model information */
if (rte_ml_model_info_get(0, ctx.id, &ctx.info) != 0)
rte_exit(EXIT_FAILURE, "Failed to get model info\n");
/* Get the buffer size for input and output */
rte_ml_io_input_size_get(0, ctx.id, ctx.info.batch_size, &ctx.input_size, NULL);
rte_ml_io_output_size_get(0, ctx.id, ctx.info.batch_size, &ctx.output_size, NULL);
mz = rte_memzone_reserve(IO_MZ, ctx.input_size + ctx.output_size, config.socket_id, 0);
if (mz == NULL)
rte_exit(EXIT_FAILURE, "Failed to create IO memzone\n");
ctx.input_buffer = mz->addr;
ctx.output_buffer = ctx.input_buffer + ctx.input_size;
/* Step 10: Fill the input data */
fp = fopen(ctx.inp_file, "r+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open input file\n");
if (fread(ctx.input_buffer, 1, ctx.input_size, fp) != ctx.input_size) {
fclose(fp);
rte_exit(EXIT_FAILURE, "Failed to read input file\n");
}
fclose(fp);
/* Step 11: Create ML op mempool */
op_pool = rte_ml_op_pool_create("ml_op_pool", 1, 0, 0, config.socket_id);
if (op_pool == NULL)
rte_exit(EXIT_FAILURE, "Failed to create op pool\n");
/* Step 12: Form an ML op */
rte_mempool_get_bulk(op_pool, (void *)op_enq, 1);
op_enq->model_id = ctx.id;
op_enq->nb_batches = ctx.info.batch_size;
op_enq->mempool = op_pool;
op_enq->input.addr = ctx.input_buffer;
op_enq->input.length = ctx.input_size;
op_enq->input.next = NULL;
op_enq->output.addr = ctx.output_buffer;
op_enq->output.length = ctx.output_size;
op_enq->output.next = NULL;
/* Step 13: Enqueue jobs */
rte_ml_enqueue_burst(0, 0, &op_enq, 1);
/* Step 14: Dequeue jobs and release op pool */
while (rte_ml_dequeue_burst(0, 0, &op_deq, 1) != 1)
;
/* Step 15: Write output */
fp = fopen(ctx.out_file, "w+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open output file\n");
fwrite(ctx.output_buffer, 1, ctx.output_size, fp);
fclose(fp);
/* Step 16: Clean up */
/* Stop ML model */
rte_ml_model_stop(0, ctx.id);
/* Unload ML model */
rte_ml_model_unload(0, ctx.id);
/* Free input/output memory */
rte_memzone_free(rte_memzone_lookup(IO_MZ));
/* Free the ml op back to pool */
rte_mempool_put_bulk(op_pool, (void *)op_deq, 1);
/* Free ml op pool */
rte_mempool_free(op_pool);
/* Stop the device */
rte_ml_dev_stop(0);
rte_ml_dev_close(0);
rte_eal_cleanup();
return 0;
}
Jerin Jacob (1):
mldev: introduce machine learning device library
Srikanth Yalavarthi (11):
mldev: add PMD functions for ML device
mldev: support device handling functions
mldev: support device queue-pair setup
mldev: support handling ML models
mldev: support input and output data handling
mldev: support op pool and its operations
mldev: support inference enqueue and dequeue
mldev: support device statistics
mldev: support device extended statistics
mldev: support to retrieve error information
mldev: support to get debug info and test device
MAINTAINERS | 5 +
config/rte_config.h | 3 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 186 ++++
lib/eal/common/eal_common_log.c | 1 +
lib/eal/include/rte_log.h | 1 +
lib/meson.build | 1 +
lib/mldev/meson.build | 27 +
lib/mldev/rte_mldev.c | 901 ++++++++++++++++++
lib/mldev/rte_mldev.h | 1092 ++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 724 ++++++++++++++
lib/mldev/rte_mldev_pmd.c | 61 ++
lib/mldev/rte_mldev_pmd.h | 151 +++
lib/mldev/version.map | 49 +
17 files changed, 3919 insertions(+)
create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/rte_mldev_core.h
create mode 100644 lib/mldev/rte_mldev_pmd.c
create mode 100644 lib/mldev/rte_mldev_pmd.h
create mode 100644 lib/mldev/version.map
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
@ 2022-11-14 12:02 ` jerinj
2023-02-01 13:34 ` Shivah Shankar Shankar Narayan Rao
2023-02-02 5:26 ` Shivah Shankar Shankar Narayan Rao
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 02/12] mldev: add PMD functions for ML device jerinj
` (12 subsequent siblings)
13 siblings, 2 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Thomas Monjalon, Bruce Richardson, Srikanth Yalavarthi
Cc: ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, chas3, chenbo.xia, ciara.loftus, dsinghrawat,
ed.czeck, evgenys, grive, g.singh, zhouguoyang, haiyue.wang,
hkalra, heinrich.kuhn, hemant.agrawal, hyonkim, igorch,
irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Jerin Jacob <jerinj@marvell.com>
Add mldev API specification to standardize and use the machine learning
device and inference operations in vendor neutral way.
Following operations are abstracted through APIs
- ML device capability probe
- ML device configuration
- ML device queue pair configuration
- ML device state management
- ML device stat/xstat operations
- ML model load/unload/start/stop operations
- ML model information probe
- ML IO operations to find size for input and output buffers
- ML quantize and dequantize operations
- ML ops pool creation and free operations
- ML device enqueue/dequeue fastpath interference operations
Also added programming guide.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
MAINTAINERS | 5 +
config/rte_config.h | 3 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 186 ++++
lib/eal/common/eal_common_log.c | 1 +
lib/eal/include/rte_log.h | 1 +
lib/meson.build | 1 +
lib/mldev/meson.build | 18 +
lib/mldev/rte_mldev.c | 5 +
lib/mldev/rte_mldev.h | 1092 ++++++++++++++++++++++
lib/mldev/version.map | 3 +
14 files changed, 2032 insertions(+)
create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/version.map
diff --git a/MAINTAINERS b/MAINTAINERS
index 0e2fd39928..b2ab042248 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -534,6 +534,11 @@ F: drivers/raw/skeleton/
F: app/test/test_rawdev.c
F: doc/guides/prog_guide/rawdev.rst
+ML device API - EXPERIMENTAL
+M: Srikanth Yalavarthi <syalavarthi@marvell.com>
+F: lib/mldev/
+F: doc/guides/prog_guide/mldev.rst
+
Memory Pool Drivers
-------------------
diff --git a/config/rte_config.h b/config/rte_config.h
index 3c4876d434..083d37757d 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -83,6 +83,9 @@
/* rawdev defines */
#define RTE_RAWDEV_MAX_DEVS 64
+/* mldev defines */
+#define RTE_MLDEV_MAX_DEVS 64
+
/* ip_fragmentation defines */
#define RTE_LIBRTE_IP_FRAG_MAX_FRAG 8
// RTE_LIBRTE_IP_FRAG_TBL_STAT is not set
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de488c7abf..a12562977a 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -22,6 +22,7 @@ The public API headers are grouped by topics:
[compress](@ref rte_comp.h),
[regexdev](@ref rte_regexdev.h),
[dmadev](@ref rte_dmadev.h),
+ [mldev](@ref rte_mldev.h),
[eventdev](@ref rte_eventdev.h),
[event_eth_rx_adapter](@ref rte_event_eth_rx_adapter.h),
[event_eth_tx_adapter](@ref rte_event_eth_tx_adapter.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index f0886c3bd1..5d6416d3e0 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -57,6 +57,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/mempool \
@TOPDIR@/lib/meter \
@TOPDIR@/lib/metrics \
+ @TOPDIR@/lib/mldev \
@TOPDIR@/lib/node \
@TOPDIR@/lib/net \
@TOPDIR@/lib/pcapng \
diff --git a/doc/guides/prog_guide/img/mldev_flow.svg b/doc/guides/prog_guide/img/mldev_flow.svg
new file mode 100644
index 0000000000..6c5dda14e5
--- /dev/null
+++ b/doc/guides/prog_guide/img/mldev_flow.svg
@@ -0,0 +1,714 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!-- SPDX-License-Identifier: BSD-3-Clause -->
+<!-- Copyright (c) 2022 Marvell. -->
+<!-- Created with Inkscape (http://www.inkscape.org/) -->
+
+<svg
+ width="320mm"
+ height="297mm"
+ viewBox="0 0 320 297"
+ version="1.1"
+ id="svg6899"
+ inkscape:version="1.2.1 (9c6d41e410, 2022-07-14)"
+ sodipodi:docname="mldev_flow.svg"
+ inkscape:export-filename="mldev_flow.png"
+ inkscape:export-xdpi="96"
+ inkscape:export-ydpi="96"
+ xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
+ xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
+ xmlns="http://www.w3.org/2000/svg"
+ xmlns:svg="http://www.w3.org/2000/svg">
+ <sodipodi:namedview
+ id="namedview6901"
+ pagecolor="#ffffff"
+ bordercolor="#000000"
+ borderopacity="0.25"
+ inkscape:showpageshadow="2"
+ inkscape:pageopacity="0.0"
+ inkscape:pagecheckerboard="0"
+ inkscape:deskcolor="#d1d1d1"
+ inkscape:document-units="mm"
+ showgrid="false"
+ inkscape:connector-spacing="0"
+ inkscape:lockguides="false"
+ inkscape:zoom="0.49638341"
+ inkscape:cx="640.63382"
+ inkscape:cy="525.80323"
+ inkscape:window-width="1920"
+ inkscape:window-height="986"
+ inkscape:window-x="-11"
+ inkscape:window-y="-11"
+ inkscape:window-maximized="1"
+ inkscape:current-layer="layer1" />
+ <defs
+ id="defs6896">
+ <marker
+ style="overflow:visible"
+ id="RoundedArrow"
+ refX="5"
+ refY="0"
+ orient="auto-start-reverse"
+ inkscape:stockid="RoundedArrow"
+ markerWidth="6.1347523"
+ markerHeight="5.9304948"
+ viewBox="0 0 6.1347524 5.9304951"
+ inkscape:isstock="true"
+ inkscape:collect="always"
+ preserveAspectRatio="xMidYMid">
+ <path
+ transform="scale(0.7)"
+ d="m -0.21114562,-4.1055728 6.42229122,3.21114561 a 1,1 90 0 1 0,1.78885438 L -0.21114562,4.1055728 A 1.236068,1.236068 31.717474 0 1 -2,3 v -6 a 1.236068,1.236068 148.28253 0 1 1.78885438,-1.1055728 z"
+ style="fill:context-stroke;fill-rule:evenodd;stroke:none"
+ id="path1367" />
+ </marker>
+ <marker
+ style="overflow:visible"
+ id="TriangleStart"
+ refX="4"
+ refY="0"
+ orient="auto-start-reverse"
+ inkscape:stockid="TriangleStart"
+ markerWidth="5.3244081"
+ markerHeight="6.155385"
+ viewBox="0 0 5.3244081 6.1553851"
+ inkscape:isstock="true"
+ inkscape:collect="always"
+ preserveAspectRatio="xMidYMid">
+ <path
+ transform="scale(0.5)"
+ style="fill:context-stroke;fill-rule:evenodd;stroke:context-stroke;stroke-width:1pt"
+ d="M 5.77,0 -2.88,5 V -5 Z"
+ id="path135" />
+ </marker>
+ </defs>
+ <g
+ inkscape:label="Layer 1"
+ inkscape:groupmode="layer"
+ id="layer1">
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1;paint-order:stroke fill markers"
+ id="rect39991"
+ width="312.88394"
+ height="286.7659"
+ x="3.5580292"
+ y="5.1170502"
+ ry="18.197132" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 114.68664,155.38145 h 32.15418"
+ id="path24358"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="m 114.68664,179.58099 h 32.15008"
+ id="path24360"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart)"
+ d="m 114.68664,203.78389 h 32.15008"
+ id="path24362"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart)"
+ d="m 114.68664,227.98576 32.14997,0"
+ id="path24364"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="M 146.8367,252.18432 H 114.68664"
+ id="path24366"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="M 146.8367,276.38309 H 114.68664"
+ id="path24368"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176-1" />
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:2, 1;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24370"
+ width="18.09137"
+ height="13.568528"
+ x="127.27605"
+ y="208.81961"
+ ry="2.7394907"
+ inkscape:connector-avoid="true" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:4, 2;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 70.388979,148.58514 -1e-6,-46.3516"
+ id="path24426"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1"
+ inkscape:connection-end="#rect24176" />
+ <g
+ id="g42647">
+ <g
+ id="g31403"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844498;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68901, 0.844498;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-9"
+ width="99.155487"
+ height="14.152132"
+ x="190.88715"
+ y="229.93475"
+ ry="2.2479143"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-236.90309"
+ y="240.37343"
+ id="text31115"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113"
+ style="stroke:none;stroke-width:0.75"
+ x="-236.90309"
+ y="240.37343">rte_ml_model_update_params()</tspan></text>
+ </g>
+ <g
+ id="g31398"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844505;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68902, 0.844505;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-4"
+ width="99.155495"
+ height="14.152357"
+ x="190.88705"
+ y="205.73608"
+ ry="2.2479498"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-212.70453"
+ y="240.37334"
+ id="text31115-8"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-8"
+ style="stroke:none;stroke-width:0.75"
+ x="-212.70453"
+ y="240.37334">rte_ml_model_stop()</tspan></text>
+ </g>
+ <g
+ id="g31408"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844505;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68901, 0.844505;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2-2"
+ width="99.155495"
+ height="14.152359"
+ x="190.88715"
+ y="254.13341"
+ ry="2.2479503"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-261.10187"
+ y="240.37343"
+ id="text31115-1"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-3"
+ style="stroke:none;stroke-width:0.75"
+ x="-261.10187"
+ y="240.37343">rte_ml_model_unload()</tspan></text>
+ </g>
+ <g
+ id="g31393"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844566;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68914, 0.844566;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2-5"
+ width="99.155434"
+ height="14.154394"
+ x="190.88718"
+ y="181.53319"
+ ry="2.2482734"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-188.50266"
+ y="240.37343"
+ id="text31115-4"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-2"
+ style="stroke:none;stroke-width:0.75"
+ x="-188.50266"
+ y="240.37343">rte_ml_model_start()</tspan></text>
+ </g>
+ <g
+ id="g31388"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844565;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68914, 0.844565;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-8"
+ width="99.155434"
+ height="14.154395"
+ x="190.88718"
+ y="157.33029"
+ ry="2.2482736"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-164.29976"
+ y="240.37343"
+ id="text31115-6"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-5"
+ style="stroke:none;stroke-width:0.75"
+ x="-164.29976"
+ y="240.37343">rte_ml_model_info_get()</tspan></text>
+ </g>
+ <g
+ id="g31383"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2"
+ width="99.155495"
+ height="14.152369"
+ x="190.89127"
+ y="133.13176"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-140.10022"
+ y="240.37755"
+ id="text31115-0"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35"
+ style="stroke:none;stroke-width:0.75"
+ x="-140.10022"
+ y="240.37755">rte_ml_model_load()</tspan></text>
+ </g>
+ </g>
+ <rect
+ style="fill:#ffccaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2-5"
+ width="99.155495"
+ height="14.152369"
+ x="184.08008"
+ y="112.15163"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-119.12009"
+ y="233.56647"
+ id="text31115-0-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35-8"
+ style="stroke:none;stroke-width:0.75"
+ x="-119.12009"
+ y="233.56647">rte_ml_dequeue_burst()</tspan></text>
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 114.90712,47.649005 h 56.16045"
+ id="path24248"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176"
+ inkscape:connection-end="#rect24200" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 171.06762,70.71111 -56.1605,0.0024"
+ id="path24250"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176"
+ inkscape:connection-start="#rect24200-5" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="M 171.06765,93.773951 H 114.90712"
+ id="path24252"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176"
+ inkscape:connection-start="#rect24200-5-2" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.44396,47.649004 h 36.42795"
+ id="path24566"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.444,70.710168 h 36.42791"
+ id="path24568"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.44395,93.773951 36.42796,-10e-7"
+ id="path24570"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <g
+ id="g42675">
+ <g
+ id="g31358"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623639;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24728, 0.623639;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200"
+ width="44.376362"
+ height="17.244751"
+ x="190.77635"
+ y="22.794853"
+ ry="2.7391431"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-31.802492"
+ y="212.98004"
+ id="text31256"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254"
+ style="stroke-width:0.75"
+ x="-31.802492"
+ y="212.98004">Queue Pair 0</tspan></text>
+ </g>
+ <g
+ id="g31353"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623639;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24728, 0.623639;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5"
+ width="44.376362"
+ height="17.244749"
+ x="190.7764"
+ y="45.856018"
+ ry="2.7391429"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-54.863655"
+ y="213.10411"
+ id="text31256-9"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254-9"
+ style="stroke-width:0.75"
+ x="-54.863655"
+ y="213.10411">Queue Pair ..</tspan></text>
+ </g>
+ <g
+ id="g31363"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623731;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24746, 0.623731;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2"
+ width="44.37627"
+ height="17.249832"
+ x="190.77643"
+ y="68.917259"
+ ry="2.7399504"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-77.927437"
+ y="213.08859"
+ id="text31256-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254-8"
+ style="stroke-width:0.75"
+ x="-77.927437"
+ y="213.08859">Queue Pair N</tspan></text>
+ </g>
+ </g>
+ <g
+ id="g42661">
+ <g
+ id="g31368"
+ transform="translate(-19.708778,16.231776)"
+ inkscape:connector-avoid="true">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="25.995117"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-31.941525"
+ y="287.03415"
+ id="text31260"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258"
+ style="stroke-width:0.75"
+ x="-31.941525"
+ y="287.03415">Core 0</tspan></text>
+ </g>
+ <g
+ id="g31373"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479-4"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="49.056282"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-55.00008"
+ y="287.15549"
+ id="text31260-0"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258-7"
+ style="stroke-width:0.75"
+ x="-55.00008"
+ y="287.15549">Core ..</tspan></text>
+ </g>
+ <g
+ id="g31378"
+ transform="translate(-19.708778,16.231776)"
+ inkscape:connector-avoid="true">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479-41"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="72.120064"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-78.063866"
+ y="287.13998"
+ id="text31260-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258-8"
+ style="stroke-width:0.75"
+ x="-78.063866"
+ y="287.13998">Core N</tspan></text>
+ </g>
+ </g>
+ <rect
+ style="fill:#ffccaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2-5-6"
+ width="99.155495"
+ height="14.152369"
+ x="184.08008"
+ y="13.539296"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-20.507757"
+ y="233.56647"
+ id="text31115-0-5-7"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35-8-7"
+ style="stroke:none;stroke-width:0.75"
+ x="-20.507757"
+ y="233.56647">rte_ml_enqueue_burst()</tspan></text>
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:2.25, 0.75;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#RoundedArrow)"
+ d="M 233.65793,27.691665 V 112.15163"
+ id="path36804"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <g
+ id="g42683">
+ <rect
+ style="fill:#44d7f4;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24176"
+ width="89.036293"
+ height="63.036304"
+ x="25.870831"
+ y="39.197231"
+ ry="3.0941005" />
+ <text
+ xml:space="preserve"
+ style="font-size:11.2889px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-49.288273"
+ y="70.228432"
+ id="text38896"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan38894"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-49.288273"
+ y="70.228432">Machine</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-63.399399"
+ y="70.228432"
+ id="tspan38898">Learning</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-77.510529"
+ y="70.228432"
+ id="tspan38900">Inference</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-91.621651"
+ y="70.228432"
+ id="tspan38902">Engine</tspan></text>
+ </g>
+ <g
+ id="g42621">
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.405;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24176-1"
+ width="88.595322"
+ height="134.59531"
+ x="26.09132"
+ y="148.58514"
+ ry="6.6065331" />
+ <g
+ id="g42601">
+ <g
+ id="g39966"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="146.14212"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-157.3761"
+ y="130.49591"
+ id="text39799"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-157.3761"
+ y="130.49591">Model 0</tspan></text>
+ </g>
+ <g
+ id="g39971"
+ transform="translate(-60.175151,10.144334)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-8"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="178.65079"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-189.88477"
+ y="130.49591"
+ id="text39799-8"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-1"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-189.88477"
+ y="130.49591">Model 1</tspan></text>
+ </g>
+ <g
+ id="g39976"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-9"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="211.15947"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-222.39345"
+ y="130.49591"
+ id="text39799-9"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-8"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-222.39345"
+ y="130.49591">Model ..</tspan></text>
+ </g>
+ <g
+ id="g39981"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-7"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="243.66815"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-254.90213"
+ y="130.49591"
+ id="text39799-90"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-5"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-254.90213"
+ y="130.49591">Model N</tspan></text>
+ </g>
+ </g>
+ </g>
+ <text
+ xml:space="preserve"
+ style="font-size:14.1111px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-279.79742"
+ y="275.46826"
+ id="text38896-4"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:14.1111px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-279.79742"
+ y="275.46826"
+ id="tspan38902-6">mldev</tspan></text>
+ </g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 8564883018..d7f2a28bdb 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -30,6 +30,7 @@ Programmer's Guide
regexdev
dmadev
gpudev
+ mldev
rte_security
rawdev
link_bonding_poll_mode_drv_lib
diff --git a/doc/guides/prog_guide/mldev.rst b/doc/guides/prog_guide/mldev.rst
new file mode 100644
index 0000000000..9809f2dba3
--- /dev/null
+++ b/doc/guides/prog_guide/mldev.rst
@@ -0,0 +1,186 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright (c) 2022 Marvell.
+
+Machine Learning Device Library
+===============================
+
+The MLDEV library provides a Machine Learning device framework for the management and
+provisioning of hardware and software ML poll mode drivers, defining APIs which
+support a number of ML operations including device handling and inference processing.
+The ML model creation and training is outside of the scope of this library.
+
+The ML framework is built on the following model:
+
+.. _figure_mldev_work_flow:
+
+.. figure:: img/mldev_flow.*
+
+ Work flow of inference on MLDEV
+
+**ML Device**: A hardware or software-based implementation of ML device API for running
+inferences using a pre-trained ML model.
+
+**ML Model**: An ML model is an algorithm trained over a dataset. A model consists of
+procedure/algorithm and data/pattern required to make predictions on live data. Once
+the model is created and trained outside of the DPDK scope, the model can be loaded
+via rte_ml_model_load() and then start it using rte_ml_model_start() API. The
+rte_ml_model_params_update() can be used to update the model parameters such as weights
+and bias without unloading the model using rte_ml_model_unload().
+
+**ML Inference**: ML inference is the process of feeding data to the model via
+rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+outputs / predictions from the started model.
+
+Design Principles
+-----------------
+
+The MLDEV library follows the same basic principles as those used in DPDK's
+Ethernet Device framework and the Crypto framework. The MLDEV framework provides
+a generic Machine Learning device framework which supports both physical (hardware)
+and virtual (software) ML devices as well as an ML API to manage and configure ML
+devices. The APIs also supports performing ML inference operations through ML poll
+mode driver.
+
+
+Device Operations
+-----------------
+
+Device Creation
+~~~~~~~~~~~~~~~
+
+Physical ML devices are discovered during the PCI probe/enumeration, through the
+EAL functions which are executed at DPDK initialization, based on their PCI device
+identifier, each unique PCI BDF (bus/bridge, device, function). ML physical devices,
+like other physical devices in DPDK can be white-listed or black-listed
+using the EAL command line options.
+
+
+Device Identification
+~~~~~~~~~~~~~~~~~~~~~
+
+Each device, whether virtual or physical is uniquely designated by two
+identifiers:
+
+- A unique device index used to designate the ML device in all functions
+ exported by the MLDEV API.
+
+- A device name used to designate the ML device in console messages, for
+ administration or debugging purposes.
+
+Device Features and Capabilities
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ML devices may support different feature set. In order to get the
+supported PMD feature ``rte_ml_dev_info_get`` API which return the
+info of the device and it's supported features.
+
+Device Configuration
+~~~~~~~~~~~~~~~~~~~~
+
+The configuration of each ML device includes the following operations:
+
+- Allocation of resources, including hardware resources if a physical device.
+- Resetting the device into a well-known default state.
+- Initialization of statistics counters.
+
+The rte_ml_dev_configure API is used to configure a ML device.
+
+.. code-block:: c
+
+ int rte_ml_dev_configure(uint8_t dev_id, const struct rte_ml_dev_config *cfg);
+
+The ``rte_ml_dev_config`` structure is used to pass the configuration parameters
+for the ML device, for example number of queue pairs, maximum number of models,
+maximum size of model and so on.
+
+Configuration of Queue Pairs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each ML device can be configured with number of queue pairs.
+Each queue pair is configured using ``rte_ml_dev_queue_pair_setup``
+
+Logical Cores, Memory and Queues Pair Relationships
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Multiple logical cores should never share the same queue pair for enqueuing
+operations or dequeueing operations on the same ML device since this would
+require global locks and hinder performance.
+
+Configuration of Machine Learning models
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pre-trained ML models that are built using external ML compiler / training frameworks
+are used to perform inference operations. These models are configured on an ML device
+in a two-stage process that includes loading the model on an ML device, and starting
+the model to accept inference operations. Inference operations can be queued for a
+model only when the model is in started state. Model load stage assigns a Model ID,
+which is unique for the model in a driver's context. Model ID is used during all
+subsequent slow-path and fast-path operations.
+
+Model loading and start is done through the ``rte_ml_model_load`` and
+``rte_ml_model_start`` functions.
+
+Similarly stop and unloading are done through ``rte_ml_model_stop`` and
+``rte_ml_model_unload`` functions.
+
+Stop and unload functions would release the resources allocated for the
+models. Inference tasks cannot be queued for a model that is stopped.
+
+Detailed information related to the model can be retrieved from the driver using the
+function ``rte_ml_model_info_get``. Model information is accessible to the application
+through the ``rte_ml_model_info`` structure. Information available to the user would
+include the details related to the inputs and outputs, and the maximum batch size
+supported by the model.
+
+User can optionally update the model params such as weights and bias, without unloading
+the model, through the ``rte_ml_model_params_update`` function. A model should be in
+stopped state to update the params. Model has to be started in order to enqueue inference
+requests after a params update.
+
+Enqueue / Dequeue
+~~~~~~~~~~~~~~~~~
+
+The burst enqueue API uses a ML device identifier and a queue pair identifier
+to specify the device queue pair to schedule the processing on. The ``nb_ops``
+parameter is the number of operations to process which are supplied in the
+``ops`` array of ``rte_ml_op`` structures. The enqueue function returns the
+number of operations it enqueued for processing, a return value equal to
+``nb_ops`` means that all packets have been enqueued.
+
+The dequeue API uses the same format as the enqueue API of processed but
+the ``nb_ops`` and ``ops`` parameters are now used to specify the max processed
+operations the user wishes to retrieve and the location in which to store them.
+The API call returns the actual number of processed operations returned; this
+can never be larger than ``nb_ops``.
+
+``rte_ml_op`` provides the required information to the driver to queue an ML inference
+task. ML op specifies the model to be used and the number of batches to be executed in
+the inference task. Input and output buffer information is specified through the
+structure ``rte_ml_buff_seg``, which supports segmented data. Input is provided through
+the ``rte_ml_op::input`` and output through ``rte_ml_op::output``. Data pointed in each
+op, should not be released until the dequeue of for that op.
+
+
+Quantize and Dequantize
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Inference operations performed with lower precision types would improve the throughput
+and efficiency of the inference execution with a minimal loss of accuracy, which is within
+the tolerance limits. Quantization and dequantization is the process of converting data
+from a higher precision type to a lower precision type and vice-versa. ML library provides
+the functions ``rte_ml_io_quantize`` and ``rte_ml_io_dequantize`` to enable data type
+conversions. User needs to provide the address of the quantized and dequantized data
+buffers to the functions, along the number of the batches in the buffers.
+
+For quantization, the dequantized data is assumed to be of the type ``dtype`` provided by
+the ``rte_ml_model_info::input`` and the data is converted to ``qtype`` provided by the
+``rte_ml_model_info::input``.
+
+For dequantization, the quantized data is assumed to be of the type ``qtype`` provided by
+the ``rte_ml_model_info::output`` and the data is converted to ``dtype`` provided by the
+``rte_ml_model_info::output``.
+
+Size of the buffers required for the input and output can be calculated using the functions
+``rte_ml_io_input_size_get`` and ``rte_ml_io_output_size_get``. These functions would get the
+buffer sizes for both quantized and dequantized data for the given number of batches.
+
diff --git a/lib/eal/common/eal_common_log.c b/lib/eal/common/eal_common_log.c
index bd7b188ceb..5cb1b15dbe 100644
--- a/lib/eal/common/eal_common_log.c
+++ b/lib/eal/common/eal_common_log.c
@@ -369,6 +369,7 @@ static const struct logtype logtype_strings[] = {
{RTE_LOGTYPE_EFD, "lib.efd"},
{RTE_LOGTYPE_EVENTDEV, "lib.eventdev"},
{RTE_LOGTYPE_GSO, "lib.gso"},
+ {RTE_LOGTYPE_MLDEV, "lib.mldev"},
{RTE_LOGTYPE_USER1, "user1"},
{RTE_LOGTYPE_USER2, "user2"},
{RTE_LOGTYPE_USER3, "user3"},
diff --git a/lib/eal/include/rte_log.h b/lib/eal/include/rte_log.h
index bba5da3d85..df6fada0b1 100644
--- a/lib/eal/include/rte_log.h
+++ b/lib/eal/include/rte_log.h
@@ -48,6 +48,7 @@ extern "C" {
#define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
#define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
#define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
+#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
/* these log types can be used in an application */
#define RTE_LOGTYPE_USER1 24 /**< User-defined log type 1. */
diff --git a/lib/meson.build b/lib/meson.build
index fd55925340..f18b352ec5 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -63,6 +63,7 @@ libraries = [
'flow_classify', # flow_classify lib depends on pkt framework table lib
'graph',
'node',
+ 'mldev'
]
optional_libs = [
diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build
new file mode 100644
index 0000000000..e378cfca30
--- /dev/null
+++ b/lib/mldev/meson.build
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2022 Marvell.
+
+sources = files(
+ 'rte_mldev.c',
+)
+
+headers = files(
+ 'rte_mldev.h',
+)
+
+deps += ['mempool']
+
+if get_option('buildtype').contains('debug')
+ cflags += [ '-DRTE_LIBRTE_ML_DEV_DEBUG' ]
+else
+ cflags += [ '-URTE_LIBRTE_ML_DEV_DEBUG' ]
+endif
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
new file mode 100644
index 0000000000..2e3dfa0e6b
--- /dev/null
+++ b/lib/mldev/rte_mldev.c
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#include <rte_mldev.h>
diff --git a/lib/mldev/rte_mldev.h b/lib/mldev/rte_mldev.h
new file mode 100644
index 0000000000..83419fcecd
--- /dev/null
+++ b/lib/mldev/rte_mldev.h
@@ -0,0 +1,1092 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef RTE_MLDEV_H
+#define RTE_MLDEV_H
+
+/**
+ * @file rte_mldev.h
+ *
+ * @warning
+ * @b EXPERIMENTAL:
+ * All functions in this file may be changed or removed without prior notice.
+ *
+ * ML (Machine Learning) device API.
+ *
+ * The ML framework is built on the following model:
+ *
+ *
+ * +-----------------+ rte_ml_[en|de]queue_burst()
+ * | | |
+ * | Machine o------+ +--------+ |
+ * | Learning | | | queue | | +------+
+ * | Inference o------+-----o |<===o===>|Core 0|
+ * | Engine | | | pair 0 | +------+
+ * | o----+ | +--------+
+ * | | | |
+ * +-----------------+ | | +--------+
+ * ^ | | | queue | +------+
+ * | | +-----o |<=======>|Core 1|
+ * | | | pair 1 | +------+
+ * | | +--------+
+ * +--------+--------+ |
+ * | +-------------+ | | +--------+
+ * | | Model 0 | | | | queue | +------+
+ * | +-------------+ | +-------o |<=======>|Core N|
+ * | +-------------+ | | pair N | +------+
+ * | | Model 1 | | +--------+
+ * | +-------------+ |
+ * | +-------------+ |<------> rte_ml_model_load()
+ * | | Model .. | |-------> rte_ml_model_info_get()
+ * | +-------------+ |<------- rte_ml_model_start()
+ * | +-------------+ |<------- rte_ml_model_stop()
+ * | | Model N | |<------- rte_ml_model_params_update()
+ * | +-------------+ |<------- rte_ml_model_unload()
+ * +-----------------+
+ *
+ * ML Device: A hardware or software-based implementation of ML device API for
+ * running inferences using a pre-trained ML model.
+ *
+ * ML Model: An ML model is an algorithm trained over a dataset. A model consists of
+ * procedure/algorithm and data/pattern required to make predictions on live data.
+ * Once the model is created and trained outside of the DPDK scope, the model can be loaded
+ * via rte_ml_model_load() and then start it using rte_ml_model_start() API.
+ * The rte_ml_model_params_update() can be used to update the model parameters such as weight
+ * and bias without unloading the model using rte_ml_model_unload().
+ *
+ * ML Inference: ML inference is the process of feeding data to the model via
+ * rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+ * outputs/predictions from the started model.
+ *
+ * In all functions of the ML device API, the ML device is designated by an
+ * integer >= 0 named as device identifier *dev_id*.
+ *
+ * The functions exported by the ML device API to setup a device designated by
+ * its device identifier must be invoked in the following order:
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_dev_start()
+ *
+ * A model is required to run the inference operations with the user specified inputs.
+ * Application needs to invoke the ML model API in the following order before queueing
+ * inference jobs.
+ *
+ * - rte_ml_model_load()
+ * - rte_ml_model_start()
+ *
+ * A model can be loaded on a device only after the device has been configured and can be
+ * started or stopped only after a device has been started.
+ *
+ * The rte_ml_model_info_get() API is provided to retrieve the information related to the model.
+ * The information would include the shape and type of input and output required for the inference.
+ *
+ * Data quantization and dequantization is one of the main aspects in ML domain. This involves
+ * conversion of input data from a higher precision to a lower precision data type and vice-versa
+ * for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
+ * dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
+ * and output buffers holding data for multiple batches.
+ *
+ * Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
+ * size of quantized and de-quantized multi-batch input and output buffers.
+ *
+ * User can optionally update the model parameters with rte_ml_model_params_update() after
+ * invoking rte_ml_model_stop() API on a given model ID.
+ *
+ * The application can invoke, in any order, the functions exported by the ML API to enqueue
+ * inference jobs and dequeue inference response.
+ *
+ * If the application wants to change the device configuration (i.e., call
+ * rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
+ * device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
+ * the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
+ * for the given model. The application does not need to call rte_ml_dev_stop() API for
+ * any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
+ *
+ * Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
+ * start state after invoking rte_ml_model_start() API, then the application can call
+ * rte_ml_enqueue_burst() and rte_ml_dequeue_burst() API on the destined device and model ID.
+ *
+ * Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
+ *
+ * Typical application utilisation of the ML API will follow the following
+ * programming flow.
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_model_load()
+ * - rte_ml_model_start()
+ * - rte_ml_model_info_get()
+ * - rte_ml_dev_start()
+ * - rte_ml_enqueue_burst()
+ * - rte_ml_dequeue_burst()
+ * - rte_ml_model_stop()
+ * - rte_ml_model_unload()
+ * - rte_ml_dev_stop()
+ * - rte_ml_dev_close()
+ *
+ * Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on different logical cores
+ * on the same target object. For instance, the dequeue function of a poll mode driver cannot be
+ * invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the user application to enforce this rule.
+ */
+
+#include <rte_common.h>
+#include <rte_mempool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_ML_STR_MAX 128
+/**< Maximum length of name string */
+
+/* Device operations */
+
+/**
+ * Get the total number of ML devices that have been successfully initialised.
+ *
+ * @return
+ * - The total number of usable ML devices.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dev_count(void);
+
+/**
+ * Check if the device is in ready state.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 if device state is not in ready state.
+ * - 1 if device state is ready state.
+ */
+__rte_experimental
+int
+rte_ml_dev_is_valid_dev(int16_t dev_id);
+
+/**
+ * Return the NUMA socket to which a device is connected.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - The NUMA socket id to which the device is connected
+ * - 0 If the socket could not be determined.
+ * - -EINVAL: if the dev_id value is not valid.
+ */
+__rte_experimental
+int
+rte_ml_dev_socket_id(int16_t dev_id);
+
+/** ML device information */
+struct rte_ml_dev_info {
+ const char *driver_name;
+ /**< Driver name */
+ int16_t max_models;
+ /**< Maximum number of models supported by the device.
+ * @see struct rte_ml_dev_config::nb_models
+ */
+ uint16_t max_queue_pairs;
+ /**< Maximum number of queues pairs supported by the device.
+ * @see struct rte_ml_dev_config::nb_queue_pairs
+ */
+ uint16_t max_desc;
+ /**< Maximum allowed number of descriptors for queue pair by the device.
+ * @see struct rte_ml_dev_qp_conf::nb_desc
+ */
+ uint16_t max_segments;
+ /**< Maximum number of scatter-gather entries supported by the device.
+ * @see struct rte_ml_buff_seg struct rte_ml_buff_seg::next
+ */
+ uint16_t min_align_size;
+ /**< Minimum alignment size of IO buffers used by the device. */
+};
+
+/**
+ * Retrieve the information of the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param dev_info
+ * A pointer to a structure of type *rte_ml_dev_info* to be filled with the info of the device.
+ *
+ * @return
+ * - 0: Success, driver updates the information of the ML device
+ * - < 0: Error code returned by the driver info get function.
+ */
+__rte_experimental
+int
+rte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info);
+
+/** ML device configuration structure */
+struct rte_ml_dev_config {
+ int socket_id;
+ /**< Socket to allocate resources on. */
+ int16_t nb_models;
+ /**< Number of models to be loaded on the device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_models
+ */
+ uint16_t nb_queue_pairs;
+ /**< Number of queue pairs to configure on this device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_queue_pairs
+ */
+};
+
+/**
+ * Configure an ML device.
+ *
+ * This function must be invoked first before any other function in the API.
+ *
+ * ML Device can be re-configured, when in a stopped state. Device cannot be re-configured after
+ * rte_ml_dev_close() is called.
+ *
+ * The caller may use rte_ml_dev_info_get() to get the capability of each resources available for
+ * this ML device.
+ *
+ * @param dev_id
+ * The identifier of the device to configure.
+ * @param config
+ * The ML device configuration structure.
+ *
+ * @return
+ * - 0: Success, device configured.
+ * - < 0: Error code returned by the driver configuration function.
+ */
+__rte_experimental
+int
+rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config);
+
+/* Forward declaration */
+struct rte_ml_op;
+
+/**< Callback function called during rte_ml_dev_stop(), invoked once per flushed ML op */
+typedef void (*rte_ml_dev_stop_flush_t)(int16_t dev_id, uint16_t qp_id, struct rte_ml_op *op);
+
+/** ML device queue pair configuration structure. */
+struct rte_ml_dev_qp_conf {
+ uint32_t nb_desc;
+ /**< Number of descriptors per queue pair.
+ * This value cannot exceed the max_desc which previously provided in
+ * struct rte_ml_dev_info:max_desc
+ */
+ rte_ml_dev_stop_flush_t cb;
+ /**< Callback function called during rte_ml_dev_stop(), invoked once per active ML op.
+ * Value NULL is allowed, in which case callback will not be invoked.
+ * This function can be used to properly dispose of outstanding ML ops from all
+ * queue pairs, for example ops containing memory pointers.
+ * @see rte_ml_dev_stop()
+ */
+};
+
+/**
+ * Set up a queue pair for a device. This should only be called when the device is stopped.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param queue_pair_id
+ * The index of the queue pairs to set up. The value must be in the range [0, nb_queue_pairs - 1]
+ * previously supplied to rte_ml_dev_configure().
+ * @param qp_conf
+ * The pointer to the configuration data to be used for the queue pair.
+ * @param socket_id
+ * The *socket_id* argument is the socket identifier in case of NUMA.
+ * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the memory allocated
+ * for the queue pair.
+ *
+ * @return
+ * - 0: Success, queue pair correctly set up.
+ * - < 0: Queue pair configuration failed.
+ */
+__rte_experimental
+int
+rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *qp_conf, int socket_id);
+
+/**
+ * Start an ML device.
+ *
+ * The device start step consists of setting the configured features and enabling the ML device
+ * to accept inference jobs.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device started.
+ * - <0: Error code of the driver device start function.
+ */
+__rte_experimental
+int
+rte_ml_dev_start(int16_t dev_id);
+
+/**
+ * Stop an ML device. A stopped device cannot accept inference jobs.
+ * The device can be restarted with a call to rte_ml_dev_start().
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device stopped.
+ * - <0: Error code of the driver device stop function.
+ */
+__rte_experimental
+int
+rte_ml_dev_stop(int16_t dev_id);
+
+/**
+ * Close an ML device. The device cannot be restarted!
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 on successfully closing device.
+ * - <0 on failure to close device.
+ */
+__rte_experimental
+int
+rte_ml_dev_close(int16_t dev_id);
+
+/** Status of ML operation */
+enum rte_ml_op_status {
+ RTE_ML_OP_STATUS_SUCCESS = 0,
+ /**< Operation completed successfully */
+ RTE_ML_OP_STATUS_NOT_PROCESSED,
+ /**< Operation has not yet been processed by the device. */
+ RTE_ML_OP_STATUS_ERROR,
+ /**< Operation completed with error.
+ * Application can invoke rte_ml_op_error_get() to get PMD specific
+ * error code if needed.
+ */
+};
+
+/** ML operation's input and output buffer representation as scatter gather list
+ */
+struct rte_ml_buff_seg {
+ rte_iova_t iova_addr;
+ /**< IOVA address of segment buffer. */
+ void *addr;
+ /**< Virtual address of segment buffer. */
+ uint32_t length;
+ /**< Segment length. */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_ml_buff_seg *next;
+ /**< Points to next segment. Value NULL represents the last segment. */
+};
+
+/**
+ * ML Operation.
+ *
+ * This structure contains data related to performing an ML operation on the buffers using
+ * the model specified through model_id.
+ */
+struct rte_ml_op {
+ int16_t model_id;
+ /**< Model ID to be used for the operation. */
+ uint16_t nb_batches;
+ /**< Number of batches. Minimum value must be one.
+ * Input buffer must hold inference data for each batch as contiguous.
+ */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_mempool *mempool;
+ /**< Pool from which operation is allocated. */
+ struct rte_ml_buff_seg input;
+ /**< Input buffer to hold the inference data. */
+ struct rte_ml_buff_seg output;
+ /**< Output buffer to hold the inference output by the driver. */
+ RTE_STD_C11
+ union {
+ uint64_t user_u64;
+ /**< User data as uint64_t.*/
+ void *user_ptr;
+ /**< User data as void*.*/
+ };
+ enum rte_ml_op_status status;
+ /**< Operation status. */
+ uint64_t impl_opaque;
+ /**< Implementation specific opaque value.
+ * An implementation may use this field to hold
+ * implementation specific value to share between
+ * dequeue and enqueue operation.
+ * The application should not modify this field.
+ */
+} __rte_cache_aligned;
+
+/* Enqueue/Dequeue operations */
+
+/**
+ * Enqueue a burst of ML inferences for processing on an ML device.
+ *
+ * The rte_ml_enqueue_burst() function is invoked to place ML inference
+ * operations on the queue *qp_id* of the device designated by its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of inferences to process which are
+ * supplied in the *ops* array of *rte_ml_op* structures.
+ *
+ * The rte_ml_enqueue_burst() function returns the number of inferences it
+ * actually enqueued for processing. A return value equal to *nb_ops* means that
+ * all packets have been enqueued.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair which inferences are to be enqueued for processing.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * *rte_ml_dev_configure*.
+ * @param ops
+ * The address of an array of *nb_ops* pointers to *rte_ml_op* structures which contain the
+ * ML inferences to be processed.
+ * @param nb_ops
+ * The number of operations to process.
+ *
+ * @return
+ * The number of inference operations actually enqueued to the ML device.
+ * The return value can be less than the value of the *nb_ops* parameter when the ML device queue
+ * is full or if invalid parameters are specified in a *rte_ml_op*.
+ */
+__rte_experimental
+uint16_t
+rte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Dequeue a burst of processed ML inferences operations from a queue on the ML device.
+ * The dequeued operations are stored in *rte_ml_op* structures whose pointers are supplied
+ * in the *ops* array.
+ *
+ * The rte_ml_dequeue_burst() function returns the number of inferences actually dequeued,
+ * which is the number of *rte_ml_op* data structures effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained at least nb_ops* operations,
+ * and this is likely to signify that other processed operations remain in the devices output queue.
+ * Application implementing a "retrieve as many processed operations as possible" policy can check
+ * this specific case and keep invoking the rte_ml_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_ml_dequeue_burst() function does not provide any error notification to avoid
+ * the corresponding overhead.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair from which to retrieve processed packets.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * rte_ml_dev_configure().
+ * @param ops
+ * The address of an array of pointers to *rte_ml_op* structures that must be large enough to
+ * store *nb_ops* pointers in it.
+ * @param nb_ops
+ * The maximum number of inferences to dequeue.
+ *
+ * @return
+ * The number of operations actually dequeued, which is the number of pointers
+ * to *rte_ml_op* structures effectively supplied to the *ops* array.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Verbose error structure definition.
+ */
+struct rte_ml_op_error {
+ char message[RTE_ML_STR_MAX]; /**< Human-readable error message. */
+ uint64_t errcode; /**< Vendor specific error code. */
+};
+
+/**
+ * Get PMD specific error information for an ML op.
+ *
+ * When an ML operation completed with RTE_ML_OP_STATUS_ERROR as status,
+ * This API allows to get PMD specific error details.
+ *
+ * @param[in] dev_id
+ * Device identifier
+ * @param[in] op
+ * Handle of ML operation
+ * @param[in] error
+ * Address of structure rte_ml_op_error to be filled
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error);
+
+/* Statistics operations */
+
+/** Device statistics. */
+struct rte_ml_dev_stats {
+ uint64_t enqueued_count;
+ /**< Count of all operations enqueued */
+ uint64_t dequeued_count;
+ /**< Count of all operations dequeued */
+ uint64_t enqueue_err_count;
+ /**< Total error count on operations enqueued */
+ uint64_t dequeue_err_count;
+ /**< Total error count on operations dequeued */
+};
+
+/**
+ * Retrieve the general I/O statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stats
+ * Pointer to structure to where statistics will be copied.
+ * On error, this location may or may not have been modified.
+ * @return
+ * - 0 on success
+ * - -EINVAL: If invalid parameter pointer is provided.
+ */
+__rte_experimental
+int
+rte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats);
+
+/**
+ * Reset the statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ */
+__rte_experimental
+void
+rte_ml_dev_stats_reset(int16_t dev_id);
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers for extended ML device statistics.
+ */
+struct rte_ml_dev_xstats_map {
+ uint16_t id;
+ /**< xstat identifier */
+ char name[RTE_ML_STR_MAX];
+ /**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param[out] xstats_map
+ * Block of memory to insert id and names into. Must be at least size in capacity.
+ * If set to NULL, function returns required capacity.
+ * @param size
+ * Capacity of xstats_map (number of name-id maps).
+ *
+ * @return
+ * - Positive value on success:
+ * - The return value is the number of entries filled in the stats map.
+ * - If xstats_map set to NULL then required capacity for xstats_map.
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map,
+ uint32_t size);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param name
+ * The stat name to retrieve.
+ * @param stat_id
+ * If non-NULL, the numerical id of the stat will be returned, so that further requests for
+ * the stat can be got using rte_ml_dev_xstats_get, which will be faster as it doesn't need to
+ * scan a list of names for the stat.
+ * @param[out] value
+ * Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ * - 0: Successfully retrieved xstat value.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value);
+
+/**
+ * Retrieve extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * The id numbers of the stats to get. The ids can be fetched from the stat position in the
+ * stat list from rte_ml_dev_xstats_names_get(), or by using rte_ml_dev_xstats_by_name_get().
+ * @param values
+ * The values for each stats request by ID.
+ * @param nb_ids
+ * The number of stats requested.
+ * @return
+ * - Positive value: number of stat entries filled into the values array
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * Selects specific statistics to be reset. When NULL, all statistics will be reset.
+ * If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ * The number of ids available from the *ids* array. Ignored when ids is NULL.
+ * @return
+ * - 0: Successfully reset the statistics to zero.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids);
+
+/* Utility operations */
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *fd*.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param fd
+ * A pointer to a file for output.
+ * @return
+ * - 0: on success.
+ * - <0: on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_dump(int16_t dev_id, FILE *fd);
+
+/**
+ * Trigger the ML device self test.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @return
+ * - 0: Selftest successful.
+ * - -ENOTSUP: if the device doesn't support selftest.
+ * - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_selftest(int16_t dev_id);
+
+/* Model operations */
+
+/** ML model load parameters
+ *
+ * Parameters required to load an ML model.
+ */
+struct rte_ml_model_params {
+ void *addr;
+ /**< Address of model buffer */
+ size_t size;
+ /**< Size of model buffer */
+};
+
+/**
+ * Load an ML model to the device.
+ *
+ * Load an ML model to the device with parameters requested in the structure rte_ml_model_params.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] params
+ * Parameters for the model to be loaded.
+ * @param[out] model_id
+ * Identifier of the model loaded.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model load driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, int16_t *model_id);
+
+/**
+ * Unload an ML model from the device.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be unloaded.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model unload driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_unload(int16_t dev_id, int16_t model_id);
+
+/**
+ * Start an ML model for the given device ID.
+ *
+ * Start an ML model to accept inference requests.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be started.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model start driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_start(int16_t dev_id, int16_t model_id);
+
+/**
+ * Stop an ML model for the given device ID.
+ *
+ * Model stop would disable the ML model to be used for inference jobs.
+ * All inference jobs must have been completed before model stop is attempted.
+
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be stopped.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model stop driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_stop(int16_t dev_id, int16_t model_id);
+
+/**
+ * Input and output data types. ML models can operate on reduced precision
+ * datatypes to achieve better power efficiency, lower network latency and lower memory footprint.
+ * This enum is used to represent the lower precision integer and floating point types used
+ * by ML models.
+ */
+enum rte_ml_io_type {
+ RTE_ML_IO_TYPE_UNKNOWN = 0,
+ /**< Invalid or unknown type */
+ RTE_ML_IO_TYPE_INT8,
+ /**< 8-bit integer */
+ RTE_ML_IO_TYPE_UINT8,
+ /**< 8-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT16,
+ /**< 16-bit integer */
+ RTE_ML_IO_TYPE_UINT16,
+ /**< 16-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT32,
+ /**< 32-bit integer */
+ RTE_ML_IO_TYPE_UINT32,
+ /**< 32-bit unsigned integer */
+ RTE_ML_IO_TYPE_FP8,
+ /**< 8-bit floating point number */
+ RTE_ML_IO_TYPE_FP16,
+ /**< IEEE 754 16-bit floating point number */
+ RTE_ML_IO_TYPE_FP32,
+ /**< IEEE 754 32-bit floating point number */
+ RTE_ML_IO_TYPE_BFLOAT16
+ /**< 16-bit brain floating point number. */
+};
+
+/**
+ * Input and output format. This is used to represent the encoding type of multi-dimensional
+ * used by ML models.
+ */
+enum rte_ml_io_format {
+ RTE_ML_IO_FORMAT_NCHW = 1,
+ /**< Batch size (N) x channels (C) x height (H) x width (W) */
+ RTE_ML_IO_FORMAT_NHWC,
+ /**< Batch size (N) x height (H) x width (W) x channels (C) */
+ RTE_ML_IO_FORMAT_CHWN,
+ /**< Channels (C) x height (H) x width (W) x batch size (N) */
+ RTE_ML_IO_FORMAT_3D,
+ /**< Format to represent a 3 dimensional data */
+ RTE_ML_IO_FORMAT_2D,
+ /**< Format to represent matrix data */
+ RTE_ML_IO_FORMAT_1D,
+ /**< Format to represent vector data */
+ RTE_ML_IO_FORMAT_SCALAR,
+ /**< Format to represent scalar data */
+};
+
+/**
+ * Input and output shape. This structure represents the encoding format and dimensions
+ * of the tensor or vector.
+ *
+ * The data can be a 4D / 3D tensor, matrix, vector or a scalar. Number of dimensions used
+ * for the data would depend on the format. Unused dimensions to be set to 1.
+ */
+struct rte_ml_io_shape {
+ enum rte_ml_io_format format;
+ /**< Format of the data */
+ uint32_t w;
+ /**< First dimension */
+ uint32_t x;
+ /**< Second dimension */
+ uint32_t y;
+ /**< Third dimension */
+ uint32_t z;
+ /**< Fourth dimension */
+};
+
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Name of data */
+ struct rte_ml_io_shape shape;
+ /**< Shape of data */
+ enum rte_ml_io_type qtype;
+ /**< Type of quantized data */
+ enum rte_ml_io_type dtype;
+ /**< Type of de-quantized data */
+};
+
+/** Model information structure */
+struct rte_ml_model_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Model name. */
+ char version[RTE_ML_STR_MAX];
+ /**< Model version */
+ int16_t model_id;
+ /**< Model ID */
+ uint16_t device_id;
+ /**< Device ID */
+ uint16_t batch_size;
+ /**< Maximum number of batches that the model can process simultaneously */
+ uint32_t nb_inputs;
+ /**< Number of inputs */
+ const struct rte_ml_io_info *input_info;
+ /**< Input info array. Array size is equal to nb_inputs */
+ uint32_t nb_outputs;
+ /**< Number of outputs */
+ const struct rte_ml_io_info *output_info;
+ /**< Output info array. Array size is equal to nb_output */
+ uint64_t wb_size;
+ /**< Size of model weights and bias */
+};
+
+/**
+ * Get ML model information.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[out] model_info
+ * Pointer to a model info structure
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_info_get(int16_t dev_id, int16_t model_id, struct rte_ml_model_info *model_info);
+
+/**
+ * Update the model parameters without unloading model.
+ *
+ * Update model parameters such as weights and bias without unloading the model.
+ * rte_ml_model_stop() must be called before invoking this API.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] buffer
+ * Pointer to the model weights and bias buffer.
+ * Size of the buffer is equal to wb_size returned in *rte_ml_model_info*.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);
+
+/* IO operations */
+
+/**
+ * Get size of quantized and dequantized input buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized input data.
+ * This API would return the buffer sizes for the number of batches provided and would
+ * consider the alignment requirements as per the PMD. Input sizes computed by this API can
+ * be used by the application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] input_qsize
+ * Quantized input size pointer.
+ * NULL value is allowed, in which case input_qsize is not calculated by the driver.
+ * @param[out] input_dsize
+ * Dequantized input size pointer.
+ * NULL value is allowed, in which case input_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_input_size_get(int16_t dev_id, int16_t model_id, uint32_t nb_batches,
+ uint64_t *input_qsize, uint64_t *input_dsize);
+
+/**
+ * Get size of quantized and dequantized output buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized output data.
+ * This API would return the buffer sizes for the number of batches provided and would consider
+ * the alignment requirements as per the PMD. Output sizes computed by this API can be used by the
+ * application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] output_qsize
+ * Quantized output size pointer.
+ * NULL value is allowed, in which case output_qsize is not calculated by the driver.
+ * @param[out] output_dsize
+ * Dequantized output size pointer.
+ * NULL value is allowed, in which case output_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_output_size_get(int16_t dev_id, int16_t model_id, uint32_t nb_batches,
+ uint64_t *output_qsize, uint64_t *output_dsize);
+
+/**
+ * Quantize input data.
+ *
+ * Quantization converts data from a higher precision types to a lower precision types to improve
+ * the throughput and efficiency of the model execution with minimal loss of accuracy.
+ * Types of dequantized data and quantized data are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized input buffer
+ * @param[in] dbuffer
+ * Address of dequantized input data
+ * @param[in] qbuffer
+ * Address of quantized input data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_quantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void *dbuffer,
+ void *qbuffer);
+
+/**
+ * Dequantize output data.
+ *
+ * Dequantization converts data from a lower precision type to a higher precision type.
+ * Types of quantized data and dequantized are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized output buffer
+ * @param[in] qbuffer
+ * Address of quantized output data
+ * @param[in] dbuffer
+ * Address of dequantized output data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_dequantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void *qbuffer,
+ void *dbuffer);
+
+/* ML op pool operations */
+
+/**
+ * Create an ML operation pool
+ *
+ * @param name
+ * ML operations pool name
+ * @param nb_elts
+ * Number of elements in pool
+ * @param cache_size
+ * Number of elements to cache on lcore, see
+ * *rte_mempool_create* for further details about cache size
+ * @param user_size
+ * Size of private data to allocate for user with each operation
+ * @param socket_id
+ * Socket to identifier allocate memory on
+ * @return
+ * - On success pointer to mempool
+ * - On failure NULL
+ */
+__rte_experimental
+struct rte_mempool *
+rte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
+ uint16_t user_size, int socket_id);
+
+/**
+ * Free an ML operation pool
+ *
+ * @param mempool
+ * A pointer to the mempool structure.
+ * If NULL then, the function does nothing.
+ */
+__rte_experimental
+void
+rte_ml_op_pool_free(struct rte_mempool *mempool);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_MLDEV_H */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
new file mode 100644
index 0000000000..33c1b976f1
--- /dev/null
+++ b/lib/mldev/version.map
@@ -0,0 +1,3 @@
+EXPERIMENTAL {
+ local: *;
+};
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 02/12] mldev: add PMD functions for ML device
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 01/12] " jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 03/12] mldev: support device handling functions jerinj
` (11 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi, Anatoly Burakov
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, asekhar, pbhagavatula,
eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added PMD functions to handle ML devices. The rte_mldev_pmd.*
files are for drivers only and should be private to DPDK, and
are not installed for application use.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/meson.build | 9 +++
lib/mldev/rte_mldev.c | 128 +++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 115 ++++++++++++++++++++++++++++
lib/mldev/rte_mldev_pmd.c | 61 +++++++++++++++
lib/mldev/rte_mldev_pmd.h | 149 +++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 11 +++
6 files changed, 473 insertions(+)
create mode 100644 lib/mldev/rte_mldev_core.h
create mode 100644 lib/mldev/rte_mldev_pmd.c
create mode 100644 lib/mldev/rte_mldev_pmd.h
diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build
index e378cfca30..5c99532c1a 100644
--- a/lib/mldev/meson.build
+++ b/lib/mldev/meson.build
@@ -2,6 +2,7 @@
# Copyright (c) 2022 Marvell.
sources = files(
+ 'rte_mldev_pmd.c',
'rte_mldev.c',
)
@@ -9,6 +10,14 @@ headers = files(
'rte_mldev.h',
)
+indirect_headers += files(
+ 'rte_mldev_core.h',
+)
+
+driver_sdk_headers += files(
+ 'rte_mldev_pmd.h',
+)
+
deps += ['mempool']
if get_option('buildtype').contains('debug')
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 2e3dfa0e6b..a4e0b5f94f 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -3,3 +3,131 @@
*/
#include <rte_mldev.h>
+#include <rte_mldev_pmd.h>
+
+static struct rte_ml_dev ml_devices[RTE_MLDEV_MAX_DEVS];
+
+static struct rte_ml_dev_global ml_dev_globals = {
+ .devs = ml_devices, .data = {NULL}, .nb_devs = 0, .max_devs = RTE_MLDEV_MAX_DEVS};
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_dev(int16_t dev_id)
+{
+ return &ml_dev_globals.devs[dev_id];
+}
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_named_dev(const char *name)
+{
+ struct rte_ml_dev *dev;
+ int16_t dev_id;
+
+ if (name == NULL)
+ return NULL;
+
+ for (dev_id = 0; dev_id < RTE_MLDEV_MAX_DEVS; dev_id++) {
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if ((dev->attached == ML_DEV_ATTACHED) && (strcmp(dev->data->name, name) == 0))
+ return dev;
+ }
+
+ return NULL;
+}
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id)
+{
+ char mz_name[RTE_MEMZONE_NAMESIZE];
+ const struct rte_memzone *mz;
+ struct rte_ml_dev *dev;
+ int16_t dev_id;
+
+ if (rte_ml_dev_pmd_get_named_dev(name) != NULL) {
+ ML_DEV_LOG(ERR, "ML device with name %s already allocated!", name);
+ return NULL;
+ }
+
+ /* Get a free device ID */
+ for (dev_id = 0; dev_id < RTE_MLDEV_MAX_DEVS; dev_id++) {
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (dev->attached == ML_DEV_DETACHED)
+ break;
+ }
+
+ if (dev_id == RTE_MLDEV_MAX_DEVS) {
+ ML_DEV_LOG(ERR, "Reached maximum number of ML devices");
+ return NULL;
+ }
+
+ if (dev->data == NULL) {
+ /* Reserve memzone name */
+ sprintf(mz_name, "rte_ml_dev_data_%d", dev_id);
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ mz = rte_memzone_reserve(mz_name, sizeof(struct rte_ml_dev_data), socket_id,
+ 0);
+ ML_DEV_LOG(DEBUG, "PRIMARY: reserved memzone for %s (%p)", mz_name, mz);
+ } else {
+ mz = rte_memzone_lookup(mz_name);
+ ML_DEV_LOG(DEBUG, "SECONDARY: looked up memzone for %s (%p)", mz_name, mz);
+ }
+
+ if (mz == NULL)
+ return NULL;
+
+ ml_dev_globals.data[dev_id] = mz->addr;
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ memset(ml_dev_globals.data[dev_id], 0, sizeof(struct rte_ml_dev_data));
+
+ dev->data = ml_dev_globals.data[dev_id];
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ strlcpy(dev->data->name, name, RTE_ML_STR_MAX);
+ dev->data->dev_id = dev_id;
+ dev->data->socket_id = socket_id;
+ dev->data->dev_started = 0;
+ ML_DEV_LOG(DEBUG, "PRIMARY: init mldev data");
+ }
+
+ ML_DEV_LOG(DEBUG, "Data for %s: dev_id %d, socket %u", dev->data->name,
+ dev->data->dev_id, dev->data->socket_id);
+
+ dev->attached = ML_DEV_ATTACHED;
+ ml_dev_globals.nb_devs++;
+ }
+
+ return dev;
+}
+
+int
+rte_ml_dev_pmd_release(struct rte_ml_dev *dev)
+{
+ char mz_name[RTE_MEMZONE_NAMESIZE];
+ const struct rte_memzone *mz;
+ int16_t dev_id;
+ int ret = 0;
+
+ if (dev == NULL)
+ return -EINVAL;
+
+ dev_id = dev->data->dev_id;
+
+ /* Memzone lookup */
+ sprintf(mz_name, "rte_ml_dev_data_%d", dev_id);
+ mz = rte_memzone_lookup(mz_name);
+ if (mz == NULL)
+ return -ENOMEM;
+
+ RTE_ASSERT(ml_dev_globals.data[dev_id] == mz->addr);
+ ml_dev_globals.data[dev_id] = NULL;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ ML_DEV_LOG(DEBUG, "PRIMARY: free memzone of %s (%p)", mz_name, mz);
+ ret = rte_memzone_free(mz);
+ } else {
+ ML_DEV_LOG(DEBUG, "SECONDARY: don't free memzone of %s (%p)", mz_name, mz);
+ }
+
+ dev->attached = ML_DEV_DETACHED;
+ ml_dev_globals.nb_devs--;
+
+ return ret;
+}
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
new file mode 100644
index 0000000000..b5cb69c5fb
--- /dev/null
+++ b/lib/mldev/rte_mldev_core.h
@@ -0,0 +1,115 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef _RTE_MLDEV_INTERNAL_H_
+#define _RTE_MLDEV_INTERNAL_H_
+
+/**
+ * @file
+ *
+ * MLDEV internal header
+ *
+ * This file contains MLDEV private data structures and macros.
+ *
+ * @note
+ * These APIs are for MLDEV PMDs and library only.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#include <dev_driver.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_mldev.h>
+
+/* Logging Macro */
+#define ML_DEV_LOG(level, fmt, args...) \
+ rte_log(RTE_LOG_##level, RTE_LOGTYPE_MLDEV, "%s(): " fmt "\n", __func__, ##args)
+
+/* Device state */
+#define ML_DEV_DETACHED (0)
+#define ML_DEV_ATTACHED (1)
+
+/**
+ * @internal
+ *
+ * The data part, with no function pointers, associated with each device. This structure is safe to
+ * place in shared memory to be common among different processes in a multi-process configuration.
+ */
+struct rte_ml_dev_data {
+ /** Unique identifier name. */
+ char name[RTE_ML_STR_MAX];
+
+ /** Device ID for this instance. */
+ int16_t dev_id;
+
+ /** Socket ID where memory is allocated. */
+ int16_t socket_id;
+
+ /** Device state: STOPPED(0) / STARTED(1) */
+ __extension__ uint8_t dev_started : 1;
+
+ /** Number of device queue pairs. */
+ uint16_t nb_queue_pairs;
+
+ /** Number of ML models. */
+ uint16_t nb_models;
+
+ /** Array of pointers to queue pairs. */
+ void **queue_pairs;
+
+ /** Array of pointers to ML models. */
+ void **models;
+
+ /** PMD-specific private data. */
+ void *dev_private;
+
+ /** Reserved for future fields */
+ uint64_t reserved[3];
+} __rte_cache_aligned;
+
+/**
+ * @internal
+ *
+ * The data structure associated with each ML device.
+ */
+struct rte_ml_dev {
+ /** Pointer to device data. */
+ struct rte_ml_dev_data *data;
+
+ /** Backing RTE device. */
+ struct rte_device *device;
+
+ /** Flag indicating the device is attached. */
+ __extension__ uint8_t attached : 1;
+} __rte_cache_aligned;
+
+/**
+ * @internal
+ *
+ * Global structure used for maintaining state of allocated ML devices.
+ */
+struct rte_ml_dev_global {
+ /** Device information array. */
+ struct rte_ml_dev *devs;
+
+ /** Device private data array. */
+ struct rte_ml_dev_data *data[RTE_MLDEV_MAX_DEVS];
+
+ /** Number of devices found. */
+ uint8_t nb_devs;
+
+ /** Maximum number of devices. */
+ uint8_t max_devs;
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MLDEV_INTERNAL_H_ */
diff --git a/lib/mldev/rte_mldev_pmd.c b/lib/mldev/rte_mldev_pmd.c
new file mode 100644
index 0000000000..796432f1f8
--- /dev/null
+++ b/lib/mldev/rte_mldev_pmd.c
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#include <dev_driver.h>
+#include <rte_eal.h>
+#include <rte_malloc.h>
+
+#include "rte_mldev_pmd.h"
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_create(const char *name, struct rte_device *device,
+ struct rte_ml_dev_pmd_init_params *params)
+{
+ struct rte_ml_dev *dev;
+
+ ML_DEV_LOG(INFO, "ML device initialisation - name: %s, socket_id: %u", name,
+ params->socket_id);
+
+ /* Allocate device structure */
+ dev = rte_ml_dev_pmd_allocate(name, params->socket_id);
+ if (dev == NULL) {
+ ML_DEV_LOG(ERR, "Failed to allocate ML device for %s", name);
+ return NULL;
+ }
+
+ /* Allocate private device structure */
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ dev->data->dev_private =
+ rte_zmalloc_socket("ml_dev_private", params->private_data_size,
+ RTE_CACHE_LINE_SIZE, params->socket_id);
+
+ if (dev->data->dev_private == NULL) {
+ ML_DEV_LOG(ERR, "Cannot allocate memory for mldev %s private data", name);
+ rte_ml_dev_pmd_release(dev);
+ return NULL;
+ }
+ }
+ dev->device = device;
+
+ return dev;
+}
+
+int
+rte_ml_dev_pmd_destroy(struct rte_ml_dev *dev)
+{
+ int ret;
+
+ ML_DEV_LOG(INFO, "Releasing ML device - name: %s", dev->device->name);
+ ret = rte_ml_dev_pmd_release(dev);
+ if (ret)
+ return ret;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ rte_free(dev->data->dev_private);
+
+ dev->data = NULL;
+ dev->device = NULL;
+
+ return 0;
+}
diff --git a/lib/mldev/rte_mldev_pmd.h b/lib/mldev/rte_mldev_pmd.h
new file mode 100644
index 0000000000..33544f1b80
--- /dev/null
+++ b/lib/mldev/rte_mldev_pmd.h
@@ -0,0 +1,149 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef _RTE_MLDEV_PMD_H_
+#define _RTE_MLDEV_PMD_H_
+
+/**
+ * @file
+ *
+ * RTE MLDEV PMD APIs
+ *
+ * ML Device PMD interface
+ *
+ * @note
+ * These APIs are for MLDEV PMDs only and user applications should not call them directly.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#include <rte_common.h>
+#include <rte_compat.h>
+#include <rte_mldev.h>
+#include <rte_mldev_core.h>
+
+/**
+ * @internal
+ *
+ * Initialisation parameters for ML devices.
+ */
+struct rte_ml_dev_pmd_init_params {
+ /** Socket to use for memory allocation. */
+ uint8_t socket_id;
+
+ /** Size of device private data. */
+ uint64_t private_data_size;
+};
+
+/**
+ * @internal
+ *
+ * Get the ML device pointer for the device. Assumes a valid device index.
+ *
+ * @param dev_id
+ * Device ID value to select the device structure.
+ *
+ * @return
+ * The rte_ml_dev pointer for the given device ID.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_dev(int16_t dev_id);
+
+/**
+ * @internal
+ *
+ * Get the rte_ml_dev structure device pointer for the named device.
+ *
+ * @param name
+ * Device name to select the device structure.
+ *
+ * @return
+ * The rte_ml_dev pointer for the given device ID.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_named_dev(const char *name);
+
+/**
+ * @internal
+ *
+ * Allocates a new mldev slot for an ML device and returns the pointer to that slot for use.
+ * Function for internal use by dummy drivers.
+ *
+ * @param name
+ * Unique identifier name for each device.
+ * @param socket_id
+ * Socket to allocate resources.
+ *
+ * @return
+ * Slot in the rte_ml_dev_devices array for a new device.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id);
+
+/**
+ * @internal
+ *
+ * Release the specified mldev device.
+ *
+ * @param dev
+ * ML device.
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+__rte_internal
+int
+rte_ml_dev_pmd_release(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * PMD assist function to provide boiler plate code for ML driver to create and allocate resources
+ * for a new ML PMD device instance.
+ *
+ * @param name
+ * ML device name.
+ * @param device
+ * Base device handle.
+ * @param params
+ * PMD initialisation parameters.
+ *
+ * @return
+ * - ML device instance on success.
+ * - NULL on failure.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_create(const char *name, struct rte_device *device,
+ struct rte_ml_dev_pmd_init_params *params);
+
+/**
+ * @internal
+ *
+ * PMD assist function to provide boiler plate code for ML driver to destroy and free resources
+ * associated with a ML PMD device instance.
+ *
+ * @param mldev
+ * ML device instance.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+__rte_internal
+int
+rte_ml_dev_pmd_destroy(struct rte_ml_dev *mldev);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MLDEV_PMD_H_ */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 33c1b976f1..82eedfada4 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -1,3 +1,14 @@
EXPERIMENTAL {
local: *;
};
+
+INTERNAL {
+ global:
+
+ rte_ml_dev_pmd_allocate;
+ rte_ml_dev_pmd_create;
+ rte_ml_dev_pmd_destroy;
+ rte_ml_dev_pmd_get_dev;
+ rte_ml_dev_pmd_get_named_dev;
+ rte_ml_dev_pmd_release;
+};
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 03/12] mldev: support device handling functions
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 01/12] " jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 02/12] mldev: add PMD functions for ML device jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 04/12] mldev: support device queue-pair setup jerinj
` (10 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added device handling APIs. These APIs are used to get device
information, configure, start, stop and close ML devices. Added
function prototypes to PMD layer which are used by the ML driver
implementations in the poll mode driver.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 175 +++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 107 +++++++++++++++++++++++
lib/mldev/version.map | 11 +++
3 files changed, 293 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index a4e0b5f94f..651f8b2f7c 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -131,3 +131,178 @@ rte_ml_dev_pmd_release(struct rte_ml_dev *dev)
return ret;
}
+
+uint16_t
+rte_ml_dev_count(void)
+{
+ return ml_dev_globals.nb_devs;
+}
+
+int
+rte_ml_dev_is_valid_dev(int16_t dev_id)
+{
+ struct rte_ml_dev *dev = NULL;
+
+ if (dev_id >= ml_dev_globals.max_devs || ml_devices[dev_id].data == NULL)
+ return 0;
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (dev->attached != ML_DEV_ATTACHED)
+ return 0;
+ else
+ return 1;
+}
+
+int
+rte_ml_dev_socket_id(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+
+ return dev->data->socket_id;
+}
+
+int
+rte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_info_get == NULL)
+ return -ENOTSUP;
+
+ if (dev_info == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, dev_info cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+ memset(dev_info, 0, sizeof(struct rte_ml_dev_info));
+
+ return (*dev->dev_ops->dev_info_get)(dev, dev_info);
+}
+
+int
+rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config)
+{
+ struct rte_ml_dev_info dev_info;
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_configure == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started) {
+ ML_DEV_LOG(ERR, "Device %d must be stopped to allow configuration", dev_id);
+ return -EBUSY;
+ }
+
+ if (config == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, config cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ ret = rte_ml_dev_info_get(dev_id, &dev_info);
+ if (ret < 0)
+ return ret;
+
+ if (config->nb_queue_pairs > dev_info.max_queue_pairs) {
+ ML_DEV_LOG(ERR, "Device %d num of queues %u > %u\n", dev_id, config->nb_queue_pairs,
+ dev_info.max_queue_pairs);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_configure)(dev, config);
+}
+
+int
+rte_ml_dev_close(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_close == NULL)
+ return -ENOTSUP;
+
+ /* Device must be stopped before it can be closed */
+ if (dev->data->dev_started == 1) {
+ ML_DEV_LOG(ERR, "Device %d must be stopped before closing", dev_id);
+ return -EBUSY;
+ }
+
+ return (*dev->dev_ops->dev_close)(dev);
+}
+
+int
+rte_ml_dev_start(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_start == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started != 0) {
+ ML_DEV_LOG(ERR, "Device %d is already started", dev_id);
+ return -EBUSY;
+ }
+
+ ret = (*dev->dev_ops->dev_start)(dev);
+ if (ret == 0)
+ dev->data->dev_started = 1;
+
+ return ret;
+}
+
+int
+rte_ml_dev_stop(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stop == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started == 0) {
+ ML_DEV_LOG(ERR, "Device %d is not started", dev_id);
+ return -EBUSY;
+ }
+
+ ret = (*dev->dev_ops->dev_stop)(dev);
+ if (ret == 0)
+ dev->data->dev_started = 0;
+
+ return ret;
+}
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index b5cb69c5fb..1405cce7f7 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -35,6 +35,110 @@ extern "C" {
#define ML_DEV_DETACHED (0)
#define ML_DEV_ATTACHED (1)
+struct rte_ml_dev;
+
+/**
+ * Definitions of all functions exported by a driver through the generic structure of type
+ * *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
+ */
+
+/**
+ * @internal
+ *
+ * Function used to get device information.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param dev_info
+ * Pointer to info structure.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_info_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info);
+
+/**
+ * @internal
+ *
+ * Function used to configure device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param config
+ * ML device configurations.
+ *
+ * @return
+ * - 0 on success
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_configure_t)(struct rte_ml_dev *dev, const struct rte_ml_dev_config *config);
+
+/**
+ * @internal
+ *
+ * Function used to close a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - -EAGAIN if can't close as device is busy.
+ * - < 0, error code on failure, other than busy.
+ */
+typedef int (*mldev_close_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * Function used to start a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_start_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * Function used to stop a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_stop_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * ML device operations function pointer table.
+ */
+struct rte_ml_dev_ops {
+ /** Get device information. */
+ mldev_info_get_t dev_info_get;
+
+ /** Configure device. */
+ mldev_configure_t dev_configure;
+
+ /** Close device. */
+ mldev_close_t dev_close;
+
+ /** Start device. */
+ mldev_start_t dev_start;
+
+ /** Stop device. */
+ mldev_stop_t dev_stop;
+};
+
/**
* @internal
*
@@ -82,6 +186,9 @@ struct rte_ml_dev {
/** Pointer to device data. */
struct rte_ml_dev_data *data;
+ /** Functions exported by PMD. */
+ struct rte_ml_dev_ops *dev_ops;
+
/** Backing RTE device. */
struct rte_device *device;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 82eedfada4..1be508ab5f 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -1,4 +1,15 @@
EXPERIMENTAL {
+ global:
+
+ rte_ml_dev_close;
+ rte_ml_dev_configure;
+ rte_ml_dev_count;
+ rte_ml_dev_info_get;
+ rte_ml_dev_is_valid_dev;
+ rte_ml_dev_socket_id;
+ rte_ml_dev_start;
+ rte_ml_dev_stop;
+
local: *;
};
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 04/12] mldev: support device queue-pair setup
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (2 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 03/12] mldev: support device handling functions jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 05/12] mldev: support handling ML models jerinj
` (9 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added APIs to create a queue-pair attached to ML device.
Queue pairs are created with a user specified ID. Added
function prototypes to be used by ML drivers for queue
pair create and destroy.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 33 ++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 44 ++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 1 +
3 files changed, 78 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 651f8b2f7c..c8672cff8e 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -306,3 +306,36 @@ rte_ml_dev_stop(int16_t dev_id)
return ret;
}
+
+int
+rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *qp_conf, int socket_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_queue_pair_setup == NULL)
+ return -ENOTSUP;
+
+ if (queue_pair_id >= dev->data->nb_queue_pairs) {
+ ML_DEV_LOG(ERR, "Invalid queue_pair_id = %d", queue_pair_id);
+ return -EINVAL;
+ }
+
+ if (qp_conf == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, qp_conf cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (dev->data->dev_started) {
+ ML_DEV_LOG(ERR, "Device %d must be stopped to allow configuration", dev_id);
+ return -EBUSY;
+ }
+
+ return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
+}
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 1405cce7f7..e2a16034d6 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -117,6 +117,44 @@ typedef int (*mldev_start_t)(struct rte_ml_dev *dev);
*/
typedef int (*mldev_stop_t)(struct rte_ml_dev *dev);
+/**
+ * @internal
+ *
+ * Setup a queue pair for a device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param queue_pair_id
+ * Queue pair index.
+ * @param queue_pair_conf
+ * Queue pair configuration structure.
+ * @param socket_id
+ * Socket index.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *queue_pair_conf,
+ int socket_id);
+
+/**
+ * @internal
+ *
+ * Release memory resources allocated by given queue pair.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param queue_pair_id
+ * Queue pair index.
+ *
+ * @return
+ * - 0 on success.
+ * - -EAGAIN, if can't close as device is busy.
+ */
+typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+
/**
* @internal
*
@@ -137,6 +175,12 @@ struct rte_ml_dev_ops {
/** Stop device. */
mldev_stop_t dev_stop;
+
+ /** Set up a device queue pair. */
+ mldev_queue_pair_setup_t dev_queue_pair_setup;
+
+ /** Release a device queue pair. */
+ mldev_queue_pair_release_t dev_queue_pair_release;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 1be508ab5f..cd44c05f3a 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -6,6 +6,7 @@ EXPERIMENTAL {
rte_ml_dev_count;
rte_ml_dev_info_get;
rte_ml_dev_is_valid_dev;
+ rte_ml_dev_queue_pair_setup;
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 05/12] mldev: support handling ML models
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (3 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 04/12] mldev: support device queue-pair setup jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 06/12] mldev: support input and output data handling jerinj
` (8 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added RTE functions to handle ML models. These APIs can
load, unload, start, and stop an ML model. Additional APIs
to update model parameters and get model information are
added.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 123 +++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 122 ++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 6 ++
3 files changed, 251 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index c8672cff8e..327ed7144d 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -339,3 +339,126 @@ rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
}
+
+int
+rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, int16_t *model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_load == NULL)
+ return -ENOTSUP;
+
+ if (params == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, params cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (model_id == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, model_id cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_load)(dev, params, model_id);
+}
+
+int
+rte_ml_model_unload(int16_t dev_id, int16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_unload == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_unload)(dev, model_id);
+}
+
+int
+rte_ml_model_start(int16_t dev_id, int16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_start == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_start)(dev, model_id);
+}
+
+int
+rte_ml_model_stop(int16_t dev_id, int16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_stop == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_stop)(dev, model_id);
+}
+
+int
+rte_ml_model_info_get(int16_t dev_id, int16_t model_id, struct rte_ml_model_info *model_info)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_info_get == NULL)
+ return -ENOTSUP;
+
+ if (model_info == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, model_id %d, model_info cannot be NULL\n", dev_id,
+ model_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_info_get)(dev, model_id, model_info);
+}
+
+int
+rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_params_update == NULL)
+ return -ENOTSUP;
+
+ if (buffer == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, buffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_params_update)(dev, model_id, buffer);
+}
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index e2a16034d6..172454c2aa 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -155,6 +155,110 @@ typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_p
*/
typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+/**
+ * @internal
+ *
+ * Function used to load an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param params
+ * Model load params.
+ * @param model_id
+ * Model ID returned by the library.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_model_load_t)(struct rte_ml_dev *dev, struct rte_ml_model_params *params,
+ int16_t *model_id);
+
+/**
+ * @internal
+ *
+ * Function used to unload an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_model_unload_t)(struct rte_ml_dev *dev, int16_t model_id);
+
+/**
+ * @internal
+ *
+ * Function used to start an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_start_t)(struct rte_ml_dev *dev, int16_t model_id);
+
+/**
+ * @internal
+ *
+ * Function used to stop an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_stop_t)(struct rte_ml_dev *dev, int16_t model_id);
+
+/**
+ * @internal
+ *
+ * Get info about a model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param model_info
+ * Pointer to model info structure.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_info_get_t)(struct rte_ml_dev *dev, int16_t model_id,
+ struct rte_ml_model_info *model_info);
+
+/**
+ * @internal
+ *
+ * Update model params.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param buffer
+ * Pointer to model params.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_params_update_t)(struct rte_ml_dev *dev, int16_t model_id, void *buffer);
+
/**
* @internal
*
@@ -181,6 +285,24 @@ struct rte_ml_dev_ops {
/** Release a device queue pair. */
mldev_queue_pair_release_t dev_queue_pair_release;
+
+ /** Load an ML model. */
+ mldev_model_load_t model_load;
+
+ /** Unload an ML model. */
+ mldev_model_unload_t model_unload;
+
+ /** Start an ML model. */
+ mldev_model_start_t model_start;
+
+ /** Stop an ML model. */
+ mldev_model_stop_t model_stop;
+
+ /** Get model information. */
+ mldev_model_info_get_t model_info_get;
+
+ /** Update model params. */
+ mldev_model_params_update_t model_params_update;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index cd44c05f3a..4459f02925 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -10,6 +10,12 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_model_info_get;
+ rte_ml_model_load;
+ rte_ml_model_params_update;
+ rte_ml_model_start;
+ rte_ml_model_stop;
+ rte_ml_model_unload;
local: *;
};
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 06/12] mldev: support input and output data handling
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (4 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 05/12] mldev: support handling ML models jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 07/12] mldev: support op pool and its operations jerinj
` (7 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added RTE library functions to handle model input and
output data. The APIs can be used to get the size of I/O
buffers, quantize input data and dequantize output data.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 94 ++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 106 +++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 4 ++
3 files changed, 204 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 327ed7144d..13b7e93943 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -462,3 +462,97 @@ rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer)
return (*dev->dev_ops->model_params_update)(dev, model_id, buffer);
}
+
+int
+rte_ml_io_input_size_get(int16_t dev_id, int16_t model_id, uint32_t nb_batches,
+ uint64_t *input_qsize, uint64_t *input_dsize)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_input_size_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->io_input_size_get)(dev, model_id, nb_batches, input_qsize,
+ input_dsize);
+}
+
+int
+rte_ml_io_output_size_get(int16_t dev_id, int16_t model_id, uint32_t nb_batches,
+ uint64_t *output_qsize, uint64_t *output_dsize)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_output_size_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->io_output_size_get)(dev, model_id, nb_batches, output_qsize,
+ output_dsize);
+}
+
+int
+rte_ml_io_quantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void *dbuffer,
+ void *qbuffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_quantize == NULL)
+ return -ENOTSUP;
+
+ if (dbuffer == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, dbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (qbuffer == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, qbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->io_quantize)(dev, model_id, nb_batches, dbuffer, qbuffer);
+}
+
+int
+rte_ml_io_dequantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void *qbuffer,
+ void *dbuffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_dequantize == NULL)
+ return -ENOTSUP;
+
+ if (qbuffer == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, qbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (dbuffer == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, dbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->io_dequantize)(dev, model_id, nb_batches, qbuffer, dbuffer);
+}
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 172454c2aa..b388553a96 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -259,6 +259,100 @@ typedef int (*mldev_model_info_get_t)(struct rte_ml_dev *dev, int16_t model_id,
*/
typedef int (*mldev_model_params_update_t)(struct rte_ml_dev *dev, int16_t model_id, void *buffer);
+/**
+ * @internal
+ *
+ * Get size of input buffers.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param input_qsize
+ * Size of quantized input.
+ * @param input_dsize
+ * Size of dequantized input.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_input_size_get_t)(struct rte_ml_dev *dev, int16_t model_id,
+ uint32_t nb_batches, uint64_t *input_qsize,
+ uint64_t *input_dsize);
+
+/**
+ * @internal
+ *
+ * Get size of output buffers.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param output_qsize
+ * Size of quantized output.
+ * @param output_dsize
+ * Size of dequantized output.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_output_size_get_t)(struct rte_ml_dev *dev, int16_t model_id,
+ uint32_t nb_batches, uint64_t *output_qsize,
+ uint64_t *output_dsize);
+
+/**
+ * @internal
+ *
+ * Quantize model data.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param dbuffer
+ * Pointer t de-quantized data buffer.
+ * @param qbuffer
+ * Pointer t de-quantized data buffer.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_quantize_t)(struct rte_ml_dev *dev, int16_t model_id, uint16_t nb_batches,
+ void *dbuffer, void *qbuffer);
+
+/**
+ * @internal
+ *
+ * Quantize model data.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param qbuffer
+ * Pointer t de-quantized data buffer.
+ * @param dbuffer
+ * Pointer t de-quantized data buffer.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_dequantize_t)(struct rte_ml_dev *dev, int16_t model_id, uint16_t nb_batches,
+ void *qbuffer, void *dbuffer);
+
/**
* @internal
*
@@ -303,6 +397,18 @@ struct rte_ml_dev_ops {
/** Update model params. */
mldev_model_params_update_t model_params_update;
+
+ /** Get input buffer size. */
+ mldev_io_input_size_get_t io_input_size_get;
+
+ /** Get output buffer size. */
+ mldev_io_output_size_get_t io_output_size_get;
+
+ /** Quantize data */
+ mldev_io_quantize_t io_quantize;
+
+ /** De-quantize data */
+ mldev_io_dequantize_t io_dequantize;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 4459f02925..0b180020db 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -10,6 +10,10 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_io_dequantize;
+ rte_ml_io_input_size_get;
+ rte_ml_io_output_size_get;
+ rte_ml_io_quantize;
rte_ml_model_info_get;
rte_ml_model_load;
rte_ml_model_params_update;
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 07/12] mldev: support op pool and its operations
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (5 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 06/12] mldev: support input and output data handling jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 08/12] mldev: support inference enqueue and dequeue jerinj
` (6 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added RTE library functions to create and free ML op pool.
Create function allocates new ML op pool and initializes ML
ops to their defaults.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 69 +++++++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
2 files changed, 71 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 13b7e93943..cc87837d85 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -10,6 +10,17 @@ static struct rte_ml_dev ml_devices[RTE_MLDEV_MAX_DEVS];
static struct rte_ml_dev_global ml_dev_globals = {
.devs = ml_devices, .data = {NULL}, .nb_devs = 0, .max_devs = RTE_MLDEV_MAX_DEVS};
+/*
+ * Private data structure of an operation pool.
+ *
+ * A structure that contains ml op_pool specific data that is
+ * appended after the mempool structure (in private data).
+ */
+struct rte_ml_op_pool_private {
+ uint16_t user_size;
+ /*< Size of private user data with each operation. */
+};
+
struct rte_ml_dev *
rte_ml_dev_pmd_get_dev(int16_t dev_id)
{
@@ -556,3 +567,61 @@ rte_ml_io_dequantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void
return (*dev->dev_ops->io_dequantize)(dev, model_id, nb_batches, qbuffer, dbuffer);
}
+
+/** Initialise rte_ml_op mempool element */
+static void
+ml_op_init(struct rte_mempool *mempool, __rte_unused void *opaque_arg, void *_op_data,
+ __rte_unused unsigned int i)
+{
+ struct rte_ml_op *op = _op_data;
+
+ memset(_op_data, 0, mempool->elt_size);
+ op->status = RTE_ML_OP_STATUS_NOT_PROCESSED;
+ op->mempool = mempool;
+}
+
+struct rte_mempool *
+rte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
+ uint16_t user_size, int socket_id)
+{
+ struct rte_ml_op_pool_private *priv;
+ struct rte_mempool *mp;
+ unsigned int elt_size;
+
+ /* lookup mempool in case already allocated */
+ mp = rte_mempool_lookup(name);
+ elt_size = sizeof(struct rte_ml_op) + user_size;
+
+ if (mp != NULL) {
+ priv = (struct rte_ml_op_pool_private *)rte_mempool_get_priv(mp);
+ if (mp->elt_size != elt_size || mp->cache_size < cache_size || mp->size < nb_elts ||
+ priv->user_size < user_size) {
+ mp = NULL;
+ ML_DEV_LOG(ERR,
+ "Mempool %s already exists but with incompatible parameters",
+ name);
+ return NULL;
+ }
+ return mp;
+ }
+
+ mp = rte_mempool_create(name, nb_elts, elt_size, cache_size,
+ sizeof(struct rte_ml_op_pool_private), NULL, NULL, ml_op_init, NULL,
+ socket_id, 0);
+ if (mp == NULL) {
+ ML_DEV_LOG(ERR, "Failed to create mempool %s", name);
+ return NULL;
+ }
+
+ priv = (struct rte_ml_op_pool_private *)rte_mempool_get_priv(mp);
+ priv->user_size = user_size;
+
+ return mp;
+}
+
+void
+rte_ml_op_pool_free(struct rte_mempool *mempool)
+{
+ if (mempool != NULL)
+ rte_mempool_free(mempool);
+}
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 0b180020db..396ca0b96a 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -20,6 +20,8 @@ EXPERIMENTAL {
rte_ml_model_start;
rte_ml_model_stop;
rte_ml_model_unload;
+ rte_ml_op_pool_create;
+ rte_ml_op_pool_free;
local: *;
};
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 08/12] mldev: support inference enqueue and dequeue
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (6 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 07/12] mldev: support op pool and its operations jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 09/12] mldev: support device statistics jerinj
` (5 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added implementations of fast-path functions to enqueue
and dequeue ML requests from an ML device queue-pair.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 76 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 46 +++++++++++++++++++++++
lib/mldev/rte_mldev_pmd.h | 2 +
lib/mldev/version.map | 2 +
4 files changed, 126 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index cc87837d85..adf8ab8cbc 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -2,6 +2,7 @@
* Copyright (c) 2022 Marvell.
*/
+#include <rte_errno.h>
#include <rte_mldev.h>
#include <rte_mldev_pmd.h>
@@ -105,6 +106,9 @@ rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id)
ml_dev_globals.nb_devs++;
}
+ dev->enqueue_burst = NULL;
+ dev->dequeue_burst = NULL;
+
return dev;
}
@@ -625,3 +629,75 @@ rte_ml_op_pool_free(struct rte_mempool *mempool)
if (mempool != NULL)
rte_mempool_free(mempool);
}
+
+uint16_t
+rte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->enqueue_burst == NULL) {
+ rte_errno = -ENOTSUP;
+ return 0;
+ }
+
+ if (ops == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, ops cannot be NULL\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ if (qp_id >= dev->data->nb_queue_pairs) {
+ ML_DEV_LOG(ERR, "Invalid qp_id %u\n", qp_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->enqueue_burst)(dev, qp_id, ops, nb_ops);
+}
+
+uint16_t
+rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dequeue_burst == NULL) {
+ rte_errno = -ENOTSUP;
+ return 0;
+ }
+
+ if (ops == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, ops cannot be NULL\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ if (qp_id >= dev->data->nb_queue_pairs) {
+ ML_DEV_LOG(ERR, "Invalid qp_id %u\n", qp_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->dequeue_burst)(dev, qp_id, ops, nb_ops);
+}
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index b388553a96..9c19d7badf 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -37,6 +37,46 @@ extern "C" {
struct rte_ml_dev;
+/**
+ * @internal
+ *
+ * Enqueue a burst of inference requests to a queue on ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param qp_id
+ * Queue-pair ID.
+ * @param ops
+ * Array of ML ops to be enqueued.
+ * @param nb_ops
+ * Number of ops to enqueue.
+ *
+ * @return
+ * - Number of ops enqueued.
+ */
+typedef uint16_t (*mldev_enqueue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
+ uint16_t nb_ops);
+
+/**
+ * @internal
+ *
+ * Dequeue a burst of inference requests from a queue on ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param qp_id
+ * Queue-pair ID.
+ * @param ops
+ * Array of ML ops to dequeued.
+ * @param nb_ops
+ * Number of ops to dequeue.
+ *
+ * @return
+ * - Number of ops dequeued.
+ */
+typedef uint16_t (*mldev_dequeue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
+ uint16_t nb_ops);
+
/**
* Definitions of all functions exported by a driver through the generic structure of type
* *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
@@ -455,6 +495,12 @@ struct rte_ml_dev_data {
* The data structure associated with each ML device.
*/
struct rte_ml_dev {
+ /** Pointer to PMD enqueue function. */
+ mldev_enqueue_t enqueue_burst;
+
+ /** Pointer to PMD dequeue function. */
+ mldev_dequeue_t dequeue_burst;
+
/** Pointer to device data. */
struct rte_ml_dev_data *data;
diff --git a/lib/mldev/rte_mldev_pmd.h b/lib/mldev/rte_mldev_pmd.h
index 33544f1b80..afe617e4bf 100644
--- a/lib/mldev/rte_mldev_pmd.h
+++ b/lib/mldev/rte_mldev_pmd.h
@@ -40,6 +40,8 @@ struct rte_ml_dev_pmd_init_params {
uint64_t private_data_size;
};
+struct rte_ml_dev;
+
/**
* @internal
*
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 396ca0b96a..5d44d210d7 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -1,6 +1,7 @@
EXPERIMENTAL {
global:
+ rte_ml_dequeue_burst;
rte_ml_dev_close;
rte_ml_dev_configure;
rte_ml_dev_count;
@@ -10,6 +11,7 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_enqueue_burst;
rte_ml_io_dequantize;
rte_ml_io_input_size_get;
rte_ml_io_output_size_get;
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 09/12] mldev: support device statistics
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (7 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 08/12] mldev: support inference enqueue and dequeue jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 10/12] mldev: support device extended statistics jerinj
` (4 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to get and reset device stats.
Device stats include number of requests enqueued, dequeued and errors.
Added function prototypes to used by driver implementations.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 40 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 32 ++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
3 files changed, 74 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index adf8ab8cbc..41b2a0be84 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -355,6 +355,46 @@ rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
}
+int
+rte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stats_get == NULL)
+ return -ENOTSUP;
+
+ if (stats == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, stats cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+ memset(stats, 0, sizeof(struct rte_ml_dev_stats));
+
+ return (*dev->dev_ops->dev_stats_get)(dev, stats);
+}
+
+void
+rte_ml_dev_stats_reset(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stats_reset == NULL)
+ return;
+
+ (*dev->dev_ops->dev_stats_reset)(dev);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, int16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 9c19d7badf..3f05ecd9c6 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -195,6 +195,32 @@ typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_p
*/
typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+/**
+ * @internal
+ *
+ * Function used to get device statistics.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stats
+ * Pointer to ML device stats structure to update.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_stats_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats);
+
+/**
+ * @internal
+ *
+ * Function used to reset device statistics.
+ *
+ * @param dev
+ * ML device pointer.
+ */
+typedef void (*mldev_stats_reset_t)(struct rte_ml_dev *dev);
+
/**
* @internal
*
@@ -420,6 +446,12 @@ struct rte_ml_dev_ops {
/** Release a device queue pair. */
mldev_queue_pair_release_t dev_queue_pair_release;
+ /** Get device statistics. */
+ mldev_stats_get_t dev_stats_get;
+
+ /** Reset device statistics. */
+ mldev_stats_reset_t dev_stats_reset;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 5d44d210d7..d12263010b 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -10,6 +10,8 @@ EXPERIMENTAL {
rte_ml_dev_queue_pair_setup;
rte_ml_dev_socket_id;
rte_ml_dev_start;
+ rte_ml_dev_stats_get;
+ rte_ml_dev_stats_reset;
rte_ml_dev_stop;
rte_ml_enqueue_burst;
rte_ml_io_dequantize;
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 10/12] mldev: support device extended statistics
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (8 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 09/12] mldev: support device statistics jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 11/12] mldev: support to retrieve error information jerinj
` (3 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to handle device extended stats.
xstats supported are driver specific and can include stats specific
to ML device or ML model and I/O.
Added prototypes for functions to be called by the device drivers.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 88 ++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 93 ++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 4 ++
3 files changed, 185 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 41b2a0be84..cdd68e933c 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -395,6 +395,94 @@ rte_ml_dev_stats_reset(int16_t dev_id)
(*dev->dev_ops->dev_stats_reset)(dev);
}
+int
+rte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map, uint32_t size)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_names_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_xstats_names_get)(dev, xstats_map, size);
+}
+
+int
+rte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_by_name_get == NULL)
+ return -ENOTSUP;
+
+ if (name == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, name cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (value == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, value cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_xstats_by_name_get)(dev, name, stat_id, value);
+}
+
+int
+rte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_get == NULL)
+ return -ENOTSUP;
+
+ if (stat_ids == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, stat_ids cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (values == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, values cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_xstats_get)(dev, stat_ids, values, nb_ids);
+}
+
+int
+rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_reset == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_xstats_reset)(dev, stat_ids, nb_ids);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, int16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 3f05ecd9c6..82e8ed0422 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -221,6 +221,87 @@ typedef int (*mldev_stats_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_stats
*/
typedef void (*mldev_stats_reset_t)(struct rte_ml_dev *dev);
+/**
+ * @internal
+ *
+ * Function used to get names of extended stats.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param xstats_map
+ * Array to insert id and names into.
+ * @param size
+ * Size of xstats_map array.
+ *
+ * @return
+ * - >= 0 and <= size on success.
+ * - > size, error. Returns the size of xstats_map array required.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_names_get_t)(struct rte_ml_dev *dev,
+ struct rte_ml_dev_xstats_map *xstats_map, uint32_t size);
+
+/**
+ * @internal
+ *
+ * Function used to get a single extended stat by name.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param name
+ * Name of the stat to retrieve.
+ * @param stat_id
+ * ID of the stat to be returned.
+ * @param value
+ * Value of the stat to be returned.
+ *
+ * @return
+ * - >= 0 stat value.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_by_name_get_t)(struct rte_ml_dev *dev, const char *name,
+ uint16_t *stat_id, uint64_t *value);
+
+/**
+ * @internal
+ *
+ * Function used to retrieve extended stats of a device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stat_ids
+ * Array of ID numbers of the stats to be retrieved.
+ * @param values
+ * Values of the stats requested by the ID.
+ * @param nb_ids
+ * Number of stats requested.
+ *
+ * @return
+ * - >= 0, number of entries filled into the values array.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_get_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
+ uint64_t *values, uint16_t nb_ids);
+
+/**
+ * @internal
+ *
+ * Function used to reset extended stats.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stat_ids
+ * Array of stats IDs to be reset.
+ * @param nb_ids
+ * Number of IDs in the stat_ids array.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_reset_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
+ uint16_t nb_ids);
+
/**
* @internal
*
@@ -452,6 +533,18 @@ struct rte_ml_dev_ops {
/** Reset device statistics. */
mldev_stats_reset_t dev_stats_reset;
+ /** Get names of extended stats. */
+ mldev_xstats_names_get_t dev_xstats_names_get;
+
+ /** Get value of a single extended stat. */
+ mldev_xstats_by_name_get_t dev_xstats_by_name_get;
+
+ /** Get extended stats of a device. */
+ mldev_xstats_get_t dev_xstats_get;
+
+ /** Reset extended stats of the device. */
+ mldev_xstats_reset_t dev_xstats_reset;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index d12263010b..02d2eab100 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -13,6 +13,10 @@ EXPERIMENTAL {
rte_ml_dev_stats_get;
rte_ml_dev_stats_reset;
rte_ml_dev_stop;
+ rte_ml_dev_xstats_by_name_get;
+ rte_ml_dev_xstats_get;
+ rte_ml_dev_xstats_names_get;
+ rte_ml_dev_xstats_reset;
rte_ml_enqueue_burst;
rte_ml_io_dequantize;
rte_ml_io_input_size_get;
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 11/12] mldev: support to retrieve error information
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (9 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 10/12] mldev: support device extended statistics jerinj
@ 2022-11-14 12:02 ` jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 12/12] mldev: support to get debug info and test device jerinj
` (2 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to get error information for an ML op.
This information can include both drive specific error
message and error code.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 31 +++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 22 ++++++++++++++++++++++
lib/mldev/version.map | 1 +
3 files changed, 54 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index cdd68e933c..7497a1316d 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -829,3 +829,34 @@ rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uin
return (*dev->dequeue_burst)(dev, qp_id, ops, nb_ops);
}
+
+int
+rte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->op_error_get == NULL)
+ return -ENOTSUP;
+
+ if (op == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, op cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (error == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, error cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->op_error_get)(dev, op, error);
+}
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 82e8ed0422..9e3873dd3a 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -77,6 +77,25 @@ typedef uint16_t (*mldev_enqueue_t)(struct rte_ml_dev *dev, uint16_t qp_id, stru
typedef uint16_t (*mldev_dequeue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
uint16_t nb_ops);
+/**
+ * @internal
+ *
+ * Get error information for an Op.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param op
+ * ML Op handle.
+ * @param error
+ * Pointer to error structure.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_op_error_get_t)(struct rte_ml_dev *dev, struct rte_ml_op *op,
+ struct rte_ml_op_error *error);
+
/**
* Definitions of all functions exported by a driver through the generic structure of type
* *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
@@ -626,6 +645,9 @@ struct rte_ml_dev {
/** Pointer to PMD dequeue function. */
mldev_dequeue_t dequeue_burst;
+ /** Pointer to PMD Op error get function. */
+ mldev_op_error_get_t op_error_get;
+
/** Pointer to device data. */
struct rte_ml_dev_data *data;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 02d2eab100..86ab2129ce 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -28,6 +28,7 @@ EXPERIMENTAL {
rte_ml_model_start;
rte_ml_model_stop;
rte_ml_model_unload;
+ rte_ml_op_error_get;
rte_ml_op_pool_create;
rte_ml_op_pool_free;
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v1 12/12] mldev: support to get debug info and test device
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (10 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 11/12] mldev: support to retrieve error information jerinj
@ 2022-11-14 12:02 ` jerinj
2023-01-25 14:20 ` [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library Thomas Monjalon
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2022-11-14 12:02 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil, mdr,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added library functions for ML device debug APIs.
The APIs are used to dump ML device debug information and to run selftest.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 39 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 37 ++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
3 files changed, 78 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 7497a1316d..71aa98ee96 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -483,6 +483,45 @@ rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_id
return (*dev->dev_ops->dev_xstats_reset)(dev, stat_ids, nb_ids);
}
+int
+rte_ml_dev_dump(int16_t dev_id, FILE *fd)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_dump == NULL)
+ return -ENOTSUP;
+
+ if (fd == NULL) {
+ ML_DEV_LOG(ERR, "Dev %d, file descriptor cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_dump)(dev, fd);
+}
+
+int
+rte_ml_dev_selftest(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ ML_DEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_selftest == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_selftest)(dev);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, int16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 9e3873dd3a..b1426806a3 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -321,6 +321,37 @@ typedef int (*mldev_xstats_get_t)(struct rte_ml_dev *dev, const uint16_t *stat_i
typedef int (*mldev_xstats_reset_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
uint16_t nb_ids);
+/**
+ * @internal
+ *
+ * Function used to dump ML device debug info.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param fd
+ * File descriptor to dump the debug info.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+
+typedef int (*mldev_dump_t)(struct rte_ml_dev *dev, FILE *fd);
+
+/**
+ * @internal
+ *
+ * Function used for selftest of ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_selftest_t)(struct rte_ml_dev *dev);
+
/**
* @internal
*
@@ -564,6 +595,12 @@ struct rte_ml_dev_ops {
/** Reset extended stats of the device. */
mldev_xstats_reset_t dev_xstats_reset;
+ /** Dump ML device debug info. */
+ mldev_dump_t dev_dump;
+
+ /** Dump ML device debug info. */
+ mldev_selftest_t dev_selftest;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 86ab2129ce..61955ab701 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -5,9 +5,11 @@ EXPERIMENTAL {
rte_ml_dev_close;
rte_ml_dev_configure;
rte_ml_dev_count;
+ rte_ml_dev_dump;
rte_ml_dev_info_get;
rte_ml_dev_is_valid_dev;
rte_ml_dev_queue_pair_setup;
+ rte_ml_dev_selftest;
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stats_get;
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-17 6:58 ` Morten Brørup
@ 2023-01-25 13:45 ` Thomas Monjalon
0 siblings, 0 replies; 80+ messages in thread
From: Thomas Monjalon @ 2023-01-25 13:45 UTC (permalink / raw)
To: Jerin Jacob, dev
Cc: Jerin Jacob, dpdk-dev, Ferruh Yigit, Ajit Khaparde, Andrew Boyer,
Andrew Rybchenko, Beilei Xing, Richardson, Bruce, Chas Williams,
Xia, Chenbo, Ciara Loftus, Devendra Singh Rawat, Ed Czeck,
Evgeny Schemeilin, Gaetan Rivet, Gagandeep Singh, Guoyang Zhou,
Haiyue Wang, Harman Kalra, Heinrich Kuhn, Hemant Agrawal,
Hyong Youb Kim, Igor Chauskin, Igor Russkikh, Jakub Grajciar,
Jasvinder Singh, Jian Wang, Jiawen Wu, Jingjing Wu, John Daley,
John Miller, John W. Linville, Wiles, Keith, Kiran Kumar K,
Lijun Ou, Liron Himi, Long Li, Marcin Wojtas, Martin Spinler,
Matan Azrad, Matt Peters, Maxime Coquelin, Michal Krawczyk,
Min Hu (Connor, Pradeep Kumar Nalla, Nithin Dabilpuram,
Qiming Yang, Qi Zhang, Radha Mohan Chintakuntla,
Rahul Lakkireddy, Rasesh Mody, Rosen Xu, Sachin Saxena,
Satha Koteswara Rao Kottidi, Shahed Shaikh, Shai Brandes,
Shepard Siegel, Somalapuram Amaranath, Somnath Kotur,
Stephen Hemminger, Steven Webster, Sunil Kumar Kori,
Tetsuya Mukawa, Veerasenareddy Burru, Viacheslav Ovsiienko,
Xiao Wang, Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Honnappa Nagarahalli, Mattias Rönnblom,
Ruifeng Wang (Arm Technology China),
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara, Stephen Hemminger,
Morten Brørup, techboard
17/08/2022 08:58, Morten Brørup:
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Wednesday, 17 August 2022 07.37
> >
> > On Tue, Aug 16, 2022 at 9:15 PM Morten Brørup
> > <mb@smartsharesystems.com> wrote:
> > >
> > > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > > Sent: Tuesday, 16 August 2022 15.13
> > > >
> > > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > > <stephen@networkplumber.org> wrote:
> > > > >
> > > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > > <jerinj@marvell.com> wrote:
> > > > >
> > > > > > Roadmap
> > > > > > -------
> > > > > > 1) Address the comments for this RFC.
> > > > > > 2) Common code for mldev
> > > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > > >
> > > > > Having a SW implementation is important because then it can be
> > > > covered
> > > > > by tests.
> > > >
> > > > Yes. That reason for adding TVM based SW driver as item (3).
> > > >
> > > > Is there any other high level or API level comments before
> > proceeding
> > > > with v1 and implementation.
> > >
> > > Have you seriously considered if the DPDK Project is the best home
> > for this project? I can easily imagine the DPDK development process
> > being a hindrance in many aspects for an evolving AI/ML library. Off
> > the top of my head, it would probably be better off as a separate
> > project, like SPDK.
> >
> > Yes. The reasons are following
> >
> > # AI/ML compiler libraries more focused on model creation and
> > training etc (Thats where actual value addition the AI/ML libraries
> > can offer) and minimal part for interference(It is just added for
> > testing the model)
> > # Considering the inference is the scope of the DPDK. DPDK is ideal
> > place for following reasons
> >
> > a) Inference scope is very limited.
> > b) Avoid memcpy of interference data (Use directly from network or
> > other class of device like cryptodev, regexdev)
> > c) Reuse highspeed IO interface like PCI backed driver etc
> > d) Integration with other DPDK subsystems like eventdev etc for job
> > completion.
> > e) Also support more inline offloads by merging two device classes
> > like rte_secuity.
> > f) Run the inference model from different AI/ML compiler frameworks or
> > abstract the inference usage.
> > Similar concept is already applied to other DPDK device classes like
> > 1) In Regexdev, The compiler generates the rule database which is out
> > of scope of DPDK. DPDK API just loads the rule database
> > 2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
> > IO interface.
>
> Thank you for the detailed reply, Jerin.
>
> These are good reasons for adding the new device class to the DPDK project - especially the Regexdev comparison got me convinced.
>
> >
> > > If all this stuff can be completely omitted at build time, I have no
> > objections.
> >
> > Yes, It can be completely omitted at build time.
>
> Perfect.
>
> > Also no plan to
> > integrate to testpmd and other existing application. Planning to add
> > only app/test-mldev application.
>
> +1 to that
>
> >
> > >
> > > A small note about naming (not intending to start a flame war, so
> > please feel free to ignore!): I haven't worked seriously with ML/AI
> > since university three decades ago, so I'm quite rusty in the domain.
> > However, I don't see any Machine Learning functions proposed by this
> > API. The library provides an API to an Inference Engine - but nobody
> > says the inference model stems from Machine Learning; it might as well
> > be a hand crafted model. Do you plan to propose APIs for training the
> > models? If not, the name of the library could confuse some potential
> > users.
> >
> > No, scope is only inference and it is documented in the programing
> > guide and API header file. I am trying to keep name similar to
> > regexdev, gpudev etc which have similar scope. But I am open to other
> > shortname/name if you have something in mind.
>
> The AI(Artificial Intelligence)/ML(Machine Learning)/IE(Inference Engine) chip market still seems immature and fragmented, so I can't find any consensus on generic names for such hardware accelerator devices.
>
> Some of the chip vendors represented on the DPDK mailing list offer AI/ML/IE accelerator chips. Perhaps their marketing department could propose alternatives to "Machine Learning Device"/"mldev" for inference engine devices (with no acceleration for training the models). If not, the initially proposed name is good enough.
>
> So: Everyone ask your marketing departments and speak up now, or the name "mldev" will be set in stone. ;-)
>
> I'm thinking: While "Inference Engine Device"/iedev might be technically more correct, it doesn't have same value as "Machine Learning Device"/"mldev" on a marketing scale. And we should choose a name that we expect might become industry standard consensus.
I don't why but I like mldev and dislike iedev.
I could be OK with aidev as well.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2022-08-17 14:53 ` Jerin Jacob
@ 2023-01-25 13:47 ` Thomas Monjalon
2023-01-25 13:54 ` Jerin Jacob
0 siblings, 1 reply; 80+ messages in thread
From: Thomas Monjalon @ 2023-01-25 13:47 UTC (permalink / raw)
To: Honnappa Nagarahalli, dev
Cc: Morten Brørup, jerinj, dpdk-dev, Ferruh Yigit,
Ajit Khaparde (ajit.khaparde@broadcom.com),
Andrew Boyer, Andrew Rybchenko, Beilei Xing, Richardson, Bruce,
Chas Williams, Xia, Chenbo, Ciara Loftus, Devendra Singh Rawat,
Ed Czeck, Evgeny Schemeilin, Gaetan Rivet, Gagandeep Singh,
Guoyang Zhou, Haiyue Wang, Harman Kalra, Heinrich Kuhn,
hemant.agrawal, Hyong Youb Kim, Igor Chauskin, Igor Russkikh,
Jakub Grajciar, Jasvinder Singh, Jian Wang, Jiawen Wu,
Jingjing Wu, John Daley, John Miller, John W. Linville, Wiles,
Keith, Kiran Kumar K, Lijun Ou, Liron Himi, Long Li,
Marcin Wojtas, Martin Spinler, Matan Azrad, Matt Peters,
Maxime Coquelin, Michal Krawczyk, Min Hu (Connor,
Pradeep Kumar Nalla, Nithin Dabilpuram, Qiming Yang, Qi Zhang,
Radha Mohan Chintakuntla, Rahul Lakkireddy, Rasesh Mody,
Rosen Xu, Sachin Saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, Shai Brandes, Shepard Siegel,
Somalapuram Amaranath, Somnath Kotur, Stephen Hemminger,
Steven Webster, Sunil Kumar Kori, Tetsuya Mukawa,
Veerasenareddy Burru, nd, Viacheslav Ovsiienko, Xiao Wang,
Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Mattias Rönnblom, Ruifeng Wang,
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara, Stephen Hemminger, Jerin Jacob
17/08/2022 16:53, Jerin Jacob:
> On Tue, Aug 16, 2022 at 10:04 PM Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com> wrote:
> >
> > <snip>
> >
> > > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > > Sent: Tuesday, 16 August 2022 15.13
> > > >
> > > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > > <stephen@networkplumber.org> wrote:
> > > > >
> > > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > > <jerinj@marvell.com> wrote:
> > > > >
> > > > > > Roadmap
> > > > > > -------
> > > > > > 1) Address the comments for this RFC.
> > > > > > 2) Common code for mldev
> > > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > > >
> > > > > Having a SW implementation is important because then it can be
> > > > covered
> > > > > by tests.
> > > >
> > > > Yes. That reason for adding TVM based SW driver as item (3).
> > > >
> > > > Is there any other high level or API level comments before proceeding
> > > > with v1 and implementation.
> > >
> > > Have you seriously considered if the DPDK Project is the best home for this
> > > project? I can easily imagine the DPDK development process being a hindrance
> > > in many aspects for an evolving AI/ML library. Off the top of my head, it would
> > > probably be better off as a separate project, like SPDK.
> > There is a lot of talk about using ML in networking workloads. Although, I am not very sure on how the use case looks like. For ex: is the inference engine going to be inline (i.e. the packet goes through the inference engine before coming to the CPU and provide some data (what sort of data?)), look aside (does it require the packets to be sent to the inference engine or is it some other data?), what would be an end to end use case? A sample application using these APIs would be helpful.
>
> Simple application for the inference usage is added in the cover letter.
>
> Regarding the use cases, There are many like firewall, intrusion
> detection etc. Most of the use cases are driven by product
> requirements and SW IP vendors try to keep it to themselves as a
> product differentiate factor.
> That is the prime reason for DPDK scope only for inference where IO is
> involved. Model creation and training etc will heavily vary based on
> use case but not the inference model.
>
> >
> > IMO, if we need to share the packets with the inference engine, then it fits into DPDK.
>
> Yes. Yes for networking or ORAN use cases the interface data comes
> over wire and result can go over wire.
>
> >
> > As I understand, there are many mature open source projects for ML/inference outside of DPDK. Does it make sense for DPDK to adopt those projects rather than inventing our own?
>
> # AI/ML compiler libraries more focused on model creation and
> training etc (Thats where actual value addition the AI/ML libraries
> can offer) and
> minimal part for inference (It is just added for testing the model)
> # Considering the inference is the scope of the DPDK. DPDK is ideal
> place for following reasons
>
> a) Inference scope is very limited.
> b) Avoid memcpy of inference data (Use directly from network or
> other class of device like cryptodev, regexdev)
> c) Reuse highspeed IO interface like PCI backed driver etc
> d) Integration with other DPDK subsystems like eventdev etc for job completion.
> e) Also support more inline offloads by merging two device classes
> like rte_secuity.
> f) Run the inference model from different AI/ML compiler frameworks or
> abstract the inference usage.
> Similar concept is already applied to other DPDK device classes like
> 1) In Regexdev, The compiler generates the rule database which is out
> of scope of DPDK. DPDK API just loads the rule database
> 2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
> IO interface.
I think Honnappa was thinking about linking an existing inference library.
What are the similar libraries?
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library
2023-01-25 13:47 ` Thomas Monjalon
@ 2023-01-25 13:54 ` Jerin Jacob
0 siblings, 0 replies; 80+ messages in thread
From: Jerin Jacob @ 2023-01-25 13:54 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Honnappa Nagarahalli, dev, Morten Brørup, jerinj,
Ferruh Yigit, Ajit Khaparde (ajit.khaparde@broadcom.com),
Andrew Boyer, Andrew Rybchenko, Beilei Xing, Richardson, Bruce,
Chas Williams, Xia, Chenbo, Ciara Loftus, Devendra Singh Rawat,
Ed Czeck, Evgeny Schemeilin, Gaetan Rivet, Gagandeep Singh,
Guoyang Zhou, Haiyue Wang, Harman Kalra, Heinrich Kuhn,
hemant.agrawal, Hyong Youb Kim, Igor Chauskin, Igor Russkikh,
Jakub Grajciar, Jasvinder Singh, Jian Wang, Jiawen Wu,
Jingjing Wu, John Daley, John Miller, John W. Linville, Wiles,
Keith, Kiran Kumar K, Lijun Ou, Liron Himi, Long Li,
Marcin Wojtas, Martin Spinler, Matan Azrad, Matt Peters,
Maxime Coquelin, Michal Krawczyk, Min Hu (Connor,
Pradeep Kumar Nalla, Nithin Dabilpuram, Qiming Yang, Qi Zhang,
Radha Mohan Chintakuntla, Rahul Lakkireddy, Rasesh Mody,
Rosen Xu, Sachin Saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, Shai Brandes, Shepard Siegel,
Somalapuram Amaranath, Somnath Kotur, Stephen Hemminger,
Steven Webster, Sunil Kumar Kori, Tetsuya Mukawa,
Veerasenareddy Burru, nd, Viacheslav Ovsiienko, Xiao Wang,
Xiaoyun Wang, Yisen Zhuang, Yong Wang, Ziyang Xuan,
Prasun Kapoor, nadavh, Satananda Burla, Narayana Prasad,
Akhil Goyal, Ray Kinsella, Dmitry Kozlyuk, Anatoly Burakov,
Cristian Dumitrescu, Mattias Rönnblom, Ruifeng Wang,
David Christensen, Ananyev, Konstantin, Olivier Matz,
Jayatheerthan, Jay, Ashwin Sekhar Thalakalath Kottilveetil,
Pavan Nikhilesh, Elena Agostini, Srikanth Yalavarthi, dchickles,
sshankarnara, John McNamara, Stephen Hemminger
On Wed, Jan 25, 2023 at 7:17 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 17/08/2022 16:53, Jerin Jacob:
> > On Tue, Aug 16, 2022 at 10:04 PM Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com> wrote:
> > >
> > > <snip>
> > >
> > > > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > > > Sent: Tuesday, 16 August 2022 15.13
> > > > >
> > > > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > > > <stephen@networkplumber.org> wrote:
> > > > > >
> > > > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > > > <jerinj@marvell.com> wrote:
> > > > > >
> > > > > > > Roadmap
> > > > > > > -------
> > > > > > > 1) Address the comments for this RFC.
> > > > > > > 2) Common code for mldev
> > > > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > > > >
> > > > > > Having a SW implementation is important because then it can be
> > > > > covered
> > > > > > by tests.
> > > > >
> > > > > Yes. That reason for adding TVM based SW driver as item (3).
> > > > >
> > > > > Is there any other high level or API level comments before proceeding
> > > > > with v1 and implementation.
> > > >
> > > > Have you seriously considered if the DPDK Project is the best home for this
> > > > project? I can easily imagine the DPDK development process being a hindrance
> > > > in many aspects for an evolving AI/ML library. Off the top of my head, it would
> > > > probably be better off as a separate project, like SPDK.
> > > There is a lot of talk about using ML in networking workloads. Although, I am not very sure on how the use case looks like. For ex: is the inference engine going to be inline (i.e. the packet goes through the inference engine before coming to the CPU and provide some data (what sort of data?)), look aside (does it require the packets to be sent to the inference engine or is it some other data?), what would be an end to end use case? A sample application using these APIs would be helpful.
> >
> > Simple application for the inference usage is added in the cover letter.
> >
> > Regarding the use cases, There are many like firewall, intrusion
> > detection etc. Most of the use cases are driven by product
> > requirements and SW IP vendors try to keep it to themselves as a
> > product differentiate factor.
> > That is the prime reason for DPDK scope only for inference where IO is
> > involved. Model creation and training etc will heavily vary based on
> > use case but not the inference model.
> >
> > >
> > > IMO, if we need to share the packets with the inference engine, then it fits into DPDK.
> >
> > Yes. Yes for networking or ORAN use cases the interface data comes
> > over wire and result can go over wire.
> >
> > >
> > > As I understand, there are many mature open source projects for ML/inference outside of DPDK. Does it make sense for DPDK to adopt those projects rather than inventing our own?
> >
> > # AI/ML compiler libraries more focused on model creation and
> > training etc (Thats where actual value addition the AI/ML libraries
> > can offer) and
> > minimal part for inference (It is just added for testing the model)
> > # Considering the inference is the scope of the DPDK. DPDK is ideal
> > place for following reasons
> >
> > a) Inference scope is very limited.
> > b) Avoid memcpy of inference data (Use directly from network or
> > other class of device like cryptodev, regexdev)
> > c) Reuse highspeed IO interface like PCI backed driver etc
> > d) Integration with other DPDK subsystems like eventdev etc for job completion.
> > e) Also support more inline offloads by merging two device classes
> > like rte_secuity.
> > f) Run the inference model from different AI/ML compiler frameworks or
> > abstract the inference usage.
> > Similar concept is already applied to other DPDK device classes like
> > 1) In Regexdev, The compiler generates the rule database which is out
> > of scope of DPDK. DPDK API just loads the rule database
> > 2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
> > IO interface.
>
> I think Honnappa was thinking about linking an existing inference library.
> What are the similar libraries?
Not sure. Honnappa can tell if any which meets (a) to (f).
>
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (11 preceding siblings ...)
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 12/12] mldev: support to get debug info and test device jerinj
@ 2023-01-25 14:20 ` Thomas Monjalon
2023-01-25 19:01 ` Jerin Jacob
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
13 siblings, 1 reply; 80+ messages in thread
From: Thomas Monjalon @ 2023-01-25 14:20 UTC (permalink / raw)
To: Jerin Jacob
Cc: dev, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, syalavarthi, dchickles, sshankarnara,
bruce.richardson, david.marchand, honnappa.nagarahalli
14/11/2022 13:02, jerinj@marvell.com:
> From: Jerin Jacob <jerinj@marvell.com>
>
> Machine learning inference library
> ==================================
>
> Definition of machine learning inference
> ----------------------------------------
> Inference in machine learning is the process of making an output prediction
> based on new input data using a pre-trained machine learning model.
>
> The scope of the RFC would include only inferencing with pre-trained machine learning models,
> training and building/compiling the ML models is out of scope for this RFC or
> DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
>
> Motivation for the new library
> ------------------------------
> Multiple semiconductor vendors are offering accelerator products such as DPU
> (often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
> integrated as part of the product. Use of ML inferencing is increasing in the domain
> of packet processing for flow classification, intrusion, malware and anomaly detection.
>
> Lack of inferencing support through DPDK APIs will involve complexities and
> increased latency from moving data across frameworks (i.e, dataplane to
> non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
> inferencing would enable the dataplane solutions to harness the benefit of inline
> inferencing supported by the hardware.
>
> Contents
> ---------------
>
> A) API specification for:
>
> 1) Discovery of ML capabilities (e.g., device specific features) in a vendor
> independent fashion
> 2) Definition of functions to handle ML devices, which includes probing,
> initialization and termination of the devices.
> 3) Definition of functions to handle ML models used to perform inference operations.
> 4) Definition of function to handle quantize and dequantize operations
>
> B) Common code for above specification
Can we compare this library with WinML?
https://learn.microsoft.com/en-us/windows/ai/windows-ml/api-reference
Is there things we can learn from it?
> ML Model: An ML model is an algorithm trained over a dataset. A model consists of
> procedure/algorithm and data/pattern required to make predictions on live data.
> Once the model is created and trained outside of the DPDK scope, the model can be loaded
> via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> The rte_ml_model_params_update() can be used to update the model parameters such as weight
> and bias without unloading the model using rte_ml_model_unload().
The fact that the model is prepared outside means the model format is free
and probably different per mldev driver.
I think it is OK but it requires a lot of documentation effort to explain
how to bind the model and its parameters with the DPDK API.
Also we may need to pass some metadata from the model builder
to the inference engine in order to enable optimizations prepared in the model.
And the other way, we may need inference capabilities in order to generate
an optimized model which can run in the inference engine.
[...]
> Typical application utilisation of the ML API will follow the following
> programming flow.
>
> - rte_ml_dev_configure()
> - rte_ml_dev_queue_pair_setup()
> - rte_ml_model_load()
> - rte_ml_model_start()
> - rte_ml_model_info()
> - rte_ml_dev_start()
> - rte_ml_enqueue_burst()
> - rte_ml_dequeue_burst()
> - rte_ml_model_stop()
> - rte_ml_model_unload()
> - rte_ml_dev_stop()
> - rte_ml_dev_close()
Where is parameters update in this flow?
Should we update all parameters at once or can it be done more fine-grain?
Question about the memory used by mldev:
Can we manage where the memory is allocated (host, device, mix, etc)?
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2023-01-25 14:20 ` [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library Thomas Monjalon
@ 2023-01-25 19:01 ` Jerin Jacob
2023-01-26 11:11 ` Thomas Monjalon
0 siblings, 1 reply; 80+ messages in thread
From: Jerin Jacob @ 2023-01-25 19:01 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Jerin Jacob, dev, ferruh.yigit, ajit.khaparde, aboyer,
andrew.rybchenko, beilei.xing, bruce.richardson, chas3,
chenbo.xia, ciara.loftus, dsinghrawat, ed.czeck, evgenys, grive,
g.singh, zhouguoyang, haiyue.wang, hkalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, irusskikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, kirankumark, oulijun, lironh,
longli, mw, spinler, matan, matt.peters, maxime.coquelin, mk,
humin29, pnalla, ndabilpuram, qiming.yang, qi.z.zhang, radhac,
rahul.lakkireddy, rmody, rosen.xu, sachin.saxena, skoteshwar,
shshaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, syalavarthi, dchickles, sshankarnara,
david.marchand
On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 14/11/2022 13:02, jerinj@marvell.com:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Machine learning inference library
> > ==================================
> >
> > Definition of machine learning inference
> > ----------------------------------------
> > Inference in machine learning is the process of making an output prediction
> > based on new input data using a pre-trained machine learning model.
> >
> > The scope of the RFC would include only inferencing with pre-trained machine learning models,
> > training and building/compiling the ML models is out of scope for this RFC or
> > DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
> >
> > Motivation for the new library
> > ------------------------------
> > Multiple semiconductor vendors are offering accelerator products such as DPU
> > (often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
> > integrated as part of the product. Use of ML inferencing is increasing in the domain
> > of packet processing for flow classification, intrusion, malware and anomaly detection.
> >
> > Lack of inferencing support through DPDK APIs will involve complexities and
> > increased latency from moving data across frameworks (i.e, dataplane to
> > non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
> > inferencing would enable the dataplane solutions to harness the benefit of inline
> > inferencing supported by the hardware.
> >
> > Contents
> > ---------------
> >
> > A) API specification for:
> >
> > 1) Discovery of ML capabilities (e.g., device specific features) in a vendor
> > independent fashion
> > 2) Definition of functions to handle ML devices, which includes probing,
> > initialization and termination of the devices.
> > 3) Definition of functions to handle ML models used to perform inference operations.
> > 4) Definition of function to handle quantize and dequantize operations
> >
> > B) Common code for above specification
Thanks for the review.
>
> Can we compare this library with WinML?
> https://learn.microsoft.com/en-us/windows/ai/windows-ml/api-reference
Proposed DPDK library supports only inferencing with pre-trained models.
> Is there things we can learn from it?
Comparing to winML, API provide functionality similar to
"LearningModel*" classes provides.
Support related to handling custom operators and Native APIs like
winML is provided through this API.
There may more features which we can add where there are drivers which
supports it.
>
>
> > ML Model: An ML model is an algorithm trained over a dataset. A model consists of
> > procedure/algorithm and data/pattern required to make predictions on live data.
> > Once the model is created and trained outside of the DPDK scope, the model can be loaded
> > via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> > The rte_ml_model_params_update() can be used to update the model parameters such as weight
> > and bias without unloading the model using rte_ml_model_unload().
>
> The fact that the model is prepared outside means the model format is free
> and probably different per mldev driver.
> I think it is OK but it requires a lot of documentation effort to explain
> how to bind the model and its parameters with the DPDK API.
> Also we may need to pass some metadata from the model builder
> to the inference engine in order to enable optimizations prepared in the model.
> And the other way, we may need inference capabilities in order to generate
> an optimized model which can run in the inference engine.
The base API specification kept absolute minimum. Currently, weight and biases
parameters updated through rte_ml_model_params_update(). It can be extended
when there are drivers supports it or if you have any specific
parameter you would like to add
it in rte_ml_model_params_update().
Other metadata data like batch, shapes, formats queried using rte_ml_io_info().
>
>
> [...]
> > Typical application utilisation of the ML API will follow the following
> > programming flow.
> >
> > - rte_ml_dev_configure()
> > - rte_ml_dev_queue_pair_setup()
> > - rte_ml_model_load()
> > - rte_ml_model_start()
> > - rte_ml_model_info()
> > - rte_ml_dev_start()
> > - rte_ml_enqueue_burst()
> > - rte_ml_dequeue_burst()
> > - rte_ml_model_stop()
> > - rte_ml_model_unload()
> > - rte_ml_dev_stop()
> > - rte_ml_dev_close()
>
> Where is parameters update in this flow?
Added the mandatory APIs in the top level flow doc.
rte_ml_model_params_update() used to update the parameters.
> Should we update all parameters at once or can it be done more fine-grain?
Currently, rte_ml_model_params_update() can be used to update weight
and bias via buffer when device is
in stop state and without unloading the model.
>
>
> Question about the memory used by mldev:
> Can we manage where the memory is allocated (host, device, mix, etc)?
Just passing buffer pointers now like other subsystem.
Other EAL infra service can take care of the locality of memory as it
is not specific to ML dev.
+/** ML operation's input and output buffer representation as scatter
gather list
+ */
+struct rte_ml_buff_seg {
+ rte_iova_t iova_addr;
+ /**< IOVA address of segment buffer. */
+ void *addr;
+ /**< Virtual address of segment buffer. */
+ uint32_t length;
+ /**< Segment length. */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_ml_buff_seg *next;
+ /**< Points to next segment. Value NULL represents the last segment. */
+};
>
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2023-01-25 19:01 ` Jerin Jacob
@ 2023-01-26 11:11 ` Thomas Monjalon
2023-01-27 2:33 ` [EXT] " Shivah Shankar Shankar Narayan Rao
0 siblings, 1 reply; 80+ messages in thread
From: Thomas Monjalon @ 2023-01-26 11:11 UTC (permalink / raw)
To: Jerin Jacob, Jerin Jacob
Cc: dev, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
dsinghrawat, ed.czeck, evgenys, grive, g.singh, zhouguoyang,
haiyue.wang, hkalra, heinrich.kuhn, hemant.agrawal, hyonkim,
igorch, irusskikh, jgrajcia, jasvinder.singh, jianwang, jiawenwu,
jingjing.wu, johndale, john.miller, linville, keith.wiles,
kirankumark, oulijun, lironh, longli, mw, spinler, matan,
matt.peters, maxime.coquelin, mk, humin29, pnalla, ndabilpuram,
qiming.yang, qi.z.zhang, radhac, rahul.lakkireddy, rmody,
rosen.xu, sachin.saxena, skoteshwar, shshaikh, shaibran,
shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, skori, mtetsuyah, vburru, viacheslavo,
xiao.w.wang, cloud.wangxiaoyun, yisen.zhuang, yongwang,
xuanziyang2, pkapoor, nadavh, sburla, pathreya, gakhil,
dmitry.kozliuk, anatoly.burakov, cristian.dumitrescu,
honnappa.nagarahalli, mattias.ronnblom, ruifeng.wang, drc,
konstantin.ananyev, olivier.matz, jay.jayatheerthan, asekhar,
pbhagavatula, eagostini, syalavarthi, dchickles, sshankarnara,
david.marchand
25/01/2023 20:01, Jerin Jacob:
> On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 14/11/2022 13:02, jerinj@marvell.com:
> > > ML Model: An ML model is an algorithm trained over a dataset. A model consists of
> > > procedure/algorithm and data/pattern required to make predictions on live data.
> > > Once the model is created and trained outside of the DPDK scope, the model can be loaded
> > > via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> > > The rte_ml_model_params_update() can be used to update the model parameters such as weight
> > > and bias without unloading the model using rte_ml_model_unload().
> >
> > The fact that the model is prepared outside means the model format is free
> > and probably different per mldev driver.
> > I think it is OK but it requires a lot of documentation effort to explain
> > how to bind the model and its parameters with the DPDK API.
> > Also we may need to pass some metadata from the model builder
> > to the inference engine in order to enable optimizations prepared in the model.
> > And the other way, we may need inference capabilities in order to generate
> > an optimized model which can run in the inference engine.
>
> The base API specification kept absolute minimum. Currently, weight and biases
> parameters updated through rte_ml_model_params_update(). It can be extended
> when there are drivers supports it or if you have any specific
> parameter you would like to add
> it in rte_ml_model_params_update().
This function is
int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);
How are we supposed to provide separate parameters in this void* ?
> Other metadata data like batch, shapes, formats queried using rte_ml_io_info().
Copying:
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Name of data */
+ struct rte_ml_io_shape shape;
+ /**< Shape of data */
+ enum rte_ml_io_type qtype;
+ /**< Type of quantized data */
+ enum rte_ml_io_type dtype;
+ /**< Type of de-quantized data */
+};
Is it the right place to notify the app that some model optimizations
are supported? (example: merge some operations in the graph)
> > [...]
> > > Typical application utilisation of the ML API will follow the following
> > > programming flow.
> > >
> > > - rte_ml_dev_configure()
> > > - rte_ml_dev_queue_pair_setup()
> > > - rte_ml_model_load()
> > > - rte_ml_model_start()
> > > - rte_ml_model_info()
> > > - rte_ml_dev_start()
> > > - rte_ml_enqueue_burst()
> > > - rte_ml_dequeue_burst()
> > > - rte_ml_model_stop()
> > > - rte_ml_model_unload()
> > > - rte_ml_dev_stop()
> > > - rte_ml_dev_close()
> >
> > Where is parameters update in this flow?
>
> Added the mandatory APIs in the top level flow doc.
> rte_ml_model_params_update() used to update the parameters.
The question is "where" should it be done?
Before/after start?
> > Should we update all parameters at once or can it be done more fine-grain?
>
> Currently, rte_ml_model_params_update() can be used to update weight
> and bias via buffer when device is
> in stop state and without unloading the model.
The question is "can we update a single parameter"?
And how?
> > Question about the memory used by mldev:
> > Can we manage where the memory is allocated (host, device, mix, etc)?
>
> Just passing buffer pointers now like other subsystem.
> Other EAL infra service can take care of the locality of memory as it
> is not specific to ML dev.
I was thinking about memory allocation required by the inference engine.
How to specify where to allocate? Is it just hardcoded in the driver?
^ permalink raw reply [flat|nested] 80+ messages in thread
* RE: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2023-01-26 11:11 ` Thomas Monjalon
@ 2023-01-27 2:33 ` Shivah Shankar Shankar Narayan Rao
2023-01-27 4:29 ` Jerin Jacob
0 siblings, 1 reply; 80+ messages in thread
From: Shivah Shankar Shankar Narayan Rao @ 2023-01-27 2:33 UTC (permalink / raw)
To: Thomas Monjalon, Jerin Jacob, Jerin Jacob Kollanukkaran
Cc: dev, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Srikanth Yalavarthi,
Derek Chickles, david.marchand
External Email
----------------------------------------------------------------------
25/01/2023 20:01, Jerin Jacob:
> On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 14/11/2022 13:02, jerinj@marvell.com:
> > > ML Model: An ML model is an algorithm trained over a dataset. A
> > > model consists of procedure/algorithm and data/pattern required to make predictions on live data.
> > > Once the model is created and trained outside of the DPDK scope,
> > > the model can be loaded via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> > > The rte_ml_model_params_update() can be used to update the model
> > > parameters such as weight and bias without unloading the model using rte_ml_model_unload().
> >
> > The fact that the model is prepared outside means the model format
> > is free and probably different per mldev driver.
> > I think it is OK but it requires a lot of documentation effort to
> > explain how to bind the model and its parameters with the DPDK API.
> > Also we may need to pass some metadata from the model builder to the
> > inference engine in order to enable optimizations prepared in the model.
> > And the other way, we may need inference capabilities in order to
> > generate an optimized model which can run in the inference engine.
>
> The base API specification kept absolute minimum. Currently, weight
> and biases parameters updated through rte_ml_model_params_update(). It
> can be extended when there are drivers supports it or if you have any
> specific parameter you would like to add it in
> rte_ml_model_params_update().
This function is
int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);
How are we supposed to provide separate parameters in this void* ?
Just to clarify on what "parameters" mean, they just mean weights and biases of the model, which are the parameters for a model.
Also, the Proposed APIs are for running the inference on a pre-trained model. For running the inference the amount of parameters tuning needed/done is limited/none.
The only parameters that get may get changed are the Weights and Bias which the API rte_ml_model_params_update() caters to.
While running the inference on a Model there won't be any random addition or removal of operators to/from the model or there won't be any changes in the actual flow of model.
Since the only parameter that can be changed is Weights and Biases the above API should take care.
> Other metadata data like batch, shapes, formats queried using rte_ml_io_info().
Copying:
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Name of data */
+ struct rte_ml_io_shape shape;
+ /**< Shape of data */
+ enum rte_ml_io_type qtype;
+ /**< Type of quantized data */
+ enum rte_ml_io_type dtype;
+ /**< Type of de-quantized data */ };
Is it the right place to notify the app that some model optimizations are supported? (example: merge some operations in the graph)
The inference is run on a pre-trained model, which means any merges /additions of operations to the graph are NOT done.
If any such things are done then the changed model needs to go through the training and compilation once again which is out of scope of these APIs.
> > [...]
> > > Typical application utilisation of the ML API will follow the
> > > following programming flow.
> > >
> > > - rte_ml_dev_configure()
> > > - rte_ml_dev_queue_pair_setup()
> > > - rte_ml_model_load()
> > > - rte_ml_model_start()
> > > - rte_ml_model_info()
> > > - rte_ml_dev_start()
> > > - rte_ml_enqueue_burst()
> > > - rte_ml_dequeue_burst()
> > > - rte_ml_model_stop()
> > > - rte_ml_model_unload()
> > > - rte_ml_dev_stop()
> > > - rte_ml_dev_close()
> >
> > Where is parameters update in this flow?
>
> Added the mandatory APIs in the top level flow doc.
> rte_ml_model_params_update() used to update the parameters.
The question is "where" should it be done?
Before/after start?
The model image comes with the Weights and Bias and will be loaded and used as a part of rte_ml_model_load and rte_ml_model_start.
In rare scenarios where the user wants to update the Weights and Bias of an already loaded model, the rte_ml_model_stop can be called to stop the model and the Weights and Biases can be updated using the The parameters (Weights&Biases) can be updated when the rte_ml_model_params_update() API followed by rte_ml_model_start to start the model with the new Weights and Biases.
> > Should we update all parameters at once or can it be done more fine-grain?
>
> Currently, rte_ml_model_params_update() can be used to update weight
> and bias via buffer when device is in stop state and without unloading
> the model.
The question is "can we update a single parameter"?
And how?
As mentioned above for running inference the model is already trained the only parameter that is updated is the Weights and Biases.
"Parameters" is another word for Weights and Bias. No other parameters are considered.
Are there any other parameters you have on your mind?
> > Question about the memory used by mldev:
> > Can we manage where the memory is allocated (host, device, mix, etc)?
>
> Just passing buffer pointers now like other subsystem.
> Other EAL infra service can take care of the locality of memory as it
> is not specific to ML dev.
I was thinking about memory allocation required by the inference engine.
How to specify where to allocate? Is it just hardcoded in the driver?
Any memory within the hardware is managed by the driver.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2023-01-27 2:33 ` [EXT] " Shivah Shankar Shankar Narayan Rao
@ 2023-01-27 4:29 ` Jerin Jacob
2023-01-27 11:34 ` Thomas Monjalon
0 siblings, 1 reply; 80+ messages in thread
From: Jerin Jacob @ 2023-01-27 4:29 UTC (permalink / raw)
To: Shivah Shankar Shankar Narayan Rao
Cc: Thomas Monjalon, Jerin Jacob Kollanukkaran, dev, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing,
bruce.richardson, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Srikanth Yalavarthi,
Derek Chickles, david.marchand
On Fri, Jan 27, 2023 at 8:04 AM Shivah Shankar Shankar Narayan Rao
<sshankarnara@marvell.com> wrote:
>
> External Email
>
> ----------------------------------------------------------------------
> 25/01/2023 20:01, Jerin Jacob:
> > On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 14/11/2022 13:02, jerinj@marvell.com:
> > > > ML Model: An ML model is an algorithm trained over a dataset. A
> > > > model consists of procedure/algorithm and data/pattern required to make predictions on live data.
> > > > Once the model is created and trained outside of the DPDK scope,
> > > > the model can be loaded via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> > > > The rte_ml_model_params_update() can be used to update the model
> > > > parameters such as weight and bias without unloading the model using rte_ml_model_unload().
> > >
> > > The fact that the model is prepared outside means the model format
> > > is free and probably different per mldev driver.
> > > I think it is OK but it requires a lot of documentation effort to
> > > explain how to bind the model and its parameters with the DPDK API.
> > > Also we may need to pass some metadata from the model builder to the
> > > inference engine in order to enable optimizations prepared in the model.
> > > And the other way, we may need inference capabilities in order to
> > > generate an optimized model which can run in the inference engine.
> >
> > The base API specification kept absolute minimum. Currently, weight
> > and biases parameters updated through rte_ml_model_params_update(). It
> > can be extended when there are drivers supports it or if you have any
> > specific parameter you would like to add it in
> > rte_ml_model_params_update().
>
> This function is
> int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);
>
> How are we supposed to provide separate parameters in this void* ?
>
> Just to clarify on what "parameters" mean, they just mean weights and biases of the model, which are the parameters for a model.
> Also, the Proposed APIs are for running the inference on a pre-trained model. For running the inference the amount of parameters tuning needed/done is limited/none.
> The only parameters that get may get changed are the Weights and Bias which the API rte_ml_model_params_update() caters to.
>
> While running the inference on a Model there won't be any random addition or removal of operators to/from the model or there won't be any changes in the actual flow of model.
> Since the only parameter that can be changed is Weights and Biases the above API should take care.
>
> > Other metadata data like batch, shapes, formats queried using rte_ml_io_info().
>
> Copying:
> +/** Input and output data information structure
> + *
> + * Specifies the type and shape of input and output data.
> + */
> +struct rte_ml_io_info {
> + char name[RTE_ML_STR_MAX];
> + /**< Name of data */
> + struct rte_ml_io_shape shape;
> + /**< Shape of data */
> + enum rte_ml_io_type qtype;
> + /**< Type of quantized data */
> + enum rte_ml_io_type dtype;
> + /**< Type of de-quantized data */ };
>
> Is it the right place to notify the app that some model optimizations are supported? (example: merge some operations in the graph)
>
> The inference is run on a pre-trained model, which means any merges /additions of operations to the graph are NOT done.
> If any such things are done then the changed model needs to go through the training and compilation once again which is out of scope of these APIs.
>
> > > [...]
> > > > Typical application utilisation of the ML API will follow the
> > > > following programming flow.
> > > >
> > > > - rte_ml_dev_configure()
> > > > - rte_ml_dev_queue_pair_setup()
> > > > - rte_ml_model_load()
> > > > - rte_ml_model_start()
> > > > - rte_ml_model_info()
> > > > - rte_ml_dev_start()
> > > > - rte_ml_enqueue_burst()
> > > > - rte_ml_dequeue_burst()
> > > > - rte_ml_model_stop()
> > > > - rte_ml_model_unload()
> > > > - rte_ml_dev_stop()
> > > > - rte_ml_dev_close()
> > >
> > > Where is parameters update in this flow?
> >
> > Added the mandatory APIs in the top level flow doc.
> > rte_ml_model_params_update() used to update the parameters.
>
> The question is "where" should it be done?
> Before/after start?
>
> The model image comes with the Weights and Bias and will be loaded and used as a part of rte_ml_model_load and rte_ml_model_start.
> In rare scenarios where the user wants to update the Weights and Bias of an already loaded model, the rte_ml_model_stop can be called to stop the model and the Weights and Biases can be updated using the The parameters (Weights&Biases) can be updated when the rte_ml_model_params_update() API followed by rte_ml_model_start to start the model with the new Weights and Biases.
>
> > > Should we update all parameters at once or can it be done more fine-grain?
> >
> > Currently, rte_ml_model_params_update() can be used to update weight
> > and bias via buffer when device is in stop state and without unloading
> > the model.
>
> The question is "can we update a single parameter"?
> And how?
> As mentioned above for running inference the model is already trained the only parameter that is updated is the Weights and Biases.
> "Parameters" is another word for Weights and Bias. No other parameters are considered.
>
> Are there any other parameters you have on your mind?
>
> > > Question about the memory used by mldev:
> > > Can we manage where the memory is allocated (host, device, mix, etc)?
> >
> > Just passing buffer pointers now like other subsystem.
> > Other EAL infra service can take care of the locality of memory as it
> > is not specific to ML dev.
>
> I was thinking about memory allocation required by the inference engine.
> How to specify where to allocate? Is it just hardcoded in the driver?
>
> Any memory within the hardware is managed by the driver.
I think, Thomas is asking input and output memory for interference. If
so, the parameters for
struct rte_ml_buff_seg or needs to add type or so. Thomas, Please
propose what parameters you want here.
In case if it is for internal driver memory, We can pass the memory
type in rte_ml_dev_configure(), If so, please propose
the memory types you need and the parameters.
>
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2023-01-27 4:29 ` Jerin Jacob
@ 2023-01-27 11:34 ` Thomas Monjalon
2023-01-28 11:27 ` Jerin Jacob
0 siblings, 1 reply; 80+ messages in thread
From: Thomas Monjalon @ 2023-01-27 11:34 UTC (permalink / raw)
To: Shivah Shankar Shankar Narayan Rao, Jerin Jacob
Cc: Jerin Jacob Kollanukkaran, dev, ferruh.yigit, ajit.khaparde,
aboyer, andrew.rybchenko, beilei.xing, bruce.richardson, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Srikanth Yalavarthi,
Derek Chickles, david.marchand
Hi,
Shivah Shankar, please quote your replies
so we can distinguish what I said from what you say.
Please try to understand my questions, you tend to reply to something else.
27/01/2023 05:29, Jerin Jacob:
> On Fri, Jan 27, 2023 at 8:04 AM Shivah Shankar Shankar Narayan Rao
> <sshankarnara@marvell.com> wrote:
> > 25/01/2023 20:01, Jerin Jacob:
> > > On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 14/11/2022 13:02, jerinj@marvell.com:
> > > > > > ML Model: An ML model is an algorithm trained over a dataset. A
> > > > > > model consists of procedure/algorithm and data/pattern required to
> > > > > > make predictions on live data. Once the model is created and
> > > > > > trained outside of the DPDK scope,
> > > > > > the model can be loaded via rte_ml_model_load() and then start it
> > > > > > using rte_ml_model_start() API. The rte_ml_model_params_update()
> > > > > > can be used to update the model
> > > > > > parameters such as weight and bias without unloading the model
> > > > > > using rte_ml_model_unload().> > > > >
> > > > > The fact that the model is prepared outside means the model format
> > > > > is free and probably different per mldev driver.
> > > > > I think it is OK but it requires a lot of documentation effort to
> > > > > explain how to bind the model and its parameters with the DPDK API.
> > > > > Also we may need to pass some metadata from the model builder to the
> > > > > inference engine in order to enable optimizations prepared in the
> > > > > model.
> > > > > And the other way, we may need inference capabilities in order to
> > > > > generate an optimized model which can run in the inference engine.
> > > >
> > > > The base API specification kept absolute minimum. Currently, weight
> > > > and biases parameters updated through rte_ml_model_params_update(). It
> > > > can be extended when there are drivers supports it or if you have any
> > > > specific parameter you would like to add it in
> > > > rte_ml_model_params_update().
> > >
> > > This function is
> > > int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void
> > > *buffer);
> > >
> > > How are we supposed to provide separate parameters in this void* ?
> >
> > Just to clarify on what "parameters" mean,
> > they just mean weights and biases of the model,
> > which are the parameters for a model.
> > Also, the Proposed APIs are for running the inference
> > on a pre-trained model.
> > For running the inference the amount of parameters tuning
> > needed/done is limited/none.
Why is it limited?
I think you are limiting to *your* model.
> > The only parameters that get may get changed are the Weights and Bias
> > which the API rte_ml_model_params_update() caters to.
We cannot imagine a model with more type of parameters?
> > While running the inference on a Model there won't be any random
> > addition or removal of operators to/from the model or there won't
> > be any changes in the actual flow of model.
> > Since the only parameter that can be changed is Weights and Biases
> > the above API should take care.
No, you don't reply to my question.
I want to be able to change a single parameter.
I am expecting a more fine-grain API than a simple "void*".
We could give the name of the parameter and a value, why not?
> > > > Other metadata data like batch, shapes, formats queried using
> > > > rte_ml_io_info().
> > > Copying:
> > > +/** Input and output data information structure
> > > + *
> > > + * Specifies the type and shape of input and output data.
> > > + */
> > > +struct rte_ml_io_info {
> > > + char name[RTE_ML_STR_MAX];
> > > + /**< Name of data */
> > > + struct rte_ml_io_shape shape;
> > > + /**< Shape of data */
> > > + enum rte_ml_io_type qtype;
> > > + /**< Type of quantized data */
> > > + enum rte_ml_io_type dtype;
> > > + /**< Type of de-quantized data */ };
> > >
> > > Is it the right place to notify the app that some model optimizations
> > > are supported? (example: merge some operations in the graph)
> >
> > The inference is run on a pre-trained model, which means
> > any merges /additions of operations to the graph are NOT done.
> > If any such things are done then the changed model needs to go
> > through the training and compilation once again
> > which is out of scope of these APIs.
Please try to understand what I am saying.
I want the application to be able to know some capabilities are supported
by the inference driver.
So it will allow to generate the model with some optimizations.
> > > > > [...]
> > > > > > Typical application utilisation of the ML API will follow the
> > > > > > following programming flow.
> > > > > >
> > > > > > - rte_ml_dev_configure()
> > > > > > - rte_ml_dev_queue_pair_setup()
> > > > > > - rte_ml_model_load()
> > > > > > - rte_ml_model_start()
> > > > > > - rte_ml_model_info()
> > > > > > - rte_ml_dev_start()
> > > > > > - rte_ml_enqueue_burst()
> > > > > > - rte_ml_dequeue_burst()
> > > > > > - rte_ml_model_stop()
> > > > > > - rte_ml_model_unload()
> > > > > > - rte_ml_dev_stop()
> > > > > > - rte_ml_dev_close()
> > > > >
> > > > > Where is parameters update in this flow?
> > > >
> > > > Added the mandatory APIs in the top level flow doc.
> > > > rte_ml_model_params_update() used to update the parameters.
> > >
> > > The question is "where" should it be done?
> > > Before/after start?
> >
> > The model image comes with the Weights and Bias
> > and will be loaded and used as a part of rte_ml_model_load
> > and rte_ml_model_start.
> > In rare scenarios where the user wants to update
> > the Weights and Bias of an already loaded model,
> > the rte_ml_model_stop can be called to stop the model
> > and the Weights and Biases can be updated using the
> > The parameters (Weights&Biases) can be updated
> > when the rte_ml_model_params_update() API
> > followed by rte_ml_model_start to start the model
> > with the new Weights and Biases.
OK please sure it is documented that parameters update
must be done on a stopped engine.
> > > > > Should we update all parameters at once or can it be done more
> > > > > fine-grain?
> > > >
> > > > Currently, rte_ml_model_params_update() can be used to update weight
> > > > and bias via buffer when device is in stop state and without unloading
> > > > the model.
Passing a raw buffer is a really dark API.
We need to know how to fill the buffer.
> > > The question is "can we update a single parameter"?
> > > And how?
> >
> > As mentioned above for running inference the model is already trained
> > the only parameter that is updated is the Weights and Biases.
> > "Parameters" is another word for Weights and Bias.
> > No other parameters are considered.
You are not replying to the question.
How can we update a single parameter?
> > Are there any other parameters you have on your mind?
No
> > > > > Question about the memory used by mldev:
> > > > > Can we manage where the memory is allocated (host, device, mix,
> > > > > etc)?
> > > >
> > > > Just passing buffer pointers now like other subsystem.
> > > > Other EAL infra service can take care of the locality of memory as it
> > > > is not specific to ML dev.
> > >
> > > I was thinking about memory allocation required by the inference engine.
> > > How to specify where to allocate? Is it just hardcoded in the driver?
> >
> > Any memory within the hardware is managed by the driver.
>
> I think, Thomas is asking input and output memory for interference. If
> so, the parameters for
> struct rte_ml_buff_seg or needs to add type or so. Thomas, Please
> propose what parameters you want here.
> In case if it is for internal driver memory, We can pass the memory
> type in rte_ml_dev_configure(), If so, please propose
> the memory types you need and the parameters.
I'm talking about the memory used by the driver to make the inference works.
In some cases we may prefer the hardware using host memory,
sometimes use the device memory.
I think that's something we may tune in the configuration.
I suppose we are fine with allocation hardcoded in the driver for now,
as I don't have a clear need.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2023-01-27 11:34 ` Thomas Monjalon
@ 2023-01-28 11:27 ` Jerin Jacob
2023-02-01 16:57 ` Thomas Monjalon
0 siblings, 1 reply; 80+ messages in thread
From: Jerin Jacob @ 2023-01-28 11:27 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Srikanth Yalavarthi,
Derek Chickles, david.marchand
On Fri, Jan 27, 2023 at 6:28 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> Hi,
>
> Shivah Shankar, please quote your replies
> so we can distinguish what I said from what you say.
>
> Please try to understand my questions, you tend to reply to something else.
>
>
> 27/01/2023 05:29, Jerin Jacob:
> > On Fri, Jan 27, 2023 at 8:04 AM Shivah Shankar Shankar Narayan Rao
> > <sshankarnara@marvell.com> wrote:
> > > 25/01/2023 20:01, Jerin Jacob:
> > > > On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > 14/11/2022 13:02, jerinj@marvell.com:
> > > > > > > ML Model: An ML model is an algorithm trained over a dataset. A
> > > > > > > model consists of procedure/algorithm and data/pattern required to
> > > > > > > make predictions on live data. Once the model is created and
> > > > > > > trained outside of the DPDK scope,
> > > > > > > the model can be loaded via rte_ml_model_load() and then start it
> > > > > > > using rte_ml_model_start() API. The rte_ml_model_params_update()
> > > > > > > can be used to update the model
> > > > > > > parameters such as weight and bias without unloading the model
> > > > > > > using rte_ml_model_unload().> > > > >
> > > > > > The fact that the model is prepared outside means the model format
> > > > > > is free and probably different per mldev driver.
> > > > > > I think it is OK but it requires a lot of documentation effort to
> > > > > > explain how to bind the model and its parameters with the DPDK API.
> > > > > > Also we may need to pass some metadata from the model builder to the
> > > > > > inference engine in order to enable optimizations prepared in the
> > > > > > model.
> > > > > > And the other way, we may need inference capabilities in order to
> > > > > > generate an optimized model which can run in the inference engine.
> > > > >
> > > > > The base API specification kept absolute minimum. Currently, weight
> > > > > and biases parameters updated through rte_ml_model_params_update(). It
> > > > > can be extended when there are drivers supports it or if you have any
> > > > > specific parameter you would like to add it in
> > > > > rte_ml_model_params_update().
> > > >
> > > > This function is
> > > > int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void
> > > > *buffer);
> > > >
> > > > How are we supposed to provide separate parameters in this void* ?
> > >
> > > Just to clarify on what "parameters" mean,
> > > they just mean weights and biases of the model,
> > > which are the parameters for a model.
> > > Also, the Proposed APIs are for running the inference
> > > on a pre-trained model.
> > > For running the inference the amount of parameters tuning
> > > needed/done is limited/none.
>
> Why is it limited?
> I think you are limiting to *your* model.
See below.
>
> > > The only parameters that get may get changed are the Weights and Bias
> > > which the API rte_ml_model_params_update() caters to.
>
> We cannot imagine a model with more type of parameters?
>
> > > While running the inference on a Model there won't be any random
> > > addition or removal of operators to/from the model or there won't
> > > be any changes in the actual flow of model.
> > > Since the only parameter that can be changed is Weights and Biases
> > > the above API should take care.
>
> No, you don't reply to my question.
> I want to be able to change a single parameter.
> I am expecting a more fine-grain API than a simple "void*".
> We could give the name of the parameter and a value, why not?
The current API model is follows
1)The model is developed outside DPDK and binary file is loaded via
rte_ml_model_load()
2)The modes "read only" capabilities like shape or quantized data can
be read through
rte_ml_model_info_get() API. If you wish to advertise any other
capability for optimization etc
please give inline reply around rte_ml_io_info for the parameter and
its comment.
We can review and add it.
3)Now comes the parameter, which is the "update" on the model which
loaded prior via rte_ml_model_load() .
Also, it created outside DPDK. User have an "update" to the parameter
when we have new set of training happens.
Currently we are assuming this as single blob due to that fact that It
is model specific and it just continues
stream of bytes from model and thus void* is given.
If you have use case or your model support more parameter update as
separate blob, we should
able to update rte_ml_model_params_update() as needed. Please suggest the new
rte_ml_model_params_type enum or so. We can add that to
rte_ml_model_params_update().
Also, if you have concrete data type instead of void* for given TYPE.
Please propose the structure
for that as well, We should be able to update struct rte_ml_dev_info
for these capabilities
to abstract the models or inference engine differences.
>
> > > > > Other metadata data like batch, shapes, formats queried using
> > > > > rte_ml_io_info().
> > > > Copying:
> > > > +/** Input and output data information structure
> > > > + *
> > > > + * Specifies the type and shape of input and output data.
> > > > + */
> > > > +struct rte_ml_io_info {
> > > > + char name[RTE_ML_STR_MAX];
> > > > + /**< Name of data */
> > > > + struct rte_ml_io_shape shape;
> > > > + /**< Shape of data */
> > > > + enum rte_ml_io_type qtype;
> > > > + /**< Type of quantized data */
> > > > + enum rte_ml_io_type dtype;
> > > > + /**< Type of de-quantized data */ };
> > > >
> > > > Is it the right place to notify the app that some model optimizations
> > > > are supported? (example: merge some operations in the graph)
> > >
> > > The inference is run on a pre-trained model, which means
> > > any merges /additions of operations to the graph are NOT done.
> > > If any such things are done then the changed model needs to go
> > > through the training and compilation once again
> > > which is out of scope of these APIs.
>
> Please try to understand what I am saying.
> I want the application to be able to know some capabilities are supported
> by the inference driver.
> So it will allow to generate the model with some optimizations.
See above. Yes, this place to add that. Please propose any changes
that you want to add.
>
> > > > > > [...]
> > > > > > > Typical application utilisation of the ML API will follow the
> > > > > > > following programming flow.
> > > > > > >
> > > > > > > - rte_ml_dev_configure()
> > > > > > > - rte_ml_dev_queue_pair_setup()
> > > > > > > - rte_ml_model_load()
> > > > > > > - rte_ml_model_start()
> > > > > > > - rte_ml_model_info()
> > > > > > > - rte_ml_dev_start()
> > > > > > > - rte_ml_enqueue_burst()
> > > > > > > - rte_ml_dequeue_burst()
> > > > > > > - rte_ml_model_stop()
> > > > > > > - rte_ml_model_unload()
> > > > > > > - rte_ml_dev_stop()
> > > > > > > - rte_ml_dev_close()
> > > > > >
> > > > > > Where is parameters update in this flow?
> > > > >
> > > > > Added the mandatory APIs in the top level flow doc.
> > > > > rte_ml_model_params_update() used to update the parameters.
> > > >
> > > > The question is "where" should it be done?
> > > > Before/after start?
> > >
> > > The model image comes with the Weights and Bias
> > > and will be loaded and used as a part of rte_ml_model_load
> > > and rte_ml_model_start.
> > > In rare scenarios where the user wants to update
> > > the Weights and Bias of an already loaded model,
> > > the rte_ml_model_stop can be called to stop the model
> > > and the Weights and Biases can be updated using the
> > > The parameters (Weights&Biases) can be updated
> > > when the rte_ml_model_params_update() API
> > > followed by rte_ml_model_start to start the model
> > > with the new Weights and Biases.
>
> OK please sure it is documented that parameters update
> must be done on a stopped engine.
The doc is already there in the exisitng patch. Please see
+/**
+ * Update the model parameters without unloading model.
+ *
+ * Update model parameters such as weights and bias without unloading
the model.
+ * rte_ml_model_stop() must be called before invoking this API.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] buffer
+ * Pointer to the model weights and bias buffer.
+ * Size of the buffer is equal to wb_size returned in *rte_ml_model_info*.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);
>
> > > > > > Should we update all parameters at once or can it be done more
> > > > > > fine-grain?
> > > > >
> > > > > Currently, rte_ml_model_params_update() can be used to update weight
> > > > > and bias via buffer when device is in stop state and without unloading
> > > > > the model.
>
> Passing a raw buffer is a really dark API.
> We need to know how to fill the buffer.
See above, Currently it is model specific and model is spitting out
the paramater
blob update after the traning or so. DPDK interfence engine API is a means to
transport these blob from model to ML engine.
>
> > > > The question is "can we update a single parameter"?
> > > > And how?
> > >
> > > As mentioned above for running inference the model is already trained
> > > the only parameter that is updated is the Weights and Biases.
> > > "Parameters" is another word for Weights and Bias.
> > > No other parameters are considered.
>
> You are not replying to the question.
> How can we update a single parameter?
See above.
I see main comments are on param update and get the capablities.
To enable that, please propose the changes around rte_ml_model_params_update(),
rte_ml_model_info. We should able to take that and send v2.
>
> > > Are there any other parameters you have on your mind?
>
> No
>
> > > > > > Question about the memory used by mldev:
> > > > > > Can we manage where the memory is allocated (host, device, mix,
> > > > > > etc)?
> > > > >
> > > > > Just passing buffer pointers now like other subsystem.
> > > > > Other EAL infra service can take care of the locality of memory as it
> > > > > is not specific to ML dev.
> > > >
> > > > I was thinking about memory allocation required by the inference engine.
> > > > How to specify where to allocate? Is it just hardcoded in the driver?
> > >
> > > Any memory within the hardware is managed by the driver.
> >
> > I think, Thomas is asking input and output memory for interference. If
> > so, the parameters for
> > struct rte_ml_buff_seg or needs to add type or so. Thomas, Please
> > propose what parameters you want here.
> > In case if it is for internal driver memory, We can pass the memory
> > type in rte_ml_dev_configure(), If so, please propose
> > the memory types you need and the parameters.
>
> I'm talking about the memory used by the driver to make the inference works.
> In some cases we may prefer the hardware using host memory,
> sometimes use the device memory.
> I think that's something we may tune in the configuration.
> I suppose we are fine with allocation hardcoded in the driver for now,
> as I don't have a clear need.
OK.
>
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* RE: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 01/12] " jerinj
@ 2023-02-01 13:34 ` Shivah Shankar Shankar Narayan Rao
2023-02-03 0:25 ` Stephen Hemminger
` (2 more replies)
2023-02-02 5:26 ` Shivah Shankar Shankar Narayan Rao
1 sibling, 3 replies; 80+ messages in thread
From: Shivah Shankar Shankar Narayan Rao @ 2023-02-01 13:34 UTC (permalink / raw)
To: Jerin Jacob Kollanukkaran, dev, Thomas Monjalon,
Bruce Richardson, Srikanth Yalavarthi
Cc: ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Jerin Jacob Kollanukkaran, Parijat Shukla, Anup Prabhu,
Prince Takkar
-----Original Message-----
From: jerinj@marvell.com <jerinj@marvell.com>
Sent: Monday, November 14, 2022 5:32 PM
To: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Bruce Richardson <bruce.richardson@intel.com>; Srikanth Yalavarthi <syalavarthi@marvell.com>
Cc: ferruh.yigit@xilinx.com; ajit.khaparde@broadcom.com; aboyer@pensando.io; andrew.rybchenko@oktetlabs.ru; beilei.xing@intel.com; chas3@att.com; chenbo.xia@intel.com; ciara.loftus@intel.com; Devendra Singh Rawat <dsinghrawat@marvell.com>; ed.czeck@atomicrules.com; evgenys@amazon.com; grive@u256.net; g.singh@nxp.com; zhouguoyang@huawei.com; haiyue.wang@intel.com; Harman Kalra <hkalra@marvell.com>; heinrich.kuhn@corigine.com; hemant.agrawal@nxp.com; hyonkim@cisco.com; igorch@amazon.com; Igor Russkikh <irusskikh@marvell.com>; jgrajcia@cisco.com; jasvinder.singh@intel.com; jianwang@trustnetic.com; jiawenwu@trustnetic.com; jingjing.wu@intel.com; johndale@cisco.com; john.miller@atomicrules.com; linville@tuxdriver.com; keith.wiles@intel.com; Kiran Kumar Kokkilagadda <kirankumark@marvell.com>; oulijun@huawei.com; Liron Himi <lironh@marvell.com>; longli@microsoft.com; mw@semihalf.com; spinler@cesnet.cz; matan@nvidia.com; matt.peters@windriver.com; maxime.coquelin@redhat.com; mk@semihalf.com; humin29@huawei.com; Pradeep Kumar Nalla <pnalla@marvell.com>; Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>; qiming.yang@intel.com; qi.z.zhang@intel.com; Radha Chintakuntla <radhac@marvell.com>; rahul.lakkireddy@chelsio.com; Rasesh Mody <rmody@marvell.com>; rosen.xu@intel.com; sachin.saxena@oss.nxp.com; Satha Koteswara Rao Kottidi <skoteshwar@marvell.com>; Shahed Shaikh <shshaikh@marvell.com>; shaibran@amazon.com; shepard.siegel@atomicrules.com; asomalap@amd.com; somnath.kotur@broadcom.com; sthemmin@microsoft.com; steven.webster@windriver.com; Sunil Kumar Kori <skori@marvell.com>; mtetsuyah@gmail.com; Veerasenareddy Burru <vburru@marvell.com>; viacheslavo@nvidia.com; xiao.w.wang@intel.com; cloud.wangxiaoyun@huawei.com; yisen.zhuang@huawei.com; yongwang@vmware.com; xuanziyang2@huawei.com; Prasun Kapoor <pkapoor@marvell.com>; Nadav Haklai <nadavh@marvell.com>; Satananda Burla <sburla@marvell.com>; Narayana Prasad Raju Athreya <pathreya@marvell.com>; Akhil Goyal <gakhil@marvell.com>; mdr@ashroe.eu; dmitry.kozliuk@gmail.com; anatoly.burakov@intel.com; cristian.dumitrescu@intel.com; honnappa.nagarahalli@arm.com; mattias.ronnblom@ericsson.com; ruifeng.wang@arm.com; drc@linux.vnet.ibm.com; konstantin.ananyev@intel.com; olivier.matz@6wind.com; jay.jayatheerthan@intel.com; Ashwin Sekhar T K <asekhar@marvell.com>; Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; eagostini@nvidia.com; Derek Chickles <dchickles@marvell.com>; Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com>; Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Subject: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
From: Jerin Jacob <jerinj@marvell.com>
Add mldev API specification to standardize and use the machine learning
device and inference operations in vendor neutral way.
Following operations are abstracted through APIs
- ML device capability probe
- ML device configuration
- ML device queue pair configuration
- ML device state management
- ML device stat/xstat operations
- ML model load/unload/start/stop operations
- ML model information probe
- ML IO operations to find size for input and output buffers
- ML quantize and dequantize operations
- ML ops pool creation and free operations
- ML device enqueue/dequeue fastpath interference operations
Also added programming guide.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
MAINTAINERS | 5 +
config/rte_config.h | 3 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 186 ++++
lib/eal/common/eal_common_log.c | 1 +
lib/eal/include/rte_log.h | 1 +
lib/meson.build | 1 +
lib/mldev/meson.build | 18 +
lib/mldev/rte_mldev.c | 5 +
lib/mldev/rte_mldev.h | 1092 ++++++++++++++++++++++
lib/mldev/version.map | 3 +
14 files changed, 2032 insertions(+)
create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/version.map
Acked-by: Shivah Shankar S <sshankarnara@marvell.com>
diff --git a/MAINTAINERS b/MAINTAINERS
index 0e2fd39928..b2ab042248 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -534,6 +534,11 @@ F: drivers/raw/skeleton/
F: app/test/test_rawdev.c
F: doc/guides/prog_guide/rawdev.rst
+ML device API - EXPERIMENTAL
+M: Srikanth Yalavarthi <syalavarthi@marvell.com>
+F: lib/mldev/
+F: doc/guides/prog_guide/mldev.rst
+
Memory Pool Drivers
-------------------
diff --git a/config/rte_config.h b/config/rte_config.h
index 3c4876d434..083d37757d 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -83,6 +83,9 @@
/* rawdev defines */
#define RTE_RAWDEV_MAX_DEVS 64
+/* mldev defines */
+#define RTE_MLDEV_MAX_DEVS 64
+
/* ip_fragmentation defines */
#define RTE_LIBRTE_IP_FRAG_MAX_FRAG 8
// RTE_LIBRTE_IP_FRAG_TBL_STAT is not set
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de488c7abf..a12562977a 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -22,6 +22,7 @@ The public API headers are grouped by topics:
[compress](@ref rte_comp.h),
[regexdev](@ref rte_regexdev.h),
[dmadev](@ref rte_dmadev.h),
+ [mldev](@ref rte_mldev.h),
[eventdev](@ref rte_eventdev.h),
[event_eth_rx_adapter](@ref rte_event_eth_rx_adapter.h),
[event_eth_tx_adapter](@ref rte_event_eth_tx_adapter.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index f0886c3bd1..5d6416d3e0 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -57,6 +57,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/mempool \
@TOPDIR@/lib/meter \
@TOPDIR@/lib/metrics \
+ @TOPDIR@/lib/mldev \
@TOPDIR@/lib/node \
@TOPDIR@/lib/net \
@TOPDIR@/lib/pcapng \
diff --git a/doc/guides/prog_guide/img/mldev_flow.svg b/doc/guides/prog_guide/img/mldev_flow.svg
new file mode 100644
index 0000000000..6c5dda14e5
--- /dev/null
+++ b/doc/guides/prog_guide/img/mldev_flow.svg
@@ -0,0 +1,714 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!-- SPDX-License-Identifier: BSD-3-Clause -->
+<!-- Copyright (c) 2022 Marvell. -->
+<!-- Created with Inkscape (http://www.inkscape.org/) -->
+
+<svg
+ width="320mm"
+ height="297mm"
+ viewBox="0 0 320 297"
+ version="1.1"
+ id="svg6899"
+ inkscape:version="1.2.1 (9c6d41e410, 2022-07-14)"
+ sodipodi:docname="mldev_flow.svg"
+ inkscape:export-filename="mldev_flow.png"
+ inkscape:export-xdpi="96"
+ inkscape:export-ydpi="96"
+ xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
+ xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
+ xmlns="http://www.w3.org/2000/svg"
+ xmlns:svg="http://www.w3.org/2000/svg">
+ <sodipodi:namedview
+ id="namedview6901"
+ pagecolor="#ffffff"
+ bordercolor="#000000"
+ borderopacity="0.25"
+ inkscape:showpageshadow="2"
+ inkscape:pageopacity="0.0"
+ inkscape:pagecheckerboard="0"
+ inkscape:deskcolor="#d1d1d1"
+ inkscape:document-units="mm"
+ showgrid="false"
+ inkscape:connector-spacing="0"
+ inkscape:lockguides="false"
+ inkscape:zoom="0.49638341"
+ inkscape:cx="640.63382"
+ inkscape:cy="525.80323"
+ inkscape:window-width="1920"
+ inkscape:window-height="986"
+ inkscape:window-x="-11"
+ inkscape:window-y="-11"
+ inkscape:window-maximized="1"
+ inkscape:current-layer="layer1" />
+ <defs
+ id="defs6896">
+ <marker
+ style="overflow:visible"
+ id="RoundedArrow"
+ refX="5"
+ refY="0"
+ orient="auto-start-reverse"
+ inkscape:stockid="RoundedArrow"
+ markerWidth="6.1347523"
+ markerHeight="5.9304948"
+ viewBox="0 0 6.1347524 5.9304951"
+ inkscape:isstock="true"
+ inkscape:collect="always"
+ preserveAspectRatio="xMidYMid">
+ <path
+ transform="scale(0.7)"
+ d="m -0.21114562,-4.1055728 6.42229122,3.21114561 a 1,1 90 0 1 0,1.78885438 L -0.21114562,4.1055728 A 1.236068,1.236068 31.717474 0 1 -2,3 v -6 a 1.236068,1.236068 148.28253 0 1 1.78885438,-1.1055728 z"
+ style="fill:context-stroke;fill-rule:evenodd;stroke:none"
+ id="path1367" />
+ </marker>
+ <marker
+ style="overflow:visible"
+ id="TriangleStart"
+ refX="4"
+ refY="0"
+ orient="auto-start-reverse"
+ inkscape:stockid="TriangleStart"
+ markerWidth="5.3244081"
+ markerHeight="6.155385"
+ viewBox="0 0 5.3244081 6.1553851"
+ inkscape:isstock="true"
+ inkscape:collect="always"
+ preserveAspectRatio="xMidYMid">
+ <path
+ transform="scale(0.5)"
+ style="fill:context-stroke;fill-rule:evenodd;stroke:context-stroke;stroke-width:1pt"
+ d="M 5.77,0 -2.88,5 V -5 Z"
+ id="path135" />
+ </marker>
+ </defs>
+ <g
+ inkscape:label="Layer 1"
+ inkscape:groupmode="layer"
+ id="layer1">
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1;paint-order:stroke fill markers"
+ id="rect39991"
+ width="312.88394"
+ height="286.7659"
+ x="3.5580292"
+ y="5.1170502"
+ ry="18.197132" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 114.68664,155.38145 h 32.15418"
+ id="path24358"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="m 114.68664,179.58099 h 32.15008"
+ id="path24360"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart)"
+ d="m 114.68664,203.78389 h 32.15008"
+ id="path24362"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart)"
+ d="m 114.68664,227.98576 32.14997,0"
+ id="path24364"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="M 146.8367,252.18432 H 114.68664"
+ id="path24366"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="M 146.8367,276.38309 H 114.68664"
+ id="path24368"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176-1" />
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:2, 1;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24370"
+ width="18.09137"
+ height="13.568528"
+ x="127.27605"
+ y="208.81961"
+ ry="2.7394907"
+ inkscape:connector-avoid="true" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:4, 2;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 70.388979,148.58514 -1e-6,-46.3516"
+ id="path24426"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1"
+ inkscape:connection-end="#rect24176" />
+ <g
+ id="g42647">
+ <g
+ id="g31403"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844498;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68901, 0.844498;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-9"
+ width="99.155487"
+ height="14.152132"
+ x="190.88715"
+ y="229.93475"
+ ry="2.2479143"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-236.90309"
+ y="240.37343"
+ id="text31115"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113"
+ style="stroke:none;stroke-width:0.75"
+ x="-236.90309"
+ y="240.37343">rte_ml_model_update_params()</tspan></text>
+ </g>
+ <g
+ id="g31398"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844505;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68902, 0.844505;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-4"
+ width="99.155495"
+ height="14.152357"
+ x="190.88705"
+ y="205.73608"
+ ry="2.2479498"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-212.70453"
+ y="240.37334"
+ id="text31115-8"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-8"
+ style="stroke:none;stroke-width:0.75"
+ x="-212.70453"
+ y="240.37334">rte_ml_model_stop()</tspan></text>
+ </g>
+ <g
+ id="g31408"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844505;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68901, 0.844505;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2-2"
+ width="99.155495"
+ height="14.152359"
+ x="190.88715"
+ y="254.13341"
+ ry="2.2479503"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-261.10187"
+ y="240.37343"
+ id="text31115-1"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-3"
+ style="stroke:none;stroke-width:0.75"
+ x="-261.10187"
+ y="240.37343">rte_ml_model_unload()</tspan></text>
+ </g>
+ <g
+ id="g31393"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844566;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68914, 0.844566;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2-5"
+ width="99.155434"
+ height="14.154394"
+ x="190.88718"
+ y="181.53319"
+ ry="2.2482734"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-188.50266"
+ y="240.37343"
+ id="text31115-4"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-2"
+ style="stroke:none;stroke-width:0.75"
+ x="-188.50266"
+ y="240.37343">rte_ml_model_start()</tspan></text>
+ </g>
+ <g
+ id="g31388"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844565;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68914, 0.844565;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-8"
+ width="99.155434"
+ height="14.154395"
+ x="190.88718"
+ y="157.33029"
+ ry="2.2482736"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-164.29976"
+ y="240.37343"
+ id="text31115-6"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-5"
+ style="stroke:none;stroke-width:0.75"
+ x="-164.29976"
+ y="240.37343">rte_ml_model_info_get()</tspan></text>
+ </g>
+ <g
+ id="g31383"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2"
+ width="99.155495"
+ height="14.152369"
+ x="190.89127"
+ y="133.13176"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-140.10022"
+ y="240.37755"
+ id="text31115-0"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35"
+ style="stroke:none;stroke-width:0.75"
+ x="-140.10022"
+ y="240.37755">rte_ml_model_load()</tspan></text>
+ </g>
+ </g>
+ <rect
+ style="fill:#ffccaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2-5"
+ width="99.155495"
+ height="14.152369"
+ x="184.08008"
+ y="112.15163"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-119.12009"
+ y="233.56647"
+ id="text31115-0-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35-8"
+ style="stroke:none;stroke-width:0.75"
+ x="-119.12009"
+ y="233.56647">rte_ml_dequeue_burst()</tspan></text>
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 114.90712,47.649005 h 56.16045"
+ id="path24248"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176"
+ inkscape:connection-end="#rect24200" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 171.06762,70.71111 -56.1605,0.0024"
+ id="path24250"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176"
+ inkscape:connection-start="#rect24200-5" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="M 171.06765,93.773951 H 114.90712"
+ id="path24252"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176"
+ inkscape:connection-start="#rect24200-5-2" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.44396,47.649004 h 36.42795"
+ id="path24566"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.444,70.710168 h 36.42791"
+ id="path24568"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.44395,93.773951 36.42796,-10e-7"
+ id="path24570"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <g
+ id="g42675">
+ <g
+ id="g31358"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623639;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24728, 0.623639;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200"
+ width="44.376362"
+ height="17.244751"
+ x="190.77635"
+ y="22.794853"
+ ry="2.7391431"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-31.802492"
+ y="212.98004"
+ id="text31256"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254"
+ style="stroke-width:0.75"
+ x="-31.802492"
+ y="212.98004">Queue Pair 0</tspan></text>
+ </g>
+ <g
+ id="g31353"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623639;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24728, 0.623639;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5"
+ width="44.376362"
+ height="17.244749"
+ x="190.7764"
+ y="45.856018"
+ ry="2.7391429"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-54.863655"
+ y="213.10411"
+ id="text31256-9"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254-9"
+ style="stroke-width:0.75"
+ x="-54.863655"
+ y="213.10411">Queue Pair ..</tspan></text>
+ </g>
+ <g
+ id="g31363"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623731;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24746, 0.623731;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2"
+ width="44.37627"
+ height="17.249832"
+ x="190.77643"
+ y="68.917259"
+ ry="2.7399504"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-77.927437"
+ y="213.08859"
+ id="text31256-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254-8"
+ style="stroke-width:0.75"
+ x="-77.927437"
+ y="213.08859">Queue Pair N</tspan></text>
+ </g>
+ </g>
+ <g
+ id="g42661">
+ <g
+ id="g31368"
+ transform="translate(-19.708778,16.231776)"
+ inkscape:connector-avoid="true">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="25.995117"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-31.941525"
+ y="287.03415"
+ id="text31260"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258"
+ style="stroke-width:0.75"
+ x="-31.941525"
+ y="287.03415">Core 0</tspan></text>
+ </g>
+ <g
+ id="g31373"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479-4"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="49.056282"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-55.00008"
+ y="287.15549"
+ id="text31260-0"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258-7"
+ style="stroke-width:0.75"
+ x="-55.00008"
+ y="287.15549">Core ..</tspan></text>
+ </g>
+ <g
+ id="g31378"
+ transform="translate(-19.708778,16.231776)"
+ inkscape:connector-avoid="true">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479-41"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="72.120064"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-78.063866"
+ y="287.13998"
+ id="text31260-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258-8"
+ style="stroke-width:0.75"
+ x="-78.063866"
+ y="287.13998">Core N</tspan></text>
+ </g>
+ </g>
+ <rect
+ style="fill:#ffccaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2-5-6"
+ width="99.155495"
+ height="14.152369"
+ x="184.08008"
+ y="13.539296"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-20.507757"
+ y="233.56647"
+ id="text31115-0-5-7"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35-8-7"
+ style="stroke:none;stroke-width:0.75"
+ x="-20.507757"
+ y="233.56647">rte_ml_enqueue_burst()</tspan></text>
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:2.25, 0.75;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#RoundedArrow)"
+ d="M 233.65793,27.691665 V 112.15163"
+ id="path36804"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <g
+ id="g42683">
+ <rect
+ style="fill:#44d7f4;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24176"
+ width="89.036293"
+ height="63.036304"
+ x="25.870831"
+ y="39.197231"
+ ry="3.0941005" />
+ <text
+ xml:space="preserve"
+ style="font-size:11.2889px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-49.288273"
+ y="70.228432"
+ id="text38896"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan38894"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-49.288273"
+ y="70.228432">Machine</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-63.399399"
+ y="70.228432"
+ id="tspan38898">Learning</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-77.510529"
+ y="70.228432"
+ id="tspan38900">Inference</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-91.621651"
+ y="70.228432"
+ id="tspan38902">Engine</tspan></text>
+ </g>
+ <g
+ id="g42621">
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.405;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24176-1"
+ width="88.595322"
+ height="134.59531"
+ x="26.09132"
+ y="148.58514"
+ ry="6.6065331" />
+ <g
+ id="g42601">
+ <g
+ id="g39966"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="146.14212"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-157.3761"
+ y="130.49591"
+ id="text39799"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-157.3761"
+ y="130.49591">Model 0</tspan></text>
+ </g>
+ <g
+ id="g39971"
+ transform="translate(-60.175151,10.144334)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-8"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="178.65079"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-189.88477"
+ y="130.49591"
+ id="text39799-8"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-1"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-189.88477"
+ y="130.49591">Model 1</tspan></text>
+ </g>
+ <g
+ id="g39976"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-9"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="211.15947"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-222.39345"
+ y="130.49591"
+ id="text39799-9"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-8"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-222.39345"
+ y="130.49591">Model ..</tspan></text>
+ </g>
+ <g
+ id="g39981"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-7"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="243.66815"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-254.90213"
+ y="130.49591"
+ id="text39799-90"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-5"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-254.90213"
+ y="130.49591">Model N</tspan></text>
+ </g>
+ </g>
+ </g>
+ <text
+ xml:space="preserve"
+ style="font-size:14.1111px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-279.79742"
+ y="275.46826"
+ id="text38896-4"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:14.1111px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-279.79742"
+ y="275.46826"
+ id="tspan38902-6">mldev</tspan></text>
+ </g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 8564883018..d7f2a28bdb 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -30,6 +30,7 @@ Programmer's Guide
regexdev
dmadev
gpudev
+ mldev
rte_security
rawdev
link_bonding_poll_mode_drv_lib
diff --git a/doc/guides/prog_guide/mldev.rst b/doc/guides/prog_guide/mldev.rst
new file mode 100644
index 0000000000..9809f2dba3
--- /dev/null
+++ b/doc/guides/prog_guide/mldev.rst
@@ -0,0 +1,186 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright (c) 2022 Marvell.
+
+Machine Learning Device Library
+===============================
+
+The MLDEV library provides a Machine Learning device framework for the management and
+provisioning of hardware and software ML poll mode drivers, defining APIs which
+support a number of ML operations including device handling and inference processing.
+The ML model creation and training is outside of the scope of this library.
+
+The ML framework is built on the following model:
+
+.. _figure_mldev_work_flow:
+
+.. figure:: img/mldev_flow.*
+
+ Work flow of inference on MLDEV
+
+**ML Device**: A hardware or software-based implementation of ML device API for running
+inferences using a pre-trained ML model.
+
+**ML Model**: An ML model is an algorithm trained over a dataset. A model consists of
+procedure/algorithm and data/pattern required to make predictions on live data. Once
+the model is created and trained outside of the DPDK scope, the model can be loaded
+via rte_ml_model_load() and then start it using rte_ml_model_start() API. The
+rte_ml_model_params_update() can be used to update the model parameters such as weights
+and bias without unloading the model using rte_ml_model_unload().
+
+**ML Inference**: ML inference is the process of feeding data to the model via
+rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+outputs / predictions from the started model.
+
+Design Principles
+-----------------
+
+The MLDEV library follows the same basic principles as those used in DPDK's
+Ethernet Device framework and the Crypto framework. The MLDEV framework provides
+a generic Machine Learning device framework which supports both physical (hardware)
+and virtual (software) ML devices as well as an ML API to manage and configure ML
+devices. The APIs also supports performing ML inference operations through ML poll
+mode driver.
+
+
+Device Operations
+-----------------
+
+Device Creation
+~~~~~~~~~~~~~~~
+
+Physical ML devices are discovered during the PCI probe/enumeration, through the
+EAL functions which are executed at DPDK initialization, based on their PCI device
+identifier, each unique PCI BDF (bus/bridge, device, function). ML physical devices,
+like other physical devices in DPDK can be white-listed or black-listed
+using the EAL command line options.
+
+
+Device Identification
+~~~~~~~~~~~~~~~~~~~~~
+
+Each device, whether virtual or physical is uniquely designated by two
+identifiers:
+
+- A unique device index used to designate the ML device in all functions
+ exported by the MLDEV API.
+
+- A device name used to designate the ML device in console messages, for
+ administration or debugging purposes.
+
+Device Features and Capabilities
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ML devices may support different feature set. In order to get the
+supported PMD feature ``rte_ml_dev_info_get`` API which return the
+info of the device and it's supported features.
+
+Device Configuration
+~~~~~~~~~~~~~~~~~~~~
+
+The configuration of each ML device includes the following operations:
+
+- Allocation of resources, including hardware resources if a physical device.
+- Resetting the device into a well-known default state.
+- Initialization of statistics counters.
+
+The rte_ml_dev_configure API is used to configure a ML device.
+
+.. code-block:: c
+
+ int rte_ml_dev_configure(uint8_t dev_id, const struct rte_ml_dev_config *cfg);
+
+The ``rte_ml_dev_config`` structure is used to pass the configuration parameters
+for the ML device, for example number of queue pairs, maximum number of models,
+maximum size of model and so on.
+
+Configuration of Queue Pairs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each ML device can be configured with number of queue pairs.
+Each queue pair is configured using ``rte_ml_dev_queue_pair_setup``
+
+Logical Cores, Memory and Queues Pair Relationships
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Multiple logical cores should never share the same queue pair for enqueuing
+operations or dequeueing operations on the same ML device since this would
+require global locks and hinder performance.
+
+Configuration of Machine Learning models
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pre-trained ML models that are built using external ML compiler / training frameworks
+are used to perform inference operations. These models are configured on an ML device
+in a two-stage process that includes loading the model on an ML device, and starting
+the model to accept inference operations. Inference operations can be queued for a
+model only when the model is in started state. Model load stage assigns a Model ID,
+which is unique for the model in a driver's context. Model ID is used during all
+subsequent slow-path and fast-path operations.
+
+Model loading and start is done through the ``rte_ml_model_load`` and
+``rte_ml_model_start`` functions.
+
+Similarly stop and unloading are done through ``rte_ml_model_stop`` and
+``rte_ml_model_unload`` functions.
+
+Stop and unload functions would release the resources allocated for the
+models. Inference tasks cannot be queued for a model that is stopped.
+
+Detailed information related to the model can be retrieved from the driver using the
+function ``rte_ml_model_info_get``. Model information is accessible to the application
+through the ``rte_ml_model_info`` structure. Information available to the user would
+include the details related to the inputs and outputs, and the maximum batch size
+supported by the model.
+
+User can optionally update the model params such as weights and bias, without unloading
+the model, through the ``rte_ml_model_params_update`` function. A model should be in
+stopped state to update the params. Model has to be started in order to enqueue inference
+requests after a params update.
+
+Enqueue / Dequeue
+~~~~~~~~~~~~~~~~~
+
+The burst enqueue API uses a ML device identifier and a queue pair identifier
+to specify the device queue pair to schedule the processing on. The ``nb_ops``
+parameter is the number of operations to process which are supplied in the
+``ops`` array of ``rte_ml_op`` structures. The enqueue function returns the
+number of operations it enqueued for processing, a return value equal to
+``nb_ops`` means that all packets have been enqueued.
+
+The dequeue API uses the same format as the enqueue API of processed but
+the ``nb_ops`` and ``ops`` parameters are now used to specify the max processed
+operations the user wishes to retrieve and the location in which to store them.
+The API call returns the actual number of processed operations returned; this
+can never be larger than ``nb_ops``.
+
+``rte_ml_op`` provides the required information to the driver to queue an ML inference
+task. ML op specifies the model to be used and the number of batches to be executed in
+the inference task. Input and output buffer information is specified through the
+structure ``rte_ml_buff_seg``, which supports segmented data. Input is provided through
+the ``rte_ml_op::input`` and output through ``rte_ml_op::output``. Data pointed in each
+op, should not be released until the dequeue of for that op.
+
+
+Quantize and Dequantize
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Inference operations performed with lower precision types would improve the throughput
+and efficiency of the inference execution with a minimal loss of accuracy, which is within
+the tolerance limits. Quantization and dequantization is the process of converting data
+from a higher precision type to a lower precision type and vice-versa. ML library provides
+the functions ``rte_ml_io_quantize`` and ``rte_ml_io_dequantize`` to enable data type
+conversions. User needs to provide the address of the quantized and dequantized data
+buffers to the functions, along the number of the batches in the buffers.
+
+For quantization, the dequantized data is assumed to be of the type ``dtype`` provided by
+the ``rte_ml_model_info::input`` and the data is converted to ``qtype`` provided by the
+``rte_ml_model_info::input``.
+
+For dequantization, the quantized data is assumed to be of the type ``qtype`` provided by
+the ``rte_ml_model_info::output`` and the data is converted to ``dtype`` provided by the
+``rte_ml_model_info::output``.
+
+Size of the buffers required for the input and output can be calculated using the functions
+``rte_ml_io_input_size_get`` and ``rte_ml_io_output_size_get``. These functions would get the
+buffer sizes for both quantized and dequantized data for the given number of batches.
+
diff --git a/lib/eal/common/eal_common_log.c b/lib/eal/common/eal_common_log.c
index bd7b188ceb..5cb1b15dbe 100644
--- a/lib/eal/common/eal_common_log.c
+++ b/lib/eal/common/eal_common_log.c
@@ -369,6 +369,7 @@ static const struct logtype logtype_strings[] = {
{RTE_LOGTYPE_EFD, "lib.efd"},
{RTE_LOGTYPE_EVENTDEV, "lib.eventdev"},
{RTE_LOGTYPE_GSO, "lib.gso"},
+ {RTE_LOGTYPE_MLDEV, "lib.mldev"},
{RTE_LOGTYPE_USER1, "user1"},
{RTE_LOGTYPE_USER2, "user2"},
{RTE_LOGTYPE_USER3, "user3"},
diff --git a/lib/eal/include/rte_log.h b/lib/eal/include/rte_log.h
index bba5da3d85..df6fada0b1 100644
--- a/lib/eal/include/rte_log.h
+++ b/lib/eal/include/rte_log.h
@@ -48,6 +48,7 @@ extern "C" {
#define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
#define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
#define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
+#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
/* these log types can be used in an application */
#define RTE_LOGTYPE_USER1 24 /**< User-defined log type 1. */
diff --git a/lib/meson.build b/lib/meson.build
index fd55925340..f18b352ec5 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -63,6 +63,7 @@ libraries = [
'flow_classify', # flow_classify lib depends on pkt framework table lib
'graph',
'node',
+ 'mldev'
]
optional_libs = [
diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build
new file mode 100644
index 0000000000..e378cfca30
--- /dev/null
+++ b/lib/mldev/meson.build
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2022 Marvell.
+
+sources = files(
+ 'rte_mldev.c',
+)
+
+headers = files(
+ 'rte_mldev.h',
+)
+
+deps += ['mempool']
+
+if get_option('buildtype').contains('debug')
+ cflags += [ '-DRTE_LIBRTE_ML_DEV_DEBUG' ]
+else
+ cflags += [ '-URTE_LIBRTE_ML_DEV_DEBUG' ]
+endif
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
new file mode 100644
index 0000000000..2e3dfa0e6b
--- /dev/null
+++ b/lib/mldev/rte_mldev.c
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#include <rte_mldev.h>
diff --git a/lib/mldev/rte_mldev.h b/lib/mldev/rte_mldev.h
new file mode 100644
index 0000000000..83419fcecd
--- /dev/null
+++ b/lib/mldev/rte_mldev.h
@@ -0,0 +1,1092 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef RTE_MLDEV_H
+#define RTE_MLDEV_H
+
+/**
+ * @file rte_mldev.h
+ *
+ * @warning
+ * @b EXPERIMENTAL:
+ * All functions in this file may be changed or removed without prior notice.
+ *
+ * ML (Machine Learning) device API.
+ *
+ * The ML framework is built on the following model:
+ *
+ *
+ * +-----------------+ rte_ml_[en|de]queue_burst()
+ * | | |
+ * | Machine o------+ +--------+ |
+ * | Learning | | | queue | | +------+
+ * | Inference o------+-----o |<===o===>|Core 0|
+ * | Engine | | | pair 0 | +------+
+ * | o----+ | +--------+
+ * | | | |
+ * +-----------------+ | | +--------+
+ * ^ | | | queue | +------+
+ * | | +-----o |<=======>|Core 1|
+ * | | | pair 1 | +------+
+ * | | +--------+
+ * +--------+--------+ |
+ * | +-------------+ | | +--------+
+ * | | Model 0 | | | | queue | +------+
+ * | +-------------+ | +-------o |<=======>|Core N|
+ * | +-------------+ | | pair N | +------+
+ * | | Model 1 | | +--------+
+ * | +-------------+ |
+ * | +-------------+ |<------> rte_ml_model_load()
+ * | | Model .. | |-------> rte_ml_model_info_get()
+ * | +-------------+ |<------- rte_ml_model_start()
+ * | +-------------+ |<------- rte_ml_model_stop()
+ * | | Model N | |<------- rte_ml_model_params_update()
+ * | +-------------+ |<------- rte_ml_model_unload()
+ * +-----------------+
+ *
+ * ML Device: A hardware or software-based implementation of ML device API for
+ * running inferences using a pre-trained ML model.
+ *
+ * ML Model: An ML model is an algorithm trained over a dataset. A model consists of
+ * procedure/algorithm and data/pattern required to make predictions on live data.
+ * Once the model is created and trained outside of the DPDK scope, the model can be loaded
+ * via rte_ml_model_load() and then start it using rte_ml_model_start() API.
+ * The rte_ml_model_params_update() can be used to update the model parameters such as weight
+ * and bias without unloading the model using rte_ml_model_unload().
+ *
+ * ML Inference: ML inference is the process of feeding data to the model via
+ * rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+ * outputs/predictions from the started model.
+ *
+ * In all functions of the ML device API, the ML device is designated by an
+ * integer >= 0 named as device identifier *dev_id*.
+ *
+ * The functions exported by the ML device API to setup a device designated by
+ * its device identifier must be invoked in the following order:
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_dev_start()
+ *
+ * A model is required to run the inference operations with the user specified inputs.
+ * Application needs to invoke the ML model API in the following order before queueing
+ * inference jobs.
+ *
+ * - rte_ml_model_load()
+ * - rte_ml_model_start()
+ *
+ * A model can be loaded on a device only after the device has been configured and can be
+ * started or stopped only after a device has been started.
+ *
+ * The rte_ml_model_info_get() API is provided to retrieve the information related to the model.
+ * The information would include the shape and type of input and output required for the inference.
+ *
+ * Data quantization and dequantization is one of the main aspects in ML domain. This involves
+ * conversion of input data from a higher precision to a lower precision data type and vice-versa
+ * for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
+ * dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
+ * and output buffers holding data for multiple batches.
+ *
+ * Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
+ * size of quantized and de-quantized multi-batch input and output buffers.
+ *
+ * User can optionally update the model parameters with rte_ml_model_params_update() after
+ * invoking rte_ml_model_stop() API on a given model ID.
+ *
+ * The application can invoke, in any order, the functions exported by the ML API to enqueue
+ * inference jobs and dequeue inference response.
+ *
+ * If the application wants to change the device configuration (i.e., call
+ * rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
+ * device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
+ * the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
+ * for the given model. The application does not need to call rte_ml_dev_stop() API for
+ * any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
+ *
+ * Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
+ * start state after invoking rte_ml_model_start() API, then the application can call
+ * rte_ml_enqueue_burst() and rte_ml_dequeue_burst() API on the destined device and model ID.
+ *
+ * Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
+ *
+ * Typical application utilisation of the ML API will follow the following
+ * programming flow.
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_model_load()
+ * - rte_ml_model_start()
+ * - rte_ml_model_info_get()
+ * - rte_ml_dev_start()
+ * - rte_ml_enqueue_burst()
+ * - rte_ml_dequeue_burst()
+ * - rte_ml_model_stop()
+ * - rte_ml_model_unload()
+ * - rte_ml_dev_stop()
+ * - rte_ml_dev_close()
+ *
+ * Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on different logical cores
+ * on the same target object. For instance, the dequeue function of a poll mode driver cannot be
+ * invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the user application to enforce this rule.
+ */
+
+#include <rte_common.h>
+#include <rte_mempool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define RTE_ML_STR_MAX 128
+/**< Maximum length of name string */
+
+/* Device operations */
+
+/**
+ * Get the total number of ML devices that have been successfully initialised.
+ *
+ * @return
+ * - The total number of usable ML devices.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dev_count(void);
+
+/**
+ * Check if the device is in ready state.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 if device state is not in ready state.
+ * - 1 if device state is ready state.
+ */
+__rte_experimental
+int
+rte_ml_dev_is_valid_dev(int16_t dev_id);
+
+/**
+ * Return the NUMA socket to which a device is connected.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - The NUMA socket id to which the device is connected
+ * - 0 If the socket could not be determined.
+ * - -EINVAL: if the dev_id value is not valid.
+ */
+__rte_experimental
+int
+rte_ml_dev_socket_id(int16_t dev_id);
+
+/** ML device information */
+struct rte_ml_dev_info {
+ const char *driver_name;
+ /**< Driver name */
+ int16_t max_models;
+ /**< Maximum number of models supported by the device.
+ * @see struct rte_ml_dev_config::nb_models
+ */
+ uint16_t max_queue_pairs;
+ /**< Maximum number of queues pairs supported by the device.
+ * @see struct rte_ml_dev_config::nb_queue_pairs
+ */
+ uint16_t max_desc;
+ /**< Maximum allowed number of descriptors for queue pair by the device.
+ * @see struct rte_ml_dev_qp_conf::nb_desc
+ */
+ uint16_t max_segments;
+ /**< Maximum number of scatter-gather entries supported by the device.
+ * @see struct rte_ml_buff_seg struct rte_ml_buff_seg::next
+ */
+ uint16_t min_align_size;
+ /**< Minimum alignment size of IO buffers used by the device. */
+};
+
+/**
+ * Retrieve the information of the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param dev_info
+ * A pointer to a structure of type *rte_ml_dev_info* to be filled with the info of the device.
+ *
+ * @return
+ * - 0: Success, driver updates the information of the ML device
+ * - < 0: Error code returned by the driver info get function.
+ */
+__rte_experimental
+int
+rte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info);
+
+/** ML device configuration structure */
+struct rte_ml_dev_config {
+ int socket_id;
+ /**< Socket to allocate resources on. */
+ int16_t nb_models;
+ /**< Number of models to be loaded on the device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_models
+ */
+ uint16_t nb_queue_pairs;
+ /**< Number of queue pairs to configure on this device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_queue_pairs
+ */
+};
+
+/**
+ * Configure an ML device.
+ *
+ * This function must be invoked first before any other function in the API.
+ *
+ * ML Device can be re-configured, when in a stopped state. Device cannot be re-configured after
+ * rte_ml_dev_close() is called.
+ *
+ * The caller may use rte_ml_dev_info_get() to get the capability of each resources available for
+ * this ML device.
+ *
+ * @param dev_id
+ * The identifier of the device to configure.
+ * @param config
+ * The ML device configuration structure.
+ *
+ * @return
+ * - 0: Success, device configured.
+ * - < 0: Error code returned by the driver configuration function.
+ */
+__rte_experimental
+int
+rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config);
+
+/* Forward declaration */
+struct rte_ml_op;
+
+/**< Callback function called during rte_ml_dev_stop(), invoked once per flushed ML op */
+typedef void (*rte_ml_dev_stop_flush_t)(int16_t dev_id, uint16_t qp_id, struct rte_ml_op *op);
+
+/** ML device queue pair configuration structure. */
+struct rte_ml_dev_qp_conf {
+ uint32_t nb_desc;
+ /**< Number of descriptors per queue pair.
+ * This value cannot exceed the max_desc which previously provided in
+ * struct rte_ml_dev_info:max_desc
+ */
+ rte_ml_dev_stop_flush_t cb;
+ /**< Callback function called during rte_ml_dev_stop(), invoked once per active ML op.
+ * Value NULL is allowed, in which case callback will not be invoked.
+ * This function can be used to properly dispose of outstanding ML ops from all
+ * queue pairs, for example ops containing memory pointers.
+ * @see rte_ml_dev_stop()
+ */
+};
+
+/**
+ * Set up a queue pair for a device. This should only be called when the device is stopped.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param queue_pair_id
+ * The index of the queue pairs to set up. The value must be in the range [0, nb_queue_pairs - 1]
+ * previously supplied to rte_ml_dev_configure().
+ * @param qp_conf
+ * The pointer to the configuration data to be used for the queue pair.
+ * @param socket_id
+ * The *socket_id* argument is the socket identifier in case of NUMA.
+ * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the memory allocated
+ * for the queue pair.
+ *
+ * @return
+ * - 0: Success, queue pair correctly set up.
+ * - < 0: Queue pair configuration failed.
+ */
+__rte_experimental
+int
+rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *qp_conf, int socket_id);
+
+/**
+ * Start an ML device.
+ *
+ * The device start step consists of setting the configured features and enabling the ML device
+ * to accept inference jobs.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device started.
+ * - <0: Error code of the driver device start function.
+ */
+__rte_experimental
+int
+rte_ml_dev_start(int16_t dev_id);
+
+/**
+ * Stop an ML device. A stopped device cannot accept inference jobs.
+ * The device can be restarted with a call to rte_ml_dev_start().
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device stopped.
+ * - <0: Error code of the driver device stop function.
+ */
+__rte_experimental
+int
+rte_ml_dev_stop(int16_t dev_id);
+
+/**
+ * Close an ML device. The device cannot be restarted!
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 on successfully closing device.
+ * - <0 on failure to close device.
+ */
+__rte_experimental
+int
+rte_ml_dev_close(int16_t dev_id);
+
+/** Status of ML operation */
+enum rte_ml_op_status {
+ RTE_ML_OP_STATUS_SUCCESS = 0,
+ /**< Operation completed successfully */
+ RTE_ML_OP_STATUS_NOT_PROCESSED,
+ /**< Operation has not yet been processed by the device. */
+ RTE_ML_OP_STATUS_ERROR,
+ /**< Operation completed with error.
+ * Application can invoke rte_ml_op_error_get() to get PMD specific
+ * error code if needed.
+ */
+};
+
+/** ML operation's input and output buffer representation as scatter gather list
+ */
+struct rte_ml_buff_seg {
+ rte_iova_t iova_addr;
+ /**< IOVA address of segment buffer. */
+ void *addr;
+ /**< Virtual address of segment buffer. */
+ uint32_t length;
+ /**< Segment length. */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_ml_buff_seg *next;
+ /**< Points to next segment. Value NULL represents the last segment. */
+};
+
+/**
+ * ML Operation.
+ *
+ * This structure contains data related to performing an ML operation on the buffers using
+ * the model specified through model_id.
+ */
+struct rte_ml_op {
+ int16_t model_id;
+ /**< Model ID to be used for the operation. */
+ uint16_t nb_batches;
+ /**< Number of batches. Minimum value must be one.
+ * Input buffer must hold inference data for each batch as contiguous.
+ */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_mempool *mempool;
+ /**< Pool from which operation is allocated. */
+ struct rte_ml_buff_seg input;
+ /**< Input buffer to hold the inference data. */
+ struct rte_ml_buff_seg output;
+ /**< Output buffer to hold the inference output by the driver. */
+ RTE_STD_C11
+ union {
+ uint64_t user_u64;
+ /**< User data as uint64_t.*/
+ void *user_ptr;
+ /**< User data as void*.*/
+ };
+ enum rte_ml_op_status status;
+ /**< Operation status. */
+ uint64_t impl_opaque;
+ /**< Implementation specific opaque value.
+ * An implementation may use this field to hold
+ * implementation specific value to share between
+ * dequeue and enqueue operation.
+ * The application should not modify this field.
+ */
+} __rte_cache_aligned;
+
+/* Enqueue/Dequeue operations */
+
+/**
+ * Enqueue a burst of ML inferences for processing on an ML device.
+ *
+ * The rte_ml_enqueue_burst() function is invoked to place ML inference
+ * operations on the queue *qp_id* of the device designated by its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of inferences to process which are
+ * supplied in the *ops* array of *rte_ml_op* structures.
+ *
+ * The rte_ml_enqueue_burst() function returns the number of inferences it
+ * actually enqueued for processing. A return value equal to *nb_ops* means that
+ * all packets have been enqueued.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair which inferences are to be enqueued for processing.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * *rte_ml_dev_configure*.
+ * @param ops
+ * The address of an array of *nb_ops* pointers to *rte_ml_op* structures which contain the
+ * ML inferences to be processed.
+ * @param nb_ops
+ * The number of operations to process.
+ *
+ * @return
+ * The number of inference operations actually enqueued to the ML device.
+ * The return value can be less than the value of the *nb_ops* parameter when the ML device queue
+ * is full or if invalid parameters are specified in a *rte_ml_op*.
+ */
+__rte_experimental
+uint16_t
+rte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Dequeue a burst of processed ML inferences operations from a queue on the ML device.
+ * The dequeued operations are stored in *rte_ml_op* structures whose pointers are supplied
+ * in the *ops* array.
+ *
+ * The rte_ml_dequeue_burst() function returns the number of inferences actually dequeued,
+ * which is the number of *rte_ml_op* data structures effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained at least nb_ops* operations,
+ * and this is likely to signify that other processed operations remain in the devices output queue.
+ * Application implementing a "retrieve as many processed operations as possible" policy can check
+ * this specific case and keep invoking the rte_ml_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_ml_dequeue_burst() function does not provide any error notification to avoid
+ * the corresponding overhead.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair from which to retrieve processed packets.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * rte_ml_dev_configure().
+ * @param ops
+ * The address of an array of pointers to *rte_ml_op* structures that must be large enough to
+ * store *nb_ops* pointers in it.
+ * @param nb_ops
+ * The maximum number of inferences to dequeue.
+ *
+ * @return
+ * The number of operations actually dequeued, which is the number of pointers
+ * to *rte_ml_op* structures effectively supplied to the *ops* array.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Verbose error structure definition.
+ */
+struct rte_ml_op_error {
+ char message[RTE_ML_STR_MAX]; /**< Human-readable error message. */
+ uint64_t errcode; /**< Vendor specific error code. */
+};
+
+/**
+ * Get PMD specific error information for an ML op.
+ *
+ * When an ML operation completed with RTE_ML_OP_STATUS_ERROR as status,
+ * This API allows to get PMD specific error details.
+ *
+ * @param[in] dev_id
+ * Device identifier
+ * @param[in] op
+ * Handle of ML operation
+ * @param[in] error
+ * Address of structure rte_ml_op_error to be filled
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error);
+
+/* Statistics operations */
+
+/** Device statistics. */
+struct rte_ml_dev_stats {
+ uint64_t enqueued_count;
+ /**< Count of all operations enqueued */
+ uint64_t dequeued_count;
+ /**< Count of all operations dequeued */
+ uint64_t enqueue_err_count;
+ /**< Total error count on operations enqueued */
+ uint64_t dequeue_err_count;
+ /**< Total error count on operations dequeued */
+};
+
+/**
+ * Retrieve the general I/O statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stats
+ * Pointer to structure to where statistics will be copied.
+ * On error, this location may or may not have been modified.
+ * @return
+ * - 0 on success
+ * - -EINVAL: If invalid parameter pointer is provided.
+ */
+__rte_experimental
+int
+rte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats);
+
+/**
+ * Reset the statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ */
+__rte_experimental
+void
+rte_ml_dev_stats_reset(int16_t dev_id);
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers for extended ML device statistics.
+ */
+struct rte_ml_dev_xstats_map {
+ uint16_t id;
+ /**< xstat identifier */
+ char name[RTE_ML_STR_MAX];
+ /**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param[out] xstats_map
+ * Block of memory to insert id and names into. Must be at least size in capacity.
+ * If set to NULL, function returns required capacity.
+ * @param size
+ * Capacity of xstats_map (number of name-id maps).
+ *
+ * @return
+ * - Positive value on success:
+ * - The return value is the number of entries filled in the stats map.
+ * - If xstats_map set to NULL then required capacity for xstats_map.
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map,
+ uint32_t size);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param name
+ * The stat name to retrieve.
+ * @param stat_id
+ * If non-NULL, the numerical id of the stat will be returned, so that further requests for
+ * the stat can be got using rte_ml_dev_xstats_get, which will be faster as it doesn't need to
+ * scan a list of names for the stat.
+ * @param[out] value
+ * Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ * - 0: Successfully retrieved xstat value.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value);
+
+/**
+ * Retrieve extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * The id numbers of the stats to get. The ids can be fetched from the stat position in the
+ * stat list from rte_ml_dev_xstats_names_get(), or by using rte_ml_dev_xstats_by_name_get().
+ * @param values
+ * The values for each stats request by ID.
+ * @param nb_ids
+ * The number of stats requested.
+ * @return
+ * - Positive value: number of stat entries filled into the values array
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * Selects specific statistics to be reset. When NULL, all statistics will be reset.
+ * If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ * The number of ids available from the *ids* array. Ignored when ids is NULL.
+ * @return
+ * - 0: Successfully reset the statistics to zero.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids);
+
+/* Utility operations */
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *fd*.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param fd
+ * A pointer to a file for output.
+ * @return
+ * - 0: on success.
+ * - <0: on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_dump(int16_t dev_id, FILE *fd);
+
+/**
+ * Trigger the ML device self test.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @return
+ * - 0: Selftest successful.
+ * - -ENOTSUP: if the device doesn't support selftest.
+ * - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_selftest(int16_t dev_id);
+
+/* Model operations */
+
+/** ML model load parameters
+ *
+ * Parameters required to load an ML model.
+ */
+struct rte_ml_model_params {
+ void *addr;
+ /**< Address of model buffer */
+ size_t size;
+ /**< Size of model buffer */
+};
+
+/**
+ * Load an ML model to the device.
+ *
+ * Load an ML model to the device with parameters requested in the structure rte_ml_model_params.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] params
+ * Parameters for the model to be loaded.
+ * @param[out] model_id
+ * Identifier of the model loaded.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model load driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, int16_t *model_id);
+
+/**
+ * Unload an ML model from the device.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be unloaded.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model unload driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_unload(int16_t dev_id, int16_t model_id);
+
+/**
+ * Start an ML model for the given device ID.
+ *
+ * Start an ML model to accept inference requests.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be started.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model start driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_start(int16_t dev_id, int16_t model_id);
+
+/**
+ * Stop an ML model for the given device ID.
+ *
+ * Model stop would disable the ML model to be used for inference jobs.
+ * All inference jobs must have been completed before model stop is attempted.
+
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be stopped.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model stop driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_stop(int16_t dev_id, int16_t model_id);
+
+/**
+ * Input and output data types. ML models can operate on reduced precision
+ * datatypes to achieve better power efficiency, lower network latency and lower memory footprint.
+ * This enum is used to represent the lower precision integer and floating point types used
+ * by ML models.
+ */
+enum rte_ml_io_type {
+ RTE_ML_IO_TYPE_UNKNOWN = 0,
+ /**< Invalid or unknown type */
+ RTE_ML_IO_TYPE_INT8,
+ /**< 8-bit integer */
+ RTE_ML_IO_TYPE_UINT8,
+ /**< 8-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT16,
+ /**< 16-bit integer */
+ RTE_ML_IO_TYPE_UINT16,
+ /**< 16-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT32,
+ /**< 32-bit integer */
+ RTE_ML_IO_TYPE_UINT32,
+ /**< 32-bit unsigned integer */
+ RTE_ML_IO_TYPE_FP8,
+ /**< 8-bit floating point number */
+ RTE_ML_IO_TYPE_FP16,
+ /**< IEEE 754 16-bit floating point number */
+ RTE_ML_IO_TYPE_FP32,
+ /**< IEEE 754 32-bit floating point number */
+ RTE_ML_IO_TYPE_BFLOAT16
+ /**< 16-bit brain floating point number. */
+};
+
+/**
+ * Input and output format. This is used to represent the encoding type of multi-dimensional
+ * used by ML models.
+ */
+enum rte_ml_io_format {
+ RTE_ML_IO_FORMAT_NCHW = 1,
+ /**< Batch size (N) x channels (C) x height (H) x width (W) */
+ RTE_ML_IO_FORMAT_NHWC,
+ /**< Batch size (N) x height (H) x width (W) x channels (C) */
+ RTE_ML_IO_FORMAT_CHWN,
+ /**< Channels (C) x height (H) x width (W) x batch size (N) */
+ RTE_ML_IO_FORMAT_3D,
+ /**< Format to represent a 3 dimensional data */
+ RTE_ML_IO_FORMAT_2D,
+ /**< Format to represent matrix data */
+ RTE_ML_IO_FORMAT_1D,
+ /**< Format to represent vector data */
+ RTE_ML_IO_FORMAT_SCALAR,
+ /**< Format to represent scalar data */
+};
+
+/**
+ * Input and output shape. This structure represents the encoding format and dimensions
+ * of the tensor or vector.
+ *
+ * The data can be a 4D / 3D tensor, matrix, vector or a scalar. Number of dimensions used
+ * for the data would depend on the format. Unused dimensions to be set to 1.
+ */
+struct rte_ml_io_shape {
+ enum rte_ml_io_format format;
+ /**< Format of the data */
+ uint32_t w;
+ /**< First dimension */
+ uint32_t x;
+ /**< Second dimension */
+ uint32_t y;
+ /**< Third dimension */
+ uint32_t z;
+ /**< Fourth dimension */
+};
+
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Name of data */
+ struct rte_ml_io_shape shape;
+ /**< Shape of data */
+ enum rte_ml_io_type qtype;
+ /**< Type of quantized data */
+ enum rte_ml_io_type dtype;
+ /**< Type of de-quantized data */
+};
+
+/** Model information structure */
+struct rte_ml_model_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Model name. */
+ char version[RTE_ML_STR_MAX];
+ /**< Model version */
+ int16_t model_id;
+ /**< Model ID */
+ uint16_t device_id;
+ /**< Device ID */
+ uint16_t batch_size;
+ /**< Maximum number of batches that the model can process simultaneously */
+ uint32_t nb_inputs;
+ /**< Number of inputs */
+ const struct rte_ml_io_info *input_info;
+ /**< Input info array. Array size is equal to nb_inputs */
+ uint32_t nb_outputs;
+ /**< Number of outputs */
+ const struct rte_ml_io_info *output_info;
+ /**< Output info array. Array size is equal to nb_output */
+ uint64_t wb_size;
+ /**< Size of model weights and bias */
+};
+
+/**
+ * Get ML model information.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[out] model_info
+ * Pointer to a model info structure
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_info_get(int16_t dev_id, int16_t model_id, struct rte_ml_model_info *model_info);
+
+/**
+ * Update the model parameters without unloading model.
+ *
+ * Update model parameters such as weights and bias without unloading the model.
+ * rte_ml_model_stop() must be called before invoking this API.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] buffer
+ * Pointer to the model weights and bias buffer.
+ * Size of the buffer is equal to wb_size returned in *rte_ml_model_info*.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);
+
+/* IO operations */
+
+/**
+ * Get size of quantized and dequantized input buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized input data.
+ * This API would return the buffer sizes for the number of batches provided and would
+ * consider the alignment requirements as per the PMD. Input sizes computed by this API can
+ * be used by the application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] input_qsize
+ * Quantized input size pointer.
+ * NULL value is allowed, in which case input_qsize is not calculated by the driver.
+ * @param[out] input_dsize
+ * Dequantized input size pointer.
+ * NULL value is allowed, in which case input_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_input_size_get(int16_t dev_id, int16_t model_id, uint32_t nb_batches,
+ uint64_t *input_qsize, uint64_t *input_dsize);
+
+/**
+ * Get size of quantized and dequantized output buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized output data.
+ * This API would return the buffer sizes for the number of batches provided and would consider
+ * the alignment requirements as per the PMD. Output sizes computed by this API can be used by the
+ * application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] output_qsize
+ * Quantized output size pointer.
+ * NULL value is allowed, in which case output_qsize is not calculated by the driver.
+ * @param[out] output_dsize
+ * Dequantized output size pointer.
+ * NULL value is allowed, in which case output_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_output_size_get(int16_t dev_id, int16_t model_id, uint32_t nb_batches,
+ uint64_t *output_qsize, uint64_t *output_dsize);
+
+/**
+ * Quantize input data.
+ *
+ * Quantization converts data from a higher precision types to a lower precision types to improve
+ * the throughput and efficiency of the model execution with minimal loss of accuracy.
+ * Types of dequantized data and quantized data are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized input buffer
+ * @param[in] dbuffer
+ * Address of dequantized input data
+ * @param[in] qbuffer
+ * Address of quantized input data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_quantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void *dbuffer,
+ void *qbuffer);
+
+/**
+ * Dequantize output data.
+ *
+ * Dequantization converts data from a lower precision type to a higher precision type.
+ * Types of quantized data and dequantized are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized output buffer
+ * @param[in] qbuffer
+ * Address of quantized output data
+ * @param[in] dbuffer
+ * Address of dequantized output data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_dequantize(int16_t dev_id, int16_t model_id, uint16_t nb_batches, void *qbuffer,
+ void *dbuffer);
+
+/* ML op pool operations */
+
+/**
+ * Create an ML operation pool
+ *
+ * @param name
+ * ML operations pool name
+ * @param nb_elts
+ * Number of elements in pool
+ * @param cache_size
+ * Number of elements to cache on lcore, see
+ * *rte_mempool_create* for further details about cache size
+ * @param user_size
+ * Size of private data to allocate for user with each operation
+ * @param socket_id
+ * Socket to identifier allocate memory on
+ * @return
+ * - On success pointer to mempool
+ * - On failure NULL
+ */
+__rte_experimental
+struct rte_mempool *
+rte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
+ uint16_t user_size, int socket_id);
+
+/**
+ * Free an ML operation pool
+ *
+ * @param mempool
+ * A pointer to the mempool structure.
+ * If NULL then, the function does nothing.
+ */
+__rte_experimental
+void
+rte_ml_op_pool_free(struct rte_mempool *mempool);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_MLDEV_H */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
new file mode 100644
index 0000000000..33c1b976f1
--- /dev/null
+++ b/lib/mldev/version.map
@@ -0,0 +1,3 @@
+EXPERIMENTAL {
+ local: *;
+};
--
2.38.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2023-01-28 11:27 ` Jerin Jacob
@ 2023-02-01 16:57 ` Thomas Monjalon
2023-02-01 17:33 ` Jerin Jacob
0 siblings, 1 reply; 80+ messages in thread
From: Thomas Monjalon @ 2023-02-01 16:57 UTC (permalink / raw)
To: Jerin Jacob
Cc: dev, Shivah Shankar Shankar Narayan Rao,
Jerin Jacob Kollanukkaran, dev, ferruh.yigit, ajit.khaparde,
aboyer, andrew.rybchenko, beilei.xing, bruce.richardson, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Srikanth Yalavarthi,
Derek Chickles, david.marchand, jerinj
28/01/2023 12:27, Jerin Jacob:
> I see main comments are on param update and get the capablities.
> To enable that, please propose the changes around rte_ml_model_params_update(),
> rte_ml_model_info. We should able to take that and send v2.
Sorry I don't have the bandwidth to work on mldev now.
I understand you took the easy path of opaque pointer,
and you are OK to refine it if needed.
Because there is not much reviews, I think we should merge it as-is
and keep it experimental the time needed to have more feedbacks
and a second vendor implementing it.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
2023-02-01 16:57 ` Thomas Monjalon
@ 2023-02-01 17:33 ` Jerin Jacob
0 siblings, 0 replies; 80+ messages in thread
From: Jerin Jacob @ 2023-02-01 17:33 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, bruce.richardson, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Srikanth Yalavarthi,
Derek Chickles, david.marchand
On Wed, Feb 1, 2023 at 10:27 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 28/01/2023 12:27, Jerin Jacob:
> > I see main comments are on param update and get the capablities.
> > To enable that, please propose the changes around rte_ml_model_params_update(),
> > rte_ml_model_info. We should able to take that and send v2.
>
> Sorry I don't have the bandwidth to work on mldev now.
Understandable.
> I understand you took the easy path of opaque pointer,
I would say not easy path, rather the use case I am not aware of and
the model that we are supporting.
> and you are OK to refine it if needed.
Yes
> Because there is not much reviews, I think we should merge it as-is
> and keep it experimental the time needed to have more feedbacks
> and a second vendor implementing it.
Ack.I think, it is reasonable the first patch was pushed on Aug3. It
was around 6 months for reviews.
https://inbox.dpdk.org/dev/20220803132839.2747858-2-jerinj@marvell.com/
>
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* RE: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 01/12] " jerinj
2023-02-01 13:34 ` Shivah Shankar Shankar Narayan Rao
@ 2023-02-02 5:26 ` Shivah Shankar Shankar Narayan Rao
1 sibling, 0 replies; 80+ messages in thread
From: Shivah Shankar Shankar Narayan Rao @ 2023-02-02 5:26 UTC (permalink / raw)
To: Jerin Jacob Kollanukkaran, dev, Thomas Monjalon,
Bruce Richardson, Srikanth Yalavarthi
Cc: ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Jerin Jacob Kollanukkaran
> -----Original Message-----
> From: jerinj@marvell.com <jerinj@marvell.com>
> Sent: Monday, November 14, 2022 5:32 PM
> To: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Bruce
> Richardson <bruce.richardson@intel.com>; Srikanth Yalavarthi
> <syalavarthi@marvell.com>
> Cc: ferruh.yigit@xilinx.com; ajit.khaparde@broadcom.com;
> aboyer@pensando.io; andrew.rybchenko@oktetlabs.ru;
> beilei.xing@intel.com; chas3@att.com; chenbo.xia@intel.com;
> ciara.loftus@intel.com; Devendra Singh Rawat <dsinghrawat@marvell.com>;
> ed.czeck@atomicrules.com; evgenys@amazon.com; grive@u256.net;
> g.singh@nxp.com; zhouguoyang@huawei.com; haiyue.wang@intel.com;
> Harman Kalra <hkalra@marvell.com>; heinrich.kuhn@corigine.com;
> hemant.agrawal@nxp.com; hyonkim@cisco.com; igorch@amazon.com; Igor
> Russkikh <irusskikh@marvell.com>; jgrajcia@cisco.com;
> jasvinder.singh@intel.com; jianwang@trustnetic.com;
> jiawenwu@trustnetic.com; jingjing.wu@intel.com; johndale@cisco.com;
> john.miller@atomicrules.com; linville@tuxdriver.com;
> keith.wiles@intel.com; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; oulijun@huawei.com; Liron Himi
> <lironh@marvell.com>; longli@microsoft.com; mw@semihalf.com;
> spinler@cesnet.cz; matan@nvidia.com; matt.peters@windriver.com;
> maxime.coquelin@redhat.com; mk@semihalf.com; humin29@huawei.com;
> Pradeep Kumar Nalla <pnalla@marvell.com>; Nithin Kumar Dabilpuram
> <ndabilpuram@marvell.com>; qiming.yang@intel.com;
> qi.z.zhang@intel.com; Radha Chintakuntla <radhac@marvell.com>;
> rahul.lakkireddy@chelsio.com; Rasesh Mody <rmody@marvell.com>;
> rosen.xu@intel.com; sachin.saxena@oss.nxp.com; Satha Koteswara Rao
> Kottidi <skoteshwar@marvell.com>; Shahed Shaikh
> <shshaikh@marvell.com>; shaibran@amazon.com;
> shepard.siegel@atomicrules.com; asomalap@amd.com;
> somnath.kotur@broadcom.com; sthemmin@microsoft.com;
> steven.webster@windriver.com; Sunil Kumar Kori <skori@marvell.com>;
> mtetsuyah@gmail.com; Veerasenareddy Burru <vburru@marvell.com>;
> viacheslavo@nvidia.com; xiao.w.wang@intel.com;
> cloud.wangxiaoyun@huawei.com; yisen.zhuang@huawei.com;
> yongwang@vmware.com; xuanziyang2@huawei.com; Prasun Kapoor
> <pkapoor@marvell.com>; Nadav Haklai <nadavh@marvell.com>; Satananda
> Burla <sburla@marvell.com>; Narayana Prasad Raju Athreya
> <pathreya@marvell.com>; Akhil Goyal <gakhil@marvell.com>;
> mdr@ashroe.eu; dmitry.kozliuk@gmail.com; anatoly.burakov@intel.com;
> cristian.dumitrescu@intel.com; honnappa.nagarahalli@arm.com;
> mattias.ronnblom@ericsson.com; ruifeng.wang@arm.com;
> drc@linux.vnet.ibm.com; konstantin.ananyev@intel.com;
> olivier.matz@6wind.com; jay.jayatheerthan@intel.com; Ashwin Sekhar T K
> <asekhar@marvell.com>; Pavan Nikhilesh Bhagavatula
> <pbhagavatula@marvell.com>; eagostini@nvidia.com; Derek Chickles
> <dchickles@marvell.com>; Shivah Shankar Shankar Narayan Rao
> <sshankarnara@marvell.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>
> Subject: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning
> device library
>
> From: Jerin Jacob <jerinj@marvell.com>
>
> Add mldev API specification to standardize and use the machine learning
> device and inference operations in vendor neutral way.
>
> Following operations are abstracted through APIs
>
> - ML device capability probe
> - ML device configuration
> - ML device queue pair configuration
> - ML device state management
> - ML device stat/xstat operations
> - ML model load/unload/start/stop operations
> - ML model information probe
> - ML IO operations to find size for input and output buffers
> - ML quantize and dequantize operations
> - ML ops pool creation and free operations
> - ML device enqueue/dequeue fastpath interference operations
>
> Also added programming guide.
>
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
> ---
> MAINTAINERS | 5 +
> config/rte_config.h | 3 +
> doc/api/doxy-api-index.md | 1 +
> doc/api/doxy-api.conf.in | 1 +
> doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
> doc/guides/prog_guide/index.rst | 1 +
> doc/guides/prog_guide/mldev.rst | 186 ++++
> lib/eal/common/eal_common_log.c | 1 +
> lib/eal/include/rte_log.h | 1 +
> lib/meson.build | 1 +
> lib/mldev/meson.build | 18 +
> lib/mldev/rte_mldev.c | 5 +
> lib/mldev/rte_mldev.h | 1092 ++++++++++++++++++++++
> lib/mldev/version.map | 3 +
> 14 files changed, 2032 insertions(+)
> create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
> create mode 100644 doc/guides/prog_guide/mldev.rst
> create mode 100644 lib/mldev/meson.build
> create mode 100644 lib/mldev/rte_mldev.c
> create mode 100644 lib/mldev/rte_mldev.h
> create mode 100644 lib/mldev/version.map
>
Acked-by: Shivah Shankar S <sshankarnara@marvell.com>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-01 13:34 ` Shivah Shankar Shankar Narayan Rao
@ 2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 8:42 ` Thomas Monjalon
2023-02-03 10:01 ` Jerin Jacob
2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 0:28 ` Stephen Hemminger
2 siblings, 2 replies; 80+ messages in thread
From: Stephen Hemminger @ 2023-02-03 0:25 UTC (permalink / raw)
To: Shivah Shankar Shankar Narayan Rao
Cc: Jerin Jacob Kollanukkaran, dev, Thomas Monjalon,
Bruce Richardson, Srikanth Yalavarthi, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar
On Wed, 1 Feb 2023 13:34:41 +0000
Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
> --- a/lib/eal/include/rte_log.h
> +++ b/lib/eal/include/rte_log.h
> @@ -48,6 +48,7 @@ extern "C" {
> #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
> #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
> #define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
> +#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
NAK to this part.
No new static logtypes please.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-01 13:34 ` Shivah Shankar Shankar Narayan Rao
2023-02-03 0:25 ` Stephen Hemminger
@ 2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 10:04 ` Jerin Jacob
2023-02-03 0:28 ` Stephen Hemminger
2 siblings, 1 reply; 80+ messages in thread
From: Stephen Hemminger @ 2023-02-03 0:25 UTC (permalink / raw)
To: Shivah Shankar Shankar Narayan Rao
Cc: Jerin Jacob Kollanukkaran, dev, Thomas Monjalon,
Bruce Richardson, Srikanth Yalavarthi, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar
On Wed, 1 Feb 2023 13:34:41 +0000
Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
> +struct rte_ml_dev_info {
> + const char *driver_name;
> + /**< Driver name */
> + int16_t max_models;
Why is max_models signed? other fields are unsigned
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-01 13:34 ` Shivah Shankar Shankar Narayan Rao
2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 0:25 ` Stephen Hemminger
@ 2023-02-03 0:28 ` Stephen Hemminger
2023-02-03 10:03 ` Jerin Jacob
2 siblings, 1 reply; 80+ messages in thread
From: Stephen Hemminger @ 2023-02-03 0:28 UTC (permalink / raw)
To: Shivah Shankar Shankar Narayan Rao
Cc: Jerin Jacob Kollanukkaran, dev, Thomas Monjalon,
Bruce Richardson, Srikanth Yalavarthi, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar
On Wed, 1 Feb 2023 13:34:41 +0000
Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
> +#define RTE_ML_STR_MAX 128
> +/**< Maximum length of name string */
Fixed length strings do create long term technical issues.
But this is big enough, I doubt it matters.
You may want to make sure that string is always at the end
of the struct to reduce cache foot print. I.e put important stuff first
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 0:25 ` Stephen Hemminger
@ 2023-02-03 8:42 ` Thomas Monjalon
2023-02-03 17:33 ` Stephen Hemminger
2023-02-03 10:01 ` Jerin Jacob
1 sibling, 1 reply; 80+ messages in thread
From: Thomas Monjalon @ 2023-02-03 8:42 UTC (permalink / raw)
To: Shivah Shankar Shankar Narayan Rao, Stephen Hemminger
Cc: Jerin Jacob Kollanukkaran, dev, Bruce Richardson,
Srikanth Yalavarthi, ferruh.yigit, ajit.khaparde, aboyer,
andrew.rybchenko, beilei.xing, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar, david.marchand
03/02/2023 01:25, Stephen Hemminger:
> On Wed, 1 Feb 2023 13:34:41 +0000
> Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
>
> > --- a/lib/eal/include/rte_log.h
> > +++ b/lib/eal/include/rte_log.h
> > @@ -48,6 +48,7 @@ extern "C" {
> > #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
> > #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
> > #define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
> > +#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
>
> NAK to this part.
> No new static logtypes please.
Good catch.
By the way, we should remove unused RTE_LOGTYPE_*.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 8:42 ` Thomas Monjalon
@ 2023-02-03 10:01 ` Jerin Jacob
1 sibling, 0 replies; 80+ messages in thread
From: Jerin Jacob @ 2023-02-03 10:01 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, Thomas Monjalon, Bruce Richardson, Srikanth Yalavarthi,
ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar
On Fri, Feb 3, 2023 at 5:55 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 1 Feb 2023 13:34:41 +0000
> Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
>
> > --- a/lib/eal/include/rte_log.h
> > +++ b/lib/eal/include/rte_log.h
> > @@ -48,6 +48,7 @@ extern "C" {
> > #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
> > #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
> > #define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
> > +#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
>
> NAK to this part.
> No new static logtypes please.
Ack, will fix in next version.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 0:28 ` Stephen Hemminger
@ 2023-02-03 10:03 ` Jerin Jacob
0 siblings, 0 replies; 80+ messages in thread
From: Jerin Jacob @ 2023-02-03 10:03 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, Thomas Monjalon, Bruce Richardson, Srikanth Yalavarthi,
ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar
On Fri, Feb 3, 2023 at 5:58 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 1 Feb 2023 13:34:41 +0000
> Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
>
> > +#define RTE_ML_STR_MAX 128
> > +/**< Maximum length of name string */
>
> Fixed length strings do create long term technical issues.
> But this is big enough, I doubt it matters.
>
> You may want to make sure that string is always at the end
> of the struct to reduce cache foot print. I.e put important stuff first
Yeah. fastpath stuff first in fast path structures.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 0:25 ` Stephen Hemminger
@ 2023-02-03 10:04 ` Jerin Jacob
0 siblings, 0 replies; 80+ messages in thread
From: Jerin Jacob @ 2023-02-03 10:04 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, Thomas Monjalon, Bruce Richardson, Srikanth Yalavarthi,
ferruh.yigit, ajit.khaparde, aboyer, andrew.rybchenko,
beilei.xing, chas3, chenbo.xia, ciara.loftus,
Devendra Singh Rawat, ed.czeck, evgenys, grive, g.singh,
zhouguoyang, haiyue.wang, Harman Kalra, heinrich.kuhn,
hemant.agrawal, hyonkim, igorch, Igor Russkikh, jgrajcia,
jasvinder.singh, jianwang, jiawenwu, jingjing.wu, johndale,
john.miller, linville, keith.wiles, Kiran Kumar Kokkilagadda,
oulijun, Liron Himi, longli, mw, spinler, matan, matt.peters,
maxime.coquelin, mk, humin29, Pradeep Kumar Nalla,
Nithin Kumar Dabilpuram, qiming.yang, qi.z.zhang,
Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody, rosen.xu,
sachin.saxena, Satha Koteswara Rao Kottidi, Shahed Shaikh,
shaibran, shepard.siegel, asomalap, somnath.kotur, sthemmin,
steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar
On Fri, Feb 3, 2023 at 5:56 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 1 Feb 2023 13:34:41 +0000
> Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
>
> > +struct rte_ml_dev_info {
> > + const char *driver_name;
> > + /**< Driver name */
> > + int16_t max_models;
>
> Why is max_models signed? other fields are unsigned
Not needed. I will change as unsigned in next version.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 8:42 ` Thomas Monjalon
@ 2023-02-03 17:33 ` Stephen Hemminger
2023-02-03 20:18 ` Thomas Monjalon
0 siblings, 1 reply; 80+ messages in thread
From: Stephen Hemminger @ 2023-02-03 17:33 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, Bruce Richardson, Srikanth Yalavarthi, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar, david.marchand
On Fri, 03 Feb 2023 09:42:45 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:
> 03/02/2023 01:25, Stephen Hemminger:
> > On Wed, 1 Feb 2023 13:34:41 +0000
> > Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
> >
> > > --- a/lib/eal/include/rte_log.h
> > > +++ b/lib/eal/include/rte_log.h
> > > @@ -48,6 +48,7 @@ extern "C" {
> > > #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
> > > #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
> > > #define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
> > > +#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
> >
> > NAK to this part.
> > No new static logtypes please.
>
> Good catch.
> By the way, we should remove unused RTE_LOGTYPE_*.
>
>
Yes, for 23.11 would like to work down the list.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 17:33 ` Stephen Hemminger
@ 2023-02-03 20:18 ` Thomas Monjalon
2023-02-03 20:26 ` Stephen Hemminger
0 siblings, 1 reply; 80+ messages in thread
From: Thomas Monjalon @ 2023-02-03 20:18 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, Bruce Richardson, Srikanth Yalavarthi, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar, david.marchand
03/02/2023 18:33, Stephen Hemminger:
> On Fri, 03 Feb 2023 09:42:45 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
>
> > 03/02/2023 01:25, Stephen Hemminger:
> > > On Wed, 1 Feb 2023 13:34:41 +0000
> > > Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
> > >
> > > > --- a/lib/eal/include/rte_log.h
> > > > +++ b/lib/eal/include/rte_log.h
> > > > @@ -48,6 +48,7 @@ extern "C" {
> > > > #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
> > > > #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
> > > > #define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
> > > > +#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
> > >
> > > NAK to this part.
> > > No new static logtypes please.
> >
> > Good catch.
> > By the way, we should remove unused RTE_LOGTYPE_*.
>
> Yes, for 23.11 would like to work down the list.
Do we need to wait 23.11?
It is not an ABI breakage.
And most of these defines are already unused.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 20:18 ` Thomas Monjalon
@ 2023-02-03 20:26 ` Stephen Hemminger
2023-02-03 20:49 ` Thomas Monjalon
0 siblings, 1 reply; 80+ messages in thread
From: Stephen Hemminger @ 2023-02-03 20:26 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, Bruce Richardson, Srikanth Yalavarthi, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar, david.marchand
On Fri, 03 Feb 2023 21:18:40 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:
> 03/02/2023 18:33, Stephen Hemminger:
> > On Fri, 03 Feb 2023 09:42:45 +0100
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > > 03/02/2023 01:25, Stephen Hemminger:
> > > > On Wed, 1 Feb 2023 13:34:41 +0000
> > > > Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
> > > >
> > > > > --- a/lib/eal/include/rte_log.h
> > > > > +++ b/lib/eal/include/rte_log.h
> > > > > @@ -48,6 +48,7 @@ extern "C" {
> > > > > #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
> > > > > #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
> > > > > #define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
> > > > > +#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
> > > >
> > > > NAK to this part.
> > > > No new static logtypes please.
> > >
> > > Good catch.
> > > By the way, we should remove unused RTE_LOGTYPE_*.
> >
> > Yes, for 23.11 would like to work down the list.
>
> Do we need to wait 23.11?
> It is not an ABI breakage.
> And most of these defines are already unused.
Turning them into deprecated would be API breakage though
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 20:26 ` Stephen Hemminger
@ 2023-02-03 20:49 ` Thomas Monjalon
2023-02-05 23:41 ` Stephen Hemminger
0 siblings, 1 reply; 80+ messages in thread
From: Thomas Monjalon @ 2023-02-03 20:49 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, Bruce Richardson, Srikanth Yalavarthi, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar, david.marchand
03/02/2023 21:26, Stephen Hemminger:
> On Fri, 03 Feb 2023 21:18:40 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
>
> > 03/02/2023 18:33, Stephen Hemminger:
> > > On Fri, 03 Feb 2023 09:42:45 +0100
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > > 03/02/2023 01:25, Stephen Hemminger:
> > > > > On Wed, 1 Feb 2023 13:34:41 +0000
> > > > > Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com> wrote:
> > > > >
> > > > > > --- a/lib/eal/include/rte_log.h
> > > > > > +++ b/lib/eal/include/rte_log.h
> > > > > > @@ -48,6 +48,7 @@ extern "C" {
> > > > > > #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */
> > > > > > #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */
> > > > > > #define RTE_LOGTYPE_GSO 20 /**< Log related to GSO. */
> > > > > > +#define RTE_LOGTYPE_MLDEV 21 /**< Log related to mldev. */
> > > > >
> > > > > NAK to this part.
> > > > > No new static logtypes please.
> > > >
> > > > Good catch.
> > > > By the way, we should remove unused RTE_LOGTYPE_*.
> > >
> > > Yes, for 23.11 would like to work down the list.
> >
> > Do we need to wait 23.11?
> > It is not an ABI breakage.
> > And most of these defines are already unused.
>
> Turning them into deprecated would be API breakage though
API breakage is not forbidden.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v1 01/12] mldev: introduce machine learning device library
2023-02-03 20:49 ` Thomas Monjalon
@ 2023-02-05 23:41 ` Stephen Hemminger
0 siblings, 0 replies; 80+ messages in thread
From: Stephen Hemminger @ 2023-02-05 23:41 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Shivah Shankar Shankar Narayan Rao, Jerin Jacob Kollanukkaran,
dev, Bruce Richardson, Srikanth Yalavarthi, ferruh.yigit,
ajit.khaparde, aboyer, andrew.rybchenko, beilei.xing, chas3,
chenbo.xia, ciara.loftus, Devendra Singh Rawat, ed.czeck,
evgenys, grive, g.singh, zhouguoyang, haiyue.wang, Harman Kalra,
heinrich.kuhn, hemant.agrawal, hyonkim, igorch, Igor Russkikh,
jgrajcia, jasvinder.singh, jianwang, jiawenwu, jingjing.wu,
johndale, john.miller, linville, keith.wiles,
Kiran Kumar Kokkilagadda, oulijun, Liron Himi, longli, mw,
spinler, matan, matt.peters, maxime.coquelin, mk, humin29,
Pradeep Kumar Nalla, Nithin Kumar Dabilpuram, qiming.yang,
qi.z.zhang, Radha Chintakuntla, rahul.lakkireddy, Rasesh Mody,
rosen.xu, sachin.saxena, Satha Koteswara Rao Kottidi,
Shahed Shaikh, shaibran, shepard.siegel, asomalap, somnath.kotur,
sthemmin, steven.webster, Sunil Kumar Kori, mtetsuyah,
Veerasenareddy Burru, viacheslavo, xiao.w.wang,
cloud.wangxiaoyun, yisen.zhuang, yongwang, xuanziyang2,
Prasun Kapoor, Nadav Haklai, Satananda Burla,
Narayana Prasad Raju Athreya, Akhil Goyal, mdr, dmitry.kozliuk,
anatoly.burakov, cristian.dumitrescu, honnappa.nagarahalli,
mattias.ronnblom, ruifeng.wang, drc, konstantin.ananyev,
olivier.matz, jay.jayatheerthan, Ashwin Sekhar T K,
Pavan Nikhilesh Bhagavatula, eagostini, Derek Chickles,
Parijat Shukla, Anup Prabhu, Prince Takkar, david.marchand
On Fri, 03 Feb 2023 21:49:02 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > Good catch.
> > > > > By the way, we should remove unused RTE_LOGTYPE_*.
> > > >
> > > > Yes, for 23.11 would like to work down the list.
> > >
> > > Do we need to wait 23.11?
> > > It is not an ABI breakage.
> > > And most of these defines are already unused.
> >
> > Turning them into deprecated would be API breakage though
>
> API breakage is not forbidden.
>
For the internal ones it would be ok, but what about the RTE_LOGTYPE_USER1 etc.
These need to go through the regular deprecation process.
The problem is that if the the types are not registered (see eal_common_log.c)
they might get reused.
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 00/12] mldev: introduce machine learning device library
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
` (12 preceding siblings ...)
2023-01-25 14:20 ` [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library Thomas Monjalon
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 01/12] " jerinj
` (12 more replies)
13 siblings, 13 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Jerin Jacob <jerinj@marvell.com>
Machine learning inference library
==================================
Definition of machine learning inference
----------------------------------------
Inference in machine learning is the process of making an output prediction
based on new input data using a pre-trained machine learning model.
The scope of the RFC would include only inferencing with pre-trained machine learning models,
training and building/compiling the ML models is out of scope for this RFC or
DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
Motivation for the new library
------------------------------
Multiple semiconductor vendors are offering accelerator products such as DPU
(often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
integrated as part of the product. Use of ML inferencing is increasing in the domain
of packet processing for flow classification, intrusion, malware and anomaly detection.
Lack of inferencing support through DPDK APIs will involve complexities and
increased latency from moving data across frameworks (i.e, dataplane to
non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
inferencing would enable the dataplane solutions to harness the benefit of inline
inferencing supported by the hardware.
Contents
---------------
A) API specification for:
1) Discovery of ML capabilities (e.g., device specific features) in a vendor
independent fashion
2) Definition of functions to handle ML devices, which includes probing,
initialization and termination of the devices.
3) Definition of functions to handle ML models used to perform inference operations.
4) Definition of function to handle quantize and dequantize operations
B) Common code for above specification
rfc..v1:
- Added programmer guide documentation
- Added implementation for common code
v2..v1:
- Moved dynamic log (Stephen)
- model id to uint16_t from int16t_t (Stephen)
- added release note updates
Machine learning library framework
----------------------------------
The ML framework is built on the following model:
+-----------------+ rte_ml_[en|de]queue_burst()
| | |
| Machine o------+ +--------+ |
| Learning | | | queue | | +------+
| Inference o------+-----o |<===o===>|Core 0|
| Engine | | | pair 0 | +------+
| o----+ | +--------+
| | | |
+-----------------+ | | +--------+
^ | | | queue | +------+
| | +-----o |<=======>|Core 1|
| | | pair 1 | +------+
| | +--------+
+--------+--------+ |
| +-------------+ | | +--------+
| | Model 0 | | | | queue | +------+
| +-------------+ | +-------o |<=======>|Core N|
| +-------------+ | | pair N | +------+
| | Model 1 | | +--------+
| +-------------+ |
| +-------------+ |<------- rte_ml_model_load()
| | Model .. | |-------> rte_ml_model_info()
| +-------------+ |<------- rte_ml_model_start()
| +-------------+ |<------- rte_ml_model_stop()
| | Model N | |<------- rte_ml_model_params_update()
| +-------------+ |<------- rte_ml_model_unload()
+-----------------+
ML Device: A hardware or software-based implementation of ML device API for
running inferences using a pre-trained ML model.
ML Model: An ML model is an algorithm trained over a dataset. A model consists of
procedure/algorithm and data/pattern required to make predictions on live data.
Once the model is created and trained outside of the DPDK scope, the model can be loaded
via rte_ml_model_load() and then start it using rte_ml_model_start() API.
The rte_ml_model_params_update() can be used to update the model parameters such as weight
and bias without unloading the model using rte_ml_model_unload().
ML Inference: ML inference is the process of feeding data to the model via
rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
outputs/predictions from the started model.
In all functions of the ML device API, the ML device is designated by an
integer >= 0 named as device identifier *dev_id*.
The functions exported by the ML device API to setup a device designated by
its device identifier must be invoked in the following order:
- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_dev_start()
A model is required to run the inference operations with the user specified inputs.
Application needs to invoke the ML model API in the following order before queueing
inference jobs.
- rte_ml_model_load()
- rte_ml_model_start()
The rte_ml_model_info() API is provided to retrieve the information related to the model.
The information would include the shape and type of input and output required for the inference.
Data quantization and dequantization is one of the main aspects in ML domain. This involves
conversion of input data from a higher precision to a lower precision data type and vice-versa
for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
and output buffers holding data for multiple batches.
Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
size of quantized and de-quantized multi-batch input and output buffers.
User can optionally update the model parameters with rte_ml_model_params_update() after
invoking rte_ml_model_stop() API on a given model ID.
The application can invoke, in any order, the functions exported by the ML API to enqueue
inference jobs and dequeue inference response.
If the application wants to change the device configuration (i.e., call
rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
for the given model. The application does not need to call rte_ml_dev_stop() API for
any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
start state after invoking rte_ml_model_start() API, then the application can call
rte_ml_enqueue() and rte_ml_dequeue() API on the destined device and model ID.
Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
Typical application utilisation of the ML API will follow the following
programming flow.
- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_model_load()
- rte_ml_model_start()
- rte_ml_model_info()
- rte_ml_dev_start()
- rte_ml_enqueue_burst()
- rte_ml_dequeue_burst()
- rte_ml_model_stop()
- rte_ml_model_unload()
- rte_ml_dev_stop()
- rte_ml_dev_close()
Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on different logical cores
on the same target object. For instance, the dequeue function of a poll mode driver cannot be
invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the user application to enforce this rule.
Example application usage for ML inferencing
--------------------------------------------
This example application is to demonstrate the programming model of ML device
library. This example omits the error checks to simplify the application. This
example also assumes that the input data received is quantized and output expected
is also quantized. In order to handle non-quantized inputs and outputs, users can
invoke rte_ml_io_quantize() or rte_ml_io_dequantize() for data type conversions.
#define ML_MODEL_NAME "model"
#define IO_MZ "io_mz"
struct app_ctx {
char model_file[PATH_MAX];
char inp_file[PATH_MAX];
char out_file[PATH_MAX];
struct rte_ml_model_params params;
struct rte_ml_model_info info;
uint16_t id;
uint64_t input_size;
uint64_t output_size;
uint8_t *input_buffer;
uint8_t *output_buffer;
} __rte_cache_aligned;
struct app_ctx ctx;
static int
parse_args(int argc, char **argv)
{
int opt, option_index;
static struct option lgopts[] = {{"model", required_argument, NULL, 'm'},
{"input", required_argument, NULL, 'i'},
{"output", required_argument, NULL, 'o'},
{NULL, 0, NULL, 0}};
while ((opt = getopt_long(argc, argv, "m:i:o:", lgopts, &option_index)) != EOF)
switch (opt) {
case 'm':
strncpy(ctx.model_file, optarg, PATH_MAX - 1);
break;
case 'i':
strncpy(ctx.inp_file, optarg, PATH_MAX - 1);
break;
case 'o':
strncpy(ctx.out_file, optarg, PATH_MAX - 1);
break;
default:
return -1;
}
return 0;
}
int
main(int argc, char **argv)
{
struct rte_ml_dev_qp_conf qp_conf;
struct rte_ml_dev_config config;
struct rte_ml_dev_info dev_info;
const struct rte_memzone *mz;
struct rte_mempool *op_pool;
struct rte_ml_op *op_enq;
struct rte_ml_op *op_deq;
FILE *fp;
int rc;
/* Initialize EAL */
rc = rte_eal_init(argc, argv);
if (rc < 0)
rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
argc -= rc;
argv += rc;
/* Parse application arguments (after the EAL args) */
if (parse_args(argc, argv) < 0)
rte_exit(EXIT_FAILURE, "Invalid application arguments\n");
/* Step 1: Check for ML devices */
if (rte_ml_dev_count() <= 0)
rte_exit(EXIT_FAILURE, "Failed to find ML devices\n");
/* Step 2: Get device info */
if (rte_ml_dev_info_get(0, &dev_info) != 0)
rte_exit(EXIT_FAILURE, "Failed to get device info\n");
/* Step 3: Configure ML device, use device 0 */
config.socket_id = rte_ml_dev_socket_id(0);
config.max_nb_models = dev_info.max_models;
config.nb_queue_pairs = dev_info.max_queue_pairs;
if (rte_ml_dev_configure(0, &config) != 0)
rte_exit(EXIT_FAILURE, "Device configuration failed\n");
/* Step 4: Setup queue pairs, used qp_id = 0 */
qp_conf.nb_desc = 1;
if (rte_ml_dev_queue_pair_setup(0, 0, &qp_conf, config.socket_id) != 0)
rte_exit(EXIT_FAILURE, "Queue-pair setup failed\n");
/* Step 5: Start device */
if (rte_ml_dev_start(0) != 0)
rte_exit(EXIT_FAILURE, "Device start failed\n");
/* Step 6: Read model data and update load params structure */
fp = fopen(ctx.model_file, "r+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open model file\n");
fseek(fp, 0, SEEK_END);
ctx.params.size = ftell(fp);
fseek(fp, 0, SEEK_SET);
ctx.params.addr = malloc(ctx.params.size);
if (fread(ctx.params.addr, 1, ctx.params.size, fp) != ctx.params.size){
fclose(fp);
rte_exit(EXIT_FAILURE, "Failed to read model\n");
}
fclose(fp);
strcpy(ctx.params.name, ML_MODEL_NAME);
/* Step 7: Load the model */
if (rte_ml_model_load(0, &ctx.params, &ctx.id) != 0)
rte_exit(EXIT_FAILURE, "Failed to load model\n");
free(ctx.params.addr);
/* Step 8: Start the model */
if (rte_ml_model_start(0, ctx.id) != 0)
rte_exit(EXIT_FAILURE, "Failed to start model\n");
/* Step 9: Allocate buffers for quantized input and output */
/* Get model information */
if (rte_ml_model_info_get(0, ctx.id, &ctx.info) != 0)
rte_exit(EXIT_FAILURE, "Failed to get model info\n");
/* Get the buffer size for input and output */
rte_ml_io_input_size_get(0, ctx.id, ctx.info.batch_size, &ctx.input_size, NULL);
rte_ml_io_output_size_get(0, ctx.id, ctx.info.batch_size, &ctx.output_size, NULL);
mz = rte_memzone_reserve(IO_MZ, ctx.input_size + ctx.output_size, config.socket_id, 0);
if (mz == NULL)
rte_exit(EXIT_FAILURE, "Failed to create IO memzone\n");
ctx.input_buffer = mz->addr;
ctx.output_buffer = ctx.input_buffer + ctx.input_size;
/* Step 10: Fill the input data */
fp = fopen(ctx.inp_file, "r+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open input file\n");
if (fread(ctx.input_buffer, 1, ctx.input_size, fp) != ctx.input_size) {
fclose(fp);
rte_exit(EXIT_FAILURE, "Failed to read input file\n");
}
fclose(fp);
/* Step 11: Create ML op mempool */
op_pool = rte_ml_op_pool_create("ml_op_pool", 1, 0, 0, config.socket_id);
if (op_pool == NULL)
rte_exit(EXIT_FAILURE, "Failed to create op pool\n");
/* Step 12: Form an ML op */
rte_mempool_get_bulk(op_pool, (void *)op_enq, 1);
op_enq->model_id = ctx.id;
op_enq->nb_batches = ctx.info.batch_size;
op_enq->mempool = op_pool;
op_enq->input.addr = ctx.input_buffer;
op_enq->input.length = ctx.input_size;
op_enq->input.next = NULL;
op_enq->output.addr = ctx.output_buffer;
op_enq->output.length = ctx.output_size;
op_enq->output.next = NULL;
/* Step 13: Enqueue jobs */
rte_ml_enqueue_burst(0, 0, &op_enq, 1);
/* Step 14: Dequeue jobs and release op pool */
while (rte_ml_dequeue_burst(0, 0, &op_deq, 1) != 1)
;
/* Step 15: Write output */
fp = fopen(ctx.out_file, "w+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open output file\n");
fwrite(ctx.output_buffer, 1, ctx.output_size, fp);
fclose(fp);
/* Step 16: Clean up */
/* Stop ML model */
rte_ml_model_stop(0, ctx.id);
/* Unload ML model */
rte_ml_model_unload(0, ctx.id);
/* Free input/output memory */
rte_memzone_free(rte_memzone_lookup(IO_MZ));
/* Free the ml op back to pool */
rte_mempool_put_bulk(op_pool, (void *)op_deq, 1);
/* Free ml op pool */
rte_mempool_free(op_pool);
/* Stop the device */
rte_ml_dev_stop(0);
rte_ml_dev_close(0);
rte_eal_cleanup();
return 0;
}
Jerin Jacob (1):
mldev: introduce machine learning device library
Srikanth Yalavarthi (11):
mldev: support PMD functions for ML device
mldev: support ML device handling functions
mldev: support ML device queue-pair setup
mldev: support handling ML models
mldev: support input and output data handling
mldev: support ML op pool and ops
mldev: support inference enqueue and dequeue
mldev: support device statistics
mldev: support device extended statistics
mldev: support to retrieve error information
mldev: support to get debug info and test device
MAINTAINERS | 5 +
config/rte_config.h | 3 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 186 ++++
doc/guides/rel_notes/release_23_03.rst | 5 +
lib/meson.build | 1 +
lib/mldev/meson.build | 27 +
lib/mldev/rte_mldev.c | 905 ++++++++++++++++++
lib/mldev/rte_mldev.h | 1099 ++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 720 ++++++++++++++
lib/mldev/rte_mldev_pmd.c | 62 ++
lib/mldev/rte_mldev_pmd.h | 151 +++
lib/mldev/version.map | 50 +
16 files changed, 3931 insertions(+)
create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/rte_mldev_core.h
create mode 100644 lib/mldev/rte_mldev_pmd.c
create mode 100644 lib/mldev/rte_mldev_pmd.h
create mode 100644 lib/mldev/version.map
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 01/12] mldev: introduce machine learning device library
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device jerinj
` (11 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Thomas Monjalon, Bruce Richardson, Srikanth Yalavarthi
Cc: ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor, Jerin Jacob
From: Jerin Jacob <jerinj@marvell.com>
Add mldev API specification to standardize and use the machine learning
device and inference operations in vendor neutral way.
Following operations are abstracted through APIs
- ML device capability probe
- ML device configuration
- ML device queue pair configuration
- ML device state management
- ML device stat/xstat operations
- ML model load/unload/start/stop operations
- ML model information probe
- ML IO operations to find size for input and output buffers
- ML quantize and dequantize operations
- ML ops pool creation and free operations
- ML device enqueue/dequeue fastpath interference operations
Also added programming guide.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
MAINTAINERS | 5 +
config/rte_config.h | 3 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 186 ++++
doc/guides/rel_notes/release_23_03.rst | 5 +
lib/meson.build | 1 +
lib/mldev/meson.build | 18 +
lib/mldev/rte_mldev.c | 8 +
lib/mldev/rte_mldev.h | 1099 ++++++++++++++++++++++
lib/mldev/version.map | 7 +
13 files changed, 2049 insertions(+)
create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/version.map
diff --git a/MAINTAINERS b/MAINTAINERS
index 9a0f416d2e..a39c00a608 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -538,6 +538,11 @@ F: drivers/raw/skeleton/
F: app/test/test_rawdev.c
F: doc/guides/prog_guide/rawdev.rst
+ML device API - EXPERIMENTAL
+M: Srikanth Yalavarthi <syalavarthi@marvell.com>
+F: lib/mldev/
+F: doc/guides/prog_guide/mldev.rst
+
Memory Pool Drivers
-------------------
diff --git a/config/rte_config.h b/config/rte_config.h
index 7b8c85e948..2c91c2b3d3 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -82,6 +82,9 @@
/* rawdev defines */
#define RTE_RAWDEV_MAX_DEVS 64
+/* mldev defines */
+#define RTE_MLDEV_MAX_DEVS 64
+
/* ip_fragmentation defines */
#define RTE_LIBRTE_IP_FRAG_MAX_FRAG 8
// RTE_LIBRTE_IP_FRAG_TBL_STAT is not set
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de488c7abf..a12562977a 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -22,6 +22,7 @@ The public API headers are grouped by topics:
[compress](@ref rte_comp.h),
[regexdev](@ref rte_regexdev.h),
[dmadev](@ref rte_dmadev.h),
+ [mldev](@ref rte_mldev.h),
[eventdev](@ref rte_eventdev.h),
[event_eth_rx_adapter](@ref rte_event_eth_rx_adapter.h),
[event_eth_tx_adapter](@ref rte_event_eth_tx_adapter.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index f0886c3bd1..5d6416d3e0 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -57,6 +57,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/mempool \
@TOPDIR@/lib/meter \
@TOPDIR@/lib/metrics \
+ @TOPDIR@/lib/mldev \
@TOPDIR@/lib/node \
@TOPDIR@/lib/net \
@TOPDIR@/lib/pcapng \
diff --git a/doc/guides/prog_guide/img/mldev_flow.svg b/doc/guides/prog_guide/img/mldev_flow.svg
new file mode 100644
index 0000000000..6c5dda14e5
--- /dev/null
+++ b/doc/guides/prog_guide/img/mldev_flow.svg
@@ -0,0 +1,714 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!-- SPDX-License-Identifier: BSD-3-Clause -->
+<!-- Copyright (c) 2022 Marvell. -->
+<!-- Created with Inkscape (http://www.inkscape.org/) -->
+
+<svg
+ width="320mm"
+ height="297mm"
+ viewBox="0 0 320 297"
+ version="1.1"
+ id="svg6899"
+ inkscape:version="1.2.1 (9c6d41e410, 2022-07-14)"
+ sodipodi:docname="mldev_flow.svg"
+ inkscape:export-filename="mldev_flow.png"
+ inkscape:export-xdpi="96"
+ inkscape:export-ydpi="96"
+ xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
+ xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
+ xmlns="http://www.w3.org/2000/svg"
+ xmlns:svg="http://www.w3.org/2000/svg">
+ <sodipodi:namedview
+ id="namedview6901"
+ pagecolor="#ffffff"
+ bordercolor="#000000"
+ borderopacity="0.25"
+ inkscape:showpageshadow="2"
+ inkscape:pageopacity="0.0"
+ inkscape:pagecheckerboard="0"
+ inkscape:deskcolor="#d1d1d1"
+ inkscape:document-units="mm"
+ showgrid="false"
+ inkscape:connector-spacing="0"
+ inkscape:lockguides="false"
+ inkscape:zoom="0.49638341"
+ inkscape:cx="640.63382"
+ inkscape:cy="525.80323"
+ inkscape:window-width="1920"
+ inkscape:window-height="986"
+ inkscape:window-x="-11"
+ inkscape:window-y="-11"
+ inkscape:window-maximized="1"
+ inkscape:current-layer="layer1" />
+ <defs
+ id="defs6896">
+ <marker
+ style="overflow:visible"
+ id="RoundedArrow"
+ refX="5"
+ refY="0"
+ orient="auto-start-reverse"
+ inkscape:stockid="RoundedArrow"
+ markerWidth="6.1347523"
+ markerHeight="5.9304948"
+ viewBox="0 0 6.1347524 5.9304951"
+ inkscape:isstock="true"
+ inkscape:collect="always"
+ preserveAspectRatio="xMidYMid">
+ <path
+ transform="scale(0.7)"
+ d="m -0.21114562,-4.1055728 6.42229122,3.21114561 a 1,1 90 0 1 0,1.78885438 L -0.21114562,4.1055728 A 1.236068,1.236068 31.717474 0 1 -2,3 v -6 a 1.236068,1.236068 148.28253 0 1 1.78885438,-1.1055728 z"
+ style="fill:context-stroke;fill-rule:evenodd;stroke:none"
+ id="path1367" />
+ </marker>
+ <marker
+ style="overflow:visible"
+ id="TriangleStart"
+ refX="4"
+ refY="0"
+ orient="auto-start-reverse"
+ inkscape:stockid="TriangleStart"
+ markerWidth="5.3244081"
+ markerHeight="6.155385"
+ viewBox="0 0 5.3244081 6.1553851"
+ inkscape:isstock="true"
+ inkscape:collect="always"
+ preserveAspectRatio="xMidYMid">
+ <path
+ transform="scale(0.5)"
+ style="fill:context-stroke;fill-rule:evenodd;stroke:context-stroke;stroke-width:1pt"
+ d="M 5.77,0 -2.88,5 V -5 Z"
+ id="path135" />
+ </marker>
+ </defs>
+ <g
+ inkscape:label="Layer 1"
+ inkscape:groupmode="layer"
+ id="layer1">
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1;paint-order:stroke fill markers"
+ id="rect39991"
+ width="312.88394"
+ height="286.7659"
+ x="3.5580292"
+ y="5.1170502"
+ ry="18.197132" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 114.68664,155.38145 h 32.15418"
+ id="path24358"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="m 114.68664,179.58099 h 32.15008"
+ id="path24360"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart)"
+ d="m 114.68664,203.78389 h 32.15008"
+ id="path24362"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart)"
+ d="m 114.68664,227.98576 32.14997,0"
+ id="path24364"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="M 146.8367,252.18432 H 114.68664"
+ id="path24366"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="M 146.8367,276.38309 H 114.68664"
+ id="path24368"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176-1" />
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:2, 1;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24370"
+ width="18.09137"
+ height="13.568528"
+ x="127.27605"
+ y="208.81961"
+ ry="2.7394907"
+ inkscape:connector-avoid="true" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:4, 2;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 70.388979,148.58514 -1e-6,-46.3516"
+ id="path24426"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1"
+ inkscape:connection-end="#rect24176" />
+ <g
+ id="g42647">
+ <g
+ id="g31403"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844498;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68901, 0.844498;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-9"
+ width="99.155487"
+ height="14.152132"
+ x="190.88715"
+ y="229.93475"
+ ry="2.2479143"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-236.90309"
+ y="240.37343"
+ id="text31115"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113"
+ style="stroke:none;stroke-width:0.75"
+ x="-236.90309"
+ y="240.37343">rte_ml_model_update_params()</tspan></text>
+ </g>
+ <g
+ id="g31398"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844505;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68902, 0.844505;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-4"
+ width="99.155495"
+ height="14.152357"
+ x="190.88705"
+ y="205.73608"
+ ry="2.2479498"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-212.70453"
+ y="240.37334"
+ id="text31115-8"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-8"
+ style="stroke:none;stroke-width:0.75"
+ x="-212.70453"
+ y="240.37334">rte_ml_model_stop()</tspan></text>
+ </g>
+ <g
+ id="g31408"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844505;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68901, 0.844505;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2-2"
+ width="99.155495"
+ height="14.152359"
+ x="190.88715"
+ y="254.13341"
+ ry="2.2479503"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-261.10187"
+ y="240.37343"
+ id="text31115-1"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-3"
+ style="stroke:none;stroke-width:0.75"
+ x="-261.10187"
+ y="240.37343">rte_ml_model_unload()</tspan></text>
+ </g>
+ <g
+ id="g31393"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844566;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68914, 0.844566;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2-5"
+ width="99.155434"
+ height="14.154394"
+ x="190.88718"
+ y="181.53319"
+ ry="2.2482734"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-188.50266"
+ y="240.37343"
+ id="text31115-4"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-2"
+ style="stroke:none;stroke-width:0.75"
+ x="-188.50266"
+ y="240.37343">rte_ml_model_start()</tspan></text>
+ </g>
+ <g
+ id="g31388"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844565;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68914, 0.844565;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-8"
+ width="99.155434"
+ height="14.154395"
+ x="190.88718"
+ y="157.33029"
+ ry="2.2482736"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-164.29976"
+ y="240.37343"
+ id="text31115-6"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-5"
+ style="stroke:none;stroke-width:0.75"
+ x="-164.29976"
+ y="240.37343">rte_ml_model_info_get()</tspan></text>
+ </g>
+ <g
+ id="g31383"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2"
+ width="99.155495"
+ height="14.152369"
+ x="190.89127"
+ y="133.13176"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-140.10022"
+ y="240.37755"
+ id="text31115-0"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35"
+ style="stroke:none;stroke-width:0.75"
+ x="-140.10022"
+ y="240.37755">rte_ml_model_load()</tspan></text>
+ </g>
+ </g>
+ <rect
+ style="fill:#ffccaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2-5"
+ width="99.155495"
+ height="14.152369"
+ x="184.08008"
+ y="112.15163"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-119.12009"
+ y="233.56647"
+ id="text31115-0-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35-8"
+ style="stroke:none;stroke-width:0.75"
+ x="-119.12009"
+ y="233.56647">rte_ml_dequeue_burst()</tspan></text>
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 114.90712,47.649005 h 56.16045"
+ id="path24248"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176"
+ inkscape:connection-end="#rect24200" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 171.06762,70.71111 -56.1605,0.0024"
+ id="path24250"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176"
+ inkscape:connection-start="#rect24200-5" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="M 171.06765,93.773951 H 114.90712"
+ id="path24252"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176"
+ inkscape:connection-start="#rect24200-5-2" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.44396,47.649004 h 36.42795"
+ id="path24566"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.444,70.710168 h 36.42791"
+ id="path24568"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.44395,93.773951 36.42796,-10e-7"
+ id="path24570"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <g
+ id="g42675">
+ <g
+ id="g31358"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623639;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24728, 0.623639;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200"
+ width="44.376362"
+ height="17.244751"
+ x="190.77635"
+ y="22.794853"
+ ry="2.7391431"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-31.802492"
+ y="212.98004"
+ id="text31256"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254"
+ style="stroke-width:0.75"
+ x="-31.802492"
+ y="212.98004">Queue Pair 0</tspan></text>
+ </g>
+ <g
+ id="g31353"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623639;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24728, 0.623639;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5"
+ width="44.376362"
+ height="17.244749"
+ x="190.7764"
+ y="45.856018"
+ ry="2.7391429"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-54.863655"
+ y="213.10411"
+ id="text31256-9"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254-9"
+ style="stroke-width:0.75"
+ x="-54.863655"
+ y="213.10411">Queue Pair ..</tspan></text>
+ </g>
+ <g
+ id="g31363"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623731;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24746, 0.623731;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2"
+ width="44.37627"
+ height="17.249832"
+ x="190.77643"
+ y="68.917259"
+ ry="2.7399504"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-77.927437"
+ y="213.08859"
+ id="text31256-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254-8"
+ style="stroke-width:0.75"
+ x="-77.927437"
+ y="213.08859">Queue Pair N</tspan></text>
+ </g>
+ </g>
+ <g
+ id="g42661">
+ <g
+ id="g31368"
+ transform="translate(-19.708778,16.231776)"
+ inkscape:connector-avoid="true">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="25.995117"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-31.941525"
+ y="287.03415"
+ id="text31260"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258"
+ style="stroke-width:0.75"
+ x="-31.941525"
+ y="287.03415">Core 0</tspan></text>
+ </g>
+ <g
+ id="g31373"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479-4"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="49.056282"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-55.00008"
+ y="287.15549"
+ id="text31260-0"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258-7"
+ style="stroke-width:0.75"
+ x="-55.00008"
+ y="287.15549">Core ..</tspan></text>
+ </g>
+ <g
+ id="g31378"
+ transform="translate(-19.708778,16.231776)"
+ inkscape:connector-avoid="true">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479-41"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="72.120064"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-78.063866"
+ y="287.13998"
+ id="text31260-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258-8"
+ style="stroke-width:0.75"
+ x="-78.063866"
+ y="287.13998">Core N</tspan></text>
+ </g>
+ </g>
+ <rect
+ style="fill:#ffccaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2-5-6"
+ width="99.155495"
+ height="14.152369"
+ x="184.08008"
+ y="13.539296"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-20.507757"
+ y="233.56647"
+ id="text31115-0-5-7"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35-8-7"
+ style="stroke:none;stroke-width:0.75"
+ x="-20.507757"
+ y="233.56647">rte_ml_enqueue_burst()</tspan></text>
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:2.25, 0.75;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#RoundedArrow)"
+ d="M 233.65793,27.691665 V 112.15163"
+ id="path36804"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <g
+ id="g42683">
+ <rect
+ style="fill:#44d7f4;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24176"
+ width="89.036293"
+ height="63.036304"
+ x="25.870831"
+ y="39.197231"
+ ry="3.0941005" />
+ <text
+ xml:space="preserve"
+ style="font-size:11.2889px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-49.288273"
+ y="70.228432"
+ id="text38896"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan38894"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-49.288273"
+ y="70.228432">Machine</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-63.399399"
+ y="70.228432"
+ id="tspan38898">Learning</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-77.510529"
+ y="70.228432"
+ id="tspan38900">Inference</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-91.621651"
+ y="70.228432"
+ id="tspan38902">Engine</tspan></text>
+ </g>
+ <g
+ id="g42621">
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.405;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24176-1"
+ width="88.595322"
+ height="134.59531"
+ x="26.09132"
+ y="148.58514"
+ ry="6.6065331" />
+ <g
+ id="g42601">
+ <g
+ id="g39966"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="146.14212"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-157.3761"
+ y="130.49591"
+ id="text39799"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-157.3761"
+ y="130.49591">Model 0</tspan></text>
+ </g>
+ <g
+ id="g39971"
+ transform="translate(-60.175151,10.144334)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-8"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="178.65079"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-189.88477"
+ y="130.49591"
+ id="text39799-8"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-1"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-189.88477"
+ y="130.49591">Model 1</tspan></text>
+ </g>
+ <g
+ id="g39976"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-9"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="211.15947"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-222.39345"
+ y="130.49591"
+ id="text39799-9"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-8"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-222.39345"
+ y="130.49591">Model ..</tspan></text>
+ </g>
+ <g
+ id="g39981"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-7"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="243.66815"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-254.90213"
+ y="130.49591"
+ id="text39799-90"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-5"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-254.90213"
+ y="130.49591">Model N</tspan></text>
+ </g>
+ </g>
+ </g>
+ <text
+ xml:space="preserve"
+ style="font-size:14.1111px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-279.79742"
+ y="275.46826"
+ id="text38896-4"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:14.1111px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-279.79742"
+ y="275.46826"
+ id="tspan38902-6">mldev</tspan></text>
+ </g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 8564883018..d7f2a28bdb 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -30,6 +30,7 @@ Programmer's Guide
regexdev
dmadev
gpudev
+ mldev
rte_security
rawdev
link_bonding_poll_mode_drv_lib
diff --git a/doc/guides/prog_guide/mldev.rst b/doc/guides/prog_guide/mldev.rst
new file mode 100644
index 0000000000..a0bd370e72
--- /dev/null
+++ b/doc/guides/prog_guide/mldev.rst
@@ -0,0 +1,186 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright (c) 2022 Marvell.
+
+Machine Learning Device Library
+===============================
+
+The MLDEV library provides a Machine Learning device framework for the management and
+provisioning of hardware and software ML poll mode drivers, defining APIs which
+support a number of ML operations including device handling and inference processing.
+The ML model creation and training is outside of the scope of this library.
+
+The ML framework is built on the following model:
+
+.. _figure_mldev_work_flow:
+
+.. figure:: img/mldev_flow.*
+
+ Work flow of inference on MLDEV
+
+**ML Device**: A hardware or software-based implementation of ML device API for running
+inferences using a pre-trained ML model.
+
+**ML Model**: An ML model is an algorithm trained over a dataset. A model consists of
+procedure/algorithm and data/pattern required to make predictions on live data. Once
+the model is created and trained outside of the DPDK scope, the model can be loaded
+via rte_ml_model_load() and then start it using rte_ml_model_start() API. The
+rte_ml_model_params_update() can be used to update the model parameters such as weights
+and bias without unloading the model using rte_ml_model_unload().
+
+**ML Inference**: ML inference is the process of feeding data to the model via
+rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+outputs / predictions from the started model.
+
+Design Principles
+-----------------
+
+The MLDEV library follows the same basic principles as those used in DPDK's
+Ethernet Device framework and the Crypto framework. The MLDEV framework provides
+a generic Machine Learning device framework which supports both physical (hardware)
+and virtual (software) ML devices as well as an ML API to manage and configure ML
+devices. The APIs also supports performing ML inference operations through ML poll
+mode driver.
+
+
+Device Operations
+-----------------
+
+Device Creation
+~~~~~~~~~~~~~~~
+
+Physical ML devices are discovered during the PCI probe/enumeration, through the
+EAL functions which are executed at DPDK initialization, based on their PCI device
+identifier, each unique PCI BDF (bus/bridge, device, function). ML physical devices,
+like other physical devices in DPDK can be white-listed or black-listed
+using the EAL command line options.
+
+
+Device Identification
+~~~~~~~~~~~~~~~~~~~~~
+
+Each device, whether virtual or physical is uniquely designated by two
+identifiers:
+
+- A unique device index used to designate the ML device in all functions
+ exported by the MLDEV API.
+
+- A device name used to designate the ML device in console messages, for
+ administration or debugging purposes.
+
+Device Features and Capabilities
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ML devices may support different feature set. In order to get the
+supported PMD feature ``rte_ml_dev_info_get`` API which return the
+info of the device and it's supported features.
+
+Device Configuration
+~~~~~~~~~~~~~~~~~~~~
+
+The configuration of each ML device includes the following operations:
+
+- Allocation of resources, including hardware resources if a physical device.
+- Resetting the device into a well-known default state.
+- Initialization of statistics counters.
+
+The rte_ml_dev_configure API is used to configure a ML device.
+
+.. code-block:: c
+
+ int rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *cfg);
+
+The ``rte_ml_dev_config`` structure is used to pass the configuration parameters
+for the ML device, for example number of queue pairs, maximum number of models,
+maximum size of model and so on.
+
+Configuration of Queue Pairs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each ML device can be configured with number of queue pairs.
+Each queue pair is configured using ``rte_ml_dev_queue_pair_setup``
+
+Logical Cores, Memory and Queues Pair Relationships
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Multiple logical cores should never share the same queue pair for enqueuing
+operations or dequeueing operations on the same ML device since this would
+require global locks and hinder performance.
+
+Configuration of Machine Learning models
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pre-trained ML models that are built using external ML compiler / training frameworks
+are used to perform inference operations. These models are configured on an ML device
+in a two-stage process that includes loading the model on an ML device, and starting
+the model to accept inference operations. Inference operations can be queued for a
+model only when the model is in started state. Model load stage assigns a Model ID,
+which is unique for the model in a driver's context. Model ID is used during all
+subsequent slow-path and fast-path operations.
+
+Model loading and start is done through the ``rte_ml_model_load`` and
+``rte_ml_model_start`` functions.
+
+Similarly stop and unloading are done through ``rte_ml_model_stop`` and
+``rte_ml_model_unload`` functions.
+
+Stop and unload functions would release the resources allocated for the
+models. Inference tasks cannot be queued for a model that is stopped.
+
+Detailed information related to the model can be retrieved from the driver using the
+function ``rte_ml_model_info_get``. Model information is accessible to the application
+through the ``rte_ml_model_info`` structure. Information available to the user would
+include the details related to the inputs and outputs, and the maximum batch size
+supported by the model.
+
+User can optionally update the model params such as weights and bias, without unloading
+the model, through the ``rte_ml_model_params_update`` function. A model should be in
+stopped state to update the params. Model has to be started in order to enqueue inference
+requests after a params update.
+
+Enqueue / Dequeue
+~~~~~~~~~~~~~~~~~
+
+The burst enqueue API uses a ML device identifier and a queue pair identifier
+to specify the device queue pair to schedule the processing on. The ``nb_ops``
+parameter is the number of operations to process which are supplied in the
+``ops`` array of ``rte_ml_op`` structures. The enqueue function returns the
+number of operations it enqueued for processing, a return value equal to
+``nb_ops`` means that all packets have been enqueued.
+
+The dequeue API uses the same format as the enqueue API of processed but
+the ``nb_ops`` and ``ops`` parameters are now used to specify the max processed
+operations the user wishes to retrieve and the location in which to store them.
+The API call returns the actual number of processed operations returned; this
+can never be larger than ``nb_ops``.
+
+``rte_ml_op`` provides the required information to the driver to queue an ML inference
+task. ML op specifies the model to be used and the number of batches to be executed in
+the inference task. Input and output buffer information is specified through the
+structure ``rte_ml_buff_seg``, which supports segmented data. Input is provided through
+the ``rte_ml_op::input`` and output through ``rte_ml_op::output``. Data pointed in each
+op, should not be released until the dequeue of for that op.
+
+
+Quantize and Dequantize
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Inference operations performed with lower precision types would improve the throughput
+and efficiency of the inference execution with a minimal loss of accuracy, which is within
+the tolerance limits. Quantization and dequantization is the process of converting data
+from a higher precision type to a lower precision type and vice-versa. ML library provides
+the functions ``rte_ml_io_quantize`` and ``rte_ml_io_dequantize`` to enable data type
+conversions. User needs to provide the address of the quantized and dequantized data
+buffers to the functions, along the number of the batches in the buffers.
+
+For quantization, the dequantized data is assumed to be of the type ``dtype`` provided by
+the ``rte_ml_model_info::input`` and the data is converted to ``qtype`` provided by the
+``rte_ml_model_info::input``.
+
+For dequantization, the quantized data is assumed to be of the type ``qtype`` provided by
+the ``rte_ml_model_info::output`` and the data is converted to ``dtype`` provided by the
+``rte_ml_model_info::output``.
+
+Size of the buffers required for the input and output can be calculated using the functions
+``rte_ml_io_input_size_get`` and ``rte_ml_io_output_size_get``. These functions would get the
+buffer sizes for both quantized and dequantized data for the given number of batches.
+
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index 73f5d94e14..354916d1fc 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -78,6 +78,11 @@ New Features
``rte_event_dev_config::nb_single_link_event_port_queues`` parameter
required for eth_rx, eth_tx, crypto and timer eventdev adapters.
+* **Added machine learning inference device library.**
+
+ * Added a machine learning inference device framework for management and provision of
+ hardware and software machine learning inference devices.
+
Removed Items
-------------
diff --git a/lib/meson.build b/lib/meson.build
index a90fee31b7..ad91819375 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -63,6 +63,7 @@ libraries = [
'flow_classify', # flow_classify lib depends on pkt framework table lib
'graph',
'node',
+ 'mldev',
]
optional_libs = [
diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build
new file mode 100644
index 0000000000..e378cfca30
--- /dev/null
+++ b/lib/mldev/meson.build
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2022 Marvell.
+
+sources = files(
+ 'rte_mldev.c',
+)
+
+headers = files(
+ 'rte_mldev.h',
+)
+
+deps += ['mempool']
+
+if get_option('buildtype').contains('debug')
+ cflags += [ '-DRTE_LIBRTE_ML_DEV_DEBUG' ]
+else
+ cflags += [ '-URTE_LIBRTE_ML_DEV_DEBUG' ]
+endif
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
new file mode 100644
index 0000000000..70aad4c44b
--- /dev/null
+++ b/lib/mldev/rte_mldev.c
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#include <rte_log.h>
+#include <rte_mldev.h>
+
+RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev.h b/lib/mldev/rte_mldev.h
new file mode 100644
index 0000000000..7b2cc1c270
--- /dev/null
+++ b/lib/mldev/rte_mldev.h
@@ -0,0 +1,1099 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef RTE_MLDEV_H
+#define RTE_MLDEV_H
+
+/**
+ * @file rte_mldev.h
+ *
+ * @warning
+ * @b EXPERIMENTAL:
+ * All functions in this file may be changed or removed without prior notice.
+ *
+ * ML (Machine Learning) device API.
+ *
+ * The ML framework is built on the following model:
+ *
+ *
+ * +-----------------+ rte_ml_[en|de]queue_burst()
+ * | | |
+ * | Machine o------+ +--------+ |
+ * | Learning | | | queue | | +------+
+ * | Inference o------+-----o |<===o===>|Core 0|
+ * | Engine | | | pair 0 | +------+
+ * | o----+ | +--------+
+ * | | | |
+ * +-----------------+ | | +--------+
+ * ^ | | | queue | +------+
+ * | | +-----o |<=======>|Core 1|
+ * | | | pair 1 | +------+
+ * | | +--------+
+ * +--------+--------+ |
+ * | +-------------+ | | +--------+
+ * | | Model 0 | | | | queue | +------+
+ * | +-------------+ | +-------o |<=======>|Core N|
+ * | +-------------+ | | pair N | +------+
+ * | | Model 1 | | +--------+
+ * | +-------------+ |
+ * | +-------------+ |<------> rte_ml_model_load()
+ * | | Model .. | |-------> rte_ml_model_info_get()
+ * | +-------------+ |<------- rte_ml_model_start()
+ * | +-------------+ |<------- rte_ml_model_stop()
+ * | | Model N | |<------- rte_ml_model_params_update()
+ * | +-------------+ |<------- rte_ml_model_unload()
+ * +-----------------+
+ *
+ * ML Device: A hardware or software-based implementation of ML device API for
+ * running inferences using a pre-trained ML model.
+ *
+ * ML Model: An ML model is an algorithm trained over a dataset. A model consists of
+ * procedure/algorithm and data/pattern required to make predictions on live data.
+ * Once the model is created and trained outside of the DPDK scope, the model can be loaded
+ * via rte_ml_model_load() and then start it using rte_ml_model_start() API.
+ * The rte_ml_model_params_update() can be used to update the model parameters such as weight
+ * and bias without unloading the model using rte_ml_model_unload().
+ *
+ * ML Inference: ML inference is the process of feeding data to the model via
+ * rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+ * outputs/predictions from the started model.
+ *
+ * In all functions of the ML device API, the ML device is designated by an
+ * integer >= 0 named as device identifier *dev_id*.
+ *
+ * The functions exported by the ML device API to setup a device designated by
+ * its device identifier must be invoked in the following order:
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_dev_start()
+ *
+ * A model is required to run the inference operations with the user specified inputs.
+ * Application needs to invoke the ML model API in the following order before queueing
+ * inference jobs.
+ *
+ * - rte_ml_model_load()
+ * - rte_ml_model_start()
+ *
+ * A model can be loaded on a device only after the device has been configured and can be
+ * started or stopped only after a device has been started.
+ *
+ * The rte_ml_model_info_get() API is provided to retrieve the information related to the model.
+ * The information would include the shape and type of input and output required for the inference.
+ *
+ * Data quantization and dequantization is one of the main aspects in ML domain. This involves
+ * conversion of input data from a higher precision to a lower precision data type and vice-versa
+ * for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
+ * dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
+ * and output buffers holding data for multiple batches.
+ *
+ * Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
+ * size of quantized and de-quantized multi-batch input and output buffers.
+ *
+ * User can optionally update the model parameters with rte_ml_model_params_update() after
+ * invoking rte_ml_model_stop() API on a given model ID.
+ *
+ * The application can invoke, in any order, the functions exported by the ML API to enqueue
+ * inference jobs and dequeue inference response.
+ *
+ * If the application wants to change the device configuration (i.e., call
+ * rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
+ * device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
+ * the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
+ * for the given model. The application does not need to call rte_ml_dev_stop() API for
+ * any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
+ *
+ * Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
+ * start state after invoking rte_ml_model_start() API, then the application can call
+ * rte_ml_enqueue_burst() and rte_ml_dequeue_burst() API on the destined device and model ID.
+ *
+ * Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
+ *
+ * Typical application utilisation of the ML API will follow the following
+ * programming flow.
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_model_load()
+ * - rte_ml_dev_start()
+ * - rte_ml_model_start()
+ * - rte_ml_model_info_get()
+ * - rte_ml_enqueue_burst()
+ * - rte_ml_dequeue_burst()
+ * - rte_ml_model_stop()
+ * - rte_ml_model_unload()
+ * - rte_ml_dev_stop()
+ * - rte_ml_dev_close()
+ *
+ * Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on different logical cores
+ * on the same target object. For instance, the dequeue function of a poll mode driver cannot be
+ * invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the user application to enforce this rule.
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_mempool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Logging Macro */
+extern int rte_ml_dev_logtype;
+
+#define RTE_MLDEV_LOG(level, fmt, args...) \
+ rte_log(RTE_LOG_##level, rte_ml_dev_logtype, "%s(): " fmt "\n", __func__, ##args)
+
+#define RTE_ML_STR_MAX 128
+/**< Maximum length of name string */
+
+/* Device operations */
+
+/**
+ * Get the total number of ML devices that have been successfully initialised.
+ *
+ * @return
+ * - The total number of usable ML devices.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dev_count(void);
+
+/**
+ * Check if the device is in ready state.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 if device state is not in ready state.
+ * - 1 if device state is ready state.
+ */
+__rte_experimental
+int
+rte_ml_dev_is_valid_dev(int16_t dev_id);
+
+/**
+ * Return the NUMA socket to which a device is connected.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - The NUMA socket id to which the device is connected
+ * - 0 If the socket could not be determined.
+ * - -EINVAL: if the dev_id value is not valid.
+ */
+__rte_experimental
+int
+rte_ml_dev_socket_id(int16_t dev_id);
+
+/** ML device information */
+struct rte_ml_dev_info {
+ const char *driver_name;
+ /**< Driver name */
+ uint16_t max_models;
+ /**< Maximum number of models supported by the device.
+ * @see struct rte_ml_dev_config::nb_models
+ */
+ uint16_t max_queue_pairs;
+ /**< Maximum number of queues pairs supported by the device.
+ * @see struct rte_ml_dev_config::nb_queue_pairs
+ */
+ uint16_t max_desc;
+ /**< Maximum allowed number of descriptors for queue pair by the device.
+ * @see struct rte_ml_dev_qp_conf::nb_desc
+ */
+ uint16_t max_segments;
+ /**< Maximum number of scatter-gather entries supported by the device.
+ * @see struct rte_ml_buff_seg struct rte_ml_buff_seg::next
+ */
+ uint16_t min_align_size;
+ /**< Minimum alignment size of IO buffers used by the device. */
+};
+
+/**
+ * Retrieve the information of the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param dev_info
+ * A pointer to a structure of type *rte_ml_dev_info* to be filled with the info of the device.
+ *
+ * @return
+ * - 0: Success, driver updates the information of the ML device
+ * - < 0: Error code returned by the driver info get function.
+ */
+__rte_experimental
+int
+rte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info);
+
+/** ML device configuration structure */
+struct rte_ml_dev_config {
+ int socket_id;
+ /**< Socket to allocate resources on. */
+ uint16_t nb_models;
+ /**< Number of models to be loaded on the device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_models
+ */
+ uint16_t nb_queue_pairs;
+ /**< Number of queue pairs to configure on this device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_queue_pairs
+ */
+};
+
+/**
+ * Configure an ML device.
+ *
+ * This function must be invoked first before any other function in the API.
+ *
+ * ML Device can be re-configured, when in a stopped state. Device cannot be re-configured after
+ * rte_ml_dev_close() is called.
+ *
+ * The caller may use rte_ml_dev_info_get() to get the capability of each resources available for
+ * this ML device.
+ *
+ * @param dev_id
+ * The identifier of the device to configure.
+ * @param config
+ * The ML device configuration structure.
+ *
+ * @return
+ * - 0: Success, device configured.
+ * - < 0: Error code returned by the driver configuration function.
+ */
+__rte_experimental
+int
+rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config);
+
+/* Forward declaration */
+struct rte_ml_op;
+
+/**< Callback function called during rte_ml_dev_stop(), invoked once per flushed ML op */
+typedef void (*rte_ml_dev_stop_flush_t)(int16_t dev_id, uint16_t qp_id, struct rte_ml_op *op);
+
+/** ML device queue pair configuration structure. */
+struct rte_ml_dev_qp_conf {
+ uint32_t nb_desc;
+ /**< Number of descriptors per queue pair.
+ * This value cannot exceed the max_desc which previously provided in
+ * struct rte_ml_dev_info:max_desc
+ */
+ rte_ml_dev_stop_flush_t cb;
+ /**< Callback function called during rte_ml_dev_stop(), invoked once per active ML op.
+ * Value NULL is allowed, in which case callback will not be invoked.
+ * This function can be used to properly dispose of outstanding ML ops from all
+ * queue pairs, for example ops containing memory pointers.
+ * @see rte_ml_dev_stop()
+ */
+};
+
+/**
+ * Set up a queue pair for a device. This should only be called when the device is stopped.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param queue_pair_id
+ * The index of the queue pairs to set up. The value must be in the range [0, nb_queue_pairs - 1]
+ * previously supplied to rte_ml_dev_configure().
+ * @param qp_conf
+ * The pointer to the configuration data to be used for the queue pair.
+ * @param socket_id
+ * The *socket_id* argument is the socket identifier in case of NUMA.
+ * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the memory allocated
+ * for the queue pair.
+ *
+ * @return
+ * - 0: Success, queue pair correctly set up.
+ * - < 0: Queue pair configuration failed.
+ */
+__rte_experimental
+int
+rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *qp_conf, int socket_id);
+
+/**
+ * Start an ML device.
+ *
+ * The device start step consists of setting the configured features and enabling the ML device
+ * to accept inference jobs.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device started.
+ * - <0: Error code of the driver device start function.
+ */
+__rte_experimental
+int
+rte_ml_dev_start(int16_t dev_id);
+
+/**
+ * Stop an ML device. A stopped device cannot accept inference jobs.
+ * The device can be restarted with a call to rte_ml_dev_start().
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device stopped.
+ * - <0: Error code of the driver device stop function.
+ */
+__rte_experimental
+int
+rte_ml_dev_stop(int16_t dev_id);
+
+/**
+ * Close an ML device. The device cannot be restarted!
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 on successfully closing device.
+ * - <0 on failure to close device.
+ */
+__rte_experimental
+int
+rte_ml_dev_close(int16_t dev_id);
+
+/** Status of ML operation */
+enum rte_ml_op_status {
+ RTE_ML_OP_STATUS_SUCCESS = 0,
+ /**< Operation completed successfully */
+ RTE_ML_OP_STATUS_NOT_PROCESSED,
+ /**< Operation has not yet been processed by the device. */
+ RTE_ML_OP_STATUS_ERROR,
+ /**< Operation completed with error.
+ * Application can invoke rte_ml_op_error_get() to get PMD specific
+ * error code if needed.
+ */
+};
+
+/** ML operation's input and output buffer representation as scatter gather list
+ */
+struct rte_ml_buff_seg {
+ rte_iova_t iova_addr;
+ /**< IOVA address of segment buffer. */
+ void *addr;
+ /**< Virtual address of segment buffer. */
+ uint32_t length;
+ /**< Segment length. */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_ml_buff_seg *next;
+ /**< Points to next segment. Value NULL represents the last segment. */
+};
+
+/**
+ * ML Operation.
+ *
+ * This structure contains data related to performing an ML operation on the buffers using
+ * the model specified through model_id.
+ */
+struct rte_ml_op {
+ uint16_t model_id;
+ /**< Model ID to be used for the operation. */
+ uint16_t nb_batches;
+ /**< Number of batches. Minimum value must be one.
+ * Input buffer must hold inference data for each batch as contiguous.
+ */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_mempool *mempool;
+ /**< Pool from which operation is allocated. */
+ struct rte_ml_buff_seg input;
+ /**< Input buffer to hold the inference data. */
+ struct rte_ml_buff_seg output;
+ /**< Output buffer to hold the inference output by the driver. */
+ RTE_STD_C11
+ union {
+ uint64_t user_u64;
+ /**< User data as uint64_t.*/
+ void *user_ptr;
+ /**< User data as void*.*/
+ };
+ enum rte_ml_op_status status;
+ /**< Operation status. */
+ uint64_t impl_opaque;
+ /**< Implementation specific opaque value.
+ * An implementation may use this field to hold
+ * implementation specific value to share between
+ * dequeue and enqueue operation.
+ * The application should not modify this field.
+ */
+} __rte_cache_aligned;
+
+/* Enqueue/Dequeue operations */
+
+/**
+ * Enqueue a burst of ML inferences for processing on an ML device.
+ *
+ * The rte_ml_enqueue_burst() function is invoked to place ML inference
+ * operations on the queue *qp_id* of the device designated by its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of inferences to process which are
+ * supplied in the *ops* array of *rte_ml_op* structures.
+ *
+ * The rte_ml_enqueue_burst() function returns the number of inferences it
+ * actually enqueued for processing. A return value equal to *nb_ops* means that
+ * all packets have been enqueued.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair which inferences are to be enqueued for processing.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * *rte_ml_dev_configure*.
+ * @param ops
+ * The address of an array of *nb_ops* pointers to *rte_ml_op* structures which contain the
+ * ML inferences to be processed.
+ * @param nb_ops
+ * The number of operations to process.
+ *
+ * @return
+ * The number of inference operations actually enqueued to the ML device.
+ * The return value can be less than the value of the *nb_ops* parameter when the ML device queue
+ * is full or if invalid parameters are specified in a *rte_ml_op*.
+ */
+__rte_experimental
+uint16_t
+rte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Dequeue a burst of processed ML inferences operations from a queue on the ML device.
+ * The dequeued operations are stored in *rte_ml_op* structures whose pointers are supplied
+ * in the *ops* array.
+ *
+ * The rte_ml_dequeue_burst() function returns the number of inferences actually dequeued,
+ * which is the number of *rte_ml_op* data structures effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained at least nb_ops* operations,
+ * and this is likely to signify that other processed operations remain in the devices output queue.
+ * Application implementing a "retrieve as many processed operations as possible" policy can check
+ * this specific case and keep invoking the rte_ml_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_ml_dequeue_burst() function does not provide any error notification to avoid
+ * the corresponding overhead.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair from which to retrieve processed packets.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * rte_ml_dev_configure().
+ * @param ops
+ * The address of an array of pointers to *rte_ml_op* structures that must be large enough to
+ * store *nb_ops* pointers in it.
+ * @param nb_ops
+ * The maximum number of inferences to dequeue.
+ *
+ * @return
+ * The number of operations actually dequeued, which is the number of pointers
+ * to *rte_ml_op* structures effectively supplied to the *ops* array.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Verbose error structure definition.
+ */
+struct rte_ml_op_error {
+ char message[RTE_ML_STR_MAX]; /**< Human-readable error message. */
+ uint64_t errcode; /**< Vendor specific error code. */
+};
+
+/**
+ * Get PMD specific error information for an ML op.
+ *
+ * When an ML operation completed with RTE_ML_OP_STATUS_ERROR as status,
+ * This API allows to get PMD specific error details.
+ *
+ * @param[in] dev_id
+ * Device identifier
+ * @param[in] op
+ * Handle of ML operation
+ * @param[in] error
+ * Address of structure rte_ml_op_error to be filled
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error);
+
+/* Statistics operations */
+
+/** Device statistics. */
+struct rte_ml_dev_stats {
+ uint64_t enqueued_count;
+ /**< Count of all operations enqueued */
+ uint64_t dequeued_count;
+ /**< Count of all operations dequeued */
+ uint64_t enqueue_err_count;
+ /**< Total error count on operations enqueued */
+ uint64_t dequeue_err_count;
+ /**< Total error count on operations dequeued */
+};
+
+/**
+ * Retrieve the general I/O statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stats
+ * Pointer to structure to where statistics will be copied.
+ * On error, this location may or may not have been modified.
+ * @return
+ * - 0 on success
+ * - -EINVAL: If invalid parameter pointer is provided.
+ */
+__rte_experimental
+int
+rte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats);
+
+/**
+ * Reset the statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ */
+__rte_experimental
+void
+rte_ml_dev_stats_reset(int16_t dev_id);
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers for extended ML device statistics.
+ */
+struct rte_ml_dev_xstats_map {
+ uint16_t id;
+ /**< xstat identifier */
+ char name[RTE_ML_STR_MAX];
+ /**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param[out] xstats_map
+ * Block of memory to insert id and names into. Must be at least size in capacity.
+ * If set to NULL, function returns required capacity.
+ * @param size
+ * Capacity of xstats_map (number of name-id maps).
+ *
+ * @return
+ * - Positive value on success:
+ * - The return value is the number of entries filled in the stats map.
+ * - If xstats_map set to NULL then required capacity for xstats_map.
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map,
+ uint32_t size);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param name
+ * The stat name to retrieve.
+ * @param stat_id
+ * If non-NULL, the numerical id of the stat will be returned, so that further requests for
+ * the stat can be got using rte_ml_dev_xstats_get, which will be faster as it doesn't need to
+ * scan a list of names for the stat.
+ * @param[out] value
+ * Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ * - 0: Successfully retrieved xstat value.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value);
+
+/**
+ * Retrieve extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * The id numbers of the stats to get. The ids can be fetched from the stat position in the
+ * stat list from rte_ml_dev_xstats_names_get(), or by using rte_ml_dev_xstats_by_name_get().
+ * @param values
+ * The values for each stats request by ID.
+ * @param nb_ids
+ * The number of stats requested.
+ * @return
+ * - Positive value: number of stat entries filled into the values array
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * Selects specific statistics to be reset. When NULL, all statistics will be reset.
+ * If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ * The number of ids available from the *ids* array. Ignored when ids is NULL.
+ * @return
+ * - 0: Successfully reset the statistics to zero.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids);
+
+/* Utility operations */
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *fd*.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param fd
+ * A pointer to a file for output.
+ * @return
+ * - 0: on success.
+ * - <0: on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_dump(int16_t dev_id, FILE *fd);
+
+/**
+ * Trigger the ML device self test.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @return
+ * - 0: Selftest successful.
+ * - -ENOTSUP: if the device doesn't support selftest.
+ * - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_selftest(int16_t dev_id);
+
+/* Model operations */
+
+/** ML model load parameters
+ *
+ * Parameters required to load an ML model.
+ */
+struct rte_ml_model_params {
+ void *addr;
+ /**< Address of model buffer */
+ size_t size;
+ /**< Size of model buffer */
+};
+
+/**
+ * Load an ML model to the device.
+ *
+ * Load an ML model to the device with parameters requested in the structure rte_ml_model_params.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] params
+ * Parameters for the model to be loaded.
+ * @param[out] model_id
+ * Identifier of the model loaded.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model load driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id);
+
+/**
+ * Unload an ML model from the device.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be unloaded.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model unload driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_unload(int16_t dev_id, uint16_t model_id);
+
+/**
+ * Start an ML model for the given device ID.
+ *
+ * Start an ML model to accept inference requests.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be started.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model start driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_start(int16_t dev_id, uint16_t model_id);
+
+/**
+ * Stop an ML model for the given device ID.
+ *
+ * Model stop would disable the ML model to be used for inference jobs.
+ * All inference jobs must have been completed before model stop is attempted.
+
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be stopped.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model stop driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_stop(int16_t dev_id, uint16_t model_id);
+
+/**
+ * Input and output data types. ML models can operate on reduced precision
+ * datatypes to achieve better power efficiency, lower network latency and lower memory footprint.
+ * This enum is used to represent the lower precision integer and floating point types used
+ * by ML models.
+ */
+enum rte_ml_io_type {
+ RTE_ML_IO_TYPE_UNKNOWN = 0,
+ /**< Invalid or unknown type */
+ RTE_ML_IO_TYPE_INT8,
+ /**< 8-bit integer */
+ RTE_ML_IO_TYPE_UINT8,
+ /**< 8-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT16,
+ /**< 16-bit integer */
+ RTE_ML_IO_TYPE_UINT16,
+ /**< 16-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT32,
+ /**< 32-bit integer */
+ RTE_ML_IO_TYPE_UINT32,
+ /**< 32-bit unsigned integer */
+ RTE_ML_IO_TYPE_FP8,
+ /**< 8-bit floating point number */
+ RTE_ML_IO_TYPE_FP16,
+ /**< IEEE 754 16-bit floating point number */
+ RTE_ML_IO_TYPE_FP32,
+ /**< IEEE 754 32-bit floating point number */
+ RTE_ML_IO_TYPE_BFLOAT16
+ /**< 16-bit brain floating point number. */
+};
+
+/**
+ * Input and output format. This is used to represent the encoding type of multi-dimensional
+ * used by ML models.
+ */
+enum rte_ml_io_format {
+ RTE_ML_IO_FORMAT_NCHW = 1,
+ /**< Batch size (N) x channels (C) x height (H) x width (W) */
+ RTE_ML_IO_FORMAT_NHWC,
+ /**< Batch size (N) x height (H) x width (W) x channels (C) */
+ RTE_ML_IO_FORMAT_CHWN,
+ /**< Channels (C) x height (H) x width (W) x batch size (N) */
+ RTE_ML_IO_FORMAT_3D,
+ /**< Format to represent a 3 dimensional data */
+ RTE_ML_IO_FORMAT_2D,
+ /**< Format to represent matrix data */
+ RTE_ML_IO_FORMAT_1D,
+ /**< Format to represent vector data */
+ RTE_ML_IO_FORMAT_SCALAR,
+ /**< Format to represent scalar data */
+};
+
+/**
+ * Input and output shape. This structure represents the encoding format and dimensions
+ * of the tensor or vector.
+ *
+ * The data can be a 4D / 3D tensor, matrix, vector or a scalar. Number of dimensions used
+ * for the data would depend on the format. Unused dimensions to be set to 1.
+ */
+struct rte_ml_io_shape {
+ enum rte_ml_io_format format;
+ /**< Format of the data */
+ uint32_t w;
+ /**< First dimension */
+ uint32_t x;
+ /**< Second dimension */
+ uint32_t y;
+ /**< Third dimension */
+ uint32_t z;
+ /**< Fourth dimension */
+};
+
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Name of data */
+ struct rte_ml_io_shape shape;
+ /**< Shape of data */
+ enum rte_ml_io_type qtype;
+ /**< Type of quantized data */
+ enum rte_ml_io_type dtype;
+ /**< Type of de-quantized data */
+};
+
+/** Model information structure */
+struct rte_ml_model_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Model name. */
+ char version[RTE_ML_STR_MAX];
+ /**< Model version */
+ uint16_t model_id;
+ /**< Model ID */
+ uint16_t device_id;
+ /**< Device ID */
+ uint16_t batch_size;
+ /**< Maximum number of batches that the model can process simultaneously */
+ uint32_t nb_inputs;
+ /**< Number of inputs */
+ const struct rte_ml_io_info *input_info;
+ /**< Input info array. Array size is equal to nb_inputs */
+ uint32_t nb_outputs;
+ /**< Number of outputs */
+ const struct rte_ml_io_info *output_info;
+ /**< Output info array. Array size is equal to nb_output */
+ uint64_t wb_size;
+ /**< Size of model weights and bias */
+};
+
+/**
+ * Get ML model information.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[out] model_info
+ * Pointer to a model info structure
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_info_get(int16_t dev_id, uint16_t model_id, struct rte_ml_model_info *model_info);
+
+/**
+ * Update the model parameters without unloading model.
+ *
+ * Update model parameters such as weights and bias without unloading the model.
+ * rte_ml_model_stop() must be called before invoking this API.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] buffer
+ * Pointer to the model weights and bias buffer.
+ * Size of the buffer is equal to wb_size returned in *rte_ml_model_info*.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_params_update(int16_t dev_id, uint16_t model_id, void *buffer);
+
+/* IO operations */
+
+/**
+ * Get size of quantized and dequantized input buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized input data.
+ * This API would return the buffer sizes for the number of batches provided and would
+ * consider the alignment requirements as per the PMD. Input sizes computed by this API can
+ * be used by the application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] input_qsize
+ * Quantized input size pointer.
+ * NULL value is allowed, in which case input_qsize is not calculated by the driver.
+ * @param[out] input_dsize
+ * Dequantized input size pointer.
+ * NULL value is allowed, in which case input_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_input_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
+ uint64_t *input_qsize, uint64_t *input_dsize);
+
+/**
+ * Get size of quantized and dequantized output buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized output data.
+ * This API would return the buffer sizes for the number of batches provided and would consider
+ * the alignment requirements as per the PMD. Output sizes computed by this API can be used by the
+ * application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] output_qsize
+ * Quantized output size pointer.
+ * NULL value is allowed, in which case output_qsize is not calculated by the driver.
+ * @param[out] output_dsize
+ * Dequantized output size pointer.
+ * NULL value is allowed, in which case output_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_output_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
+ uint64_t *output_qsize, uint64_t *output_dsize);
+
+/**
+ * Quantize input data.
+ *
+ * Quantization converts data from a higher precision types to a lower precision types to improve
+ * the throughput and efficiency of the model execution with minimal loss of accuracy.
+ * Types of dequantized data and quantized data are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized input buffer
+ * @param[in] dbuffer
+ * Address of dequantized input data
+ * @param[in] qbuffer
+ * Address of quantized input data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_quantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *dbuffer,
+ void *qbuffer);
+
+/**
+ * Dequantize output data.
+ *
+ * Dequantization converts data from a lower precision type to a higher precision type.
+ * Types of quantized data and dequantized are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized output buffer
+ * @param[in] qbuffer
+ * Address of quantized output data
+ * @param[in] dbuffer
+ * Address of dequantized output data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_dequantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *qbuffer,
+ void *dbuffer);
+
+/* ML op pool operations */
+
+/**
+ * Create an ML operation pool
+ *
+ * @param name
+ * ML operations pool name
+ * @param nb_elts
+ * Number of elements in pool
+ * @param cache_size
+ * Number of elements to cache on lcore, see
+ * *rte_mempool_create* for further details about cache size
+ * @param user_size
+ * Size of private data to allocate for user with each operation
+ * @param socket_id
+ * Socket to identifier allocate memory on
+ * @return
+ * - On success pointer to mempool
+ * - On failure NULL
+ */
+__rte_experimental
+struct rte_mempool *
+rte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
+ uint16_t user_size, int socket_id);
+
+/**
+ * Free an ML operation pool
+ *
+ * @param mempool
+ * A pointer to the mempool structure.
+ * If NULL then, the function does nothing.
+ */
+__rte_experimental
+void
+rte_ml_op_pool_free(struct rte_mempool *mempool);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_MLDEV_H */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
new file mode 100644
index 0000000000..3793380442
--- /dev/null
+++ b/lib/mldev/version.map
@@ -0,0 +1,7 @@
+EXPERIMENTAL {
+ global:
+
+ rte_ml_dev_logtype;
+
+ local: *;
+};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 01/12] " jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 21:04 ` Stephen Hemminger
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 03/12] mldev: support ML device handling functions jerinj
` (10 subsequent siblings)
12 siblings, 1 reply; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi, Anatoly Burakov
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added PMD functions to handle ML devices. The rte_mldev_pmd.*
files are for drivers only and should be private to DPDK, and
are not installed for application use.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/meson.build | 9 +++
lib/mldev/rte_mldev.c | 129 ++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 111 +++++++++++++++++++++++++++
lib/mldev/rte_mldev_pmd.c | 62 +++++++++++++++
lib/mldev/rte_mldev_pmd.h | 149 +++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 11 +++
6 files changed, 471 insertions(+)
create mode 100644 lib/mldev/rte_mldev_core.h
create mode 100644 lib/mldev/rte_mldev_pmd.c
create mode 100644 lib/mldev/rte_mldev_pmd.h
diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build
index e378cfca30..5c99532c1a 100644
--- a/lib/mldev/meson.build
+++ b/lib/mldev/meson.build
@@ -2,6 +2,7 @@
# Copyright (c) 2022 Marvell.
sources = files(
+ 'rte_mldev_pmd.c',
'rte_mldev.c',
)
@@ -9,6 +10,14 @@ headers = files(
'rte_mldev.h',
)
+indirect_headers += files(
+ 'rte_mldev_core.h',
+)
+
+driver_sdk_headers += files(
+ 'rte_mldev_pmd.h',
+)
+
deps += ['mempool']
if get_option('buildtype').contains('debug')
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 70aad4c44b..06396de680 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -4,5 +4,134 @@
#include <rte_log.h>
#include <rte_mldev.h>
+#include <rte_mldev_pmd.h>
+
+static struct rte_ml_dev ml_devices[RTE_MLDEV_MAX_DEVS];
+
+static struct rte_ml_dev_global ml_dev_globals = {
+ .devs = ml_devices, .data = {NULL}, .nb_devs = 0, .max_devs = RTE_MLDEV_MAX_DEVS};
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_dev(int16_t dev_id)
+{
+ return &ml_dev_globals.devs[dev_id];
+}
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_named_dev(const char *name)
+{
+ struct rte_ml_dev *dev;
+ int16_t dev_id;
+
+ if (name == NULL)
+ return NULL;
+
+ for (dev_id = 0; dev_id < RTE_MLDEV_MAX_DEVS; dev_id++) {
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if ((dev->attached == ML_DEV_ATTACHED) && (strcmp(dev->data->name, name) == 0))
+ return dev;
+ }
+
+ return NULL;
+}
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id)
+{
+ char mz_name[RTE_MEMZONE_NAMESIZE];
+ const struct rte_memzone *mz;
+ struct rte_ml_dev *dev;
+ int16_t dev_id;
+
+ if (rte_ml_dev_pmd_get_named_dev(name) != NULL) {
+ RTE_MLDEV_LOG(ERR, "ML device with name %s already allocated!", name);
+ return NULL;
+ }
+
+ /* Get a free device ID */
+ for (dev_id = 0; dev_id < RTE_MLDEV_MAX_DEVS; dev_id++) {
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (dev->attached == ML_DEV_DETACHED)
+ break;
+ }
+
+ if (dev_id == RTE_MLDEV_MAX_DEVS) {
+ RTE_MLDEV_LOG(ERR, "Reached maximum number of ML devices");
+ return NULL;
+ }
+
+ if (dev->data == NULL) {
+ /* Reserve memzone name */
+ sprintf(mz_name, "rte_ml_dev_data_%d", dev_id);
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ mz = rte_memzone_reserve(mz_name, sizeof(struct rte_ml_dev_data), socket_id,
+ 0);
+ RTE_MLDEV_LOG(DEBUG, "PRIMARY: reserved memzone for %s (%p)", mz_name, mz);
+ } else {
+ mz = rte_memzone_lookup(mz_name);
+ RTE_MLDEV_LOG(DEBUG, "SECONDARY: looked up memzone for %s (%p)", mz_name,
+ mz);
+ }
+
+ if (mz == NULL)
+ return NULL;
+
+ ml_dev_globals.data[dev_id] = mz->addr;
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ memset(ml_dev_globals.data[dev_id], 0, sizeof(struct rte_ml_dev_data));
+
+ dev->data = ml_dev_globals.data[dev_id];
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ strlcpy(dev->data->name, name, RTE_ML_STR_MAX);
+ dev->data->dev_id = dev_id;
+ dev->data->socket_id = socket_id;
+ dev->data->dev_started = 0;
+ RTE_MLDEV_LOG(DEBUG, "PRIMARY: init mldev data");
+ }
+
+ RTE_MLDEV_LOG(DEBUG, "Data for %s: dev_id %d, socket %u", dev->data->name,
+ dev->data->dev_id, dev->data->socket_id);
+
+ dev->attached = ML_DEV_ATTACHED;
+ ml_dev_globals.nb_devs++;
+ }
+
+ return dev;
+}
+
+int
+rte_ml_dev_pmd_release(struct rte_ml_dev *dev)
+{
+ char mz_name[RTE_MEMZONE_NAMESIZE];
+ const struct rte_memzone *mz;
+ int16_t dev_id;
+ int ret = 0;
+
+ if (dev == NULL)
+ return -EINVAL;
+
+ dev_id = dev->data->dev_id;
+
+ /* Memzone lookup */
+ sprintf(mz_name, "rte_ml_dev_data_%d", dev_id);
+ mz = rte_memzone_lookup(mz_name);
+ if (mz == NULL)
+ return -ENOMEM;
+
+ RTE_ASSERT(ml_dev_globals.data[dev_id] == mz->addr);
+ ml_dev_globals.data[dev_id] = NULL;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ RTE_MLDEV_LOG(DEBUG, "PRIMARY: free memzone of %s (%p)", mz_name, mz);
+ ret = rte_memzone_free(mz);
+ } else {
+ RTE_MLDEV_LOG(DEBUG, "SECONDARY: don't free memzone of %s (%p)", mz_name, mz);
+ }
+
+ dev->attached = ML_DEV_DETACHED;
+ ml_dev_globals.nb_devs--;
+
+ return ret;
+}
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
new file mode 100644
index 0000000000..1c989a5ecf
--- /dev/null
+++ b/lib/mldev/rte_mldev_core.h
@@ -0,0 +1,111 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef _RTE_MLDEV_INTERNAL_H_
+#define _RTE_MLDEV_INTERNAL_H_
+
+/**
+ * @file
+ *
+ * MLDEV internal header
+ *
+ * This file contains MLDEV private data structures and macros.
+ *
+ * @note
+ * These APIs are for MLDEV PMDs and library only.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#include <dev_driver.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_mldev.h>
+
+/* Device state */
+#define ML_DEV_DETACHED (0)
+#define ML_DEV_ATTACHED (1)
+
+/**
+ * @internal
+ *
+ * The data part, with no function pointers, associated with each device. This structure is safe to
+ * place in shared memory to be common among different processes in a multi-process configuration.
+ */
+struct rte_ml_dev_data {
+ /** Unique identifier name. */
+ char name[RTE_ML_STR_MAX];
+
+ /** Device ID for this instance. */
+ int16_t dev_id;
+
+ /** Socket ID where memory is allocated. */
+ int16_t socket_id;
+
+ /** Device state: STOPPED(0) / STARTED(1) */
+ __extension__ uint8_t dev_started : 1;
+
+ /** Number of device queue pairs. */
+ uint16_t nb_queue_pairs;
+
+ /** Number of ML models. */
+ uint16_t nb_models;
+
+ /** Array of pointers to queue pairs. */
+ void **queue_pairs;
+
+ /** Array of pointers to ML models. */
+ void **models;
+
+ /** PMD-specific private data. */
+ void *dev_private;
+
+ /** Reserved for future fields */
+ uint64_t reserved[3];
+} __rte_cache_aligned;
+
+/**
+ * @internal
+ *
+ * The data structure associated with each ML device.
+ */
+struct rte_ml_dev {
+ /** Pointer to device data. */
+ struct rte_ml_dev_data *data;
+
+ /** Backing RTE device. */
+ struct rte_device *device;
+
+ /** Flag indicating the device is attached. */
+ __extension__ uint8_t attached : 1;
+} __rte_cache_aligned;
+
+/**
+ * @internal
+ *
+ * Global structure used for maintaining state of allocated ML devices.
+ */
+struct rte_ml_dev_global {
+ /** Device information array. */
+ struct rte_ml_dev *devs;
+
+ /** Device private data array. */
+ struct rte_ml_dev_data *data[RTE_MLDEV_MAX_DEVS];
+
+ /** Number of devices found. */
+ uint8_t nb_devs;
+
+ /** Maximum number of devices. */
+ uint8_t max_devs;
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MLDEV_INTERNAL_H_ */
diff --git a/lib/mldev/rte_mldev_pmd.c b/lib/mldev/rte_mldev_pmd.c
new file mode 100644
index 0000000000..3169e5d4fa
--- /dev/null
+++ b/lib/mldev/rte_mldev_pmd.c
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#include <dev_driver.h>
+#include <rte_eal.h>
+#include <rte_malloc.h>
+
+#include "rte_mldev_pmd.h"
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_create(const char *name, struct rte_device *device,
+ struct rte_ml_dev_pmd_init_params *params)
+{
+ struct rte_ml_dev *dev;
+
+ RTE_MLDEV_LOG(INFO, "ML device initialisation - name: %s, socket_id: %u", name,
+ params->socket_id);
+
+ /* Allocate device structure */
+ dev = rte_ml_dev_pmd_allocate(name, params->socket_id);
+ if (dev == NULL) {
+ RTE_MLDEV_LOG(ERR, "Failed to allocate ML device for %s", name);
+ return NULL;
+ }
+
+ /* Allocate private device structure */
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ dev->data->dev_private =
+ rte_zmalloc_socket("ml_dev_private", params->private_data_size,
+ RTE_CACHE_LINE_SIZE, params->socket_id);
+
+ if (dev->data->dev_private == NULL) {
+ RTE_MLDEV_LOG(ERR, "Cannot allocate memory for mldev %s private data",
+ name);
+ rte_ml_dev_pmd_release(dev);
+ return NULL;
+ }
+ }
+ dev->device = device;
+
+ return dev;
+}
+
+int
+rte_ml_dev_pmd_destroy(struct rte_ml_dev *dev)
+{
+ int ret;
+
+ RTE_MLDEV_LOG(INFO, "Releasing ML device - name: %s", dev->device->name);
+ ret = rte_ml_dev_pmd_release(dev);
+ if (ret)
+ return ret;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ rte_free(dev->data->dev_private);
+
+ dev->data = NULL;
+ dev->device = NULL;
+
+ return 0;
+}
diff --git a/lib/mldev/rte_mldev_pmd.h b/lib/mldev/rte_mldev_pmd.h
new file mode 100644
index 0000000000..33544f1b80
--- /dev/null
+++ b/lib/mldev/rte_mldev_pmd.h
@@ -0,0 +1,149 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef _RTE_MLDEV_PMD_H_
+#define _RTE_MLDEV_PMD_H_
+
+/**
+ * @file
+ *
+ * RTE MLDEV PMD APIs
+ *
+ * ML Device PMD interface
+ *
+ * @note
+ * These APIs are for MLDEV PMDs only and user applications should not call them directly.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#include <rte_common.h>
+#include <rte_compat.h>
+#include <rte_mldev.h>
+#include <rte_mldev_core.h>
+
+/**
+ * @internal
+ *
+ * Initialisation parameters for ML devices.
+ */
+struct rte_ml_dev_pmd_init_params {
+ /** Socket to use for memory allocation. */
+ uint8_t socket_id;
+
+ /** Size of device private data. */
+ uint64_t private_data_size;
+};
+
+/**
+ * @internal
+ *
+ * Get the ML device pointer for the device. Assumes a valid device index.
+ *
+ * @param dev_id
+ * Device ID value to select the device structure.
+ *
+ * @return
+ * The rte_ml_dev pointer for the given device ID.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_dev(int16_t dev_id);
+
+/**
+ * @internal
+ *
+ * Get the rte_ml_dev structure device pointer for the named device.
+ *
+ * @param name
+ * Device name to select the device structure.
+ *
+ * @return
+ * The rte_ml_dev pointer for the given device ID.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_named_dev(const char *name);
+
+/**
+ * @internal
+ *
+ * Allocates a new mldev slot for an ML device and returns the pointer to that slot for use.
+ * Function for internal use by dummy drivers.
+ *
+ * @param name
+ * Unique identifier name for each device.
+ * @param socket_id
+ * Socket to allocate resources.
+ *
+ * @return
+ * Slot in the rte_ml_dev_devices array for a new device.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id);
+
+/**
+ * @internal
+ *
+ * Release the specified mldev device.
+ *
+ * @param dev
+ * ML device.
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+__rte_internal
+int
+rte_ml_dev_pmd_release(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * PMD assist function to provide boiler plate code for ML driver to create and allocate resources
+ * for a new ML PMD device instance.
+ *
+ * @param name
+ * ML device name.
+ * @param device
+ * Base device handle.
+ * @param params
+ * PMD initialisation parameters.
+ *
+ * @return
+ * - ML device instance on success.
+ * - NULL on failure.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_create(const char *name, struct rte_device *device,
+ struct rte_ml_dev_pmd_init_params *params);
+
+/**
+ * @internal
+ *
+ * PMD assist function to provide boiler plate code for ML driver to destroy and free resources
+ * associated with a ML PMD device instance.
+ *
+ * @param mldev
+ * ML device instance.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+__rte_internal
+int
+rte_ml_dev_pmd_destroy(struct rte_ml_dev *mldev);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MLDEV_PMD_H_ */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 3793380442..98a5e7d117 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -5,3 +5,14 @@ EXPERIMENTAL {
local: *;
};
+
+INTERNAL {
+ global:
+
+ rte_ml_dev_pmd_allocate;
+ rte_ml_dev_pmd_create;
+ rte_ml_dev_pmd_destroy;
+ rte_ml_dev_pmd_get_dev;
+ rte_ml_dev_pmd_get_named_dev;
+ rte_ml_dev_pmd_release;
+};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 03/12] mldev: support ML device handling functions
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 01/12] " jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 04/12] mldev: support ML device queue-pair setup jerinj
` (9 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added ML device handling APIs. These APIs are used to get device
information, configure, start, stop and close ML devices. Added
function prototypes to PMD layer which are used by the ML driver
implementations in the poll mode driver.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 175 +++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 107 +++++++++++++++++++++++
lib/mldev/version.map | 8 ++
3 files changed, 290 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 06396de680..ddbd371bc9 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -134,4 +134,179 @@ rte_ml_dev_pmd_release(struct rte_ml_dev *dev)
return ret;
}
+uint16_t
+rte_ml_dev_count(void)
+{
+ return ml_dev_globals.nb_devs;
+}
+
+int
+rte_ml_dev_is_valid_dev(int16_t dev_id)
+{
+ struct rte_ml_dev *dev = NULL;
+
+ if (dev_id >= ml_dev_globals.max_devs || ml_devices[dev_id].data == NULL)
+ return 0;
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (dev->attached != ML_DEV_ATTACHED)
+ return 0;
+ else
+ return 1;
+}
+
+int
+rte_ml_dev_socket_id(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+
+ return dev->data->socket_id;
+}
+
+int
+rte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_info_get == NULL)
+ return -ENOTSUP;
+
+ if (dev_info == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, dev_info cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+ memset(dev_info, 0, sizeof(struct rte_ml_dev_info));
+
+ return (*dev->dev_ops->dev_info_get)(dev, dev_info);
+}
+
+int
+rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config)
+{
+ struct rte_ml_dev_info dev_info;
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_configure == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started) {
+ RTE_MLDEV_LOG(ERR, "Device %d must be stopped to allow configuration", dev_id);
+ return -EBUSY;
+ }
+
+ if (config == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, config cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ ret = rte_ml_dev_info_get(dev_id, &dev_info);
+ if (ret < 0)
+ return ret;
+
+ if (config->nb_queue_pairs > dev_info.max_queue_pairs) {
+ RTE_MLDEV_LOG(ERR, "Device %d num of queues %u > %u\n", dev_id,
+ config->nb_queue_pairs, dev_info.max_queue_pairs);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_configure)(dev, config);
+}
+
+int
+rte_ml_dev_close(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_close == NULL)
+ return -ENOTSUP;
+
+ /* Device must be stopped before it can be closed */
+ if (dev->data->dev_started == 1) {
+ RTE_MLDEV_LOG(ERR, "Device %d must be stopped before closing", dev_id);
+ return -EBUSY;
+ }
+
+ return (*dev->dev_ops->dev_close)(dev);
+}
+
+int
+rte_ml_dev_start(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_start == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started != 0) {
+ RTE_MLDEV_LOG(ERR, "Device %d is already started", dev_id);
+ return -EBUSY;
+ }
+
+ ret = (*dev->dev_ops->dev_start)(dev);
+ if (ret == 0)
+ dev->data->dev_started = 1;
+
+ return ret;
+}
+
+int
+rte_ml_dev_stop(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stop == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started == 0) {
+ RTE_MLDEV_LOG(ERR, "Device %d is not started", dev_id);
+ return -EBUSY;
+ }
+
+ ret = (*dev->dev_ops->dev_stop)(dev);
+ if (ret == 0)
+ dev->data->dev_started = 0;
+
+ return ret;
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 1c989a5ecf..f4ed5badfb 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -31,6 +31,110 @@ extern "C" {
#define ML_DEV_DETACHED (0)
#define ML_DEV_ATTACHED (1)
+struct rte_ml_dev;
+
+/**
+ * Definitions of all functions exported by a driver through the generic structure of type
+ * *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
+ */
+
+/**
+ * @internal
+ *
+ * Function used to get device information.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param dev_info
+ * Pointer to info structure.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_info_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info);
+
+/**
+ * @internal
+ *
+ * Function used to configure device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param config
+ * ML device configurations.
+ *
+ * @return
+ * - 0 on success
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_configure_t)(struct rte_ml_dev *dev, const struct rte_ml_dev_config *config);
+
+/**
+ * @internal
+ *
+ * Function used to close a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - -EAGAIN if can't close as device is busy.
+ * - < 0, error code on failure, other than busy.
+ */
+typedef int (*mldev_close_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * Function used to start a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_start_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * Function used to stop a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_stop_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * ML device operations function pointer table.
+ */
+struct rte_ml_dev_ops {
+ /** Get device information. */
+ mldev_info_get_t dev_info_get;
+
+ /** Configure device. */
+ mldev_configure_t dev_configure;
+
+ /** Close device. */
+ mldev_close_t dev_close;
+
+ /** Start device. */
+ mldev_start_t dev_start;
+
+ /** Stop device. */
+ mldev_stop_t dev_stop;
+};
+
/**
* @internal
*
@@ -78,6 +182,9 @@ struct rte_ml_dev {
/** Pointer to device data. */
struct rte_ml_dev_data *data;
+ /** Functions exported by PMD. */
+ struct rte_ml_dev_ops *dev_ops;
+
/** Backing RTE device. */
struct rte_device *device;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 98a5e7d117..a2b3163c97 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -1,7 +1,15 @@
EXPERIMENTAL {
global:
+ rte_ml_dev_close;
+ rte_ml_dev_configure;
+ rte_ml_dev_count;
+ rte_ml_dev_info_get;
+ rte_ml_dev_is_valid_dev;
rte_ml_dev_logtype;
+ rte_ml_dev_socket_id;
+ rte_ml_dev_start;
+ rte_ml_dev_stop;
local: *;
};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 04/12] mldev: support ML device queue-pair setup
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (2 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 03/12] mldev: support ML device handling functions jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 05/12] mldev: support handling ML models jerinj
` (8 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added APIs to create a queue-pair attached to ML device.
Queue pairs are created with a user specified ID. Added
function prototypes to be used by ML drivers for queue
pair create and destroy.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 33 ++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 44 ++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 1 +
3 files changed, 78 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index ddbd371bc9..4452c21dd6 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -309,4 +309,37 @@ rte_ml_dev_stop(int16_t dev_id)
return ret;
}
+int
+rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *qp_conf, int socket_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_queue_pair_setup == NULL)
+ return -ENOTSUP;
+
+ if (queue_pair_id >= dev->data->nb_queue_pairs) {
+ RTE_MLDEV_LOG(ERR, "Invalid queue_pair_id = %d", queue_pair_id);
+ return -EINVAL;
+ }
+
+ if (qp_conf == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, qp_conf cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (dev->data->dev_started) {
+ RTE_MLDEV_LOG(ERR, "Device %d must be stopped to allow configuration", dev_id);
+ return -EBUSY;
+ }
+
+ return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index f4ed5badfb..b7a692fc7a 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -113,6 +113,44 @@ typedef int (*mldev_start_t)(struct rte_ml_dev *dev);
*/
typedef int (*mldev_stop_t)(struct rte_ml_dev *dev);
+/**
+ * @internal
+ *
+ * Setup a queue pair for a device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param queue_pair_id
+ * Queue pair index.
+ * @param queue_pair_conf
+ * Queue pair configuration structure.
+ * @param socket_id
+ * Socket index.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *queue_pair_conf,
+ int socket_id);
+
+/**
+ * @internal
+ *
+ * Release memory resources allocated by given queue pair.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param queue_pair_id
+ * Queue pair index.
+ *
+ * @return
+ * - 0 on success.
+ * - -EAGAIN, if can't close as device is busy.
+ */
+typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+
/**
* @internal
*
@@ -133,6 +171,12 @@ struct rte_ml_dev_ops {
/** Stop device. */
mldev_stop_t dev_stop;
+
+ /** Set up a device queue pair. */
+ mldev_queue_pair_setup_t dev_queue_pair_setup;
+
+ /** Release a device queue pair. */
+ mldev_queue_pair_release_t dev_queue_pair_release;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index a2b3163c97..8de1e8bec7 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -7,6 +7,7 @@ EXPERIMENTAL {
rte_ml_dev_info_get;
rte_ml_dev_is_valid_dev;
rte_ml_dev_logtype;
+ rte_ml_dev_queue_pair_setup;
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 05/12] mldev: support handling ML models
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (3 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 04/12] mldev: support ML device queue-pair setup jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 06/12] mldev: support input and output data handling jerinj
` (7 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added RTE functions to handle ML models. These APIs can
load, unload, start, and stop an ML model. Additional APIs
to update model parameters and get model information are
added.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 123 +++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 122 ++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 6 ++
3 files changed, 251 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 4452c21dd6..3b8c073615 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -342,4 +342,127 @@ rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
}
+int
+rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_load == NULL)
+ return -ENOTSUP;
+
+ if (params == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, params cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (model_id == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, model_id cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_load)(dev, params, model_id);
+}
+
+int
+rte_ml_model_unload(int16_t dev_id, uint16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_unload == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_unload)(dev, model_id);
+}
+
+int
+rte_ml_model_start(int16_t dev_id, uint16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_start == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_start)(dev, model_id);
+}
+
+int
+rte_ml_model_stop(int16_t dev_id, uint16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_stop == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_stop)(dev, model_id);
+}
+
+int
+rte_ml_model_info_get(int16_t dev_id, uint16_t model_id, struct rte_ml_model_info *model_info)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_info_get == NULL)
+ return -ENOTSUP;
+
+ if (model_info == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, model_id %u, model_info cannot be NULL\n", dev_id,
+ model_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_info_get)(dev, model_id, model_info);
+}
+
+int
+rte_ml_model_params_update(int16_t dev_id, uint16_t model_id, void *buffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_params_update == NULL)
+ return -ENOTSUP;
+
+ if (buffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, buffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_params_update)(dev, model_id, buffer);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index b7a692fc7a..4f1f32b583 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -151,6 +151,110 @@ typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_p
*/
typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+/**
+ * @internal
+ *
+ * Function used to load an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param params
+ * Model load params.
+ * @param model_id
+ * Model ID returned by the library.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_model_load_t)(struct rte_ml_dev *dev, struct rte_ml_model_params *params,
+ uint16_t *model_id);
+
+/**
+ * @internal
+ *
+ * Function used to unload an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_model_unload_t)(struct rte_ml_dev *dev, uint16_t model_id);
+
+/**
+ * @internal
+ *
+ * Function used to start an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_start_t)(struct rte_ml_dev *dev, uint16_t model_id);
+
+/**
+ * @internal
+ *
+ * Function used to stop an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_stop_t)(struct rte_ml_dev *dev, uint16_t model_id);
+
+/**
+ * @internal
+ *
+ * Get info about a model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param model_info
+ * Pointer to model info structure.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_info_get_t)(struct rte_ml_dev *dev, uint16_t model_id,
+ struct rte_ml_model_info *model_info);
+
+/**
+ * @internal
+ *
+ * Update model params.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param buffer
+ * Pointer to model params.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_params_update_t)(struct rte_ml_dev *dev, uint16_t model_id, void *buffer);
+
/**
* @internal
*
@@ -177,6 +281,24 @@ struct rte_ml_dev_ops {
/** Release a device queue pair. */
mldev_queue_pair_release_t dev_queue_pair_release;
+
+ /** Load an ML model. */
+ mldev_model_load_t model_load;
+
+ /** Unload an ML model. */
+ mldev_model_unload_t model_unload;
+
+ /** Start an ML model. */
+ mldev_model_start_t model_start;
+
+ /** Stop an ML model. */
+ mldev_model_stop_t model_stop;
+
+ /** Get model information. */
+ mldev_model_info_get_t model_info_get;
+
+ /** Update model params. */
+ mldev_model_params_update_t model_params_update;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 8de1e8bec7..640671efff 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -11,6 +11,12 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_model_info_get;
+ rte_ml_model_load;
+ rte_ml_model_params_update;
+ rte_ml_model_start;
+ rte_ml_model_stop;
+ rte_ml_model_unload;
local: *;
};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 06/12] mldev: support input and output data handling
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (4 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 05/12] mldev: support handling ML models jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 07/12] mldev: support ML op pool and ops jerinj
` (6 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added library functions to handle model input and
output data. The APIs can be used to get the size of I/O
buffers, quantize input data and dequantize output data.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 94 ++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 106 +++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 4 ++
3 files changed, 204 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 3b8c073615..62f0e95c85 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -465,4 +465,98 @@ rte_ml_model_params_update(int16_t dev_id, uint16_t model_id, void *buffer)
return (*dev->dev_ops->model_params_update)(dev, model_id, buffer);
}
+int
+rte_ml_io_input_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
+ uint64_t *input_qsize, uint64_t *input_dsize)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_input_size_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->io_input_size_get)(dev, model_id, nb_batches, input_qsize,
+ input_dsize);
+}
+
+int
+rte_ml_io_output_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
+ uint64_t *output_qsize, uint64_t *output_dsize)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_output_size_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->io_output_size_get)(dev, model_id, nb_batches, output_qsize,
+ output_dsize);
+}
+
+int
+rte_ml_io_quantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *dbuffer,
+ void *qbuffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_quantize == NULL)
+ return -ENOTSUP;
+
+ if (dbuffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, dbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (qbuffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, qbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->io_quantize)(dev, model_id, nb_batches, dbuffer, qbuffer);
+}
+
+int
+rte_ml_io_dequantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *qbuffer,
+ void *dbuffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_dequantize == NULL)
+ return -ENOTSUP;
+
+ if (qbuffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, qbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (dbuffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, dbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->io_dequantize)(dev, model_id, nb_batches, qbuffer, dbuffer);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 4f1f32b583..1bbc9fda0e 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -255,6 +255,100 @@ typedef int (*mldev_model_info_get_t)(struct rte_ml_dev *dev, uint16_t model_id,
*/
typedef int (*mldev_model_params_update_t)(struct rte_ml_dev *dev, uint16_t model_id, void *buffer);
+/**
+ * @internal
+ *
+ * Get size of input buffers.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param input_qsize
+ * Size of quantized input.
+ * @param input_dsize
+ * Size of dequantized input.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_input_size_get_t)(struct rte_ml_dev *dev, uint16_t model_id,
+ uint32_t nb_batches, uint64_t *input_qsize,
+ uint64_t *input_dsize);
+
+/**
+ * @internal
+ *
+ * Get size of output buffers.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param output_qsize
+ * Size of quantized output.
+ * @param output_dsize
+ * Size of dequantized output.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_output_size_get_t)(struct rte_ml_dev *dev, uint16_t model_id,
+ uint32_t nb_batches, uint64_t *output_qsize,
+ uint64_t *output_dsize);
+
+/**
+ * @internal
+ *
+ * Quantize model data.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param dbuffer
+ * Pointer t de-quantized data buffer.
+ * @param qbuffer
+ * Pointer t de-quantized data buffer.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_quantize_t)(struct rte_ml_dev *dev, uint16_t model_id, uint16_t nb_batches,
+ void *dbuffer, void *qbuffer);
+
+/**
+ * @internal
+ *
+ * Quantize model data.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param qbuffer
+ * Pointer t de-quantized data buffer.
+ * @param dbuffer
+ * Pointer t de-quantized data buffer.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_dequantize_t)(struct rte_ml_dev *dev, uint16_t model_id, uint16_t nb_batches,
+ void *qbuffer, void *dbuffer);
+
/**
* @internal
*
@@ -299,6 +393,18 @@ struct rte_ml_dev_ops {
/** Update model params. */
mldev_model_params_update_t model_params_update;
+
+ /** Get input buffer size. */
+ mldev_io_input_size_get_t io_input_size_get;
+
+ /** Get output buffer size. */
+ mldev_io_output_size_get_t io_output_size_get;
+
+ /** Quantize data */
+ mldev_io_quantize_t io_quantize;
+
+ /** De-quantize data */
+ mldev_io_dequantize_t io_dequantize;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 640671efff..d87c7781df 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -11,6 +11,10 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_io_dequantize;
+ rte_ml_io_input_size_get;
+ rte_ml_io_output_size_get;
+ rte_ml_io_quantize;
rte_ml_model_info_get;
rte_ml_model_load;
rte_ml_model_params_update;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 07/12] mldev: support ML op pool and ops
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (5 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 06/12] mldev: support input and output data handling jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 08/12] mldev: support inference enqueue and dequeue jerinj
` (5 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added library functions to create and free ML op pool.
Create function allocates new ML op pool and initializes ML
ops to their defaults.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 69 +++++++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
2 files changed, 71 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 62f0e95c85..b669d866bf 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -11,6 +11,17 @@ static struct rte_ml_dev ml_devices[RTE_MLDEV_MAX_DEVS];
static struct rte_ml_dev_global ml_dev_globals = {
.devs = ml_devices, .data = {NULL}, .nb_devs = 0, .max_devs = RTE_MLDEV_MAX_DEVS};
+/*
+ * Private data structure of an operation pool.
+ *
+ * A structure that contains ml op_pool specific data that is
+ * appended after the mempool structure (in private data).
+ */
+struct rte_ml_op_pool_private {
+ uint16_t user_size;
+ /*< Size of private user data with each operation. */
+};
+
struct rte_ml_dev *
rte_ml_dev_pmd_get_dev(int16_t dev_id)
{
@@ -559,4 +570,62 @@ rte_ml_io_dequantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, voi
return (*dev->dev_ops->io_dequantize)(dev, model_id, nb_batches, qbuffer, dbuffer);
}
+/** Initialise rte_ml_op mempool element */
+static void
+ml_op_init(struct rte_mempool *mempool, __rte_unused void *opaque_arg, void *_op_data,
+ __rte_unused unsigned int i)
+{
+ struct rte_ml_op *op = _op_data;
+
+ memset(_op_data, 0, mempool->elt_size);
+ op->status = RTE_ML_OP_STATUS_NOT_PROCESSED;
+ op->mempool = mempool;
+}
+
+struct rte_mempool *
+rte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
+ uint16_t user_size, int socket_id)
+{
+ struct rte_ml_op_pool_private *priv;
+ struct rte_mempool *mp;
+ unsigned int elt_size;
+
+ /* lookup mempool in case already allocated */
+ mp = rte_mempool_lookup(name);
+ elt_size = sizeof(struct rte_ml_op) + user_size;
+
+ if (mp != NULL) {
+ priv = (struct rte_ml_op_pool_private *)rte_mempool_get_priv(mp);
+ if (mp->elt_size != elt_size || mp->cache_size < cache_size || mp->size < nb_elts ||
+ priv->user_size < user_size) {
+ mp = NULL;
+ RTE_MLDEV_LOG(ERR,
+ "Mempool %s already exists but with incompatible parameters",
+ name);
+ return NULL;
+ }
+ return mp;
+ }
+
+ mp = rte_mempool_create(name, nb_elts, elt_size, cache_size,
+ sizeof(struct rte_ml_op_pool_private), NULL, NULL, ml_op_init, NULL,
+ socket_id, 0);
+ if (mp == NULL) {
+ RTE_MLDEV_LOG(ERR, "Failed to create mempool %s", name);
+ return NULL;
+ }
+
+ priv = (struct rte_ml_op_pool_private *)rte_mempool_get_priv(mp);
+ priv->user_size = user_size;
+
+ return mp;
+}
+
+void
+rte_ml_op_pool_free(struct rte_mempool *mempool)
+{
+ if (mempool != NULL)
+ rte_mempool_free(mempool);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index d87c7781df..b5ee45108b 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -21,6 +21,8 @@ EXPERIMENTAL {
rte_ml_model_start;
rte_ml_model_stop;
rte_ml_model_unload;
+ rte_ml_op_pool_create;
+ rte_ml_op_pool_free;
local: *;
};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 08/12] mldev: support inference enqueue and dequeue
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (6 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 07/12] mldev: support ML op pool and ops jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 09/12] mldev: support device statistics jerinj
` (4 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added implementations of fast-path functions to enqueue
and dequeue ML requests from an ML device queue-pair.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 76 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 46 +++++++++++++++++++++++
lib/mldev/rte_mldev_pmd.h | 2 +
lib/mldev/version.map | 2 +
4 files changed, 126 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index b669d866bf..872e4e200a 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -2,6 +2,7 @@
* Copyright (c) 2022 Marvell.
*/
+#include <rte_errno.h>
#include <rte_log.h>
#include <rte_mldev.h>
#include <rte_mldev_pmd.h>
@@ -107,6 +108,9 @@ rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id)
ml_dev_globals.nb_devs++;
}
+ dev->enqueue_burst = NULL;
+ dev->dequeue_burst = NULL;
+
return dev;
}
@@ -628,4 +632,76 @@ rte_ml_op_pool_free(struct rte_mempool *mempool)
rte_mempool_free(mempool);
}
+uint16_t
+rte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->enqueue_burst == NULL) {
+ rte_errno = -ENOTSUP;
+ return 0;
+ }
+
+ if (ops == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, ops cannot be NULL\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ if (qp_id >= dev->data->nb_queue_pairs) {
+ RTE_MLDEV_LOG(ERR, "Invalid qp_id %u\n", qp_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->enqueue_burst)(dev, qp_id, ops, nb_ops);
+}
+
+uint16_t
+rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dequeue_burst == NULL) {
+ rte_errno = -ENOTSUP;
+ return 0;
+ }
+
+ if (ops == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, ops cannot be NULL\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ if (qp_id >= dev->data->nb_queue_pairs) {
+ RTE_MLDEV_LOG(ERR, "Invalid qp_id %u\n", qp_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->dequeue_burst)(dev, qp_id, ops, nb_ops);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 1bbc9fda0e..b0144aaf0c 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -33,6 +33,46 @@ extern "C" {
struct rte_ml_dev;
+/**
+ * @internal
+ *
+ * Enqueue a burst of inference requests to a queue on ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param qp_id
+ * Queue-pair ID.
+ * @param ops
+ * Array of ML ops to be enqueued.
+ * @param nb_ops
+ * Number of ops to enqueue.
+ *
+ * @return
+ * - Number of ops enqueued.
+ */
+typedef uint16_t (*mldev_enqueue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
+ uint16_t nb_ops);
+
+/**
+ * @internal
+ *
+ * Dequeue a burst of inference requests from a queue on ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param qp_id
+ * Queue-pair ID.
+ * @param ops
+ * Array of ML ops to dequeued.
+ * @param nb_ops
+ * Number of ops to dequeue.
+ *
+ * @return
+ * - Number of ops dequeued.
+ */
+typedef uint16_t (*mldev_dequeue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
+ uint16_t nb_ops);
+
/**
* Definitions of all functions exported by a driver through the generic structure of type
* *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
@@ -451,6 +491,12 @@ struct rte_ml_dev_data {
* The data structure associated with each ML device.
*/
struct rte_ml_dev {
+ /** Pointer to PMD enqueue function. */
+ mldev_enqueue_t enqueue_burst;
+
+ /** Pointer to PMD dequeue function. */
+ mldev_dequeue_t dequeue_burst;
+
/** Pointer to device data. */
struct rte_ml_dev_data *data;
diff --git a/lib/mldev/rte_mldev_pmd.h b/lib/mldev/rte_mldev_pmd.h
index 33544f1b80..afe617e4bf 100644
--- a/lib/mldev/rte_mldev_pmd.h
+++ b/lib/mldev/rte_mldev_pmd.h
@@ -40,6 +40,8 @@ struct rte_ml_dev_pmd_init_params {
uint64_t private_data_size;
};
+struct rte_ml_dev;
+
/**
* @internal
*
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index b5ee45108b..b585b09ec1 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -1,6 +1,7 @@
EXPERIMENTAL {
global:
+ rte_ml_dequeue_burst;
rte_ml_dev_close;
rte_ml_dev_configure;
rte_ml_dev_count;
@@ -11,6 +12,7 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_enqueue_burst;
rte_ml_io_dequantize;
rte_ml_io_input_size_get;
rte_ml_io_output_size_get;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 09/12] mldev: support device statistics
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (7 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 08/12] mldev: support inference enqueue and dequeue jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 10/12] mldev: support device extended statistics jerinj
` (3 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to get and reset device stats. Device stats
include number of requests enqueued, dequeued and errors. Added
function prototypes to used by driver implementations.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 40 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 32 ++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
3 files changed, 74 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 872e4e200a..bd65a44be5 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -357,6 +357,46 @@ rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
}
+int
+rte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stats_get == NULL)
+ return -ENOTSUP;
+
+ if (stats == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, stats cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+ memset(stats, 0, sizeof(struct rte_ml_dev_stats));
+
+ return (*dev->dev_ops->dev_stats_get)(dev, stats);
+}
+
+void
+rte_ml_dev_stats_reset(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stats_reset == NULL)
+ return;
+
+ (*dev->dev_ops->dev_stats_reset)(dev);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index b0144aaf0c..73eefc48c0 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -191,6 +191,32 @@ typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_p
*/
typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+/**
+ * @internal
+ *
+ * Function used to get device statistics.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stats
+ * Pointer to ML device stats structure to update.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_stats_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats);
+
+/**
+ * @internal
+ *
+ * Function used to reset device statistics.
+ *
+ * @param dev
+ * ML device pointer.
+ */
+typedef void (*mldev_stats_reset_t)(struct rte_ml_dev *dev);
+
/**
* @internal
*
@@ -416,6 +442,12 @@ struct rte_ml_dev_ops {
/** Release a device queue pair. */
mldev_queue_pair_release_t dev_queue_pair_release;
+ /** Get device statistics. */
+ mldev_stats_get_t dev_stats_get;
+
+ /** Reset device statistics. */
+ mldev_stats_reset_t dev_stats_reset;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index b585b09ec1..58803722be 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -11,6 +11,8 @@ EXPERIMENTAL {
rte_ml_dev_queue_pair_setup;
rte_ml_dev_socket_id;
rte_ml_dev_start;
+ rte_ml_dev_stats_get;
+ rte_ml_dev_stats_reset;
rte_ml_dev_stop;
rte_ml_enqueue_burst;
rte_ml_io_dequantize;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 10/12] mldev: support device extended statistics
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (8 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 09/12] mldev: support device statistics jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 11/12] mldev: support to retrieve error information jerinj
` (2 subsequent siblings)
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to handle device extended stats. xstats
supported are driver specific and can include stats specific
to ML device or ML model and I/O. Added prototypes for
functions to be called by the device drivers.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 88 ++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 93 ++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 4 ++
3 files changed, 185 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index bd65a44be5..da4c272d57 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -397,6 +397,94 @@ rte_ml_dev_stats_reset(int16_t dev_id)
(*dev->dev_ops->dev_stats_reset)(dev);
}
+int
+rte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map, uint32_t size)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_names_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_xstats_names_get)(dev, xstats_map, size);
+}
+
+int
+rte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_by_name_get == NULL)
+ return -ENOTSUP;
+
+ if (name == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, name cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (value == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, value cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_xstats_by_name_get)(dev, name, stat_id, value);
+}
+
+int
+rte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_get == NULL)
+ return -ENOTSUP;
+
+ if (stat_ids == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, stat_ids cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (values == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, values cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_xstats_get)(dev, stat_ids, values, nb_ids);
+}
+
+int
+rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_reset == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_xstats_reset)(dev, stat_ids, nb_ids);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 73eefc48c0..b2ddf8fb5e 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -217,6 +217,87 @@ typedef int (*mldev_stats_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_stats
*/
typedef void (*mldev_stats_reset_t)(struct rte_ml_dev *dev);
+/**
+ * @internal
+ *
+ * Function used to get names of extended stats.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param xstats_map
+ * Array to insert id and names into.
+ * @param size
+ * Size of xstats_map array.
+ *
+ * @return
+ * - >= 0 and <= size on success.
+ * - > size, error. Returns the size of xstats_map array required.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_names_get_t)(struct rte_ml_dev *dev,
+ struct rte_ml_dev_xstats_map *xstats_map, uint32_t size);
+
+/**
+ * @internal
+ *
+ * Function used to get a single extended stat by name.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param name
+ * Name of the stat to retrieve.
+ * @param stat_id
+ * ID of the stat to be returned.
+ * @param value
+ * Value of the stat to be returned.
+ *
+ * @return
+ * - >= 0 stat value.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_by_name_get_t)(struct rte_ml_dev *dev, const char *name,
+ uint16_t *stat_id, uint64_t *value);
+
+/**
+ * @internal
+ *
+ * Function used to retrieve extended stats of a device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stat_ids
+ * Array of ID numbers of the stats to be retrieved.
+ * @param values
+ * Values of the stats requested by the ID.
+ * @param nb_ids
+ * Number of stats requested.
+ *
+ * @return
+ * - >= 0, number of entries filled into the values array.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_get_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
+ uint64_t *values, uint16_t nb_ids);
+
+/**
+ * @internal
+ *
+ * Function used to reset extended stats.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stat_ids
+ * Array of stats IDs to be reset.
+ * @param nb_ids
+ * Number of IDs in the stat_ids array.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_reset_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
+ uint16_t nb_ids);
+
/**
* @internal
*
@@ -448,6 +529,18 @@ struct rte_ml_dev_ops {
/** Reset device statistics. */
mldev_stats_reset_t dev_stats_reset;
+ /** Get names of extended stats. */
+ mldev_xstats_names_get_t dev_xstats_names_get;
+
+ /** Get value of a single extended stat. */
+ mldev_xstats_by_name_get_t dev_xstats_by_name_get;
+
+ /** Get extended stats of a device. */
+ mldev_xstats_get_t dev_xstats_get;
+
+ /** Reset extended stats of the device. */
+ mldev_xstats_reset_t dev_xstats_reset;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 58803722be..ddf340ef8e 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -14,6 +14,10 @@ EXPERIMENTAL {
rte_ml_dev_stats_get;
rte_ml_dev_stats_reset;
rte_ml_dev_stop;
+ rte_ml_dev_xstats_by_name_get;
+ rte_ml_dev_xstats_get;
+ rte_ml_dev_xstats_names_get;
+ rte_ml_dev_xstats_reset;
rte_ml_enqueue_burst;
rte_ml_io_dequantize;
rte_ml_io_input_size_get;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 11/12] mldev: support to retrieve error information
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (9 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 10/12] mldev: support device extended statistics jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 12/12] mldev: support to get debug info and test device jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to get error information for an ML op.
This information can include both drive specific error
message and error code.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 31 +++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 22 ++++++++++++++++++++++
lib/mldev/version.map | 1 +
3 files changed, 54 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index da4c272d57..1a8d8d4987 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -832,4 +832,35 @@ rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uin
return (*dev->dequeue_burst)(dev, qp_id, ops, nb_ops);
}
+int
+rte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->op_error_get == NULL)
+ return -ENOTSUP;
+
+ if (op == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, op cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (error == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, error cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->op_error_get)(dev, op, error);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index b2ddf8fb5e..29bec93c5f 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -73,6 +73,25 @@ typedef uint16_t (*mldev_enqueue_t)(struct rte_ml_dev *dev, uint16_t qp_id, stru
typedef uint16_t (*mldev_dequeue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
uint16_t nb_ops);
+/**
+ * @internal
+ *
+ * Get error information for an Op.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param op
+ * ML Op handle.
+ * @param error
+ * Pointer to error structure.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_op_error_get_t)(struct rte_ml_dev *dev, struct rte_ml_op *op,
+ struct rte_ml_op_error *error);
+
/**
* Definitions of all functions exported by a driver through the generic structure of type
* *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
@@ -622,6 +641,9 @@ struct rte_ml_dev {
/** Pointer to PMD dequeue function. */
mldev_dequeue_t dequeue_burst;
+ /** Pointer to PMD Op error get function. */
+ mldev_op_error_get_t op_error_get;
+
/** Pointer to device data. */
struct rte_ml_dev_data *data;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index ddf340ef8e..ebe69765e6 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -29,6 +29,7 @@ EXPERIMENTAL {
rte_ml_model_start;
rte_ml_model_stop;
rte_ml_model_unload;
+ rte_ml_op_error_get;
rte_ml_op_pool_create;
rte_ml_op_pool_free;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v2 12/12] mldev: support to get debug info and test device
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (10 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 11/12] mldev: support to retrieve error information jerinj
@ 2023-02-06 20:24 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
12 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-06 20:24 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, pkapoor,
Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions for ML device debug APIs. The APIs
are used to dump ML device debug information and to run selftest.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 39 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 37 ++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
3 files changed, 78 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 1a8d8d4987..f2d6158689 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -485,6 +485,45 @@ rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_id
return (*dev->dev_ops->dev_xstats_reset)(dev, stat_ids, nb_ids);
}
+int
+rte_ml_dev_dump(int16_t dev_id, FILE *fd)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_dump == NULL)
+ return -ENOTSUP;
+
+ if (fd == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, file descriptor cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_dump)(dev, fd);
+}
+
+int
+rte_ml_dev_selftest(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_selftest == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_selftest)(dev);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 29bec93c5f..0bc181c680 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -317,6 +317,37 @@ typedef int (*mldev_xstats_get_t)(struct rte_ml_dev *dev, const uint16_t *stat_i
typedef int (*mldev_xstats_reset_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
uint16_t nb_ids);
+/**
+ * @internal
+ *
+ * Function used to dump ML device debug info.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param fd
+ * File descriptor to dump the debug info.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+
+typedef int (*mldev_dump_t)(struct rte_ml_dev *dev, FILE *fd);
+
+/**
+ * @internal
+ *
+ * Function used for selftest of ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_selftest_t)(struct rte_ml_dev *dev);
+
/**
* @internal
*
@@ -560,6 +591,12 @@ struct rte_ml_dev_ops {
/** Reset extended stats of the device. */
mldev_xstats_reset_t dev_xstats_reset;
+ /** Dump ML device debug info. */
+ mldev_dump_t dev_dump;
+
+ /** Dump ML device debug info. */
+ mldev_selftest_t dev_selftest;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index ebe69765e6..745cf2e1cc 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -5,10 +5,12 @@ EXPERIMENTAL {
rte_ml_dev_close;
rte_ml_dev_configure;
rte_ml_dev_count;
+ rte_ml_dev_dump;
rte_ml_dev_info_get;
rte_ml_dev_is_valid_dev;
rte_ml_dev_logtype;
rte_ml_dev_queue_pair_setup;
+ rte_ml_dev_selftest;
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stats_get;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device jerinj
@ 2023-02-06 21:04 ` Stephen Hemminger
2023-02-06 22:17 ` Thomas Monjalon
2023-02-07 5:16 ` Jerin Jacob
0 siblings, 2 replies; 80+ messages in thread
From: Stephen Hemminger @ 2023-02-06 21:04 UTC (permalink / raw)
To: jerinj
Cc: dev, Srikanth Yalavarthi, Anatoly Burakov, thomas, ferruh.yigit,
dchickles, sshankarnara, pkapoor
On Tue, 7 Feb 2023 01:54:43 +0530
<jerinj@marvell.com> wrote:
> +static struct rte_ml_dev ml_devices[RTE_MLDEV_MAX_DEVS];
>
This will reserve space for 64 devices, but almost all users
will only have one. Maybe a level of indirection and allocate as needed?
You could even use a single allocation for the pmd and device private
data portion.
> + */
> +struct rte_ml_dev_data {
> + /** Unique identifier name. */
> + char name[RTE_ML_STR_MAX];
Why is name first, it is the least used field. Might want it to be last
for cache locality.
> + /** Reserved for future fields */
> + uint64_t reserved[3];
Reserved fields have been a problem in the past.
Why do this? Are thy just available pad elements to be cache line size?
And why bother being cache aligned for an info struct?
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device
2023-02-06 21:04 ` Stephen Hemminger
@ 2023-02-06 22:17 ` Thomas Monjalon
2023-02-07 5:16 ` Jerin Jacob
1 sibling, 0 replies; 80+ messages in thread
From: Thomas Monjalon @ 2023-02-06 22:17 UTC (permalink / raw)
To: jerinj, Stephen Hemminger
Cc: dev, Srikanth Yalavarthi, Anatoly Burakov, ferruh.yigit,
dchickles, sshankarnara, pkapoor
06/02/2023 22:04, Stephen Hemminger:
> On Tue, 7 Feb 2023 01:54:43 +0530
> <jerinj@marvell.com> wrote:
>
> > +static struct rte_ml_dev ml_devices[RTE_MLDEV_MAX_DEVS];
> >
>
> This will reserve space for 64 devices, but almost all users
> will only have one. Maybe a level of indirection and allocate as needed?
>
> You could even use a single allocation for the pmd and device private
> data portion.
I like what we did for GPU class: rte_gpu_init(size_t dev_max)
If not called by the app, it is called automatically with a default size.
So you can have a small default and there is no compilation settings.
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device
2023-02-06 21:04 ` Stephen Hemminger
2023-02-06 22:17 ` Thomas Monjalon
@ 2023-02-07 5:16 ` Jerin Jacob
1 sibling, 0 replies; 80+ messages in thread
From: Jerin Jacob @ 2023-02-07 5:16 UTC (permalink / raw)
To: Stephen Hemminger
Cc: jerinj, dev, Srikanth Yalavarthi, Anatoly Burakov, thomas,
ferruh.yigit, dchickles, sshankarnara, pkapoor
On Tue, Feb 7, 2023 at 2:34 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 7 Feb 2023 01:54:43 +0530
> <jerinj@marvell.com> wrote:
>
> > +static struct rte_ml_dev ml_devices[RTE_MLDEV_MAX_DEVS];
> >
>
> This will reserve space for 64 devices, but almost all users
> will only have one. Maybe a level of indirection and allocate as needed?
As Thomas suggested, I will add something similar to rte_gpu_init()
>
> You could even use a single allocation for the pmd and device private
> data portion.
>
> > + */
> > +struct rte_ml_dev_data {
> > + /** Unique identifier name. */
> > + char name[RTE_ML_STR_MAX];
>
>
> Why is name first, it is the least used field. Might want it to be last
> for cache locality.
It is slowpath, does not matter. But there is no harm in moving end of
it. I will move to end of the structure.
>
> > + /** Reserved for future fields */
> > + uint64_t reserved[3];
>
> Reserved fields have been a problem in the past.
Will remove it.
> Why do this? Are thy just available pad elements to be cache line size?
Yes.
>
> And why bother being cache aligned for an info struct?
We can remove it. Will removing reserved and cache aligned.
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
` (11 preceding siblings ...)
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 12/12] mldev: support to get debug info and test device jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 01/12] " jerinj
` (13 more replies)
12 siblings, 14 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev; +Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Jerin Jacob <jerinj@marvell.com>
Machine learning inference library
==================================
Definition of machine learning inference
----------------------------------------
Inference in machine learning is the process of making an output prediction
based on new input data using a pre-trained machine learning model.
The scope of the RFC would include only inferencing with pre-trained machine learning models,
training and building/compiling the ML models is out of scope for this RFC or
DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
Motivation for the new library
------------------------------
Multiple semiconductor vendors are offering accelerator products such as DPU
(often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
integrated as part of the product. Use of ML inferencing is increasing in the domain
of packet processing for flow classification, intrusion, malware and anomaly detection.
Lack of inferencing support through DPDK APIs will involve complexities and
increased latency from moving data across frameworks (i.e, dataplane to
non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
inferencing would enable the dataplane solutions to harness the benefit of inline
inferencing supported by the hardware.
Contents
---------------
A) API specification for:
1) Discovery of ML capabilities (e.g., device specific features) in a vendor
independent fashion
2) Definition of functions to handle ML devices, which includes probing,
initialization and termination of the devices.
3) Definition of functions to handle ML models used to perform inference operations.
4) Definition of function to handle quantize and dequantize operations
B) Common code for above specification
rfc..v1:
- Added programmer guide documentation
- Added implementation for common code
v2..v1:
- Moved dynamic log (Stephen)
- model id to uint16_t from int16t_t (Stephen)
- added release note updates
v3..v2:
- Introduced rte_ml_dev_init() similar to rte_gpu_init() (Stephen, Thomas)
- In struct rte_ml_dev_data, removed reserved[3] and __rte_cache_aligned.
Also, moved name field to the end(Stephen)
Machine learning library framework
----------------------------------
The ML framework is built on the following model:
+-----------------+ rte_ml_[en|de]queue_burst()
| | |
| Machine o------+ +--------+ |
| Learning | | | queue | | +------+
| Inference o------+-----o |<===o===>|Core 0|
| Engine | | | pair 0 | +------+
| o----+ | +--------+
| | | |
+-----------------+ | | +--------+
^ | | | queue | +------+
| | +-----o |<=======>|Core 1|
| | | pair 1 | +------+
| | +--------+
+--------+--------+ |
| +-------------+ | | +--------+
| | Model 0 | | | | queue | +------+
| +-------------+ | +-------o |<=======>|Core N|
| +-------------+ | | pair N | +------+
| | Model 1 | | +--------+
| +-------------+ |
| +-------------+ |<------- rte_ml_model_load()
| | Model .. | |-------> rte_ml_model_info()
| +-------------+ |<------- rte_ml_model_start()
| +-------------+ |<------- rte_ml_model_stop()
| | Model N | |<------- rte_ml_model_params_update()
| +-------------+ |<------- rte_ml_model_unload()
+-----------------+
ML Device: A hardware or software-based implementation of ML device API for
running inferences using a pre-trained ML model.
ML Model: An ML model is an algorithm trained over a dataset. A model consists of
procedure/algorithm and data/pattern required to make predictions on live data.
Once the model is created and trained outside of the DPDK scope, the model can be loaded
via rte_ml_model_load() and then start it using rte_ml_model_start() API.
The rte_ml_model_params_update() can be used to update the model parameters such as weight
and bias without unloading the model using rte_ml_model_unload().
ML Inference: ML inference is the process of feeding data to the model via
rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
outputs/predictions from the started model.
In all functions of the ML device API, the ML device is designated by an
integer >= 0 named as device identifier *dev_id*.
The functions exported by the ML device API to setup a device designated by
its device identifier must be invoked in the following order:
- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_dev_start()
A model is required to run the inference operations with the user specified inputs.
Application needs to invoke the ML model API in the following order before queueing
inference jobs.
- rte_ml_model_load()
- rte_ml_model_start()
The rte_ml_model_info() API is provided to retrieve the information related to the model.
The information would include the shape and type of input and output required for the inference.
Data quantization and dequantization is one of the main aspects in ML domain. This involves
conversion of input data from a higher precision to a lower precision data type and vice-versa
for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
and output buffers holding data for multiple batches.
Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
size of quantized and de-quantized multi-batch input and output buffers.
User can optionally update the model parameters with rte_ml_model_params_update() after
invoking rte_ml_model_stop() API on a given model ID.
The application can invoke, in any order, the functions exported by the ML API to enqueue
inference jobs and dequeue inference response.
If the application wants to change the device configuration (i.e., call
rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
for the given model. The application does not need to call rte_ml_dev_stop() API for
any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
start state after invoking rte_ml_model_start() API, then the application can call
rte_ml_enqueue() and rte_ml_dequeue() API on the destined device and model ID.
Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
Typical application utilisation of the ML API will follow the following
programming flow.
- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_model_load()
- rte_ml_model_start()
- rte_ml_model_info()
- rte_ml_dev_start()
- rte_ml_enqueue_burst()
- rte_ml_dequeue_burst()
- rte_ml_model_stop()
- rte_ml_model_unload()
- rte_ml_dev_stop()
- rte_ml_dev_close()
Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on different logical cores
on the same target object. For instance, the dequeue function of a poll mode driver cannot be
invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the user application to enforce this rule.
Example application usage for ML inferencing
--------------------------------------------
This example application is to demonstrate the programming model of ML device
library. This example omits the error checks to simplify the application. This
example also assumes that the input data received is quantized and output expected
is also quantized. In order to handle non-quantized inputs and outputs, users can
invoke rte_ml_io_quantize() or rte_ml_io_dequantize() for data type conversions.
#define ML_MODEL_NAME "model"
#define IO_MZ "io_mz"
struct app_ctx {
char model_file[PATH_MAX];
char inp_file[PATH_MAX];
char out_file[PATH_MAX];
struct rte_ml_model_params params;
struct rte_ml_model_info info;
uint16_t id;
uint64_t input_size;
uint64_t output_size;
uint8_t *input_buffer;
uint8_t *output_buffer;
} __rte_cache_aligned;
struct app_ctx ctx;
static int
parse_args(int argc, char **argv)
{
int opt, option_index;
static struct option lgopts[] = {{"model", required_argument, NULL, 'm'},
{"input", required_argument, NULL, 'i'},
{"output", required_argument, NULL, 'o'},
{NULL, 0, NULL, 0}};
while ((opt = getopt_long(argc, argv, "m:i:o:", lgopts, &option_index)) != EOF)
switch (opt) {
case 'm':
strncpy(ctx.model_file, optarg, PATH_MAX - 1);
break;
case 'i':
strncpy(ctx.inp_file, optarg, PATH_MAX - 1);
break;
case 'o':
strncpy(ctx.out_file, optarg, PATH_MAX - 1);
break;
default:
return -1;
}
return 0;
}
int
main(int argc, char **argv)
{
struct rte_ml_dev_qp_conf qp_conf;
struct rte_ml_dev_config config;
struct rte_ml_dev_info dev_info;
const struct rte_memzone *mz;
struct rte_mempool *op_pool;
struct rte_ml_op *op_enq;
struct rte_ml_op *op_deq;
FILE *fp;
int rc;
/* Initialize EAL */
rc = rte_eal_init(argc, argv);
if (rc < 0)
rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
argc -= rc;
argv += rc;
/* Parse application arguments (after the EAL args) */
if (parse_args(argc, argv) < 0)
rte_exit(EXIT_FAILURE, "Invalid application arguments\n");
/* Step 1: Check for ML devices */
if (rte_ml_dev_count() <= 0)
rte_exit(EXIT_FAILURE, "Failed to find ML devices\n");
/* Step 2: Get device info */
if (rte_ml_dev_info_get(0, &dev_info) != 0)
rte_exit(EXIT_FAILURE, "Failed to get device info\n");
/* Step 3: Configure ML device, use device 0 */
config.socket_id = rte_ml_dev_socket_id(0);
config.max_nb_models = dev_info.max_models;
config.nb_queue_pairs = dev_info.max_queue_pairs;
if (rte_ml_dev_configure(0, &config) != 0)
rte_exit(EXIT_FAILURE, "Device configuration failed\n");
/* Step 4: Setup queue pairs, used qp_id = 0 */
qp_conf.nb_desc = 1;
if (rte_ml_dev_queue_pair_setup(0, 0, &qp_conf, config.socket_id) != 0)
rte_exit(EXIT_FAILURE, "Queue-pair setup failed\n");
/* Step 5: Start device */
if (rte_ml_dev_start(0) != 0)
rte_exit(EXIT_FAILURE, "Device start failed\n");
/* Step 6: Read model data and update load params structure */
fp = fopen(ctx.model_file, "r+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open model file\n");
fseek(fp, 0, SEEK_END);
ctx.params.size = ftell(fp);
fseek(fp, 0, SEEK_SET);
ctx.params.addr = malloc(ctx.params.size);
if (fread(ctx.params.addr, 1, ctx.params.size, fp) != ctx.params.size){
fclose(fp);
rte_exit(EXIT_FAILURE, "Failed to read model\n");
}
fclose(fp);
strcpy(ctx.params.name, ML_MODEL_NAME);
/* Step 7: Load the model */
if (rte_ml_model_load(0, &ctx.params, &ctx.id) != 0)
rte_exit(EXIT_FAILURE, "Failed to load model\n");
free(ctx.params.addr);
/* Step 8: Start the model */
if (rte_ml_model_start(0, ctx.id) != 0)
rte_exit(EXIT_FAILURE, "Failed to start model\n");
/* Step 9: Allocate buffers for quantized input and output */
/* Get model information */
if (rte_ml_model_info_get(0, ctx.id, &ctx.info) != 0)
rte_exit(EXIT_FAILURE, "Failed to get model info\n");
/* Get the buffer size for input and output */
rte_ml_io_input_size_get(0, ctx.id, ctx.info.batch_size, &ctx.input_size, NULL);
rte_ml_io_output_size_get(0, ctx.id, ctx.info.batch_size, &ctx.output_size, NULL);
mz = rte_memzone_reserve(IO_MZ, ctx.input_size + ctx.output_size, config.socket_id, 0);
if (mz == NULL)
rte_exit(EXIT_FAILURE, "Failed to create IO memzone\n");
ctx.input_buffer = mz->addr;
ctx.output_buffer = ctx.input_buffer + ctx.input_size;
/* Step 10: Fill the input data */
fp = fopen(ctx.inp_file, "r+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open input file\n");
if (fread(ctx.input_buffer, 1, ctx.input_size, fp) != ctx.input_size) {
fclose(fp);
rte_exit(EXIT_FAILURE, "Failed to read input file\n");
}
fclose(fp);
/* Step 11: Create ML op mempool */
op_pool = rte_ml_op_pool_create("ml_op_pool", 1, 0, 0, config.socket_id);
if (op_pool == NULL)
rte_exit(EXIT_FAILURE, "Failed to create op pool\n");
/* Step 12: Form an ML op */
rte_mempool_get_bulk(op_pool, (void *)op_enq, 1);
op_enq->model_id = ctx.id;
op_enq->nb_batches = ctx.info.batch_size;
op_enq->mempool = op_pool;
op_enq->input.addr = ctx.input_buffer;
op_enq->input.length = ctx.input_size;
op_enq->input.next = NULL;
op_enq->output.addr = ctx.output_buffer;
op_enq->output.length = ctx.output_size;
op_enq->output.next = NULL;
/* Step 13: Enqueue jobs */
rte_ml_enqueue_burst(0, 0, &op_enq, 1);
/* Step 14: Dequeue jobs and release op pool */
while (rte_ml_dequeue_burst(0, 0, &op_deq, 1) != 1)
;
/* Step 15: Write output */
fp = fopen(ctx.out_file, "w+");
if (fp == NULL)
rte_exit(EXIT_FAILURE, "Failed to open output file\n");
fwrite(ctx.output_buffer, 1, ctx.output_size, fp);
fclose(fp);
/* Step 16: Clean up */
/* Stop ML model */
rte_ml_model_stop(0, ctx.id);
/* Unload ML model */
rte_ml_model_unload(0, ctx.id);
/* Free input/output memory */
rte_memzone_free(rte_memzone_lookup(IO_MZ));
/* Free the ml op back to pool */
rte_mempool_put_bulk(op_pool, (void *)op_deq, 1);
/* Free ml op pool */
rte_mempool_free(op_pool);
/* Stop the device */
rte_ml_dev_stop(0);
rte_ml_dev_close(0);
rte_eal_cleanup();
return 0;
}
Jerin Jacob (1):
mldev: introduce machine learning device library
Srikanth Yalavarthi (11):
mldev: support PMD functions for ML device
mldev: support ML device handling functions
mldev: support ML device queue-pair setup
mldev: support handling ML models
mldev: support input and output data handling
mldev: support ML op pool and ops
mldev: support inference enqueue and dequeue
mldev: support device statistics
mldev: support device extended statistics
mldev: support to retrieve error information
mldev: support to get debug info and test device
MAINTAINERS | 5 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 186 ++++
doc/guides/rel_notes/release_23_03.rst | 5 +
lib/meson.build | 1 +
lib/mldev/meson.build | 27 +
lib/mldev/rte_mldev.c | 947 ++++++++++++++++++
lib/mldev/rte_mldev.h | 1119 ++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 717 ++++++++++++++
lib/mldev/rte_mldev_pmd.c | 62 ++
lib/mldev/rte_mldev_pmd.h | 151 +++
lib/mldev/version.map | 51 +
15 files changed, 3988 insertions(+)
create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/rte_mldev_core.h
create mode 100644 lib/mldev/rte_mldev_pmd.c
create mode 100644 lib/mldev/rte_mldev_pmd.h
create mode 100644 lib/mldev/version.map
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 01/12] mldev: introduce machine learning device library
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 02/12] mldev: support PMD functions for ML device jerinj
` (12 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Thomas Monjalon, Srikanth Yalavarthi
Cc: ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Jerin Jacob <jerinj@marvell.com>
Add mldev API specification to standardize and use the machine learning
device and inference operations in vendor neutral way.
Following operations are abstracted through APIs
- ML device capability probe
- ML device configuration
- ML device queue pair configuration
- ML device state management
- ML device stat/xstat operations
- ML model load/unload/start/stop operations
- ML model information probe
- ML IO operations to find size for input and output buffers
- ML quantize and dequantize operations
- ML ops pool creation and free operations
- ML device enqueue/dequeue fastpath interference operations
Also added programming guide.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
MAINTAINERS | 5 +
doc/api/doxy-api-index.md | 1 +
doc/api/doxy-api.conf.in | 1 +
doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/mldev.rst | 186 ++++
doc/guides/rel_notes/release_23_03.rst | 5 +
lib/meson.build | 1 +
lib/mldev/meson.build | 18 +
lib/mldev/rte_mldev.c | 8 +
lib/mldev/rte_mldev.h | 1119 ++++++++++++++++++++++
lib/mldev/version.map | 7 +
12 files changed, 2066 insertions(+)
create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
create mode 100644 doc/guides/prog_guide/mldev.rst
create mode 100644 lib/mldev/meson.build
create mode 100644 lib/mldev/rte_mldev.c
create mode 100644 lib/mldev/rte_mldev.h
create mode 100644 lib/mldev/version.map
diff --git a/MAINTAINERS b/MAINTAINERS
index 3495946d0f..fa91900a20 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -538,6 +538,11 @@ F: drivers/raw/skeleton/
F: app/test/test_rawdev.c
F: doc/guides/prog_guide/rawdev.rst
+ML device API - EXPERIMENTAL
+M: Srikanth Yalavarthi <syalavarthi@marvell.com>
+F: lib/mldev/
+F: doc/guides/prog_guide/mldev.rst
+
Memory Pool Drivers
-------------------
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de488c7abf..a12562977a 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -22,6 +22,7 @@ The public API headers are grouped by topics:
[compress](@ref rte_comp.h),
[regexdev](@ref rte_regexdev.h),
[dmadev](@ref rte_dmadev.h),
+ [mldev](@ref rte_mldev.h),
[eventdev](@ref rte_eventdev.h),
[event_eth_rx_adapter](@ref rte_event_eth_rx_adapter.h),
[event_eth_tx_adapter](@ref rte_event_eth_tx_adapter.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index f0886c3bd1..5d6416d3e0 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -57,6 +57,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/mempool \
@TOPDIR@/lib/meter \
@TOPDIR@/lib/metrics \
+ @TOPDIR@/lib/mldev \
@TOPDIR@/lib/node \
@TOPDIR@/lib/net \
@TOPDIR@/lib/pcapng \
diff --git a/doc/guides/prog_guide/img/mldev_flow.svg b/doc/guides/prog_guide/img/mldev_flow.svg
new file mode 100644
index 0000000000..6c5dda14e5
--- /dev/null
+++ b/doc/guides/prog_guide/img/mldev_flow.svg
@@ -0,0 +1,714 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!-- SPDX-License-Identifier: BSD-3-Clause -->
+<!-- Copyright (c) 2022 Marvell. -->
+<!-- Created with Inkscape (http://www.inkscape.org/) -->
+
+<svg
+ width="320mm"
+ height="297mm"
+ viewBox="0 0 320 297"
+ version="1.1"
+ id="svg6899"
+ inkscape:version="1.2.1 (9c6d41e410, 2022-07-14)"
+ sodipodi:docname="mldev_flow.svg"
+ inkscape:export-filename="mldev_flow.png"
+ inkscape:export-xdpi="96"
+ inkscape:export-ydpi="96"
+ xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
+ xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
+ xmlns="http://www.w3.org/2000/svg"
+ xmlns:svg="http://www.w3.org/2000/svg">
+ <sodipodi:namedview
+ id="namedview6901"
+ pagecolor="#ffffff"
+ bordercolor="#000000"
+ borderopacity="0.25"
+ inkscape:showpageshadow="2"
+ inkscape:pageopacity="0.0"
+ inkscape:pagecheckerboard="0"
+ inkscape:deskcolor="#d1d1d1"
+ inkscape:document-units="mm"
+ showgrid="false"
+ inkscape:connector-spacing="0"
+ inkscape:lockguides="false"
+ inkscape:zoom="0.49638341"
+ inkscape:cx="640.63382"
+ inkscape:cy="525.80323"
+ inkscape:window-width="1920"
+ inkscape:window-height="986"
+ inkscape:window-x="-11"
+ inkscape:window-y="-11"
+ inkscape:window-maximized="1"
+ inkscape:current-layer="layer1" />
+ <defs
+ id="defs6896">
+ <marker
+ style="overflow:visible"
+ id="RoundedArrow"
+ refX="5"
+ refY="0"
+ orient="auto-start-reverse"
+ inkscape:stockid="RoundedArrow"
+ markerWidth="6.1347523"
+ markerHeight="5.9304948"
+ viewBox="0 0 6.1347524 5.9304951"
+ inkscape:isstock="true"
+ inkscape:collect="always"
+ preserveAspectRatio="xMidYMid">
+ <path
+ transform="scale(0.7)"
+ d="m -0.21114562,-4.1055728 6.42229122,3.21114561 a 1,1 90 0 1 0,1.78885438 L -0.21114562,4.1055728 A 1.236068,1.236068 31.717474 0 1 -2,3 v -6 a 1.236068,1.236068 148.28253 0 1 1.78885438,-1.1055728 z"
+ style="fill:context-stroke;fill-rule:evenodd;stroke:none"
+ id="path1367" />
+ </marker>
+ <marker
+ style="overflow:visible"
+ id="TriangleStart"
+ refX="4"
+ refY="0"
+ orient="auto-start-reverse"
+ inkscape:stockid="TriangleStart"
+ markerWidth="5.3244081"
+ markerHeight="6.155385"
+ viewBox="0 0 5.3244081 6.1553851"
+ inkscape:isstock="true"
+ inkscape:collect="always"
+ preserveAspectRatio="xMidYMid">
+ <path
+ transform="scale(0.5)"
+ style="fill:context-stroke;fill-rule:evenodd;stroke:context-stroke;stroke-width:1pt"
+ d="M 5.77,0 -2.88,5 V -5 Z"
+ id="path135" />
+ </marker>
+ </defs>
+ <g
+ inkscape:label="Layer 1"
+ inkscape:groupmode="layer"
+ id="layer1">
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1;paint-order:stroke fill markers"
+ id="rect39991"
+ width="312.88394"
+ height="286.7659"
+ x="3.5580292"
+ y="5.1170502"
+ ry="18.197132" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 114.68664,155.38145 h 32.15418"
+ id="path24358"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="m 114.68664,179.58099 h 32.15008"
+ id="path24360"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart)"
+ d="m 114.68664,203.78389 h 32.15008"
+ id="path24362"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-start:url(#TriangleStart)"
+ d="m 114.68664,227.98576 32.14997,0"
+ id="path24364"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="M 146.8367,252.18432 H 114.68664"
+ id="path24366"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176-1" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#TriangleStart)"
+ d="M 146.8367,276.38309 H 114.68664"
+ id="path24368"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176-1" />
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:2, 1;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24370"
+ width="18.09137"
+ height="13.568528"
+ x="127.27605"
+ y="208.81961"
+ ry="2.7394907"
+ inkscape:connector-avoid="true" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:4, 2;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 70.388979,148.58514 -1e-6,-46.3516"
+ id="path24426"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176-1"
+ inkscape:connection-end="#rect24176" />
+ <g
+ id="g42647">
+ <g
+ id="g31403"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844498;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68901, 0.844498;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-9"
+ width="99.155487"
+ height="14.152132"
+ x="190.88715"
+ y="229.93475"
+ ry="2.2479143"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-236.90309"
+ y="240.37343"
+ id="text31115"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113"
+ style="stroke:none;stroke-width:0.75"
+ x="-236.90309"
+ y="240.37343">rte_ml_model_update_params()</tspan></text>
+ </g>
+ <g
+ id="g31398"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844505;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68902, 0.844505;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-4"
+ width="99.155495"
+ height="14.152357"
+ x="190.88705"
+ y="205.73608"
+ ry="2.2479498"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-212.70453"
+ y="240.37334"
+ id="text31115-8"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-8"
+ style="stroke:none;stroke-width:0.75"
+ x="-212.70453"
+ y="240.37334">rte_ml_model_stop()</tspan></text>
+ </g>
+ <g
+ id="g31408"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844505;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68901, 0.844505;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2-2"
+ width="99.155495"
+ height="14.152359"
+ x="190.88715"
+ y="254.13341"
+ ry="2.2479503"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-261.10187"
+ y="240.37343"
+ id="text31115-1"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-3"
+ style="stroke:none;stroke-width:0.75"
+ x="-261.10187"
+ y="240.37343">rte_ml_model_unload()</tspan></text>
+ </g>
+ <g
+ id="g31393"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844566;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68914, 0.844566;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2-5"
+ width="99.155434"
+ height="14.154394"
+ x="190.88718"
+ y="181.53319"
+ ry="2.2482734"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-188.50266"
+ y="240.37343"
+ id="text31115-4"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-2"
+ style="stroke:none;stroke-width:0.75"
+ x="-188.50266"
+ y="240.37343">rte_ml_model_start()</tspan></text>
+ </g>
+ <g
+ id="g31388"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844565;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.68914, 0.844565;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-8"
+ width="99.155434"
+ height="14.154395"
+ x="190.88718"
+ y="157.33029"
+ ry="2.2482736"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-164.29976"
+ y="240.37343"
+ id="text31115-6"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-5"
+ style="stroke:none;stroke-width:0.75"
+ x="-164.29976"
+ y="240.37343">rte_ml_model_info_get()</tspan></text>
+ </g>
+ <g
+ id="g31383"
+ transform="translate(-44.050451,15.173444)">
+ <rect
+ style="fill:#cadae7;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2"
+ width="99.155495"
+ height="14.152369"
+ x="190.89127"
+ y="133.13176"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-140.10022"
+ y="240.37755"
+ id="text31115-0"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35"
+ style="stroke:none;stroke-width:0.75"
+ x="-140.10022"
+ y="240.37755">rte_ml_model_load()</tspan></text>
+ </g>
+ </g>
+ <rect
+ style="fill:#ffccaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2-5"
+ width="99.155495"
+ height="14.152369"
+ x="184.08008"
+ y="112.15163"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-119.12009"
+ y="233.56647"
+ id="text31115-0-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35-8"
+ style="stroke:none;stroke-width:0.75"
+ x="-119.12009"
+ y="233.56647">rte_ml_dequeue_burst()</tspan></text>
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 114.90712,47.649005 h 56.16045"
+ id="path24248"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-start="#rect24176"
+ inkscape:connection-end="#rect24200" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 171.06762,70.71111 -56.1605,0.0024"
+ id="path24250"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176"
+ inkscape:connection-start="#rect24200-5" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="M 171.06765,93.773951 H 114.90712"
+ id="path24252"
+ inkscape:connector-type="orthogonal"
+ inkscape:connector-curvature="0"
+ inkscape:connection-end="#rect24176"
+ inkscape:connection-start="#rect24200-5-2" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.44396,47.649004 h 36.42795"
+ id="path24566"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.444,70.710168 h 36.42791"
+ id="path24568"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:3, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-start:url(#TriangleStart);marker-end:url(#TriangleStart)"
+ d="m 215.44395,93.773951 36.42796,-10e-7"
+ id="path24570"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <g
+ id="g42675">
+ <g
+ id="g31358"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623639;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24728, 0.623639;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200"
+ width="44.376362"
+ height="17.244751"
+ x="190.77635"
+ y="22.794853"
+ ry="2.7391431"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-31.802492"
+ y="212.98004"
+ id="text31256"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254"
+ style="stroke-width:0.75"
+ x="-31.802492"
+ y="212.98004">Queue Pair 0</tspan></text>
+ </g>
+ <g
+ id="g31353"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623639;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24728, 0.623639;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5"
+ width="44.376362"
+ height="17.244749"
+ x="190.7764"
+ y="45.856018"
+ ry="2.7391429"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-54.863655"
+ y="213.10411"
+ id="text31256-9"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254-9"
+ style="stroke-width:0.75"
+ x="-54.863655"
+ y="213.10411">Queue Pair ..</tspan></text>
+ </g>
+ <g
+ id="g31363"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#dcf4d3;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.623731;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.24746, 0.623731;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-5-2"
+ width="44.37627"
+ height="17.249832"
+ x="190.77643"
+ y="68.917259"
+ ry="2.7399504"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-77.927437"
+ y="213.08859"
+ id="text31256-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31254-8"
+ style="stroke-width:0.75"
+ x="-77.927437"
+ y="213.08859">Queue Pair N</tspan></text>
+ </g>
+ </g>
+ <g
+ id="g42661">
+ <g
+ id="g31368"
+ transform="translate(-19.708778,16.231776)"
+ inkscape:connector-avoid="true">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="25.995117"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-31.941525"
+ y="287.03415"
+ id="text31260"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258"
+ style="stroke-width:0.75"
+ x="-31.941525"
+ y="287.03415">Core 0</tspan></text>
+ </g>
+ <g
+ id="g31373"
+ transform="translate(-19.708778,16.231776)">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479-4"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="49.056282"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-55.00008"
+ y="287.15549"
+ id="text31260-0"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258-7"
+ style="stroke-width:0.75"
+ x="-55.00008"
+ y="287.15549">Core ..</tspan></text>
+ </g>
+ <g
+ id="g31378"
+ transform="translate(-19.708778,16.231776)"
+ inkscape:connector-avoid="true">
+ <rect
+ style="fill:#ffeeaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.08598;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24479-41"
+ width="30.914017"
+ height="10.84422"
+ x="271.58066"
+ y="72.120064"
+ ry="2.2564735" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:4, 2;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-78.063866"
+ y="287.13998"
+ id="text31260-5"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31258-8"
+ style="stroke-width:0.75"
+ x="-78.063866"
+ y="287.13998">Core N</tspan></text>
+ </g>
+ </g>
+ <rect
+ style="fill:#ffccaa;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.844503;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:1.689, 0.844503;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24200-2-5-6"
+ width="99.155495"
+ height="14.152369"
+ x="184.08008"
+ y="13.539296"
+ ry="2.2479515"
+ inkscape:connector-avoid="true" />
+ <text
+ xml:space="preserve"
+ style="font-size:6.35px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.750001;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:3, 1.5;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-20.507757"
+ y="233.56647"
+ id="text31115-0-5-7"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan31113-35-8-7"
+ style="stroke:none;stroke-width:0.75"
+ x="-20.507757"
+ y="233.56647">rte_ml_enqueue_burst()</tspan></text>
+ <path
+ style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.75;stroke-linecap:butt;stroke-linejoin:miter;stroke-dasharray:2.25, 0.75;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#RoundedArrow)"
+ d="M 233.65793,27.691665 V 112.15163"
+ id="path36804"
+ inkscape:connector-type="polyline"
+ inkscape:connector-curvature="0" />
+ <g
+ id="g42683">
+ <rect
+ style="fill:#44d7f4;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24176"
+ width="89.036293"
+ height="63.036304"
+ x="25.870831"
+ y="39.197231"
+ ry="3.0941005" />
+ <text
+ xml:space="preserve"
+ style="font-size:11.2889px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-49.288273"
+ y="70.228432"
+ id="text38896"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan38894"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-49.288273"
+ y="70.228432">Machine</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-63.399399"
+ y="70.228432"
+ id="tspan38898">Learning</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-77.510529"
+ y="70.228432"
+ id="tspan38900">Inference</tspan><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:11.2889px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-91.621651"
+ y="70.228432"
+ id="tspan38902">Engine</tspan></text>
+ </g>
+ <g
+ id="g42621">
+ <rect
+ style="fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.405;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect24176-1"
+ width="88.595322"
+ height="134.59531"
+ x="26.09132"
+ y="148.58514"
+ ry="6.6065331" />
+ <g
+ id="g42601">
+ <g
+ id="g39966"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="146.14212"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-157.3761"
+ y="130.49591"
+ id="text39799"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-157.3761"
+ y="130.49591">Model 0</tspan></text>
+ </g>
+ <g
+ id="g39971"
+ transform="translate(-60.175151,10.144334)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-8"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="178.65079"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-189.88477"
+ y="130.49591"
+ id="text39799-8"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-1"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-189.88477"
+ y="130.49591">Model 1</tspan></text>
+ </g>
+ <g
+ id="g39976"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-9"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="211.15947"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-222.39345"
+ y="130.49591"
+ id="text39799-9"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-8"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-222.39345"
+ y="130.49591">Model ..</tspan></text>
+ </g>
+ <g
+ id="g39981"
+ transform="translate(-60.175145,10.144324)">
+ <rect
+ style="fill:#007cab;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.236524;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ id="rect38962-7"
+ width="48.620556"
+ height="21.483501"
+ x="106.25385"
+ y="243.66815"
+ ry="1.9712806" />
+ <text
+ xml:space="preserve"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:6.35px;font-family:Arial;-inkscape-font-specification:'Arial Bold';text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-254.90213"
+ y="130.49591"
+ id="text39799-90"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ id="tspan39797-5"
+ style="font-size:6.35px;fill:#000000;stroke-width:0.265"
+ x="-254.90213"
+ y="130.49591">Model N</tspan></text>
+ </g>
+ </g>
+ </g>
+ <text
+ xml:space="preserve"
+ style="font-size:14.1111px;font-family:Arial;-inkscape-font-specification:Arial;text-align:center;writing-mode:tb-rl;text-anchor:middle;fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:0.264999;stroke-linecap:round;stroke-linejoin:round;stroke-dasharray:none;stroke-dashoffset:0;paint-order:stroke fill markers"
+ x="-279.79742"
+ y="275.46826"
+ id="text38896-4"
+ transform="rotate(-90)"><tspan
+ sodipodi:role="line"
+ style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:14.1111px;font-family:Arial;-inkscape-font-specification:'Arial Bold';stroke-width:0.265"
+ x="-279.79742"
+ y="275.46826"
+ id="tspan38902-6">mldev</tspan></text>
+ </g>
+</svg>
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 8564883018..d7f2a28bdb 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -30,6 +30,7 @@ Programmer's Guide
regexdev
dmadev
gpudev
+ mldev
rte_security
rawdev
link_bonding_poll_mode_drv_lib
diff --git a/doc/guides/prog_guide/mldev.rst b/doc/guides/prog_guide/mldev.rst
new file mode 100644
index 0000000000..a0bd370e72
--- /dev/null
+++ b/doc/guides/prog_guide/mldev.rst
@@ -0,0 +1,186 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright (c) 2022 Marvell.
+
+Machine Learning Device Library
+===============================
+
+The MLDEV library provides a Machine Learning device framework for the management and
+provisioning of hardware and software ML poll mode drivers, defining APIs which
+support a number of ML operations including device handling and inference processing.
+The ML model creation and training is outside of the scope of this library.
+
+The ML framework is built on the following model:
+
+.. _figure_mldev_work_flow:
+
+.. figure:: img/mldev_flow.*
+
+ Work flow of inference on MLDEV
+
+**ML Device**: A hardware or software-based implementation of ML device API for running
+inferences using a pre-trained ML model.
+
+**ML Model**: An ML model is an algorithm trained over a dataset. A model consists of
+procedure/algorithm and data/pattern required to make predictions on live data. Once
+the model is created and trained outside of the DPDK scope, the model can be loaded
+via rte_ml_model_load() and then start it using rte_ml_model_start() API. The
+rte_ml_model_params_update() can be used to update the model parameters such as weights
+and bias without unloading the model using rte_ml_model_unload().
+
+**ML Inference**: ML inference is the process of feeding data to the model via
+rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+outputs / predictions from the started model.
+
+Design Principles
+-----------------
+
+The MLDEV library follows the same basic principles as those used in DPDK's
+Ethernet Device framework and the Crypto framework. The MLDEV framework provides
+a generic Machine Learning device framework which supports both physical (hardware)
+and virtual (software) ML devices as well as an ML API to manage and configure ML
+devices. The APIs also supports performing ML inference operations through ML poll
+mode driver.
+
+
+Device Operations
+-----------------
+
+Device Creation
+~~~~~~~~~~~~~~~
+
+Physical ML devices are discovered during the PCI probe/enumeration, through the
+EAL functions which are executed at DPDK initialization, based on their PCI device
+identifier, each unique PCI BDF (bus/bridge, device, function). ML physical devices,
+like other physical devices in DPDK can be white-listed or black-listed
+using the EAL command line options.
+
+
+Device Identification
+~~~~~~~~~~~~~~~~~~~~~
+
+Each device, whether virtual or physical is uniquely designated by two
+identifiers:
+
+- A unique device index used to designate the ML device in all functions
+ exported by the MLDEV API.
+
+- A device name used to designate the ML device in console messages, for
+ administration or debugging purposes.
+
+Device Features and Capabilities
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ML devices may support different feature set. In order to get the
+supported PMD feature ``rte_ml_dev_info_get`` API which return the
+info of the device and it's supported features.
+
+Device Configuration
+~~~~~~~~~~~~~~~~~~~~
+
+The configuration of each ML device includes the following operations:
+
+- Allocation of resources, including hardware resources if a physical device.
+- Resetting the device into a well-known default state.
+- Initialization of statistics counters.
+
+The rte_ml_dev_configure API is used to configure a ML device.
+
+.. code-block:: c
+
+ int rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *cfg);
+
+The ``rte_ml_dev_config`` structure is used to pass the configuration parameters
+for the ML device, for example number of queue pairs, maximum number of models,
+maximum size of model and so on.
+
+Configuration of Queue Pairs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each ML device can be configured with number of queue pairs.
+Each queue pair is configured using ``rte_ml_dev_queue_pair_setup``
+
+Logical Cores, Memory and Queues Pair Relationships
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Multiple logical cores should never share the same queue pair for enqueuing
+operations or dequeueing operations on the same ML device since this would
+require global locks and hinder performance.
+
+Configuration of Machine Learning models
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pre-trained ML models that are built using external ML compiler / training frameworks
+are used to perform inference operations. These models are configured on an ML device
+in a two-stage process that includes loading the model on an ML device, and starting
+the model to accept inference operations. Inference operations can be queued for a
+model only when the model is in started state. Model load stage assigns a Model ID,
+which is unique for the model in a driver's context. Model ID is used during all
+subsequent slow-path and fast-path operations.
+
+Model loading and start is done through the ``rte_ml_model_load`` and
+``rte_ml_model_start`` functions.
+
+Similarly stop and unloading are done through ``rte_ml_model_stop`` and
+``rte_ml_model_unload`` functions.
+
+Stop and unload functions would release the resources allocated for the
+models. Inference tasks cannot be queued for a model that is stopped.
+
+Detailed information related to the model can be retrieved from the driver using the
+function ``rte_ml_model_info_get``. Model information is accessible to the application
+through the ``rte_ml_model_info`` structure. Information available to the user would
+include the details related to the inputs and outputs, and the maximum batch size
+supported by the model.
+
+User can optionally update the model params such as weights and bias, without unloading
+the model, through the ``rte_ml_model_params_update`` function. A model should be in
+stopped state to update the params. Model has to be started in order to enqueue inference
+requests after a params update.
+
+Enqueue / Dequeue
+~~~~~~~~~~~~~~~~~
+
+The burst enqueue API uses a ML device identifier and a queue pair identifier
+to specify the device queue pair to schedule the processing on. The ``nb_ops``
+parameter is the number of operations to process which are supplied in the
+``ops`` array of ``rte_ml_op`` structures. The enqueue function returns the
+number of operations it enqueued for processing, a return value equal to
+``nb_ops`` means that all packets have been enqueued.
+
+The dequeue API uses the same format as the enqueue API of processed but
+the ``nb_ops`` and ``ops`` parameters are now used to specify the max processed
+operations the user wishes to retrieve and the location in which to store them.
+The API call returns the actual number of processed operations returned; this
+can never be larger than ``nb_ops``.
+
+``rte_ml_op`` provides the required information to the driver to queue an ML inference
+task. ML op specifies the model to be used and the number of batches to be executed in
+the inference task. Input and output buffer information is specified through the
+structure ``rte_ml_buff_seg``, which supports segmented data. Input is provided through
+the ``rte_ml_op::input`` and output through ``rte_ml_op::output``. Data pointed in each
+op, should not be released until the dequeue of for that op.
+
+
+Quantize and Dequantize
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Inference operations performed with lower precision types would improve the throughput
+and efficiency of the inference execution with a minimal loss of accuracy, which is within
+the tolerance limits. Quantization and dequantization is the process of converting data
+from a higher precision type to a lower precision type and vice-versa. ML library provides
+the functions ``rte_ml_io_quantize`` and ``rte_ml_io_dequantize`` to enable data type
+conversions. User needs to provide the address of the quantized and dequantized data
+buffers to the functions, along the number of the batches in the buffers.
+
+For quantization, the dequantized data is assumed to be of the type ``dtype`` provided by
+the ``rte_ml_model_info::input`` and the data is converted to ``qtype`` provided by the
+``rte_ml_model_info::input``.
+
+For dequantization, the quantized data is assumed to be of the type ``qtype`` provided by
+the ``rte_ml_model_info::output`` and the data is converted to ``dtype`` provided by the
+``rte_ml_model_info::output``.
+
+Size of the buffers required for the input and output can be calculated using the functions
+``rte_ml_io_input_size_get`` and ``rte_ml_io_output_size_get``. These functions would get the
+buffer sizes for both quantized and dequantized data for the given number of batches.
+
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index 1fa101c420..f23b58f416 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -87,6 +87,11 @@ New Features
``rte_event_dev_config::nb_single_link_event_port_queues`` parameter
required for eth_rx, eth_tx, crypto and timer eventdev adapters.
+* **Added machine learning inference device library.**
+
+ * Added a machine learning inference device framework for management and provision of
+ hardware and software machine learning inference devices.
+
Removed Items
-------------
diff --git a/lib/meson.build b/lib/meson.build
index a90fee31b7..ad91819375 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -63,6 +63,7 @@ libraries = [
'flow_classify', # flow_classify lib depends on pkt framework table lib
'graph',
'node',
+ 'mldev',
]
optional_libs = [
diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build
new file mode 100644
index 0000000000..e378cfca30
--- /dev/null
+++ b/lib/mldev/meson.build
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2022 Marvell.
+
+sources = files(
+ 'rte_mldev.c',
+)
+
+headers = files(
+ 'rte_mldev.h',
+)
+
+deps += ['mempool']
+
+if get_option('buildtype').contains('debug')
+ cflags += [ '-DRTE_LIBRTE_ML_DEV_DEBUG' ]
+else
+ cflags += [ '-URTE_LIBRTE_ML_DEV_DEBUG' ]
+endif
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
new file mode 100644
index 0000000000..70aad4c44b
--- /dev/null
+++ b/lib/mldev/rte_mldev.c
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#include <rte_log.h>
+#include <rte_mldev.h>
+
+RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev.h b/lib/mldev/rte_mldev.h
new file mode 100644
index 0000000000..b22234730d
--- /dev/null
+++ b/lib/mldev/rte_mldev.h
@@ -0,0 +1,1119 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef RTE_MLDEV_H
+#define RTE_MLDEV_H
+
+/**
+ * @file rte_mldev.h
+ *
+ * @warning
+ * @b EXPERIMENTAL:
+ * All functions in this file may be changed or removed without prior notice.
+ *
+ * ML (Machine Learning) device API.
+ *
+ * The ML framework is built on the following model:
+ *
+ *
+ * +-----------------+ rte_ml_[en|de]queue_burst()
+ * | | |
+ * | Machine o------+ +--------+ |
+ * | Learning | | | queue | | +------+
+ * | Inference o------+-----o |<===o===>|Core 0|
+ * | Engine | | | pair 0 | +------+
+ * | o----+ | +--------+
+ * | | | |
+ * +-----------------+ | | +--------+
+ * ^ | | | queue | +------+
+ * | | +-----o |<=======>|Core 1|
+ * | | | pair 1 | +------+
+ * | | +--------+
+ * +--------+--------+ |
+ * | +-------------+ | | +--------+
+ * | | Model 0 | | | | queue | +------+
+ * | +-------------+ | +-------o |<=======>|Core N|
+ * | +-------------+ | | pair N | +------+
+ * | | Model 1 | | +--------+
+ * | +-------------+ |
+ * | +-------------+ |<------> rte_ml_model_load()
+ * | | Model .. | |-------> rte_ml_model_info_get()
+ * | +-------------+ |<------- rte_ml_model_start()
+ * | +-------------+ |<------- rte_ml_model_stop()
+ * | | Model N | |<------- rte_ml_model_params_update()
+ * | +-------------+ |<------- rte_ml_model_unload()
+ * +-----------------+
+ *
+ * ML Device: A hardware or software-based implementation of ML device API for
+ * running inferences using a pre-trained ML model.
+ *
+ * ML Model: An ML model is an algorithm trained over a dataset. A model consists of
+ * procedure/algorithm and data/pattern required to make predictions on live data.
+ * Once the model is created and trained outside of the DPDK scope, the model can be loaded
+ * via rte_ml_model_load() and then start it using rte_ml_model_start() API.
+ * The rte_ml_model_params_update() can be used to update the model parameters such as weight
+ * and bias without unloading the model using rte_ml_model_unload().
+ *
+ * ML Inference: ML inference is the process of feeding data to the model via
+ * rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
+ * outputs/predictions from the started model.
+ *
+ * In all functions of the ML device API, the ML device is designated by an
+ * integer >= 0 named as device identifier *dev_id*.
+ *
+ * The functions exported by the ML device API to setup a device designated by
+ * its device identifier must be invoked in the following order:
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_dev_start()
+ *
+ * A model is required to run the inference operations with the user specified inputs.
+ * Application needs to invoke the ML model API in the following order before queueing
+ * inference jobs.
+ *
+ * - rte_ml_model_load()
+ * - rte_ml_model_start()
+ *
+ * A model can be loaded on a device only after the device has been configured and can be
+ * started or stopped only after a device has been started.
+ *
+ * The rte_ml_model_info_get() API is provided to retrieve the information related to the model.
+ * The information would include the shape and type of input and output required for the inference.
+ *
+ * Data quantization and dequantization is one of the main aspects in ML domain. This involves
+ * conversion of input data from a higher precision to a lower precision data type and vice-versa
+ * for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
+ * dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
+ * and output buffers holding data for multiple batches.
+ *
+ * Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
+ * size of quantized and de-quantized multi-batch input and output buffers.
+ *
+ * User can optionally update the model parameters with rte_ml_model_params_update() after
+ * invoking rte_ml_model_stop() API on a given model ID.
+ *
+ * The application can invoke, in any order, the functions exported by the ML API to enqueue
+ * inference jobs and dequeue inference response.
+ *
+ * If the application wants to change the device configuration (i.e., call
+ * rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
+ * device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
+ * the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
+ * for the given model. The application does not need to call rte_ml_dev_stop() API for
+ * any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
+ *
+ * Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
+ * start state after invoking rte_ml_model_start() API, then the application can call
+ * rte_ml_enqueue_burst() and rte_ml_dequeue_burst() API on the destined device and model ID.
+ *
+ * Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
+ *
+ * Typical application utilisation of the ML API will follow the following
+ * programming flow.
+ *
+ * - rte_ml_dev_configure()
+ * - rte_ml_dev_queue_pair_setup()
+ * - rte_ml_model_load()
+ * - rte_ml_dev_start()
+ * - rte_ml_model_start()
+ * - rte_ml_model_info_get()
+ * - rte_ml_enqueue_burst()
+ * - rte_ml_dequeue_burst()
+ * - rte_ml_model_stop()
+ * - rte_ml_model_unload()
+ * - rte_ml_dev_stop()
+ * - rte_ml_dev_close()
+ *
+ * Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on different logical cores
+ * on the same target object. For instance, the dequeue function of a poll mode driver cannot be
+ * invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the user application to enforce this rule.
+ */
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_mempool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Logging Macro */
+extern int rte_ml_dev_logtype;
+
+#define RTE_MLDEV_LOG(level, fmt, args...) \
+ rte_log(RTE_LOG_##level, rte_ml_dev_logtype, "%s(): " fmt "\n", __func__, ##args)
+
+#define RTE_ML_STR_MAX 128
+/**< Maximum length of name string */
+
+#define RTE_MLDEV_DEFAULT_MAX 32
+/** Maximum number of devices if rte_ml_dev_init() is not called. */
+
+/* Device operations */
+
+/**
+ * Initialize the device array before probing devices. If not called, the first device probed would
+ * initialize the array to a size of RTE_MLDEV_DEFAULT_MAX.
+ *
+ * @param dev_max
+ * Maximum number of devices.
+ *
+ * @return
+ * 0 on success, -rte_errno otherwise:
+ * - ENOMEM if out of memory
+ * - EINVAL if 0 size
+ * - EBUSY if already initialized
+ */
+__rte_experimental
+int
+rte_ml_dev_init(size_t dev_max);
+
+/**
+ * Get the total number of ML devices that have been successfully initialised.
+ *
+ * @return
+ * - The total number of usable ML devices.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dev_count(void);
+
+/**
+ * Check if the device is in ready state.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 if device state is not in ready state.
+ * - 1 if device state is ready state.
+ */
+__rte_experimental
+int
+rte_ml_dev_is_valid_dev(int16_t dev_id);
+
+/**
+ * Return the NUMA socket to which a device is connected.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - The NUMA socket id to which the device is connected
+ * - 0 If the socket could not be determined.
+ * - -EINVAL: if the dev_id value is not valid.
+ */
+__rte_experimental
+int
+rte_ml_dev_socket_id(int16_t dev_id);
+
+/** ML device information */
+struct rte_ml_dev_info {
+ const char *driver_name;
+ /**< Driver name */
+ uint16_t max_models;
+ /**< Maximum number of models supported by the device.
+ * @see struct rte_ml_dev_config::nb_models
+ */
+ uint16_t max_queue_pairs;
+ /**< Maximum number of queues pairs supported by the device.
+ * @see struct rte_ml_dev_config::nb_queue_pairs
+ */
+ uint16_t max_desc;
+ /**< Maximum allowed number of descriptors for queue pair by the device.
+ * @see struct rte_ml_dev_qp_conf::nb_desc
+ */
+ uint16_t max_segments;
+ /**< Maximum number of scatter-gather entries supported by the device.
+ * @see struct rte_ml_buff_seg struct rte_ml_buff_seg::next
+ */
+ uint16_t min_align_size;
+ /**< Minimum alignment size of IO buffers used by the device. */
+};
+
+/**
+ * Retrieve the information of the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param dev_info
+ * A pointer to a structure of type *rte_ml_dev_info* to be filled with the info of the device.
+ *
+ * @return
+ * - 0: Success, driver updates the information of the ML device
+ * - < 0: Error code returned by the driver info get function.
+ */
+__rte_experimental
+int
+rte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info);
+
+/** ML device configuration structure */
+struct rte_ml_dev_config {
+ int socket_id;
+ /**< Socket to allocate resources on. */
+ uint16_t nb_models;
+ /**< Number of models to be loaded on the device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_models
+ */
+ uint16_t nb_queue_pairs;
+ /**< Number of queue pairs to configure on this device.
+ * This value cannot exceed the max_models which is previously provided in
+ * struct rte_ml_dev_info::max_queue_pairs
+ */
+};
+
+/**
+ * Configure an ML device.
+ *
+ * This function must be invoked first before any other function in the API.
+ *
+ * ML Device can be re-configured, when in a stopped state. Device cannot be re-configured after
+ * rte_ml_dev_close() is called.
+ *
+ * The caller may use rte_ml_dev_info_get() to get the capability of each resources available for
+ * this ML device.
+ *
+ * @param dev_id
+ * The identifier of the device to configure.
+ * @param config
+ * The ML device configuration structure.
+ *
+ * @return
+ * - 0: Success, device configured.
+ * - < 0: Error code returned by the driver configuration function.
+ */
+__rte_experimental
+int
+rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config);
+
+/* Forward declaration */
+struct rte_ml_op;
+
+/**< Callback function called during rte_ml_dev_stop(), invoked once per flushed ML op */
+typedef void (*rte_ml_dev_stop_flush_t)(int16_t dev_id, uint16_t qp_id, struct rte_ml_op *op);
+
+/** ML device queue pair configuration structure. */
+struct rte_ml_dev_qp_conf {
+ uint32_t nb_desc;
+ /**< Number of descriptors per queue pair.
+ * This value cannot exceed the max_desc which previously provided in
+ * struct rte_ml_dev_info:max_desc
+ */
+ rte_ml_dev_stop_flush_t cb;
+ /**< Callback function called during rte_ml_dev_stop(), invoked once per active ML op.
+ * Value NULL is allowed, in which case callback will not be invoked.
+ * This function can be used to properly dispose of outstanding ML ops from all
+ * queue pairs, for example ops containing memory pointers.
+ * @see rte_ml_dev_stop()
+ */
+};
+
+/**
+ * Set up a queue pair for a device. This should only be called when the device is stopped.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param queue_pair_id
+ * The index of the queue pairs to set up. The value must be in the range [0, nb_queue_pairs - 1]
+ * previously supplied to rte_ml_dev_configure().
+ * @param qp_conf
+ * The pointer to the configuration data to be used for the queue pair.
+ * @param socket_id
+ * The *socket_id* argument is the socket identifier in case of NUMA.
+ * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the memory allocated
+ * for the queue pair.
+ *
+ * @return
+ * - 0: Success, queue pair correctly set up.
+ * - < 0: Queue pair configuration failed.
+ */
+__rte_experimental
+int
+rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *qp_conf, int socket_id);
+
+/**
+ * Start an ML device.
+ *
+ * The device start step consists of setting the configured features and enabling the ML device
+ * to accept inference jobs.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device started.
+ * - <0: Error code of the driver device start function.
+ */
+__rte_experimental
+int
+rte_ml_dev_start(int16_t dev_id);
+
+/**
+ * Stop an ML device. A stopped device cannot accept inference jobs.
+ * The device can be restarted with a call to rte_ml_dev_start().
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0: Success, device stopped.
+ * - <0: Error code of the driver device stop function.
+ */
+__rte_experimental
+int
+rte_ml_dev_stop(int16_t dev_id);
+
+/**
+ * Close an ML device. The device cannot be restarted!
+ *
+ * @param dev_id
+ * The identifier of the device.
+ *
+ * @return
+ * - 0 on successfully closing device.
+ * - <0 on failure to close device.
+ */
+__rte_experimental
+int
+rte_ml_dev_close(int16_t dev_id);
+
+/** Status of ML operation */
+enum rte_ml_op_status {
+ RTE_ML_OP_STATUS_SUCCESS = 0,
+ /**< Operation completed successfully */
+ RTE_ML_OP_STATUS_NOT_PROCESSED,
+ /**< Operation has not yet been processed by the device. */
+ RTE_ML_OP_STATUS_ERROR,
+ /**< Operation completed with error.
+ * Application can invoke rte_ml_op_error_get() to get PMD specific
+ * error code if needed.
+ */
+};
+
+/** ML operation's input and output buffer representation as scatter gather list
+ */
+struct rte_ml_buff_seg {
+ rte_iova_t iova_addr;
+ /**< IOVA address of segment buffer. */
+ void *addr;
+ /**< Virtual address of segment buffer. */
+ uint32_t length;
+ /**< Segment length. */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_ml_buff_seg *next;
+ /**< Points to next segment. Value NULL represents the last segment. */
+};
+
+/**
+ * ML Operation.
+ *
+ * This structure contains data related to performing an ML operation on the buffers using
+ * the model specified through model_id.
+ */
+struct rte_ml_op {
+ uint16_t model_id;
+ /**< Model ID to be used for the operation. */
+ uint16_t nb_batches;
+ /**< Number of batches. Minimum value must be one.
+ * Input buffer must hold inference data for each batch as contiguous.
+ */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_mempool *mempool;
+ /**< Pool from which operation is allocated. */
+ struct rte_ml_buff_seg input;
+ /**< Input buffer to hold the inference data. */
+ struct rte_ml_buff_seg output;
+ /**< Output buffer to hold the inference output by the driver. */
+ RTE_STD_C11
+ union {
+ uint64_t user_u64;
+ /**< User data as uint64_t.*/
+ void *user_ptr;
+ /**< User data as void*.*/
+ };
+ enum rte_ml_op_status status;
+ /**< Operation status. */
+ uint64_t impl_opaque;
+ /**< Implementation specific opaque value.
+ * An implementation may use this field to hold
+ * implementation specific value to share between
+ * dequeue and enqueue operation.
+ * The application should not modify this field.
+ */
+} __rte_cache_aligned;
+
+/* Enqueue/Dequeue operations */
+
+/**
+ * Enqueue a burst of ML inferences for processing on an ML device.
+ *
+ * The rte_ml_enqueue_burst() function is invoked to place ML inference
+ * operations on the queue *qp_id* of the device designated by its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of inferences to process which are
+ * supplied in the *ops* array of *rte_ml_op* structures.
+ *
+ * The rte_ml_enqueue_burst() function returns the number of inferences it
+ * actually enqueued for processing. A return value equal to *nb_ops* means that
+ * all packets have been enqueued.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair which inferences are to be enqueued for processing.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * *rte_ml_dev_configure*.
+ * @param ops
+ * The address of an array of *nb_ops* pointers to *rte_ml_op* structures which contain the
+ * ML inferences to be processed.
+ * @param nb_ops
+ * The number of operations to process.
+ *
+ * @return
+ * The number of inference operations actually enqueued to the ML device.
+ * The return value can be less than the value of the *nb_ops* parameter when the ML device queue
+ * is full or if invalid parameters are specified in a *rte_ml_op*.
+ */
+__rte_experimental
+uint16_t
+rte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Dequeue a burst of processed ML inferences operations from a queue on the ML device.
+ * The dequeued operations are stored in *rte_ml_op* structures whose pointers are supplied
+ * in the *ops* array.
+ *
+ * The rte_ml_dequeue_burst() function returns the number of inferences actually dequeued,
+ * which is the number of *rte_ml_op* data structures effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained at least nb_ops* operations,
+ * and this is likely to signify that other processed operations remain in the devices output queue.
+ * Application implementing a "retrieve as many processed operations as possible" policy can check
+ * this specific case and keep invoking the rte_ml_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_ml_dequeue_burst() function does not provide any error notification to avoid
+ * the corresponding overhead.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param qp_id
+ * The index of the queue pair from which to retrieve processed packets.
+ * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
+ * rte_ml_dev_configure().
+ * @param ops
+ * The address of an array of pointers to *rte_ml_op* structures that must be large enough to
+ * store *nb_ops* pointers in it.
+ * @param nb_ops
+ * The maximum number of inferences to dequeue.
+ *
+ * @return
+ * The number of operations actually dequeued, which is the number of pointers
+ * to *rte_ml_op* structures effectively supplied to the *ops* array.
+ */
+__rte_experimental
+uint16_t
+rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
+
+/**
+ * Verbose error structure definition.
+ */
+struct rte_ml_op_error {
+ char message[RTE_ML_STR_MAX]; /**< Human-readable error message. */
+ uint64_t errcode; /**< Vendor specific error code. */
+};
+
+/**
+ * Get PMD specific error information for an ML op.
+ *
+ * When an ML operation completed with RTE_ML_OP_STATUS_ERROR as status,
+ * This API allows to get PMD specific error details.
+ *
+ * @param[in] dev_id
+ * Device identifier
+ * @param[in] op
+ * Handle of ML operation
+ * @param[in] error
+ * Address of structure rte_ml_op_error to be filled
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error);
+
+/* Statistics operations */
+
+/** Device statistics. */
+struct rte_ml_dev_stats {
+ uint64_t enqueued_count;
+ /**< Count of all operations enqueued */
+ uint64_t dequeued_count;
+ /**< Count of all operations dequeued */
+ uint64_t enqueue_err_count;
+ /**< Total error count on operations enqueued */
+ uint64_t dequeue_err_count;
+ /**< Total error count on operations dequeued */
+};
+
+/**
+ * Retrieve the general I/O statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stats
+ * Pointer to structure to where statistics will be copied.
+ * On error, this location may or may not have been modified.
+ * @return
+ * - 0 on success
+ * - -EINVAL: If invalid parameter pointer is provided.
+ */
+__rte_experimental
+int
+rte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats);
+
+/**
+ * Reset the statistics of a device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ */
+__rte_experimental
+void
+rte_ml_dev_stats_reset(int16_t dev_id);
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers for extended ML device statistics.
+ */
+struct rte_ml_dev_xstats_map {
+ uint16_t id;
+ /**< xstat identifier */
+ char name[RTE_ML_STR_MAX];
+ /**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param[out] xstats_map
+ * Block of memory to insert id and names into. Must be at least size in capacity.
+ * If set to NULL, function returns required capacity.
+ * @param size
+ * Capacity of xstats_map (number of name-id maps).
+ *
+ * @return
+ * - Positive value on success:
+ * - The return value is the number of entries filled in the stats map.
+ * - If xstats_map set to NULL then required capacity for xstats_map.
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map,
+ uint32_t size);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param name
+ * The stat name to retrieve.
+ * @param stat_id
+ * If non-NULL, the numerical id of the stat will be returned, so that further requests for
+ * the stat can be got using rte_ml_dev_xstats_get, which will be faster as it doesn't need to
+ * scan a list of names for the stat.
+ * @param[out] value
+ * Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ * - 0: Successfully retrieved xstat value.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value);
+
+/**
+ * Retrieve extended statistics of an ML device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * The id numbers of the stats to get. The ids can be fetched from the stat position in the
+ * stat list from rte_ml_dev_xstats_names_get(), or by using rte_ml_dev_xstats_by_name_get().
+ * @param values
+ * The values for each stats request by ID.
+ * @param nb_ids
+ * The number of stats requested.
+ * @return
+ * - Positive value: number of stat entries filled into the values array
+ * - Negative value on error:
+ * - -ENODEV: for invalid *dev_id*.
+ * - -ENOTSUP: if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param stat_ids
+ * Selects specific statistics to be reset. When NULL, all statistics will be reset.
+ * If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ * The number of ids available from the *ids* array. Ignored when ids is NULL.
+ * @return
+ * - 0: Successfully reset the statistics to zero.
+ * - -EINVAL: invalid parameters.
+ * - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids);
+
+/* Utility operations */
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *fd*.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @param fd
+ * A pointer to a file for output.
+ * @return
+ * - 0: on success.
+ * - <0: on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_dump(int16_t dev_id, FILE *fd);
+
+/**
+ * Trigger the ML device self test.
+ *
+ * @param dev_id
+ * The identifier of the device.
+ * @return
+ * - 0: Selftest successful.
+ * - -ENOTSUP: if the device doesn't support selftest.
+ * - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_ml_dev_selftest(int16_t dev_id);
+
+/* Model operations */
+
+/** ML model load parameters
+ *
+ * Parameters required to load an ML model.
+ */
+struct rte_ml_model_params {
+ void *addr;
+ /**< Address of model buffer */
+ size_t size;
+ /**< Size of model buffer */
+};
+
+/**
+ * Load an ML model to the device.
+ *
+ * Load an ML model to the device with parameters requested in the structure rte_ml_model_params.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] params
+ * Parameters for the model to be loaded.
+ * @param[out] model_id
+ * Identifier of the model loaded.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model load driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id);
+
+/**
+ * Unload an ML model from the device.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be unloaded.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model unload driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_unload(int16_t dev_id, uint16_t model_id);
+
+/**
+ * Start an ML model for the given device ID.
+ *
+ * Start an ML model to accept inference requests.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be started.
+ *
+ * @return
+ * - 0: Success, Model loaded.
+ * - < 0: Failure, Error code of the model start driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_start(int16_t dev_id, uint16_t model_id);
+
+/**
+ * Stop an ML model for the given device ID.
+ *
+ * Model stop would disable the ML model to be used for inference jobs.
+ * All inference jobs must have been completed before model stop is attempted.
+
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier of the model to be stopped.
+ *
+ * @return
+ * - 0: Success, Model unloaded.
+ * - < 0: Failure, Error code of the model stop driver function.
+ */
+__rte_experimental
+int
+rte_ml_model_stop(int16_t dev_id, uint16_t model_id);
+
+/**
+ * Input and output data types. ML models can operate on reduced precision
+ * datatypes to achieve better power efficiency, lower network latency and lower memory footprint.
+ * This enum is used to represent the lower precision integer and floating point types used
+ * by ML models.
+ */
+enum rte_ml_io_type {
+ RTE_ML_IO_TYPE_UNKNOWN = 0,
+ /**< Invalid or unknown type */
+ RTE_ML_IO_TYPE_INT8,
+ /**< 8-bit integer */
+ RTE_ML_IO_TYPE_UINT8,
+ /**< 8-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT16,
+ /**< 16-bit integer */
+ RTE_ML_IO_TYPE_UINT16,
+ /**< 16-bit unsigned integer */
+ RTE_ML_IO_TYPE_INT32,
+ /**< 32-bit integer */
+ RTE_ML_IO_TYPE_UINT32,
+ /**< 32-bit unsigned integer */
+ RTE_ML_IO_TYPE_FP8,
+ /**< 8-bit floating point number */
+ RTE_ML_IO_TYPE_FP16,
+ /**< IEEE 754 16-bit floating point number */
+ RTE_ML_IO_TYPE_FP32,
+ /**< IEEE 754 32-bit floating point number */
+ RTE_ML_IO_TYPE_BFLOAT16
+ /**< 16-bit brain floating point number. */
+};
+
+/**
+ * Input and output format. This is used to represent the encoding type of multi-dimensional
+ * used by ML models.
+ */
+enum rte_ml_io_format {
+ RTE_ML_IO_FORMAT_NCHW = 1,
+ /**< Batch size (N) x channels (C) x height (H) x width (W) */
+ RTE_ML_IO_FORMAT_NHWC,
+ /**< Batch size (N) x height (H) x width (W) x channels (C) */
+ RTE_ML_IO_FORMAT_CHWN,
+ /**< Channels (C) x height (H) x width (W) x batch size (N) */
+ RTE_ML_IO_FORMAT_3D,
+ /**< Format to represent a 3 dimensional data */
+ RTE_ML_IO_FORMAT_2D,
+ /**< Format to represent matrix data */
+ RTE_ML_IO_FORMAT_1D,
+ /**< Format to represent vector data */
+ RTE_ML_IO_FORMAT_SCALAR,
+ /**< Format to represent scalar data */
+};
+
+/**
+ * Input and output shape. This structure represents the encoding format and dimensions
+ * of the tensor or vector.
+ *
+ * The data can be a 4D / 3D tensor, matrix, vector or a scalar. Number of dimensions used
+ * for the data would depend on the format. Unused dimensions to be set to 1.
+ */
+struct rte_ml_io_shape {
+ enum rte_ml_io_format format;
+ /**< Format of the data */
+ uint32_t w;
+ /**< First dimension */
+ uint32_t x;
+ /**< Second dimension */
+ uint32_t y;
+ /**< Third dimension */
+ uint32_t z;
+ /**< Fourth dimension */
+};
+
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Name of data */
+ struct rte_ml_io_shape shape;
+ /**< Shape of data */
+ enum rte_ml_io_type qtype;
+ /**< Type of quantized data */
+ enum rte_ml_io_type dtype;
+ /**< Type of de-quantized data */
+};
+
+/** Model information structure */
+struct rte_ml_model_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Model name. */
+ char version[RTE_ML_STR_MAX];
+ /**< Model version */
+ uint16_t model_id;
+ /**< Model ID */
+ uint16_t device_id;
+ /**< Device ID */
+ uint16_t batch_size;
+ /**< Maximum number of batches that the model can process simultaneously */
+ uint32_t nb_inputs;
+ /**< Number of inputs */
+ const struct rte_ml_io_info *input_info;
+ /**< Input info array. Array size is equal to nb_inputs */
+ uint32_t nb_outputs;
+ /**< Number of outputs */
+ const struct rte_ml_io_info *output_info;
+ /**< Output info array. Array size is equal to nb_output */
+ uint64_t wb_size;
+ /**< Size of model weights and bias */
+};
+
+/**
+ * Get ML model information.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[out] model_info
+ * Pointer to a model info structure
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_info_get(int16_t dev_id, uint16_t model_id, struct rte_ml_model_info *model_info);
+
+/**
+ * Update the model parameters without unloading model.
+ *
+ * Update model parameters such as weights and bias without unloading the model.
+ * rte_ml_model_stop() must be called before invoking this API.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] buffer
+ * Pointer to the model weights and bias buffer.
+ * Size of the buffer is equal to wb_size returned in *rte_ml_model_info*.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_model_params_update(int16_t dev_id, uint16_t model_id, void *buffer);
+
+/* IO operations */
+
+/**
+ * Get size of quantized and dequantized input buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized input data.
+ * This API would return the buffer sizes for the number of batches provided and would
+ * consider the alignment requirements as per the PMD. Input sizes computed by this API can
+ * be used by the application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] input_qsize
+ * Quantized input size pointer.
+ * NULL value is allowed, in which case input_qsize is not calculated by the driver.
+ * @param[out] input_dsize
+ * Dequantized input size pointer.
+ * NULL value is allowed, in which case input_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_input_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
+ uint64_t *input_qsize, uint64_t *input_dsize);
+
+/**
+ * Get size of quantized and dequantized output buffers.
+ *
+ * Calculate the size of buffers required for quantized and dequantized output data.
+ * This API would return the buffer sizes for the number of batches provided and would consider
+ * the alignment requirements as per the PMD. Output sizes computed by this API can be used by the
+ * application to allocate buffers.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model created
+ * @param[in] nb_batches
+ * Number of batches of input to be processed in a single inference job
+ * @param[out] output_qsize
+ * Quantized output size pointer.
+ * NULL value is allowed, in which case output_qsize is not calculated by the driver.
+ * @param[out] output_dsize
+ * Dequantized output size pointer.
+ * NULL value is allowed, in which case output_dsize is not calculated by the driver.
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_output_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
+ uint64_t *output_qsize, uint64_t *output_dsize);
+
+/**
+ * Quantize input data.
+ *
+ * Quantization converts data from a higher precision types to a lower precision types to improve
+ * the throughput and efficiency of the model execution with minimal loss of accuracy.
+ * Types of dequantized data and quantized data are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized input buffer
+ * @param[in] dbuffer
+ * Address of dequantized input data
+ * @param[in] qbuffer
+ * Address of quantized input data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_quantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *dbuffer,
+ void *qbuffer);
+
+/**
+ * Dequantize output data.
+ *
+ * Dequantization converts data from a lower precision type to a higher precision type.
+ * Types of quantized data and dequantized are specified by the model.
+ *
+ * @param[in] dev_id
+ * The identifier of the device.
+ * @param[in] model_id
+ * Identifier for the model
+ * @param[in] nb_batches
+ * Number of batches in the dequantized output buffer
+ * @param[in] qbuffer
+ * Address of quantized output data
+ * @param[in] dbuffer
+ * Address of dequantized output data
+ *
+ * @return
+ * - Returns 0 on success
+ * - Returns negative value on failure
+ */
+__rte_experimental
+int
+rte_ml_io_dequantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *qbuffer,
+ void *dbuffer);
+
+/* ML op pool operations */
+
+/**
+ * Create an ML operation pool
+ *
+ * @param name
+ * ML operations pool name
+ * @param nb_elts
+ * Number of elements in pool
+ * @param cache_size
+ * Number of elements to cache on lcore, see
+ * *rte_mempool_create* for further details about cache size
+ * @param user_size
+ * Size of private data to allocate for user with each operation
+ * @param socket_id
+ * Socket to identifier allocate memory on
+ * @return
+ * - On success pointer to mempool
+ * - On failure NULL
+ */
+__rte_experimental
+struct rte_mempool *
+rte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
+ uint16_t user_size, int socket_id);
+
+/**
+ * Free an ML operation pool
+ *
+ * @param mempool
+ * A pointer to the mempool structure.
+ * If NULL then, the function does nothing.
+ */
+__rte_experimental
+void
+rte_ml_op_pool_free(struct rte_mempool *mempool);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_MLDEV_H */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
new file mode 100644
index 0000000000..3793380442
--- /dev/null
+++ b/lib/mldev/version.map
@@ -0,0 +1,7 @@
+EXPERIMENTAL {
+ global:
+
+ rte_ml_dev_logtype;
+
+ local: *;
+};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 02/12] mldev: support PMD functions for ML device
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 01/12] " jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 03/12] mldev: support ML device handling functions jerinj
` (11 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi, Anatoly Burakov
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added PMD functions to handle ML devices. The rte_mldev_pmd.*
files are for drivers only and should be private to DPDK, and
are not installed for application use. Added implementation
for rte_ml_dev_init.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/meson.build | 9 ++
lib/mldev/rte_mldev.c | 172 +++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 108 +++++++++++++++++++++++
lib/mldev/rte_mldev_pmd.c | 62 +++++++++++++
lib/mldev/rte_mldev_pmd.h | 149 ++++++++++++++++++++++++++++++++
lib/mldev/version.map | 12 +++
6 files changed, 512 insertions(+)
create mode 100644 lib/mldev/rte_mldev_core.h
create mode 100644 lib/mldev/rte_mldev_pmd.c
create mode 100644 lib/mldev/rte_mldev_pmd.h
diff --git a/lib/mldev/meson.build b/lib/mldev/meson.build
index e378cfca30..5c99532c1a 100644
--- a/lib/mldev/meson.build
+++ b/lib/mldev/meson.build
@@ -2,6 +2,7 @@
# Copyright (c) 2022 Marvell.
sources = files(
+ 'rte_mldev_pmd.c',
'rte_mldev.c',
)
@@ -9,6 +10,14 @@ headers = files(
'rte_mldev.h',
)
+indirect_headers += files(
+ 'rte_mldev_core.h',
+)
+
+driver_sdk_headers += files(
+ 'rte_mldev_pmd.h',
+)
+
deps += ['mempool']
if get_option('buildtype').contains('debug')
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 70aad4c44b..833afcbf87 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -2,7 +2,179 @@
* Copyright (c) 2022 Marvell.
*/
+#include <rte_errno.h>
#include <rte_log.h>
#include <rte_mldev.h>
+#include <rte_mldev_pmd.h>
+
+#include <stdlib.h>
+
+static struct rte_ml_dev_global ml_dev_globals = {
+ .devs = NULL, .data = NULL, .nb_devs = 0, .max_devs = RTE_MLDEV_DEFAULT_MAX};
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_dev(int16_t dev_id)
+{
+ return &ml_dev_globals.devs[dev_id];
+}
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_named_dev(const char *name)
+{
+ struct rte_ml_dev *dev;
+ int16_t dev_id;
+
+ if (name == NULL)
+ return NULL;
+
+ for (dev_id = 0; dev_id < ml_dev_globals.max_devs; dev_id++) {
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if ((dev->attached == ML_DEV_ATTACHED) && (strcmp(dev->data->name, name) == 0))
+ return dev;
+ }
+
+ return NULL;
+}
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id)
+{
+ char mz_name[RTE_MEMZONE_NAMESIZE];
+ const struct rte_memzone *mz;
+ struct rte_ml_dev *dev;
+ int16_t dev_id;
+
+ /* implicit initialization of library before adding first device */
+ if (ml_dev_globals.devs == NULL) {
+ if (rte_ml_dev_init(RTE_MLDEV_DEFAULT_MAX) != 0)
+ return NULL;
+ }
+
+ if (rte_ml_dev_pmd_get_named_dev(name) != NULL) {
+ RTE_MLDEV_LOG(ERR, "ML device with name %s already allocated!", name);
+ return NULL;
+ }
+
+ /* Get a free device ID */
+ for (dev_id = 0; dev_id < ml_dev_globals.max_devs; dev_id++) {
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (dev->attached == ML_DEV_DETACHED)
+ break;
+ }
+
+ if (dev_id == ml_dev_globals.max_devs) {
+ RTE_MLDEV_LOG(ERR, "Reached maximum number of ML devices");
+ return NULL;
+ }
+
+ if (dev->data == NULL) {
+ /* Reserve memzone name */
+ sprintf(mz_name, "rte_ml_dev_data_%d", dev_id);
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ mz = rte_memzone_reserve(mz_name, sizeof(struct rte_ml_dev_data), socket_id,
+ 0);
+ RTE_MLDEV_LOG(DEBUG, "PRIMARY: reserved memzone for %s (%p)", mz_name, mz);
+ } else {
+ mz = rte_memzone_lookup(mz_name);
+ RTE_MLDEV_LOG(DEBUG, "SECONDARY: looked up memzone for %s (%p)", mz_name,
+ mz);
+ }
+
+ if (mz == NULL)
+ return NULL;
+
+ ml_dev_globals.data[dev_id] = mz->addr;
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ memset(ml_dev_globals.data[dev_id], 0, sizeof(struct rte_ml_dev_data));
+
+ dev->data = ml_dev_globals.data[dev_id];
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ strlcpy(dev->data->name, name, RTE_ML_STR_MAX);
+ dev->data->dev_id = dev_id;
+ dev->data->socket_id = socket_id;
+ dev->data->dev_started = 0;
+ RTE_MLDEV_LOG(DEBUG, "PRIMARY: init mldev data");
+ }
+
+ RTE_MLDEV_LOG(DEBUG, "Data for %s: dev_id %d, socket %u", dev->data->name,
+ dev->data->dev_id, dev->data->socket_id);
+
+ dev->attached = ML_DEV_ATTACHED;
+ ml_dev_globals.nb_devs++;
+ }
+
+ return dev;
+}
+
+int
+rte_ml_dev_pmd_release(struct rte_ml_dev *dev)
+{
+ char mz_name[RTE_MEMZONE_NAMESIZE];
+ const struct rte_memzone *mz;
+ int16_t dev_id;
+ int ret = 0;
+
+ if (dev == NULL)
+ return -EINVAL;
+
+ dev_id = dev->data->dev_id;
+
+ /* Memzone lookup */
+ sprintf(mz_name, "rte_ml_dev_data_%d", dev_id);
+ mz = rte_memzone_lookup(mz_name);
+ if (mz == NULL)
+ return -ENOMEM;
+
+ RTE_ASSERT(ml_dev_globals.data[dev_id] == mz->addr);
+ ml_dev_globals.data[dev_id] = NULL;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ RTE_MLDEV_LOG(DEBUG, "PRIMARY: free memzone of %s (%p)", mz_name, mz);
+ ret = rte_memzone_free(mz);
+ } else {
+ RTE_MLDEV_LOG(DEBUG, "SECONDARY: don't free memzone of %s (%p)", mz_name, mz);
+ }
+
+ dev->attached = ML_DEV_DETACHED;
+ ml_dev_globals.nb_devs--;
+
+ return ret;
+}
+
+int
+rte_ml_dev_init(size_t dev_max)
+{
+ if (dev_max == 0 || dev_max > INT16_MAX) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_max = %zu (> %d)\n", dev_max, INT16_MAX);
+ rte_errno = EINVAL;
+ return -rte_errno;
+ }
+
+ /* No lock, it must be called before or during first probing. */
+ if (ml_dev_globals.devs != NULL) {
+ RTE_MLDEV_LOG(ERR, "Device array already initialized");
+ rte_errno = EBUSY;
+ return -rte_errno;
+ }
+
+ ml_dev_globals.devs = calloc(dev_max, sizeof(struct rte_ml_dev));
+ if (ml_dev_globals.devs == NULL) {
+ RTE_MLDEV_LOG(ERR, "Cannot initialize MLDEV library");
+ rte_errno = ENOMEM;
+ return -rte_errno;
+ }
+
+ ml_dev_globals.data = calloc(dev_max, sizeof(struct rte_ml_dev_data *));
+ if (ml_dev_globals.data == NULL) {
+ RTE_MLDEV_LOG(ERR, "Cannot initialize MLDEV library");
+ rte_errno = ENOMEM;
+ return -rte_errno;
+ }
+
+ ml_dev_globals.max_devs = dev_max;
+ ml_dev_globals.devs = ml_dev_globals.devs;
+
+ return 0;
+}
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
new file mode 100644
index 0000000000..1564d0fa4d
--- /dev/null
+++ b/lib/mldev/rte_mldev_core.h
@@ -0,0 +1,108 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef _RTE_MLDEV_INTERNAL_H_
+#define _RTE_MLDEV_INTERNAL_H_
+
+/**
+ * @file
+ *
+ * MLDEV internal header
+ *
+ * This file contains MLDEV private data structures and macros.
+ *
+ * @note
+ * These APIs are for MLDEV PMDs and library only.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#include <dev_driver.h>
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_mldev.h>
+
+/* Device state */
+#define ML_DEV_DETACHED (0)
+#define ML_DEV_ATTACHED (1)
+
+/**
+ * @internal
+ *
+ * The data part, with no function pointers, associated with each device. This structure is safe to
+ * place in shared memory to be common among different processes in a multi-process configuration.
+ */
+struct rte_ml_dev_data {
+ /** Device ID for this instance. */
+ int16_t dev_id;
+
+ /** Socket ID where memory is allocated. */
+ int16_t socket_id;
+
+ /** Device state: STOPPED(0) / STARTED(1) */
+ __extension__ uint8_t dev_started : 1;
+
+ /** Number of device queue pairs. */
+ uint16_t nb_queue_pairs;
+
+ /** Number of ML models. */
+ uint16_t nb_models;
+
+ /** Array of pointers to queue pairs. */
+ void **queue_pairs;
+
+ /** Array of pointers to ML models. */
+ void **models;
+
+ /** PMD-specific private data. */
+ void *dev_private;
+
+ /** Unique identifier name. */
+ char name[RTE_ML_STR_MAX];
+};
+
+/**
+ * @internal
+ *
+ * The data structure associated with each ML device.
+ */
+struct rte_ml_dev {
+ /** Pointer to device data. */
+ struct rte_ml_dev_data *data;
+
+ /** Backing RTE device. */
+ struct rte_device *device;
+
+ /** Flag indicating the device is attached. */
+ __extension__ uint8_t attached : 1;
+} __rte_cache_aligned;
+
+/**
+ * @internal
+ *
+ * Global structure used for maintaining state of allocated ML devices.
+ */
+struct rte_ml_dev_global {
+ /** Device information array. */
+ struct rte_ml_dev *devs;
+
+ /** Device private data array. */
+ struct rte_ml_dev_data **data;
+
+ /** Number of devices found. */
+ uint8_t nb_devs;
+
+ /** Maximum number of devices. */
+ uint8_t max_devs;
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MLDEV_INTERNAL_H_ */
diff --git a/lib/mldev/rte_mldev_pmd.c b/lib/mldev/rte_mldev_pmd.c
new file mode 100644
index 0000000000..3169e5d4fa
--- /dev/null
+++ b/lib/mldev/rte_mldev_pmd.c
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#include <dev_driver.h>
+#include <rte_eal.h>
+#include <rte_malloc.h>
+
+#include "rte_mldev_pmd.h"
+
+struct rte_ml_dev *
+rte_ml_dev_pmd_create(const char *name, struct rte_device *device,
+ struct rte_ml_dev_pmd_init_params *params)
+{
+ struct rte_ml_dev *dev;
+
+ RTE_MLDEV_LOG(INFO, "ML device initialisation - name: %s, socket_id: %u", name,
+ params->socket_id);
+
+ /* Allocate device structure */
+ dev = rte_ml_dev_pmd_allocate(name, params->socket_id);
+ if (dev == NULL) {
+ RTE_MLDEV_LOG(ERR, "Failed to allocate ML device for %s", name);
+ return NULL;
+ }
+
+ /* Allocate private device structure */
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+ dev->data->dev_private =
+ rte_zmalloc_socket("ml_dev_private", params->private_data_size,
+ RTE_CACHE_LINE_SIZE, params->socket_id);
+
+ if (dev->data->dev_private == NULL) {
+ RTE_MLDEV_LOG(ERR, "Cannot allocate memory for mldev %s private data",
+ name);
+ rte_ml_dev_pmd_release(dev);
+ return NULL;
+ }
+ }
+ dev->device = device;
+
+ return dev;
+}
+
+int
+rte_ml_dev_pmd_destroy(struct rte_ml_dev *dev)
+{
+ int ret;
+
+ RTE_MLDEV_LOG(INFO, "Releasing ML device - name: %s", dev->device->name);
+ ret = rte_ml_dev_pmd_release(dev);
+ if (ret)
+ return ret;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ rte_free(dev->data->dev_private);
+
+ dev->data = NULL;
+ dev->device = NULL;
+
+ return 0;
+}
diff --git a/lib/mldev/rte_mldev_pmd.h b/lib/mldev/rte_mldev_pmd.h
new file mode 100644
index 0000000000..33544f1b80
--- /dev/null
+++ b/lib/mldev/rte_mldev_pmd.h
@@ -0,0 +1,149 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Marvell.
+ */
+
+#ifndef _RTE_MLDEV_PMD_H_
+#define _RTE_MLDEV_PMD_H_
+
+/**
+ * @file
+ *
+ * RTE MLDEV PMD APIs
+ *
+ * ML Device PMD interface
+ *
+ * @note
+ * These APIs are for MLDEV PMDs only and user applications should not call them directly.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#include <rte_common.h>
+#include <rte_compat.h>
+#include <rte_mldev.h>
+#include <rte_mldev_core.h>
+
+/**
+ * @internal
+ *
+ * Initialisation parameters for ML devices.
+ */
+struct rte_ml_dev_pmd_init_params {
+ /** Socket to use for memory allocation. */
+ uint8_t socket_id;
+
+ /** Size of device private data. */
+ uint64_t private_data_size;
+};
+
+/**
+ * @internal
+ *
+ * Get the ML device pointer for the device. Assumes a valid device index.
+ *
+ * @param dev_id
+ * Device ID value to select the device structure.
+ *
+ * @return
+ * The rte_ml_dev pointer for the given device ID.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_dev(int16_t dev_id);
+
+/**
+ * @internal
+ *
+ * Get the rte_ml_dev structure device pointer for the named device.
+ *
+ * @param name
+ * Device name to select the device structure.
+ *
+ * @return
+ * The rte_ml_dev pointer for the given device ID.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_get_named_dev(const char *name);
+
+/**
+ * @internal
+ *
+ * Allocates a new mldev slot for an ML device and returns the pointer to that slot for use.
+ * Function for internal use by dummy drivers.
+ *
+ * @param name
+ * Unique identifier name for each device.
+ * @param socket_id
+ * Socket to allocate resources.
+ *
+ * @return
+ * Slot in the rte_ml_dev_devices array for a new device.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id);
+
+/**
+ * @internal
+ *
+ * Release the specified mldev device.
+ *
+ * @param dev
+ * ML device.
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+__rte_internal
+int
+rte_ml_dev_pmd_release(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * PMD assist function to provide boiler plate code for ML driver to create and allocate resources
+ * for a new ML PMD device instance.
+ *
+ * @param name
+ * ML device name.
+ * @param device
+ * Base device handle.
+ * @param params
+ * PMD initialisation parameters.
+ *
+ * @return
+ * - ML device instance on success.
+ * - NULL on failure.
+ */
+__rte_internal
+struct rte_ml_dev *
+rte_ml_dev_pmd_create(const char *name, struct rte_device *device,
+ struct rte_ml_dev_pmd_init_params *params);
+
+/**
+ * @internal
+ *
+ * PMD assist function to provide boiler plate code for ML driver to destroy and free resources
+ * associated with a ML PMD device instance.
+ *
+ * @param mldev
+ * ML device instance.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+__rte_internal
+int
+rte_ml_dev_pmd_destroy(struct rte_ml_dev *mldev);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MLDEV_PMD_H_ */
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 3793380442..d6bf7c8ebb 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -1,7 +1,19 @@
EXPERIMENTAL {
global:
+ rte_ml_dev_init;
rte_ml_dev_logtype;
local: *;
};
+
+INTERNAL {
+ global:
+
+ rte_ml_dev_pmd_allocate;
+ rte_ml_dev_pmd_create;
+ rte_ml_dev_pmd_destroy;
+ rte_ml_dev_pmd_get_dev;
+ rte_ml_dev_pmd_get_named_dev;
+ rte_ml_dev_pmd_release;
+};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 03/12] mldev: support ML device handling functions
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 01/12] " jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 02/12] mldev: support PMD functions for ML device jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 04/12] mldev: support ML device queue-pair setup jerinj
` (10 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added ML device handling APIs. These APIs are used to get device
information, configure, start, stop and close ML devices. Added
function prototypes to PMD layer which are used by the ML driver
implementations in the poll mode driver.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 175 +++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 107 +++++++++++++++++++++++
lib/mldev/version.map | 8 ++
3 files changed, 290 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 833afcbf87..961e12d150 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -177,4 +177,179 @@ rte_ml_dev_init(size_t dev_max)
return 0;
}
+uint16_t
+rte_ml_dev_count(void)
+{
+ return ml_dev_globals.nb_devs;
+}
+
+int
+rte_ml_dev_is_valid_dev(int16_t dev_id)
+{
+ struct rte_ml_dev *dev = NULL;
+
+ if (dev_id >= ml_dev_globals.max_devs || ml_dev_globals.devs[dev_id].data == NULL)
+ return 0;
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (dev->attached != ML_DEV_ATTACHED)
+ return 0;
+ else
+ return 1;
+}
+
+int
+rte_ml_dev_socket_id(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+
+ return dev->data->socket_id;
+}
+
+int
+rte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_info_get == NULL)
+ return -ENOTSUP;
+
+ if (dev_info == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, dev_info cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+ memset(dev_info, 0, sizeof(struct rte_ml_dev_info));
+
+ return (*dev->dev_ops->dev_info_get)(dev, dev_info);
+}
+
+int
+rte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config)
+{
+ struct rte_ml_dev_info dev_info;
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_configure == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started) {
+ RTE_MLDEV_LOG(ERR, "Device %d must be stopped to allow configuration", dev_id);
+ return -EBUSY;
+ }
+
+ if (config == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, config cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ ret = rte_ml_dev_info_get(dev_id, &dev_info);
+ if (ret < 0)
+ return ret;
+
+ if (config->nb_queue_pairs > dev_info.max_queue_pairs) {
+ RTE_MLDEV_LOG(ERR, "Device %d num of queues %u > %u\n", dev_id,
+ config->nb_queue_pairs, dev_info.max_queue_pairs);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_configure)(dev, config);
+}
+
+int
+rte_ml_dev_close(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_close == NULL)
+ return -ENOTSUP;
+
+ /* Device must be stopped before it can be closed */
+ if (dev->data->dev_started == 1) {
+ RTE_MLDEV_LOG(ERR, "Device %d must be stopped before closing", dev_id);
+ return -EBUSY;
+ }
+
+ return (*dev->dev_ops->dev_close)(dev);
+}
+
+int
+rte_ml_dev_start(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_start == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started != 0) {
+ RTE_MLDEV_LOG(ERR, "Device %d is already started", dev_id);
+ return -EBUSY;
+ }
+
+ ret = (*dev->dev_ops->dev_start)(dev);
+ if (ret == 0)
+ dev->data->dev_started = 1;
+
+ return ret;
+}
+
+int
+rte_ml_dev_stop(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+ int ret;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stop == NULL)
+ return -ENOTSUP;
+
+ if (dev->data->dev_started == 0) {
+ RTE_MLDEV_LOG(ERR, "Device %d is not started", dev_id);
+ return -EBUSY;
+ }
+
+ ret = (*dev->dev_ops->dev_stop)(dev);
+ if (ret == 0)
+ dev->data->dev_started = 0;
+
+ return ret;
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 1564d0fa4d..dc79c5f630 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -31,6 +31,110 @@ extern "C" {
#define ML_DEV_DETACHED (0)
#define ML_DEV_ATTACHED (1)
+struct rte_ml_dev;
+
+/**
+ * Definitions of all functions exported by a driver through the generic structure of type
+ * *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
+ */
+
+/**
+ * @internal
+ *
+ * Function used to get device information.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param dev_info
+ * Pointer to info structure.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_info_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info);
+
+/**
+ * @internal
+ *
+ * Function used to configure device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param config
+ * ML device configurations.
+ *
+ * @return
+ * - 0 on success
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_configure_t)(struct rte_ml_dev *dev, const struct rte_ml_dev_config *config);
+
+/**
+ * @internal
+ *
+ * Function used to close a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - -EAGAIN if can't close as device is busy.
+ * - < 0, error code on failure, other than busy.
+ */
+typedef int (*mldev_close_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * Function used to start a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_start_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * Function used to stop a configured device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_stop_t)(struct rte_ml_dev *dev);
+
+/**
+ * @internal
+ *
+ * ML device operations function pointer table.
+ */
+struct rte_ml_dev_ops {
+ /** Get device information. */
+ mldev_info_get_t dev_info_get;
+
+ /** Configure device. */
+ mldev_configure_t dev_configure;
+
+ /** Close device. */
+ mldev_close_t dev_close;
+
+ /** Start device. */
+ mldev_start_t dev_start;
+
+ /** Stop device. */
+ mldev_stop_t dev_stop;
+};
+
/**
* @internal
*
@@ -75,6 +179,9 @@ struct rte_ml_dev {
/** Pointer to device data. */
struct rte_ml_dev_data *data;
+ /** Functions exported by PMD. */
+ struct rte_ml_dev_ops *dev_ops;
+
/** Backing RTE device. */
struct rte_device *device;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index d6bf7c8ebb..028b6a464d 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -1,8 +1,16 @@
EXPERIMENTAL {
global:
+ rte_ml_dev_close;
+ rte_ml_dev_configure;
+ rte_ml_dev_count;
+ rte_ml_dev_info_get;
rte_ml_dev_init;
+ rte_ml_dev_is_valid_dev;
rte_ml_dev_logtype;
+ rte_ml_dev_socket_id;
+ rte_ml_dev_start;
+ rte_ml_dev_stop;
local: *;
};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 04/12] mldev: support ML device queue-pair setup
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (2 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 03/12] mldev: support ML device handling functions jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 05/12] mldev: support handling ML models jerinj
` (9 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added APIs to create a queue-pair attached to ML device.
Queue pairs are created with a user specified ID. Added
function prototypes to be used by ML drivers for queue
pair create and destroy.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 33 ++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 44 ++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 1 +
3 files changed, 78 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 961e12d150..928ea90ab5 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -352,4 +352,37 @@ rte_ml_dev_stop(int16_t dev_id)
return ret;
}
+int
+rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *qp_conf, int socket_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_queue_pair_setup == NULL)
+ return -ENOTSUP;
+
+ if (queue_pair_id >= dev->data->nb_queue_pairs) {
+ RTE_MLDEV_LOG(ERR, "Invalid queue_pair_id = %d", queue_pair_id);
+ return -EINVAL;
+ }
+
+ if (qp_conf == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, qp_conf cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (dev->data->dev_started) {
+ RTE_MLDEV_LOG(ERR, "Device %d must be stopped to allow configuration", dev_id);
+ return -EBUSY;
+ }
+
+ return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index dc79c5f630..f8d247b43e 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -113,6 +113,44 @@ typedef int (*mldev_start_t)(struct rte_ml_dev *dev);
*/
typedef int (*mldev_stop_t)(struct rte_ml_dev *dev);
+/**
+ * @internal
+ *
+ * Setup a queue pair for a device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param queue_pair_id
+ * Queue pair index.
+ * @param queue_pair_conf
+ * Queue pair configuration structure.
+ * @param socket_id
+ * Socket index.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id,
+ const struct rte_ml_dev_qp_conf *queue_pair_conf,
+ int socket_id);
+
+/**
+ * @internal
+ *
+ * Release memory resources allocated by given queue pair.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param queue_pair_id
+ * Queue pair index.
+ *
+ * @return
+ * - 0 on success.
+ * - -EAGAIN, if can't close as device is busy.
+ */
+typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+
/**
* @internal
*
@@ -133,6 +171,12 @@ struct rte_ml_dev_ops {
/** Stop device. */
mldev_stop_t dev_stop;
+
+ /** Set up a device queue pair. */
+ mldev_queue_pair_setup_t dev_queue_pair_setup;
+
+ /** Release a device queue pair. */
+ mldev_queue_pair_release_t dev_queue_pair_release;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 028b6a464d..fcf48f1760 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -8,6 +8,7 @@ EXPERIMENTAL {
rte_ml_dev_init;
rte_ml_dev_is_valid_dev;
rte_ml_dev_logtype;
+ rte_ml_dev_queue_pair_setup;
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 05/12] mldev: support handling ML models
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (3 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 04/12] mldev: support ML device queue-pair setup jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 06/12] mldev: support input and output data handling jerinj
` (8 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added RTE functions to handle ML models. These APIs can
load, unload, start, and stop an ML model. Additional APIs
to update model parameters and get model information are
added.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 123 +++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 122 ++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 6 ++
3 files changed, 251 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 928ea90ab5..9642e62815 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -385,4 +385,127 @@ rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
}
+int
+rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_load == NULL)
+ return -ENOTSUP;
+
+ if (params == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, params cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (model_id == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, model_id cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_load)(dev, params, model_id);
+}
+
+int
+rte_ml_model_unload(int16_t dev_id, uint16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_unload == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_unload)(dev, model_id);
+}
+
+int
+rte_ml_model_start(int16_t dev_id, uint16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_start == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_start)(dev, model_id);
+}
+
+int
+rte_ml_model_stop(int16_t dev_id, uint16_t model_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_stop == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->model_stop)(dev, model_id);
+}
+
+int
+rte_ml_model_info_get(int16_t dev_id, uint16_t model_id, struct rte_ml_model_info *model_info)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_info_get == NULL)
+ return -ENOTSUP;
+
+ if (model_info == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, model_id %u, model_info cannot be NULL\n", dev_id,
+ model_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_info_get)(dev, model_id, model_info);
+}
+
+int
+rte_ml_model_params_update(int16_t dev_id, uint16_t model_id, void *buffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->model_params_update == NULL)
+ return -ENOTSUP;
+
+ if (buffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, buffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->model_params_update)(dev, model_id, buffer);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index f8d247b43e..602f541589 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -151,6 +151,110 @@ typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_p
*/
typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+/**
+ * @internal
+ *
+ * Function used to load an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param params
+ * Model load params.
+ * @param model_id
+ * Model ID returned by the library.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_model_load_t)(struct rte_ml_dev *dev, struct rte_ml_model_params *params,
+ uint16_t *model_id);
+
+/**
+ * @internal
+ *
+ * Function used to unload an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_model_unload_t)(struct rte_ml_dev *dev, uint16_t model_id);
+
+/**
+ * @internal
+ *
+ * Function used to start an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_start_t)(struct rte_ml_dev *dev, uint16_t model_id);
+
+/**
+ * @internal
+ *
+ * Function used to stop an ML model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_stop_t)(struct rte_ml_dev *dev, uint16_t model_id);
+
+/**
+ * @internal
+ *
+ * Get info about a model.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param model_info
+ * Pointer to model info structure.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_info_get_t)(struct rte_ml_dev *dev, uint16_t model_id,
+ struct rte_ml_model_info *model_info);
+
+/**
+ * @internal
+ *
+ * Update model params.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param buffer
+ * Pointer to model params.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_model_params_update_t)(struct rte_ml_dev *dev, uint16_t model_id, void *buffer);
+
/**
* @internal
*
@@ -177,6 +281,24 @@ struct rte_ml_dev_ops {
/** Release a device queue pair. */
mldev_queue_pair_release_t dev_queue_pair_release;
+
+ /** Load an ML model. */
+ mldev_model_load_t model_load;
+
+ /** Unload an ML model. */
+ mldev_model_unload_t model_unload;
+
+ /** Start an ML model. */
+ mldev_model_start_t model_start;
+
+ /** Stop an ML model. */
+ mldev_model_stop_t model_stop;
+
+ /** Get model information. */
+ mldev_model_info_get_t model_info_get;
+
+ /** Update model params. */
+ mldev_model_params_update_t model_params_update;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index fcf48f1760..8da70f2443 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -12,6 +12,12 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_model_info_get;
+ rte_ml_model_load;
+ rte_ml_model_params_update;
+ rte_ml_model_start;
+ rte_ml_model_stop;
+ rte_ml_model_unload;
local: *;
};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 06/12] mldev: support input and output data handling
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (4 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 05/12] mldev: support handling ML models jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 07/12] mldev: support ML op pool and ops jerinj
` (7 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added library functions to handle model input and
output data. The APIs can be used to get the size of I/O
buffers, quantize input data and dequantize output data.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 94 ++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 106 +++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 4 ++
3 files changed, 204 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 9642e62815..f9684e90ea 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -508,4 +508,98 @@ rte_ml_model_params_update(int16_t dev_id, uint16_t model_id, void *buffer)
return (*dev->dev_ops->model_params_update)(dev, model_id, buffer);
}
+int
+rte_ml_io_input_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
+ uint64_t *input_qsize, uint64_t *input_dsize)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_input_size_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->io_input_size_get)(dev, model_id, nb_batches, input_qsize,
+ input_dsize);
+}
+
+int
+rte_ml_io_output_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
+ uint64_t *output_qsize, uint64_t *output_dsize)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_output_size_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->io_output_size_get)(dev, model_id, nb_batches, output_qsize,
+ output_dsize);
+}
+
+int
+rte_ml_io_quantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *dbuffer,
+ void *qbuffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_quantize == NULL)
+ return -ENOTSUP;
+
+ if (dbuffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, dbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (qbuffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, qbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->io_quantize)(dev, model_id, nb_batches, dbuffer, qbuffer);
+}
+
+int
+rte_ml_io_dequantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *qbuffer,
+ void *dbuffer)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->io_dequantize == NULL)
+ return -ENOTSUP;
+
+ if (qbuffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, qbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (dbuffer == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, dbuffer cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->io_dequantize)(dev, model_id, nb_batches, qbuffer, dbuffer);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 602f541589..46e6e7a4e5 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -255,6 +255,100 @@ typedef int (*mldev_model_info_get_t)(struct rte_ml_dev *dev, uint16_t model_id,
*/
typedef int (*mldev_model_params_update_t)(struct rte_ml_dev *dev, uint16_t model_id, void *buffer);
+/**
+ * @internal
+ *
+ * Get size of input buffers.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param input_qsize
+ * Size of quantized input.
+ * @param input_dsize
+ * Size of dequantized input.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_input_size_get_t)(struct rte_ml_dev *dev, uint16_t model_id,
+ uint32_t nb_batches, uint64_t *input_qsize,
+ uint64_t *input_dsize);
+
+/**
+ * @internal
+ *
+ * Get size of output buffers.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param output_qsize
+ * Size of quantized output.
+ * @param output_dsize
+ * Size of dequantized output.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_output_size_get_t)(struct rte_ml_dev *dev, uint16_t model_id,
+ uint32_t nb_batches, uint64_t *output_qsize,
+ uint64_t *output_dsize);
+
+/**
+ * @internal
+ *
+ * Quantize model data.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param dbuffer
+ * Pointer t de-quantized data buffer.
+ * @param qbuffer
+ * Pointer t de-quantized data buffer.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_quantize_t)(struct rte_ml_dev *dev, uint16_t model_id, uint16_t nb_batches,
+ void *dbuffer, void *qbuffer);
+
+/**
+ * @internal
+ *
+ * Quantize model data.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param model_id
+ * Model ID to use.
+ * @param nb_batches
+ * Number of batches.
+ * @param qbuffer
+ * Pointer t de-quantized data buffer.
+ * @param dbuffer
+ * Pointer t de-quantized data buffer.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_io_dequantize_t)(struct rte_ml_dev *dev, uint16_t model_id, uint16_t nb_batches,
+ void *qbuffer, void *dbuffer);
+
/**
* @internal
*
@@ -299,6 +393,18 @@ struct rte_ml_dev_ops {
/** Update model params. */
mldev_model_params_update_t model_params_update;
+
+ /** Get input buffer size. */
+ mldev_io_input_size_get_t io_input_size_get;
+
+ /** Get output buffer size. */
+ mldev_io_output_size_get_t io_output_size_get;
+
+ /** Quantize data */
+ mldev_io_quantize_t io_quantize;
+
+ /** De-quantize data */
+ mldev_io_dequantize_t io_dequantize;
};
/**
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 8da70f2443..d65b0792ab 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -12,6 +12,10 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_io_dequantize;
+ rte_ml_io_input_size_get;
+ rte_ml_io_output_size_get;
+ rte_ml_io_quantize;
rte_ml_model_info_get;
rte_ml_model_load;
rte_ml_model_params_update;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 07/12] mldev: support ML op pool and ops
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (5 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 06/12] mldev: support input and output data handling jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 08/12] mldev: support inference enqueue and dequeue jerinj
` (6 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added library functions to create and free ML op pool.
Create function allocates new ML op pool and initializes ML
ops to their defaults.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 69 +++++++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
2 files changed, 71 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index f9684e90ea..da65819c08 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -12,6 +12,17 @@
static struct rte_ml_dev_global ml_dev_globals = {
.devs = NULL, .data = NULL, .nb_devs = 0, .max_devs = RTE_MLDEV_DEFAULT_MAX};
+/*
+ * Private data structure of an operation pool.
+ *
+ * A structure that contains ml op_pool specific data that is
+ * appended after the mempool structure (in private data).
+ */
+struct rte_ml_op_pool_private {
+ uint16_t user_size;
+ /*< Size of private user data with each operation. */
+};
+
struct rte_ml_dev *
rte_ml_dev_pmd_get_dev(int16_t dev_id)
{
@@ -602,4 +613,62 @@ rte_ml_io_dequantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, voi
return (*dev->dev_ops->io_dequantize)(dev, model_id, nb_batches, qbuffer, dbuffer);
}
+/** Initialise rte_ml_op mempool element */
+static void
+ml_op_init(struct rte_mempool *mempool, __rte_unused void *opaque_arg, void *_op_data,
+ __rte_unused unsigned int i)
+{
+ struct rte_ml_op *op = _op_data;
+
+ memset(_op_data, 0, mempool->elt_size);
+ op->status = RTE_ML_OP_STATUS_NOT_PROCESSED;
+ op->mempool = mempool;
+}
+
+struct rte_mempool *
+rte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
+ uint16_t user_size, int socket_id)
+{
+ struct rte_ml_op_pool_private *priv;
+ struct rte_mempool *mp;
+ unsigned int elt_size;
+
+ /* lookup mempool in case already allocated */
+ mp = rte_mempool_lookup(name);
+ elt_size = sizeof(struct rte_ml_op) + user_size;
+
+ if (mp != NULL) {
+ priv = (struct rte_ml_op_pool_private *)rte_mempool_get_priv(mp);
+ if (mp->elt_size != elt_size || mp->cache_size < cache_size || mp->size < nb_elts ||
+ priv->user_size < user_size) {
+ mp = NULL;
+ RTE_MLDEV_LOG(ERR,
+ "Mempool %s already exists but with incompatible parameters",
+ name);
+ return NULL;
+ }
+ return mp;
+ }
+
+ mp = rte_mempool_create(name, nb_elts, elt_size, cache_size,
+ sizeof(struct rte_ml_op_pool_private), NULL, NULL, ml_op_init, NULL,
+ socket_id, 0);
+ if (mp == NULL) {
+ RTE_MLDEV_LOG(ERR, "Failed to create mempool %s", name);
+ return NULL;
+ }
+
+ priv = (struct rte_ml_op_pool_private *)rte_mempool_get_priv(mp);
+ priv->user_size = user_size;
+
+ return mp;
+}
+
+void
+rte_ml_op_pool_free(struct rte_mempool *mempool)
+{
+ if (mempool != NULL)
+ rte_mempool_free(mempool);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index d65b0792ab..7665226d0e 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -22,6 +22,8 @@ EXPERIMENTAL {
rte_ml_model_start;
rte_ml_model_stop;
rte_ml_model_unload;
+ rte_ml_op_pool_create;
+ rte_ml_op_pool_free;
local: *;
};
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 08/12] mldev: support inference enqueue and dequeue
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (6 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 07/12] mldev: support ML op pool and ops jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 09/12] mldev: support device statistics jerinj
` (5 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added implementations of fast-path functions to enqueue
and dequeue ML requests from an ML device queue-pair.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 75 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 46 +++++++++++++++++++++++
lib/mldev/rte_mldev_pmd.h | 2 +
lib/mldev/version.map | 2 +
4 files changed, 125 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index da65819c08..184d87c70a 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -114,6 +114,9 @@ rte_ml_dev_pmd_allocate(const char *name, uint8_t socket_id)
ml_dev_globals.nb_devs++;
}
+ dev->enqueue_burst = NULL;
+ dev->dequeue_burst = NULL;
+
return dev;
}
@@ -671,4 +674,76 @@ rte_ml_op_pool_free(struct rte_mempool *mempool)
rte_mempool_free(mempool);
}
+uint16_t
+rte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->enqueue_burst == NULL) {
+ rte_errno = -ENOTSUP;
+ return 0;
+ }
+
+ if (ops == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, ops cannot be NULL\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ if (qp_id >= dev->data->nb_queue_pairs) {
+ RTE_MLDEV_LOG(ERR, "Invalid qp_id %u\n", qp_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->enqueue_burst)(dev, qp_id, ops, nb_ops);
+}
+
+uint16_t
+rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dequeue_burst == NULL) {
+ rte_errno = -ENOTSUP;
+ return 0;
+ }
+
+ if (ops == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, ops cannot be NULL\n", dev_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+
+ if (qp_id >= dev->data->nb_queue_pairs) {
+ RTE_MLDEV_LOG(ERR, "Invalid qp_id %u\n", qp_id);
+ rte_errno = -EINVAL;
+ return 0;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->dequeue_burst)(dev, qp_id, ops, nb_ops);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 46e6e7a4e5..1c02813c87 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -33,6 +33,46 @@ extern "C" {
struct rte_ml_dev;
+/**
+ * @internal
+ *
+ * Enqueue a burst of inference requests to a queue on ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param qp_id
+ * Queue-pair ID.
+ * @param ops
+ * Array of ML ops to be enqueued.
+ * @param nb_ops
+ * Number of ops to enqueue.
+ *
+ * @return
+ * - Number of ops enqueued.
+ */
+typedef uint16_t (*mldev_enqueue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
+ uint16_t nb_ops);
+
+/**
+ * @internal
+ *
+ * Dequeue a burst of inference requests from a queue on ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param qp_id
+ * Queue-pair ID.
+ * @param ops
+ * Array of ML ops to dequeued.
+ * @param nb_ops
+ * Number of ops to dequeue.
+ *
+ * @return
+ * - Number of ops dequeued.
+ */
+typedef uint16_t (*mldev_dequeue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
+ uint16_t nb_ops);
+
/**
* Definitions of all functions exported by a driver through the generic structure of type
* *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
@@ -448,6 +488,12 @@ struct rte_ml_dev_data {
* The data structure associated with each ML device.
*/
struct rte_ml_dev {
+ /** Pointer to PMD enqueue function. */
+ mldev_enqueue_t enqueue_burst;
+
+ /** Pointer to PMD dequeue function. */
+ mldev_dequeue_t dequeue_burst;
+
/** Pointer to device data. */
struct rte_ml_dev_data *data;
diff --git a/lib/mldev/rte_mldev_pmd.h b/lib/mldev/rte_mldev_pmd.h
index 33544f1b80..afe617e4bf 100644
--- a/lib/mldev/rte_mldev_pmd.h
+++ b/lib/mldev/rte_mldev_pmd.h
@@ -40,6 +40,8 @@ struct rte_ml_dev_pmd_init_params {
uint64_t private_data_size;
};
+struct rte_ml_dev;
+
/**
* @internal
*
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 7665226d0e..e6b1ac4a4d 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -1,6 +1,7 @@
EXPERIMENTAL {
global:
+ rte_ml_dequeue_burst;
rte_ml_dev_close;
rte_ml_dev_configure;
rte_ml_dev_count;
@@ -12,6 +13,7 @@ EXPERIMENTAL {
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stop;
+ rte_ml_enqueue_burst;
rte_ml_io_dequantize;
rte_ml_io_input_size_get;
rte_ml_io_output_size_get;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 09/12] mldev: support device statistics
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (7 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 08/12] mldev: support inference enqueue and dequeue jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 10/12] mldev: support device extended statistics jerinj
` (4 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to get and reset device stats. Device stats
include number of requests enqueued, dequeued and errors. Added
function prototypes to used by driver implementations.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 40 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 32 ++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
3 files changed, 74 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 184d87c70a..f096ed2bc3 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -399,6 +399,46 @@ rte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
return (*dev->dev_ops->dev_queue_pair_setup)(dev, queue_pair_id, qp_conf, socket_id);
}
+int
+rte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stats_get == NULL)
+ return -ENOTSUP;
+
+ if (stats == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, stats cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+ memset(stats, 0, sizeof(struct rte_ml_dev_stats));
+
+ return (*dev->dev_ops->dev_stats_get)(dev, stats);
+}
+
+void
+rte_ml_dev_stats_reset(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_stats_reset == NULL)
+ return;
+
+ (*dev->dev_ops->dev_stats_reset)(dev);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 1c02813c87..7c9877731e 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -191,6 +191,32 @@ typedef int (*mldev_queue_pair_setup_t)(struct rte_ml_dev *dev, uint16_t queue_p
*/
typedef int (*mldev_queue_pair_release_t)(struct rte_ml_dev *dev, uint16_t queue_pair_id);
+/**
+ * @internal
+ *
+ * Function used to get device statistics.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stats
+ * Pointer to ML device stats structure to update.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_stats_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats);
+
+/**
+ * @internal
+ *
+ * Function used to reset device statistics.
+ *
+ * @param dev
+ * ML device pointer.
+ */
+typedef void (*mldev_stats_reset_t)(struct rte_ml_dev *dev);
+
/**
* @internal
*
@@ -416,6 +442,12 @@ struct rte_ml_dev_ops {
/** Release a device queue pair. */
mldev_queue_pair_release_t dev_queue_pair_release;
+ /** Get device statistics. */
+ mldev_stats_get_t dev_stats_get;
+
+ /** Reset device statistics. */
+ mldev_stats_reset_t dev_stats_reset;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index e6b1ac4a4d..7c652f1f9b 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -12,6 +12,8 @@ EXPERIMENTAL {
rte_ml_dev_queue_pair_setup;
rte_ml_dev_socket_id;
rte_ml_dev_start;
+ rte_ml_dev_stats_get;
+ rte_ml_dev_stats_reset;
rte_ml_dev_stop;
rte_ml_enqueue_burst;
rte_ml_io_dequantize;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 10/12] mldev: support device extended statistics
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (8 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 09/12] mldev: support device statistics jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 11/12] mldev: support to retrieve error information jerinj
` (3 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to handle device extended stats. xstats
supported are driver specific and can include stats specific
to ML device or ML model and I/O. Added prototypes for
functions to be called by the device drivers.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 88 ++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 93 ++++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 4 ++
3 files changed, 185 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index f096ed2bc3..f6c5282f39 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -439,6 +439,94 @@ rte_ml_dev_stats_reset(int16_t dev_id)
(*dev->dev_ops->dev_stats_reset)(dev);
}
+int
+rte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map, uint32_t size)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_names_get == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_xstats_names_get)(dev, xstats_map, size);
+}
+
+int
+rte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_by_name_get == NULL)
+ return -ENOTSUP;
+
+ if (name == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, name cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (value == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, value cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_xstats_by_name_get)(dev, name, stat_id, value);
+}
+
+int
+rte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_get == NULL)
+ return -ENOTSUP;
+
+ if (stat_ids == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, stat_ids cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (values == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, values cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_xstats_get)(dev, stat_ids, values, nb_ids);
+}
+
+int
+rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_xstats_reset == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_xstats_reset)(dev, stat_ids, nb_ids);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 7c9877731e..bc94420000 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -217,6 +217,87 @@ typedef int (*mldev_stats_get_t)(struct rte_ml_dev *dev, struct rte_ml_dev_stats
*/
typedef void (*mldev_stats_reset_t)(struct rte_ml_dev *dev);
+/**
+ * @internal
+ *
+ * Function used to get names of extended stats.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param xstats_map
+ * Array to insert id and names into.
+ * @param size
+ * Size of xstats_map array.
+ *
+ * @return
+ * - >= 0 and <= size on success.
+ * - > size, error. Returns the size of xstats_map array required.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_names_get_t)(struct rte_ml_dev *dev,
+ struct rte_ml_dev_xstats_map *xstats_map, uint32_t size);
+
+/**
+ * @internal
+ *
+ * Function used to get a single extended stat by name.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param name
+ * Name of the stat to retrieve.
+ * @param stat_id
+ * ID of the stat to be returned.
+ * @param value
+ * Value of the stat to be returned.
+ *
+ * @return
+ * - >= 0 stat value.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_by_name_get_t)(struct rte_ml_dev *dev, const char *name,
+ uint16_t *stat_id, uint64_t *value);
+
+/**
+ * @internal
+ *
+ * Function used to retrieve extended stats of a device.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stat_ids
+ * Array of ID numbers of the stats to be retrieved.
+ * @param values
+ * Values of the stats requested by the ID.
+ * @param nb_ids
+ * Number of stats requested.
+ *
+ * @return
+ * - >= 0, number of entries filled into the values array.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_get_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
+ uint64_t *values, uint16_t nb_ids);
+
+/**
+ * @internal
+ *
+ * Function used to reset extended stats.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param stat_ids
+ * Array of stats IDs to be reset.
+ * @param nb_ids
+ * Number of IDs in the stat_ids array.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+typedef int (*mldev_xstats_reset_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
+ uint16_t nb_ids);
+
/**
* @internal
*
@@ -448,6 +529,18 @@ struct rte_ml_dev_ops {
/** Reset device statistics. */
mldev_stats_reset_t dev_stats_reset;
+ /** Get names of extended stats. */
+ mldev_xstats_names_get_t dev_xstats_names_get;
+
+ /** Get value of a single extended stat. */
+ mldev_xstats_by_name_get_t dev_xstats_by_name_get;
+
+ /** Get extended stats of a device. */
+ mldev_xstats_get_t dev_xstats_get;
+
+ /** Reset extended stats of the device. */
+ mldev_xstats_reset_t dev_xstats_reset;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 7c652f1f9b..1e7c1ab2b2 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -15,6 +15,10 @@ EXPERIMENTAL {
rte_ml_dev_stats_get;
rte_ml_dev_stats_reset;
rte_ml_dev_stop;
+ rte_ml_dev_xstats_by_name_get;
+ rte_ml_dev_xstats_get;
+ rte_ml_dev_xstats_names_get;
+ rte_ml_dev_xstats_reset;
rte_ml_enqueue_burst;
rte_ml_io_dequantize;
rte_ml_io_input_size_get;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 11/12] mldev: support to retrieve error information
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (9 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 10/12] mldev: support device extended statistics jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 12/12] mldev: support to get debug info and test device jerinj
` (2 subsequent siblings)
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions to get error information for an ML op.
This information can include both drive specific error
message and error code.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 31 +++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 22 ++++++++++++++++++++++
lib/mldev/version.map | 1 +
3 files changed, 54 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index f6c5282f39..9258f44466 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -874,4 +874,35 @@ rte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uin
return (*dev->dequeue_burst)(dev, qp_id, ops, nb_ops);
}
+int
+rte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error)
+{
+ struct rte_ml_dev *dev;
+
+#ifdef RTE_LIBRTE_ML_DEV_DEBUG
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->op_error_get == NULL)
+ return -ENOTSUP;
+
+ if (op == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, op cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ if (error == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, error cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+#else
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+#endif
+
+ return (*dev->op_error_get)(dev, op, error);
+}
+
RTE_LOG_REGISTER_DEFAULT(rte_ml_dev_logtype, INFO);
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index bc94420000..14c33175d2 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -73,6 +73,25 @@ typedef uint16_t (*mldev_enqueue_t)(struct rte_ml_dev *dev, uint16_t qp_id, stru
typedef uint16_t (*mldev_dequeue_t)(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops,
uint16_t nb_ops);
+/**
+ * @internal
+ *
+ * Get error information for an Op.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param op
+ * ML Op handle.
+ * @param error
+ * Pointer to error structure.
+ *
+ * @return
+ * - 0 on success.
+ * - <0, error on failure.
+ */
+typedef int (*mldev_op_error_get_t)(struct rte_ml_dev *dev, struct rte_ml_op *op,
+ struct rte_ml_op_error *error);
+
/**
* Definitions of all functions exported by a driver through the generic structure of type
* *ml_dev_ops* supplied in the *rte_ml_dev* structure associated with a device.
@@ -619,6 +638,9 @@ struct rte_ml_dev {
/** Pointer to PMD dequeue function. */
mldev_dequeue_t dequeue_burst;
+ /** Pointer to PMD Op error get function. */
+ mldev_op_error_get_t op_error_get;
+
/** Pointer to device data. */
struct rte_ml_dev_data *data;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index 1e7c1ab2b2..ea91912f5f 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -30,6 +30,7 @@ EXPERIMENTAL {
rte_ml_model_start;
rte_ml_model_stop;
rte_ml_model_unload;
+ rte_ml_op_error_get;
rte_ml_op_pool_create;
rte_ml_op_pool_free;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* [dpdk-dev] [PATCH v3 12/12] mldev: support to get debug info and test device
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (10 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 11/12] mldev: support to retrieve error information jerinj
@ 2023-02-07 15:13 ` jerinj
2023-02-15 12:55 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library Ferruh Yigit
2023-03-09 17:33 ` Thomas Monjalon
13 siblings, 0 replies; 80+ messages in thread
From: jerinj @ 2023-02-07 15:13 UTC (permalink / raw)
To: dev, Srikanth Yalavarthi
Cc: thomas, ferruh.yigit, stephen, dchickles, sshankarnara, Jerin Jacob
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
Added functions for ML device debug APIs. The APIs
are used to dump ML device debug information and to run selftest.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
lib/mldev/rte_mldev.c | 39 ++++++++++++++++++++++++++++++++++++++
lib/mldev/rte_mldev_core.h | 37 ++++++++++++++++++++++++++++++++++++
lib/mldev/version.map | 2 ++
3 files changed, 78 insertions(+)
diff --git a/lib/mldev/rte_mldev.c b/lib/mldev/rte_mldev.c
index 9258f44466..eeddb8e874 100644
--- a/lib/mldev/rte_mldev.c
+++ b/lib/mldev/rte_mldev.c
@@ -527,6 +527,45 @@ rte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_id
return (*dev->dev_ops->dev_xstats_reset)(dev, stat_ids, nb_ids);
}
+int
+rte_ml_dev_dump(int16_t dev_id, FILE *fd)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_dump == NULL)
+ return -ENOTSUP;
+
+ if (fd == NULL) {
+ RTE_MLDEV_LOG(ERR, "Dev %d, file descriptor cannot be NULL\n", dev_id);
+ return -EINVAL;
+ }
+
+ return (*dev->dev_ops->dev_dump)(dev, fd);
+}
+
+int
+rte_ml_dev_selftest(int16_t dev_id)
+{
+ struct rte_ml_dev *dev;
+
+ if (!rte_ml_dev_is_valid_dev(dev_id)) {
+ RTE_MLDEV_LOG(ERR, "Invalid dev_id = %d\n", dev_id);
+ return -EINVAL;
+ }
+
+ dev = rte_ml_dev_pmd_get_dev(dev_id);
+ if (*dev->dev_ops->dev_selftest == NULL)
+ return -ENOTSUP;
+
+ return (*dev->dev_ops->dev_selftest)(dev);
+}
+
int
rte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id)
{
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 14c33175d2..98851e0fd5 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -317,6 +317,37 @@ typedef int (*mldev_xstats_get_t)(struct rte_ml_dev *dev, const uint16_t *stat_i
typedef int (*mldev_xstats_reset_t)(struct rte_ml_dev *dev, const uint16_t *stat_ids,
uint16_t nb_ids);
+/**
+ * @internal
+ *
+ * Function used to dump ML device debug info.
+ *
+ * @param dev
+ * ML device pointer.
+ * @param fd
+ * File descriptor to dump the debug info.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error code on failure.
+ */
+
+typedef int (*mldev_dump_t)(struct rte_ml_dev *dev, FILE *fd);
+
+/**
+ * @internal
+ *
+ * Function used for selftest of ML device.
+ *
+ * @param dev
+ * ML device pointer.
+ *
+ * @return
+ * - 0 on success.
+ * - < 0, error on failure.
+ */
+typedef int (*mldev_selftest_t)(struct rte_ml_dev *dev);
+
/**
* @internal
*
@@ -560,6 +591,12 @@ struct rte_ml_dev_ops {
/** Reset extended stats of the device. */
mldev_xstats_reset_t dev_xstats_reset;
+ /** Dump ML device debug info. */
+ mldev_dump_t dev_dump;
+
+ /** Dump ML device debug info. */
+ mldev_selftest_t dev_selftest;
+
/** Load an ML model. */
mldev_model_load_t model_load;
diff --git a/lib/mldev/version.map b/lib/mldev/version.map
index ea91912f5f..d2b30a991a 100644
--- a/lib/mldev/version.map
+++ b/lib/mldev/version.map
@@ -5,11 +5,13 @@ EXPERIMENTAL {
rte_ml_dev_close;
rte_ml_dev_configure;
rte_ml_dev_count;
+ rte_ml_dev_dump;
rte_ml_dev_info_get;
rte_ml_dev_init;
rte_ml_dev_is_valid_dev;
rte_ml_dev_logtype;
rte_ml_dev_queue_pair_setup;
+ rte_ml_dev_selftest;
rte_ml_dev_socket_id;
rte_ml_dev_start;
rte_ml_dev_stats_get;
--
2.39.1
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (11 preceding siblings ...)
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 12/12] mldev: support to get debug info and test device jerinj
@ 2023-02-15 12:55 ` Ferruh Yigit
2023-02-15 17:03 ` Jerin Jacob
2023-03-09 17:33 ` Thomas Monjalon
13 siblings, 1 reply; 80+ messages in thread
From: Ferruh Yigit @ 2023-02-15 12:55 UTC (permalink / raw)
To: jerinj, dev
Cc: thomas, stephen, dchickles, sshankarnara, Bahri, Aziz,
O'Donohoe, Fionn
On 2/7/2023 3:13 PM, jerinj@marvell.com wrote:
> From: Jerin Jacob <jerinj@marvell.com>
>
Hi Jerin,
Please find some comments/questions gathered with the help of some
collegues.
> Machine learning inference library
> ==================================
>
> Definition of machine learning inference
> ----------------------------------------
> Inference in machine learning is the process of making an output prediction
> based on new input data using a pre-trained machine learning model.
>
> The scope of the RFC would include only inferencing with pre-trained machine learning models,
> training and building/compiling the ML models is out of scope for this RFC or
> DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
>
> Motivation for the new library
> ------------------------------
> Multiple semiconductor vendors are offering accelerator products such as DPU
> (often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
> integrated as part of the product. Use of ML inferencing is increasing in the domain
> of packet processing for flow classification, intrusion, malware and anomaly detection.
>
Agree on this need.
> Lack of inferencing support through DPDK APIs will involve complexities and
> increased latency from moving data across frameworks (i.e, dataplane to
> non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
> inferencing would enable the dataplane solutions to harness the benefit of inline
> inferencing supported by the hardware.
>
ack
> Contents
> ---------------
> A) API specification for:
>
> 1) Discovery of ML capabilities (e.g., device specific features) in a vendor
> independent fashion
> 2) Definition of functions to handle ML devices, which includes probing,
> initialization and termination of the devices.
> 3) Definition of functions to handle ML models used to perform inference operations.
> 4) Definition of function to handle quantize and dequantize operations
>
> B) Common code for above specification
>
> rfc..v1:
> - Added programmer guide documentation
> - Added implementation for common code
>
> v2..v1:
> - Moved dynamic log (Stephen)
> - model id to uint16_t from int16t_t (Stephen)
> - added release note updates
>
> v3..v2:
> - Introduced rte_ml_dev_init() similar to rte_gpu_init() (Stephen, Thomas)
> - In struct rte_ml_dev_data, removed reserved[3] and __rte_cache_aligned.
> Also, moved name field to the end(Stephen)
>
> Machine learning library framework
> ----------------------------------
>
> The ML framework is built on the following model:
>
>
> +-----------------+ rte_ml_[en|de]queue_burst()
> | | |
> | Machine o------+ +--------+ |
> | Learning | | | queue | | +------+
> | Inference o------+-----o |<===o===>|Core 0|
> | Engine | | | pair 0 | +------+
> | o----+ | +--------+
> | | | |
> +-----------------+ | | +--------+
> ^ | | | queue | +------+
> | | +-----o |<=======>|Core 1|
> | | | pair 1 | +------+
> | | +--------+
> +--------+--------+ |
> | +-------------+ | | +--------+
> | | Model 0 | | | | queue | +------+
> | +-------------+ | +-------o |<=======>|Core N|
> | +-------------+ | | pair N | +------+
> | | Model 1 | | +--------+
> | +-------------+ |
> | +-------------+ |<------- rte_ml_model_load()
> | | Model .. | |-------> rte_ml_model_info()
> | +-------------+ |<------- rte_ml_model_start()
> | +-------------+ |<------- rte_ml_model_stop()
> | | Model N | |<------- rte_ml_model_params_update()
> | +-------------+ |<------- rte_ml_model_unload()
> +-----------------+
>
Should model load/unload, params_update be part of dpdk, or dpdk can
assume these are already in place. For FPGA both options works, what is
the benefit to have these APIs part of DPDK? What are usecases for other
architectures?
Is multiple active models at same time supported?
For FPGA case multiple models may exist at same time, it would be good
to have a way to select the model to use, like a handle for model that
API accepts.
Similarly a handle for model may help chaining models, possibly with
help of additional APIs to define the chaining.
> ML Device: A hardware or software-based implementation of ML device API for
> running inferences using a pre-trained ML model.
>
Can this device consume multiple queues in parallel?
> ML Model: An ML model is an algorithm trained over a dataset. A model consists of
> procedure/algorithm and data/pattern required to make predictions on live data.
> Once the model is created and trained outside of the DPDK scope, the model can be loaded
> via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> The rte_ml_model_params_update() can be used to update the model parameters such as weight
> and bias without unloading the model using rte_ml_model_unload().
>
> ML Inference: ML inference is the process of feeding data to the model via
> rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
> outputs/predictions from the started model.
>
> In all functions of the ML device API, the ML device is designated by an
> integer >= 0 named as device identifier *dev_id*.
>
> The functions exported by the ML device API to setup a device designated by
> its device identifier must be invoked in the following order:
>
> - rte_ml_dev_configure()
> - rte_ml_dev_queue_pair_setup()
> - rte_ml_dev_start()
>
> A model is required to run the inference operations with the user specified inputs.
> Application needs to invoke the ML model API in the following order before queueing
> inference jobs.
>
> - rte_ml_model_load()
> - rte_ml_model_start()
>
> The rte_ml_model_info() API is provided to retrieve the information related to the model.
> The information would include the shape and type of input and output required for the inference.
>
It seems there is a sandardization effort for model description, called
ONNX (https://onnx.ai/), supported by many vendors.
Does it make sense that 'rte_ml_model_info()' describes data, and
perhaps model itself too, using onnx format?
> Data quantization and dequantization is one of the main aspects in ML domain. This involves
> conversion of input data from a higher precision to a lower precision data type and vice-versa
> for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
> dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
> and output buffers holding data for multiple batches.
> Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
> size of quantized and de-quantized multi-batch input and output buffers.
>
It seems quantize and dequantize can be part of model and optimized
during training, can you please some information HW architecture that
needs these APIs?
Does it make sense to have quantize/dequantize as a capability, like in
case HW has specific support for it this can be used, else host can
provide this functionality.
> User can optionally update the model parameters with rte_ml_model_params_update() after
> invoking rte_ml_model_stop() API on a given model ID.
>
> The application can invoke, in any order, the functions exported by the ML API to enqueue
> inference jobs and dequeue inference response.
>
> If the application wants to change the device configuration (i.e., call
> rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
> device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
> the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
> for the given model. The application does not need to call rte_ml_dev_stop() API for
> any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
>
> Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
> start state after invoking rte_ml_model_start() API, then the application can call
> rte_ml_enqueue() and rte_ml_dequeue() API on the destined device and model ID.
>
> Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
>
> Typical application utilisation of the ML API will follow the following
> programming flow.
>
> - rte_ml_dev_configure()
> - rte_ml_dev_queue_pair_setup()
> - rte_ml_model_load()
> - rte_ml_model_start()
> - rte_ml_model_info()
> - rte_ml_dev_start()
> - rte_ml_enqueue_burst()
> - rte_ml_dequeue_burst()
> - rte_ml_model_stop()
> - rte_ml_model_unload()
> - rte_ml_dev_stop()
> - rte_ml_dev_close()
>
is a 'reset()' API needed?
> Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on different logical cores
> on the same target object. For instance, the dequeue function of a poll mode driver cannot be
> invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the user application to enforce this rule.
>
> Example application usage for ML inferencing
> --------------------------------------------
> This example application is to demonstrate the programming model of ML device
> library. This example omits the error checks to simplify the application. This
> example also assumes that the input data received is quantized and output expected
> is also quantized. In order to handle non-quantized inputs and outputs, users can
> invoke rte_ml_io_quantize() or rte_ml_io_dequantize() for data type conversions.
>
> #define ML_MODEL_NAME "model"
> #define IO_MZ "io_mz"
>
> struct app_ctx {
> char model_file[PATH_MAX];
> char inp_file[PATH_MAX];
> char out_file[PATH_MAX];
>
> struct rte_ml_model_params params;
> struct rte_ml_model_info info;
> uint16_t id;
>
> uint64_t input_size;
> uint64_t output_size;
> uint8_t *input_buffer;
> uint8_t *output_buffer;
> } __rte_cache_aligned;
>
> struct app_ctx ctx;
>
> static int
> parse_args(int argc, char **argv)
> {
> int opt, option_index;
> static struct option lgopts[] = {{"model", required_argument, NULL, 'm'},
> {"input", required_argument, NULL, 'i'},
> {"output", required_argument, NULL, 'o'},
> {NULL, 0, NULL, 0}};
>
> while ((opt = getopt_long(argc, argv, "m:i:o:", lgopts, &option_index)) != EOF)
> switch (opt) {
> case 'm':
> strncpy(ctx.model_file, optarg, PATH_MAX - 1);
> break;
> case 'i':
> strncpy(ctx.inp_file, optarg, PATH_MAX - 1);
> break;
> case 'o':
> strncpy(ctx.out_file, optarg, PATH_MAX - 1);
> break;
> default:
> return -1;
> }
>
> return 0;
> }
>
> int
> main(int argc, char **argv)
> {
> struct rte_ml_dev_qp_conf qp_conf;
> struct rte_ml_dev_config config;
> struct rte_ml_dev_info dev_info;
> const struct rte_memzone *mz;
> struct rte_mempool *op_pool;
> struct rte_ml_op *op_enq;
> struct rte_ml_op *op_deq;
>
> FILE *fp;
> int rc;
>
> /* Initialize EAL */
> rc = rte_eal_init(argc, argv);
> if (rc < 0)
> rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
> argc -= rc;
> argv += rc;
>
> /* Parse application arguments (after the EAL args) */
> if (parse_args(argc, argv) < 0)
> rte_exit(EXIT_FAILURE, "Invalid application arguments\n");
>
> /* Step 1: Check for ML devices */
> if (rte_ml_dev_count() <= 0)
> rte_exit(EXIT_FAILURE, "Failed to find ML devices\n");
>
> /* Step 2: Get device info */
> if (rte_ml_dev_info_get(0, &dev_info) != 0)
> rte_exit(EXIT_FAILURE, "Failed to get device info\n");
>
> /* Step 3: Configure ML device, use device 0 */
> config.socket_id = rte_ml_dev_socket_id(0);
> config.max_nb_models = dev_info.max_models;
> config.nb_queue_pairs = dev_info.max_queue_pairs;
> if (rte_ml_dev_configure(0, &config) != 0)
> rte_exit(EXIT_FAILURE, "Device configuration failed\n");
>
> /* Step 4: Setup queue pairs, used qp_id = 0 */
> qp_conf.nb_desc = 1;
> if (rte_ml_dev_queue_pair_setup(0, 0, &qp_conf, config.socket_id) != 0)
> rte_exit(EXIT_FAILURE, "Queue-pair setup failed\n");
>
> /* Step 5: Start device */
> if (rte_ml_dev_start(0) != 0)
> rte_exit(EXIT_FAILURE, "Device start failed\n");
>
> /* Step 6: Read model data and update load params structure */
> fp = fopen(ctx.model_file, "r+");
> if (fp == NULL)
> rte_exit(EXIT_FAILURE, "Failed to open model file\n");
>
> fseek(fp, 0, SEEK_END);
> ctx.params.size = ftell(fp);
> fseek(fp, 0, SEEK_SET);
>
> ctx.params.addr = malloc(ctx.params.size);
> if (fread(ctx.params.addr, 1, ctx.params.size, fp) != ctx.params.size){
> fclose(fp);
> rte_exit(EXIT_FAILURE, "Failed to read model\n");
> }
> fclose(fp);
> strcpy(ctx.params.name, ML_MODEL_NAME);
>
> /* Step 7: Load the model */
> if (rte_ml_model_load(0, &ctx.params, &ctx.id) != 0)
> rte_exit(EXIT_FAILURE, "Failed to load model\n");
> free(ctx.params.addr);
>
> /* Step 8: Start the model */
> if (rte_ml_model_start(0, ctx.id) != 0)
> rte_exit(EXIT_FAILURE, "Failed to start model\n");
>
> /* Step 9: Allocate buffers for quantized input and output */
>
> /* Get model information */
> if (rte_ml_model_info_get(0, ctx.id, &ctx.info) != 0)
> rte_exit(EXIT_FAILURE, "Failed to get model info\n");
>
> /* Get the buffer size for input and output */
> rte_ml_io_input_size_get(0, ctx.id, ctx.info.batch_size, &ctx.input_size, NULL);
> rte_ml_io_output_size_get(0, ctx.id, ctx.info.batch_size, &ctx.output_size, NULL);
>
> mz = rte_memzone_reserve(IO_MZ, ctx.input_size + ctx.output_size, config.socket_id, 0);
> if (mz == NULL)
> rte_exit(EXIT_FAILURE, "Failed to create IO memzone\n");
>
> ctx.input_buffer = mz->addr;
> ctx.output_buffer = ctx.input_buffer + ctx.input_size;
>
> /* Step 10: Fill the input data */
> fp = fopen(ctx.inp_file, "r+");
> if (fp == NULL)
> rte_exit(EXIT_FAILURE, "Failed to open input file\n");
>
> if (fread(ctx.input_buffer, 1, ctx.input_size, fp) != ctx.input_size) {
> fclose(fp);
> rte_exit(EXIT_FAILURE, "Failed to read input file\n");
> }
> fclose(fp);
>
> /* Step 11: Create ML op mempool */
> op_pool = rte_ml_op_pool_create("ml_op_pool", 1, 0, 0, config.socket_id);
> if (op_pool == NULL)
> rte_exit(EXIT_FAILURE, "Failed to create op pool\n");
>
> /* Step 12: Form an ML op */
> rte_mempool_get_bulk(op_pool, (void *)op_enq, 1);
> op_enq->model_id = ctx.id;
> op_enq->nb_batches = ctx.info.batch_size;
> op_enq->mempool = op_pool;
> op_enq->input.addr = ctx.input_buffer;
> op_enq->input.length = ctx.input_size;
> op_enq->input.next = NULL;
> op_enq->output.addr = ctx.output_buffer;
> op_enq->output.length = ctx.output_size;
> op_enq->output.next = NULL;
>
> /* Step 13: Enqueue jobs */
> rte_ml_enqueue_burst(0, 0, &op_enq, 1);
>
> /* Step 14: Dequeue jobs and release op pool */
> while (rte_ml_dequeue_burst(0, 0, &op_deq, 1) != 1)
> ;
>
> /* Step 15: Write output */
> fp = fopen(ctx.out_file, "w+");
> if (fp == NULL)
> rte_exit(EXIT_FAILURE, "Failed to open output file\n");
> fwrite(ctx.output_buffer, 1, ctx.output_size, fp);
> fclose(fp);
>
> /* Step 16: Clean up */
> /* Stop ML model */
> rte_ml_model_stop(0, ctx.id);
> /* Unload ML model */
> rte_ml_model_unload(0, ctx.id);
> /* Free input/output memory */
> rte_memzone_free(rte_memzone_lookup(IO_MZ));
> /* Free the ml op back to pool */
> rte_mempool_put_bulk(op_pool, (void *)op_deq, 1);
> /* Free ml op pool */
> rte_mempool_free(op_pool);
> /* Stop the device */
> rte_ml_dev_stop(0);
> rte_ml_dev_close(0);
> rte_eal_cleanup();
>
> return 0;
> }
>
>
> Jerin Jacob (1):
> mldev: introduce machine learning device library
>
> Srikanth Yalavarthi (11):
> mldev: support PMD functions for ML device
> mldev: support ML device handling functions
> mldev: support ML device queue-pair setup
> mldev: support handling ML models
> mldev: support input and output data handling
> mldev: support ML op pool and ops
> mldev: support inference enqueue and dequeue
> mldev: support device statistics
> mldev: support device extended statistics
> mldev: support to retrieve error information
> mldev: support to get debug info and test device
>
> MAINTAINERS | 5 +
> doc/api/doxy-api-index.md | 1 +
> doc/api/doxy-api.conf.in | 1 +
> doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
> doc/guides/prog_guide/index.rst | 1 +
> doc/guides/prog_guide/mldev.rst | 186 ++++
> doc/guides/rel_notes/release_23_03.rst | 5 +
> lib/meson.build | 1 +
> lib/mldev/meson.build | 27 +
> lib/mldev/rte_mldev.c | 947 ++++++++++++++++++
> lib/mldev/rte_mldev.h | 1119 ++++++++++++++++++++++
> lib/mldev/rte_mldev_core.h | 717 ++++++++++++++
> lib/mldev/rte_mldev_pmd.c | 62 ++
> lib/mldev/rte_mldev_pmd.h | 151 +++
> lib/mldev/version.map | 51 +
> 15 files changed, 3988 insertions(+)
> create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
> create mode 100644 doc/guides/prog_guide/mldev.rst
> create mode 100644 lib/mldev/meson.build
> create mode 100644 lib/mldev/rte_mldev.c
> create mode 100644 lib/mldev/rte_mldev.h
> create mode 100644 lib/mldev/rte_mldev_core.h
> create mode 100644 lib/mldev/rte_mldev_pmd.c
> create mode 100644 lib/mldev/rte_mldev_pmd.h
> create mode 100644 lib/mldev/version.map
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library
2023-02-15 12:55 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library Ferruh Yigit
@ 2023-02-15 17:03 ` Jerin Jacob
0 siblings, 0 replies; 80+ messages in thread
From: Jerin Jacob @ 2023-02-15 17:03 UTC (permalink / raw)
To: Ferruh Yigit
Cc: jerinj, dev, thomas, stephen, dchickles, sshankarnara, Bahri,
Aziz, O'Donohoe, Fionn, Srikanth Yalavarthi
On Wed, Feb 15, 2023 at 6:25 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
>
> On 2/7/2023 3:13 PM, jerinj@marvell.com wrote:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
>
> Hi Jerin,
>
> Please find some comments/questions gathered with the help of some
> collegues.
Thanks Ferruh for the review.
>
> > Machine learning inference library
> > ==================================
> >
> > Definition of machine learning inference
> > ----------------------------------------
> > Inference in machine learning is the process of making an output prediction
> > based on new input data using a pre-trained machine learning model.
> >
> > The scope of the RFC would include only inferencing with pre-trained machine learning models,
> > training and building/compiling the ML models is out of scope for this RFC or
> > DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
> >
> > Motivation for the new library
> > ------------------------------
> > Multiple semiconductor vendors are offering accelerator products such as DPU
> > (often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
> > integrated as part of the product. Use of ML inferencing is increasing in the domain
> > of packet processing for flow classification, intrusion, malware and anomaly detection.
> >
>
> Agree on this need.
>
> > Lack of inferencing support through DPDK APIs will involve complexities and
> > increased latency from moving data across frameworks (i.e, dataplane to
> > non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
> > inferencing would enable the dataplane solutions to harness the benefit of inline
> > inferencing supported by the hardware.
> >
>
> ack
>
> > Contents
> > ---------------
> > A) API specification for:
> >
> > 1) Discovery of ML capabilities (e.g., device specific features) in a vendor
> > independent fashion
> > 2) Definition of functions to handle ML devices, which includes probing,
> > initialization and termination of the devices.
> > 3) Definition of functions to handle ML models used to perform inference operations.
> > 4) Definition of function to handle quantize and dequantize operations
> >
> > B) Common code for above specification
> >
> > rfc..v1:
> > - Added programmer guide documentation
> > - Added implementation for common code
> >
> > v2..v1:
> > - Moved dynamic log (Stephen)
> > - model id to uint16_t from int16t_t (Stephen)
> > - added release note updates
> >
> > v3..v2:
> > - Introduced rte_ml_dev_init() similar to rte_gpu_init() (Stephen, Thomas)
> > - In struct rte_ml_dev_data, removed reserved[3] and __rte_cache_aligned.
> > Also, moved name field to the end(Stephen)
> >
> > Machine learning library framework
> > ----------------------------------
> >
> > The ML framework is built on the following model:
> >
> >
> > +-----------------+ rte_ml_[en|de]queue_burst()
> > | | |
> > | Machine o------+ +--------+ |
> > | Learning | | | queue | | +------+
> > | Inference o------+-----o |<===o===>|Core 0|
> > | Engine | | | pair 0 | +------+
> > | o----+ | +--------+
> > | | | |
> > +-----------------+ | | +--------+
> > ^ | | | queue | +------+
> > | | +-----o |<=======>|Core 1|
> > | | | pair 1 | +------+
> > | | +--------+
> > +--------+--------+ |
> > | +-------------+ | | +--------+
> > | | Model 0 | | | | queue | +------+
> > | +-------------+ | +-------o |<=======>|Core N|
> > | +-------------+ | | pair N | +------+
> > | | Model 1 | | +--------+
> > | +-------------+ |
> > | +-------------+ |<------- rte_ml_model_load()
> > | | Model .. | |-------> rte_ml_model_info()
> > | +-------------+ |<------- rte_ml_model_start()
> > | +-------------+ |<------- rte_ml_model_stop()
> > | | Model N | |<------- rte_ml_model_params_update()
> > | +-------------+ |<------- rte_ml_model_unload()
> > +-----------------+
> >
>
>
> Should model load/unload, params_update be part of dpdk, or dpdk can
> assume these are already in place.
The driver hooks can be NOPs if the model is already loaded or fixed
model FPGA solutions.
Probably we can add parameter info_get() in case if someone think user
needs to be aware of it.
Currently its an experimental API, when such device comes we can
extent the rte_ml_dev_infostructure if/as needed.
> For FPGA both options works, what is
> the benefit to have these APIs part of DPDK?
Support for runtime load/unload model if ml devices supports it.
Also it has some bearing on data-path as inference
needs to be stopped if one need to unload the model.
> What are usecases for other architectures?
When a device supports max N models (rte_ml_dev_info::max_models),
a user can replace unused model at runtime when the max number model
are reached.
>
>
> Is multiple active models at same time supported?
> For FPGA case multiple models may exist at same time, it would be good
> to have a way to select the model to use, like a handle for model that
> API accepts.
> Similarly a handle for model may help chaining models, possibly with
> help of additional APIs to define the chaining.
Yes, multiple active models are supported simultaneously.
Each model loaded would have unique model_id assigned by the driver
which can be used as a handle, while queuing inference requests or
doing slow-path operations.
>
> > ML Device: A hardware or software-based implementation of ML device API for
> > running inferences using a pre-trained ML model.
> >
>
> Can this device consume multiple queues in parallel?
Yes, spec doesn't impose any restrictions on the number of queues that
can be consumed by the device.
Actual number of queues permitted is dependent on the device.
see rte_ml_dev_info:max_queue_pairs
>
> > ML Model: An ML model is an algorithm trained over a dataset. A model consists of
> > procedure/algorithm and data/pattern required to make predictions on live data.
> > Once the model is created and trained outside of the DPDK scope, the model can be loaded
> > via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> > The rte_ml_model_params_update() can be used to update the model parameters such as weight
> > and bias without unloading the model using rte_ml_model_unload().
> >
> > ML Inference: ML inference is the process of feeding data to the model via
> > rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
> > outputs/predictions from the started model.
> >
> > In all functions of the ML device API, the ML device is designated by an
> > integer >= 0 named as device identifier *dev_id*.
> >
> > The functions exported by the ML device API to setup a device designated by
> > its device identifier must be invoked in the following order:
> >
> > - rte_ml_dev_configure()
> > - rte_ml_dev_queue_pair_setup()
> > - rte_ml_dev_start()
> >
> > A model is required to run the inference operations with the user specified inputs.
> > Application needs to invoke the ML model API in the following order before queueing
> > inference jobs.
> >
> > - rte_ml_model_load()
> > - rte_ml_model_start()
> >
> > The rte_ml_model_info() API is provided to retrieve the information related to the model.
> > The information would include the shape and type of input and output required for the inference.
> >
>
> It seems there is a sandardization effort for model description, called
> ONNX (https://onnx.ai/), supported by many vendors.
>
> Does it make sense that 'rte_ml_model_info()' describes data, and
> perhaps model itself too, using onnx format?
ONNX format is presents higher number of details, which may not be
required from a dataplane point-of-view.
We have proposed a rte_ml_model_info struct with the most important
fields that required. This structure can be expanded further as
required.
>
> > Data quantization and dequantization is one of the main aspects in ML domain. This involves
> > conversion of input data from a higher precision to a lower precision data type and vice-versa
> > for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
> > dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
> > and output buffers holding data for multiple batches.
> > Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
> > size of quantized and de-quantized multi-batch input and output buffers.
> >
>
>
> It seems quantize and dequantize can be part of model and optimized
> during training, can you please some information HW architecture that
> needs these APIs?
Marvell HW engines don't support quantization/dequantization in HW and
the same has to be done by software on ARM64/x64 cores.
The same applies for SW based mldevices.
So it can NOP for the device which already supports it or we can
introduce the capability
when such drivers getting added to DPDK. No issue in updating the API
when new driver comes.
>
> Does it make sense to have quantize/dequantize as a capability, like in
> case HW has specific support for it this can be used, else host can
> provide this functionality.
We tried to keep the APIs minimal in the initial version. Yes.
quantize/dequantize capability can be part of these device
capabilities when such devices added.
>
> > User can optionally update the model parameters with rte_ml_model_params_update() after
> > invoking rte_ml_model_stop() API on a given model ID.
> >
> > The application can invoke, in any order, the functions exported by the ML API to enqueue
> > inference jobs and dequeue inference response.
> >
> > If the application wants to change the device configuration (i.e., call
> > rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
> > device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
> > the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
> > for the given model. The application does not need to call rte_ml_dev_stop() API for
> > any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
> >
> > Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
> > start state after invoking rte_ml_model_start() API, then the application can call
> > rte_ml_enqueue() and rte_ml_dequeue() API on the destined device and model ID.
> >
> > Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
> >
> > Typical application utilisation of the ML API will follow the following
> > programming flow.
> >
> > - rte_ml_dev_configure()
> > - rte_ml_dev_queue_pair_setup()
> > - rte_ml_model_load()
> > - rte_ml_model_start()
> > - rte_ml_model_info()
> > - rte_ml_dev_start()
> > - rte_ml_enqueue_burst()
> > - rte_ml_dequeue_burst()
> > - rte_ml_model_stop()
> > - rte_ml_model_unload()
> > - rte_ml_dev_stop()
> > - rte_ml_dev_close()
> >
>
> is a 'reset()' API needed?
We can add in future, if a specific HW needs/supports it. Keeping
bare minium for first version.
>
> > Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
> > are lock-free functions which assume to not be invoked in parallel on different logical cores
> > on the same target object. For instance, the dequeue function of a poll mode driver cannot be
> > invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
> > can be invoked in parallel by different logical core on different queue pair.
> > It is the responsibility of the user application to enforce this rule.
> >
> > Example application usage for ML inferencing
> > --------------------------------------------
> > This example application is to demonstrate the programming model of ML device
> > library. This example omits the error checks to simplify the application. This
> > example also assumes that the input data received is quantized and output expected
> > is also quantized. In order to handle non-quantized inputs and outputs, users can
> > invoke rte_ml_io_quantize() or rte_ml_io_dequantize() for data type conversions.
> >
> > #define ML_MODEL_NAME "model"
> > #define IO_MZ "io_mz"
> >
> > struct app_ctx {
> > char model_file[PATH_MAX];
> > char inp_file[PATH_MAX];
> > char out_file[PATH_MAX];
> >
> > struct rte_ml_model_params params;
> > struct rte_ml_model_info info;
> > uint16_t id;
> >
> > uint64_t input_size;
> > uint64_t output_size;
> > uint8_t *input_buffer;
> > uint8_t *output_buffer;
> > } __rte_cache_aligned;
> >
> > struct app_ctx ctx;
> >
> > static int
> > parse_args(int argc, char **argv)
> > {
> > int opt, option_index;
> > static struct option lgopts[] = {{"model", required_argument, NULL, 'm'},
> > {"input", required_argument, NULL, 'i'},
> > {"output", required_argument, NULL, 'o'},
> > {NULL, 0, NULL, 0}};
> >
> > while ((opt = getopt_long(argc, argv, "m:i:o:", lgopts, &option_index)) != EOF)
> > switch (opt) {
> > case 'm':
> > strncpy(ctx.model_file, optarg, PATH_MAX - 1);
> > break;
> > case 'i':
> > strncpy(ctx.inp_file, optarg, PATH_MAX - 1);
> > break;
> > case 'o':
> > strncpy(ctx.out_file, optarg, PATH_MAX - 1);
> > break;
> > default:
> > return -1;
> > }
> >
> > return 0;
> > }
> >
> > int
> > main(int argc, char **argv)
> > {
> > struct rte_ml_dev_qp_conf qp_conf;
> > struct rte_ml_dev_config config;
> > struct rte_ml_dev_info dev_info;
> > const struct rte_memzone *mz;
> > struct rte_mempool *op_pool;
> > struct rte_ml_op *op_enq;
> > struct rte_ml_op *op_deq;
> >
> > FILE *fp;
> > int rc;
> >
> > /* Initialize EAL */
> > rc = rte_eal_init(argc, argv);
> > if (rc < 0)
> > rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
> > argc -= rc;
> > argv += rc;
> >
> > /* Parse application arguments (after the EAL args) */
> > if (parse_args(argc, argv) < 0)
> > rte_exit(EXIT_FAILURE, "Invalid application arguments\n");
> >
> > /* Step 1: Check for ML devices */
> > if (rte_ml_dev_count() <= 0)
> > rte_exit(EXIT_FAILURE, "Failed to find ML devices\n");
> >
> > /* Step 2: Get device info */
> > if (rte_ml_dev_info_get(0, &dev_info) != 0)
> > rte_exit(EXIT_FAILURE, "Failed to get device info\n");
> >
> > /* Step 3: Configure ML device, use device 0 */
> > config.socket_id = rte_ml_dev_socket_id(0);
> > config.max_nb_models = dev_info.max_models;
> > config.nb_queue_pairs = dev_info.max_queue_pairs;
> > if (rte_ml_dev_configure(0, &config) != 0)
> > rte_exit(EXIT_FAILURE, "Device configuration failed\n");
> >
> > /* Step 4: Setup queue pairs, used qp_id = 0 */
> > qp_conf.nb_desc = 1;
> > if (rte_ml_dev_queue_pair_setup(0, 0, &qp_conf, config.socket_id) != 0)
> > rte_exit(EXIT_FAILURE, "Queue-pair setup failed\n");
> >
> > /* Step 5: Start device */
> > if (rte_ml_dev_start(0) != 0)
> > rte_exit(EXIT_FAILURE, "Device start failed\n");
> >
> > /* Step 6: Read model data and update load params structure */
> > fp = fopen(ctx.model_file, "r+");
> > if (fp == NULL)
> > rte_exit(EXIT_FAILURE, "Failed to open model file\n");
> >
> > fseek(fp, 0, SEEK_END);
> > ctx.params.size = ftell(fp);
> > fseek(fp, 0, SEEK_SET);
> >
> > ctx.params.addr = malloc(ctx.params.size);
> > if (fread(ctx.params.addr, 1, ctx.params.size, fp) != ctx.params.size){
> > fclose(fp);
> > rte_exit(EXIT_FAILURE, "Failed to read model\n");
> > }
> > fclose(fp);
> > strcpy(ctx.params.name, ML_MODEL_NAME);
> >
> > /* Step 7: Load the model */
> > if (rte_ml_model_load(0, &ctx.params, &ctx.id) != 0)
> > rte_exit(EXIT_FAILURE, "Failed to load model\n");
> > free(ctx.params.addr);
> >
> > /* Step 8: Start the model */
> > if (rte_ml_model_start(0, ctx.id) != 0)
> > rte_exit(EXIT_FAILURE, "Failed to start model\n");
> >
> > /* Step 9: Allocate buffers for quantized input and output */
> >
> > /* Get model information */
> > if (rte_ml_model_info_get(0, ctx.id, &ctx.info) != 0)
> > rte_exit(EXIT_FAILURE, "Failed to get model info\n");
> >
> > /* Get the buffer size for input and output */
> > rte_ml_io_input_size_get(0, ctx.id, ctx.info.batch_size, &ctx.input_size, NULL);
> > rte_ml_io_output_size_get(0, ctx.id, ctx.info.batch_size, &ctx.output_size, NULL);
> >
> > mz = rte_memzone_reserve(IO_MZ, ctx.input_size + ctx.output_size, config.socket_id, 0);
> > if (mz == NULL)
> > rte_exit(EXIT_FAILURE, "Failed to create IO memzone\n");
> >
> > ctx.input_buffer = mz->addr;
> > ctx.output_buffer = ctx.input_buffer + ctx.input_size;
> >
> > /* Step 10: Fill the input data */
> > fp = fopen(ctx.inp_file, "r+");
> > if (fp == NULL)
> > rte_exit(EXIT_FAILURE, "Failed to open input file\n");
> >
> > if (fread(ctx.input_buffer, 1, ctx.input_size, fp) != ctx.input_size) {
> > fclose(fp);
> > rte_exit(EXIT_FAILURE, "Failed to read input file\n");
> > }
> > fclose(fp);
> >
> > /* Step 11: Create ML op mempool */
> > op_pool = rte_ml_op_pool_create("ml_op_pool", 1, 0, 0, config.socket_id);
> > if (op_pool == NULL)
> > rte_exit(EXIT_FAILURE, "Failed to create op pool\n");
> >
> > /* Step 12: Form an ML op */
> > rte_mempool_get_bulk(op_pool, (void *)op_enq, 1);
> > op_enq->model_id = ctx.id;
> > op_enq->nb_batches = ctx.info.batch_size;
> > op_enq->mempool = op_pool;
> > op_enq->input.addr = ctx.input_buffer;
> > op_enq->input.length = ctx.input_size;
> > op_enq->input.next = NULL;
> > op_enq->output.addr = ctx.output_buffer;
> > op_enq->output.length = ctx.output_size;
> > op_enq->output.next = NULL;
> >
> > /* Step 13: Enqueue jobs */
> > rte_ml_enqueue_burst(0, 0, &op_enq, 1);
> >
> > /* Step 14: Dequeue jobs and release op pool */
> > while (rte_ml_dequeue_burst(0, 0, &op_deq, 1) != 1)
> > ;
> >
> > /* Step 15: Write output */
> > fp = fopen(ctx.out_file, "w+");
> > if (fp == NULL)
> > rte_exit(EXIT_FAILURE, "Failed to open output file\n");
> > fwrite(ctx.output_buffer, 1, ctx.output_size, fp);
> > fclose(fp);
> >
> > /* Step 16: Clean up */
> > /* Stop ML model */
> > rte_ml_model_stop(0, ctx.id);
> > /* Unload ML model */
> > rte_ml_model_unload(0, ctx.id);
> > /* Free input/output memory */
> > rte_memzone_free(rte_memzone_lookup(IO_MZ));
> > /* Free the ml op back to pool */
> > rte_mempool_put_bulk(op_pool, (void *)op_deq, 1);
> > /* Free ml op pool */
> > rte_mempool_free(op_pool);
> > /* Stop the device */
> > rte_ml_dev_stop(0);
> > rte_ml_dev_close(0);
> > rte_eal_cleanup();
> >
> > return 0;
> > }
> >
> >
> > Jerin Jacob (1):
> > mldev: introduce machine learning device library
> >
> > Srikanth Yalavarthi (11):
> > mldev: support PMD functions for ML device
> > mldev: support ML device handling functions
> > mldev: support ML device queue-pair setup
> > mldev: support handling ML models
> > mldev: support input and output data handling
> > mldev: support ML op pool and ops
> > mldev: support inference enqueue and dequeue
> > mldev: support device statistics
> > mldev: support device extended statistics
> > mldev: support to retrieve error information
> > mldev: support to get debug info and test device
> >
> > MAINTAINERS | 5 +
> > doc/api/doxy-api-index.md | 1 +
> > doc/api/doxy-api.conf.in | 1 +
> > doc/guides/prog_guide/img/mldev_flow.svg | 714 ++++++++++++++
> > doc/guides/prog_guide/index.rst | 1 +
> > doc/guides/prog_guide/mldev.rst | 186 ++++
> > doc/guides/rel_notes/release_23_03.rst | 5 +
> > lib/meson.build | 1 +
> > lib/mldev/meson.build | 27 +
> > lib/mldev/rte_mldev.c | 947 ++++++++++++++++++
> > lib/mldev/rte_mldev.h | 1119 ++++++++++++++++++++++
> > lib/mldev/rte_mldev_core.h | 717 ++++++++++++++
> > lib/mldev/rte_mldev_pmd.c | 62 ++
> > lib/mldev/rte_mldev_pmd.h | 151 +++
> > lib/mldev/version.map | 51 +
> > 15 files changed, 3988 insertions(+)
> > create mode 100644 doc/guides/prog_guide/img/mldev_flow.svg
> > create mode 100644 doc/guides/prog_guide/mldev.rst
> > create mode 100644 lib/mldev/meson.build
> > create mode 100644 lib/mldev/rte_mldev.c
> > create mode 100644 lib/mldev/rte_mldev.h
> > create mode 100644 lib/mldev/rte_mldev_core.h
> > create mode 100644 lib/mldev/rte_mldev_pmd.c
> > create mode 100644 lib/mldev/rte_mldev_pmd.h
> > create mode 100644 lib/mldev/version.map
> >
>
^ permalink raw reply [flat|nested] 80+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
` (12 preceding siblings ...)
2023-02-15 12:55 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library Ferruh Yigit
@ 2023-03-09 17:33 ` Thomas Monjalon
13 siblings, 0 replies; 80+ messages in thread
From: Thomas Monjalon @ 2023-03-09 17:33 UTC (permalink / raw)
To: sshankarnara, Jerin Jacob; +Cc: dev, ferruh.yigit, stephen, dchickles
07/02/2023 16:13, jerinj@marvell.com:
> Jerin Jacob (1):
> mldev: introduce machine learning device library
>
> Srikanth Yalavarthi (11):
> mldev: support PMD functions for ML device
> mldev: support ML device handling functions
> mldev: support ML device queue-pair setup
> mldev: support handling ML models
> mldev: support input and output data handling
> mldev: support ML op pool and ops
> mldev: support inference enqueue and dequeue
> mldev: support device statistics
> mldev: support device extended statistics
> mldev: support to retrieve error information
> mldev: support to get debug info and test device
Applied with few styling improvements for the doc, thanks.
One more library in DPDK :)
^ permalink raw reply [flat|nested] 80+ messages in thread
end of thread, other threads:[~2023-03-09 17:33 UTC | newest]
Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-03 13:28 [dpdk-dev] [RFC PATCH 0/1] mldev: introduce machine learning device library jerinj
2022-08-03 13:28 ` [dpdk-dev] [RFC PATCH 1/1] " jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 01/12] " jerinj
2023-02-01 13:34 ` Shivah Shankar Shankar Narayan Rao
2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 8:42 ` Thomas Monjalon
2023-02-03 17:33 ` Stephen Hemminger
2023-02-03 20:18 ` Thomas Monjalon
2023-02-03 20:26 ` Stephen Hemminger
2023-02-03 20:49 ` Thomas Monjalon
2023-02-05 23:41 ` Stephen Hemminger
2023-02-03 10:01 ` Jerin Jacob
2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 10:04 ` Jerin Jacob
2023-02-03 0:28 ` Stephen Hemminger
2023-02-03 10:03 ` Jerin Jacob
2023-02-02 5:26 ` Shivah Shankar Shankar Narayan Rao
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 02/12] mldev: add PMD functions for ML device jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 03/12] mldev: support device handling functions jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 04/12] mldev: support device queue-pair setup jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 05/12] mldev: support handling ML models jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 06/12] mldev: support input and output data handling jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 07/12] mldev: support op pool and its operations jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 08/12] mldev: support inference enqueue and dequeue jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 09/12] mldev: support device statistics jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 10/12] mldev: support device extended statistics jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 11/12] mldev: support to retrieve error information jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 12/12] mldev: support to get debug info and test device jerinj
2023-01-25 14:20 ` [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library Thomas Monjalon
2023-01-25 19:01 ` Jerin Jacob
2023-01-26 11:11 ` Thomas Monjalon
2023-01-27 2:33 ` [EXT] " Shivah Shankar Shankar Narayan Rao
2023-01-27 4:29 ` Jerin Jacob
2023-01-27 11:34 ` Thomas Monjalon
2023-01-28 11:27 ` Jerin Jacob
2023-02-01 16:57 ` Thomas Monjalon
2023-02-01 17:33 ` Jerin Jacob
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 01/12] " jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device jerinj
2023-02-06 21:04 ` Stephen Hemminger
2023-02-06 22:17 ` Thomas Monjalon
2023-02-07 5:16 ` Jerin Jacob
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 03/12] mldev: support ML device handling functions jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 04/12] mldev: support ML device queue-pair setup jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 05/12] mldev: support handling ML models jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 06/12] mldev: support input and output data handling jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 07/12] mldev: support ML op pool and ops jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 08/12] mldev: support inference enqueue and dequeue jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 09/12] mldev: support device statistics jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 10/12] mldev: support device extended statistics jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 11/12] mldev: support to retrieve error information jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 12/12] mldev: support to get debug info and test device jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 01/12] " jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 02/12] mldev: support PMD functions for ML device jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 03/12] mldev: support ML device handling functions jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 04/12] mldev: support ML device queue-pair setup jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 05/12] mldev: support handling ML models jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 06/12] mldev: support input and output data handling jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 07/12] mldev: support ML op pool and ops jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 08/12] mldev: support inference enqueue and dequeue jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 09/12] mldev: support device statistics jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 10/12] mldev: support device extended statistics jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 11/12] mldev: support to retrieve error information jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 12/12] mldev: support to get debug info and test device jerinj
2023-02-15 12:55 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library Ferruh Yigit
2023-02-15 17:03 ` Jerin Jacob
2023-03-09 17:33 ` Thomas Monjalon
2022-08-03 15:19 ` [dpdk-dev] [RFC PATCH 0/1] " Stephen Hemminger
2022-08-16 13:13 ` Jerin Jacob
2022-08-16 15:45 ` Morten Brørup
2022-08-16 16:34 ` Honnappa Nagarahalli
2022-08-17 14:53 ` Jerin Jacob
2023-01-25 13:47 ` Thomas Monjalon
2023-01-25 13:54 ` Jerin Jacob
2022-08-17 5:37 ` Jerin Jacob
2022-08-17 6:58 ` Morten Brørup
2023-01-25 13:45 ` Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).