From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 67E7F42478;
	Wed, 25 Jan 2023 20:02:12 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 490E642D4C;
	Wed, 25 Jan 2023 20:02:12 +0100 (CET)
Received: from mail-vs1-f44.google.com (mail-vs1-f44.google.com
 [209.85.217.44]) by mails.dpdk.org (Postfix) with ESMTP id 6787A42D47
 for <dev@dpdk.org>; Wed, 25 Jan 2023 20:02:10 +0100 (CET)
Received: by mail-vs1-f44.google.com with SMTP id p1so20754688vsr.5
 for <dev@dpdk.org>; Wed, 25 Jan 2023 11:02:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
 h=cc:to:subject:message-id:date:from:in-reply-to:references
 :mime-version:from:to:cc:subject:date:message-id:reply-to;
 bh=ZCUX3FyKAeWKRPJf9VWuBql4Ly0Zw9jmP4T5a9ov92s=;
 b=dJig4qvc/RU0gbBVi5A8j6MbD/XrfHPzSrVsZMbvJoA7KXvf+0+ikDNRpd6dwCW3lP
 aG75KZCfNwbkJno01CZ4YDaldDVI3pSrlHBO6efIY0Uzselm5gaX6nn7oFpKqKMy2My2
 Mlzl01gPhUDYkAbu1hHKfyK575Gom3q9ukX1QDlrM1mLppkpOMR8KHFZ+mDxul/skaSK
 Sbf/3jCZgyiPQRPTHovsRs19YvWDe4NvTEqKv+YQwhcz1Vo4D9ssE00NwtBrxAasngJ5
 HWB8pRXxjWKQNTXkMMmzfOBEjfR5saKZOPA8VKZ1/q65UjR6nsKuZnpj02RL5UJFxiPp
 196g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=cc:to:subject:message-id:date:from:in-reply-to:references
 :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=ZCUX3FyKAeWKRPJf9VWuBql4Ly0Zw9jmP4T5a9ov92s=;
 b=RBQG3uXp6tjpMSYWMErLIoiL2O5qRyZya4jCWAsYW7BM2rsdfgXEAEV71tSGTlVCaP
 +Vj41rpfZauRxK7cZt+QLHf/fiKneCp/VyCEGSjWftigyGXw+mmibCIfepGjJMblkbte
 A9m25y6mmEO0DXLYUPKGRjylSW6ATIe+qYPlSLXQGrvWeJT93n6egXT7jpTCmVFw98X+
 MfHOLZJu2dJBfqJ7C1gW+uL36p+Lq8kQSveElZVtbOPLAygfEHyE5OU1WgPld0uINoMH
 P28S0V1BxruL3wb+pQfp7GEW+owqrE1QH2YFOyKtYl4y6DT8ellpTF0z6NoceW2sxK5K
 wjUg==
X-Gm-Message-State: AO0yUKUOFiZ3hWyEXSP6Cf2CRtskufUVxW7PogkiAcL1tQ8DLlZeByUm
 yUGd+gHzbrqlGu9+nCRVh6JZ421v271Sgkjjx/s=
X-Google-Smtp-Source: AK7set+AURvLw29nZftn8VFihDdUeFUgQ72Sd770n+SCMg+53JVO3InX9AFy0iFfacLf/LZLsVpfCynoFKAORutGdi0=
X-Received: by 2002:a05:6102:3456:b0:3e8:b80c:7d8 with SMTP id
 o22-20020a056102345600b003e8b80c07d8mr427658vsj.66.1674673329477; Wed, 25 Jan
 2023 11:02:09 -0800 (PST)
MIME-Version: 1.0
References: <20220803132839.2747858-2-jerinj@marvell.com>
 <20221114120238.2143832-1-jerinj@marvell.com>
 <3877164.bqPgKRP4r2@thomas>
In-Reply-To: <3877164.bqPgKRP4r2@thomas>
From: Jerin Jacob <jerinjacobk@gmail.com>
Date: Thu, 26 Jan 2023 00:31:43 +0530
Message-ID: <CALBAE1OFk7j57qefVYaaySi0zSrJR0-BSAEeAmTU2bwVtJLErQ@mail.gmail.com>
Subject: Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning
 device library
To: Thomas Monjalon <thomas@monjalon.net>
Cc: Jerin Jacob <jerinj@marvell.com>, dev@dpdk.org, ferruh.yigit@amd.com, 
 ajit.khaparde@broadcom.com, aboyer@pensando.io, andrew.rybchenko@oktetlabs.ru, 
 beilei.xing@intel.com, bruce.richardson@intel.com, chas3@att.com, 
 chenbo.xia@intel.com, ciara.loftus@intel.com, dsinghrawat@marvell.com, 
 ed.czeck@atomicrules.com, evgenys@amazon.com, grive@u256.net, g.singh@nxp.com, 
 zhouguoyang@huawei.com, haiyue.wang@intel.com, hkalra@marvell.com, 
 heinrich.kuhn@corigine.com, hemant.agrawal@nxp.com, hyonkim@cisco.com, 
 igorch@amazon.com, irusskikh@marvell.com, jgrajcia@cisco.com, 
 jasvinder.singh@intel.com, jianwang@trustnetic.com, jiawenwu@trustnetic.com, 
 jingjing.wu@intel.com, johndale@cisco.com, john.miller@atomicrules.com, 
 linville@tuxdriver.com, keith.wiles@intel.com, kirankumark@marvell.com, 
 oulijun@huawei.com, lironh@marvell.com, longli@microsoft.com, mw@semihalf.com, 
 spinler@cesnet.cz, matan@nvidia.com, matt.peters@windriver.com, 
 maxime.coquelin@redhat.com, mk@semihalf.com, humin29@huawei.com, 
 pnalla@marvell.com, ndabilpuram@marvell.com, qiming.yang@intel.com, 
 qi.z.zhang@intel.com, radhac@marvell.com, rahul.lakkireddy@chelsio.com, 
 rmody@marvell.com, rosen.xu@intel.com, sachin.saxena@oss.nxp.com, 
 skoteshwar@marvell.com, shshaikh@marvell.com, shaibran@amazon.com, 
 shepard.siegel@atomicrules.com, asomalap@amd.com, somnath.kotur@broadcom.com, 
 sthemmin@microsoft.com, steven.webster@windriver.com, skori@marvell.com, 
 mtetsuyah@gmail.com, vburru@marvell.com, viacheslavo@nvidia.com, 
 xiao.w.wang@intel.com, cloud.wangxiaoyun@huawei.com, yisen.zhuang@huawei.com, 
 yongwang@vmware.com, xuanziyang2@huawei.com, pkapoor@marvell.com, 
 nadavh@marvell.com, sburla@marvell.com, pathreya@marvell.com, 
 gakhil@marvell.com, dmitry.kozliuk@gmail.com, anatoly.burakov@intel.com, 
 cristian.dumitrescu@intel.com, honnappa.nagarahalli@arm.com, 
 mattias.ronnblom@ericsson.com, ruifeng.wang@arm.com, drc@linux.vnet.ibm.com, 
 konstantin.ananyev@intel.com, olivier.matz@6wind.com, 
 jay.jayatheerthan@intel.com, asekhar@marvell.com, pbhagavatula@marvell.com, 
 eagostini@nvidia.com, syalavarthi@marvell.com, dchickles@marvell.com, 
 sshankarnara@marvell.com, david.marchand@redhat.com
Content-Type: text/plain; charset="UTF-8"
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 14/11/2022 13:02, jerinj@marvell.com:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Machine learning inference library
> > ==================================
> >
> > Definition of machine learning inference
> > ----------------------------------------
> > Inference in machine learning is the process of making an output prediction
> > based on new input data using a pre-trained machine learning model.
> >
> > The scope of the RFC would include only inferencing with pre-trained machine learning models,
> > training and building/compiling the ML models is out of scope for this RFC or
> > DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
> >
> > Motivation for the new library
> > ------------------------------
> > Multiple semiconductor vendors are offering accelerator products such as DPU
> > (often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
> > integrated as part of the product. Use of ML inferencing is increasing in the domain
> > of packet processing for flow classification, intrusion, malware and anomaly detection.
> >
> > Lack of inferencing support through DPDK APIs will involve complexities and
> > increased latency from moving data across frameworks (i.e, dataplane to
> > non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
> > inferencing would enable the dataplane solutions to harness the benefit of inline
> > inferencing supported by the hardware.
> >
> > Contents
> > ---------------
> >
> > A) API specification for:
> >
> > 1) Discovery of ML capabilities (e.g., device specific features) in a vendor
> > independent fashion
> > 2) Definition of functions to handle ML devices, which includes probing,
> > initialization and termination of the devices.
> > 3) Definition of functions to handle ML models used to perform inference operations.
> > 4) Definition of function to handle quantize and dequantize operations
> >
> > B) Common code for above specification


Thanks for the review.

>
> Can we compare this library with WinML?
> https://learn.microsoft.com/en-us/windows/ai/windows-ml/api-reference

Proposed DPDK library supports only inferencing with pre-trained models.

> Is there things we can learn from it?

Comparing to winML, API provide functionality similar to
"LearningModel*" classes provides.
Support related to handling custom operators and Native APIs like
winML is provided through this API.
There may more features which we can add where there are drivers which
supports it.

>
>
> > ML Model: An ML model is an algorithm trained over a dataset. A model consists of
> > procedure/algorithm and data/pattern required to make predictions on live data.
> > Once the model is created and trained outside of the DPDK scope, the model can be loaded
> > via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> > The rte_ml_model_params_update() can be used to update the model parameters such as weight
> > and bias without unloading the model using rte_ml_model_unload().
>
> The fact that the model is prepared outside means the model format is free
> and probably different per mldev driver.
> I think it is OK but it requires a lot of documentation effort to explain
> how to bind the model and its parameters with the DPDK API.
> Also we may need to pass some metadata from the model builder
> to the inference engine in order to enable optimizations prepared in the model.
> And the other way, we may need inference capabilities in order to generate
> an optimized model which can run in the inference engine.

The base API specification kept absolute minimum. Currently, weight and biases
parameters updated through rte_ml_model_params_update(). It can be extended
when there are drivers supports it or if you have any specific
parameter you would like to add
it in rte_ml_model_params_update().

Other metadata data like batch, shapes, formats queried using rte_ml_io_info().


>
>
> [...]
> > Typical application utilisation of the ML API will follow the following
> > programming flow.
> >
> > - rte_ml_dev_configure()
> > - rte_ml_dev_queue_pair_setup()
> > - rte_ml_model_load()
> > - rte_ml_model_start()
> > - rte_ml_model_info()
> > - rte_ml_dev_start()
> > - rte_ml_enqueue_burst()
> > - rte_ml_dequeue_burst()
> > - rte_ml_model_stop()
> > - rte_ml_model_unload()
> > - rte_ml_dev_stop()
> > - rte_ml_dev_close()
>
> Where is parameters update in this flow?

Added the mandatory APIs in the top level flow doc.
rte_ml_model_params_update() used to update the parameters.

> Should we update all parameters at once or can it be done more fine-grain?

Currently, rte_ml_model_params_update() can be used to update weight
and bias via buffer when device is
in stop state and without unloading the model.

>
>
> Question about the memory used by mldev:
> Can we manage where the memory is allocated (host, device, mix, etc)?

Just passing buffer pointers now like other subsystem.
Other EAL infra service can take care of the locality of memory as it
is not specific to ML dev.

+/** ML operation's input and output buffer representation as scatter
gather list
+ */
+struct rte_ml_buff_seg {
+ rte_iova_t iova_addr;
+ /**< IOVA address of segment buffer. */
+ void *addr;
+ /**< Virtual address of segment buffer. */
+ uint32_t length;
+ /**< Segment length. */
+ uint32_t reserved;
+ /**< Reserved for future use. */
+ struct rte_ml_buff_seg *next;
+ /**< Points to next segment. Value NULL represents the last segment. */
+};


>
>