From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 8B07A42485;
	Wed, 25 Jan 2023 15:20:19 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 6E14142D47;
	Wed, 25 Jan 2023 15:20:19 +0100 (CET)
Received: from new4-smtp.messagingengine.com (new4-smtp.messagingengine.com
 [66.111.4.230]) by mails.dpdk.org (Postfix) with ESMTP id 4B8E342D3E
 for <dev@dpdk.org>; Wed, 25 Jan 2023 15:20:17 +0100 (CET)
Received: from compute2.internal (compute2.nyi.internal [10.202.2.46])
 by mailnew.nyi.internal (Postfix) with ESMTP id CFCDC581E0C;
 Wed, 25 Jan 2023 09:20:16 -0500 (EST)
Received: from mailfrontend1 ([10.202.2.162])
 by compute2.internal (MEProxy); Wed, 25 Jan 2023 09:20:16 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 cc:cc:content-transfer-encoding:content-type:date:date:from:from
 :in-reply-to:in-reply-to:message-id:mime-version:references
 :reply-to:sender:subject:subject:to:to; s=fm3; t=1674656416; x=
 1674663616; bh=muUfE6xWLMsjt7QCCb+DLGFshOfre3yj0X9MqfKw4wI=; b=b
 c0ws256z5iPtcZIrUtZmrRTmv2juNzs+Rx+WgMl1bwGBD1woscJuITle1yI6eRgA
 XQnuaDGF96gsWqAm11cl0t4cIjBpMIRTTD+9LYlJaJaMQVAerWG0yzBAcwRLTHpH
 ToN5WxTW0pShqM/o/++gNCT8hrAsqS0H7XUqc5AzzHv3kW/sEQEpa2r890lEa/Sj
 xMCk1nCuw5ncxndKM+I1Fu/lyKQZyQlbOc4KHy3SxLsWbg8l+JaOZ0gbmygGZZ7J
 SqAHirL1/Aj7RfM1O3DzsDKo0z6zu73/LdZycJ03nKZF/bnt/YIAVEqDK6WL2/21
 DoiFFf58kCLw24/9XIqkg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:date:date:feedback-id:feedback-id:from:from
 :in-reply-to:in-reply-to:message-id:mime-version:references
 :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1674656416; x=
 1674663616; bh=muUfE6xWLMsjt7QCCb+DLGFshOfre3yj0X9MqfKw4wI=; b=j
 X+CKC9E5N5Pv/Wx1FJjSSXkhpETBiNeY3MFtNKoRdyS1YK5Ru5r3ZG3hNDdW378d
 I/3uFJVrgdbG8rZsJAOxfD3sJL6amSrG4IGrMAxrQyJMey7tt8gAmmcngj8zmSFR
 f0CI67zy6bEgcMsfXNJNe5dQJTzbPHni8dfLb7LoNL6DItbb+dwRBcuCsd/go5p8
 swPwXEZmTLMyoiIoZPk1WCuZKmc3R2YjjN7ZlyJsVKIRPw17Jy6VD7ZemjFsAKq4
 NTU4XFUvE0UYM+vhpY0ZOWnA1Pvr2MG4H0ZmG7K+r6bWpcQzZpOo0+rsEJbAr3lg
 8wF/rfHouy/YVlQ1ZgB2w==
X-ME-Sender: <xms:nzrRY7KdcSgAoGmYph7WHRm17UoNYab3Sfsu-nvkGxeB7uUcrU_i1Q>
 <xme:nzrRY_IsHFDTSmuqbP7DpdidGbam2vGJWPvoNEWVPLd2vr3LoXottqRMreo1a6tPL
 jyMBeCiI94u2c9QGg>
X-ME-Received: <xmr:nzrRYzvn2K4zwova1KXPAU9VYlOPos607fKJ5AN33vWWdW1PBKC6lC0Ps84c-zNyVtXcMip2XR3-xv29gQKN9riDQw>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedruddvvddgieegucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen
 uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne
 cujfgurhephffvvefufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhm
 rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc
 ggtffrrghtthgvrhhnpeevkedvheektefhveeuteekkeeivdelhedttdfgveeifeeltedv
 feduheelueehteenucffohhmrghinhepmhhitghrohhsohhfthdrtghomhenucevlhhush
 htvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehthhhomhgrshesmhho
 nhhjrghlohhnrdhnvght
X-ME-Proxy: <xmx:nzrRY0biYOujPjLBrnd4IhzlMV6r2QeGodt2Ig3WO3ms0m5AEX3l-w>
 <xmx:nzrRYyYAV8x5YPNXMnPWhBGFAZkq3S-jyggGJASrGQVfN5ZiBdk_zQ>
 <xmx:nzrRY4DQG8N3VLftYcUQujwTOycUaMq7P6zCG95P0Nqs72qQC3S1vw>
 <xmx:oDrRY4MsNmpn_CnoKS8BAbU_bMU7Qgoc5gEbZRR6ciB2-jRXlIe1Jw>
Feedback-ID: i47234305:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed,
 25 Jan 2023 09:20:06 -0500 (EST)
From: Thomas Monjalon <thomas@monjalon.net>
To: Jerin Jacob <jerinj@marvell.com>
Cc: dev@dpdk.org, ferruh.yigit@amd.com, ajit.khaparde@broadcom.com,
 aboyer@pensando.io, andrew.rybchenko@oktetlabs.ru, beilei.xing@intel.com,
 bruce.richardson@intel.com, chas3@att.com, chenbo.xia@intel.com,
 ciara.loftus@intel.com, dsinghrawat@marvell.com,
 ed.czeck@atomicrules.com, evgenys@amazon.com, grive@u256.net,
 g.singh@nxp.com, zhouguoyang@huawei.com, haiyue.wang@intel.com,
 hkalra@marvell.com, heinrich.kuhn@corigine.com, hemant.agrawal@nxp.com,
 hyonkim@cisco.com, igorch@amazon.com, irusskikh@marvell.com,
 jgrajcia@cisco.com, jasvinder.singh@intel.com, jianwang@trustnetic.com,
 jiawenwu@trustnetic.com, jingjing.wu@intel.com, johndale@cisco.com,
 john.miller@atomicrules.com, linville@tuxdriver.com,
 keith.wiles@intel.com, kirankumark@marvell.com, oulijun@huawei.com,
 lironh@marvell.com, longli@microsoft.com, mw@semihalf.com,
 spinler@cesnet.cz, matan@nvidia.com, matt.peters@windriver.com,
 maxime.coquelin@redhat.com, mk@semihalf.com, humin29@huawei.com,
 pnalla@marvell.com, ndabilpuram@marvell.com, qiming.yang@intel.com,
 qi.z.zhang@intel.com, radhac@marvell.com, rahul.lakkireddy@chelsio.com,
 rmody@marvell.com, rosen.xu@intel.com, sachin.saxena@oss.nxp.com,
 skoteshwar@marvell.com, shshaikh@marvell.com, shaibran@amazon.com,
 shepard.siegel@atomicrules.com, asomalap@amd.com,
 somnath.kotur@broadcom.com, sthemmin@microsoft.com,
 steven.webster@windriver.com, skori@marvell.com, mtetsuyah@gmail.com,
 vburru@marvell.com, viacheslavo@nvidia.com, xiao.w.wang@intel.com,
 cloud.wangxiaoyun@huawei.com, yisen.zhuang@huawei.com,
 yongwang@vmware.com, xuanziyang2@huawei.com, pkapoor@marvell.com,
 nadavh@marvell.com, sburla@marvell.com, pathreya@marvell.com,
 gakhil@marvell.com, dmitry.kozliuk@gmail.com, anatoly.burakov@intel.com,
 cristian.dumitrescu@intel.com, honnappa.nagarahalli@arm.com,
 mattias.ronnblom@ericsson.com, ruifeng.wang@arm.com,
 drc@linux.vnet.ibm.com, konstantin.ananyev@intel.com,
 olivier.matz@6wind.com, jay.jayatheerthan@intel.com, asekhar@marvell.com,
 pbhagavatula@marvell.com, eagostini@nvidia.com, syalavarthi@marvell.com,
 dchickles@marvell.com, sshankarnara@marvell.com,
 bruce.richardson@intel.com, david.marchand@redhat.com,
 honnappa.nagarahalli@arm.com
Subject: Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning
 device library
Date: Wed, 25 Jan 2023 15:20:04 +0100
Message-ID: <3877164.bqPgKRP4r2@thomas>
In-Reply-To: <20221114120238.2143832-1-jerinj@marvell.com>
References: <20220803132839.2747858-2-jerinj@marvell.com>
 <20221114120238.2143832-1-jerinj@marvell.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

14/11/2022 13:02, jerinj@marvell.com:
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Machine learning inference library
> ==================================
> 
> Definition of machine learning inference
> ----------------------------------------
> Inference in machine learning is the process of making an output prediction
> based on new input data using a pre-trained machine learning model.
> 
> The scope of the RFC would include only inferencing with pre-trained machine learning models,
> training and building/compiling the ML models is out of scope for this RFC or
> DPDK mldev API. Use existing machine learning compiler frameworks for model creation.
> 
> Motivation for the new library
> ------------------------------
> Multiple semiconductor vendors are offering accelerator products such as DPU
> (often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
> integrated as part of the product. Use of ML inferencing is increasing in the domain
> of packet processing for flow classification, intrusion, malware and anomaly detection.
> 
> Lack of inferencing support through DPDK APIs will involve complexities and
> increased latency from moving data across frameworks (i.e, dataplane to
> non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
> inferencing would enable the dataplane solutions to harness the benefit of inline
> inferencing supported by the hardware.
> 
> Contents
> ---------------
> 
> A) API specification for:
> 
> 1) Discovery of ML capabilities (e.g., device specific features) in a vendor
> independent fashion
> 2) Definition of functions to handle ML devices, which includes probing,
> initialization and termination of the devices.
> 3) Definition of functions to handle ML models used to perform inference operations.
> 4) Definition of function to handle quantize and dequantize operations
> 
> B) Common code for above specification

Can we compare this library with WinML?
https://learn.microsoft.com/en-us/windows/ai/windows-ml/api-reference
Is there things we can learn from it?


> ML Model: An ML model is an algorithm trained over a dataset. A model consists of
> procedure/algorithm and data/pattern required to make predictions on live data.
> Once the model is created and trained outside of the DPDK scope, the model can be loaded
> via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> The rte_ml_model_params_update() can be used to update the model parameters such as weight
> and bias without unloading the model using rte_ml_model_unload().

The fact that the model is prepared outside means the model format is free
and probably different per mldev driver.
I think it is OK but it requires a lot of documentation effort to explain
how to bind the model and its parameters with the DPDK API.
Also we may need to pass some metadata from the model builder
to the inference engine in order to enable optimizations prepared in the model.
And the other way, we may need inference capabilities in order to generate
an optimized model which can run in the inference engine.


[...]
> Typical application utilisation of the ML API will follow the following
> programming flow.
> 
> - rte_ml_dev_configure()
> - rte_ml_dev_queue_pair_setup()
> - rte_ml_model_load()
> - rte_ml_model_start()
> - rte_ml_model_info()
> - rte_ml_dev_start()
> - rte_ml_enqueue_burst()
> - rte_ml_dequeue_burst()
> - rte_ml_model_stop()
> - rte_ml_model_unload()
> - rte_ml_dev_stop()
> - rte_ml_dev_close()

Where is parameters update in this flow?
Should we update all parameters at once or can it be done more fine-grain?


Question about the memory used by mldev:
Can we manage where the memory is allocated (host, device, mix, etc)?