From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4602F42490; Thu, 26 Jan 2023 12:12:14 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 3466040A79; Thu, 26 Jan 2023 12:12:14 +0100 (CET) Received: from wnew2-smtp.messagingengine.com (wnew2-smtp.messagingengine.com [64.147.123.27]) by mails.dpdk.org (Postfix) with ESMTP id 9402940697 for ; Thu, 26 Jan 2023 12:12:13 +0100 (CET) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailnew.west.internal (Postfix) with ESMTP id A94542B060A2; Thu, 26 Jan 2023 06:12:05 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Thu, 26 Jan 2023 06:12:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1674731525; x= 1674738725; bh=xUkJeXEpDm0h5if3j3qoOg93G2RLoTcgSJsZpORim3s=; b=K mICNgJm8//uv194A752/fpnhIIPnl0oEfFco6Mhpg+Vz1WR6XoYFbNsYT/k1hI5R /7bU93Z4NbODxICfuKcnqscNU61IwMPleSF4TovMQ2utVNqTXpAyTXLoJ6ljTJwu IpRukpvoWhAd5W+NN5Fo+dnQ7nHAaQfUVN5sCrQH/4dXMXa9u3K0Rr/n2QY98t1o qXj1B7kGZC7dXA6uEacDdGO1lQc5q00rrYHSzPVcLZh244cE7IrofnSyX50s+UCx 3KU2U9YCN3/GrWkSmZmxP7YtqtFY9q3k5E2IF3Ln6YlyFK5blo4PW1aiJPcVb6ut 73ePjcXYQT7xscur//tMg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1674731525; x= 1674738725; bh=xUkJeXEpDm0h5if3j3qoOg93G2RLoTcgSJsZpORim3s=; b=G OrbW5LDyxaHYrZmmJWFz8a4CsH6rygGE5gwIN1MAe4k7thgorYQrN+gGqeFpClit b9JMEKl/Swl95D06IIFYlkgS/JSekxKChSVdl7k7cmKWinj5OkjjCcwgtD8Wjn28 YYdtWZYNm9zb20/S1lHL77kZBjFcZupLzHf9nahx7HxFyUKgeEmPTQCV32I7Xmwx 12IhmwK5RiKXIw2K9k6MqdBKX7U4cWk6QwwM/p/kY3QdqyPZK7g12FeXZUalA1wy kiH1sHvoid32TLm1oimVXOC+EtDHq5YHXYVbjVjzitKLTQ+/wuCTR0MVMRnmGFBx dtz1O2MGSovrDXkDeeQSA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedruddvgedgvdehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvvefufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhm rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc ggtffrrghtthgvrhhnpedtjeeiieefhedtfffgvdelteeufeefheeujefgueetfedttdei kefgkeduhedtgfenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Feedback-ID: i47234305:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 26 Jan 2023 06:11:54 -0500 (EST) From: Thomas Monjalon To: Jerin Jacob , Jerin Jacob Cc: dev@dpdk.org, ferruh.yigit@amd.com, ajit.khaparde@broadcom.com, aboyer@pensando.io, andrew.rybchenko@oktetlabs.ru, beilei.xing@intel.com, bruce.richardson@intel.com, chas3@att.com, chenbo.xia@intel.com, ciara.loftus@intel.com, dsinghrawat@marvell.com, ed.czeck@atomicrules.com, evgenys@amazon.com, grive@u256.net, g.singh@nxp.com, zhouguoyang@huawei.com, haiyue.wang@intel.com, hkalra@marvell.com, heinrich.kuhn@corigine.com, hemant.agrawal@nxp.com, hyonkim@cisco.com, igorch@amazon.com, irusskikh@marvell.com, jgrajcia@cisco.com, jasvinder.singh@intel.com, jianwang@trustnetic.com, jiawenwu@trustnetic.com, jingjing.wu@intel.com, johndale@cisco.com, john.miller@atomicrules.com, linville@tuxdriver.com, keith.wiles@intel.com, kirankumark@marvell.com, oulijun@huawei.com, lironh@marvell.com, longli@microsoft.com, mw@semihalf.com, spinler@cesnet.cz, matan@nvidia.com, matt.peters@windriver.com, maxime.coquelin@redhat.com, mk@semihalf.com, humin29@huawei.com, pnalla@marvell.com, ndabilpuram@marvell.com, qiming.yang@intel.com, qi.z.zhang@intel.com, radhac@marvell.com, rahul.lakkireddy@chelsio.com, rmody@marvell.com, rosen.xu@intel.com, sachin.saxena@oss.nxp.com, skoteshwar@marvell.com, shshaikh@marvell.com, shaibran@amazon.com, shepard.siegel@atomicrules.com, asomalap@amd.com, somnath.kotur@broadcom.com, sthemmin@microsoft.com, steven.webster@windriver.com, skori@marvell.com, mtetsuyah@gmail.com, vburru@marvell.com, viacheslavo@nvidia.com, xiao.w.wang@intel.com, cloud.wangxiaoyun@huawei.com, yisen.zhuang@huawei.com, yongwang@vmware.com, xuanziyang2@huawei.com, pkapoor@marvell.com, nadavh@marvell.com, sburla@marvell.com, pathreya@marvell.com, gakhil@marvell.com, dmitry.kozliuk@gmail.com, anatoly.burakov@intel.com, cristian.dumitrescu@intel.com, honnappa.nagarahalli@arm.com, mattias.ronnblom@ericsson.com, ruifeng.wang@arm.com, drc@linux.vnet.ibm.com, konstantin.ananyev@intel.com, olivier.matz@6wind.com, jay.jayatheerthan@intel.com, asekhar@marvell.com, pbhagavatula@marvell.com, eagostini@nvidia.com, syalavarthi@marvell.com, dchickles@marvell.com, sshankarnara@marvell.com, david.marchand@redhat.com Subject: Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library Date: Thu, 26 Jan 2023 12:11:53 +0100 Message-ID: <2491874.vBoWY3egPC@thomas> In-Reply-To: References: <20220803132839.2747858-2-jerinj@marvell.com> <3877164.bqPgKRP4r2@thomas> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 25/01/2023 20:01, Jerin Jacob: > On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon wrote: > > 14/11/2022 13:02, jerinj@marvell.com: > > > ML Model: An ML model is an algorithm trained over a dataset. A model consists of > > > procedure/algorithm and data/pattern required to make predictions on live data. > > > Once the model is created and trained outside of the DPDK scope, the model can be loaded > > > via rte_ml_model_load() and then start it using rte_ml_model_start() API. > > > The rte_ml_model_params_update() can be used to update the model parameters such as weight > > > and bias without unloading the model using rte_ml_model_unload(). > > > > The fact that the model is prepared outside means the model format is free > > and probably different per mldev driver. > > I think it is OK but it requires a lot of documentation effort to explain > > how to bind the model and its parameters with the DPDK API. > > Also we may need to pass some metadata from the model builder > > to the inference engine in order to enable optimizations prepared in the model. > > And the other way, we may need inference capabilities in order to generate > > an optimized model which can run in the inference engine. > > The base API specification kept absolute minimum. Currently, weight and biases > parameters updated through rte_ml_model_params_update(). It can be extended > when there are drivers supports it or if you have any specific > parameter you would like to add > it in rte_ml_model_params_update(). This function is int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer); How are we supposed to provide separate parameters in this void* ? > Other metadata data like batch, shapes, formats queried using rte_ml_io_info(). Copying: +/** Input and output data information structure + * + * Specifies the type and shape of input and output data. + */ +struct rte_ml_io_info { + char name[RTE_ML_STR_MAX]; + /**< Name of data */ + struct rte_ml_io_shape shape; + /**< Shape of data */ + enum rte_ml_io_type qtype; + /**< Type of quantized data */ + enum rte_ml_io_type dtype; + /**< Type of de-quantized data */ +}; Is it the right place to notify the app that some model optimizations are supported? (example: merge some operations in the graph) > > [...] > > > Typical application utilisation of the ML API will follow the following > > > programming flow. > > > > > > - rte_ml_dev_configure() > > > - rte_ml_dev_queue_pair_setup() > > > - rte_ml_model_load() > > > - rte_ml_model_start() > > > - rte_ml_model_info() > > > - rte_ml_dev_start() > > > - rte_ml_enqueue_burst() > > > - rte_ml_dequeue_burst() > > > - rte_ml_model_stop() > > > - rte_ml_model_unload() > > > - rte_ml_dev_stop() > > > - rte_ml_dev_close() > > > > Where is parameters update in this flow? > > Added the mandatory APIs in the top level flow doc. > rte_ml_model_params_update() used to update the parameters. The question is "where" should it be done? Before/after start? > > Should we update all parameters at once or can it be done more fine-grain? > > Currently, rte_ml_model_params_update() can be used to update weight > and bias via buffer when device is > in stop state and without unloading the model. The question is "can we update a single parameter"? And how? > > Question about the memory used by mldev: > > Can we manage where the memory is allocated (host, device, mix, etc)? > > Just passing buffer pointers now like other subsystem. > Other EAL infra service can take care of the locality of memory as it > is not specific to ML dev. I was thinking about memory allocation required by the inference engine. How to specify where to allocate? Is it just hardcoded in the driver?