From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 90A2E424A1; Fri, 27 Jan 2023 13:58:14 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2AC5C40223; Fri, 27 Jan 2023 13:58:14 +0100 (CET) Received: from new1-smtp.messagingengine.com (new1-smtp.messagingengine.com [66.111.4.221]) by mails.dpdk.org (Postfix) with ESMTP id 566AA40146 for ; Fri, 27 Jan 2023 13:58:13 +0100 (CET) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailnew.nyi.internal (Postfix) with ESMTP id 8FB9A581ECC; Fri, 27 Jan 2023 07:58:10 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Fri, 27 Jan 2023 07:58:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1674824290; x= 1674831490; bh=FPjeP0n/1l2lJecnbzh1NUPpy97dRJAEqE5Es+bnY98=; b=h Q/ztVNwXw0tH4STfieKRxvpG2nHMgkQI9FjvboDGhSYAXjGHUHX0gUwEVmEU27rk dMutImE5Oqo/UvdmVEpESVVhCT+DUCHEkDyvAilpHcSOQ+j78kvbi5NpDqTyQC1U 8IPsv+Ek28RuV83buH46odSAUsGju7k6pdU6AZZqQUBnIDaAsjCQpiwp2zQzw5XA ZCbZlw0gOVoZ1gserPQSLZ1cMV+jOgXZJE2V0flShA8VyObjQXzTmacJ5JaeDYYw znqFNvlvoRmPljcpeTU9DNSZQt3rfi/nSIYcbB2zKQqyhrUFItG2KOVyrm2qEU+k /zmEy7H77mCR0mni44dJw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1674824290; x= 1674831490; bh=FPjeP0n/1l2lJecnbzh1NUPpy97dRJAEqE5Es+bnY98=; b=D QTQlGJCw6ba8Zhck2j6AzMdGQGBexBuoL4VWUkxpF91pU6BAOun+E3UJcMm6uulE HTr/+4oSuXHR5W2K4o1T9hQI0UWWGkjG+0CTShC0xs//JwbllbQmDEQO0slBKJiY w6b8E8YDL76+GEbBQKS1Ny0fD5cx2P7YUrMgxSNJnpmNMsj23ryld2nCHpQrAYvD NkKDUYwBIfGqLaBb1FQb3EyTkmux01emXWBxt7gScuRyYZYqoGybFO1qQjwhnBYt ZlJMtIEXqi8OxtLUfij6coK7BdTQ3XH49fMOh4rbsuYME4Gbv10BwvRvLIBVE0t3 S9ooQviJE+ajcH+iyjjPQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedruddviedggeegucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvvefufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhm rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc ggtffrrghtthgvrhhnpedtjeeiieefhedtfffgvdelteeufeefheeujefgueetfedttdei kefgkeduhedtgfenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Feedback-ID: i47234305:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 27 Jan 2023 07:58:00 -0500 (EST) From: Thomas Monjalon To: Shivah Shankar Shankar Narayan Rao , Jerin Jacob Cc: Jerin Jacob Kollanukkaran , "dev@dpdk.org" , "ferruh.yigit@amd.com" , "ajit.khaparde@broadcom.com" , "aboyer@pensando.io" , "andrew.rybchenko@oktetlabs.ru" , "beilei.xing@intel.com" , "bruce.richardson@intel.com" , "chas3@att.com" , "chenbo.xia@intel.com" , "ciara.loftus@intel.com" , Devendra Singh Rawat , "ed.czeck@atomicrules.com" , "evgenys@amazon.com" , "grive@u256.net" , "g.singh@nxp.com" , "zhouguoyang@huawei.com" , "haiyue.wang@intel.com" , Harman Kalra , "heinrich.kuhn@corigine.com" , "hemant.agrawal@nxp.com" , "hyonkim@cisco.com" , "igorch@amazon.com" , Igor Russkikh , "jgrajcia@cisco.com" , "jasvinder.singh@intel.com" , "jianwang@trustnetic.com" , "jiawenwu@trustnetic.com" , "jingjing.wu@intel.com" , "johndale@cisco.com" , "john.miller@atomicrules.com" , "linville@tuxdriver.com" , "keith.wiles@intel.com" , Kiran Kumar Kokkilagadda , "oulijun@huawei.com" , Liron Himi , "longli@microsoft.com" , "mw@semihalf.com" , "spinler@cesnet.cz" , "matan@nvidia.com" , "matt.peters@windriver.com" , "maxime.coquelin@redhat.com" , "mk@semihalf.com" , "humin29@huawei.com" , Pradeep Kumar Nalla , Nithin Kumar Dabilpuram , "qiming.yang@intel.com" , "qi.z.zhang@intel.com" , Radha Chintakuntla , "rahul.lakkireddy@chelsio.com" , Rasesh Mody , "rosen.xu@intel.com" , "sachin.saxena@oss.nxp.com" , Satha Koteswara Rao Kottidi , Shahed Shaikh , "shaibran@amazon.com" , "shepard.siegel@atomicrules.com" , "asomalap@amd.com" , "somnath.kotur@broadcom.com" , "sthemmin@microsoft.com" , "steven.webster@windriver.com" , Sunil Kumar Kori , "mtetsuyah@gmail.com" , Veerasenareddy Burru , "viacheslavo@nvidia.com" , "xiao.w.wang@intel.com" , "cloud.wangxiaoyun@huawei.com" , "yisen.zhuang@huawei.com" , "yongwang@vmware.com" , "xuanziyang2@huawei.com" , Prasun Kapoor , Nadav Haklai , Satananda Burla , Narayana Prasad Raju Athreya , Akhil Goyal , "dmitry.kozliuk@gmail.com" , "anatoly.burakov@intel.com" , "cristian.dumitrescu@intel.com" , "honnappa.nagarahalli@arm.com" , "mattias.ronnblom@ericsson.com" , "ruifeng.wang@arm.com" , "drc@linux.vnet.ibm.com" , "konstantin.ananyev@intel.com" , "olivier.matz@6wind.com" , "jay.jayatheerthan@intel.com" , Ashwin Sekhar T K , Pavan Nikhilesh Bhagavatula , "eagostini@nvidia.com" , Srikanth Yalavarthi , Derek Chickles , "david.marchand@redhat.com" Subject: Re: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library Date: Fri, 27 Jan 2023 12:34:15 +0100 Message-ID: <2871791.H8VbNj7W2P@thomas> In-Reply-To: References: <20220803132839.2747858-2-jerinj@marvell.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi, Shivah Shankar, please quote your replies so we can distinguish what I said from what you say. Please try to understand my questions, you tend to reply to something else. 27/01/2023 05:29, Jerin Jacob: > On Fri, Jan 27, 2023 at 8:04 AM Shivah Shankar Shankar Narayan Rao > wrote: > > 25/01/2023 20:01, Jerin Jacob: > > > On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon wrote: > > > > 14/11/2022 13:02, jerinj@marvell.com: > > > > > > ML Model: An ML model is an algorithm trained over a dataset. A > > > > > > model consists of procedure/algorithm and data/pattern required to > > > > > > make predictions on live data. Once the model is created and > > > > > > trained outside of the DPDK scope, > > > > > > the model can be loaded via rte_ml_model_load() and then start it > > > > > > using rte_ml_model_start() API. The rte_ml_model_params_update() > > > > > > can be used to update the model > > > > > > parameters such as weight and bias without unloading the model > > > > > > using rte_ml_model_unload().> > > > > > > > > > The fact that the model is prepared outside means the model format > > > > > is free and probably different per mldev driver. > > > > > I think it is OK but it requires a lot of documentation effort to > > > > > explain how to bind the model and its parameters with the DPDK API. > > > > > Also we may need to pass some metadata from the model builder to the > > > > > inference engine in order to enable optimizations prepared in the > > > > > model. > > > > > And the other way, we may need inference capabilities in order to > > > > > generate an optimized model which can run in the inference engine. > > > > > > > > The base API specification kept absolute minimum. Currently, weight > > > > and biases parameters updated through rte_ml_model_params_update(). It > > > > can be extended when there are drivers supports it or if you have any > > > > specific parameter you would like to add it in > > > > rte_ml_model_params_update(). > > > > > > This function is > > > int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void > > > *buffer); > > > > > > How are we supposed to provide separate parameters in this void* ? > > > > Just to clarify on what "parameters" mean, > > they just mean weights and biases of the model, > > which are the parameters for a model. > > Also, the Proposed APIs are for running the inference > > on a pre-trained model. > > For running the inference the amount of parameters tuning > > needed/done is limited/none. Why is it limited? I think you are limiting to *your* model. > > The only parameters that get may get changed are the Weights and Bias > > which the API rte_ml_model_params_update() caters to. We cannot imagine a model with more type of parameters? > > While running the inference on a Model there won't be any random > > addition or removal of operators to/from the model or there won't > > be any changes in the actual flow of model. > > Since the only parameter that can be changed is Weights and Biases > > the above API should take care. No, you don't reply to my question. I want to be able to change a single parameter. I am expecting a more fine-grain API than a simple "void*". We could give the name of the parameter and a value, why not? > > > > Other metadata data like batch, shapes, formats queried using > > > > rte_ml_io_info(). > > > Copying: > > > +/** Input and output data information structure > > > + * > > > + * Specifies the type and shape of input and output data. > > > + */ > > > +struct rte_ml_io_info { > > > + char name[RTE_ML_STR_MAX]; > > > + /**< Name of data */ > > > + struct rte_ml_io_shape shape; > > > + /**< Shape of data */ > > > + enum rte_ml_io_type qtype; > > > + /**< Type of quantized data */ > > > + enum rte_ml_io_type dtype; > > > + /**< Type of de-quantized data */ }; > > > > > > Is it the right place to notify the app that some model optimizations > > > are supported? (example: merge some operations in the graph) > > > > The inference is run on a pre-trained model, which means > > any merges /additions of operations to the graph are NOT done. > > If any such things are done then the changed model needs to go > > through the training and compilation once again > > which is out of scope of these APIs. Please try to understand what I am saying. I want the application to be able to know some capabilities are supported by the inference driver. So it will allow to generate the model with some optimizations. > > > > > [...] > > > > > > Typical application utilisation of the ML API will follow the > > > > > > following programming flow. > > > > > > > > > > > > - rte_ml_dev_configure() > > > > > > - rte_ml_dev_queue_pair_setup() > > > > > > - rte_ml_model_load() > > > > > > - rte_ml_model_start() > > > > > > - rte_ml_model_info() > > > > > > - rte_ml_dev_start() > > > > > > - rte_ml_enqueue_burst() > > > > > > - rte_ml_dequeue_burst() > > > > > > - rte_ml_model_stop() > > > > > > - rte_ml_model_unload() > > > > > > - rte_ml_dev_stop() > > > > > > - rte_ml_dev_close() > > > > > > > > > > Where is parameters update in this flow? > > > > > > > > Added the mandatory APIs in the top level flow doc. > > > > rte_ml_model_params_update() used to update the parameters. > > > > > > The question is "where" should it be done? > > > Before/after start? > > > > The model image comes with the Weights and Bias > > and will be loaded and used as a part of rte_ml_model_load > > and rte_ml_model_start. > > In rare scenarios where the user wants to update > > the Weights and Bias of an already loaded model, > > the rte_ml_model_stop can be called to stop the model > > and the Weights and Biases can be updated using the > > The parameters (Weights&Biases) can be updated > > when the rte_ml_model_params_update() API > > followed by rte_ml_model_start to start the model > > with the new Weights and Biases. OK please sure it is documented that parameters update must be done on a stopped engine. > > > > > Should we update all parameters at once or can it be done more > > > > > fine-grain? > > > > > > > > Currently, rte_ml_model_params_update() can be used to update weight > > > > and bias via buffer when device is in stop state and without unloading > > > > the model. Passing a raw buffer is a really dark API. We need to know how to fill the buffer. > > > The question is "can we update a single parameter"? > > > And how? > > > > As mentioned above for running inference the model is already trained > > the only parameter that is updated is the Weights and Biases. > > "Parameters" is another word for Weights and Bias. > > No other parameters are considered. You are not replying to the question. How can we update a single parameter? > > Are there any other parameters you have on your mind? No > > > > > Question about the memory used by mldev: > > > > > Can we manage where the memory is allocated (host, device, mix, > > > > > etc)? > > > > > > > > Just passing buffer pointers now like other subsystem. > > > > Other EAL infra service can take care of the locality of memory as it > > > > is not specific to ML dev. > > > > > > I was thinking about memory allocation required by the inference engine. > > > How to specify where to allocate? Is it just hardcoded in the driver? > > > > Any memory within the hardware is managed by the driver. > > I think, Thomas is asking input and output memory for interference. If > so, the parameters for > struct rte_ml_buff_seg or needs to add type or so. Thomas, Please > propose what parameters you want here. > In case if it is for internal driver memory, We can pass the memory > type in rte_ml_dev_configure(), If so, please propose > the memory types you need and the parameters. I'm talking about the memory used by the driver to make the inference works. In some cases we may prefer the hardware using host memory, sometimes use the device memory. I think that's something we may tune in the configuration. I suppose we are fine with allocation hardcoded in the driver for now, as I don't have a clear need.