From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C647D4248B; Fri, 27 Jan 2023 05:30:16 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A69BA40143; Fri, 27 Jan 2023 05:30:16 +0100 (CET) Received: from mail-vk1-f170.google.com (mail-vk1-f170.google.com [209.85.221.170]) by mails.dpdk.org (Postfix) with ESMTP id F2D49400D7 for ; Fri, 27 Jan 2023 05:30:15 +0100 (CET) Received: by mail-vk1-f170.google.com with SMTP id z190so1929146vka.4 for ; Thu, 26 Jan 2023 20:30:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kw5pILkLEvMAeQf4f6wpU06CsklIHXYEM6LTfOjQA6c=; b=nUTS6fVhr+flqB3//4b2hYXSupb7jWR7CRdjzdyUOoELfPpg3YDtKgw1/FSGxl8rUL tHRcSpxg61PTBMVCy6Xudw9ZNigeShBN2UDSfLu7np9Wp3dEZwr1jtDSzELESgo343OF 5ZHp6Lr658psHg+pekBtRqJBM3zYc3GnPVZitMNu0I/Tqsbqtn5Y3RbkyKq5bFSDFDqp h6OAb28UoOat6q7uL9P/m7UTuli6oXXaCHHy8/Fgko0XnoPZmICfY2wX+vLv3ho7koJB 9Q3uAvtFtygXzN+t4PTkY4Yn+Kyk1XYnmijhR1LkkD7yXLOY1rj0Uqod1Nf+H6z6MzgH Ypiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kw5pILkLEvMAeQf4f6wpU06CsklIHXYEM6LTfOjQA6c=; b=cY8LOaepdFOKgDN+hjY4mBnEQ1Y5UF5woXNTD3zeZmBDCPWa8YJsjVhL6k/K4IhDxR 2R6JXeqmwIbI6WYiCmHtECdUUx+Co+4qfIe1aFHuuZ8D6GefAvtWjwdVzdT/kMW2B1x4 h/QesIsMwR0V1PE7aYNv0RVozGj63wmYASaB4YE7SzkgyfGNnW0YAzOskCVgacTj7gIl 2r5wRy9cPn7/kx2hf+pmHoZgqCqPEfJuH9WnL9UCocTAFOzy2kMoqMOHwc7kTi5vclOM DfPIhzShe2ZzXgQLaH9tv5QbXIwI5L0iDuMKRcXxdRtF+y462XHptMMRQWfbgC6BDAt/ 2z4g== X-Gm-Message-State: AFqh2kqWIw1WiRmSp2MK41+4TqekfQf430Ih/NL3ABSq1R3jgK6V6NWV I4qiHmevAS2skGoQ/Qnl47MdADYz+M1oVbEjtSA= X-Google-Smtp-Source: AMrXdXsrNs5fCk3s5hj/NZoNfB5IQ9z3GjCIULzBuKFXOJJ+yDEqzIMQw2HBWVYG9/g7KyaslA+JFAhr5ckSOM5WRQM= X-Received: by 2002:a05:6122:d9f:b0:3e1:cd5f:ea24 with SMTP id bc31-20020a0561220d9f00b003e1cd5fea24mr4357235vkb.29.1674793815218; Thu, 26 Jan 2023 20:30:15 -0800 (PST) MIME-Version: 1.0 References: <20220803132839.2747858-2-jerinj@marvell.com> <3877164.bqPgKRP4r2@thomas> <2491874.vBoWY3egPC@thomas> In-Reply-To: From: Jerin Jacob Date: Fri, 27 Jan 2023 09:59:48 +0530 Message-ID: Subject: Re: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library To: Shivah Shankar Shankar Narayan Rao Cc: Thomas Monjalon , Jerin Jacob Kollanukkaran , "dev@dpdk.org" , "ferruh.yigit@amd.com" , "ajit.khaparde@broadcom.com" , "aboyer@pensando.io" , "andrew.rybchenko@oktetlabs.ru" , "beilei.xing@intel.com" , "bruce.richardson@intel.com" , "chas3@att.com" , "chenbo.xia@intel.com" , "ciara.loftus@intel.com" , Devendra Singh Rawat , "ed.czeck@atomicrules.com" , "evgenys@amazon.com" , "grive@u256.net" , "g.singh@nxp.com" , "zhouguoyang@huawei.com" , "haiyue.wang@intel.com" , Harman Kalra , "heinrich.kuhn@corigine.com" , "hemant.agrawal@nxp.com" , "hyonkim@cisco.com" , "igorch@amazon.com" , Igor Russkikh , "jgrajcia@cisco.com" , "jasvinder.singh@intel.com" , "jianwang@trustnetic.com" , "jiawenwu@trustnetic.com" , "jingjing.wu@intel.com" , "johndale@cisco.com" , "john.miller@atomicrules.com" , "linville@tuxdriver.com" , "keith.wiles@intel.com" , Kiran Kumar Kokkilagadda , "oulijun@huawei.com" , Liron Himi , "longli@microsoft.com" , "mw@semihalf.com" , "spinler@cesnet.cz" , "matan@nvidia.com" , "matt.peters@windriver.com" , "maxime.coquelin@redhat.com" , "mk@semihalf.com" , "humin29@huawei.com" , Pradeep Kumar Nalla , Nithin Kumar Dabilpuram , "qiming.yang@intel.com" , "qi.z.zhang@intel.com" , Radha Chintakuntla , "rahul.lakkireddy@chelsio.com" , Rasesh Mody , "rosen.xu@intel.com" , "sachin.saxena@oss.nxp.com" , Satha Koteswara Rao Kottidi , Shahed Shaikh , "shaibran@amazon.com" , "shepard.siegel@atomicrules.com" , "asomalap@amd.com" , "somnath.kotur@broadcom.com" , "sthemmin@microsoft.com" , "steven.webster@windriver.com" , Sunil Kumar Kori , "mtetsuyah@gmail.com" , Veerasenareddy Burru , "viacheslavo@nvidia.com" , "xiao.w.wang@intel.com" , "cloud.wangxiaoyun@huawei.com" , "yisen.zhuang@huawei.com" , "yongwang@vmware.com" , "xuanziyang2@huawei.com" , Prasun Kapoor , Nadav Haklai , Satananda Burla , Narayana Prasad Raju Athreya , Akhil Goyal , "dmitry.kozliuk@gmail.com" , "anatoly.burakov@intel.com" , "cristian.dumitrescu@intel.com" , "honnappa.nagarahalli@arm.com" , "mattias.ronnblom@ericsson.com" , "ruifeng.wang@arm.com" , "drc@linux.vnet.ibm.com" , "konstantin.ananyev@intel.com" , "olivier.matz@6wind.com" , "jay.jayatheerthan@intel.com" , Ashwin Sekhar T K , Pavan Nikhilesh Bhagavatula , "eagostini@nvidia.com" , Srikanth Yalavarthi , Derek Chickles , "david.marchand@redhat.com" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Fri, Jan 27, 2023 at 8:04 AM Shivah Shankar Shankar Narayan Rao wrote: > > External Email > > ---------------------------------------------------------------------- > 25/01/2023 20:01, Jerin Jacob: > > On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon w= rote: > > > 14/11/2022 13:02, jerinj@marvell.com: > > > > ML Model: An ML model is an algorithm trained over a dataset. A > > > > model consists of procedure/algorithm and data/pattern required to = make predictions on live data. > > > > Once the model is created and trained outside of the DPDK scope, > > > > the model can be loaded via rte_ml_model_load() and then start it u= sing rte_ml_model_start() API. > > > > The rte_ml_model_params_update() can be used to update the model > > > > parameters such as weight and bias without unloading the model usin= g rte_ml_model_unload(). > > > > > > The fact that the model is prepared outside means the model format > > > is free and probably different per mldev driver. > > > I think it is OK but it requires a lot of documentation effort to > > > explain how to bind the model and its parameters with the DPDK API. > > > Also we may need to pass some metadata from the model builder to the > > > inference engine in order to enable optimizations prepared in the mod= el. > > > And the other way, we may need inference capabilities in order to > > > generate an optimized model which can run in the inference engine. > > > > The base API specification kept absolute minimum. Currently, weight > > and biases parameters updated through rte_ml_model_params_update(). It > > can be extended when there are drivers supports it or if you have any > > specific parameter you would like to add it in > > rte_ml_model_params_update(). > > This function is > int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *bu= ffer); > > How are we supposed to provide separate parameters in this void* ? > > Just to clarify on what "parameters" mean, they just mean weights and bia= ses of the model, which are the parameters for a model. > Also, the Proposed APIs are for running the inference on a pre-trained mo= del. For running the inference the amount of parameters tuning needed/done = is limited/none. > The only parameters that get may get changed are the Weights and Bias whi= ch the API rte_ml_model_params_update() caters to. > > While running the inference on a Model there won't be any random addition= or removal of operators to/from the model or there won't be any changes in= the actual flow of model. > Since the only parameter that can be changed is Weights and Biases the ab= ove API should take care. > > > Other metadata data like batch, shapes, formats queried using rte_ml_io= _info(). > > Copying: > +/** Input and output data information structure > + * > + * Specifies the type and shape of input and output data. > + */ > +struct rte_ml_io_info { > + char name[RTE_ML_STR_MAX]; > + /**< Name of data */ > + struct rte_ml_io_shape shape; > + /**< Shape of data */ > + enum rte_ml_io_type qtype; > + /**< Type of quantized data */ > + enum rte_ml_io_type dtype; > + /**< Type of de-quantized data */ }; > > Is it the right place to notify the app that some model optimizations are= supported? (example: merge some operations in the graph) > > The inference is run on a pre-trained model, which means any merges /addi= tions of operations to the graph are NOT done. > If any such things are done then the changed model needs to go through th= e training and compilation once again which is out of scope of these APIs. > > > > [...] > > > > Typical application utilisation of the ML API will follow the > > > > following programming flow. > > > > > > > > - rte_ml_dev_configure() > > > > - rte_ml_dev_queue_pair_setup() > > > > - rte_ml_model_load() > > > > - rte_ml_model_start() > > > > - rte_ml_model_info() > > > > - rte_ml_dev_start() > > > > - rte_ml_enqueue_burst() > > > > - rte_ml_dequeue_burst() > > > > - rte_ml_model_stop() > > > > - rte_ml_model_unload() > > > > - rte_ml_dev_stop() > > > > - rte_ml_dev_close() > > > > > > Where is parameters update in this flow? > > > > Added the mandatory APIs in the top level flow doc. > > rte_ml_model_params_update() used to update the parameters. > > The question is "where" should it be done? > Before/after start? > > The model image comes with the Weights and Bias and will be loaded and us= ed as a part of rte_ml_model_load and rte_ml_model_start. > In rare scenarios where the user wants to update the Weights and Bias of = an already loaded model, the rte_ml_model_stop can be called to stop the mo= del and the Weights and Biases can be updated using the The parameters (Wei= ghts&Biases) can be updated when the rte_ml_model_params_update() API foll= owed by rte_ml_model_start to start the model with the new Weights and Bias= es. > > > > Should we update all parameters at once or can it be done more fine-g= rain? > > > > Currently, rte_ml_model_params_update() can be used to update weight > > and bias via buffer when device is in stop state and without unloading > > the model. > > The question is "can we update a single parameter"? > And how? > As mentioned above for running inference the model is already trained the= only parameter that is updated is the Weights and Biases. > "Parameters" is another word for Weights and Bias. No other parameters ar= e considered. > > Are there any other parameters you have on your mind? > > > > Question about the memory used by mldev: > > > Can we manage where the memory is allocated (host, device, mix, etc)? > > > > Just passing buffer pointers now like other subsystem. > > Other EAL infra service can take care of the locality of memory as it > > is not specific to ML dev. > > I was thinking about memory allocation required by the inference engine. > How to specify where to allocate? Is it just hardcoded in the driver? > > Any memory within the hardware is managed by the driver. I think, Thomas is asking input and output memory for interference. If so, the parameters for struct rte_ml_buff_seg or needs to add type or so. Thomas, Please propose what parameters you want here. In case if it is for internal driver memory, We can pass the memory type in rte_ml_dev_configure(), If so, please propose the memory types you need and the parameters. > >