From: Shivah Shankar Shankar Narayan Rao <sshankarnara@marvell.com>
To: Thomas Monjalon <thomas@monjalon.net>,
Jerin Jacob <jerinjacobk@gmail.com>,
Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
"ferruh.yigit@amd.com" <ferruh.yigit@amd.com>,
"ajit.khaparde@broadcom.com" <ajit.khaparde@broadcom.com>,
"aboyer@pensando.io" <aboyer@pensando.io>,
"andrew.rybchenko@oktetlabs.ru" <andrew.rybchenko@oktetlabs.ru>,
"beilei.xing@intel.com" <beilei.xing@intel.com>,
"bruce.richardson@intel.com" <bruce.richardson@intel.com>,
"chas3@att.com" <chas3@att.com>,
"chenbo.xia@intel.com" <chenbo.xia@intel.com>,
"ciara.loftus@intel.com" <ciara.loftus@intel.com>,
Devendra Singh Rawat <dsinghrawat@marvell.com>,
"ed.czeck@atomicrules.com" <ed.czeck@atomicrules.com>,
"evgenys@amazon.com" <evgenys@amazon.com>,
"grive@u256.net" <grive@u256.net>,
"g.singh@nxp.com" <g.singh@nxp.com>,
"zhouguoyang@huawei.com" <zhouguoyang@huawei.com>,
"haiyue.wang@intel.com" <haiyue.wang@intel.com>,
Harman Kalra <hkalra@marvell.com>,
"heinrich.kuhn@corigine.com" <heinrich.kuhn@corigine.com>,
"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
"hyonkim@cisco.com" <hyonkim@cisco.com>,
"igorch@amazon.com" <igorch@amazon.com>,
Igor Russkikh <irusskikh@marvell.com>,
"jgrajcia@cisco.com" <jgrajcia@cisco.com>,
"jasvinder.singh@intel.com" <jasvinder.singh@intel.com>,
"jianwang@trustnetic.com" <jianwang@trustnetic.com>,
"jiawenwu@trustnetic.com" <jiawenwu@trustnetic.com>,
"jingjing.wu@intel.com" <jingjing.wu@intel.com>,
"johndale@cisco.com" <johndale@cisco.com>,
"john.miller@atomicrules.com" <john.miller@atomicrules.com>,
"linville@tuxdriver.com" <linville@tuxdriver.com>,
"keith.wiles@intel.com" <keith.wiles@intel.com>,
Kiran Kumar Kokkilagadda <kirankumark@marvell.com>,
"oulijun@huawei.com" <oulijun@huawei.com>,
Liron Himi <lironh@marvell.com>,
"longli@microsoft.com" <longli@microsoft.com>,
"mw@semihalf.com" <mw@semihalf.com>,
"spinler@cesnet.cz" <spinler@cesnet.cz>,
"matan@nvidia.com" <matan@nvidia.com>,
"matt.peters@windriver.com" <matt.peters@windriver.com>,
"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
"mk@semihalf.com" <mk@semihalf.com>,
"humin29@huawei.com" <humin29@huawei.com>,
Pradeep Kumar Nalla <pnalla@marvell.com>,
Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>,
"qiming.yang@intel.com" <qiming.yang@intel.com>,
"qi.z.zhang@intel.com" <qi.z.zhang@intel.com>,
Radha Chintakuntla <radhac@marvell.com>,
"rahul.lakkireddy@chelsio.com" <rahul.lakkireddy@chelsio.com>,
Rasesh Mody <rmody@marvell.com>,
"rosen.xu@intel.com" <rosen.xu@intel.com>,
"sachin.saxena@oss.nxp.com" <sachin.saxena@oss.nxp.com>,
Satha Koteswara Rao Kottidi <skoteshwar@marvell.com>,
Shahed Shaikh <shshaikh@marvell.com>,
"shaibran@amazon.com" <shaibran@amazon.com>,
"shepard.siegel@atomicrules.com" <shepard.siegel@atomicrules.com>,
"asomalap@amd.com" <asomalap@amd.com>,
"somnath.kotur@broadcom.com" <somnath.kotur@broadcom.com>,
"sthemmin@microsoft.com" <sthemmin@microsoft.com>,
"steven.webster@windriver.com" <steven.webster@windriver.com>,
Sunil Kumar Kori <skori@marvell.com>,
"mtetsuyah@gmail.com" <mtetsuyah@gmail.com>,
Veerasenareddy Burru <vburru@marvell.com>,
"viacheslavo@nvidia.com" <viacheslavo@nvidia.com>,
"xiao.w.wang@intel.com" <xiao.w.wang@intel.com>,
"cloud.wangxiaoyun@huawei.com" <cloud.wangxiaoyun@huawei.com>,
"yisen.zhuang@huawei.com" <yisen.zhuang@huawei.com>,
"yongwang@vmware.com" <yongwang@vmware.com>,
"xuanziyang2@huawei.com" <xuanziyang2@huawei.com>,
Prasun Kapoor <pkapoor@marvell.com>,
Nadav Haklai <nadavh@marvell.com>,
Satananda Burla <sburla@marvell.com>,
Narayana Prasad Raju Athreya <pathreya@marvell.com>,
Akhil Goyal <gakhil@marvell.com>,
"dmitry.kozliuk@gmail.com" <dmitry.kozliuk@gmail.com>,
"anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
"cristian.dumitrescu@intel.com" <cristian.dumitrescu@intel.com>,
"honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
"mattias.ronnblom@ericsson.com" <mattias.ronnblom@ericsson.com>,
"ruifeng.wang@arm.com" <ruifeng.wang@arm.com>,
"drc@linux.vnet.ibm.com" <drc@linux.vnet.ibm.com>,
"konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
"jay.jayatheerthan@intel.com" <jay.jayatheerthan@intel.com>,
Ashwin Sekhar T K <asekhar@marvell.com>,
Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>,
"eagostini@nvidia.com" <eagostini@nvidia.com>,
Srikanth Yalavarthi <syalavarthi@marvell.com>,
Derek Chickles <dchickles@marvell.com>,
"david.marchand@redhat.com" <david.marchand@redhat.com>
Subject: RE: [EXT] Re: [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library
Date: Fri, 27 Jan 2023 02:33:11 +0000 [thread overview]
Message-ID: <PH0PR18MB47667A02CF72A4791DC901B5DBCC9@PH0PR18MB4766.namprd18.prod.outlook.com> (raw)
In-Reply-To: <2491874.vBoWY3egPC@thomas>
External Email
----------------------------------------------------------------------
25/01/2023 20:01, Jerin Jacob:
> On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > 14/11/2022 13:02, jerinj@marvell.com:
> > > ML Model: An ML model is an algorithm trained over a dataset. A
> > > model consists of procedure/algorithm and data/pattern required to make predictions on live data.
> > > Once the model is created and trained outside of the DPDK scope,
> > > the model can be loaded via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> > > The rte_ml_model_params_update() can be used to update the model
> > > parameters such as weight and bias without unloading the model using rte_ml_model_unload().
> >
> > The fact that the model is prepared outside means the model format
> > is free and probably different per mldev driver.
> > I think it is OK but it requires a lot of documentation effort to
> > explain how to bind the model and its parameters with the DPDK API.
> > Also we may need to pass some metadata from the model builder to the
> > inference engine in order to enable optimizations prepared in the model.
> > And the other way, we may need inference capabilities in order to
> > generate an optimized model which can run in the inference engine.
>
> The base API specification kept absolute minimum. Currently, weight
> and biases parameters updated through rte_ml_model_params_update(). It
> can be extended when there are drivers supports it or if you have any
> specific parameter you would like to add it in
> rte_ml_model_params_update().
This function is
int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);
How are we supposed to provide separate parameters in this void* ?
Just to clarify on what "parameters" mean, they just mean weights and biases of the model, which are the parameters for a model.
Also, the Proposed APIs are for running the inference on a pre-trained model. For running the inference the amount of parameters tuning needed/done is limited/none.
The only parameters that get may get changed are the Weights and Bias which the API rte_ml_model_params_update() caters to.
While running the inference on a Model there won't be any random addition or removal of operators to/from the model or there won't be any changes in the actual flow of model.
Since the only parameter that can be changed is Weights and Biases the above API should take care.
> Other metadata data like batch, shapes, formats queried using rte_ml_io_info().
Copying:
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+ char name[RTE_ML_STR_MAX];
+ /**< Name of data */
+ struct rte_ml_io_shape shape;
+ /**< Shape of data */
+ enum rte_ml_io_type qtype;
+ /**< Type of quantized data */
+ enum rte_ml_io_type dtype;
+ /**< Type of de-quantized data */ };
Is it the right place to notify the app that some model optimizations are supported? (example: merge some operations in the graph)
The inference is run on a pre-trained model, which means any merges /additions of operations to the graph are NOT done.
If any such things are done then the changed model needs to go through the training and compilation once again which is out of scope of these APIs.
> > [...]
> > > Typical application utilisation of the ML API will follow the
> > > following programming flow.
> > >
> > > - rte_ml_dev_configure()
> > > - rte_ml_dev_queue_pair_setup()
> > > - rte_ml_model_load()
> > > - rte_ml_model_start()
> > > - rte_ml_model_info()
> > > - rte_ml_dev_start()
> > > - rte_ml_enqueue_burst()
> > > - rte_ml_dequeue_burst()
> > > - rte_ml_model_stop()
> > > - rte_ml_model_unload()
> > > - rte_ml_dev_stop()
> > > - rte_ml_dev_close()
> >
> > Where is parameters update in this flow?
>
> Added the mandatory APIs in the top level flow doc.
> rte_ml_model_params_update() used to update the parameters.
The question is "where" should it be done?
Before/after start?
The model image comes with the Weights and Bias and will be loaded and used as a part of rte_ml_model_load and rte_ml_model_start.
In rare scenarios where the user wants to update the Weights and Bias of an already loaded model, the rte_ml_model_stop can be called to stop the model and the Weights and Biases can be updated using the The parameters (Weights&Biases) can be updated when the rte_ml_model_params_update() API followed by rte_ml_model_start to start the model with the new Weights and Biases.
> > Should we update all parameters at once or can it be done more fine-grain?
>
> Currently, rte_ml_model_params_update() can be used to update weight
> and bias via buffer when device is in stop state and without unloading
> the model.
The question is "can we update a single parameter"?
And how?
As mentioned above for running inference the model is already trained the only parameter that is updated is the Weights and Biases.
"Parameters" is another word for Weights and Bias. No other parameters are considered.
Are there any other parameters you have on your mind?
> > Question about the memory used by mldev:
> > Can we manage where the memory is allocated (host, device, mix, etc)?
>
> Just passing buffer pointers now like other subsystem.
> Other EAL infra service can take care of the locality of memory as it
> is not specific to ML dev.
I was thinking about memory allocation required by the inference engine.
How to specify where to allocate? Is it just hardcoded in the driver?
Any memory within the hardware is managed by the driver.
next prev parent reply other threads:[~2023-01-29 20:46 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-03 13:28 [dpdk-dev] [RFC PATCH 0/1] " jerinj
2022-08-03 13:28 ` [dpdk-dev] [RFC PATCH 1/1] " jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 00/12] " jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 01/12] " jerinj
2023-02-01 13:34 ` Shivah Shankar Shankar Narayan Rao
2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 8:42 ` Thomas Monjalon
2023-02-03 17:33 ` Stephen Hemminger
2023-02-03 20:18 ` Thomas Monjalon
2023-02-03 20:26 ` Stephen Hemminger
2023-02-03 20:49 ` Thomas Monjalon
2023-02-05 23:41 ` Stephen Hemminger
2023-02-03 10:01 ` Jerin Jacob
2023-02-03 0:25 ` Stephen Hemminger
2023-02-03 10:04 ` Jerin Jacob
2023-02-03 0:28 ` Stephen Hemminger
2023-02-03 10:03 ` Jerin Jacob
2023-02-02 5:26 ` Shivah Shankar Shankar Narayan Rao
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 02/12] mldev: add PMD functions for ML device jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 03/12] mldev: support device handling functions jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 04/12] mldev: support device queue-pair setup jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 05/12] mldev: support handling ML models jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 06/12] mldev: support input and output data handling jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 07/12] mldev: support op pool and its operations jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 08/12] mldev: support inference enqueue and dequeue jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 09/12] mldev: support device statistics jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 10/12] mldev: support device extended statistics jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 11/12] mldev: support to retrieve error information jerinj
2022-11-14 12:02 ` [dpdk-dev] [PATCH v1 12/12] mldev: support to get debug info and test device jerinj
2023-01-25 14:20 ` [dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library Thomas Monjalon
2023-01-25 19:01 ` Jerin Jacob
2023-01-26 11:11 ` Thomas Monjalon
2023-01-27 2:33 ` Shivah Shankar Shankar Narayan Rao [this message]
2023-01-27 4:29 ` [EXT] " Jerin Jacob
2023-01-27 11:34 ` Thomas Monjalon
2023-01-28 11:27 ` Jerin Jacob
2023-02-01 16:57 ` Thomas Monjalon
2023-02-01 17:33 ` Jerin Jacob
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 " jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 01/12] " jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 02/12] mldev: support PMD functions for ML device jerinj
2023-02-06 21:04 ` Stephen Hemminger
2023-02-06 22:17 ` Thomas Monjalon
2023-02-07 5:16 ` Jerin Jacob
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 03/12] mldev: support ML device handling functions jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 04/12] mldev: support ML device queue-pair setup jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 05/12] mldev: support handling ML models jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 06/12] mldev: support input and output data handling jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 07/12] mldev: support ML op pool and ops jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 08/12] mldev: support inference enqueue and dequeue jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 09/12] mldev: support device statistics jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 10/12] mldev: support device extended statistics jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 11/12] mldev: support to retrieve error information jerinj
2023-02-06 20:24 ` [dpdk-dev] [PATCH v2 12/12] mldev: support to get debug info and test device jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 01/12] " jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 02/12] mldev: support PMD functions for ML device jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 03/12] mldev: support ML device handling functions jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 04/12] mldev: support ML device queue-pair setup jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 05/12] mldev: support handling ML models jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 06/12] mldev: support input and output data handling jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 07/12] mldev: support ML op pool and ops jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 08/12] mldev: support inference enqueue and dequeue jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 09/12] mldev: support device statistics jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 10/12] mldev: support device extended statistics jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 11/12] mldev: support to retrieve error information jerinj
2023-02-07 15:13 ` [dpdk-dev] [PATCH v3 12/12] mldev: support to get debug info and test device jerinj
2023-02-15 12:55 ` [dpdk-dev] [PATCH v3 00/12] mldev: introduce machine learning device library Ferruh Yigit
2023-02-15 17:03 ` Jerin Jacob
2023-03-09 17:33 ` Thomas Monjalon
2022-08-03 15:19 ` [dpdk-dev] [RFC PATCH 0/1] " Stephen Hemminger
2022-08-16 13:13 ` Jerin Jacob
2022-08-16 15:45 ` Morten Brørup
2022-08-16 16:34 ` Honnappa Nagarahalli
2022-08-17 14:53 ` Jerin Jacob
2023-01-25 13:47 ` Thomas Monjalon
2023-01-25 13:54 ` Jerin Jacob
2022-08-17 5:37 ` Jerin Jacob
2022-08-17 6:58 ` Morten Brørup
2023-01-25 13:45 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PH0PR18MB47667A02CF72A4791DC901B5DBCC9@PH0PR18MB4766.namprd18.prod.outlook.com \
--to=sshankarnara@marvell.com \
--cc=aboyer@pensando.io \
--cc=ajit.khaparde@broadcom.com \
--cc=anatoly.burakov@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=asekhar@marvell.com \
--cc=asomalap@amd.com \
--cc=beilei.xing@intel.com \
--cc=bruce.richardson@intel.com \
--cc=chas3@att.com \
--cc=chenbo.xia@intel.com \
--cc=ciara.loftus@intel.com \
--cc=cloud.wangxiaoyun@huawei.com \
--cc=cristian.dumitrescu@intel.com \
--cc=david.marchand@redhat.com \
--cc=dchickles@marvell.com \
--cc=dev@dpdk.org \
--cc=dmitry.kozliuk@gmail.com \
--cc=drc@linux.vnet.ibm.com \
--cc=dsinghrawat@marvell.com \
--cc=eagostini@nvidia.com \
--cc=ed.czeck@atomicrules.com \
--cc=evgenys@amazon.com \
--cc=ferruh.yigit@amd.com \
--cc=g.singh@nxp.com \
--cc=gakhil@marvell.com \
--cc=grive@u256.net \
--cc=haiyue.wang@intel.com \
--cc=heinrich.kuhn@corigine.com \
--cc=hemant.agrawal@nxp.com \
--cc=hkalra@marvell.com \
--cc=honnappa.nagarahalli@arm.com \
--cc=humin29@huawei.com \
--cc=hyonkim@cisco.com \
--cc=igorch@amazon.com \
--cc=irusskikh@marvell.com \
--cc=jasvinder.singh@intel.com \
--cc=jay.jayatheerthan@intel.com \
--cc=jerinj@marvell.com \
--cc=jerinjacobk@gmail.com \
--cc=jgrajcia@cisco.com \
--cc=jianwang@trustnetic.com \
--cc=jiawenwu@trustnetic.com \
--cc=jingjing.wu@intel.com \
--cc=john.miller@atomicrules.com \
--cc=johndale@cisco.com \
--cc=keith.wiles@intel.com \
--cc=kirankumark@marvell.com \
--cc=konstantin.ananyev@intel.com \
--cc=linville@tuxdriver.com \
--cc=lironh@marvell.com \
--cc=longli@microsoft.com \
--cc=matan@nvidia.com \
--cc=matt.peters@windriver.com \
--cc=mattias.ronnblom@ericsson.com \
--cc=maxime.coquelin@redhat.com \
--cc=mk@semihalf.com \
--cc=mtetsuyah@gmail.com \
--cc=mw@semihalf.com \
--cc=nadavh@marvell.com \
--cc=ndabilpuram@marvell.com \
--cc=olivier.matz@6wind.com \
--cc=oulijun@huawei.com \
--cc=pathreya@marvell.com \
--cc=pbhagavatula@marvell.com \
--cc=pkapoor@marvell.com \
--cc=pnalla@marvell.com \
--cc=qi.z.zhang@intel.com \
--cc=qiming.yang@intel.com \
--cc=radhac@marvell.com \
--cc=rahul.lakkireddy@chelsio.com \
--cc=rmody@marvell.com \
--cc=rosen.xu@intel.com \
--cc=ruifeng.wang@arm.com \
--cc=sachin.saxena@oss.nxp.com \
--cc=sburla@marvell.com \
--cc=shaibran@amazon.com \
--cc=shepard.siegel@atomicrules.com \
--cc=shshaikh@marvell.com \
--cc=skori@marvell.com \
--cc=skoteshwar@marvell.com \
--cc=somnath.kotur@broadcom.com \
--cc=spinler@cesnet.cz \
--cc=steven.webster@windriver.com \
--cc=sthemmin@microsoft.com \
--cc=syalavarthi@marvell.com \
--cc=thomas@monjalon.net \
--cc=vburru@marvell.com \
--cc=viacheslavo@nvidia.com \
--cc=xiao.w.wang@intel.com \
--cc=xuanziyang2@huawei.com \
--cc=yisen.zhuang@huawei.com \
--cc=yongwang@vmware.com \
--cc=zhouguoyang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).