Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Jerin Jacob <jerinjacobk@gmail.com>
To: Elena Agostini <eagostini@nvidia.com>
Cc: "NBU-Contact-Thomas Monjalon" <thomas@monjalon.net>,
	"Jerin Jacob" <jerinj@marvell.com>, dpdk-dev <dev@dpdk.org>,
	"Ferruh Yigit" <ferruh.yigit@intel.com>,
	"Ajit Khaparde" <ajit.khaparde@broadcom.com>,
	"Andrew Boyer" <aboyer@pensando.io>,
	"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
	"Beilei Xing" <beilei.xing@intel.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"Chas Williams" <chas3@att.com>,
	"Xia, Chenbo" <chenbo.xia@intel.com>,
	"Ciara Loftus" <ciara.loftus@intel.com>,
	"Devendra Singh Rawat" <dsinghrawat@marvell.com>,
	"Ed Czeck" <ed.czeck@atomicrules.com>,
	"Evgeny Schemeilin" <evgenys@amazon.com>,
	"Gaetan Rivet" <grive@u256.net>,
	"Gagandeep Singh" <g.singh@nxp.com>,
	"Guoyang Zhou" <zhouguoyang@huawei.com>,
	"Haiyue Wang" <haiyue.wang@intel.com>,
	"Harman Kalra" <hkalra@marvell.com>,
	"heinrich.kuhn@corigine.com" <heinrich.kuhn@corigine.com>,
	"Hemant Agrawal" <hemant.agrawal@nxp.com>,
	"Hyong Youb Kim" <hyonkim@cisco.com>,
	"Igor Chauskin" <igorch@amazon.com>,
	"Igor Russkikh" <irusskikh@marvell.com>,
	"Jakub Grajciar" <jgrajcia@cisco.com>,
	"Jasvinder Singh" <jasvinder.singh@intel.com>,
	"Jian Wang" <jianwang@trustnetic.com>,
	"Jiawen Wu" <jiawenwu@trustnetic.com>,
	"Jingjing Wu" <jingjing.wu@intel.com>,
	"John Daley" <johndale@cisco.com>,
	"John Miller" <john.miller@atomicrules.com>,
	"John W. Linville" <linville@tuxdriver.com>,
	"Wiles, Keith" <keith.wiles@intel.com>,
	"Kiran Kumar K" <kirankumark@marvell.com>,
	"Lijun Ou" <oulijun@huawei.com>,
	"Liron Himi" <lironh@marvell.com>,
	NBU-Contact-longli <longli@microsoft.com>,
	"Marcin Wojtas" <mw@semihalf.com>,
	"Martin Spinler" <spinler@cesnet.cz>,
	"Matan Azrad" <matan@nvidia.com>,
	"Matt Peters" <matt.peters@windriver.com>,
	"Maxime Coquelin" <maxime.coquelin@redhat.com>,
	"Michal Krawczyk" <mk@semihalf.com>,
	"Min Hu (Connor" <humin29@huawei.com>,
	"Pradeep Kumar Nalla" <pnalla@marvell.com>,
	"Nithin Dabilpuram" <ndabilpuram@marvell.com>,
	"Qiming Yang" <qiming.yang@intel.com>,
	"Qi Zhang" <qi.z.zhang@intel.com>,
	"Radha Mohan Chintakuntla" <radhac@marvell.com>,
	"Rahul Lakkireddy" <rahul.lakkireddy@chelsio.com>,
	"Rasesh Mody" <rmody@marvell.com>,
	"Rosen Xu" <rosen.xu@intel.com>,
	"Sachin Saxena" <sachin.saxena@oss.nxp.com>,
	"Satha Koteswara Rao Kottidi" <skoteshwar@marvell.com>,
	"Shahed Shaikh" <shshaikh@marvell.com>,
	"Shai Brandes" <shaibran@amazon.com>,
	"Shepard Siegel" <shepard.siegel@atomicrules.com>,
	"Somalapuram Amaranath" <asomalap@amd.com>,
	"Somnath Kotur" <somnath.kotur@broadcom.com>,
	"Stephen Hemminger" <sthemmin@microsoft.com>,
	"Steven Webster" <steven.webster@windriver.com>,
	"Sunil Kumar Kori" <skori@marvell.com>,
	"Tetsuya Mukawa" <mtetsuyah@gmail.com>,
	"Veerasenareddy Burru" <vburru@marvell.com>,
	"Slava Ovsiienko" <viacheslavo@nvidia.com>,
	"Xiao Wang" <xiao.w.wang@intel.com>,
	"Xiaoyun Wang" <cloud.wangxiaoyun@huawei.com>,
	"Yisen Zhuang" <yisen.zhuang@huawei.com>,
	"Yong Wang" <yongwang@vmware.com>,
	"Ziyang Xuan" <xuanziyang2@huawei.com>,
	"Prasun Kapoor" <pkapoor@marvell.com>,
	"nadavh@marvell.com" <nadavh@marvell.com>,
	"Satananda Burla" <sburla@marvell.com>,
	"Narayana Prasad" <pathreya@marvell.com>,
	"Akhil Goyal" <gakhil@marvell.com>,
	"Ray Kinsella" <mdr@ashroe.eu>,
	"Dmitry Kozlyuk" <dmitry.kozliuk@gmail.com>,
	"Anatoly Burakov" <anatoly.burakov@intel.com>,
	"Cristian Dumitrescu" <cristian.dumitrescu@intel.com>,
	"Honnappa Nagarahalli" <honnappa.nagarahalli@arm.com>,
	"Mattias Rönnblom" <mattias.ronnblom@ericsson.com>,
	"Ruifeng Wang (Arm Technology China)" <ruifeng.wang@arm.com>,
	"David Christensen" <drc@linux.vnet.ibm.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Olivier Matz" <olivier.matz@6wind.com>,
	"Jayatheerthan, Jay" <jay.jayatheerthan@intel.com>,
	"Ashwin Sekhar Thalakalath Kottilveetil" <asekhar@marvell.com>,
	"Pavan Nikhilesh" <pbhagavatula@marvell.com>,
	"David Marchand" <david.marchand@redhat.com>,
	"tom@herbertland.com" <tom@herbertland.com>
Subject: Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library
Date: Fri, 22 Oct 2021 19:09:52 +0530	[thread overview]
Message-ID: <CALBAE1NvrrhSsXdoPwAeC3o94Ksx13eW3qs1FZtvwkROQ_YYHA@mail.gmail.com> (raw)
In-Reply-To: <DM6PR12MB41070B8DDFEBAED51EFF4470CD809@DM6PR12MB4107.namprd12.prod.outlook.com>

On Fri, Oct 22, 2021 at 5:30 PM Elena Agostini <eagostini@nvidia.com> wrote:
>
> On Tue, Oct 19, 2021 at 21:36 Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
>
>
> > On Wed, Oct 20, 2021 at 12:38 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> > >
>
> > > 19/10/2021 20:14, jerinj@marvell.com:
>
> > > > Definition of Dataplane Workload Accelerator
>
> > > > --------------------------------------------
>
> > > > Dataplane Workload Accelerator(DWA) typically contains a set of CPUs,
>
> > > > Network controllers and programmable data acceleration engines for
>
> > > > packet processing, cryptography, regex engines, baseband processing, etc.
>
> > > > This allows DWA to offload  compute/packet processing/baseband/
>
> > > > cryptography-related workload from the host CPU to save the cost and power.
>
> > > > Also to enable scaling the workload by adding DWAs to the Host CPU as needed.
>
> > > >
>
> > > > Unlike other devices in DPDK, the DWA device is not fixed-function
>
> > > > due to the fact that it has CPUs and programmable HW accelerators.
>
> > > > This enables DWA personality/workload to be completely programmable.
>
> > > > Typical examples of DWA offloads are Flow/Session management,
>
> > > > Virtual switch, TLS offload, IPsec offload, l3fwd offload, etc.
>
> > >
>
> > > If I understand well, the idea is to abstract the offload
>
> > > of some stack layers in the hardware.
>
> >
>
> > Yes. It may not just HW, For expressing the complicated workloads
>
> > may need CPU and/or other HW accelerators.
>
> >
>
> > > I am not sure we should give an API for such stack layers in DPDK.
>
> >
>
> > Why not?
>
> >
>
> > > It looks to be the role of the dataplane application to finely manage
>
> > > how to use the hardware for a specific dataplane.
>
> >
>
> > It is possible with this scheme.
>
> >
>
> > > I believe the API for such layer would be either too big, or too limited,
>
> > > or not optimized for specific needs.
>
> >
>
> > It will be optimized for specific needs as applications ask for what to do?
>
> > not how to do?
>
> >
>
> > > If we really want to automate or abstract the HW/SW co-design,
>
> > > I think we should better look at compiler work like P4 or PANDA.
>
> >
>
> > The compiler stuff is very static in nature. It can address the packet
>
> > transformation
>
> > workloads. Not the ones like IPsec or baseband offload.
>
> > Another way to look at it, GPU RFC started just because you are not able
>
> > to express all the workload in P4.
>
> >
>
>
>
> That’s not the purpose of the GPU RFC.
>
> gpudev library goal is to enhance the dialog between GPU, CPU and NIC offering the possibility to:
>
>
>
> - Have DPDK aware of non-CPU memory like device memory (e.g. similarly to what happened with MPI)
>
> - Hide some memory management GPU library specific implementation details
>
> - Reduce the gap between network activity and device activity (e.g. receive/send packets directly using the device memory)
>
> - Reduce the gap between CPU activity and application-defined GPU workload
>
> - Open to the capability to interact with the GPU device, not managing it

Agree. I am not advocating P4 as the replacement for gpulib or DWA. If
someone thinks possible. It would be
great to how to express that for complex workload like TLS offload or
ORAN 7.2 split highphy baseband offload etc.

Could you give more details on "Open to the capability to interact
with the GPU device, not managing it"
What do you mean by managing it and what this RFC doing to manage it?


>
>
>
> gpudev library can be easily embedded in any GPU specific application with a relatively small effort.
>
> The application can allocate, communicate and manage the memory with the device transparently through DPDK.

See below


>
> What you are providing here is different and out of the scope of the gpudev library: control and manage the workload submission of possibly any
>
> accelerator device, hiding a lot of implementation details within DPDK.

No. it has both control and user plane. which also allows an
implementation to allocate, communicate and manage the memory with the
device transparently through DPDK
using user action. TLV messages can be at level. We can define the
profile from a low and higher level based on
what feature we need to offload. Or chain the multiple small profiles
to create complex workloads.


>
> A wrapper for accelerator devices specific libraries and I think that it’s too far to be realistic.
>
> As a GPU user, I don’t want to delegate my tasks to DWA because it can’t be fully optimized, updated to the latest GPU specific feature, etc..

DWA is the GPU.Task are expressed in generic representation, so it can
be optimized for GPU/DPU/IPU based on accelerator speciifics.


>
>
>
> Additionally, a generic DWA won't work for a GPU:
>
> - Memory copies of DWA to CPU / CPU to DWA is latency expensive. Packets can directly be received in device memory

No copy involved. The host port is just an abstract model. You can use
just shared memory as underneath.
Also, If you see the RFC, We can add new host ports that are specific
to the category for transport(Ethernet, PCIe, Shared memory)

>
> - When launching multiple processing blocks, efficiency may be compromised

How you are avoiding that with gpulib, the same logic can be moved to
driver implementation. Right?

>
>
>
> I don’t actually see a real comparison between gpudev and DWA.
>
> If in the future we’ll expose some GPU workload through the gpudev library, it will be for some network specific and well-defined problems.

How do you want to represent the "network specific" and "well-defined"
problem from application PoV.

The problem, I am trying to address, if every vendor express the
workload in accelerator specific fashion then we need
N library and N application code to solve a single problem,

I have provided an example for L3FWD, it will be good to know, how it
can not map to GPU.
Such level of depth discussion will give more ideas instead of an
abstract level. Or you can take up
a workload that can be NOT expressed with DWA RFC. That helps to
understand the gap.

I think,  TB board/DPDK community needs to decide the direction
following questions

1)  Agree/Disagree on the need for workload offload accelerators in DPDK.

2)  Do we need to expose accelerator-specific workload libraries (ie
separate libraries for GPU, DPU etc) let the _DPDK_
application deal with using acceleration-specific API for the
workload. If the majority thinks yes, In such case,
we can have dpudev library in addition to gpudev, basically, it will
be removing the profile concept from this RFC.

3)  Allow accelerator-specific libraries and DWA kind of model and
application to pick the model they want.

next prev parent reply	other threads:[~2021-10-22 13:40 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19 18:14 jerinj
2021-10-19 18:14 ` [dpdk-dev] [RFC PATCH 1/1] dwa: introduce dataplane workload accelerator subsystem jerinj
2021-10-19 19:08 ` [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library Thomas Monjalon
2021-10-19 19:36   ` Jerin Jacob
2021-10-19 20:42     ` Stephen Hemminger
2021-10-20  5:25       ` Jerin Jacob
2021-10-19 20:42     ` Tom Herbert
2021-10-20  5:38       ` Jerin Jacob
2021-10-22 12:00     ` Elena Agostini
2021-10-22 13:39       ` Jerin Jacob [this message]
2021-10-25  7:35 ` Mattias Rönnblom
2021-10-25  9:03   ` Jerin Jacob
2021-10-29 11:57     ` Mattias Rönnblom
2021-10-29 15:51       ` Jerin Jacob
2021-10-31  9:18         ` Mattias Rönnblom
2021-10-31 14:01           ` Jerin Jacob
2021-10-31 19:34             ` Thomas Monjalon
2021-10-31 21:13               ` Jerin Jacob
2021-10-31 21:55                 ` Thomas Monjalon
2021-10-31 22:19                   ` Jerin Jacob

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALBAE1NvrrhSsXdoPwAeC3o94Ksx13eW3qs1FZtvwkROQ_YYHA@mail.gmail.com \
    --to=jerinjacobk@gmail.com \
    --cc=aboyer@pensando.io \
    --cc=ajit.khaparde@broadcom.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=asekhar@marvell.com \
    --cc=asomalap@amd.com \
    --cc=beilei.xing@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=chas3@att.com \
    --cc=chenbo.xia@intel.com \
    --cc=ciara.loftus@intel.com \
    --cc=cloud.wangxiaoyun@huawei.com \
    --cc=cristian.dumitrescu@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=dmitry.kozliuk@gmail.com \
    --cc=drc@linux.vnet.ibm.com \
    --cc=dsinghrawat@marvell.com \
    --cc=eagostini@nvidia.com \
    --cc=ed.czeck@atomicrules.com \
    --cc=evgenys@amazon.com \
    --cc=ferruh.yigit@intel.com \
    --cc=g.singh@nxp.com \
    --cc=gakhil@marvell.com \
    --cc=grive@u256.net \
    --cc=haiyue.wang@intel.com \
    --cc=heinrich.kuhn@corigine.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=hkalra@marvell.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=humin29@huawei.com \
    --cc=hyonkim@cisco.com \
    --cc=igorch@amazon.com \
    --cc=irusskikh@marvell.com \
    --cc=jasvinder.singh@intel.com \
    --cc=jay.jayatheerthan@intel.com \
    --cc=jerinj@marvell.com \
    --cc=jgrajcia@cisco.com \
    --cc=jianwang@trustnetic.com \
    --cc=jiawenwu@trustnetic.com \
    --cc=jingjing.wu@intel.com \
    --cc=john.miller@atomicrules.com \
    --cc=johndale@cisco.com \
    --cc=keith.wiles@intel.com \
    --cc=kirankumark@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=linville@tuxdriver.com \
    --cc=lironh@marvell.com \
    --cc=longli@microsoft.com \
    --cc=matan@nvidia.com \
    --cc=matt.peters@windriver.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mdr@ashroe.eu \
    --cc=mk@semihalf.com \
    --cc=mtetsuyah@gmail.com \
    --cc=mw@semihalf.com \
    --cc=nadavh@marvell.com \
    --cc=ndabilpuram@marvell.com \
    --cc=olivier.matz@6wind.com \
    --cc=oulijun@huawei.com \
    --cc=pathreya@marvell.com \
    --cc=pbhagavatula@marvell.com \
    --cc=pkapoor@marvell.com \
    --cc=pnalla@marvell.com \
    --cc=qi.z.zhang@intel.com \
    --cc=qiming.yang@intel.com \
    --cc=radhac@marvell.com \
    --cc=rahul.lakkireddy@chelsio.com \
    --cc=rmody@marvell.com \
    --cc=rosen.xu@intel.com \
    --cc=ruifeng.wang@arm.com \
    --cc=sachin.saxena@oss.nxp.com \
    --cc=sburla@marvell.com \
    --cc=shaibran@amazon.com \
    --cc=shepard.siegel@atomicrules.com \
    --cc=shshaikh@marvell.com \
    --cc=skori@marvell.com \
    --cc=skoteshwar@marvell.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=spinler@cesnet.cz \
    --cc=steven.webster@windriver.com \
    --cc=sthemmin@microsoft.com \
    --cc=thomas@monjalon.net \
    --cc=tom@herbertland.com \
    --cc=vburru@marvell.com \
    --cc=viacheslavo@nvidia.com \
    --cc=xiao.w.wang@intel.com \
    --cc=xuanziyang2@huawei.com \
    --cc=yisen.zhuang@huawei.com \
    --cc=yongwang@vmware.com \
    --cc=zhouguoyang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).