Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Jerin Jacob <jerinjacobk@gmail.com>
To: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
Cc: "jerinj@marvell.com" <jerinj@marvell.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	 "thomas@monjalon.net" <thomas@monjalon.net>,
	"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
	 "ajit.khaparde@broadcom.com" <ajit.khaparde@broadcom.com>,
	"aboyer@pensando.io" <aboyer@pensando.io>,
	 "andrew.rybchenko@oktetlabs.ru" <andrew.rybchenko@oktetlabs.ru>,
	 "beilei.xing@intel.com" <beilei.xing@intel.com>,
	 "bruce.richardson@intel.com" <bruce.richardson@intel.com>,
	"chas3@att.com" <chas3@att.com>,
	 "chenbo.xia@intel.com" <chenbo.xia@intel.com>,
	"ciara.loftus@intel.com" <ciara.loftus@intel.com>,
	 "dsinghrawat@marvell.com" <dsinghrawat@marvell.com>,
	 "ed.czeck@atomicrules.com" <ed.czeck@atomicrules.com>,
	"evgenys@amazon.com" <evgenys@amazon.com>,
	 "grive@u256.net" <grive@u256.net>,
	"g.singh@nxp.com" <g.singh@nxp.com>,
	 "zhouguoyang@huawei.com" <zhouguoyang@huawei.com>,
	"haiyue.wang@intel.com" <haiyue.wang@intel.com>,
	 "hkalra@marvell.com" <hkalra@marvell.com>,
	 "heinrich.kuhn@corigine.com" <heinrich.kuhn@corigine.com>,
	 "hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
	"hyonkim@cisco.com" <hyonkim@cisco.com>,
	 "igorch@amazon.com" <igorch@amazon.com>,
	"irusskikh@marvell.com" <irusskikh@marvell.com>,
	 "jgrajcia@cisco.com" <jgrajcia@cisco.com>,
	 "jasvinder.singh@intel.com" <jasvinder.singh@intel.com>,
	 "jianwang@trustnetic.com" <jianwang@trustnetic.com>,
	 "jiawenwu@trustnetic.com" <jiawenwu@trustnetic.com>,
	"jingjing.wu@intel.com" <jingjing.wu@intel.com>,
	 "johndale@cisco.com" <johndale@cisco.com>,
	 "john.miller@atomicrules.com" <john.miller@atomicrules.com>,
	 "linville@tuxdriver.com" <linville@tuxdriver.com>,
	"keith.wiles@intel.com" <keith.wiles@intel.com>,
	 "kirankumark@marvell.com" <kirankumark@marvell.com>,
	"oulijun@huawei.com" <oulijun@huawei.com>,
	 "lironh@marvell.com" <lironh@marvell.com>,
	"longli@microsoft.com" <longli@microsoft.com>,
	 "mw@semihalf.com" <mw@semihalf.com>,
	"spinler@cesnet.cz" <spinler@cesnet.cz>,
	 "matan@nvidia.com" <matan@nvidia.com>,
	"matt.peters@windriver.com" <matt.peters@windriver.com>,
	 "maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	"mk@semihalf.com" <mk@semihalf.com>,
	 "humin29@huawei.com" <humin29@huawei.com>,
	"pnalla@marvell.com" <pnalla@marvell.com>,
	 "ndabilpuram@marvell.com" <ndabilpuram@marvell.com>,
	"qiming.yang@intel.com" <qiming.yang@intel.com>,
	 "qi.z.zhang@intel.com" <qi.z.zhang@intel.com>,
	"radhac@marvell.com" <radhac@marvell.com>,
	 "rahul.lakkireddy@chelsio.com" <rahul.lakkireddy@chelsio.com>,
	"rmody@marvell.com" <rmody@marvell.com>,
	 "rosen.xu@intel.com" <rosen.xu@intel.com>,
	 "sachin.saxena@oss.nxp.com" <sachin.saxena@oss.nxp.com>,
	 "skoteshwar@marvell.com" <skoteshwar@marvell.com>,
	"shshaikh@marvell.com" <shshaikh@marvell.com>,
	 "shaibran@amazon.com" <shaibran@amazon.com>,
	 "shepard.siegel@atomicrules.com"
	<shepard.siegel@atomicrules.com>,
	"asomalap@amd.com" <asomalap@amd.com>,
	 "somnath.kotur@broadcom.com" <somnath.kotur@broadcom.com>,
	 "sthemmin@microsoft.com" <sthemmin@microsoft.com>,
	 "steven.webster@windriver.com" <steven.webster@windriver.com>,
	"skori@marvell.com" <skori@marvell.com>,
	 "mtetsuyah@gmail.com" <mtetsuyah@gmail.com>,
	"vburru@marvell.com" <vburru@marvell.com>,
	 "viacheslavo@nvidia.com" <viacheslavo@nvidia.com>,
	"xiao.w.wang@intel.com" <xiao.w.wang@intel.com>,
	 "cloud.wangxiaoyun@huawei.com" <cloud.wangxiaoyun@huawei.com>,
	 "yisen.zhuang@huawei.com" <yisen.zhuang@huawei.com>,
	"yongwang@vmware.com" <yongwang@vmware.com>,
	 "xuanziyang2@huawei.com" <xuanziyang2@huawei.com>,
	"pkapoor@marvell.com" <pkapoor@marvell.com>,
	 "nadavh@marvell.com" <nadavh@marvell.com>,
	"sburla@marvell.com" <sburla@marvell.com>,
	 "pathreya@marvell.com" <pathreya@marvell.com>,
	"gakhil@marvell.com" <gakhil@marvell.com>,
	 "mdr@ashroe.eu" <mdr@ashroe.eu>,
	"dmitry.kozliuk@gmail.com" <dmitry.kozliuk@gmail.com>,
	 "anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
	 "cristian.dumitrescu@intel.com" <cristian.dumitrescu@intel.com>,
	 "honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
	 "ruifeng.wang@arm.com" <ruifeng.wang@arm.com>,
	"drc@linux.vnet.ibm.com" <drc@linux.vnet.ibm.com>,
	 "konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
	 "olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	 "jay.jayatheerthan@intel.com" <jay.jayatheerthan@intel.com>,
	"asekhar@marvell.com" <asekhar@marvell.com>,
	 "pbhagavatula@marvell.com" <pbhagavatula@marvell.com>,
	Elana Agostini <eagostini@nvidia.com>
Subject: Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library
Date: Sun, 31 Oct 2021 19:31:13 +0530	[thread overview]
Message-ID: <CALBAE1N4nyw8U9Q=cJqRxWetxT3o8G-hXMFL0uju7K0CPx1tOA@mail.gmail.com> (raw)
In-Reply-To: <fcb82807-16bd-32e0-731a-09e2ec162052@ericsson.com>

On Sun, Oct 31, 2021 at 2:48 PM Mattias Rönnblom
<mattias.ronnblom@ericsson.com> wrote:
>
> On 2021-10-29 17:51, Jerin Jacob wrote:
> > On Fri, Oct 29, 2021 at 5:27 PM Mattias Rönnblom
> > <mattias.ronnblom@ericsson.com> wrote:
> >> On 2021-10-25 11:03, Jerin Jacob wrote:
> >>> On Mon, Oct 25, 2021 at 1:05 PM Mattias Rönnblom
> >>> <mattias.ronnblom@ericsson.com> wrote:
> >>>> On 2021-10-19 20:14, jerinj@marvell.com wrote:
> >>>>> From: Jerin Jacob <jerinj@marvell.com>
> >>>>>
> >>>>>
> >>>>> Dataplane Workload Accelerator library
> >>>>> ======================================
> >>>>>
> >>>>> Definition of Dataplane Workload Accelerator
> >>>>> --------------------------------------------
> >>>>> Dataplane Workload Accelerator(DWA) typically contains a set of CPUs,
> >>>>> Network controllers and programmable data acceleration engines for
> >>>>> packet processing, cryptography, regex engines, baseband processing, etc.
> >>>>> This allows DWA to offload  compute/packet processing/baseband/
> >>>>> cryptography-related workload from the host CPU to save the cost and power.
> >>>>> Also to enable scaling the workload by adding DWAs to the Host CPU as needed.
> >>>>>
> >>>>> Unlike other devices in DPDK, the DWA device is not fixed-function
> >>>>> due to the fact that it has CPUs and programmable HW accelerators.
> >>>> There are already several instances of DPDK devices with pure-software
> >>>> implementation. In this regard, a DPU/SmartNIC represents nothing new.
> >>>> What's new, it seems to me, is a much-increased need to
> >>>> configure/arrange the processing in complex manners, to avoid bouncing
> >>>> everything to the host CPU.
> >>> Yes and No. It will be based on the profile. The TLV type TYPE_USER_PLANE will
> >>> have user plane traffic from/to host. For example, offloading ORAN split 7.2
> >>> baseband profile. Transport blocks sent to/from host as TYPE_USER_PLANE.
> >>>
> >>>> Something like P4 or rte_flow-based hooks or
> >>>> some other kind of extension. The eventdev adapters solve the same
> >>>> problem (where on some systems packets go through the host CPU on their
> >>>> way to the event device, and others do not) - although on a *much*
> >>>> smaller scale.
> >>> Yes. Eventdev Adapters only for event device plumbing.
> >>>
> >>>
> >>>> "Not-fixed function" seems to call for more hot plug support in the
> >>>> device APIs. Such functionality could then be reused by anything that
> >>>> can be reconfigured dynamically (FPGAs, firmware-programmed
> >>>> accelerators, etc.),
> >>> Yes.
> >>>
> >>>> but which may not be able to serve as a RPC
> >>>> endpoint, like a SmartNIC.
> >>> It can. That's the reason for choosing TLVs. So that
> >>> any higher level language can use TLVs like https://protect2.fireeye.com/v1/url?k=96886daf-c91357b6-96882d34-8682aaa22bc0-c994a5dcbda5d9e8&q=1&e=e89c0aca-a3b3-4f72-b616-ba4550b856b6&u=https%3A%2F%2Fgithub.com%2Fustropo%2Futtlv
> >>> to communicate with the accelerator.  TLVs follow the request and
> >>> response scheme like RPC. So it can warp it under application if needed.
> >>>
> >>>> DWA could be some kind of DPDK-internal framework for managing certain
> >>>> type of DPUs, but should it be exposed to the user application?
> >>> Could you clarify a bit more.
> >>> The offload is represented as a set of TLVs in generic fashion. There
> >>> is no DPU specific bit in offload representation. See
> >>> rte_dwa_profiile_l3fwd.h header file.
> >>
> >> It seems a bit cumbersome to work with TLVs on the user application
> >> side. Would it be an alternative to have the profile API as a set of C
> >> APIs instead of TLV-based messaging interface? The underlying
> >> implementation could still be - in many or all cases - be TLVs sent over
> >> some appropriate transport.
> > The reason to pick TLVs is as follows
> >
> > 1) Very easy to enable ABI compatibility. (Learned from rte_flow)
>
>
> Do you include the TLV-defined profile interface in "ABI"? Or do you
> with ABI only mean the C ABI to send/receive TLVs? To me, the former
> makes the most sense, since changing the profile will break binary
> compatibility with then-existing applications.

The TLV payload will be as part of ABI just like rte_flow.
If there is  ABI breakage on any TLV we can add a new Tag and it is associated
payload to enable backward compatibility. i.e old TLV will work
without any change

>
>
> > 2) If it needs to be transported over network etc it needs to be
> > packed so that way
> > it is easy for implementation to do that with TLV also it gives better
> > performance in such
> > cases by avoiding reformatting or possibly avoiding memcpy etc.
>
> My question was not "why TLVs", but the more specific "why are TLVs
> exposed to the user application." I find it likely the user applications
> are going to wrap the TLV serialization and de-serialization into their
> own functions.

We can stack up the TLVs, unlike traditional function calls.
Those things really need if the device supports N profiles so multiple TLVs
can be used in a single shot in fastpath.


>
>
> > 3) It is easy to plugin with another high-level programing language as
> > just one API
>
>
> Make sense. One note though: the transport is just one API, but then
> each profile makes up an API as well, although it's not C, but TLV-based.

Yes,

>
>
> > 4) Easy to decouple DWA core library functionalities from profile.
> > 5) Easy to enable asynchronous scheme using request and response TLVs.
> > 6) Most importantly, We could introduce type notion with TLV
> > (connected with the type of message  See TYPE_ATTACHED, TYPE_STOPPED,
> > TYPE_USER_PLANE etc ),
> > That way, we can have a uniform outlook of profiles instead of each profile
> > coming with a setup of its own APIs and __rules__ on the state machine.
> > I think, for a framework to leverage communication mechanisms and other
> > aspects between profiles, it's important to have some synergy between profiles.
> >
> >
> > Yes. I agree that a bit more logic is required on the application side
> > to use TLV,
> > But I think we can have a wrapper function getting req and response structures.
>
>
> Do you think ethdev, eventdev, cryptodev and the other DPDK APIs had
> been better off as TLV-based messaging interfaces as well? From a user
> point of view, I'm not sure I see what's so special about talking to a
> SmartNIC compared to functions implemented in a GPU, FPGA, an
> fix-function ASIC, a large array of garden gnomes or some other manner.
> More functionality and more need for asynchronicity (if that's a word)
> maybe.

No. I am trying to avoid creating 1000s of API and it is driver hooks
for all profiles and enable symmetry between all the profiles by
attaching state, type attributes to TLV so that we can get a unified view.
Nothing specific to  SmartNIC/GPU/FPGA.
Also, TLVs are very common in interoperable solutions like
https://scf.io/en/documents/222_5G_FAPI_PHY_API_Specification.php


>
> >> Such a C API could still be asynchronous, and still be a profile API
> >> (rather than a set of new DPDK device types).
> >>
> >>
> >> What I tried to ask during the meeting but where I didn't get an answer
> >> (or at least one that I could understand) was how the profiles was to be
> >> specified and/or documented. Maybe the above is what you had in mind
> >> already.
> > Yes. Documentation is easy, please check the RFC header file for Doxygen
> > meta to express all the attributes of a TLV.
> >
> >
> > +enum rte_dwa_port_host_ethernet {
> > + /**
> > + * Attribute |  Value
> > + * ----------|--------
> > + * Tag       | RTE_DWA_TAG_PORT_HOST_ETHERNET
> > + * Stag      | RTE_DWA_STAG_PORT_HOST_ETHERNET_H2D_INFO
> > + * Direction | H2D
> > + * Type      | TYPE_ATTACHED
> > + * Payload   | NA
> > + * Pair TLV  | RTE_DWA_STAG_PORT_HOST_ETHERNET_D2H_INFO
> > + *
> > + * Request DWA host ethernet port information.
> > + */
> > + RTE_DWA_STAG_PORT_HOST_ETHERNET_H2D_INFO,
> > + /**
> > + * Attribute |  Value
> > + * ----------|---------
> > + * Tag       | RTE_DWA_TAG_PORT_HOST_ETHERNET
> > + * Stag      | RTE_DWA_STAG_PORT_HOST_ETHERNET_D2H_INFO
> > + * Direction | H2D
> > + * Type      | TYPE_ATTACHED
> > + * Payload   | struct rte_dwa_port_host_ethernet_d2h_info
> > + * Pair TLV  | RTE_DWA_STAG_PORT_HOST_ETHERNET_H2D_INFO
> > + *
> > + * Response for DWA host ethernet port information.
> > + */
> > + RTE_DWA_STAG_PORT_HOST_ETHERNET_D2H_INFO,
>
>
> Thanks for the pointer.
>
>
> It would make sense to have a machine-readable schema, so you can
> generate the (in my view) inevitable wrapper code. Much like what gRPC
> is to protobuf, or Sun RPC to XDR.

I thought of doing that, I thought it may not be good due to
1) Additional library dependencies
2) Performance overhead of such solutions.
Not all the transports are not supported in all the libraries
and allow drivers to enable any sort of transport.
3) Keep it simple
4) Better asynchronous support.
5) If someone needs gRPC kind of thing, it can be wrapped over TLV.

Since rte_flow already has the TLV concept it may not be new to DPDK.
I really liked rte_flow enablement of ABI combability and its ease of adding
new stuff. Try to follow similar stuff which is proven in DPDK.
Ie. New profile creation will very easy, it will be a matter of identifying
the TLVs and their type and payload, rather than everyone comes with
new APIs in every profile.


>
>
> Why not use protobuf and its IDL to specify the interface?
>
>

next prev parent reply	other threads:[~2021-10-31 14:01 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19 18:14 jerinj
2021-10-19 18:14 ` [dpdk-dev] [RFC PATCH 1/1] dwa: introduce dataplane workload accelerator subsystem jerinj
2021-10-19 19:08 ` [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library Thomas Monjalon
2021-10-19 19:36   ` Jerin Jacob
2021-10-19 20:42     ` Stephen Hemminger
2021-10-20  5:25       ` Jerin Jacob
2021-10-19 20:42     ` Tom Herbert
2021-10-20  5:38       ` Jerin Jacob
2021-10-22 12:00     ` Elena Agostini
2021-10-22 13:39       ` Jerin Jacob
2021-10-25  7:35 ` Mattias Rönnblom
2021-10-25  9:03   ` Jerin Jacob
2021-10-29 11:57     ` Mattias Rönnblom
2021-10-29 15:51       ` Jerin Jacob
2021-10-31  9:18         ` Mattias Rönnblom
2021-10-31 14:01           ` Jerin Jacob [this message]
2021-10-31 19:34             ` Thomas Monjalon
2021-10-31 21:13               ` Jerin Jacob
2021-10-31 21:55                 ` Thomas Monjalon
2021-10-31 22:19                   ` Jerin Jacob

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALBAE1N4nyw8U9Q=cJqRxWetxT3o8G-hXMFL0uju7K0CPx1tOA@mail.gmail.com' \
    --to=jerinjacobk@gmail.com \
    --cc=aboyer@pensando.io \
    --cc=ajit.khaparde@broadcom.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=asekhar@marvell.com \
    --cc=asomalap@amd.com \
    --cc=beilei.xing@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=chas3@att.com \
    --cc=chenbo.xia@intel.com \
    --cc=ciara.loftus@intel.com \
    --cc=cloud.wangxiaoyun@huawei.com \
    --cc=cristian.dumitrescu@intel.com \
    --cc=dev@dpdk.org \
    --cc=dmitry.kozliuk@gmail.com \
    --cc=drc@linux.vnet.ibm.com \
    --cc=dsinghrawat@marvell.com \
    --cc=eagostini@nvidia.com \
    --cc=ed.czeck@atomicrules.com \
    --cc=evgenys@amazon.com \
    --cc=ferruh.yigit@intel.com \
    --cc=g.singh@nxp.com \
    --cc=gakhil@marvell.com \
    --cc=grive@u256.net \
    --cc=haiyue.wang@intel.com \
    --cc=heinrich.kuhn@corigine.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=hkalra@marvell.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=humin29@huawei.com \
    --cc=hyonkim@cisco.com \
    --cc=igorch@amazon.com \
    --cc=irusskikh@marvell.com \
    --cc=jasvinder.singh@intel.com \
    --cc=jay.jayatheerthan@intel.com \
    --cc=jerinj@marvell.com \
    --cc=jgrajcia@cisco.com \
    --cc=jianwang@trustnetic.com \
    --cc=jiawenwu@trustnetic.com \
    --cc=jingjing.wu@intel.com \
    --cc=john.miller@atomicrules.com \
    --cc=johndale@cisco.com \
    --cc=keith.wiles@intel.com \
    --cc=kirankumark@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=linville@tuxdriver.com \
    --cc=lironh@marvell.com \
    --cc=longli@microsoft.com \
    --cc=matan@nvidia.com \
    --cc=matt.peters@windriver.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mdr@ashroe.eu \
    --cc=mk@semihalf.com \
    --cc=mtetsuyah@gmail.com \
    --cc=mw@semihalf.com \
    --cc=nadavh@marvell.com \
    --cc=ndabilpuram@marvell.com \
    --cc=olivier.matz@6wind.com \
    --cc=oulijun@huawei.com \
    --cc=pathreya@marvell.com \
    --cc=pbhagavatula@marvell.com \
    --cc=pkapoor@marvell.com \
    --cc=pnalla@marvell.com \
    --cc=qi.z.zhang@intel.com \
    --cc=qiming.yang@intel.com \
    --cc=radhac@marvell.com \
    --cc=rahul.lakkireddy@chelsio.com \
    --cc=rmody@marvell.com \
    --cc=rosen.xu@intel.com \
    --cc=ruifeng.wang@arm.com \
    --cc=sachin.saxena@oss.nxp.com \
    --cc=sburla@marvell.com \
    --cc=shaibran@amazon.com \
    --cc=shepard.siegel@atomicrules.com \
    --cc=shshaikh@marvell.com \
    --cc=skori@marvell.com \
    --cc=skoteshwar@marvell.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=spinler@cesnet.cz \
    --cc=steven.webster@windriver.com \
    --cc=sthemmin@microsoft.com \
    --cc=thomas@monjalon.net \
    --cc=vburru@marvell.com \
    --cc=viacheslavo@nvidia.com \
    --cc=xiao.w.wang@intel.com \
    --cc=xuanziyang2@huawei.com \
    --cc=yisen.zhuang@huawei.com \
    --cc=yongwang@vmware.com \
    --cc=zhouguoyang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).