From: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
To: Jerin Jacob <jerinjacobk@gmail.com>
Cc: "jerinj@marvell.com" <jerinj@marvell.com>,
"dev@dpdk.org" <dev@dpdk.org>,
"thomas@monjalon.net" <thomas@monjalon.net>,
"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
"ajit.khaparde@broadcom.com" <ajit.khaparde@broadcom.com>,
"aboyer@pensando.io" <aboyer@pensando.io>,
"andrew.rybchenko@oktetlabs.ru" <andrew.rybchenko@oktetlabs.ru>,
"beilei.xing@intel.com" <beilei.xing@intel.com>,
"bruce.richardson@intel.com" <bruce.richardson@intel.com>,
"chas3@att.com" <chas3@att.com>,
"chenbo.xia@intel.com" <chenbo.xia@intel.com>,
"ciara.loftus@intel.com" <ciara.loftus@intel.com>,
"dsinghrawat@marvell.com" <dsinghrawat@marvell.com>,
"ed.czeck@atomicrules.com" <ed.czeck@atomicrules.com>,
"evgenys@amazon.com" <evgenys@amazon.com>,
"grive@u256.net" <grive@u256.net>,
"g.singh@nxp.com" <g.singh@nxp.com>,
"zhouguoyang@huawei.com" <zhouguoyang@huawei.com>,
"haiyue.wang@intel.com" <haiyue.wang@intel.com>,
"hkalra@marvell.com" <hkalra@marvell.com>,
"heinrich.kuhn@corigine.com" <heinrich.kuhn@corigine.com>,
"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
"hyonkim@cisco.com" <hyonkim@cisco.com>,
"igorch@amazon.com" <igorch@amazon.com>,
"irusskikh@marvell.com" <irusskikh@marvell.com>,
"jgrajcia@cisco.com" <jgrajcia@cisco.com>,
"jasvinder.singh@intel.com" <jasvinder.singh@intel.com>,
"jianwang@trustnetic.com" <jianwang@trustnetic.com>,
"jiawenwu@trustnetic.com" <jiawenwu@trustnetic.com>,
"jingjing.wu@intel.com" <jingjing.wu@intel.com>,
"johndale@cisco.com" <johndale@cisco.com>,
"john.miller@atomicrules.com" <john.miller@atomicrules.com>,
"linville@tuxdriver.com" <linville@tuxdriver.com>,
"keith.wiles@intel.com" <keith.wiles@intel.com>,
"kirankumark@marvell.com" <kirankumark@marvell.com>,
"oulijun@huawei.com" <oulijun@huawei.com>,
"lironh@marvell.com" <lironh@marvell.com>,
"longli@microsoft.com" <longli@microsoft.com>,
"mw@semihalf.com" <mw@semihalf.com>,
"spinler@cesnet.cz" <spinler@cesnet.cz>,
"matan@nvidia.com" <matan@nvidia.com>,
"matt.peters@windriver.com" <matt.peters@windriver.com>,
"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
"mk@semihalf.com" <mk@semihalf.com>,
"humin29@huawei.com" <humin29@huawei.com>,
"pnalla@marvell.com" <pnalla@marvell.com>,
"ndabilpuram@marvell.com" <ndabilpuram@marvell.com>,
"qiming.yang@intel.com" <qiming.yang@intel.com>,
"qi.z.zhang@intel.com" <qi.z.zhang@intel.com>,
"radhac@marvell.com" <radhac@marvell.com>,
"rahul.lakkireddy@chelsio.com" <rahul.lakkireddy@chelsio.com>,
"rmody@marvell.com" <rmody@marvell.com>,
"rosen.xu@intel.com" <rosen.xu@intel.com>,
"sachin.saxena@oss.nxp.com" <sachin.saxena@oss.nxp.com>,
"skoteshwar@marvell.com" <skoteshwar@marvell.com>,
"shshaikh@marvell.com" <shshaikh@marvell.com>,
"shaibran@amazon.com" <shaibran@amazon.com>,
"shepard.siegel@atomicrules.com" <shepard.siegel@atomicrules.com>,
"asomalap@amd.com" <asomalap@amd.com>,
"somnath.kotur@broadcom.com" <somnath.kotur@broadcom.com>,
"sthemmin@microsoft.com" <sthemmin@microsoft.com>,
"steven.webster@windriver.com" <steven.webster@windriver.com>,
"skori@marvell.com" <skori@marvell.com>,
"mtetsuyah@gmail.com" <mtetsuyah@gmail.com>,
"vburru@marvell.com" <vburru@marvell.com>,
"viacheslavo@nvidia.com" <viacheslavo@nvidia.com>,
"xiao.w.wang@intel.com" <xiao.w.wang@intel.com>,
"cloud.wangxiaoyun@huawei.com" <cloud.wangxiaoyun@huawei.com>,
"yisen.zhuang@huawei.com" <yisen.zhuang@huawei.com>,
"yongwang@vmware.com" <yongwang@vmware.com>,
"xuanziyang2@huawei.com" <xuanziyang2@huawei.com>,
"pkapoor@marvell.com" <pkapoor@marvell.com>,
"nadavh@marvell.com" <nadavh@marvell.com>,
"sburla@marvell.com" <sburla@marvell.com>,
"pathreya@marvell.com" <pathreya@marvell.com>,
"gakhil@marvell.com" <gakhil@marvell.com>,
"mdr@ashroe.eu" <mdr@ashroe.eu>,
"dmitry.kozliuk@gmail.com" <dmitry.kozliuk@gmail.com>,
"anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
"cristian.dumitrescu@intel.com" <cristian.dumitrescu@intel.com>,
"honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
"ruifeng.wang@arm.com" <ruifeng.wang@arm.com>,
"drc@linux.vnet.ibm.com" <drc@linux.vnet.ibm.com>,
"konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
"jay.jayatheerthan@intel.com" <jay.jayatheerthan@intel.com>,
"asekhar@marvell.com" <asekhar@marvell.com>,
"pbhagavatula@marvell.com" <pbhagavatula@marvell.com>,
Elana Agostini <eagostini@nvidia.com>
Subject: Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library
Date: Fri, 29 Oct 2021 11:57:08 +0000 [thread overview]
Message-ID: <35f086cb-bef3-9a11-6a85-7e695c0b0e7c@ericsson.com> (raw)
In-Reply-To: <CALBAE1N-i36g5miz2ahF=D9svLQ3LjhLsZueZ4VJ8fqMpLQR8A@mail.gmail.com>
On 2021-10-25 11:03, Jerin Jacob wrote:
> On Mon, Oct 25, 2021 at 1:05 PM Mattias Rönnblom
> <mattias.ronnblom@ericsson.com> wrote:
>> On 2021-10-19 20:14, jerinj@marvell.com wrote:
>>> From: Jerin Jacob <jerinj@marvell.com>
>>>
>>>
>>> Dataplane Workload Accelerator library
>>> ======================================
>>>
>>> Definition of Dataplane Workload Accelerator
>>> --------------------------------------------
>>> Dataplane Workload Accelerator(DWA) typically contains a set of CPUs,
>>> Network controllers and programmable data acceleration engines for
>>> packet processing, cryptography, regex engines, baseband processing, etc.
>>> This allows DWA to offload compute/packet processing/baseband/
>>> cryptography-related workload from the host CPU to save the cost and power.
>>> Also to enable scaling the workload by adding DWAs to the Host CPU as needed.
>>>
>>> Unlike other devices in DPDK, the DWA device is not fixed-function
>>> due to the fact that it has CPUs and programmable HW accelerators.
>>
>> There are already several instances of DPDK devices with pure-software
>> implementation. In this regard, a DPU/SmartNIC represents nothing new.
>> What's new, it seems to me, is a much-increased need to
>> configure/arrange the processing in complex manners, to avoid bouncing
>> everything to the host CPU.
> Yes and No. It will be based on the profile. The TLV type TYPE_USER_PLANE will
> have user plane traffic from/to host. For example, offloading ORAN split 7.2
> baseband profile. Transport blocks sent to/from host as TYPE_USER_PLANE.
>
>> Something like P4 or rte_flow-based hooks or
>> some other kind of extension. The eventdev adapters solve the same
>> problem (where on some systems packets go through the host CPU on their
>> way to the event device, and others do not) - although on a *much*
>> smaller scale.
> Yes. Eventdev Adapters only for event device plumbing.
>
>
>>
>> "Not-fixed function" seems to call for more hot plug support in the
>> device APIs. Such functionality could then be reused by anything that
>> can be reconfigured dynamically (FPGAs, firmware-programmed
>> accelerators, etc.),
> Yes.
>
>> but which may not be able to serve as a RPC
>> endpoint, like a SmartNIC.
> It can. That's the reason for choosing TLVs. So that
> any higher level language can use TLVs like https://protect2.fireeye.com/v1/url?k=96886daf-c91357b6-96882d34-8682aaa22bc0-c994a5dcbda5d9e8&q=1&e=e89c0aca-a3b3-4f72-b616-ba4550b856b6&u=https%3A%2F%2Fgithub.com%2Fustropo%2Futtlv
> to communicate with the accelerator. TLVs follow the request and
> response scheme like RPC. So it can warp it under application if needed.
>
>>
>> DWA could be some kind of DPDK-internal framework for managing certain
>> type of DPUs, but should it be exposed to the user application?
>
> Could you clarify a bit more.
> The offload is represented as a set of TLVs in generic fashion. There
> is no DPU specific bit in offload representation. See
> rte_dwa_profiile_l3fwd.h header file.
It seems a bit cumbersome to work with TLVs on the user application
side. Would it be an alternative to have the profile API as a set of C
APIs instead of TLV-based messaging interface? The underlying
implementation could still be - in many or all cases - be TLVs sent over
some appropriate transport.
Such a C API could still be asynchronous, and still be a profile API
(rather than a set of new DPDK device types).
What I tried to ask during the meeting but where I didn't get an answer
(or at least one that I could understand) was how the profiles was to be
specified and/or documented. Maybe the above is what you had in mind
already.
> TB hosted a meeting for this at Date: Wednesday, October 27th Time:
> 3pm UTC, https://meet.jit.si/DPDK
> Feel free to join.
>
>
>>
>>> This enables DWA personality/workload to be completely programmable.
>>> Typical examples of DWA offloads are Flow/Session management,
>>> Virtual switch, TLS offload, IPsec offload, l3fwd offload, etc.
>>> Motivation for the new library
>>> ------------------------------
>>> Even though, a lot of semiconductor vendors offers a different form of DWA,
>>> such as DPU(often called Smart-NIC), GPU, IPU, XPU, etc.,
>>> Due to the lack of standard APIs to "Define the workload" and
>>> "Communication between HOST and DWA", it is difficult for DPDK
>>> consumers to use them in a portable way across different DWA vendors
>>> and enable it in cloud environments.
>>>
>>>
>>> Contents of RFC
>>> ------------------
>>> This RFC attempts to define standard APIs for:
>>>
>>> 1) Definition of Profiles corresponding to well defined workloads, which includes
>>> a set of TLV(Messages) as a request and response scheme to define
>>> the contract between host and DWA to offload a workload.
>>> (See lib/dwa/rte_dwa_profile_* header files)
>>> 2) Discovery of a DWAs capabilities (e.g. which specific workloads it can support)
>>> in a vendor independent fashion. (See rte_dwa_dev_disc_profiles())
>>> 3) Attaching a set of profiles to a DWA device(See rte_dwa_dev_attach())
>>> 4) A communication framework between Host and DWA(See rte_dwa_ctrl_op() for
>>> control plane and rte_dwa_port_host_* for user plane)
>>> 5) Virtualization of DWA hardware and firmware (Use standard DPDK device/bus model)
>>> 6) Enablement of administrative functions such as FW updates,
>>> resource partitioning in a DWA like items in global in
>>> nature that is applicable for all DWA device under the DWA.
>>> (See rte_dwa_profile_admin.h)
>>>
>>> Also, this RFC define the L3FWD profile to offload L3FWD workload to DWA.
>>> This RFC defines an ethernet-style host port for Host to DWA communication.
>>> Different host port types may be required to cover the large spectrum of DWA types as
>>> transports like PCIe DMA, Shared Memory, or Ethernet are fundamentally different,
>>> and optimal performance need host port specific APIs.
>>>
>>> The framework does not force an abstract of different transport interfaces as
>>> single API, instead, decouples TLV from the transport interface and focuses on
>>> defining the TLVs and leaving vendors to specify the host ports
>>> specific to their DWA architecture.
>>>
>>>
>>> Roadmap
>>> -------
>>> 1) Address the comments for this RFC and enable the common code
>>> 2) SW drivers/infrastructure for `DWA` and `DWA device`
>>> as two separate DPDK processes over `memif` DPDK ethdev driver for
>>> L3FWD offload. This is to enable the framework without any special HW.
>>> 3) Example DWA device application for L3FWD profile.
>>> 4) Marvell DWA Device drivers.
>>> 5) Based on community interest new profile can be added in the future.
>>>
>>>
>>> DWA library framework
>>> ---------------------
>>>
>>> DWA components:
>>>
>>> +--> rte_dwa_port_host_*()
>>> | (User Plane traffic as TLV)
>>> |
>>> +----------------------+ | +--------------------+
>>> | | | | DPDK DWA Device[0] |
>>> | +----------------+ | Host Port | +----------------+ |
>>> | | | |<========+==>| | | |
>>> | | Profile 0 | | | | Profile X | |
>>> | | | | | | | |
>>> <=============>| +----------------+ | Control Port| +----------------+ |
>>> DWA Port0 | +----------------+ |<========+==>| |
>>> | | | | | +--------------------+
>>> | | Profile 1 | | |
>>> | | | | +--> rte_dwa_ctrl_op()
>>> | +----------------+ | (Control Plane traffic as TLV)
>>> <=============>| Dataplane |
>>> DWA Port1 | Workload |
>>> | Accelerator | +---------- ---------+
>>> | (HW/FW/SW) | | DPDK DWA Device[N] |
>>> | | Host Port | +----------------+ |
>>> <=============>| +----------------+ |<===========>| | | |
>>> DWA PortN | | | | | | Profile Y | |
>>> | | Profile N | | | | ^ | |
>>> | | | | Control Port| +-----------|----+ |
>>> | +-------|--------+ |<===========>| | |
>>> | | | +-------------|------+
>>> +----------|-----------+ |
>>> | |
>>> +---------------------------------------+
>>> ^
>>> |
>>> +--rte_dwa_dev_attach()
>>>
>>>
>>> Dataplane Workload Accelerator: It is an abstract model. The model is
>>> capable of offloading the dataplane workload from application via
>>> DPDK API over host and control ports of a DWA device.
>>> Dataplane Workload Accelerator(DWA) typically contains a set of CPUs,
>>> Network controllers, and programmable data acceleration engines for
>>> packet processing, cryptography, regex engines, base-band processing, etc.
>>> This allows DWA to offload compute/packet processing/base-band/cryptography-related
>>> workload from the host CPU to save cost and power. Also,
>>> enable scaling the workload by adding DWAs to the host CPU as needed.
>>>
>>> DWA device: A DWA can be sliced to N number of DPDK DWA device(s)
>>> based on the resources available in DWA.
>>> The DPDK API interface operates on the DPDK DWA device.
>>> It is a representation of a set of resources in DWA.
>>>
>>> TLV: TLV (tag-length-value) encoded data stream contain tag as
>>> message ID, followed by message length, and finally the message payload.
>>> The 32bit message ID consists of two parts, 16bit Tag and 16bit Subtag.
>>> The tag represents ID of the group of the similar message,
>>> whereas, subtag represents a message tag ID under the group.
>>>
>>> Control Port: Used for transferring the control plane TLVs. Every DPDK
>>> DWA device must have a control port. Only one outstanding TLV can be
>>> processed via this port by a single DWA device. This makes the control
>>> port suitable for the control plane.
>>>
>>> Host Port: Used for transferring the user plane TLVs.
>>> Ethernet, PCIe DMA, Shared Memory, etc.are the example of
>>> different transport mechanisms abstracted under the host port.
>>> The primary purpose of host port to decouple the user plane TLVs with
>>> underneath transport mechanism differences.
>>> Unlike control port, more than one outstanding TLVs can be processed by
>>> a single DWA device via this port.
>>> This makes, the host port transfer to be in asynchronous nature,
>>> to support large volumes and less latency user plane traffic.
>>>
>>> DWA Port: Used for transferring data between the external source and DWA.
>>> Ethernet, eCPRI are examples of DWA ports. Unlike host ports,
>>> the host CPU is not involved in transferring the data to/from DWA ports.
>>> These ports typically connected to the Network controller inside the
>>> DWA to transfer the traffic from the external source.
>>>
>>> TLV direction: `Host to DWA` and `DWA to Host` are the directions
>>> of TLV messages. The former one is specified as H2D, and the later one is
>>> specified as D2H. The H2D control TLVs, used for requesting DWA to perform
>>> specific action and D2H control TLVs are used to respond to the requested
>>> actions. The H2D user plane messages are used for transferring data from the
>>> host to the DWA. The D2H user plane messages are used for transferring
>>> data from the DWA to the host.
>>>
>>> DWA device states: Following are the different states of a DWA device.
>>> - READY: DWA Device is ready to attach the profile.
>>> See rte_dwa_dev_disc_profiles() API to discover the profile.
>>> - ATTACHED: DWA Device attached to one or more profiles.
>>> See rte_dwa_dev_attach() API to attach the profile(s).
>>> - STOPPED: Profile is in the stop state.
>>> TLV type `TYPE_ATTACHED`and `TYPE_STOPPED` messages are valid in this state.
>>> After rte_dwa_dev_attach() or explicitly invoking the rte_dwa_stop() API
>>> brings device to this state.
>>> - RUNNING: Invoking rte_dwa_start() brings the device to this state.
>>> TLV type `TYPE_STARTED` and `TYPE_USER_PLANE` are valid in this state.
>>> - DETACHED: Invoking rte_dwa_dev_detach() brings the device to this state.
>>> The device and profile must be in the STOPPED state prior to
>>> invoking the rte_dwa_dev_detach().
>>> - CLOSED: Closed a stopped/detached DWA device.The device cannot be restarted!.
>>> Invoking rte_dwa_dev_close() brings the device to this state.
>>>
>>> TLV types: Following are the different TLV types
>>> - TYPE_ATTACHED: Valid when the device is in `ATTACHED`, `STOPPED` and `RUNNING` state.
>>> - TYPE_STOPPED: Valid when the device is in `STOPPED` state.
>>> - TYPE_STARTED: Valid when the device is in `RUNNING` state.
>>> - TYPE_USER_PLANE: Valid when the device is in `RUNNING` state and
>>> used to transfer only user plane traffic.
>>>
>>> Profile: Specifies a workload that dataplane workload accelerator
>>> process on behalf of a DPDK application through a DPDK DWA device.
>>> A profile is expressed as a set of TLV messages for control plane and user plane
>>> functions. Each TLV message must have Tag, SubTag, Direction, Type, Payload attributes.
>>>
>>> Programming model: Typical application programming sequence is as follows,
>>> 1) In the EAL initialization phase, the DWA devices shall be probed,
>>> the application can query the number of available DWA devices with
>>> rte_dwa_dev_count() API.
>>> 2) Application discovers the available profile(s) in a DWA device using
>>> rte_dwa_dev_disc_profiles() API.
>>> 3) Application attaches one or more profile(s) to a DWA device using
>>> rte_dwa_dev_attach().
>>> 4) Once the profile is attached, The device shall be in the STOPPED state.
>>> Configure the profile(s) with `TYPE_ATTACHED`and `TYPE_STOPPED`
>>> type TLVs using rte_dwa_ctrl_op() API.
>>> 5) Once the profile is configured, move the profile to the `RUNNING` state
>>> by invoking rte_dwa_start() API.
>>> 6) Once the profile is in running state and if it has user plane TLV,
>>> transfer those TLVs using rte_dwa_port_host_() API based on the available
>>> host port for the given profile attached.
>>> 7) Application can change the dynamic configuration aspects in
>>> `RUNNING` state using rte_dwa_ctrl_op() API by issuing `TYPE_STARTED` type
>>> of TLV messages.
>>> 8) Finally, use rte_dwa_stop(), rte_dwa_dev_detach(), rte_dwa_dev_close()
>>> sequence for tear-down.
>>>
>>>
>>> L3FWD profile
>>> -------------
>>>
>>> +-------------->--[1]--------------+
>>> | |
>>> +-----------|----------+ |
>>> | | | |
>>> | +--------|-------+ | |
>>> | | | | |
>>> | | L3FWD Profile | | |
>>> \ | | | | |
>>> <====\========>| +----------------+ | |
>>> DWA \Port0 | Lookup Table | +---------|----------+
>>> \ | +----------------+ | | DPDK DWA|Device[0] |
>>> \ | | IP | Dport | | Host Port | +-------|--------+ |
>>> \ | +----------------+ |<===========>| | | | |
>>> +~[3]~~~|~~~~~~~|~~~~~~~~|~~~~~~~~~~~~~~~~~>|->L3FWD Profile | |
>>> <=============>| +----------------+ | | | | |
>>> DWA Port1 | | | | | Control Port| +-|---------|----+ |
>>> | +----------------+ |<===========>| | | |
>>> ~~~>~~[5]~~~~|~~|~~~+ | | | +---|---------|------+
>>> | +---+------------+ | | |
>>> ~~~<~~~~~~~~~|~~|~~~+ | |<-|------[2]--------+ |
>>> | +----------------+<-|------[4]------------------+
>>> | Dataplane |
>>> <=============>| Workload |
>>> DWA PortN | Accelerator |
>>> | (HW/FW/SW) |
>>> +----------------------+
>>>
>>>
>>> L3FWD profile offloads Layer-3 forwarding between the DWA Ethernet ports.
>>>
>>> The above diagram depicts the profile and application programming sequence.
>>> 1) DWA device attaches the L3FWD profile using rte_dwa_dev_attach().
>>> 2) Configure the L3FWD profile:
>>> a) The application requests L3FWD profile capabilities of the DWA
>>> by using RTE_DWA_STAG_PROFILE_L3FWD_H2D_INFO, On response,
>>> the RTE_DWA_STAG_PROFILE_L3FWD_D2H_INFO returns the lookup modes
>>> supported, max rules supported, and available host ports for this profile.
>>> b) The application configures a set of DWA ports to use a
>>> lookup mode(EM, LPM, or FIB) via RTE_DWA_STAG_PROFILE_L3FWD_H2D_CONFIG.
>>> c) The application configures a valid host port to receive exception packets.
>>> 3) The exception that is not matching forwarding table entry comes as
>>> RTE_DWA_STAG_PROFILE_L3FWD_D2H_EXCEPTION_PACKETS TLV to host. DWA stores the exception
>>> packet send back destination ports after completing step (4).
>>> 4) Parse the exception packet and add rules to the FWD table using
>>> RTE_DWA_STAG_PROFILE_L3FWD_H2D_LOOKUP_ADD. If the application knows the rules beforehand,
>>> it can add the rules in step 2.
>>> 5) When DWA ports receive the matching flows in the lookup table, DWA forwards
>>> to DWA Ethernet ports without host CPU intervention.
>>>
>>>
>>> Example application usage with L3FWD profile
>>> --------------------------------------------
>>> This example application is to demonstrate the programming model of DWA library.
>>> This example omits the error checks to simply the application.
>>>
>>> void
>>> dwa_profile_l3fwd_add_rule(rte_dwa_obj_t obj obj, struct rte_mbuf *mbuf)
>>> {
>>> struct rte_dwa_profile_l3fwd_h2d_lookup_add *lookup;
>>> struct rte_dwa_tlv *h2d, *d2h;
>>> struct rte_ether_hdr *eth_hdr;
>>> struct rte_ipv4_hdr *ipv4_hdr;
>>> uint32_t id;
>>> size_t len;
>>>
>>> id = RTE_DWA_TLV_MK_ID(PROFILE_L3FWD, H2D_LOOKUP_ADD);
>>> len = sizeof(struct rte_dwa_profile_l3fwd_h2d_config);
>>> h2d = malloc(RTE_DWA_TLV_HDR_SZ + len);
>>>
>>> lookup = h2d->msg;
>>> /* Simply hardcode to IPv4 instead of looking for Packet type to simplify example */
>>> lookup->rule_type = RTE_DWA_PROFILE_L3FWD_RULE_TYPE_IPV4;
>>> lookup->v4_rule.prefix.depth = 24;
>>>
>>> eth_hdr = rte_pktmbuf_mtod(mbuf, struct rte_ether_hdr *);
>>> ipv4_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1);
>>> lookup->v4_rule.prefix.ip_dst = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
>>> lookup->eth_port_dst = mbuf->port;
>>>
>>> rte_dwa_tlv_fill(h2d, id, len, h2d);
>>> d2h = rte_dwa_ctrl_op(obj, h2h);
>>> free(h2d);
>>> free(d2h);
>>> }
>>>
>>> void
>>> dwa_profile_l3fwd_port_host_ethernet_worker(rte_dwa_obj_t obj, struct app_ctx *ctx)
>>> {
>>> struct rte_dwa_profile_l3fwd_d2h_exception_pkts *msg;
>>> struct rte_dwa_tlv *tlv;
>>> uint16_t i, rc, nb_tlvs;
>>> struct rte_mbuf *mbuf;
>>>
>>> while (!ctx->done) {
>>> rc = rte_dwa_port_host_ethernet_rx(obj, 0, &tlv, 1);
>>> if (!rc)
>>> continue;
>>>
>>> /* Since L3FWD profile has only one User Plane TLV, Message must be
>>> * RTE_DWA_STAG_PROFILE_L3FWD_D2H_EXCEPTION_PACKETS message
>>> */
>>> msg = (struct rte_dwa_profile_l3fwd_d2h_exception_pkts *)tlv->msg;
>>> for (i = 0; i < msg->nb_pkts; i++) {
>>> mbuf = msg->pkts[i];
>>> /* Got a exception pkt from DWA, handle it by adding as new rule in
>>> * lookup table in DWA
>>> */
>>> dwa_profile_l3fwd_add_rule(obj, mbuf);
>>> /* Free the mbuf to pool */
>>> rte_pktmbuf_free(mbuf);
>>> }
>>>
>>> /* Done with TLV mbuf container, free it back */
>>> rte_mempool_ops_enqueue_bulk(ctx->tlv_pool, tlv, 1);
>>> }
>>>
>>> bool
>>> dwa_port_host_ethernet_config(rte_dwa_obj_t obj, struct app_ctx *ctx)
>>> {
>>> struct rte_dwa_tlv info_h2d, *info_d2h, *h2d = NULL, *d2h;
>>> struct rte_dwa_port_host_ethernet_d2h_info *info;
>>> int tlv_pool_element_sz;
>>> bool rc = false;
>>> size_t len;
>>>
>>> /* Get the Ethernet host port info */
>>> id = RTE_DWA_TLV_MK_ID(PORT_HOST_ETHERNET, H2D_INFO);
>>> rte_dwa_tlv_fill(&info_h2d, id, 0, NULL);
>>> info_d2h = rte_dwa_ctrl_op(obj, &info_h2d)
>>>
>>> info = rte_dwa_tlv_d2h_to_msg(info_d2h);
>>> if (info == NULL)
>>> goto fail;
>>> /* Need min one Rx queue to Receive exception traffic */
>>> if (info->nb_rx_queues == 0)
>>> goto fail;
>>> /* Done with message from DWA. Free back to implementation */
>>> free(obj, info_d2h);
>>>
>>> /* Allocate exception packet pool */
>>> ctx->pkt_pool = rte_pktmbuf_pool_create("exception pool", /* Name */
>>> ctx->pkt_pool_depth, /* Number of elements*/
>>> 512, /* Cache size*/
>>> 0,
>>> RTE_MBUF_DEFAULT_BUF_SIZE,
>>> ctx->socket_id));
>>>
>>>
>>> tlv_pool_element_sz = DWA_EXCEPTION_PACKETS_PKT_BURST_MAX_SZ * sizeof(rte_mbuf *);
>>> tlv_pool_element_sz += sizeof(rte_dwa_profile_l3fwd_d2h_exception_pkts);
>>>
>>> /* Allocate TLV pool for RTE_DWA_STLV_PROFILE_L3FWD_D2H_EXCEPTION_PACKETS_PACKETS tag */
>>> ctx->tlv_pool = rte_mempool_create("TLV pool", /* mempool name */
>>> ctx->tlv_pool_depth, /* Number of elements*/
>>> tlv_pool_element_sz, /* Element size*/
>>> 512, /* cache size*/
>>> 0, NULL, NULL, NULL /* Obj constructor */, NULL,
>>> ctx->socket_id, 0 /* flags *);
>>>
>>>
>>> /* Configure Ethernet host port */
>>> id = RTE_DWA_TLV_MK_ID(PORT_HOST_ETHERNET, H2D_CONFIG);
>>> len = sizeof(struct rte_dwa_port_host_ethernet_config);
>>> h2d = malloc(RTE_DWA_TLV_HDR_SZ + len);
>>>
>>> cfg = h2d->msg;
>>> /* Update the Ethernet configuration parameters */
>>> cfg->nb_rx_queues = 1;
>>> cfg->nb_tx_queues = 0;
>>> cfg->max_burst = DWA_EXCEPTION_PACKETS_PKT_BURST_MAX_SZ;
>>> cfg->pkt_pool = ctx->pkt_pool;
>>> cfg->tlv_pool = ctx->tlv_pool;
>>> rte_dwa_tlv_fill(h2d, id, len, h2d);
>>> d2h = rte_dwa_ctrl_op(obj, h2d);
>>> if (d2h == NULL))
>>> goto fail;
>>>
>>> free(h2d);
>>>
>>> /* Configure Rx queue 0 receive expectation traffic */
>>> id = RTE_DWA_TLV_MK_ID(PORT_HOST_ETHERNET, H2D_QUEUE_CONFIG);
>>> len = sizeof(struct rte_dwa_port_host_ethernet_queue_config);
>>> h2d = malloc(RTE_DWA_TLV_HDR_SZ + len);
>>>
>>> cfg = h2d->msg;
>>> cfg->id = 0; /* 0th Queue */
>>> cfg->enable= 1;
>>> cfg->is_tx = 0; /* Rx queue */
>>> cfg->depth = ctx->rx_queue_depth;
>>> rte_dwa_tlv_fill(h2d, id, len, h2d);
>>> d2h = rte_dwa_ctrl_op(obj, h2d);
>>> if (d2h == NULL))
>>> goto fail;
>>>
>>> free(h2d);
>>>
>>> return true;
>>> fail:
>>> if (h2d)
>>> free(h2d);
>>> return rc;
>>> }
>>>
>>> bool
>>> dwa_profile_l3fwd_config(rte_dwa_obj_t obj, struct app_ctx *ctx)
>>> {
>>> struct rte_dwa_tlv info_h2d, *info_d2h = NULL, *h2d, *d2h = NULL;
>>> struct rte_dwa_port_dwa_ethernet_d2h_info *info;
>>> struct rte_dwa_profile_l3fwd_h2d_config *cfg;
>>> bool rc = false;
>>> uint32_t id;
>>> size_t len;
>>>
>>> /* Get DWA Ethernet port info */
>>> id = RTE_DWA_TLV_MK_ID(PORT_DWA_ETHERNET, H2D_INFO);
>>> rte_dwa_tlv_fill(&info_h2d, id, 0, NULL);
>>> info_d2h = rte_dwa_ctrl_op(obj, &info_h2d);
>>>
>>> info = rte_dwa_tlv_d2h_to_msg(info_d2h);
>>> if (info == NULL)
>>> goto fail;
>>>
>>> /* Not found any DWA ethernet ports */
>>> if (info->nb_ports == 0)
>>> goto fail;
>>>
>>> /* Configure L3FWD profile */
>>> id = RTE_DWA_TLV_MK_ID(PROFILE_L3FWD, H2D_CONFIG);
>>> len = sizeof(struct rte_dwa_profile_l3fwd_h2d_config) + (sizeof(uint16_t) * info->nb_ports);
>>> h2d = malloc(RTE_DWA_TLV_HDR_SZ + len);
>>>
>>> cfg = h2d->msg;
>>> /* Update the L3FWD configuration parameters */
>>> cfg->mode = ctx->mode;
>>> /* Attach all DWA Ethernet ports onto L3FWD profile */
>>> cfg->nb_eth_ports = info->nb_ports;
>>> memcpy(cfg->eth_ports, info->avail_ports, sizeof(uint16_t) * info->nb_ports);
>>>
>>> rte_dwa_tlv_fill(h2d, id, len, h2d);
>>> d2h = rte_dwa_ctrl_op(obj, h2d);
>>> free(h2d);
>>>
>>> /* All good */
>>> rc = true;
>>> fail:
>>> if (info_d2h)
>>> free(obj, info_d2h);
>>> if (d2h)
>>> free(obj, d2h);
>>>
>>> return rc;
>>> }
>>>
>>> bool
>>> dwa_profile_l3fwd_has_capa(rte_dwa_obj_t obj, struct app_ctx *ctx)
>>> {
>>> struct rte_dwa_profile_l3fwd_d2h_info *info;
>>> struct rte_dwa_tlv h2d, *d2h;
>>> bool found = false;
>>> uint32_t id;
>>>
>>> /* Get L3FWD profile info */
>>> id = RTE_DWA_TLV_MK_ID(PROFILE_L3FWD, H2D_INFO);
>>> rte_dwa_tlv_fill(&h2d, id, 0, NULL);
>>> d2h = rte_dwa_ctrl_op(obj, &h2d);
>>>
>>> info = rte_dwa_tlv_d2h_to_msg(d2h);
>>> /* Request failed */
>>> if (info == NULL)
>>> goto fail;
>>> /* Required lookup modes is not supported */
>>> if (!(info->modes_supported & ctx->mode))
>>> goto fail;
>>>
>>> /* Check profile supports HOST_ETHERNET port as this application
>>> * supports only host port as Ethernet
>>> */
>>> for (i = 0; i < info->nb_host_ports; i++) {
>>> if (info->host_ports[i] == RTE_DWA_TAG_PORT_HOST_ETHERNET); {
>>> found = true;
>>> }
>>> }
>>>
>>> /* Done with response, Free the d2h memory allocated by implementation */
>>> free(obj, d2h);
>>> fail:
>>> return found;
>>> }
>>>
>>>
>>> bool
>>> dwa_has_profile(enum rte_dwa_tag_profile pf)
>>> {
>>> enum rte_dwa_tlv_profile *pfs = NULL;
>>> bool found = false;
>>> int nb_pfs;
>>>
>>> /* Get the number of profiles on the DWA device */
>>> nb_pfs = rte_dwa_dev_disc_profiles(0, NULL);
>>> pfs = malloc(sizeof(enum rte_dwa_tag_profile) * nb_pfs);
>>> /* Fetch all the profiles */
>>> nb_pfs = rte_dwa_dev_disc_profiles(0, pfs);
>>>
>>> /* Check the list has requested profile */
>>> for (i = 0; i < nb_pfs; i++) {
>>> if (pfs[i] == pf);
>>> found = true;
>>> }
>>> free(pfs);
>>>
>>>
>>> return found;
>>> }
>>>
>>>
>>> #include <rte_dwa.h>
>>>
>>> #define DWA_EXCEPTION_PACKETS_PKT_BURST_MAX_SZ 32
>>>
>>> struct app_ctx {
>>> bool done;
>>> struct rte_mempool *pkt_pool;
>>> struct rte_mempool *tlv_pool;
>>> enum rte_dwa_profile_l3fwd_lookup_mode mode;
>>> int socket_id;
>>> int pkt_pool_depth;
>>> int tlv_pool_depth;
>>> int rx_queue_depth;
>>> } __rte_cache_aligned;
>>>
>>> int
>>> main(int argc, char **argv)
>>> {
>>> rte_dwa_obj_t obj = NULL;
>>> struct app_ctx ctx;
>>> int rc;
>>>
>>> /* Initialize EAL */
>>> rc= rte_eal_init(argc, argv);
>>> if (rc < 0)
>>> rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
>>> argc -= ret;
>>> argv += ret;
>>>
>>>
>>> memset(&ctx, 0, sizeof(ctx));
>>> /* Set application default values */
>>> ctx->mode = RTE_DWA_PROFILE_L3FWD_MODE_LPM;
>>> ctx->socket_id = SOCKET_ID_ANY;
>>> ctx->pkt_pool_depth = 10000;
>>> ctx->tlv_pool_depth = 10000;
>>> ctx->rx_queue_depth = 10000;
>>>
>>> /* Step 1: Check any DWA devices present */
>>> rc = rte_dwa_dev_count();
>>> if (rc <= 0)
>>> rte_exit(EXIT_FAILURE, "Failed to find DWA devices\n");
>>>
>>> /* Step 2: Check DWA device has L3FWD profile or not */
>>> if (!dwa_has_profile(RTE_DWA_TAG_PROFILE_L3FWD))
>>> rte_exit(EXIT_FAILURE, "L3FWD profile not found\n");
>>>
>>> /*
>>> * Step 3: Now that, workload accelerator has L3FWD profile,
>>> * offload L3FWD workload to accelerator by attaching the profile
>>> * to accelerator.
>>> */
>>> enum rte_dwa_tlv_profile profile[] = {RTE_DWA_TAG_PROFILE_L3FWD};
>>> obj = rte_dwa_dev_attach(0, "my_custom_accelerator_device", profile, 1).;
>>>
>>> /* Step 4: Check Attached L3FWD profile has required capability to proceed */
>>> if (!dwa_profile_l3fwd_has_capa(obj, &ctx))
>>> rte_exit(EXIT_FAILURE, "L3FWD profile does not have enough capability \n");
>>>
>>> /* Step 5: Configure l3fwd profile */
>>> if (!dwa_profile_l3fwd_config(obj, &ctx))
>>> rte_exit(EXIT_FAILURE, "L3FWD profile configure failed \n");
>>>
>>> /* Step 6: Configure ethernet host port to receive exception packets */
>>> if (!dwa_port_host_ethernet_config(obj, &ctx))
>>> rte_exit(EXIT_FAILURE, "L3FWD profile configure failed \n");
>>>
>>> /* Step 7 : Move DWA profiles to start state */
>>> rte_dwa_start(obj);
>>>
>>> /* Step 8: Handle expectation packets and add lookup rules for it */
>>> dwa_profile_l3fwd_port_host_ethernet_worker(obj, &ctx);
>>>
>>> /* Step 9: Clean up */
>>> rte_dwa_stop(obj);
>>> rte_dwa_dev_detach(0, obj);
>>> rte_dwa_dev_close(0);
>>>
>>> return 0;
>>> }
>>>
>>>
>>> Jerin Jacob (1):
>>> dwa: introduce dataplane workload accelerator subsystem
>>>
>>> doc/api/doxy-api-index.md | 13 +
>>> doc/api/doxy-api.conf.in | 1 +
>>> lib/dwa/dwa.c | 7 +
>>> lib/dwa/meson.build | 17 ++
>>> lib/dwa/rte_dwa.h | 184 +++++++++++++
>>> lib/dwa/rte_dwa_core.h | 264 +++++++++++++++++++
>>> lib/dwa/rte_dwa_dev.h | 154 +++++++++++
>>> lib/dwa/rte_dwa_port_dwa_ethernet.h | 68 +++++
>>> lib/dwa/rte_dwa_port_host_ethernet.h | 178 +++++++++++++
>>> lib/dwa/rte_dwa_profile_admin.h | 85 ++++++
>>> lib/dwa/rte_dwa_profile_l3fwd.h | 378 +++++++++++++++++++++++++++
>>> lib/dwa/version.map | 3 +
>>> lib/meson.build | 1 +
>>> 13 files changed, 1353 insertions(+)
>>> create mode 100644 lib/dwa/dwa.c
>>> create mode 100644 lib/dwa/meson.build
>>> create mode 100644 lib/dwa/rte_dwa.h
>>> create mode 100644 lib/dwa/rte_dwa_core.h
>>> create mode 100644 lib/dwa/rte_dwa_dev.h
>>> create mode 100644 lib/dwa/rte_dwa_port_dwa_ethernet.h
>>> create mode 100644 lib/dwa/rte_dwa_port_host_ethernet.h
>>> create mode 100644 lib/dwa/rte_dwa_profile_admin.h
>>> create mode 100644 lib/dwa/rte_dwa_profile_l3fwd.h
>>> create mode 100644 lib/dwa/version.map
>>>
next prev parent reply other threads:[~2021-10-29 11:57 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-19 18:14 jerinj
2021-10-19 18:14 ` [dpdk-dev] [RFC PATCH 1/1] dwa: introduce dataplane workload accelerator subsystem jerinj
2021-10-19 19:08 ` [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library Thomas Monjalon
2021-10-19 19:36 ` Jerin Jacob
2021-10-19 20:42 ` Stephen Hemminger
2021-10-20 5:25 ` Jerin Jacob
2021-10-19 20:42 ` Tom Herbert
2021-10-20 5:38 ` Jerin Jacob
2021-10-22 12:00 ` Elena Agostini
2021-10-22 13:39 ` Jerin Jacob
2021-10-25 7:35 ` Mattias Rönnblom
2021-10-25 9:03 ` Jerin Jacob
2021-10-29 11:57 ` Mattias Rönnblom [this message]
2021-10-29 15:51 ` Jerin Jacob
2021-10-31 9:18 ` Mattias Rönnblom
2021-10-31 14:01 ` Jerin Jacob
2021-10-31 19:34 ` Thomas Monjalon
2021-10-31 21:13 ` Jerin Jacob
2021-10-31 21:55 ` Thomas Monjalon
2021-10-31 22:19 ` Jerin Jacob
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=35f086cb-bef3-9a11-6a85-7e695c0b0e7c@ericsson.com \
--to=mattias.ronnblom@ericsson.com \
--cc=aboyer@pensando.io \
--cc=ajit.khaparde@broadcom.com \
--cc=anatoly.burakov@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=asekhar@marvell.com \
--cc=asomalap@amd.com \
--cc=beilei.xing@intel.com \
--cc=bruce.richardson@intel.com \
--cc=chas3@att.com \
--cc=chenbo.xia@intel.com \
--cc=ciara.loftus@intel.com \
--cc=cloud.wangxiaoyun@huawei.com \
--cc=cristian.dumitrescu@intel.com \
--cc=dev@dpdk.org \
--cc=dmitry.kozliuk@gmail.com \
--cc=drc@linux.vnet.ibm.com \
--cc=dsinghrawat@marvell.com \
--cc=eagostini@nvidia.com \
--cc=ed.czeck@atomicrules.com \
--cc=evgenys@amazon.com \
--cc=ferruh.yigit@intel.com \
--cc=g.singh@nxp.com \
--cc=gakhil@marvell.com \
--cc=grive@u256.net \
--cc=haiyue.wang@intel.com \
--cc=heinrich.kuhn@corigine.com \
--cc=hemant.agrawal@nxp.com \
--cc=hkalra@marvell.com \
--cc=honnappa.nagarahalli@arm.com \
--cc=humin29@huawei.com \
--cc=hyonkim@cisco.com \
--cc=igorch@amazon.com \
--cc=irusskikh@marvell.com \
--cc=jasvinder.singh@intel.com \
--cc=jay.jayatheerthan@intel.com \
--cc=jerinj@marvell.com \
--cc=jerinjacobk@gmail.com \
--cc=jgrajcia@cisco.com \
--cc=jianwang@trustnetic.com \
--cc=jiawenwu@trustnetic.com \
--cc=jingjing.wu@intel.com \
--cc=john.miller@atomicrules.com \
--cc=johndale@cisco.com \
--cc=keith.wiles@intel.com \
--cc=kirankumark@marvell.com \
--cc=konstantin.ananyev@intel.com \
--cc=linville@tuxdriver.com \
--cc=lironh@marvell.com \
--cc=longli@microsoft.com \
--cc=matan@nvidia.com \
--cc=matt.peters@windriver.com \
--cc=maxime.coquelin@redhat.com \
--cc=mdr@ashroe.eu \
--cc=mk@semihalf.com \
--cc=mtetsuyah@gmail.com \
--cc=mw@semihalf.com \
--cc=nadavh@marvell.com \
--cc=ndabilpuram@marvell.com \
--cc=olivier.matz@6wind.com \
--cc=oulijun@huawei.com \
--cc=pathreya@marvell.com \
--cc=pbhagavatula@marvell.com \
--cc=pkapoor@marvell.com \
--cc=pnalla@marvell.com \
--cc=qi.z.zhang@intel.com \
--cc=qiming.yang@intel.com \
--cc=radhac@marvell.com \
--cc=rahul.lakkireddy@chelsio.com \
--cc=rmody@marvell.com \
--cc=rosen.xu@intel.com \
--cc=ruifeng.wang@arm.com \
--cc=sachin.saxena@oss.nxp.com \
--cc=sburla@marvell.com \
--cc=shaibran@amazon.com \
--cc=shepard.siegel@atomicrules.com \
--cc=shshaikh@marvell.com \
--cc=skori@marvell.com \
--cc=skoteshwar@marvell.com \
--cc=somnath.kotur@broadcom.com \
--cc=spinler@cesnet.cz \
--cc=steven.webster@windriver.com \
--cc=sthemmin@microsoft.com \
--cc=thomas@monjalon.net \
--cc=vburru@marvell.com \
--cc=viacheslavo@nvidia.com \
--cc=xiao.w.wang@intel.com \
--cc=xuanziyang2@huawei.com \
--cc=yisen.zhuang@huawei.com \
--cc=yongwang@vmware.com \
--cc=zhouguoyang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).