From: "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>
To: "antonin@barefootnetworks.com" <antonin@barefootnetworks.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Daly, Dan" <dan.daly@intel.com>
Subject: Re: [dpdk-dev] [RFC] P4 enablement in DPDK
Date: Tue, 19 Jun 2018 17:52:17 +0000 [thread overview]
Message-ID: <3EB4FA525960D640B5BDFFD6A3D891268E75B899@IRSMSX108.ger.corp.intel.com> (raw)
In-Reply-To: <767DDC46-5501-4147-9004-0AB38C34C17B@barefootnetworks.com>
Hi Antonin,
Thanks very much for your input and your help going forward!
More comments inline below.
From: antonin@barefootnetworks.com [mailto:antonin@barefootnetworks.com]
Sent: Saturday, June 16, 2018 12:26 AM
To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
Cc: dev@dpdk.org; Daly, Dan <dan.daly@intel.com>
Subject: Re: [dpdk-dev] [RFC] P4 enablement in DPDK
Hi,
I want to express support for this proposal and adding P4 capabilities in DPDK. For example, I personally see a lot of demand for a production-quality P4-programmable software switch.
[Cristian] Excellent, thank you!
A few comments on this:
1) I see a lot of similarities between the proposed PDEV table runtime API and the existing PI C API: https://github.com/p4lang/PI/tree/master/include/PI. I wonder if there would be value in trying to re-use them - at least partially - for this.
[Cristian] Yes, DPDK PDEV should be very much aligned with the P4 language and the P4Runtime API.
PI is very much aligned on P4 and strictly Program Independent. That does not seem to be completely the case for the PDEV table runtime API (dscp table, TTL, …) and I’m not familiar enough with DPDK to understand the rationale for this, but I don’t see why DPDK couldn’t have its own extensions to the PI API.
[Cristian] Yes, these are simply extensions to support the frequent actions on IP packets, as most of the packets are IP packets, which justifies specializations for performance reasons.
2) For the sake of avoiding fragmentation of the community, I would strongly recommend making sure that there is an available P4Runtime (https://p4.org/p4-spec/docs/P4Runtime-v1.0.0.pdf) implementation for DPDK. That would require a mapping from P4Runtime messages to PDEV API calls. The advantage of trying to align PDEV with PI (first bullet point) is that there is already a mapping from P4Runtime messages to PI API calls.
[Cristian] Yes, we need to fit all the P4Runtime functionality in the PDEV API.
The burden of supporting P4Runtime can probably be reduced by leveraging the Stratum project (https://stratumproject.org/), which unfortunately is not open-source yet.
[Cristian] Yes, the “instrumentation” layer translating between the gRPC calls of P4RT and the DPDK PDEV API probably fits into an SDN controller such as Stratum.
3) It seems that the notion of “action profile” here is more general than in P4, or more precisely than in the P4_16 PSA architecture (Portable Switch Architecture). Since this term has a strong connotation in the P4 world, maybe another term should be used instead if possible.
[Cristian] Yes, we’ll probably rename “action profile” to something like “action configuration template” to avoid name clash with the action profile construct from the P4 language.
4) I recommend looking into the notion of “architecture” in P4_16 and trying to decide if you want to a) have generic support for all P4 architectures (at least for the CPU implementation), b) support the PSA architecture specifically (which is the primary / only architecture used as part of Stratum) or c) define your own architecture specifically for targets that are going to support P4 through DPDK drivers (which may limit your impact).
[Cristian] The PDEV API should support all features of the P4 language, the set of extern constructs defined by the PSA architecture and the configuration API defined by the P4RT; of course, support for each PDEV API feature is subject to the target supporting it. The PDEV API must be agnostic of whether the implementation (DPDK driver) addresses a HW or SW target. As stated, we want to support a generous range of P4 capable devices (FPGAs, ASICs, NICs, Smart NICs, etc) as well as the SW target (CPU based), with the latter likely to be implemented based on DPDK Packet Framework libraries.
5) Conceptually the APIs can be split into 2 parts: a) the table runtime APIs, which are generally pretty-straightforward and b) pipeline query & configuration APIs. Both P4Runtime (SetForwardingPipelineConfig) & PI (pi_device_update_[start|end]) include mechanisms to re-configure the data-plane, by providing the compiler output to the target.
For b), I strongly recommend looking into what we have done with P4Runtime. SetForwardingPipelineConfig provides the target with a P4Info message (which is target-agnostic and describes the interface of each runtime-controllable P4 object; in a way I believe it is similar to your table_create PDEV API) and a target-specific opaque “blob”. For reconfigurable SW & HW the “blob” is essentially a description of the pipeline: it can be some text file, binary register values, an object file, etc…
The case of fixed-function devices is usually trickier. We actually do not have a pipeline discovery mechanism in P4Runtime & PI. In P4Runtime, we just assume that the control-plane is aware of the pipeline and has access to a P4Info message for it. We still require the P4Runtime client to call SetForwardingPipelineConfig with the “right" P4Info message (we expect the target to return an error if the P4Info is not the right one) and a potentially empty “blob”.
I think the take-away is that there isn’t a unified pipeline creation mechanism across programmable targets, i.e. it is difficult to break down pipeline creation into a sequence of universal sub-API calls, such as “create_table”, “create_parser”, etc… However it would make perfect sense IMO to design and implement such an API in the context of a specific DPDK SW switch. The P4 compiler backend would then be in-charge of generating the appropriate sequence of API calls.
[Cristian] Yes, it makes perfect sense for PDEV API to support the pipeline query/discovery service and the run-time management of pipelines, same as P4RT. The pipeline creation service is very useful for the CPU SW target and probably for other targets where the application pipeline can be specified/constructed incrementally), it can be left unimplemented by the targets that only support a monolithic specification/creation mechanism.
Overall I’m very excited to see some work being done in this area. I believe a lot of people will be able to help, especially with compiler backend development. To summarize my 5 bullet points above, I would say that there are 2 import areas of investigation as far as I can tell:
1) what should be the compiler backend output for the DPDK CPU SW target (sequence of API calls)? For non-programmable devices, having the “right” P4Info is usually enough. Existing P4-programmable hardware already comes with its own compiler backend (Barefoot Tofino ASIC, Xilinx FPGAs).
[Cristian] See my comment above. Likely more work required here for you and me☺.
2) can we try to avoid fragmentation and re-use existing code with P4Runtime / PI / Stratum?
[Cristian] Yes, this is work that spans across multiple projects: dpdk.org (PDEV API and drivers), p4.org (DPDK back-end for P4 compiler), stratumproject.org (SDN controller adaptation layers).
Thanks,
Antonin
On Apr 18, 2018, at 10:22 AM, Dumitrescu, Cristian <cristian.dumitrescu@intel.com<mailto:cristian.dumitrescu@intel.com>> wrote:
P4 is a language for programming the data plane of network devices [1]. The P4
language is developed by p4.org<http://p4.org> which is joining ONF and Linux Foundation [2].
This API provides a way to program P4 capable devices through DPDK. The purpose
of this API is to enable P4 compilers [3] to generate high performance DPDK code
out of P4 programs.
The main advantage of this approach is that P4 enablement of network devices can
be done through DPDK in a unified way:
1. This API serves as the interface between the P4 compiler front-end (target
independent) and the P4 compiler backe-ends (target specific).
2. Device vendors develop their device drivers as part of DPDK by
implementing this API. The device driver is agostic of being called by the
P4 front-end. The device driver serves as the P4 compiler taget specific
back-end.
3. The P4 compiler front-end is target independent. The amount of C code it
generates is minimized by calling this API directly for every P4 feature
as opposed to vendor-specific free-style C code generation.
This API introduces a pipeline device (PDEV) by using a similar approach to the
existing ethdev and eventdev DPDK device-like APIs implemented by the DPDK Poll
Mode Drivers (PMDs). Main features:
1. Discovery of built-in pipeline devices and their capabilities.
2. Creation of new pipelines out of input ports, output ports, tables and
actions.
3. Registration of packet protocol header and meta-data fields.
4. Action definition for input ports, output ports and tables.
5. Pipeline run-time API for table population, statistics read, etc.
This API targets P4 capable devices such as NICs, FPGAs, NPUs, ASICs, etc, as
well as CPUs. Let’s remember that the first P in P4 stands for Programmable, and
the CPUs are arguably the most programmable devices. The implementation for the
CPU SW target is expected to use the DPDK Packet Framework libraries such as
librte_pipeline, librte_port, librte_table with some expected but moderate API
and implementation adjustments.
Links:
[1] P4-16 language specification:
https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.pdf
[2] p4.org<http://p4.org> to join ONF and LF: https://p4.org/p4/onward-and-upward.html
[3] p4c: https://github.com/p4lang/p4c
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com<mailto:cristian.dumitrescu@intel.com>>
---
lib/librte_pipeline/rte_pdev.h | 1654 +++++++++++++++++++++++++++++++++
lib/librte_pipeline/rte_pdev_driver.h | 283 ++++++
2 files changed, 1937 insertions(+)
create mode 100644 lib/librte_pipeline/rte_pdev.h
create mode 100644 lib/librte_pipeline/rte_pdev_driver.h
next prev parent reply other threads:[~2018-06-19 17:52 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-18 17:22 Cristian Dumitrescu
2018-04-19 5:04 ` Kuusisaari, Juhamatti
2018-06-15 23:25 ` antonin
2018-06-19 17:52 ` Dumitrescu, Cristian [this message]
2018-06-20 6:13 ` Jerin Jacob
2018-06-20 11:56 ` Dumitrescu, Cristian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3EB4FA525960D640B5BDFFD6A3D891268E75B899@IRSMSX108.ger.corp.intel.com \
--to=cristian.dumitrescu@intel.com \
--cc=antonin@barefootnetworks.com \
--cc=dan.daly@intel.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).