DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC] P4 enablement in DPDK
@ 2018-04-18 17:22 Cristian Dumitrescu
  2018-04-19  5:04 ` Kuusisaari, Juhamatti
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Cristian Dumitrescu @ 2018-04-18 17:22 UTC (permalink / raw)
  To: dev; +Cc: dan.daly

P4 is a language for programming the data plane of network devices [1]. The P4
language is developed by p4.org which is joining ONF and Linux Foundation [2].

This API provides a way to program P4 capable devices through DPDK. The purpose
of this API is to enable P4 compilers [3] to generate high performance DPDK code
out of P4 programs.

The main advantage of this approach is that P4 enablement of network devices can
be done through DPDK in a unified way:

   1. This API serves as the interface between the P4 compiler front-end (target
      independent) and the P4 compiler backe-ends (target specific).

   2. Device vendors develop their device drivers as part of DPDK by
      implementing this API. The device driver is agostic of being called by the
      P4 front-end. The device driver serves as the P4 compiler taget specific
      back-end.

   3. The P4 compiler front-end is target independent. The amount of C code it
      generates is minimized by calling this API directly for every P4 feature
      as opposed to vendor-specific free-style C code generation.

This API introduces a pipeline device (PDEV) by using a similar approach to the
existing ethdev and eventdev DPDK device-like APIs implemented by the DPDK Poll
Mode Drivers (PMDs). Main features:

   1. Discovery of built-in pipeline devices and their capabilities.

   2. Creation of new pipelines out of input ports, output ports, tables and
      actions.

   3. Registration of packet protocol header and meta-data fields.

   4. Action definition for input ports, output ports and tables.

   5. Pipeline run-time API for table population, statistics read, etc.

This API targets P4 capable devices such as NICs, FPGAs, NPUs, ASICs, etc, as
well as CPUs. Let’s remember that the first P in P4 stands for Programmable, and
the CPUs are arguably the most programmable devices. The implementation for the
CPU SW target is expected to use the DPDK Packet Framework libraries such as
librte_pipeline, librte_port, librte_table with some expected but moderate API
and implementation adjustments.

Links:

   [1] P4-16 language specification:
       https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.pdf

   [2] p4.org to join ONF and LF: https://p4.org/p4/onward-and-upward.html

   [3] p4c: https://github.com/p4lang/p4c

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 lib/librte_pipeline/rte_pdev.h        | 1654 +++++++++++++++++++++++++++++++++
 lib/librte_pipeline/rte_pdev_driver.h |  283 ++++++
 2 files changed, 1937 insertions(+)
 create mode 100644 lib/librte_pipeline/rte_pdev.h
 create mode 100644 lib/librte_pipeline/rte_pdev_driver.h

diff --git a/lib/librte_pipeline/rte_pdev.h b/lib/librte_pipeline/rte_pdev.h
new file mode 100644
index 0000000..7095197
--- /dev/null
+++ b/lib/librte_pipeline/rte_pdev.h
@@ -0,0 +1,1654 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef __INCLUDE_RTE_PDEV_H__
+#define __INCLUDE_RTE_PDEV_H__
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file
+ * RTE Pipeline Device (PDEV)
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+
+#include <rte_common.h>
+#include <rte_dev.h>
+#include <rte_ether.h>
+
+/** PDEV device handle data type. */
+struct rte_pdev;
+
+/**
+ * PDEV Capability API
+ */
+
+/** PDEV capabilities. */
+struct rte_pdev_capabilities {
+	/** Number of built-in pipelines.
+	 * @see rte_pdev_next_get()
+	 */
+	uint32_t n_pipelines_builtin;
+
+	/** Non-zero when new pipelines can be created, zero otherwise.
+	 * @see rte_pdev_create()
+	 */
+	int create;
+};
+
+/**
+ * PDEV capabilities get
+ *
+ * @param[in] dev
+ *   Current device.
+ * @param[out] cap
+ *   PDEV capabilities. Must be non-NULL.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_capabilities_get(struct rte_device *dev,
+	struct rte_pdev_capabilities *cap);
+
+/**
+ * PDEV Discovery API
+ *
+ */
+
+/**
+ * PDEV get next
+ *
+ * This API is used to discover pre-existent pipelines on given device. This is
+ * typically the case for built-in HW pipelines.
+ *
+ * @param[in] dev
+ *   Current device.
+ * @param[in] pdev
+ *   Handle to current PDEV on *dev*. Set to NULL during the first invocation
+ *   for *dev* device.
+ * @return
+ *   When non-NULL, handle to next PDEV, otherwise no more PDEV.
+ *
+ * @see struct rte_pdev_capabilities::n_pipelines_builtin
+ */
+struct rte_pdev *
+rte_pdev_next_get(struct rte_device *dev, struct rte_pdev *pdev);
+
+/**
+ * PDEV Create API
+ */
+
+/** PDEV statistics counter type. */
+enum rte_pdev_stats_type {
+	/** Number of packets. */
+	RTE_PDEV_STATS_N_PKTS = 1 << 0,
+
+	/** Number of packet bytes. */
+	RTE_PDEV_STATS_N_BYTES = 1 << 1,
+};
+
+/** PDEV parameters. */
+struct rte_pdev_params {
+	/** PDEV name. */
+	const char *name;
+
+	/** Statistics counters to be enabled.
+	 * @see enum rte_pdev_stats_type
+	 */
+	uint64_t stats_mask;
+};
+
+/**
+ * PDEV create
+ *
+ * This API is to be called to create new pipelines on given device. This is
+ * typically supported by reconfigurable HW devices and SW pipelines.
+ *
+ * @param[in] dev
+ *   Current device.
+ * @param[in] params
+ *   PDEV parameters. Must be non-NULL and valid.
+ * @return
+ *   When non-NULL, handle to created PDEV, otherwise error.
+ *
+ * @see struct rte_pdev_capabilities::create
+ */
+struct rte_pdev *
+rte_pdev_create(struct rte_device *dev,
+	struct rte_pdev_params *params);
+
+/**
+ * PDEV free
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_free(struct rte_pdev *pdev);
+
+/**
+ * PDEV start
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_start(struct rte_pdev *pdev);
+
+/** PDEV input port types. */
+enum rte_pdev_port_in_type {
+	/** Builtin device. */
+	RTE_PDEV_PORT_IN_BUILTIN = 0,
+
+	/** Ethernet device. */
+	RTE_PDEV_PORT_IN_ETHDEV,
+};
+
+/** PDEV input port parameters. */
+struct rte_pdev_port_in_params {
+	/** Type. */
+	enum rte_pdev_port_in_type type;
+
+	/** Device specific parameters. */
+	union {
+		/** Builtin device. */
+		struct {
+			/** Builtin device name. */
+			const char *name;
+		} builtin;
+
+		/** Ethernet device. */
+		struct {
+			/** Ethernet device name. */
+			const char *name;
+
+			/** Reception side queue ID. */
+			uint32_t rx_queue_id;
+
+			/** Burst size. */
+			uint32_t burst_size;
+		} ethdev;
+	} dev;
+};
+
+/**
+ * PDEV input port create
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   PDEV input port ID. Must not be used by any existing *pdev* input port.
+ * @params[in] params
+ *   Input port parameters. Must be non-NULL and valid.
+ * @params[in] enable
+ *   When non-zero, the new input port is initially enabled, otherwise disabled.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_create(struct rte_pdev *pdev,
+	uint32_t port_id,
+	struct rte_pdev_port_in_params *params,
+	int enable);
+
+/**
+ * PDEV input port connect to table
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   Input port ID.
+ * @params[in] table_id
+ *   Table ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_connect(struct rte_pdev *pdev,
+	uint32_t port_id,
+	uint32_t table_id);
+
+/** PDEV table match type. */
+enum rte_pdev_table_match_type {
+	/**  Wildcard match. */
+	RTE_PDEV_TABLE_MATCH_WILDCARD = 0,
+
+	/** Exact match. */
+	RTE_PDEV_TABLE_MATCH_EXACT,
+
+	/** Longest Prefix Match (LPM). */
+	RTE_PDEV_TABLE_MATCH_LPM,
+
+	/** Index match. */
+	RTE_PDEV_TABLE_MATCH_INDEX,
+
+	/** Stub match. No real match process: the default rule is always hit. */
+	RTE_PDEV_TABLE_MATCH_STUB,
+};
+
+/** PDEV table match parameters. */
+struct rte_pdev_table_match_params {
+	/** Packet field or packet meta-data field name at match offset 0. */
+	const char *start;
+
+	/** Match size (in bits).
+	 *
+	 * For LPM match type, typical values are 32 bits to match a single IPv4
+	 * address and 128 bits to match a single IPv6 address, but other values
+	 * are possible, for example for Virtual Routing and Forwarding (VRF).
+	 *
+	 * For INDEX match type, the maximum allowed value is 32 bits.
+	 */
+	uint32_t size;
+
+	/** Match mask (*size* bits are used and must be valid).
+	 *
+	 * For LPM match type, this parameter is ignored, as *size* implicitly
+	 * defines *mask* as *size* bits of 1.
+	 */
+	uint8_t *mask;
+};
+
+/** PDEV exact match table parameters. */
+struct rte_pdev_table_exact_match_params {
+	/** Number of hash table buckets. This parameter represents a hint that
+	 * the underlying implementation may ignore.
+	 */
+	uint32_t n_buckets;
+
+	/** Hash table type. Non-zero for extendable bucket hash table, zero for
+	 * Least Recently Used (LRU) hash table.
+	 */
+	int extendable_bucket;
+};
+
+/** PDEV table parameters. */
+struct rte_pdev_table_params {
+	/** Match type. */
+	enum rte_pdev_table_match_type match_type;
+
+	/** Match parameters. Ignored for STUB match type. */
+	struct rte_pdev_table_match_params match;
+
+	/** Match type specific parameters. */
+	RTE_STD_C11
+	union {
+		/** Exact match table specific parameters. */
+		struct rte_pdev_table_exact_match_params exact;
+	};
+
+	/** Maximum number of rules to be stored in the current table. */
+	uint32_t n_rules;
+};
+
+/**
+ * PDEV table create
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] table_id
+ *   PDEV table ID. Must not be used by any existing *pdev* table.
+ * @params[in] params
+ *   Table parameters. Must be non-NULL and valid.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_create(struct rte_pdev *pdev,
+	uint32_t table_id,
+	struct rte_pdev_table_params *params);
+
+/** PDEV output port type. */
+enum rte_pdev_port_out_type {
+	/** Builtin device. */
+	RTE_PDEV_PORT_OUT_BUILTIN = 0,
+
+	/** Ethernet device. */
+	RTE_PDEV_PORT_OUT_ETHDEV,
+
+	/** Drop all packets device. */
+	RTE_PDEV_PORT_OUT_DROP,
+};
+
+/** PDEV output port parameters. */
+struct rte_pdev_port_out_params {
+	/** Type. */
+	enum rte_pdev_port_out_type type;
+
+	/** Device specific parameters. */
+	union {
+		/** Builtin device. */
+		struct {
+			/** Builtin device name. */
+			const char *name;
+		} builtin;
+
+		/** Ethernet device. */
+		struct {
+			/** Ethernet device name. */
+			const char *name;
+
+			/** Transmission side queue ID. */
+			uint32_t tx_queue_id;
+
+			/** Burst size. */
+			uint32_t burst_size;
+		} ethdev;
+	} dev;
+};
+
+/**
+ * PDEV output port create
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   PDEV output port ID. Must not be used by any existing *pdev* output port.
+ * @params[in] params
+ *   Output port parameters. Must be non-NULL and valid.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_out_create(struct rte_pdev *pdev,
+	uint32_t port_id,
+	struct rte_pdev_port_out_params *params);
+
+/**
+ * PDEV meta-data definition API
+ */
+
+/**
+ * PDEV packet field registration.
+ *
+ * Create symbolic alias for a protocol header or header field in the input
+ * packet. This alias can then be used as part of assignment actions registered
+ * for PDEV input ports, tables or output ports, either as left hand side value
+ * or as one of the right hand side expression operands, as appropriate. The
+ * packet field registered with the name of "x" is used as "pkt.x".
+ *
+ * This alias is typically translated to its offset and size, which are then
+ * used during the execution of assignment actions to access the associated data
+ * bytes within the packet.
+ *
+ * The scope of the packet field aliases is the PDEV instance. The attributes
+ * such as offset or size cannot be changed after the alias registration. This
+ * approach assumes the input packet type is known in advance, as opposed to
+ * having each input packet parsed to detect its type. This is a reasonable
+ * assumption, given that NIC capabilities to filter each packet type to a
+ * different RX queue are quite common; the NIC is configured transparently to
+ * the PDEV, with each NIC RX queue mapped as different PDEV input port.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] name
+ *   Symbolic alias for the input packet header or header field.
+ * @params[in] offset
+ *   Byte offset within the input packet. Offset 0 points to the first byte of
+ *   the packet.
+ * @params[in] size
+ *   Field size (in bytes).
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_pkt_field_register(struct rte_pdev *pdev,
+	const char *name,
+	uint32_t offset,
+	uint32_t size);
+
+/**
+ * PDEV packet meta-data field registration.
+ *
+ * Reserve field into the packet meta-data and assign a symbolic alias to it.
+ * This alias can then be used as part of assignment actions registered for PDEV
+ * input ports, tables and output ports, either as left hand side value or as
+ * one of the right hand side expression operands, as appropriate. The packet
+ * meta-data field registered with the name of "x" is used as "meta.x".
+ *
+ * Each input packet has its own private memory area reserved to store its
+ * meta-data, which is valid for the lifetime of the packet within the PDEV.
+ * This alias is typically translated to its offset and size, which are then
+ * used during the execution of assignment actions to access the associated
+ * packet meta-data bytes.
+ *
+ * The scope of the packet meta-data field aliases is the PDEV instance,
+ * therefore the meta-data layout is commmon for all the PDEV input packets.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] name
+ *   Symbolic alias for the packet meta-data field.
+ * @params[in] size
+ *   Field size (in bytes).
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_pkt_meta_field_register(struct rte_pdev *pdev,
+	const char *name,
+	uint32_t size);
+
+/**
+ * PDEV table rule data field registration.
+ *
+ * Reserve field into the table rule data and assign a symbolic alias to it.
+ * This alias can then be used as part of assignment actions registered for PDEV
+ * tables, either as left hand side value or as one of the right hand side
+ * expression operands, as appropriate. The table rule data field registered
+ * with the name of "x" is used as "table.x".
+ *
+ * This alias is typically translated to its offset and size, which are then
+ * used during the execution of assignment actions to access the associated
+ * table rule data bytes.
+ *
+ * The table rule data layout is common for all the rules of a given table that
+ * share the same action profile, therefore the scope of the table rule data
+ * field alias is its (table, action profile) pair.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] table_id
+ *   PDEV table ID.
+ * @params[in] action_profile_id
+ *   Table action profile ID.
+ * @params[in] name
+ *   Symbolic alias for the table rule data field.
+ * @params[in] size
+ *   Field size (in bytes).
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_field_register(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t action_profile_id,
+	const char *name,
+	uint32_t size);
+
+/**
+ * PDEV input port action API
+ */
+enum rte_pdev_port_in_action_type {
+	/** Assignment: lvalue = expression. */
+	RTE_PDEV_PORT_IN_ACTION_ASSIGN = 0,
+};
+
+/**
+ * RTE_PDEV_PORT_IN_ACTION_ASSIGN
+ */
+struct rte_pdev_port_in_action_assign_config {
+	/** Left hand side value for the assignment. Must be one of the
+	 * pre-registered packet meta-data field symbolic aliases. Packet field
+	 * and table rule data field symbolic aliases are not allowed.
+	 */
+	const char *lvalue;
+
+	/** Expression with operands and operators. The operands must be
+	 * pre-registered packet or packet meta-data field symbolic aliases.
+	 * Table rule data field aliases are not allowed.
+	 */
+	const char *expression;
+};
+
+/**
+ * PDEV input port action profile create
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Input port action profile ID. Must not be used by any existing *pdev* input
+ *   port action profile.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_action_profile_create(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/**
+ * PDEV input port action profile action register
+ *
+ * The action registration order is important, as it determines the action
+ * execution order. The same action type can be registered several times for the
+ * same profile.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Input port action profile ID.
+ * @params[in] action_type
+ *   Input port action type.
+ * @params[in] action_config
+ *   Input port action configuration. For input port action X, this parameter
+ *   needs to point to pre-allocated and valid instance of struct
+ *   rte_pdev_port_in_action_X_config.
+ *   Input port action configuration. For input port action X, this parameter
+ *   must point to valid instance of struct rte_pdev_port_in_action_X_config
+ *   when this structure is defined by the API or to NULL otherwise.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_action_profile_action_register(struct rte_pdev *pdev,
+	uint32_t profile_id,
+	enum rte_pdev_port_in_action_type action_type,
+	void *action_config);
+
+/**
+ * PDEV input port action profile freeze
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Input port action profile ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_action_profile_freeze(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/**
+ * PDEV input port action profile register
+ *
+ * Zero or at most one action profile can be registered for each input port.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   PDEV input port ID.
+ * @params[in] profile_id
+ *   Input port action profile ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_action_profile_register(struct rte_pdev *pdev,
+	uint32_t port_id,
+	uint32_t profile_id);
+
+/**
+ * PDEV table action API
+ *
+ * For some actions below (e.g. packet encapsulations, NAT, statistics, etc),
+ * the same effect can be obtained through a sequence of assignment actions, but
+ * usually this approach is significantly less performant than using specialized
+ * actions.
+ *
+ * Some of the actions below (e.g. metering, timestamp update, etc) require the
+ * definition of specialized actions.
+ */
+enum rte_pdev_table_action_type {
+	/** Assignment: lvalue = expression. */
+	RTE_PDEV_TABLE_ACTION_ASSIGN = 0,
+
+	/** Load balancing. */
+	RTE_PDEV_TABLE_ACTION_LB,
+
+	/** Traffic metering and policing. */
+	RTE_PDEV_TABLE_ACTION_METER,
+
+	/** Packet encapsulation. */
+	RTE_PDEV_TABLE_ACTION_ENCAP,
+
+	/** Network Address Translation (NAT). */
+	RTE_PDEV_TABLE_ACTION_NAT,
+
+	/** Time to Leave (TTL) update. */
+	RTE_PDEV_TABLE_ACTION_TTL,
+
+	/** Statistics per table rule. */
+	RTE_PDEV_TABLE_ACTION_STATS,
+
+	/** Time stamp update. */
+	RTE_PDEV_TABLE_ACTION_TIME,
+};
+
+/**
+ * RTE_PDEV_TABLE_ACTION_ASSIGN
+ */
+struct rte_pdev_table_action_assign_config {
+	/** Left hand side value for the assignment. Must be one of the
+	 * pre-registered packet, packet meta-data or table field symbolic
+	 * aliases.
+	 */
+	const char *lvalue;
+
+	/** Expression with operands and operators. The operands must be
+	 * pre-registered packet, packet meta-data or table field symbolic
+	 * aliases.
+	 */
+	const char *expression;
+};
+
+/**
+ * RTE_PDEV_TABLE_ACTION_LB
+ */
+/** Load balance action configuration (per table action profile). */
+struct rte_pdev_table_action_lb_config {
+	/** Hash key parameters. */
+	struct rte_pdev_table_match_params hash_key;
+
+	/** Hash function name. This parameter represents a hint that the
+	 * underlying implementation may ignore.
+	 */
+	const char *hash_func;
+
+	/** Hash function seed value. This parameter represents a hint that the
+	 * underlying implementation may ignore.
+	 */
+	uint64_t hash_seed;
+
+	/** Number of elements in the table storing the output values. */
+	uint32_t table_size;
+
+	/** Packet meta-data field name where the output value should be saved. */
+	const char *out;
+};
+
+/** Load balance action parameters (per table rule). */
+struct rte_pdev_table_action_lb_params {
+	/** Table defining the output values and their weights. Needs to be
+	 * pre-allocated with exactly *table_size* elements. The weights are set
+	 * in 1 / *table_size* increments. To assign a weight of N / *table_size*
+	 * to a given output value (0 <= N <= *table_size*), the same output
+	 * value needs to show up exactly N times in this table.
+	 */
+	uint32_t *table;
+};
+
+/**
+ * RTE_PDEV_TABLE_ACTION_MTR
+ */
+/** Packet color. */
+enum rte_pdev_meter_color {
+	RTE_PDEV_METER_COLOR_GREEN = 0, /**< Green. */
+	RTE_PDEV_METER_COLOR_YELLOW, /**< Yellow. */
+	RTE_PDEV_METER_COLOR_RED, /**< Red. */
+	RTE_PDEV_METER_COLORS /**< Number of colors. */
+};
+
+/** Differentiated Services Code Point (DSCP) translation table entry. */
+struct rte_pdev_dscp_table_entry {
+	/** Traffic class ID. Has to be strictly less than *n_tc*. */
+	uint32_t tc_id;
+
+	/** Packet input color. Used by the traffic metering algorithm in
+	 * color aware mode.
+	 */
+	enum rte_pdev_meter_color color;
+};
+
+/** DSCP translation table. */
+struct rte_pdev_dscp_table {
+	/** Array of DSCP table entries */
+	struct rte_pdev_dscp_table_entry entry[64];
+};
+
+/** Supported traffic metering algorithms. */
+enum rte_pdev_meter_algorithm {
+	/** Single Rate Three Color Marker (srTCM) - IETF RFC 2697. */
+	RTE_PDEV_METER_SRTCM_RFC2697,
+
+	/** Two Rate Three Color Marker (trTCM) - IETF RFC 2698. */
+	RTE_PDEV_METER_TRTCM_RFC2698,
+
+	/** Two Rate Three Color Marker (trTCM) - IETF RFC 4115. */
+	RTE_PDEV_METER_TRTCM_RFC4115,
+};
+
+/** Traffic metering profile (configuration template). */
+struct rte_pdev_meter_profile {
+	/** Traffic metering algorithm. */
+	enum rte_pdev_meter_algorithm alg;
+
+	RTE_STD_C11
+	union {
+		/** Items only valid when *alg* is set to srTCM - RFC 2697. */
+		struct {
+			/** Committed Information Rate (CIR) (bytes/second). */
+			uint64_t cir;
+
+			/** Committed Burst Size (CBS) (bytes). */
+			uint64_t cbs;
+
+			/** Excess Burst Size (EBS) (bytes). */
+			uint64_t ebs;
+		} srtcm_rfc2697;
+
+		/** Items only valid when *alg* is set to trTCM - RFC 2698. */
+		struct {
+			/** Committed Information Rate (CIR) (bytes/second). */
+			uint64_t cir;
+
+			/** Peak Information Rate (PIR) (bytes/second). */
+			uint64_t pir;
+
+			/** Committed Burst Size (CBS) (byes). */
+			uint64_t cbs;
+
+			/** Peak Burst Size (PBS) (bytes). */
+			uint64_t pbs;
+		} trtcm_rfc2698;
+
+		/** Items only valid when *alg* is set to trTCM - RFC 4115. */
+		struct {
+			/** Committed Information Rate (CIR) (bytes/second). */
+			uint64_t cir;
+
+			/** Excess Information Rate (EIR) (bytes/second). */
+			uint64_t eir;
+
+			/** Committed Burst Size (CBS) (byes). */
+			uint64_t cbs;
+
+			/** Excess Burst Size (EBS) (bytes). */
+			uint64_t ebs;
+		} trtcm_rfc4115;
+	};
+};
+
+/** Policer actions. */
+enum rte_pdev_policer {
+	/** Recolor the packet as green. */
+	RTE_PDEV_POLICER_COLOR_GREEN = 0,
+
+	/** Recolor the packet as yellow. */
+	RTE_PDEV_POLICER_COLOR_YELLOW,
+
+	/** Recolor the packet as red. */
+	RTE_PDEV_POLICER_COLOR_RED,
+
+	/** Drop the packet. */
+	RTE_PDEV_POLICER_DROP,
+};
+
+/** Meter action configuration per traffic class. */
+struct rte_pdev_table_action_mtr_tc_params {
+	/** Meter profile ID. */
+	uint32_t meter_profile_id;
+
+	/** Policer actions. */
+	enum rte_pdev_policer policer[RTE_PDEV_METER_COLORS];
+};
+
+/** Meter action statistics counters per traffic class. */
+struct rte_pdev_table_action_mtr_tc_counters {
+	/** Number of packets per color at the output of the traffic metering
+	 * and before the policer actions are executed. Only valid when
+	 * *n_packets_valid* is non-zero.
+	 */
+	uint64_t n_packets[RTE_PDEV_METER_COLORS];
+
+	/** Number of packet bytes per color at the output of the traffic
+	 * metering and before the policer actions are executed. Only valid when
+	 * *n_bytes_valid* is non-zero.
+	 */
+	uint64_t n_bytes[RTE_PDEV_METER_COLORS];
+
+	/** When non-zero, the *n_packets* field is valid. */
+	int n_packets_valid;
+
+	/** When non-zero, the *n_bytes* field is valid. */
+	int n_bytes_valid;
+};
+
+/** Meter action configuration (per table action profile). */
+struct rte_pdev_table_action_mtr_config {
+	/** Packet field for IP header. */
+	const char *ip;
+
+	/** IP protocol version. Non-zero for IPv4, zero for IPv6. */
+	int ip_version;
+
+	/** DSCP translation table. */
+	struct rte_pdev_dscp_table *dscp_table;
+
+	/** Number of traffic classes. Each traffic class has its own traffic
+	 * meter and policer instances.
+	 */
+	uint32_t n_tc;
+
+	/** Meter algorithm. */
+	enum rte_pdev_meter_algorithm alg;
+
+	/** When non-zero, the *n_packets* meter stats counter is enabled,
+	 * otherwise it is disabled.
+	 *
+	 * @see struct rte_pdev_table_action_mtr_tc_counters
+	 */
+	int n_packets_enabled;
+
+	/** When non-zero, the *n_bytes* meter stats counter is enabled,
+	 * otherwise it is disabled.
+	 *
+	 * @see struct rte_pdev_table_action_mtr_tc_counters
+	 */
+	int n_bytes_enabled;
+};
+
+/** Meter action parameters (per table rule). */
+struct rte_pdev_table_action_mtr_params {
+	/** Traffic meter and policer parameters for all traffic classes. Array
+	 * of *n_tc* elements.
+	 */
+	struct rte_pdev_table_action_mtr_tc_params *mtr;
+};
+
+/** Meter action statistics counters (per table rule). */
+struct rte_pdev_table_action_mtr_counters {
+	/** Stats counters for all traffic classes. Array of *n_tc* elements. */
+	struct rte_pdev_table_action_mtr_tc_counters *stats;
+};
+
+/**
+ * RTE_PDEV_TABLE_ACTION_ENCAP
+ */
+/** Supported packet encapsulation types. */
+enum rte_pdev_encap_type {
+	/** IP -> { Ether | IP } */
+	RTE_PDEV_ENCAP_ETHER = 0,
+
+	/** IP -> { Ether | VLAN | IP } */
+	RTE_PDEV_ENCAP_VLAN,
+
+	/** IP -> { Ether | S-VLAN | C-VLAN | IP } */
+	RTE_PDEV_ENCAP_QINQ,
+
+	/** IP -> { Ether | MPLS | IP } */
+	RTE_PDEV_ENCAP_MPLS,
+
+	/** IP -> { Ether | PPPoE | PPP | IP } */
+	RTE_PDEV_ENCAP_PPPOE,
+};
+
+/** Pre-computed Ethernet header fields for encapsulation action. */
+struct rte_pdev_ether_hdr {
+	struct ether_addr da; /**< Destination address. */
+	struct ether_addr sa; /**< Source address. */
+};
+
+/** Pre-computed VLAN header fields for encapsulation action. */
+struct rte_pdev_vlan_hdr {
+	uint8_t pcp; /**< Priority Code Point (PCP). */
+	uint8_t dei; /**< Drop Eligibility Indicator (DEI). */
+	uint16_t vid; /**< VLAN Identifier (VID). */
+};
+
+/** Pre-computed MPLS header fields for encapsulation action. */
+struct rte_pdev_mpls_hdr {
+	uint32_t label; /**< Label. */
+	uint8_t tc; /**< Traffic Class (TC). */
+	uint8_t ttl; /**< Time to Live (TTL). */
+};
+
+/** Pre-computed PPPoE header fields for encapsulation action. */
+struct rte_pdev_pppoe_hdr {
+	uint16_t session_id; /**< Session ID. */
+};
+
+/** Ether encap parameters. */
+struct rte_pdev_encap_ether_params {
+	struct rte_pdev_ether_hdr ether; /**< Ethernet header. */
+};
+
+/** VLAN encap parameters. */
+struct rte_pdev_encap_vlan_params {
+	struct rte_pdev_ether_hdr ether; /**< Ethernet header. */
+	struct rte_pdev_vlan_hdr vlan; /**< VLAN header. */
+};
+
+/** QinQ encap parameters. */
+struct rte_pdev_encap_qinq_params {
+	struct rte_pdev_ether_hdr ether; /**< Ethernet header. */
+	struct rte_pdev_vlan_hdr svlan; /**< Service VLAN header. */
+	struct rte_pdev_vlan_hdr cvlan; /**< Customer VLAN header. */
+};
+
+/** Max number of MPLS labels per output packet for MPLS encapsulation. */
+#ifndef RTE_PDEV_MPLS_LABELS_MAX
+#define RTE_PDEV_MPLS_LABELS_MAX                            4
+#endif
+
+/** MPLS encap parameters. */
+struct rte_pdev_encap_mpls_params {
+	/** Ethernet header. */
+	struct rte_pdev_ether_hdr ether;
+
+	/** MPLS header. */
+	struct rte_pdev_mpls_hdr mpls[RTE_PDEV_MPLS_LABELS_MAX];
+
+	/** Number of MPLS labels in MPLS header. */
+	uint32_t mpls_count;
+
+	/** Non-zero for MPLS unicast, zero for MPLS multicast. */
+	int unicast;
+};
+
+/** PPPoE encap parameters. */
+struct rte_pdev_encap_pppoe_params {
+	struct rte_pdev_ether_hdr ether; /**< Ethernet header. */
+	struct rte_pdev_pppoe_hdr pppoe; /**< PPPoE/PPP headers. */
+};
+
+/** Encap action configuration (per table action profile). */
+struct rte_pdev_table_action_encap_config {
+	/** Packet field for IP header. */
+	const char *ip;
+
+	/** IP protocol version. Non-zero for IPv4, zero for IPv6. */
+	int ip_version;
+
+	/** Bit mask defining the set of packet encapsulations enabled for the
+	 * current table action profile. If bit (1 << N) is set in *encap_mask*,
+	 * then packet encapsulation N is enabled, otherwise it is disabled.
+	 *
+	 * @see enum rte_pdev_encap_type
+	 */
+	uint64_t encap_mask;
+};
+
+/** Encap action parameters (per table rule). */
+struct rte_pdev_table_action_encap_params {
+	/** Encapsulation type. */
+	enum rte_pdev_encap_type type;
+
+	RTE_STD_C11
+	union {
+		/** Only valid when *type* is set to Ether. */
+		struct rte_pdev_encap_ether_params ether;
+
+		/** Only valid when *type* is set to VLAN. */
+		struct rte_pdev_encap_vlan_params vlan;
+
+		/** Only valid when *type* is set to QinQ. */
+		struct rte_pdev_encap_qinq_params qinq;
+
+		/** Only valid when *type* is set to MPLS. */
+		struct rte_pdev_encap_mpls_params mpls;
+
+		/** Only valid when *type* is set to PPPoE. */
+		struct rte_pdev_encap_pppoe_params pppoe;
+	};
+};
+
+/**
+ * RTE_PDEV_TABLE_ACTION_NAT
+ */
+/** NAT action configuration (per table action profile). */
+struct rte_pdev_table_action_nat_config {
+	/** Packet field for IP header. */
+	const char *ip;
+
+	/** IP protocol version. Non-zero for IPv4, zero for IPv6. */
+	int ip_version;
+
+	/** When non-zero, the IP source address and L4 protocol source port are
+	 * translated. When zero, the IP destination address and L4 protocol
+	 * destination port are translated.
+	 */
+	int source_nat;
+
+	/** Layer 4 protocol, for example TCP (0x06) or UDP (0x11). The checksum
+	 * field is computed differently and placed at different header offset
+	 * by each layer 4 protocol.
+	 */
+	uint8_t proto;
+};
+
+/** NAT action parameters (per table rule). */
+struct rte_pdev_table_action_nat_params {
+	/** IP version for *addr*: non-zero for IPv4, zero for IPv6. */
+	int ip_version;
+
+	/** IP address. */
+	union {
+		/** IPv4 address; only valid when *ip_version* is IPv4. */
+		uint32_t ipv4;
+
+		/** IPv6 address; only valid when *ip_version* is IPv6. */
+		uint8_t ipv6[16];
+	} addr;
+
+	/** Port. */
+	uint16_t port;
+};
+
+/**
+ * RTE_PDEV_TABLE_ACTION_TTL
+ */
+/** TTL action configuration (per table action profile). */
+struct rte_pdev_table_action_ttl_config {
+	/** Packet field for IP header. */
+	const char *ip;
+
+	/** IP protocol version. Non-zero for IPv4, zero for IPv6. */
+	int ip_version;
+
+	/** Packet meta-data field to be set to *port_out_id* when the updated
+	 * IPv4 Time to Live (TTL) field or IPv6 Hop Limit (HL) field is zero.
+	 */
+	const char *port_out;
+
+	/** Output port ID to be stored into *port_out* packet meta-data field
+	 * when the updated IPv4 TTL field or IPv6 HL field is zero.
+	 */
+	uint32_t port_out_id;
+
+	/** When non-zero, the *n_packets* stats counter for TTL action is
+	 * enabled, otherwise disabled.
+	 *
+	 * @see struct rte_pdev_table_action_ttl_counters
+	 */
+	int n_packets_enabled;
+};
+
+/** TTL action parameters (per table rule). */
+struct rte_pdev_table_action_ttl_params {
+	/** When non-zero, decrement the IPv4 TTL field and update the checksum
+	 * field, or decrement the IPv6 HL field. When zero, the IPv4 TTL field
+	 * or the IPv6 HL field is not changed.
+	 */
+	int decrement;
+};
+
+/** TTL action statistics packets (per table rule). */
+struct rte_pdev_table_action_ttl_counters {
+	/** Number of IPv4 packets whose updated TTL field is zero or IPv6
+	 * packets whose updated HL field is zero.
+	 */
+	uint64_t n_packets;
+};
+
+/**
+ * RTE_PDEV_TABLE_ACTION_STATS
+ */
+/** Stats action configuration (per table action profile). */
+struct rte_pdev_table_action_stats_config {
+	/** Packet field for IP header. */
+	const char *ip;
+
+	/** IP protocol version. Non-zero for IPv4, zero for IPv6. */
+	int ip_version;
+
+	/** When non-zero, the *n_packets* stats counter is enabled, otherwise
+	 * disabled.
+	 *
+	 * @see struct rte_pdev_table_action_stats_counters
+	 */
+	int n_packets_enabled;
+
+	/** When non-zero, the *n_bytes* stats counter is enabled, otherwise
+	 * disabled.
+	 *
+	 * @see struct rte_pdev_table_action_stats_counters
+	 */
+	int n_bytes_enabled;
+};
+
+/** Stats action parameters (per table rule). */
+struct rte_pdev_table_action_stats_params {
+	/** Initial value for the *n_packets* stats counter. Typically set to 0.
+	 *
+	 * @see struct rte_pdev_table_action_stats_counters
+	 */
+	uint64_t n_packets;
+
+	/** Initial value for the *n_bytes* stats counter. Typically set to 0.
+	 *
+	 * @see struct rte_pdev_table_action_stats_counters
+	 */
+	uint64_t n_bytes;
+};
+
+/** Stats action counters (per table rule). */
+struct rte_pdev_table_action_stats_counters {
+	/** Number of packets. Valid only when *n_packets_valid* is non-zero. */
+	uint64_t n_packets;
+
+	/** Number of bytes. Valid only when *n_bytes_valid* is non-zero. */
+	uint64_t n_bytes;
+
+	/** When non-zero, the *n_packets* field is valid, otherwise invalid. */
+	int n_packets_valid;
+
+	/** When non-zero, the *n_bytes* field is valid, otherwise invalid. */
+	int n_bytes_valid;
+};
+
+/**
+ * RTE_PDEV_TABLE_ACTION_TIME
+ */
+/** Timestamp action parameters (per table rule). */
+struct rte_pdev_table_action_time_params {
+	/** Initial timestamp value. Typically set to current time. */
+	uint64_t time;
+};
+
+/**
+ * PDEV table action profile create
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Table action profile ID. Must not be used by any existing *pdev* table
+ *   action profile.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_action_profile_create(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/**
+ * PDEV table action profile action register
+ *
+ * The action registration order is important, as it determines the action
+ * execution order. The same action type can be registered several times for the
+ * same profile. 
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Table action profile ID.
+ * @params[in] action_type
+ *   Table action type.
+ * @params[in] action_config
+ *   Table action configuration. For table action X, this parameter must point
+ *   to valid instance of struct rte_pdev_table_action_X_config when this
+ *   structure is defined by the API or to NULL otherwise.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_action_profile_action_register(struct rte_pdev *pdev,
+	uint32_t profile_id,
+	enum rte_pdev_table_action_type action_type,
+	void *action_config);
+
+/**
+ * PDEV table action profile freeze
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Table action profile ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_action_profile_freeze(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/**
+ * PDEV table action profile register
+ *
+ * Zero or several action profiles can be registered for each table.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] table_id
+ *   PDEV table ID.
+ * @params[in] profile_id
+ *   Table action profile ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_action_profile_register(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t profile_id);
+
+/**
+ * PDEV output port action API
+ */
+enum rte_pdev_port_out_action_type {
+	/** Assignment: lvalue = expression. */
+	RTE_PDEV_PORT_OUT_ACTION_ASSIGN = 0,
+};
+
+/**
+ * RTE_PDEV_PORT_OUT_ACTION_ASSIGN
+ */
+struct rte_pdev_port_out_action_assign_config {
+	/** Left hand side value for the assignment. Must be one of the
+	 * pre-registered packet field symbolic aliases. Packet meta-data field
+	 * and table rule data field aliases are not allowed.
+	 */
+	const char *lvalue;
+
+	/** Expression with operands and operators. The operands must be
+	 * pre-registered packet or packet meta-data field symbolic aliases.
+	 * Table rule data field aliases are not allowed.
+	 */
+	const char *expression;
+};
+
+/**
+ * PDEV output port action profile create
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Output port action profile ID. Must not be used by any existing *pdev*
+ *   output port action profile.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_out_action_profile_create(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/**
+ * PDEV output port action profile action register
+ *
+ * The action registration order is important, as it determines the action
+ * execution order. The same action type can be registered several times for the
+ * same profile. 
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Output port action profile ID.
+ * @params[in] action_type
+ *   Output port action type.
+ * @params[in] action_config
+ *   Output port action configuration. For output port action X, this parameter
+ *   must point to valid instance of struct rte_pdev_port_out_action_X_config
+ *   when this structure is defined by the API or to NULL otherwise. 
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_out_action_profile_action_register(struct rte_pdev *pdev,
+	uint32_t profile_id,
+	enum rte_pdev_port_out_action_type action_type,
+	void *action_config);
+
+/**
+ * PDEV output port action profile freeze
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] profile_id
+ *   Output port action profile ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_out_action_profile_freeze(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/**
+ * PDEV output port action profile register
+ *
+ * Zero or at most one action profile can be registered for each output port.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   PDEV output port ID.
+ * @params[in] profile_id
+ *   Output port action profile ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_out_action_profile_register(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t profile_id);
+
+/**
+ * PDEV input port run-time API
+ */
+
+/**
+ * PDEV input port enable
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   PDEV input port ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_enable(struct rte_pdev *pdev,
+	uint32_t port_id);
+
+/**
+ * PDEV input port disable
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   PDEV input port ID.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_disable(struct rte_pdev *pdev,
+	uint32_t port_id);
+
+/** PDEV input port statistics counters. */
+struct rte_pdev_port_in_stats {
+	/** Number of packets read from this input port. */
+	uint64_t n_packets;
+
+	/** Number of bytes associated with *n_packets*. */
+	uint64_t n_bytes;
+};
+
+/**
+ * PDEV input port statistics counters read
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   PDEV input port ID.
+ * @params[out] stats
+ *   When non-NULL, the statistics counters are read and saved here.
+ * @params[in] clear
+ *   When non-zero, the statistics counters are cleared after read, otherwise
+ *   they are not modified.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_in_stats_read(struct rte_pdev *pdev,
+	uint32_t port_id,
+	struct rte_pdev_port_in_stats *stats,
+	int clear);
+
+/**
+ * PDEV table run-time API
+ */
+
+/**
+ * PDEV table rule add.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] match
+ *   Rule match. For any match type, the NULL value indicates the default rule.
+ * @param[in] match_mask
+ *   Rule match bit-mask. Ignored when *match* is set to NULL. Ignored when
+ *   *table_id* match type is not WILDCARD or LPM.
+ * @param[in] match_priority
+ *   Rule match priority. Ignored when *match* is set to NULL. Ignored when
+ *   *table_id* match type is not WILDCARD.
+ * @param[in] action_profile_id
+ *   Table action profile ID.
+ * @param[in] action_params
+ *   Array of action parameters. The number of elements must be equal to the
+ *   number of actions registered for the *action_profile_id* table action
+ *   profile. If X is the N-th action registered for *action_profile_id*, then
+ *   the N-th element of this array needs to be pointer to valid instance of
+ *   struct rte_pdev_table_action_X_params when defined by the API or to NULL
+ *   otherwise.
+ * @param[out] rule_handle
+ *   Rule handle.
+ * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_rule_add(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint8_t *match,
+	uint8_t *match_mask,
+	uint32_t match_priority,
+	uint32_t action_profile_id,
+	void **action_params,
+	void **rule_handle);
+
+/**
+ * PDEV table rule delete.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] match
+ *   Rule match. For any match type, the NULL value indicates the default rule.
+ * @param[in] match_mask
+ *   Rule match bit-mask. Ignored when *match* is set to NULL. Ignored when
+ *   *table_id* match type is not WILDCARD or LPM.
+  * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_rule_delete(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint8_t *match,
+	uint8_t *match_mask);
+
+/**
+ * PDEV table DSCP table update.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] dscp_table
+ *   DSCP table. Must be pre-allocated and valid.
+ * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_dscp_table_update(struct rte_pdev *pdev,
+	uint32_t table_id,
+	struct rte_pdev_dscp_table *dscp_table);
+
+/**
+ * PDEV table meter profile add.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] meter_profile_id
+ *   Meter profile ID. Must not be used by any existing *table_id* meter
+ *   profile.
+ * @param[in] profile
+ *   Meter profile parameters.
+ * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_meter_profile_add(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t meter_profile_id,
+	struct rte_pdev_meter_profile *profile);
+
+/**
+ * PDEV table meter profile delete.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] meter_profile_id
+ *   Meter profile ID. Must be one of the existing *table_id* meter profiles.
+ * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_meter_profile_delete(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t meter_profile_id);
+
+/**
+ * PDEV table rule meter statistics counters read.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] rule_handle
+ *   Rule handle.
+ * @params[out] stats
+ *   When non-NULL, the statistics counters are read and saved here.
+ * @params[in] clear
+ *   When non-zero, the statistics counters are cleared after read, otherwise
+ *   they are not modified.
+ * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_rule_meter_read(struct rte_pdev *pdev,
+	uint32_t table_id,
+	void *rule_handle,
+	struct rte_pdev_table_action_mtr_counters *stats,
+	int clear);
+
+/**
+ * PDEV table rule TTL statistics counters read.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] rule_handle
+ *   Rule handle.
+ * @params[out] stats
+ *   When non-NULL, the statistics counters are read and saved here.
+ * @params[in] clear
+ *   When non-zero, the statistics counters are cleared after read, otherwise
+ *   they are not modified.
+ * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_rule_ttl_read(struct rte_pdev *pdev,
+	uint32_t table_id,
+	void *rule_handle,
+	struct rte_pdev_table_action_ttl_counters *stats,
+	int clear);
+
+/**
+ * PDEV table rule statistics counters read.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] rule_handle
+ *   Rule handle.
+ * @params[out] stats
+ *   When non-NULL, the statistics counters are read and saved here.
+ * @params[in] clear
+ *   When non-zero, the statistics counters are cleared after read, otherwise
+ *   they are not modified.
+ * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_rule_stats_read(struct rte_pdev *pdev,
+	uint32_t table_id,
+	void *rule_handle,
+	struct rte_pdev_table_action_stats_counters *stats,
+	int clear);
+
+/**
+ * PDEV table rule timestamp read.
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @param[in] table_id
+ *   PDEV table ID.
+ * @param[in] rule_handle
+ *   Rule handle.
+ * @param[out] timestamp
+ *   Current timestamp value. Must be non-NULL.
+ * @return
+ *   Zero on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_rule_timestamp_read(struct rte_pdev *pdev,
+	uint32_t table_id,
+	void *rule_handle,
+	uint64_t *timestamp);
+
+/** PDEV table statistics counters. */
+struct rte_pdev_table_stats {
+	/** Number of packets looked up in this table. */
+	uint64_t n_packets;
+
+	/** Number of bytes associated with *n_packets*. */
+	uint64_t n_bytes;	
+};
+
+/**
+ * PDEV table statistics counters read
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] table_id
+ *   PDEV table ID.
+ * @params[out] stats
+ *   When non-NULL, the statistics counters are read and saved here.
+ * @params[in] clear
+ *   When non-zero, the statistics counters are cleared after read, otherwise
+ *   they are not modified.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_table_stats_read(struct rte_pdev *pdev,
+	uint32_t table_id,
+	struct rte_pdev_table_stats *stats,
+	int clear);
+
+/**
+ * PDEV output port run-time API
+ */
+
+/** PDEV output port statistics counters. */
+struct rte_pdev_port_out_stats {
+	/** Number of packets written to this output port. */
+	uint64_t n_packets;
+
+	/** Number of bytes associated with *n_packets*. */
+	uint64_t n_bytes;
+};
+
+/**
+ * PDEV output port statistics counters read
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ * @params[in] port_id
+ *   PDEV output port ID.
+ * @params[out] stats
+ *   When non-NULL, the statistics counters are read and saved here.
+ * @params[in] clear
+ *   When non-zero, the statistics counters are cleared after read, otherwise
+ *   they are not modified.
+ * @return
+ *   0 on success, non-zero error code otherwise.
+ */
+int
+rte_pdev_port_out_stats_read(struct rte_pdev *pdev,
+	uint32_t port_id,
+	struct rte_pdev_port_out_stats *stats,
+	int clear);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_pipeline/rte_pdev_driver.h b/lib/librte_pipeline/rte_pdev_driver.h
new file mode 100644
index 0000000..4b97784
--- /dev/null
+++ b/lib/librte_pipeline/rte_pdev_driver.h
@@ -0,0 +1,283 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#ifndef __INCLUDE_RTE_PDEV_DRIVER_H__
+#define __INCLUDE_RTE_PDEV_DRIVER_H__
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file
+ * RTE Pipeline Device (PDEV) - Device Driver Interface
+ */
+
+#include <stdint.h>
+
+#include "rte_pdev.h"
+
+/** @internal PDEV create */
+typedef struct rte_pdev * (*rte_pdev_create_t)(struct rte_device *dev,
+	struct rte_pdev_params *params);
+
+/** @internal PDEV free */
+typedef int (*rte_pdev_free_t)(struct rte_pdev *pdev);
+
+/** @internal PDEV start */
+typedef int (*rte_pdev_start_t)(struct rte_pdev *pdev);
+
+/** @internal PDEV input port create */
+typedef int (*rte_pdev_port_in_create_t)(struct rte_pdev *pdev,
+	uint32_t port_id,
+	struct rte_pdev_port_in_params *params,
+	int enable);
+
+/** @internal PDEV input port connect */
+typedef int (*rte_pdev_port_in_connect_t)(struct rte_pdev *pdev,
+	uint32_t port_id,
+	uint32_t table_id);
+
+/** @internal PDEV table create */
+typedef int (*rte_pdev_table_create_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	struct rte_pdev_table_params *params);
+
+/** @internal PDEV output port create */
+typedef int (*rte_pdev_port_out_create_t)(struct rte_pdev *pdev,
+	uint32_t port_id,
+	struct rte_pdev_port_out_params *params);
+
+/** @internal PDEV packet field register */
+typedef int (*rte_pdev_pkt_field_register_t)(struct rte_pdev *pdev,
+	const char *name,
+	uint32_t offset,
+	uint32_t size);
+
+/** @internal PDEV packet meta-data field register */
+typedef int (*rte_pdev_pkt_meta_field_register_t)(struct rte_pdev *pdev,
+	const char *name,
+	uint32_t size);
+
+/** @internal PDEV table rule data field register */
+typedef int (*rte_pdev_table_field_register_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t action_profile_id,
+	const char *name,
+	uint32_t size);
+
+/** @internal PDEV input port action profile create */
+typedef int (*rte_pdev_port_in_action_profile_create_t)(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/** @internal PDEV input port action profile action register */
+typedef int (*rte_pdev_port_in_action_profile_action_register_t)(struct rte_pdev *pdev,
+	uint32_t profile_id,
+	enum rte_pdev_port_in_action_type action_type,
+	void *action_config);
+
+/** @internal PDEV input port action profile freeze */
+typedef int (*rte_pdev_port_in_action_profile_freeze_t)(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/** @internal PDEV input port action profile register */
+typedef int (*rte_pdev_port_in_action_profile_register_t)(struct rte_pdev *pdev,
+	uint32_t port_id,
+	uint32_t profile_id);
+
+/** @internal PDEV table action profile create */
+typedef int (*rte_pdev_table_action_profile_create_t)(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/** @internal PDEV table action profile action register */
+typedef int (*rte_pdev_table_action_profile_action_register_t)(struct rte_pdev *pdev,
+	uint32_t profile_id,
+	enum rte_pdev_table_action_type action_type,
+	void *action_config);
+
+/** @internal PDEV table action profile freeze */
+typedef int (*rte_pdev_table_action_profile_freeze_t)(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/** @internal PDEV table action profile register */
+typedef int (*rte_pdev_table_action_profile_register_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t profile_id);
+
+/** @internal PDEV output port action profile create */
+typedef int (*rte_pdev_port_out_action_profile_create_t)(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/** @internal PDEV output port action profile action register */
+typedef int (*rte_pdev_port_out_action_profile_action_register_t)(struct rte_pdev *pdev,
+	uint32_t profile_id,
+	enum rte_pdev_port_out_action_type action_type,
+	void *action_config);
+
+/** @internal PDEV output port action profile freeze */
+typedef int (*rte_pdev_port_out_action_profile_freeze_t)(struct rte_pdev *pdev,
+	uint32_t profile_id);
+
+/** @internal PDEV output port action profile register */
+typedef int (*rte_pdev_port_out_action_profile_register_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t profile_id);
+
+/** @internal PDEV input port enable */
+typedef int (*rte_pdev_port_in_enable_t)(struct rte_pdev *pdev,
+	uint32_t port_id);
+
+/** @internal PDEV input port disable */
+typedef int (*rte_pdev_port_in_disable_t)(struct rte_pdev *pdev,
+	uint32_t port_id);
+
+/** @internal PDEV input port stats read */
+typedef int (*rte_pdev_port_in_stats_read_t)(struct rte_pdev *pdev,
+	uint32_t port_id,
+	struct rte_pdev_port_in_stats *stats,
+	int clear);
+
+/** @internal PDEV table rule add */
+typedef int (*rte_pdev_table_rule_add_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint8_t *match,
+	uint8_t *match_mask,
+	uint32_t match_priority,
+	uint32_t action_profile_id,
+	void **action_params,
+	void **rule_handle);
+
+/** @internal PDEV table rule delete */
+typedef int (*rte_pdev_table_rule_delete_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint8_t *match,
+	uint8_t *match_mask);
+
+/** @internal PDEV table DSCP table update */
+typedef int (*rte_pdev_table_dscp_table_update_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	struct rte_pdev_dscp_table *dscp_table);
+
+/** @internal PDEV table meter profile add */
+typedef int (*rte_pdev_table_meter_profile_add_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t meter_profile_id,
+	struct rte_pdev_meter_profile *profile);
+
+/** @internal PDEV table meter profile delete */
+typedef int (*rte_pdev_table_meter_profile_delete_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	uint32_t meter_profile_id);
+
+/** @internal PDEV table rule meter stats read */
+typedef int (*rte_pdev_table_rule_meter_read_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	void *rule_handle,
+	struct rte_pdev_table_action_mtr_counters *stats,
+	int clear);
+
+/** @internal PDEV table rule TTL stats read */
+typedef int (*rte_pdev_table_rule_ttl_read_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	void *rule_handle,
+	struct rte_pdev_table_action_ttl_counters *stats,
+	int clear);
+
+/** @internal PDEV table rule stats read */
+typedef int (*rte_pdev_table_rule_stats_read_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	void *rule_handle,
+	struct rte_pdev_table_action_stats_counters *stats,
+	int clear);
+
+/** @internal PDEV table rule tiemstamp read */
+typedef int (*rte_pdev_table_rule_timestamp_read_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	void *rule_handle,
+	uint64_t *timestamp);
+
+/** @internal PDEV table stats read */
+typedef int (*rte_pdev_table_stats_read_t)(struct rte_pdev *pdev,
+	uint32_t table_id,
+	struct rte_pdev_table_stats *stats,
+	int clear);
+
+/** @internal PDEV output port stats read */
+typedef int (*rte_pdev_port_out_stats_read_t)(struct rte_pdev *pdev,
+	uint32_t port_id,
+	struct rte_pdev_port_out_stats *stats,
+	int clear);
+
+/** PDEV ops */
+struct rte_pdev_ops {
+	/** PDEV create API */
+	rte_pdev_create_t create;
+	rte_pdev_free_t free;
+	rte_pdev_start_t start;
+	rte_pdev_port_in_create_t port_in_create;
+	rte_pdev_port_in_connect_t port_in_connect;
+	rte_pdev_table_create_t table_create;
+	rte_pdev_port_out_create_t port_out_create;
+
+	/** PDEV meta-data API */
+	rte_pdev_pkt_field_register_t pkt_field_register;
+	rte_pdev_pkt_meta_field_register_t pkt_meta_field_register;
+	rte_pdev_table_field_register_t table_field_register;
+
+	/** PDEV input port action API */
+	rte_pdev_port_in_action_profile_create_t port_in_action_profile_create;
+	rte_pdev_port_in_action_profile_action_register_t port_in_action_profile_action_register;
+	rte_pdev_port_in_action_profile_freeze_t port_in_action_profile_freeze;
+	rte_pdev_port_in_action_profile_register_t port_in_action_profile_register;
+
+	/** PDEV table action API */
+	rte_pdev_table_action_profile_create_t table_action_profile_create;
+	rte_pdev_table_action_profile_action_register_t table_action_profile_action_register;
+	rte_pdev_table_action_profile_freeze_t table_action_profile_freeze;
+	rte_pdev_table_action_profile_register_t table_action_profile_register;
+
+	/** PDEV output port action API */
+	rte_pdev_port_out_action_profile_create_t port_out_action_profile_create;
+	rte_pdev_port_out_action_profile_action_register_t port_out_action_profile_action_register;
+	rte_pdev_port_out_action_profile_freeze_t port_out_action_profile_freeze;
+	rte_pdev_port_out_action_profile_register_t port_out_action_profile_register;
+
+	/** PDEV input port run-time API */
+	rte_pdev_port_in_enable_t port_in_enable;
+	rte_pdev_port_in_disable_t port_in_disable;
+	rte_pdev_port_in_stats_read_t port_in_stats_read;
+
+	/** PDEV table run-time API */
+	rte_pdev_table_rule_add_t table_rule_add;
+	rte_pdev_table_rule_delete_t table_rule_delete;
+	rte_pdev_table_dscp_table_update_t table_dscp_table_update;
+	rte_pdev_table_meter_profile_add_t table_meter_profile_add;
+	rte_pdev_table_meter_profile_delete_t table_meter_profile_delete;
+	rte_pdev_table_rule_meter_read_t table_rule_meter_read;
+	rte_pdev_table_rule_ttl_read_t table_rule_ttl_read;
+	rte_pdev_table_rule_stats_read_t table_rule_stats_read;
+	rte_pdev_table_rule_timestamp_read_t table_rule_timestamp_read;
+	rte_pdev_table_stats_read_t table_stats_read;
+
+	/** PDEV output port run-time API */
+	rte_pdev_port_out_stats_read_t port_out_stats_read;	
+};
+
+/**
+ * Get PDEV ops
+ *
+ * @param[in] pdev
+ *   PDEV handle.
+ *
+ * @return
+ *   PDEV ops on success, NULL otherwise.
+ */
+const struct rte_pdev_ops *
+rte_pdev_ops_get(struct rte_pdev *pdev);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
-- 
2.7.4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] P4 enablement in DPDK
  2018-04-18 17:22 [dpdk-dev] [RFC] P4 enablement in DPDK Cristian Dumitrescu
@ 2018-04-19  5:04 ` Kuusisaari, Juhamatti
  2018-06-15 23:25 ` antonin
  2018-06-20  6:13 ` Jerin Jacob
  2 siblings, 0 replies; 6+ messages in thread
From: Kuusisaari, Juhamatti @ 2018-04-19  5:04 UTC (permalink / raw)
  To: Cristian Dumitrescu, dev; +Cc: dan.daly

Hello,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cristian Dumitrescu
> Sent: Wednesday, April 18, 2018 8:22 PM
> To: dev@dpdk.org
> Cc: dan.daly@intel.com
> Subject: [dpdk-dev] [RFC] P4 enablement in DPDK
> 
> P4 is a language for programming the data plane of network devices [1]. The
> P4
> language is developed by p4.org which is joining ONF and Linux Foundation
> [2].
> 
> This API provides a way to program P4 capable devices through DPDK. The
> purpose
> of this API is to enable P4 compilers [3] to generate high performance DPDK
> code
> out of P4 programs.
> 
> The main advantage of this approach is that P4 enablement of network
> devices can
> be done through DPDK in a unified way:
> 
>    1. This API serves as the interface between the P4 compiler front-end
> (target
>       independent) and the P4 compiler backe-ends (target specific).
> 
>    2. Device vendors develop their device drivers as part of DPDK by
>       implementing this API. The device driver is agostic of being called by the
>       P4 front-end. The device driver serves as the P4 compiler taget specific
>       back-end.
> 
>    3. The P4 compiler front-end is target independent. The amount of C code it
>       generates is minimized by calling this API directly for every P4 feature
>       as opposed to vendor-specific free-style C code generation.
> 
> This API introduces a pipeline device (PDEV) by using a similar approach to
> the
> existing ethdev and eventdev DPDK device-like APIs implemented by the
> DPDK Poll
> Mode Drivers (PMDs). Main features:
> 
>    1. Discovery of built-in pipeline devices and their capabilities.
> 
>    2. Creation of new pipelines out of input ports, output ports, tables and
>       actions.
> 
>    3. Registration of packet protocol header and meta-data fields.
> 
>    4. Action definition for input ports, output ports and tables.
> 
>    5. Pipeline run-time API for table population, statistics read, etc.
> 
> This API targets P4 capable devices such as NICs, FPGAs, NPUs, ASICs, etc, as
> well as CPUs. Let’s remember that the first P in P4 stands for Programmable,
> and
> the CPUs are arguably the most programmable devices. The implementation
> for the
> CPU SW target is expected to use the DPDK Packet Framework libraries such
> as
> librte_pipeline, librte_port, librte_table with some expected but moderate
> API
> and implementation adjustments.
> 
> Links:
> 
>    [1] P4-16 language specification:
>        https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.pdf
> 
>    [2] p4.org to join ONF and LF: https://p4.org/p4/onward-and-upward.html
> 
>    [3] p4c: https://github.com/p4lang/p4c
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

+1 for adding P4 support in general.

--
 Juhamatti

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] P4 enablement in DPDK
  2018-04-18 17:22 [dpdk-dev] [RFC] P4 enablement in DPDK Cristian Dumitrescu
  2018-04-19  5:04 ` Kuusisaari, Juhamatti
@ 2018-06-15 23:25 ` antonin
  2018-06-19 17:52   ` Dumitrescu, Cristian
  2018-06-20  6:13 ` Jerin Jacob
  2 siblings, 1 reply; 6+ messages in thread
From: antonin @ 2018-06-15 23:25 UTC (permalink / raw)
  To: Dumitrescu, Cristian; +Cc: dev, Daly, Dan

Hi,

I want to express support for this proposal and adding P4 capabilities in DPDK. For example, I personally see a lot of demand for a production-quality P4-programmable software switch.

A few comments on this:

1) I see a lot of similarities between the proposed PDEV table runtime API and the existing PI C API: https://github.com/p4lang/PI/tree/master/include/PI <https://github.com/p4lang/PI/tree/master/include/PI>. I wonder if there would be value in trying to re-use them - at least partially - for this.
PI is very much aligned on P4 and strictly Program Independent. That does not seem to be completely the case for the PDEV table runtime API (dscp table, TTL, …) and I’m not familiar enough with DPDK to understand the rationale for this, but I don’t see why DPDK couldn’t have its own extensions to the PI API.

2) For the sake of avoiding fragmentation of the community, I would strongly recommend making sure that there is an available P4Runtime (https://p4.org/p4-spec/docs/P4Runtime-v1.0.0.pdf <https://p4.org/p4-spec/docs/P4Runtime-v1.0.0.pdf>) implementation for DPDK. That would require a mapping from P4Runtime messages to PDEV API calls. The advantage of trying to align PDEV with PI (first bullet point) is that there is already a mapping from P4Runtime messages to PI API calls.
The burden of supporting P4Runtime can probably be reduced by leveraging the Stratum project (https://stratumproject.org/ <https://stratumproject.org/>), which unfortunately is not open-source yet.

3) It seems that the notion of “action profile” here is more general than in P4, or more precisely than in the P4_16 PSA architecture (Portable Switch Architecture). Since this term has a strong connotation in the P4 world, maybe another term should be used instead if possible.

4) I recommend looking into the notion of “architecture” in P4_16 and trying to decide if you want to a) have generic support for all P4 architectures (at least for the CPU implementation), b) support the PSA architecture specifically (which is the primary / only architecture used as part of Stratum) or c) define your own architecture specifically for targets that are going to support P4 through DPDK drivers (which may limit your impact).

 5) Conceptually the APIs can be split into 2 parts: a) the table runtime APIs, which are generally pretty-straightforward and b) pipeline query & configuration APIs. Both P4Runtime (SetForwardingPipelineConfig) & PI (pi_device_update_[start|end]) include mechanisms to re-configure the data-plane, by providing the compiler output to the target.
For b), I strongly recommend looking into what we have done with P4Runtime. SetForwardingPipelineConfig provides the target with a P4Info message (which is target-agnostic and describes the interface of each runtime-controllable P4 object; in a way I believe it is similar to your table_create PDEV API) and a target-specific opaque “blob”. For reconfigurable SW & HW the “blob” is essentially a description of the pipeline: it can be some text file, binary register values, an object file, etc…
The case of fixed-function devices is usually trickier. We actually do not have a pipeline discovery mechanism in P4Runtime & PI. In P4Runtime, we just assume that the control-plane is aware of the pipeline and has access to a P4Info message for it. We still require the P4Runtime client to call SetForwardingPipelineConfig with the “right" P4Info message (we expect the target to return an error if the P4Info is not the right one) and a potentially empty “blob”.
I think the take-away is that there isn’t a unified pipeline creation mechanism across programmable targets, i.e. it is difficult to break down pipeline creation into a sequence of universal sub-API calls, such as “create_table”, “create_parser”, etc… However it would make perfect sense IMO to design and implement such an API in the context of a specific DPDK SW switch. The P4 compiler backend would then be in-charge of generating the appropriate sequence of API calls.

Overall I’m very excited to see some work being done in this area. I believe a lot of people will be able to help, especially with compiler backend development. To summarize my 5 bullet points above, I would say that there are 2 import areas of investigation as far as I can tell:
1) what should be the compiler backend output for the DPDK CPU SW target (sequence of API calls)? For non-programmable devices, having the “right” P4Info is usually enough. Existing P4-programmable hardware already comes with its own compiler backend (Barefoot Tofino ASIC, Xilinx FPGAs).
2) can we try to avoid fragmentation and re-use existing code with P4Runtime / PI / Stratum?

Thanks,

Antonin

> On Apr 18, 2018, at 10:22 AM, Dumitrescu, Cristian <cristian.dumitrescu@intel.com> wrote:
> 
> P4 is a language for programming the data plane of network devices [1]. The P4
> language is developed by p4.org which is joining ONF and Linux Foundation [2].
> 
> This API provides a way to program P4 capable devices through DPDK. The purpose
> of this API is to enable P4 compilers [3] to generate high performance DPDK code
> out of P4 programs.
> 
> The main advantage of this approach is that P4 enablement of network devices can
> be done through DPDK in a unified way:
> 
>   1. This API serves as the interface between the P4 compiler front-end (target
>      independent) and the P4 compiler backe-ends (target specific).
> 
>   2. Device vendors develop their device drivers as part of DPDK by
>      implementing this API. The device driver is agostic of being called by the
>      P4 front-end. The device driver serves as the P4 compiler taget specific
>      back-end.
> 
>   3. The P4 compiler front-end is target independent. The amount of C code it
>      generates is minimized by calling this API directly for every P4 feature
>      as opposed to vendor-specific free-style C code generation.
> 
> This API introduces a pipeline device (PDEV) by using a similar approach to the
> existing ethdev and eventdev DPDK device-like APIs implemented by the DPDK Poll
> Mode Drivers (PMDs). Main features:
> 
>   1. Discovery of built-in pipeline devices and their capabilities.
> 
>   2. Creation of new pipelines out of input ports, output ports, tables and
>      actions.
> 
>   3. Registration of packet protocol header and meta-data fields.
> 
>   4. Action definition for input ports, output ports and tables.
> 
>   5. Pipeline run-time API for table population, statistics read, etc.
> 
> This API targets P4 capable devices such as NICs, FPGAs, NPUs, ASICs, etc, as
> well as CPUs. Let’s remember that the first P in P4 stands for Programmable, and
> the CPUs are arguably the most programmable devices. The implementation for the
> CPU SW target is expected to use the DPDK Packet Framework libraries such as
> librte_pipeline, librte_port, librte_table with some expected but moderate API
> and implementation adjustments.
> 
> Links:
> 
>   [1] P4-16 language specification:
>       https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.pdf
> 
>   [2] p4.org to join ONF and LF: https://p4.org/p4/onward-and-upward.html
> 
>   [3] p4c: https://github.com/p4lang/p4c
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
> lib/librte_pipeline/rte_pdev.h        | 1654 +++++++++++++++++++++++++++++++++
> lib/librte_pipeline/rte_pdev_driver.h |  283 ++++++
> 2 files changed, 1937 insertions(+)
> create mode 100644 lib/librte_pipeline/rte_pdev.h
> create mode 100644 lib/librte_pipeline/rte_pdev_driver.h
> 
> diff --git a/lib/librte_pipeline/rte_pdev.h b/lib/librte_pipeline/rte_pdev.h
> new file mode 100644
> index 0000000..7095197
> --- /dev/null
> +++ b/lib/librte_pipeline/rte_pdev.h
> @@ -0,0 +1,1654 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Intel Corporation
> + */
> +
> +#ifndef __INCLUDE_RTE_PDEV_H__
> +#define __INCLUDE_RTE_PDEV_H__
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * @file
> + * RTE Pipeline Device (PDEV)
> + *
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + */
> +
> +#include <stdint.h>
> +
> +#include <rte_common.h>
> +#include <rte_dev.h>
> +#include <rte_ether.h>
> +
> +/** PDEV device handle data type. */
> +struct rte_pdev;
> +
> +/**
> + * PDEV Capability API
> + */
> +
> +/** PDEV capabilities. */
> +struct rte_pdev_capabilities {
> +       /** Number of built-in pipelines.
> +        * @see rte_pdev_next_get()
> +        */
> +       uint32_t n_pipelines_builtin;
> +
> +       /** Non-zero when new pipelines can be created, zero otherwise.
> +        * @see rte_pdev_create()
> +        */
> +       int create;
> +};
> +
> +/**
> + * PDEV capabilities get
> + *
> + * @param[in] dev
> + *   Current device.
> + * @param[out] cap
> + *   PDEV capabilities. Must be non-NULL.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_capabilities_get(struct rte_device *dev,
> +       struct rte_pdev_capabilities *cap);
> +
> +/**
> + * PDEV Discovery API
> + *
> + */
> +
> +/**
> + * PDEV get next
> + *
> + * This API is used to discover pre-existent pipelines on given device. This is
> + * typically the case for built-in HW pipelines.
> + *
> + * @param[in] dev
> + *   Current device.
> + * @param[in] pdev
> + *   Handle to current PDEV on *dev*. Set to NULL during the first invocation
> + *   for *dev* device.
> + * @return
> + *   When non-NULL, handle to next PDEV, otherwise no more PDEV.
> + *
> + * @see struct rte_pdev_capabilities::n_pipelines_builtin
> + */
> +struct rte_pdev *
> +rte_pdev_next_get(struct rte_device *dev, struct rte_pdev *pdev);
> +
> +/**
> + * PDEV Create API
> + */
> +
> +/** PDEV statistics counter type. */
> +enum rte_pdev_stats_type {
> +       /** Number of packets. */
> +       RTE_PDEV_STATS_N_PKTS = 1 << 0,
> +
> +       /** Number of packet bytes. */
> +       RTE_PDEV_STATS_N_BYTES = 1 << 1,
> +};
> +
> +/** PDEV parameters. */
> +struct rte_pdev_params {
> +       /** PDEV name. */
> +       const char *name;
> +
> +       /** Statistics counters to be enabled.
> +        * @see enum rte_pdev_stats_type
> +        */
> +       uint64_t stats_mask;
> +};
> +
> +/**
> + * PDEV create
> + *
> + * This API is to be called to create new pipelines on given device. This is
> + * typically supported by reconfigurable HW devices and SW pipelines.
> + *
> + * @param[in] dev
> + *   Current device.
> + * @param[in] params
> + *   PDEV parameters. Must be non-NULL and valid.
> + * @return
> + *   When non-NULL, handle to created PDEV, otherwise error.
> + *
> + * @see struct rte_pdev_capabilities::create
> + */
> +struct rte_pdev *
> +rte_pdev_create(struct rte_device *dev,
> +       struct rte_pdev_params *params);
> +
> +/**
> + * PDEV free
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_free(struct rte_pdev *pdev);
> +
> +/**
> + * PDEV start
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_start(struct rte_pdev *pdev);
> +
> +/** PDEV input port types. */
> +enum rte_pdev_port_in_type {
> +       /** Builtin device. */
> +       RTE_PDEV_PORT_IN_BUILTIN = 0,
> +
> +       /** Ethernet device. */
> +       RTE_PDEV_PORT_IN_ETHDEV,
> +};
> +
> +/** PDEV input port parameters. */
> +struct rte_pdev_port_in_params {
> +       /** Type. */
> +       enum rte_pdev_port_in_type type;
> +
> +       /** Device specific parameters. */
> +       union {
> +               /** Builtin device. */
> +               struct {
> +                       /** Builtin device name. */
> +                       const char *name;
> +               } builtin;
> +
> +               /** Ethernet device. */
> +               struct {
> +                       /** Ethernet device name. */
> +                       const char *name;
> +
> +                       /** Reception side queue ID. */
> +                       uint32_t rx_queue_id;
> +
> +                       /** Burst size. */
> +                       uint32_t burst_size;
> +               } ethdev;
> +       } dev;
> +};
> +
> +/**
> + * PDEV input port create
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   PDEV input port ID. Must not be used by any existing *pdev* input port.
> + * @params[in] params
> + *   Input port parameters. Must be non-NULL and valid.
> + * @params[in] enable
> + *   When non-zero, the new input port is initially enabled, otherwise disabled.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_create(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       struct rte_pdev_port_in_params *params,
> +       int enable);
> +
> +/**
> + * PDEV input port connect to table
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   Input port ID.
> + * @params[in] table_id
> + *   Table ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_connect(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       uint32_t table_id);
> +
> +/** PDEV table match type. */
> +enum rte_pdev_table_match_type {
> +       /**  Wildcard match. */
> +       RTE_PDEV_TABLE_MATCH_WILDCARD = 0,
> +
> +       /** Exact match. */
> +       RTE_PDEV_TABLE_MATCH_EXACT,
> +
> +       /** Longest Prefix Match (LPM). */
> +       RTE_PDEV_TABLE_MATCH_LPM,
> +
> +       /** Index match. */
> +       RTE_PDEV_TABLE_MATCH_INDEX,
> +
> +       /** Stub match. No real match process: the default rule is always hit. */
> +       RTE_PDEV_TABLE_MATCH_STUB,
> +};
> +
> +/** PDEV table match parameters. */
> +struct rte_pdev_table_match_params {
> +       /** Packet field or packet meta-data field name at match offset 0. */
> +       const char *start;
> +
> +       /** Match size (in bits).
> +        *
> +        * For LPM match type, typical values are 32 bits to match a single IPv4
> +        * address and 128 bits to match a single IPv6 address, but other values
> +        * are possible, for example for Virtual Routing and Forwarding (VRF).
> +        *
> +        * For INDEX match type, the maximum allowed value is 32 bits.
> +        */
> +       uint32_t size;
> +
> +       /** Match mask (*size* bits are used and must be valid).
> +        *
> +        * For LPM match type, this parameter is ignored, as *size* implicitly
> +        * defines *mask* as *size* bits of 1.
> +        */
> +       uint8_t *mask;
> +};
> +
> +/** PDEV exact match table parameters. */
> +struct rte_pdev_table_exact_match_params {
> +       /** Number of hash table buckets. This parameter represents a hint that
> +        * the underlying implementation may ignore.
> +        */
> +       uint32_t n_buckets;
> +
> +       /** Hash table type. Non-zero for extendable bucket hash table, zero for
> +        * Least Recently Used (LRU) hash table.
> +        */
> +       int extendable_bucket;
> +};
> +
> +/** PDEV table parameters. */
> +struct rte_pdev_table_params {
> +       /** Match type. */
> +       enum rte_pdev_table_match_type match_type;
> +
> +       /** Match parameters. Ignored for STUB match type. */
> +       struct rte_pdev_table_match_params match;
> +
> +       /** Match type specific parameters. */
> +       RTE_STD_C11
> +       union {
> +               /** Exact match table specific parameters. */
> +               struct rte_pdev_table_exact_match_params exact;
> +       };
> +
> +       /** Maximum number of rules to be stored in the current table. */
> +       uint32_t n_rules;
> +};
> +
> +/**
> + * PDEV table create
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] table_id
> + *   PDEV table ID. Must not be used by any existing *pdev* table.
> + * @params[in] params
> + *   Table parameters. Must be non-NULL and valid.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_create(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       struct rte_pdev_table_params *params);
> +
> +/** PDEV output port type. */
> +enum rte_pdev_port_out_type {
> +       /** Builtin device. */
> +       RTE_PDEV_PORT_OUT_BUILTIN = 0,
> +
> +       /** Ethernet device. */
> +       RTE_PDEV_PORT_OUT_ETHDEV,
> +
> +       /** Drop all packets device. */
> +       RTE_PDEV_PORT_OUT_DROP,
> +};
> +
> +/** PDEV output port parameters. */
> +struct rte_pdev_port_out_params {
> +       /** Type. */
> +       enum rte_pdev_port_out_type type;
> +
> +       /** Device specific parameters. */
> +       union {
> +               /** Builtin device. */
> +               struct {
> +                       /** Builtin device name. */
> +                       const char *name;
> +               } builtin;
> +
> +               /** Ethernet device. */
> +               struct {
> +                       /** Ethernet device name. */
> +                       const char *name;
> +
> +                       /** Transmission side queue ID. */
> +                       uint32_t tx_queue_id;
> +
> +                       /** Burst size. */
> +                       uint32_t burst_size;
> +               } ethdev;
> +       } dev;
> +};
> +
> +/**
> + * PDEV output port create
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   PDEV output port ID. Must not be used by any existing *pdev* output port.
> + * @params[in] params
> + *   Output port parameters. Must be non-NULL and valid.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_out_create(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       struct rte_pdev_port_out_params *params);
> +
> +/**
> + * PDEV meta-data definition API
> + */
> +
> +/**
> + * PDEV packet field registration.
> + *
> + * Create symbolic alias for a protocol header or header field in the input
> + * packet. This alias can then be used as part of assignment actions registered
> + * for PDEV input ports, tables or output ports, either as left hand side value
> + * or as one of the right hand side expression operands, as appropriate. The
> + * packet field registered with the name of "x" is used as "pkt.x".
> + *
> + * This alias is typically translated to its offset and size, which are then
> + * used during the execution of assignment actions to access the associated data
> + * bytes within the packet.
> + *
> + * The scope of the packet field aliases is the PDEV instance. The attributes
> + * such as offset or size cannot be changed after the alias registration. This
> + * approach assumes the input packet type is known in advance, as opposed to
> + * having each input packet parsed to detect its type. This is a reasonable
> + * assumption, given that NIC capabilities to filter each packet type to a
> + * different RX queue are quite common; the NIC is configured transparently to
> + * the PDEV, with each NIC RX queue mapped as different PDEV input port.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] name
> + *   Symbolic alias for the input packet header or header field.
> + * @params[in] offset
> + *   Byte offset within the input packet. Offset 0 points to the first byte of
> + *   the packet.
> + * @params[in] size
> + *   Field size (in bytes).
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_pkt_field_register(struct rte_pdev *pdev,
> +       const char *name,
> +       uint32_t offset,
> +       uint32_t size);
> +
> +/**
> + * PDEV packet meta-data field registration.
> + *
> + * Reserve field into the packet meta-data and assign a symbolic alias to it.
> + * This alias can then be used as part of assignment actions registered for PDEV
> + * input ports, tables and output ports, either as left hand side value or as
> + * one of the right hand side expression operands, as appropriate. The packet
> + * meta-data field registered with the name of "x" is used as "meta.x".
> + *
> + * Each input packet has its own private memory area reserved to store its
> + * meta-data, which is valid for the lifetime of the packet within the PDEV.
> + * This alias is typically translated to its offset and size, which are then
> + * used during the execution of assignment actions to access the associated
> + * packet meta-data bytes.
> + *
> + * The scope of the packet meta-data field aliases is the PDEV instance,
> + * therefore the meta-data layout is commmon for all the PDEV input packets.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] name
> + *   Symbolic alias for the packet meta-data field.
> + * @params[in] size
> + *   Field size (in bytes).
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_pkt_meta_field_register(struct rte_pdev *pdev,
> +       const char *name,
> +       uint32_t size);
> +
> +/**
> + * PDEV table rule data field registration.
> + *
> + * Reserve field into the table rule data and assign a symbolic alias to it.
> + * This alias can then be used as part of assignment actions registered for PDEV
> + * tables, either as left hand side value or as one of the right hand side
> + * expression operands, as appropriate. The table rule data field registered
> + * with the name of "x" is used as "table.x".
> + *
> + * This alias is typically translated to its offset and size, which are then
> + * used during the execution of assignment actions to access the associated
> + * table rule data bytes.
> + *
> + * The table rule data layout is common for all the rules of a given table that
> + * share the same action profile, therefore the scope of the table rule data
> + * field alias is its (table, action profile) pair.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] table_id
> + *   PDEV table ID.
> + * @params[in] action_profile_id
> + *   Table action profile ID.
> + * @params[in] name
> + *   Symbolic alias for the table rule data field.
> + * @params[in] size
> + *   Field size (in bytes).
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_field_register(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t action_profile_id,
> +       const char *name,
> +       uint32_t size);
> +
> +/**
> + * PDEV input port action API
> + */
> +enum rte_pdev_port_in_action_type {
> +       /** Assignment: lvalue = expression. */
> +       RTE_PDEV_PORT_IN_ACTION_ASSIGN = 0,
> +};
> +
> +/**
> + * RTE_PDEV_PORT_IN_ACTION_ASSIGN
> + */
> +struct rte_pdev_port_in_action_assign_config {
> +       /** Left hand side value for the assignment. Must be one of the
> +        * pre-registered packet meta-data field symbolic aliases. Packet field
> +        * and table rule data field symbolic aliases are not allowed.
> +        */
> +       const char *lvalue;
> +
> +       /** Expression with operands and operators. The operands must be
> +        * pre-registered packet or packet meta-data field symbolic aliases.
> +        * Table rule data field aliases are not allowed.
> +        */
> +       const char *expression;
> +};
> +
> +/**
> + * PDEV input port action profile create
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Input port action profile ID. Must not be used by any existing *pdev* input
> + *   port action profile.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_action_profile_create(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV input port action profile action register
> + *
> + * The action registration order is important, as it determines the action
> + * execution order. The same action type can be registered several times for the
> + * same profile.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Input port action profile ID.
> + * @params[in] action_type
> + *   Input port action type.
> + * @params[in] action_config
> + *   Input port action configuration. For input port action X, this parameter
> + *   needs to point to pre-allocated and valid instance of struct
> + *   rte_pdev_port_in_action_X_config.
> + *   Input port action configuration. For input port action X, this parameter
> + *   must point to valid instance of struct rte_pdev_port_in_action_X_config
> + *   when this structure is defined by the API or to NULL otherwise.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_action_profile_action_register(struct rte_pdev *pdev,
> +       uint32_t profile_id,
> +       enum rte_pdev_port_in_action_type action_type,
> +       void *action_config);
> +
> +/**
> + * PDEV input port action profile freeze
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Input port action profile ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_action_profile_freeze(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV input port action profile register
> + *
> + * Zero or at most one action profile can be registered for each input port.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   PDEV input port ID.
> + * @params[in] profile_id
> + *   Input port action profile ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_action_profile_register(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV table action API
> + *
> + * For some actions below (e.g. packet encapsulations, NAT, statistics, etc),
> + * the same effect can be obtained through a sequence of assignment actions, but
> + * usually this approach is significantly less performant than using specialized
> + * actions.
> + *
> + * Some of the actions below (e.g. metering, timestamp update, etc) require the
> + * definition of specialized actions.
> + */
> +enum rte_pdev_table_action_type {
> +       /** Assignment: lvalue = expression. */
> +       RTE_PDEV_TABLE_ACTION_ASSIGN = 0,
> +
> +       /** Load balancing. */
> +       RTE_PDEV_TABLE_ACTION_LB,
> +
> +       /** Traffic metering and policing. */
> +       RTE_PDEV_TABLE_ACTION_METER,
> +
> +       /** Packet encapsulation. */
> +       RTE_PDEV_TABLE_ACTION_ENCAP,
> +
> +       /** Network Address Translation (NAT). */
> +       RTE_PDEV_TABLE_ACTION_NAT,
> +
> +       /** Time to Leave (TTL) update. */
> +       RTE_PDEV_TABLE_ACTION_TTL,
> +
> +       /** Statistics per table rule. */
> +       RTE_PDEV_TABLE_ACTION_STATS,
> +
> +       /** Time stamp update. */
> +       RTE_PDEV_TABLE_ACTION_TIME,
> +};
> +
> +/**
> + * RTE_PDEV_TABLE_ACTION_ASSIGN
> + */
> +struct rte_pdev_table_action_assign_config {
> +       /** Left hand side value for the assignment. Must be one of the
> +        * pre-registered packet, packet meta-data or table field symbolic
> +        * aliases.
> +        */
> +       const char *lvalue;
> +
> +       /** Expression with operands and operators. The operands must be
> +        * pre-registered packet, packet meta-data or table field symbolic
> +        * aliases.
> +        */
> +       const char *expression;
> +};
> +
> +/**
> + * RTE_PDEV_TABLE_ACTION_LB
> + */
> +/** Load balance action configuration (per table action profile). */
> +struct rte_pdev_table_action_lb_config {
> +       /** Hash key parameters. */
> +       struct rte_pdev_table_match_params hash_key;
> +
> +       /** Hash function name. This parameter represents a hint that the
> +        * underlying implementation may ignore.
> +        */
> +       const char *hash_func;
> +
> +       /** Hash function seed value. This parameter represents a hint that the
> +        * underlying implementation may ignore.
> +        */
> +       uint64_t hash_seed;
> +
> +       /** Number of elements in the table storing the output values. */
> +       uint32_t table_size;
> +
> +       /** Packet meta-data field name where the output value should be saved. */
> +       const char *out;
> +};
> +
> +/** Load balance action parameters (per table rule). */
> +struct rte_pdev_table_action_lb_params {
> +       /** Table defining the output values and their weights. Needs to be
> +        * pre-allocated with exactly *table_size* elements. The weights are set
> +        * in 1 / *table_size* increments. To assign a weight of N / *table_size*
> +        * to a given output value (0 <= N <= *table_size*), the same output
> +        * value needs to show up exactly N times in this table.
> +        */
> +       uint32_t *table;
> +};
> +
> +/**
> + * RTE_PDEV_TABLE_ACTION_MTR
> + */
> +/** Packet color. */
> +enum rte_pdev_meter_color {
> +       RTE_PDEV_METER_COLOR_GREEN = 0, /**< Green. */
> +       RTE_PDEV_METER_COLOR_YELLOW, /**< Yellow. */
> +       RTE_PDEV_METER_COLOR_RED, /**< Red. */
> +       RTE_PDEV_METER_COLORS /**< Number of colors. */
> +};
> +
> +/** Differentiated Services Code Point (DSCP) translation table entry. */
> +struct rte_pdev_dscp_table_entry {
> +       /** Traffic class ID. Has to be strictly less than *n_tc*. */
> +       uint32_t tc_id;
> +
> +       /** Packet input color. Used by the traffic metering algorithm in
> +        * color aware mode.
> +        */
> +       enum rte_pdev_meter_color color;
> +};
> +
> +/** DSCP translation table. */
> +struct rte_pdev_dscp_table {
> +       /** Array of DSCP table entries */
> +       struct rte_pdev_dscp_table_entry entry[64];
> +};
> +
> +/** Supported traffic metering algorithms. */
> +enum rte_pdev_meter_algorithm {
> +       /** Single Rate Three Color Marker (srTCM) - IETF RFC 2697. */
> +       RTE_PDEV_METER_SRTCM_RFC2697,
> +
> +       /** Two Rate Three Color Marker (trTCM) - IETF RFC 2698. */
> +       RTE_PDEV_METER_TRTCM_RFC2698,
> +
> +       /** Two Rate Three Color Marker (trTCM) - IETF RFC 4115. */
> +       RTE_PDEV_METER_TRTCM_RFC4115,
> +};
> +
> +/** Traffic metering profile (configuration template). */
> +struct rte_pdev_meter_profile {
> +       /** Traffic metering algorithm. */
> +       enum rte_pdev_meter_algorithm alg;
> +
> +       RTE_STD_C11
> +       union {
> +               /** Items only valid when *alg* is set to srTCM - RFC 2697. */
> +               struct {
> +                       /** Committed Information Rate (CIR) (bytes/second). */
> +                       uint64_t cir;
> +
> +                       /** Committed Burst Size (CBS) (bytes). */
> +                       uint64_t cbs;
> +
> +                       /** Excess Burst Size (EBS) (bytes). */
> +                       uint64_t ebs;
> +               } srtcm_rfc2697;
> +
> +               /** Items only valid when *alg* is set to trTCM - RFC 2698. */
> +               struct {
> +                       /** Committed Information Rate (CIR) (bytes/second). */
> +                       uint64_t cir;
> +
> +                       /** Peak Information Rate (PIR) (bytes/second). */
> +                       uint64_t pir;
> +
> +                       /** Committed Burst Size (CBS) (byes). */
> +                       uint64_t cbs;
> +
> +                       /** Peak Burst Size (PBS) (bytes). */
> +                       uint64_t pbs;
> +               } trtcm_rfc2698;
> +
> +               /** Items only valid when *alg* is set to trTCM - RFC 4115. */
> +               struct {
> +                       /** Committed Information Rate (CIR) (bytes/second). */
> +                       uint64_t cir;
> +
> +                       /** Excess Information Rate (EIR) (bytes/second). */
> +                       uint64_t eir;
> +
> +                       /** Committed Burst Size (CBS) (byes). */
> +                       uint64_t cbs;
> +
> +                       /** Excess Burst Size (EBS) (bytes). */
> +                       uint64_t ebs;
> +               } trtcm_rfc4115;
> +       };
> +};
> +
> +/** Policer actions. */
> +enum rte_pdev_policer {
> +       /** Recolor the packet as green. */
> +       RTE_PDEV_POLICER_COLOR_GREEN = 0,
> +
> +       /** Recolor the packet as yellow. */
> +       RTE_PDEV_POLICER_COLOR_YELLOW,
> +
> +       /** Recolor the packet as red. */
> +       RTE_PDEV_POLICER_COLOR_RED,
> +
> +       /** Drop the packet. */
> +       RTE_PDEV_POLICER_DROP,
> +};
> +
> +/** Meter action configuration per traffic class. */
> +struct rte_pdev_table_action_mtr_tc_params {
> +       /** Meter profile ID. */
> +       uint32_t meter_profile_id;
> +
> +       /** Policer actions. */
> +       enum rte_pdev_policer policer[RTE_PDEV_METER_COLORS];
> +};
> +
> +/** Meter action statistics counters per traffic class. */
> +struct rte_pdev_table_action_mtr_tc_counters {
> +       /** Number of packets per color at the output of the traffic metering
> +        * and before the policer actions are executed. Only valid when
> +        * *n_packets_valid* is non-zero.
> +        */
> +       uint64_t n_packets[RTE_PDEV_METER_COLORS];
> +
> +       /** Number of packet bytes per color at the output of the traffic
> +        * metering and before the policer actions are executed. Only valid when
> +        * *n_bytes_valid* is non-zero.
> +        */
> +       uint64_t n_bytes[RTE_PDEV_METER_COLORS];
> +
> +       /** When non-zero, the *n_packets* field is valid. */
> +       int n_packets_valid;
> +
> +       /** When non-zero, the *n_bytes* field is valid. */
> +       int n_bytes_valid;
> +};
> +
> +/** Meter action configuration (per table action profile). */
> +struct rte_pdev_table_action_mtr_config {
> +       /** Packet field for IP header. */
> +       const char *ip;
> +
> +       /** IP protocol version. Non-zero for IPv4, zero for IPv6. */
> +       int ip_version;
> +
> +       /** DSCP translation table. */
> +       struct rte_pdev_dscp_table *dscp_table;
> +
> +       /** Number of traffic classes. Each traffic class has its own traffic
> +        * meter and policer instances.
> +        */
> +       uint32_t n_tc;
> +
> +       /** Meter algorithm. */
> +       enum rte_pdev_meter_algorithm alg;
> +
> +       /** When non-zero, the *n_packets* meter stats counter is enabled,
> +        * otherwise it is disabled.
> +        *
> +        * @see struct rte_pdev_table_action_mtr_tc_counters
> +        */
> +       int n_packets_enabled;
> +
> +       /** When non-zero, the *n_bytes* meter stats counter is enabled,
> +        * otherwise it is disabled.
> +        *
> +        * @see struct rte_pdev_table_action_mtr_tc_counters
> +        */
> +       int n_bytes_enabled;
> +};
> +
> +/** Meter action parameters (per table rule). */
> +struct rte_pdev_table_action_mtr_params {
> +       /** Traffic meter and policer parameters for all traffic classes. Array
> +        * of *n_tc* elements.
> +        */
> +       struct rte_pdev_table_action_mtr_tc_params *mtr;
> +};
> +
> +/** Meter action statistics counters (per table rule). */
> +struct rte_pdev_table_action_mtr_counters {
> +       /** Stats counters for all traffic classes. Array of *n_tc* elements. */
> +       struct rte_pdev_table_action_mtr_tc_counters *stats;
> +};
> +
> +/**
> + * RTE_PDEV_TABLE_ACTION_ENCAP
> + */
> +/** Supported packet encapsulation types. */
> +enum rte_pdev_encap_type {
> +       /** IP -> { Ether | IP } */
> +       RTE_PDEV_ENCAP_ETHER = 0,
> +
> +       /** IP -> { Ether | VLAN | IP } */
> +       RTE_PDEV_ENCAP_VLAN,
> +
> +       /** IP -> { Ether | S-VLAN | C-VLAN | IP } */
> +       RTE_PDEV_ENCAP_QINQ,
> +
> +       /** IP -> { Ether | MPLS | IP } */
> +       RTE_PDEV_ENCAP_MPLS,
> +
> +       /** IP -> { Ether | PPPoE | PPP | IP } */
> +       RTE_PDEV_ENCAP_PPPOE,
> +};
> +
> +/** Pre-computed Ethernet header fields for encapsulation action. */
> +struct rte_pdev_ether_hdr {
> +       struct ether_addr da; /**< Destination address. */
> +       struct ether_addr sa; /**< Source address. */
> +};
> +
> +/** Pre-computed VLAN header fields for encapsulation action. */
> +struct rte_pdev_vlan_hdr {
> +       uint8_t pcp; /**< Priority Code Point (PCP). */
> +       uint8_t dei; /**< Drop Eligibility Indicator (DEI). */
> +       uint16_t vid; /**< VLAN Identifier (VID). */
> +};
> +
> +/** Pre-computed MPLS header fields for encapsulation action. */
> +struct rte_pdev_mpls_hdr {
> +       uint32_t label; /**< Label. */
> +       uint8_t tc; /**< Traffic Class (TC). */
> +       uint8_t ttl; /**< Time to Live (TTL). */
> +};
> +
> +/** Pre-computed PPPoE header fields for encapsulation action. */
> +struct rte_pdev_pppoe_hdr {
> +       uint16_t session_id; /**< Session ID. */
> +};
> +
> +/** Ether encap parameters. */
> +struct rte_pdev_encap_ether_params {
> +       struct rte_pdev_ether_hdr ether; /**< Ethernet header. */
> +};
> +
> +/** VLAN encap parameters. */
> +struct rte_pdev_encap_vlan_params {
> +       struct rte_pdev_ether_hdr ether; /**< Ethernet header. */
> +       struct rte_pdev_vlan_hdr vlan; /**< VLAN header. */
> +};
> +
> +/** QinQ encap parameters. */
> +struct rte_pdev_encap_qinq_params {
> +       struct rte_pdev_ether_hdr ether; /**< Ethernet header. */
> +       struct rte_pdev_vlan_hdr svlan; /**< Service VLAN header. */
> +       struct rte_pdev_vlan_hdr cvlan; /**< Customer VLAN header. */
> +};
> +
> +/** Max number of MPLS labels per output packet for MPLS encapsulation. */
> +#ifndef RTE_PDEV_MPLS_LABELS_MAX
> +#define RTE_PDEV_MPLS_LABELS_MAX                            4
> +#endif
> +
> +/** MPLS encap parameters. */
> +struct rte_pdev_encap_mpls_params {
> +       /** Ethernet header. */
> +       struct rte_pdev_ether_hdr ether;
> +
> +       /** MPLS header. */
> +       struct rte_pdev_mpls_hdr mpls[RTE_PDEV_MPLS_LABELS_MAX];
> +
> +       /** Number of MPLS labels in MPLS header. */
> +       uint32_t mpls_count;
> +
> +       /** Non-zero for MPLS unicast, zero for MPLS multicast. */
> +       int unicast;
> +};
> +
> +/** PPPoE encap parameters. */
> +struct rte_pdev_encap_pppoe_params {
> +       struct rte_pdev_ether_hdr ether; /**< Ethernet header. */
> +       struct rte_pdev_pppoe_hdr pppoe; /**< PPPoE/PPP headers. */
> +};
> +
> +/** Encap action configuration (per table action profile). */
> +struct rte_pdev_table_action_encap_config {
> +       /** Packet field for IP header. */
> +       const char *ip;
> +
> +       /** IP protocol version. Non-zero for IPv4, zero for IPv6. */
> +       int ip_version;
> +
> +       /** Bit mask defining the set of packet encapsulations enabled for the
> +        * current table action profile. If bit (1 << N) is set in *encap_mask*,
> +        * then packet encapsulation N is enabled, otherwise it is disabled.
> +        *
> +        * @see enum rte_pdev_encap_type
> +        */
> +       uint64_t encap_mask;
> +};
> +
> +/** Encap action parameters (per table rule). */
> +struct rte_pdev_table_action_encap_params {
> +       /** Encapsulation type. */
> +       enum rte_pdev_encap_type type;
> +
> +       RTE_STD_C11
> +       union {
> +               /** Only valid when *type* is set to Ether. */
> +               struct rte_pdev_encap_ether_params ether;
> +
> +               /** Only valid when *type* is set to VLAN. */
> +               struct rte_pdev_encap_vlan_params vlan;
> +
> +               /** Only valid when *type* is set to QinQ. */
> +               struct rte_pdev_encap_qinq_params qinq;
> +
> +               /** Only valid when *type* is set to MPLS. */
> +               struct rte_pdev_encap_mpls_params mpls;
> +
> +               /** Only valid when *type* is set to PPPoE. */
> +               struct rte_pdev_encap_pppoe_params pppoe;
> +       };
> +};
> +
> +/**
> + * RTE_PDEV_TABLE_ACTION_NAT
> + */
> +/** NAT action configuration (per table action profile). */
> +struct rte_pdev_table_action_nat_config {
> +       /** Packet field for IP header. */
> +       const char *ip;
> +
> +       /** IP protocol version. Non-zero for IPv4, zero for IPv6. */
> +       int ip_version;
> +
> +       /** When non-zero, the IP source address and L4 protocol source port are
> +        * translated. When zero, the IP destination address and L4 protocol
> +        * destination port are translated.
> +        */
> +       int source_nat;
> +
> +       /** Layer 4 protocol, for example TCP (0x06) or UDP (0x11). The checksum
> +        * field is computed differently and placed at different header offset
> +        * by each layer 4 protocol.
> +        */
> +       uint8_t proto;
> +};
> +
> +/** NAT action parameters (per table rule). */
> +struct rte_pdev_table_action_nat_params {
> +       /** IP version for *addr*: non-zero for IPv4, zero for IPv6. */
> +       int ip_version;
> +
> +       /** IP address. */
> +       union {
> +               /** IPv4 address; only valid when *ip_version* is IPv4. */
> +               uint32_t ipv4;
> +
> +               /** IPv6 address; only valid when *ip_version* is IPv6. */
> +               uint8_t ipv6[16];
> +       } addr;
> +
> +       /** Port. */
> +       uint16_t port;
> +};
> +
> +/**
> + * RTE_PDEV_TABLE_ACTION_TTL
> + */
> +/** TTL action configuration (per table action profile). */
> +struct rte_pdev_table_action_ttl_config {
> +       /** Packet field for IP header. */
> +       const char *ip;
> +
> +       /** IP protocol version. Non-zero for IPv4, zero for IPv6. */
> +       int ip_version;
> +
> +       /** Packet meta-data field to be set to *port_out_id* when the updated
> +        * IPv4 Time to Live (TTL) field or IPv6 Hop Limit (HL) field is zero.
> +        */
> +       const char *port_out;
> +
> +       /** Output port ID to be stored into *port_out* packet meta-data field
> +        * when the updated IPv4 TTL field or IPv6 HL field is zero.
> +        */
> +       uint32_t port_out_id;
> +
> +       /** When non-zero, the *n_packets* stats counter for TTL action is
> +        * enabled, otherwise disabled.
> +        *
> +        * @see struct rte_pdev_table_action_ttl_counters
> +        */
> +       int n_packets_enabled;
> +};
> +
> +/** TTL action parameters (per table rule). */
> +struct rte_pdev_table_action_ttl_params {
> +       /** When non-zero, decrement the IPv4 TTL field and update the checksum
> +        * field, or decrement the IPv6 HL field. When zero, the IPv4 TTL field
> +        * or the IPv6 HL field is not changed.
> +        */
> +       int decrement;
> +};
> +
> +/** TTL action statistics packets (per table rule). */
> +struct rte_pdev_table_action_ttl_counters {
> +       /** Number of IPv4 packets whose updated TTL field is zero or IPv6
> +        * packets whose updated HL field is zero.
> +        */
> +       uint64_t n_packets;
> +};
> +
> +/**
> + * RTE_PDEV_TABLE_ACTION_STATS
> + */
> +/** Stats action configuration (per table action profile). */
> +struct rte_pdev_table_action_stats_config {
> +       /** Packet field for IP header. */
> +       const char *ip;
> +
> +       /** IP protocol version. Non-zero for IPv4, zero for IPv6. */
> +       int ip_version;
> +
> +       /** When non-zero, the *n_packets* stats counter is enabled, otherwise
> +        * disabled.
> +        *
> +        * @see struct rte_pdev_table_action_stats_counters
> +        */
> +       int n_packets_enabled;
> +
> +       /** When non-zero, the *n_bytes* stats counter is enabled, otherwise
> +        * disabled.
> +        *
> +        * @see struct rte_pdev_table_action_stats_counters
> +        */
> +       int n_bytes_enabled;
> +};
> +
> +/** Stats action parameters (per table rule). */
> +struct rte_pdev_table_action_stats_params {
> +       /** Initial value for the *n_packets* stats counter. Typically set to 0.
> +        *
> +        * @see struct rte_pdev_table_action_stats_counters
> +        */
> +       uint64_t n_packets;
> +
> +       /** Initial value for the *n_bytes* stats counter. Typically set to 0.
> +        *
> +        * @see struct rte_pdev_table_action_stats_counters
> +        */
> +       uint64_t n_bytes;
> +};
> +
> +/** Stats action counters (per table rule). */
> +struct rte_pdev_table_action_stats_counters {
> +       /** Number of packets. Valid only when *n_packets_valid* is non-zero. */
> +       uint64_t n_packets;
> +
> +       /** Number of bytes. Valid only when *n_bytes_valid* is non-zero. */
> +       uint64_t n_bytes;
> +
> +       /** When non-zero, the *n_packets* field is valid, otherwise invalid. */
> +       int n_packets_valid;
> +
> +       /** When non-zero, the *n_bytes* field is valid, otherwise invalid. */
> +       int n_bytes_valid;
> +};
> +
> +/**
> + * RTE_PDEV_TABLE_ACTION_TIME
> + */
> +/** Timestamp action parameters (per table rule). */
> +struct rte_pdev_table_action_time_params {
> +       /** Initial timestamp value. Typically set to current time. */
> +       uint64_t time;
> +};
> +
> +/**
> + * PDEV table action profile create
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Table action profile ID. Must not be used by any existing *pdev* table
> + *   action profile.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_action_profile_create(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV table action profile action register
> + *
> + * The action registration order is important, as it determines the action
> + * execution order. The same action type can be registered several times for the
> + * same profile.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Table action profile ID.
> + * @params[in] action_type
> + *   Table action type.
> + * @params[in] action_config
> + *   Table action configuration. For table action X, this parameter must point
> + *   to valid instance of struct rte_pdev_table_action_X_config when this
> + *   structure is defined by the API or to NULL otherwise.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_action_profile_action_register(struct rte_pdev *pdev,
> +       uint32_t profile_id,
> +       enum rte_pdev_table_action_type action_type,
> +       void *action_config);
> +
> +/**
> + * PDEV table action profile freeze
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Table action profile ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_action_profile_freeze(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV table action profile register
> + *
> + * Zero or several action profiles can be registered for each table.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] table_id
> + *   PDEV table ID.
> + * @params[in] profile_id
> + *   Table action profile ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_action_profile_register(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV output port action API
> + */
> +enum rte_pdev_port_out_action_type {
> +       /** Assignment: lvalue = expression. */
> +       RTE_PDEV_PORT_OUT_ACTION_ASSIGN = 0,
> +};
> +
> +/**
> + * RTE_PDEV_PORT_OUT_ACTION_ASSIGN
> + */
> +struct rte_pdev_port_out_action_assign_config {
> +       /** Left hand side value for the assignment. Must be one of the
> +        * pre-registered packet field symbolic aliases. Packet meta-data field
> +        * and table rule data field aliases are not allowed.
> +        */
> +       const char *lvalue;
> +
> +       /** Expression with operands and operators. The operands must be
> +        * pre-registered packet or packet meta-data field symbolic aliases.
> +        * Table rule data field aliases are not allowed.
> +        */
> +       const char *expression;
> +};
> +
> +/**
> + * PDEV output port action profile create
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Output port action profile ID. Must not be used by any existing *pdev*
> + *   output port action profile.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_out_action_profile_create(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV output port action profile action register
> + *
> + * The action registration order is important, as it determines the action
> + * execution order. The same action type can be registered several times for the
> + * same profile.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Output port action profile ID.
> + * @params[in] action_type
> + *   Output port action type.
> + * @params[in] action_config
> + *   Output port action configuration. For output port action X, this parameter
> + *   must point to valid instance of struct rte_pdev_port_out_action_X_config
> + *   when this structure is defined by the API or to NULL otherwise.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_out_action_profile_action_register(struct rte_pdev *pdev,
> +       uint32_t profile_id,
> +       enum rte_pdev_port_out_action_type action_type,
> +       void *action_config);
> +
> +/**
> + * PDEV output port action profile freeze
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] profile_id
> + *   Output port action profile ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_out_action_profile_freeze(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV output port action profile register
> + *
> + * Zero or at most one action profile can be registered for each output port.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   PDEV output port ID.
> + * @params[in] profile_id
> + *   Output port action profile ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_out_action_profile_register(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t profile_id);
> +
> +/**
> + * PDEV input port run-time API
> + */
> +
> +/**
> + * PDEV input port enable
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   PDEV input port ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_enable(struct rte_pdev *pdev,
> +       uint32_t port_id);
> +
> +/**
> + * PDEV input port disable
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   PDEV input port ID.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_disable(struct rte_pdev *pdev,
> +       uint32_t port_id);
> +
> +/** PDEV input port statistics counters. */
> +struct rte_pdev_port_in_stats {
> +       /** Number of packets read from this input port. */
> +       uint64_t n_packets;
> +
> +       /** Number of bytes associated with *n_packets*. */
> +       uint64_t n_bytes;
> +};
> +
> +/**
> + * PDEV input port statistics counters read
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   PDEV input port ID.
> + * @params[out] stats
> + *   When non-NULL, the statistics counters are read and saved here.
> + * @params[in] clear
> + *   When non-zero, the statistics counters are cleared after read, otherwise
> + *   they are not modified.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_in_stats_read(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       struct rte_pdev_port_in_stats *stats,
> +       int clear);
> +
> +/**
> + * PDEV table run-time API
> + */
> +
> +/**
> + * PDEV table rule add.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] match
> + *   Rule match. For any match type, the NULL value indicates the default rule.
> + * @param[in] match_mask
> + *   Rule match bit-mask. Ignored when *match* is set to NULL. Ignored when
> + *   *table_id* match type is not WILDCARD or LPM.
> + * @param[in] match_priority
> + *   Rule match priority. Ignored when *match* is set to NULL. Ignored when
> + *   *table_id* match type is not WILDCARD.
> + * @param[in] action_profile_id
> + *   Table action profile ID.
> + * @param[in] action_params
> + *   Array of action parameters. The number of elements must be equal to the
> + *   number of actions registered for the *action_profile_id* table action
> + *   profile. If X is the N-th action registered for *action_profile_id*, then
> + *   the N-th element of this array needs to be pointer to valid instance of
> + *   struct rte_pdev_table_action_X_params when defined by the API or to NULL
> + *   otherwise.
> + * @param[out] rule_handle
> + *   Rule handle.
> + * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_rule_add(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint8_t *match,
> +       uint8_t *match_mask,
> +       uint32_t match_priority,
> +       uint32_t action_profile_id,
> +       void **action_params,
> +       void **rule_handle);
> +
> +/**
> + * PDEV table rule delete.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] match
> + *   Rule match. For any match type, the NULL value indicates the default rule.
> + * @param[in] match_mask
> + *   Rule match bit-mask. Ignored when *match* is set to NULL. Ignored when
> + *   *table_id* match type is not WILDCARD or LPM.
> +  * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_rule_delete(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint8_t *match,
> +       uint8_t *match_mask);
> +
> +/**
> + * PDEV table DSCP table update.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] dscp_table
> + *   DSCP table. Must be pre-allocated and valid.
> + * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_dscp_table_update(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       struct rte_pdev_dscp_table *dscp_table);
> +
> +/**
> + * PDEV table meter profile add.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] meter_profile_id
> + *   Meter profile ID. Must not be used by any existing *table_id* meter
> + *   profile.
> + * @param[in] profile
> + *   Meter profile parameters.
> + * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_meter_profile_add(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t meter_profile_id,
> +       struct rte_pdev_meter_profile *profile);
> +
> +/**
> + * PDEV table meter profile delete.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] meter_profile_id
> + *   Meter profile ID. Must be one of the existing *table_id* meter profiles.
> + * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_meter_profile_delete(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t meter_profile_id);
> +
> +/**
> + * PDEV table rule meter statistics counters read.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] rule_handle
> + *   Rule handle.
> + * @params[out] stats
> + *   When non-NULL, the statistics counters are read and saved here.
> + * @params[in] clear
> + *   When non-zero, the statistics counters are cleared after read, otherwise
> + *   they are not modified.
> + * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_rule_meter_read(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       void *rule_handle,
> +       struct rte_pdev_table_action_mtr_counters *stats,
> +       int clear);
> +
> +/**
> + * PDEV table rule TTL statistics counters read.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] rule_handle
> + *   Rule handle.
> + * @params[out] stats
> + *   When non-NULL, the statistics counters are read and saved here.
> + * @params[in] clear
> + *   When non-zero, the statistics counters are cleared after read, otherwise
> + *   they are not modified.
> + * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_rule_ttl_read(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       void *rule_handle,
> +       struct rte_pdev_table_action_ttl_counters *stats,
> +       int clear);
> +
> +/**
> + * PDEV table rule statistics counters read.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] rule_handle
> + *   Rule handle.
> + * @params[out] stats
> + *   When non-NULL, the statistics counters are read and saved here.
> + * @params[in] clear
> + *   When non-zero, the statistics counters are cleared after read, otherwise
> + *   they are not modified.
> + * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_rule_stats_read(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       void *rule_handle,
> +       struct rte_pdev_table_action_stats_counters *stats,
> +       int clear);
> +
> +/**
> + * PDEV table rule timestamp read.
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @param[in] table_id
> + *   PDEV table ID.
> + * @param[in] rule_handle
> + *   Rule handle.
> + * @param[out] timestamp
> + *   Current timestamp value. Must be non-NULL.
> + * @return
> + *   Zero on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_rule_timestamp_read(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       void *rule_handle,
> +       uint64_t *timestamp);
> +
> +/** PDEV table statistics counters. */
> +struct rte_pdev_table_stats {
> +       /** Number of packets looked up in this table. */
> +       uint64_t n_packets;
> +
> +       /** Number of bytes associated with *n_packets*. */
> +       uint64_t n_bytes;
> +};
> +
> +/**
> + * PDEV table statistics counters read
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] table_id
> + *   PDEV table ID.
> + * @params[out] stats
> + *   When non-NULL, the statistics counters are read and saved here.
> + * @params[in] clear
> + *   When non-zero, the statistics counters are cleared after read, otherwise
> + *   they are not modified.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_table_stats_read(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       struct rte_pdev_table_stats *stats,
> +       int clear);
> +
> +/**
> + * PDEV output port run-time API
> + */
> +
> +/** PDEV output port statistics counters. */
> +struct rte_pdev_port_out_stats {
> +       /** Number of packets written to this output port. */
> +       uint64_t n_packets;
> +
> +       /** Number of bytes associated with *n_packets*. */
> +       uint64_t n_bytes;
> +};
> +
> +/**
> + * PDEV output port statistics counters read
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + * @params[in] port_id
> + *   PDEV output port ID.
> + * @params[out] stats
> + *   When non-NULL, the statistics counters are read and saved here.
> + * @params[in] clear
> + *   When non-zero, the statistics counters are cleared after read, otherwise
> + *   they are not modified.
> + * @return
> + *   0 on success, non-zero error code otherwise.
> + */
> +int
> +rte_pdev_port_out_stats_read(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       struct rte_pdev_port_out_stats *stats,
> +       int clear);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif
> diff --git a/lib/librte_pipeline/rte_pdev_driver.h b/lib/librte_pipeline/rte_pdev_driver.h
> new file mode 100644
> index 0000000..4b97784
> --- /dev/null
> +++ b/lib/librte_pipeline/rte_pdev_driver.h
> @@ -0,0 +1,283 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Intel Corporation
> + */
> +
> +#ifndef __INCLUDE_RTE_PDEV_DRIVER_H__
> +#define __INCLUDE_RTE_PDEV_DRIVER_H__
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * @file
> + * RTE Pipeline Device (PDEV) - Device Driver Interface
> + */
> +
> +#include <stdint.h>
> +
> +#include "rte_pdev.h"
> +
> +/** @internal PDEV create */
> +typedef struct rte_pdev * (*rte_pdev_create_t)(struct rte_device *dev,
> +       struct rte_pdev_params *params);
> +
> +/** @internal PDEV free */
> +typedef int (*rte_pdev_free_t)(struct rte_pdev *pdev);
> +
> +/** @internal PDEV start */
> +typedef int (*rte_pdev_start_t)(struct rte_pdev *pdev);
> +
> +/** @internal PDEV input port create */
> +typedef int (*rte_pdev_port_in_create_t)(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       struct rte_pdev_port_in_params *params,
> +       int enable);
> +
> +/** @internal PDEV input port connect */
> +typedef int (*rte_pdev_port_in_connect_t)(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       uint32_t table_id);
> +
> +/** @internal PDEV table create */
> +typedef int (*rte_pdev_table_create_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       struct rte_pdev_table_params *params);
> +
> +/** @internal PDEV output port create */
> +typedef int (*rte_pdev_port_out_create_t)(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       struct rte_pdev_port_out_params *params);
> +
> +/** @internal PDEV packet field register */
> +typedef int (*rte_pdev_pkt_field_register_t)(struct rte_pdev *pdev,
> +       const char *name,
> +       uint32_t offset,
> +       uint32_t size);
> +
> +/** @internal PDEV packet meta-data field register */
> +typedef int (*rte_pdev_pkt_meta_field_register_t)(struct rte_pdev *pdev,
> +       const char *name,
> +       uint32_t size);
> +
> +/** @internal PDEV table rule data field register */
> +typedef int (*rte_pdev_table_field_register_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t action_profile_id,
> +       const char *name,
> +       uint32_t size);
> +
> +/** @internal PDEV input port action profile create */
> +typedef int (*rte_pdev_port_in_action_profile_create_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV input port action profile action register */
> +typedef int (*rte_pdev_port_in_action_profile_action_register_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id,
> +       enum rte_pdev_port_in_action_type action_type,
> +       void *action_config);
> +
> +/** @internal PDEV input port action profile freeze */
> +typedef int (*rte_pdev_port_in_action_profile_freeze_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV input port action profile register */
> +typedef int (*rte_pdev_port_in_action_profile_register_t)(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV table action profile create */
> +typedef int (*rte_pdev_table_action_profile_create_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV table action profile action register */
> +typedef int (*rte_pdev_table_action_profile_action_register_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id,
> +       enum rte_pdev_table_action_type action_type,
> +       void *action_config);
> +
> +/** @internal PDEV table action profile freeze */
> +typedef int (*rte_pdev_table_action_profile_freeze_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV table action profile register */
> +typedef int (*rte_pdev_table_action_profile_register_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV output port action profile create */
> +typedef int (*rte_pdev_port_out_action_profile_create_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV output port action profile action register */
> +typedef int (*rte_pdev_port_out_action_profile_action_register_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id,
> +       enum rte_pdev_port_out_action_type action_type,
> +       void *action_config);
> +
> +/** @internal PDEV output port action profile freeze */
> +typedef int (*rte_pdev_port_out_action_profile_freeze_t)(struct rte_pdev *pdev,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV output port action profile register */
> +typedef int (*rte_pdev_port_out_action_profile_register_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t profile_id);
> +
> +/** @internal PDEV input port enable */
> +typedef int (*rte_pdev_port_in_enable_t)(struct rte_pdev *pdev,
> +       uint32_t port_id);
> +
> +/** @internal PDEV input port disable */
> +typedef int (*rte_pdev_port_in_disable_t)(struct rte_pdev *pdev,
> +       uint32_t port_id);
> +
> +/** @internal PDEV input port stats read */
> +typedef int (*rte_pdev_port_in_stats_read_t)(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       struct rte_pdev_port_in_stats *stats,
> +       int clear);
> +
> +/** @internal PDEV table rule add */
> +typedef int (*rte_pdev_table_rule_add_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint8_t *match,
> +       uint8_t *match_mask,
> +       uint32_t match_priority,
> +       uint32_t action_profile_id,
> +       void **action_params,
> +       void **rule_handle);
> +
> +/** @internal PDEV table rule delete */
> +typedef int (*rte_pdev_table_rule_delete_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint8_t *match,
> +       uint8_t *match_mask);
> +
> +/** @internal PDEV table DSCP table update */
> +typedef int (*rte_pdev_table_dscp_table_update_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       struct rte_pdev_dscp_table *dscp_table);
> +
> +/** @internal PDEV table meter profile add */
> +typedef int (*rte_pdev_table_meter_profile_add_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t meter_profile_id,
> +       struct rte_pdev_meter_profile *profile);
> +
> +/** @internal PDEV table meter profile delete */
> +typedef int (*rte_pdev_table_meter_profile_delete_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       uint32_t meter_profile_id);
> +
> +/** @internal PDEV table rule meter stats read */
> +typedef int (*rte_pdev_table_rule_meter_read_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       void *rule_handle,
> +       struct rte_pdev_table_action_mtr_counters *stats,
> +       int clear);
> +
> +/** @internal PDEV table rule TTL stats read */
> +typedef int (*rte_pdev_table_rule_ttl_read_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       void *rule_handle,
> +       struct rte_pdev_table_action_ttl_counters *stats,
> +       int clear);
> +
> +/** @internal PDEV table rule stats read */
> +typedef int (*rte_pdev_table_rule_stats_read_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       void *rule_handle,
> +       struct rte_pdev_table_action_stats_counters *stats,
> +       int clear);
> +
> +/** @internal PDEV table rule tiemstamp read */
> +typedef int (*rte_pdev_table_rule_timestamp_read_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       void *rule_handle,
> +       uint64_t *timestamp);
> +
> +/** @internal PDEV table stats read */
> +typedef int (*rte_pdev_table_stats_read_t)(struct rte_pdev *pdev,
> +       uint32_t table_id,
> +       struct rte_pdev_table_stats *stats,
> +       int clear);
> +
> +/** @internal PDEV output port stats read */
> +typedef int (*rte_pdev_port_out_stats_read_t)(struct rte_pdev *pdev,
> +       uint32_t port_id,
> +       struct rte_pdev_port_out_stats *stats,
> +       int clear);
> +
> +/** PDEV ops */
> +struct rte_pdev_ops {
> +       /** PDEV create API */
> +       rte_pdev_create_t create;
> +       rte_pdev_free_t free;
> +       rte_pdev_start_t start;
> +       rte_pdev_port_in_create_t port_in_create;
> +       rte_pdev_port_in_connect_t port_in_connect;
> +       rte_pdev_table_create_t table_create;
> +       rte_pdev_port_out_create_t port_out_create;
> +
> +       /** PDEV meta-data API */
> +       rte_pdev_pkt_field_register_t pkt_field_register;
> +       rte_pdev_pkt_meta_field_register_t pkt_meta_field_register;
> +       rte_pdev_table_field_register_t table_field_register;
> +
> +       /** PDEV input port action API */
> +       rte_pdev_port_in_action_profile_create_t port_in_action_profile_create;
> +       rte_pdev_port_in_action_profile_action_register_t port_in_action_profile_action_register;
> +       rte_pdev_port_in_action_profile_freeze_t port_in_action_profile_freeze;
> +       rte_pdev_port_in_action_profile_register_t port_in_action_profile_register;
> +
> +       /** PDEV table action API */
> +       rte_pdev_table_action_profile_create_t table_action_profile_create;
> +       rte_pdev_table_action_profile_action_register_t table_action_profile_action_register;
> +       rte_pdev_table_action_profile_freeze_t table_action_profile_freeze;
> +       rte_pdev_table_action_profile_register_t table_action_profile_register;
> +
> +       /** PDEV output port action API */
> +       rte_pdev_port_out_action_profile_create_t port_out_action_profile_create;
> +       rte_pdev_port_out_action_profile_action_register_t port_out_action_profile_action_register;
> +       rte_pdev_port_out_action_profile_freeze_t port_out_action_profile_freeze;
> +       rte_pdev_port_out_action_profile_register_t port_out_action_profile_register;
> +
> +       /** PDEV input port run-time API */
> +       rte_pdev_port_in_enable_t port_in_enable;
> +       rte_pdev_port_in_disable_t port_in_disable;
> +       rte_pdev_port_in_stats_read_t port_in_stats_read;
> +
> +       /** PDEV table run-time API */
> +       rte_pdev_table_rule_add_t table_rule_add;
> +       rte_pdev_table_rule_delete_t table_rule_delete;
> +       rte_pdev_table_dscp_table_update_t table_dscp_table_update;
> +       rte_pdev_table_meter_profile_add_t table_meter_profile_add;
> +       rte_pdev_table_meter_profile_delete_t table_meter_profile_delete;
> +       rte_pdev_table_rule_meter_read_t table_rule_meter_read;
> +       rte_pdev_table_rule_ttl_read_t table_rule_ttl_read;
> +       rte_pdev_table_rule_stats_read_t table_rule_stats_read;
> +       rte_pdev_table_rule_timestamp_read_t table_rule_timestamp_read;
> +       rte_pdev_table_stats_read_t table_stats_read;
> +
> +       /** PDEV output port run-time API */
> +       rte_pdev_port_out_stats_read_t port_out_stats_read;
> +};
> +
> +/**
> + * Get PDEV ops
> + *
> + * @param[in] pdev
> + *   PDEV handle.
> + *
> + * @return
> + *   PDEV ops on success, NULL otherwise.
> + */
> +const struct rte_pdev_ops *
> +rte_pdev_ops_get(struct rte_pdev *pdev);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif
> --
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] P4 enablement in DPDK
  2018-06-15 23:25 ` antonin
@ 2018-06-19 17:52   ` Dumitrescu, Cristian
  0 siblings, 0 replies; 6+ messages in thread
From: Dumitrescu, Cristian @ 2018-06-19 17:52 UTC (permalink / raw)
  To: antonin; +Cc: dev, Daly, Dan

Hi Antonin,

Thanks very much for your input and your help going forward!

More comments inline below.

From: antonin@barefootnetworks.com [mailto:antonin@barefootnetworks.com]
Sent: Saturday, June 16, 2018 12:26 AM
To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
Cc: dev@dpdk.org; Daly, Dan <dan.daly@intel.com>
Subject: Re: [dpdk-dev] [RFC] P4 enablement in DPDK

Hi,

I want to express support for this proposal and adding P4 capabilities in DPDK. For example, I personally see a lot of demand for a production-quality P4-programmable software switch.
[Cristian] Excellent, thank you!

A few comments on this:

1) I see a lot of similarities between the proposed PDEV table runtime API and the existing PI C API: https://github.com/p4lang/PI/tree/master/include/PI. I wonder if there would be value in trying to re-use them - at least partially - for this.
[Cristian] Yes, DPDK PDEV should be very much aligned with the P4 language and the P4Runtime API.

PI is very much aligned on P4 and strictly Program Independent. That does not seem to be completely the case for the PDEV table runtime API (dscp table, TTL, …) and I’m not familiar enough with DPDK to understand the rationale for this, but I don’t see why DPDK couldn’t have its own extensions to the PI API.
[Cristian] Yes, these are simply extensions to support the frequent actions on IP packets, as most of the packets are IP packets, which justifies specializations for performance reasons.

2) For the sake of avoiding fragmentation of the community, I would strongly recommend making sure that there is an available P4Runtime (https://p4.org/p4-spec/docs/P4Runtime-v1.0.0.pdf) implementation for DPDK. That would require a mapping from P4Runtime messages to PDEV API calls. The advantage of trying to align PDEV with PI (first bullet point) is that there is already a mapping from P4Runtime messages to PI API calls.
[Cristian] Yes, we need to fit all the P4Runtime functionality in the PDEV API.

The burden of supporting P4Runtime can probably be reduced by leveraging the Stratum project (https://stratumproject.org/), which unfortunately is not open-source yet.
[Cristian] Yes, the “instrumentation” layer translating between the gRPC calls of P4RT and the DPDK PDEV API probably fits into an SDN controller such as Stratum.

3) It seems that the notion of “action profile” here is more general than in P4, or more precisely than in the P4_16 PSA architecture (Portable Switch Architecture). Since this term has a strong connotation in the P4 world, maybe another term should be used instead if possible.
[Cristian] Yes, we’ll probably rename “action profile” to something like “action configuration template” to avoid name clash with the action profile construct from the P4 language.

4) I recommend looking into the notion of “architecture” in P4_16 and trying to decide if you want to a) have generic support for all P4 architectures (at least for the CPU implementation), b) support the PSA architecture specifically (which is the primary / only architecture used as part of Stratum) or c) define your own architecture specifically for targets that are going to support P4 through DPDK drivers (which may limit your impact).
[Cristian] The PDEV API should support all features of the P4 language, the set of extern constructs defined by the PSA architecture and the configuration API defined by the P4RT; of course, support for each PDEV API feature is subject to the target supporting it.  The PDEV API must be agnostic of whether the implementation (DPDK driver) addresses a HW or SW target. As stated, we want to support a generous range of P4 capable devices (FPGAs, ASICs, NICs, Smart NICs, etc) as well as the SW target (CPU based), with the latter likely to be implemented based on DPDK Packet Framework libraries.

 5) Conceptually the APIs can be split into 2 parts: a) the table runtime APIs, which are generally pretty-straightforward and b) pipeline query & configuration APIs. Both P4Runtime (SetForwardingPipelineConfig) & PI (pi_device_update_[start|end]) include mechanisms to re-configure the data-plane, by providing the compiler output to the target.
For b), I strongly recommend looking into what we have done with P4Runtime. SetForwardingPipelineConfig provides the target with a P4Info message (which is target-agnostic and describes the interface of each runtime-controllable P4 object; in a way I believe it is similar to your table_create PDEV API) and a target-specific opaque “blob”. For reconfigurable SW & HW the “blob” is essentially a description of the pipeline: it can be some text file, binary register values, an object file, etc…
The case of fixed-function devices is usually trickier. We actually do not have a pipeline discovery mechanism in P4Runtime & PI. In P4Runtime, we just assume that the control-plane is aware of the pipeline and has access to a P4Info message for it. We still require the P4Runtime client to call SetForwardingPipelineConfig with the “right" P4Info message (we expect the target to return an error if the P4Info is not the right one) and a potentially empty “blob”.
I think the take-away is that there isn’t a unified pipeline creation mechanism across programmable targets, i.e. it is difficult to break down pipeline creation into a sequence of universal sub-API calls, such as “create_table”, “create_parser”, etc… However it would make perfect sense IMO to design and implement such an API in the context of a specific DPDK SW switch. The P4 compiler backend would then be in-charge of generating the appropriate sequence of API calls.
[Cristian] Yes, it makes perfect sense for PDEV API to support the pipeline query/discovery service and the run-time management of pipelines, same as P4RT. The pipeline creation service is very useful for the CPU SW target and probably for other targets where the application pipeline can be specified/constructed incrementally), it can be left unimplemented by the targets that only support a monolithic specification/creation mechanism.

Overall I’m very excited to see some work being done in this area. I believe a lot of people will be able to help, especially with compiler backend development. To summarize my 5 bullet points above, I would say that there are 2 import areas of investigation as far as I can tell:
1) what should be the compiler backend output for the DPDK CPU SW target (sequence of API calls)? For non-programmable devices, having the “right” P4Info is usually enough. Existing P4-programmable hardware already comes with its own compiler backend (Barefoot Tofino ASIC, Xilinx FPGAs).
[Cristian] See my comment above. Likely more work required here for you and me☺.

2) can we try to avoid fragmentation and re-use existing code with P4Runtime / PI / Stratum?
[Cristian] Yes, this is work that spans across multiple projects: dpdk.org (PDEV API and drivers), p4.org (DPDK back-end for P4 compiler), stratumproject.org (SDN controller adaptation layers).

Thanks,

Antonin


On Apr 18, 2018, at 10:22 AM, Dumitrescu, Cristian <cristian.dumitrescu@intel.com<mailto:cristian.dumitrescu@intel.com>> wrote:

P4 is a language for programming the data plane of network devices [1]. The P4
language is developed by p4.org<http://p4.org> which is joining ONF and Linux Foundation [2].

This API provides a way to program P4 capable devices through DPDK. The purpose
of this API is to enable P4 compilers [3] to generate high performance DPDK code
out of P4 programs.

The main advantage of this approach is that P4 enablement of network devices can
be done through DPDK in a unified way:

  1. This API serves as the interface between the P4 compiler front-end (target
     independent) and the P4 compiler backe-ends (target specific).

  2. Device vendors develop their device drivers as part of DPDK by
     implementing this API. The device driver is agostic of being called by the
     P4 front-end. The device driver serves as the P4 compiler taget specific
     back-end.

  3. The P4 compiler front-end is target independent. The amount of C code it
     generates is minimized by calling this API directly for every P4 feature
     as opposed to vendor-specific free-style C code generation.

This API introduces a pipeline device (PDEV) by using a similar approach to the
existing ethdev and eventdev DPDK device-like APIs implemented by the DPDK Poll
Mode Drivers (PMDs). Main features:

  1. Discovery of built-in pipeline devices and their capabilities.

  2. Creation of new pipelines out of input ports, output ports, tables and
     actions.

  3. Registration of packet protocol header and meta-data fields.

  4. Action definition for input ports, output ports and tables.

  5. Pipeline run-time API for table population, statistics read, etc.

This API targets P4 capable devices such as NICs, FPGAs, NPUs, ASICs, etc, as
well as CPUs. Let’s remember that the first P in P4 stands for Programmable, and
the CPUs are arguably the most programmable devices. The implementation for the
CPU SW target is expected to use the DPDK Packet Framework libraries such as
librte_pipeline, librte_port, librte_table with some expected but moderate API
and implementation adjustments.

Links:

  [1] P4-16 language specification:
      https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.pdf

  [2] p4.org<http://p4.org> to join ONF and LF: https://p4.org/p4/onward-and-upward.html

  [3] p4c: https://github.com/p4lang/p4c

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com<mailto:cristian.dumitrescu@intel.com>>
---
lib/librte_pipeline/rte_pdev.h        | 1654 +++++++++++++++++++++++++++++++++
lib/librte_pipeline/rte_pdev_driver.h |  283 ++++++
2 files changed, 1937 insertions(+)
create mode 100644 lib/librte_pipeline/rte_pdev.h
create mode 100644 lib/librte_pipeline/rte_pdev_driver.h




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] P4 enablement in DPDK
  2018-04-18 17:22 [dpdk-dev] [RFC] P4 enablement in DPDK Cristian Dumitrescu
  2018-04-19  5:04 ` Kuusisaari, Juhamatti
  2018-06-15 23:25 ` antonin
@ 2018-06-20  6:13 ` Jerin Jacob
  2018-06-20 11:56   ` Dumitrescu, Cristian
  2 siblings, 1 reply; 6+ messages in thread
From: Jerin Jacob @ 2018-06-20  6:13 UTC (permalink / raw)
  To: Cristian Dumitrescu; +Cc: dev, dan.daly

-----Original Message-----
> Date: Wed, 18 Apr 2018 18:22:01 +0100
> From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> To: dev@dpdk.org
> CC: dan.daly@intel.com
> Subject: [dpdk-dev] [RFC] P4 enablement in DPDK
> X-Mailer: git-send-email 2.7.4
> 
> P4 is a language for programming the data plane of network devices [1]. The P4
> language is developed by p4.org which is joining ONF and Linux Foundation [2].
> 
> This API provides a way to program P4 capable devices through DPDK. The purpose
> of this API is to enable P4 compilers [3] to generate high performance DPDK code
> out of P4 programs.
> 
> The main advantage of this approach is that P4 enablement of network devices can
> be done through DPDK in a unified way:
> 
>    1. This API serves as the interface between the P4 compiler front-end (target
>       independent) and the P4 compiler backe-ends (target specific).
> 
>    2. Device vendors develop their device drivers as part of DPDK by
>       implementing this API. The device driver is agostic of being called by the
>       P4 front-end. The device driver serves as the P4 compiler taget specific
>       back-end.
> 
>    3. The P4 compiler front-end is target independent. The amount of C code it
>       generates is minimized by calling this API directly for every P4 feature
>       as opposed to vendor-specific free-style C code generation.
> 
> This API introduces a pipeline device (PDEV) by using a similar approach to the
> existing ethdev and eventdev DPDK device-like APIs implemented by the DPDK Poll
> Mode Drivers (PMDs). Main features:
> 
>    1. Discovery of built-in pipeline devices and their capabilities.
> 
>    2. Creation of new pipelines out of input ports, output ports, tables and
>       actions.
> 
>    3. Registration of packet protocol header and meta-data fields.
> 
>    4. Action definition for input ports, output ports and tables.
> 
>    5. Pipeline run-time API for table population, statistics read, etc.
> 
> This API targets P4 capable devices such as NICs, FPGAs, NPUs, ASICs, etc, as
> well as CPUs. Let’s remember that the first P in P4 stands for Programmable, and
> the CPUs are arguably the most programmable devices. The implementation for the
> CPU SW target is expected to use the DPDK Packet Framework libraries such as
> librte_pipeline, librte_port, librte_table with some expected but moderate API
> and implementation adjustments.
> 
> Links:
> 
>    [1] P4-16 language specification:
>        https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.pdf
> 
>    [2] p4.org to join ONF and LF: https://p4.org/p4/onward-and-upward.html
> 
>    [3] p4c: https://github.com/p4lang/p4c
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> ---
>  lib/librte_pipeline/rte_pdev.h        | 1654 +++++++++++++++++++++++++++++++++
>  lib/librte_pipeline/rte_pdev_driver.h |  283 ++++++

How about moving this as separate library(pipeline dev) with driver(plugin) interface
and librte_pipeline based API being used as one plugin/driver. This will enable us
to hook another HW based or HW-SW combination of plugins in future.

Eventdev has capability to create pipeline that can be add in future
as plugin/driver. The code to deal with rte_tm etc in plugin, you can
make it as common code in the library. So that new plugin can be created based eventdev
+ other building blocks such as rte_tm.


/Jerin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] P4 enablement in DPDK
  2018-06-20  6:13 ` Jerin Jacob
@ 2018-06-20 11:56   ` Dumitrescu, Cristian
  0 siblings, 0 replies; 6+ messages in thread
From: Dumitrescu, Cristian @ 2018-06-20 11:56 UTC (permalink / raw)
  To: Jerin Jacob; +Cc: dev, Daly, Dan

<snip>

> 
> How about moving this as separate library(pipeline dev) with driver(plugin)
> interface
> and librte_pipeline based API being used as one plugin/driver. This will
> enable us
> to hook another HW based or HW-SW combination of plugins in future.

This is exactly what we are looking to do. The PDEV API is meant to be generic, while the current librte_pipeline API is limited to SW only. The intention is to transform the current librte_pipeline code into driver/plugin for the new PDEV API, with hopefully lots of other devices supporting PDEV through their own drivers.

> 
> Eventdev has capability to create pipeline that can be add in future
> as plugin/driver. The code to deal with rte_tm etc in plugin, you can
> make it as common code in the library. So that new plugin can be created
> based eventdev
> + other building blocks such as rte_tm.
> 

It would be great to be able to expose some/all of the eventdev features through PDEV API, let's look at this together.

> 
> /Jerin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-06-20 11:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-18 17:22 [dpdk-dev] [RFC] P4 enablement in DPDK Cristian Dumitrescu
2018-04-19  5:04 ` Kuusisaari, Juhamatti
2018-06-15 23:25 ` antonin
2018-06-19 17:52   ` Dumitrescu, Cristian
2018-06-20  6:13 ` Jerin Jacob
2018-06-20 11:56   ` Dumitrescu, Cristian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).