DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC 0/3] introduce  Stateful Flow Table
@ 2020-09-09 20:30 Andrey Vesnovaty
  2020-09-09 20:30 ` [dpdk-dev] [RFC 1/3] ethdev: add item/action for SFT Andrey Vesnovaty
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Andrey Vesnovaty @ 2020-09-09 20:30 UTC (permalink / raw)
  To: dev
  Cc: thomas, orika, viacheslavo, andrey.vesnovaty, ozsh, elibr, alexr, roniba

The RFC introduces Stateful Flow Table (SFT) API and changes needed in
both ethdev an RTE flow to support SFT functionality.

SFT library provides a framework for applications that need to maintain
context across different packets of the connection.

The goals of the SFT library:
- Accelerate flow recognition & its context retrieval for further
  lookaside processing.
- Enable context-aware flow handling offload.

Andrey Vesnovaty (3):
  ethdev: add item/action for SFT
  ethdev: support SFT APIs
  sft: introduce API

 lib/librte_ethdev/rte_ethdev.c      |   7 +
 lib/librte_ethdev/rte_ethdev.h      |  16 +
 lib/librte_ethdev/rte_ethdev_core.h |   1 +
 lib/librte_ethdev/rte_flow.h        |  84 +++
 lib/librte_sft/Makefile             |  28 +
 lib/librte_sft/meson.build          |   7 +
 lib/librte_sft/rte_sft.c            |   9 +
 lib/librte_sft/rte_sft.h            | 845 ++++++++++++++++++++++++++++
 lib/librte_sft/rte_sft_driver.h     | 195 +++++++
 lib/librte_sft/rte_sft_version.map  |  21 +
 10 files changed, 1213 insertions(+)
 create mode 100644 lib/librte_sft/Makefile
 create mode 100644 lib/librte_sft/meson.build
 create mode 100644 lib/librte_sft/rte_sft.c
 create mode 100644 lib/librte_sft/rte_sft.h
 create mode 100644 lib/librte_sft/rte_sft_driver.h
 create mode 100644 lib/librte_sft/rte_sft_version.map

-- 
2.26.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [RFC 1/3] ethdev: add item/action for SFT
  2020-09-09 20:30 [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
@ 2020-09-09 20:30 ` Andrey Vesnovaty
  2020-09-16 15:46   ` Ori Kam
  2020-09-09 20:30 ` [dpdk-dev] [RFC 2/3] ethdev: support SFT APIs Andrey Vesnovaty
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Andrey Vesnovaty @ 2020-09-09 20:30 UTC (permalink / raw)
  To: dev
  Cc: thomas, orika, viacheslavo, andrey.vesnovaty, ozsh, elibr, alexr,
	roniba, Ori Kam, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko

Attach SFT flow context to packet with SFT action.
Match on SFT flow context (attached to packet),
with SFT item.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
---
 lib/librte_ethdev/rte_flow.h | 84 ++++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5489..24390e6ab4 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -537,6 +537,12 @@ enum rte_flow_item_type {
 	 */
 	RTE_FLOW_ITEM_TYPE_ECPRI,
 
+	/**
+	 * Matches SFT context (see fields of struct rte_flow_item_sft).
+	 *
+	 * See struct rte_flow_item_sft.
+	 */
+	RTE_FLOW_ITEM_TYPE_SFT,
 };
 
 /**
@@ -1579,6 +1585,54 @@ static const struct rte_flow_item_ecpri rte_flow_item_ecpri_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_SFT
+ *
+ * Matches context of flow in SFT table.
+ *
+ * 5-tuple: src/dest IP + src/dest port + IP protocol.
+ * zone: application defined value cupled with 5-tuple to identify flow,
+ * example - VxLAN, VLAN.
+ * SFT: Statfull flow table
+ * SFT in scope of ethernet device (port) is HW offloaded lookup table
+ * where key is zone + 5-tuple & value is statefull flow context.
+ * Contents of the SFT maintained by SFT PMD (see SFT PMD API in rte_sft).
+ *
+ * The structure describes SFT flow context.
+ * All the fields of the structure, except @p fid, should be considered as
+ * user defined.
+ * The @p fid assigned by RTE SFT & used as unique flow identifier.
+ * SFT context attached to packet by action ``SFT`` (see RTE_FLOW_ACTION_SFT).
+ *
+ * SFT default context defined as context attached to packet when there is no
+ * entry for the flow in SFT. The @p state has application reserved value
+ * meaning that SFT context for the packet undefined since entry wasn't found
+ * in SFT. If state 'undefined' then @p zone should be valid othervice @p fid
+ * should be valid.
+ *
+ * Context considered virtual since the method of storing this info on packet
+ * is PMD/implementation specific & may involve mapping methods if there is
+ * 'not enough bits' to store entire contents of struct rte_flow_item_sft.
+ *
+ * Maximal value/size of each field depends on HW capabilities and considered
+ * as implementation specific.
+ */
+struct rte_flow_item_sft {
+	union {
+		uint32_t fid; /**< SFT flow identifier. */
+		uint32_t zone; /**< Zone assigned to flow. */
+	};
+	uint8_t state; /**< User defined flow state. */
+	uint8_t fid_valid:1; /**< fid field validity bit. */
+	uint8_t zone_valid:1; /**< zone fieald validity bit. */
+	uint8_t state_valid:1; /**< state fieald validity bit. */
+	uint8_t user_data_size; /**< user_data buffer size. */
+	uint8_t *user_data; /**< Arbitrary user data. */
+};
+
 /**
  * Matching pattern item definition.
  *
@@ -2132,6 +2186,15 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * RTE_FLOW_ACTION_TYPE_SFT
+	 *
+	 * Set SFT context and redirect to continue processing.
+	 *
+	 * See struct rte_flow_action_sft.
+	 */
+	RTE_FLOW_ACTION_TYPE_SFT,
 };
 
 /**
@@ -2721,6 +2784,27 @@ rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
 	*RTE_FLOW_DYNF_METADATA(m) = v;
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SFT
+ *
+ * Attaches an SFT context (see struct rte_flow_item_sft) to packet.
+ *
+ * Performs lookup by *zone* and 5-tuple in SFT; if entry found the related SFT
+ * context will be attached othervise default SFT context attached (see
+ * 'SFT default context' in struct rte_flow_item_sft description).
+ * Adding action of type ``SFT`` to the list of rule actions may impose
+ * limitations on other rule actions added to the list, depending on specific
+ * PMD implementation.
+ *
+ * For 5-tuple, zone & SFT definitions see `struct rte_flow_item_sft`.
+ */
+struct rte_flow_action_sft {
+	uint32_t zone; /**< Zone for lookup in SFT */
+};
+
 /*
  * Definition of a single action.
  *
-- 
2.26.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [RFC 2/3] ethdev: support SFT APIs
  2020-09-09 20:30 [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
  2020-09-09 20:30 ` [dpdk-dev] [RFC 1/3] ethdev: add item/action for SFT Andrey Vesnovaty
@ 2020-09-09 20:30 ` Andrey Vesnovaty
  2020-09-09 20:30 ` [dpdk-dev] [RFC 3/3] sft: introduce API Andrey Vesnovaty
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Andrey Vesnovaty @ 2020-09-09 20:30 UTC (permalink / raw)
  To: dev
  Cc: thomas, orika, viacheslavo, andrey.vesnovaty, ozsh, elibr, alexr,
	roniba, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko

ethdev updated to support SFT lookup offload
to ethernet device.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
---
 lib/librte_ethdev/rte_ethdev.c      |  7 +++++++
 lib/librte_ethdev/rte_ethdev.h      | 16 ++++++++++++++++
 lib/librte_ethdev/rte_ethdev_core.h |  1 +
 3 files changed, 24 insertions(+)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 7858ad5f11..fcdcfcce6d 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -752,6 +752,13 @@ rte_eth_dev_get_sec_ctx(uint16_t port_id)
 	return rte_eth_devices[port_id].security_ctx;
 }
 
+void *
+rte_eth_dev_get_sft_ctx(uint16_t port_id)
+{
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
+	return rte_eth_devices[port_id].sft_ctx;
+}
+
 uint16_t
 rte_eth_dev_count_avail(void)
 {
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 70295d7ab7..83a71a8532 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1228,6 +1228,7 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_SCTP_CKSUM	0x00020000
 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM  0x00040000
 #define DEV_RX_OFFLOAD_RSS_HASH		0x00080000
+#define DEV_RX_OFFLOAD_SFT		0x00100000
 
 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \
 				 DEV_RX_OFFLOAD_UDP_CKSUM | \
@@ -4388,6 +4389,21 @@ rte_eth_dev_pool_ops_supported(uint16_t port_id, const char *pool);
 void *
 rte_eth_dev_get_sec_ctx(uint16_t port_id);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Get the SFT context for the Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of the Ethernet device
+ * @return
+ *   - NULL on error.
+ *   - pointer to SFT context on success.
+ */
+void *
+rte_eth_dev_get_sft_ctx(uint16_t port_id);
+
 /**
  * @warning
  * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
index 32407dd418..4ff458bfe0 100644
--- a/lib/librte_ethdev/rte_ethdev_core.h
+++ b/lib/librte_ethdev/rte_ethdev_core.h
@@ -806,6 +806,7 @@ struct rte_eth_dev {
 	struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT];
 	enum rte_eth_dev_state state; /**< Flag indicating the port state */
 	void *security_ctx; /**< Context for security ops */
+	void *sft_ctx; /**< Context for SFT ops */
 
 	uint64_t reserved_64s[4]; /**< Reserved for future fields */
 	void *reserved_ptrs[4];   /**< Reserved for future fields */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [RFC 3/3] sft: introduce API
  2020-09-09 20:30 [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
  2020-09-09 20:30 ` [dpdk-dev] [RFC 1/3] ethdev: add item/action for SFT Andrey Vesnovaty
  2020-09-09 20:30 ` [dpdk-dev] [RFC 2/3] ethdev: support SFT APIs Andrey Vesnovaty
@ 2020-09-09 20:30 ` Andrey Vesnovaty
  2020-09-16 18:33   ` Ori Kam
  2020-09-18 13:34   ` Kinsella, Ray
  2020-09-15 11:59 ` [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 17+ messages in thread
From: Andrey Vesnovaty @ 2020-09-09 20:30 UTC (permalink / raw)
  To: dev
  Cc: thomas, orika, viacheslavo, andrey.vesnovaty, ozsh, elibr, alexr,
	roniba, Ray Kinsella, Neil Horman

Defines RTE SFT APIs for Statefull Flow Table library.

SFT General description:
SFT library provides a framework for applications that need to maintain
context across different packets of the connection.
Examples for such applications:
- Next-generation firewalls
- Intrusion detection/prevention systems (IDS/IPS): Suricata, snort
- SW/Virtual Switching: OVS
The goals of the SFT library:
- Accelerate flow recognition & its context retrieval for further
  lookaside processing.
- Enable context-aware flow handling offload.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
---
 lib/librte_sft/Makefile            |  28 +
 lib/librte_sft/meson.build         |   7 +
 lib/librte_sft/rte_sft.c           |   9 +
 lib/librte_sft/rte_sft.h           | 845 +++++++++++++++++++++++++++++
 lib/librte_sft/rte_sft_driver.h    | 195 +++++++
 lib/librte_sft/rte_sft_version.map |  21 +
 6 files changed, 1105 insertions(+)
 create mode 100644 lib/librte_sft/Makefile
 create mode 100644 lib/librte_sft/meson.build
 create mode 100644 lib/librte_sft/rte_sft.c
 create mode 100644 lib/librte_sft/rte_sft.h
 create mode 100644 lib/librte_sft/rte_sft_driver.h
 create mode 100644 lib/librte_sft/rte_sft_version.map

diff --git a/lib/librte_sft/Makefile b/lib/librte_sft/Makefile
new file mode 100644
index 0000000000..23c6eee849
--- /dev/null
+++ b/lib/librte_sft/Makefile
@@ -0,0 +1,28 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2020 Mellanox Technologies, Ltd
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_sft.a
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf
+
+# library source files
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_sft.c
+
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft_driver.h
+
+# versioning export map
+EXPORT_MAP := rte_sft_version.map
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_sft/meson.build b/lib/librte_sft/meson.build
new file mode 100644
index 0000000000..b210e43f29
--- /dev/null
+++ b/lib/librte_sft/meson.build
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2020 Mellanox Technologies, Ltd
+
+sources = files('rte_sft.c')
+headers = files('rte_sft.h',
+	'rte_sft_driver.h')
+deps += ['mbuf']
diff --git a/lib/librte_sft/rte_sft.c b/lib/librte_sft/rte_sft.c
new file mode 100644
index 0000000000..f3d3945545
--- /dev/null
+++ b/lib/librte_sft/rte_sft.c
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+
+#include "rte_sft.h"
+#include "rte_sft_driver.h"
+
+/* Placeholder for RTE SFT library APIs implementation */
diff --git a/lib/librte_sft/rte_sft.h b/lib/librte_sft/rte_sft.h
new file mode 100644
index 0000000000..5c9f92ea9f
--- /dev/null
+++ b/lib/librte_sft/rte_sft.h
@@ -0,0 +1,845 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+#ifndef _RTE_SFT_H_
+#define _RTE_SFT_H_
+
+/**
+ * @file
+ *
+ * RTE SFT API
+ *
+ * Defines RTE SFT APIs for Statefull Flow Table library.
+ *
+ * SFT General description:
+ * SFT library provides a framework for applications that need to maintain
+ * context across different packets of the connection.
+ * Examples for such applications:
+ * - Next-generation firewalls
+ * - Intrusion detection/prevention systems (IDS/IPS): Suricata, Snort
+ * - SW/Virtual Switching: OVS
+ * The goals of the SFT library:
+ * - Accelerate flow recognition & its context retrieval for further lookaside
+ *   processing.
+ * - Enable context-aware flow handling offload.
+ *
+ * Definitions and Abbreviations:
+ * - 5-tuple: defined by:
+ *     -- Source IP address
+ *     -- Source port
+ *     -- Destination IP address
+ *     -- Destination port
+ *     -- IP protocol number
+ * - 7-tuple: 5-tuple zone and port (see struct rte_sft_7tuple)
+ * - 5/7-tuple: 5/7-tuple of the packet from connection initiator
+ * - revers 5/7-tuple: 5/7-tuple of the packet from connection initiate
+ * - application: SFT library API consumer
+ * - APP: see application
+ * - CID: client ID
+ * - CT: connection tracking
+ * - FID: Flow identifier
+ * - FIF: First In Flow
+ * - Flow: defined by 7-tuple and its reverse i.e. flow is bidirectional
+ * - SFT: Stateful Flow Table
+ * - user: see application
+ * - zone: additional user defined value used as differentiator for
+ *         connections having same 5-tuple (for example different VxLan
+ *         connections with same inner 5-tuple).
+ *
+ * SFT components:
+ *
+ * +-----------------------------------+
+ * | RTE flow                          |
+ * |                                   |
+ * | +-------------------------------+ |  +----------------+
+ * | | group X                       | |  | RTE_SFT        |
+ * | |                               | |  |                |
+ * | | +---------------------------+ | |  |                |
+ * | | | rule ...                  | | |  |                |
+ * | | | .                         | | |  +-----------+----+
+ * | | | .                         | | |              |
+ * | | | .                         | | |          entry
+ * | | +---------------------------+ | |            create
+ * | | | rule                      | | |              |
+ * | | |   patterns ...            +---------+        |
+ * | | |   actions                 | | |     |        |
+ * | | |     SFT (zone=Z)          | | |     |        |
+ * | | |     JUMP (group=Y)        | | |  lookup      |
+ * | | +---------------------------+ | |    zone=Z,   |
+ * | | | rule ...                  | | |    5tuple    |
+ * | | | .                         | | |     |        |
+ * | | | .                         | | |  +--v-------------+
+ * | | | .                         | | |  | SFT       |    |
+ * | | |                           | | |  |           |    |
+ * | | +---------------------------+ | |  |        +--v--+ |
+ * | |                               | |  |        |     | |
+ * | +-------------------------------+ |  |        | PMD | |
+ * |                                   |  |        |     | |
+ * |                                   |  |        +-----+ |
+ * | +-------------------------------+ |  |                |
+ * | | group Y                       | |  |                |
+ * | |                               | |  | set flow CTX   |
+ * | | +---------------------------+ | |  |                |
+ * | | | rule                      | | |  +--------+-------+
+ * | | |   patterns                | | |           |
+ * | | |     SFT (state=UNDEFINED) | | |           |
+ * | | |   actions RSS             | | |           |
+ * | | +---------------------------+ | |           |
+ * | | | rule                      | | |           |
+ * | | |   patterns                | | |           |
+ * | | |     SFT (state=INVALID)   | <-------------+
+ * | | |   actions DROP            | | |  forward
+ * | | +---------------------------+ | |    group=Y
+ * | | | rule                      | | |
+ * | | |   patterns                | | |
+ * | | |     SFT (state=ACCEPTED)  | | |
+ * | | |   actions PORT            | | |
+ * | | +---------------------------+ | |
+ * | |  ...                          | |
+ * | |                               | |
+ * | +-------------------------------+ |
+ * |  ...                              |
+ * |                                   |
+ * +-----------------------------------+
+ *
+ * SFT as datastructure:
+ * SFT can be treated as datastructure maintaining flow context across its
+ * lifetime. SFT flow entry represent bidirectional network flow and defined by
+ * 7-tuple & its reverse 7-tuple.
+ * Each entry in SFT has:
+ * - FID: 1:1 mapped & used as entry handle & encapsulating internal
+ *   implementation of the entry.
+ * - State: user-defined value attached to each entry, the only library
+ *   reserved value for state unset (the actual value defined by SFT
+ *   configuration). The application should define flow state encodings and
+ *   set it for flow via rte_sft_flow_set_ctx() than what actions should be
+ *   applied on packets can be defined via related RTE flow rule matching SFT
+ *   state (see rules in SFT components diagram above).
+ * - Timestamp: for the last seen in flow packet used for flow aging mechanism
+ *   implementation.
+ * - Client Objects: user-defined flow contexts attached as opaques to flow.
+ * - Acceleration & offloading - utilize RTE flow capabilities, when supported
+ *   (see action ``SFT``), for flow lookup acceleration and further
+ *   context-aware flow handling offload.
+ * - CT state: optionally for TCP connections CT state can be maintained
+ *   (see enum rte_sft_flow_ct_state).
+ * - Out of order TCP packets: optionally SFT can keep out of order TCP
+ *   packets aside the flow context till the arrival of the missing in-order
+ *   packet.
+ *
+ * RTE flow changes:
+ * The SFT flow state (or context) for RTE flow is defined by fields of
+ * struct rte_flow_item_sft.
+ * To utilize SFT capabilities new item and action types introduced:
+ * - item SFT: matching on SFT flow state (see RTE_FLOW_ITEM_TYPE_SFT).
+ * - action SFT: retrieve SFT flow context and attache it to the processed
+ *   packet (see RTE_FLOW_ACTION_TYPE_SFT).
+ *
+ * The contents of per port SFT serving RTE flow action ``SFT`` managed via
+ * SFT PMD APIs (see struct rte_sft_ops).
+ * The SFT flow state/context retrieval performed by user-defined zone ``SFT``
+ * action argument and processed packet 5-tuple.
+ * If in scope of action ``SFT`` there is no context/state for the flow in SFT
+ * undefined sate attached to the packet meaning that the flow is not
+ * recognized by SFT, most probably FIF packet.
+ *
+ * Once the SFT state set for a packet it can match on item SFT
+ * (see RTE_FLOW_ITEM_TYPE_SFT) and forwarding design can be done for the
+ * packet, for example:
+ * - if state value == x than queue for further processing by the application
+ * - if state value == y than forward it to eth port (full offload)
+ * - if state value == 'undefined' than queue for further processing by
+ *   the application (handle FIF packets)
+ *
+ * Processing packets with SFT library:
+ *
+ * FIF packet:
+ * To recognize upcoming packets of the SFT flow every FIF packet should be
+ * forwarded to the application utilizing the SFT library. Non-FIF packets can
+ * be processed by the application or its processing can be fully offloaded.
+ * Processing of the packets in SFT library starts with rte_sft_process_mbuf
+ * or rte_sft_process_mbuf_with_zone. If mbuf recognized as FIF application
+ * should make a design to destroy flow or complete flow creation process in
+ * SFT using rte_sft_flow_activate.
+ *
+ * Recognized SFT flow:
+ * Once struct rte_sft_flow_status with valid fid field posesed by application
+ * it can:
+ * - mange client objects on it (see client_obj field in
+ *   struct rte_sft_flow_status) using rte_sft_flow_<OP>_client_obj APIs
+ * - analyze user-defined flow state and CT state (see state & ct_sate fields
+ *   in struct rte_sft_flow_status).
+ * - set flow state to be attached to the upcoming packets by action ``SFT``
+ *   via struct rte_sft_flow_status API.
+ * - decide to destroy flow via rte_sft_flow_destroy API.
+ *
+ * Flow aging:
+ *
+ * SFT library manages the aging for each flow. On flow creation, it's
+ * assigned an aging value, the maximal number of seconds passed since the
+ * last flow packet arrived, once exceeded flow considered aged.
+ * The application notified of aged flow asynchronously via event queues.
+ * The device and port IDs tuple to identify the event queue to enqueue
+ * flow aged events passed on flow creation as arguments
+ * (see rte_sft_flow_activate). It's the application responsibility to
+ * initialize event queues and assign them to each flow for EOF event
+ * notifications.
+ * Aged EOF event handling:
+ * - Should be considered as application responsibility.
+ * - The last stage should be the release of the flow resources via
+ *    rte_sft_flow_destroy API.
+ * - All client objects should be removed from flow before the
+ *   rte_sft_flow_destroy API call.
+ * See the description of rete_sft_flow_destroy for an example of aged flow
+ * handling.
+ *
+ * SFT API thread safety:
+ *
+ * SFT library APIs are thread-safe while handling of specific flow can be
+ * done in a single thread simultaneously. Exclusive access to specific SFT
+ * flow guaranteed by:
+ * - rte_sft_process_mbuf
+ * - rte_sft_process_mbuf_with_zone
+ * - rte_sft_flow_create
+ * - rte_sft_flow_lock
+ * When application is done with the flow handling for the current packet it
+ * should call rte_sft_flow_unlock API to maintain exclusive access to the
+ * flow with other threads.
+ *
+ * SFT Library initialization and cleanup:
+ *
+ * SFT library should be considered as a single instance, preconfigured and
+ * initialized via rte_sft_init() API.
+ * SFT library resource deallocation and cleanup should be done via
+ * rte_sft_init() API as a stage of the application termination procedure.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_flow.h>
+
+/**
+ * L3/L4 5-tuple - src/dest IP and port and IP protocol.
+ *
+ * Used for flow/connection identification.
+ */
+struct rte_sft_5tuple {
+	union {
+		struct {
+			rte_be32_t src_addr; /**< IPv4 source address. */
+			rte_be32_t dst_addr; /**< IPv4 destination address. */
+		} ipv4;
+		struct {
+			uint8_t src_addr[16]; /**< IPv6 source address. */
+			uint8_t dst_addr[16]; /**< IPv6 destination address. */
+		} ipv6;
+	};
+	uint16_t src_port; /**< Source port. */
+	uint16_t dst_port; /**< Destination port. */
+	uint8_t proto; /**< IP protocol. */
+	uint8_t is_ipv6: 1; /**< True for valid IPv6 fields. Otherwise IPv4. */
+};
+
+/**
+ * Port flow identification.
+ *
+ * @p zone used for setups where 5-tuple is not enough to identify flow.
+ * For example different VLANs/VXLANs may have similar 5-tuples.
+ */
+struct rte_sft_7tuple {
+	struct rte_sft_5tuple flow_5tuple; /**< L3/L4 5-tuple. */
+	uint32_t zone; /**< Zone assigned to flow. */
+	uint16_t port_id; /** <Port identifier of Ethernet device. */
+};
+
+/**
+ * Flow connection tracking states
+ */
+enum rte_sft_flow_ct_state {
+	RTE_SFT_FLOW_CT_STATE_NEW  = (1 << 0),
+	RTE_SFT_FLOW_CT_STATE_EST  = (1 << 1),
+	RTE_SFT_FLOW_CT_STATE_REL  = (1 << 2),
+	RTE_SFT_FLOW_CT_STATE_RPL  = (1 << 3),
+	RTE_SFT_FLOW_CT_STATE_INV  = (1 << 4),
+	RTE_SFT_FLOW_CT_STATE_TRK  = (1 << 5),
+	RTE_SFT_FLOW_CT_STATE_SNAT = (1 << 6),
+	RTE_SFT_FLOW_CT_STATE_DNAT = (1 << 7),
+};
+
+/**
+ * Structure describes SFT library configuration
+ */
+struct rte_sft_conf {
+	uint32_t UDP_aging; /**< UDP proto default aging. */
+	uint32_t TCP_aging; /**< TCP proto default aging. */
+	uint32_t TCP_SYN_aging; /**< TCP SYN default aging. */
+	uint32_t OTHER_aging; /**< All unlisted proto default aging. */
+	uint32_t size; /**< Max entries in SFT. */
+	uint8_t undefined_state; /**< Undefined state constant. */
+	uint8_t reorder_enable: 1;
+	/**< TCP packet reordering feature enabled bit. */
+	uint8_t ct_enable: 1; /**< Connection tracking feature enabled bit. */
+};
+
+/**
+ * Structure describes the state of the flow in SFT.
+ */
+struct rte_sft_flow_status {
+	uint32_t fid; /**< SFT flow id. */
+	uint32_t zone; /**< Zone for lookup in SFT */
+	uint8_t state; /**< Application defined bidirectional flow state. */
+	uint8_t ct_state; /**< Connection tracking flow state. */
+	uint32_t age; /**< Seconds passed since last flown packet. */
+	uint32_t aging;
+	/**< Flow considered aged once this age (seconds) reached. */
+	uint32_t nb_in_order_mbufs;
+	/**< Number of in-order mbufs available for drain */
+	void **client_obj; /**< Array of clients attached to flow. */
+	int nb_clients; /**< Number of clients attached to flow. */
+	uint8_t defined: 1; /**< Flow defined in SFT bit. */
+	uint8_t activated: 1; /**< Flow activation bit. */
+	uint8_t fragmented: 1; /**< Last flow mbuf was fragmented. */
+	uint8_t out_of_order: 1; /**< Last flow mbuf was out of order (TCP). */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get SFT flow status.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] status
+ *   Structure to dump actual SFT flow status.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_get_status(const uint32_t fid,
+			struct rte_sft_flow_status *status,
+			struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set user defined context.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * Updates per ethernet dev SFT entries:
+ * - flow lookup acceleration
+ * - partial/full flow offloading managed by flow context
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param ctx
+ *   User defined state to set.
+ *   Update of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_set_ctx(uint32_t fid,
+		     const struct rte_flow_item_sft *ctx,
+		     struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Initialize SFT library instance.
+ *
+ * @param conf
+ *   SFT library instance configuration.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_init(const struct rte_sft_conf *conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Finalize SFT library instance.
+ * Cleanup & release allocated resources.
+ */
+void
+rte_sft_fini(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Process mbuf received on RX queue.
+ *
+ * Fragmentation handling (SFT fragmentation feature configured):
+ * If *mbuf_in* of fragmented packet received it will be stored by SFT library.
+ * status->fragmented bit will be set and *mbuf_out* will be set to NULL.
+ * On reception of all related fragments of IP packet it will be reassembled
+ * and further processed by this function on reception of last fragment.
+ *
+ * Flow definition:
+ * SFT flow defined by one of its 7-tuples, since there is no zone value as
+ * argument flow should be defined by context attached to mbuf with action
+ * ``SFT`` (see RTE flow RTE_FLOW_ACTION_TYPE_SFT). Otherwise status->defined
+ * field will be turned off & *mbuf_out* will be set to *mbuf_in*.
+ * In order to define flow for *mbuf_in* without attached sft context
+ * rte_sft_process_mbuf_with_zone() should be used with *zone* argument
+ * supplied by caller.
+ *
+ * Flow lookup:
+ * If SFT flow identifier can't be retrieved from SFT context attached to
+ * *mbuf_in* by action ``SFT`` - SFT lookup should be performmed by zone,
+ * retrieved from SFT context attached to *mbuf_in*, and 5-tuple, extracted
+ * form mbuf outer header contents.
+ *
+ * Flow defined but does not exists:
+ * If flow not found in SFT inactivated flow will be created in SFT.
+ * status->activated field will be turned off & *mbuf_out* be set to *mbuf_in*.
+ * In order to activate created flow rte_sft_flow_activate() should be used
+ * with reverse 7-tuple supplied by caller.
+ * This is first phase of flow creation in SFT for second phase & more detailed
+ * descriotion of flow creation see rte_sft_flow_activate.
+ *
+ * Out of order (SFT out of oreder feature configured):
+ * If flow defined & activated but *mbuf_in* is TCP out of order packet it will
+ * be stored by SFT library. status->out_of_order bit will be set & *mbuf_out*
+ * will be set to NULL. On reception of the first missing in order packet
+ * status->nb_in_order_mbufs will be set to number of mbufs that available for
+ * processing with rte_sft_drain_mbuf().
+ *
+ * Flow defined & activated, mbuf not fragmented and 'in order':
+ * - Flow aging related data (see age field in `struct rte_sft_flow_status`)
+ *   will be updated according to *mbuf_in* timestamp.
+ * - Flow connection tracking state (see ct_state field in
+ *   `struct rte_sft_flow_status`)  will be updated according to *mbuf_in* L4
+ *   header contents.
+ * - *mbuf_out* will be set to last processed mbuf.
+ *
+ * @param[in] mbuf_in
+ *   mbuf to process; mbuf pinter considered 'consumed' and should not be used
+ *   after successful call to this function.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param[out] status
+ *   Structure to dump SFT flow status once updated according to contents of
+ *   *mbuf_in*.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success:
+ *   - *mbuf_out* contains valid mbuf pointer, locked SFT flow recognized by
+ *     status->fid.
+ *   - *mbuf_out* is NULL and status->fragmented bit on in case of
+ *     non last fragment *mbuf_in*.
+ *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
+ *     order *mbuf_in*, locked SFT flow recognized by status->fid.
+ *   On failure a negative errno value and rte_errno is set.
+ */
+int
+rte_sft_process_mbuf(struct rte_mbuf *mbuf_in,
+		     struct rte_mbuf **mbuf_out,
+		     struct rte_sft_flow_status *status,
+		     struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Process mbuf received on RX queue while zone value provided by caller.
+ *
+ * The behaviour of this function is similar to rte_sft_process_mbuf except
+ * the lookup in SFT procedure. The lookup in SFT always done by the *zone*
+ * arg and 5-tuple 5-tuple, extracted form mbuf outer header contents.
+ *
+ * @see rte_sft_process_mbuf
+ *
+ * @param[in] mbuf_in
+ *   mbuf to process; mbuf pinter considered 'consumed' and should not be used
+ *   after successful call to this function.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param[out] status
+ *   Structure to dump SFT flow status once updated according to contents of
+ *   *mbuf_in*.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success:
+ *   - *mbuf_out* contains valid mbuf pointer.
+ *   - *mbuf_out* is NULL and status->fragmented bit on in case of
+ *     non last fragment *mbuf_in*.
+ *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
+ *     order *mbuf_in*.
+ *   On failure a negative errno value and rte_errno is set.
+ */
+int
+rte_sft_process_mbuf_with_zone(struct rte_mbuf *mbuf_in,
+			       uint32_t zone,
+			       struct rte_mbuf **mbuf_out,
+			       struct rte_sft_flow_status *status,
+			       struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Drain next in order mbuf.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * This function behaves similar to rte_sft_process_mbuf() but acts on packets
+ * accumulated in SFT flow due to missing in order packet. Processing done on
+ * single mbuf at a time and `in order`. Other than above the behavior is
+ * same as of rte_sft_process_mbuf for flow defined & activated & mbuf isn't
+ * fragmented & 'in order'. This function should be called when
+ * rte_sft_process_mbuf or rte_sft_process_mbuf_with_zone sets
+ * status->nb_in_order_mbufs output param !=0 and until
+ * status->nb_in_order_mbufs == 0.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] status
+ *   Structure to dump SFT flow status once updated according to contents of
+ *   *mbuf_in*.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid mbuf in case of success, NULL otherwise and rte_errno is set.
+ */
+struct rte_mbuf *
+rte_sft_drain_mbuf(uint32_t fid,
+		   struct rte_sft_flow_status *status,
+		   struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Activate flow in SFT.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * This function performs second phase of flow creation in SFT.
+ * The reasons for 2 phase flow creation procedure:
+ * 1. Missing reverse flow - flow context is shared for both flow directions
+ *    i.e. in order maintain bidirectional flow context in RTE SFT packets
+ *    arriving from both dirrections should be identified as packets of the
+ *    RTE SFT flow. Consequently before creation of the SFT flow caller should
+ *    provide reverse flow direction 7-tuple.
+ * 2. The caller of rte_sft_process_mbuf/rte_sft_process_mbuf_with_zone should
+ *   be notified that arrived mbuf is first in flow & decide weather to
+ *   create new flow or it distroy before it was activated with
+ *   rte_sft_flow_destroy.
+ * This function completes creation of the bidirectional SFT flow & creates
+ * entry for 7-tuple on SFT PMD defined by the tuple port for both
+ * initiator/initiate 7-tuples.
+ * Flow aging, connection tracking state & out of order handling will be
+ * initialized according to the content of the *mbuf_in* passes to
+ * rte_sft_process_mbuf/_with_zone during the phase 1 of flow creation.
+ * Once this function returns upcoming calls rte_sft_process_mbuf/_with_zone
+ * with 7-tuple or its reverse will return handle to this flow.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param reverse_tuple
+ *   Expected response flow 7-tuple.
+ * @param ctx
+ *   User defined state to set.
+ *   Update of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
+ * @param ct_enable
+ *   Enables maintenance of status->ct_state connection tracking value for the
+ *   flow; otherwise status->ct_state will be initialized with zeros.
+ * @param evdev_id
+ *   Event dev ID to enqueue end of flow event.
+ * @param evport_id
+ *   Event port ID to enqueue end of flow event.
+ * @param[out] status
+ *   Structure to dump SFT flow status once activated.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_activate(uint32_t fid,
+		      const struct rte_sft_7tuple *reverse_tuple,
+		      const struct rte_flow_item_sft *ctx,
+		      uint8_t ct_enable,
+		      uint8_t dev_id,
+		      uint8_t port_id,
+		      struct rte_sft_flow_status *status,
+		      struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Artificially create SFT flow.
+ *
+ * Function to create SFT flow before reception of the first flow packet.
+ *
+ * @param tuple
+ *   Expected initiator flow 7-tuple.
+ * @param reverse_tuple
+ *   Expected initiate flow 7-tuple.
+ * @param ctx
+ *   User defined state to set.
+ *   Setting of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
+ * @param[out] ct_enable
+ *   Enables maintenance of status->ct_state connection tracking value for the
+ *   flow; otherwise status->ct_state will be initialized with zeros.
+ * @param[out] status
+ *   Structure to dump SFT flow status once created.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   - on success: 0, locked SFT flow recognized by status->fid.
+ *   - on error: a negative errno value otherwise and rte_errno is set.
+ */
+
+int
+rte_sft_flow_create(const struct rte_sft_7tuple *tuple,
+		    const struct rte_sft_7tuple *reverse_tuple,
+		    const struct rte_flow_item_sft *ctx,
+		    uint8_t ct_enable,
+		    struct rte_sft_flow_status *status,
+		    struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Lock exclusively SFT flow.
+ *
+ * Explicit flow locking; used for handling aged flows.
+ *
+ * @param fid
+ *   SFT flow ID.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_lock(uint32_t fid);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Release exclusively locked SFT flow.
+ *
+ * When rte_sft_process_mbuf/_with_zone and rte_sft_flow_create
+ * return *status* containing fid with defined bit on the flow considered
+ * exclusively locked and should be unlocked with this function.
+ *
+ * @param fid
+ *   SFT flow ID.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_unlock(uint32_t fid);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Removes flow from SFT.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * - Flow should be locked by caller in order to remove it.
+ * - Flow should have no client objects attached.
+ *
+ * Should be applied on aged flows, when flow aged event received.
+ *
+ * @code{.c}
+ *     while (1) {
+ *         rte_event_dequeue_burst(...);
+ *         FOR_EACH_EV(ev) {
+ *             uint32_t fid = ev.u64;
+ *             rte_sft_flow_lock(fid);
+ *             FOR_EACH_CLIENT(fid, client_id) {
+ *                 rte_sft_flow_reset_client_obj(fid, client_obj);
+ *                 // detached client object handling
+ *             }
+ *             rte_sft_flow_destroy(fid, &error);
+ *         }
+ *     }
+ * @endcode
+ *
+ * @param fid
+ *   SFT flow ID to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_destroy(uint32_t fid, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset flow age to zero.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * Simulates last flow packet with timestamp set to just now.
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_touch(uint32_t fid, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set flow aging to specific value.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param aging
+ *   New flow aging value.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_set_aging(uint32_t fid,
+		       uint32_t aging,
+		       struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set client object for given client ID.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param client_id
+ *   Client ID to set object for.
+ * @param client_obj
+ *   Pointer to opaque client object structure.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_sft_flow_set_client_obj(uint32_t fid,
+			    uint8_t client_id,
+			    void *client_obj,
+			    struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get client object for given client ID.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param client_id
+ *   Client ID to get object for.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid client object opaque pointer in case of success, NULL otherwise
+ *   and rte_errno is set.
+ */
+void *
+rte_sft_flow_get_client_obj(const uint32_t fid,
+			    uint8_t client_id,
+			    struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Remove client object for given client ID.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * Detaches client object from SFT flow and returns the ownership for the
+ * client object to the caller by returning client object pointer value.
+ * The pointer returned by this function won't be accessed any more, the caller
+ * may release all client obj related resources & the memory allocated for
+ * this client object.
+ *
+ * @param fid
+ *   SFT flow ID.
+ * @param client_id
+ *   Client ID to remove object for.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid client object opaque pointer in case of success, NULL otherwise
+ *   and rte_errno is set.
+ */
+void *
+rte_sft_flow_reset_client_obj(uint32_t fid,
+			      uint8_t client_id,
+			      struct rte_sft_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SFT_H_ */
diff --git a/lib/librte_sft/rte_sft_driver.h b/lib/librte_sft/rte_sft_driver.h
new file mode 100644
index 0000000000..0c9e28fe17
--- /dev/null
+++ b/lib/librte_sft/rte_sft_driver.h
@@ -0,0 +1,195 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+#ifndef _RTE_SFT_DRIVER_H_
+#define _RTE_SFT_DRIVER_H_
+
+/**
+ * @file
+ *
+ * RTE SFT Ethernet device PMD API
+ *
+ * APIs that are used by the SFT library to offload SFT operationons
+ * to Ethernet device.
+ */
+
+#include "rte_sft.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Opaque type returned after successfully creating an entry in SFT.
+ *
+ * This handle can be used to manage and query the related entry (e.g. to
+ * destroy it or update age).
+ */
+struct rte_sft_entry;
+
+/**
+ * Create SFT entry in eth_dev SFT.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param tuple
+ *   L3/L4 5-tuple - src/dest IP and port and IP protocol.
+ * @param nat_tuple
+ *   L3/L4 5-tuple to replace in packet original 5-tuple in order to implement
+ *   NAT offloading; if NULL NAT offloading won't be configured for the flow.
+ * @param aging
+ *   Flow aging timeout in seconds.
+ * @param ctx
+ *   Initial values in SFT flow context
+ *   (see RTE flow struct rte_flow_item_sft).
+ *   ctx->zone should be valid.
+ * @param fid
+ *   SFT flow ID for the entry to create on *device*.
+ *   If there is an entry for the *fid* in PMD it will be updated with the
+ *   values of *ctx*.
+ * @param[out] queue_index
+ *   if PMD can figure out the queue where the flow packets will
+ *   arrive in RX data path it will set the value of queue_index; otherwise
+ *   all bits will be turned on.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid handle in case of success, NULL otherwise and rte_errno is set.
+ */
+typedef struct rte_sft_entry *(*sft_entry_create_t) (struct rte_eth_dev *dev,
+		const struct rte_sft_5tuple *tuple,
+		const struct rte_sft_5tuple *nat_tuple,
+		const uint32_t aging,
+		const struct rte_flow_item_sft *ctx,
+		const uint32_t fid,
+		uint16_t *queue_index,
+		struct rte_sft_error *error);
+
+/**
+ * Destroy SFT entry in eth_dev SFT.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param entry
+ *   Handle to the SFT entry to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+typedef int (*sft_entry_destroy_t)(struct rte_eth_dev *dev,
+		struct rte_sft_entry *entry,
+		struct rte_sft_error *error);
+
+/**
+ * Decodes SFT flow context if attached to mbuf by action ``SFT``.
+ * @see RTE flow RTE_FLOW_ACTION_TYPE_SFT.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param mbuf
+ *   mbuf of the packet to decode attached state from.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid SFT flow context in case of success, NULL otherwise and rte_errno
+ *   is set.
+ */
+typedef struct rte_flow_item_sft *(*sft_entry_mbuf_decode_ctx_t)(
+		struct rte_eth_dev *dev,
+		const struct rte_mbuf *mbuf,
+		struct rte_sft_error *error);
+
+/**
+ * Get aged-out SFT entries.
+ *
+ * Report entry as aged-out if timeout passed without any matching
+ * on the SFT entry.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet device structure.
+ * @param[in, out] fid_aged
+ *   The address of an array of aged-out SFT flow IDs.
+ * @param[in] nb_aged
+ *   The length of *fid_aged* array pointers.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. Initialized in case of
+ *   error only.
+ *
+ * @return
+ *   if nb_aged is 0, return the amount of all aged flows.
+ *   if nb_aged is not 0 , return the amount of aged flows reported
+ *   in the *fid_aged* array, otherwise negative errno value.
+ */
+typedef int (*sft_entry_get_aged_entries_t)(struct rte_eth_dev *dev,
+		uint32_t *fid_aged,
+		int nb_aged,
+		struct rte_sft_error *error);
+
+/**
+ * Simulate SFT entry match in terms of entry aging.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param fid
+ *   SFT flow ID paired with dev to retrieve related SFT entry.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+typedef int (*sft_entry_touch_t)(struct rte_eth_dev *dev,
+		uint32_t fid,
+		struct rte_sft_error *error);
+
+/**
+ * Set SFT entry aging to specific value.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param fid
+ *   SFT flow ID paired with dev to retrieve related SFT entry.
+ * @param aging
+ *   New entry aging value.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+typedef int (*sft_entry_set_aging_t)(struct rte_eth_dev *dev,
+		uint32_t fid,
+		uint32_t aging,
+		struct rte_sft_error *error);
+
+/** SFT operations function pointer table */
+struct rte_sft_ops {
+	sft_entry_create_t entry_create;
+	/**< Create SFT entry in eth_dev SFT. */
+	sft_entry_destroy_t entry_destroy;
+	/**< Destroy SFT entry in eth_dev SFT. */
+	sft_entry_mbuf_decode_ctx_t mbuf_decode_ctx;
+	/**< Decodes SFT flow context if attached to mbuf by action ``SFT``. */
+	sft_entry_get_aged_entries_t get_aged_entries;
+	/**< Get aged-out SFT entries. */
+	sft_entry_touch_t entry_touch;
+	/**< Simulate SFT entry match in terms of entry aging. */
+	sft_entry_set_aging_t set_aging;
+	/**< Set SFT entry aging to specific value. */
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SFT_DRIVER_H_ */
diff --git a/lib/librte_sft/rte_sft_version.map b/lib/librte_sft/rte_sft_version.map
new file mode 100644
index 0000000000..747e100ac5
--- /dev/null
+++ b/lib/librte_sft/rte_sft_version.map
@@ -0,0 +1,21 @@
+EXPERIMENTAL {
+	global:
+
+	rte_sft_flow_get_status;
+	rte_sft_flow_set_ctx;
+	rte_sft_init;
+	rte_sft_fini;
+	rte_sft_process_mbuf;
+	rte_sft_process_mbuf_with_zone;
+	rte_sft_drain_mbuf;
+	rte_sft_flow_activate;
+	rte_sft_flow_create;
+	rte_sft_flow_lock;
+	rte_sft_flow_unlock;
+	rte_sft_flow_destroy;
+	rte_sft_flow_touch;
+	rte_sft_flow_set_aging;
+	rte_sft_flow_set_client_obj;
+	rte_sft_flow_get_client_obj;
+	rte_sft_flow_reset_client_obj;
+};
-- 
2.26.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [RFC 0/3] introduce  Stateful Flow Table
  2020-09-09 20:30 [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
                   ` (2 preceding siblings ...)
  2020-09-09 20:30 ` [dpdk-dev] [RFC 3/3] sft: introduce API Andrey Vesnovaty
@ 2020-09-15 11:59 ` Andrey Vesnovaty
  2020-11-04 12:59 ` [dpdk-dev] [PATCH v2 0/2] introduce stateful flow table Ori Kam
  2020-11-04 13:17 ` [dpdk-dev] [RFC v3 0/2] introduce stateful flow table Ori Kam
  5 siblings, 0 replies; 17+ messages in thread
From: Andrey Vesnovaty @ 2020-09-15 11:59 UTC (permalink / raw)
  To: dev
  Cc: thomas, Ori Kam, Slava Ovsiienko, andrey.vesnovaty, Oz Shlomo,
	Eli Britstein, Alex Rosenbaum, Roni Bar Yanai, Ferruh Yigit,
	Andrew Rybchenko

+ Ferruh & Andrew.
Adding more people that may find this discussion relevant.
Any feedback highly appreciated.

Thanks,
Andrey

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Andrey Vesnovaty
> Sent: Wednesday, September 9, 2020 11:30 PM
> To: dev@dpdk.org
> Cc: thomas@nvidia.net; Ori Kam <orika@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; andrey.vesnovaty@gmail.com; Oz Shlomo
> <ozsh@nvidia.com>; Eli Britstein <elibr@nvidia.com>; Alex Rosenbaum
> <alexr@nvidia.com>; Roni Bar Yanai <roniba@nvidia.com>
> Subject: [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table
> 
> The RFC introduces Stateful Flow Table (SFT) API and changes needed in
> both ethdev an RTE flow to support SFT functionality.
> 
> SFT library provides a framework for applications that need to maintain
> context across different packets of the connection.
> 
> The goals of the SFT library:
> - Accelerate flow recognition & its context retrieval for further
>   lookaside processing.
> - Enable context-aware flow handling offload.
> 
> Andrey Vesnovaty (3):
>   ethdev: add item/action for SFT
>   ethdev: support SFT APIs
>   sft: introduce API
> 
>  lib/librte_ethdev/rte_ethdev.c      |   7 +
>  lib/librte_ethdev/rte_ethdev.h      |  16 +
>  lib/librte_ethdev/rte_ethdev_core.h |   1 +
>  lib/librte_ethdev/rte_flow.h        |  84 +++
>  lib/librte_sft/Makefile             |  28 +
>  lib/librte_sft/meson.build          |   7 +
>  lib/librte_sft/rte_sft.c            |   9 +
>  lib/librte_sft/rte_sft.h            | 845 ++++++++++++++++++++++++++++
>  lib/librte_sft/rte_sft_driver.h     | 195 +++++++
>  lib/librte_sft/rte_sft_version.map  |  21 +
>  10 files changed, 1213 insertions(+)
>  create mode 100644 lib/librte_sft/Makefile
>  create mode 100644 lib/librte_sft/meson.build
>  create mode 100644 lib/librte_sft/rte_sft.c
>  create mode 100644 lib/librte_sft/rte_sft.h
>  create mode 100644 lib/librte_sft/rte_sft_driver.h
>  create mode 100644 lib/librte_sft/rte_sft_version.map
> 
> --
> 2.26.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: add item/action for SFT
  2020-09-09 20:30 ` [dpdk-dev] [RFC 1/3] ethdev: add item/action for SFT Andrey Vesnovaty
@ 2020-09-16 15:46   ` Ori Kam
  2020-09-18  7:04     ` Andrew Rybchenko
  0 siblings, 1 reply; 17+ messages in thread
From: Ori Kam @ 2020-09-16 15:46 UTC (permalink / raw)
  To: Andrey Vesnovaty, dev
  Cc: thomas, Slava Ovsiienko, andrey.vesnovaty, Oz Shlomo,
	Eli Britstein, Alex Rosenbaum, Roni Bar Yanai, Ori Kam,
	NBU-Contact-Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko

Hi Andrey,

PSB

> -----Original Message-----
> From: Andrey Vesnovaty <andreyv@nvidia.com>
> Sent: Wednesday, September 9, 2020 11:30 PM
> 
> Attach SFT flow context to packet with SFT action.
> Match on SFT flow context (attached to packet),
> with SFT item.
> 
> Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
> ---
>  lib/librte_ethdev/rte_flow.h | 84 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 84 insertions(+)
> 
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index da8bfa5489..24390e6ab4 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -537,6 +537,12 @@ enum rte_flow_item_type {
>  	 */
>  	RTE_FLOW_ITEM_TYPE_ECPRI,
> 
> +	/**
You are missing the Meta, tag not relevant for RFC but please notice for the patch.

> +	 * Matches SFT context (see fields of struct rte_flow_item_sft).
> +	 *
> +	 * See struct rte_flow_item_sft.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_SFT,
>  };
> 
>  /**
> @@ -1579,6 +1585,54 @@ static const struct rte_flow_item_ecpri
> rte_flow_item_ecpri_mask = {
>  };
>  #endif
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ITEM_TYPE_SFT
> + *
> + * Matches context of flow in SFT table.
> + *
> + * 5-tuple: src/dest IP + src/dest port + IP protocol.
> + * zone: application defined value cupled with 5-tuple to identify flow,
> + * example - VxLAN, VLAN.
> + * SFT: Statfull flow table
> + * SFT in scope of ethernet device (port) is HW offloaded lookup table
> + * where key is zone + 5-tuple & value is statefull flow context.
> + * Contents of the SFT maintained by SFT PMD (see SFT PMD API in rte_sft).
> + *
> + * The structure describes SFT flow context.
> + * All the fields of the structure, except @p fid, should be considered as
> + * user defined.
> + * The @p fid assigned by RTE SFT & used as unique flow identifier.
> + * SFT context attached to packet by action ``SFT`` (see
> RTE_FLOW_ACTION_SFT).
> + *
> + * SFT default context defined as context attached to packet when there is no
> + * entry for the flow in SFT. The @p state has application reserved value
> + * meaning that SFT context for the packet undefined since entry wasn't found
> + * in SFT. If state 'undefined' then @p zone should be valid othervice @p fid
> + * should be valid.
> + *
> + * Context considered virtual since the method of storing this info on packet
> + * is PMD/implementation specific & may involve mapping methods if there is
> + * 'not enough bits' to store entire contents of struct rte_flow_item_sft.
> + *
> + * Maximal value/size of each field depends on HW capabilities and
> considered
> + * as implementation specific.
> + */
> +struct rte_flow_item_sft {
> +	union {
> +		uint32_t fid; /**< SFT flow identifier. */
> +		uint32_t zone; /**< Zone assigned to flow. */
> +	};
> +	uint8_t state; /**< User defined flow state. */
> +	uint8_t fid_valid:1; /**< fid field validity bit. */
> +	uint8_t zone_valid:1; /**< zone fieald validity bit. */
> +	uint8_t state_valid:1; /**< state fieald validity bit. */
> +	uint8_t user_data_size; /**< user_data buffer size. */
> +	uint8_t *user_data; /**< Arbitrary user data. */
> +};
> +
This object is only used to match and not set so
why do we need the union? I understand that later when reporting to the SFT in the application layer
sometimes you will get zone while other time you will get fid.
From rte flow you are matching on given object which is 32 bit.
What are the matchable  fields? (fid / zone / user_data / fid_valid ... )
Do you think that some of the times the match will be on he fid other on the zone?
If so they should not be union.
I think zone is the responsibility of the application to save and to match. So I don't see why it is
needed here.

>  /**
>   * Matching pattern item definition.
>   *
> @@ -2132,6 +2186,15 @@ enum rte_flow_action_type {
>  	 * see enum RTE_ETH_EVENT_FLOW_AGED
>  	 */
>  	RTE_FLOW_ACTION_TYPE_AGE,
> +
> +	/**
> +	 * RTE_FLOW_ACTION_TYPE_SFT
> +	 *
> +	 * Set SFT context and redirect to continue processing.
> +	 *
> +	 * See struct rte_flow_action_sft.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_SFT,
>  };
> 
>  /**
> @@ -2721,6 +2784,27 @@ rte_flow_dynf_metadata_set(struct rte_mbuf *m,
> uint32_t v)
>  	*RTE_FLOW_DYNF_METADATA(m) = v;
>  }
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_SFT
> + *
> + * Attaches an SFT context (see struct rte_flow_item_sft) to packet.
> + *
> + * Performs lookup by *zone* and 5-tuple in SFT; if entry found the related SFT
> + * context will be attached othervise default SFT context attached (see
> + * 'SFT default context' in struct rte_flow_item_sft description).
> + * Adding action of type ``SFT`` to the list of rule actions may impose
> + * limitations on other rule actions added to the list, depending on specific
> + * PMD implementation.
> + *
> + * For 5-tuple, zone & SFT definitions see `struct rte_flow_item_sft`.
> + */
> +struct rte_flow_action_sft {
> +	uint32_t zone; /**< Zone for lookup in SFT */
> +};
> +
>  /*
>   * Definition of a single action.
>   *
> --
> 2.26.2

Thanks,
Ori


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [RFC 3/3] sft: introduce API
  2020-09-09 20:30 ` [dpdk-dev] [RFC 3/3] sft: introduce API Andrey Vesnovaty
@ 2020-09-16 18:33   ` Ori Kam
  2020-09-18  7:43     ` Andrew Rybchenko
  2020-09-18 13:34   ` Kinsella, Ray
  1 sibling, 1 reply; 17+ messages in thread
From: Ori Kam @ 2020-09-16 18:33 UTC (permalink / raw)
  To: Andrey Vesnovaty, dev
  Cc: thomas, Slava Ovsiienko, andrey.vesnovaty, Oz Shlomo,
	Eli Britstein, Alex Rosenbaum, Roni Bar Yanai, Ray Kinsella,
	Neil Horman, Ferruh Yigit, Andrew Rybchenko

Hi Andery,
PSB

> -----Original Message-----
> From: Andrey Vesnovaty <andreyv@nvidia.com>
> Sent: Wednesday, September 9, 2020 11:30 PM
> To: dev@dpdk.org
> Subject: [RFC 3/3] sft: introduce API
> 
> Defines RTE SFT APIs for Statefull Flow Table library.
> 
> SFT General description:
> SFT library provides a framework for applications that need to maintain
> context across different packets of the connection.
> Examples for such applications:
> - Next-generation firewalls
> - Intrusion detection/prevention systems (IDS/IPS): Suricata, snort
> - SW/Virtual Switching: OVS
> The goals of the SFT library:
> - Accelerate flow recognition & its context retrieval for further
>   lookaside processing.
> - Enable context-aware flow handling offload.
> 
> Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
> ---
>  lib/librte_sft/Makefile            |  28 +
>  lib/librte_sft/meson.build         |   7 +
>  lib/librte_sft/rte_sft.c           |   9 +
>  lib/librte_sft/rte_sft.h           | 845 +++++++++++++++++++++++++++++
>  lib/librte_sft/rte_sft_driver.h    | 195 +++++++
>  lib/librte_sft/rte_sft_version.map |  21 +
>  6 files changed, 1105 insertions(+)
>  create mode 100644 lib/librte_sft/Makefile
>  create mode 100644 lib/librte_sft/meson.build
>  create mode 100644 lib/librte_sft/rte_sft.c
>  create mode 100644 lib/librte_sft/rte_sft.h
>  create mode 100644 lib/librte_sft/rte_sft_driver.h
>  create mode 100644 lib/librte_sft/rte_sft_version.map
> 
> diff --git a/lib/librte_sft/Makefile b/lib/librte_sft/Makefile
> new file mode 100644
> index 0000000000..23c6eee849
> --- /dev/null
> +++ b/lib/librte_sft/Makefile
> @@ -0,0 +1,28 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2020 Mellanox Technologies, Ltd
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_sft.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +LDLIBS += -lrte_eal -lrte_mbuf
> +
> +# library source files
> +# all source are stored in SRCS-y
> +SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_sft.c
> +
> +# export include files
> +SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft_driver.h
> +
> +# versioning export map
> +EXPORT_MAP := rte_sft_version.map
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_sft/meson.build b/lib/librte_sft/meson.build
> new file mode 100644
> index 0000000000..b210e43f29
> --- /dev/null
> +++ b/lib/librte_sft/meson.build
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2020 Mellanox Technologies, Ltd
> +
> +sources = files('rte_sft.c')
> +headers = files('rte_sft.h',
> +	'rte_sft_driver.h')
> +deps += ['mbuf']
> diff --git a/lib/librte_sft/rte_sft.c b/lib/librte_sft/rte_sft.c
> new file mode 100644
> index 0000000000..f3d3945545
> --- /dev/null
> +++ b/lib/librte_sft/rte_sft.c
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + */
> +
> +
> +#include "rte_sft.h"
> +#include "rte_sft_driver.h"
> +
> +/* Placeholder for RTE SFT library APIs implementation */
> diff --git a/lib/librte_sft/rte_sft.h b/lib/librte_sft/rte_sft.h
> new file mode 100644
> index 0000000000..5c9f92ea9f
> --- /dev/null
> +++ b/lib/librte_sft/rte_sft.h
> @@ -0,0 +1,845 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + */
> +
> +#ifndef _RTE_SFT_H_
> +#define _RTE_SFT_H_
> +
> +/**
> + * @file
> + *
> + * RTE SFT API
> + *
> + * Defines RTE SFT APIs for Statefull Flow Table library.
> + *
> + * SFT General description:
> + * SFT library provides a framework for applications that need to maintain
> + * context across different packets of the connection.
> + * Examples for such applications:
> + * - Next-generation firewalls
> + * - Intrusion detection/prevention systems (IDS/IPS): Suricata, Snort
> + * - SW/Virtual Switching: OVS
> + * The goals of the SFT library:
> + * - Accelerate flow recognition & its context retrieval for further lookaside
> + *   processing.
> + * - Enable context-aware flow handling offload.
> + *
> + * Definitions and Abbreviations:
> + * - 5-tuple: defined by:
> + *     -- Source IP address
> + *     -- Source port
> + *     -- Destination IP address
> + *     -- Destination port
> + *     -- IP protocol number
> + * - 7-tuple: 5-tuple zone and port (see struct rte_sft_7tuple)
> + * - 5/7-tuple: 5/7-tuple of the packet from connection initiator
> + * - revers 5/7-tuple: 5/7-tuple of the packet from connection initiate
> + * - application: SFT library API consumer
> + * - APP: see application
> + * - CID: client ID
> + * - CT: connection tracking
> + * - FID: Flow identifier
> + * - FIF: First In Flow
> + * - Flow: defined by 7-tuple and its reverse i.e. flow is bidirectional
> + * - SFT: Stateful Flow Table
> + * - user: see application
> + * - zone: additional user defined value used as differentiator for
> + *         connections having same 5-tuple (for example different VxLan
> + *         connections with same inner 5-tuple).
> + *
> + * SFT components:
> + *
> + * +-----------------------------------+
> + * | RTE flow                          |
> + * |                                   |
> + * | +-------------------------------+ |  +----------------+
> + * | | group X                       | |  | RTE_SFT        |
> + * | |                               | |  |                |
> + * | | +---------------------------+ | |  |                |
> + * | | | rule ...                  | | |  |                |
> + * | | | .                         | | |  +-----------+----+
> + * | | | .                         | | |              |
> + * | | | .                         | | |          entry
> + * | | +---------------------------+ | |            create
> + * | | | rule                      | | |              |
> + * | | |   patterns ...            +---------+        |
> + * | | |   actions                 | | |     |        |
> + * | | |     SFT (zone=Z)          | | |     |        |
> + * | | |     JUMP (group=Y)        | | |  lookup      |
> + * | | +---------------------------+ | |    zone=Z,   |
> + * | | | rule ...                  | | |    5tuple    |
> + * | | | .                         | | |     |        |
> + * | | | .                         | | |  +--v-------------+
> + * | | | .                         | | |  | SFT       |    |
> + * | | |                           | | |  |           |    |
> + * | | +---------------------------+ | |  |        +--v--+ |
> + * | |                               | |  |        |     | |
> + * | +-------------------------------+ |  |        | PMD | |
> + * |                                   |  |        |     | |
> + * |                                   |  |        +-----+ |
> + * | +-------------------------------+ |  |                |
> + * | | group Y                       | |  |                |
> + * | |                               | |  | set flow CTX   |
> + * | | +---------------------------+ | |  |                |
> + * | | | rule                      | | |  +--------+-------+
> + * | | |   patterns                | | |           |
> + * | | |     SFT (state=UNDEFINED) | | |           |
> + * | | |   actions RSS             | | |           |
> + * | | +---------------------------+ | |           |
> + * | | | rule                      | | |           |
> + * | | |   patterns                | | |           |
> + * | | |     SFT (state=INVALID)   | <-------------+
> + * | | |   actions DROP            | | |  forward
> + * | | +---------------------------+ | |    group=Y
> + * | | | rule                      | | |
> + * | | |   patterns                | | |
> + * | | |     SFT (state=ACCEPTED)  | | |
> + * | | |   actions PORT            | | |
> + * | | +---------------------------+ | |
> + * | |  ...                          | |
> + * | |                               | |
> + * | +-------------------------------+ |
> + * |  ...                              |
> + * |                                   |
> + * +-----------------------------------+
> + *
> + * SFT as datastructure:
> + * SFT can be treated as datastructure maintaining flow context across its
> + * lifetime. SFT flow entry represent bidirectional network flow and defined by
> + * 7-tuple & its reverse 7-tuple.
> + * Each entry in SFT has:
> + * - FID: 1:1 mapped & used as entry handle & encapsulating internal
> + *   implementation of the entry.
> + * - State: user-defined value attached to each entry, the only library
> + *   reserved value for state unset (the actual value defined by SFT
> + *   configuration). The application should define flow state encodings and
> + *   set it for flow via rte_sft_flow_set_ctx() than what actions should be
> + *   applied on packets can be defined via related RTE flow rule matching SFT
> + *   state (see rules in SFT components diagram above).
> + * - Timestamp: for the last seen in flow packet used for flow aging
> mechanism
> + *   implementation.
> + * - Client Objects: user-defined flow contexts attached as opaques to flow.
> + * - Acceleration & offloading - utilize RTE flow capabilities, when supported
> + *   (see action ``SFT``), for flow lookup acceleration and further
> + *   context-aware flow handling offload.
> + * - CT state: optionally for TCP connections CT state can be maintained
> + *   (see enum rte_sft_flow_ct_state).
> + * - Out of order TCP packets: optionally SFT can keep out of order TCP
> + *   packets aside the flow context till the arrival of the missing in-order
> + *   packet.
> + *
> + * RTE flow changes:
> + * The SFT flow state (or context) for RTE flow is defined by fields of
> + * struct rte_flow_item_sft.
> + * To utilize SFT capabilities new item and action types introduced:
> + * - item SFT: matching on SFT flow state (see RTE_FLOW_ITEM_TYPE_SFT).
> + * - action SFT: retrieve SFT flow context and attache it to the processed
> + *   packet (see RTE_FLOW_ACTION_TYPE_SFT).
> + *
> + * The contents of per port SFT serving RTE flow action ``SFT`` managed via
> + * SFT PMD APIs (see struct rte_sft_ops).
> + * The SFT flow state/context retrieval performed by user-defined zone ``SFT``
> + * action argument and processed packet 5-tuple.
> + * If in scope of action ``SFT`` there is no context/state for the flow in SFT
> + * undefined sate attached to the packet meaning that the flow is not
> + * recognized by SFT, most probably FIF packet.
> + *
> + * Once the SFT state set for a packet it can match on item SFT
> + * (see RTE_FLOW_ITEM_TYPE_SFT) and forwarding design can be done for
> the
> + * packet, for example:
> + * - if state value == x than queue for further processing by the application
> + * - if state value == y than forward it to eth port (full offload)
> + * - if state value == 'undefined' than queue for further processing by
> + *   the application (handle FIF packets)
> + *
> + * Processing packets with SFT library:
> + *
> + * FIF packet:
> + * To recognize upcoming packets of the SFT flow every FIF packet should be
> + * forwarded to the application utilizing the SFT library. Non-FIF packets can
> + * be processed by the application or its processing can be fully offloaded.
> + * Processing of the packets in SFT library starts with rte_sft_process_mbuf
> + * or rte_sft_process_mbuf_with_zone. If mbuf recognized as FIF application
> + * should make a design to destroy flow or complete flow creation process in
> + * SFT using rte_sft_flow_activate.
> + *
> + * Recognized SFT flow:
> + * Once struct rte_sft_flow_status with valid fid field posesed by application
> + * it can:
> + * - mange client objects on it (see client_obj field in
> + *   struct rte_sft_flow_status) using rte_sft_flow_<OP>_client_obj APIs
> + * - analyze user-defined flow state and CT state (see state & ct_sate fields
> + *   in struct rte_sft_flow_status).
> + * - set flow state to be attached to the upcoming packets by action ``SFT``
> + *   via struct rte_sft_flow_status API.
> + * - decide to destroy flow via rte_sft_flow_destroy API.
> + *
> + * Flow aging:
> + *
> + * SFT library manages the aging for each flow. On flow creation, it's
> + * assigned an aging value, the maximal number of seconds passed since the
> + * last flow packet arrived, once exceeded flow considered aged.
> + * The application notified of aged flow asynchronously via event queues.
> + * The device and port IDs tuple to identify the event queue to enqueue
> + * flow aged events passed on flow creation as arguments
> + * (see rte_sft_flow_activate). It's the application responsibility to
> + * initialize event queues and assign them to each flow for EOF event
> + * notifications.
> + * Aged EOF event handling:
> + * - Should be considered as application responsibility.
> + * - The last stage should be the release of the flow resources via
> + *    rte_sft_flow_destroy API.
> + * - All client objects should be removed from flow before the
> + *   rte_sft_flow_destroy API call.
> + * See the description of rete_sft_flow_destroy for an example of aged flow
> + * handling.
> + *
> + * SFT API thread safety:
> + *
> + * SFT library APIs are thread-safe while handling of specific flow can be
> + * done in a single thread simultaneously. Exclusive access to specific SFT
> + * flow guaranteed by:

The line above contradict itself, if you are working with single thread you can't work simultaneously.
Does the SFT allow the access to a single flow from two threads in the same time? or it is the responsibility 
Of the application to protect itself. I think it should be the application responsibility the SFT should protect
itself only on SFT global functions. For example calling process_mbuf should be protected, so application can 
call the same function from different threads.
I think we can assume that all packets from a specific flow will arrive to the same queue and the same thread.

So I don't see the usage of the lock API.
 
> + * - rte_sft_process_mbuf
> + * - rte_sft_process_mbuf_with_zone
> + * - rte_sft_flow_create
> + * - rte_sft_flow_lock
> + * When application is done with the flow handling for the current packet it
> + * should call rte_sft_flow_unlock API to maintain exclusive access to the
> + * flow with other threads.
> + *
> + * SFT Library initialization and cleanup:
> + *
> + * SFT library should be considered as a single instance, preconfigured and
> + * initialized via rte_sft_init() API.
> + * SFT library resource deallocation and cleanup should be done via
> + * rte_sft_init() API as a stage of the application termination procedure.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_errno.h>
> +#include <rte_mbuf.h>
> +#include <rte_ethdev.h>
> +#include <rte_flow.h>
> +
> +/**
> + * L3/L4 5-tuple - src/dest IP and port and IP protocol.
> + *
> + * Used for flow/connection identification.
> + */
> +struct rte_sft_5tuple {
> +	union {
> +		struct {
> +			rte_be32_t src_addr; /**< IPv4 source address. */
> +			rte_be32_t dst_addr; /**< IPv4 destination address. */
> +		} ipv4;
> +		struct {
> +			uint8_t src_addr[16]; /**< IPv6 source address. */
> +			uint8_t dst_addr[16]; /**< IPv6 destination address. */
> +		} ipv6;
> +	};
> +	uint16_t src_port; /**< Source port. */
> +	uint16_t dst_port; /**< Destination port. */
> +	uint8_t proto; /**< IP protocol. */
> +	uint8_t is_ipv6: 1; /**< True for valid IPv6 fields. Otherwise IPv4. */
> +};
> +
> +/**
> + * Port flow identification.
> + *
> + * @p zone used for setups where 5-tuple is not enough to identify flow.
> + * For example different VLANs/VXLANs may have similar 5-tuples.
> + */
> +struct rte_sft_7tuple {
> +	struct rte_sft_5tuple flow_5tuple; /**< L3/L4 5-tuple. */
> +	uint32_t zone; /**< Zone assigned to flow. */
> +	uint16_t port_id; /** <Port identifier of Ethernet device. */
> +};
> +
> +/**
> + * Flow connection tracking states
> + */
> +enum rte_sft_flow_ct_state {
> +	RTE_SFT_FLOW_CT_STATE_NEW  = (1 << 0),
> +	RTE_SFT_FLOW_CT_STATE_EST  = (1 << 1),
> +	RTE_SFT_FLOW_CT_STATE_REL  = (1 << 2),
> +	RTE_SFT_FLOW_CT_STATE_RPL  = (1 << 3),
> +	RTE_SFT_FLOW_CT_STATE_INV  = (1 << 4),
> +	RTE_SFT_FLOW_CT_STATE_TRK  = (1 << 5),
> +	RTE_SFT_FLOW_CT_STATE_SNAT = (1 << 6),
> +	RTE_SFT_FLOW_CT_STATE_DNAT = (1 << 7),
> +};
> +
> +/**
> + * Structure describes SFT library configuration
> + */
> +struct rte_sft_conf {
> +	uint32_t UDP_aging; /**< UDP proto default aging. */
> +	uint32_t TCP_aging; /**< TCP proto default aging. */
> +	uint32_t TCP_SYN_aging; /**< TCP SYN default aging. */
> +	uint32_t OTHER_aging; /**< All unlisted proto default aging. */
> +	uint32_t size; /**< Max entries in SFT. */
> +	uint8_t undefined_state; /**< Undefined state constant. */
> +	uint8_t reorder_enable: 1;
> +	/**< TCP packet reordering feature enabled bit. */
> +	uint8_t ct_enable: 1; /**< Connection tracking feature enabled bit. */
> +};
> +
> +/**
> + * Structure describes the state of the flow in SFT.
> + */
> +struct rte_sft_flow_status {
> +	uint32_t fid; /**< SFT flow id. */
> +	uint32_t zone; /**< Zone for lookup in SFT */
> +	uint8_t state; /**< Application defined bidirectional flow state. */
> +	uint8_t ct_state; /**< Connection tracking flow state. */
> +	uint32_t age; /**< Seconds passed since last flown packet. */
> +	uint32_t aging;
> +	/**< Flow considered aged once this age (seconds) reached. */
> +	uint32_t nb_in_order_mbufs;
> +	/**< Number of in-order mbufs available for drain */
> +	void **client_obj; /**< Array of clients attached to flow. */
> +	int nb_clients; /**< Number of clients attached to flow. */
> +	uint8_t defined: 1; /**< Flow defined in SFT bit. */
> +	uint8_t activated: 1; /**< Flow activation bit. */
> +	uint8_t fragmented: 1; /**< Last flow mbuf was fragmented. */
> +	uint8_t out_of_order: 1; /**< Last flow mbuf was out of order (TCP). */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get SFT flow status.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param[out] status
> + *   Structure to dump actual SFT flow status.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_get_status(const uint32_t fid,
> +			struct rte_sft_flow_status *status,
> +			struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Set user defined context.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * Updates per ethernet dev SFT entries:
> + * - flow lookup acceleration
> + * - partial/full flow offloading managed by flow context
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param ctx
> + *   User defined state to set.
> + *   Update of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success , a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_set_ctx(uint32_t fid,
> +		     const struct rte_flow_item_sft *ctx,
> +		     struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Initialize SFT library instance.
> + *
> + * @param conf
> + *   SFT library instance configuration.
> + *
> + * @return
> + *   0 on success , a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_init(const struct rte_sft_conf *conf);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Finalize SFT library instance.
> + * Cleanup & release allocated resources.
> + */
> +void
> +rte_sft_fini(void);
> +

I think we should use stop. It is not commons in DPDK to have fini functions.
Maybe we should also add start function, so the app can init and then start the SFT.

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Process mbuf received on RX queue.
> + *
> + * Fragmentation handling (SFT fragmentation feature configured):
> + * If *mbuf_in* of fragmented packet received it will be stored by SFT library.
> + * status->fragmented bit will be set and *mbuf_out* will be set to NULL.
> + * On reception of all related fragments of IP packet it will be reassembled
> + * and further processed by this function on reception of last fragment.
> + *
Does this function allocate a new mbuf? Does it releases all old mbufs?

> + * Flow definition:
> + * SFT flow defined by one of its 7-tuples, since there is no zone value as
> + * argument flow should be defined by context attached to mbuf with action
> + * ``SFT`` (see RTE flow RTE_FLOW_ACTION_TYPE_SFT). Otherwise status-
> >defined
> + * field will be turned off & *mbuf_out* will be set to *mbuf_in*.
> + * In order to define flow for *mbuf_in* without attached sft context
> + * rte_sft_process_mbuf_with_zone() should be used with *zone* argument
> + * supplied by caller.
> + *
> + * Flow lookup:
> + * If SFT flow identifier can't be retrieved from SFT context attached to
> + * *mbuf_in* by action ``SFT`` - SFT lookup should be performmed by zone,
> + * retrieved from SFT context attached to *mbuf_in*, and 5-tuple, extracted
> + * form mbuf outer header contents.
> + *
> + * Flow defined but does not exists:
> + * If flow not found in SFT inactivated flow will be created in SFT.
> + * status->activated field will be turned off & *mbuf_out* be set to
> *mbuf_in*.
> + * In order to activate created flow rte_sft_flow_activate() should be used
> + * with reverse 7-tuple supplied by caller.
> + * This is first phase of flow creation in SFT for second phase & more detailed
> + * descriotion of flow creation see rte_sft_flow_activate.
> + *
> + * Out of order (SFT out of oreder feature configured):
> + * If flow defined & activated but *mbuf_in* is TCP out of order packet it will
> + * be stored by SFT library. status->out_of_order bit will be set & *mbuf_out*
> + * will be set to NULL. On reception of the first missing in order packet
> + * status->nb_in_order_mbufs will be set to number of mbufs that available
> for
> + * processing with rte_sft_drain_mbuf().
> + *
It is possible that some packets will get trapped in the SFT do to this feature.
if it supports ordering. For example the following case:
Packets arrive to the application. After draining the packets the 
Application changed the flow to full offload. This means that
all future packets will not arrive to the application.
But until the flow is offloaded some packets do arrive not in order.
Then the flow is offloaded, this will result in the situation that no more
packets will arrive to the application so some packets will get stack
in the SFT.
I think we must have some force drain or, notify the SFT that no more
packets should arrive to even if the packets are not in order it will release them.

Also the same with fragmented does this function allocate new mbufs? are you releasing the
old ones?

> + * Flow defined & activated, mbuf not fragmented and 'in order':
> + * - Flow aging related data (see age field in `struct rte_sft_flow_status`)
> + *   will be updated according to *mbuf_in* timestamp.
> + * - Flow connection tracking state (see ct_state field in
> + *   `struct rte_sft_flow_status`)  will be updated according to *mbuf_in* L4
> + *   header contents.
> + * - *mbuf_out* will be set to last processed mbuf.
> + *
> + * @param[in] mbuf_in
> + *   mbuf to process; mbuf pinter considered 'consumed' and should not be
> used
> + *   after successful call to this function.
> + * @param[out] mbuf_out
> + *   last processed not fragmented and in order mbuf.

If the in mbuf is not fragmented and in order, this pointer will point to the in one?

> + * @param[out] status
> + *   Structure to dump SFT flow status once updated according to contents of
> + *   *mbuf_in*.

Does the status bits for example fragmented is kept per connection or per flow?
Since it is possible to get fragmented packets from both sides.
The same goes for out of order packets


> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success:
> + *   - *mbuf_out* contains valid mbuf pointer, locked SFT flow recognized by
> + *     status->fid.
> + *   - *mbuf_out* is NULL and status->fragmented bit on in case of
> + *     non last fragment *mbuf_in*.
> + *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
> + *     order *mbuf_in*, locked SFT flow recognized by status->fid.
> + *   On failure a negative errno value and rte_errno is set.
> + */
> +int
> +rte_sft_process_mbuf(struct rte_mbuf *mbuf_in,
> +		     struct rte_mbuf **mbuf_out,
> +		     struct rte_sft_flow_status *status,
> +		     struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Process mbuf received on RX queue while zone value provided by caller.
> + *
> + * The behaviour of this function is similar to rte_sft_process_mbuf except
> + * the lookup in SFT procedure. The lookup in SFT always done by the *zone*
> + * arg and 5-tuple 5-tuple, extracted form mbuf outer header contents.
> + *
> + * @see rte_sft_process_mbuf
> + *
> + * @param[in] mbuf_in
> + *   mbuf to process; mbuf pinter considered 'consumed' and should not be
> used
> + *   after successful call to this function.
> + * @param[out] mbuf_out
> + *   last processed not fragmented and in order mbuf.
> + * @param[out] status
> + *   Structure to dump SFT flow status once updated according to contents of
> + *   *mbuf_in*.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success:
> + *   - *mbuf_out* contains valid mbuf pointer.
> + *   - *mbuf_out* is NULL and status->fragmented bit on in case of
> + *     non last fragment *mbuf_in*.
> + *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
> + *     order *mbuf_in*.
> + *   On failure a negative errno value and rte_errno is set.
> + */
> +int
> +rte_sft_process_mbuf_with_zone(struct rte_mbuf *mbuf_in,
> +			       uint32_t zone,
> +			       struct rte_mbuf **mbuf_out,
> +			       struct rte_sft_flow_status *status,
> +			       struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Drain next in order mbuf.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * This function behaves similar to rte_sft_process_mbuf() but acts on packets
> + * accumulated in SFT flow due to missing in order packet. Processing done on
> + * single mbuf at a time and `in order`. Other than above the behavior is
> + * same as of rte_sft_process_mbuf for flow defined & activated & mbuf isn't
> + * fragmented & 'in order'. This function should be called when
> + * rte_sft_process_mbuf or rte_sft_process_mbuf_with_zone sets
> + * status->nb_in_order_mbufs output param !=0 and until
> + * status->nb_in_order_mbufs == 0.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param[out] status
> + *   Structure to dump SFT flow status once updated according to contents of
> + *   *mbuf_in*.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid mbuf in case of success, NULL otherwise and rte_errno is set.
> + */
> +struct rte_mbuf *
> +rte_sft_drain_mbuf(uint32_t fid,
> +		   struct rte_sft_flow_status *status,
> +		   struct rte_sft_error *error);
> +

Fid represent a connection, so which direction do we drain the packets?
since we can have inordered packet in from both directions right?

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Activate flow in SFT.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * This function performs second phase of flow creation in SFT.
> + * The reasons for 2 phase flow creation procedure:
> + * 1. Missing reverse flow - flow context is shared for both flow directions
> + *    i.e. in order maintain bidirectional flow context in RTE SFT packets
> + *    arriving from both dirrections should be identified as packets of the
> + *    RTE SFT flow. Consequently before creation of the SFT flow caller should
> + *    provide reverse flow direction 7-tuple.
> + * 2. The caller of rte_sft_process_mbuf/rte_sft_process_mbuf_with_zone
> should
> + *   be notified that arrived mbuf is first in flow & decide weather to
> + *   create new flow or it distroy before it was activated with
> + *   rte_sft_flow_destroy.
> + * This function completes creation of the bidirectional SFT flow & creates
> + * entry for 7-tuple on SFT PMD defined by the tuple port for both
> + * initiator/initiate 7-tuples.
> + * Flow aging, connection tracking state & out of order handling will be
> + * initialized according to the content of the *mbuf_in* passes to
> + * rte_sft_process_mbuf/_with_zone during the phase 1 of flow creation.
> + * Once this function returns upcoming calls
> rte_sft_process_mbuf/_with_zone
> + * with 7-tuple or its reverse will return handle to this flow.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param reverse_tuple
> + *   Expected response flow 7-tuple.
> + * @param ctx
> + *   User defined state to set.
> + *   Update of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
> + * @param ct_enable
> + *   Enables maintenance of status->ct_state connection tracking value for the
> + *   flow; otherwise status->ct_state will be initialized with zeros.
> + * @param evdev_id
> + *   Event dev ID to enqueue end of flow event.
> + * @param evport_id
> + *   Event port ID to enqueue end of flow event.
> + * @param[out] status
> + *   Structure to dump SFT flow status once activated.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_activate(uint32_t fid,
> +		      const struct rte_sft_7tuple *reverse_tuple,
> +		      const struct rte_flow_item_sft *ctx,
> +		      uint8_t ct_enable,
> +		      uint8_t dev_id,
> +		      uint8_t port_id,
> +		      struct rte_sft_flow_status *status,
> +		      struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Artificially create SFT flow.
> + *
> + * Function to create SFT flow before reception of the first flow packet.
> + *
> + * @param tuple
> + *   Expected initiator flow 7-tuple.
> + * @param reverse_tuple
> + *   Expected initiate flow 7-tuple.
> + * @param ctx
> + *   User defined state to set.
> + *   Setting of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
> + * @param[out] ct_enable
> + *   Enables maintenance of status->ct_state connection tracking value for the
> + *   flow; otherwise status->ct_state will be initialized with zeros.
> + * @param[out] status
> + *   Structure to dump SFT flow status once created.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   - on success: 0, locked SFT flow recognized by status->fid.
> + *   - on error: a negative errno value otherwise and rte_errno is set.
> + */
> +
> +int
> +rte_sft_flow_create(const struct rte_sft_7tuple *tuple,
> +		    const struct rte_sft_7tuple *reverse_tuple,
> +		    const struct rte_flow_item_sft *ctx,
> +		    uint8_t ct_enable,
> +		    struct rte_sft_flow_status *status,
> +		    struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Lock exclusively SFT flow.
> + *
> + * Explicit flow locking; used for handling aged flows.
> + *
> + * @param fid
> + *   SFT flow ID.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_lock(uint32_t fid);
> + 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Release exclusively locked SFT flow.
> + *
> + * When rte_sft_process_mbuf/_with_zone and rte_sft_flow_create
> + * return *status* containing fid with defined bit on the flow considered
> + * exclusively locked and should be unlocked with this function.
> + *
> + * @param fid
> + *   SFT flow ID.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_unlock(uint32_t fid);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Removes flow from SFT.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * - Flow should be locked by caller in order to remove it.
> + * - Flow should have no client objects attached.
> + *
> + * Should be applied on aged flows, when flow aged event received.
> + *
> + * @code{.c}
> + *     while (1) {
> + *         rte_event_dequeue_burst(...);
> + *         FOR_EACH_EV(ev) {
> + *             uint32_t fid = ev.u64;
> + *             rte_sft_flow_lock(fid);
> + *             FOR_EACH_CLIENT(fid, client_id) {
> + *                 rte_sft_flow_reset_client_obj(fid, client_obj);
> + *                 // detached client object handling
> + *             }
> + *             rte_sft_flow_destroy(fid, &error);
> + *         }
> + *     }
> + * @endcode
> + *
> + * @param fid
> + *   SFT flow ID to destroy.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_destroy(uint32_t fid, struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Reset flow age to zero.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * Simulates last flow packet with timestamp set to just now.
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_touch(uint32_t fid, struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Set flow aging to specific value.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param aging
> + *   New flow aging value.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_set_aging(uint32_t fid,
> +		       uint32_t aging,
> +		       struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Set client object for given client ID.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param client_id
> + *   Client ID to set object for.
> + * @param client_obj
> + *   Pointer to opaque client object structure.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_set_client_obj(uint32_t fid,
> +			    uint8_t client_id,
> +			    void *client_obj,
> +			    struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get client object for given client ID.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param client_id
> + *   Client ID to get object for.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid client object opaque pointer in case of success, NULL otherwise
> + *   and rte_errno is set.
> + */
> +void *
> +rte_sft_flow_get_client_obj(const uint32_t fid,
> +			    uint8_t client_id,
> +			    struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Remove client object for given client ID.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * Detaches client object from SFT flow and returns the ownership for the
> + * client object to the caller by returning client object pointer value.
> + * The pointer returned by this function won't be accessed any more, the
> caller
> + * may release all client obj related resources & the memory allocated for
> + * this client object.
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param client_id
> + *   Client ID to remove object for.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid client object opaque pointer in case of success, NULL otherwise
> + *   and rte_errno is set.
> + */
> +void *
> +rte_sft_flow_reset_client_obj(uint32_t fid,
> +			      uint8_t client_id,
> +			      struct rte_sft_error *error);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_SFT_H_ */
> diff --git a/lib/librte_sft/rte_sft_driver.h b/lib/librte_sft/rte_sft_driver.h
> new file mode 100644
> index 0000000000..0c9e28fe17
> --- /dev/null
> +++ b/lib/librte_sft/rte_sft_driver.h
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + */
> +
> +#ifndef _RTE_SFT_DRIVER_H_
> +#define _RTE_SFT_DRIVER_H_
> +
> +/**
> + * @file
> + *
> + * RTE SFT Ethernet device PMD API
> + *
> + * APIs that are used by the SFT library to offload SFT operationons
> + * to Ethernet device.
> + */
> +
> +#include "rte_sft.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Opaque type returned after successfully creating an entry in SFT.
> + *
> + * This handle can be used to manage and query the related entry (e.g. to
> + * destroy it or update age).
> + */
> +struct rte_sft_entry;
> +
> +/**
> + * Create SFT entry in eth_dev SFT.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param tuple
> + *   L3/L4 5-tuple - src/dest IP and port and IP protocol.
> + * @param nat_tuple
> + *   L3/L4 5-tuple to replace in packet original 5-tuple in order to implement
> + *   NAT offloading; if NULL NAT offloading won't be configured for the flow.
> + * @param aging
> + *   Flow aging timeout in seconds.
> + * @param ctx
> + *   Initial values in SFT flow context
> + *   (see RTE flow struct rte_flow_item_sft).
> + *   ctx->zone should be valid.
> + * @param fid
> + *   SFT flow ID for the entry to create on *device*.
> + *   If there is an entry for the *fid* in PMD it will be updated with the
> + *   values of *ctx*.
> + * @param[out] queue_index
> + *   if PMD can figure out the queue where the flow packets will
> + *   arrive in RX data path it will set the value of queue_index; otherwise
> + *   all bits will be turned on.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid handle in case of success, NULL otherwise and rte_errno is set.
> + */
> +typedef struct rte_sft_entry *(*sft_entry_create_t) (struct rte_eth_dev *dev,
> +		const struct rte_sft_5tuple *tuple,
> +		const struct rte_sft_5tuple *nat_tuple,
> +		const uint32_t aging,
> +		const struct rte_flow_item_sft *ctx,
> +		const uint32_t fid,
> +		uint16_t *queue_index,
> +		struct rte_sft_error *error);
> +

I think for easier reading, the API should change to have 6 tuple (5 + zone)
the ctx should be removed and replaced with the state.

Then add new API to modify the ctx
typedef int (*sft_modify_state)(struct rte_eth_dev *dev, uint8 state);
The main issue we my suggestion is that it will force the PMD to store the information to recreate
the rule, data that is already  saved by the SFT.

Also I don't see why we need queue index, since the RSS and queue will be configured by the RTE flow
in a different group.

> +/**
> + * Destroy SFT entry in eth_dev SFT.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param entry
> + *   Handle to the SFT entry to destroy.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +typedef int (*sft_entry_destroy_t)(struct rte_eth_dev *dev,
> +		struct rte_sft_entry *entry,
> +		struct rte_sft_error *error);
> +
> +/**
> + * Decodes SFT flow context if attached to mbuf by action ``SFT``.
> + * @see RTE flow RTE_FLOW_ACTION_TYPE_SFT.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param mbuf
> + *   mbuf of the packet to decode attached state from.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid SFT flow context in case of success, NULL otherwise and rte_errno
> + *   is set.
> + */
> +typedef struct rte_flow_item_sft *(*sft_entry_mbuf_decode_ctx_t)(
> +		struct rte_eth_dev *dev,
> +		const struct rte_mbuf *mbuf,
> +		struct rte_sft_error *error);
> +

What about returning int as error code, and return the rte_flow_item_sft
as out parameter?
This will remove the allocation and free.

> +/**
> + * Get aged-out SFT entries.
> + *
> + * Report entry as aged-out if timeout passed without any matching
> + * on the SFT entry.
> + *
> + * @param[in] dev
> + *   Pointer to Ethernet device structure.
> + * @param[in, out] fid_aged
> + *   The address of an array of aged-out SFT flow IDs.
> + * @param[in] nb_aged
> + *   The length of *fid_aged* array pointers.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. Initialized in case of
> + *   error only.
> + *
> + * @return
> + *   if nb_aged is 0, return the amount of all aged flows.
> + *   if nb_aged is not 0 , return the amount of aged flows reported
> + *   in the *fid_aged* array, otherwise negative errno value.
> + */
> +typedef int (*sft_entry_get_aged_entries_t)(struct rte_eth_dev *dev,
> +		uint32_t *fid_aged,
> +		int nb_aged,
> +		struct rte_sft_error *error);
> +
> +/**
> + * Simulate SFT entry match in terms of entry aging.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param fid
> + *   SFT flow ID paired with dev to retrieve related SFT entry.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +typedef int (*sft_entry_touch_t)(struct rte_eth_dev *dev,
> +		uint32_t fid,
> +		struct rte_sft_error *error);
> +
> +/**
> + * Set SFT entry aging to specific value.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param fid
> + *   SFT flow ID paired with dev to retrieve related SFT entry.
> + * @param aging
> + *   New entry aging value.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +typedef int (*sft_entry_set_aging_t)(struct rte_eth_dev *dev,
> +		uint32_t fid,
> +		uint32_t aging,
> +		struct rte_sft_error *error);
> +
> +/** SFT operations function pointer table */
> +struct rte_sft_ops {
> +	sft_entry_create_t entry_create;
> +	/**< Create SFT entry in eth_dev SFT. */
> +	sft_entry_destroy_t entry_destroy;
> +	/**< Destroy SFT entry in eth_dev SFT. */
> +	sft_entry_mbuf_decode_ctx_t mbuf_decode_ctx;
> +	/**< Decodes SFT flow context if attached to mbuf by action ``SFT``. */
> +	sft_entry_get_aged_entries_t get_aged_entries;
> +	/**< Get aged-out SFT entries. */
> +	sft_entry_touch_t entry_touch;
> +	/**< Simulate SFT entry match in terms of entry aging. */
> +	sft_entry_set_aging_t set_aging;
> +	/**< Set SFT entry aging to specific value. */
> +};
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_SFT_DRIVER_H_ */
> diff --git a/lib/librte_sft/rte_sft_version.map
> b/lib/librte_sft/rte_sft_version.map
> new file mode 100644
> index 0000000000..747e100ac5
> --- /dev/null
> +++ b/lib/librte_sft/rte_sft_version.map
> @@ -0,0 +1,21 @@
> +EXPERIMENTAL {
> +	global:
> +
> +	rte_sft_flow_get_status;
> +	rte_sft_flow_set_ctx;
> +	rte_sft_init;
> +	rte_sft_fini;
> +	rte_sft_process_mbuf;
> +	rte_sft_process_mbuf_with_zone;
> +	rte_sft_drain_mbuf;
> +	rte_sft_flow_activate;
> +	rte_sft_flow_create;
> +	rte_sft_flow_lock;
> +	rte_sft_flow_unlock;
> +	rte_sft_flow_destroy;
> +	rte_sft_flow_touch;
> +	rte_sft_flow_set_aging;
> +	rte_sft_flow_set_client_obj;
> +	rte_sft_flow_get_client_obj;
> +	rte_sft_flow_reset_client_obj;
> +};
> --
> 2.26.2

Best,
Ori

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [RFC 1/3] ethdev: add item/action for SFT
  2020-09-16 15:46   ` Ori Kam
@ 2020-09-18  7:04     ` Andrew Rybchenko
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Rybchenko @ 2020-09-18  7:04 UTC (permalink / raw)
  To: Ori Kam, Andrey Vesnovaty, dev
  Cc: thomas, Slava Ovsiienko, andrey.vesnovaty, Oz Shlomo,
	Eli Britstein, Alex Rosenbaum, Roni Bar Yanai, Ori Kam,
	NBU-Contact-Thomas Monjalon, Ferruh Yigit

On 9/16/20 6:46 PM, Ori Kam wrote:
> Hi Andrey,
> 
> PSB
> 
>> -----Original Message-----
>> From: Andrey Vesnovaty <andreyv@nvidia.com>
>> Sent: Wednesday, September 9, 2020 11:30 PM
>>
>> Attach SFT flow context to packet with SFT action.
>> Match on SFT flow context (attached to packet),
>> with SFT item.

Since it is the first patch which introduces SFT, it would
be useful define abbreviation in the changeset description.

The description does not explain what is SFT flow context.
It does not explain why we should attach it using action
and why we should match on it using pattern items.

Please, help the reader to understand how it is supposed
to be used in the future.

>>
>> Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
>> ---
>>  lib/librte_ethdev/rte_flow.h | 84 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 84 insertions(+)
>>
>> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
>> index da8bfa5489..24390e6ab4 100644
>> --- a/lib/librte_ethdev/rte_flow.h
>> +++ b/lib/librte_ethdev/rte_flow.h
>> @@ -537,6 +537,12 @@ enum rte_flow_item_type {
>>  	 */
>>  	RTE_FLOW_ITEM_TYPE_ECPRI,
>>
>> +	/**
> You are missing the Meta, tag not relevant for RFC but please notice for the patch.
> 
>> +	 * Matches SFT context (see fields of struct rte_flow_item_sft).
>> +	 *
>> +	 * See struct rte_flow_item_sft.
>> +	 */
>> +	RTE_FLOW_ITEM_TYPE_SFT,
>>  };
>>
>>  /**
>> @@ -1579,6 +1585,54 @@ static const struct rte_flow_item_ecpri
>> rte_flow_item_ecpri_mask = {
>>  };
>>  #endif
>>
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this structure may change without prior notice
>> + *
>> + * RTE_FLOW_ITEM_TYPE_SFT
>> + *
>> + * Matches context of flow in SFT table.
>> + *

It looks like the content of SFT context is defined below.
If so, please, say so.

>> + * 5-tuple: src/dest IP + src/dest port + IP protocol.
>> + * zone: application defined value cupled with 5-tuple to identify flow,

cupled -> coupled

>> + * example - VxLAN, VLAN.
>> + * SFT: Statfull flow table

Statfull -> Stateful

>> + * SFT in scope of ethernet device (port) is HW offloaded lookup table

ethernet -> Ethernet

>> + * where key is zone + 5-tuple & value is statefull flow context.

statefull -> stateful

>> + * Contents of the SFT maintained by SFT PMD (see SFT PMD API in rte_sft).
>> + *
>> + * The structure describes SFT flow context.
>> + * All the fields of the structure, except @p fid, should be considered as
>> + * user defined.
>> + * The @p fid assigned by RTE SFT & used as unique flow identifier.
>> + * SFT context attached to packet by action ``SFT`` (see RTE_FLOW_ACTION_SFT).
>> + *
>> + * SFT default context defined as context attached to packet when there is no
>> + * entry for the flow in SFT. The @p state has application reserved value
>> + * meaning that SFT context for the packet undefined since entry wasn't found
>> + * in SFT. If state 'undefined' then @p zone should be valid othervice @p fid

othervice -> otherwise

>> + * should be valid.
>> + *
>> + * Context considered virtual since the method of storing this info on packet
>> + * is PMD/implementation specific & may involve mapping methods if there is
>> + * 'not enough bits' to store entire contents of struct rte_flow_item_sft.
>> + *
>> + * Maximal value/size of each field depends on HW capabilities and considered
>> + * as implementation specific.
>> + */
>> +struct rte_flow_item_sft {
>> +	union {
>> +		uint32_t fid; /**< SFT flow identifier. */
>> +		uint32_t zone; /**< Zone assigned to flow. */
>> +	};

Is RTE_STD_C11 missing?

>> +	uint8_t state; /**< User defined flow state. */
>> +	uint8_t fid_valid:1; /**< fid field validity bit. */
>> +	uint8_t zone_valid:1; /**< zone fieald validity bit. */

fieald -> field

>> +	uint8_t state_valid:1; /**< state fieald validity bit. */

fieald -> field

>> +	uint8_t user_data_size; /**< user_data buffer size. */
>> +	uint8_t *user_data; /**< Arbitrary user data. */
>> +};
>> +
> This object is only used to match and not set so
> why do we need the union? I understand that later when reporting to the SFT in the application layer
> sometimes you will get zone while other time you will get fid.
> From rte flow you are matching on given object which is 32 bit.
> What are the matchable  fields? (fid / zone / user_data / fid_valid ... )
> Do you think that some of the times the match will be on he fid other on the zone?
> If so they should not be union.
> I think zone is the responsibility of the application to save and to match. So I don't see why it is
> needed here.

+1

> 
>>  /**
>>   * Matching pattern item definition.
>>   *
>> @@ -2132,6 +2186,15 @@ enum rte_flow_action_type {
>>  	 * see enum RTE_ETH_EVENT_FLOW_AGED
>>  	 */
>>  	RTE_FLOW_ACTION_TYPE_AGE,
>> +
>> +	/**
>> +	 * RTE_FLOW_ACTION_TYPE_SFT
>> +	 *
>> +	 * Set SFT context and redirect to continue processing.
>> +	 *
>> +	 * See struct rte_flow_action_sft.
>> +	 */
>> +	RTE_FLOW_ACTION_TYPE_SFT,
>>  };
>>
>>  /**
>> @@ -2721,6 +2784,27 @@ rte_flow_dynf_metadata_set(struct rte_mbuf *m,
>> uint32_t v)
>>  	*RTE_FLOW_DYNF_METADATA(m) = v;
>>  }
>>
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this structure may change without prior notice
>> + *
>> + * RTE_FLOW_ACTION_TYPE_SFT
>> + *
>> + * Attaches an SFT context (see struct rte_flow_item_sft) to packet.
>> + *
>> + * Performs lookup by *zone* and 5-tuple in SFT; if entry found the related SFT
>> + * context will be attached othervise default SFT context attached (see

othervise -> otherwise

>> + * 'SFT default context' in struct rte_flow_item_sft description).
>> + * Adding action of type ``SFT`` to the list of rule actions may impose
>> + * limitations on other rule actions added to the list, depending on specific
>> + * PMD implementation.
>> + *
>> + * For 5-tuple, zone & SFT definitions see `struct rte_flow_item_sft`.
>> + */
>> +struct rte_flow_action_sft {
>> +	uint32_t zone; /**< Zone for lookup in SFT */
>> +};
>> +
>>  /*
>>   * Definition of a single action.
>>   *
>> --
>> 2.26.2
> 
> Thanks,
> Ori
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [RFC 3/3] sft: introduce API
  2020-09-16 18:33   ` Ori Kam
@ 2020-09-18  7:43     ` Andrew Rybchenko
  2020-11-02 10:49       ` Andrey Vesnovaty
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Rybchenko @ 2020-09-18  7:43 UTC (permalink / raw)
  To: Ori Kam, Andrey Vesnovaty, dev
  Cc: thomas, Slava Ovsiienko, andrey.vesnovaty, Oz Shlomo,
	Eli Britstein, Alex Rosenbaum, Roni Bar Yanai, Ray Kinsella,
	Neil Horman, Ferruh Yigit

Hi Andrey,

looks very interesting, but a bit hard to review.
I hope I'll do deeper review on the next version.
Right not just few cosmetic things to make the
next version a bit clearer.

Do you plan to create/publish an example appliction
which uses the API and demonstrates how to do it?

Plesee, see below.

Thanks,
Andrew.

On 9/16/20 9:33 PM, Ori Kam wrote:
> Hi Andery,
> PSB
> 
>> -----Original Message-----
>> From: Andrey Vesnovaty <andreyv@nvidia.com>
>> Sent: Wednesday, September 9, 2020 11:30 PM
>> To: dev@dpdk.org
>> Subject: [RFC 3/3] sft: introduce API
>>
>> Defines RTE SFT APIs for Statefull Flow Table library.
>>
>> SFT General description:
>> SFT library provides a framework for applications that need to maintain
>> context across different packets of the connection.
>> Examples for such applications:
>> - Next-generation firewalls
>> - Intrusion detection/prevention systems (IDS/IPS): Suricata, snort
>> - SW/Virtual Switching: OVS
>> The goals of the SFT library:
>> - Accelerate flow recognition & its context retrieval for further
>>   lookaside processing.
>> - Enable context-aware flow handling offload.
>>
>> Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
>> ---
>>  lib/librte_sft/Makefile            |  28 +
>>  lib/librte_sft/meson.build         |   7 +
>>  lib/librte_sft/rte_sft.c           |   9 +
>>  lib/librte_sft/rte_sft.h           | 845 +++++++++++++++++++++++++++++
>>  lib/librte_sft/rte_sft_driver.h    | 195 +++++++
>>  lib/librte_sft/rte_sft_version.map |  21 +
>>  6 files changed, 1105 insertions(+)
>>  create mode 100644 lib/librte_sft/Makefile
>>  create mode 100644 lib/librte_sft/meson.build
>>  create mode 100644 lib/librte_sft/rte_sft.c
>>  create mode 100644 lib/librte_sft/rte_sft.h
>>  create mode 100644 lib/librte_sft/rte_sft_driver.h
>>  create mode 100644 lib/librte_sft/rte_sft_version.map
>>
>> diff --git a/lib/librte_sft/Makefile b/lib/librte_sft/Makefile
>> new file mode 100644
>> index 0000000000..23c6eee849
>> --- /dev/null
>> +++ b/lib/librte_sft/Makefile
>> @@ -0,0 +1,28 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright 2020 Mellanox Technologies, Ltd
>> +
>> +include $(RTE_SDK)/mk/rte.vars.mk
>> +
>> +# library name
>> +LIB = librte_sft.a
>> +
>> +# library version
>> +LIBABIVER := 1
>> +
>> +# build flags
>> +CFLAGS += -O3
>> +CFLAGS += $(WERROR_FLAGS)
>> +LDLIBS += -lrte_eal -lrte_mbuf
>> +
>> +# library source files
>> +# all source are stored in SRCS-y
>> +SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_sft.c
>> +
>> +# export include files
>> +SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft.h
>> +SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft_driver.h
>> +
>> +# versioning export map
>> +EXPORT_MAP := rte_sft_version.map
>> +
>> +include $(RTE_SDK)/mk/rte.lib.mk
>> diff --git a/lib/librte_sft/meson.build b/lib/librte_sft/meson.build
>> new file mode 100644
>> index 0000000000..b210e43f29
>> --- /dev/null
>> +++ b/lib/librte_sft/meson.build
>> @@ -0,0 +1,7 @@
>> +# SPDX-License-Identifier: BSD-3-Clause
>> +# Copyright 2020 Mellanox Technologies, Ltd
>> +
>> +sources = files('rte_sft.c')
>> +headers = files('rte_sft.h',
>> +	'rte_sft_driver.h')
>> +deps += ['mbuf']
>> diff --git a/lib/librte_sft/rte_sft.c b/lib/librte_sft/rte_sft.c
>> new file mode 100644
>> index 0000000000..f3d3945545
>> --- /dev/null
>> +++ b/lib/librte_sft/rte_sft.c
>> @@ -0,0 +1,9 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright 2020 Mellanox Technologies, Ltd
>> + */
>> +
>> +
>> +#include "rte_sft.h"
>> +#include "rte_sft_driver.h"
>> +
>> +/* Placeholder for RTE SFT library APIs implementation */
>> diff --git a/lib/librte_sft/rte_sft.h b/lib/librte_sft/rte_sft.h
>> new file mode 100644
>> index 0000000000..5c9f92ea9f
>> --- /dev/null
>> +++ b/lib/librte_sft/rte_sft.h
>> @@ -0,0 +1,845 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright 2020 Mellanox Technologies, Ltd
>> + */
>> +
>> +#ifndef _RTE_SFT_H_
>> +#define _RTE_SFT_H_
>> +
>> +/**
>> + * @file
>> + *
>> + * RTE SFT API
>> + *
>> + * Defines RTE SFT APIs for Statefull Flow Table library.
>> + *
>> + * SFT General description:
>> + * SFT library provides a framework for applications that need to maintain
>> + * context across different packets of the connection.
>> + * Examples for such applications:
>> + * - Next-generation firewalls
>> + * - Intrusion detection/prevention systems (IDS/IPS): Suricata, Snort
>> + * - SW/Virtual Switching: OVS
>> + * The goals of the SFT library:
>> + * - Accelerate flow recognition & its context retrieval for further lookaside

lookaside -> look-aside

>> + *   processing.
>> + * - Enable context-aware flow handling offload.
>> + *
>> + * Definitions and Abbreviations:
>> + * - 5-tuple: defined by:
>> + *     -- Source IP address
>> + *     -- Source port
>> + *     -- Destination IP address
>> + *     -- Destination port
>> + *     -- IP protocol number
>> + * - 7-tuple: 5-tuple zone and port (see struct rte_sft_7tuple)

I guess comma is missing after "5-tuple", since I read it as
"5-tuple zone"???

>> + * - 5/7-tuple: 5/7-tuple of the packet from connection initiator
>> + * - revers 5/7-tuple: 5/7-tuple of the packet from connection initiate
>> + * - application: SFT library API consumer
>> + * - APP: see application
>> + * - CID: client ID
>> + * - CT: connection tracking
>> + * - FID: Flow identifier
>> + * - FIF: First In Flow
>> + * - Flow: defined by 7-tuple and its reverse i.e. flow is bidirectional
>> + * - SFT: Stateful Flow Table
>> + * - user: see application
>> + * - zone: additional user defined value used as differentiator for
>> + *         connections having same 5-tuple (for example different VxLan

VxLan -> VXLAN (see devtools/words-case.txt)

>> + *         connections with same inner 5-tuple).
>> + *
>> + * SFT components:
>> + *
>> + * +-----------------------------------+
>> + * | RTE flow                          |
>> + * |                                   |
>> + * | +-------------------------------+ |  +----------------+
>> + * | | group X                       | |  | RTE_SFT        |
>> + * | |                               | |  |                |
>> + * | | +---------------------------+ | |  |                |
>> + * | | | rule ...                  | | |  |                |
>> + * | | | .                         | | |  +-----------+----+
>> + * | | | .                         | | |              |
>> + * | | | .                         | | |          entry
>> + * | | +---------------------------+ | |            create
>> + * | | | rule                      | | |              |
>> + * | | |   patterns ...            +---------+        |
>> + * | | |   actions                 | | |     |        |
>> + * | | |     SFT (zone=Z)          | | |     |        |
>> + * | | |     JUMP (group=Y)        | | |  lookup      |
>> + * | | +---------------------------+ | |    zone=Z,   |
>> + * | | | rule ...                  | | |    5tuple    |
>> + * | | | .                         | | |     |        |
>> + * | | | .                         | | |  +--v-------------+
>> + * | | | .                         | | |  | SFT       |    |
>> + * | | |                           | | |  |           |    |
>> + * | | +---------------------------+ | |  |        +--v--+ |
>> + * | |                               | |  |        |     | |
>> + * | +-------------------------------+ |  |        | PMD | |
>> + * |                                   |  |        |     | |
>> + * |                                   |  |        +-----+ |
>> + * | +-------------------------------+ |  |                |
>> + * | | group Y                       | |  |                |
>> + * | |                               | |  | set flow CTX   |
>> + * | | +---------------------------+ | |  |                |
>> + * | | | rule                      | | |  +--------+-------+
>> + * | | |   patterns                | | |           |
>> + * | | |     SFT (state=UNDEFINED) | | |           |
>> + * | | |   actions RSS             | | |           |
>> + * | | +---------------------------+ | |           |
>> + * | | | rule                      | | |           |
>> + * | | |   patterns                | | |           |
>> + * | | |     SFT (state=INVALID)   | <-------------+
>> + * | | |   actions DROP            | | |  forward
>> + * | | +---------------------------+ | |    group=Y
>> + * | | | rule                      | | |
>> + * | | |   patterns                | | |
>> + * | | |     SFT (state=ACCEPTED)  | | |
>> + * | | |   actions PORT            | | |
>> + * | | +---------------------------+ | |
>> + * | |  ...                          | |
>> + * | |                               | |
>> + * | +-------------------------------+ |
>> + * |  ...                              |
>> + * |                                   |
>> + * +-----------------------------------+
>> + *
>> + * SFT as datastructure:
>> + * SFT can be treated as datastructure maintaining flow context across its
>> + * lifetime. SFT flow entry represent bidirectional network flow and defined by

represent -> represents

>> + * 7-tuple & its reverse 7-tuple.
>> + * Each entry in SFT has:
>> + * - FID: 1:1 mapped & used as entry handle & encapsulating internal
>> + *   implementation of the entry.
>> + * - State: user-defined value attached to each entry, the only library
>> + *   reserved value for state unset (the actual value defined by SFT
>> + *   configuration). The application should define flow state encodings and
>> + *   set it for flow via rte_sft_flow_set_ctx() than what actions should be
>> + *   applied on packets can be defined via related RTE flow rule matching SFT
>> + *   state (see rules in SFT components diagram above).
>> + * - Timestamp: for the last seen in flow packet used for flow aging mechanism
>> + *   implementation.
>> + * - Client Objects: user-defined flow contexts attached as opaques to flow.
>> + * - Acceleration & offloading - utilize RTE flow capabilities, when supported
>> + *   (see action ``SFT``), for flow lookup acceleration and further
>> + *   context-aware flow handling offload.
>> + * - CT state: optionally for TCP connections CT state can be maintained
>> + *   (see enum rte_sft_flow_ct_state).
>> + * - Out of order TCP packets: optionally SFT can keep out of order TCP
>> + *   packets aside the flow context till the arrival of the missing in-order
>> + *   packet.
>> + *
>> + * RTE flow changes:
>> + * The SFT flow state (or context) for RTE flow is defined by fields of
>> + * struct rte_flow_item_sft.
>> + * To utilize SFT capabilities new item and action types introduced:
>> + * - item SFT: matching on SFT flow state (see RTE_FLOW_ITEM_TYPE_SFT).
>> + * - action SFT: retrieve SFT flow context and attache it to the processed
>> + *   packet (see RTE_FLOW_ACTION_TYPE_SFT).
>> + *
>> + * The contents of per port SFT serving RTE flow action ``SFT`` managed via
>> + * SFT PMD APIs (see struct rte_sft_ops).
>> + * The SFT flow state/context retrieval performed by user-defined zone ``SFT``
>> + * action argument and processed packet 5-tuple.
>> + * If in scope of action ``SFT`` there is no context/state for the flow in SFT
>> + * undefined sate attached to the packet meaning that the flow is not
>> + * recognized by SFT, most probably FIF packet.
>> + *
>> + * Once the SFT state set for a packet it can match on item SFT
>> + * (see RTE_FLOW_ITEM_TYPE_SFT) and forwarding design can be done for the
>> + * packet, for example:
>> + * - if state value == x than queue for further processing by the application
>> + * - if state value == y than forward it to eth port (full offload)
>> + * - if state value == 'undefined' than queue for further processing by
>> + *   the application (handle FIF packets)
>> + *
>> + * Processing packets with SFT library:
>> + *
>> + * FIF packet:
>> + * To recognize upcoming packets of the SFT flow every FIF packet should be
>> + * forwarded to the application utilizing the SFT library. Non-FIF packets can
>> + * be processed by the application or its processing can be fully offloaded.
>> + * Processing of the packets in SFT library starts with rte_sft_process_mbuf
>> + * or rte_sft_process_mbuf_with_zone. If mbuf recognized as FIF application
>> + * should make a design to destroy flow or complete flow creation process in
>> + * SFT using rte_sft_flow_activate.
>> + *
>> + * Recognized SFT flow:
>> + * Once struct rte_sft_flow_status with valid fid field posesed by application

posesed -> possessed

>> + * it can:
>> + * - mange client objects on it (see client_obj field in
>> + *   struct rte_sft_flow_status) using rte_sft_flow_<OP>_client_obj APIs
>> + * - analyze user-defined flow state and CT state (see state & ct_sate fields
>> + *   in struct rte_sft_flow_status).
>> + * - set flow state to be attached to the upcoming packets by action ``SFT``
>> + *   via struct rte_sft_flow_status API.
>> + * - decide to destroy flow via rte_sft_flow_destroy API.
>> + *
>> + * Flow aging:
>> + *
>> + * SFT library manages the aging for each flow. On flow creation, it's
>> + * assigned an aging value, the maximal number of seconds passed since the
>> + * last flow packet arrived, once exceeded flow considered aged.
>> + * The application notified of aged flow asynchronously via event queues.
>> + * The device and port IDs tuple to identify the event queue to enqueue
>> + * flow aged events passed on flow creation as arguments
>> + * (see rte_sft_flow_activate). It's the application responsibility to
>> + * initialize event queues and assign them to each flow for EOF event
>> + * notifications.
>> + * Aged EOF event handling:
>> + * - Should be considered as application responsibility.
>> + * - The last stage should be the release of the flow resources via
>> + *    rte_sft_flow_destroy API.
>> + * - All client objects should be removed from flow before the
>> + *   rte_sft_flow_destroy API call.
>> + * See the description of rete_sft_flow_destroy for an example of aged flow

rete_sft_flow_destroy -> rte_sft_flow_destroy

>> + * handling.
>> + *
>> + * SFT API thread safety:
>> + *
>> + * SFT library APIs are thread-safe while handling of specific flow can be
>> + * done in a single thread simultaneously. Exclusive access to specific SFT
>> + * flow guaranteed by:
> 
> The line above contradict itself, if you are working with single thread you can't work simultaneously.
> Does the SFT allow the access to a single flow from two threads in the same time? or it is the responsibility 
> Of the application to protect itself. I think it should be the application responsibility the SFT should protect
> itself only on SFT global functions. For example calling process_mbuf should be protected, so application can 
> call the same function from different threads.
> I think we can assume that all packets from a specific flow will arrive to the same queue and the same thread.
> 
> So I don't see the usage of the lock API.
>  
>> + * - rte_sft_process_mbuf
>> + * - rte_sft_process_mbuf_with_zone
>> + * - rte_sft_flow_create
>> + * - rte_sft_flow_lock
>> + * When application is done with the flow handling for the current packet it
>> + * should call rte_sft_flow_unlock API to maintain exclusive access to the
>> + * flow with other threads.
>> + *
>> + * SFT Library initialization and cleanup:
>> + *
>> + * SFT library should be considered as a single instance, preconfigured and
>> + * initialized via rte_sft_init() API.
>> + * SFT library resource deallocation and cleanup should be done via
>> + * rte_sft_init() API as a stage of the application termination procedure.
>> + */
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <rte_common.h>
>> +#include <rte_config.h>
>> +#include <rte_errno.h>
>> +#include <rte_mbuf.h>
>> +#include <rte_ethdev.h>
>> +#include <rte_flow.h>
>> +
>> +/**
>> + * L3/L4 5-tuple - src/dest IP and port and IP protocol.
>> + *
>> + * Used for flow/connection identification.
>> + */
>> +struct rte_sft_5tuple {
>> +	union {
>> +		struct {
>> +			rte_be32_t src_addr; /**< IPv4 source address. */
>> +			rte_be32_t dst_addr; /**< IPv4 destination address. */
>> +		} ipv4;
>> +		struct {
>> +			uint8_t src_addr[16]; /**< IPv6 source address. */
>> +			uint8_t dst_addr[16]; /**< IPv6 destination address. */
>> +		} ipv6;
>> +	};

RTE_STD_C11 missing?

>> +	uint16_t src_port; /**< Source port. */
>> +	uint16_t dst_port; /**< Destination port. */

If it is really host-endian, please, highlight it in above
descriptions. Also it would be interesting to understand
why.

>> +	uint8_t proto; /**< IP protocol. */
>> +	uint8_t is_ipv6: 1; /**< True for valid IPv6 fields. Otherwise IPv4. */
>> +};
>> +
>> +/**
>> + * Port flow identification.
>> + *
>> + * @p zone used for setups where 5-tuple is not enough to identify flow.
>> + * For example different VLANs/VXLANs may have similar 5-tuples.
>> + */
>> +struct rte_sft_7tuple {
>> +	struct rte_sft_5tuple flow_5tuple; /**< L3/L4 5-tuple. */
>> +	uint32_t zone; /**< Zone assigned to flow. */
>> +	uint16_t port_id; /** <Port identifier of Ethernet device. */
>> +};
>> +
>> +/**
>> + * Flow connection tracking states
>> + */
>> +enum rte_sft_flow_ct_state {
>> +	RTE_SFT_FLOW_CT_STATE_NEW  = (1 << 0),
>> +	RTE_SFT_FLOW_CT_STATE_EST  = (1 << 1),
>> +	RTE_SFT_FLOW_CT_STATE_REL  = (1 << 2),
>> +	RTE_SFT_FLOW_CT_STATE_RPL  = (1 << 3),
>> +	RTE_SFT_FLOW_CT_STATE_INV  = (1 << 4),
>> +	RTE_SFT_FLOW_CT_STATE_TRK  = (1 << 5),
>> +	RTE_SFT_FLOW_CT_STATE_SNAT = (1 << 6),
>> +	RTE_SFT_FLOW_CT_STATE_DNAT = (1 << 7),
>> +};
>> +
>> +/**
>> + * Structure describes SFT library configuration
>> + */
>> +struct rte_sft_conf {
>> +	uint32_t UDP_aging; /**< UDP proto default aging. */
>> +	uint32_t TCP_aging; /**< TCP proto default aging. */
>> +	uint32_t TCP_SYN_aging; /**< TCP SYN default aging. */
>> +	uint32_t OTHER_aging; /**< All unlisted proto default aging. */

May I suggest to stick to lowercase fields, please.

>> +	uint32_t size; /**< Max entries in SFT. */
>> +	uint8_t undefined_state; /**< Undefined state constant. */
>> +	uint8_t reorder_enable: 1;
>> +	/**< TCP packet reordering feature enabled bit. */
>> +	uint8_t ct_enable: 1; /**< Connection tracking feature enabled bit. */
>> +};
>> +
>> +/**
>> + * Structure describes the state of the flow in SFT.
>> + */
>> +struct rte_sft_flow_status {
>> +	uint32_t fid; /**< SFT flow id. */
>> +	uint32_t zone; /**< Zone for lookup in SFT */
>> +	uint8_t state; /**< Application defined bidirectional flow state. */
>> +	uint8_t ct_state; /**< Connection tracking flow state. */
>> +	uint32_t age; /**< Seconds passed since last flown packet. */
>> +	uint32_t aging;
>> +	/**< Flow considered aged once this age (seconds) reached. */
>> +	uint32_t nb_in_order_mbufs;
>> +	/**< Number of in-order mbufs available for drain */
>> +	void **client_obj; /**< Array of clients attached to flow. */
>> +	int nb_clients; /**< Number of clients attached to flow. */
>> +	uint8_t defined: 1; /**< Flow defined in SFT bit. */
>> +	uint8_t activated: 1; /**< Flow activation bit. */
>> +	uint8_t fragmented: 1; /**< Last flow mbuf was fragmented. */
>> +	uint8_t out_of_order: 1; /**< Last flow mbuf was out of order (TCP). */
>> +};
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Get SFT flow status.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *

Dup lines above

>> + * @param fid
>> + *   SFT flow ID.
>> + * @param[out] status
>> + *   Structure to dump actual SFT flow status.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_get_status(const uint32_t fid,
>> +			struct rte_sft_flow_status *status,
>> +			struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Set user defined context.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * Updates per ethernet dev SFT entries:

ethernet -> Ethernet
dev -> device

>> + * - flow lookup acceleration
>> + * - partial/full flow offloading managed by flow context
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + * @param ctx
>> + *   User defined state to set.
>> + *   Update of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success , a negative errno value otherwise and rte_errno is set.

Remove space before comma

>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_set_ctx(uint32_t fid,
>> +		     const struct rte_flow_item_sft *ctx,
>> +		     struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Initialize SFT library instance.
>> + *
>> + * @param conf
>> + *   SFT library instance configuration.
>> + *
>> + * @return
>> + *   0 on success , a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_init(const struct rte_sft_conf *conf);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Finalize SFT library instance.
>> + * Cleanup & release allocated resources.
>> + */
>> +void
>> +rte_sft_fini(void);
>> +
> 
> I think we should use stop. It is not commons in DPDK to have fini functions.
> Maybe we should also add start function, so the app can init and then start the SFT.
> 
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Process mbuf received on RX queue.
>> + *
>> + * Fragmentation handling (SFT fragmentation feature configured):
>> + * If *mbuf_in* of fragmented packet received it will be stored by SFT library.
>> + * status->fragmented bit will be set and *mbuf_out* will be set to NULL.
>> + * On reception of all related fragments of IP packet it will be reassembled
>> + * and further processed by this function on reception of last fragment.
>> + *
> Does this function allocate a new mbuf? Does it releases all old mbufs?
> 
>> + * Flow definition:
>> + * SFT flow defined by one of its 7-tuples, since there is no zone value as
>> + * argument flow should be defined by context attached to mbuf with action
>> + * ``SFT`` (see RTE flow RTE_FLOW_ACTION_TYPE_SFT). Otherwise status-
>>> defined
>> + * field will be turned off & *mbuf_out* will be set to *mbuf_in*.
>> + * In order to define flow for *mbuf_in* without attached sft context
>> + * rte_sft_process_mbuf_with_zone() should be used with *zone* argument
>> + * supplied by caller.
>> + *
>> + * Flow lookup:
>> + * If SFT flow identifier can't be retrieved from SFT context attached to
>> + * *mbuf_in* by action ``SFT`` - SFT lookup should be performmed by zone,

performmed -> performed

>> + * retrieved from SFT context attached to *mbuf_in*, and 5-tuple, extracted
>> + * form mbuf outer header contents.
>> + *
>> + * Flow defined but does not exists:
>> + * If flow not found in SFT inactivated flow will be created in SFT.
>> + * status->activated field will be turned off & *mbuf_out* be set to *mbuf_in*.
>> + * In order to activate created flow rte_sft_flow_activate() should be used
>> + * with reverse 7-tuple supplied by caller.
>> + * This is first phase of flow creation in SFT for second phase & more detailed
>> + * descriotion of flow creation see rte_sft_flow_activate.

descriotion -> description

>> + *
>> + * Out of order (SFT out of oreder feature configured):

oreder -> order

>> + * If flow defined & activated but *mbuf_in* is TCP out of order packet it will
>> + * be stored by SFT library. status->out_of_order bit will be set & *mbuf_out*
>> + * will be set to NULL. On reception of the first missing in order packet
>> + * status->nb_in_order_mbufs will be set to number of mbufs that available
>> for
>> + * processing with rte_sft_drain_mbuf().
>> + *
> It is possible that some packets will get trapped in the SFT do to this feature.
> if it supports ordering. For example the following case:
> Packets arrive to the application. After draining the packets the 
> Application changed the flow to full offload. This means that
> all future packets will not arrive to the application.
> But until the flow is offloaded some packets do arrive not in order.
> Then the flow is offloaded, this will result in the situation that no more
> packets will arrive to the application so some packets will get stack
> in the SFT.
> I think we must have some force drain or, notify the SFT that no more
> packets should arrive to even if the packets are not in order it will release them.
> 
> Also the same with fragmented does this function allocate new mbufs? are you releasing the
> old ones?
> 
>> + * Flow defined & activated, mbuf not fragmented and 'in order':
>> + * - Flow aging related data (see age field in `struct rte_sft_flow_status`)
>> + *   will be updated according to *mbuf_in* timestamp.
>> + * - Flow connection tracking state (see ct_state field in
>> + *   `struct rte_sft_flow_status`)  will be updated according to *mbuf_in* L4
>> + *   header contents.
>> + * - *mbuf_out* will be set to last processed mbuf.
>> + *
>> + * @param[in] mbuf_in
>> + *   mbuf to process; mbuf pinter considered 'consumed' and should not be
>> used
>> + *   after successful call to this function.
>> + * @param[out] mbuf_out
>> + *   last processed not fragmented and in order mbuf.
> 
> If the in mbuf is not fragmented and in order, this pointer will point to the in one?
> 
>> + * @param[out] status
>> + *   Structure to dump SFT flow status once updated according to contents of
>> + *   *mbuf_in*.
> 
> Does the status bits for example fragmented is kept per connection or per flow?
> Since it is possible to get fragmented packets from both sides.
> The same goes for out of order packets
> 
> 
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success:
>> + *   - *mbuf_out* contains valid mbuf pointer, locked SFT flow recognized by
>> + *     status->fid.
>> + *   - *mbuf_out* is NULL and status->fragmented bit on in case of
>> + *     non last fragment *mbuf_in*.
>> + *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
>> + *     order *mbuf_in*, locked SFT flow recognized by status->fid.
>> + *   On failure a negative errno value and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_process_mbuf(struct rte_mbuf *mbuf_in,
>> +		     struct rte_mbuf **mbuf_out,
>> +		     struct rte_sft_flow_status *status,
>> +		     struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Process mbuf received on RX queue while zone value provided by caller.
>> + *
>> + * The behaviour of this function is similar to rte_sft_process_mbuf except
>> + * the lookup in SFT procedure. The lookup in SFT always done by the *zone*
>> + * arg and 5-tuple 5-tuple, extracted form mbuf outer header contents.
>> + *
>> + * @see rte_sft_process_mbuf
>> + *
>> + * @param[in] mbuf_in
>> + *   mbuf to process; mbuf pinter considered 'consumed' and should not be used

pinter -> pointer

>> + *   after successful call to this function.
>> + * @param[out] mbuf_out
>> + *   last processed not fragmented and in order mbuf.
>> + * @param[out] status
>> + *   Structure to dump SFT flow status once updated according to contents of
>> + *   *mbuf_in*.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success:
>> + *   - *mbuf_out* contains valid mbuf pointer.
>> + *   - *mbuf_out* is NULL and status->fragmented bit on in case of
>> + *     non last fragment *mbuf_in*.
>> + *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
>> + *     order *mbuf_in*.
>> + *   On failure a negative errno value and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_process_mbuf_with_zone(struct rte_mbuf *mbuf_in,
>> +			       uint32_t zone,
>> +			       struct rte_mbuf **mbuf_out,
>> +			       struct rte_sft_flow_status *status,
>> +			       struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Drain next in order mbuf.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * This function behaves similar to rte_sft_process_mbuf() but acts on packets
>> + * accumulated in SFT flow due to missing in order packet. Processing done on
>> + * single mbuf at a time and `in order`. Other than above the behavior is
>> + * same as of rte_sft_process_mbuf for flow defined & activated & mbuf isn't
>> + * fragmented & 'in order'. This function should be called when
>> + * rte_sft_process_mbuf or rte_sft_process_mbuf_with_zone sets
>> + * status->nb_in_order_mbufs output param !=0 and until
>> + * status->nb_in_order_mbufs == 0.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + * @param[out] status
>> + *   Structure to dump SFT flow status once updated according to contents of
>> + *   *mbuf_in*.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   A valid mbuf in case of success, NULL otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +struct rte_mbuf *
>> +rte_sft_drain_mbuf(uint32_t fid,
>> +		   struct rte_sft_flow_status *status,
>> +		   struct rte_sft_error *error);
>> +
> 
> Fid represent a connection, so which direction do we drain the packets?
> since we can have inordered packet in from both directions right?
> 
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Activate flow in SFT.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * This function performs second phase of flow creation in SFT.
>> + * The reasons for 2 phase flow creation procedure:
>> + * 1. Missing reverse flow - flow context is shared for both flow directions
>> + *    i.e. in order maintain bidirectional flow context in RTE SFT packets
>> + *    arriving from both dirrections should be identified as packets of the
>> + *    RTE SFT flow. Consequently before creation of the SFT flow caller should
>> + *    provide reverse flow direction 7-tuple.
>> + * 2. The caller of rte_sft_process_mbuf/rte_sft_process_mbuf_with_zone
>> should
>> + *   be notified that arrived mbuf is first in flow & decide weather to
>> + *   create new flow or it distroy before it was activated with
>> + *   rte_sft_flow_destroy.
>> + * This function completes creation of the bidirectional SFT flow & creates
>> + * entry for 7-tuple on SFT PMD defined by the tuple port for both
>> + * initiator/initiate 7-tuples.
>> + * Flow aging, connection tracking state & out of order handling will be
>> + * initialized according to the content of the *mbuf_in* passes to
>> + * rte_sft_process_mbuf/_with_zone during the phase 1 of flow creation.
>> + * Once this function returns upcoming calls
>> rte_sft_process_mbuf/_with_zone
>> + * with 7-tuple or its reverse will return handle to this flow.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + * @param reverse_tuple
>> + *   Expected response flow 7-tuple.
>> + * @param ctx
>> + *   User defined state to set.
>> + *   Update of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
>> + * @param ct_enable
>> + *   Enables maintenance of status->ct_state connection tracking value for the
>> + *   flow; otherwise status->ct_state will be initialized with zeros.
>> + * @param evdev_id
>> + *   Event dev ID to enqueue end of flow event.
>> + * @param evport_id
>> + *   Event port ID to enqueue end of flow event.
>> + * @param[out] status
>> + *   Structure to dump SFT flow status once activated.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_activate(uint32_t fid,
>> +		      const struct rte_sft_7tuple *reverse_tuple,
>> +		      const struct rte_flow_item_sft *ctx,
>> +		      uint8_t ct_enable,
>> +		      uint8_t dev_id,
>> +		      uint8_t port_id,
>> +		      struct rte_sft_flow_status *status,
>> +		      struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Artificially create SFT flow.
>> + *
>> + * Function to create SFT flow before reception of the first flow packet.
>> + *
>> + * @param tuple
>> + *   Expected initiator flow 7-tuple.
>> + * @param reverse_tuple
>> + *   Expected initiate flow 7-tuple.
>> + * @param ctx
>> + *   User defined state to set.
>> + *   Setting of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
>> + * @param[out] ct_enable
>> + *   Enables maintenance of status->ct_state connection tracking value for the
>> + *   flow; otherwise status->ct_state will be initialized with zeros.
>> + * @param[out] status
>> + *   Structure to dump SFT flow status once created.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   - on success: 0, locked SFT flow recognized by status->fid.
>> + *   - on error: a negative errno value otherwise and rte_errno is set.
>> + */
>> +

No extra empty line and __rte_experimental

>> +int
>> +rte_sft_flow_create(const struct rte_sft_7tuple *tuple,
>> +		    const struct rte_sft_7tuple *reverse_tuple,
>> +		    const struct rte_flow_item_sft *ctx,
>> +		    uint8_t ct_enable,
>> +		    struct rte_sft_flow_status *status,
>> +		    struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Lock exclusively SFT flow.
>> + *
>> + * Explicit flow locking; used for handling aged flows.
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_lock(uint32_t fid);
>> + 
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Release exclusively locked SFT flow.
>> + *
>> + * When rte_sft_process_mbuf/_with_zone and rte_sft_flow_create
>> + * return *status* containing fid with defined bit on the flow considered
>> + * exclusively locked and should be unlocked with this function.
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_unlock(uint32_t fid);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Removes flow from SFT.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * - Flow should be locked by caller in order to remove it.
>> + * - Flow should have no client objects attached.
>> + *
>> + * Should be applied on aged flows, when flow aged event received.
>> + *
>> + * @code{.c}
>> + *     while (1) {
>> + *         rte_event_dequeue_burst(...);
>> + *         FOR_EACH_EV(ev) {
>> + *             uint32_t fid = ev.u64;
>> + *             rte_sft_flow_lock(fid);
>> + *             FOR_EACH_CLIENT(fid, client_id) {
>> + *                 rte_sft_flow_reset_client_obj(fid, client_obj);
>> + *                 // detached client object handling
>> + *             }
>> + *             rte_sft_flow_destroy(fid, &error);
>> + *         }
>> + *     }
>> + * @endcode
>> + *
>> + * @param fid
>> + *   SFT flow ID to destroy.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_destroy(uint32_t fid, struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Reset flow age to zero.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * Simulates last flow packet with timestamp set to just now.
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_touch(uint32_t fid, struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Set flow aging to specific value.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + * @param aging
>> + *   New flow aging value.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_set_aging(uint32_t fid,
>> +		       uint32_t aging,
>> +		       struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Set client object for given client ID.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + * @param client_id
>> + *   Client ID to set object for.
>> + * @param client_obj
>> + *   Pointer to opaque client object structure.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */

__rte_experimental

>> +int
>> +rte_sft_flow_set_client_obj(uint32_t fid,
>> +			    uint8_t client_id,
>> +			    void *client_obj,
>> +			    struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Get client object for given client ID.
>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + * @param client_id
>> + *   Client ID to get object for.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   A valid client object opaque pointer in case of success, NULL otherwise
>> + *   and rte_errno is set.
>> + */

__rte_experimental

>> +void *
>> +rte_sft_flow_get_client_obj(const uint32_t fid,
>> +			    uint8_t client_id,
>> +			    struct rte_sft_error *error);
>> +
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Remove client object for given client ID.

Function name uses "reset", but description says "remove".
May be synchronize it?

>> + * Flow should be locked by caller (see rte_sft_flow_lock).
>> + *
>> + * Detaches client object from SFT flow and returns the ownership for the
>> + * client object to the caller by returning client object pointer value.
>> + * The pointer returned by this function won't be accessed any more, the caller
>> + * may release all client obj related resources & the memory allocated for
>> + * this client object.
>> + *
>> + * @param fid
>> + *   SFT flow ID.
>> + * @param client_id
>> + *   Client ID to remove object for.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   A valid client object opaque pointer in case of success, NULL otherwise
>> + *   and rte_errno is set.
>> + */

__rte_experimental

>> +void *
>> +rte_sft_flow_reset_client_obj(uint32_t fid,
>> +			      uint8_t client_id,
>> +			      struct rte_sft_error *error);
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_SFT_H_ */
>> diff --git a/lib/librte_sft/rte_sft_driver.h b/lib/librte_sft/rte_sft_driver.h
>> new file mode 100644
>> index 0000000000..0c9e28fe17
>> --- /dev/null
>> +++ b/lib/librte_sft/rte_sft_driver.h
>> @@ -0,0 +1,195 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright 2020 Mellanox Technologies, Ltd
>> + */
>> +
>> +#ifndef _RTE_SFT_DRIVER_H_
>> +#define _RTE_SFT_DRIVER_H_
>> +
>> +/**
>> + * @file
>> + *
>> + * RTE SFT Ethernet device PMD API
>> + *
>> + * APIs that are used by the SFT library to offload SFT operationons
>> + * to Ethernet device.
>> + */
>> +
>> +#include "rte_sft.h"
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +/**
>> + * Opaque type returned after successfully creating an entry in SFT.
>> + *
>> + * This handle can be used to manage and query the related entry (e.g. to
>> + * destroy it or update age).
>> + */
>> +struct rte_sft_entry;
>> +
>> +/**
>> + * Create SFT entry in eth_dev SFT.
>> + *
>> + * @param dev
>> + *   Pointer to Ethernet device structure.
>> + * @param tuple
>> + *   L3/L4 5-tuple - src/dest IP and port and IP protocol.
>> + * @param nat_tuple
>> + *   L3/L4 5-tuple to replace in packet original 5-tuple in order to implement
>> + *   NAT offloading; if NULL NAT offloading won't be configured for the flow.
>> + * @param aging
>> + *   Flow aging timeout in seconds.
>> + * @param ctx
>> + *   Initial values in SFT flow context
>> + *   (see RTE flow struct rte_flow_item_sft).
>> + *   ctx->zone should be valid.
>> + * @param fid
>> + *   SFT flow ID for the entry to create on *device*.
>> + *   If there is an entry for the *fid* in PMD it will be updated with the
>> + *   values of *ctx*.
>> + * @param[out] queue_index
>> + *   if PMD can figure out the queue where the flow packets will
>> + *   arrive in RX data path it will set the value of queue_index; otherwise
>> + *   all bits will be turned on.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   A valid handle in case of success, NULL otherwise and rte_errno is set.
>> + */
>> +typedef struct rte_sft_entry *(*sft_entry_create_t) (struct rte_eth_dev *dev,
>> +		const struct rte_sft_5tuple *tuple,
>> +		const struct rte_sft_5tuple *nat_tuple,
>> +		const uint32_t aging,
>> +		const struct rte_flow_item_sft *ctx,
>> +		const uint32_t fid,
>> +		uint16_t *queue_index,
>> +		struct rte_sft_error *error);
>> +
> 
> I think for easier reading, the API should change to have 6 tuple (5 + zone)
> the ctx should be removed and replaced with the state.
> 
> Then add new API to modify the ctx
> typedef int (*sft_modify_state)(struct rte_eth_dev *dev, uint8 state);
> The main issue we my suggestion is that it will force the PMD to store the information to recreate
> the rule, data that is already  saved by the SFT.
> 
> Also I don't see why we need queue index, since the RSS and queue will be configured by the RTE flow
> in a different group.
> 
>> +/**
>> + * Destroy SFT entry in eth_dev SFT.
>> + *
>> + * @param dev
>> + *   Pointer to Ethernet device structure.
>> + * @param entry
>> + *   Handle to the SFT entry to destroy.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */
>> +typedef int (*sft_entry_destroy_t)(struct rte_eth_dev *dev,
>> +		struct rte_sft_entry *entry,
>> +		struct rte_sft_error *error);
>> +
>> +/**
>> + * Decodes SFT flow context if attached to mbuf by action ``SFT``.
>> + * @see RTE flow RTE_FLOW_ACTION_TYPE_SFT.
>> + *
>> + * @param dev
>> + *   Pointer to Ethernet device structure.
>> + * @param mbuf
>> + *   mbuf of the packet to decode attached state from.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   A valid SFT flow context in case of success, NULL otherwise and rte_errno
>> + *   is set.
>> + */
>> +typedef struct rte_flow_item_sft *(*sft_entry_mbuf_decode_ctx_t)(
>> +		struct rte_eth_dev *dev,
>> +		const struct rte_mbuf *mbuf,
>> +		struct rte_sft_error *error);
>> +
> 
> What about returning int as error code, and return the rte_flow_item_sft
> as out parameter?
> This will remove the allocation and free.
> 
>> +/**
>> + * Get aged-out SFT entries.
>> + *
>> + * Report entry as aged-out if timeout passed without any matching
>> + * on the SFT entry.
>> + *
>> + * @param[in] dev
>> + *   Pointer to Ethernet device structure.
>> + * @param[in, out] fid_aged
>> + *   The address of an array of aged-out SFT flow IDs.
>> + * @param[in] nb_aged
>> + *   The length of *fid_aged* array pointers.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. Initialized in case of
>> + *   error only.
>> + *
>> + * @return
>> + *   if nb_aged is 0, return the amount of all aged flows.
>> + *   if nb_aged is not 0 , return the amount of aged flows reported
>> + *   in the *fid_aged* array, otherwise negative errno value.
>> + */
>> +typedef int (*sft_entry_get_aged_entries_t)(struct rte_eth_dev *dev,
>> +		uint32_t *fid_aged,
>> +		int nb_aged,
>> +		struct rte_sft_error *error);
>> +
>> +/**
>> + * Simulate SFT entry match in terms of entry aging.
>> + *
>> + * @param dev
>> + *   Pointer to Ethernet device structure.
>> + * @param fid
>> + *   SFT flow ID paired with dev to retrieve related SFT entry.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */
>> +typedef int (*sft_entry_touch_t)(struct rte_eth_dev *dev,
>> +		uint32_t fid,
>> +		struct rte_sft_error *error);
>> +
>> +/**
>> + * Set SFT entry aging to specific value.
>> + *
>> + * @param dev
>> + *   Pointer to Ethernet device structure.
>> + * @param fid
>> + *   SFT flow ID paired with dev to retrieve related SFT entry.
>> + * @param aging
>> + *   New entry aging value.
>> + * @param[out] error
>> + *   Perform verbose error reporting if not NULL. PMDs initialize this
>> + *   structure in case of error only.
>> + *
>> + * @return
>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>> + */
>> +typedef int (*sft_entry_set_aging_t)(struct rte_eth_dev *dev,
>> +		uint32_t fid,
>> +		uint32_t aging,
>> +		struct rte_sft_error *error);
>> +
>> +/** SFT operations function pointer table */
>> +struct rte_sft_ops {
>> +	sft_entry_create_t entry_create;
>> +	/**< Create SFT entry in eth_dev SFT. */
>> +	sft_entry_destroy_t entry_destroy;
>> +	/**< Destroy SFT entry in eth_dev SFT. */
>> +	sft_entry_mbuf_decode_ctx_t mbuf_decode_ctx;
>> +	/**< Decodes SFT flow context if attached to mbuf by action ``SFT``. */
>> +	sft_entry_get_aged_entries_t get_aged_entries;
>> +	/**< Get aged-out SFT entries. */
>> +	sft_entry_touch_t entry_touch;
>> +	/**< Simulate SFT entry match in terms of entry aging. */
>> +	sft_entry_set_aging_t set_aging;
>> +	/**< Set SFT entry aging to specific value. */
>> +};
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_SFT_DRIVER_H_ */
>> diff --git a/lib/librte_sft/rte_sft_version.map
>> b/lib/librte_sft/rte_sft_version.map
>> new file mode 100644
>> index 0000000000..747e100ac5
>> --- /dev/null
>> +++ b/lib/librte_sft/rte_sft_version.map
>> @@ -0,0 +1,21 @@
>> +EXPERIMENTAL {
>> +	global:
>> +
>> +	rte_sft_flow_get_status;
>> +	rte_sft_flow_set_ctx;
>> +	rte_sft_init;
>> +	rte_sft_fini;
>> +	rte_sft_process_mbuf;
>> +	rte_sft_process_mbuf_with_zone;
>> +	rte_sft_drain_mbuf;
>> +	rte_sft_flow_activate;
>> +	rte_sft_flow_create;
>> +	rte_sft_flow_lock;
>> +	rte_sft_flow_unlock;
>> +	rte_sft_flow_destroy;
>> +	rte_sft_flow_touch;
>> +	rte_sft_flow_set_aging;
>> +	rte_sft_flow_set_client_obj;
>> +	rte_sft_flow_get_client_obj;
>> +	rte_sft_flow_reset_client_obj;

If I'm not mistaken, it should be alphabetially sorted.

>> +};
>> --
>> 2.26.2
> 
> Best,
> Ori
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [RFC 3/3] sft: introduce API
  2020-09-09 20:30 ` [dpdk-dev] [RFC 3/3] sft: introduce API Andrey Vesnovaty
  2020-09-16 18:33   ` Ori Kam
@ 2020-09-18 13:34   ` Kinsella, Ray
  1 sibling, 0 replies; 17+ messages in thread
From: Kinsella, Ray @ 2020-09-18 13:34 UTC (permalink / raw)
  To: Andrey Vesnovaty, dev
  Cc: thomas, orika, viacheslavo, andrey.vesnovaty, ozsh, elibr, alexr,
	roniba, Neil Horman



On 09/09/2020 21:30, Andrey Vesnovaty wrote:
> Defines RTE SFT APIs for Statefull Flow Table library.
> 
> SFT General description:
> SFT library provides a framework for applications that need to maintain
> context across different packets of the connection.
> Examples for such applications:
> - Next-generation firewalls
> - Intrusion detection/prevention systems (IDS/IPS): Suricata, snort
> - SW/Virtual Switching: OVS
> The goals of the SFT library:
> - Accelerate flow recognition & its context retrieval for further
>   lookaside processing.
> - Enable context-aware flow handling offload.
> 
> Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
> ---
>  lib/librte_sft/Makefile            |  28 +
>  lib/librte_sft/meson.build         |   7 +
>  lib/librte_sft/rte_sft.c           |   9 +
>  lib/librte_sft/rte_sft.h           | 845 +++++++++++++++++++++++++++++
>  lib/librte_sft/rte_sft_driver.h    | 195 +++++++
>  lib/librte_sft/rte_sft_version.map |  21 +
>  6 files changed, 1105 insertions(+)
>  create mode 100644 lib/librte_sft/Makefile
>  create mode 100644 lib/librte_sft/meson.build
>  create mode 100644 lib/librte_sft/rte_sft.c
>  create mode 100644 lib/librte_sft/rte_sft.h
>  create mode 100644 lib/librte_sft/rte_sft_driver.h
>  create mode 100644 lib/librte_sft/rte_sft_version.map
> 
> diff --git a/lib/librte_sft/Makefile b/lib/librte_sft/Makefile
> new file mode 100644
> index 0000000000..23c6eee849
> --- /dev/null
> +++ b/lib/librte_sft/Makefile
> @@ -0,0 +1,28 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2020 Mellanox Technologies, Ltd
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_sft.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +LDLIBS += -lrte_eal -lrte_mbuf
> +
> +# library source files
> +# all source are stored in SRCS-y
> +SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_sft.c
> +
> +# export include files
> +SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft_driver.h
> +
> +# versioning export map
> +EXPORT_MAP := rte_sft_version.map
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_sft/meson.build b/lib/librte_sft/meson.build
> new file mode 100644
> index 0000000000..b210e43f29
> --- /dev/null
> +++ b/lib/librte_sft/meson.build
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright 2020 Mellanox Technologies, Ltd
> +
> +sources = files('rte_sft.c')
> +headers = files('rte_sft.h',
> +	'rte_sft_driver.h')
> +deps += ['mbuf']
> diff --git a/lib/librte_sft/rte_sft.c b/lib/librte_sft/rte_sft.c
> new file mode 100644
> index 0000000000..f3d3945545
> --- /dev/null
> +++ b/lib/librte_sft/rte_sft.c
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + */
> +
> +
> +#include "rte_sft.h"
> +#include "rte_sft_driver.h"
> +
> +/* Placeholder for RTE SFT library APIs implementation */
> diff --git a/lib/librte_sft/rte_sft.h b/lib/librte_sft/rte_sft.h
> new file mode 100644
> index 0000000000..5c9f92ea9f
> --- /dev/null
> +++ b/lib/librte_sft/rte_sft.h
> @@ -0,0 +1,845 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + */
> +
> +#ifndef _RTE_SFT_H_
> +#define _RTE_SFT_H_
> +
> +/**
> + * @file
> + *
> + * RTE SFT API
> + *
> + * Defines RTE SFT APIs for Statefull Flow Table library.
> + *
> + * SFT General description:
> + * SFT library provides a framework for applications that need to maintain
> + * context across different packets of the connection.
> + * Examples for such applications:
> + * - Next-generation firewalls
> + * - Intrusion detection/prevention systems (IDS/IPS): Suricata, Snort
> + * - SW/Virtual Switching: OVS
> + * The goals of the SFT library:
> + * - Accelerate flow recognition & its context retrieval for further lookaside
> + *   processing.
> + * - Enable context-aware flow handling offload.
> + *
> + * Definitions and Abbreviations:
> + * - 5-tuple: defined by:
> + *     -- Source IP address
> + *     -- Source port
> + *     -- Destination IP address
> + *     -- Destination port
> + *     -- IP protocol number
> + * - 7-tuple: 5-tuple zone and port (see struct rte_sft_7tuple)
> + * - 5/7-tuple: 5/7-tuple of the packet from connection initiator
> + * - revers 5/7-tuple: 5/7-tuple of the packet from connection initiate
> + * - application: SFT library API consumer
> + * - APP: see application
> + * - CID: client ID
> + * - CT: connection tracking
> + * - FID: Flow identifier
> + * - FIF: First In Flow
> + * - Flow: defined by 7-tuple and its reverse i.e. flow is bidirectional
> + * - SFT: Stateful Flow Table
> + * - user: see application
> + * - zone: additional user defined value used as differentiator for
> + *         connections having same 5-tuple (for example different VxLan
> + *         connections with same inner 5-tuple).
> + *
> + * SFT components:
> + *
> + * +-----------------------------------+
> + * | RTE flow                          |
> + * |                                   |
> + * | +-------------------------------+ |  +----------------+
> + * | | group X                       | |  | RTE_SFT        |
> + * | |                               | |  |                |
> + * | | +---------------------------+ | |  |                |
> + * | | | rule ...                  | | |  |                |
> + * | | | .                         | | |  +-----------+----+
> + * | | | .                         | | |              |
> + * | | | .                         | | |          entry
> + * | | +---------------------------+ | |            create
> + * | | | rule                      | | |              |
> + * | | |   patterns ...            +---------+        |
> + * | | |   actions                 | | |     |        |
> + * | | |     SFT (zone=Z)          | | |     |        |
> + * | | |     JUMP (group=Y)        | | |  lookup      |
> + * | | +---------------------------+ | |    zone=Z,   |
> + * | | | rule ...                  | | |    5tuple    |
> + * | | | .                         | | |     |        |
> + * | | | .                         | | |  +--v-------------+
> + * | | | .                         | | |  | SFT       |    |
> + * | | |                           | | |  |           |    |
> + * | | +---------------------------+ | |  |        +--v--+ |
> + * | |                               | |  |        |     | |
> + * | +-------------------------------+ |  |        | PMD | |
> + * |                                   |  |        |     | |
> + * |                                   |  |        +-----+ |
> + * | +-------------------------------+ |  |                |
> + * | | group Y                       | |  |                |
> + * | |                               | |  | set flow CTX   |
> + * | | +---------------------------+ | |  |                |
> + * | | | rule                      | | |  +--------+-------+
> + * | | |   patterns                | | |           |
> + * | | |     SFT (state=UNDEFINED) | | |           |
> + * | | |   actions RSS             | | |           |
> + * | | +---------------------------+ | |           |
> + * | | | rule                      | | |           |
> + * | | |   patterns                | | |           |
> + * | | |     SFT (state=INVALID)   | <-------------+
> + * | | |   actions DROP            | | |  forward
> + * | | +---------------------------+ | |    group=Y
> + * | | | rule                      | | |
> + * | | |   patterns                | | |
> + * | | |     SFT (state=ACCEPTED)  | | |
> + * | | |   actions PORT            | | |
> + * | | +---------------------------+ | |
> + * | |  ...                          | |
> + * | |                               | |
> + * | +-------------------------------+ |
> + * |  ...                              |
> + * |                                   |
> + * +-----------------------------------+
> + *
> + * SFT as datastructure:
> + * SFT can be treated as datastructure maintaining flow context across its
> + * lifetime. SFT flow entry represent bidirectional network flow and defined by
> + * 7-tuple & its reverse 7-tuple.
> + * Each entry in SFT has:
> + * - FID: 1:1 mapped & used as entry handle & encapsulating internal
> + *   implementation of the entry.
> + * - State: user-defined value attached to each entry, the only library
> + *   reserved value for state unset (the actual value defined by SFT
> + *   configuration). The application should define flow state encodings and
> + *   set it for flow via rte_sft_flow_set_ctx() than what actions should be
> + *   applied on packets can be defined via related RTE flow rule matching SFT
> + *   state (see rules in SFT components diagram above).
> + * - Timestamp: for the last seen in flow packet used for flow aging mechanism
> + *   implementation.
> + * - Client Objects: user-defined flow contexts attached as opaques to flow.
> + * - Acceleration & offloading - utilize RTE flow capabilities, when supported
> + *   (see action ``SFT``), for flow lookup acceleration and further
> + *   context-aware flow handling offload.
> + * - CT state: optionally for TCP connections CT state can be maintained
> + *   (see enum rte_sft_flow_ct_state).
> + * - Out of order TCP packets: optionally SFT can keep out of order TCP
> + *   packets aside the flow context till the arrival of the missing in-order
> + *   packet.
> + *
> + * RTE flow changes:
> + * The SFT flow state (or context) for RTE flow is defined by fields of
> + * struct rte_flow_item_sft.
> + * To utilize SFT capabilities new item and action types introduced:
> + * - item SFT: matching on SFT flow state (see RTE_FLOW_ITEM_TYPE_SFT).
> + * - action SFT: retrieve SFT flow context and attache it to the processed
> + *   packet (see RTE_FLOW_ACTION_TYPE_SFT).
> + *
> + * The contents of per port SFT serving RTE flow action ``SFT`` managed via
> + * SFT PMD APIs (see struct rte_sft_ops).
> + * The SFT flow state/context retrieval performed by user-defined zone ``SFT``
> + * action argument and processed packet 5-tuple.
> + * If in scope of action ``SFT`` there is no context/state for the flow in SFT
> + * undefined sate attached to the packet meaning that the flow is not
> + * recognized by SFT, most probably FIF packet.
> + *
> + * Once the SFT state set for a packet it can match on item SFT
> + * (see RTE_FLOW_ITEM_TYPE_SFT) and forwarding design can be done for the
> + * packet, for example:
> + * - if state value == x than queue for further processing by the application
> + * - if state value == y than forward it to eth port (full offload)
> + * - if state value == 'undefined' than queue for further processing by
> + *   the application (handle FIF packets)
> + *
> + * Processing packets with SFT library:
> + *
> + * FIF packet:
> + * To recognize upcoming packets of the SFT flow every FIF packet should be
> + * forwarded to the application utilizing the SFT library. Non-FIF packets can
> + * be processed by the application or its processing can be fully offloaded.
> + * Processing of the packets in SFT library starts with rte_sft_process_mbuf
> + * or rte_sft_process_mbuf_with_zone. If mbuf recognized as FIF application
> + * should make a design to destroy flow or complete flow creation process in
> + * SFT using rte_sft_flow_activate.
> + *
> + * Recognized SFT flow:
> + * Once struct rte_sft_flow_status with valid fid field posesed by application
> + * it can:
> + * - mange client objects on it (see client_obj field in
> + *   struct rte_sft_flow_status) using rte_sft_flow_<OP>_client_obj APIs
> + * - analyze user-defined flow state and CT state (see state & ct_sate fields
> + *   in struct rte_sft_flow_status).
> + * - set flow state to be attached to the upcoming packets by action ``SFT``
> + *   via struct rte_sft_flow_status API.
> + * - decide to destroy flow via rte_sft_flow_destroy API.
> + *
> + * Flow aging:
> + *
> + * SFT library manages the aging for each flow. On flow creation, it's
> + * assigned an aging value, the maximal number of seconds passed since the
> + * last flow packet arrived, once exceeded flow considered aged.
> + * The application notified of aged flow asynchronously via event queues.
> + * The device and port IDs tuple to identify the event queue to enqueue
> + * flow aged events passed on flow creation as arguments
> + * (see rte_sft_flow_activate). It's the application responsibility to
> + * initialize event queues and assign them to each flow for EOF event
> + * notifications.
> + * Aged EOF event handling:
> + * - Should be considered as application responsibility.
> + * - The last stage should be the release of the flow resources via
> + *    rte_sft_flow_destroy API.
> + * - All client objects should be removed from flow before the
> + *   rte_sft_flow_destroy API call.
> + * See the description of rete_sft_flow_destroy for an example of aged flow
> + * handling.
> + *
> + * SFT API thread safety:
> + *
> + * SFT library APIs are thread-safe while handling of specific flow can be
> + * done in a single thread simultaneously. Exclusive access to specific SFT
> + * flow guaranteed by:
> + * - rte_sft_process_mbuf
> + * - rte_sft_process_mbuf_with_zone
> + * - rte_sft_flow_create
> + * - rte_sft_flow_lock
> + * When application is done with the flow handling for the current packet it
> + * should call rte_sft_flow_unlock API to maintain exclusive access to the
> + * flow with other threads.
> + *
> + * SFT Library initialization and cleanup:
> + *
> + * SFT library should be considered as a single instance, preconfigured and
> + * initialized via rte_sft_init() API.
> + * SFT library resource deallocation and cleanup should be done via
> + * rte_sft_init() API as a stage of the application termination procedure.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_errno.h>
> +#include <rte_mbuf.h>
> +#include <rte_ethdev.h>
> +#include <rte_flow.h>
> +
> +/**
> + * L3/L4 5-tuple - src/dest IP and port and IP protocol.
> + *
> + * Used for flow/connection identification.
> + */
> +struct rte_sft_5tuple {
> +	union {
> +		struct {
> +			rte_be32_t src_addr; /**< IPv4 source address. */
> +			rte_be32_t dst_addr; /**< IPv4 destination address. */
> +		} ipv4;
> +		struct {
> +			uint8_t src_addr[16]; /**< IPv6 source address. */
> +			uint8_t dst_addr[16]; /**< IPv6 destination address. */
> +		} ipv6;
> +	};
> +	uint16_t src_port; /**< Source port. */
> +	uint16_t dst_port; /**< Destination port. */
> +	uint8_t proto; /**< IP protocol. */
> +	uint8_t is_ipv6: 1; /**< True for valid IPv6 fields. Otherwise IPv4. */
> +};
> +
> +/**
> + * Port flow identification.
> + *
> + * @p zone used for setups where 5-tuple is not enough to identify flow.
> + * For example different VLANs/VXLANs may have similar 5-tuples.
> + */
> +struct rte_sft_7tuple {
> +	struct rte_sft_5tuple flow_5tuple; /**< L3/L4 5-tuple. */
> +	uint32_t zone; /**< Zone assigned to flow. */
> +	uint16_t port_id; /** <Port identifier of Ethernet device. */
> +};
> +
> +/**
> + * Flow connection tracking states
> + */
> +enum rte_sft_flow_ct_state {
> +	RTE_SFT_FLOW_CT_STATE_NEW  = (1 << 0),
> +	RTE_SFT_FLOW_CT_STATE_EST  = (1 << 1),
> +	RTE_SFT_FLOW_CT_STATE_REL  = (1 << 2),
> +	RTE_SFT_FLOW_CT_STATE_RPL  = (1 << 3),
> +	RTE_SFT_FLOW_CT_STATE_INV  = (1 << 4),
> +	RTE_SFT_FLOW_CT_STATE_TRK  = (1 << 5),
> +	RTE_SFT_FLOW_CT_STATE_SNAT = (1 << 6),
> +	RTE_SFT_FLOW_CT_STATE_DNAT = (1 << 7),
> +};
> +
> +/**
> + * Structure describes SFT library configuration
> + */
> +struct rte_sft_conf {
> +	uint32_t UDP_aging; /**< UDP proto default aging. */
> +	uint32_t TCP_aging; /**< TCP proto default aging. */
> +	uint32_t TCP_SYN_aging; /**< TCP SYN default aging. */
> +	uint32_t OTHER_aging; /**< All unlisted proto default aging. */
> +	uint32_t size; /**< Max entries in SFT. */
> +	uint8_t undefined_state; /**< Undefined state constant. */
> +	uint8_t reorder_enable: 1;
> +	/**< TCP packet reordering feature enabled bit. */
> +	uint8_t ct_enable: 1; /**< Connection tracking feature enabled bit. */
> +};
> +
> +/**
> + * Structure describes the state of the flow in SFT.
> + */
> +struct rte_sft_flow_status {
> +	uint32_t fid; /**< SFT flow id. */
> +	uint32_t zone; /**< Zone for lookup in SFT */
> +	uint8_t state; /**< Application defined bidirectional flow state. */
> +	uint8_t ct_state; /**< Connection tracking flow state. */
> +	uint32_t age; /**< Seconds passed since last flown packet. */
> +	uint32_t aging;
> +	/**< Flow considered aged once this age (seconds) reached. */
> +	uint32_t nb_in_order_mbufs;
> +	/**< Number of in-order mbufs available for drain */
> +	void **client_obj; /**< Array of clients attached to flow. */
> +	int nb_clients; /**< Number of clients attached to flow. */
> +	uint8_t defined: 1; /**< Flow defined in SFT bit. */
> +	uint8_t activated: 1; /**< Flow activation bit. */
> +	uint8_t fragmented: 1; /**< Last flow mbuf was fragmented. */
> +	uint8_t out_of_order: 1; /**< Last flow mbuf was out of order (TCP). */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get SFT flow status.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param[out] status
> + *   Structure to dump actual SFT flow status.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_get_status(const uint32_t fid,
> +			struct rte_sft_flow_status *status,
> +			struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Set user defined context.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * Updates per ethernet dev SFT entries:
> + * - flow lookup acceleration
> + * - partial/full flow offloading managed by flow context
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param ctx
> + *   User defined state to set.
> + *   Update of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success , a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_set_ctx(uint32_t fid,
> +		     const struct rte_flow_item_sft *ctx,
> +		     struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Initialize SFT library instance.
> + *
> + * @param conf
> + *   SFT library instance configuration.
> + *
> + * @return
> + *   0 on success , a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_init(const struct rte_sft_conf *conf);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Finalize SFT library instance.
> + * Cleanup & release allocated resources.
> + */
> +void
> +rte_sft_fini(void);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Process mbuf received on RX queue.
> + *
> + * Fragmentation handling (SFT fragmentation feature configured):
> + * If *mbuf_in* of fragmented packet received it will be stored by SFT library.
> + * status->fragmented bit will be set and *mbuf_out* will be set to NULL.
> + * On reception of all related fragments of IP packet it will be reassembled
> + * and further processed by this function on reception of last fragment.
> + *
> + * Flow definition:
> + * SFT flow defined by one of its 7-tuples, since there is no zone value as
> + * argument flow should be defined by context attached to mbuf with action
> + * ``SFT`` (see RTE flow RTE_FLOW_ACTION_TYPE_SFT). Otherwise status->defined
> + * field will be turned off & *mbuf_out* will be set to *mbuf_in*.
> + * In order to define flow for *mbuf_in* without attached sft context
> + * rte_sft_process_mbuf_with_zone() should be used with *zone* argument
> + * supplied by caller.
> + *
> + * Flow lookup:
> + * If SFT flow identifier can't be retrieved from SFT context attached to
> + * *mbuf_in* by action ``SFT`` - SFT lookup should be performmed by zone,
> + * retrieved from SFT context attached to *mbuf_in*, and 5-tuple, extracted
> + * form mbuf outer header contents.
> + *
> + * Flow defined but does not exists:
> + * If flow not found in SFT inactivated flow will be created in SFT.
> + * status->activated field will be turned off & *mbuf_out* be set to *mbuf_in*.
> + * In order to activate created flow rte_sft_flow_activate() should be used
> + * with reverse 7-tuple supplied by caller.
> + * This is first phase of flow creation in SFT for second phase & more detailed
> + * descriotion of flow creation see rte_sft_flow_activate.
> + *
> + * Out of order (SFT out of oreder feature configured):
> + * If flow defined & activated but *mbuf_in* is TCP out of order packet it will
> + * be stored by SFT library. status->out_of_order bit will be set & *mbuf_out*
> + * will be set to NULL. On reception of the first missing in order packet
> + * status->nb_in_order_mbufs will be set to number of mbufs that available for
> + * processing with rte_sft_drain_mbuf().
> + *
> + * Flow defined & activated, mbuf not fragmented and 'in order':
> + * - Flow aging related data (see age field in `struct rte_sft_flow_status`)
> + *   will be updated according to *mbuf_in* timestamp.
> + * - Flow connection tracking state (see ct_state field in
> + *   `struct rte_sft_flow_status`)  will be updated according to *mbuf_in* L4
> + *   header contents.
> + * - *mbuf_out* will be set to last processed mbuf.
> + *
> + * @param[in] mbuf_in
> + *   mbuf to process; mbuf pinter considered 'consumed' and should not be used
> + *   after successful call to this function.
> + * @param[out] mbuf_out
> + *   last processed not fragmented and in order mbuf.
> + * @param[out] status
> + *   Structure to dump SFT flow status once updated according to contents of
> + *   *mbuf_in*.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success:
> + *   - *mbuf_out* contains valid mbuf pointer, locked SFT flow recognized by
> + *     status->fid.
> + *   - *mbuf_out* is NULL and status->fragmented bit on in case of
> + *     non last fragment *mbuf_in*.
> + *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
> + *     order *mbuf_in*, locked SFT flow recognized by status->fid.
> + *   On failure a negative errno value and rte_errno is set.
> + */
> +int
> +rte_sft_process_mbuf(struct rte_mbuf *mbuf_in,
> +		     struct rte_mbuf **mbuf_out,
> +		     struct rte_sft_flow_status *status,
> +		     struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Process mbuf received on RX queue while zone value provided by caller.
> + *
> + * The behaviour of this function is similar to rte_sft_process_mbuf except
> + * the lookup in SFT procedure. The lookup in SFT always done by the *zone*
> + * arg and 5-tuple 5-tuple, extracted form mbuf outer header contents.
> + *
> + * @see rte_sft_process_mbuf
> + *
> + * @param[in] mbuf_in
> + *   mbuf to process; mbuf pinter considered 'consumed' and should not be used
> + *   after successful call to this function.
> + * @param[out] mbuf_out
> + *   last processed not fragmented and in order mbuf.
> + * @param[out] status
> + *   Structure to dump SFT flow status once updated according to contents of
> + *   *mbuf_in*.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success:
> + *   - *mbuf_out* contains valid mbuf pointer.
> + *   - *mbuf_out* is NULL and status->fragmented bit on in case of
> + *     non last fragment *mbuf_in*.
> + *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
> + *     order *mbuf_in*.
> + *   On failure a negative errno value and rte_errno is set.
> + */
> +int
> +rte_sft_process_mbuf_with_zone(struct rte_mbuf *mbuf_in,
> +			       uint32_t zone,
> +			       struct rte_mbuf **mbuf_out,
> +			       struct rte_sft_flow_status *status,
> +			       struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Drain next in order mbuf.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * This function behaves similar to rte_sft_process_mbuf() but acts on packets
> + * accumulated in SFT flow due to missing in order packet. Processing done on
> + * single mbuf at a time and `in order`. Other than above the behavior is
> + * same as of rte_sft_process_mbuf for flow defined & activated & mbuf isn't
> + * fragmented & 'in order'. This function should be called when
> + * rte_sft_process_mbuf or rte_sft_process_mbuf_with_zone sets
> + * status->nb_in_order_mbufs output param !=0 and until
> + * status->nb_in_order_mbufs == 0.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param[out] status
> + *   Structure to dump SFT flow status once updated according to contents of
> + *   *mbuf_in*.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid mbuf in case of success, NULL otherwise and rte_errno is set.
> + */
> +struct rte_mbuf *
> +rte_sft_drain_mbuf(uint32_t fid,
> +		   struct rte_sft_flow_status *status,
> +		   struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Activate flow in SFT.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * This function performs second phase of flow creation in SFT.
> + * The reasons for 2 phase flow creation procedure:
> + * 1. Missing reverse flow - flow context is shared for both flow directions
> + *    i.e. in order maintain bidirectional flow context in RTE SFT packets
> + *    arriving from both dirrections should be identified as packets of the
> + *    RTE SFT flow. Consequently before creation of the SFT flow caller should
> + *    provide reverse flow direction 7-tuple.
> + * 2. The caller of rte_sft_process_mbuf/rte_sft_process_mbuf_with_zone should
> + *   be notified that arrived mbuf is first in flow & decide weather to
> + *   create new flow or it distroy before it was activated with
> + *   rte_sft_flow_destroy.
> + * This function completes creation of the bidirectional SFT flow & creates
> + * entry for 7-tuple on SFT PMD defined by the tuple port for both
> + * initiator/initiate 7-tuples.
> + * Flow aging, connection tracking state & out of order handling will be
> + * initialized according to the content of the *mbuf_in* passes to
> + * rte_sft_process_mbuf/_with_zone during the phase 1 of flow creation.
> + * Once this function returns upcoming calls rte_sft_process_mbuf/_with_zone
> + * with 7-tuple or its reverse will return handle to this flow.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param reverse_tuple
> + *   Expected response flow 7-tuple.
> + * @param ctx
> + *   User defined state to set.
> + *   Update of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
> + * @param ct_enable
> + *   Enables maintenance of status->ct_state connection tracking value for the
> + *   flow; otherwise status->ct_state will be initialized with zeros.
> + * @param evdev_id
> + *   Event dev ID to enqueue end of flow event.
> + * @param evport_id
> + *   Event port ID to enqueue end of flow event.
> + * @param[out] status
> + *   Structure to dump SFT flow status once activated.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_activate(uint32_t fid,
> +		      const struct rte_sft_7tuple *reverse_tuple,
> +		      const struct rte_flow_item_sft *ctx,
> +		      uint8_t ct_enable,
> +		      uint8_t dev_id,
> +		      uint8_t port_id,
> +		      struct rte_sft_flow_status *status,
> +		      struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Artificially create SFT flow.
> + *
> + * Function to create SFT flow before reception of the first flow packet.
> + *
> + * @param tuple
> + *   Expected initiator flow 7-tuple.
> + * @param reverse_tuple
> + *   Expected initiate flow 7-tuple.
> + * @param ctx
> + *   User defined state to set.
> + *   Setting of *fid* or *zone* fields in struct rte_flow_item_sft unsupported.
> + * @param[out] ct_enable
> + *   Enables maintenance of status->ct_state connection tracking value for the
> + *   flow; otherwise status->ct_state will be initialized with zeros.
> + * @param[out] status
> + *   Structure to dump SFT flow status once created.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   - on success: 0, locked SFT flow recognized by status->fid.
> + *   - on error: a negative errno value otherwise and rte_errno is set.
> + */
> +
> +int
> +rte_sft_flow_create(const struct rte_sft_7tuple *tuple,
> +		    const struct rte_sft_7tuple *reverse_tuple,
> +		    const struct rte_flow_item_sft *ctx,
> +		    uint8_t ct_enable,
> +		    struct rte_sft_flow_status *status,
> +		    struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Lock exclusively SFT flow.
> + *
> + * Explicit flow locking; used for handling aged flows.
> + *
> + * @param fid
> + *   SFT flow ID.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_lock(uint32_t fid);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Release exclusively locked SFT flow.
> + *
> + * When rte_sft_process_mbuf/_with_zone and rte_sft_flow_create
> + * return *status* containing fid with defined bit on the flow considered
> + * exclusively locked and should be unlocked with this function.
> + *
> + * @param fid
> + *   SFT flow ID.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_unlock(uint32_t fid);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Removes flow from SFT.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * - Flow should be locked by caller in order to remove it.
> + * - Flow should have no client objects attached.
> + *
> + * Should be applied on aged flows, when flow aged event received.
> + *
> + * @code{.c}
> + *     while (1) {
> + *         rte_event_dequeue_burst(...);
> + *         FOR_EACH_EV(ev) {
> + *             uint32_t fid = ev.u64;
> + *             rte_sft_flow_lock(fid);
> + *             FOR_EACH_CLIENT(fid, client_id) {
> + *                 rte_sft_flow_reset_client_obj(fid, client_obj);
> + *                 // detached client object handling
> + *             }
> + *             rte_sft_flow_destroy(fid, &error);
> + *         }
> + *     }
> + * @endcode
> + *
> + * @param fid
> + *   SFT flow ID to destroy.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_destroy(uint32_t fid, struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Reset flow age to zero.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * Simulates last flow packet with timestamp set to just now.
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_touch(uint32_t fid, struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Set flow aging to specific value.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param aging
> + *   New flow aging value.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_set_aging(uint32_t fid,
> +		       uint32_t aging,
> +		       struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Set client object for given client ID.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param client_id
> + *   Client ID to set object for.
> + * @param client_obj
> + *   Pointer to opaque client object structure.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_sft_flow_set_client_obj(uint32_t fid,
> +			    uint8_t client_id,
> +			    void *client_obj,
> +			    struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get client object for given client ID.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param client_id
> + *   Client ID to get object for.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid client object opaque pointer in case of success, NULL otherwise
> + *   and rte_errno is set.
> + */
> +void *
> +rte_sft_flow_get_client_obj(const uint32_t fid,
> +			    uint8_t client_id,
> +			    struct rte_sft_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Remove client object for given client ID.
> + * Flow should be locked by caller (see rte_sft_flow_lock).
> + *
> + * Detaches client object from SFT flow and returns the ownership for the
> + * client object to the caller by returning client object pointer value.
> + * The pointer returned by this function won't be accessed any more, the caller
> + * may release all client obj related resources & the memory allocated for
> + * this client object.
> + *
> + * @param fid
> + *   SFT flow ID.
> + * @param client_id
> + *   Client ID to remove object for.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid client object opaque pointer in case of success, NULL otherwise
> + *   and rte_errno is set.
> + */
> +void *
> +rte_sft_flow_reset_client_obj(uint32_t fid,
> +			      uint8_t client_id,
> +			      struct rte_sft_error *error);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_SFT_H_ */
> diff --git a/lib/librte_sft/rte_sft_driver.h b/lib/librte_sft/rte_sft_driver.h
> new file mode 100644
> index 0000000000..0c9e28fe17
> --- /dev/null
> +++ b/lib/librte_sft/rte_sft_driver.h
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + */
> +
> +#ifndef _RTE_SFT_DRIVER_H_
> +#define _RTE_SFT_DRIVER_H_
> +
> +/**
> + * @file
> + *
> + * RTE SFT Ethernet device PMD API
> + *
> + * APIs that are used by the SFT library to offload SFT operationons
> + * to Ethernet device.
> + */
> +
> +#include "rte_sft.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Opaque type returned after successfully creating an entry in SFT.
> + *
> + * This handle can be used to manage and query the related entry (e.g. to
> + * destroy it or update age).
> + */
> +struct rte_sft_entry;
> +
> +/**
> + * Create SFT entry in eth_dev SFT.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param tuple
> + *   L3/L4 5-tuple - src/dest IP and port and IP protocol.
> + * @param nat_tuple
> + *   L3/L4 5-tuple to replace in packet original 5-tuple in order to implement
> + *   NAT offloading; if NULL NAT offloading won't be configured for the flow.
> + * @param aging
> + *   Flow aging timeout in seconds.
> + * @param ctx
> + *   Initial values in SFT flow context
> + *   (see RTE flow struct rte_flow_item_sft).
> + *   ctx->zone should be valid.
> + * @param fid
> + *   SFT flow ID for the entry to create on *device*.
> + *   If there is an entry for the *fid* in PMD it will be updated with the
> + *   values of *ctx*.
> + * @param[out] queue_index
> + *   if PMD can figure out the queue where the flow packets will
> + *   arrive in RX data path it will set the value of queue_index; otherwise
> + *   all bits will be turned on.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid handle in case of success, NULL otherwise and rte_errno is set.
> + */
> +typedef struct rte_sft_entry *(*sft_entry_create_t) (struct rte_eth_dev *dev,
> +		const struct rte_sft_5tuple *tuple,
> +		const struct rte_sft_5tuple *nat_tuple,
> +		const uint32_t aging,
> +		const struct rte_flow_item_sft *ctx,
> +		const uint32_t fid,
> +		uint16_t *queue_index,
> +		struct rte_sft_error *error);
> +
> +/**
> + * Destroy SFT entry in eth_dev SFT.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param entry
> + *   Handle to the SFT entry to destroy.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +typedef int (*sft_entry_destroy_t)(struct rte_eth_dev *dev,
> +		struct rte_sft_entry *entry,
> +		struct rte_sft_error *error);
> +
> +/**
> + * Decodes SFT flow context if attached to mbuf by action ``SFT``.
> + * @see RTE flow RTE_FLOW_ACTION_TYPE_SFT.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param mbuf
> + *   mbuf of the packet to decode attached state from.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid SFT flow context in case of success, NULL otherwise and rte_errno
> + *   is set.
> + */
> +typedef struct rte_flow_item_sft *(*sft_entry_mbuf_decode_ctx_t)(
> +		struct rte_eth_dev *dev,
> +		const struct rte_mbuf *mbuf,
> +		struct rte_sft_error *error);
> +
> +/**
> + * Get aged-out SFT entries.
> + *
> + * Report entry as aged-out if timeout passed without any matching
> + * on the SFT entry.
> + *
> + * @param[in] dev
> + *   Pointer to Ethernet device structure.
> + * @param[in, out] fid_aged
> + *   The address of an array of aged-out SFT flow IDs.
> + * @param[in] nb_aged
> + *   The length of *fid_aged* array pointers.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. Initialized in case of
> + *   error only.
> + *
> + * @return
> + *   if nb_aged is 0, return the amount of all aged flows.
> + *   if nb_aged is not 0 , return the amount of aged flows reported
> + *   in the *fid_aged* array, otherwise negative errno value.
> + */
> +typedef int (*sft_entry_get_aged_entries_t)(struct rte_eth_dev *dev,
> +		uint32_t *fid_aged,
> +		int nb_aged,
> +		struct rte_sft_error *error);
> +
> +/**
> + * Simulate SFT entry match in terms of entry aging.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param fid
> + *   SFT flow ID paired with dev to retrieve related SFT entry.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +typedef int (*sft_entry_touch_t)(struct rte_eth_dev *dev,
> +		uint32_t fid,
> +		struct rte_sft_error *error);
> +
> +/**
> + * Set SFT entry aging to specific value.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param fid
> + *   SFT flow ID paired with dev to retrieve related SFT entry.
> + * @param aging
> + *   New entry aging value.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +typedef int (*sft_entry_set_aging_t)(struct rte_eth_dev *dev,
> +		uint32_t fid,
> +		uint32_t aging,
> +		struct rte_sft_error *error);
> +
> +/** SFT operations function pointer table */
> +struct rte_sft_ops {
> +	sft_entry_create_t entry_create;
> +	/**< Create SFT entry in eth_dev SFT. */
> +	sft_entry_destroy_t entry_destroy;
> +	/**< Destroy SFT entry in eth_dev SFT. */
> +	sft_entry_mbuf_decode_ctx_t mbuf_decode_ctx;
> +	/**< Decodes SFT flow context if attached to mbuf by action ``SFT``. */
> +	sft_entry_get_aged_entries_t get_aged_entries;
> +	/**< Get aged-out SFT entries. */
> +	sft_entry_touch_t entry_touch;
> +	/**< Simulate SFT entry match in terms of entry aging. */
> +	sft_entry_set_aging_t set_aging;
> +	/**< Set SFT entry aging to specific value. */
> +};
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_SFT_DRIVER_H_ */
> diff --git a/lib/librte_sft/rte_sft_version.map b/lib/librte_sft/rte_sft_version.map
> new file mode 100644
> index 0000000000..747e100ac5
> --- /dev/null
> +++ b/lib/librte_sft/rte_sft_version.map
> @@ -0,0 +1,21 @@
> +EXPERIMENTAL {
> +	global:
> +
> +	rte_sft_flow_get_status;
> +	rte_sft_flow_set_ctx;
> +	rte_sft_init;
> +	rte_sft_fini;
> +	rte_sft_process_mbuf;
> +	rte_sft_process_mbuf_with_zone;
> +	rte_sft_drain_mbuf;
> +	rte_sft_flow_activate;
> +	rte_sft_flow_create;
> +	rte_sft_flow_lock;
> +	rte_sft_flow_unlock;
> +	rte_sft_flow_destroy;
> +	rte_sft_flow_touch;
> +	rte_sft_flow_set_aging;
> +	rte_sft_flow_set_client_obj;
> +	rte_sft_flow_get_client_obj;
> +	rte_sft_flow_reset_client_obj;
> +};
> 

Missing the __rte_experimental  attribute in front of all these in the header.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dpdk-dev] [RFC 3/3] sft: introduce API
  2020-09-18  7:43     ` Andrew Rybchenko
@ 2020-11-02 10:49       ` Andrey Vesnovaty
  0 siblings, 0 replies; 17+ messages in thread
From: Andrey Vesnovaty @ 2020-11-02 10:49 UTC (permalink / raw)
  To: Andrew Rybchenko, Ori Kam, Kinsella, Ray, dev
  Cc: thomas, Slava Ovsiienko, andrey.vesnovaty, Oz Shlomo,
	Eli Britstein, Alex Rosenbaum, Roni Bar Yanai, Ray Kinsella,
	Neil Horman, Ferruh Yigit

Hi Andrew, Ray and everybody reviewing SFT RFC patch series.

Anything you comments to this patchset coupled with other open issues
made me & Ori rethink the suggested API.
Ori is about to publish V2 for SFT RFC very soon addressing al the mentioned
above.   

Thanks,
Andrey

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Friday, September 18, 2020 10:44 AM
> To: Ori Kam <orika@nvidia.com>; Andrey Vesnovaty <andreyv@nvidia.com>;
> dev@dpdk.org
> Cc: thomas@nvidia.net; Slava Ovsiienko <viacheslavo@nvidia.com>;
> andrey.vesnovaty@gmail.com; Oz Shlomo <ozsh@nvidia.com>; Eli Britstein
> <elibr@nvidia.com>; Alex Rosenbaum <alexr@nvidia.com>; Roni Bar Yanai
> <roniba@nvidia.com>; Ray Kinsella <mdr@ashroe.eu>; Neil Horman
> <nhorman@tuxdriver.com>; Ferruh Yigit <ferruh.yigit@intel.com>
> Subject: Re: [RFC 3/3] sft: introduce API
> 
> Hi Andrey,
> 
> looks very interesting, but a bit hard to review.
> I hope I'll do deeper review on the next version.
> Right not just few cosmetic things to make the
> next version a bit clearer.
> 
> Do you plan to create/publish an example appliction
> which uses the API and demonstrates how to do it?
> 
> Plesee, see below.
> 
> Thanks,
> Andrew.
> 
> On 9/16/20 9:33 PM, Ori Kam wrote:
> > Hi Andery,
> > PSB
> >
> >> -----Original Message-----
> >> From: Andrey Vesnovaty <andreyv@nvidia.com>
> >> Sent: Wednesday, September 9, 2020 11:30 PM
> >> To: dev@dpdk.org
> >> Subject: [RFC 3/3] sft: introduce API
> >>
> >> Defines RTE SFT APIs for Statefull Flow Table library.
> >>
> >> SFT General description:
> >> SFT library provides a framework for applications that need to maintain
> >> context across different packets of the connection.
> >> Examples for such applications:
> >> - Next-generation firewalls
> >> - Intrusion detection/prevention systems (IDS/IPS): Suricata, snort
> >> - SW/Virtual Switching: OVS
> >> The goals of the SFT library:
> >> - Accelerate flow recognition & its context retrieval for further
> >>   lookaside processing.
> >> - Enable context-aware flow handling offload.
> >>
> >> Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
> >> ---
> >>  lib/librte_sft/Makefile            |  28 +
> >>  lib/librte_sft/meson.build         |   7 +
> >>  lib/librte_sft/rte_sft.c           |   9 +
> >>  lib/librte_sft/rte_sft.h           | 845 +++++++++++++++++++++++++++++
> >>  lib/librte_sft/rte_sft_driver.h    | 195 +++++++
> >>  lib/librte_sft/rte_sft_version.map |  21 +
> >>  6 files changed, 1105 insertions(+)
> >>  create mode 100644 lib/librte_sft/Makefile
> >>  create mode 100644 lib/librte_sft/meson.build
> >>  create mode 100644 lib/librte_sft/rte_sft.c
> >>  create mode 100644 lib/librte_sft/rte_sft.h
> >>  create mode 100644 lib/librte_sft/rte_sft_driver.h
> >>  create mode 100644 lib/librte_sft/rte_sft_version.map
> >>
> >> diff --git a/lib/librte_sft/Makefile b/lib/librte_sft/Makefile
> >> new file mode 100644
> >> index 0000000000..23c6eee849
> >> --- /dev/null
> >> +++ b/lib/librte_sft/Makefile
> >> @@ -0,0 +1,28 @@
> >> +# SPDX-License-Identifier: BSD-3-Clause
> >> +# Copyright 2020 Mellanox Technologies, Ltd
> >> +
> >> +include $(RTE_SDK)/mk/rte.vars.mk
> >> +
> >> +# library name
> >> +LIB = librte_sft.a
> >> +
> >> +# library version
> >> +LIBABIVER := 1
> >> +
> >> +# build flags
> >> +CFLAGS += -O3
> >> +CFLAGS += $(WERROR_FLAGS)
> >> +LDLIBS += -lrte_eal -lrte_mbuf
> >> +
> >> +# library source files
> >> +# all source are stored in SRCS-y
> >> +SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_sft.c
> >> +
> >> +# export include files
> >> +SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft.h
> >> +SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_sft_driver.h
> >> +
> >> +# versioning export map
> >> +EXPORT_MAP := rte_sft_version.map
> >> +
> >> +include $(RTE_SDK)/mk/rte.lib.mk
> >> diff --git a/lib/librte_sft/meson.build b/lib/librte_sft/meson.build
> >> new file mode 100644
> >> index 0000000000..b210e43f29
> >> --- /dev/null
> >> +++ b/lib/librte_sft/meson.build
> >> @@ -0,0 +1,7 @@
> >> +# SPDX-License-Identifier: BSD-3-Clause
> >> +# Copyright 2020 Mellanox Technologies, Ltd
> >> +
> >> +sources = files('rte_sft.c')
> >> +headers = files('rte_sft.h',
> >> +	'rte_sft_driver.h')
> >> +deps += ['mbuf']
> >> diff --git a/lib/librte_sft/rte_sft.c b/lib/librte_sft/rte_sft.c
> >> new file mode 100644
> >> index 0000000000..f3d3945545
> >> --- /dev/null
> >> +++ b/lib/librte_sft/rte_sft.c
> >> @@ -0,0 +1,9 @@
> >> +/* SPDX-License-Identifier: BSD-3-Clause
> >> + * Copyright 2020 Mellanox Technologies, Ltd
> >> + */
> >> +
> >> +
> >> +#include "rte_sft.h"
> >> +#include "rte_sft_driver.h"
> >> +
> >> +/* Placeholder for RTE SFT library APIs implementation */
> >> diff --git a/lib/librte_sft/rte_sft.h b/lib/librte_sft/rte_sft.h
> >> new file mode 100644
> >> index 0000000000..5c9f92ea9f
> >> --- /dev/null
> >> +++ b/lib/librte_sft/rte_sft.h
> >> @@ -0,0 +1,845 @@
> >> +/* SPDX-License-Identifier: BSD-3-Clause
> >> + * Copyright 2020 Mellanox Technologies, Ltd
> >> + */
> >> +
> >> +#ifndef _RTE_SFT_H_
> >> +#define _RTE_SFT_H_
> >> +
> >> +/**
> >> + * @file
> >> + *
> >> + * RTE SFT API
> >> + *
> >> + * Defines RTE SFT APIs for Statefull Flow Table library.
> >> + *
> >> + * SFT General description:
> >> + * SFT library provides a framework for applications that need to maintain
> >> + * context across different packets of the connection.
> >> + * Examples for such applications:
> >> + * - Next-generation firewalls
> >> + * - Intrusion detection/prevention systems (IDS/IPS): Suricata, Snort
> >> + * - SW/Virtual Switching: OVS
> >> + * The goals of the SFT library:
> >> + * - Accelerate flow recognition & its context retrieval for further lookaside
> 
> lookaside -> look-aside
> 
> >> + *   processing.
> >> + * - Enable context-aware flow handling offload.
> >> + *
> >> + * Definitions and Abbreviations:
> >> + * - 5-tuple: defined by:
> >> + *     -- Source IP address
> >> + *     -- Source port
> >> + *     -- Destination IP address
> >> + *     -- Destination port
> >> + *     -- IP protocol number
> >> + * - 7-tuple: 5-tuple zone and port (see struct rte_sft_7tuple)
> 
> I guess comma is missing after "5-tuple", since I read it as
> "5-tuple zone"???
> 
> >> + * - 5/7-tuple: 5/7-tuple of the packet from connection initiator
> >> + * - revers 5/7-tuple: 5/7-tuple of the packet from connection initiate
> >> + * - application: SFT library API consumer
> >> + * - APP: see application
> >> + * - CID: client ID
> >> + * - CT: connection tracking
> >> + * - FID: Flow identifier
> >> + * - FIF: First In Flow
> >> + * - Flow: defined by 7-tuple and its reverse i.e. flow is bidirectional
> >> + * - SFT: Stateful Flow Table
> >> + * - user: see application
> >> + * - zone: additional user defined value used as differentiator for
> >> + *         connections having same 5-tuple (for example different VxLan
> 
> VxLan -> VXLAN (see devtools/words-case.txt)
> 
> >> + *         connections with same inner 5-tuple).
> >> + *
> >> + * SFT components:
> >> + *
> >> + * +-----------------------------------+
> >> + * | RTE flow                          |
> >> + * |                                   |
> >> + * | +-------------------------------+ |  +----------------+
> >> + * | | group X                       | |  | RTE_SFT        |
> >> + * | |                               | |  |                |
> >> + * | | +---------------------------+ | |  |                |
> >> + * | | | rule ...                  | | |  |                |
> >> + * | | | .                         | | |  +-----------+----+
> >> + * | | | .                         | | |              |
> >> + * | | | .                         | | |          entry
> >> + * | | +---------------------------+ | |            create
> >> + * | | | rule                      | | |              |
> >> + * | | |   patterns ...            +---------+        |
> >> + * | | |   actions                 | | |     |        |
> >> + * | | |     SFT (zone=Z)          | | |     |        |
> >> + * | | |     JUMP (group=Y)        | | |  lookup      |
> >> + * | | +---------------------------+ | |    zone=Z,   |
> >> + * | | | rule ...                  | | |    5tuple    |
> >> + * | | | .                         | | |     |        |
> >> + * | | | .                         | | |  +--v-------------+
> >> + * | | | .                         | | |  | SFT       |    |
> >> + * | | |                           | | |  |           |    |
> >> + * | | +---------------------------+ | |  |        +--v--+ |
> >> + * | |                               | |  |        |     | |
> >> + * | +-------------------------------+ |  |        | PMD | |
> >> + * |                                   |  |        |     | |
> >> + * |                                   |  |        +-----+ |
> >> + * | +-------------------------------+ |  |                |
> >> + * | | group Y                       | |  |                |
> >> + * | |                               | |  | set flow CTX   |
> >> + * | | +---------------------------+ | |  |                |
> >> + * | | | rule                      | | |  +--------+-------+
> >> + * | | |   patterns                | | |           |
> >> + * | | |     SFT (state=UNDEFINED) | | |           |
> >> + * | | |   actions RSS             | | |           |
> >> + * | | +---------------------------+ | |           |
> >> + * | | | rule                      | | |           |
> >> + * | | |   patterns                | | |           |
> >> + * | | |     SFT (state=INVALID)   | <-------------+
> >> + * | | |   actions DROP            | | |  forward
> >> + * | | +---------------------------+ | |    group=Y
> >> + * | | | rule                      | | |
> >> + * | | |   patterns                | | |
> >> + * | | |     SFT (state=ACCEPTED)  | | |
> >> + * | | |   actions PORT            | | |
> >> + * | | +---------------------------+ | |
> >> + * | |  ...                          | |
> >> + * | |                               | |
> >> + * | +-------------------------------+ |
> >> + * |  ...                              |
> >> + * |                                   |
> >> + * +-----------------------------------+
> >> + *
> >> + * SFT as datastructure:
> >> + * SFT can be treated as datastructure maintaining flow context across its
> >> + * lifetime. SFT flow entry represent bidirectional network flow and defined
> by
> 
> represent -> represents
> 
> >> + * 7-tuple & its reverse 7-tuple.
> >> + * Each entry in SFT has:
> >> + * - FID: 1:1 mapped & used as entry handle & encapsulating internal
> >> + *   implementation of the entry.
> >> + * - State: user-defined value attached to each entry, the only library
> >> + *   reserved value for state unset (the actual value defined by SFT
> >> + *   configuration). The application should define flow state encodings and
> >> + *   set it for flow via rte_sft_flow_set_ctx() than what actions should be
> >> + *   applied on packets can be defined via related RTE flow rule matching SFT
> >> + *   state (see rules in SFT components diagram above).
> >> + * - Timestamp: for the last seen in flow packet used for flow aging
> mechanism
> >> + *   implementation.
> >> + * - Client Objects: user-defined flow contexts attached as opaques to flow.
> >> + * - Acceleration & offloading - utilize RTE flow capabilities, when supported
> >> + *   (see action ``SFT``), for flow lookup acceleration and further
> >> + *   context-aware flow handling offload.
> >> + * - CT state: optionally for TCP connections CT state can be maintained
> >> + *   (see enum rte_sft_flow_ct_state).
> >> + * - Out of order TCP packets: optionally SFT can keep out of order TCP
> >> + *   packets aside the flow context till the arrival of the missing in-order
> >> + *   packet.
> >> + *
> >> + * RTE flow changes:
> >> + * The SFT flow state (or context) for RTE flow is defined by fields of
> >> + * struct rte_flow_item_sft.
> >> + * To utilize SFT capabilities new item and action types introduced:
> >> + * - item SFT: matching on SFT flow state (see RTE_FLOW_ITEM_TYPE_SFT).
> >> + * - action SFT: retrieve SFT flow context and attache it to the processed
> >> + *   packet (see RTE_FLOW_ACTION_TYPE_SFT).
> >> + *
> >> + * The contents of per port SFT serving RTE flow action ``SFT`` managed via
> >> + * SFT PMD APIs (see struct rte_sft_ops).
> >> + * The SFT flow state/context retrieval performed by user-defined zone
> ``SFT``
> >> + * action argument and processed packet 5-tuple.
> >> + * If in scope of action ``SFT`` there is no context/state for the flow in SFT
> >> + * undefined sate attached to the packet meaning that the flow is not
> >> + * recognized by SFT, most probably FIF packet.
> >> + *
> >> + * Once the SFT state set for a packet it can match on item SFT
> >> + * (see RTE_FLOW_ITEM_TYPE_SFT) and forwarding design can be done for
> the
> >> + * packet, for example:
> >> + * - if state value == x than queue for further processing by the application
> >> + * - if state value == y than forward it to eth port (full offload)
> >> + * - if state value == 'undefined' than queue for further processing by
> >> + *   the application (handle FIF packets)
> >> + *
> >> + * Processing packets with SFT library:
> >> + *
> >> + * FIF packet:
> >> + * To recognize upcoming packets of the SFT flow every FIF packet should be
> >> + * forwarded to the application utilizing the SFT library. Non-FIF packets can
> >> + * be processed by the application or its processing can be fully offloaded.
> >> + * Processing of the packets in SFT library starts with rte_sft_process_mbuf
> >> + * or rte_sft_process_mbuf_with_zone. If mbuf recognized as FIF
> application
> >> + * should make a design to destroy flow or complete flow creation process
> in
> >> + * SFT using rte_sft_flow_activate.
> >> + *
> >> + * Recognized SFT flow:
> >> + * Once struct rte_sft_flow_status with valid fid field posesed by application
> 
> posesed -> possessed
> 
> >> + * it can:
> >> + * - mange client objects on it (see client_obj field in
> >> + *   struct rte_sft_flow_status) using rte_sft_flow_<OP>_client_obj APIs
> >> + * - analyze user-defined flow state and CT state (see state & ct_sate fields
> >> + *   in struct rte_sft_flow_status).
> >> + * - set flow state to be attached to the upcoming packets by action ``SFT``
> >> + *   via struct rte_sft_flow_status API.
> >> + * - decide to destroy flow via rte_sft_flow_destroy API.
> >> + *
> >> + * Flow aging:
> >> + *
> >> + * SFT library manages the aging for each flow. On flow creation, it's
> >> + * assigned an aging value, the maximal number of seconds passed since the
> >> + * last flow packet arrived, once exceeded flow considered aged.
> >> + * The application notified of aged flow asynchronously via event queues.
> >> + * The device and port IDs tuple to identify the event queue to enqueue
> >> + * flow aged events passed on flow creation as arguments
> >> + * (see rte_sft_flow_activate). It's the application responsibility to
> >> + * initialize event queues and assign them to each flow for EOF event
> >> + * notifications.
> >> + * Aged EOF event handling:
> >> + * - Should be considered as application responsibility.
> >> + * - The last stage should be the release of the flow resources via
> >> + *    rte_sft_flow_destroy API.
> >> + * - All client objects should be removed from flow before the
> >> + *   rte_sft_flow_destroy API call.
> >> + * See the description of rete_sft_flow_destroy for an example of aged flow
> 
> rete_sft_flow_destroy -> rte_sft_flow_destroy
> 
> >> + * handling.
> >> + *
> >> + * SFT API thread safety:
> >> + *
> >> + * SFT library APIs are thread-safe while handling of specific flow can be
> >> + * done in a single thread simultaneously. Exclusive access to specific SFT
> >> + * flow guaranteed by:
> >
> > The line above contradict itself, if you are working with single thread you can't
> work simultaneously.
> > Does the SFT allow the access to a single flow from two threads in the same
> time? or it is the responsibility
> > Of the application to protect itself. I think it should be the application
> responsibility the SFT should protect
> > itself only on SFT global functions. For example calling process_mbuf should be
> protected, so application can
> > call the same function from different threads.
> > I think we can assume that all packets from a specific flow will arrive to the
> same queue and the same thread.
> >
> > So I don't see the usage of the lock API.
> >
> >> + * - rte_sft_process_mbuf
> >> + * - rte_sft_process_mbuf_with_zone
> >> + * - rte_sft_flow_create
> >> + * - rte_sft_flow_lock
> >> + * When application is done with the flow handling for the current packet it
> >> + * should call rte_sft_flow_unlock API to maintain exclusive access to the
> >> + * flow with other threads.
> >> + *
> >> + * SFT Library initialization and cleanup:
> >> + *
> >> + * SFT library should be considered as a single instance, preconfigured and
> >> + * initialized via rte_sft_init() API.
> >> + * SFT library resource deallocation and cleanup should be done via
> >> + * rte_sft_init() API as a stage of the application termination procedure.
> >> + */
> >> +
> >> +#ifdef __cplusplus
> >> +extern "C" {
> >> +#endif
> >> +
> >> +#include <rte_common.h>
> >> +#include <rte_config.h>
> >> +#include <rte_errno.h>
> >> +#include <rte_mbuf.h>
> >> +#include <rte_ethdev.h>
> >> +#include <rte_flow.h>
> >> +
> >> +/**
> >> + * L3/L4 5-tuple - src/dest IP and port and IP protocol.
> >> + *
> >> + * Used for flow/connection identification.
> >> + */
> >> +struct rte_sft_5tuple {
> >> +	union {
> >> +		struct {
> >> +			rte_be32_t src_addr; /**< IPv4 source address. */
> >> +			rte_be32_t dst_addr; /**< IPv4 destination address. */
> >> +		} ipv4;
> >> +		struct {
> >> +			uint8_t src_addr[16]; /**< IPv6 source address. */
> >> +			uint8_t dst_addr[16]; /**< IPv6 destination address. */
> >> +		} ipv6;
> >> +	};
> 
> RTE_STD_C11 missing?
> 
> >> +	uint16_t src_port; /**< Source port. */
> >> +	uint16_t dst_port; /**< Destination port. */
> 
> If it is really host-endian, please, highlight it in above
> descriptions. Also it would be interesting to understand
> why.
> 
> >> +	uint8_t proto; /**< IP protocol. */
> >> +	uint8_t is_ipv6: 1; /**< True for valid IPv6 fields. Otherwise IPv4. */
> >> +};
> >> +
> >> +/**
> >> + * Port flow identification.
> >> + *
> >> + * @p zone used for setups where 5-tuple is not enough to identify flow.
> >> + * For example different VLANs/VXLANs may have similar 5-tuples.
> >> + */
> >> +struct rte_sft_7tuple {
> >> +	struct rte_sft_5tuple flow_5tuple; /**< L3/L4 5-tuple. */
> >> +	uint32_t zone; /**< Zone assigned to flow. */
> >> +	uint16_t port_id; /** <Port identifier of Ethernet device. */
> >> +};
> >> +
> >> +/**
> >> + * Flow connection tracking states
> >> + */
> >> +enum rte_sft_flow_ct_state {
> >> +	RTE_SFT_FLOW_CT_STATE_NEW  = (1 << 0),
> >> +	RTE_SFT_FLOW_CT_STATE_EST  = (1 << 1),
> >> +	RTE_SFT_FLOW_CT_STATE_REL  = (1 << 2),
> >> +	RTE_SFT_FLOW_CT_STATE_RPL  = (1 << 3),
> >> +	RTE_SFT_FLOW_CT_STATE_INV  = (1 << 4),
> >> +	RTE_SFT_FLOW_CT_STATE_TRK  = (1 << 5),
> >> +	RTE_SFT_FLOW_CT_STATE_SNAT = (1 << 6),
> >> +	RTE_SFT_FLOW_CT_STATE_DNAT = (1 << 7),
> >> +};
> >> +
> >> +/**
> >> + * Structure describes SFT library configuration
> >> + */
> >> +struct rte_sft_conf {
> >> +	uint32_t UDP_aging; /**< UDP proto default aging. */
> >> +	uint32_t TCP_aging; /**< TCP proto default aging. */
> >> +	uint32_t TCP_SYN_aging; /**< TCP SYN default aging. */
> >> +	uint32_t OTHER_aging; /**< All unlisted proto default aging. */
> 
> May I suggest to stick to lowercase fields, please.
> 
> >> +	uint32_t size; /**< Max entries in SFT. */
> >> +	uint8_t undefined_state; /**< Undefined state constant. */
> >> +	uint8_t reorder_enable: 1;
> >> +	/**< TCP packet reordering feature enabled bit. */
> >> +	uint8_t ct_enable: 1; /**< Connection tracking feature enabled bit. */
> >> +};
> >> +
> >> +/**
> >> + * Structure describes the state of the flow in SFT.
> >> + */
> >> +struct rte_sft_flow_status {
> >> +	uint32_t fid; /**< SFT flow id. */
> >> +	uint32_t zone; /**< Zone for lookup in SFT */
> >> +	uint8_t state; /**< Application defined bidirectional flow state. */
> >> +	uint8_t ct_state; /**< Connection tracking flow state. */
> >> +	uint32_t age; /**< Seconds passed since last flown packet. */
> >> +	uint32_t aging;
> >> +	/**< Flow considered aged once this age (seconds) reached. */
> >> +	uint32_t nb_in_order_mbufs;
> >> +	/**< Number of in-order mbufs available for drain */
> >> +	void **client_obj; /**< Array of clients attached to flow. */
> >> +	int nb_clients; /**< Number of clients attached to flow. */
> >> +	uint8_t defined: 1; /**< Flow defined in SFT bit. */
> >> +	uint8_t activated: 1; /**< Flow activation bit. */
> >> +	uint8_t fragmented: 1; /**< Last flow mbuf was fragmented. */
> >> +	uint8_t out_of_order: 1; /**< Last flow mbuf was out of order (TCP). */
> >> +};
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Get SFT flow status.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> 
> Dup lines above
> 
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param[out] status
> >> + *   Structure to dump actual SFT flow status.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_get_status(const uint32_t fid,
> >> +			struct rte_sft_flow_status *status,
> >> +			struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Set user defined context.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * Updates per ethernet dev SFT entries:
> 
> ethernet -> Ethernet
> dev -> device
> 
> >> + * - flow lookup acceleration
> >> + * - partial/full flow offloading managed by flow context
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param ctx
> >> + *   User defined state to set.
> >> + *   Update of *fid* or *zone* fields in struct rte_flow_item_sft
> unsupported.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success , a negative errno value otherwise and rte_errno is set.
> 
> Remove space before comma
> 
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_set_ctx(uint32_t fid,
> >> +		     const struct rte_flow_item_sft *ctx,
> >> +		     struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Initialize SFT library instance.
> >> + *
> >> + * @param conf
> >> + *   SFT library instance configuration.
> >> + *
> >> + * @return
> >> + *   0 on success , a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_init(const struct rte_sft_conf *conf);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Finalize SFT library instance.
> >> + * Cleanup & release allocated resources.
> >> + */
> >> +void
> >> +rte_sft_fini(void);
> >> +
> >
> > I think we should use stop. It is not commons in DPDK to have fini functions.
> > Maybe we should also add start function, so the app can init and then start the
> SFT.
> >
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Process mbuf received on RX queue.
> >> + *
> >> + * Fragmentation handling (SFT fragmentation feature configured):
> >> + * If *mbuf_in* of fragmented packet received it will be stored by SFT
> library.
> >> + * status->fragmented bit will be set and *mbuf_out* will be set to NULL.
> >> + * On reception of all related fragments of IP packet it will be reassembled
> >> + * and further processed by this function on reception of last fragment.
> >> + *
> > Does this function allocate a new mbuf? Does it releases all old mbufs?
> >
> >> + * Flow definition:
> >> + * SFT flow defined by one of its 7-tuples, since there is no zone value as
> >> + * argument flow should be defined by context attached to mbuf with action
> >> + * ``SFT`` (see RTE flow RTE_FLOW_ACTION_TYPE_SFT). Otherwise status-
> >>> defined
> >> + * field will be turned off & *mbuf_out* will be set to *mbuf_in*.
> >> + * In order to define flow for *mbuf_in* without attached sft context
> >> + * rte_sft_process_mbuf_with_zone() should be used with *zone* argument
> >> + * supplied by caller.
> >> + *
> >> + * Flow lookup:
> >> + * If SFT flow identifier can't be retrieved from SFT context attached to
> >> + * *mbuf_in* by action ``SFT`` - SFT lookup should be performmed by zone,
> 
> performmed -> performed
> 
> >> + * retrieved from SFT context attached to *mbuf_in*, and 5-tuple, extracted
> >> + * form mbuf outer header contents.
> >> + *
> >> + * Flow defined but does not exists:
> >> + * If flow not found in SFT inactivated flow will be created in SFT.
> >> + * status->activated field will be turned off & *mbuf_out* be set to
> *mbuf_in*.
> >> + * In order to activate created flow rte_sft_flow_activate() should be used
> >> + * with reverse 7-tuple supplied by caller.
> >> + * This is first phase of flow creation in SFT for second phase & more
> detailed
> >> + * descriotion of flow creation see rte_sft_flow_activate.
> 
> descriotion -> description
> 
> >> + *
> >> + * Out of order (SFT out of oreder feature configured):
> 
> oreder -> order
> 
> >> + * If flow defined & activated but *mbuf_in* is TCP out of order packet it
> will
> >> + * be stored by SFT library. status->out_of_order bit will be set &
> *mbuf_out*
> >> + * will be set to NULL. On reception of the first missing in order packet
> >> + * status->nb_in_order_mbufs will be set to number of mbufs that available
> >> for
> >> + * processing with rte_sft_drain_mbuf().
> >> + *
> > It is possible that some packets will get trapped in the SFT do to this feature.
> > if it supports ordering. For example the following case:
> > Packets arrive to the application. After draining the packets the
> > Application changed the flow to full offload. This means that
> > all future packets will not arrive to the application.
> > But until the flow is offloaded some packets do arrive not in order.
> > Then the flow is offloaded, this will result in the situation that no more
> > packets will arrive to the application so some packets will get stack
> > in the SFT.
> > I think we must have some force drain or, notify the SFT that no more
> > packets should arrive to even if the packets are not in order it will release
> them.
> >
> > Also the same with fragmented does this function allocate new mbufs? are
> you releasing the
> > old ones?
> >
> >> + * Flow defined & activated, mbuf not fragmented and 'in order':
> >> + * - Flow aging related data (see age field in `struct rte_sft_flow_status`)
> >> + *   will be updated according to *mbuf_in* timestamp.
> >> + * - Flow connection tracking state (see ct_state field in
> >> + *   `struct rte_sft_flow_status`)  will be updated according to *mbuf_in* L4
> >> + *   header contents.
> >> + * - *mbuf_out* will be set to last processed mbuf.
> >> + *
> >> + * @param[in] mbuf_in
> >> + *   mbuf to process; mbuf pinter considered 'consumed' and should not be
> >> used
> >> + *   after successful call to this function.
> >> + * @param[out] mbuf_out
> >> + *   last processed not fragmented and in order mbuf.
> >
> > If the in mbuf is not fragmented and in order, this pointer will point to the in
> one?
> >
> >> + * @param[out] status
> >> + *   Structure to dump SFT flow status once updated according to contents
> of
> >> + *   *mbuf_in*.
> >
> > Does the status bits for example fragmented is kept per connection or per
> flow?
> > Since it is possible to get fragmented packets from both sides.
> > The same goes for out of order packets
> >
> >
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success:
> >> + *   - *mbuf_out* contains valid mbuf pointer, locked SFT flow recognized by
> >> + *     status->fid.
> >> + *   - *mbuf_out* is NULL and status->fragmented bit on in case of
> >> + *     non last fragment *mbuf_in*.
> >> + *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
> >> + *     order *mbuf_in*, locked SFT flow recognized by status->fid.
> >> + *   On failure a negative errno value and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_process_mbuf(struct rte_mbuf *mbuf_in,
> >> +		     struct rte_mbuf **mbuf_out,
> >> +		     struct rte_sft_flow_status *status,
> >> +		     struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Process mbuf received on RX queue while zone value provided by caller.
> >> + *
> >> + * The behaviour of this function is similar to rte_sft_process_mbuf except
> >> + * the lookup in SFT procedure. The lookup in SFT always done by the *zone*
> >> + * arg and 5-tuple 5-tuple, extracted form mbuf outer header contents.
> >> + *
> >> + * @see rte_sft_process_mbuf
> >> + *
> >> + * @param[in] mbuf_in
> >> + *   mbuf to process; mbuf pinter considered 'consumed' and should not be
> used
> 
> pinter -> pointer
> 
> >> + *   after successful call to this function.
> >> + * @param[out] mbuf_out
> >> + *   last processed not fragmented and in order mbuf.
> >> + * @param[out] status
> >> + *   Structure to dump SFT flow status once updated according to contents
> of
> >> + *   *mbuf_in*.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success:
> >> + *   - *mbuf_out* contains valid mbuf pointer.
> >> + *   - *mbuf_out* is NULL and status->fragmented bit on in case of
> >> + *     non last fragment *mbuf_in*.
> >> + *   - *mbuf_out* is NULL and status->out_of_order bit on in case of out of
> >> + *     order *mbuf_in*.
> >> + *   On failure a negative errno value and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_process_mbuf_with_zone(struct rte_mbuf *mbuf_in,
> >> +			       uint32_t zone,
> >> +			       struct rte_mbuf **mbuf_out,
> >> +			       struct rte_sft_flow_status *status,
> >> +			       struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Drain next in order mbuf.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * This function behaves similar to rte_sft_process_mbuf() but acts on
> packets
> >> + * accumulated in SFT flow due to missing in order packet. Processing done
> on
> >> + * single mbuf at a time and `in order`. Other than above the behavior is
> >> + * same as of rte_sft_process_mbuf for flow defined & activated & mbuf
> isn't
> >> + * fragmented & 'in order'. This function should be called when
> >> + * rte_sft_process_mbuf or rte_sft_process_mbuf_with_zone sets
> >> + * status->nb_in_order_mbufs output param !=0 and until
> >> + * status->nb_in_order_mbufs == 0.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param[out] status
> >> + *   Structure to dump SFT flow status once updated according to contents
> of
> >> + *   *mbuf_in*.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   A valid mbuf in case of success, NULL otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +struct rte_mbuf *
> >> +rte_sft_drain_mbuf(uint32_t fid,
> >> +		   struct rte_sft_flow_status *status,
> >> +		   struct rte_sft_error *error);
> >> +
> >
> > Fid represent a connection, so which direction do we drain the packets?
> > since we can have inordered packet in from both directions right?
> >
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Activate flow in SFT.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * This function performs second phase of flow creation in SFT.
> >> + * The reasons for 2 phase flow creation procedure:
> >> + * 1. Missing reverse flow - flow context is shared for both flow directions
> >> + *    i.e. in order maintain bidirectional flow context in RTE SFT packets
> >> + *    arriving from both dirrections should be identified as packets of the
> >> + *    RTE SFT flow. Consequently before creation of the SFT flow caller
> should
> >> + *    provide reverse flow direction 7-tuple.
> >> + * 2. The caller of rte_sft_process_mbuf/rte_sft_process_mbuf_with_zone
> >> should
> >> + *   be notified that arrived mbuf is first in flow & decide weather to
> >> + *   create new flow or it distroy before it was activated with
> >> + *   rte_sft_flow_destroy.
> >> + * This function completes creation of the bidirectional SFT flow & creates
> >> + * entry for 7-tuple on SFT PMD defined by the tuple port for both
> >> + * initiator/initiate 7-tuples.
> >> + * Flow aging, connection tracking state & out of order handling will be
> >> + * initialized according to the content of the *mbuf_in* passes to
> >> + * rte_sft_process_mbuf/_with_zone during the phase 1 of flow creation.
> >> + * Once this function returns upcoming calls
> >> rte_sft_process_mbuf/_with_zone
> >> + * with 7-tuple or its reverse will return handle to this flow.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param reverse_tuple
> >> + *   Expected response flow 7-tuple.
> >> + * @param ctx
> >> + *   User defined state to set.
> >> + *   Update of *fid* or *zone* fields in struct rte_flow_item_sft
> unsupported.
> >> + * @param ct_enable
> >> + *   Enables maintenance of status->ct_state connection tracking value for
> the
> >> + *   flow; otherwise status->ct_state will be initialized with zeros.
> >> + * @param evdev_id
> >> + *   Event dev ID to enqueue end of flow event.
> >> + * @param evport_id
> >> + *   Event port ID to enqueue end of flow event.
> >> + * @param[out] status
> >> + *   Structure to dump SFT flow status once activated.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_activate(uint32_t fid,
> >> +		      const struct rte_sft_7tuple *reverse_tuple,
> >> +		      const struct rte_flow_item_sft *ctx,
> >> +		      uint8_t ct_enable,
> >> +		      uint8_t dev_id,
> >> +		      uint8_t port_id,
> >> +		      struct rte_sft_flow_status *status,
> >> +		      struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Artificially create SFT flow.
> >> + *
> >> + * Function to create SFT flow before reception of the first flow packet.
> >> + *
> >> + * @param tuple
> >> + *   Expected initiator flow 7-tuple.
> >> + * @param reverse_tuple
> >> + *   Expected initiate flow 7-tuple.
> >> + * @param ctx
> >> + *   User defined state to set.
> >> + *   Setting of *fid* or *zone* fields in struct rte_flow_item_sft
> unsupported.
> >> + * @param[out] ct_enable
> >> + *   Enables maintenance of status->ct_state connection tracking value for
> the
> >> + *   flow; otherwise status->ct_state will be initialized with zeros.
> >> + * @param[out] status
> >> + *   Structure to dump SFT flow status once created.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   - on success: 0, locked SFT flow recognized by status->fid.
> >> + *   - on error: a negative errno value otherwise and rte_errno is set.
> >> + */
> >> +
> 
> No extra empty line and __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_create(const struct rte_sft_7tuple *tuple,
> >> +		    const struct rte_sft_7tuple *reverse_tuple,
> >> +		    const struct rte_flow_item_sft *ctx,
> >> +		    uint8_t ct_enable,
> >> +		    struct rte_sft_flow_status *status,
> >> +		    struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Lock exclusively SFT flow.
> >> + *
> >> + * Explicit flow locking; used for handling aged flows.
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_lock(uint32_t fid);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Release exclusively locked SFT flow.
> >> + *
> >> + * When rte_sft_process_mbuf/_with_zone and rte_sft_flow_create
> >> + * return *status* containing fid with defined bit on the flow considered
> >> + * exclusively locked and should be unlocked with this function.
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_unlock(uint32_t fid);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Removes flow from SFT.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * - Flow should be locked by caller in order to remove it.
> >> + * - Flow should have no client objects attached.
> >> + *
> >> + * Should be applied on aged flows, when flow aged event received.
> >> + *
> >> + * @code{.c}
> >> + *     while (1) {
> >> + *         rte_event_dequeue_burst(...);
> >> + *         FOR_EACH_EV(ev) {
> >> + *             uint32_t fid = ev.u64;
> >> + *             rte_sft_flow_lock(fid);
> >> + *             FOR_EACH_CLIENT(fid, client_id) {
> >> + *                 rte_sft_flow_reset_client_obj(fid, client_obj);
> >> + *                 // detached client object handling
> >> + *             }
> >> + *             rte_sft_flow_destroy(fid, &error);
> >> + *         }
> >> + *     }
> >> + * @endcode
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID to destroy.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_destroy(uint32_t fid, struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Reset flow age to zero.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * Simulates last flow packet with timestamp set to just now.
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_touch(uint32_t fid, struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Set flow aging to specific value.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param aging
> >> + *   New flow aging value.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_set_aging(uint32_t fid,
> >> +		       uint32_t aging,
> >> +		       struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Set client object for given client ID.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param client_id
> >> + *   Client ID to set object for.
> >> + * @param client_obj
> >> + *   Pointer to opaque client object structure.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +int
> >> +rte_sft_flow_set_client_obj(uint32_t fid,
> >> +			    uint8_t client_id,
> >> +			    void *client_obj,
> >> +			    struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Get client object for given client ID.
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param client_id
> >> + *   Client ID to get object for.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   A valid client object opaque pointer in case of success, NULL otherwise
> >> + *   and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +void *
> >> +rte_sft_flow_get_client_obj(const uint32_t fid,
> >> +			    uint8_t client_id,
> >> +			    struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Remove client object for given client ID.
> 
> Function name uses "reset", but description says "remove".
> May be synchronize it?
> 
> >> + * Flow should be locked by caller (see rte_sft_flow_lock).
> >> + *
> >> + * Detaches client object from SFT flow and returns the ownership for the
> >> + * client object to the caller by returning client object pointer value.
> >> + * The pointer returned by this function won't be accessed any more, the
> caller
> >> + * may release all client obj related resources & the memory allocated for
> >> + * this client object.
> >> + *
> >> + * @param fid
> >> + *   SFT flow ID.
> >> + * @param client_id
> >> + *   Client ID to remove object for.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   A valid client object opaque pointer in case of success, NULL otherwise
> >> + *   and rte_errno is set.
> >> + */
> 
> __rte_experimental
> 
> >> +void *
> >> +rte_sft_flow_reset_client_obj(uint32_t fid,
> >> +			      uint8_t client_id,
> >> +			      struct rte_sft_error *error);
> >> +
> >> +#ifdef __cplusplus
> >> +}
> >> +#endif
> >> +
> >> +#endif /* _RTE_SFT_H_ */
> >> diff --git a/lib/librte_sft/rte_sft_driver.h b/lib/librte_sft/rte_sft_driver.h
> >> new file mode 100644
> >> index 0000000000..0c9e28fe17
> >> --- /dev/null
> >> +++ b/lib/librte_sft/rte_sft_driver.h
> >> @@ -0,0 +1,195 @@
> >> +/* SPDX-License-Identifier: BSD-3-Clause
> >> + * Copyright 2020 Mellanox Technologies, Ltd
> >> + */
> >> +
> >> +#ifndef _RTE_SFT_DRIVER_H_
> >> +#define _RTE_SFT_DRIVER_H_
> >> +
> >> +/**
> >> + * @file
> >> + *
> >> + * RTE SFT Ethernet device PMD API
> >> + *
> >> + * APIs that are used by the SFT library to offload SFT operationons
> >> + * to Ethernet device.
> >> + */
> >> +
> >> +#include "rte_sft.h"
> >> +
> >> +#ifdef __cplusplus
> >> +extern "C" {
> >> +#endif
> >> +
> >> +/**
> >> + * Opaque type returned after successfully creating an entry in SFT.
> >> + *
> >> + * This handle can be used to manage and query the related entry (e.g. to
> >> + * destroy it or update age).
> >> + */
> >> +struct rte_sft_entry;
> >> +
> >> +/**
> >> + * Create SFT entry in eth_dev SFT.
> >> + *
> >> + * @param dev
> >> + *   Pointer to Ethernet device structure.
> >> + * @param tuple
> >> + *   L3/L4 5-tuple - src/dest IP and port and IP protocol.
> >> + * @param nat_tuple
> >> + *   L3/L4 5-tuple to replace in packet original 5-tuple in order to implement
> >> + *   NAT offloading; if NULL NAT offloading won't be configured for the
> flow.
> >> + * @param aging
> >> + *   Flow aging timeout in seconds.
> >> + * @param ctx
> >> + *   Initial values in SFT flow context
> >> + *   (see RTE flow struct rte_flow_item_sft).
> >> + *   ctx->zone should be valid.
> >> + * @param fid
> >> + *   SFT flow ID for the entry to create on *device*.
> >> + *   If there is an entry for the *fid* in PMD it will be updated with the
> >> + *   values of *ctx*.
> >> + * @param[out] queue_index
> >> + *   if PMD can figure out the queue where the flow packets will
> >> + *   arrive in RX data path it will set the value of queue_index; otherwise
> >> + *   all bits will be turned on.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   A valid handle in case of success, NULL otherwise and rte_errno is set.
> >> + */
> >> +typedef struct rte_sft_entry *(*sft_entry_create_t) (struct rte_eth_dev
> *dev,
> >> +		const struct rte_sft_5tuple *tuple,
> >> +		const struct rte_sft_5tuple *nat_tuple,
> >> +		const uint32_t aging,
> >> +		const struct rte_flow_item_sft *ctx,
> >> +		const uint32_t fid,
> >> +		uint16_t *queue_index,
> >> +		struct rte_sft_error *error);
> >> +
> >
> > I think for easier reading, the API should change to have 6 tuple (5 + zone)
> > the ctx should be removed and replaced with the state.
> >
> > Then add new API to modify the ctx
> > typedef int (*sft_modify_state)(struct rte_eth_dev *dev, uint8 state);
> > The main issue we my suggestion is that it will force the PMD to store the
> information to recreate
> > the rule, data that is already  saved by the SFT.
> >
> > Also I don't see why we need queue index, since the RSS and queue will be
> configured by the RTE flow
> > in a different group.
> >
> >> +/**
> >> + * Destroy SFT entry in eth_dev SFT.
> >> + *
> >> + * @param dev
> >> + *   Pointer to Ethernet device structure.
> >> + * @param entry
> >> + *   Handle to the SFT entry to destroy.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> >> +typedef int (*sft_entry_destroy_t)(struct rte_eth_dev *dev,
> >> +		struct rte_sft_entry *entry,
> >> +		struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * Decodes SFT flow context if attached to mbuf by action ``SFT``.
> >> + * @see RTE flow RTE_FLOW_ACTION_TYPE_SFT.
> >> + *
> >> + * @param dev
> >> + *   Pointer to Ethernet device structure.
> >> + * @param mbuf
> >> + *   mbuf of the packet to decode attached state from.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   A valid SFT flow context in case of success, NULL otherwise and
> rte_errno
> >> + *   is set.
> >> + */
> >> +typedef struct rte_flow_item_sft *(*sft_entry_mbuf_decode_ctx_t)(
> >> +		struct rte_eth_dev *dev,
> >> +		const struct rte_mbuf *mbuf,
> >> +		struct rte_sft_error *error);
> >> +
> >
> > What about returning int as error code, and return the rte_flow_item_sft
> > as out parameter?
> > This will remove the allocation and free.
> >
> >> +/**
> >> + * Get aged-out SFT entries.
> >> + *
> >> + * Report entry as aged-out if timeout passed without any matching
> >> + * on the SFT entry.
> >> + *
> >> + * @param[in] dev
> >> + *   Pointer to Ethernet device structure.
> >> + * @param[in, out] fid_aged
> >> + *   The address of an array of aged-out SFT flow IDs.
> >> + * @param[in] nb_aged
> >> + *   The length of *fid_aged* array pointers.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. Initialized in case of
> >> + *   error only.
> >> + *
> >> + * @return
> >> + *   if nb_aged is 0, return the amount of all aged flows.
> >> + *   if nb_aged is not 0 , return the amount of aged flows reported
> >> + *   in the *fid_aged* array, otherwise negative errno value.
> >> + */
> >> +typedef int (*sft_entry_get_aged_entries_t)(struct rte_eth_dev *dev,
> >> +		uint32_t *fid_aged,
> >> +		int nb_aged,
> >> +		struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * Simulate SFT entry match in terms of entry aging.
> >> + *
> >> + * @param dev
> >> + *   Pointer to Ethernet device structure.
> >> + * @param fid
> >> + *   SFT flow ID paired with dev to retrieve related SFT entry.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> >> +typedef int (*sft_entry_touch_t)(struct rte_eth_dev *dev,
> >> +		uint32_t fid,
> >> +		struct rte_sft_error *error);
> >> +
> >> +/**
> >> + * Set SFT entry aging to specific value.
> >> + *
> >> + * @param dev
> >> + *   Pointer to Ethernet device structure.
> >> + * @param fid
> >> + *   SFT flow ID paired with dev to retrieve related SFT entry.
> >> + * @param aging
> >> + *   New entry aging value.
> >> + * @param[out] error
> >> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> >> + *   structure in case of error only.
> >> + *
> >> + * @return
> >> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >> + */
> >> +typedef int (*sft_entry_set_aging_t)(struct rte_eth_dev *dev,
> >> +		uint32_t fid,
> >> +		uint32_t aging,
> >> +		struct rte_sft_error *error);
> >> +
> >> +/** SFT operations function pointer table */
> >> +struct rte_sft_ops {
> >> +	sft_entry_create_t entry_create;
> >> +	/**< Create SFT entry in eth_dev SFT. */
> >> +	sft_entry_destroy_t entry_destroy;
> >> +	/**< Destroy SFT entry in eth_dev SFT. */
> >> +	sft_entry_mbuf_decode_ctx_t mbuf_decode_ctx;
> >> +	/**< Decodes SFT flow context if attached to mbuf by action ``SFT``. */
> >> +	sft_entry_get_aged_entries_t get_aged_entries;
> >> +	/**< Get aged-out SFT entries. */
> >> +	sft_entry_touch_t entry_touch;
> >> +	/**< Simulate SFT entry match in terms of entry aging. */
> >> +	sft_entry_set_aging_t set_aging;
> >> +	/**< Set SFT entry aging to specific value. */
> >> +};
> >> +
> >> +#ifdef __cplusplus
> >> +}
> >> +#endif
> >> +
> >> +#endif /* _RTE_SFT_DRIVER_H_ */
> >> diff --git a/lib/librte_sft/rte_sft_version.map
> >> b/lib/librte_sft/rte_sft_version.map
> >> new file mode 100644
> >> index 0000000000..747e100ac5
> >> --- /dev/null
> >> +++ b/lib/librte_sft/rte_sft_version.map
> >> @@ -0,0 +1,21 @@
> >> +EXPERIMENTAL {
> >> +	global:
> >> +
> >> +	rte_sft_flow_get_status;
> >> +	rte_sft_flow_set_ctx;
> >> +	rte_sft_init;
> >> +	rte_sft_fini;
> >> +	rte_sft_process_mbuf;
> >> +	rte_sft_process_mbuf_with_zone;
> >> +	rte_sft_drain_mbuf;
> >> +	rte_sft_flow_activate;
> >> +	rte_sft_flow_create;
> >> +	rte_sft_flow_lock;
> >> +	rte_sft_flow_unlock;
> >> +	rte_sft_flow_destroy;
> >> +	rte_sft_flow_touch;
> >> +	rte_sft_flow_set_aging;
> >> +	rte_sft_flow_set_client_obj;
> >> +	rte_sft_flow_get_client_obj;
> >> +	rte_sft_flow_reset_client_obj;
> 
> If I'm not mistaken, it should be alphabetially sorted.
> 
> >> +};
> >> --
> >> 2.26.2
> >
> > Best,
> > Ori
> >


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v2 0/2] introduce stateful flow table
  2020-09-09 20:30 [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
                   ` (3 preceding siblings ...)
  2020-09-15 11:59 ` [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
@ 2020-11-04 12:59 ` Ori Kam
  2020-11-04 12:59   ` [dpdk-dev] [PATCH v2 1/2] ethdev: add item/action for SFT Ori Kam
  2020-11-04 12:59   ` [dpdk-dev] [PATCH v2 2/2] ethdev: introduce sft lib Ori Kam
  2020-11-04 13:17 ` [dpdk-dev] [RFC v3 0/2] introduce stateful flow table Ori Kam
  5 siblings, 2 replies; 17+ messages in thread
From: Ori Kam @ 2020-11-04 12:59 UTC (permalink / raw)
  To: andreyv, mdr
  Cc: alexr, andrey.vesnovaty, arybchenko, dev, elibr, ferruh.yigit,
	orika, ozsh, roniba, thomas, viacheslavo

The RFC introduces Stateful Flow Table (SFT) API and changes needed in
both ethdev an RTE flow to support SFT functionality.

SFT library provides a framework for applications that need to maintain
context across different packets of the connection.

The goals of the SFT library:
- Accelerate flow recognition & its context retrieval for further
  lookaside processing.
- Enable context-aware flow handling offload.


Andrey Vesnovaty (1):
  ethdev: add item/action for SFT

Ori Kam (1):
  ethdev: introduce sft lib

 lib/librte_ethdev/meson.build            |   3 +
 lib/librte_ethdev/rte_ethdev_version.map |  19 +
 lib/librte_ethdev/rte_flow.h             |  75 ++
 lib/librte_ethdev/rte_sft.c              |   9 +
 lib/librte_ethdev/rte_sft.h              | 878 +++++++++++++++++++++++
 lib/librte_ethdev/rte_sft_driver.h       | 201 ++++++
 6 files changed, 1185 insertions(+)
 create mode 100644 lib/librte_ethdev/rte_sft.c
 create mode 100644 lib/librte_ethdev/rte_sft.h
 create mode 100644 lib/librte_ethdev/rte_sft_driver.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] ethdev: add item/action for SFT
  2020-11-04 12:59 ` [dpdk-dev] [PATCH v2 0/2] introduce stateful flow table Ori Kam
@ 2020-11-04 12:59   ` Ori Kam
  2020-11-04 12:59   ` [dpdk-dev] [PATCH v2 2/2] ethdev: introduce sft lib Ori Kam
  1 sibling, 0 replies; 17+ messages in thread
From: Ori Kam @ 2020-11-04 12:59 UTC (permalink / raw)
  To: andreyv, mdr
  Cc: alexr, andrey.vesnovaty, arybchenko, dev, elibr, ferruh.yigit,
	orika, ozsh, roniba, thomas, viacheslavo

From: Andrey Vesnovaty <andreyv@nvidia.com>

Attach SFT flow context to packet with SFT action.
Match on SFT flow context (attached to packet),
with SFT item.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
---
 lib/librte_ethdev/rte_flow.h | 75 ++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5489..7ca47cc87c 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -537,6 +537,14 @@ enum rte_flow_item_type {
 	 */
 	RTE_FLOW_ITEM_TYPE_ECPRI,
 
+	/**
+	 * [META]
+	 *
+	 * Matches SFT context.
+	 *
+	 * See struct rte_flow_item_sft.
+	 */
+	RTE_FLOW_ITEM_TYPE_SFT,
 };
 
 /**
@@ -1579,6 +1587,48 @@ static const struct rte_flow_item_ecpri rte_flow_item_ecpri_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_SFT
+ *
+ * Matches context of flow that wasw created by SFT.
+ *
+ * The structure describes SFT flow context.
+ * All the fields of the structure, except @p fid, should be considered as
+ * user defined.
+ * The @p fid assigned by RTE SFT & used as unique flow identifier.
+ * SFT context attached to packet by action ``SFT`` (see RTE_FLOW_ACTION_SFT),
+ * which sends the packet to the SFT module.
+ *
+ * SFT default context defined as context attached to packet when there is no
+ * entry for the flow in SFT. The @p state has application reserved value
+ * meaning that SFT context for the packet undefined since entry wasn't found
+ * in SFT. If state 'undefined' then @p zone should be valid othervice @p fid
+ * should be valid.
+ *
+ * Context considered virtual since the method of storing this info on packet
+ * is PMD/implementation specific & may involve mapping methods if there is
+ * 'not enough bits' to store entire contents of struct rte_flow_item_sft.
+ *
+ * Maximal value/size of each field depends on HW capabilities and considered
+ * as implementation specific.
+ */
+RTE_STD_C11
+struct rte_flow_item_sft {
+	union {
+		uint32_t fid; /**< SFT flow identifier. */
+		uint32_t zone; /**< Zone assigned to flow. */
+	};
+	uint32_t fid_valid:1; /**< The fid member is valid. */
+	uint32_t zone_valid:1; /**< The zone member is valid. */
+	uint32_t reserved:30; /**< Reserved. */
+	uint8_t state; /**< User defined flow state. */
+	uint8_t user_data_size; /**< user_data buffer size. */
+	uint8_t *user_data; /**< Arbitrary user data. */
+};
+
 /**
  * Matching pattern item definition.
  *
@@ -2132,6 +2182,15 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * RTE_FLOW_ACTION_TYPE_SFT
+	 *
+	 * Direct the packet to SFT module.
+	 *
+	 * See struct rte_flow_action_sft.
+	 */
+	RTE_FLOW_ACTION_TYPE_SFT,
 };
 
 /**
@@ -2721,6 +2780,22 @@ rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
 	*RTE_FLOW_DYNF_METADATA(m) = v;
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SFT
+ *
+ * Performs lookup by *zone* and 5-tuple in SFT.
+ * If entry is found the related SFT context will be attached otherwise,
+ * default SFT context will be attached.
+ *
+ * This action may result in termination of actions that following this action.
+ */
+struct rte_flow_action_sft {
+	uint32_t zone; /**< Zone for lookup in SFT */
+};
+
 /*
  * Definition of a single action.
  *
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [PATCH v2 2/2] ethdev: introduce sft lib
  2020-11-04 12:59 ` [dpdk-dev] [PATCH v2 0/2] introduce stateful flow table Ori Kam
  2020-11-04 12:59   ` [dpdk-dev] [PATCH v2 1/2] ethdev: add item/action for SFT Ori Kam
@ 2020-11-04 12:59   ` Ori Kam
  1 sibling, 0 replies; 17+ messages in thread
From: Ori Kam @ 2020-11-04 12:59 UTC (permalink / raw)
  To: andreyv, mdr
  Cc: alexr, andrey.vesnovaty, arybchenko, dev, elibr, ferruh.yigit,
	orika, ozsh, roniba, thomas, viacheslavo

Defines RTE SFT (Stateful Flow Table) APIs for Stateful Flow Table library.

Currently, DPDK enables only stateless offloading, using the rte_flow.
stateless means that each packet is handled without any knowledge of
privious or future packets.

As we look at the industry, there is much demand to save a context across
packets that belong to a connection.

Examples for such applications:
- Next-generation firewalls
- Intrusion detection/prevention systems (IDS/IPS): Suricata, snort
- SW/Virtual Switching: OVS
The goals of the SFT library:
- Accelerate flow recognition & its context retrieval for further
  lookaside processing.
- Enable context-aware flow handling offload.

The solution suggested is to create a lib that will enable saving states
between different packets that belong to the same connection.

The solution will also enable better HW abstraction than the one we get
from using the rte_flow. The reason for this is that saving states is
not atomic action like we have in rte_flow and also can't be done fully
in HW (The first packets must be seen by the application).
Saying the above this lib is based on interacting with the rte_flow but it
doesn't replace it or encapsulate it.

Key design points.
- The SFT should offload as much as possible to HW.
- The SFT is designed to work alongside the rte_flow.
- The SFT has its own ops that the PMD needs to implement.
- The SFT works on 5 tuple + zone (a user-defined value)

Basic usage flow:
1. Application insert a flow that matches all eth traffic and have
   sft action along with jump action. (in the future this jump can be
   avoided and in doing so saving some jumps,
   but for the most generic and complete solution we think
   that allow the application full control of the packet process using
   rte_flow is better.)
2. Application insert a flow in the target group that matches the packet
   state. Based on this state the application performs the needed
   actions. This flow can also be merged with other matching criteria.
   The application will also add a flow in the target group that will
   upload to the application any packet with a miss state.
3. First eth packet arrives and is routed to the SFT HW component.
   since this is the first packet the SFT will have a miss and will
   mark the packet with miss state and forward it to the target group.
4. The application will pull the packet from the queue and will send it to
   be processed by the sft lib.
5. The SFT will extract HW packet state and if valid the zone or the
   flow-id, and report it back to the application.
6. Application will see that this is a new connection, so it will issue
   SFT command to create a new connection with a selected state.
   The SFT will create a HW flow, that matches the 5 tuple + zone and
   sets the state of the packet. The state can be any u8 value, it is
   the responsibility of the application to match on the value.
7. On the next packet arriving to the HW it will jump to the SFT and
   in the SFT HW there will be a match which will result in setting
   the packet state and ID according to the application request.
8. In case of later miss (at some other group) or application logic
   that the packet should be routed back to the application.
   The application will call the SFT lib with the new mbuf,
   which will result in the flow-id returend to the application
   along with the context attached to this connection.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Signed-off-by: Ori Kam <orika@nvidia.com>
---
 lib/librte_ethdev/meson.build            |   3 +
 lib/librte_ethdev/rte_ethdev_version.map |  19 +
 lib/librte_ethdev/rte_sft.c              |   9 +
 lib/librte_ethdev/rte_sft.h              | 878 +++++++++++++++++++++++
 lib/librte_ethdev/rte_sft_driver.h       | 201 ++++++
 5 files changed, 1110 insertions(+)
 create mode 100644 lib/librte_ethdev/rte_sft.c
 create mode 100644 lib/librte_ethdev/rte_sft.h
 create mode 100644 lib/librte_ethdev/rte_sft_driver.h

diff --git a/lib/librte_ethdev/meson.build b/lib/librte_ethdev/meson.build
index 8fc24e8c8a..064e3c9443 100644
--- a/lib/librte_ethdev/meson.build
+++ b/lib/librte_ethdev/meson.build
@@ -9,6 +9,7 @@ sources = files('ethdev_private.c',
 	'rte_ethdev.c',
 	'rte_flow.c',
 	'rte_mtr.c',
+	'rte_sft.c',
 	'rte_tm.c')
 
 headers = files('rte_ethdev.h',
@@ -24,6 +25,8 @@ headers = files('rte_ethdev.h',
 	'rte_flow_driver.h',
 	'rte_mtr.h',
 	'rte_mtr_driver.h',
+	'rte_sft.h',
+	'rte_sft_driver.h',
 	'rte_tm.h',
 	'rte_tm_driver.h')
 
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index f8a0945812..e3c829b494 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -232,6 +232,25 @@ EXPERIMENTAL {
 	rte_eth_fec_get_capability;
 	rte_eth_fec_get;
 	rte_eth_fec_set;
+	rte_sft_drain_mbuf;
+	rte_sft_fini;
+	rte_sft_flow_activate;
+	rte_sft_flow_create;
+	rte_sft_flow_destroy;
+	rte_sft_flow_get_client_obj;
+	rte_sft_flow_get_status;
+	rte_sft_flow_query;
+	rte_sft_flow_set_aging;
+	rte_sft_flow_set_client_obj;
+	rte_sft_flow_set_data;
+	rte_sft_flow_set_offload;
+	rte_sft_flow_set_state;
+	rte_sft_flow_touch;
+	rte_sft_init;
+	rte_sft_process_mbuf;
+	rte_sft_process_mbuf_with_zone;
+
+
 };
 
 INTERNAL {
diff --git a/lib/librte_ethdev/rte_sft.c b/lib/librte_ethdev/rte_sft.c
new file mode 100644
index 0000000000..f3d3945545
--- /dev/null
+++ b/lib/librte_ethdev/rte_sft.c
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+
+#include "rte_sft.h"
+#include "rte_sft_driver.h"
+
+/* Placeholder for RTE SFT library APIs implementation */
diff --git a/lib/librte_ethdev/rte_sft.h b/lib/librte_ethdev/rte_sft.h
new file mode 100644
index 0000000000..edd8671cad
--- /dev/null
+++ b/lib/librte_ethdev/rte_sft.h
@@ -0,0 +1,878 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+#ifndef _RTE_SFT_H_
+#define _RTE_SFT_H_
+
+/**
+ * @file
+ *
+ * RTE SFT API
+ *
+ * Defines RTE SFT APIs for Statefull Flow Table library.
+ *
+ * The SFT lib is part of the ethdev class, the reason for this is that the main
+ * idea is to leverage the HW offload that the ethdev allow using the rte_flow.
+ *
+ * SFT General description:
+ * SFT library provides a framework for applications that need to maintain
+ * context across different packets of the connection.
+ * Examples for such applications:
+ * - Next-generation firewalls
+ * - Intrusion detection/prevention systems (IDS/IPS): Suricata, Snort
+ * - SW/Virtual Switching: OVS
+ * The goals of the SFT library:
+ * - Accelerate flow recognition & its context retrieval for further look-aside
+ *   processing.
+ * - Enable context-aware flow handling offload.
+ *
+ * The SFT is designed to use HW offload to get the best performance.
+ * This is done on two levels. The first one is marking the packet with flow id
+ * to speed the lookup of the flow in the data structure.
+ * The second is done be connecting the SFT results to the rte_flow for
+ * continuing packet process.
+ *
+ * Definitions and Abbreviations:
+ * - 5-tuple: defined by:
+ *     -- Source IP address
+ *     -- Source port
+ *     -- Destination IP address
+ *     -- Destination port
+ *     -- IP protocol number
+ * - 7-tuple: 5-tuple, zone and port (see struct rte_sft_7tuple)
+ * - 5/7-tuple: 5/7-tuple of the packet from connection initiator
+ * - revers 5/7-tuple: 5/7-tuple of the packet from connection initiate
+ * - application: SFT library API consumer
+ * - APP: see application
+ * - CID: client ID
+ * - CT: connection tracking
+ * - FID: Flow identifier
+ * - FIF: First In Flow
+ * - Flow: defined by 7-tuple and its reverse i.e. flow is bidirectional
+ * - SFT: Stateful Flow Table
+ * - user: see application
+ * - zone: additional user defined value used as differentiator for
+ *         connections having same 5-tuple (for example different VXLAN
+ *         connections with same inner 5-tuple).
+ *
+ * SFT components:
+ *
+ * +-----------------------------------+
+ * | RTE flow                          |
+ * |                                   |
+ * | +-------------------------------+ |  +----------------+
+ * | | group X                       | |  | RTE_SFT        |
+ * | |                               | |  |                |
+ * | | +---------------------------+ | |  |                |
+ * | | | rule ...                  | | |  |                |
+ * | | | .                         | | |  +-----------+----+
+ * | | | .                         | | |              |
+ * | | | .                         | | |          entry
+ * | | +---------------------------+ | |            create
+ * | | | rule                      | | |              |
+ * | | |   patterns ...            +---------+        |
+ * | | |   actions                 | | |     |        |
+ * | | |     SFT (zone=Z)          | | |     |        |
+ * | | |     JUMP (group=Y)        | | |  lookup      |
+ * | | +---------------------------+ | |    zone=Z,   |
+ * | | | rule ...                  | | |    5tuple    |
+ * | | | .                         | | |     |        |
+ * | | | .                         | | |  +--v-------------+
+ * | | | .                         | | |  | SFT       |    |
+ * | | |                           | | |  |           |    |
+ * | | +---------------------------+ | |  |        +--v--+ |
+ * | |                               | |  |        |     | |
+ * | +-------------------------------+ |  |        | PMD | |
+ * |                                   |  |        |     | |
+ * |                                   |  |        +-----+ |
+ * | +-------------------------------+ |  |                |
+ * | | group Y                       | |  |                |
+ * | |                               | |  | set state      |
+ * | | +---------------------------+ | |  | set data       |
+ * | | | rule                      | | |  +--------+-------+
+ * | | |   patterns                | | |           |
+ * | | |     SFT (state=UNDEFINED) | | |           |
+ * | | |   actions RSS             | | |           |
+ * | | +---------------------------+ | |           |
+ * | | | rule                      | | |           |
+ * | | |   patterns                | | |           |
+ * | | |     SFT (state=INVALID)   | <-------------+
+ * | | |   actions DROP            | | |  forward
+ * | | +---------------------------+ | |    group=Y
+ * | | | rule                      | | |
+ * | | |   patterns                | | |
+ * | | |     SFT (state=ACCEPTED)  | | |
+ * | | |   actions PORT            | | |
+ * | | +---------------------------+ | |
+ * | |  ...                          | |
+ * | |                               | |
+ * | +-------------------------------+ |
+ * |  ...                              |
+ * |                                   |
+ * +-----------------------------------+
+ *
+ * SFT as datastructure:
+ * SFT can be treated as datastructure maintaining flow context across its
+ * lifetime. SFT flow entry represents bidirectional network flow and defined by
+ * 7-tuple & its reverse 7-tuple.
+ * Each entry in SFT has:
+ * - FID: 1:1 mapped & used as entry handle & encapsulating internal
+ *   implementation of the entry.
+ * - State: user-defined value attached to each entry, the only library
+ *   reserved value for state unset (the actual value defined by SFT
+ *   configuration). The application should define flow state encodings and
+ *   set it for flow via rte_sft_flow_set_ctx() than what actions should be
+ *   applied on packets can be defined via related RTE flow rule matching SFT
+ *   state (see rules in SFT components diagram above).
+ * - Timestamp: for the last seen in flow packet used for flow aging mechanism
+ *   implementation.
+ * - Client Objects: user-defined flow contexts attached as opaques to flow.
+ * - Acceleration & offloading - utilize RTE flow capabilities, when supported
+ *   (see action ``SFT``), for flow lookup acceleration and further
+ *   context-aware flow handling offload.
+ * - CT state: optionally for TCP connections CT state can be maintained
+ *   (see enum rte_sft_flow_ct_state).
+ * - Out of order TCP packets: optionally SFT can keep out of order TCP
+ *   packets aside the flow context till the arrival of the missing in-order
+ *   packet.
+ *
+ * RTE flow changes:
+ * The SFT flow state (or context) for RTE flow is defined by fields of
+ * struct rte_flow_item_sft.
+ * To utilize SFT capabilities new item and action types introduced:
+ * - item SFT: matching on SFT flow state (see RTE_FLOW_ITEM_TYPE_SFT).
+ * - action SFT: retrieve SFT flow context and attache it to the processed
+ *   packet (see RTE_FLOW_ACTION_TYPE_SFT).
+ *
+ * The contents of per port SFT serving RTE flow action ``SFT`` managed via
+ * SFT PMD APIs (see struct rte_sft_ops).
+ * The SFT flow state/context retrieval performed by user-defined zone ``SFT``
+ * action argument and processed packet 5-tuple.
+ * If in scope of action ``SFT`` there is no context/state for the flow in SFT
+ * undefined sate attached to the packet meaning that the flow is not
+ * recognized by SFT, most probably FIF packet.
+ *
+ * Once the SFT state set for a packet it can match on item SFT
+ * (see RTE_FLOW_ITEM_TYPE_SFT) and forwarding design can be done for the
+ * packet, for example:
+ * - if state value == x than queue for further processing by the application
+ * - if state value == y than forward it to eth port (full offload)
+ * - if state value == 'undefined' than queue for further processing by
+ *   the application (handle FIF packets)
+ *
+ * Processing packets with SFT library:
+ *
+ * FIF packet:
+ * To recognize upcoming packets of the SFT flow every FIF packet should be
+ * forwarded to the application utilizing the SFT library. Non-FIF packets can
+ * be processed by the application or its processing can be fully offloaded.
+ * Processing of the packets in SFT library starts with rte_sft_process_mbuf
+ * or rte_sft_process_mbuf_with_zone. If mbuf recognized as FIF application
+ * should make a design to destroy flow or complete flow creation process in
+ * SFT using rte_sft_flow_activate.
+ *
+ * Recognized SFT flow:
+ * Once struct rte_sft_flow_status with valid fid field possessed by application
+ * it can:
+ * - mange client objects on it (see client_obj field in
+ *   struct rte_sft_flow_status) using rte_sft_flow_<OP>_client_obj APIs
+ * - analyze user-defined flow state and CT state (see state & ct_sate fields
+ *   in struct rte_sft_flow_status).
+ * - set flow state to be attached to the upcoming packets by action ``SFT``
+ *   via struct rte_sft_flow_status API.
+ * - decide to destroy flow via rte_sft_flow_destroy API.
+ *
+ * Flow aging:
+ *
+ * SFT library manages the aging for each flow. On flow creation, it's
+ * assigned an aging value, the maximal number of seconds passed since the
+ * last flow packet arrived, once exceeded flow considered aged.
+ * The application notified of aged flow asynchronously via event queues.
+ * The device and port IDs tuple to identify the event queue to enqueue
+ * flow aged events passed on flow creation as arguments
+ * (see rte_sft_flow_activate). It's the application responsibility to
+ * initialize event queues and assign them to each flow for EOF event
+ * notifications.
+ * Aged EOF event handling:
+ * - Should be considered as application responsibility.
+ * - The last stage should be the release of the flow resources via
+ *    rte_sft_flow_destroy API.
+ * - All client objects should be removed from flow before the
+ *   rte_sft_flow_destroy API call.
+ * See the description of ret_sft_flow_destroy for an example of aged flow
+ * handling.
+ *
+ * SFT API thread safety:
+ *
+ * Since the SFT lib is designed to work as part of the Fast-Path, The SFT
+ * is not thread safe, in order to enable better working with multiple threads
+ * the SFT lib uses the queue approach, where each queue can only be accessesd
+ * by one thread while one thread can access multiple queues.
+ *
+ * SFT Library initialization and cleanup:
+ *
+ * SFT library should be considered as a single instance, preconfigured and
+ * initialized via rte_sft_init() API.
+ * SFT library resource deallocation and cleanup should be done via
+ * rte_sft_init() API as a stage of the application termination procedure.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_flow.h>
+
+/**
+ * L3/L4 5-tuple - src/dest IP and port and IP protocol.
+ *
+ * Used for flow/connection identification.
+ */
+RTE_STD_C11
+struct rte_sft_5tuple {
+	union {
+		struct {
+			rte_be32_t src_addr; /**< IPv4 source address. */
+			rte_be32_t dst_addr; /**< IPv4 destination address. */
+		} ipv4;
+		struct {
+			uint8_t src_addr[16]; /**< IPv6 source address. */
+			uint8_t dst_addr[16]; /**< IPv6 destination address. */
+		} ipv6;
+	};
+	rte_be16_t src_port; /**< Source port. */
+	rte_be16_t dst_port; /**< Destination port. */
+	uint8_t proto; /**< IP protocol. */
+	uint8_t is_ipv6: 1; /**< True for valid IPv6 fields. Otherwise IPv4. */
+};
+
+/**
+ * Port flow identification.
+ *
+ * @p zone used for setups where 5-tuple is not enough to identify flow.
+ * For example different VLANs/VXLANs may have similar 5-tuples.
+ */
+struct rte_sft_7tuple {
+	struct rte_sft_5tuple flow_5tuple; /**< L3/L4 5-tuple. */
+	uint32_t zone; /**< Zone assigned to flow. */
+	uint16_t port_id; /** <Port identifier of Ethernet device. */
+};
+
+/**
+ * Structure describes SFT library configuration
+ */
+struct rte_sft_conf {
+	uint16_t nb_queues; /**< Preferred number of queues */
+	uint32_t udp_aging; /**< UDP proto default aging in sec */
+	uint32_t tcp_aging; /**< TCP proto default aging in sec */
+	uint32_t tcp_syn_aging; /**< TCP SYN default aging in sec. */
+	uint32_t default_aging; /**< All unlisted proto default aging in sec. */
+	uint32_t nb_max_entries; /**< Max entries in SFT. */
+	uint8_t app_data_len; /**< Number of uint32 of app data. */
+	uint32_t support_partial_match: 1;
+	/**< App can partial match on the data. */
+	uint32_t reorder_enable: 1;
+	/**< TCP packet reordering feature enabled bit. */
+	uint32_t tcp_ct_enable: 1;
+	/**< TCP connection tracking based on standard. */
+	uint32_t reserved: 30;
+};
+
+/**
+ *  Structure that holds the action configuration.
+ */
+struct rte_sft_actions_specs {
+	struct rte_sft_5tuple *initiator_nat;
+	/**< The NAT configuration for the initiator flow. */
+	struct rte_sft_5tuple *reverse_nat;
+	/**< The NAT configuration for the reverse flow. */
+	uint64_t aging; /**< the aging time out in sec. */
+};
+
+#define RTE_SFT_ACTION_INITIATOR_NAT (1ul << 0)
+/**< NAT action should be done on the initiator traffic. */
+#define RTE_SFT_ACTION_REVERSE_NAT (1ul << 1)
+/**< NAT action should be done on the reverse traffic. */
+#define RTE_SFT_ACTION_COUNT (1ul << 2) /**< Enable count action. */
+#define RTE_SFT_ACTION_AGE (1ul << 3) /**< Enable ageing action. */
+
+
+/**
+ * Structure that holds the count data.
+ */
+struct rte_sft_query_data {
+	uint64_t nb_bytes; /**< Number of bytes that passed in the flow. */
+	uint64_t nb_packets; /**< Number of packets that passed in the flow. */
+	uint32_t age; /**< Seconds passed since last seen packet. */
+	uint32_t aging;
+	/**< Flow considered aged once this age (seconds) reached. */
+	uint32_t nb_bytes_valid: 1; /**< Number of bytes is valid. */
+	uint32_t nb_packets_valid: 1; /* Number of packets is valid. */
+	uint32_t nb_age_valid: 1; /* Age is valid. */
+	uint32_t nb_aging_valid: 1; /* Aging is valid. */
+	uint32_t reserved: 28;
+};
+
+/**
+ * Structure describes the state of the flow in SFT.
+ */
+struct rte_sft_flow_status {
+	uint32_t fid; /**< SFT flow id. */
+	uint32_t zone; /**< Zone for lookup in SFT */
+	uint8_t state; /**< Application defined bidirectional flow state. */
+	uint8_t proto_state; /**< The state based on the protocol. */
+	uint16_t proto; /**< L4 protocol. */
+	/**< Connection tracking flow state, based on standard. */
+	uint32_t nb_in_order_mbufs;
+	/**< Number of in-order mbufs available for drain */
+	uint32_t activated: 1; /**< Flow was activated. */
+	uint32_t zone_valid: 1; /**< Zone field is valid. */
+	uint32_t proto_state_change: 1; /**< Protocol state was changed. */
+	uint32_t fragmented: 1; /**< Last flow mbuf was fragmented. */
+	uint32_t out_of_order: 1; /**< Last flow mbuf was out of order (TCP). */
+	uint32_t offloaded: 1;
+	/**< The connection is offload and no packet should be stored. */
+	uint32_t initiator: 1; /**< marks if the mbuf is from the initiator. */
+	uint32_t reserved: 25;
+	uint32_t data[];
+	/**< Application data. The length is defined by the configuration. */
+};
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_flow_error.cause.
+ */
+enum rte_sft_error_type {
+	RTE_SFT_ERROR_TYPE_NONE, /**< No error. */
+	RTE_SFT_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_SFT_ERROR_TYPE_FLOW_NOT_DEFINED, /**< The FID is not defined. */
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by SFT, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_sft_error {
+	enum rte_sft_error_type type; /**< Cause field and error types. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get SFT flow status, based on the fid.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] status
+ *   Structure to dump actual SFT flow status.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_get_status(const uint16_t queue, const uint32_t fid,
+			struct rte_sft_flow_status *status,
+			struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set user defined data.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param data
+ *   User defined data. The len is defined at configuration time.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_data(uint16_t queue, uint32_t fid, const uint32_t *data,
+		      struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set user defined state.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param state
+ *   User state.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_state(uint16_t queue, uint32_t fid, const uint8_t state,
+		       struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set user defined state.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param offload
+ *   set if flow is offloaded.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_offload(uint16_t queue, uint32_t fid, bool offload,
+			 struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Initialize SFT library instance.
+ *
+ * @param conf
+ *   SFT library instance configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_init(const struct rte_sft_conf *conf, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Finalize SFT library instance.
+ * Cleanup & release allocated resources.
+ */
+__rte_experimental
+void
+rte_sft_fini(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Process mbuf received on RX queue.
+ *
+ * This function checks the mbuf against the SFT database and return the
+ * connection status that this mbuf belongs to.
+ *
+ * If status.activated = 1 and status.offloaded = 0 the input mbuf is
+ * considered consumed and the application is not allowed to use it or free it,
+ * instead the application should use the mbuf pointed by the mbuf_out.
+ * Incase the mbuf is out of order or fragmented the mbuf_out will be NULL.
+ *
+ * If status.activated = 0 or status.offloaded = 1, the input mbuf is not
+ * consumed and the mbuf_out will allways be NULL.
+ *
+ * This function doesn't create new entry in the SFT.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param[in] mbuf_in
+ *   mbuf to process; mbuf pointer considered 'consumed' and should not be used
+ *   if status.activated and status.offload = 0.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param[out] status
+ *   Connection status based on the last in mbuf.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. Initialize in case of
+ *   error only.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_process_mbuf(uint16_t queue, struct rte_mbuf *mbuf_in,
+		     struct rte_mbuf **mbuf_out,
+		     struct rte_sft_flow_status *status,
+		     struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Process mbuf received on RX queue while zone value provided by caller.
+ *
+ * The behaviour of this function is similar to rte_sft_process_mbuf except
+ * the lookup in SFT procedure. The lookup in SFT always done by the *zone*
+ * arg and 5-tuple 5-tuple, extracted form mbuf outer header contents.
+ *
+ * @see rte_sft_process_mbuf
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param[in] mbuf_in
+ *   mbuf to process; mbuf pointer considered 'consumed' and should not be used
+ *   after successful call to this function.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param[out] status
+ *   Connection status based on the last in mbuf.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. Initialize in case of
+ *   error only.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_process_mbuf_with_zone(uint16_t queue, struct rte_mbuf *mbuf_in,
+			       uint32_t zone, struct rte_mbuf **mbuf_out,
+			       struct rte_sft_flow_status *status,
+			       struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Drain next in order mbuf.
+ *
+ * This function behaves similar to rte_sft_process_mbuf() but acts on packets
+ * accumulated in SFT flow due to missing in order packet. Processing done on
+ * single mbuf at a time and `in order`. Other than above the behavior is
+ * same as of rte_sft_process_mbuf for flow defined & activated & mbuf isn't
+ * fragmented & 'in order'. This function should be called when
+ * rte_sft_process_mbuf or rte_sft_process_mbuf_with_zone sets
+ * status->nb_in_order_mbufs output param !=0 and until
+ * status->nb_in_order_mbufs == 0.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param nb_out
+ *   Number of buffers to be drained.
+ * @param initiator
+ *   true packets that will be drained belongs to the initiator.
+ * @param[out] status
+ *   Connection status based on the last mbuf that was drained.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. Initialize in case of
+ *   error only.
+ *
+ * @return
+ *   The number of mbufs that were drained, negative value in case
+ *   of error and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_drain_mbuf(uint16_t queue, uint32_t fid, struct rte_mbuf **mbuf_out,
+		   uint16_t nb_out, bool initiator,
+		   struct rte_sft_flow_status *status,
+		   struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Activate flow in SFT.
+ *
+ * This function creates an entry in the SFT for this connection.
+ * The reasons for 2 phase flow creation procedure:
+ * 1. Missing reverse flow - flow context is shared for both flow directions
+ *    i.e. in order maintain bidirectional flow context in RTE SFT packets
+ *    arriving from both directions should be identified as packets of the
+ *    RTE SFT flow. Consequently, before the creation of the SFT flow caller
+ *    should provide reverse flow direction 7-tuple.
+ * 2. The caller of rte_sft_process_mbuf/rte_sft_process_mbuf_with_zone should
+ *   be notified that arrived mbuf is first in flow & decide whether to
+ *   create a new flow or disregard this packet.
+ * This function completes the creation of the bidirectional SFT flow & creates
+ * entry for 7-tuple on SFT PMD defined by the tuple port for both
+ * initiator/initiate 7-tuples.
+ * Flow aging, connection tracking state & out of order handling will be
+ * initialized according to the content of the *mbuf_in* passes to
+ * rte_sft_process_mbuf/_with_zone during phase 1 of flow creation.
+ * Once this function returns upcoming calls rte_sft_process_mbuf/_with_zone
+ * with 7-tuple or its reverse will return the handle to this flow.
+ * Flow should be locked by the caller (see rte_sft_flow_lock).
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param[in] mbuf_in
+ *   mbuf to process; mbuf pointer considered 'consumed' and should not be used
+ *   after successful call to this function.
+ * @param reverse_tuple
+ *   Expected response flow 7-tuple.
+ * @param state
+ *   User defined state to set.
+ * @param data
+ *   User defined data, the len is configured during sft init.
+ * @param proto_enable
+ *   Enables maintenance of status->proto_state connection tracking value
+ *   for the flow. otherwise status->proto_state will be initialized with zeros.
+ * @param dev_id
+ *   Event dev ID to enqueue end of flow event.
+ * @param port_id
+ *   Event port ID to enqueue end of flow event.
+ * @param actions
+ *   Flags that indicate which actions should be done on the packet before
+ *   returning it to the rte_flow.
+ * @param action_specs
+ *   Hold the actions configuration.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param[out] status
+ *   Structure to dump SFT flow status once activated.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_activate(uint16_t queue, struct rte_mbuf *mbuf_in,
+		      const struct rte_sft_7tuple *reverse_tuple,
+		      uint8_t state, uint32_t *data, uint8_t proto_enable,
+		      uint8_t dev_id, uint8_t port_id, uint64_t actions,
+		      const struct rte_sft_actions_specs *action_specs,
+		      struct rte_mbuf **mbuf_out,
+		      struct rte_sft_flow_status *status,
+		      struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Artificially create SFT flow.
+ *
+ * Function to create SFT flow before reception of the first flow packet.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param tuple
+ *   Expected initiator flow 7-tuple.
+ * @param reverse_tuple
+ *   Expected initiate flow 7-tuple.
+ * @param state
+ *   User defined state to set.
+ * @param data
+ *   User defined data, the len is configured during sft init.
+ * @param proto_enable
+ *   Enables maintenance of status->proto_state connection tracking value
+ *   for the flow. otherwise status->proto_state will be initialized with zeros.
+ * @param[out] status
+ *   Connection status.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   - on success: 0, locked SFT flow recognized by status->fid.
+ *   - on error: a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_create(uint16_t queue, const struct rte_sft_7tuple *tuple,
+		    const struct rte_sft_7tuple *reverse_tuple,
+		    const struct rte_flow_item_sft *ctx,
+		    uint8_t ct_enable,
+		    struct rte_sft_flow_status *status,
+		    struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Removes flow from SFT.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_destroy(uint16_t queue, uint32_t fid, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Query counter and aging data.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] data.
+ *   SFT flow ID.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_query(uint16_t queue, uint32_t fid,
+		   struct rte_sft_query_data *data,
+		   struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset flow age to zero.
+ *
+ * Simulates last flow packet with timestamp set to just now.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_touch(uint16_t queue, uint32_t fid, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set flow aging to specific value.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param aging
+ *   New flow aging value.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_aging(uint16_t queue, uint32_t fid, uint32_t aging,
+		       struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set client object for given client ID.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param client_id
+ *   Client ID to set object for.
+ * @param client_obj
+ *   Pointer to opaque client object structure.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_client_obj(uint16_t queue, uint32_t fid, uint8_t client_id,
+			    void *client_obj, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get client object for given client ID.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param client_id
+ *   Client ID to get object for.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid client object opaque pointer in case of success, NULL otherwise
+ *   and rte_sft_error is set.
+ */
+__rte_experimental
+void *
+rte_sft_flow_get_client_obj(uint16_t queue, const uint32_t fid,
+			    uint8_t client_id, struct rte_sft_error *error);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SFT_H_ */
diff --git a/lib/librte_ethdev/rte_sft_driver.h b/lib/librte_ethdev/rte_sft_driver.h
new file mode 100644
index 0000000000..4f1964dab6
--- /dev/null
+++ b/lib/librte_ethdev/rte_sft_driver.h
@@ -0,0 +1,201 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+#ifndef RTE_SFT_DRIVER_H_
+#define RTE_SFT_DRIVER_H_
+
+/**
+ * @file
+ * RTE generic SFT API (driver side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include "rte_ethdev.h"
+#include "rte_ethdev_driver.h"
+#include "rte_sft.h"
+#include "rte_flow.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct rte_sft_entry;
+
+#define RTE_SFT_STATE_FLAG_FID_VALID (1 << 0)
+#define RTE_SFT_STATE_FLAG_ZONE_VALID (1 << 1)
+#define RTE_SFT_STATE_FLAG_FLOW_MISS (1 << 2)
+
+#define RTE_SFT_MISS_TCP_FLAGS (1 << 0)
+
+RTE_STD_C11
+struct rte_sft_decode_info {
+	union {
+		uint32_t fid; /**< The fid value. */
+		uint32_t zone; /**< The zone value. */
+	};
+	uint32_t state;
+	/**< Flags that mark the packet state. see RTE_SFT_STATE_FLAG_*. */
+};
+
+/**
+ * @internal
+ * Insert a flow to the SFT HW component.
+ *
+ * @param dev
+ *   ethdev handle of port.
+ * @param fid
+ *   Flow ID.
+ * @param queue
+ *   The sft working queue.
+ * @param pattern
+ *   The matching pattern.
+ * @param miss_conditions
+ *   The conditions that forces a miss even if the 5 tuple was matched
+ *   see RTE_SFT_MISS_*.
+ * @param actions
+ *   Set pf actions to apply in case the flow was hit. If no terminating action
+ *   (queue, rss, drop, port) was given, the terminating action should be taken
+ *   from the flow that resulted in the SFT.
+ * @param miss_actions
+ *   Set pf actions to apply in case the flow was hit. but the miss conditions
+ *   were hit. (6 tuple match but tcp flags are on) If no terminating action
+ *   (queue, rss, drop, port) was given, the terminating action should be taken
+ *   from the flow that resulted in the SFT.
+ * @param data
+ *   The application data to attached to the flow.
+ * @param data_len
+ *   The length of the data in uint32_t increments.
+ * @param state
+ *   The application state to set.
+ * @param error[out]
+ *   Verbose of the error.
+ *
+ * @return
+ *   Pointer to sft_entry in case of success, null otherwise and rte_sft_error
+ *   is set.
+ */
+typedef struct rte_sft_entry *(*sft_entry_create_t)
+		(struct rte_eth_dev *dev, uint32_t fid, uint16_t queue,
+		 const struct rte_flow_item *pattern, uint64_t miss_conditions,
+		 const struct rte_flow_action *actions,
+		 const struct rte_flow_action *miss_actions,
+		 const uint32_t *data, uint16_t data_len, uint8_t state,
+		 struct rte_sft_error *error);
+
+/**
+ * @internal
+ * Modify the state and the data of SFT flow in HW component.
+ *
+ * @param dev
+ *   ethdev handle of port.
+ * @param entry
+ *   The entry to modify.
+ * @param queue
+ *   The sft working queue.
+ * @param data
+ *   The application data to attached to the flow.
+ * @param data_len
+ *   The length of the data in uint32_t increments.
+ * @param state
+ *   The application state to set.
+ * @param error[out]
+ *   Verbose of the error.
+ *
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+typedef int *(*sft_entry_modify_t)(struct rte_eth_dev *dev,
+				   struct rte_sft_entry *entry, uint16_t queue,
+				   const uint32_t *data, uint16_t data_len,
+				   uint8_t state, struct rte_sft_error *error);
+
+/**
+ * @internal
+ * Destroy SFT flow in HW component.
+ *
+ * @param dev
+ *   ethdev handle of port.
+ * @param entry
+ *   The entry to modify.
+ * @param queue
+ *   The sft working queue.
+ * @param error[out]
+ *   Verbose of the error.
+ *
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+typedef int *(*sft_entry_destory_t)(struct rte_eth_dev *dev,
+				    struct rte_sft_entry *entry, uint16_t queue,
+				    struct rte_sft_error *error);
+
+/**
+ * @internal
+ * Decode sft state and FID from mbuf.
+ *
+ * @param dev
+ *   ethdev handle of port.
+ * @param entry
+ *   The entry to modify.
+ * @param queue
+ *   The sft working queue.
+ * @param mbuf
+ *   The input mbuf.
+ * @param info[out]
+ *   The decoded sft data.
+ * @param error[out]
+ *   Verbose of the error.
+ *
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+typedef int *(*sft_entry_decode_t)(struct rte_eth_dev *dev,
+				   struct rte_sft_entry *entry, uint16_t queue,
+				   struct rte_mbuf *mbuf,
+				   struct rte_sft_decode_info *info,
+				   struct rte_sft_error *error);
+
+/**
+ * Generic sft operations structure implemented and returned by PMDs.
+ *
+ * If successful, this operation must result in a pointer to a PMD-specific.
+ *
+ * See also rte_sft_ops_get().
+ *
+ * These callback functions are not supposed to be used by applications
+ * directly, which must rely on the API defined in rte_sft.h.
+ */
+struct rte_sft_ops {
+	sft_entry_create_t sft_create_entry;
+	sft_entry_modify_t sft_entry_modify;
+	sft_entry_destory_t sft_entry_destory;
+	sft_entry_decode_t sft_entry_decode;
+};
+
+/**
+ * Get generic sft operations structure from a port.
+ *
+ * @param port_id
+ *   Port identifier to query.
+ * @param[out] error
+ *   Pointer to flow error structure.
+ *
+ * @return
+ *   The flow operations structure associated with port_id, NULL in case of
+ *   error, in which case rte_errno is set and the error structure contains
+ *   additional details.
+ */
+const struct rte_sft_ops *
+rte_sft_ops_get(uint16_t port_id, struct rte_sft_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_SFT_DRIVER_H_ */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [RFC v3 0/2] introduce stateful flow table
  2020-09-09 20:30 [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
                   ` (4 preceding siblings ...)
  2020-11-04 12:59 ` [dpdk-dev] [PATCH v2 0/2] introduce stateful flow table Ori Kam
@ 2020-11-04 13:17 ` Ori Kam
  2020-11-04 13:17   ` [dpdk-dev] [RFC v3 1/2] ethdev: add item/action for SFT Ori Kam
  2020-11-04 13:17   ` [dpdk-dev] [RFC v3 2/2] ethdev: introduce sft lib Ori Kam
  5 siblings, 2 replies; 17+ messages in thread
From: Ori Kam @ 2020-11-04 13:17 UTC (permalink / raw)
  To: andreyv, mdr
  Cc: alexr, andrey.vesnovaty, arybchenko, dev, elibr, ferruh.yigit,
	orika, ozsh, roniba, thomas, viacheslavo

The RFC introduces Stateful Flow Table (SFT) API and changes needed in
both ethdev an RTE flow to support SFT functionality.

SFT library provides a framework for applications that need to maintain
context across different packets of the connection.

The goals of the SFT library:
- Accelerate flow recognition & its context retrieval for further
  lookaside processing.
- Enable context-aware flow handling offload.

Change log:
v3:
- add change log.
- change to RFC

v2:
- Add queue approach in the SFT.
- Move to ethdev.
- update based on ML commentes.

*** BLURB HERE ***

Andrey Vesnovaty (1):
  ethdev: add item/action for SFT

Ori Kam (1):
  ethdev: introduce sft lib

 lib/librte_ethdev/meson.build            |   3 +
 lib/librte_ethdev/rte_ethdev_version.map |  19 +
 lib/librte_ethdev/rte_flow.h             |  75 ++
 lib/librte_ethdev/rte_sft.c              |   9 +
 lib/librte_ethdev/rte_sft.h              | 877 +++++++++++++++++++++++
 lib/librte_ethdev/rte_sft_driver.h       | 201 ++++++
 6 files changed, 1184 insertions(+)
 create mode 100644 lib/librte_ethdev/rte_sft.c
 create mode 100644 lib/librte_ethdev/rte_sft.h
 create mode 100644 lib/librte_ethdev/rte_sft_driver.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [RFC v3 1/2] ethdev: add item/action for SFT
  2020-11-04 13:17 ` [dpdk-dev] [RFC v3 0/2] introduce stateful flow table Ori Kam
@ 2020-11-04 13:17   ` Ori Kam
  2020-11-04 13:17   ` [dpdk-dev] [RFC v3 2/2] ethdev: introduce sft lib Ori Kam
  1 sibling, 0 replies; 17+ messages in thread
From: Ori Kam @ 2020-11-04 13:17 UTC (permalink / raw)
  To: andreyv, mdr
  Cc: alexr, andrey.vesnovaty, arybchenko, dev, elibr, ferruh.yigit,
	orika, ozsh, roniba, thomas, viacheslavo

From: Andrey Vesnovaty <andreyv@nvidia.com>

Attach SFT flow context to packet with SFT action.
Match on SFT flow context (attached to packet),
with SFT item.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
---
 lib/librte_ethdev/rte_flow.h | 75 ++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5489..7ca47cc87c 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -537,6 +537,14 @@ enum rte_flow_item_type {
 	 */
 	RTE_FLOW_ITEM_TYPE_ECPRI,
 
+	/**
+	 * [META]
+	 *
+	 * Matches SFT context.
+	 *
+	 * See struct rte_flow_item_sft.
+	 */
+	RTE_FLOW_ITEM_TYPE_SFT,
 };
 
 /**
@@ -1579,6 +1587,48 @@ static const struct rte_flow_item_ecpri rte_flow_item_ecpri_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_SFT
+ *
+ * Matches context of flow that wasw created by SFT.
+ *
+ * The structure describes SFT flow context.
+ * All the fields of the structure, except @p fid, should be considered as
+ * user defined.
+ * The @p fid assigned by RTE SFT & used as unique flow identifier.
+ * SFT context attached to packet by action ``SFT`` (see RTE_FLOW_ACTION_SFT),
+ * which sends the packet to the SFT module.
+ *
+ * SFT default context defined as context attached to packet when there is no
+ * entry for the flow in SFT. The @p state has application reserved value
+ * meaning that SFT context for the packet undefined since entry wasn't found
+ * in SFT. If state 'undefined' then @p zone should be valid othervice @p fid
+ * should be valid.
+ *
+ * Context considered virtual since the method of storing this info on packet
+ * is PMD/implementation specific & may involve mapping methods if there is
+ * 'not enough bits' to store entire contents of struct rte_flow_item_sft.
+ *
+ * Maximal value/size of each field depends on HW capabilities and considered
+ * as implementation specific.
+ */
+RTE_STD_C11
+struct rte_flow_item_sft {
+	union {
+		uint32_t fid; /**< SFT flow identifier. */
+		uint32_t zone; /**< Zone assigned to flow. */
+	};
+	uint32_t fid_valid:1; /**< The fid member is valid. */
+	uint32_t zone_valid:1; /**< The zone member is valid. */
+	uint32_t reserved:30; /**< Reserved. */
+	uint8_t state; /**< User defined flow state. */
+	uint8_t user_data_size; /**< user_data buffer size. */
+	uint8_t *user_data; /**< Arbitrary user data. */
+};
+
 /**
  * Matching pattern item definition.
  *
@@ -2132,6 +2182,15 @@ enum rte_flow_action_type {
 	 * see enum RTE_ETH_EVENT_FLOW_AGED
 	 */
 	RTE_FLOW_ACTION_TYPE_AGE,
+
+	/**
+	 * RTE_FLOW_ACTION_TYPE_SFT
+	 *
+	 * Direct the packet to SFT module.
+	 *
+	 * See struct rte_flow_action_sft.
+	 */
+	RTE_FLOW_ACTION_TYPE_SFT,
 };
 
 /**
@@ -2721,6 +2780,22 @@ rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t v)
 	*RTE_FLOW_DYNF_METADATA(m) = v;
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SFT
+ *
+ * Performs lookup by *zone* and 5-tuple in SFT.
+ * If entry is found the related SFT context will be attached otherwise,
+ * default SFT context will be attached.
+ *
+ * This action may result in termination of actions that following this action.
+ */
+struct rte_flow_action_sft {
+	uint32_t zone; /**< Zone for lookup in SFT */
+};
+
 /*
  * Definition of a single action.
  *
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dpdk-dev] [RFC v3 2/2] ethdev: introduce sft lib
  2020-11-04 13:17 ` [dpdk-dev] [RFC v3 0/2] introduce stateful flow table Ori Kam
  2020-11-04 13:17   ` [dpdk-dev] [RFC v3 1/2] ethdev: add item/action for SFT Ori Kam
@ 2020-11-04 13:17   ` Ori Kam
  1 sibling, 0 replies; 17+ messages in thread
From: Ori Kam @ 2020-11-04 13:17 UTC (permalink / raw)
  To: andreyv, mdr
  Cc: alexr, andrey.vesnovaty, arybchenko, dev, elibr, ferruh.yigit,
	orika, ozsh, roniba, thomas, viacheslavo

Defines RTE SFT (Stateful Flow Table) APIs for Stateful Flow Table library.

Currently, DPDK enables only stateless offloading, using the rte_flow.
stateless means that each packet is handled without any knowledge of
previous or future packets.

As we look at the industry, there is much demand to save a context across
packets that belong to a connection.

Examples for such applications:
- Next-generation firewalls
- Intrusion detection/prevention systems (IDS/IPS): Suricata, snort
- SW/Virtual Switching: OVS
The goals of the SFT library:
- Accelerate flow recognition & its context retrieval for further
  lookaside processing.
- Enable context-aware flow handling offload.

The solution suggested is to create a lib that will enable saving states
between different packets that belong to the same connection.

The solution will also enable better HW abstraction than the one we get
from using the rte_flow. The reason for this is that saving states is
not atomic action like we have in rte_flow and also can't be done fully
in HW (The first packets must be seen by the application).
Saying the above this lib is based on interacting with the rte_flow but it
doesn't replace it or encapsulate it.

Key design points.
- The SFT should offload as much as possible to HW.
- The SFT is designed to work alongside the rte_flow.
- The SFT has its own ops that the PMD needs to implement.
- The SFT works on 5 tuple + zone (a user-defined value)

Basic usage flow:
1. Application insert a flow that matches all eth traffic and have
   sft action along with jump action. (in the future this jump can be
   avoided and in doing so saving some jumps,
   but for the most generic and complete solution we think
   that allow the application full control of the packet process using
   rte_flow is better.)
2. Application insert a flow in the target group that matches the packet
   state. Based on this state the application performs the needed
   actions. This flow can also be merged with other matching criteria.
   The application will also add a flow in the target group that will
   upload to the application any packet with a miss state.
3. First eth packet arrives and is routed to the SFT HW component.
   since this is the first packet the SFT will have a miss and will
   mark the packet with miss state and forward it to the target group.
4. The application will pull the packet from the queue and will send it to
   be processed by the sft lib.
5. The SFT will extract HW packet state and if valid the zone or the
   flow-id, and report it back to the application.
6. Application will see that this is a new connection, so it will issue
   SFT command to create a new connection with a selected state.
   The SFT will create a HW flow, that matches the 5 tuple + zone and
   sets the state of the packet. The state can be any u8 value, it is
   the responsibility of the application to match on the value.
7. On the next packet arriving to the HW it will jump to the SFT and
   in the SFT HW there will be a match which will result in setting
   the packet state and ID according to the application request.
8. In case of later miss (at some other group) or application logic
   that the packet should be routed back to the application.
   The application will call the SFT lib with the new mbuf,
   which will result in the flow-id returned to the application
   along with the context attached to this connection.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Signed-off-by: Ori Kam <orika@nvidia.com>
---
 lib/librte_ethdev/meson.build            |   3 +
 lib/librte_ethdev/rte_ethdev_version.map |  19 +
 lib/librte_ethdev/rte_sft.c              |   9 +
 lib/librte_ethdev/rte_sft.h              | 877 +++++++++++++++++++++++
 lib/librte_ethdev/rte_sft_driver.h       | 201 ++++++
 5 files changed, 1109 insertions(+)
 create mode 100644 lib/librte_ethdev/rte_sft.c
 create mode 100644 lib/librte_ethdev/rte_sft.h
 create mode 100644 lib/librte_ethdev/rte_sft_driver.h

diff --git a/lib/librte_ethdev/meson.build b/lib/librte_ethdev/meson.build
index 8fc24e8c8a..064e3c9443 100644
--- a/lib/librte_ethdev/meson.build
+++ b/lib/librte_ethdev/meson.build
@@ -9,6 +9,7 @@ sources = files('ethdev_private.c',
 	'rte_ethdev.c',
 	'rte_flow.c',
 	'rte_mtr.c',
+	'rte_sft.c',
 	'rte_tm.c')
 
 headers = files('rte_ethdev.h',
@@ -24,6 +25,8 @@ headers = files('rte_ethdev.h',
 	'rte_flow_driver.h',
 	'rte_mtr.h',
 	'rte_mtr_driver.h',
+	'rte_sft.h',
+	'rte_sft_driver.h',
 	'rte_tm.h',
 	'rte_tm_driver.h')
 
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index f8a0945812..e3c829b494 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -232,6 +232,25 @@ EXPERIMENTAL {
 	rte_eth_fec_get_capability;
 	rte_eth_fec_get;
 	rte_eth_fec_set;
+	rte_sft_drain_mbuf;
+	rte_sft_fini;
+	rte_sft_flow_activate;
+	rte_sft_flow_create;
+	rte_sft_flow_destroy;
+	rte_sft_flow_get_client_obj;
+	rte_sft_flow_get_status;
+	rte_sft_flow_query;
+	rte_sft_flow_set_aging;
+	rte_sft_flow_set_client_obj;
+	rte_sft_flow_set_data;
+	rte_sft_flow_set_offload;
+	rte_sft_flow_set_state;
+	rte_sft_flow_touch;
+	rte_sft_init;
+	rte_sft_process_mbuf;
+	rte_sft_process_mbuf_with_zone;
+
+
 };
 
 INTERNAL {
diff --git a/lib/librte_ethdev/rte_sft.c b/lib/librte_ethdev/rte_sft.c
new file mode 100644
index 0000000000..f3d3945545
--- /dev/null
+++ b/lib/librte_ethdev/rte_sft.c
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+
+#include "rte_sft.h"
+#include "rte_sft_driver.h"
+
+/* Placeholder for RTE SFT library APIs implementation */
diff --git a/lib/librte_ethdev/rte_sft.h b/lib/librte_ethdev/rte_sft.h
new file mode 100644
index 0000000000..d295bb0b7a
--- /dev/null
+++ b/lib/librte_ethdev/rte_sft.h
@@ -0,0 +1,877 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+#ifndef _RTE_SFT_H_
+#define _RTE_SFT_H_
+
+/**
+ * @file
+ *
+ * RTE SFT API
+ *
+ * Defines RTE SFT APIs for Statefull Flow Table library.
+ *
+ * The SFT lib is part of the ethdev class, the reason for this is that the main
+ * idea is to leverage the HW offload that the ethdev allow using the rte_flow.
+ *
+ * SFT General description:
+ * SFT library provides a framework for applications that need to maintain
+ * context across different packets of the connection.
+ * Examples for such applications:
+ * - Next-generation firewalls
+ * - Intrusion detection/prevention systems (IDS/IPS): Suricata, Snort
+ * - SW/Virtual Switching: OVS
+ * The goals of the SFT library:
+ * - Accelerate flow recognition & its context retrieval for further look-aside
+ *   processing.
+ * - Enable context-aware flow handling offload.
+ *
+ * The SFT is designed to use HW offload to get the best performance.
+ * This is done on two levels. The first one is marking the packet with flow id
+ * to speed the lookup of the flow in the data structure.
+ * The second is done be connecting the SFT results to the rte_flow for
+ * continuing packet process.
+ *
+ * Definitions and Abbreviations:
+ * - 5-tuple: defined by:
+ *     -- Source IP address
+ *     -- Source port
+ *     -- Destination IP address
+ *     -- Destination port
+ *     -- IP protocol number
+ * - 7-tuple: 5-tuple, zone and port (see struct rte_sft_7tuple)
+ * - 5/7-tuple: 5/7-tuple of the packet from connection initiator
+ * - revers 5/7-tuple: 5/7-tuple of the packet from connection initiate
+ * - application: SFT library API consumer
+ * - APP: see application
+ * - CID: client ID
+ * - CT: connection tracking
+ * - FID: Flow identifier
+ * - FIF: First In Flow
+ * - Flow: defined by 7-tuple and its reverse i.e. flow is bidirectional
+ * - SFT: Stateful Flow Table
+ * - user: see application
+ * - zone: additional user defined value used as differentiator for
+ *         connections having same 5-tuple (for example different VXLAN
+ *         connections with same inner 5-tuple).
+ *
+ * SFT components:
+ *
+ * +-----------------------------------+
+ * | RTE flow                          |
+ * |                                   |
+ * | +-------------------------------+ |  +----------------+
+ * | | group X                       | |  | RTE_SFT        |
+ * | |                               | |  |                |
+ * | | +---------------------------+ | |  |                |
+ * | | | rule ...                  | | |  |                |
+ * | | | .                         | | |  +-----------+----+
+ * | | | .                         | | |              |
+ * | | | .                         | | |          entry
+ * | | +---------------------------+ | |            create
+ * | | | rule                      | | |              |
+ * | | |   patterns ...            +---------+        |
+ * | | |   actions                 | | |     |        |
+ * | | |     SFT (zone=Z)          | | |     |        |
+ * | | |     JUMP (group=Y)        | | |  lookup      |
+ * | | +---------------------------+ | |    zone=Z,   |
+ * | | | rule ...                  | | |    5tuple    |
+ * | | | .                         | | |     |        |
+ * | | | .                         | | |  +--v-------------+
+ * | | | .                         | | |  | SFT       |    |
+ * | | |                           | | |  |           |    |
+ * | | +---------------------------+ | |  |        +--v--+ |
+ * | |                               | |  |        |     | |
+ * | +-------------------------------+ |  |        | PMD | |
+ * |                                   |  |        |     | |
+ * |                                   |  |        +-----+ |
+ * | +-------------------------------+ |  |                |
+ * | | group Y                       | |  |                |
+ * | |                               | |  | set state      |
+ * | | +---------------------------+ | |  | set data       |
+ * | | | rule                      | | |  +--------+-------+
+ * | | |   patterns                | | |           |
+ * | | |     SFT (state=UNDEFINED) | | |           |
+ * | | |   actions RSS             | | |           |
+ * | | +---------------------------+ | |           |
+ * | | | rule                      | | |           |
+ * | | |   patterns                | | |           |
+ * | | |     SFT (state=INVALID)   | <-------------+
+ * | | |   actions DROP            | | |  forward
+ * | | +---------------------------+ | |    group=Y
+ * | | | rule                      | | |
+ * | | |   patterns                | | |
+ * | | |     SFT (state=ACCEPTED)  | | |
+ * | | |   actions PORT            | | |
+ * | | +---------------------------+ | |
+ * | |  ...                          | |
+ * | |                               | |
+ * | +-------------------------------+ |
+ * |  ...                              |
+ * |                                   |
+ * +-----------------------------------+
+ *
+ * SFT as datastructure:
+ * SFT can be treated as datastructure maintaining flow context across its
+ * lifetime. SFT flow entry represents bidirectional network flow and defined by
+ * 7-tuple & its reverse 7-tuple.
+ * Each entry in SFT has:
+ * - FID: 1:1 mapped & used as entry handle & encapsulating internal
+ *   implementation of the entry.
+ * - State: user-defined value attached to each entry, the only library
+ *   reserved value for state unset (the actual value defined by SFT
+ *   configuration). The application should define flow state encodings and
+ *   set it for flow via rte_sft_flow_set_ctx() than what actions should be
+ *   applied on packets can be defined via related RTE flow rule matching SFT
+ *   state (see rules in SFT components diagram above).
+ * - Timestamp: for the last seen in flow packet used for flow aging mechanism
+ *   implementation.
+ * - Client Objects: user-defined flow contexts attached as opaques to flow.
+ * - Acceleration & offloading - utilize RTE flow capabilities, when supported
+ *   (see action ``SFT``), for flow lookup acceleration and further
+ *   context-aware flow handling offload.
+ * - CT state: optionally for TCP connections CT state can be maintained
+ *   (see enum rte_sft_flow_ct_state).
+ * - Out of order TCP packets: optionally SFT can keep out of order TCP
+ *   packets aside the flow context till the arrival of the missing in-order
+ *   packet.
+ *
+ * RTE flow changes:
+ * The SFT flow state (or context) for RTE flow is defined by fields of
+ * struct rte_flow_item_sft.
+ * To utilize SFT capabilities new item and action types introduced:
+ * - item SFT: matching on SFT flow state (see RTE_FLOW_ITEM_TYPE_SFT).
+ * - action SFT: retrieve SFT flow context and attached it to the processed
+ *   packet (see RTE_FLOW_ACTION_TYPE_SFT).
+ *
+ * The contents of per port SFT serving RTE flow action ``SFT`` managed via
+ * SFT PMD APIs (see struct rte_sft_ops).
+ * The SFT flow state/context retrieval performed by user-defined zone ``SFT``
+ * action argument and processed packet 5-tuple.
+ * If in scope of action ``SFT`` there is no context/state for the flow in SFT
+ * undefined state attached to the packet meaning that the flow is not
+ * recognized by SFT, most probably FIF packet.
+ *
+ * Once the SFT state set for a packet it can match on item SFT
+ * (see RTE_FLOW_ITEM_TYPE_SFT) and forwarding design can be done for the
+ * packet, for example:
+ * - if state value == x than queue for further processing by the application
+ * - if state value == y than forward it to eth port (full offload)
+ * - if state value == 'undefined' than queue for further processing by
+ *   the application (handle FIF packets)
+ *
+ * Processing packets with SFT library:
+ *
+ * FIF packet:
+ * To recognize upcoming packets of the SFT flow every FIF packet should be
+ * forwarded to the application utilizing the SFT library. Non-FIF packets can
+ * be processed by the application or its processing can be fully offloaded.
+ * Processing of the packets in SFT library starts with rte_sft_process_mbuf
+ * or rte_sft_process_mbuf_with_zone. If mbuf recognized as FIF application
+ * should make a design to destroy flow or complete flow creation process in
+ * SFT using rte_sft_flow_activate.
+ *
+ * Recognized SFT flow:
+ * Once struct rte_sft_flow_status with valid fid field possessed by application
+ * it can:
+ * - mange client objects on it (see client_obj field in
+ *   struct rte_sft_flow_status) using rte_sft_flow_<OP>_client_obj APIs
+ * - analyze user-defined flow state and CT state.
+ * - set flow state to be attached to the upcoming packets by action ``SFT``
+ *   via struct rte_sft_flow_status API.
+ * - decide to destroy flow via rte_sft_flow_destroy API.
+ *
+ * Flow aging:
+ *
+ * SFT library manages the aging for each flow. On flow creation, it's
+ * assigned an aging value, the maximal number of seconds passed since the
+ * last flow packet arrived, once exceeded flow considered aged.
+ * The application notified of aged flow asynchronously via event queues.
+ * The device and port IDs tuple to identify the event queue to enqueue
+ * flow aged events passed on flow creation as arguments
+ * (see rte_sft_flow_activate). It's the application responsibility to
+ * initialize event queues and assign them to each flow for EOF event
+ * notifications.
+ * Aged EOF event handling:
+ * - Should be considered as application responsibility.
+ * - The last stage should be the release of the flow resources via
+ *    rte_sft_flow_destroy API.
+ * - All client objects should be removed from flow before the
+ *   rte_sft_flow_destroy API call.
+ * See the description of ret_sft_flow_destroy for an example of aged flow
+ * handling.
+ *
+ * SFT API thread safety:
+ *
+ * Since the SFT lib is designed to work as part of the Fast-Path, The SFT
+ * is not thread safe, in order to enable better working with multiple threads
+ * the SFT lib uses the queue approach, where each queue can only be accessesd
+ * by one thread while one thread can access multiple queues.
+ *
+ * SFT Library initialization and cleanup:
+ *
+ * SFT library should be considered as a single instance, preconfigured and
+ * initialized via rte_sft_init() API.
+ * SFT library resource deallocation and cleanup should be done via
+ * rte_sft_init() API as a stage of the application termination procedure.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_errno.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_flow.h>
+
+/**
+ * L3/L4 5-tuple - src/dest IP and port and IP protocol.
+ *
+ * Used for flow/connection identification.
+ */
+RTE_STD_C11
+struct rte_sft_5tuple {
+	union {
+		struct {
+			rte_be32_t src_addr; /**< IPv4 source address. */
+			rte_be32_t dst_addr; /**< IPv4 destination address. */
+		} ipv4;
+		struct {
+			uint8_t src_addr[16]; /**< IPv6 source address. */
+			uint8_t dst_addr[16]; /**< IPv6 destination address. */
+		} ipv6;
+	};
+	rte_be16_t src_port; /**< Source port. */
+	rte_be16_t dst_port; /**< Destination port. */
+	uint8_t proto; /**< IP protocol. */
+	uint8_t is_ipv6: 1; /**< True for valid IPv6 fields. Otherwise IPv4. */
+};
+
+/**
+ * Port flow identification.
+ *
+ * @p zone used for setups where 5-tuple is not enough to identify flow.
+ * For example different VLANs/VXLANs may have similar 5-tuples.
+ */
+struct rte_sft_7tuple {
+	struct rte_sft_5tuple flow_5tuple; /**< L3/L4 5-tuple. */
+	uint32_t zone; /**< Zone assigned to flow. */
+	uint16_t port_id; /** <Port identifier of Ethernet device. */
+};
+
+/**
+ * Structure describes SFT library configuration
+ */
+struct rte_sft_conf {
+	uint16_t nb_queues; /**< Preferred number of queues */
+	uint32_t udp_aging; /**< UDP proto default aging in sec */
+	uint32_t tcp_aging; /**< TCP proto default aging in sec */
+	uint32_t tcp_syn_aging; /**< TCP SYN default aging in sec. */
+	uint32_t default_aging; /**< All unlisted proto default aging in sec. */
+	uint32_t nb_max_entries; /**< Max entries in SFT. */
+	uint8_t app_data_len; /**< Number of uint32 of app data. */
+	uint32_t support_partial_match: 1;
+	/**< App can partial match on the data. */
+	uint32_t reorder_enable: 1;
+	/**< TCP packet reordering feature enabled bit. */
+	uint32_t tcp_ct_enable: 1;
+	/**< TCP connection tracking based on standard. */
+	uint32_t reserved: 30;
+};
+
+/**
+ *  Structure that holds the action configuration.
+ */
+struct rte_sft_actions_specs {
+	struct rte_sft_5tuple *initiator_nat;
+	/**< The NAT configuration for the initiator flow. */
+	struct rte_sft_5tuple *reverse_nat;
+	/**< The NAT configuration for the reverse flow. */
+	uint64_t aging; /**< the aging time out in sec. */
+};
+
+#define RTE_SFT_ACTION_INITIATOR_NAT (1ul << 0)
+/**< NAT action should be done on the initiator traffic. */
+#define RTE_SFT_ACTION_REVERSE_NAT (1ul << 1)
+/**< NAT action should be done on the reverse traffic. */
+#define RTE_SFT_ACTION_COUNT (1ul << 2) /**< Enable count action. */
+#define RTE_SFT_ACTION_AGE (1ul << 3) /**< Enable ageing action. */
+
+
+/**
+ * Structure that holds the count data.
+ */
+struct rte_sft_query_data {
+	uint64_t nb_bytes; /**< Number of bytes that passed in the flow. */
+	uint64_t nb_packets; /**< Number of packets that passed in the flow. */
+	uint32_t age; /**< Seconds passed since last seen packet. */
+	uint32_t aging;
+	/**< Flow considered aged once this age (seconds) reached. */
+	uint32_t nb_bytes_valid: 1; /**< Number of bytes is valid. */
+	uint32_t nb_packets_valid: 1; /* Number of packets is valid. */
+	uint32_t nb_age_valid: 1; /* Age is valid. */
+	uint32_t nb_aging_valid: 1; /* Aging is valid. */
+	uint32_t reserved: 28;
+};
+
+/**
+ * Structure describes the state of the flow in SFT.
+ */
+struct rte_sft_flow_status {
+	uint32_t fid; /**< SFT flow id. */
+	uint32_t zone; /**< Zone for lookup in SFT */
+	uint8_t state; /**< Application defined bidirectional flow state. */
+	uint8_t proto_state; /**< The state based on the protocol. */
+	uint16_t proto; /**< L4 protocol. */
+	/**< Connection tracking flow state, based on standard. */
+	uint32_t nb_in_order_mbufs;
+	/**< Number of in-order mbufs available for drain */
+	uint32_t activated: 1; /**< Flow was activated. */
+	uint32_t zone_valid: 1; /**< Zone field is valid. */
+	uint32_t proto_state_change: 1; /**< Protocol state was changed. */
+	uint32_t fragmented: 1; /**< Last flow mbuf was fragmented. */
+	uint32_t out_of_order: 1; /**< Last flow mbuf was out of order (TCP). */
+	uint32_t offloaded: 1;
+	/**< The connection is offload and no packet should be stored. */
+	uint32_t initiator: 1; /**< marks if the mbuf is from the initiator. */
+	uint32_t reserved: 25;
+	uint32_t data[];
+	/**< Application data. The length is defined by the configuration. */
+};
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_flow_error.cause.
+ */
+enum rte_sft_error_type {
+	RTE_SFT_ERROR_TYPE_NONE, /**< No error. */
+	RTE_SFT_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_SFT_ERROR_TYPE_FLOW_NOT_DEFINED, /**< The FID is not defined. */
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by SFT, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_sft_error {
+	enum rte_sft_error_type type; /**< Cause field and error types. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get SFT flow status, based on the fid.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] status
+ *   Structure to dump actual SFT flow status.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_get_status(const uint16_t queue, const uint32_t fid,
+			struct rte_sft_flow_status *status,
+			struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set user defined data.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param data
+ *   User defined data. The len is defined at configuration time.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_data(uint16_t queue, uint32_t fid, const uint32_t *data,
+		      struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set user defined state.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param state
+ *   User state.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_state(uint16_t queue, uint32_t fid, const uint8_t state,
+		       struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set user defined state.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param offload
+ *   set if flow is offloaded.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_offload(uint16_t queue, uint32_t fid, bool offload,
+			 struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Initialize SFT library instance.
+ *
+ * @param conf
+ *   SFT library instance configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_init(const struct rte_sft_conf *conf, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Finalize SFT library instance.
+ * Cleanup & release allocated resources.
+ */
+__rte_experimental
+void
+rte_sft_fini(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Process mbuf received on RX queue.
+ *
+ * This function checks the mbuf against the SFT database and return the
+ * connection status that this mbuf belongs to.
+ *
+ * If status.activated = 1 and status.offloaded = 0 the input mbuf is
+ * considered consumed and the application is not allowed to use it or free it,
+ * instead the application should use the mbuf pointed by the mbuf_out.
+ * In case the mbuf is out of order or fragmented the mbuf_out will be NULL.
+ *
+ * If status.activated = 0 or status.offloaded = 1, the input mbuf is not
+ * consumed and the mbuf_out will always be NULL.
+ *
+ * This function doesn't create new entry in the SFT.
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param[in] mbuf_in
+ *   mbuf to process; mbuf pointer considered 'consumed' and should not be used
+ *   if status.activated and status.offload = 0.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param[out] status
+ *   Connection status based on the last in mbuf.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. Initialize in case of
+ *   error only.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_process_mbuf(uint16_t queue, struct rte_mbuf *mbuf_in,
+		     struct rte_mbuf **mbuf_out,
+		     struct rte_sft_flow_status *status,
+		     struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Process mbuf received on RX queue while zone value provided by caller.
+ *
+ * The behaviour of this function is similar to rte_sft_process_mbuf except
+ * the lookup in SFT procedure. The lookup in SFT always done by the *zone*
+ * arg and 5-tuple 5-tuple, extracted form mbuf outer header contents.
+ *
+ * @see rte_sft_process_mbuf
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param[in] mbuf_in
+ *   mbuf to process; mbuf pointer considered 'consumed' and should not be used
+ *   after successful call to this function.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param[out] status
+ *   Connection status based on the last in mbuf.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. Initialize in case of
+ *   error only.
+ *
+ * @return
+ *   0 on success , a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_process_mbuf_with_zone(uint16_t queue, struct rte_mbuf *mbuf_in,
+			       uint32_t zone, struct rte_mbuf **mbuf_out,
+			       struct rte_sft_flow_status *status,
+			       struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Drain next in order mbuf.
+ *
+ * This function behaves similar to rte_sft_process_mbuf() but acts on packets
+ * accumulated in SFT flow due to missing in order packet. Processing done on
+ * single mbuf at a time and `in order`. Other than above the behavior is
+ * same as of rte_sft_process_mbuf for flow defined & activated & mbuf isn't
+ * fragmented & 'in order'. This function should be called when
+ * rte_sft_process_mbuf or rte_sft_process_mbuf_with_zone sets
+ * status->nb_in_order_mbufs output param !=0 and until
+ * status->nb_in_order_mbufs == 0.
+ * Flow should be locked by caller (see rte_sft_flow_lock).
+ *
+ * @param queue
+ *   The sft queue number.
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param nb_out
+ *   Number of buffers to be drained.
+ * @param initiator
+ *   true packets that will be drained belongs to the initiator.
+ * @param[out] status
+ *   Connection status based on the last mbuf that was drained.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. Initialize in case of
+ *   error only.
+ *
+ * @return
+ *   The number of mbufs that were drained, negative value in case
+ *   of error and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_drain_mbuf(uint16_t queue, uint32_t fid, struct rte_mbuf **mbuf_out,
+		   uint16_t nb_out, bool initiator,
+		   struct rte_sft_flow_status *status,
+		   struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Activate flow in SFT.
+ *
+ * This function creates an entry in the SFT for this connection.
+ * The reasons for 2 phase flow creation procedure:
+ * 1. Missing reverse flow - flow context is shared for both flow directions
+ *    i.e. in order maintain bidirectional flow context in RTE SFT packets
+ *    arriving from both directions should be identified as packets of the
+ *    RTE SFT flow. Consequently, before the creation of the SFT flow caller
+ *    should provide reverse flow direction 7-tuple.
+ * 2. The caller of rte_sft_process_mbuf/rte_sft_process_mbuf_with_zone should
+ *   be notified that arrived mbuf is first in flow & decide whether to
+ *   create a new flow or disregard this packet.
+ * This function completes the creation of the bidirectional SFT flow & creates
+ * entry for 7-tuple on SFT PMD defined by the tuple port for both
+ * initiator/initiate 7-tuples.
+ * Flow aging, connection tracking state & out of order handling will be
+ * initialized according to the content of the *mbuf_in* passes to
+ * rte_sft_process_mbuf/_with_zone during phase 1 of flow creation.
+ * Once this function returns upcoming calls rte_sft_process_mbuf/_with_zone
+ * with 7-tuple or its reverse will return the handle to this flow.
+ * Flow should be locked by the caller (see rte_sft_flow_lock).
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param[in] mbuf_in
+ *   mbuf to process; mbuf pointer considered 'consumed' and should not be used
+ *   after successful call to this function.
+ * @param reverse_tuple
+ *   Expected response flow 7-tuple.
+ * @param state
+ *   User defined state to set.
+ * @param data
+ *   User defined data, the len is configured during sft init.
+ * @param proto_enable
+ *   Enables maintenance of status->proto_state connection tracking value
+ *   for the flow. otherwise status->proto_state will be initialized with zeros.
+ * @param dev_id
+ *   Event dev ID to enqueue end of flow event.
+ * @param port_id
+ *   Event port ID to enqueue end of flow event.
+ * @param actions
+ *   Flags that indicate which actions should be done on the packet before
+ *   returning it to the rte_flow.
+ * @param action_specs
+ *   Hold the actions configuration.
+ * @param[out] mbuf_out
+ *   last processed not fragmented and in order mbuf.
+ * @param[out] status
+ *   Structure to dump SFT flow status once activated.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_activate(uint16_t queue, struct rte_mbuf *mbuf_in,
+		      const struct rte_sft_7tuple *reverse_tuple,
+		      uint8_t state, uint32_t *data, uint8_t proto_enable,
+		      uint8_t dev_id, uint8_t port_id, uint64_t actions,
+		      const struct rte_sft_actions_specs *action_specs,
+		      struct rte_mbuf **mbuf_out,
+		      struct rte_sft_flow_status *status,
+		      struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Artificially create SFT flow.
+ *
+ * Function to create SFT flow before reception of the first flow packet.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param tuple
+ *   Expected initiator flow 7-tuple.
+ * @param reverse_tuple
+ *   Expected initiate flow 7-tuple.
+ * @param state
+ *   User defined state to set.
+ * @param data
+ *   User defined data, the len is configured during sft init.
+ * @param proto_enable
+ *   Enables maintenance of status->proto_state connection tracking value
+ *   for the flow. otherwise status->proto_state will be initialized with zeros.
+ * @param[out] status
+ *   Connection status.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   - on success: 0, locked SFT flow recognized by status->fid.
+ *   - on error: a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_create(uint16_t queue, const struct rte_sft_7tuple *tuple,
+		    const struct rte_sft_7tuple *reverse_tuple,
+		    const struct rte_flow_item_sft *ctx,
+		    uint8_t ct_enable,
+		    struct rte_sft_flow_status *status,
+		    struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Removes flow from SFT.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_destroy(uint16_t queue, uint32_t fid, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Query counter and aging data.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] data.
+ *   SFT flow ID.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_query(uint16_t queue, uint32_t fid,
+		   struct rte_sft_query_data *data,
+		   struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset flow age to zero.
+ *
+ * Simulates last flow packet with timestamp set to just now.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_touch(uint16_t queue, uint32_t fid, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set flow aging to specific value.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param aging
+ *   New flow aging value.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_aging(uint16_t queue, uint32_t fid, uint32_t aging,
+		       struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set client object for given client ID.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param client_id
+ *   Client ID to set object for.
+ * @param client_obj
+ *   Pointer to opaque client object structure.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_sft_error is set.
+ */
+__rte_experimental
+int
+rte_sft_flow_set_client_obj(uint16_t queue, uint32_t fid, uint8_t client_id,
+			    void *client_obj, struct rte_sft_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get client object for given client ID.
+ *
+ * @param queue
+ *   The SFT queue.
+ * @param fid
+ *   SFT flow ID.
+ * @param client_id
+ *   Client ID to get object for.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. SFT initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid client object opaque pointer in case of success, NULL otherwise
+ *   and rte_sft_error is set.
+ */
+__rte_experimental
+void *
+rte_sft_flow_get_client_obj(uint16_t queue, const uint32_t fid,
+			    uint8_t client_id, struct rte_sft_error *error);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SFT_H_ */
diff --git a/lib/librte_ethdev/rte_sft_driver.h b/lib/librte_ethdev/rte_sft_driver.h
new file mode 100644
index 0000000000..6ae3c4b997
--- /dev/null
+++ b/lib/librte_ethdev/rte_sft_driver.h
@@ -0,0 +1,201 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+
+#ifndef RTE_SFT_DRIVER_H_
+#define RTE_SFT_DRIVER_H_
+
+/**
+ * @file
+ * RTE generic SFT API (driver side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include "rte_ethdev.h"
+#include "rte_ethdev_driver.h"
+#include "rte_sft.h"
+#include "rte_flow.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct rte_sft_entry;
+
+#define RTE_SFT_STATE_FLAG_FID_VALID (1 << 0)
+#define RTE_SFT_STATE_FLAG_ZONE_VALID (1 << 1)
+#define RTE_SFT_STATE_FLAG_FLOW_MISS (1 << 2)
+
+#define RTE_SFT_MISS_TCP_FLAGS (1 << 0)
+
+RTE_STD_C11
+struct rte_sft_decode_info {
+	union {
+		uint32_t fid; /**< The fid value. */
+		uint32_t zone; /**< The zone value. */
+	};
+	uint32_t state;
+	/**< Flags that mark the packet state. see RTE_SFT_STATE_FLAG_*. */
+};
+
+/**
+ * @internal
+ * Insert a flow to the SFT HW component.
+ *
+ * @param dev
+ *   ethdev handle of port.
+ * @param fid
+ *   Flow ID.
+ * @param queue
+ *   The sft working queue.
+ * @param pattern
+ *   The matching pattern.
+ * @param miss_conditions
+ *   The conditions that forces a miss even if the 5 tuple was matched
+ *   see RTE_SFT_MISS_*.
+ * @param actions
+ *   Set pf actions to apply in case the flow was hit. If no terminating action
+ *   (queue, rss, drop, port) was given, the terminating action should be taken
+ *   from the flow that resulted in the SFT.
+ * @param miss_actions
+ *   Set pf actions to apply in case the flow was hit. but the miss conditions
+ *   were hit. (6 tuple match but tcp flags are on) If no terminating action
+ *   (queue, rss, drop, port) was given, the terminating action should be taken
+ *   from the flow that resulted in the SFT.
+ * @param data
+ *   The application data to attached to the flow.
+ * @param data_len
+ *   The length of the data in uint32_t increments.
+ * @param state
+ *   The application state to set.
+ * @param error[out]
+ *   Verbose of the error.
+ *
+ * @return
+ *   Pointer to sft_entry in case of success, null otherwise and rte_sft_error
+ *   is set.
+ */
+typedef struct rte_sft_entry *(*sft_entry_create_t)
+		(struct rte_eth_dev *dev, uint32_t fid, uint16_t queue,
+		 const struct rte_flow_item *pattern, uint64_t miss_conditions,
+		 const struct rte_flow_action *actions,
+		 const struct rte_flow_action *miss_actions,
+		 const uint32_t *data, uint16_t data_len, uint8_t state,
+		 struct rte_sft_error *error);
+
+/**
+ * @internal
+ * Modify the state and the data of SFT flow in HW component.
+ *
+ * @param dev
+ *   ethdev handle of port.
+ * @param entry
+ *   The entry to modify.
+ * @param queue
+ *   The sft working queue.
+ * @param data
+ *   The application data to attached to the flow.
+ * @param data_len
+ *   The length of the data in uint32_t increments.
+ * @param state
+ *   The application state to set.
+ * @param error[out]
+ *   Verbose of the error.
+ *
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+typedef int *(*sft_entry_modify_t)(struct rte_eth_dev *dev,
+				   struct rte_sft_entry *entry, uint16_t queue,
+				   const uint32_t *data, uint16_t data_len,
+				   uint8_t state, struct rte_sft_error *error);
+
+/**
+ * @internal
+ * Destroy SFT flow in HW component.
+ *
+ * @param dev
+ *   ethdev handle of port.
+ * @param entry
+ *   The entry to modify.
+ * @param queue
+ *   The sft working queue.
+ * @param error[out]
+ *   Verbose of the error.
+ *
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+typedef int *(*sft_entry_destroy_t)(struct rte_eth_dev *dev,
+				    struct rte_sft_entry *entry, uint16_t queue,
+				    struct rte_sft_error *error);
+
+/**
+ * @internal
+ * Decode sft state and FID from mbuf.
+ *
+ * @param dev
+ *   ethdev handle of port.
+ * @param entry
+ *   The entry to modify.
+ * @param queue
+ *   The sft working queue.
+ * @param mbuf
+ *   The input mbuf.
+ * @param info[out]
+ *   The decoded sft data.
+ * @param error[out]
+ *   Verbose of the error.
+ *
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+typedef int *(*sft_entry_decode_t)(struct rte_eth_dev *dev,
+				   struct rte_sft_entry *entry, uint16_t queue,
+				   struct rte_mbuf *mbuf,
+				   struct rte_sft_decode_info *info,
+				   struct rte_sft_error *error);
+
+/**
+ * Generic sft operations structure implemented and returned by PMDs.
+ *
+ * If successful, this operation must result in a pointer to a PMD-specific.
+ *
+ * See also rte_sft_ops_get().
+ *
+ * These callback functions are not supposed to be used by applications
+ * directly, which must rely on the API defined in rte_sft.h.
+ */
+struct rte_sft_ops {
+	sft_entry_create_t sft_create_entry;
+	sft_entry_modify_t sft_entry_modify;
+	sft_entry_destroy_t sft_entry_destroy;
+	sft_entry_decode_t sft_entry_decode;
+};
+
+/**
+ * Get generic sft operations structure from a port.
+ *
+ * @param port_id
+ *   Port identifier to query.
+ * @param[out] error
+ *   Pointer to flow error structure.
+ *
+ * @return
+ *   The flow operations structure associated with port_id, NULL in case of
+ *   error, in which case rte_errno is set and the error structure contains
+ *   additional details.
+ */
+const struct rte_sft_ops *
+rte_sft_ops_get(uint16_t port_id, struct rte_sft_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_SFT_DRIVER_H_ */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-11-04 13:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-09 20:30 [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
2020-09-09 20:30 ` [dpdk-dev] [RFC 1/3] ethdev: add item/action for SFT Andrey Vesnovaty
2020-09-16 15:46   ` Ori Kam
2020-09-18  7:04     ` Andrew Rybchenko
2020-09-09 20:30 ` [dpdk-dev] [RFC 2/3] ethdev: support SFT APIs Andrey Vesnovaty
2020-09-09 20:30 ` [dpdk-dev] [RFC 3/3] sft: introduce API Andrey Vesnovaty
2020-09-16 18:33   ` Ori Kam
2020-09-18  7:43     ` Andrew Rybchenko
2020-11-02 10:49       ` Andrey Vesnovaty
2020-09-18 13:34   ` Kinsella, Ray
2020-09-15 11:59 ` [dpdk-dev] [RFC 0/3] introduce Stateful Flow Table Andrey Vesnovaty
2020-11-04 12:59 ` [dpdk-dev] [PATCH v2 0/2] introduce stateful flow table Ori Kam
2020-11-04 12:59   ` [dpdk-dev] [PATCH v2 1/2] ethdev: add item/action for SFT Ori Kam
2020-11-04 12:59   ` [dpdk-dev] [PATCH v2 2/2] ethdev: introduce sft lib Ori Kam
2020-11-04 13:17 ` [dpdk-dev] [RFC v3 0/2] introduce stateful flow table Ori Kam
2020-11-04 13:17   ` [dpdk-dev] [RFC v3 1/2] ethdev: add item/action for SFT Ori Kam
2020-11-04 13:17   ` [dpdk-dev] [RFC v3 2/2] ethdev: introduce sft lib Ori Kam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).