DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item
@ 2021-03-18  7:30 Bing Zhao
  2021-03-22 15:16 ` Andrew Rybchenko
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Bing Zhao @ 2021-03-18  7:30 UTC (permalink / raw)
  To: orika, thomas, ferruh.yigit, andrew.rybchenko; +Cc: dev

This commit introduced the conntrack action and item.

Usually the HW offloading is stateless. For some stateful offloading
like a TCP connection, HW module will help provide the ability of a
full offloading w/o SW participation after the connection was
established.

The basic usage is that in the first flow the application should add
the conntrack action and in the following flow(s) the application
should use the conntrack item to match on the result.

A TCP connection has two directions traffic. To set a conntrack
action context correctly, information from packets of both directions
are required.

The conntrack action should be created on one port and supply the
peer port as a parameter to the action. After context creating, it
could only be used between the ports (dual-port mode) or a single
port. The application should modify the action via action_ctx_update
interface before each use in dual-port mode, in order to set the
correct direction for the following rte flow.

Query will be supported via action_ctx_query interface, about the
current packets information and connection status.

For the packets received during the conntrack setup, it is suggested
to re-inject the packets in order to take full advantage of the
conntrack. Only the valid packets should pass the conntrack, packets
with invalid TCP information, like out of window, or with invalid
header, like malformed, should not pass.

Testpmd command line example:

set conntrack [index] enable is 1 last_seq is xxx last ack is xxx /
... / orig_dir win_scale is xxx sent_end is xxx max_win is xxx ... /
rply_dir ... / end
flow action_ctx [CTX] create ingress ... / conntrack is [index] / end
flow create 0 group X ingress patterns ... / tcp / end actions action_ctx [CTX]
/ jump group Y / end
flow create 0 group Y ingress patterns ... / ct is [Valid] / end actions
queue index [hairpin queue] / end

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 lib/librte_ethdev/rte_flow.h | 191 +++++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 669e677e91..b2e4f0751a 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -550,6 +550,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_geneve_opt
 	 */
 	RTE_FLOW_ITEM_TYPE_GENEVE_OPT,
+
+	/**
+	 * [META]
+	 *
+	 * Matches conntrack state.
+	 *
+	 * See struct rte_flow_item_conntrack.
+	 */
+	RTE_FLOW_ITEM_TYPE_CONNTRACK,
 };
 
 /**
@@ -1654,6 +1663,49 @@ rte_flow_item_geneve_opt_mask = {
 };
 #endif
 
+/**
+ * The packet is with valid.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_STATE_VALID (1 << 0)
+/**
+ * The state of the connection was changed.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_STATE_CHANGED (1 << 1)
+/**
+ * Error state was detected on this packet for this connection.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_ERROR (1 << 2)
+/**
+ * The HW connection tracking module is disabled.
+ * It can be due to application command or an invalid state.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_DISABLED (1 << 3)
+/**
+ * The packet contains some bad field(s).
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_BAD_PKT (1 << 4)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_CONNTRACK
+ *
+ * Matches the state of a packet after it passed the connection tracking
+ * examination. The state is a bit mask of one RTE_FLOW_CONNTRACK_FLAG*
+ * or a reasonable combination of these bits.
+ */
+struct rte_flow_item_conntrack {
+	uint32_t flags;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_CONNTRACK. */
+#ifndef __cplusplus
+static const struct rte_flow_item_conntrack rte_flow_item_conntrack_mask = {
+	.flags = 0xffffffff,
+};
+#endif
+
 /**
  * Matching pattern item definition.
  *
@@ -2236,6 +2288,17 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_modify_field.
 	 */
 	RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
+
+	/**
+	 * [META]
+	 *
+	 * Enable tracking a TCP connection state.
+	 *
+	 * Send packet to HW connection tracking module for examination.
+	 *
+	 * See struct rte_flow_action_conntrack.
+	 */
+	RTE_FLOW_ACTION_TYPE_CONNTRACK,
 };
 
 /**
@@ -2828,6 +2891,134 @@ struct rte_flow_action_set_dscp {
  */
 struct rte_flow_shared_action;
 
+/**
+ * The state of a TCP connection.
+ */
+enum rte_flow_conntrack_state {
+	RTE_FLOW_CONNTRACK_STATE_SYN_RECV,
+	/**< SYN-ACK packet was seen. */
+	RTE_FLOW_CONNTRACK_STATE_ESTABLISHED,
+	/**< 3-way handshark was done. */
+	RTE_FLOW_CONNTRACK_STATE_FIN_WAIT,
+	/**< First FIN packet was received to close the connection. */
+	RTE_FLOW_CONNTRACK_STATE_CLOSE_WAIT,
+	/**< First FIN was ACKed. */
+	RTE_FLOW_CONNTRACK_STATE_LAST_ACK,
+	/**< After second FIN, waiting for the last ACK. */
+	RTE_FLOW_CONNTRACK_STATE_TIME_WAIT,
+	/**< Second FIN was ACKed, connection was closed. */
+};
+
+/**
+ * The last passed TCP packet flags of a connection.
+ */
+enum rte_flow_conntrack_index {
+	RTE_FLOW_CONNTRACK_INDEX_NONE = 0, /**< No Flag. */
+	RTE_FLOW_CONNTRACK_INDEX_SYN = (1 << 0), /**< With SYN flag. */
+	RTE_FLOW_CONNTRACK_INDEX_SYN_ACK = (1 << 1), /**< With SYN+ACK flag. */
+	RTE_FLOW_CONNTRACK_INDEX_FIN = (1 << 2), /**< With FIN flag. */
+	RTE_FLOW_CONNTRACK_INDEX_ACK = (1 << 3), /**< With ACK flag. */
+	RTE_FLOW_CONNTRACK_INDEX_RST = (1 << 4), /**< With RST flag. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * Configuration parameters for each direction of a TCP connection.
+ */
+struct rte_flow_tcp_dir_param {
+	uint32_t scale:4; /**< TCP window scaling factor, 0xF to disable. */
+	uint32_t close_initiated:1; /**< The FIN was sent by this direction. */
+	uint32_t last_ack_seen:1;
+	/**< An ACK packet has been received by this side. */
+	uint32_t data_unacked:1;
+	/**< If set, indicates that there is unacked data of the connection. */
+	uint32_t sent_end;
+	/**< Maximal value of sequence + payload length over sent
+	 * packets (next ACK from the opposite direction).
+	 */
+	uint32_t reply_end;
+	/**< Maximal value of (ACK + window size) over received packet + length
+	 * over sent packet (maximal sequence could be sent).
+	 */
+	uint32_t max_win;
+	/**< Maximal value of actual window size over sent packets. */
+	uint32_t max_ack;
+	/**< Maximal value of ACK over sent packets. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_CONNTRACK
+ *
+ * Configuration and initial state for the connection tracking module.
+ * This structure could be used for both setting and query.
+ */
+struct rte_flow_action_conntrack {
+	uint16_t peer_port; /**< The peer port number, can be the same port. */
+	uint32_t is_original_dir:1;
+	/**< Direction of this connection when creating a flow, the value only
+	 * affects the subsequent flows creation.
+	 */
+	uint32_t enable:1;
+	/**< Enable / disable the conntrack HW module. When disabled, the
+	 * result will always be RTE_FLOW_CONNTRACK_FLAG_DISABLED.
+	 * In this state the HW will act as passthrough.
+	 */
+	uint32_t live_connection:1;
+	/**< At least one ack was seen, after the connection was established. */
+	uint32_t selective_ack:1;
+	/**< Enable selective ACK on this connection. */
+	uint32_t challenge_ack_passed:1;
+	/**< A challenge ack has passed. */
+	uint32_t last_direction:1;
+	/**< 1: The last packet is seen that comes from the original direction.
+	 * 0: From the reply direction.
+	 */
+	uint32_t liberal_mode:1;
+	/**< No TCP check will be done except the state change. */
+	enum rte_flow_conntrack_state state;
+	/**< The current state of the connection. */
+	uint8_t max_ack_window;
+	/**< Scaling factor for maximal allowed ACK window. */
+	uint8_t retransmission_limit;
+	/**< Maximal allowed number of retransmission times. */
+	struct rte_flow_tcp_dir_param original_dir;
+	/**< TCP parameters of the original direction. */
+	struct rte_flow_tcp_dir_param reply_dir;
+	/**< TCP parameters of the reply direction. */
+	uint16_t last_window;
+	/**< The window value of the last packet passed this conntrack. */
+	enum rte_flow_conntrack_index last_index;
+	uint32_t last_seq;
+	/**< The sequence of the last packet passed this conntrack. */
+	uint32_t last_ack;
+	/**< The acknowledgement of the last packet passed this conntrack. */
+	uint32_t last_end;
+	/**< The total value ACK + payload length of the last packet passed
+	 * this conntrack.
+	 */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_CONNTRACK
+ *
+ * Wrapper structure for the context update interface.
+ * Ports cannot support updating, and the only valid solution is to
+ * destroy the old context and create a new one instead.
+ */
+struct rte_flow_modify_conntrack {
+	struct rte_flow_action_conntrack new_ct;
+	/**< New connection tracking parameters to be updated. */
+	uint32_t direction:1; /**< The direction field will be updated. */
+	uint32_t state:1;
+	/**< All the other fields except direction will be updated. */
+	uint32_t reserved:30; /**< Reserved bits for the future usage. */
+};
+
 /**
  * Field IDs for MODIFY_FIELD action.
  */
-- 
2.19.0.windows.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item
  2021-03-18  7:30 [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item Bing Zhao
@ 2021-03-22 15:16 ` Andrew Rybchenko
  2021-04-07  7:43   ` Bing Zhao
  2021-03-23 23:27 ` Ajit Khaparde
  2021-04-10 13:46 ` [dpdk-dev] [PATCH] " Bing Zhao
  2 siblings, 1 reply; 6+ messages in thread
From: Andrew Rybchenko @ 2021-03-22 15:16 UTC (permalink / raw)
  To: Bing Zhao, orika, thomas, ferruh.yigit; +Cc: dev

On 3/18/21 10:30 AM, Bing Zhao wrote:
> This commit introduced the conntrack action and item.
> 
> Usually the HW offloading is stateless. For some stateful offloading
> like a TCP connection, HW module will help provide the ability of a
> full offloading w/o SW participation after the connection was
> established.
> 
> The basic usage is that in the first flow the application should add
> the conntrack action and in the following flow(s) the application
> should use the conntrack item to match on the result.
> 
> A TCP connection has two directions traffic. To set a conntrack
> action context correctly, information from packets of both directions
> are required.
> 
> The conntrack action should be created on one port and supply the
> peer port as a parameter to the action. After context creating, it
> could only be used between the ports (dual-port mode) or a single
> port. The application should modify the action via action_ctx_update
> interface before each use in dual-port mode, in order to set the
> correct direction for the following rte flow.

Sorry, but "update interface before each use" sounds frightening. May be
I simply don't understand all
reasons behind.

> Query will be supported via action_ctx_query interface, about the
> current packets information and connection status.
> 
> For the packets received during the conntrack setup, it is suggested
> to re-inject the packets in order to take full advantage of the
> conntrack. Only the valid packets should pass the conntrack, packets
> with invalid TCP information, like out of window, or with invalid
> header, like malformed, should not pass.
> 
> Testpmd command line example:
> 
> set conntrack [index] enable is 1 last_seq is xxx last ack is xxx /
> ... / orig_dir win_scale is xxx sent_end is xxx max_win is xxx ... /
> rply_dir ... / end
> flow action_ctx [CTX] create ingress ... / conntrack is [index] / end
> flow create 0 group X ingress patterns ... / tcp / end actions action_ctx [CTX]
> / jump group Y / end
> flow create 0 group Y ingress patterns ... / ct is [Valid] / end actions
> queue index [hairpin queue] / end
> 
> Signed-off-by: Bing Zhao <bingz@nvidia.com>
> ---
>  lib/librte_ethdev/rte_flow.h | 191 +++++++++++++++++++++++++++++++++++
>  1 file changed, 191 insertions(+)
> 
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 669e677e91..b2e4f0751a 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -550,6 +550,15 @@ enum rte_flow_item_type {
>  	 * See struct rte_flow_item_geneve_opt
>  	 */
>  	RTE_FLOW_ITEM_TYPE_GENEVE_OPT,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Matches conntrack state.
> +	 *
> +	 * See struct rte_flow_item_conntrack.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_CONNTRACK,
>  };
>  
>  /**
> @@ -1654,6 +1663,49 @@ rte_flow_item_geneve_opt_mask = {
>  };
>  #endif
>  
> +/**
> + * The packet is with valid.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_STATE_VALID (1 << 0)

It sounds like conntrack state is valid, but not packet is
valid from conntrack point of view. May be:
RTE_FLOW_CONNTRACK_FLAG_PKT_VALID? Or _VALID_PKT to
go with _BAD_PKT.

> +/**
> + * The state of the connection was changed.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_STATE_CHANGED (1 << 1)
> +/**
> + * Error state was detected on this packet for this connection.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_ERROR (1 << 2)
> +/**
> + * The HW connection tracking module is disabled.
> + * It can be due to application command or an invalid state.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_DISABLED (1 << 3)
> +/**
> + * The packet contains some bad field(s).
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_BAD_PKT (1 << 4)
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ITEM_TYPE_CONNTRACK
> + *
> + * Matches the state of a packet after it passed the connection tracking
> + * examination. The state is a bit mask of one RTE_FLOW_CONNTRACK_FLAG*
> + * or a reasonable combination of these bits.
> + */
> +struct rte_flow_item_conntrack {
> +	uint32_t flags;
> +};
> +
> +/** Default mask for RTE_FLOW_ITEM_TYPE_CONNTRACK. */
> +#ifndef __cplusplus
> +static const struct rte_flow_item_conntrack rte_flow_item_conntrack_mask = {
> +	.flags = 0xffffffff,
> +};
> +#endif
> +
>  /**
>   * Matching pattern item definition.
>   *
> @@ -2236,6 +2288,17 @@ enum rte_flow_action_type {
>  	 * See struct rte_flow_action_modify_field.
>  	 */
>  	RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Enable tracking a TCP connection state.
> +	 *
> +	 * Send packet to HW connection tracking module for examination.
> +	 *
> +	 * See struct rte_flow_action_conntrack.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_CONNTRACK,
>  };
>  
>  /**
> @@ -2828,6 +2891,134 @@ struct rte_flow_action_set_dscp {
>   */
>  struct rte_flow_shared_action;
>  
> +/**
> + * The state of a TCP connection.
> + */
> +enum rte_flow_conntrack_state {
> +	RTE_FLOW_CONNTRACK_STATE_SYN_RECV,
> +	/**< SYN-ACK packet was seen. */

May I suggest to put comments before enum member. IMHO it is
more readable. Comment after makes sense if it is on the same
line, otherwise, it is better to use comments before code.

> +	RTE_FLOW_CONNTRACK_STATE_ESTABLISHED,
> +	/**< 3-way handshark was done. */
> +	RTE_FLOW_CONNTRACK_STATE_FIN_WAIT,
> +	/**< First FIN packet was received to close the connection. */
> +	RTE_FLOW_CONNTRACK_STATE_CLOSE_WAIT,
> +	/**< First FIN was ACKed. */
> +	RTE_FLOW_CONNTRACK_STATE_LAST_ACK,
> +	/**< After second FIN, waiting for the last ACK. */
> +	RTE_FLOW_CONNTRACK_STATE_TIME_WAIT,
> +	/**< Second FIN was ACKed, connection was closed. */
> +};
> +
> +/**
> + * The last passed TCP packet flags of a connection.
> + */
> +enum rte_flow_conntrack_index {

Sorry, I don't understand why it is named conntrack_index.

> +	RTE_FLOW_CONNTRACK_INDEX_NONE = 0, /**< No Flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_SYN = (1 << 0), /**< With SYN flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_SYN_ACK = (1 << 1), /**< With SYN+ACK flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_FIN = (1 << 2), /**< With FIN flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_ACK = (1 << 3), /**< With ACK flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_RST = (1 << 4), /**< With RST flag. */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * Configuration parameters for each direction of a TCP connection.
> + */
> +struct rte_flow_tcp_dir_param {
> +	uint32_t scale:4; /**< TCP window scaling factor, 0xF to disable. */
> +	uint32_t close_initiated:1; /**< The FIN was sent by this direction. */
> +	uint32_t last_ack_seen:1;
> +	/**< An ACK packet has been received by this side. */

Same here about comments after fields.

> +	uint32_t data_unacked:1;
> +	/**< If set, indicates that there is unacked data of the connection. */
> +	uint32_t sent_end;
> +	/**< Maximal value of sequence + payload length over sent
> +	 * packets (next ACK from the opposite direction).
> +	 */
> +	uint32_t reply_end;
> +	/**< Maximal value of (ACK + window size) over received packet + length
> +	 * over sent packet (maximal sequence could be sent).
> +	 */
> +	uint32_t max_win;
> +	/**< Maximal value of actual window size over sent packets. */
> +	uint32_t max_ack;
> +	/**< Maximal value of ACK over sent packets. */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> + *
> + * Configuration and initial state for the connection tracking module.
> + * This structure could be used for both setting and query.
> + */
> +struct rte_flow_action_conntrack {
> +	uint16_t peer_port; /**< The peer port number, can be the same port. */
> +	uint32_t is_original_dir:1;
> +	/**< Direction of this connection when creating a flow, the value only
> +	 * affects the subsequent flows creation.
> +	 */

and here tool

> +	uint32_t enable:1;
> +	/**< Enable / disable the conntrack HW module. When disabled, the
> +	 * result will always be RTE_FLOW_CONNTRACK_FLAG_DISABLED.
> +	 * In this state the HW will act as passthrough.
> +	 */

Does it disable entire conntrack HW module for all flows?
It sounds like this. If so - confusing.

> +	uint32_t live_connection:1;
> +	/**< At least one ack was seen, after the connection was established. */
> +	uint32_t selective_ack:1;
> +	/**< Enable selective ACK on this connection. */
> +	uint32_t challenge_ack_passed:1;
> +	/**< A challenge ack has passed. */
> +	uint32_t last_direction:1;
> +	/**< 1: The last packet is seen that comes from the original direction.
> +	 * 0: From the reply direction.
> +	 */
> +	uint32_t liberal_mode:1;
> +	/**< No TCP check will be done except the state change. */
> +	enum rte_flow_conntrack_state state;
> +	/**< The current state of the connection. */
> +	uint8_t max_ack_window;
> +	/**< Scaling factor for maximal allowed ACK window. */
> +	uint8_t retransmission_limit;
> +	/**< Maximal allowed number of retransmission times. */
> +	struct rte_flow_tcp_dir_param original_dir;
> +	/**< TCP parameters of the original direction. */
> +	struct rte_flow_tcp_dir_param reply_dir;
> +	/**< TCP parameters of the reply direction. */
> +	uint16_t last_window;
> +	/**< The window value of the last packet passed this conntrack. */
> +	enum rte_flow_conntrack_index last_index;
> +	uint32_t last_seq;
> +	/**< The sequence of the last packet passed this conntrack. */
> +	uint32_t last_ack;
> +	/**< The acknowledgement of the last packet passed this conntrack. */
> +	uint32_t last_end;
> +	/**< The total value ACK + payload length of the last packet passed
> +	 * this conntrack.
> +	 */
> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> + *
> + * Wrapper structure for the context update interface.
> + * Ports cannot support updating, and the only valid solution is to
> + * destroy the old context and create a new one instead.
> + */
> +struct rte_flow_modify_conntrack {
> +	struct rte_flow_action_conntrack new_ct;
> +	/**< New connection tracking parameters to be updated. */

and here

> +	uint32_t direction:1; /**< The direction field will be updated. */
> +	uint32_t state:1;
> +	/**< All the other fields except direction will be updated. */
> +	uint32_t reserved:30; /**< Reserved bits for the future usage. */
> +};
> +
>  /**
>   * Field IDs for MODIFY_FIELD action.
>   */
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item
  2021-03-18  7:30 [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item Bing Zhao
  2021-03-22 15:16 ` Andrew Rybchenko
@ 2021-03-23 23:27 ` Ajit Khaparde
  2021-04-07  2:41   ` Bing Zhao
  2021-04-10 13:46 ` [dpdk-dev] [PATCH] " Bing Zhao
  2 siblings, 1 reply; 6+ messages in thread
From: Ajit Khaparde @ 2021-03-23 23:27 UTC (permalink / raw)
  To: Bing Zhao
  Cc: Ori Kam, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, dpdk-dev

[-- Attachment #1: Type: text/plain, Size: 10956 bytes --]

On Thu, Mar 18, 2021 at 12:30 AM Bing Zhao <bingz@nvidia.com> wrote:
>
> This commit introduced the conntrack action and item.
>
> Usually the HW offloading is stateless. For some stateful offloading
> like a TCP connection, HW module will help provide the ability of a
> full offloading w/o SW participation after the connection was
> established.
>
> The basic usage is that in the first flow the application should add
> the conntrack action and in the following flow(s) the application
> should use the conntrack item to match on the result.
>
> A TCP connection has two directions traffic. To set a conntrack
> action context correctly, information from packets of both directions
> are required.
>
> The conntrack action should be created on one port and supply the
> peer port as a parameter to the action. After context creating, it
> could only be used between the ports (dual-port mode) or a single
> port. The application should modify the action via action_ctx_update
> interface before each use in dual-port mode, in order to set the
> correct direction for the following rte flow.
>
> Query will be supported via action_ctx_query interface, about the
> current packets information and connection status.
>
> For the packets received during the conntrack setup, it is suggested
> to re-inject the packets in order to take full advantage of the
> conntrack. Only the valid packets should pass the conntrack, packets
> with invalid TCP information, like out of window, or with invalid
> header, like malformed, should not pass.
>
> Testpmd command line example:
>
> set conntrack [index] enable is 1 last_seq is xxx last ack is xxx /
> ... / orig_dir win_scale is xxx sent_end is xxx max_win is xxx ... /
> rply_dir ... / end
> flow action_ctx [CTX] create ingress ... / conntrack is [index] / end
> flow create 0 group X ingress patterns ... / tcp / end actions action_ctx [CTX]
> / jump group Y / end
> flow create 0 group Y ingress patterns ... / ct is [Valid] / end actions
> queue index [hairpin queue] / end
>
> Signed-off-by: Bing Zhao <bingz@nvidia.com>
> ---
>  lib/librte_ethdev/rte_flow.h | 191 +++++++++++++++++++++++++++++++++++
>  1 file changed, 191 insertions(+)
>
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 669e677e91..b2e4f0751a 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -550,6 +550,15 @@ enum rte_flow_item_type {
>          * See struct rte_flow_item_geneve_opt
>          */
>         RTE_FLOW_ITEM_TYPE_GENEVE_OPT,
> +
> +       /**
> +        * [META]
> +        *
> +        * Matches conntrack state.
> +        *
> +        * See struct rte_flow_item_conntrack.
> +        */
> +       RTE_FLOW_ITEM_TYPE_CONNTRACK,
>  };
>
>  /**
> @@ -1654,6 +1663,49 @@ rte_flow_item_geneve_opt_mask = {
>  };
>  #endif
>
> +/**
> + * The packet is with valid.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_STATE_VALID (1 << 0)
> +/**
> + * The state of the connection was changed.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_STATE_CHANGED (1 << 1)
> +/**
> + * Error state was detected on this packet for this connection.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_ERROR (1 << 2)
> +/**
> + * The HW connection tracking module is disabled.
> + * It can be due to application command or an invalid state.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_DISABLED (1 << 3)
> +/**
> + * The packet contains some bad field(s).
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_BAD_PKT (1 << 4)
Why not an enum? We could use the bits, but group them under an enum?

> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ITEM_TYPE_CONNTRACK
> + *
> + * Matches the state of a packet after it passed the connection tracking
> + * examination. The state is a bit mask of one RTE_FLOW_CONNTRACK_FLAG*
> + * or a reasonable combination of these bits.
> + */
> +struct rte_flow_item_conntrack {
> +       uint32_t flags;
> +};
> +
> +/** Default mask for RTE_FLOW_ITEM_TYPE_CONNTRACK. */
> +#ifndef __cplusplus
> +static const struct rte_flow_item_conntrack rte_flow_item_conntrack_mask = {
> +       .flags = 0xffffffff,
> +};
> +#endif
> +
>  /**
>   * Matching pattern item definition.
>   *
> @@ -2236,6 +2288,17 @@ enum rte_flow_action_type {
>          * See struct rte_flow_action_modify_field.
>          */
>         RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
> +
> +       /**
> +        * [META]
> +        *
> +        * Enable tracking a TCP connection state.
> +        *
> +        * Send packet to HW connection tracking module for examination.
> +        *
> +        * See struct rte_flow_action_conntrack.
> +        */
> +       RTE_FLOW_ACTION_TYPE_CONNTRACK,
>  };
>
>  /**
> @@ -2828,6 +2891,134 @@ struct rte_flow_action_set_dscp {
>   */
>  struct rte_flow_shared_action;
>
> +/**
> + * The state of a TCP connection.
> + */
> +enum rte_flow_conntrack_state {
> +       RTE_FLOW_CONNTRACK_STATE_SYN_RECV,
> +       /**< SYN-ACK packet was seen. */
> +       RTE_FLOW_CONNTRACK_STATE_ESTABLISHED,
> +       /**< 3-way handshark was done. */
> +       RTE_FLOW_CONNTRACK_STATE_FIN_WAIT,
> +       /**< First FIN packet was received to close the connection. */
> +       RTE_FLOW_CONNTRACK_STATE_CLOSE_WAIT,
> +       /**< First FIN was ACKed. */
> +       RTE_FLOW_CONNTRACK_STATE_LAST_ACK,
> +       /**< After second FIN, waiting for the last ACK. */
> +       RTE_FLOW_CONNTRACK_STATE_TIME_WAIT,
> +       /**< Second FIN was ACKed, connection was closed. */
> +};
> +
> +/**
> + * The last passed TCP packet flags of a connection.
> + */
> +enum rte_flow_conntrack_index {
> +       RTE_FLOW_CONNTRACK_INDEX_NONE = 0, /**< No Flag. */
> +       RTE_FLOW_CONNTRACK_INDEX_SYN = (1 << 0), /**< With SYN flag. */
> +       RTE_FLOW_CONNTRACK_INDEX_SYN_ACK = (1 << 1), /**< With SYN+ACK flag. */
> +       RTE_FLOW_CONNTRACK_INDEX_FIN = (1 << 2), /**< With FIN flag. */
> +       RTE_FLOW_CONNTRACK_INDEX_ACK = (1 << 3), /**< With ACK flag. */
> +       RTE_FLOW_CONNTRACK_INDEX_RST = (1 << 4), /**< With RST flag. */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * Configuration parameters for each direction of a TCP connection.
> + */
> +struct rte_flow_tcp_dir_param {
> +       uint32_t scale:4; /**< TCP window scaling factor, 0xF to disable. */
> +       uint32_t close_initiated:1; /**< The FIN was sent by this direction. */
> +       uint32_t last_ack_seen:1;
> +       /**< An ACK packet has been received by this side. */
> +       uint32_t data_unacked:1;
> +       /**< If set, indicates that there is unacked data of the connection. */
> +       uint32_t sent_end;
> +       /**< Maximal value of sequence + payload length over sent
> +        * packets (next ACK from the opposite direction).
> +        */
> +       uint32_t reply_end;
> +       /**< Maximal value of (ACK + window size) over received packet + length
> +        * over sent packet (maximal sequence could be sent).
> +        */
> +       uint32_t max_win;
> +       /**< Maximal value of actual window size over sent packets. */
> +       uint32_t max_ack;
> +       /**< Maximal value of ACK over sent packets. */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> + *
> + * Configuration and initial state for the connection tracking module.
> + * This structure could be used for both setting and query.
Can we split the structure into set and query.
Some of the fields seem to be relevant for a query.
Also the names will be simpler and easier to understand that way.

> + */
> +struct rte_flow_action_conntrack {
> +       uint16_t peer_port; /**< The peer port number, can be the same port. */
> +       uint32_t is_original_dir:1;
> +       /**< Direction of this connection when creating a flow, the value only
> +        * affects the subsequent flows creation.
> +        */
> +       uint32_t enable:1;
> +       /**< Enable / disable the conntrack HW module. When disabled, the
> +        * result will always be RTE_FLOW_CONNTRACK_FLAG_DISABLED.
> +        * In this state the HW will act as passthrough.
> +        */
We should be able to enable the block in HW implicitly based on the
rte_flow_create.
I don't think this is needed.

> +       uint32_t live_connection:1;
> +       /**< At least one ack was seen, after the connection was established. */
> +       uint32_t selective_ack:1;
> +       /**< Enable selective ACK on this connection. */
> +       uint32_t challenge_ack_passed:1;
> +       /**< A challenge ack has passed. */
> +       uint32_t last_direction:1;
> +       /**< 1: The last packet is seen that comes from the original direction.
> +        * 0: From the reply direction.
> +        */
> +       uint32_t liberal_mode:1;
> +       /**< No TCP check will be done except the state change. */
> +       enum rte_flow_conntrack_state state;
initial_state or cur_state?

> +       /**< The current state of the connection. */
> +       uint8_t max_ack_window;
> +       /**< Scaling factor for maximal allowed ACK window. */
> +       uint8_t retransmission_limit;
> +       /**< Maximal allowed number of retransmission times. */
> +       struct rte_flow_tcp_dir_param original_dir;
> +       /**< TCP parameters of the original direction. */
> +       struct rte_flow_tcp_dir_param reply_dir;
> +       /**< TCP parameters of the reply direction. */
> +       uint16_t last_window;
> +       /**< The window value of the last packet passed this conntrack. */
> +       enum rte_flow_conntrack_index last_index;
Do you mean rte_flow_conntrack_last_state - as in last state as seen
by HW block?
Or maybe it is the TCP flag and not state?

> +       uint32_t last_seq;
> +       /**< The sequence of the last packet passed this conntrack. */
> +       uint32_t last_ack;
> +       /**< The acknowledgement of the last packet passed this conntrack. */
> +       uint32_t last_end;
> +       /**< The total value ACK + payload length of the last packet passed
> +        * this conntrack.
> +        */
> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> + *
> + * Wrapper structure for the context update interface.
> + * Ports cannot support updating, and the only valid solution is to
> + * destroy the old context and create a new one instead.
> + */
In that case why not destroy the flow and create a new one?

> +struct rte_flow_modify_conntrack {
> +       struct rte_flow_action_conntrack new_ct;
> +       /**< New connection tracking parameters to be updated. */
> +       uint32_t direction:1; /**< The direction field will be updated. */
> +       uint32_t state:1;
> +       /**< All the other fields except direction will be updated. */
> +       uint32_t reserved:30; /**< Reserved bits for the future usage. */
> +};
> +
>  /**
>   * Field IDs for MODIFY_FIELD action.
>   */
> --
> 2.19.0.windows.1
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item
  2021-03-23 23:27 ` Ajit Khaparde
@ 2021-04-07  2:41   ` Bing Zhao
  0 siblings, 0 replies; 6+ messages in thread
From: Bing Zhao @ 2021-04-07  2:41 UTC (permalink / raw)
  To: Ajit Khaparde
  Cc: Ori Kam, NBU-Contact-Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, dpdk-dev

Hello,

> -----Original Message-----
> From: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Sent: Wednesday, March 24, 2021 7:27 AM
> To: Bing Zhao <bingz@nvidia.com>
> Cc: Ori Kam <orika@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew
> Rybchenko <andrew.rybchenko@oktetlabs.ru>; dpdk-dev <dev@dpdk.org>
> Subject: Re: [dpdk-dev] [RFC] ethdev: introduce conntrack flow
> action and item
> 
> On Thu, Mar 18, 2021 at 12:30 AM Bing Zhao <bingz@nvidia.com> wrote:
> >
> > This commit introduced the conntrack action and item.
> >
> > Usually the HW offloading is stateless. For some stateful
> offloading
> > like a TCP connection, HW module will help provide the ability of
> a
> > full offloading w/o SW participation after the connection was
> > established.
> >
> > The basic usage is that in the first flow the application should
> add
> > the conntrack action and in the following flow(s) the application
> > should use the conntrack item to match on the result.
> >
> > A TCP connection has two directions traffic. To set a conntrack
> > action context correctly, information from packets of both
> directions
> > are required.
> >
> > The conntrack action should be created on one port and supply the
> > peer port as a parameter to the action. After context creating, it
> > could only be used between the ports (dual-port mode) or a single
> > port. The application should modify the action via
> action_ctx_update
> > interface before each use in dual-port mode, in order to set the
> > correct direction for the following rte flow.
> >
> > Query will be supported via action_ctx_query interface, about the
> > current packets information and connection status.
> >
> > For the packets received during the conntrack setup, it is
> suggested
> > to re-inject the packets in order to take full advantage of the
> > conntrack. Only the valid packets should pass the conntrack,
> packets
> > with invalid TCP information, like out of window, or with invalid
> > header, like malformed, should not pass.
> >
> > Testpmd command line example:
> >
> > set conntrack [index] enable is 1 last_seq is xxx last ack is xxx
> /
> > ... / orig_dir win_scale is xxx sent_end is xxx max_win is xxx ...
> /
> > rply_dir ... / end
> > flow action_ctx [CTX] create ingress ... / conntrack is [index] /
> end
> > flow create 0 group X ingress patterns ... / tcp / end actions
> action_ctx [CTX]
> > / jump group Y / end
> > flow create 0 group Y ingress patterns ... / ct is [Valid] / end
> actions
> > queue index [hairpin queue] / end
> >
> > Signed-off-by: Bing Zhao <bingz@nvidia.com>
> > ---
> >  lib/librte_ethdev/rte_flow.h | 191
> +++++++++++++++++++++++++++++++++++
> >  1 file changed, 191 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_flow.h
> b/lib/librte_ethdev/rte_flow.h
> > index 669e677e91..b2e4f0751a 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -550,6 +550,15 @@ enum rte_flow_item_type {
> >          * See struct rte_flow_item_geneve_opt
> >          */
> >         RTE_FLOW_ITEM_TYPE_GENEVE_OPT,
> > +
> > +       /**
> > +        * [META]
> > +        *
> > +        * Matches conntrack state.
> > +        *
> > +        * See struct rte_flow_item_conntrack.
> > +        */
> > +       RTE_FLOW_ITEM_TYPE_CONNTRACK,
> >  };
> >
> >  /**
> > @@ -1654,6 +1663,49 @@ rte_flow_item_geneve_opt_mask = {
> >  };
> >  #endif
> >
> > +/**
> > + * The packet is with valid.
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_STATE_VALID (1 << 0)
> > +/**
> > + * The state of the connection was changed.
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_STATE_CHANGED (1 << 1)
> > +/**
> > + * Error state was detected on this packet for this connection.
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_ERROR (1 << 2)
> > +/**
> > + * The HW connection tracking module is disabled.
> > + * It can be due to application command or an invalid state.
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_DISABLED (1 << 3)
> > +/**
> > + * The packet contains some bad field(s).
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_BAD_PKT (1 << 4)
> Why not an enum? We could use the bits, but group them under an enum?
> 

It could be. BTW, is there any convention to describe when to use #define macros and when to use enum types?

> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior
> notice
> > + *
> > + * RTE_FLOW_ITEM_TYPE_CONNTRACK
> > + *
> > + * Matches the state of a packet after it passed the connection
> tracking
> > + * examination. The state is a bit mask of one
> RTE_FLOW_CONNTRACK_FLAG*
> > + * or a reasonable combination of these bits.
> > + */
> > +struct rte_flow_item_conntrack {
> > +       uint32_t flags;
> > +};
> > +
> > +/** Default mask for RTE_FLOW_ITEM_TYPE_CONNTRACK. */
> > +#ifndef __cplusplus
> > +static const struct rte_flow_item_conntrack
> rte_flow_item_conntrack_mask = {
> > +       .flags = 0xffffffff,
> > +};
> > +#endif
> > +
> >  /**
> >   * Matching pattern item definition.
> >   *
> > @@ -2236,6 +2288,17 @@ enum rte_flow_action_type {
> >          * See struct rte_flow_action_modify_field.
> >          */
> >         RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
> > +
> > +       /**
> > +        * [META]
> > +        *
> > +        * Enable tracking a TCP connection state.
> > +        *
> > +        * Send packet to HW connection tracking module for
> examination.
> > +        *
> > +        * See struct rte_flow_action_conntrack.
> > +        */
> > +       RTE_FLOW_ACTION_TYPE_CONNTRACK,
> >  };
> >
> >  /**
> > @@ -2828,6 +2891,134 @@ struct rte_flow_action_set_dscp {
> >   */
> >  struct rte_flow_shared_action;
> >
> > +/**
> > + * The state of a TCP connection.
> > + */
> > +enum rte_flow_conntrack_state {
> > +       RTE_FLOW_CONNTRACK_STATE_SYN_RECV,
> > +       /**< SYN-ACK packet was seen. */
> > +       RTE_FLOW_CONNTRACK_STATE_ESTABLISHED,
> > +       /**< 3-way handshark was done. */
> > +       RTE_FLOW_CONNTRACK_STATE_FIN_WAIT,
> > +       /**< First FIN packet was received to close the connection.
> */
> > +       RTE_FLOW_CONNTRACK_STATE_CLOSE_WAIT,
> > +       /**< First FIN was ACKed. */
> > +       RTE_FLOW_CONNTRACK_STATE_LAST_ACK,
> > +       /**< After second FIN, waiting for the last ACK. */
> > +       RTE_FLOW_CONNTRACK_STATE_TIME_WAIT,
> > +       /**< Second FIN was ACKed, connection was closed. */
> > +};
> > +
> > +/**
> > + * The last passed TCP packet flags of a connection.
> > + */
> > +enum rte_flow_conntrack_index {
> > +       RTE_FLOW_CONNTRACK_INDEX_NONE = 0, /**< No Flag. */
> > +       RTE_FLOW_CONNTRACK_INDEX_SYN = (1 << 0), /**< With SYN
> flag. */
> > +       RTE_FLOW_CONNTRACK_INDEX_SYN_ACK = (1 << 1), /**< With
> SYN+ACK flag. */
> > +       RTE_FLOW_CONNTRACK_INDEX_FIN = (1 << 2), /**< With FIN
> flag. */
> > +       RTE_FLOW_CONNTRACK_INDEX_ACK = (1 << 3), /**< With ACK
> flag. */
> > +       RTE_FLOW_CONNTRACK_INDEX_RST = (1 << 4), /**< With RST
> flag. */
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior
> notice
> > + *
> > + * Configuration parameters for each direction of a TCP
> connection.
> > + */
> > +struct rte_flow_tcp_dir_param {
> > +       uint32_t scale:4; /**< TCP window scaling factor, 0xF to
> disable. */
> > +       uint32_t close_initiated:1; /**< The FIN was sent by this
> direction. */
> > +       uint32_t last_ack_seen:1;
> > +       /**< An ACK packet has been received by this side. */
> > +       uint32_t data_unacked:1;
> > +       /**< If set, indicates that there is unacked data of the
> connection. */
> > +       uint32_t sent_end;
> > +       /**< Maximal value of sequence + payload length over sent
> > +        * packets (next ACK from the opposite direction).
> > +        */
> > +       uint32_t reply_end;
> > +       /**< Maximal value of (ACK + window size) over received
> packet + length
> > +        * over sent packet (maximal sequence could be sent).
> > +        */
> > +       uint32_t max_win;
> > +       /**< Maximal value of actual window size over sent packets.
> */
> > +       uint32_t max_ack;
> > +       /**< Maximal value of ACK over sent packets. */
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior
> notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> > + *
> > + * Configuration and initial state for the connection tracking
> module.
> > + * This structure could be used for both setting and query.
> Can we split the structure into set and query.
> Some of the fields seem to be relevant for a query.
> Also the names will be simpler and easier to understand that way.
> 

To my understanding, it may be better to have all of them in a single structure. Different HW may have different ability for querying, and most of the fields are used both for query and create/update.
If some field is not supported for querying, the PMD could return the default value, e.g., 0. And if we split query and create/update:
1. Query struct may also needs to be the UNION of different HW's query capacity
2. the 2 structures will have a lot of common fields (most of them)

> > + */
> > +struct rte_flow_action_conntrack {
> > +       uint16_t peer_port; /**< The peer port number, can be the
> same port. */
> > +       uint32_t is_original_dir:1;
> > +       /**< Direction of this connection when creating a flow,
> the value only
> > +        * affects the subsequent flows creation.
> > +        */
> > +       uint32_t enable:1;
> > +       /**< Enable / disable the conntrack HW module. When
> disabled, the
> > +        * result will always be RTE_FLOW_CONNTRACK_FLAG_DISABLED.
> > +        * In this state the HW will act as passthrough.
> > +        */
> We should be able to enable the block in HW implicitly based on the
> rte_flow_create.
> I don't think this is needed.
> 
> > +       uint32_t live_connection:1;
> > +       /**< At least one ack was seen, after the connection was
> established. */
> > +       uint32_t selective_ack:1;
> > +       /**< Enable selective ACK on this connection. */
> > +       uint32_t challenge_ack_passed:1;
> > +       /**< A challenge ack has passed. */
> > +       uint32_t last_direction:1;
> > +       /**< 1: The last packet is seen that comes from the
> original direction.
> > +        * 0: From the reply direction.
> > +        */
> > +       uint32_t liberal_mode:1;
> > +       /**< No TCP check will be done except the state change. */
> > +       enum rte_flow_conntrack_state state;
> initial_state or cur_state?
> 
> > +       /**< The current state of the connection. */
> > +       uint8_t max_ack_window;
> > +       /**< Scaling factor for maximal allowed ACK window. */
> > +       uint8_t retransmission_limit;
> > +       /**< Maximal allowed number of retransmission times. */
> > +       struct rte_flow_tcp_dir_param original_dir;
> > +       /**< TCP parameters of the original direction. */
> > +       struct rte_flow_tcp_dir_param reply_dir;
> > +       /**< TCP parameters of the reply direction. */
> > +       uint16_t last_window;
> > +       /**< The window value of the last packet passed this
> conntrack. */
> > +       enum rte_flow_conntrack_index last_index;
> Do you mean rte_flow_conntrack_last_state - as in last state as seen
> by HW block?
> Or maybe it is the TCP flag and not state?

They are a little different. It should be the 2nd, the TCP flag of the last packets passed the connection tracking module, not the connection state.

> 
> > +       uint32_t last_seq;
> > +       /**< The sequence of the last packet passed this conntrack.
> */
> > +       uint32_t last_ack;
> > +       /**< The acknowledgement of the last packet passed this
> conntrack. */
> > +       uint32_t last_end;
> > +       /**< The total value ACK + payload length of the last
> packet passed
> > +        * this conntrack.
> > +        */
> > +};
> > +
> > +/**
> > + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> > + *
> > + * Wrapper structure for the context update interface.
> > + * Ports cannot support updating, and the only valid solution is
> to
> > + * destroy the old context and create a new one instead.
> > + */
> In that case why not destroy the flow and create a new one?

I may not quite understand your question but will try to answer a bit more detailed and please comment.
The connection tracking action context will be created before any flow creation, then it will be used by the flows:
1. Conntrack action will be used for the flows of bi-directional traffic, and when creating it, the information of TCP packets from both directions are needed.
2. This conntrack action could be used for multiple flows over single port or dual ports.
3. The flow could be destroyed w/o destroy the action and then it could be reused by a new flow(if needed).
4. One direction flow could be destroyed w/o destroying the opposite direction flow.

So if the user want to destroy the action context, it should call the destroy interface directly. The action context will be still "alive" after one flow that using it is destroyed. It couldn't be destroyed together with the flow.

> 
> > +struct rte_flow_modify_conntrack {
> > +       struct rte_flow_action_conntrack new_ct;
> > +       /**< New connection tracking parameters to be updated. */
> > +       uint32_t direction:1; /**< The direction field will be
> updated. */
> > +       uint32_t state:1;
> > +       /**< All the other fields except direction will be updated.
> */
> > +       uint32_t reserved:30; /**< Reserved bits for the future
> usage. */
> > +};
> > +
> >  /**
> >   * Field IDs for MODIFY_FIELD action.
> >   */
> > --
> > 2.19.0.windows.1
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item
  2021-03-22 15:16 ` Andrew Rybchenko
@ 2021-04-07  7:43   ` Bing Zhao
  0 siblings, 0 replies; 6+ messages in thread
From: Bing Zhao @ 2021-04-07  7:43 UTC (permalink / raw)
  To: Andrew Rybchenko, Ori Kam, NBU-Contact-Thomas Monjalon, ferruh.yigit; +Cc: dev

Hi Andrew,
Sorry for the late reply.

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Monday, March 22, 2021 11:17 PM
> To: Bing Zhao <bingz@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-
> Contact-Thomas Monjalon <thomas@monjalon.net>;
> ferruh.yigit@intel.com
> Cc: dev@dpdk.org
> Subject: Re: [RFC] ethdev: introduce conntrack flow action and item
> 
> External email: Use caution opening links or attachments
> 
> 
> On 3/18/21 10:30 AM, Bing Zhao wrote:
> > This commit introduced the conntrack action and item.
> >
> > Usually the HW offloading is stateless. For some stateful
> offloading
> > like a TCP connection, HW module will help provide the ability of
> a
> > full offloading w/o SW participation after the connection was
> > established.
> >
> > The basic usage is that in the first flow the application should
> add
> > the conntrack action and in the following flow(s) the application
> > should use the conntrack item to match on the result.
> >
> > A TCP connection has two directions traffic. To set a conntrack
> action
> > context correctly, information from packets of both directions are
> > required.
> >
> > The conntrack action should be created on one port and supply the
> peer
> > port as a parameter to the action. After context creating, it
> could
> > only be used between the ports (dual-port mode) or a single port.
> The
> > application should modify the action via action_ctx_update
> interface
> > before each use in dual-port mode, in order to set the correct
> > direction for the following rte flow.
> 
> Sorry, but "update interface before each use" sounds frightening.
> May be I simply don't understand all reasons behind.

Sorry for the uncleared description and the "each use in dual-port mode" should be "single-port mode". It is a suggestion but not a must, depending on the HW. But usually, since connection tracking is a bi-directional action and should be used by original and reply direction flows.
For dual-ports mode, usually the original traffic and reply traffic will come from different ports, the SW could distinguish them implicitly.
But in single-port mode, like in a gateway scenario, all the traffic are ingress and go into the same port, it is hard to distinguish the direction.
The update of the action will be just in SW level, or maybe in the HW level, depends on the NIC features.
If the next several flows to be created with this action context are for the same direction, then there is no need to call such API. Only when using it for an opposite direction, the interface will be called.
Also, if some changing of the action context is needed, like the seq/ACK/window, then the interface is also needed.

> 
> > Query will be supported via action_ctx_query interface, about the
> > current packets information and connection status.
> >
> > For the packets received during the conntrack setup, it is
> suggested
> > to re-inject the packets in order to take full advantage of the
> > conntrack. Only the valid packets should pass the conntrack,
> packets
> > with invalid TCP information, like out of window, or with invalid
> > header, like malformed, should not pass.
> >
> > Testpmd command line example:
> >
> > set conntrack [index] enable is 1 last_seq is xxx last ack is xxx
> /
> > ... / orig_dir win_scale is xxx sent_end is xxx max_win is xxx ...
> /
> > rply_dir ... / end flow action_ctx [CTX] create ingress ... /
> > conntrack is [index] / end flow create 0 group X ingress
> patterns ...
> > / tcp / end actions action_ctx [CTX] / jump group Y / end flow
> create
> > 0 group Y ingress patterns ... / ct is [Valid] / end actions queue
> > index [hairpin queue] / end

@Andrew @Ori
Is such command line interface OK from your points of view?

> >
> > Signed-off-by: Bing Zhao <bingz@nvidia.com>
> > ---
> >  lib/librte_ethdev/rte_flow.h | 191
> > +++++++++++++++++++++++++++++++++++
> >  1 file changed, 191 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index 669e677e91..b2e4f0751a 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -550,6 +550,15 @@ enum rte_flow_item_type {
> >        * See struct rte_flow_item_geneve_opt
> >        */
> >       RTE_FLOW_ITEM_TYPE_GENEVE_OPT,
> > +
> > +     /**
> > +      * [META]
> > +      *
> > +      * Matches conntrack state.
> > +      *
> > +      * See struct rte_flow_item_conntrack.
> > +      */
> > +     RTE_FLOW_ITEM_TYPE_CONNTRACK,
> >  };
> >
> >  /**
> > @@ -1654,6 +1663,49 @@ rte_flow_item_geneve_opt_mask = {  };
> #endif
> >
> > +/**
> > + * The packet is with valid.
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_STATE_VALID (1 << 0)
> 
> It sounds like conntrack state is valid, but not packet is valid
> from conntrack point of view. May be:
> RTE_FLOW_CONNTRACK_FLAG_PKT_VALID? Or _VALID_PKT to go with _BAD_PKT.

The original idea of this is the state is valid after the packet integrity checking.
1. if some fields of the packet itself has an error, _BAD_PKT, and the packet will not be checked by the HW.
Then
2. if passed the HW connection tracking module, it should be considered _VALID.
3. if not passed the HW module checking, e.g., out of window, then it should be considered INVALID state or ERROR state.
But yes, the name should be more clear to describe themselves.

> 
> > +/**
> > + * The state of the connection was changed.
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_STATE_CHANGED (1 << 1)
> > +/**
> > + * Error state was detected on this packet for this connection.
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_ERROR (1 << 2)
> > +/**
> > + * The HW connection tracking module is disabled.
> > + * It can be due to application command or an invalid state.
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_DISABLED (1 << 3)
> > +/**
> > + * The packet contains some bad field(s).
> > + */
> > +#define RTE_FLOW_CONNTRACK_FLAG_BAD_PKT (1 << 4)
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior
> notice
> > + *
> > + * RTE_FLOW_ITEM_TYPE_CONNTRACK
> > + *
> > + * Matches the state of a packet after it passed the connection
> > +tracking
> > + * examination. The state is a bit mask of one
> > +RTE_FLOW_CONNTRACK_FLAG*
> > + * or a reasonable combination of these bits.
> > + */
> > +struct rte_flow_item_conntrack {
> > +     uint32_t flags;
> > +};
> > +
> > +/** Default mask for RTE_FLOW_ITEM_TYPE_CONNTRACK. */ #ifndef
> > +__cplusplus static const struct rte_flow_item_conntrack
> > +rte_flow_item_conntrack_mask = {
> > +     .flags = 0xffffffff,
> > +};
> > +#endif
> > +
> >  /**
> >   * Matching pattern item definition.
> >   *
> > @@ -2236,6 +2288,17 @@ enum rte_flow_action_type {
> >        * See struct rte_flow_action_modify_field.
> >        */
> >       RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
> > +
> > +     /**
> > +      * [META]
> > +      *
> > +      * Enable tracking a TCP connection state.
> > +      *
> > +      * Send packet to HW connection tracking module for
> examination.
> > +      *
> > +      * See struct rte_flow_action_conntrack.
> > +      */
> > +     RTE_FLOW_ACTION_TYPE_CONNTRACK,
> >  };
> >
> >  /**
> > @@ -2828,6 +2891,134 @@ struct rte_flow_action_set_dscp {
> >   */
> >  struct rte_flow_shared_action;
> >
> > +/**
> > + * The state of a TCP connection.
> > + */
> > +enum rte_flow_conntrack_state {
> > +     RTE_FLOW_CONNTRACK_STATE_SYN_RECV,
> > +     /**< SYN-ACK packet was seen. */
> 
> May I suggest to put comments before enum member. IMHO it is more
> readable. Comment after makes sense if it is on the same line,
> otherwise, it is better to use comments before code.

Sure, I will change it in the patch itself.

> 
> > +     RTE_FLOW_CONNTRACK_STATE_ESTABLISHED,
> > +     /**< 3-way handshark was done. */
> > +     RTE_FLOW_CONNTRACK_STATE_FIN_WAIT,
> > +     /**< First FIN packet was received to close the connection.
> */
> > +     RTE_FLOW_CONNTRACK_STATE_CLOSE_WAIT,
> > +     /**< First FIN was ACKed. */
> > +     RTE_FLOW_CONNTRACK_STATE_LAST_ACK,
> > +     /**< After second FIN, waiting for the last ACK. */
> > +     RTE_FLOW_CONNTRACK_STATE_TIME_WAIT,
> > +     /**< Second FIN was ACKed, connection was closed. */ };
> > +
> > +/**
> > + * The last passed TCP packet flags of a connection.
> > + */
> > +enum rte_flow_conntrack_index {
> 
> Sorry, I don't understand why it is named conntrack_index.

May flag will be a better name instead of index? Or any other suggestion?

> 
> > +     RTE_FLOW_CONNTRACK_INDEX_NONE = 0, /**< No Flag. */
> > +     RTE_FLOW_CONNTRACK_INDEX_SYN = (1 << 0), /**< With SYN flag.
> */
> > +     RTE_FLOW_CONNTRACK_INDEX_SYN_ACK = (1 << 1), /**< With
> SYN+ACK flag. */
> > +     RTE_FLOW_CONNTRACK_INDEX_FIN = (1 << 2), /**< With FIN flag.
> */
> > +     RTE_FLOW_CONNTRACK_INDEX_ACK = (1 << 3), /**< With ACK flag.
> */
> > +     RTE_FLOW_CONNTRACK_INDEX_RST = (1 << 4), /**< With RST flag.
> */
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior
> notice
> > + *
> > + * Configuration parameters for each direction of a TCP
> connection.
> > + */
> > +struct rte_flow_tcp_dir_param {
> > +     uint32_t scale:4; /**< TCP window scaling factor, 0xF to
> disable. */
> > +     uint32_t close_initiated:1; /**< The FIN was sent by this
> direction. */
> > +     uint32_t last_ack_seen:1;
> > +     /**< An ACK packet has been received by this side. */
> 
> Same here about comments after fields.

Will change them all.

> 
> > +     uint32_t data_unacked:1;
> > +     /**< If set, indicates that there is unacked data of the
> connection. */
> > +     uint32_t sent_end;
> > +     /**< Maximal value of sequence + payload length over sent
> > +      * packets (next ACK from the opposite direction).
> > +      */
> > +     uint32_t reply_end;
> > +     /**< Maximal value of (ACK + window size) over received
> packet + length
> > +      * over sent packet (maximal sequence could be sent).
> > +      */
> > +     uint32_t max_win;
> > +     /**< Maximal value of actual window size over sent packets.
> */
> > +     uint32_t max_ack;
> > +     /**< Maximal value of ACK over sent packets. */ };
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior
> notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> > + *
> > + * Configuration and initial state for the connection tracking
> module.
> > + * This structure could be used for both setting and query.
> > + */
> > +struct rte_flow_action_conntrack {
> > +     uint16_t peer_port; /**< The peer port number, can be the
> same port. */
> > +     uint32_t is_original_dir:1;
> > +     /**< Direction of this connection when creating a flow, the
> value only
> > +      * affects the subsequent flows creation.
> > +      */
> 
> and here tool

Will change them all in the patch.

> 
> > +     uint32_t enable:1;
> > +     /**< Enable / disable the conntrack HW module. When disabled,
> the
> > +      * result will always be RTE_FLOW_CONNTRACK_FLAG_DISABLED.
> > +      * In this state the HW will act as passthrough.
> > +      */
> 
> Does it disable entire conntrack HW module for all flows?
> It sounds like this. If so - confusing.

Yes, for all the flows that using this connection tracking context. But for the remaining flows which use other CT contexts, they will not be impacted.

> 
> > +     uint32_t live_connection:1;
> > +     /**< At least one ack was seen, after the connection was
> established. */
> > +     uint32_t selective_ack:1;
> > +     /**< Enable selective ACK on this connection. */
> > +     uint32_t challenge_ack_passed:1;
> > +     /**< A challenge ack has passed. */
> > +     uint32_t last_direction:1;
> > +     /**< 1: The last packet is seen that comes from the original
> direction.
> > +      * 0: From the reply direction.
> > +      */
> > +     uint32_t liberal_mode:1;
> > +     /**< No TCP check will be done except the state change. */
> > +     enum rte_flow_conntrack_state state;
> > +     /**< The current state of the connection. */
> > +     uint8_t max_ack_window;
> > +     /**< Scaling factor for maximal allowed ACK window. */
> > +     uint8_t retransmission_limit;
> > +     /**< Maximal allowed number of retransmission times. */
> > +     struct rte_flow_tcp_dir_param original_dir;
> > +     /**< TCP parameters of the original direction. */
> > +     struct rte_flow_tcp_dir_param reply_dir;
> > +     /**< TCP parameters of the reply direction. */
> > +     uint16_t last_window;
> > +     /**< The window value of the last packet passed this
> conntrack. */
> > +     enum rte_flow_conntrack_index last_index;
> > +     uint32_t last_seq;
> > +     /**< The sequence of the last packet passed this conntrack.
> */
> > +     uint32_t last_ack;
> > +     /**< The acknowledgement of the last packet passed this
> conntrack. */
> > +     uint32_t last_end;
> > +     /**< The total value ACK + payload length of the last packet
> passed
> > +      * this conntrack.
> > +      */
> > +};
> > +
> > +/**
> > + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> > + *
> > + * Wrapper structure for the context update interface.
> > + * Ports cannot support updating, and the only valid solution is
> to
> > + * destroy the old context and create a new one instead.
> > + */
> > +struct rte_flow_modify_conntrack {
> > +     struct rte_flow_action_conntrack new_ct;
> > +     /**< New connection tracking parameters to be updated. */
> 
> and here

Will change them all in the formal patch.

> 
> > +     uint32_t direction:1; /**< The direction field will be
> updated. */
> > +     uint32_t state:1;
> > +     /**< All the other fields except direction will be updated.
> */
> > +     uint32_t reserved:30; /**< Reserved bits for the future
> usage.
> > +*/ };
> > +
> >  /**
> >   * Field IDs for MODIFY_FIELD action.
> >   */
> >


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-dev] [PATCH] ethdev: introduce conntrack flow action and item
  2021-03-18  7:30 [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item Bing Zhao
  2021-03-22 15:16 ` Andrew Rybchenko
  2021-03-23 23:27 ` Ajit Khaparde
@ 2021-04-10 13:46 ` Bing Zhao
  2 siblings, 0 replies; 6+ messages in thread
From: Bing Zhao @ 2021-04-10 13:46 UTC (permalink / raw)
  To: orika, thomas, ferruh.yigit, andrew.rybchenko; +Cc: dev, ajit.khaparde

This commit introduced the conntrack action and item.

Usually the HW offloading is stateless. For some stateful offloading
like a TCP connection, HW module will help provide the ability of a
full offloading w/o SW participation after the connection was
established.

The basic usage is that in the first flow the application should add
the conntrack action and in the following flow(s) the application
should use the conntrack item to match on the result.

A TCP connection has two directions traffic. To set a conntrack
action context correctly, information from packets of both directions
are required.

The conntrack action should be created on one port and supply the
peer port as a parameter to the action. After context creating, it
could only be used between the ports (dual-port mode) or a single
port. The application should modify the action via the API
"action_handle_update" only when before using it to create a flow
with opposite direction. This will help the driver to recognize the
direction of the flow to be created, especially in single port mode.
The traffic from both directions will go through the same port if
the application works as an "forwarding engine" but not a end point.
There is no need to call the update interface if the subsequent flows
have nothing to be changed.

Query will be supported via action_ctx_query interface, about the
current packets information and connection status. Tha fields
query capabilities depends on the HW.

For the packets received during the conntrack setup, it is suggested
to re-inject the packets in order to take full advantage of the
conntrack. Only the valid packets should pass the conntrack, packets
with invalid TCP information, like out of window, or with invalid
header, like malformed, should not pass.

Naming and definition:
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/netfilter/nf_conntrack_tcp.h
https://elixir.bootlin.com/linux/latest/source/net/netfilter/nf_conntrack_proto_tcp.c

Other reference:
https://www.usenix.org/legacy/events/sec01/invitedtalks/rooij.pdf

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 lib/librte_ethdev/rte_flow.h | 195 +++++++++++++++++++++++++++++++++++
 1 file changed, 195 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 6cc57136ac..d506377f7e 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -551,6 +551,15 @@ enum rte_flow_item_type {
 	 * See struct rte_flow_item_geneve_opt
 	 */
 	RTE_FLOW_ITEM_TYPE_GENEVE_OPT,
+
+	/**
+	 * [META]
+	 *
+	 * Matches conntrack state.
+	 *
+	 * See struct rte_flow_item_conntrack.
+	 */
+	RTE_FLOW_ITEM_TYPE_CONNTRACK,
 };
 
 /**
@@ -1685,6 +1694,51 @@ rte_flow_item_geneve_opt_mask = {
 };
 #endif
 
+/**
+ * The packet is with valid state after conntrack checking.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_PKT_STATE_VALID (1 << 0)
+/**
+ * The state of the connection was changed.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_PKT_STATE_CHANGED (1 << 1)
+/**
+ * Error is detected on this packet for this connection and
+ * an invalid state is set.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_PKT_STATE_INVAL (1 << 2)
+/**
+ * The HW connection tracking module is disabled.
+ * It can be due to application command or an invalid state.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_HW_DISABLED (1 << 3)
+/**
+ * The packet contains some bad field(s) and cannot continue
+ * with the conntrack module checking.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_PKT_BAD (1 << 4)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_CONNTRACK
+ *
+ * Matches the state of a packet after it passed the connection tracking
+ * examination. The state is a bit mask of one RTE_FLOW_CONNTRACK_FLAG*
+ * or a reasonable combination of these bits.
+ */
+struct rte_flow_item_conntrack {
+	uint32_t flags;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_CONNTRACK. */
+#ifndef __cplusplus
+static const struct rte_flow_item_conntrack rte_flow_item_conntrack_mask = {
+	.flags = 0xffffffff,
+};
+#endif
+
 /**
  * Matching pattern item definition.
  *
@@ -2267,6 +2321,17 @@ enum rte_flow_action_type {
 	 * See struct rte_flow_action_modify_field.
 	 */
 	RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
+
+	/**
+	 * [META]
+	 *
+	 * Enable tracking a TCP connection state.
+	 *
+	 * Send packet to HW connection tracking module for examination.
+	 *
+	 * See struct rte_flow_action_conntrack.
+	 */
+	RTE_FLOW_ACTION_TYPE_CONNTRACK,
 };
 
 /**
@@ -2859,6 +2924,136 @@ struct rte_flow_action_set_dscp {
  */
 struct rte_flow_shared_action;
 
+/**
+ * The state of a TCP connection.
+ */
+enum rte_flow_conntrack_state {
+	/**< SYN-ACK packet was seen. */
+	RTE_FLOW_CONNTRACK_STATE_SYN_RECV,
+	/**< 3-way handshark was done. */
+	RTE_FLOW_CONNTRACK_STATE_ESTABLISHED,
+	/**< First FIN packet was received to close the connection. */
+	RTE_FLOW_CONNTRACK_STATE_FIN_WAIT,
+	/**< First FIN was ACKed. */
+	RTE_FLOW_CONNTRACK_STATE_CLOSE_WAIT,
+	/**< Second FIN was received, waiting for the last ACK. */
+	RTE_FLOW_CONNTRACK_STATE_LAST_ACK,
+	/**< Second FIN was ACKed, connection was closed. */
+	RTE_FLOW_CONNTRACK_STATE_TIME_WAIT,
+};
+
+/**
+ * The last passed TCP packet flags of a connection.
+ */
+enum rte_flow_conntrack_tcp_last_index {
+	RTE_FLOW_CONNTRACK_FLAG_NONE = 0, /**< No Flag. */
+	RTE_FLOW_CONNTRACK_FLAG_SYN = (1 << 0), /**< With SYN flag. */
+	RTE_FLOW_CONNTRACK_FLAG_SYNACK = (1 << 1), /**< With SYN+ACK flag. */
+	RTE_FLOW_CONNTRACK_FLAG_FIN = (1 << 2), /**< With FIN flag. */
+	RTE_FLOW_CONNTRACK_FLAG_ACK = (1 << 3), /**< With ACK flag. */
+	RTE_FLOW_CONNTRACK_FLAG_RST = (1 << 4), /**< With RST flag. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * Configuration parameters for each direction of a TCP connection.
+ */
+struct rte_flow_tcp_dir_param {
+	uint32_t scale:4; /**< TCP window scaling factor, 0xF to disable. */
+	uint32_t close_initiated:1; /**< The FIN was sent by this direction. */
+	/**< An ACK packet has been received by this side. */
+	uint32_t last_ack_seen:1;
+	/**< If set, indicates that there is unacked data of the connection. */
+	uint32_t data_unacked:1;
+	/**< Maximal value of sequence + payload length over sent
+	 * packets (next ACK from the opposite direction).
+	 */
+	uint32_t sent_end;
+	/**< Maximal value of (ACK + window size) over received packet + length
+	 * over sent packet (maximal sequence could be sent).
+	 */
+	uint32_t reply_end;
+	/**< Maximal value of actual window size over sent packets. */
+	uint32_t max_win;
+	/**< Maximal value of ACK over sent packets. */
+	uint32_t max_ack;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_CONNTRACK
+ *
+ * Configuration and initial state for the connection tracking module.
+ * This structure could be used for both setting and query.
+ */
+struct rte_flow_action_conntrack {
+	uint16_t peer_port; /**< The peer port number, can be the same port. */
+	/**< Direction of this connection when creating a flow, the value only
+	 * affects the subsequent flows creation.
+	 */
+	uint32_t is_original_dir:1;
+	/**< Enable / disable the conntrack HW module. When disabled, the
+	 * result will always be RTE_FLOW_CONNTRACK_FLAG_DISABLED.
+	 * In this state the HW will act as passthrough.
+	 * It only affects this conntrack object in the HW without any effect
+	 * to the other objects.
+	 */
+	uint32_t enable:1;
+	/**< At least one ack was seen, after the connection was established. */
+	uint32_t live_connection:1;
+	/**< Enable selective ACK on this connection. */
+	uint32_t selective_ack:1;
+	/**< A challenge ack has passed. */
+	uint32_t challenge_ack_passed:1;
+	/**< 1: The last packet is seen that comes from the original direction.
+	 * 0: From the reply direction.
+	 */
+	uint32_t last_direction:1;
+	/**< No TCP check will be done except the state change. */
+	uint32_t liberal_mode:1;
+	/**< The current state of the connection. */
+	enum rte_flow_conntrack_state state;
+	/**< Scaling factor for maximal allowed ACK window. */
+	uint8_t max_ack_window;
+	/**< Maximal allowed number of retransmission times. */
+	uint8_t retransmission_limit;
+	/**< TCP parameters of the original direction. */
+	struct rte_flow_tcp_dir_param original_dir;
+	/**< TCP parameters of the reply direction. */
+	struct rte_flow_tcp_dir_param reply_dir;
+	/**< The window value of the last packet passed this conntrack. */
+	uint16_t last_window;
+	enum rte_flow_conntrack_tcp_last_index last_index;
+	/**< The sequence of the last packet passed this conntrack. */
+	uint32_t last_seq;
+	/**< The acknowledgement of the last packet passed this conntrack. */
+	uint32_t last_ack;
+	/**< The total value ACK + payload length of the last packet passed
+	 * this conntrack.
+	 */
+	uint32_t last_end;
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_CONNTRACK
+ *
+ * Wrapper structure for the context update interface.
+ * Ports cannot support updating, and the only valid solution is to
+ * destroy the old context and create a new one instead.
+ */
+struct rte_flow_modify_conntrack {
+	/**< New connection tracking parameters to be updated. */
+	struct rte_flow_action_conntrack new_ct;
+	uint32_t direction:1; /**< The direction field will be updated. */
+	/**< All the other fields except direction will be updated. */
+	uint32_t state:1;
+	uint32_t reserved:30; /**< Reserved bits for the future usage. */
+};
+
 /**
  * Field IDs for MODIFY_FIELD action.
  */
-- 
2.30.0.windows.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-10 13:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-18  7:30 [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item Bing Zhao
2021-03-22 15:16 ` Andrew Rybchenko
2021-04-07  7:43   ` Bing Zhao
2021-03-23 23:27 ` Ajit Khaparde
2021-04-07  2:41   ` Bing Zhao
2021-04-10 13:46 ` [dpdk-dev] [PATCH] " Bing Zhao

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git