DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
To: Bing Zhao <bingz@nvidia.com>,
	orika@nvidia.com, thomas@monjalon.net, ferruh.yigit@intel.com
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item
Date: Mon, 22 Mar 2021 18:16:57 +0300	[thread overview]
Message-ID: <9ad370a3-ebb7-8c15-9cdb-ec0b35a9bd3e@oktetlabs.ru> (raw)
In-Reply-To: <1616052616-344969-1-git-send-email-bingz@nvidia.com>

On 3/18/21 10:30 AM, Bing Zhao wrote:
> This commit introduced the conntrack action and item.
> 
> Usually the HW offloading is stateless. For some stateful offloading
> like a TCP connection, HW module will help provide the ability of a
> full offloading w/o SW participation after the connection was
> established.
> 
> The basic usage is that in the first flow the application should add
> the conntrack action and in the following flow(s) the application
> should use the conntrack item to match on the result.
> 
> A TCP connection has two directions traffic. To set a conntrack
> action context correctly, information from packets of both directions
> are required.
> 
> The conntrack action should be created on one port and supply the
> peer port as a parameter to the action. After context creating, it
> could only be used between the ports (dual-port mode) or a single
> port. The application should modify the action via action_ctx_update
> interface before each use in dual-port mode, in order to set the
> correct direction for the following rte flow.

Sorry, but "update interface before each use" sounds frightening. May be
I simply don't understand all
reasons behind.

> Query will be supported via action_ctx_query interface, about the
> current packets information and connection status.
> 
> For the packets received during the conntrack setup, it is suggested
> to re-inject the packets in order to take full advantage of the
> conntrack. Only the valid packets should pass the conntrack, packets
> with invalid TCP information, like out of window, or with invalid
> header, like malformed, should not pass.
> 
> Testpmd command line example:
> 
> set conntrack [index] enable is 1 last_seq is xxx last ack is xxx /
> ... / orig_dir win_scale is xxx sent_end is xxx max_win is xxx ... /
> rply_dir ... / end
> flow action_ctx [CTX] create ingress ... / conntrack is [index] / end
> flow create 0 group X ingress patterns ... / tcp / end actions action_ctx [CTX]
> / jump group Y / end
> flow create 0 group Y ingress patterns ... / ct is [Valid] / end actions
> queue index [hairpin queue] / end
> 
> Signed-off-by: Bing Zhao <bingz@nvidia.com>
> ---
>  lib/librte_ethdev/rte_flow.h | 191 +++++++++++++++++++++++++++++++++++
>  1 file changed, 191 insertions(+)
> 
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index 669e677e91..b2e4f0751a 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -550,6 +550,15 @@ enum rte_flow_item_type {
>  	 * See struct rte_flow_item_geneve_opt
>  	 */
>  	RTE_FLOW_ITEM_TYPE_GENEVE_OPT,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Matches conntrack state.
> +	 *
> +	 * See struct rte_flow_item_conntrack.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_CONNTRACK,
>  };
>  
>  /**
> @@ -1654,6 +1663,49 @@ rte_flow_item_geneve_opt_mask = {
>  };
>  #endif
>  
> +/**
> + * The packet is with valid.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_STATE_VALID (1 << 0)

It sounds like conntrack state is valid, but not packet is
valid from conntrack point of view. May be:
RTE_FLOW_CONNTRACK_FLAG_PKT_VALID? Or _VALID_PKT to
go with _BAD_PKT.

> +/**
> + * The state of the connection was changed.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_STATE_CHANGED (1 << 1)
> +/**
> + * Error state was detected on this packet for this connection.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_ERROR (1 << 2)
> +/**
> + * The HW connection tracking module is disabled.
> + * It can be due to application command or an invalid state.
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_DISABLED (1 << 3)
> +/**
> + * The packet contains some bad field(s).
> + */
> +#define RTE_FLOW_CONNTRACK_FLAG_BAD_PKT (1 << 4)
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ITEM_TYPE_CONNTRACK
> + *
> + * Matches the state of a packet after it passed the connection tracking
> + * examination. The state is a bit mask of one RTE_FLOW_CONNTRACK_FLAG*
> + * or a reasonable combination of these bits.
> + */
> +struct rte_flow_item_conntrack {
> +	uint32_t flags;
> +};
> +
> +/** Default mask for RTE_FLOW_ITEM_TYPE_CONNTRACK. */
> +#ifndef __cplusplus
> +static const struct rte_flow_item_conntrack rte_flow_item_conntrack_mask = {
> +	.flags = 0xffffffff,
> +};
> +#endif
> +
>  /**
>   * Matching pattern item definition.
>   *
> @@ -2236,6 +2288,17 @@ enum rte_flow_action_type {
>  	 * See struct rte_flow_action_modify_field.
>  	 */
>  	RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Enable tracking a TCP connection state.
> +	 *
> +	 * Send packet to HW connection tracking module for examination.
> +	 *
> +	 * See struct rte_flow_action_conntrack.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_CONNTRACK,
>  };
>  
>  /**
> @@ -2828,6 +2891,134 @@ struct rte_flow_action_set_dscp {
>   */
>  struct rte_flow_shared_action;
>  
> +/**
> + * The state of a TCP connection.
> + */
> +enum rte_flow_conntrack_state {
> +	RTE_FLOW_CONNTRACK_STATE_SYN_RECV,
> +	/**< SYN-ACK packet was seen. */

May I suggest to put comments before enum member. IMHO it is
more readable. Comment after makes sense if it is on the same
line, otherwise, it is better to use comments before code.

> +	RTE_FLOW_CONNTRACK_STATE_ESTABLISHED,
> +	/**< 3-way handshark was done. */
> +	RTE_FLOW_CONNTRACK_STATE_FIN_WAIT,
> +	/**< First FIN packet was received to close the connection. */
> +	RTE_FLOW_CONNTRACK_STATE_CLOSE_WAIT,
> +	/**< First FIN was ACKed. */
> +	RTE_FLOW_CONNTRACK_STATE_LAST_ACK,
> +	/**< After second FIN, waiting for the last ACK. */
> +	RTE_FLOW_CONNTRACK_STATE_TIME_WAIT,
> +	/**< Second FIN was ACKed, connection was closed. */
> +};
> +
> +/**
> + * The last passed TCP packet flags of a connection.
> + */
> +enum rte_flow_conntrack_index {

Sorry, I don't understand why it is named conntrack_index.

> +	RTE_FLOW_CONNTRACK_INDEX_NONE = 0, /**< No Flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_SYN = (1 << 0), /**< With SYN flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_SYN_ACK = (1 << 1), /**< With SYN+ACK flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_FIN = (1 << 2), /**< With FIN flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_ACK = (1 << 3), /**< With ACK flag. */
> +	RTE_FLOW_CONNTRACK_INDEX_RST = (1 << 4), /**< With RST flag. */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * Configuration parameters for each direction of a TCP connection.
> + */
> +struct rte_flow_tcp_dir_param {
> +	uint32_t scale:4; /**< TCP window scaling factor, 0xF to disable. */
> +	uint32_t close_initiated:1; /**< The FIN was sent by this direction. */
> +	uint32_t last_ack_seen:1;
> +	/**< An ACK packet has been received by this side. */

Same here about comments after fields.

> +	uint32_t data_unacked:1;
> +	/**< If set, indicates that there is unacked data of the connection. */
> +	uint32_t sent_end;
> +	/**< Maximal value of sequence + payload length over sent
> +	 * packets (next ACK from the opposite direction).
> +	 */
> +	uint32_t reply_end;
> +	/**< Maximal value of (ACK + window size) over received packet + length
> +	 * over sent packet (maximal sequence could be sent).
> +	 */
> +	uint32_t max_win;
> +	/**< Maximal value of actual window size over sent packets. */
> +	uint32_t max_ack;
> +	/**< Maximal value of ACK over sent packets. */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> + *
> + * Configuration and initial state for the connection tracking module.
> + * This structure could be used for both setting and query.
> + */
> +struct rte_flow_action_conntrack {
> +	uint16_t peer_port; /**< The peer port number, can be the same port. */
> +	uint32_t is_original_dir:1;
> +	/**< Direction of this connection when creating a flow, the value only
> +	 * affects the subsequent flows creation.
> +	 */

and here tool

> +	uint32_t enable:1;
> +	/**< Enable / disable the conntrack HW module. When disabled, the
> +	 * result will always be RTE_FLOW_CONNTRACK_FLAG_DISABLED.
> +	 * In this state the HW will act as passthrough.
> +	 */

Does it disable entire conntrack HW module for all flows?
It sounds like this. If so - confusing.

> +	uint32_t live_connection:1;
> +	/**< At least one ack was seen, after the connection was established. */
> +	uint32_t selective_ack:1;
> +	/**< Enable selective ACK on this connection. */
> +	uint32_t challenge_ack_passed:1;
> +	/**< A challenge ack has passed. */
> +	uint32_t last_direction:1;
> +	/**< 1: The last packet is seen that comes from the original direction.
> +	 * 0: From the reply direction.
> +	 */
> +	uint32_t liberal_mode:1;
> +	/**< No TCP check will be done except the state change. */
> +	enum rte_flow_conntrack_state state;
> +	/**< The current state of the connection. */
> +	uint8_t max_ack_window;
> +	/**< Scaling factor for maximal allowed ACK window. */
> +	uint8_t retransmission_limit;
> +	/**< Maximal allowed number of retransmission times. */
> +	struct rte_flow_tcp_dir_param original_dir;
> +	/**< TCP parameters of the original direction. */
> +	struct rte_flow_tcp_dir_param reply_dir;
> +	/**< TCP parameters of the reply direction. */
> +	uint16_t last_window;
> +	/**< The window value of the last packet passed this conntrack. */
> +	enum rte_flow_conntrack_index last_index;
> +	uint32_t last_seq;
> +	/**< The sequence of the last packet passed this conntrack. */
> +	uint32_t last_ack;
> +	/**< The acknowledgement of the last packet passed this conntrack. */
> +	uint32_t last_end;
> +	/**< The total value ACK + payload length of the last packet passed
> +	 * this conntrack.
> +	 */
> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_CONNTRACK
> + *
> + * Wrapper structure for the context update interface.
> + * Ports cannot support updating, and the only valid solution is to
> + * destroy the old context and create a new one instead.
> + */
> +struct rte_flow_modify_conntrack {
> +	struct rte_flow_action_conntrack new_ct;
> +	/**< New connection tracking parameters to be updated. */

and here

> +	uint32_t direction:1; /**< The direction field will be updated. */
> +	uint32_t state:1;
> +	/**< All the other fields except direction will be updated. */
> +	uint32_t reserved:30; /**< Reserved bits for the future usage. */
> +};
> +
>  /**
>   * Field IDs for MODIFY_FIELD action.
>   */
> 


  reply	other threads:[~2021-03-22 15:17 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-18  7:30 Bing Zhao
2021-03-22 15:16 ` Andrew Rybchenko [this message]
2021-04-07  7:43   ` Bing Zhao
2021-03-23 23:27 ` Ajit Khaparde
2021-04-07  2:41   ` Bing Zhao
2021-04-10 13:46 ` [dpdk-dev] [PATCH] " Bing Zhao
2021-04-15 16:24   ` Ori Kam
2021-04-15 16:44     ` Bing Zhao
2021-04-15 16:41   ` [dpdk-dev] [PATCH v2 0/2] " Bing Zhao
2021-04-15 16:41     ` [dpdk-dev] [PATCH v2 1/2] " Bing Zhao
2021-04-16 10:49       ` Thomas Monjalon
2021-04-16 18:18         ` Bing Zhao
2021-04-16 12:41       ` Ori Kam
2021-04-16 18:05         ` Bing Zhao
2021-04-16 21:47           ` Ajit Khaparde
2021-04-17  6:10             ` Bing Zhao
2021-04-17 14:54               ` Ajit Khaparde
2021-04-15 16:41     ` [dpdk-dev] [PATCH v2 2/2] app/testpmd: add CLI for conntrack Bing Zhao
2021-04-16  8:46       ` Ori Kam
2021-04-16 18:20         ` Bing Zhao
2021-04-16 17:54   ` [dpdk-dev] [PATCH v3 0/3] ethdev: introduce conntrack flow action and item Bing Zhao
2021-04-16 17:54     ` [dpdk-dev] [PATCH v3 1/3] " Bing Zhao
2021-04-16 18:30       ` Ajit Khaparde
2021-04-19 14:08         ` Thomas Monjalon
2021-04-19 16:21           ` Bing Zhao
2021-04-19 14:06       ` Thomas Monjalon
2021-04-19 16:13         ` Bing Zhao
2021-04-16 17:54     ` [dpdk-dev] [PATCH v3 2/3] app/testpmd: add CLI for conntrack Bing Zhao
2021-04-16 17:54     ` [dpdk-dev] [PATCH v3 3/3] doc: update " Bing Zhao
2021-04-16 18:22       ` Thomas Monjalon
2021-04-16 18:30       ` Ajit Khaparde
2021-04-19 17:28         ` Bing Zhao
2021-04-19 17:16   ` [dpdk-dev] [PATCH v4 0/3] ethdev: introduce conntrack flow action and item Bing Zhao
2021-04-19 17:16     ` [dpdk-dev] [PATCH v4 1/3] " Bing Zhao
2021-04-19 17:33       ` Ori Kam
2021-04-19 17:16     ` [dpdk-dev] [PATCH v4 2/3] app/testpmd: add CLI for conntrack Bing Zhao
2021-04-19 17:35       ` Ori Kam
2021-04-19 17:16     ` [dpdk-dev] [PATCH v4 3/3] doc: update " Bing Zhao
2021-04-19 17:32       ` Thomas Monjalon
2021-04-19 17:37       ` Ori Kam
2021-04-19 17:51   ` [dpdk-dev] [PATCH v5 0/2] ethdev: introduce conntrack flow action and item Bing Zhao
2021-04-19 17:51     ` [dpdk-dev] [PATCH v5 1/2] " Bing Zhao
2021-04-19 18:07       ` Thomas Monjalon
2021-04-19 23:29         ` Ferruh Yigit
2021-04-19 17:51     ` [dpdk-dev] [PATCH v5 2/2] app/testpmd: add CLI for conntrack Bing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ad370a3-ebb7-8c15-9cdb-ec0b35a9bd3e@oktetlabs.ru \
    --to=andrew.rybchenko@oktetlabs.ru \
    --cc=bingz@nvidia.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=orika@nvidia.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).