From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 02028A0C41;
	Wed,  6 Oct 2021 18:25:12 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 861E841130;
	Wed,  6 Oct 2021 18:25:12 +0200 (CEST)
Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113])
 by mails.dpdk.org (Postfix) with ESMTP id 6B7EC410EA
 for <dev@dpdk.org>; Wed,  6 Oct 2021 18:25:10 +0200 (CEST)
Received: from [100.65.5.102] (unknown [5.144.123.44])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by shelob.oktetlabs.ru (Postfix) with ESMTPSA id D546D7F4E2;
 Wed,  6 Oct 2021 19:25:09 +0300 (MSK)
DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru D546D7F4E2
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru;
 s=default; t=1633537509;
 bh=I24kzXDtChik0P1raQM374gzNI0QvfTanKZjMUqD/Kg=;
 h=Subject:To:Cc:References:From:Date:In-Reply-To;
 b=sn2UnXQSK+Vb4XfSfqY/9+xV9Vdwa1FUTAhQtMb9n9IV1U6FTYAIc1vtyJeS+p2zS
 qFYvqYqJXWQyw9Rh9SNhSR2Q4iuDfFOb3S9TOd5dRio93wXjfAGAixnOw5wyhUWOGS
 Ft9N+pSKSbfHLwBE/Q6d++zcH+D4CtIHTW5Jg38I=
To: Alexander Kozyrev <akozyrev@nvidia.com>, dev@dpdk.org
Cc: thomas@monjalon.net, orika@nvidia.com, andrew.rybchenko@oktetlabs.ru,
 ferruh.yigit@intel.com, mohammad.abdul.awal@intel.com, qi.z.zhang@intel.com,
 jerinj@marvell.com, ajit.khaparde@broadcom.com
References: <20211006044835.3936226-1-akozyrev@nvidia.com>
 <20211006044835.3936226-4-akozyrev@nvidia.com>
From: Ivan Malov <Ivan.Malov@oktetlabs.ru>
Message-ID: <23842a48-9278-c22c-276a-063edb4dd63a@oktetlabs.ru>
Date: Wed, 6 Oct 2021 19:24:58 +0300
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.14.0
MIME-Version: 1.0
In-Reply-To: <20211006044835.3936226-4-akozyrev@nvidia.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH 3/3] ethdev: add async queue-based flow rules
 operations
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi,

On 06/10/2021 07:48, Alexander Kozyrev wrote:
> A new, faster, queue-based flow rules management mechanism is needed for
> applications offloading rules inside the datapath. This asynchronous
> and lockless mechanism frees the CPU for further packet processing and
> reduces the performance impact of the flow rules creation/destruction
> on the datapath. Note that queues are not thread-safe and queue-based
> operations can be safely invoked without any locks from a single thread.
> 
> The rte_flow_q_flow_create() function enqueues a flow creation to the
> requested queue. It benefits from already configured resources and sets
> unique values on top of item and action templates. A flow rule is enqueued
> on the specified flow queue and offloaded asynchronously to the hardware.
> The function returns immediately to spare CPU for further packet
> processing. The application must invoke the rte_flow_q_dequeue() function
> to complete the flow rule operation offloading, to clear the queue, and to
> receive the operation status. The rte_flow_q_flow_destroy() function
> enqueues a flow destruction to the requested queue.
> 
> Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
> Suggested-by: Ori Kam <orika@nvidia.com>
> ---
>   lib/ethdev/rte_flow.h | 288 ++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 288 insertions(+)
> 
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index ba3204b17e..8cdffd8d2e 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -4298,6 +4298,13 @@ struct rte_flow_port_attr {
>   	 * Version of the struct layout, should be 0.
>   	 */
>   	uint32_t version;
> +	/**
> +	 * Number of flow queues to be configured.
> +	 * Flow queues are used for asyncronous flow rule creation/destruction.

Typo: asyncronous --> asynchronous

> +	 * The order of operations is not guaranteed inside a queue.
> +	 * Flow queues are not thread-safe.
> +	 */
> +	uint16_t nb_queues;
>   	/**
>   	 * Memory size allocated for the flow rules management.
>   	 * If set to 0, memory is allocated dynamically.
> @@ -4330,6 +4337,21 @@ struct rte_flow_port_attr {
>   	uint32_t fixed_resource_size:1;
>   };
>   
> +/**
> + * Flow engine queue configuration.
> + */
> +__extension__
> +struct rte_flow_queue_attr {

Perhaps "struct rte_flow_queue_mode" or "struct rte_flow_queue_conf". I 
don't insist.

> +	/**
> +	 * Version of the struct layout, should be 0.
> +	 */
> +	uint32_t version;
> +	/**
> +	 * Number of flow rule operations a queue can hold.
> +	 */
> +	uint32_t size;
> +};
> +
>   /**
>    * @warning
>    * @b EXPERIMENTAL: this API may change without prior notice.
> @@ -4346,6 +4368,8 @@ struct rte_flow_port_attr {
>    *   Port identifier of Ethernet device.
>    * @param[in] port_attr
>    *   Port configuration attributes.
> + * @param[in] queue_attr
> + *   Array that holds attributes for each queue.

This should probably say that the number of queues / array size is taken 
from port_attr->nb_queues.

Also, consider "... for each flow queue".

>    * @param[out] error
>    *   Perform verbose error reporting if not NULL.
>    *   PMDs initialize this structure in case of error only.
> @@ -4357,6 +4381,7 @@ __rte_experimental
>   int
>   rte_flow_configure(uint16_t port_id,
>   		   const struct rte_flow_port_attr *port_attr,
> +		   const struct rte_flow_queue_attr *queue_attr[],
>   		   struct rte_flow_error *error);
>   
>   /**
> @@ -4626,6 +4651,269 @@ __rte_experimental
>   int
>   rte_flow_table_destroy(uint16_t port_id, struct rte_flow_table *table,
>   		       struct rte_flow_error *error);
> +
> +/**
> + * Queue operation attributes
> + */
> +__extension__
> +struct rte_flow_q_ops_attr {
> +	/**
> +	 * Version of the struct layout, should be 0.
> +	 */
> +	uint32_t version;
> +	/**
> +	 * The user data that will be returned on the completion.

Maybe "on completion events"?

> +	 */
> +	void *user_data;
> +	/**
> +	 * When set, the requested action must be sent to the HW without
> +	 * any delay. Any prior requests must be also sent to the HW.
> +	 * If this bit is cleared, the application must call the
> +	 * rte_flow_queue_flush API to actually send the request to the HW.

Not sure that I understand the "Any prior requests ..." part. If this 
structure configures operation mode for the whole queue and not for each 
enqueue request, then no "prior requests" can exist in the first place 
because each submission is meant to be immediately sent to the HW.

But if this structure can vary across enqueue requests, then this 
documentation should be improved to say this clearly.


> +	 */
> +	uint32_t flush:1;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue rule creation operation.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param queue
> + *   Flow queue used to insert the rule.
> + * @param[in] attr
> + *   Rule creation operation attributes.

At a bare minimum, please consider renaming this to "queue_attr". More 
variants (see above): "queue_mode", "queue_conf". I suggest doing so to 
avoid confusion with "struct rte_flow_attr" which sits in "struct 
rte_flow_table_attr" in fact.

If this structure must be exactly the same as the one used in 
rte_flow_configure(), please say so. If, however, this structure can be 
completely different on enqueue operations, then the argument name 
should indicate it somehow. Maybe, "override_queue_conf". Otherwise, 
it's unclear.

There are similar occurrences below.

> + * @param[in] table
> + *   Table to select templates from.

Perhaps "template_table"?

> + * @param[in] items
> + *   List of pattern items to be used.
> + *   The list order should match the order in the item template.
> + *   The spec is the only relevant member of the item that is being used.
> + * @param[in] item_template_index
> + *   Item template index in the table.
> + * @param[in] actions
> + *   List of actions to be used.
> + *   The list order should match the order in the action template.
> + * @param[in] action_template_index
> + *   Action template index in the table.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   Handle on success, NULL otherwise and rte_errno is set.
> + *   The rule handle doesn't mean that the rule was offloaded.
> + *   Only completion result indicates that the rule was offloaded.
> + */
> +__rte_experimental
> +struct rte_flow *
> +rte_flow_q_flow_create(uint16_t port_id, uint32_t queue,
> +		       const struct rte_flow_q_ops_attr *attr,
> +		       const struct rte_flow_table *table,
> +			   const struct rte_flow_item items[],
> +		       uint8_t item_template_index,
> +			   const struct rte_flow_action actions[],
> +		       uint8_t action_template_index,
> +		       struct rte_flow_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue rule destruction operation.
> + *
> + * This function enqueues a destruction operation on the queue.
> + * Application should assume that after calling this function
> + * the rule handle is not valid anymore.
> + * Completion indicates the full removal of the rule from the HW.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param queue
> + *   Flow queue which is used to destroy the rule.
> + *   This must match the queue on which the rule was created.
> + * @param[in] attr
> + *   Rule destroy operation attributes.
> + * @param[in] flow
> + *   Flow handle to be destroyed.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_q_flow_destroy(uint16_t port_id, uint32_t queue,
> +			struct rte_flow_q_ops_attr *attr,
> +			struct rte_flow *flow,
> +			struct rte_flow_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue indirect rule creation operation.
> + * @see rte_flow_action_handle_create

Why "indirect rule"? This API seems to enqueue an "indirect action"...

> + *
> + * @param[in] port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] queue
> + *   Flow queue which is used to create the rule.
> + * @param[in] attr
> + *   Queue operation attributes.
> + * @param[in] conf
> + *   Action configuration for the indirect action object creation.

Perhaps "indir_conf" or "indir_action_conf"?

> + * @param[in] action
> + *   Specific configuration of the indirect action object.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   - (0) if success.
> + *   - (-ENODEV) if *port_id* invalid.
> + *   - (-ENOSYS) if underlying device does not support this functionality.
> + *   - (-EIO) if underlying device is removed.
> + *   - (-ENOENT) if action pointed by *action* handle was not found.
> + *   - (-EBUSY) if action pointed by *action* handle still used by some rules
> + *   rte_errno is also set.
> + */
> +__rte_experimental
> +struct rte_flow_action_handle *
> +rte_flow_q_action_handle_create(uint16_t port_id, uint32_t queue,
> +				const struct rte_flow_q_ops_attr *attr,
> +				const struct rte_flow_indir_action_conf *conf,
> +				const struct rte_flow_action *action,
> +				struct rte_flow_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue indirect rule destruction operation.

Please see above. Did you mean "indirect action"?

> + * The destroy queue must be the same
> + * as the queue on which the action was created.
> + *
> + * @param[in] port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] queue
> + *   Flow queue which is used to destroy the rule.
> + * @param[in] attr
> + *   Queue operation attributes.
> + * @param[in] handle
> + *   Handle for the indirect action object to be destroyed.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   - (0) if success.
> + *   - (-ENODEV) if *port_id* invalid.
> + *   - (-ENOSYS) if underlying device does not support this functionality.
> + *   - (-EIO) if underlying device is removed.
> + *   - (-ENOENT) if action pointed by *action* handle was not found.
> + *   - (-EBUSY) if action pointed by *action* handle still used by some rules
> + *   rte_errno is also set.
> + */
> +__rte_experimental
> +int
> +rte_flow_q_action_handle_destroy(uint16_t port_id, uint32_t queue,
> +				struct rte_flow_q_ops_attr *attr,
> +				struct rte_flow_action_handle *handle,
> +				struct rte_flow_error *error);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Flush all internally stored rules to the HW.
> + * Non-flushed rules are rules that were inserted without the flush flag set.
> + * Can be used to notify the HW about batch of rules prepared by the SW to
> + * reduce the number of communications between the HW and SW.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param queue
> + *   Flow queue to be flushed.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *    0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_q_flush(uint16_t port_id, uint32_t queue,
> +		 struct rte_flow_error *error);
> +
> +/**
> + * Dequeue operation status > + */
> +enum rte_flow_q_op_status {
> +	/**
> +	 * The operation was completed successfully.
> +	 */
> +	RTE_FLOW_Q_OP_SUCCESS,
> +	/**
> +	 * The operation was not completed successfully.
> +	 */
> +	RTE_FLOW_Q_OP_ERROR,
> +};
> +
> +/**
> + * Dequeue operation result.
> + */
> +struct rte_flow_q_op_res {
> +	/**
> +	 * Version of the struct layout, should be 0.
> +	 */
> +	uint32_t version;
> +	/**
> +	 * Returns the status of the operation that this completion signals.
> +	 */
> +	enum rte_flow_q_op_status status;
> +	/**
> +	 * User data that was supplied during operation submission.
> +	 */
> +	void *user_data;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Dequeue a rte flow operation.
> + * The application must invoke this function in order to complete
> + * the flow rule offloading and to receive the flow rule operation status.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param queue
> + *   Flow queue which is used to dequeue the operation.
> + * @param[out] res
> + *   Array of results that will be set.
> + * @param[in] n_res
> + *   Maximum number of results that can be returned.
> + *   This value is equal to the size of the res array.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *   PMDs initialize this structure in case of error only.
> + *
> + * @return
> + *   Number of results that were dequeued,
> + *   a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_q_dequeue(uint16_t port_id, uint32_t queue,
> +		   struct rte_flow_q_op_res res[], uint16_t n_res,
> +		   struct rte_flow_error *error);
>   #ifdef __cplusplus
>   }
>   #endif
> 

-- 
Ivan M