From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id C6B14A0350;
	Tue,  8 Feb 2022 16:24:01 +0100 (CET)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 5A2DF410FD;
	Tue,  8 Feb 2022 16:24:01 +0100 (CET)
Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113])
 by mails.dpdk.org (Postfix) with ESMTP id 58AEA410FC
 for <dev@dpdk.org>; Tue,  8 Feb 2022 16:23:59 +0100 (CET)
Received: by shelob.oktetlabs.ru (Postfix, from userid 115)
 id 9999E67; Tue,  8 Feb 2022 18:23:58 +0300 (MSK)
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mail1.oktetlabs.ru
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=ALL_TRUSTED, DKIM_ADSP_DISCARD,
 URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6
Received: from bree.oktetlabs.ru (bree.oktetlabs.ru [192.168.34.5])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by shelob.oktetlabs.ru (Postfix) with ESMTPS id 9ADD065;
 Tue,  8 Feb 2022 18:23:57 +0300 (MSK)
DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru 9ADD065
Authentication-Results: shelob.oktetlabs.ru/9ADD065; dkim=none;
 dkim-atps=neutral
Date: Tue, 8 Feb 2022 18:23:57 +0300 (MSK)
From: Ivan Malov <ivan.malov@oktetlabs.ru>
To: Alexander Kozyrev <akozyrev@nvidia.com>
cc: Jerin Jacob <jerinjacobk@gmail.com>, dpdk-dev <dev@dpdk.org>, 
 Ori Kam <orika@nvidia.com>, 
 "NBU-Contact-Thomas Monjalon (EXTERNAL)" <thomas@monjalon.net>, 
 Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>, 
 Ferruh Yigit <ferruh.yigit@intel.com>, 
 "mohammad.abdul.awal@intel.com" <mohammad.abdul.awal@intel.com>, 
 Qi Zhang <qi.z.zhang@intel.com>, Jerin Jacob <jerinj@marvell.com>, 
 Ajit Khaparde <ajit.khaparde@broadcom.com>
Subject: RE: [PATCH v3 03/10] ethdev: bring in async queue-based flow rules
 operations
In-Reply-To: <MW2PR12MB24122EEBBF39D4E82B1DD749AF2D9@MW2PR12MB2412.namprd12.prod.outlook.com>
Message-ID: <bb3335f1-975d-a765-c3dc-ff5033c91bd2@oktetlabs.ru>
References: <20220118153027.3947448-1-akozyrev@nvidia.com>
 <20220206032526.816079-1-akozyrev@nvidia.com>
 <20220206032526.816079-4-akozyrev@nvidia.com>
 <CALBAE1NACPKy5K9CjsK0avf_=QYVHnuc3b3qsMKCruSgXLObfA@mail.gmail.com> 
 <MW2PR12MB24122EEBBF39D4E82B1DD749AF2D9@MW2PR12MB2412.namprd12.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII; format=flowed
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Hi,

PSB

On Tue, 8 Feb 2022, Alexander Kozyrev wrote:

>> On Tuesday, February 8, 2022 5:57 Jerin Jacob <jerinjacobk@gmail.com> wrote:
>> On Sun, Feb 6, 2022 at 8:57 AM Alexander Kozyrev <akozyrev@nvidia.com>
>> wrote:
>
> Hi Jerin, thanks you for reviewing my patch. I appreciate your input.
> I'm planning to send v4 today with addressed comments today to be on time for RC1.
> I hope that my answers are satisfactory for the rest of questions raised by you.
>
>>>
>>> A new, faster, queue-based flow rules management mechanism is needed
>> for
>>> applications offloading rules inside the datapath. This asynchronous
>>> and lockless mechanism frees the CPU for further packet processing and
>>> reduces the performance impact of the flow rules creation/destruction
>>> on the datapath. Note that queues are not thread-safe and the queue
>>> should be accessed from the same thread for all queue operations.
>>> It is the responsibility of the app to sync the queue functions in case
>>> of multi-threaded access to the same queue.
>>>
>>> The rte_flow_q_flow_create() function enqueues a flow creation to the
>>> requested queue. It benefits from already configured resources and sets
>>> unique values on top of item and action templates. A flow rule is enqueued
>>> on the specified flow queue and offloaded asynchronously to the
>> hardware.
>>> The function returns immediately to spare CPU for further packet
>>> processing. The application must invoke the rte_flow_q_pull() function
>>> to complete the flow rule operation offloading, to clear the queue, and to
>>> receive the operation status. The rte_flow_q_flow_destroy() function
>>> enqueues a flow destruction to the requested queue.
>>
>> It is good to see the implementation, specifically to understand,
>
> We will send PMD implementation in the next few days.
>
>> 1)
>> I understand, We are creating queues to make multiple producers to
>> enqueue multiple jobs in parallel.
>> On the consumer side, Is it HW or some other cores to consume the job?
>
> From API point of view there is no restriction on the type of consumer.
> It could be hardware or software implementation, but in most cases
> (and in our driver) it will be the NIC to handle the requests.
>
>> Can we operate in consumer in parallel?
>
> Yes, we can have separate multiple hardware queues to handle operations
> in parallel independently and without any locking mechanism needed.
>
>> 2) Is Queue part of HW or just SW primitive to submit the work as a channel.
>
> The queue is a software primitive.
>
>>
>>>
>>> Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
>>> ---
>>>  doc/guides/prog_guide/img/rte_flow_q_init.svg |  71 ++++
>>>  .../prog_guide/img/rte_flow_q_usage.svg       |  60 +++
>>>  doc/guides/prog_guide/rte_flow.rst            | 159 +++++++-
>>>  doc/guides/rel_notes/release_22_03.rst        |   8 +
>>>  lib/ethdev/rte_flow.c                         | 173 ++++++++-
>>>  lib/ethdev/rte_flow.h                         | 342 ++++++++++++++++++
>>>  lib/ethdev/rte_flow_driver.h                  |  55 +++
>>>  lib/ethdev/version.map                        |   7 +
>>>  8 files changed, 873 insertions(+), 2 deletions(-)
>>>  create mode 100644 doc/guides/prog_guide/img/rte_flow_q_init.svg
>>>  create mode 100644 doc/guides/prog_guide/img/rte_flow_q_usage.svg
>>>
>>> diff --git a/doc/guides/prog_guide/img/rte_flow_q_init.svg
>> b/doc/guides/prog_guide/img/rte_flow_q_init.svg
>>> new file mode 100644
>>> index 0000000000..2080bf4c04
>>
>>
>>
>> Some comments on the diagrams:
>> # rte_flow_q_create_flow and rte_flow_q_destroy_flow used instead of
>> rte_flow_q_flow_create/destroy
>> # rte_flow_q_pull's brackets(i.e ()) not aligned
>
> Will fix this, thanks for noticing.
>
>>
>>> +</svg>
>>> \ No newline at end of file
>>> diff --git a/doc/guides/prog_guide/rte_flow.rst
>> b/doc/guides/prog_guide/rte_flow.rst
>>> index b7799c5abe..734294e65d 100644
>>> --- a/doc/guides/prog_guide/rte_flow.rst
>>> +++ b/doc/guides/prog_guide/rte_flow.rst
>>> @@ -3607,12 +3607,16 @@ Hints about the expected number of counters
>> or meters in an application,
>>>  for example, allow PMD to prepare and optimize NIC memory layout in
>> advance.
>>>  ``rte_flow_configure()`` must be called before any flow rule is created,
>>>  but after an Ethernet device is configured.
>>> +It also creates flow queues for asynchronous flow rules operations via
>>> +queue-based API, see `Asynchronous operations`_ section.
>>>
>>>  .. code-block:: c
>>>
>>>     int
>>>     rte_flow_configure(uint16_t port_id,
>>>                       const struct rte_flow_port_attr *port_attr,
>>> +                     uint16_t nb_queue,
>>
>> # rte_flow_info_get() don't have number of queues, why not adding
>> number queues in rte_flow_port_attr.
>
> Good suggestion, I'll add it to the capabilities structure.
>
>> # And additional APIs for queue_setup() like ethdev.
>
> ethdev has the start function which tells the PMD when all configurations are done.
> In our case there is no such function and device is ready to create a flows as soon
> as we exit the rte_flow_configure(). In addition, the number of queues may affect
> the resource allocation it is best to process all the requested resources at the same time.
>
>>
>>> +                     const struct rte_flow_queue_attr *queue_attr[],
>>>                       struct rte_flow_error *error);
>>>
>>>  Information about resources that can benefit from pre-allocation can be
>>> @@ -3737,7 +3741,7 @@ and pattern and actions templates are created.
>>>
>>>  .. code-block:: c
>>>
>>> -       rte_flow_configure(port, *port_attr, *error);
>>> +       rte_flow_configure(port, *port_attr, nb_queue, *queue_attr,
>> *error);
>>>
>>>         struct rte_flow_pattern_template *pattern_templates[0] =
>>>                 rte_flow_pattern_template_create(port, &itr, &pattern, &error);
>>> @@ -3750,6 +3754,159 @@ and pattern and actions templates are created.
>>>                                 *actions_templates, nb_actions_templates,
>>>                                 *error);
>>>
>>> +Asynchronous operations
>>> +-----------------------
>>> +
>>> +Flow rules management can be done via special lockless flow
>> management queues.
>>> +- Queue operations are asynchronous and not thread-safe.
>>> +- Operations can thus be invoked by the app's datapath,
>>> +packet processing can continue while queue operations are processed by
>> NIC.
>>> +- The queue number is configured at initialization stage.
>>> +- Available operation types: rule creation, rule destruction,
>>> +indirect rule creation, indirect rule destruction, indirect rule update.
>>> +- Operations may be reordered within a queue.
>>> +- Operations can be postponed and pushed to NIC in batches.
>>> +- Results pulling must be done on time to avoid queue overflows.
>>> +- User data is returned as part of the result to identify an operation.
>>> +- Flow handle is valid once the creation operation is enqueued and must
>> be
>>> +destroyed even if the operation is not successful and the rule is not
>> inserted.
>>
>> You need CR between lines as rendered text does comes as new line in
>> between the items.
>
> OK.
>
>>
>>> +
>>> +The asynchronous flow rule insertion logic can be broken into two phases.
>>> +
>>> +1. Initialization stage as shown here:
>>> +
>>> +.. _figure_rte_flow_q_init:
>>> +
>>> +.. figure:: img/rte_flow_q_init.*
>>> +
>>> +2. Main loop as presented on a datapath application example:
>>> +
>>> +.. _figure_rte_flow_q_usage:
>>> +
>>> +.. figure:: img/rte_flow_q_usage.*
>>
>> it is better to add sequence operations as text to understand the flow.
>
> I prefer keeping the diagram here, it looks more clean and concise.
> Block of text gives no new information and harder to follow, imho.
>
>>
>>> +
>>> +Enqueue creation operation
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Enqueueing a flow rule creation operation is similar to simple creation.
>>
>> If it is enqueue operation, why not call it ad rte_flow_q_flow_enqueue()
>>
>>> +
>>> +.. code-block:: c
>>> +
>>> +       struct rte_flow *
>>> +       rte_flow_q_flow_create(uint16_t port_id,
>>> +                               uint32_t queue_id,
>>> +                               const struct rte_flow_q_ops_attr *q_ops_attr,
>>> +                               struct rte_flow_table *table,
>>> +                               const struct rte_flow_item pattern[],
>>> +                               uint8_t pattern_template_index,
>>> +                               const struct rte_flow_action actions[],
>>
>> If I understand correctly, table is the pre-configured object that has
>> N number of patterns and N number of actions.
>> Why giving items[] and actions[] again?
>
> Table only contains templates for pattern and actions.

Then why not reflect it in the argument name? Perhaps, "template_table"?
Or even in the struct name: "struct rte_flow_template_table".
Chances are that readers will misread "rte_flow_table"
as "flow entry table" in the OpenFlow sense.

> We still need to provide the values for those templates when we create a flow.
> Thus we specify patterns and action here.

All of that is clear in terms of this review cycle, but please
consider improving the argument names to help future readers.

>
>>> +                               uint8_t actions_template_index,
>>> +                               struct rte_flow_error *error);
>>> +
>>> +A valid handle in case of success is returned. It must be destroyed later
>>> +by calling ``rte_flow_q_flow_destroy()`` even if the rule is rejected by
>> HW.
>>> +
>>> +Enqueue destruction operation
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> Queue destruction operation.
>
> We are not destroying queue, we are enqueuing the flow destruction operation.
>
>>
>>> +
>>> +Enqueueing a flow rule destruction operation is similar to simple
>> destruction.
>>> +
>>> +.. code-block:: c
>>> +
>>> +       int
>>> +       rte_flow_q_flow_destroy(uint16_t port_id,
>>> +                               uint32_t queue_id,
>>> +                               const struct rte_flow_q_ops_attr *q_ops_attr,
>>> +                               struct rte_flow *flow,
>>> +                               struct rte_flow_error *error);
>>> +
>>> +Push enqueued operations
>>> +~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Pushing all internally stored rules from a queue to the NIC.
>>> +
>>> +.. code-block:: c
>>> +
>>> +       int
>>> +       rte_flow_q_push(uint16_t port_id,
>>> +                       uint32_t queue_id,
>>> +                       struct rte_flow_error *error);
>>> +
>>> +There is the postpone attribute in the queue operation attributes.
>>> +When it is set, multiple operations can be bulked together and not sent to
>> HW
>>> +right away to save SW/HW interactions and prioritize throughput over
>> latency.
>>> +The application must invoke this function to actually push all outstanding
>>> +operations to HW in this case.
>>> +
>>> +Pull enqueued operations
>>> +~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Pulling asynchronous operations results.
>>> +
>>> +The application must invoke this function in order to complete
>> asynchronous
>>> +flow rule operations and to receive flow rule operations statuses.
>>> +
>>> +.. code-block:: c
>>> +
>>> +       int
>>> +       rte_flow_q_pull(uint16_t port_id,
>>> +                       uint32_t queue_id,
>>> +                       struct rte_flow_q_op_res res[],
>>> +                       uint16_t n_res,
>>> +                       struct rte_flow_error *error);
>>> +
>>> +Multiple outstanding operation results can be pulled simultaneously.
>>> +User data may be provided during a flow creation/destruction in order
>>> +to distinguish between multiple operations. User data is returned as part
>>> +of the result to provide a method to detect which operation is completed.
>>> +
>>> +Enqueue indirect action creation operation
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Asynchronous version of indirect action creation API.
>>> +
>>> +.. code-block:: c
>>> +
>>> +       struct rte_flow_action_handle *
>>> +       rte_flow_q_action_handle_create(uint16_t port_id,
>>
>> What is the use case for this?
>
> Indirect action creation may take time, it may depend on hardware resources
> allocation. So we add the asynchronous way of creating it the same way.
>
>> How application needs to use this. We already creating flow_table. Is
>> that not sufficient?
>
> The indirect action object is used in flow rules via its handle.
> This is an extension to the already existing API in order to speed up
> the creation of these objects.
>
>>
>>> +                       uint32_t queue_id,
>>> +                       const struct rte_flow_q_ops_attr *q_ops_attr,
>>> +                       const struct rte_flow_indir_action_conf *indir_action_conf,
>>> +                       const struct rte_flow_action *action,
>>> +                       struct rte_flow_error *error);
>>> +
>>> +A valid handle in case of success is returned. It must be destroyed later by
>>> +calling ``rte_flow_q_action_handle_destroy()`` even if the rule is
>> rejected.
>>> +
>>> +Enqueue indirect action destruction operation
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Asynchronous version of indirect action destruction API.
>>> +
>>> +.. code-block:: c
>>> +
>>> +       int
>>> +       rte_flow_q_action_handle_destroy(uint16_t port_id,
>>> +                       uint32_t queue_id,
>>> +                       const struct rte_flow_q_ops_attr *q_ops_attr,
>>> +                       struct rte_flow_action_handle *action_handle,
>>> +                       struct rte_flow_error *error);
>>> +
>>> +Enqueue indirect action update operation
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Asynchronous version of indirect action update API.
>>> +
>>> +.. code-block:: c
>>> +
>>> +       int
>>> +       rte_flow_q_action_handle_update(uint16_t port_id,
>>> +                       uint32_t queue_id,
>>> +                       const struct rte_flow_q_ops_attr *q_ops_attr,
>>> +                       struct rte_flow_action_handle *action_handle,
>>> +                       const void *update,
>>> +                       struct rte_flow_error *error);
>>> +
>>>  .. _flow_isolated_mode:
>>>
>>>  Flow isolated mode
>>> diff --git a/doc/guides/rel_notes/release_22_03.rst
>> b/doc/guides/rel_notes/release_22_03.rst
>>> index d23d1591df..80a85124e6 100644
>>> --- a/doc/guides/rel_notes/release_22_03.rst
>>> +++ b/doc/guides/rel_notes/release_22_03.rst
>>> @@ -67,6 +67,14 @@ New Features
>>>    ``rte_flow_table_destroy``, ``rte_flow_pattern_template_destroy``
>>>    and ``rte_flow_actions_template_destroy``.
>>>
>>> +* ethdev: Added ``rte_flow_q_flow_create`` and
>> ``rte_flow_q_flow_destroy`` API
>>> +  to enqueue flow creaion/destruction operations asynchronously as well
>> as
>>> +  ``rte_flow_q_pull`` to poll and retrieve results of these operations and
>>> +  ``rte_flow_q_push`` to push all the in-flight operations to the NIC.
>>> +  Introduced asynchronous API for indirect actions management as well:
>>> +  ``rte_flow_q_action_handle_create``,
>> ``rte_flow_q_action_handle_destroy`` and
>>> +  ``rte_flow_q_action_handle_update``.
>>> +
>