From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E85F9A0352; Tue, 8 Feb 2022 18:37:07 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8358D41140; Tue, 8 Feb 2022 18:37:07 +0100 (CET) Received: from mail-il1-f174.google.com (mail-il1-f174.google.com [209.85.166.174]) by mails.dpdk.org (Postfix) with ESMTP id 645CC4111B for ; Tue, 8 Feb 2022 18:37:06 +0100 (CET) Received: by mail-il1-f174.google.com with SMTP id f13so2367748ilq.5 for ; Tue, 08 Feb 2022 09:37:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7hrXz8s7v5NsSWq/eoXPgBdVmxQFUAyNaFlTN3sJPaQ=; b=EVM4IR6Ag4oklQLCwitWf1unuZjgL28mUfhyjaJehUM4xqzej88ulD6mh2HPR/TRBq Oxsp46knfVPV25Ew3I17Ug4jeUeZHOG48sW7sTeDiBdur5CAdH1+QA0bOHtCPGcJgwkY o8kCAR/idPl6Uu07gRvPxLJO80SiCsglHUAOVLCe7feILh+LuPJJ/+0MvYebjD7dH9eO mXlErXwscvNkevE9GuZ43Gn08vPK5dEAE8GJwlXP0idhc/end665pCkikywhTZezQiSQ QqB33WzCv/n5k15YFBqWW3jqmNam9U1WdKfZvJ7waBcnjNi53Q4iQhCP1HrEcrKvyXag KiFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7hrXz8s7v5NsSWq/eoXPgBdVmxQFUAyNaFlTN3sJPaQ=; b=6i7+bH1oL4qktyf9IY1Ms0u1IvTzK272jnemH5cZXbRc3G3+aY54y5RAm6hOK0tAgP TsxI+Slam9levL856604Aeb7Rd+Bc63rlMDZPC80UrxNUI+EValTciTMccZDdV6HeLpB DDZv5ErUd96ODXPWS02RYgQ6Kk1a3WCMjNkfQHat9ZKDjhcLQUMwKjcfHLeo45Ue0JbG jGDgkHm7zPyQpgnygGb6uC3HlvYI7w2JtrTegm1Qheu1wx4b7b26KAP4iApRdCOHwrmR GrkdeI7p+xAfAhisxUgAjMhrrHzqkQoOJOOR5R8c81rGI8ogMwmfhfecqsFD8TpvKsl0 9Y1Q== X-Gm-Message-State: AOAM532MjeYkNxFBR68o7gkHWK/1X1SaXO1GEw2LbqcO0R/nOcvT8u0d uTCeA2alidhC5z3Ce7T+IabfyfbnMmzFlG9jFA8= X-Google-Smtp-Source: ABdhPJxeDvJuF7/QEttSCQmcvkIrexQVSCnFzaWgXsuN/gXhc6ciaBy2Nc/Ur7+VHQp+OYxTIyh9QNkobr5U/Pxv7x8= X-Received: by 2002:a05:6e02:160b:: with SMTP id t11mr2940128ilu.75.1644341825497; Tue, 08 Feb 2022 09:37:05 -0800 (PST) MIME-Version: 1.0 References: <20220118153027.3947448-1-akozyrev@nvidia.com> <20220206032526.816079-1-akozyrev@nvidia.com> <20220206032526.816079-4-akozyrev@nvidia.com> In-Reply-To: From: Jerin Jacob Date: Tue, 8 Feb 2022 23:06:39 +0530 Message-ID: Subject: Re: [PATCH v3 03/10] ethdev: bring in async queue-based flow rules operations To: Alexander Kozyrev Cc: dpdk-dev , Ori Kam , "NBU-Contact-Thomas Monjalon (EXTERNAL)" , Ivan Malov , Andrew Rybchenko , Ferruh Yigit , "mohammad.abdul.awal@intel.com" , Qi Zhang , Jerin Jacob , Ajit Khaparde Content-Type: text/plain; charset="UTF-8" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, Feb 8, 2022 at 7:42 PM Alexander Kozyrev wrote: > > > On Tuesday, February 8, 2022 5:57 Jerin Jacob wrote: > > On Sun, Feb 6, 2022 at 8:57 AM Alexander Kozyrev > > wrote: > > Hi Jerin, thanks you for reviewing my patch. I appreciate your input. Hi Alex, > I'm planning to send v4 today with addressed comments today to be on time for RC1. > I hope that my answers are satisfactory for the rest of questions raised by you. Comments looks good to me. Please remove version field in the next patch. > > > > > > > A new, faster, queue-based flow rules management mechanism is needed > > for > > > applications offloading rules inside the datapath. This asynchronous > > > and lockless mechanism frees the CPU for further packet processing and > > > reduces the performance impact of the flow rules creation/destruction > > > on the datapath. Note that queues are not thread-safe and the queue > > > should be accessed from the same thread for all queue operations. > > > It is the responsibility of the app to sync the queue functions in case > > > of multi-threaded access to the same queue. > > > > > > The rte_flow_q_flow_create() function enqueues a flow creation to the > > > requested queue. It benefits from already configured resources and sets > > > unique values on top of item and action templates. A flow rule is enqueued > > > on the specified flow queue and offloaded asynchronously to the > > hardware. > > > The function returns immediately to spare CPU for further packet > > > processing. The application must invoke the rte_flow_q_pull() function > > > to complete the flow rule operation offloading, to clear the queue, and to > > > receive the operation status. The rte_flow_q_flow_destroy() function > > > enqueues a flow destruction to the requested queue. > > > > It is good to see the implementation, specifically to understand, > > We will send PMD implementation in the next few days. > > > 1) > > I understand, We are creating queues to make multiple producers to > > enqueue multiple jobs in parallel. > > On the consumer side, Is it HW or some other cores to consume the job? > > From API point of view there is no restriction on the type of consumer. > It could be hardware or software implementation, but in most cases > (and in our driver) it will be the NIC to handle the requests. > > > Can we operate in consumer in parallel? > > Yes, we can have separate multiple hardware queues to handle operations > in parallel independently and without any locking mechanism needed. > > > 2) Is Queue part of HW or just SW primitive to submit the work as a channel. > > The queue is a software primitive. > > > > > > > > > Signed-off-by: Alexander Kozyrev > > > --- > > > doc/guides/prog_guide/img/rte_flow_q_init.svg | 71 ++++ > > > .../prog_guide/img/rte_flow_q_usage.svg | 60 +++ > > > doc/guides/prog_guide/rte_flow.rst | 159 +++++++- > > > doc/guides/rel_notes/release_22_03.rst | 8 + > > > lib/ethdev/rte_flow.c | 173 ++++++++- > > > lib/ethdev/rte_flow.h | 342 ++++++++++++++++++ > > > lib/ethdev/rte_flow_driver.h | 55 +++ > > > lib/ethdev/version.map | 7 + > > > 8 files changed, 873 insertions(+), 2 deletions(-) > > > create mode 100644 doc/guides/prog_guide/img/rte_flow_q_init.svg > > > create mode 100644 doc/guides/prog_guide/img/rte_flow_q_usage.svg > > > > > > diff --git a/doc/guides/prog_guide/img/rte_flow_q_init.svg > > b/doc/guides/prog_guide/img/rte_flow_q_init.svg > > > new file mode 100644 > > > index 0000000000..2080bf4c04 > > > > > > > > Some comments on the diagrams: > > # rte_flow_q_create_flow and rte_flow_q_destroy_flow used instead of > > rte_flow_q_flow_create/destroy > > # rte_flow_q_pull's brackets(i.e ()) not aligned > > Will fix this, thanks for noticing. > > > > > > + > > > \ No newline at end of file > > > diff --git a/doc/guides/prog_guide/rte_flow.rst > > b/doc/guides/prog_guide/rte_flow.rst > > > index b7799c5abe..734294e65d 100644 > > > --- a/doc/guides/prog_guide/rte_flow.rst > > > +++ b/doc/guides/prog_guide/rte_flow.rst > > > @@ -3607,12 +3607,16 @@ Hints about the expected number of counters > > or meters in an application, > > > for example, allow PMD to prepare and optimize NIC memory layout in > > advance. > > > ``rte_flow_configure()`` must be called before any flow rule is created, > > > but after an Ethernet device is configured. > > > +It also creates flow queues for asynchronous flow rules operations via > > > +queue-based API, see `Asynchronous operations`_ section. > > > > > > .. code-block:: c > > > > > > int > > > rte_flow_configure(uint16_t port_id, > > > const struct rte_flow_port_attr *port_attr, > > > + uint16_t nb_queue, > > > > # rte_flow_info_get() don't have number of queues, why not adding > > number queues in rte_flow_port_attr. > > Good suggestion, I'll add it to the capabilities structure. > > > # And additional APIs for queue_setup() like ethdev. > > ethdev has the start function which tells the PMD when all configurations are done. > In our case there is no such function and device is ready to create a flows as soon > as we exit the rte_flow_configure(). In addition, the number of queues may affect > the resource allocation it is best to process all the requested resources at the same time. > > > > > > + const struct rte_flow_queue_attr *queue_attr[], > > > struct rte_flow_error *error); > > > > > > Information about resources that can benefit from pre-allocation can be > > > @@ -3737,7 +3741,7 @@ and pattern and actions templates are created. > > > > > > .. code-block:: c > > > > > > - rte_flow_configure(port, *port_attr, *error); > > > + rte_flow_configure(port, *port_attr, nb_queue, *queue_attr, > > *error); > > > > > > struct rte_flow_pattern_template *pattern_templates[0] = > > > rte_flow_pattern_template_create(port, &itr, &pattern, &error); > > > @@ -3750,6 +3754,159 @@ and pattern and actions templates are created. > > > *actions_templates, nb_actions_templates, > > > *error); > > > > > > +Asynchronous operations > > > +----------------------- > > > + > > > +Flow rules management can be done via special lockless flow > > management queues. > > > +- Queue operations are asynchronous and not thread-safe. > > > +- Operations can thus be invoked by the app's datapath, > > > +packet processing can continue while queue operations are processed by > > NIC. > > > +- The queue number is configured at initialization stage. > > > +- Available operation types: rule creation, rule destruction, > > > +indirect rule creation, indirect rule destruction, indirect rule update. > > > +- Operations may be reordered within a queue. > > > +- Operations can be postponed and pushed to NIC in batches. > > > +- Results pulling must be done on time to avoid queue overflows. > > > +- User data is returned as part of the result to identify an operation. > > > +- Flow handle is valid once the creation operation is enqueued and must > > be > > > +destroyed even if the operation is not successful and the rule is not > > inserted. > > > > You need CR between lines as rendered text does comes as new line in > > between the items. > > OK. > > > > > > + > > > +The asynchronous flow rule insertion logic can be broken into two phases. > > > + > > > +1. Initialization stage as shown here: > > > + > > > +.. _figure_rte_flow_q_init: > > > + > > > +.. figure:: img/rte_flow_q_init.* > > > + > > > +2. Main loop as presented on a datapath application example: > > > + > > > +.. _figure_rte_flow_q_usage: > > > + > > > +.. figure:: img/rte_flow_q_usage.* > > > > it is better to add sequence operations as text to understand the flow. > > I prefer keeping the diagram here, it looks more clean and concise. > Block of text gives no new information and harder to follow, imho. > > > > > > + > > > +Enqueue creation operation > > > +~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +Enqueueing a flow rule creation operation is similar to simple creation. > > > > If it is enqueue operation, why not call it ad rte_flow_q_flow_enqueue() > > > > > + > > > +.. code-block:: c > > > + > > > + struct rte_flow * > > > + rte_flow_q_flow_create(uint16_t port_id, > > > + uint32_t queue_id, > > > + const struct rte_flow_q_ops_attr *q_ops_attr, > > > + struct rte_flow_table *table, > > > + const struct rte_flow_item pattern[], > > > + uint8_t pattern_template_index, > > > + const struct rte_flow_action actions[], > > > > If I understand correctly, table is the pre-configured object that has > > N number of patterns and N number of actions. > > Why giving items[] and actions[] again? > > Table only contains templates for pattern and actions. > We still need to provide the values for those templates when we create a flow. > Thus we specify patterns and action here. > > > > + uint8_t actions_template_index, > > > + struct rte_flow_error *error); > > > + > > > +A valid handle in case of success is returned. It must be destroyed later > > > +by calling ``rte_flow_q_flow_destroy()`` even if the rule is rejected by > > HW. > > > + > > > +Enqueue destruction operation > > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > Queue destruction operation. > > We are not destroying queue, we are enqueuing the flow destruction operation. > > > > > > + > > > +Enqueueing a flow rule destruction operation is similar to simple > > destruction. > > > + > > > +.. code-block:: c > > > + > > > + int > > > + rte_flow_q_flow_destroy(uint16_t port_id, > > > + uint32_t queue_id, > > > + const struct rte_flow_q_ops_attr *q_ops_attr, > > > + struct rte_flow *flow, > > > + struct rte_flow_error *error); > > > + > > > +Push enqueued operations > > > +~~~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +Pushing all internally stored rules from a queue to the NIC. > > > + > > > +.. code-block:: c > > > + > > > + int > > > + rte_flow_q_push(uint16_t port_id, > > > + uint32_t queue_id, > > > + struct rte_flow_error *error); > > > + > > > +There is the postpone attribute in the queue operation attributes. > > > +When it is set, multiple operations can be bulked together and not sent to > > HW > > > +right away to save SW/HW interactions and prioritize throughput over > > latency. > > > +The application must invoke this function to actually push all outstanding > > > +operations to HW in this case. > > > + > > > +Pull enqueued operations > > > +~~~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +Pulling asynchronous operations results. > > > + > > > +The application must invoke this function in order to complete > > asynchronous > > > +flow rule operations and to receive flow rule operations statuses. > > > + > > > +.. code-block:: c > > > + > > > + int > > > + rte_flow_q_pull(uint16_t port_id, > > > + uint32_t queue_id, > > > + struct rte_flow_q_op_res res[], > > > + uint16_t n_res, > > > + struct rte_flow_error *error); > > > + > > > +Multiple outstanding operation results can be pulled simultaneously. > > > +User data may be provided during a flow creation/destruction in order > > > +to distinguish between multiple operations. User data is returned as part > > > +of the result to provide a method to detect which operation is completed. > > > + > > > +Enqueue indirect action creation operation > > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +Asynchronous version of indirect action creation API. > > > + > > > +.. code-block:: c > > > + > > > + struct rte_flow_action_handle * > > > + rte_flow_q_action_handle_create(uint16_t port_id, > > > > What is the use case for this? > > Indirect action creation may take time, it may depend on hardware resources > allocation. So we add the asynchronous way of creating it the same way. > > > How application needs to use this. We already creating flow_table. Is > > that not sufficient? > > The indirect action object is used in flow rules via its handle. > This is an extension to the already existing API in order to speed up > the creation of these objects. > > > > > > + uint32_t queue_id, > > > + const struct rte_flow_q_ops_attr *q_ops_attr, > > > + const struct rte_flow_indir_action_conf *indir_action_conf, > > > + const struct rte_flow_action *action, > > > + struct rte_flow_error *error); > > > + > > > +A valid handle in case of success is returned. It must be destroyed later by > > > +calling ``rte_flow_q_action_handle_destroy()`` even if the rule is > > rejected. > > > + > > > +Enqueue indirect action destruction operation > > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +Asynchronous version of indirect action destruction API. > > > + > > > +.. code-block:: c > > > + > > > + int > > > + rte_flow_q_action_handle_destroy(uint16_t port_id, > > > + uint32_t queue_id, > > > + const struct rte_flow_q_ops_attr *q_ops_attr, > > > + struct rte_flow_action_handle *action_handle, > > > + struct rte_flow_error *error); > > > + > > > +Enqueue indirect action update operation > > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +Asynchronous version of indirect action update API. > > > + > > > +.. code-block:: c > > > + > > > + int > > > + rte_flow_q_action_handle_update(uint16_t port_id, > > > + uint32_t queue_id, > > > + const struct rte_flow_q_ops_attr *q_ops_attr, > > > + struct rte_flow_action_handle *action_handle, > > > + const void *update, > > > + struct rte_flow_error *error); > > > + > > > .. _flow_isolated_mode: > > > > > > Flow isolated mode > > > diff --git a/doc/guides/rel_notes/release_22_03.rst > > b/doc/guides/rel_notes/release_22_03.rst > > > index d23d1591df..80a85124e6 100644 > > > --- a/doc/guides/rel_notes/release_22_03.rst > > > +++ b/doc/guides/rel_notes/release_22_03.rst > > > @@ -67,6 +67,14 @@ New Features > > > ``rte_flow_table_destroy``, ``rte_flow_pattern_template_destroy`` > > > and ``rte_flow_actions_template_destroy``. > > > > > > +* ethdev: Added ``rte_flow_q_flow_create`` and > > ``rte_flow_q_flow_destroy`` API > > > + to enqueue flow creaion/destruction operations asynchronously as well > > as > > > + ``rte_flow_q_pull`` to poll and retrieve results of these operations and > > > + ``rte_flow_q_push`` to push all the in-flight operations to the NIC. > > > + Introduced asynchronous API for indirect actions management as well: > > > + ``rte_flow_q_action_handle_create``, > > ``rte_flow_q_action_handle_destroy`` and > > > + ``rte_flow_q_action_handle_update``. > > > +