DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC] Generic flow director/filtering/classification API
@ 2016-07-05 18:16 Adrien Mazarguil
  2016-07-07  7:14 ` Lu, Wenzhuo
                   ` (5 more replies)
  0 siblings, 6 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-05 18:16 UTC (permalink / raw)
  To: dev
  Cc: Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu, Jan Medala,
	John Daley, Jing Chen, Konstantin Ananyev, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, Pablo de Lara,
	Olga Shern

Hi All,

First, forgive me for this large message, I know our mailboxes already
suffer quite a bit from the amount of traffic on this ML.

This is not exactly yet another thread about how flow director should be
extended, rather about a brand new API to handle filtering and
classification for incoming packets in the most PMD-generic and
application-friendly fashion we can come up with. Reasons described below.

I think this topic is important enough to include both the users of this API
as well as PMD maintainers. So far I have CC'ed librte_ether (especially
rte_eth_ctrl.h contributors), testpmd and PMD maintainers (with and without
a .filter_ctrl implementation), but if you know application maintainers
other than testpmd who use FDIR or might be interested in this discussion,
feel free to add them.

The issues we found with the current approach are already summarized in the
following document, but here is a quick summary for TL;DR folks:

- PMDs do not expose a common set of filter types and even when they do,
  their behavior more or less differs.

- Applications need to determine and adapt to device-specific limitations
  and quirks on their own, without help from PMDs.

- Writing an application that creates flow rules targeting all devices
  supported by DPDK is thus difficult, if not impossible.

- The current API has too many unspecified areas (particularly regarding
  side effects of flow rules) that make PMD implementation tricky.

This RFC API handles everything currently supported by .filter_ctrl, the
idea being to reimplement all of these to make them fully usable by
applications in a more generic and well defined fashion. It has a very small
set of mandatory features and an easy method to let applications probe for
supported capabilities.

The only downside is more work for the software control side of PMDs because
they have to adapt to the API instead of the reverse. I think helpers can be
added to EAL to assist with this.

HTML version:

 https://rawgit.com/6WIND/rte_flow/master/rte_flow.html

PDF version:

 https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf

Related draft header file (for reference while reading the specification):

 https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h

Git tree for completeness (latest .rst version can be retrieved from here):

 https://github.com/6WIND/rte_flow

What follows is the ReST source of the above, for inline comments and
discussion. I intend to update that specification accordingly.

========================
Generic filter interface
========================

.. footer::

   v0.6

.. contents::
.. sectnum::
.. raw:: pdf

   PageBreak

Overview
========

DPDK provides several competing interfaces added over time to perform packet
matching and related actions such as filtering and classification.

They must be extended to implement the features supported by newer devices
in order to expose them to applications, however the current design has
several drawbacks:

- Complicated filter combinations which have not been hard-coded cannot be
  expressed.
- Prone to API/ABI breakage when new features must be added to an existing
  filter type, which frequently happens.

>From an application point of view:

- Having disparate interfaces, all optional and lacking in features does not
  make this API easy to use.
- Seemingly arbitrary built-in limitations of filter types based on the
  device they were initially designed for.
- Undefined relationship between different filter types.
- High complexity, considerable undocumented and/or undefined behavior.

Considering the growing number of devices supported by DPDK, adding a new
filter type each time a new feature must be implemented is not sustainable
in the long term. Applications not written to target a specific device
cannot really benefit from such an API.

For these reasons, this document defines an extensible unified API that
encompasses and supersedes these legacy filter types.

.. raw:: pdf

   PageBreak

Current API
===========

Rationale
---------

The reason several competing (and mostly overlapping) filtering APIs are
present in DPDK is due to its nature as a thin layer between hardware and
software.

Each subsequent interface has been added to better match the capabilities
and limitations of the latest supported device, which usually happened to
need an incompatible configuration approach. Because of this, many ended up
device-centric and not usable by applications that were not written for that
particular device.

This document is not the first attempt to address this proliferation issue,
in fact a lot of work has already been done both to create a more generic
interface while somewhat keeping compatibility with legacy ones through a
common call interface (``rte_eth_dev_filter_ctrl()`` with the
``.filter_ctrl`` PMD callback in ``rte_ethdev.h``).

Today, these previously incompatible interfaces are known as filter types
(``RTE_ETH_FILTER_*`` from ``enum rte_filter_type`` in ``rte_eth_ctrl.h``).

However while trivial to extend with new types, it only shifted the
underlying problem as applications still need to be written for one kind of
filter type, which, as described in the following sections, is not
necessarily implemented by all PMDs that support filtering.

.. raw:: pdf

   PageBreak

Filter types
------------

This section summarizes the capabilities of each filter type.

Although the following list is exhaustive, the description of individual
types may contain inaccuracies due to the lack of documentation or usage
examples.

Note: names are prefixed with ``RTE_ETH_FILTER_``.

``MACVLAN``
~~~~~~~~~~~

Matching:

- L2 source/destination addresses.
- Optional 802.1Q VLAN ID.
- Masking individual fields on a rule basis is not supported.

Action:

- Packets are redirected either to a given VF device using its ID or to the
  PF.

``ETHERTYPE``
~~~~~~~~~~~~~

Matching:

- L2 source/destination addresses (optional).
- Ethertype (no VLAN ID?).
- Masking individual fields on a rule basis is not supported.

Action:

- Receive packets on a given queue.
- Drop packets.

``FLEXIBLE``
~~~~~~~~~~~~

Matching:

- At most 128 consecutive bytes anywhere in packets.
- Masking is supported with byte granularity.
- Priorities are supported (relative to this filter type, undefined
  otherwise).

Action:

- Receive packets on a given queue.

``SYN``
~~~~~~~

Matching:

- TCP SYN packets only.
- One high priority bit can be set to give the highest possible priority to
  this type when other filters with different types are configured.

Action:

- Receive packets on a given queue.

``NTUPLE``
~~~~~~~~~~

Matching:

- Source/destination IPv4 addresses (optional in 2-tuple mode).
- Source/destination TCP/UDP port (mandatory in 2 and 5-tuple modes).
- L4 protocol (2 and 5-tuple modes).
- Masking individual fields is supported.
- TCP flags.
- Up to 7 levels of priority relative to this filter type, undefined
  otherwise.
- No IPv6.

Action:

- Receive packets on a given queue.

``TUNNEL``
~~~~~~~~~~

Matching:

- Outer L2 source/destination addresses.
- Inner L2 source/destination addresses.
- Inner VLAN ID.
- IPv4/IPv6 source (destination?) address.
- Tunnel type to match (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE, 802.1BR
  E-Tag).
- Tenant ID for tunneling protocols that have one.
- Any combination of the above can be specified.
- Masking individual fields on a rule basis is not supported.

Action:

- Receive packets on a given queue.

.. raw:: pdf

   PageBreak

``FDIR``
~~~~~~~~

Queries:

- Device capabilities and limitations.
- Device statistics about configured filters (resource usage, collisions).
- Device configuration (matching input set and masks)

Matching:

- Device mode of operation: none (to disable filtering), signature
  (hash-based dispatching from masked fields) or perfect (either MAC VLAN or
  tunnel).
- L2 Ethertype.
- Outer L2 destination address (MAC VLAN mode).
- Inner L2 destination address, tunnel type (NVGRE, VXLAN) and tunnel ID
  (tunnel mode).
- IPv4 source/destination addresses, ToS, TTL and protocol fields.
- IPv6 source/destination addresses, TC, protocol and hop limits fields.
- UDP source/destination IPv4/IPv6 and ports.
- TCP source/destination IPv4/IPv6 and ports.
- SCTP source/destination IPv4/IPv6, ports and verification tag field.
- Note, only one protocol type at once (either only L2 Ethertype, basic
  IPv6, IPv4+UDP, IPv4+TCP and so on).
- VLAN TCI (extended API).
- At most 16 bytes to match in payload (extended API). A global device
  look-up table specifies for each possible protocol layer (unknown, raw,
  L2, L3, L4) the offset to use for each byte (they do not need to be
  contiguous) and the related bitmask.
- Whether packet is addressed to PF or VF, in that case its ID can be
  matched as well (extended API).
- Masking most of the above fields is supported, but simultaneously affects
  all filters configured on a device.
- Input set can be modified in a similar fashion for a given device to
  ignore individual fields of filters (i.e. do not match the destination
  address in a IPv4 filter, refer to **RTE_ETH_INPUT_SET_**
  macros). Configuring this also affects RSS processing on **i40e**.
- Filters can also provide 32 bits of arbitrary data to return as part of
  matched packets.

Action:

- **RTE_ETH_FDIR_ACCEPT**: receive (accept) packet on a given queue.
- **RTE_ETH_FDIR_REJECT**: drop packet immediately.
- **RTE_ETH_FDIR_PASSTHRU**: similar to accept for the last filter in list,
  otherwise process it with subsequent filters.
- For accepted packets and if requested by filter, either 32 bits of
  arbitrary data and four bytes of matched payload (only in case of flex
  bytes matching), or eight bytes of matched payload (flex also) are added
  to meta data.

.. raw:: pdf

   PageBreak

``HASH``
~~~~~~~~

Not an actual filter type. Provides and retrieves the global device
configuration (per port or entire NIC) for hash functions and their
properties.

Hash function selection: "default" (keep current), XOR or Toeplitz.

This function can be configured per flow type (**RTE_ETH_FLOW_**
definitions), supported types are:

- Unknown.
- Raw.
- Fragmented or non-fragmented IPv4.
- Non-fragmented IPv4 with L4 (TCP, UDP, SCTP or other).
- Fragmented or non-fragmented IPv6.
- Non-fragmented IPv6 with L4 (TCP, UDP, SCTP or other).
- L2 payload.
- IPv6 with extensions.
- IPv6 with L4 (TCP, UDP) and extensions.

``L2_TUNNEL``
~~~~~~~~~~~~~

Matching:

- All packets received on a given port.

Action:

- Add tunnel encapsulation (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE,
  802.1BR E-Tag) using the provided Ethertype and tunnel ID (only E-Tag
  is implemented at the moment).
- VF ID to use for tag insertion (currently unused).
- Destination pool for tag based forwarding (pools are IDs that can be
  affected to ports, duplication occurs if the same ID is shared by several
  ports of the same NIC).

.. raw:: pdf

   PageBreak

Driver support
--------------

======== ======= ========= ======== === ====== ====== ==== ==== =========
Driver   MACVLAN ETHERTYPE FLEXIBLE SYN NTUPLE TUNNEL FDIR HASH L2_TUNNEL
======== ======= ========= ======== === ====== ====== ==== ==== =========
bnx2x
cxgbe
e1000            yes       yes      yes yes
ena
enic                                                  yes
fm10k
i40e     yes     yes                           yes    yes  yes
ixgbe            yes                yes yes           yes       yes
mlx4
mlx5                                                  yes
szedata2
======== ======= ========= ======== === ====== ====== ==== ==== =========

Flow director
-------------

Flow director (FDIR) is the name of the most capable filter type, which
covers most features offered by others. As such, it is the most widespread
in PMDs that support filtering (i.e. all of them besides **e1000**).

It is also the only type that allows an arbitrary 32 bits value provided by
applications to be attached to a filter and returned with matching packets
instead of relying on the destination queue to recognize flows.

Unfortunately, even FDIR requires applications to be aware of low-level
capabilities and limitations (most of which come directly from **ixgbe** and
**i40e**):

- Bitmasks are set globally per device (port?), not per filter.
- Configuration state is not expected to be saved by the driver, and
  stopping/restarting a port requires the application to perform it again
  (API documentation is also unclear about this).
- Monolithic approach with ABI issues as soon as a new kind of flow or
  combination needs to be supported.
- Cryptic global statistics/counters.
- Unclear about how priorities are managed; filters seem to be arranged as a
  linked list in hardware (possibly related to configuration order).

Packet alteration
-----------------

One interesting feature is that the L2 tunnel filter type implements the
ability to alter incoming packets through a filter (in this case to
encapsulate them), thus the **mlx5** flow encap/decap features are not a
foreign concept.

.. raw:: pdf

   PageBreak

Proposed API
============

Terminology
-----------

- **Filtering API**: overall framework affecting the fate of selected
  packets, covers everything described in this document.
- **Matching pattern**: properties to look for in received packets, a
  combination of any number of items.
- **Pattern item**: part of a pattern that either matches packet data
  (protocol header, payload or derived information), or specifies properties
  of the pattern itself.
- **Actions**: what needs to be done when a packet matches a pattern.
- **Flow rule**: this is the result of combining a *matching pattern* with
  *actions*.
- **Filter rule**: a less generic term than *flow rule*, can otherwise be
  used interchangeably.
- **Hit**: a flow rule is said to be *hit* when processing a matching
  packet.

Requirements
------------

As described in the previous section, there is a growing need for a common
method to configure filtering and related actions in a hardware independent
fashion.

The filtering API should not disallow any filter combination by design and
must remain as simple as possible to use. It can simply be defined as a
method to perform one or several actions on selected packets.

PMDs are aware of the capabilities of the device they manage and should be
responsible for preventing unsupported or conflicting combinations.

This approach is fundamentally different as it places most of the burden on
the software side of the PMD instead of having device capabilities directly
mapped to API functions, then expecting applications to work around ensuing
compatibility issues.

Requirements for a new API:

- Flexible and extensible without causing API/ABI problems for existing
  applications.
- Should be unambiguous and easy to use.
- Support existing filtering features and actions listed in `Filter types`_.
- Support packet alteration.
- In case of overlapping filters, their priority should be well documented.
- Support filter queries (for example to retrieve counters).

.. raw:: pdf

   PageBreak

High level design
-----------------

The chosen approach to make filtering as generic as possible is by
expressing matching patterns through lists of items instead of the flat
structures used in DPDK today, enabling combinations that are not predefined
and thus being more versatile.

Flow rules can have several distinct actions (such as counting,
encapsulating, decapsulating before redirecting packets to a particular
queue, etc.), instead of relying on several rules to achieve this and having
applications deal with hardware implementation details regarding their
order.

Support for different priority levels on a rule basis is provided, for
example in order to force a more specific rule come before a more generic
one for packets matched by both, however hardware support for more than a
single priority level cannot be guaranteed. When supported, the number of
available priority levels is usually low, which is why they can also be
implemented in software by PMDs (e.g. to simulate missing priority levels by
reordering rules).

In order to remain as hardware agnostic as possible, by default all rules
are considered to have the same priority, which means that the order between
overlapping rules (when a packet is matched by several filters) is
undefined, packet duplication may even occur as a result.

PMDs may refuse to create overlapping rules at a given priority level when
they can be detected (e.g. if a pattern matches an existing filter).

Thus predictable results for a given priority level can only be achieved
with non-overlapping rules, using perfect matching on all protocol layers.

Support for multiple actions per rule may be implemented internally on top
of non-default hardware priorities, as a result both features may not be
simultaneously available to applications.

Considering that allowed pattern/actions combinations cannot be known in
advance and would result in an unpractically large number of capabilities to
expose, a method is provided to validate a given rule from the current
device configuration state without actually adding it (akin to a "dry run"
mode).

This enables applications to check if the rule types they need is supported
at initialization time, before starting their data path. This method can be
used anytime, its only requirement being that the resources needed by a rule
must exist (e.g. a target RX queue must be configured first).

Each defined rule is associated with an opaque handle managed by the PMD,
applications are responsible for keeping it. These can be used for queries
and rules management, such as retrieving counters or other data and
destroying them.

Handles must be destroyed before releasing associated resources such as
queues.

Integration
-----------

To avoid ABI breakage, this new interface will be implemented through the
existing filtering control framework (``rte_eth_dev_filter_ctrl()``) using
**RTE_ETH_FILTER_GENERIC** as a new filter type.

However a public front-end API described in `Rules management`_ will
be added as the preferred method to use it.

Once discussions with the community have converged to a definite API, legacy
filter types should be deprecated and a deadline defined to remove their
support entirely.

PMDs will have to be gradually converted to **RTE_ETH_FILTER_GENERIC** or
drop filtering support entirely. Less maintained PMDs for older hardware may
lose support at this point.

The notion of filter type will then be deprecated and subsequently dropped
to avoid confusion between both frameworks.

Implementation details
======================

Flow rule
---------

A flow rule is the combination of a matching pattern with a list of actions,
and is the basis of this API.

Priorities
~~~~~~~~~~

A priority can be assigned to a matching pattern.

The default priority level is 0 and is also the highest. Support for more
than a single priority level in hardware is not guaranteed.

If a packet is matched by several filters at a given priority level, the
outcome is undefined. It can take any path and can even be duplicated.

Matching pattern
~~~~~~~~~~~~~~~~

A matching pattern comprises any number of items of various types.

Items are arranged in a list to form a matching pattern for packets. They
fall in two categories:

- Protocol matching (ANY, RAW, ETH, IPV4, IPV6, ICMP, UDP, TCP, VXLAN and so
  on), usually associated with a specification structure. These must be
  stacked in the same order as the protocol layers to match, starting from
  L2.

- Affecting how the pattern is processed (END, VOID, INVERT, PF, VF,
  SIGNATURE and so on), often without a specification structure. Since they
  are meta data that does not match packet contents, these can be specified
  anywhere within item lists without affecting the protocol matching items.

Most item specifications can be optionally paired with a mask to narrow the
specific fields or bits to be matched.

- Items are defined with ``struct rte_flow_item``.
- Patterns are defined with ``struct rte_flow_pattern``.

Example of an item specification matching an Ethernet header:

+-----------------------------------------+
| Ethernet                                |
+==========+=========+====================+
| ``spec`` | ``src`` | ``00:01:02:03:04`` |
|          +---------+--------------------+
|          | ``dst`` | ``00:2a:66:00:01`` |
+----------+---------+--------------------+
| ``mask`` | ``src`` | ``00:ff:ff:ff:00`` |
|          +---------+--------------------+
|          | ``dst`` | ``00:00:00:00:ff`` |
+----------+---------+--------------------+

Non-masked bits stand for any value, Ethernet headers with the following
properties are thus matched:

- ``src``: ``??:01:02:03:??``
- ``dst``: ``??:??:??:??:01``

Except for meta types that do not need one, ``spec`` must be a valid pointer
to a structure of the related item type. A ``mask`` of the same type can be
provided to tell which bits in ``spec`` are to be matched.

A mask is normally only needed for ``spec`` fields matching packet data,
ignored otherwise. See individual item types for more information.

A ``NULL`` mask pointer is allowed and is similar to matching with a full
mask (all ones) ``spec`` fields supported by hardware, the remaining fields
are ignored (all zeroes), there is thus no error checking for unsupported
fields.

Matching pattern items for packet data must be naturally stacked (ordered
from lowest to highest protocol layer), as in the following examples:

+--------------+
| TCPv4 as L4  |
+===+==========+
| 0 | Ethernet |
+---+----------+
| 1 | IPv4     |
+---+----------+
| 2 | TCP      |
+---+----------+

+----------------+
| TCPv6 in VXLAN |
+===+============+
| 0 | Ethernet   |
+---+------------+
| 1 | IPv4       |
+---+------------+
| 2 | UDP        |
+---+------------+
| 3 | VXLAN      |
+---+------------+
| 4 | Ethernet   |
+---+------------+
| 5 | IPv6       |
+---+------------+
| 6 | TCP        |
+---+------------+

+-----------------------------+
| TCPv4 as L4 with meta items |
+===+=========================+
| 0 | VOID                    |
+---+-------------------------+
| 1 | Ethernet                |
+---+-------------------------+
| 2 | VOID                    |
+---+-------------------------+
| 3 | IPv4                    |
+---+-------------------------+
| 4 | TCP                     |
+---+-------------------------+
| 5 | VOID                    |
+---+-------------------------+
| 6 | VOID                    |
+---+-------------------------+

The above example shows how meta items do not affect packet data matching
items, as long as those remain stacked properly. The resulting matching
pattern is identical to "TCPv4 as L4".

+----------------+
| UDPv6 anywhere |
+===+============+
| 0 | IPv6       |
+---+------------+
| 1 | UDP        |
+---+------------+

If supported by the PMD, omitting one or several protocol layers at the
bottom of the stack as in the above example (missing an Ethernet
specification) enables hardware to look anywhere in packets.

It is unspecified whether the payload of supported encapsulations
(e.g. VXLAN inner packet) is matched by such a pattern, which may apply to
inner, outer or both packets.

+---------------------+
| Invalid, missing L3 |
+===+=================+
| 0 | Ethernet        |
+---+-----------------+
| 1 | UDP             |
+---+-----------------+

The above pattern is invalid due to a missing L3 specification between L2
and L4. It is only allowed at the bottom and at the top of the stack.

Meta item types
~~~~~~~~~~~~~~~

These do not match packet data but affect how the pattern is processed, most
of them do not need a specification structure. This particularity allows
them to be specified anywhere without affecting other item types.

``END``
^^^^^^^

End marker for item lists. Prevents further processing of items, thereby
ending the pattern.

- Its numeric value is **0** for convenience.
- PMD support is mandatory.
- Both ``spec`` and ``mask`` are ignored.

+--------------------+
| END                |
+==========+=========+
| ``spec`` | ignored |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

``VOID``
^^^^^^^^

Used as a placeholder for convenience. It is ignored and simply discarded by
PMDs.

- PMD support is mandatory.
- Both ``spec`` and ``mask`` are ignored.

+--------------------+
| VOID               |
+==========+=========+
| ``spec`` | ignored |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

One usage example for this type is generating rules that share a common
prefix quickly without reallocating memory, only by updating item types:

+------------------------+
| TCP, UDP or ICMP as L4 |
+===+====================+
| 0 | Ethernet           |
+---+--------------------+
| 1 | IPv4               |
+---+------+------+------+
| 2 | UDP  | VOID | VOID |
+---+------+------+------+
| 3 | VOID | TCP  | VOID |
+---+------+------+------+
| 4 | VOID | VOID | ICMP |
+---+------+------+------+

.. raw:: pdf

   PageBreak

``INVERT``
^^^^^^^^^^

Inverted matching, i.e. process packets that do not match the pattern.

- Both ``spec`` and ``mask`` are ignored.

+--------------------+
| INVERT             |
+==========+=========+
| ``spec`` | ignored |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

Usage example in order to match non-TCPv4 packets only:

+--------------------+
| Anything but TCPv4 |
+===+================+
| 0 | INVERT         |
+---+----------------+
| 1 | Ethernet       |
+---+----------------+
| 2 | IPv4           |
+---+----------------+
| 3 | TCP            |
+---+----------------+

``PF``
^^^^^^

Matches packets addressed to the physical function of the device.

- Both ``spec`` and ``mask`` are ignored.

+--------------------+
| PF                 |
+==========+=========+
| ``spec`` | ignored |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

``VF``
^^^^^^

Matches packets addressed to the given virtual function ID of the device.

- Only ``spec`` needs to be defined, ``mask`` is ignored.

+----------------------------------------+
| VF                                     |
+==========+=========+===================+
| ``spec`` | ``vf``  | destination VF ID |
+----------+---------+-------------------+
| ``mask`` | ignored                     |
+----------+-----------------------------+

``SIGNATURE``
^^^^^^^^^^^^^

Requests hash-based signature dispatching for this rule.

Considering this is a global setting on devices that support it, all
subsequent filter rules may have to be created with it as well.

- Only ``spec`` needs to be defined, ``mask`` is ignored.

+--------------------+
| SIGNATURE          |
+==========+=========+
| ``spec`` | TBD     |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

.. raw:: pdf

   PageBreak

Data matching item types
~~~~~~~~~~~~~~~~~~~~~~~~

Most of these are basically protocol header definitions with associated
bitmasks. They must be specified (stacked) from lowest to highest protocol
layer.

The following list is not exhaustive as new protocols will be added in the
future.

``ANY``
^^^^^^^

Matches any protocol in place of the current layer, a single ANY may also
stand for several protocol layers.

This is usually specified as the first pattern item when looking for a
protocol anywhere in a packet.

- A maximum value of **0** requests matching any number of protocol layers
  above or equal to the minimum value, a maximum value lower than the
  minimum one is otherwise invalid.
- Only ``spec`` needs to be defined, ``mask`` is ignored.

+-----------------------------------------------------------------------+
| ANY                                                                   |
+==========+=========+==================================================+
| ``spec`` | ``min`` | minimum number of layers covered                 |
|          +---------+--------------------------------------------------+
|          | ``max`` | maximum number of layers covered, 0 for infinity |
+----------+---------+--------------------------------------------------+
| ``mask`` | ignored                                                    |
+----------+------------------------------------------------------------+

Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or IPv6)
and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
or IPv6) matched by the second ANY specification:

+----------------------------------+
| TCP in VXLAN with wildcards      |
+===+==============================+
| 0 | Ethernet                     |
+---+-----+----------+---------+---+
| 1 | ANY | ``spec`` | ``min`` | 2 |
|   |     |          +---------+---+
|   |     |          | ``max`` | 2 |
+---+-----+----------+---------+---+
| 2 | VXLAN                        |
+---+------------------------------+
| 3 | Ethernet                     |
+---+-----+----------+---------+---+
| 4 | ANY | ``spec`` | ``min`` | 1 |
|   |     |          +---------+---+
|   |     |          | ``max`` | 1 |
+---+-----+----------+---------+---+
| 5 | TCP                          |
+---+------------------------------+

.. raw:: pdf

   PageBreak

``RAW``
^^^^^^^

Matches a string of a given length at a given offset (in bytes), or anywhere
in the payload of the current protocol layer (including L2 header if used as
the first item in the stack).

This does not increment the protocol layer count as it is not a protocol
definition. Subsequent RAW items modulate the first absolute one with
relative offsets.

- Using **-1** as the ``offset`` of the first RAW item makes its absolute
  offset not fixed, i.e. the pattern is searched everywhere.
- ``mask`` only affects the pattern.

+--------------------------------------------------------------+
| RAW                                                          |
+==========+=============+=====================================+
| ``spec`` | ``offset``  | absolute or relative pattern offset |
|          +-------------+-------------------------------------+
|          | ``length``  | pattern length                      |
|          +-------------+-------------------------------------+
|          | ``pattern`` | byte string of the above length     |
+----------+-------------+-------------------------------------+
| ``mask`` | ``offset``  | ignored                             |
|          +-------------+-------------------------------------+
|          | ``length``  | ignored                             |
|          +-------------+-------------------------------------+
|          | ``pattern`` | bitmask with the same byte length   |
+----------+-------------+-------------------------------------+

Example pattern looking for several strings at various offsets of a UDP
payload, using combined RAW items:

+------------------------------------------+
| UDP payload matching                     |
+===+======================================+
| 0 | Ethernet                             |
+---+--------------------------------------+
| 1 | IPv4                                 |
+---+--------------------------------------+
| 2 | UDP                                  |
+---+-----+----------+-------------+-------+
| 3 | RAW | ``spec`` | ``offset``  | -1    |
|   |     |          +-------------+-------+
|   |     |          | ``length``  | 3     |
|   |     |          +-------------+-------+
|   |     |          | ``pattern`` | "foo" |
+---+-----+----------+-------------+-------+
| 4 | RAW | ``spec`` | ``offset``  | 20    |
|   |     |          +-------------+-------+
|   |     |          | ``length``  | 3     |
|   |     |          +-------------+-------+
|   |     |          | ``pattern`` | "bar" |
+---+-----+----------+-------------+-------+
| 5 | RAW | ``spec`` | ``offset``  | -30   |
|   |     |          +-------------+-------+
|   |     |          | ``length``  | 3     |
|   |     |          +-------------+-------+
|   |     |          | ``pattern`` | "baz" |
+---+-----+----------+-------------+-------+

This translates to:

- Locate "foo" in UDP payload, remember its offset.
- Check "bar" at "foo"'s offset plus 20 bytes.
- Check "baz" at "foo"'s offset minus 30 bytes.

.. raw:: pdf

   PageBreak

``ETH``
^^^^^^^

Matches an Ethernet header.

- ``dst``: destination MAC.
- ``src``: source MAC.
- ``type``: EtherType.
- ``tags``: number of 802.1Q/ad tags defined.
- ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:

 - ``tpid``: Tag protocol identifier.
 - ``tci``: Tag control information.

``IPV4``
^^^^^^^^

Matches an IPv4 header.

- ``src``: source IP address.
- ``dst``: destination IP address.
- ``tos``: ToS/DSCP field.
- ``ttl``: TTL field.
- ``proto``: protocol number for the next layer.

``IPV6``
^^^^^^^^

Matches an IPv6 header.

- ``src``: source IP address.
- ``dst``: destination IP address.
- ``tc``: traffic class field.
- ``nh``: Next header field (protocol).
- ``hop_limit``: hop limit field (TTL).

``ICMP``
^^^^^^^^

Matches an ICMP header.

- TBD.

``UDP``
^^^^^^^

Matches a UDP header.

- ``sport``: source port.
- ``dport``: destination port.
- ``length``: UDP length.
- ``checksum``: UDP checksum.

.. raw:: pdf

   PageBreak

``TCP``
^^^^^^^

Matches a TCP header.

- ``sport``: source port.
- ``dport``: destination port.
- All other TCP fields and bits.

``VXLAN``
^^^^^^^^^

Matches a VXLAN header.

- TBD.

.. raw:: pdf

   PageBreak

Actions
~~~~~~~

Each possible action is represented by a type. Some have associated
configuration structures. Several actions combined in a list can be affected
to a flow rule. That list is not ordered.

At least one action must be defined in a filter rule in order to do
something with matched packets.

- Actions are defined with ``struct rte_flow_action``.
- A list of actions is defined with ``struct rte_flow_actions``.

They fall in three categories:

- Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
  processing matched packets by subsequent flow rules, unless overridden
  with PASSTHRU.

- Non terminating actions (PASSTHRU, DUP) that leave matched packets up for
  additional processing by subsequent flow rules.

- Other non terminating meta actions that do not affect the fate of packets
  (END, VOID, ID, COUNT).

When several actions are combined in a flow rule, they should all have
different types (e.g. dropping a packet twice is not possible). However
considering the VOID type is an exception to this rule, the defined behavior
is for PMDs to only take into account the last action of a given type found
in the list. PMDs still perform error checking on the entire list.

*Note that PASSTHRU is the only action able to override a terminating rule.*

.. raw:: pdf

   PageBreak

Example of an action that redirects packets to queue index 10:

+----------------+
| QUEUE          |
+===========+====+
| ``queue`` | 10 |
+-----------+----+

Action lists examples, their order is not significant, applications must
consider all actions to be performed simultaneously:

+----------------+
| Count and drop |
+=======+========+
| COUNT |        |
+-------+--------+
| DROP  |        |
+-------+--------+

+--------------------------+
| Tag, count and redirect  |
+=======+===========+======+
| ID    | ``id``    | 0x2a |
+-------+-----------+------+
| COUNT |                  |
+-------+-----------+------+
| QUEUE | ``queue`` | 10   |
+-------+-----------+------+

+-----------------------+
| Redirect to queue 5   |
+=======+===============+
| DROP  |               |
+-------+-----------+---+
| QUEUE | ``queue`` | 5 |
+-------+-----------+---+

In the above example, considering both actions are performed simultaneously,
its end result is that only QUEUE has any effect.

+-----------------------+
| Redirect to queue 3   |
+=======+===========+===+
| QUEUE | ``queue`` | 5 |
+-------+-----------+---+
| VOID  |               |
+-------+-----------+---+
| QUEUE | ``queue`` | 3 |
+-------+-----------+---+

As previously described, only the last action of a given type found in the
list is taken into account. The above example also shows that VOID is
ignored.

.. raw:: pdf

   PageBreak

Action types
~~~~~~~~~~~~

Common action types are described in this section. Like pattern item types,
this list is not exhaustive as new actions will be added in the future.

``END`` (action)
^^^^^^^^^^^^^^^^

End marker for action lists. Prevents further processing of actions, thereby
ending the list.

- Its numeric value is **0** for convenience.
- PMD support is mandatory.
- No configurable property.

+---------------+
| END           |
+===============+
| no properties |
+---------------+

``VOID`` (action)
^^^^^^^^^^^^^^^^^

Used as a placeholder for convenience. It is ignored and simply discarded by
PMDs.

- PMD support is mandatory.
- No configurable property.

+---------------+
| VOID          |
+===============+
| no properties |
+---------------+

``PASSTHRU``
^^^^^^^^^^^^

Leaves packets up for additional processing by subsequent flow rules. This
is the default when a rule does not contain a terminating action, but can be
specified to force a rule to become non-terminating.

- No configurable property.

+---------------+
| PASSTHRU      |
+===============+
| no properties |
+---------------+

Example to copy a packet to a queue and continue processing by subsequent
flow rules:

+--------------------------+
| Copy to queue 8          |
+==========+===============+
| PASSTHRU |               |
+----------+-----------+---+
| QUEUE    | ``queue`` | 8 |
+----------+-----------+---+

``ID``
^^^^^^

Attaches a 32 bit value to packets.

+----------------------------------------------+
| ID                                           |
+========+=====================================+
| ``id`` | 32 bit value to return with packets |
+--------+-------------------------------------+

.. raw:: pdf

   PageBreak

``QUEUE``
^^^^^^^^^

Assigns packets to a given queue index.

- Terminating by default.

+--------------------------------+
| QUEUE                          |
+===========+====================+
| ``queue`` | queue index to use |
+-----------+--------------------+

``DROP``
^^^^^^^^

Drop packets.

- No configurable property.
- Terminating by default.
- PASSTHRU overrides this action if both are specified.

+---------------+
| DROP          |
+===============+
| no properties |
+---------------+

``COUNT``
^^^^^^^^^

Enables hits counter for this rule.

This counter can be retrieved and reset through ``rte_flow_query()``, see
``struct rte_flow_query_count``.

- Counters can be retrieved with ``rte_flow_query()``.
- No configurable property.

+---------------+
| COUNT         |
+===============+
| no properties |
+---------------+

Query structure to retrieve and reset the flow rule hits counter:

+------------------------------------------------+
| COUNT query                                    |
+===========+=====+==============================+
| ``reset`` | in  | reset counter after query    |
+-----------+-----+------------------------------+
| ``hits``  | out | number of hits for this flow |
+-----------+-----+------------------------------+

``DUP``
^^^^^^^

Duplicates packets to a given queue index.

This is normally combined with QUEUE, however when used alone, it is
actually similar to QUEUE + PASSTHRU.

- Non-terminating by default.

+------------------------------------------------+
| DUP                                            |
+===========+====================================+
| ``queue`` | queue index to duplicate packet to |
+-----------+------------------------------------+

.. raw:: pdf

   PageBreak

``RSS``
^^^^^^^

Similar to QUEUE, except RSS is additionally performed on packets to spread
them among several queues according to the provided parameters.

- Terminating by default.

+---------------------------------------------+
| RSS                                         |
+==============+==============================+
| ``rss_conf`` | RSS parameters               |
+--------------+------------------------------+
| ``queues``   | number of entries in queue[] |
+--------------+------------------------------+
| ``queue[]``  | queue indices to use         |
+--------------+------------------------------+

``PF`` (action)
^^^^^^^^^^^^^^^

Redirects packets to the physical function (PF) of the current device.

- No configurable property.
- Terminating by default.

+---------------+
| PF            |
+===============+
| no properties |
+---------------+

``VF`` (action)
^^^^^^^^^^^^^^^

Redirects packets to the virtual function (VF) of the current device with
the specified ID.

- Terminating by default.

+---------------------------------------+
| VF                                    |
+========+==============================+
| ``id`` | VF ID to redirect packets to |
+--------+------------------------------+

Planned types
~~~~~~~~~~~~~

Other action types are planned but not defined yet. These actions will add
the ability to alter matching packets in several ways, such as performing
encapsulation/decapsulation of tunnel headers on specific flows.

.. raw:: pdf

   PageBreak

Rules management
----------------

A simple API with only four functions is provided to fully manage flows.

Each created flow rule is associated with an opaque, PMD-specific handle
pointer. The application is responsible for keeping it until the rule is
destroyed.

Flows rules are defined with ``struct rte_flow``.

Validation
~~~~~~~~~~

Given that expressing a definite set of device capabilities with this API is
not practical, a dedicated function is provided to check if a flow rule is
supported and can be created.

::

 int
 rte_flow_validate(uint8_t port_id,
                   const struct rte_flow_pattern *pattern,
                   const struct rte_flow_actions *actions);

While this function has no effect on the target device, the flow rule is
validated against its current configuration state and the returned value
should be considered valid by the caller for that state only.

The returned value is guaranteed to remain valid only as long as no
successful calls to rte_flow_create() or rte_flow_destroy() are made in the
meantime and no device parameter affecting flow rules in any way are
modified, due to possible collisions or resource limitations (although in
such cases ``EINVAL`` should not be returned).

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``pattern``: pattern specification to check.
- ``actions``: actions associated with the flow definition.

Return value:

- **0** if flow rule is valid and can be created. A negative errno value
  otherwise (``rte_errno`` is also set), the following errors are defined.
- ``-EINVAL``: unknown or invalid rule specification.
- ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial masks
  are unsupported).
- ``-EEXIST``: collision with an existing rule.
- ``-ENOMEM``: not enough resources.

.. raw:: pdf

   PageBreak

Creation
~~~~~~~~

Creating a flow rule is similar to validating one, except the rule is
actually created.

::

 struct rte_flow *
 rte_flow_create(uint8_t port_id,
                 const struct rte_flow_pattern *pattern,
                 const struct rte_flow_actions *actions);

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``pattern``: pattern specification to add.
- ``actions``: actions associated with the flow definition.

Return value:

A valid flow pointer in case of success, NULL otherwise and ``rte_errno`` is
set to the positive version of one of the error codes defined for
``rte_flow_validate()``.

Destruction
~~~~~~~~~~~

Flow rules destruction is not automatic, and a queue should not be released
if any are still attached to it. Applications must take care of performing
this step before releasing resources.

::

 int
 rte_flow_destroy(uint8_t port_id,
                  struct rte_flow *flow);


Failure to destroy a flow rule may occur when other flow rules depend on it,
and destroying it would result in an inconsistent state.

This function is only guaranteed to succeed if flow rules are destroyed in
reverse order of their creation.

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``flow``: flow rule to destroy.

Return value:

- **0** on success, a negative errno value otherwise and ``rte_errno`` is
  set.

.. raw:: pdf

   PageBreak

Query
~~~~~

Query an existing flow rule.

This function allows retrieving flow-specific data such as counters. Data
is gathered by special actions which must be present in the flow rule
definition.

::

 int
 rte_flow_query(uint8_t port_id,
                struct rte_flow *flow,
                enum rte_flow_action_type action,
                void *data);

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``flow``: flow rule to query.
- ``action``: action type to query.
- ``data``: pointer to storage for the associated query data type.

Return value:

- **0** on success, a negative errno value otherwise and ``rte_errno`` is
  set.

.. raw:: pdf

   PageBreak

Behavior
--------

- API operations are synchronous and blocking (``EAGAIN`` cannot be
  returned).

- There is no provision for reentrancy/multi-thread safety, although nothing
  should prevent different devices from being configured at the same
  time. PMDs may protect their control path functions accordingly.

- Stopping the data path (TX/RX) should not be necessary when managing flow
  rules. If this cannot be achieved naturally or with workarounds (such as
  temporarily replacing the burst function pointers), an appropriate error
  code must be returned (``EBUSY``).

- PMDs, not applications, are responsible for maintaining flow rules
  configuration when stopping and restarting a port or performing other
  actions which may affect them. They can only be destroyed explicitly.

.. raw:: pdf

   PageBreak

Compatibility
-------------

No known hardware implementation supports all the features described in this
document.

Unsupported features or combinations are not expected to be fully emulated
in software by PMDs for performance reasons. Partially supported features
may be completed in software as long as hardware performs most of the work
(such as queue redirection and packet recognition).

However PMDs are expected to do their best to satisfy application requests
by working around hardware limitations as long as doing so does not affect
the behavior of existing flow rules.

The following sections provide a few examples of such cases, they are based
on limitations built into the previous APIs.

Global bitmasks
~~~~~~~~~~~~~~~

Each flow rule comes with its own, per-layer bitmasks, while hardware may
support only a single, device-wide bitmask for a given layer type, so that
two IPv4 rules cannot use different bitmasks.

The expected behavior in this case is that PMDs automatically configure
global bitmasks according to the needs of the first created flow rule.

Subsequent rules are allowed only if their bitmasks match those, the
``EEXIST`` error code should be returned otherwise.

Unsupported layer types
~~~~~~~~~~~~~~~~~~~~~~~

Many protocols can be simulated by crafting patterns with the `RAW`_ type.

PMDs can rely on this capability to simulate support for protocols with
fixed headers not directly recognized by hardware.

``ANY`` pattern item
~~~~~~~~~~~~~~~~~~~~

This pattern item stands for anything, which can be difficult to translate
to something hardware would understand, particularly if followed by more
specific types.

Consider the following pattern:

+---+--------------------------------+
| 0 | ETHER                          |
+---+--------------------------------+
| 1 | ANY (``min`` = 1, ``max`` = 1) |
+---+--------------------------------+
| 2 | TCP                            |
+---+--------------------------------+

Knowing that TCP does not make sense with something other than IPv4 and IPv6
as L3, such a pattern may be translated to two flow rules instead:

+---+--------------------+
| 0 | ETHER              |
+---+--------------------+
| 1 | IPV4 (zeroed mask) |
+---+--------------------+
| 2 | TCP                |
+---+--------------------+

+---+--------------------+
| 0 | ETHER              |
+---+--------------------+
| 1 | IPV6 (zeroed mask) |
+---+--------------------+
| 2 | TCP                |
+---+--------------------+

Note that as soon as a ANY rule covers several layers, this approach may
yield a large number of hidden flow rules. It is thus suggested to only
support the most common scenarios (anything as L2 and/or L3).

.. raw:: pdf

   PageBreak

Unsupported actions
~~~~~~~~~~~~~~~~~~~

- When combined with a `QUEUE`_ action, packet counting (`COUNT`_) and
  tagging (`ID`_) may be implemented in software as long as the target queue
  is used by a single rule.

- A rule specifying both `DUP`_ + `QUEUE`_ may be translated to two hidden
  rules combining `QUEUE`_ and `PASSTHRU`_.

- When a single target queue is provided, `RSS`_ can also be implemented
  through `QUEUE`_.

Flow rules priority
~~~~~~~~~~~~~~~~~~~

While it would naturally make sense, flow rules cannot be assumed to be
processed by hardware in the same order as their creation for several
reasons:

- They may be managed internally as a tree or a hash table instead of a
  list.
- Removing a flow rule before adding another one can either put the new rule
  at the end of the list or reuse a freed entry.
- Duplication may occur when packets are matched by several rules.

For overlapping rules (particularly in order to use the `PASSTHRU`_ action)
predictable behavior is only guaranteed by using different priority levels.

Priority levels are not necessarily implemented in hardware, or may be
severely limited (e.g. a single priority bit).

For these reasons, priority levels may be implemented purely in software by
PMDs.

- For devices expecting flow rules to be added in the correct order, PMDs
  may destroy and re-create existing rules after adding a new one with
  a higher priority.

- A configurable number of dummy or empty rules can be created at
  initialization time to save high priority slots for later.

- In order to save priority levels, PMDs may evaluate whether rules are
  likely to collide and adjust their priority accordingly.

.. raw:: pdf

   PageBreak

API migration
=============

Exhaustive list of deprecated filter types and how to convert them to
generic flow rules.

``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
---------------------------------------

`MACVLAN`_ can be translated to a basic `ETH`_ flow rule with a `VF
(action)`_ or `PF (action)`_ terminating action.

+------------------------------------+
| MACVLAN                            |
+--------------------------+---------+
| Pattern                  | Actions |
+===+=====+==========+=====+=========+
| 0 | ETH | ``spec`` | any | VF,     |
|   |     +----------+-----+ PF      |
|   |     | ``mask`` | any |         |
+---+-----+----------+-----+---------+

``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
----------------------------------------------

`ETHERTYPE`_ is basically an `ETH`_ flow rule with `QUEUE`_ or `DROP`_ as
a terminating action.

+------------------------------------+
| ETHERTYPE                          |
+--------------------------+---------+
| Pattern                  | Actions |
+===+=====+==========+=====+=========+
| 0 | ETH | ``spec`` | any | QUEUE,  |
|   |     +----------+-----+ DROP    |
|   |     | ``mask`` | any |         |
+---+-----+----------+-----+---------+

``FLEXIBLE`` to ``RAW`` → ``QUEUE``
-----------------------------------

`FLEXIBLE`_ can be translated to one `RAW`_ pattern with `QUEUE`_ as the
terminating action and a defined priority level.

+------------------------------------+
| FLEXIBLE                           |
+--------------------------+---------+
| Pattern                  | Actions |
+===+=====+==========+=====+=========+
| 0 | RAW | ``spec`` | any | QUEUE   |
|   |     +----------+-----+         |
|   |     | ``mask`` | any |         |
+---+-----+----------+-----+---------+

``SYN`` to ``TCP`` → ``QUEUE``
------------------------------

`SYN`_ is a `TCP`_ rule with only the ``syn`` bit enabled and masked, and
`QUEUE`_ as the terminating action.

Priority level can be set to simulate the high priority bit.

+---------------------------------------------+
| SYN                                         |
+-----------------------------------+---------+
| Pattern                           | Actions |
+===+======+==========+=============+=========+
| 0 | ETH  | ``spec`` | N/A         | QUEUE   |
|   |      +----------+-------------+         |
|   |      | ``mask`` | empty       |         |
+---+------+----------+-------------+         |
| 1 | IPV4 | ``spec`` | N/A         |         |
|   |      +----------+-------------+         |
|   |      | ``mask`` | empty       |         |
+---+------+----------+-------------+         |
| 2 | TCP  | ``spec`` | ``syn`` = 1 |         |
|   |      +----------+-------------+         |
|   |      | ``mask`` | ``syn`` = 1 |         |
+---+------+----------+-------------+---------+

``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
----------------------------------------------------

`NTUPLE`_ is similar to specifying an empty L2, `IPV4`_ as L3 with `TCP`_ or
`UDP`_ as L4 and `QUEUE`_ as the terminating action.

A priority level can be specified as well.

+---------------------------------------+
| NTUPLE                                |
+-----------------------------+---------+
| Pattern                     | Actions |
+===+======+==========+=======+=========+
| 0 | ETH  | ``spec`` | N/A   | QUEUE   |
|   |      +----------+-------+         |
|   |      | ``mask`` | empty |         |
+---+------+----------+-------+         |
| 1 | IPV4 | ``spec`` | any   |         |
|   |      +----------+-------+         |
|   |      | ``mask`` | any   |         |
+---+------+----------+-------+         |
| 2 | TCP, | ``spec`` | any   |         |
|   | UDP  +----------+-------+         |
|   |      | ``mask`` | any   |         |
+---+------+----------+-------+---------+

``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
---------------------------------------------------------------------------

`TUNNEL`_ matches common IPv4 and IPv6 L3/L4-based tunnel types.

In the following table, `ANY`_ is used to cover the optional L4.

+------------------------------------------------+
| TUNNEL                                         |
+--------------------------------------+---------+
| Pattern                              | Actions |
+===+=========+==========+=============+=========+
| 0 | ETH     | ``spec`` | any         | QUEUE   |
|   |         +----------+-------------+         |
|   |         | ``mask`` | any         |         |
+---+---------+----------+-------------+         |
| 1 | IPV4,   | ``spec`` | any         |         |
|   | IPV6    +----------+-------------+         |
|   |         | ``mask`` | any         |         |
+---+---------+----------+-------------+         |
| 2 | ANY     | ``spec`` | ``min`` = 0 |         |
|   |         |          +-------------+         |
|   |         |          | ``max`` = 0 |         |
|   |         +----------+-------------+         |
|   |         | ``mask`` | N/A         |         |
+---+---------+----------+-------------+         |
| 3 | VXLAN,  | ``spec`` | any         |         |
|   | GENEVE, +----------+-------------+         |
|   | TEREDO, | ``mask`` | any         |         |
|   | NVGRE,  |          |             |         |
|   | GRE,    |          |             |         |
|   | ...     |          |             |         |
+---+---------+----------+-------------+---------+

.. raw:: pdf

   PageBreak

``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
---------------------------------------------------------------

`FDIR`_ is more complex than any other type, there are several methods to
emulate its functionality. It is summarized for the most part in the table
below.

A few features are intentionally not supported:

- The ability to configure the matching input set and masks for the entire
  device, PMDs should take care of it automatically according to flow rules.

- Returning four or eight bytes of matched data when using flex bytes
  filtering. Although a specific action could implement it, it conflicts
  with the much more useful 32 bits tagging on devices that support it.

- Side effects on RSS processing of the entire device. Flow rules that
  conflict with the current device configuration should not be
  allowed. Similarly, device configuration should not be allowed when it
  affects existing flow rules.

- Device modes of operation. "none" is unsupported since filtering cannot be
  disabled as long as a flow rule is present.

- "MAC VLAN" or "tunnel" perfect matching modes should be automatically set
  according to the created flow rules.

+----------------------------------------------+
| FDIR                                         |
+---------------------------------+------------+
| Pattern                         | Actions    |
+===+============+==========+=====+============+
| 0 | ETH,       | ``spec`` | any | QUEUE,     |
|   | RAW        +----------+-----+ DROP,      |
|   |            | ``mask`` | any | PASSTHRU   |
+---+------------+----------+-----+------------+
| 1 | IPV4,      | ``spec`` | any | ID         |
|   | IPV6       +----------+-----+ (optional) |
|   |            | ``mask`` | any |            |
+---+------------+----------+-----+            |
| 2 | TCP,       | ``spec`` | any |            |
|   | UDP,       +----------+-----+            |
|   | SCTP       | ``mask`` | any |            |
+---+------------+----------+-----+            |
| 3 | VF,        | ``spec`` | any |            |
|   | PF,        +----------+-----+            |
|   | SIGNATURE  | ``mask`` | any |            |
|   | (optional) |          |     |            |
+---+------------+----------+-----+------------+

``HASH``
~~~~~~~~

Hashing configuration is set per rule through the `SIGNATURE`_ item.

Since it is usually a global device setting, all flow rules created with
this item may have to share the same specification.

``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All packets are matched. This type alters incoming packets to encapsulate
them in a chosen tunnel type, optionally redirect them to a VF as well.

The destination pool for tag based forwarding can be emulated with other
flow rules using `DUP`_ as the action.

+----------------------------------------+
| L2_TUNNEL                              |
+---------------------------+------------+
| Pattern                   | Actions    |
+===+======+==========+=====+============+
| 0 | VOID | ``spec`` | N/A | VXLAN,     |
|   |      |          |     | GENEVE,    |
|   |      |          |     | ...        |
|   |      +----------+-----+------------+
|   |      | ``mask`` | N/A | VF         |
|   |      |          |     | (optional) |
+---+------+----------+-----+------------+

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-05 18:16 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Adrien Mazarguil
@ 2016-07-07  7:14 ` Lu, Wenzhuo
  2016-07-07 10:26   ` Adrien Mazarguil
  2016-07-07 23:15 ` Chandran, Sugesh
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 262+ messages in thread
From: Lu, Wenzhuo @ 2016-07-07  7:14 UTC (permalink / raw)
  To: Adrien Mazarguil, dev
  Cc: Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

Hi Adrien,
I have some questions, please see inline, thanks.

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Wednesday, July 6, 2016 2:17 AM
> To: dev@dpdk.org
> Cc: Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody; Ajit Khaparde;
> Rahul Lakkireddy; Lu, Wenzhuo; Jan Medala; John Daley; Chen, Jing D; Ananyev,
> Konstantin; Matej Vido; Alejandro Lucero; Sony Chacko; Jerin Jacob; De Lara
> Guarch, Pablo; Olga Shern
> Subject: [RFC] Generic flow director/filtering/classification API
> 
> 
> Requirements for a new API:
> 
> - Flexible and extensible without causing API/ABI problems for existing
>   applications.
> - Should be unambiguous and easy to use.
> - Support existing filtering features and actions listed in `Filter types`_.
> - Support packet alteration.
> - In case of overlapping filters, their priority should be well documented.
Does that mean we don't guarantee the consistent of priority? The priority can be different on different NICs. So the behavior of the actions  can be different. Right?
Seems the users still need to aware the some details of the HW? Do we need to add the negotiation for the priority?

> 
> Flow rules can have several distinct actions (such as counting,
> encapsulating, decapsulating before redirecting packets to a particular
> queue, etc.), instead of relying on several rules to achieve this and having
> applications deal with hardware implementation details regarding their
> order.
I think normally HW doesn't support several actions in one rule. If a rule has several actions, seems HW has to split it to several rules. The order can still be a problem.

> 
> ``ETH``
> ^^^^^^^
> 
> Matches an Ethernet header.
> 
> - ``dst``: destination MAC.
> - ``src``: source MAC.
> - ``type``: EtherType.
> - ``tags``: number of 802.1Q/ad tags defined.
> - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> 
>  - ``tpid``: Tag protocol identifier.
>  - ``tci``: Tag control information.
"ETH" means all the parameters, dst, src, type... need to be matched? The same question for IPv4, IPv6 ...

> 
> ``UDP``
> ^^^^^^^
> 
> Matches a UDP header.
> 
> - ``sport``: source port.
> - ``dport``: destination port.
> - ``length``: UDP length.
> - ``checksum``: UDP checksum.
Why checksum? Do we need to filter the packets by checksum value?

> 
> ``VOID`` (action)
> ^^^^^^^^^^^^^^^^^
> 
> Used as a placeholder for convenience. It is ignored and simply discarded by
> PMDs.
Don't understand why we need VOID. If it’s about the format. Why not guarantee it in rte layer?

> 
> Behavior
> --------
> 
> - API operations are synchronous and blocking (``EAGAIN`` cannot be
>   returned).
> 
> - There is no provision for reentrancy/multi-thread safety, although nothing
>   should prevent different devices from being configured at the same
>   time. PMDs may protect their control path functions accordingly.
> 
> - Stopping the data path (TX/RX) should not be necessary when managing flow
>   rules. If this cannot be achieved naturally or with workarounds (such as
>   temporarily replacing the burst function pointers), an appropriate error
>   code must be returned (``EBUSY``).
PMD cannot stop the data path without adding lock. So I think if some rules cannot be applied without stopping rx/tx, PMD has to return fail.
Or let the APP to stop the data path.

> 
> - PMDs, not applications, are responsible for maintaining flow rules
>   configuration when stopping and restarting a port or performing other
>   actions which may affect them. They can only be destroyed explicitly.
Don’t understand " They can only be destroyed explicitly." If a new rule has conflict with an old one, what should we do? Return fail?

> 
> ``ANY`` pattern item
> ~~~~~~~~~~~~~~~~~~~~
> 
> This pattern item stands for anything, which can be difficult to translate
> to something hardware would understand, particularly if followed by more
> specific types.
> 
> Consider the following pattern:
> 
> +---+--------------------------------+
> | 0 | ETHER                          |
> +---+--------------------------------+
> | 1 | ANY (``min`` = 1, ``max`` = 1) |
> +---+--------------------------------+
> | 2 | TCP                            |
> +---+--------------------------------+
> 
> Knowing that TCP does not make sense with something other than IPv4 and IPv6
> as L3, such a pattern may be translated to two flow rules instead:
> 
> +---+--------------------+
> | 0 | ETHER              |
> +---+--------------------+
> | 1 | IPV4 (zeroed mask) |
> +---+--------------------+
> | 2 | TCP                |
> +---+--------------------+
> 
> +---+--------------------+
> | 0 | ETHER              |
> +---+--------------------+
> | 1 | IPV6 (zeroed mask) |
> +---+--------------------+
> | 2 | TCP                |
> +---+--------------------+
> 
> Note that as soon as a ANY rule covers several layers, this approach may
> yield a large number of hidden flow rules. It is thus suggested to only
> support the most common scenarios (anything as L2 and/or L3).
I think "any" may make things confusing.  How about if the NIC doesn't support IPv6? Should we return fail for this rule?

> 
> Flow rules priority
> ~~~~~~~~~~~~~~~~~~~
> 
> While it would naturally make sense, flow rules cannot be assumed to be
> processed by hardware in the same order as their creation for several
> reasons:
> 
> - They may be managed internally as a tree or a hash table instead of a
>   list.
> - Removing a flow rule before adding another one can either put the new rule
>   at the end of the list or reuse a freed entry.
> - Duplication may occur when packets are matched by several rules.
> 
> For overlapping rules (particularly in order to use the `PASSTHRU`_ action)
> predictable behavior is only guaranteed by using different priority levels.
> 
> Priority levels are not necessarily implemented in hardware, or may be
> severely limited (e.g. a single priority bit).
> 
> For these reasons, priority levels may be implemented purely in software by
> PMDs.
> 
> - For devices expecting flow rules to be added in the correct order, PMDs
>   may destroy and re-create existing rules after adding a new one with
>   a higher priority.
> 
> - A configurable number of dummy or empty rules can be created at
>   initialization time to save high priority slots for later.
> 
> - In order to save priority levels, PMDs may evaluate whether rules are
>   likely to collide and adjust their priority accordingly.
If there's 3 rules, r1, r2,r3. The rules say the priority is r1 > r2 > r3. If PMD can only support r1 > r3 > r2, or doesn't support r2. Should PMD apply r1 and r3 or totally not support them all?

A generic question, is the parsing supposed to be done by rte or PMD?

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-07  7:14 ` Lu, Wenzhuo
@ 2016-07-07 10:26   ` Adrien Mazarguil
  2016-07-19  8:11     ` Lu, Wenzhuo
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-07 10:26 UTC (permalink / raw)
  To: Lu, Wenzhuo
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

Hi Lu Wenzhuo,

Thanks for your feedback, I'm replying below as well.

On Thu, Jul 07, 2016 at 07:14:18AM +0000, Lu, Wenzhuo wrote:
> Hi Adrien,
> I have some questions, please see inline, thanks.
> 
> > -----Original Message-----
> > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > Sent: Wednesday, July 6, 2016 2:17 AM
> > To: dev@dpdk.org
> > Cc: Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody; Ajit Khaparde;
> > Rahul Lakkireddy; Lu, Wenzhuo; Jan Medala; John Daley; Chen, Jing D; Ananyev,
> > Konstantin; Matej Vido; Alejandro Lucero; Sony Chacko; Jerin Jacob; De Lara
> > Guarch, Pablo; Olga Shern
> > Subject: [RFC] Generic flow director/filtering/classification API
> > 
> > 
> > Requirements for a new API:
> > 
> > - Flexible and extensible without causing API/ABI problems for existing
> >   applications.
> > - Should be unambiguous and easy to use.
> > - Support existing filtering features and actions listed in `Filter types`_.
> > - Support packet alteration.
> > - In case of overlapping filters, their priority should be well documented.
> Does that mean we don't guarantee the consistent of priority? The priority can be different on different NICs. So the behavior of the actions  can be different. Right?

No, the intent is precisely to define what happens in order to get a
consistent result across different devices, and document cases with
undefined behavior. There must be no room left for interpretation.

For example, the API must describe what happens when two overlapping filters
(e.g. one matching an Ethernet header, another one matching an IP header)
match a given packet at a given priority level.

It is documented in section 4.1.1 (priorities) as "undefined behavior".
Applications remain free to do it and deal with consequences, at least they
know they cannot expect a consistent outcome, unless they use different
priority levels for both rules, see also 4.4.5 (flow rules priority).

> Seems the users still need to aware the some details of the HW? Do we need to add the negotiation for the priority?

Priorities as defined in this document may not be directly mappable to HW
capabilities (e.g. HW does not support enough priorities, or that some
corner case make them not work as described), in which case the PMD may
choose to simulate priorities (again 4.4.5), as long as the end result
follows the specification.

So users must not be aware of some HW details, the PMD does and must perform
the needed workarounds to suit their expectations. Users may only be
impacted by errors while attempting to create rules that are either
unsupported or would cause them (or existing rules) to diverge from the
spec.

> > Flow rules can have several distinct actions (such as counting,
> > encapsulating, decapsulating before redirecting packets to a particular
> > queue, etc.), instead of relying on several rules to achieve this and having
> > applications deal with hardware implementation details regarding their
> > order.
> I think normally HW doesn't support several actions in one rule. If a rule has several actions, seems HW has to split it to several rules. The order can still be a problem.

Note that, except for a very small subset of pattern items and actions,
supporting multiple actions for a given rule is not mandatory, and can be
emulated as you said by having to split them into several rules each with
its own priority if possible (see 3.3 "high level design").

Also, a rule "action" as defined in this API can be just about anything, for
example combining a queue redirection with 32-bit tagging. FDIR supports
many cases of what can be described as several actions, see 5.7 "FDIR to
most item types → QUEUE, DROP, PASSTHRU".

If you were thinking about having two queue targets for a given rule, then
I agree with you - that is why a rule cannot have more than a single action
of a given type (see 4.1.5 actions), to avoid such abuse from applications.

Applications must use several pass-through rules with different priority
levels if they want to perform a given action several times on a given
packet. Again, PMDs support is not mandatory as pass-through is optional.

> > ``ETH``
> > ^^^^^^^
> > 
> > Matches an Ethernet header.
> > 
> > - ``dst``: destination MAC.
> > - ``src``: source MAC.
> > - ``type``: EtherType.
> > - ``tags``: number of 802.1Q/ad tags defined.
> > - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> > 
> >  - ``tpid``: Tag protocol identifier.
> >  - ``tci``: Tag control information.
> "ETH" means all the parameters, dst, src, type... need to be matched? The same question for IPv4, IPv6 ...

Yes, it's basically the description of all Ethernet header fields including
VLAN tags (same for other protocols). Please see the linked draft header
file which should make all of this easier to understand:

 https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h

> > ``UDP``
> > ^^^^^^^
> > 
> > Matches a UDP header.
> > 
> > - ``sport``: source port.
> > - ``dport``: destination port.
> > - ``length``: UDP length.
> > - ``checksum``: UDP checksum.
> Why checksum? Do we need to filter the packets by checksum value?

Well, I've decided to include all protocol header fields for completeness
(so the ABI does not need to be broken later then they become necessary, or
require another pattern item), not that all of them make sense in a pattern.

In this specific case, all PMDs I know of must reject a pattern
specification with a nonzero mask for the checksum field, because none of
them support it.

> > ``VOID`` (action)
> > ^^^^^^^^^^^^^^^^^
> > 
> > Used as a placeholder for convenience. It is ignored and simply discarded by
> > PMDs.
> Don't understand why we need VOID. If it’s about the format. Why not guarantee it in rte layer?

I'm not sure to understand your question about rte layer, but this type is
fully managed by the PMD and is not supposed to be translated to a hardware
action.

I think it may come handy in some cases (like the VOID pattern item), so it
is defined just in case. Should be relatively trivial to support.

Applications may find a use for it when they want to statically define
templates for flow rules, when they need room for some reason.

> > Behavior
> > --------
> > 
> > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> >   returned).
> > 
> > - There is no provision for reentrancy/multi-thread safety, although nothing
> >   should prevent different devices from being configured at the same
> >   time. PMDs may protect their control path functions accordingly.
> > 
> > - Stopping the data path (TX/RX) should not be necessary when managing flow
> >   rules. If this cannot be achieved naturally or with workarounds (such as
> >   temporarily replacing the burst function pointers), an appropriate error
> >   code must be returned (``EBUSY``).
> PMD cannot stop the data path without adding lock. So I think if some rules cannot be applied without stopping rx/tx, PMD has to return fail.
> Or let the APP to stop the data path.

Agreed, that is the intent. If the PMD cannot touch flow rules for some
reason even after trying really hard, then it just returns EBUSY.

Perhaps we should write down that applications may get a different outcome
after stopping the data path if they get EBUSY?

> > - PMDs, not applications, are responsible for maintaining flow rules
> >   configuration when stopping and restarting a port or performing other
> >   actions which may affect them. They can only be destroyed explicitly.
> Don’t understand " They can only be destroyed explicitly."

This part says that as long as an application has not called
rte_flow_destroy() on a flow rule, it never disappears, whatever happens to
the port (stopped, restarted). The application is not responsible for
re-creating rules after that.

Note that according to the specification, this may translate to not being
able to stop a port as long as a flow rule is present, depending on how nice
the PMD intends to be with applications. Implementation can be done in small
steps with minimal amount of code on the PMD side.

> If a new rule has conflict with an old one, what should we do? Return fail?

That should not happen. If say 100 rules have been created with various
priorities and the port is happily running with them, stopping the port may
require the PMD to destroy them, it then has to re-create all 100 of them
exactly as they were automatically when restarting the port.

If re-creating them is not possible for some reason, the port cannot be
restarted as long as rules that cannot be added back haven't been destroyed
by the application. Frankly, this should not happen.

To manage this case, I suggest preventing applications from doing things
that conflict with existing flow rules while the port is stopped (just like
when it is not stopped, as described in 5.7 "FDIR to most item types").

> > ``ANY`` pattern item
> > ~~~~~~~~~~~~~~~~~~~~
> > 
> > This pattern item stands for anything, which can be difficult to translate
> > to something hardware would understand, particularly if followed by more
> > specific types.
> > 
> > Consider the following pattern:
> > 
> > +---+--------------------------------+
> > | 0 | ETHER                          |
> > +---+--------------------------------+
> > | 1 | ANY (``min`` = 1, ``max`` = 1) |
> > +---+--------------------------------+
> > | 2 | TCP                            |
> > +---+--------------------------------+
> > 
> > Knowing that TCP does not make sense with something other than IPv4 and IPv6
> > as L3, such a pattern may be translated to two flow rules instead:
> > 
> > +---+--------------------+
> > | 0 | ETHER              |
> > +---+--------------------+
> > | 1 | IPV4 (zeroed mask) |
> > +---+--------------------+
> > | 2 | TCP                |
> > +---+--------------------+
> > 
> > +---+--------------------+
> > | 0 | ETHER              |
> > +---+--------------------+
> > | 1 | IPV6 (zeroed mask) |
> > +---+--------------------+
> > | 2 | TCP                |
> > +---+--------------------+
> > 
> > Note that as soon as a ANY rule covers several layers, this approach may
> > yield a large number of hidden flow rules. It is thus suggested to only
> > support the most common scenarios (anything as L2 and/or L3).
> I think "any" may make things confusing.  How about if the NIC doesn't support IPv6? Should we return fail for this rule?

In a sense you are right, ANY relies on HW capabilities so you cannot know
that it won't match unsupported protocols. The above example would be
somewhat useless for a conscious application which should really have
created two separate flow rules (and gotten an error on the IPv6 one).

So an ANY flow rule only able to match v4 packets won't return an error.

ANY can be useful to match outer packets when only a tunnel header and the
inner packet are meaningful to the application. HW that does not recognize
the outer packet is not able to recognize the inner one anyway.

This section only says that PMDs should do their best to make HW match what
they can when faced with ANY.

Also once again, ANY support is not mandatory.

> > Flow rules priority
> > ~~~~~~~~~~~~~~~~~~~
> > 
> > While it would naturally make sense, flow rules cannot be assumed to be
> > processed by hardware in the same order as their creation for several
> > reasons:
> > 
> > - They may be managed internally as a tree or a hash table instead of a
> >   list.
> > - Removing a flow rule before adding another one can either put the new rule
> >   at the end of the list or reuse a freed entry.
> > - Duplication may occur when packets are matched by several rules.
> > 
> > For overlapping rules (particularly in order to use the `PASSTHRU`_ action)
> > predictable behavior is only guaranteed by using different priority levels.
> > 
> > Priority levels are not necessarily implemented in hardware, or may be
> > severely limited (e.g. a single priority bit).
> > 
> > For these reasons, priority levels may be implemented purely in software by
> > PMDs.
> > 
> > - For devices expecting flow rules to be added in the correct order, PMDs
> >   may destroy and re-create existing rules after adding a new one with
> >   a higher priority.
> > 
> > - A configurable number of dummy or empty rules can be created at
> >   initialization time to save high priority slots for later.
> > 
> > - In order to save priority levels, PMDs may evaluate whether rules are
> >   likely to collide and adjust their priority accordingly.
> If there's 3 rules, r1, r2,r3. The rules say the priority is r1 > r2 > r3. If PMD can only support r1 > r3 > r2, or doesn't support r2. Should PMD apply r1 and r3 or totally not support them all?

Remember that the API lets applications create only one rule at a time. If
all 3 rules are not supported together but individually are, the answer
depends on what the application does:

1. r1 OK, r2 FAIL => application chooses to stop here, thus only r1 works as
  expected (may roll back and remove r1 as a result).

2. r1 OK, r2 FAIL, r3 OK => application chooses to ignore the fact r2 failed
  and added r3 anyway, so it should end up with r1 > r3.

Applications should do as described in 1, they need to check for errors if
they want consistency.

This document describes only the basic functions, but may be extended later
with methods to add several flow rules at once, so rules that depend on
others can be added together and a single failure is returned without the
need for a rollback at the application level.

> A generic question, is the parsing supposed to be done by rte or PMD?

Actually, a bit of both. EAL will certainly at least provide helpers to
assist PMDs. This specification defines only the public-facing API for now,
but our goal is really to have something that is not too difficult to
implement both for applications and PMDs.

These helpers can be defined later with the first implementation.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-05 18:16 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Adrien Mazarguil
  2016-07-07  7:14 ` Lu, Wenzhuo
@ 2016-07-07 23:15 ` Chandran, Sugesh
  2016-07-08 13:03   ` Adrien Mazarguil
  2016-07-08 11:11 ` Liang, Cunming
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 262+ messages in thread
From: Chandran, Sugesh @ 2016-07-07 23:15 UTC (permalink / raw)
  To: Adrien Mazarguil, dev
  Cc: Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

Hi Adrien,

Thank you for proposing this. It would be really useful for application such as OVS-DPDK.
Please find my comments and questions inline below prefixed with [Sugesh]. Most of them are from the perspective of enabling these APIs in application such as OVS-DPDK.

Regards
_Sugesh


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Tuesday, July 5, 2016 7:17 PM
> To: dev@dpdk.org
> Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; Zhang, Helin
> <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Rasesh
> Mody <rasesh.mody@qlogic.com>; Ajit Khaparde
> <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>
> Subject: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
> 
> Hi All,
> 
> First, forgive me for this large message, I know our mailboxes already
> suffer quite a bit from the amount of traffic on this ML.
> 
> This is not exactly yet another thread about how flow director should be
> extended, rather about a brand new API to handle filtering and
> classification for incoming packets in the most PMD-generic and
> application-friendly fashion we can come up with. Reasons described below.
> 
> I think this topic is important enough to include both the users of this API
> as well as PMD maintainers. So far I have CC'ed librte_ether (especially
> rte_eth_ctrl.h contributors), testpmd and PMD maintainers (with and
> without
> a .filter_ctrl implementation), but if you know application maintainers
> other than testpmd who use FDIR or might be interested in this discussion,
> feel free to add them.
> 
> The issues we found with the current approach are already summarized in
> the
> following document, but here is a quick summary for TL;DR folks:
> 
> - PMDs do not expose a common set of filter types and even when they do,
>   their behavior more or less differs.
> 
> - Applications need to determine and adapt to device-specific limitations
>   and quirks on their own, without help from PMDs.
> 
> - Writing an application that creates flow rules targeting all devices
>   supported by DPDK is thus difficult, if not impossible.
> 
> - The current API has too many unspecified areas (particularly regarding
>   side effects of flow rules) that make PMD implementation tricky.
> 
> This RFC API handles everything currently supported by .filter_ctrl, the
> idea being to reimplement all of these to make them fully usable by
> applications in a more generic and well defined fashion. It has a very small
> set of mandatory features and an easy method to let applications probe for
> supported capabilities.
> 
> The only downside is more work for the software control side of PMDs
> because
> they have to adapt to the API instead of the reverse. I think helpers can be
> added to EAL to assist with this.
> 
> HTML version:
> 
>  https://rawgit.com/6WIND/rte_flow/master/rte_flow.html
> 
> PDF version:
> 
>  https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf
> 
> Related draft header file (for reference while reading the specification):
> 
>  https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h
> 
> Git tree for completeness (latest .rst version can be retrieved from here):
> 
>  https://github.com/6WIND/rte_flow
> 
> What follows is the ReST source of the above, for inline comments and
> discussion. I intend to update that specification accordingly.
> 
> ========================
> Generic filter interface
> ========================
> 
> .. footer::
> 
>    v0.6
> 
> .. contents::
> .. sectnum::
> .. raw:: pdf
> 
>    PageBreak
> 
> Overview
> ========
> 
> DPDK provides several competing interfaces added over time to perform
> packet
> matching and related actions such as filtering and classification.
> 
> They must be extended to implement the features supported by newer
> devices
> in order to expose them to applications, however the current design has
> several drawbacks:
> 
> - Complicated filter combinations which have not been hard-coded cannot be
>   expressed.
> - Prone to API/ABI breakage when new features must be added to an
> existing
>   filter type, which frequently happens.
> 
> From an application point of view:
> 
> - Having disparate interfaces, all optional and lacking in features does not
>   make this API easy to use.
> - Seemingly arbitrary built-in limitations of filter types based on the
>   device they were initially designed for.
> - Undefined relationship between different filter types.
> - High complexity, considerable undocumented and/or undefined behavior.
> 
> Considering the growing number of devices supported by DPDK, adding a
> new
> filter type each time a new feature must be implemented is not sustainable
> in the long term. Applications not written to target a specific device
> cannot really benefit from such an API.
> 
> For these reasons, this document defines an extensible unified API that
> encompasses and supersedes these legacy filter types.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Current API
> ===========
> 
> Rationale
> ---------
> 
> The reason several competing (and mostly overlapping) filtering APIs are
> present in DPDK is due to its nature as a thin layer between hardware and
> software.
> 
> Each subsequent interface has been added to better match the capabilities
> and limitations of the latest supported device, which usually happened to
> need an incompatible configuration approach. Because of this, many ended
> up
> device-centric and not usable by applications that were not written for that
> particular device.
> 
> This document is not the first attempt to address this proliferation issue,
> in fact a lot of work has already been done both to create a more generic
> interface while somewhat keeping compatibility with legacy ones through a
> common call interface (``rte_eth_dev_filter_ctrl()`` with the
> ``.filter_ctrl`` PMD callback in ``rte_ethdev.h``).
> 
> Today, these previously incompatible interfaces are known as filter types
> (``RTE_ETH_FILTER_*`` from ``enum rte_filter_type`` in ``rte_eth_ctrl.h``).
> 
> However while trivial to extend with new types, it only shifted the
> underlying problem as applications still need to be written for one kind of
> filter type, which, as described in the following sections, is not
> necessarily implemented by all PMDs that support filtering.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Filter types
> ------------
> 
> This section summarizes the capabilities of each filter type.
> 
> Although the following list is exhaustive, the description of individual
> types may contain inaccuracies due to the lack of documentation or usage
> examples.
> 
> Note: names are prefixed with ``RTE_ETH_FILTER_``.
> 
> ``MACVLAN``
> ~~~~~~~~~~~
> 
> Matching:
> 
> - L2 source/destination addresses.
> - Optional 802.1Q VLAN ID.
> - Masking individual fields on a rule basis is not supported.
> 
> Action:
> 
> - Packets are redirected either to a given VF device using its ID or to the
>   PF.
> 
> ``ETHERTYPE``
> ~~~~~~~~~~~~~
> 
> Matching:
> 
> - L2 source/destination addresses (optional).
> - Ethertype (no VLAN ID?).
> - Masking individual fields on a rule basis is not supported.
> 
> Action:
> 
> - Receive packets on a given queue.
> - Drop packets.
> 
> ``FLEXIBLE``
> ~~~~~~~~~~~~
> 
> Matching:
> 
> - At most 128 consecutive bytes anywhere in packets.
> - Masking is supported with byte granularity.
> - Priorities are supported (relative to this filter type, undefined
>   otherwise).
> 
> Action:
> 
> - Receive packets on a given queue.
> 
> ``SYN``
> ~~~~~~~
> 
> Matching:
> 
> - TCP SYN packets only.
> - One high priority bit can be set to give the highest possible priority to
>   this type when other filters with different types are configured.
> 
> Action:
> 
> - Receive packets on a given queue.
> 
> ``NTUPLE``
> ~~~~~~~~~~
> 
> Matching:
> 
> - Source/destination IPv4 addresses (optional in 2-tuple mode).
> - Source/destination TCP/UDP port (mandatory in 2 and 5-tuple modes).
> - L4 protocol (2 and 5-tuple modes).
> - Masking individual fields is supported.
> - TCP flags.
> - Up to 7 levels of priority relative to this filter type, undefined
>   otherwise.
> - No IPv6.
> 
> Action:
> 
> - Receive packets on a given queue.
> 
> ``TUNNEL``
> ~~~~~~~~~~
> 
> Matching:
> 
> - Outer L2 source/destination addresses.
> - Inner L2 source/destination addresses.
> - Inner VLAN ID.
> - IPv4/IPv6 source (destination?) address.
> - Tunnel type to match (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE,
> 802.1BR
>   E-Tag).
> - Tenant ID for tunneling protocols that have one.
> - Any combination of the above can be specified.
> - Masking individual fields on a rule basis is not supported.
> 
> Action:
> 
> - Receive packets on a given queue.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``FDIR``
> ~~~~~~~~
> 
> Queries:
> 
> - Device capabilities and limitations.
> - Device statistics about configured filters (resource usage, collisions).
> - Device configuration (matching input set and masks)
> 
> Matching:
> 
> - Device mode of operation: none (to disable filtering), signature
>   (hash-based dispatching from masked fields) or perfect (either MAC VLAN
> or
>   tunnel).
> - L2 Ethertype.
> - Outer L2 destination address (MAC VLAN mode).
> - Inner L2 destination address, tunnel type (NVGRE, VXLAN) and tunnel ID
>   (tunnel mode).
> - IPv4 source/destination addresses, ToS, TTL and protocol fields.
> - IPv6 source/destination addresses, TC, protocol and hop limits fields.
> - UDP source/destination IPv4/IPv6 and ports.
> - TCP source/destination IPv4/IPv6 and ports.
> - SCTP source/destination IPv4/IPv6, ports and verification tag field.
> - Note, only one protocol type at once (either only L2 Ethertype, basic
>   IPv6, IPv4+UDP, IPv4+TCP and so on).
> - VLAN TCI (extended API).
> - At most 16 bytes to match in payload (extended API). A global device
>   look-up table specifies for each possible protocol layer (unknown, raw,
>   L2, L3, L4) the offset to use for each byte (they do not need to be
>   contiguous) and the related bitmask.
> - Whether packet is addressed to PF or VF, in that case its ID can be
>   matched as well (extended API).
> - Masking most of the above fields is supported, but simultaneously affects
>   all filters configured on a device.
> - Input set can be modified in a similar fashion for a given device to
>   ignore individual fields of filters (i.e. do not match the destination
>   address in a IPv4 filter, refer to **RTE_ETH_INPUT_SET_**
>   macros). Configuring this also affects RSS processing on **i40e**.
> - Filters can also provide 32 bits of arbitrary data to return as part of
>   matched packets.
> 
> Action:
> 
> - **RTE_ETH_FDIR_ACCEPT**: receive (accept) packet on a given queue.
> - **RTE_ETH_FDIR_REJECT**: drop packet immediately.
> - **RTE_ETH_FDIR_PASSTHRU**: similar to accept for the last filter in list,
>   otherwise process it with subsequent filters.
> - For accepted packets and if requested by filter, either 32 bits of
>   arbitrary data and four bytes of matched payload (only in case of flex
>   bytes matching), or eight bytes of matched payload (flex also) are added
>   to meta data.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``HASH``
> ~~~~~~~~
> 
> Not an actual filter type. Provides and retrieves the global device
> configuration (per port or entire NIC) for hash functions and their
> properties.
> 
> Hash function selection: "default" (keep current), XOR or Toeplitz.
> 
> This function can be configured per flow type (**RTE_ETH_FLOW_**
> definitions), supported types are:
> 
> - Unknown.
> - Raw.
> - Fragmented or non-fragmented IPv4.
> - Non-fragmented IPv4 with L4 (TCP, UDP, SCTP or other).
> - Fragmented or non-fragmented IPv6.
> - Non-fragmented IPv6 with L4 (TCP, UDP, SCTP or other).
> - L2 payload.
> - IPv6 with extensions.
> - IPv6 with L4 (TCP, UDP) and extensions.
> 
> ``L2_TUNNEL``
> ~~~~~~~~~~~~~
> 
> Matching:
> 
> - All packets received on a given port.
> 
> Action:
> 
> - Add tunnel encapsulation (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE,
>   802.1BR E-Tag) using the provided Ethertype and tunnel ID (only E-Tag
>   is implemented at the moment).
> - VF ID to use for tag insertion (currently unused).
> - Destination pool for tag based forwarding (pools are IDs that can be
>   affected to ports, duplication occurs if the same ID is shared by several
>   ports of the same NIC).
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Driver support
> --------------
> 
> ======== ======= ========= ======== === ====== ====== ==== ====
> =========
> Driver   MACVLAN ETHERTYPE FLEXIBLE SYN NTUPLE TUNNEL FDIR HASH
> L2_TUNNEL
> ======== ======= ========= ======== === ====== ====== ==== ====
> =========
> bnx2x
> cxgbe
> e1000            yes       yes      yes yes
> ena
> enic                                                  yes
> fm10k
> i40e     yes     yes                           yes    yes  yes
> ixgbe            yes                yes yes           yes       yes
> mlx4
> mlx5                                                  yes
> szedata2
> ======== ======= ========= ======== === ====== ====== ==== ====
> =========
> 
> Flow director
> -------------
> 
> Flow director (FDIR) is the name of the most capable filter type, which
> covers most features offered by others. As such, it is the most widespread
> in PMDs that support filtering (i.e. all of them besides **e1000**).
> 
> It is also the only type that allows an arbitrary 32 bits value provided by
> applications to be attached to a filter and returned with matching packets
> instead of relying on the destination queue to recognize flows.
> 
> Unfortunately, even FDIR requires applications to be aware of low-level
> capabilities and limitations (most of which come directly from **ixgbe** and
> **i40e**):
> 
> - Bitmasks are set globally per device (port?), not per filter.
[Sugesh] This means application cannot define filters that matches on arbitrary different offsets?
If that’s the case, I assume the application has to program bitmask in advance. Otherwise how 
the API framework deduce this bitmask information from the rules?? Its not very clear to me
that how application pass down the bitmask information for multiple filters on same port?
> - Configuration state is not expected to be saved by the driver, and
>   stopping/restarting a port requires the application to perform it again
>   (API documentation is also unclear about this).
> - Monolithic approach with ABI issues as soon as a new kind of flow or
>   combination needs to be supported.
> - Cryptic global statistics/counters.
> - Unclear about how priorities are managed; filters seem to be arranged as a
>   linked list in hardware (possibly related to configuration order).
> 
> Packet alteration
> -----------------
> 
> One interesting feature is that the L2 tunnel filter type implements the
> ability to alter incoming packets through a filter (in this case to
> encapsulate them), thus the **mlx5** flow encap/decap features are not a
> foreign concept.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Proposed API
> ============
> 
> Terminology
> -----------
> 
> - **Filtering API**: overall framework affecting the fate of selected
>   packets, covers everything described in this document.
> - **Matching pattern**: properties to look for in received packets, a
>   combination of any number of items.
> - **Pattern item**: part of a pattern that either matches packet data
>   (protocol header, payload or derived information), or specifies properties
>   of the pattern itself.
> - **Actions**: what needs to be done when a packet matches a pattern.
> - **Flow rule**: this is the result of combining a *matching pattern* with
>   *actions*.
> - **Filter rule**: a less generic term than *flow rule*, can otherwise be
>   used interchangeably.
> - **Hit**: a flow rule is said to be *hit* when processing a matching
>   packet.
> 
> Requirements
> ------------
> 
> As described in the previous section, there is a growing need for a common
> method to configure filtering and related actions in a hardware independent
> fashion.
> 
> The filtering API should not disallow any filter combination by design and
> must remain as simple as possible to use. It can simply be defined as a
> method to perform one or several actions on selected packets.
> 
> PMDs are aware of the capabilities of the device they manage and should be
> responsible for preventing unsupported or conflicting combinations.
> 
> This approach is fundamentally different as it places most of the burden on
> the software side of the PMD instead of having device capabilities directly
> mapped to API functions, then expecting applications to work around
> ensuing
> compatibility issues.
> 
> Requirements for a new API:
> 
> - Flexible and extensible without causing API/ABI problems for existing
>   applications.
> - Should be unambiguous and easy to use.
> - Support existing filtering features and actions listed in `Filter types`_.
> - Support packet alteration.
> - In case of overlapping filters, their priority should be well documented.
> - Support filter queries (for example to retrieve counters).
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> High level design
> -----------------
> 
> The chosen approach to make filtering as generic as possible is by
> expressing matching patterns through lists of items instead of the flat
> structures used in DPDK today, enabling combinations that are not
> predefined
> and thus being more versatile.
> 
> Flow rules can have several distinct actions (such as counting,
> encapsulating, decapsulating before redirecting packets to a particular
> queue, etc.), instead of relying on several rules to achieve this and having
> applications deal with hardware implementation details regarding their
> order.
> 
> Support for different priority levels on a rule basis is provided, for
> example in order to force a more specific rule come before a more generic
> one for packets matched by both, however hardware support for more than
> a
> single priority level cannot be guaranteed. When supported, the number of
> available priority levels is usually low, which is why they can also be
> implemented in software by PMDs (e.g. to simulate missing priority levels by
> reordering rules).
> 
> In order to remain as hardware agnostic as possible, by default all rules
> are considered to have the same priority, which means that the order
> between
> overlapping rules (when a packet is matched by several filters) is
> undefined, packet duplication may even occur as a result.
> 
> PMDs may refuse to create overlapping rules at a given priority level when
> they can be detected (e.g. if a pattern matches an existing filter).
> 
> Thus predictable results for a given priority level can only be achieved
> with non-overlapping rules, using perfect matching on all protocol layers.
> 
> Support for multiple actions per rule may be implemented internally on top
> of non-default hardware priorities, as a result both features may not be
> simultaneously available to applications.
> 
> Considering that allowed pattern/actions combinations cannot be known in
> advance and would result in an unpractically large number of capabilities to
> expose, a method is provided to validate a given rule from the current
> device configuration state without actually adding it (akin to a "dry run"
> mode).
> 
> This enables applications to check if the rule types they need is supported
> at initialization time, before starting their data path. This method can be
> used anytime, its only requirement being that the resources needed by a
> rule
> must exist (e.g. a target RX queue must be configured first).
> 
> Each defined rule is associated with an opaque handle managed by the PMD,
> applications are responsible for keeping it. These can be used for queries
> and rules management, such as retrieving counters or other data and
> destroying them.
> 
> Handles must be destroyed before releasing associated resources such as
> queues.
> 
> Integration
> -----------
> 
> To avoid ABI breakage, this new interface will be implemented through the
> existing filtering control framework (``rte_eth_dev_filter_ctrl()``) using
> **RTE_ETH_FILTER_GENERIC** as a new filter type.
> 
> However a public front-end API described in `Rules management`_ will
> be added as the preferred method to use it.
> 
> Once discussions with the community have converged to a definite API,
> legacy
> filter types should be deprecated and a deadline defined to remove their
> support entirely.
> 
> PMDs will have to be gradually converted to **RTE_ETH_FILTER_GENERIC**
> or
> drop filtering support entirely. Less maintained PMDs for older hardware
> may
> lose support at this point.
> 
> The notion of filter type will then be deprecated and subsequently dropped
> to avoid confusion between both frameworks.
> 
> Implementation details
> ======================
> 
> Flow rule
> ---------
> 
> A flow rule is the combination of a matching pattern with a list of actions,
> and is the basis of this API.
> 
> Priorities
> ~~~~~~~~~~
> 
> A priority can be assigned to a matching pattern.
> 
> The default priority level is 0 and is also the highest. Support for more
> than a single priority level in hardware is not guaranteed.
> 
> If a packet is matched by several filters at a given priority level, the
> outcome is undefined. It can take any path and can even be duplicated.
> 
> Matching pattern
> ~~~~~~~~~~~~~~~~
> 
> A matching pattern comprises any number of items of various types.
> 
> Items are arranged in a list to form a matching pattern for packets. They
> fall in two categories:
> 
> - Protocol matching (ANY, RAW, ETH, IPV4, IPV6, ICMP, UDP, TCP, VXLAN and
> so
>   on), usually associated with a specification structure. These must be
>   stacked in the same order as the protocol layers to match, starting from
>   L2.
> 
> - Affecting how the pattern is processed (END, VOID, INVERT, PF, VF,
>   SIGNATURE and so on), often without a specification structure. Since they
>   are meta data that does not match packet contents, these can be specified
>   anywhere within item lists without affecting the protocol matching items.
> 
> Most item specifications can be optionally paired with a mask to narrow the
> specific fields or bits to be matched.
> 
> - Items are defined with ``struct rte_flow_item``.
> - Patterns are defined with ``struct rte_flow_pattern``.
> 
> Example of an item specification matching an Ethernet header:
> 
> +-----------------------------------------+
> | Ethernet                                |
> +==========+=========+====================+
> | ``spec`` | ``src`` | ``00:01:02:03:04`` |
> |          +---------+--------------------+
> |          | ``dst`` | ``00:2a:66:00:01`` |
> +----------+---------+--------------------+
> | ``mask`` | ``src`` | ``00:ff:ff:ff:00`` |
> |          +---------+--------------------+
> |          | ``dst`` | ``00:00:00:00:ff`` |
> +----------+---------+--------------------+
> 
> Non-masked bits stand for any value, Ethernet headers with the following
> properties are thus matched:
> 
> - ``src``: ``??:01:02:03:??``
> - ``dst``: ``??:??:??:??:01``
> 
> Except for meta types that do not need one, ``spec`` must be a valid pointer
> to a structure of the related item type. A ``mask`` of the same type can be
> provided to tell which bits in ``spec`` are to be matched.
> 
> A mask is normally only needed for ``spec`` fields matching packet data,
> ignored otherwise. See individual item types for more information.
> 
> A ``NULL`` mask pointer is allowed and is similar to matching with a full
> mask (all ones) ``spec`` fields supported by hardware, the remaining fields
> are ignored (all zeroes), there is thus no error checking for unsupported
> fields.
> 
> Matching pattern items for packet data must be naturally stacked (ordered
> from lowest to highest protocol layer), as in the following examples:
> 
> +--------------+
> | TCPv4 as L4  |
> +===+==========+
> | 0 | Ethernet |
> +---+----------+
> | 1 | IPv4     |
> +---+----------+
> | 2 | TCP      |
> +---+----------+
> 
> +----------------+
> | TCPv6 in VXLAN |
> +===+============+
> | 0 | Ethernet   |
> +---+------------+
> | 1 | IPv4       |
> +---+------------+
> | 2 | UDP        |
> +---+------------+
> | 3 | VXLAN      |
> +---+------------+
> | 4 | Ethernet   |
> +---+------------+
> | 5 | IPv6       |
> +---+------------+
> | 6 | TCP        |
> +---+------------+
> 
> +-----------------------------+
> | TCPv4 as L4 with meta items |
> +===+=========================+
> | 0 | VOID                    |
> +---+-------------------------+
> | 1 | Ethernet                |
> +---+-------------------------+
> | 2 | VOID                    |
> +---+-------------------------+
> | 3 | IPv4                    |
> +---+-------------------------+
> | 4 | TCP                     |
> +---+-------------------------+
> | 5 | VOID                    |
> +---+-------------------------+
> | 6 | VOID                    |
> +---+-------------------------+
> 
> The above example shows how meta items do not affect packet data
> matching
> items, as long as those remain stacked properly. The resulting matching
> pattern is identical to "TCPv4 as L4".
> 
> +----------------+
> | UDPv6 anywhere |
> +===+============+
> | 0 | IPv6       |
> +---+------------+
> | 1 | UDP        |
> +---+------------+
> 
> If supported by the PMD, omitting one or several protocol layers at the
> bottom of the stack as in the above example (missing an Ethernet
> specification) enables hardware to look anywhere in packets.
> 
> It is unspecified whether the payload of supported encapsulations
> (e.g. VXLAN inner packet) is matched by such a pattern, which may apply to
> inner, outer or both packets.
> 
> +---------------------+
> | Invalid, missing L3 |
> +===+=================+
> | 0 | Ethernet        |
> +---+-----------------+
> | 1 | UDP             |
> +---+-----------------+
> 
> The above pattern is invalid due to a missing L3 specification between L2
> and L4. It is only allowed at the bottom and at the top of the stack.
> 
> Meta item types
> ~~~~~~~~~~~~~~~
> 
> These do not match packet data but affect how the pattern is processed,
> most
> of them do not need a specification structure. This particularity allows
> them to be specified anywhere without affecting other item types.
> 
> ``END``
> ^^^^^^^
> 
> End marker for item lists. Prevents further processing of items, thereby
> ending the pattern.
> 
> - Its numeric value is **0** for convenience.
> - PMD support is mandatory.
> - Both ``spec`` and ``mask`` are ignored.
> 
> +--------------------+
> | END                |
> +==========+=========+
> | ``spec`` | ignored |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
> 
> ``VOID``
> ^^^^^^^^
> 
> Used as a placeholder for convenience. It is ignored and simply discarded by
> PMDs.
> 
> - PMD support is mandatory.
> - Both ``spec`` and ``mask`` are ignored.
> 
> +--------------------+
> | VOID               |
> +==========+=========+
> | ``spec`` | ignored |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
> 
> One usage example for this type is generating rules that share a common
> prefix quickly without reallocating memory, only by updating item types:
> 
> +------------------------+
> | TCP, UDP or ICMP as L4 |
> +===+====================+
> | 0 | Ethernet           |
> +---+--------------------+
> | 1 | IPv4               |
> +---+------+------+------+
> | 2 | UDP  | VOID | VOID |
> +---+------+------+------+
> | 3 | VOID | TCP  | VOID |
> +---+------+------+------+
> | 4 | VOID | VOID | ICMP |
> +---+------+------+------+
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``INVERT``
> ^^^^^^^^^^
> 
> Inverted matching, i.e. process packets that do not match the pattern.
> 
> - Both ``spec`` and ``mask`` are ignored.
> 
> +--------------------+
> | INVERT             |
> +==========+=========+
> | ``spec`` | ignored |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
> 
> Usage example in order to match non-TCPv4 packets only:
> 
> +--------------------+
> | Anything but TCPv4 |
> +===+================+
> | 0 | INVERT         |
> +---+----------------+
> | 1 | Ethernet       |
> +---+----------------+
> | 2 | IPv4           |
> +---+----------------+
> | 3 | TCP            |
> +---+----------------+
> 
> ``PF``
> ^^^^^^
> 
> Matches packets addressed to the physical function of the device.
> 
> - Both ``spec`` and ``mask`` are ignored.
> 
> +--------------------+
> | PF                 |
> +==========+=========+
> | ``spec`` | ignored |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
> 
> ``VF``
> ^^^^^^
> 
> Matches packets addressed to the given virtual function ID of the device.
> 
> - Only ``spec`` needs to be defined, ``mask`` is ignored.
> 
> +----------------------------------------+
> | VF                                     |
> +==========+=========+===================+
> | ``spec`` | ``vf``  | destination VF ID |
> +----------+---------+-------------------+
> | ``mask`` | ignored                     |
> +----------+-----------------------------+
> 
> ``SIGNATURE``
> ^^^^^^^^^^^^^
> 
> Requests hash-based signature dispatching for this rule.
> 
> Considering this is a global setting on devices that support it, all
> subsequent filter rules may have to be created with it as well.
> 
> - Only ``spec`` needs to be defined, ``mask`` is ignored.
> 
> +--------------------+
> | SIGNATURE          |
> +==========+=========+
> | ``spec`` | TBD     |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Data matching item types
> ~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Most of these are basically protocol header definitions with associated
> bitmasks. They must be specified (stacked) from lowest to highest protocol
> layer.
> 
> The following list is not exhaustive as new protocols will be added in the
> future.
> 
> ``ANY``
> ^^^^^^^
> 
> Matches any protocol in place of the current layer, a single ANY may also
> stand for several protocol layers.
> 
> This is usually specified as the first pattern item when looking for a
> protocol anywhere in a packet.
> 
> - A maximum value of **0** requests matching any number of protocol
> layers
>   above or equal to the minimum value, a maximum value lower than the
>   minimum one is otherwise invalid.
> - Only ``spec`` needs to be defined, ``mask`` is ignored.
> 
> +-----------------------------------------------------------------------+
> | ANY                                                                   |
> +==========+=========+====================================
> ==============+
> | ``spec`` | ``min`` | minimum number of layers covered                 |
> |          +---------+--------------------------------------------------+
> |          | ``max`` | maximum number of layers covered, 0 for infinity |
> +----------+---------+--------------------------------------------------+
> | ``mask`` | ignored                                                    |
> +----------+------------------------------------------------------------+
> 
> Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or
> IPv6)
> and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
> or IPv6) matched by the second ANY specification:
> 
> +----------------------------------+
> | TCP in VXLAN with wildcards      |
> +===+==============================+
> | 0 | Ethernet                     |
> +---+-----+----------+---------+---+
> | 1 | ANY | ``spec`` | ``min`` | 2 |
> |   |     |          +---------+---+
> |   |     |          | ``max`` | 2 |
> +---+-----+----------+---------+---+
> | 2 | VXLAN                        |
> +---+------------------------------+
> | 3 | Ethernet                     |
> +---+-----+----------+---------+---+
> | 4 | ANY | ``spec`` | ``min`` | 1 |
> |   |     |          +---------+---+
> |   |     |          | ``max`` | 1 |
> +---+-----+----------+---------+---+
> | 5 | TCP                          |
> +---+------------------------------+
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``RAW``
> ^^^^^^^
> 
> Matches a string of a given length at a given offset (in bytes), or anywhere
> in the payload of the current protocol layer (including L2 header if used as
> the first item in the stack).
> 
> This does not increment the protocol layer count as it is not a protocol
> definition. Subsequent RAW items modulate the first absolute one with
> relative offsets.
> 
> - Using **-1** as the ``offset`` of the first RAW item makes its absolute
>   offset not fixed, i.e. the pattern is searched everywhere.
> - ``mask`` only affects the pattern.
> 
> +--------------------------------------------------------------+
> | RAW                                                          |
> +==========+=============+================================
> =====+
> | ``spec`` | ``offset``  | absolute or relative pattern offset |
> |          +-------------+-------------------------------------+
> |          | ``length``  | pattern length                      |
> |          +-------------+-------------------------------------+
> |          | ``pattern`` | byte string of the above length     |
> +----------+-------------+-------------------------------------+
> | ``mask`` | ``offset``  | ignored                             |
> |          +-------------+-------------------------------------+
> |          | ``length``  | ignored                             |
> |          +-------------+-------------------------------------+
> |          | ``pattern`` | bitmask with the same byte length   |
> +----------+-------------+-------------------------------------+
> 
> Example pattern looking for several strings at various offsets of a UDP
> payload, using combined RAW items:
> 
> +------------------------------------------+
> | UDP payload matching                     |
> +===+======================================+
> | 0 | Ethernet                             |
> +---+--------------------------------------+
> | 1 | IPv4                                 |
> +---+--------------------------------------+
> | 2 | UDP                                  |
> +---+-----+----------+-------------+-------+
> | 3 | RAW | ``spec`` | ``offset``  | -1    |
> |   |     |          +-------------+-------+
> |   |     |          | ``length``  | 3     |
> |   |     |          +-------------+-------+
> |   |     |          | ``pattern`` | "foo" |
> +---+-----+----------+-------------+-------+
> | 4 | RAW | ``spec`` | ``offset``  | 20    |
> |   |     |          +-------------+-------+
> |   |     |          | ``length``  | 3     |
> |   |     |          +-------------+-------+
> |   |     |          | ``pattern`` | "bar" |
> +---+-----+----------+-------------+-------+
> | 5 | RAW | ``spec`` | ``offset``  | -30   |
> |   |     |          +-------------+-------+
> |   |     |          | ``length``  | 3     |
> |   |     |          +-------------+-------+
> |   |     |          | ``pattern`` | "baz" |
> +---+-----+----------+-------------+-------+
> 
> This translates to:
> 
> - Locate "foo" in UDP payload, remember its offset.
> - Check "bar" at "foo"'s offset plus 20 bytes.
> - Check "baz" at "foo"'s offset minus 30 bytes.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``ETH``
> ^^^^^^^
> 
> Matches an Ethernet header.
> 
> - ``dst``: destination MAC.
> - ``src``: source MAC.
> - ``type``: EtherType.
> - ``tags``: number of 802.1Q/ad tags defined.
> - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> 
>  - ``tpid``: Tag protocol identifier.
>  - ``tci``: Tag control information.
> 
> ``IPV4``
> ^^^^^^^^
> 
> Matches an IPv4 header.
> 
> - ``src``: source IP address.
> - ``dst``: destination IP address.
> - ``tos``: ToS/DSCP field.
> - ``ttl``: TTL field.
> - ``proto``: protocol number for the next layer.
> 
> ``IPV6``
> ^^^^^^^^
> 
> Matches an IPv6 header.
> 
> - ``src``: source IP address.
> - ``dst``: destination IP address.
> - ``tc``: traffic class field.
> - ``nh``: Next header field (protocol).
> - ``hop_limit``: hop limit field (TTL).
> 
> ``ICMP``
> ^^^^^^^^
> 
> Matches an ICMP header.
> 
> - TBD.
> 
> ``UDP``
> ^^^^^^^
> 
> Matches a UDP header.
> 
> - ``sport``: source port.
> - ``dport``: destination port.
> - ``length``: UDP length.
> - ``checksum``: UDP checksum.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``TCP``
> ^^^^^^^
> 
> Matches a TCP header.
> 
> - ``sport``: source port.
> - ``dport``: destination port.
> - All other TCP fields and bits.
> 
> ``VXLAN``
> ^^^^^^^^^
> 
> Matches a VXLAN header.
> 
> - TBD.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Actions
> ~~~~~~~
> 
> Each possible action is represented by a type. Some have associated
> configuration structures. Several actions combined in a list can be affected
> to a flow rule. That list is not ordered.
> 
> At least one action must be defined in a filter rule in order to do
> something with matched packets.
> 
> - Actions are defined with ``struct rte_flow_action``.
> - A list of actions is defined with ``struct rte_flow_actions``.
> 
> They fall in three categories:
> 
> - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
>   processing matched packets by subsequent flow rules, unless overridden
>   with PASSTHRU.
> 
> - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
> for
>   additional processing by subsequent flow rules.
> 
> - Other non terminating meta actions that do not affect the fate of packets
>   (END, VOID, ID, COUNT).
> 
> When several actions are combined in a flow rule, they should all have
> different types (e.g. dropping a packet twice is not possible). However
> considering the VOID type is an exception to this rule, the defined behavior
> is for PMDs to only take into account the last action of a given type found
> in the list. PMDs still perform error checking on the entire list.
> 
> *Note that PASSTHRU is the only action able to override a terminating rule.*
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Example of an action that redirects packets to queue index 10:
> 
> +----------------+
> | QUEUE          |
> +===========+====+
> | ``queue`` | 10 |
> +-----------+----+
> 
> Action lists examples, their order is not significant, applications must
> consider all actions to be performed simultaneously:
> 
> +----------------+
> | Count and drop |
> +=======+========+
> | COUNT |        |
> +-------+--------+
> | DROP  |        |
> +-------+--------+
> 
> +--------------------------+
> | Tag, count and redirect  |
> +=======+===========+======+
> | ID    | ``id``    | 0x2a |
> +-------+-----------+------+
> | COUNT |                  |
> +-------+-----------+------+
> | QUEUE | ``queue`` | 10   |
> +-------+-----------+------+
> 
> +-----------------------+
> | Redirect to queue 5   |
> +=======+===============+
> | DROP  |               |
> +-------+-----------+---+
> | QUEUE | ``queue`` | 5 |
> +-------+-----------+---+
> 
> In the above example, considering both actions are performed
> simultaneously,
> its end result is that only QUEUE has any effect.
> 
> +-----------------------+
> | Redirect to queue 3   |
> +=======+===========+===+
> | QUEUE | ``queue`` | 5 |
> +-------+-----------+---+
> | VOID  |               |
> +-------+-----------+---+
> | QUEUE | ``queue`` | 3 |
> +-------+-----------+---+
> 
> As previously described, only the last action of a given type found in the
> list is taken into account. The above example also shows that VOID is
> ignored.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Action types
> ~~~~~~~~~~~~
> 
> Common action types are described in this section. Like pattern item types,
> this list is not exhaustive as new actions will be added in the future.
> 
> ``END`` (action)
> ^^^^^^^^^^^^^^^^
> 
> End marker for action lists. Prevents further processing of actions, thereby
> ending the list.
> 
> - Its numeric value is **0** for convenience.
> - PMD support is mandatory.
> - No configurable property.
> 
> +---------------+
> | END           |
> +===============+
> | no properties |
> +---------------+
> 
> ``VOID`` (action)
> ^^^^^^^^^^^^^^^^^
> 
> Used as a placeholder for convenience. It is ignored and simply discarded by
> PMDs.
> 
> - PMD support is mandatory.
> - No configurable property.
> 
> +---------------+
> | VOID          |
> +===============+
> | no properties |
> +---------------+
> 
> ``PASSTHRU``
> ^^^^^^^^^^^^
> 
> Leaves packets up for additional processing by subsequent flow rules. This
> is the default when a rule does not contain a terminating action, but can be
> specified to force a rule to become non-terminating.
> 
> - No configurable property.
> 
> +---------------+
> | PASSTHRU      |
> +===============+
> | no properties |
> +---------------+
> 
> Example to copy a packet to a queue and continue processing by subsequent
> flow rules:
[Sugesh] If a packet get copied to a queue, it’s a termination action. 
How can its possible to do subsequent action after the packet already 
moved to the queue. ?How it differs from DUP action?
 Am I missing anything here? 
> 
> +--------------------------+
> | Copy to queue 8          |
> +==========+===============+
> | PASSTHRU |               |
> +----------+-----------+---+
> | QUEUE    | ``queue`` | 8 |
> +----------+-----------+---+
> 
> ``ID``
> ^^^^^^
> 
> Attaches a 32 bit value to packets.
> 
> +----------------------------------------------+
> | ID                                           |
> +========+=====================================+
> | ``id`` | 32 bit value to return with packets |
> +--------+-------------------------------------+
> 
[Sugesh] I assume the application has to program the flow 
with a unique ID and matching packets are stamped with this ID
when reporting to the software. The uniqueness of ID is NOT 
guaranteed by the API framework. Correct me if I am wrong here.

[Sugesh] Is it a limitation to use only 32 bit ID? Is it possible to have a
64 bit ID? So that application can use the control plane flow pointer
Itself as an ID. Does it make sense? 


> .. raw:: pdf
> 
>    PageBreak
> 
> ``QUEUE``
> ^^^^^^^^^
> 
> Assigns packets to a given queue index.
> 
> - Terminating by default.
> 
> +--------------------------------+
> | QUEUE                          |
> +===========+====================+
> | ``queue`` | queue index to use |
> +-----------+--------------------+
> 
> ``DROP``
> ^^^^^^^^
> 
> Drop packets.
> 
> - No configurable property.
> - Terminating by default.
> - PASSTHRU overrides this action if both are specified.
> 
> +---------------+
> | DROP          |
> +===============+
> | no properties |
> +---------------+
> 
> ``COUNT``
> ^^^^^^^^^
> 
[Sugesh] Should we really have to set count action explicitly for every rule?
IMHO it would be great to be an implicit action. Most of the application would be
interested in the stats of almost all the filters/flows .
> Enables hits counter for this rule.
> 
> This counter can be retrieved and reset through ``rte_flow_query()``, see
> ``struct rte_flow_query_count``.
> 
> - Counters can be retrieved with ``rte_flow_query()``.
> - No configurable property.
> 
> +---------------+
> | COUNT         |
> +===============+
> | no properties |
> +---------------+
> 
> Query structure to retrieve and reset the flow rule hits counter:
> 
> +------------------------------------------------+
> | COUNT query                                    |
> +===========+=====+==============================+
> | ``reset`` | in  | reset counter after query    |
> +-----------+-----+------------------------------+
> | ``hits``  | out | number of hits for this flow |
> +-----------+-----+------------------------------+
> 
> ``DUP``
> ^^^^^^^
> 
> Duplicates packets to a given queue index.
> 
> This is normally combined with QUEUE, however when used alone, it is
> actually similar to QUEUE + PASSTHRU.
> 
> - Non-terminating by default.
> 
> +------------------------------------------------+
> | DUP                                            |
> +===========+====================================+
> | ``queue`` | queue index to duplicate packet to |
> +-----------+------------------------------------+
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``RSS``
> ^^^^^^^
> 
> Similar to QUEUE, except RSS is additionally performed on packets to spread
> them among several queues according to the provided parameters.
> 
> - Terminating by default.
> 
> +---------------------------------------------+
> | RSS                                         |
> +==============+==============================+
> | ``rss_conf`` | RSS parameters               |
> +--------------+------------------------------+
> | ``queues``   | number of entries in queue[] |
> +--------------+------------------------------+
> | ``queue[]``  | queue indices to use         |
> +--------------+------------------------------+
> 
> ``PF`` (action)
> ^^^^^^^^^^^^^^^
> 
> Redirects packets to the physical function (PF) of the current device.
> 
> - No configurable property.
> - Terminating by default.
> 
> +---------------+
> | PF            |
> +===============+
> | no properties |
> +---------------+
> 
> ``VF`` (action)
> ^^^^^^^^^^^^^^^
> 
> Redirects packets to the virtual function (VF) of the current device with
> the specified ID.
> 
> - Terminating by default.
> 
> +---------------------------------------+
> | VF                                    |
> +========+==============================+
> | ``id`` | VF ID to redirect packets to |
> +--------+------------------------------+
> 
> Planned types
> ~~~~~~~~~~~~~
> 
> Other action types are planned but not defined yet. These actions will add
> the ability to alter matching packets in several ways, such as performing
> encapsulation/decapsulation of tunnel headers on specific flows.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Rules management
> ----------------
> 
> A simple API with only four functions is provided to fully manage flows.
> 
> Each created flow rule is associated with an opaque, PMD-specific handle
> pointer. The application is responsible for keeping it until the rule is
> destroyed.
> 
> Flows rules are defined with ``struct rte_flow``.
> 
> Validation
> ~~~~~~~~~~
> 
> Given that expressing a definite set of device capabilities with this API is
> not practical, a dedicated function is provided to check if a flow rule is
> supported and can be created.
> 
> ::
> 
>  int
>  rte_flow_validate(uint8_t port_id,
>                    const struct rte_flow_pattern *pattern,
>                    const struct rte_flow_actions *actions);
> 
> While this function has no effect on the target device, the flow rule is
> validated against its current configuration state and the returned value
> should be considered valid by the caller for that state only.
> 
> The returned value is guaranteed to remain valid only as long as no
> successful calls to rte_flow_create() or rte_flow_destroy() are made in the
> meantime and no device parameter affecting flow rules in any way are
> modified, due to possible collisions or resource limitations (although in
> such cases ``EINVAL`` should not be returned).
> 
> Arguments:
> 
> - ``port_id``: port identifier of Ethernet device.
> - ``pattern``: pattern specification to check.
> - ``actions``: actions associated with the flow definition.
> 
> Return value:
> 
> - **0** if flow rule is valid and can be created. A negative errno value
>   otherwise (``rte_errno`` is also set), the following errors are defined.
> - ``-EINVAL``: unknown or invalid rule specification.
> - ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial masks
>   are unsupported).
> - ``-EEXIST``: collision with an existing rule.
> - ``-ENOMEM``: not enough resources.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Creation
> ~~~~~~~~
> 
> Creating a flow rule is similar to validating one, except the rule is
> actually created.
> 
> ::
> 
>  struct rte_flow *
>  rte_flow_create(uint8_t port_id,
>                  const struct rte_flow_pattern *pattern,
>                  const struct rte_flow_actions *actions);
> 
> Arguments:
> 
> - ``port_id``: port identifier of Ethernet device.
> - ``pattern``: pattern specification to add.
> - ``actions``: actions associated with the flow definition.
> 
> Return value:
> 
> A valid flow pointer in case of success, NULL otherwise and ``rte_errno`` is
> set to the positive version of one of the error codes defined for
> ``rte_flow_validate()``.
[Sugesh] : Kind of implementation specific query. What if application
try to add duplicate rules? Does the API create new flow entry for every 
API call? 
[Sugesh] Another concern is the cost and time of installing these rules
in the hardware. Can we make these APIs time bound(or at least an option to
set the time limit to execute these APIs), so that
Application doesn’t have to wait so long when installing and deleting flows with
slow hardware/NIC. What do you think? Most of the datapath flow installations are 
dynamic and triggered only when there is
an ingress traffic. Delay in flow insertion/deletion have unpredictable consequences.

[Sugesh] Another query is on the synchronization part. What if same rules are 
handled from different threads? Is application responsible for handling the concurrent
hardware programming?

> 
> Destruction
> ~~~~~~~~~~~
> 
> Flow rules destruction is not automatic, and a queue should not be released
> if any are still attached to it. Applications must take care of performing
> this step before releasing resources.
> 
> ::
> 
>  int
>  rte_flow_destroy(uint8_t port_id,
>                   struct rte_flow *flow);
> 
> 
[Sugesh] I would suggest having a clean-up API is really useful as the releasing of
Queue(is it applicable for releasing of port too?) is not guaranteeing the automatic flow 
destruction. This way application can initialize the port,
clean-up all the existing rules and create new rules  on a clean slate.

> Failure to destroy a flow rule may occur when other flow rules depend on it,
> and destroying it would result in an inconsistent state.
> 
> This function is only guaranteed to succeed if flow rules are destroyed in
> reverse order of their creation.
> 
> Arguments:
> 
> - ``port_id``: port identifier of Ethernet device.
> - ``flow``: flow rule to destroy.
> 
> Return value:
> 
> - **0** on success, a negative errno value otherwise and ``rte_errno`` is
>   set.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Query
> ~~~~~
> 
> Query an existing flow rule.
> 
> This function allows retrieving flow-specific data such as counters. Data
> is gathered by special actions which must be present in the flow rule
> definition.
> 
> ::
> 
>  int
>  rte_flow_query(uint8_t port_id,
>                 struct rte_flow *flow,
>                 enum rte_flow_action_type action,
>                 void *data);
> 
> Arguments:
> 
> - ``port_id``: port identifier of Ethernet device.
> - ``flow``: flow rule to query.
> - ``action``: action type to query.
> - ``data``: pointer to storage for the associated query data type.
> 
> Return value:
> 
> - **0** on success, a negative errno value otherwise and ``rte_errno`` is
>   set.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Behavior
> --------
> 
> - API operations are synchronous and blocking (``EAGAIN`` cannot be
>   returned).
> 
> - There is no provision for reentrancy/multi-thread safety, although nothing
>   should prevent different devices from being configured at the same
>   time. PMDs may protect their control path functions accordingly.
> 
> - Stopping the data path (TX/RX) should not be necessary when managing
> flow
>   rules. If this cannot be achieved naturally or with workarounds (such as
>   temporarily replacing the burst function pointers), an appropriate error
>   code must be returned (``EBUSY``).
> 
> - PMDs, not applications, are responsible for maintaining flow rules
>   configuration when stopping and restarting a port or performing other
>   actions which may affect them. They can only be destroyed explicitly.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
[Sugesh] Query all the rules for a specific port/queue?? Useful when adding and
deleting ports and queues dynamically according to the need. I am not sure 
what are the other  different usecases for these APIs. But I feel it makes much easier to 
manage flows from the application. What do you think?
> Compatibility
> -------------
> 
> No known hardware implementation supports all the features described in
> this
> document.
> 
> Unsupported features or combinations are not expected to be fully
> emulated
> in software by PMDs for performance reasons. Partially supported features
> may be completed in software as long as hardware performs most of the
> work
> (such as queue redirection and packet recognition).
> 
> However PMDs are expected to do their best to satisfy application requests
> by working around hardware limitations as long as doing so does not affect
> the behavior of existing flow rules.
> 
> The following sections provide a few examples of such cases, they are based
> on limitations built into the previous APIs.
> 
> Global bitmasks
> ~~~~~~~~~~~~~~~
> 
> Each flow rule comes with its own, per-layer bitmasks, while hardware may
> support only a single, device-wide bitmask for a given layer type, so that
> two IPv4 rules cannot use different bitmasks.
> 
> The expected behavior in this case is that PMDs automatically configure
> global bitmasks according to the needs of the first created flow rule.
> 
> Subsequent rules are allowed only if their bitmasks match those, the
> ``EEXIST`` error code should be returned otherwise.
> 
> Unsupported layer types
> ~~~~~~~~~~~~~~~~~~~~~~~
> 
> Many protocols can be simulated by crafting patterns with the `RAW`_ type.
> 
> PMDs can rely on this capability to simulate support for protocols with
> fixed headers not directly recognized by hardware.
> 
> ``ANY`` pattern item
> ~~~~~~~~~~~~~~~~~~~~
> 
> This pattern item stands for anything, which can be difficult to translate
> to something hardware would understand, particularly if followed by more
> specific types.
> 
> Consider the following pattern:
> 
> +---+--------------------------------+
> | 0 | ETHER                          |
> +---+--------------------------------+
> | 1 | ANY (``min`` = 1, ``max`` = 1) |
> +---+--------------------------------+
> | 2 | TCP                            |
> +---+--------------------------------+
> 
> Knowing that TCP does not make sense with something other than IPv4 and
> IPv6
> as L3, such a pattern may be translated to two flow rules instead:
> 
> +---+--------------------+
> | 0 | ETHER              |
> +---+--------------------+
> | 1 | IPV4 (zeroed mask) |
> +---+--------------------+
> | 2 | TCP                |
> +---+--------------------+
> 
> +---+--------------------+
> | 0 | ETHER              |
> +---+--------------------+
> | 1 | IPV6 (zeroed mask) |
> +---+--------------------+
> | 2 | TCP                |
> +---+--------------------+
> 
> Note that as soon as a ANY rule covers several layers, this approach may
> yield a large number of hidden flow rules. It is thus suggested to only
> support the most common scenarios (anything as L2 and/or L3).
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> Unsupported actions
> ~~~~~~~~~~~~~~~~~~~
> 
> - When combined with a `QUEUE`_ action, packet counting (`COUNT`_) and
>   tagging (`ID`_) may be implemented in software as long as the target queue
>   is used by a single rule.
> 
> - A rule specifying both `DUP`_ + `QUEUE`_ may be translated to two hidden
>   rules combining `QUEUE`_ and `PASSTHRU`_.
> 
> - When a single target queue is provided, `RSS`_ can also be implemented
>   through `QUEUE`_.
> 
> Flow rules priority
> ~~~~~~~~~~~~~~~~~~~
> 
> While it would naturally make sense, flow rules cannot be assumed to be
> processed by hardware in the same order as their creation for several
> reasons:
> 
> - They may be managed internally as a tree or a hash table instead of a
>   list.
> - Removing a flow rule before adding another one can either put the new
> rule
>   at the end of the list or reuse a freed entry.
> - Duplication may occur when packets are matched by several rules.
> 
> For overlapping rules (particularly in order to use the `PASSTHRU`_ action)
> predictable behavior is only guaranteed by using different priority levels.
> 
> Priority levels are not necessarily implemented in hardware, or may be
> severely limited (e.g. a single priority bit).
> 
> For these reasons, priority levels may be implemented purely in software by
> PMDs.
> 
> - For devices expecting flow rules to be added in the correct order, PMDs
>   may destroy and re-create existing rules after adding a new one with
>   a higher priority.
> 
> - A configurable number of dummy or empty rules can be created at
>   initialization time to save high priority slots for later.
> 
> - In order to save priority levels, PMDs may evaluate whether rules are
>   likely to collide and adjust their priority accordingly.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> API migration
> =============
> 
> Exhaustive list of deprecated filter types and how to convert them to
> generic flow rules.
> 
> ``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
> ---------------------------------------
> 
> `MACVLAN`_ can be translated to a basic `ETH`_ flow rule with a `VF
> (action)`_ or `PF (action)`_ terminating action.
> 
> +------------------------------------+
> | MACVLAN                            |
> +--------------------------+---------+
> | Pattern                  | Actions |
> +===+=====+==========+=====+=========+
> | 0 | ETH | ``spec`` | any | VF,     |
> |   |     +----------+-----+ PF      |
> |   |     | ``mask`` | any |         |
> +---+-----+----------+-----+---------+
> 
> ``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
> ----------------------------------------------
> 
> `ETHERTYPE`_ is basically an `ETH`_ flow rule with `QUEUE`_ or `DROP`_ as
> a terminating action.
> 
> +------------------------------------+
> | ETHERTYPE                          |
> +--------------------------+---------+
> | Pattern                  | Actions |
> +===+=====+==========+=====+=========+
> | 0 | ETH | ``spec`` | any | QUEUE,  |
> |   |     +----------+-----+ DROP    |
> |   |     | ``mask`` | any |         |
> +---+-----+----------+-----+---------+
> 
> ``FLEXIBLE`` to ``RAW`` → ``QUEUE``
> -----------------------------------
> 
> `FLEXIBLE`_ can be translated to one `RAW`_ pattern with `QUEUE`_ as the
> terminating action and a defined priority level.
> 
> +------------------------------------+
> | FLEXIBLE                           |
> +--------------------------+---------+
> | Pattern                  | Actions |
> +===+=====+==========+=====+=========+
> | 0 | RAW | ``spec`` | any | QUEUE   |
> |   |     +----------+-----+         |
> |   |     | ``mask`` | any |         |
> +---+-----+----------+-----+---------+
> 
> ``SYN`` to ``TCP`` → ``QUEUE``
> ------------------------------
> 
> `SYN`_ is a `TCP`_ rule with only the ``syn`` bit enabled and masked, and
> `QUEUE`_ as the terminating action.
> 
> Priority level can be set to simulate the high priority bit.
> 
> +---------------------------------------------+
> | SYN                                         |
> +-----------------------------------+---------+
> | Pattern                           | Actions |
> +===+======+==========+=============+=========+
> | 0 | ETH  | ``spec`` | N/A         | QUEUE   |
> |   |      +----------+-------------+         |
> |   |      | ``mask`` | empty       |         |
> +---+------+----------+-------------+         |
> | 1 | IPV4 | ``spec`` | N/A         |         |
> |   |      +----------+-------------+         |
> |   |      | ``mask`` | empty       |         |
> +---+------+----------+-------------+         |
> | 2 | TCP  | ``spec`` | ``syn`` = 1 |         |
> |   |      +----------+-------------+         |
> |   |      | ``mask`` | ``syn`` = 1 |         |
> +---+------+----------+-------------+---------+
> 
> ``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
> ----------------------------------------------------
> 
> `NTUPLE`_ is similar to specifying an empty L2, `IPV4`_ as L3 with `TCP`_ or
> `UDP`_ as L4 and `QUEUE`_ as the terminating action.
> 
> A priority level can be specified as well.
> 
> +---------------------------------------+
> | NTUPLE                                |
> +-----------------------------+---------+
> | Pattern                     | Actions |
> +===+======+==========+=======+=========+
> | 0 | ETH  | ``spec`` | N/A   | QUEUE   |
> |   |      +----------+-------+         |
> |   |      | ``mask`` | empty |         |
> +---+------+----------+-------+         |
> | 1 | IPV4 | ``spec`` | any   |         |
> |   |      +----------+-------+         |
> |   |      | ``mask`` | any   |         |
> +---+------+----------+-------+         |
> | 2 | TCP, | ``spec`` | any   |         |
> |   | UDP  +----------+-------+         |
> |   |      | ``mask`` | any   |         |
> +---+------+----------+-------+---------+
> 
> ``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
> ---------------------------------------------------------------------------
> 
> `TUNNEL`_ matches common IPv4 and IPv6 L3/L4-based tunnel types.
> 
> In the following table, `ANY`_ is used to cover the optional L4.
> 
> +------------------------------------------------+
> | TUNNEL                                         |
> +--------------------------------------+---------+
> | Pattern                              | Actions |
> +===+=========+==========+=============+=========+
> | 0 | ETH     | ``spec`` | any         | QUEUE   |
> |   |         +----------+-------------+         |
> |   |         | ``mask`` | any         |         |
> +---+---------+----------+-------------+         |
> | 1 | IPV4,   | ``spec`` | any         |         |
> |   | IPV6    +----------+-------------+         |
> |   |         | ``mask`` | any         |         |
> +---+---------+----------+-------------+         |
> | 2 | ANY     | ``spec`` | ``min`` = 0 |         |
> |   |         |          +-------------+         |
> |   |         |          | ``max`` = 0 |         |
> |   |         +----------+-------------+         |
> |   |         | ``mask`` | N/A         |         |
> +---+---------+----------+-------------+         |
> | 3 | VXLAN,  | ``spec`` | any         |         |
> |   | GENEVE, +----------+-------------+         |
> |   | TEREDO, | ``mask`` | any         |         |
> |   | NVGRE,  |          |             |         |
> |   | GRE,    |          |             |         |
> |   | ...     |          |             |         |
> +---+---------+----------+-------------+---------+
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
> ---------------------------------------------------------------
> 
> `FDIR`_ is more complex than any other type, there are several methods to
> emulate its functionality. It is summarized for the most part in the table
> below.
> 
> A few features are intentionally not supported:
> 
> - The ability to configure the matching input set and masks for the entire
>   device, PMDs should take care of it automatically according to flow rules.
> 
> - Returning four or eight bytes of matched data when using flex bytes
>   filtering. Although a specific action could implement it, it conflicts
>   with the much more useful 32 bits tagging on devices that support it.
> 
> - Side effects on RSS processing of the entire device. Flow rules that
>   conflict with the current device configuration should not be
>   allowed. Similarly, device configuration should not be allowed when it
>   affects existing flow rules.
> 
> - Device modes of operation. "none" is unsupported since filtering cannot be
>   disabled as long as a flow rule is present.
> 
> - "MAC VLAN" or "tunnel" perfect matching modes should be automatically
> set
>   according to the created flow rules.
> 
> +----------------------------------------------+
> | FDIR                                         |
> +---------------------------------+------------+
> | Pattern                         | Actions    |
> +===+============+==========+=====+============+
> | 0 | ETH,       | ``spec`` | any | QUEUE,     |
> |   | RAW        +----------+-----+ DROP,      |
> |   |            | ``mask`` | any | PASSTHRU   |
> +---+------------+----------+-----+------------+
> | 1 | IPV4,      | ``spec`` | any | ID         |
> |   | IPV6       +----------+-----+ (optional) |
> |   |            | ``mask`` | any |            |
> +---+------------+----------+-----+            |
> | 2 | TCP,       | ``spec`` | any |            |
> |   | UDP,       +----------+-----+            |
> |   | SCTP       | ``mask`` | any |            |
> +---+------------+----------+-----+            |
> | 3 | VF,        | ``spec`` | any |            |
> |   | PF,        +----------+-----+            |
> |   | SIGNATURE  | ``mask`` | any |            |
> |   | (optional) |          |     |            |
> +---+------------+----------+-----+------------+
> 
> ``HASH``
> ~~~~~~~~
> 
> Hashing configuration is set per rule through the `SIGNATURE`_ item.
> 
> Since it is usually a global device setting, all flow rules created with
> this item may have to share the same specification.
> 
> ``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> All packets are matched. This type alters incoming packets to encapsulate
> them in a chosen tunnel type, optionally redirect them to a VF as well.
> 
> The destination pool for tag based forwarding can be emulated with other
> flow rules using `DUP`_ as the action.
> 
> +----------------------------------------+
> | L2_TUNNEL                              |
> +---------------------------+------------+
> | Pattern                   | Actions    |
> +===+======+==========+=====+============+
> | 0 | VOID | ``spec`` | N/A | VXLAN,     |
> |   |      |          |     | GENEVE,    |
> |   |      |          |     | ...        |
> |   |      +----------+-----+------------+
> |   |      | ``mask`` | N/A | VF         |
> |   |      |          |     | (optional) |
> +---+------+----------+-----+------------+
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-05 18:16 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Adrien Mazarguil
  2016-07-07  7:14 ` Lu, Wenzhuo
  2016-07-07 23:15 ` Chandran, Sugesh
@ 2016-07-08 11:11 ` Liang, Cunming
  2016-07-08 12:38   ` Bruce Richardson
  2016-07-08 13:25   ` Adrien Mazarguil
  2016-07-11 10:41 ` Jerin Jacob
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 262+ messages in thread
From: Liang, Cunming @ 2016-07-08 11:11 UTC (permalink / raw)
  To: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu, Jan Medala,
	John Daley, Jing Chen, Konstantin Ananyev, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, Pablo de Lara,
	Olga Shern

Hi Adrien,

On 7/6/2016 2:16 AM, Adrien Mazarguil wrote:
> Hi All,
>
> First, forgive me for this large message, I know our mailboxes already
> suffer quite a bit from the amount of traffic on this ML.
>
> This is not exactly yet another thread about how flow director should be
> extended, rather about a brand new API to handle filtering and
> classification for incoming packets in the most PMD-generic and
> application-friendly fashion we can come up with. Reasons described below.
>
> I think this topic is important enough to include both the users of this API
> as well as PMD maintainers. So far I have CC'ed librte_ether (especially
> rte_eth_ctrl.h contributors), testpmd and PMD maintainers (with and without
> a .filter_ctrl implementation), but if you know application maintainers
> other than testpmd who use FDIR or might be interested in this discussion,
> feel free to add them.
>
> The issues we found with the current approach are already summarized in the
> following document, but here is a quick summary for TL;DR folks:
>
> - PMDs do not expose a common set of filter types and even when they do,
>    their behavior more or less differs.
>
> - Applications need to determine and adapt to device-specific limitations
>    and quirks on their own, without help from PMDs.
>
> - Writing an application that creates flow rules targeting all devices
>    supported by DPDK is thus difficult, if not impossible.
>
> - The current API has too many unspecified areas (particularly regarding
>    side effects of flow rules) that make PMD implementation tricky.
>
> This RFC API handles everything currently supported by .filter_ctrl, the
> idea being to reimplement all of these to make them fully usable by
> applications in a more generic and well defined fashion. It has a very small
> set of mandatory features and an easy method to let applications probe for
> supported capabilities.
>
> The only downside is more work for the software control side of PMDs because
> they have to adapt to the API instead of the reverse. I think helpers can be
> added to EAL to assist with this.
>
> HTML version:
>
>   https://rawgit.com/6WIND/rte_flow/master/rte_flow.html
>
> PDF version:
>
>   https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf
>
> Related draft header file (for reference while reading the specification):
>
>   https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h
>
> Git tree for completeness (latest .rst version can be retrieved from here):
>
>   https://github.com/6WIND/rte_flow
>
> What follows is the ReST source of the above, for inline comments and
> discussion. I intend to update that specification accordingly.
>
> ========================
> Generic filter interface
> ========================
>
> .. footer::
>
>     v0.6
>
> .. contents::
> .. sectnum::
> .. raw:: pdf
>
>     PageBreak
>
> Overview
> ========
>
> DPDK provides several competing interfaces added over time to perform packet
> matching and related actions such as filtering and classification.
>
> They must be extended to implement the features supported by newer devices
> in order to expose them to applications, however the current design has
> several drawbacks:
>
> - Complicated filter combinations which have not been hard-coded cannot be
>    expressed.
> - Prone to API/ABI breakage when new features must be added to an existing
>    filter type, which frequently happens.
>
>  From an application point of view:
>
> - Having disparate interfaces, all optional and lacking in features does not
>    make this API easy to use.
> - Seemingly arbitrary built-in limitations of filter types based on the
>    device they were initially designed for.
> - Undefined relationship between different filter types.
> - High complexity, considerable undocumented and/or undefined behavior.
>
> Considering the growing number of devices supported by DPDK, adding a new
> filter type each time a new feature must be implemented is not sustainable
> in the long term. Applications not written to target a specific device
> cannot really benefit from such an API.
>
> For these reasons, this document defines an extensible unified API that
> encompasses and supersedes these legacy filter types.
>
> .. raw:: pdf
>
>     PageBreak
>
> Current API
> ===========
>
> Rationale
> ---------
>
> The reason several competing (and mostly overlapping) filtering APIs are
> present in DPDK is due to its nature as a thin layer between hardware and
> software.
>
> Each subsequent interface has been added to better match the capabilities
> and limitations of the latest supported device, which usually happened to
> need an incompatible configuration approach. Because of this, many ended up
> device-centric and not usable by applications that were not written for that
> particular device.
>
> This document is not the first attempt to address this proliferation issue,
> in fact a lot of work has already been done both to create a more generic
> interface while somewhat keeping compatibility with legacy ones through a
> common call interface (``rte_eth_dev_filter_ctrl()`` with the
> ``.filter_ctrl`` PMD callback in ``rte_ethdev.h``).
>
> Today, these previously incompatible interfaces are known as filter types
> (``RTE_ETH_FILTER_*`` from ``enum rte_filter_type`` in ``rte_eth_ctrl.h``).
>
> However while trivial to extend with new types, it only shifted the
> underlying problem as applications still need to be written for one kind of
> filter type, which, as described in the following sections, is not
> necessarily implemented by all PMDs that support filtering.
>
> .. raw:: pdf
>
>     PageBreak
>
> Filter types
> ------------
>
> This section summarizes the capabilities of each filter type.
>
> Although the following list is exhaustive, the description of individual
> types may contain inaccuracies due to the lack of documentation or usage
> examples.
>
> Note: names are prefixed with ``RTE_ETH_FILTER_``.
>
> ``MACVLAN``
> ~~~~~~~~~~~
>
> Matching:
>
> - L2 source/destination addresses.
> - Optional 802.1Q VLAN ID.
> - Masking individual fields on a rule basis is not supported.
>
> Action:
>
> - Packets are redirected either to a given VF device using its ID or to the
>    PF.
>
> ``ETHERTYPE``
> ~~~~~~~~~~~~~
>
> Matching:
>
> - L2 source/destination addresses (optional).
> - Ethertype (no VLAN ID?).
> - Masking individual fields on a rule basis is not supported.
>
> Action:
>
> - Receive packets on a given queue.
> - Drop packets.
>
> ``FLEXIBLE``
> ~~~~~~~~~~~~
>
> Matching:
>
> - At most 128 consecutive bytes anywhere in packets.
> - Masking is supported with byte granularity.
> - Priorities are supported (relative to this filter type, undefined
>    otherwise).
>
> Action:
>
> - Receive packets on a given queue.
>
> ``SYN``
> ~~~~~~~
>
> Matching:
>
> - TCP SYN packets only.
> - One high priority bit can be set to give the highest possible priority to
>    this type when other filters with different types are configured.
>
> Action:
>
> - Receive packets on a given queue.
>
> ``NTUPLE``
> ~~~~~~~~~~
>
> Matching:
>
> - Source/destination IPv4 addresses (optional in 2-tuple mode).
> - Source/destination TCP/UDP port (mandatory in 2 and 5-tuple modes).
> - L4 protocol (2 and 5-tuple modes).
> - Masking individual fields is supported.
> - TCP flags.
> - Up to 7 levels of priority relative to this filter type, undefined
>    otherwise.
> - No IPv6.
>
> Action:
>
> - Receive packets on a given queue.
>
> ``TUNNEL``
> ~~~~~~~~~~
>
> Matching:
>
> - Outer L2 source/destination addresses.
> - Inner L2 source/destination addresses.
> - Inner VLAN ID.
> - IPv4/IPv6 source (destination?) address.
> - Tunnel type to match (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE, 802.1BR
>    E-Tag).
> - Tenant ID for tunneling protocols that have one.
> - Any combination of the above can be specified.
> - Masking individual fields on a rule basis is not supported.
>
> Action:
>
> - Receive packets on a given queue.
>
> .. raw:: pdf
>
>     PageBreak
>
> ``FDIR``
> ~~~~~~~~
>
> Queries:
>
> - Device capabilities and limitations.
> - Device statistics about configured filters (resource usage, collisions).
> - Device configuration (matching input set and masks)
>
> Matching:
>
> - Device mode of operation: none (to disable filtering), signature
>    (hash-based dispatching from masked fields) or perfect (either MAC VLAN or
>    tunnel).
> - L2 Ethertype.
> - Outer L2 destination address (MAC VLAN mode).
> - Inner L2 destination address, tunnel type (NVGRE, VXLAN) and tunnel ID
>    (tunnel mode).
> - IPv4 source/destination addresses, ToS, TTL and protocol fields.
> - IPv6 source/destination addresses, TC, protocol and hop limits fields.
> - UDP source/destination IPv4/IPv6 and ports.
> - TCP source/destination IPv4/IPv6 and ports.
> - SCTP source/destination IPv4/IPv6, ports and verification tag field.
> - Note, only one protocol type at once (either only L2 Ethertype, basic
>    IPv6, IPv4+UDP, IPv4+TCP and so on).
> - VLAN TCI (extended API).
> - At most 16 bytes to match in payload (extended API). A global device
>    look-up table specifies for each possible protocol layer (unknown, raw,
>    L2, L3, L4) the offset to use for each byte (they do not need to be
>    contiguous) and the related bitmask.
> - Whether packet is addressed to PF or VF, in that case its ID can be
>    matched as well (extended API).
> - Masking most of the above fields is supported, but simultaneously affects
>    all filters configured on a device.
> - Input set can be modified in a similar fashion for a given device to
>    ignore individual fields of filters (i.e. do not match the destination
>    address in a IPv4 filter, refer to **RTE_ETH_INPUT_SET_**
>    macros). Configuring this also affects RSS processing on **i40e**.
> - Filters can also provide 32 bits of arbitrary data to return as part of
>    matched packets.
>
> Action:
>
> - **RTE_ETH_FDIR_ACCEPT**: receive (accept) packet on a given queue.
> - **RTE_ETH_FDIR_REJECT**: drop packet immediately.
> - **RTE_ETH_FDIR_PASSTHRU**: similar to accept for the last filter in list,
>    otherwise process it with subsequent filters.
> - For accepted packets and if requested by filter, either 32 bits of
>    arbitrary data and four bytes of matched payload (only in case of flex
>    bytes matching), or eight bytes of matched payload (flex also) are added
>    to meta data.
>
> .. raw:: pdf
>
>     PageBreak
>
> ``HASH``
> ~~~~~~~~
>
> Not an actual filter type. Provides and retrieves the global device
> configuration (per port or entire NIC) for hash functions and their
> properties.
>
> Hash function selection: "default" (keep current), XOR or Toeplitz.
>
> This function can be configured per flow type (**RTE_ETH_FLOW_**
> definitions), supported types are:
>
> - Unknown.
> - Raw.
> - Fragmented or non-fragmented IPv4.
> - Non-fragmented IPv4 with L4 (TCP, UDP, SCTP or other).
> - Fragmented or non-fragmented IPv6.
> - Non-fragmented IPv6 with L4 (TCP, UDP, SCTP or other).
> - L2 payload.
> - IPv6 with extensions.
> - IPv6 with L4 (TCP, UDP) and extensions.
>
> ``L2_TUNNEL``
> ~~~~~~~~~~~~~
>
> Matching:
>
> - All packets received on a given port.
>
> Action:
>
> - Add tunnel encapsulation (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE,
>    802.1BR E-Tag) using the provided Ethertype and tunnel ID (only E-Tag
>    is implemented at the moment).
> - VF ID to use for tag insertion (currently unused).
> - Destination pool for tag based forwarding (pools are IDs that can be
>    affected to ports, duplication occurs if the same ID is shared by several
>    ports of the same NIC).
>
> .. raw:: pdf
>
>     PageBreak
>
> Driver support
> --------------
>
> ======== ======= ========= ======== === ====== ====== ==== ==== =========
> Driver   MACVLAN ETHERTYPE FLEXIBLE SYN NTUPLE TUNNEL FDIR HASH L2_TUNNEL
> ======== ======= ========= ======== === ====== ====== ==== ==== =========
> bnx2x
> cxgbe
> e1000            yes       yes      yes yes
> ena
> enic                                                  yes
> fm10k
> i40e     yes     yes                           yes    yes  yes
> ixgbe            yes                yes yes           yes       yes
> mlx4
> mlx5                                                  yes
> szedata2
> ======== ======= ========= ======== === ====== ====== ==== ==== =========
>
> Flow director
> -------------
>
> Flow director (FDIR) is the name of the most capable filter type, which
> covers most features offered by others. As such, it is the most widespread
> in PMDs that support filtering (i.e. all of them besides **e1000**).
>
> It is also the only type that allows an arbitrary 32 bits value provided by
> applications to be attached to a filter and returned with matching packets
> instead of relying on the destination queue to recognize flows.
>
> Unfortunately, even FDIR requires applications to be aware of low-level
> capabilities and limitations (most of which come directly from **ixgbe** and
> **i40e**):
>
> - Bitmasks are set globally per device (port?), not per filter.
> - Configuration state is not expected to be saved by the driver, and
>    stopping/restarting a port requires the application to perform it again
>    (API documentation is also unclear about this).
> - Monolithic approach with ABI issues as soon as a new kind of flow or
>    combination needs to be supported.
> - Cryptic global statistics/counters.
> - Unclear about how priorities are managed; filters seem to be arranged as a
>    linked list in hardware (possibly related to configuration order).
>
> Packet alteration
> -----------------
>
> One interesting feature is that the L2 tunnel filter type implements the
> ability to alter incoming packets through a filter (in this case to
> encapsulate them), thus the **mlx5** flow encap/decap features are not a
> foreign concept.
>
> .. raw:: pdf
>
>     PageBreak
>
> Proposed API
> ============
>
> Terminology
> -----------
>
> - **Filtering API**: overall framework affecting the fate of selected
>    packets, covers everything described in this document.
> - **Matching pattern**: properties to look for in received packets, a
>    combination of any number of items.
> - **Pattern item**: part of a pattern that either matches packet data
>    (protocol header, payload or derived information), or specifies properties
>    of the pattern itself.
> - **Actions**: what needs to be done when a packet matches a pattern.
> - **Flow rule**: this is the result of combining a *matching pattern* with
>    *actions*.
> - **Filter rule**: a less generic term than *flow rule*, can otherwise be
>    used interchangeably.
> - **Hit**: a flow rule is said to be *hit* when processing a matching
>    packet.
>
> Requirements
> ------------
>
> As described in the previous section, there is a growing need for a common
> method to configure filtering and related actions in a hardware independent
> fashion.
>
> The filtering API should not disallow any filter combination by design and
> must remain as simple as possible to use. It can simply be defined as a
> method to perform one or several actions on selected packets.
>
> PMDs are aware of the capabilities of the device they manage and should be
> responsible for preventing unsupported or conflicting combinations.
>
> This approach is fundamentally different as it places most of the burden on
> the software side of the PMD instead of having device capabilities directly
> mapped to API functions, then expecting applications to work around ensuing
> compatibility issues.
>
> Requirements for a new API:
>
> - Flexible and extensible without causing API/ABI problems for existing
>    applications.
> - Should be unambiguous and easy to use.
> - Support existing filtering features and actions listed in `Filter types`_.
> - Support packet alteration.
> - In case of overlapping filters, their priority should be well documented.
> - Support filter queries (for example to retrieve counters).
>
> .. raw:: pdf
>
>     PageBreak
>
> High level design
> -----------------
>
> The chosen approach to make filtering as generic as possible is by
> expressing matching patterns through lists of items instead of the flat
> structures used in DPDK today, enabling combinations that are not predefined
> and thus being more versatile.
>
> Flow rules can have several distinct actions (such as counting,
> encapsulating, decapsulating before redirecting packets to a particular
> queue, etc.), instead of relying on several rules to achieve this and having
> applications deal with hardware implementation details regarding their
> order.
>
> Support for different priority levels on a rule basis is provided, for
> example in order to force a more specific rule come before a more generic
> one for packets matched by both, however hardware support for more than a
> single priority level cannot be guaranteed. When supported, the number of
> available priority levels is usually low, which is why they can also be
> implemented in software by PMDs (e.g. to simulate missing priority levels by
> reordering rules).
>
> In order to remain as hardware agnostic as possible, by default all rules
> are considered to have the same priority, which means that the order between
> overlapping rules (when a packet is matched by several filters) is
> undefined, packet duplication may even occur as a result.
>
> PMDs may refuse to create overlapping rules at a given priority level when
> they can be detected (e.g. if a pattern matches an existing filter).
>
> Thus predictable results for a given priority level can only be achieved
> with non-overlapping rules, using perfect matching on all protocol layers.
>
> Support for multiple actions per rule may be implemented internally on top
> of non-default hardware priorities, as a result both features may not be
> simultaneously available to applications.
>
> Considering that allowed pattern/actions combinations cannot be known in
> advance and would result in an unpractically large number of capabilities to
> expose, a method is provided to validate a given rule from the current
> device configuration state without actually adding it (akin to a "dry run"
> mode).
>
> This enables applications to check if the rule types they need is supported
> at initialization time, before starting their data path. This method can be
> used anytime, its only requirement being that the resources needed by a rule
> must exist (e.g. a target RX queue must be configured first).
>
> Each defined rule is associated with an opaque handle managed by the PMD,
> applications are responsible for keeping it. These can be used for queries
> and rules management, such as retrieving counters or other data and
> destroying them.
>
> Handles must be destroyed before releasing associated resources such as
> queues.
>
> Integration
> -----------
>
> To avoid ABI breakage, this new interface will be implemented through the
> existing filtering control framework (``rte_eth_dev_filter_ctrl()``) using
> **RTE_ETH_FILTER_GENERIC** as a new filter type.
>
> However a public front-end API described in `Rules management`_ will
> be added as the preferred method to use it.
>
> Once discussions with the community have converged to a definite API, legacy
> filter types should be deprecated and a deadline defined to remove their
> support entirely.
>
> PMDs will have to be gradually converted to **RTE_ETH_FILTER_GENERIC** or
> drop filtering support entirely. Less maintained PMDs for older hardware may
> lose support at this point.
>
> The notion of filter type will then be deprecated and subsequently dropped
> to avoid confusion between both frameworks.
>
> Implementation details
> ======================
>
> Flow rule
> ---------
>
> A flow rule is the combination of a matching pattern with a list of actions,
> and is the basis of this API.
>
> Priorities
> ~~~~~~~~~~
>
> A priority can be assigned to a matching pattern.
>
> The default priority level is 0 and is also the highest. Support for more
> than a single priority level in hardware is not guaranteed.
>
> If a packet is matched by several filters at a given priority level, the
> outcome is undefined. It can take any path and can even be duplicated.
>
> Matching pattern
> ~~~~~~~~~~~~~~~~
>
> A matching pattern comprises any number of items of various types.
>
> Items are arranged in a list to form a matching pattern for packets. They
> fall in two categories:
>
> - Protocol matching (ANY, RAW, ETH, IPV4, IPV6, ICMP, UDP, TCP, VXLAN and so
>    on), usually associated with a specification structure. These must be
>    stacked in the same order as the protocol layers to match, starting from
>    L2.
>
> - Affecting how the pattern is processed (END, VOID, INVERT, PF, VF,
>    SIGNATURE and so on), often without a specification structure. Since they
>    are meta data that does not match packet contents, these can be specified
>    anywhere within item lists without affecting the protocol matching items.
>
> Most item specifications can be optionally paired with a mask to narrow the
> specific fields or bits to be matched.
>
> - Items are defined with ``struct rte_flow_item``.
> - Patterns are defined with ``struct rte_flow_pattern``.
>
> Example of an item specification matching an Ethernet header:
>
> +-----------------------------------------+
> | Ethernet                                |
> +==========+=========+====================+
> | ``spec`` | ``src`` | ``00:01:02:03:04`` |
> |          +---------+--------------------+
> |          | ``dst`` | ``00:2a:66:00:01`` |
> +----------+---------+--------------------+
> | ``mask`` | ``src`` | ``00:ff:ff:ff:00`` |
> |          +---------+--------------------+
> |          | ``dst`` | ``00:00:00:00:ff`` |
> +----------+---------+--------------------+
>
> Non-masked bits stand for any value, Ethernet headers with the following
> properties are thus matched:
>
> - ``src``: ``??:01:02:03:??``
> - ``dst``: ``??:??:??:??:01``
>
> Except for meta types that do not need one, ``spec`` must be a valid pointer
> to a structure of the related item type. A ``mask`` of the same type can be
> provided to tell which bits in ``spec`` are to be matched.
>
> A mask is normally only needed for ``spec`` fields matching packet data,
> ignored otherwise. See individual item types for more information.
>
> A ``NULL`` mask pointer is allowed and is similar to matching with a full
> mask (all ones) ``spec`` fields supported by hardware, the remaining fields
> are ignored (all zeroes), there is thus no error checking for unsupported
> fields.
>
> Matching pattern items for packet data must be naturally stacked (ordered
> from lowest to highest protocol layer), as in the following examples:
>
> +--------------+
> | TCPv4 as L4  |
> +===+==========+
> | 0 | Ethernet |
> +---+----------+
> | 1 | IPv4     |
> +---+----------+
> | 2 | TCP      |
> +---+----------+
>
> +----------------+
> | TCPv6 in VXLAN |
> +===+============+
> | 0 | Ethernet   |
> +---+------------+
> | 1 | IPv4       |
> +---+------------+
> | 2 | UDP        |
> +---+------------+
> | 3 | VXLAN      |
> +---+------------+
> | 4 | Ethernet   |
> +---+------------+
> | 5 | IPv6       |
> +---+------------+
> | 6 | TCP        |
> +---+------------+
>
> +-----------------------------+
> | TCPv4 as L4 with meta items |
> +===+=========================+
> | 0 | VOID                    |
> +---+-------------------------+
> | 1 | Ethernet                |
> +---+-------------------------+
> | 2 | VOID                    |
> +---+-------------------------+
> | 3 | IPv4                    |
> +---+-------------------------+
> | 4 | TCP                     |
> +---+-------------------------+
> | 5 | VOID                    |
> +---+-------------------------+
> | 6 | VOID                    |
> +---+-------------------------+
>
> The above example shows how meta items do not affect packet data matching
> items, as long as those remain stacked properly. The resulting matching
> pattern is identical to "TCPv4 as L4".
>
> +----------------+
> | UDPv6 anywhere |
> +===+============+
> | 0 | IPv6       |
> +---+------------+
> | 1 | UDP        |
> +---+------------+
>
> If supported by the PMD, omitting one or several protocol layers at the
> bottom of the stack as in the above example (missing an Ethernet
> specification) enables hardware to look anywhere in packets.
>
> It is unspecified whether the payload of supported encapsulations
> (e.g. VXLAN inner packet) is matched by such a pattern, which may apply to
> inner, outer or both packets.
>
> +---------------------+
> | Invalid, missing L3 |
> +===+=================+
> | 0 | Ethernet        |
> +---+-----------------+
> | 1 | UDP             |
> +---+-----------------+
>
> The above pattern is invalid due to a missing L3 specification between L2
> and L4. It is only allowed at the bottom and at the top of the stack.
>
> Meta item types
> ~~~~~~~~~~~~~~~
>
> These do not match packet data but affect how the pattern is processed, most
> of them do not need a specification structure. This particularity allows
> them to be specified anywhere without affecting other item types.
[LC] For the meta item(END, VOID, INVERT) and some data matching type 
like ANY and RAW,
it's all PMD responsible to understand the key character and to parse 
the header graph?
>
> ``END``
> ^^^^^^^
>
> End marker for item lists. Prevents further processing of items, thereby
> ending the pattern.
>
> - Its numeric value is **0** for convenience.
> - PMD support is mandatory.
> - Both ``spec`` and ``mask`` are ignored.
>
> +--------------------+
> | END                |
> +==========+=========+
> | ``spec`` | ignored |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
>
> ``VOID``
> ^^^^^^^^
>
> Used as a placeholder for convenience. It is ignored and simply discarded by
> PMDs.
>
> - PMD support is mandatory.
> - Both ``spec`` and ``mask`` are ignored.
>
> +--------------------+
> | VOID               |
> +==========+=========+
> | ``spec`` | ignored |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
>
> One usage example for this type is generating rules that share a common
> prefix quickly without reallocating memory, only by updating item types:
>
> +------------------------+
> | TCP, UDP or ICMP as L4 |
> +===+====================+
> | 0 | Ethernet           |
> +---+--------------------+
> | 1 | IPv4               |
> +---+------+------+------+
> | 2 | UDP  | VOID | VOID |
> +---+------+------+------+
> | 3 | VOID | TCP  | VOID |
> +---+------+------+------+
> | 4 | VOID | VOID | ICMP |
> +---+------+------+------+
>
> .. raw:: pdf
>
>     PageBreak
>
> ``INVERT``
> ^^^^^^^^^^
>
> Inverted matching, i.e. process packets that do not match the pattern.
>
> - Both ``spec`` and ``mask`` are ignored.
>
> +--------------------+
> | INVERT             |
> +==========+=========+
> | ``spec`` | ignored |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
>
> Usage example in order to match non-TCPv4 packets only:
>
> +--------------------+
> | Anything but TCPv4 |
> +===+================+
> | 0 | INVERT         |
> +---+----------------+
> | 1 | Ethernet       |
> +---+----------------+
> | 2 | IPv4           |
> +---+----------------+
> | 3 | TCP            |
> +---+----------------+
>
> ``PF``
> ^^^^^^
>
> Matches packets addressed to the physical function of the device.
>
> - Both ``spec`` and ``mask`` are ignored.
>
> +--------------------+
> | PF                 |
> +==========+=========+
> | ``spec`` | ignored |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
>
> ``VF``
> ^^^^^^
>
> Matches packets addressed to the given virtual function ID of the device.
>
> - Only ``spec`` needs to be defined, ``mask`` is ignored.
>
> +----------------------------------------+
> | VF                                     |
> +==========+=========+===================+
> | ``spec`` | ``vf``  | destination VF ID |
> +----------+---------+-------------------+
> | ``mask`` | ignored                     |
> +----------+-----------------------------+
>
> ``SIGNATURE``
> ^^^^^^^^^^^^^
>
> Requests hash-based signature dispatching for this rule.
>
> Considering this is a global setting on devices that support it, all
> subsequent filter rules may have to be created with it as well.
>
> - Only ``spec`` needs to be defined, ``mask`` is ignored.
>
> +--------------------+
> | SIGNATURE          |
> +==========+=========+
> | ``spec`` | TBD     |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
>
> .. raw:: pdf
>
>     PageBreak
>
> Data matching item types
> ~~~~~~~~~~~~~~~~~~~~~~~~
>
> Most of these are basically protocol header definitions with associated
> bitmasks. They must be specified (stacked) from lowest to highest protocol
> layer.
>
> The following list is not exhaustive as new protocols will be added in the
> future.
>
> ``ANY``
> ^^^^^^^
>
> Matches any protocol in place of the current layer, a single ANY may also
> stand for several protocol layers.
>
> This is usually specified as the first pattern item when looking for a
> protocol anywhere in a packet.
>
> - A maximum value of **0** requests matching any number of protocol layers
>    above or equal to the minimum value, a maximum value lower than the
>    minimum one is otherwise invalid.
> - Only ``spec`` needs to be defined, ``mask`` is ignored.
>
> +-----------------------------------------------------------------------+
> | ANY                                                                   |
> +==========+=========+==================================================+
> | ``spec`` | ``min`` | minimum number of layers covered                 |
> |          +---------+--------------------------------------------------+
> |          | ``max`` | maximum number of layers covered, 0 for infinity |
> +----------+---------+--------------------------------------------------+
> | ``mask`` | ignored                                                    |
> +----------+------------------------------------------------------------+
>
> Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or IPv6)
> and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
> or IPv6) matched by the second ANY specification:
>
> +----------------------------------+
> | TCP in VXLAN with wildcards      |
> +===+==============================+
> | 0 | Ethernet                     |
> +---+-----+----------+---------+---+
> | 1 | ANY | ``spec`` | ``min`` | 2 |
> |   |     |          +---------+---+
> |   |     |          | ``max`` | 2 |
> +---+-----+----------+---------+---+
> | 2 | VXLAN                        |
> +---+------------------------------+
> | 3 | Ethernet                     |
> +---+-----+----------+---------+---+
> | 4 | ANY | ``spec`` | ``min`` | 1 |
> |   |     |          +---------+---+
> |   |     |          | ``max`` | 1 |
> +---+-----+----------+---------+---+
> | 5 | TCP                          |
> +---+------------------------------+
>
> .. raw:: pdf
>
>     PageBreak
>
> ``RAW``
> ^^^^^^^
>
> Matches a string of a given length at a given offset (in bytes), or anywhere
> in the payload of the current protocol layer (including L2 header if used as
> the first item in the stack).
>
> This does not increment the protocol layer count as it is not a protocol
> definition. Subsequent RAW items modulate the first absolute one with
> relative offsets.
>
> - Using **-1** as the ``offset`` of the first RAW item makes its absolute
>    offset not fixed, i.e. the pattern is searched everywhere.
> - ``mask`` only affects the pattern.
The RAW matching type allow offset & length which support anchor setting 
setting and string match.
It's not defined for a user defined packet layout. Sometimes, comparing 
payload raw data after a header require
{offset, length}. One typical case is 5-tuples matching. The 'PORT' of 
transport layer is an offset to the IP header.
It can't address by IP/ANY, as it requires to extract key from the field 
in ANY.

>
> +--------------------------------------------------------------+
> | RAW                                                          |
> +==========+=============+=====================================+
> | ``spec`` | ``offset``  | absolute or relative pattern offset |
> |          +-------------+-------------------------------------+
> |          | ``length``  | pattern length                      |
> |          +-------------+-------------------------------------+
> |          | ``pattern`` | byte string of the above length     |
> +----------+-------------+-------------------------------------+
> | ``mask`` | ``offset``  | ignored                             |
> |          +-------------+-------------------------------------+
> |          | ``length``  | ignored                             |
> |          +-------------+-------------------------------------+
> |          | ``pattern`` | bitmask with the same byte length   |
> +----------+-------------+-------------------------------------+
>
> Example pattern looking for several strings at various offsets of a UDP
> payload, using combined RAW items:
>
> +------------------------------------------+
> | UDP payload matching                     |
> +===+======================================+
> | 0 | Ethernet                             |
> +---+--------------------------------------+
> | 1 | IPv4                                 |
> +---+--------------------------------------+
> | 2 | UDP                                  |
> +---+-----+----------+-------------+-------+
> | 3 | RAW | ``spec`` | ``offset``  | -1    |
> |   |     |          +-------------+-------+
> |   |     |          | ``length``  | 3     |
> |   |     |          +-------------+-------+
> |   |     |          | ``pattern`` | "foo" |
> +---+-----+----------+-------------+-------+
> | 4 | RAW | ``spec`` | ``offset``  | 20    |
> |   |     |          +-------------+-------+
> |   |     |          | ``length``  | 3     |
> |   |     |          +-------------+-------+
> |   |     |          | ``pattern`` | "bar" |
> +---+-----+----------+-------------+-------+
> | 5 | RAW | ``spec`` | ``offset``  | -30   |
> |   |     |          +-------------+-------+
> |   |     |          | ``length``  | 3     |
> |   |     |          +-------------+-------+
> |   |     |          | ``pattern`` | "baz" |
> +---+-----+----------+-------------+-------+
>
> This translates to:
>
> - Locate "foo" in UDP payload, remember its offset.
> - Check "bar" at "foo"'s offset plus 20 bytes.
> - Check "baz" at "foo"'s offset minus 30 bytes.
>
> .. raw:: pdf
>
>     PageBreak
>
> ``ETH``
> ^^^^^^^
>
> Matches an Ethernet header.
>
> - ``dst``: destination MAC.
> - ``src``: source MAC.
> - ``type``: EtherType.
> - ``tags``: number of 802.1Q/ad tags defined.
> - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
>
>   - ``tpid``: Tag protocol identifier.
>   - ``tci``: Tag control information.
>
> ``IPV4``
> ^^^^^^^^
>
> Matches an IPv4 header.
>
> - ``src``: source IP address.
> - ``dst``: destination IP address.
> - ``tos``: ToS/DSCP field.
> - ``ttl``: TTL field.
> - ``proto``: protocol number for the next layer.
>
> ``IPV6``
> ^^^^^^^^
>
> Matches an IPv6 header.
>
> - ``src``: source IP address.
> - ``dst``: destination IP address.
> - ``tc``: traffic class field.
> - ``nh``: Next header field (protocol).
> - ``hop_limit``: hop limit field (TTL).
>
> ``ICMP``
> ^^^^^^^^
>
> Matches an ICMP header.
>
> - TBD.
>
> ``UDP``
> ^^^^^^^
>
> Matches a UDP header.
>
> - ``sport``: source port.
> - ``dport``: destination port.
> - ``length``: UDP length.
> - ``checksum``: UDP checksum.
>
> .. raw:: pdf
>
>     PageBreak
>
> ``TCP``
> ^^^^^^^
>
> Matches a TCP header.
>
> - ``sport``: source port.
> - ``dport``: destination port.
> - All other TCP fields and bits.
>
> ``VXLAN``
> ^^^^^^^^^
>
> Matches a VXLAN header.
>
> - TBD.
>
> .. raw:: pdf
>
>     PageBreak
>
> Actions
> ~~~~~~~
>
> Each possible action is represented by a type. Some have associated
> configuration structures. Several actions combined in a list can be affected
> to a flow rule. That list is not ordered.
>
> At least one action must be defined in a filter rule in order to do
> something with matched packets.
>
> - Actions are defined with ``struct rte_flow_action``.
> - A list of actions is defined with ``struct rte_flow_actions``.
>
> They fall in three categories:
>
> - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
>    processing matched packets by subsequent flow rules, unless overridden
>    with PASSTHRU.
>
> - Non terminating actions (PASSTHRU, DUP) that leave matched packets up for
>    additional processing by subsequent flow rules.
>
> - Other non terminating meta actions that do not affect the fate of packets
>    (END, VOID, ID, COUNT).
>
> When several actions are combined in a flow rule, they should all have
> different types (e.g. dropping a packet twice is not possible). However
> considering the VOID type is an exception to this rule, the defined behavior
> is for PMDs to only take into account the last action of a given type found
> in the list. PMDs still perform error checking on the entire list.
>
> *Note that PASSTHRU is the only action able to override a terminating rule.*
[LC] I'm wondering how to address the meta data carried by mbuf, there's 
no mentioned here.
For packets hit one specific flow, usually there's something for CPU to 
identify the flow.
FDIR and RSS as an example, has id or key in mbuf. In addition, some 
meta may pointed by userdata in mbuf.
Any view on it ?

>
> .. raw:: pdf
>
>     PageBreak
>
> Example of an action that redirects packets to queue index 10:
>
> +----------------+
> | QUEUE          |
> +===========+====+
> | ``queue`` | 10 |
> +-----------+----+
>
> Action lists examples, their order is not significant, applications must
> consider all actions to be performed simultaneously:
>
> +----------------+
> | Count and drop |
> +=======+========+
> | COUNT |        |
> +-------+--------+
> | DROP  |        |
> +-------+--------+
>
> +--------------------------+
> | Tag, count and redirect  |
> +=======+===========+======+
> | ID    | ``id``    | 0x2a |
> +-------+-----------+------+
> | COUNT |                  |
> +-------+-----------+------+
> | QUEUE | ``queue`` | 10   |
> +-------+-----------+------+
>
> +-----------------------+
> | Redirect to queue 5   |
> +=======+===============+
> | DROP  |               |
> +-------+-----------+---+
> | QUEUE | ``queue`` | 5 |
> +-------+-----------+---+
>
> In the above example, considering both actions are performed simultaneously,
> its end result is that only QUEUE has any effect.
>
> +-----------------------+
> | Redirect to queue 3   |
> +=======+===========+===+
> | QUEUE | ``queue`` | 5 |
> +-------+-----------+---+
> | VOID  |               |
> +-------+-----------+---+
> | QUEUE | ``queue`` | 3 |
> +-------+-----------+---+
>
> As previously described, only the last action of a given type found in the
> list is taken into account. The above example also shows that VOID is
> ignored.
>
> .. raw:: pdf
>
>     PageBreak
>
> Action types
> ~~~~~~~~~~~~
>
> Common action types are described in this section. Like pattern item types,
> this list is not exhaustive as new actions will be added in the future.
>
> ``END`` (action)
> ^^^^^^^^^^^^^^^^
>
> End marker for action lists. Prevents further processing of actions, thereby
> ending the list.
>
> - Its numeric value is **0** for convenience.
> - PMD support is mandatory.
> - No configurable property.
>
> +---------------+
> | END           |
> +===============+
> | no properties |
> +---------------+
>
> ``VOID`` (action)
> ^^^^^^^^^^^^^^^^^
>
> Used as a placeholder for convenience. It is ignored and simply discarded by
> PMDs.
>
> - PMD support is mandatory.
> - No configurable property.
>
> +---------------+
> | VOID          |
> +===============+
> | no properties |
> +---------------+
>
> ``PASSTHRU``
> ^^^^^^^^^^^^
>
> Leaves packets up for additional processing by subsequent flow rules. This
> is the default when a rule does not contain a terminating action, but can be
> specified to force a rule to become non-terminating.
>
> - No configurable property.
>
> +---------------+
> | PASSTHRU      |
> +===============+
> | no properties |
> +---------------+
>
> Example to copy a packet to a queue and continue processing by subsequent
> flow rules:
>
> +--------------------------+
> | Copy to queue 8          |
> +==========+===============+
> | PASSTHRU |               |
> +----------+-----------+---+
> | QUEUE    | ``queue`` | 8 |
> +----------+-----------+---+
>
> ``ID``
> ^^^^^^
>
> Attaches a 32 bit value to packets.
>
> +----------------------------------------------+
> | ID                                           |
> +========+=====================================+
> | ``id`` | 32 bit value to return with packets |
> +--------+-------------------------------------+
>
> .. raw:: pdf
>
>     PageBreak
>
> ``QUEUE``
> ^^^^^^^^^
>
> Assigns packets to a given queue index.
>
> - Terminating by default.
>
> +--------------------------------+
> | QUEUE                          |
> +===========+====================+
> | ``queue`` | queue index to use |
> +-----------+--------------------+
>
> ``DROP``
> ^^^^^^^^
>
> Drop packets.
>
> - No configurable property.
> - Terminating by default.
> - PASSTHRU overrides this action if both are specified.
>
> +---------------+
> | DROP          |
> +===============+
> | no properties |
> +---------------+
>
> ``COUNT``
> ^^^^^^^^^
>
> Enables hits counter for this rule.
>
> This counter can be retrieved and reset through ``rte_flow_query()``, see
> ``struct rte_flow_query_count``.
>
> - Counters can be retrieved with ``rte_flow_query()``.
> - No configurable property.
>
> +---------------+
> | COUNT         |
> +===============+
> | no properties |
> +---------------+
>
> Query structure to retrieve and reset the flow rule hits counter:
>
> +------------------------------------------------+
> | COUNT query                                    |
> +===========+=====+==============================+
> | ``reset`` | in  | reset counter after query    |
> +-----------+-----+------------------------------+
> | ``hits``  | out | number of hits for this flow |
> +-----------+-----+------------------------------+
>
> ``DUP``
> ^^^^^^^
>
> Duplicates packets to a given queue index.
>
> This is normally combined with QUEUE, however when used alone, it is
> actually similar to QUEUE + PASSTHRU.
>
> - Non-terminating by default.
>
> +------------------------------------------------+
> | DUP                                            |
> +===========+====================================+
> | ``queue`` | queue index to duplicate packet to |
> +-----------+------------------------------------+
>
> .. raw:: pdf
>
>     PageBreak
>
> ``RSS``
> ^^^^^^^
>
> Similar to QUEUE, except RSS is additionally performed on packets to spread
> them among several queues according to the provided parameters.
>
> - Terminating by default.
>
> +---------------------------------------------+
> | RSS                                         |
> +==============+==============================+
> | ``rss_conf`` | RSS parameters               |
> +--------------+------------------------------+
> | ``queues``   | number of entries in queue[] |
> +--------------+------------------------------+
> | ``queue[]``  | queue indices to use         |
> +--------------+------------------------------+
>
> ``PF`` (action)
> ^^^^^^^^^^^^^^^
>
> Redirects packets to the physical function (PF) of the current device.
>
> - No configurable property.
> - Terminating by default.
>
> +---------------+
> | PF            |
> +===============+
> | no properties |
> +---------------+
>
> ``VF`` (action)
> ^^^^^^^^^^^^^^^
>
> Redirects packets to the virtual function (VF) of the current device with
> the specified ID.
>
> - Terminating by default.
>
> +---------------------------------------+
> | VF                                    |
> +========+==============================+
> | ``id`` | VF ID to redirect packets to |
> +--------+------------------------------+
>
> Planned types
> ~~~~~~~~~~~~~
>
> Other action types are planned but not defined yet. These actions will add
> the ability to alter matching packets in several ways, such as performing
> encapsulation/decapsulation of tunnel headers on specific flows.
>
> .. raw:: pdf
>
>     PageBreak
>
> Rules management
> ----------------
>
> A simple API with only four functions is provided to fully manage flows.
>
> Each created flow rule is associated with an opaque, PMD-specific handle
> pointer. The application is responsible for keeping it until the rule is
> destroyed.
>
> Flows rules are defined with ``struct rte_flow``.
>
> Validation
> ~~~~~~~~~~
>
> Given that expressing a definite set of device capabilities with this API is
> not practical, a dedicated function is provided to check if a flow rule is
> supported and can be created.
>
> ::
>
>   int
>   rte_flow_validate(uint8_t port_id,
>                     const struct rte_flow_pattern *pattern,
>                     const struct rte_flow_actions *actions);
>
> While this function has no effect on the target device, the flow rule is
> validated against its current configuration state and the returned value
> should be considered valid by the caller for that state only.
>
> The returned value is guaranteed to remain valid only as long as no
> successful calls to rte_flow_create() or rte_flow_destroy() are made in the
> meantime and no device parameter affecting flow rules in any way are
> modified, due to possible collisions or resource limitations (although in
> such cases ``EINVAL`` should not be returned).
>
> Arguments:
>
> - ``port_id``: port identifier of Ethernet device.
> - ``pattern``: pattern specification to check.
> - ``actions``: actions associated with the flow definition.
>
> Return value:
>
> - **0** if flow rule is valid and can be created. A negative errno value
>    otherwise (``rte_errno`` is also set), the following errors are defined.
> - ``-EINVAL``: unknown or invalid rule specification.
> - ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial masks
>    are unsupported).
> - ``-EEXIST``: collision with an existing rule.
> - ``-ENOMEM``: not enough resources.
>
> .. raw:: pdf
>
>     PageBreak
>
> Creation
> ~~~~~~~~
>
> Creating a flow rule is similar to validating one, except the rule is
> actually created.
>
> ::
>
>   struct rte_flow *
>   rte_flow_create(uint8_t port_id,
>                   const struct rte_flow_pattern *pattern,
>                   const struct rte_flow_actions *actions);
>
> Arguments:
>
> - ``port_id``: port identifier of Ethernet device.
> - ``pattern``: pattern specification to add.
> - ``actions``: actions associated with the flow definition.
>
> Return value:
>
> A valid flow pointer in case of success, NULL otherwise and ``rte_errno`` is
> set to the positive version of one of the error codes defined for
> ``rte_flow_validate()``.
>
> Destruction
> ~~~~~~~~~~~
>
> Flow rules destruction is not automatic, and a queue should not be released
> if any are still attached to it. Applications must take care of performing
> this step before releasing resources.
>
> ::
>
>   int
>   rte_flow_destroy(uint8_t port_id,
>                    struct rte_flow *flow);
>
>
> Failure to destroy a flow rule may occur when other flow rules depend on it,
> and destroying it would result in an inconsistent state.
>
> This function is only guaranteed to succeed if flow rules are destroyed in
> reverse order of their creation.
>
> Arguments:
>
> - ``port_id``: port identifier of Ethernet device.
> - ``flow``: flow rule to destroy.
>
> Return value:
>
> - **0** on success, a negative errno value otherwise and ``rte_errno`` is
>    set.
>
> .. raw:: pdf
>
>     PageBreak
>
> Query
> ~~~~~
>
> Query an existing flow rule.
>
> This function allows retrieving flow-specific data such as counters. Data
> is gathered by special actions which must be present in the flow rule
> definition.
>
> ::
>
>   int
>   rte_flow_query(uint8_t port_id,
>                  struct rte_flow *flow,
>                  enum rte_flow_action_type action,
>                  void *data);
>
> Arguments:
>
> - ``port_id``: port identifier of Ethernet device.
> - ``flow``: flow rule to query.
> - ``action``: action type to query.
> - ``data``: pointer to storage for the associated query data type.
>
> Return value:
>
> - **0** on success, a negative errno value otherwise and ``rte_errno`` is
>    set.
>
> .. raw:: pdf
>
>     PageBreak
>
> Behavior
> --------
>
> - API operations are synchronous and blocking (``EAGAIN`` cannot be
>    returned).
>
> - There is no provision for reentrancy/multi-thread safety, although nothing
>    should prevent different devices from being configured at the same
>    time. PMDs may protect their control path functions accordingly.
>
> - Stopping the data path (TX/RX) should not be necessary when managing flow
>    rules. If this cannot be achieved naturally or with workarounds (such as
>    temporarily replacing the burst function pointers), an appropriate error
>    code must be returned (``EBUSY``).
>
> - PMDs, not applications, are responsible for maintaining flow rules
>    configuration when stopping and restarting a port or performing other
>    actions which may affect them. They can only be destroyed explicitly.
>
> .. raw:: pdf
>
>     PageBreak
>
> Compatibility
> -------------
>
> No known hardware implementation supports all the features described in this
> document.
>
> Unsupported features or combinations are not expected to be fully emulated
> in software by PMDs for performance reasons. Partially supported features
> may be completed in software as long as hardware performs most of the work
> (such as queue redirection and packet recognition).
>
> However PMDs are expected to do their best to satisfy application requests
> by working around hardware limitations as long as doing so does not affect
> the behavior of existing flow rules.
>
> The following sections provide a few examples of such cases, they are based
> on limitations built into the previous APIs.
>
> Global bitmasks
> ~~~~~~~~~~~~~~~
>
> Each flow rule comes with its own, per-layer bitmasks, while hardware may
> support only a single, device-wide bitmask for a given layer type, so that
> two IPv4 rules cannot use different bitmasks.
>
> The expected behavior in this case is that PMDs automatically configure
> global bitmasks according to the needs of the first created flow rule.
>
> Subsequent rules are allowed only if their bitmasks match those, the
> ``EEXIST`` error code should be returned otherwise.
>
> Unsupported layer types
> ~~~~~~~~~~~~~~~~~~~~~~~
>
> Many protocols can be simulated by crafting patterns with the `RAW`_ type.
>
> PMDs can rely on this capability to simulate support for protocols with
> fixed headers not directly recognized by hardware.
>
> ``ANY`` pattern item
> ~~~~~~~~~~~~~~~~~~~~
>
> This pattern item stands for anything, which can be difficult to translate
> to something hardware would understand, particularly if followed by more
> specific types.
>
> Consider the following pattern:
>
> +---+--------------------------------+
> | 0 | ETHER                          |
> +---+--------------------------------+
> | 1 | ANY (``min`` = 1, ``max`` = 1) |
> +---+--------------------------------+
> | 2 | TCP                            |
> +---+--------------------------------+
>
> Knowing that TCP does not make sense with something other than IPv4 and IPv6
> as L3, such a pattern may be translated to two flow rules instead:
>
> +---+--------------------+
> | 0 | ETHER              |
> +---+--------------------+
> | 1 | IPV4 (zeroed mask) |
> +---+--------------------+
> | 2 | TCP                |
> +---+--------------------+
>
> +---+--------------------+
> | 0 | ETHER              |
> +---+--------------------+
> | 1 | IPV6 (zeroed mask) |
> +---+--------------------+
> | 2 | TCP                |
> +---+--------------------+
>
> Note that as soon as a ANY rule covers several layers, this approach may
> yield a large number of hidden flow rules. It is thus suggested to only
> support the most common scenarios (anything as L2 and/or L3).
>
> .. raw:: pdf
>
>     PageBreak
>
> Unsupported actions
> ~~~~~~~~~~~~~~~~~~~
>
> - When combined with a `QUEUE`_ action, packet counting (`COUNT`_) and
>    tagging (`ID`_) may be implemented in software as long as the target queue
>    is used by a single rule.
>
> - A rule specifying both `DUP`_ + `QUEUE`_ may be translated to two hidden
>    rules combining `QUEUE`_ and `PASSTHRU`_.
>
> - When a single target queue is provided, `RSS`_ can also be implemented
>    through `QUEUE`_.
>
> Flow rules priority
> ~~~~~~~~~~~~~~~~~~~
>
> While it would naturally make sense, flow rules cannot be assumed to be
> processed by hardware in the same order as their creation for several
> reasons:
>
> - They may be managed internally as a tree or a hash table instead of a
>    list.
> - Removing a flow rule before adding another one can either put the new rule
>    at the end of the list or reuse a freed entry.
> - Duplication may occur when packets are matched by several rules.
>
> For overlapping rules (particularly in order to use the `PASSTHRU`_ action)
> predictable behavior is only guaranteed by using different priority levels.
>
> Priority levels are not necessarily implemented in hardware, or may be
> severely limited (e.g. a single priority bit).
>
> For these reasons, priority levels may be implemented purely in software by
> PMDs.
>
> - For devices expecting flow rules to be added in the correct order, PMDs
>    may destroy and re-create existing rules after adding a new one with
>    a higher priority.
>
> - A configurable number of dummy or empty rules can be created at
>    initialization time to save high priority slots for later.
>
> - In order to save priority levels, PMDs may evaluate whether rules are
>    likely to collide and adjust their priority accordingly.
>
> .. raw:: pdf
>
>     PageBreak
>
> API migration
> =============
>
> Exhaustive list of deprecated filter types and how to convert them to
> generic flow rules.
>
> ``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
> ---------------------------------------
>
> `MACVLAN`_ can be translated to a basic `ETH`_ flow rule with a `VF
> (action)`_ or `PF (action)`_ terminating action.
>
> +------------------------------------+
> | MACVLAN                            |
> +--------------------------+---------+
> | Pattern                  | Actions |
> +===+=====+==========+=====+=========+
> | 0 | ETH | ``spec`` | any | VF,     |
> |   |     +----------+-----+ PF      |
> |   |     | ``mask`` | any |         |
> +---+-----+----------+-----+---------+
>
> ``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
> ----------------------------------------------
>
> `ETHERTYPE`_ is basically an `ETH`_ flow rule with `QUEUE`_ or `DROP`_ as
> a terminating action.
>
> +------------------------------------+
> | ETHERTYPE                          |
> +--------------------------+---------+
> | Pattern                  | Actions |
> +===+=====+==========+=====+=========+
> | 0 | ETH | ``spec`` | any | QUEUE,  |
> |   |     +----------+-----+ DROP    |
> |   |     | ``mask`` | any |         |
> +---+-----+----------+-----+---------+
>
> ``FLEXIBLE`` to ``RAW`` → ``QUEUE``
> -----------------------------------
>
> `FLEXIBLE`_ can be translated to one `RAW`_ pattern with `QUEUE`_ as the
> terminating action and a defined priority level.
>
> +------------------------------------+
> | FLEXIBLE                           |
> +--------------------------+---------+
> | Pattern                  | Actions |
> +===+=====+==========+=====+=========+
> | 0 | RAW | ``spec`` | any | QUEUE   |
> |   |     +----------+-----+         |
> |   |     | ``mask`` | any |         |
> +---+-----+----------+-----+---------+
>
> ``SYN`` to ``TCP`` → ``QUEUE``
> ------------------------------
>
> `SYN`_ is a `TCP`_ rule with only the ``syn`` bit enabled and masked, and
> `QUEUE`_ as the terminating action.
>
> Priority level can be set to simulate the high priority bit.
>
> +---------------------------------------------+
> | SYN                                         |
> +-----------------------------------+---------+
> | Pattern                           | Actions |
> +===+======+==========+=============+=========+
> | 0 | ETH  | ``spec`` | N/A         | QUEUE   |
> |   |      +----------+-------------+         |
> |   |      | ``mask`` | empty       |         |
> +---+------+----------+-------------+         |
> | 1 | IPV4 | ``spec`` | N/A         |         |
> |   |      +----------+-------------+         |
> |   |      | ``mask`` | empty       |         |
> +---+------+----------+-------------+         |
> | 2 | TCP  | ``spec`` | ``syn`` = 1 |         |
> |   |      +----------+-------------+         |
> |   |      | ``mask`` | ``syn`` = 1 |         |
> +---+------+----------+-------------+---------+
>
> ``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
> ----------------------------------------------------
>
> `NTUPLE`_ is similar to specifying an empty L2, `IPV4`_ as L3 with `TCP`_ or
> `UDP`_ as L4 and `QUEUE`_ as the terminating action.
>
> A priority level can be specified as well.
>
> +---------------------------------------+
> | NTUPLE                                |
> +-----------------------------+---------+
> | Pattern                     | Actions |
> +===+======+==========+=======+=========+
> | 0 | ETH  | ``spec`` | N/A   | QUEUE   |
> |   |      +----------+-------+         |
> |   |      | ``mask`` | empty |         |
> +---+------+----------+-------+         |
> | 1 | IPV4 | ``spec`` | any   |         |
> |   |      +----------+-------+         |
> |   |      | ``mask`` | any   |         |
> +---+------+----------+-------+         |
> | 2 | TCP, | ``spec`` | any   |         |
> |   | UDP  +----------+-------+         |
> |   |      | ``mask`` | any   |         |
> +---+------+----------+-------+---------+
>
> ``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
> ---------------------------------------------------------------------------
>
> `TUNNEL`_ matches common IPv4 and IPv6 L3/L4-based tunnel types.
>
> In the following table, `ANY`_ is used to cover the optional L4.
>
> +------------------------------------------------+
> | TUNNEL                                         |
> +--------------------------------------+---------+
> | Pattern                              | Actions |
> +===+=========+==========+=============+=========+
> | 0 | ETH     | ``spec`` | any         | QUEUE   |
> |   |         +----------+-------------+         |
> |   |         | ``mask`` | any         |         |
> +---+---------+----------+-------------+         |
> | 1 | IPV4,   | ``spec`` | any         |         |
> |   | IPV6    +----------+-------------+         |
> |   |         | ``mask`` | any         |         |
> +---+---------+----------+-------------+         |
> | 2 | ANY     | ``spec`` | ``min`` = 0 |         |
> |   |         |          +-------------+         |
> |   |         |          | ``max`` = 0 |         |
> |   |         +----------+-------------+         |
> |   |         | ``mask`` | N/A         |         |
> +---+---------+----------+-------------+         |
> | 3 | VXLAN,  | ``spec`` | any         |         |
> |   | GENEVE, +----------+-------------+         |
> |   | TEREDO, | ``mask`` | any         |         |
> |   | NVGRE,  |          |             |         |
> |   | GRE,    |          |             |         |
> |   | ...     |          |             |         |
> +---+---------+----------+-------------+---------+
>
> .. raw:: pdf
>
>     PageBreak
>
> ``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
> ---------------------------------------------------------------
>
> `FDIR`_ is more complex than any other type, there are several methods to
> emulate its functionality. It is summarized for the most part in the table
> below.
>
> A few features are intentionally not supported:
>
> - The ability to configure the matching input set and masks for the entire
>    device, PMDs should take care of it automatically according to flow rules.
>
> - Returning four or eight bytes of matched data when using flex bytes
>    filtering. Although a specific action could implement it, it conflicts
>    with the much more useful 32 bits tagging on devices that support it.
>
> - Side effects on RSS processing of the entire device. Flow rules that
>    conflict with the current device configuration should not be
>    allowed. Similarly, device configuration should not be allowed when it
>    affects existing flow rules.
>
> - Device modes of operation. "none" is unsupported since filtering cannot be
>    disabled as long as a flow rule is present.
>
> - "MAC VLAN" or "tunnel" perfect matching modes should be automatically set
>    according to the created flow rules.
>
> +----------------------------------------------+
> | FDIR                                         |
> +---------------------------------+------------+
> | Pattern                         | Actions    |
> +===+============+==========+=====+============+
> | 0 | ETH,       | ``spec`` | any | QUEUE,     |
> |   | RAW        +----------+-----+ DROP,      |
> |   |            | ``mask`` | any | PASSTHRU   |
> +---+------------+----------+-----+------------+
> | 1 | IPV4,      | ``spec`` | any | ID         |
> |   | IPV6       +----------+-----+ (optional) |
> |   |            | ``mask`` | any |            |
> +---+------------+----------+-----+            |
> | 2 | TCP,       | ``spec`` | any |            |
> |   | UDP,       +----------+-----+            |
> |   | SCTP       | ``mask`` | any |            |
> +---+------------+----------+-----+            |
> | 3 | VF,        | ``spec`` | any |            |
> |   | PF,        +----------+-----+            |
> |   | SIGNATURE  | ``mask`` | any |            |
> |   | (optional) |          |     |            |
> +---+------------+----------+-----+------------+
>
> ``HASH``
> ~~~~~~~~
>
> Hashing configuration is set per rule through the `SIGNATURE`_ item.
>
> Since it is usually a global device setting, all flow rules created with
> this item may have to share the same specification.
>
> ``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> All packets are matched. This type alters incoming packets to encapsulate
> them in a chosen tunnel type, optionally redirect them to a VF as well.
>
> The destination pool for tag based forwarding can be emulated with other
> flow rules using `DUP`_ as the action.
>
> +----------------------------------------+
> | L2_TUNNEL                              |
> +---------------------------+------------+
> | Pattern                   | Actions    |
> +===+======+==========+=====+============+
> | 0 | VOID | ``spec`` | N/A | VXLAN,     |
> |   |      |          |     | GENEVE,    |
> |   |      |          |     | ...        |
> |   |      +----------+-----+------------+
> |   |      | ``mask`` | N/A | VF         |
> |   |      |          |     | (optional) |
> +---+------+----------+-----+------------+
>

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-08 11:11 ` Liang, Cunming
@ 2016-07-08 12:38   ` Bruce Richardson
  2016-07-08 13:25   ` Adrien Mazarguil
  1 sibling, 0 replies; 262+ messages in thread
From: Bruce Richardson @ 2016-07-08 12:38 UTC (permalink / raw)
  To: Liang, Cunming
  Cc: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu, Jan Medala,
	John Daley, Jing Chen, Konstantin Ananyev, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, Pablo de Lara,
	Olga Shern

On Fri, Jul 08, 2016 at 07:11:28PM +0800, Liang, Cunming wrote:

Please Cunming, when replying in the middle of a long email - of which this
is a perfect example - delete the large chunks of content you are not replying
to. I had to page down multiple times to find the new comments you wrote, and
even then I wasn't sure that I hadn't missed something along the way.

/Bruce

> Hi Adrien,
> 
> On 7/6/2016 2:16 AM, Adrien Mazarguil wrote:
> >Hi All,
> >
<snip>

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-07 23:15 ` Chandran, Sugesh
@ 2016-07-08 13:03   ` Adrien Mazarguil
  2016-07-11 10:42     ` Chandran, Sugesh
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-08 13:03 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

Hi Sugesh,

On Thu, Jul 07, 2016 at 11:15:07PM +0000, Chandran, Sugesh wrote:
> Hi Adrien,
> 
> Thank you for proposing this. It would be really useful for application such as OVS-DPDK.
> Please find my comments and questions inline below prefixed with [Sugesh]. Most of them are from the perspective of enabling these APIs in application such as OVS-DPDK.

Thanks, I'm replying below.

> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Tuesday, July 5, 2016 7:17 PM
> > To: dev@dpdk.org
> > Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; Zhang, Helin
> > <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Rasesh
> > Mody <rasesh.mody@qlogic.com>; Ajit Khaparde
> > <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> > <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> > Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> > Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> > Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> > <sony.chacko@qlogic.com>; Jerin Jacob
> > <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> > <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>
> > Subject: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
> > 
> > Hi All,
> > 
> > First, forgive me for this large message, I know our mailboxes already
> > suffer quite a bit from the amount of traffic on this ML.
> > 
> > This is not exactly yet another thread about how flow director should be
> > extended, rather about a brand new API to handle filtering and
> > classification for incoming packets in the most PMD-generic and
> > application-friendly fashion we can come up with. Reasons described below.
> > 
> > I think this topic is important enough to include both the users of this API
> > as well as PMD maintainers. So far I have CC'ed librte_ether (especially
> > rte_eth_ctrl.h contributors), testpmd and PMD maintainers (with and
> > without
> > a .filter_ctrl implementation), but if you know application maintainers
> > other than testpmd who use FDIR or might be interested in this discussion,
> > feel free to add them.
> > 
> > The issues we found with the current approach are already summarized in
> > the
> > following document, but here is a quick summary for TL;DR folks:
> > 
> > - PMDs do not expose a common set of filter types and even when they do,
> >   their behavior more or less differs.
> > 
> > - Applications need to determine and adapt to device-specific limitations
> >   and quirks on their own, without help from PMDs.
> > 
> > - Writing an application that creates flow rules targeting all devices
> >   supported by DPDK is thus difficult, if not impossible.
> > 
> > - The current API has too many unspecified areas (particularly regarding
> >   side effects of flow rules) that make PMD implementation tricky.
> > 
> > This RFC API handles everything currently supported by .filter_ctrl, the
> > idea being to reimplement all of these to make them fully usable by
> > applications in a more generic and well defined fashion. It has a very small
> > set of mandatory features and an easy method to let applications probe for
> > supported capabilities.
> > 
> > The only downside is more work for the software control side of PMDs
> > because
> > they have to adapt to the API instead of the reverse. I think helpers can be
> > added to EAL to assist with this.
> > 
> > HTML version:
> > 
> >  https://rawgit.com/6WIND/rte_flow/master/rte_flow.html
> > 
> > PDF version:
> > 
> >  https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf
> > 
> > Related draft header file (for reference while reading the specification):
> > 
> >  https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h
> > 
> > Git tree for completeness (latest .rst version can be retrieved from here):
> > 
> >  https://github.com/6WIND/rte_flow
> > 
> > What follows is the ReST source of the above, for inline comments and
> > discussion. I intend to update that specification accordingly.
> > 
> > ========================
> > Generic filter interface
> > ========================
> > 
> > .. footer::
> > 
> >    v0.6
> > 
> > .. contents::
> > .. sectnum::
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Overview
> > ========
> > 
> > DPDK provides several competing interfaces added over time to perform
> > packet
> > matching and related actions such as filtering and classification.
> > 
> > They must be extended to implement the features supported by newer
> > devices
> > in order to expose them to applications, however the current design has
> > several drawbacks:
> > 
> > - Complicated filter combinations which have not been hard-coded cannot be
> >   expressed.
> > - Prone to API/ABI breakage when new features must be added to an
> > existing
> >   filter type, which frequently happens.
> > 
> > From an application point of view:
> > 
> > - Having disparate interfaces, all optional and lacking in features does not
> >   make this API easy to use.
> > - Seemingly arbitrary built-in limitations of filter types based on the
> >   device they were initially designed for.
> > - Undefined relationship between different filter types.
> > - High complexity, considerable undocumented and/or undefined behavior.
> > 
> > Considering the growing number of devices supported by DPDK, adding a
> > new
> > filter type each time a new feature must be implemented is not sustainable
> > in the long term. Applications not written to target a specific device
> > cannot really benefit from such an API.
> > 
> > For these reasons, this document defines an extensible unified API that
> > encompasses and supersedes these legacy filter types.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Current API
> > ===========
> > 
> > Rationale
> > ---------
> > 
> > The reason several competing (and mostly overlapping) filtering APIs are
> > present in DPDK is due to its nature as a thin layer between hardware and
> > software.
> > 
> > Each subsequent interface has been added to better match the capabilities
> > and limitations of the latest supported device, which usually happened to
> > need an incompatible configuration approach. Because of this, many ended
> > up
> > device-centric and not usable by applications that were not written for that
> > particular device.
> > 
> > This document is not the first attempt to address this proliferation issue,
> > in fact a lot of work has already been done both to create a more generic
> > interface while somewhat keeping compatibility with legacy ones through a
> > common call interface (``rte_eth_dev_filter_ctrl()`` with the
> > ``.filter_ctrl`` PMD callback in ``rte_ethdev.h``).
> > 
> > Today, these previously incompatible interfaces are known as filter types
> > (``RTE_ETH_FILTER_*`` from ``enum rte_filter_type`` in ``rte_eth_ctrl.h``).
> > 
> > However while trivial to extend with new types, it only shifted the
> > underlying problem as applications still need to be written for one kind of
> > filter type, which, as described in the following sections, is not
> > necessarily implemented by all PMDs that support filtering.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Filter types
> > ------------
> > 
> > This section summarizes the capabilities of each filter type.
> > 
> > Although the following list is exhaustive, the description of individual
> > types may contain inaccuracies due to the lack of documentation or usage
> > examples.
> > 
> > Note: names are prefixed with ``RTE_ETH_FILTER_``.
> > 
> > ``MACVLAN``
> > ~~~~~~~~~~~
> > 
> > Matching:
> > 
> > - L2 source/destination addresses.
> > - Optional 802.1Q VLAN ID.
> > - Masking individual fields on a rule basis is not supported.
> > 
> > Action:
> > 
> > - Packets are redirected either to a given VF device using its ID or to the
> >   PF.
> > 
> > ``ETHERTYPE``
> > ~~~~~~~~~~~~~
> > 
> > Matching:
> > 
> > - L2 source/destination addresses (optional).
> > - Ethertype (no VLAN ID?).
> > - Masking individual fields on a rule basis is not supported.
> > 
> > Action:
> > 
> > - Receive packets on a given queue.
> > - Drop packets.
> > 
> > ``FLEXIBLE``
> > ~~~~~~~~~~~~
> > 
> > Matching:
> > 
> > - At most 128 consecutive bytes anywhere in packets.
> > - Masking is supported with byte granularity.
> > - Priorities are supported (relative to this filter type, undefined
> >   otherwise).
> > 
> > Action:
> > 
> > - Receive packets on a given queue.
> > 
> > ``SYN``
> > ~~~~~~~
> > 
> > Matching:
> > 
> > - TCP SYN packets only.
> > - One high priority bit can be set to give the highest possible priority to
> >   this type when other filters with different types are configured.
> > 
> > Action:
> > 
> > - Receive packets on a given queue.
> > 
> > ``NTUPLE``
> > ~~~~~~~~~~
> > 
> > Matching:
> > 
> > - Source/destination IPv4 addresses (optional in 2-tuple mode).
> > - Source/destination TCP/UDP port (mandatory in 2 and 5-tuple modes).
> > - L4 protocol (2 and 5-tuple modes).
> > - Masking individual fields is supported.
> > - TCP flags.
> > - Up to 7 levels of priority relative to this filter type, undefined
> >   otherwise.
> > - No IPv6.
> > 
> > Action:
> > 
> > - Receive packets on a given queue.
> > 
> > ``TUNNEL``
> > ~~~~~~~~~~
> > 
> > Matching:
> > 
> > - Outer L2 source/destination addresses.
> > - Inner L2 source/destination addresses.
> > - Inner VLAN ID.
> > - IPv4/IPv6 source (destination?) address.
> > - Tunnel type to match (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE,
> > 802.1BR
> >   E-Tag).
> > - Tenant ID for tunneling protocols that have one.
> > - Any combination of the above can be specified.
> > - Masking individual fields on a rule basis is not supported.
> > 
> > Action:
> > 
> > - Receive packets on a given queue.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``FDIR``
> > ~~~~~~~~
> > 
> > Queries:
> > 
> > - Device capabilities and limitations.
> > - Device statistics about configured filters (resource usage, collisions).
> > - Device configuration (matching input set and masks)
> > 
> > Matching:
> > 
> > - Device mode of operation: none (to disable filtering), signature
> >   (hash-based dispatching from masked fields) or perfect (either MAC VLAN
> > or
> >   tunnel).
> > - L2 Ethertype.
> > - Outer L2 destination address (MAC VLAN mode).
> > - Inner L2 destination address, tunnel type (NVGRE, VXLAN) and tunnel ID
> >   (tunnel mode).
> > - IPv4 source/destination addresses, ToS, TTL and protocol fields.
> > - IPv6 source/destination addresses, TC, protocol and hop limits fields.
> > - UDP source/destination IPv4/IPv6 and ports.
> > - TCP source/destination IPv4/IPv6 and ports.
> > - SCTP source/destination IPv4/IPv6, ports and verification tag field.
> > - Note, only one protocol type at once (either only L2 Ethertype, basic
> >   IPv6, IPv4+UDP, IPv4+TCP and so on).
> > - VLAN TCI (extended API).
> > - At most 16 bytes to match in payload (extended API). A global device
> >   look-up table specifies for each possible protocol layer (unknown, raw,
> >   L2, L3, L4) the offset to use for each byte (they do not need to be
> >   contiguous) and the related bitmask.
> > - Whether packet is addressed to PF or VF, in that case its ID can be
> >   matched as well (extended API).
> > - Masking most of the above fields is supported, but simultaneously affects
> >   all filters configured on a device.
> > - Input set can be modified in a similar fashion for a given device to
> >   ignore individual fields of filters (i.e. do not match the destination
> >   address in a IPv4 filter, refer to **RTE_ETH_INPUT_SET_**
> >   macros). Configuring this also affects RSS processing on **i40e**.
> > - Filters can also provide 32 bits of arbitrary data to return as part of
> >   matched packets.
> > 
> > Action:
> > 
> > - **RTE_ETH_FDIR_ACCEPT**: receive (accept) packet on a given queue.
> > - **RTE_ETH_FDIR_REJECT**: drop packet immediately.
> > - **RTE_ETH_FDIR_PASSTHRU**: similar to accept for the last filter in list,
> >   otherwise process it with subsequent filters.
> > - For accepted packets and if requested by filter, either 32 bits of
> >   arbitrary data and four bytes of matched payload (only in case of flex
> >   bytes matching), or eight bytes of matched payload (flex also) are added
> >   to meta data.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``HASH``
> > ~~~~~~~~
> > 
> > Not an actual filter type. Provides and retrieves the global device
> > configuration (per port or entire NIC) for hash functions and their
> > properties.
> > 
> > Hash function selection: "default" (keep current), XOR or Toeplitz.
> > 
> > This function can be configured per flow type (**RTE_ETH_FLOW_**
> > definitions), supported types are:
> > 
> > - Unknown.
> > - Raw.
> > - Fragmented or non-fragmented IPv4.
> > - Non-fragmented IPv4 with L4 (TCP, UDP, SCTP or other).
> > - Fragmented or non-fragmented IPv6.
> > - Non-fragmented IPv6 with L4 (TCP, UDP, SCTP or other).
> > - L2 payload.
> > - IPv6 with extensions.
> > - IPv6 with L4 (TCP, UDP) and extensions.
> > 
> > ``L2_TUNNEL``
> > ~~~~~~~~~~~~~
> > 
> > Matching:
> > 
> > - All packets received on a given port.
> > 
> > Action:
> > 
> > - Add tunnel encapsulation (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE,
> >   802.1BR E-Tag) using the provided Ethertype and tunnel ID (only E-Tag
> >   is implemented at the moment).
> > - VF ID to use for tag insertion (currently unused).
> > - Destination pool for tag based forwarding (pools are IDs that can be
> >   affected to ports, duplication occurs if the same ID is shared by several
> >   ports of the same NIC).
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Driver support
> > --------------
> > 
> > ======== ======= ========= ======== === ====== ====== ==== ====
> > =========
> > Driver   MACVLAN ETHERTYPE FLEXIBLE SYN NTUPLE TUNNEL FDIR HASH
> > L2_TUNNEL
> > ======== ======= ========= ======== === ====== ====== ==== ====
> > =========
> > bnx2x
> > cxgbe
> > e1000            yes       yes      yes yes
> > ena
> > enic                                                  yes
> > fm10k
> > i40e     yes     yes                           yes    yes  yes
> > ixgbe            yes                yes yes           yes       yes
> > mlx4
> > mlx5                                                  yes
> > szedata2
> > ======== ======= ========= ======== === ====== ====== ==== ====
> > =========
> > 
> > Flow director
> > -------------
> > 
> > Flow director (FDIR) is the name of the most capable filter type, which
> > covers most features offered by others. As such, it is the most widespread
> > in PMDs that support filtering (i.e. all of them besides **e1000**).
> > 
> > It is also the only type that allows an arbitrary 32 bits value provided by
> > applications to be attached to a filter and returned with matching packets
> > instead of relying on the destination queue to recognize flows.
> > 
> > Unfortunately, even FDIR requires applications to be aware of low-level
> > capabilities and limitations (most of which come directly from **ixgbe** and
> > **i40e**):
> > 
> > - Bitmasks are set globally per device (port?), not per filter.
> [Sugesh] This means application cannot define filters that matches on arbitrary different offsets?
> If that’s the case, I assume the application has to program bitmask in advance. Otherwise how 
> the API framework deduce this bitmask information from the rules?? Its not very clear to me
> that how application pass down the bitmask information for multiple filters on same port?

This is my understanding of how flow director currently works, perhaps
someome more familiar with it can answer this question better than I could.

Let me take an example, if particular device can only handle a single IPv4
mask common to all flow rules (say only to match destination addresses),
updating that mask to also match the source address affects all defined and
future flow rules simultaneously.

That is how FDIR currently works and I think it is wrong, as it penalizes
devices that do support individual bit-masks per rule, and is a little
awkward from an application point of view.

What I suggest for the new API instead is the ability to specify one
bit-mask per rule, and let the PMD deal with HW limitations by automatically
configuring global bitmasks from the first added rule, then refusing to add
subsequent rules if they specify a conflicting bit-mask. Existing rules
remain unaffected that way, and applications do not have to be extra
cautious.

> > - Configuration state is not expected to be saved by the driver, and
> >   stopping/restarting a port requires the application to perform it again
> >   (API documentation is also unclear about this).
> > - Monolithic approach with ABI issues as soon as a new kind of flow or
> >   combination needs to be supported.
> > - Cryptic global statistics/counters.
> > - Unclear about how priorities are managed; filters seem to be arranged as a
> >   linked list in hardware (possibly related to configuration order).
> > 
> > Packet alteration
> > -----------------
> > 
> > One interesting feature is that the L2 tunnel filter type implements the
> > ability to alter incoming packets through a filter (in this case to
> > encapsulate them), thus the **mlx5** flow encap/decap features are not a
> > foreign concept.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Proposed API
> > ============
> > 
> > Terminology
> > -----------
> > 
> > - **Filtering API**: overall framework affecting the fate of selected
> >   packets, covers everything described in this document.
> > - **Matching pattern**: properties to look for in received packets, a
> >   combination of any number of items.
> > - **Pattern item**: part of a pattern that either matches packet data
> >   (protocol header, payload or derived information), or specifies properties
> >   of the pattern itself.
> > - **Actions**: what needs to be done when a packet matches a pattern.
> > - **Flow rule**: this is the result of combining a *matching pattern* with
> >   *actions*.
> > - **Filter rule**: a less generic term than *flow rule*, can otherwise be
> >   used interchangeably.
> > - **Hit**: a flow rule is said to be *hit* when processing a matching
> >   packet.
> > 
> > Requirements
> > ------------
> > 
> > As described in the previous section, there is a growing need for a common
> > method to configure filtering and related actions in a hardware independent
> > fashion.
> > 
> > The filtering API should not disallow any filter combination by design and
> > must remain as simple as possible to use. It can simply be defined as a
> > method to perform one or several actions on selected packets.
> > 
> > PMDs are aware of the capabilities of the device they manage and should be
> > responsible for preventing unsupported or conflicting combinations.
> > 
> > This approach is fundamentally different as it places most of the burden on
> > the software side of the PMD instead of having device capabilities directly
> > mapped to API functions, then expecting applications to work around
> > ensuing
> > compatibility issues.
> > 
> > Requirements for a new API:
> > 
> > - Flexible and extensible without causing API/ABI problems for existing
> >   applications.
> > - Should be unambiguous and easy to use.
> > - Support existing filtering features and actions listed in `Filter types`_.
> > - Support packet alteration.
> > - In case of overlapping filters, their priority should be well documented.
> > - Support filter queries (for example to retrieve counters).
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > High level design
> > -----------------
> > 
> > The chosen approach to make filtering as generic as possible is by
> > expressing matching patterns through lists of items instead of the flat
> > structures used in DPDK today, enabling combinations that are not
> > predefined
> > and thus being more versatile.
> > 
> > Flow rules can have several distinct actions (such as counting,
> > encapsulating, decapsulating before redirecting packets to a particular
> > queue, etc.), instead of relying on several rules to achieve this and having
> > applications deal with hardware implementation details regarding their
> > order.
> > 
> > Support for different priority levels on a rule basis is provided, for
> > example in order to force a more specific rule come before a more generic
> > one for packets matched by both, however hardware support for more than
> > a
> > single priority level cannot be guaranteed. When supported, the number of
> > available priority levels is usually low, which is why they can also be
> > implemented in software by PMDs (e.g. to simulate missing priority levels by
> > reordering rules).
> > 
> > In order to remain as hardware agnostic as possible, by default all rules
> > are considered to have the same priority, which means that the order
> > between
> > overlapping rules (when a packet is matched by several filters) is
> > undefined, packet duplication may even occur as a result.
> > 
> > PMDs may refuse to create overlapping rules at a given priority level when
> > they can be detected (e.g. if a pattern matches an existing filter).
> > 
> > Thus predictable results for a given priority level can only be achieved
> > with non-overlapping rules, using perfect matching on all protocol layers.
> > 
> > Support for multiple actions per rule may be implemented internally on top
> > of non-default hardware priorities, as a result both features may not be
> > simultaneously available to applications.
> > 
> > Considering that allowed pattern/actions combinations cannot be known in
> > advance and would result in an unpractically large number of capabilities to
> > expose, a method is provided to validate a given rule from the current
> > device configuration state without actually adding it (akin to a "dry run"
> > mode).
> > 
> > This enables applications to check if the rule types they need is supported
> > at initialization time, before starting their data path. This method can be
> > used anytime, its only requirement being that the resources needed by a
> > rule
> > must exist (e.g. a target RX queue must be configured first).
> > 
> > Each defined rule is associated with an opaque handle managed by the PMD,
> > applications are responsible for keeping it. These can be used for queries
> > and rules management, such as retrieving counters or other data and
> > destroying them.
> > 
> > Handles must be destroyed before releasing associated resources such as
> > queues.
> > 
> > Integration
> > -----------
> > 
> > To avoid ABI breakage, this new interface will be implemented through the
> > existing filtering control framework (``rte_eth_dev_filter_ctrl()``) using
> > **RTE_ETH_FILTER_GENERIC** as a new filter type.
> > 
> > However a public front-end API described in `Rules management`_ will
> > be added as the preferred method to use it.
> > 
> > Once discussions with the community have converged to a definite API,
> > legacy
> > filter types should be deprecated and a deadline defined to remove their
> > support entirely.
> > 
> > PMDs will have to be gradually converted to **RTE_ETH_FILTER_GENERIC**
> > or
> > drop filtering support entirely. Less maintained PMDs for older hardware
> > may
> > lose support at this point.
> > 
> > The notion of filter type will then be deprecated and subsequently dropped
> > to avoid confusion between both frameworks.
> > 
> > Implementation details
> > ======================
> > 
> > Flow rule
> > ---------
> > 
> > A flow rule is the combination of a matching pattern with a list of actions,
> > and is the basis of this API.
> > 
> > Priorities
> > ~~~~~~~~~~
> > 
> > A priority can be assigned to a matching pattern.
> > 
> > The default priority level is 0 and is also the highest. Support for more
> > than a single priority level in hardware is not guaranteed.
> > 
> > If a packet is matched by several filters at a given priority level, the
> > outcome is undefined. It can take any path and can even be duplicated.
> > 
> > Matching pattern
> > ~~~~~~~~~~~~~~~~
> > 
> > A matching pattern comprises any number of items of various types.
> > 
> > Items are arranged in a list to form a matching pattern for packets. They
> > fall in two categories:
> > 
> > - Protocol matching (ANY, RAW, ETH, IPV4, IPV6, ICMP, UDP, TCP, VXLAN and
> > so
> >   on), usually associated with a specification structure. These must be
> >   stacked in the same order as the protocol layers to match, starting from
> >   L2.
> > 
> > - Affecting how the pattern is processed (END, VOID, INVERT, PF, VF,
> >   SIGNATURE and so on), often without a specification structure. Since they
> >   are meta data that does not match packet contents, these can be specified
> >   anywhere within item lists without affecting the protocol matching items.
> > 
> > Most item specifications can be optionally paired with a mask to narrow the
> > specific fields or bits to be matched.
> > 
> > - Items are defined with ``struct rte_flow_item``.
> > - Patterns are defined with ``struct rte_flow_pattern``.
> > 
> > Example of an item specification matching an Ethernet header:
> > 
> > +-----------------------------------------+
> > | Ethernet                                |
> > +==========+=========+====================+
> > | ``spec`` | ``src`` | ``00:01:02:03:04`` |
> > |          +---------+--------------------+
> > |          | ``dst`` | ``00:2a:66:00:01`` |
> > +----------+---------+--------------------+
> > | ``mask`` | ``src`` | ``00:ff:ff:ff:00`` |
> > |          +---------+--------------------+
> > |          | ``dst`` | ``00:00:00:00:ff`` |
> > +----------+---------+--------------------+
> > 
> > Non-masked bits stand for any value, Ethernet headers with the following
> > properties are thus matched:
> > 
> > - ``src``: ``??:01:02:03:??``
> > - ``dst``: ``??:??:??:??:01``
> > 
> > Except for meta types that do not need one, ``spec`` must be a valid pointer
> > to a structure of the related item type. A ``mask`` of the same type can be
> > provided to tell which bits in ``spec`` are to be matched.
> > 
> > A mask is normally only needed for ``spec`` fields matching packet data,
> > ignored otherwise. See individual item types for more information.
> > 
> > A ``NULL`` mask pointer is allowed and is similar to matching with a full
> > mask (all ones) ``spec`` fields supported by hardware, the remaining fields
> > are ignored (all zeroes), there is thus no error checking for unsupported
> > fields.
> > 
> > Matching pattern items for packet data must be naturally stacked (ordered
> > from lowest to highest protocol layer), as in the following examples:
> > 
> > +--------------+
> > | TCPv4 as L4  |
> > +===+==========+
> > | 0 | Ethernet |
> > +---+----------+
> > | 1 | IPv4     |
> > +---+----------+
> > | 2 | TCP      |
> > +---+----------+
> > 
> > +----------------+
> > | TCPv6 in VXLAN |
> > +===+============+
> > | 0 | Ethernet   |
> > +---+------------+
> > | 1 | IPv4       |
> > +---+------------+
> > | 2 | UDP        |
> > +---+------------+
> > | 3 | VXLAN      |
> > +---+------------+
> > | 4 | Ethernet   |
> > +---+------------+
> > | 5 | IPv6       |
> > +---+------------+
> > | 6 | TCP        |
> > +---+------------+
> > 
> > +-----------------------------+
> > | TCPv4 as L4 with meta items |
> > +===+=========================+
> > | 0 | VOID                    |
> > +---+-------------------------+
> > | 1 | Ethernet                |
> > +---+-------------------------+
> > | 2 | VOID                    |
> > +---+-------------------------+
> > | 3 | IPv4                    |
> > +---+-------------------------+
> > | 4 | TCP                     |
> > +---+-------------------------+
> > | 5 | VOID                    |
> > +---+-------------------------+
> > | 6 | VOID                    |
> > +---+-------------------------+
> > 
> > The above example shows how meta items do not affect packet data
> > matching
> > items, as long as those remain stacked properly. The resulting matching
> > pattern is identical to "TCPv4 as L4".
> > 
> > +----------------+
> > | UDPv6 anywhere |
> > +===+============+
> > | 0 | IPv6       |
> > +---+------------+
> > | 1 | UDP        |
> > +---+------------+
> > 
> > If supported by the PMD, omitting one or several protocol layers at the
> > bottom of the stack as in the above example (missing an Ethernet
> > specification) enables hardware to look anywhere in packets.
> > 
> > It is unspecified whether the payload of supported encapsulations
> > (e.g. VXLAN inner packet) is matched by such a pattern, which may apply to
> > inner, outer or both packets.
> > 
> > +---------------------+
> > | Invalid, missing L3 |
> > +===+=================+
> > | 0 | Ethernet        |
> > +---+-----------------+
> > | 1 | UDP             |
> > +---+-----------------+
> > 
> > The above pattern is invalid due to a missing L3 specification between L2
> > and L4. It is only allowed at the bottom and at the top of the stack.
> > 
> > Meta item types
> > ~~~~~~~~~~~~~~~
> > 
> > These do not match packet data but affect how the pattern is processed,
> > most
> > of them do not need a specification structure. This particularity allows
> > them to be specified anywhere without affecting other item types.
> > 
> > ``END``
> > ^^^^^^^
> > 
> > End marker for item lists. Prevents further processing of items, thereby
> > ending the pattern.
> > 
> > - Its numeric value is **0** for convenience.
> > - PMD support is mandatory.
> > - Both ``spec`` and ``mask`` are ignored.
> > 
> > +--------------------+
> > | END                |
> > +==========+=========+
> > | ``spec`` | ignored |
> > +----------+---------+
> > | ``mask`` | ignored |
> > +----------+---------+
> > 
> > ``VOID``
> > ^^^^^^^^
> > 
> > Used as a placeholder for convenience. It is ignored and simply discarded by
> > PMDs.
> > 
> > - PMD support is mandatory.
> > - Both ``spec`` and ``mask`` are ignored.
> > 
> > +--------------------+
> > | VOID               |
> > +==========+=========+
> > | ``spec`` | ignored |
> > +----------+---------+
> > | ``mask`` | ignored |
> > +----------+---------+
> > 
> > One usage example for this type is generating rules that share a common
> > prefix quickly without reallocating memory, only by updating item types:
> > 
> > +------------------------+
> > | TCP, UDP or ICMP as L4 |
> > +===+====================+
> > | 0 | Ethernet           |
> > +---+--------------------+
> > | 1 | IPv4               |
> > +---+------+------+------+
> > | 2 | UDP  | VOID | VOID |
> > +---+------+------+------+
> > | 3 | VOID | TCP  | VOID |
> > +---+------+------+------+
> > | 4 | VOID | VOID | ICMP |
> > +---+------+------+------+
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``INVERT``
> > ^^^^^^^^^^
> > 
> > Inverted matching, i.e. process packets that do not match the pattern.
> > 
> > - Both ``spec`` and ``mask`` are ignored.
> > 
> > +--------------------+
> > | INVERT             |
> > +==========+=========+
> > | ``spec`` | ignored |
> > +----------+---------+
> > | ``mask`` | ignored |
> > +----------+---------+
> > 
> > Usage example in order to match non-TCPv4 packets only:
> > 
> > +--------------------+
> > | Anything but TCPv4 |
> > +===+================+
> > | 0 | INVERT         |
> > +---+----------------+
> > | 1 | Ethernet       |
> > +---+----------------+
> > | 2 | IPv4           |
> > +---+----------------+
> > | 3 | TCP            |
> > +---+----------------+
> > 
> > ``PF``
> > ^^^^^^
> > 
> > Matches packets addressed to the physical function of the device.
> > 
> > - Both ``spec`` and ``mask`` are ignored.
> > 
> > +--------------------+
> > | PF                 |
> > +==========+=========+
> > | ``spec`` | ignored |
> > +----------+---------+
> > | ``mask`` | ignored |
> > +----------+---------+
> > 
> > ``VF``
> > ^^^^^^
> > 
> > Matches packets addressed to the given virtual function ID of the device.
> > 
> > - Only ``spec`` needs to be defined, ``mask`` is ignored.
> > 
> > +----------------------------------------+
> > | VF                                     |
> > +==========+=========+===================+
> > | ``spec`` | ``vf``  | destination VF ID |
> > +----------+---------+-------------------+
> > | ``mask`` | ignored                     |
> > +----------+-----------------------------+
> > 
> > ``SIGNATURE``
> > ^^^^^^^^^^^^^
> > 
> > Requests hash-based signature dispatching for this rule.
> > 
> > Considering this is a global setting on devices that support it, all
> > subsequent filter rules may have to be created with it as well.
> > 
> > - Only ``spec`` needs to be defined, ``mask`` is ignored.
> > 
> > +--------------------+
> > | SIGNATURE          |
> > +==========+=========+
> > | ``spec`` | TBD     |
> > +----------+---------+
> > | ``mask`` | ignored |
> > +----------+---------+
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Data matching item types
> > ~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > Most of these are basically protocol header definitions with associated
> > bitmasks. They must be specified (stacked) from lowest to highest protocol
> > layer.
> > 
> > The following list is not exhaustive as new protocols will be added in the
> > future.
> > 
> > ``ANY``
> > ^^^^^^^
> > 
> > Matches any protocol in place of the current layer, a single ANY may also
> > stand for several protocol layers.
> > 
> > This is usually specified as the first pattern item when looking for a
> > protocol anywhere in a packet.
> > 
> > - A maximum value of **0** requests matching any number of protocol
> > layers
> >   above or equal to the minimum value, a maximum value lower than the
> >   minimum one is otherwise invalid.
> > - Only ``spec`` needs to be defined, ``mask`` is ignored.
> > 
> > +-----------------------------------------------------------------------+
> > | ANY                                                                   |
> > +==========+=========+====================================
> > ==============+
> > | ``spec`` | ``min`` | minimum number of layers covered                 |
> > |          +---------+--------------------------------------------------+
> > |          | ``max`` | maximum number of layers covered, 0 for infinity |
> > +----------+---------+--------------------------------------------------+
> > | ``mask`` | ignored                                                    |
> > +----------+------------------------------------------------------------+
> > 
> > Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or
> > IPv6)
> > and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
> > or IPv6) matched by the second ANY specification:
> > 
> > +----------------------------------+
> > | TCP in VXLAN with wildcards      |
> > +===+==============================+
> > | 0 | Ethernet                     |
> > +---+-----+----------+---------+---+
> > | 1 | ANY | ``spec`` | ``min`` | 2 |
> > |   |     |          +---------+---+
> > |   |     |          | ``max`` | 2 |
> > +---+-----+----------+---------+---+
> > | 2 | VXLAN                        |
> > +---+------------------------------+
> > | 3 | Ethernet                     |
> > +---+-----+----------+---------+---+
> > | 4 | ANY | ``spec`` | ``min`` | 1 |
> > |   |     |          +---------+---+
> > |   |     |          | ``max`` | 1 |
> > +---+-----+----------+---------+---+
> > | 5 | TCP                          |
> > +---+------------------------------+
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``RAW``
> > ^^^^^^^
> > 
> > Matches a string of a given length at a given offset (in bytes), or anywhere
> > in the payload of the current protocol layer (including L2 header if used as
> > the first item in the stack).
> > 
> > This does not increment the protocol layer count as it is not a protocol
> > definition. Subsequent RAW items modulate the first absolute one with
> > relative offsets.
> > 
> > - Using **-1** as the ``offset`` of the first RAW item makes its absolute
> >   offset not fixed, i.e. the pattern is searched everywhere.
> > - ``mask`` only affects the pattern.
> > 
> > +--------------------------------------------------------------+
> > | RAW                                                          |
> > +==========+=============+================================
> > =====+
> > | ``spec`` | ``offset``  | absolute or relative pattern offset |
> > |          +-------------+-------------------------------------+
> > |          | ``length``  | pattern length                      |
> > |          +-------------+-------------------------------------+
> > |          | ``pattern`` | byte string of the above length     |
> > +----------+-------------+-------------------------------------+
> > | ``mask`` | ``offset``  | ignored                             |
> > |          +-------------+-------------------------------------+
> > |          | ``length``  | ignored                             |
> > |          +-------------+-------------------------------------+
> > |          | ``pattern`` | bitmask with the same byte length   |
> > +----------+-------------+-------------------------------------+
> > 
> > Example pattern looking for several strings at various offsets of a UDP
> > payload, using combined RAW items:
> > 
> > +------------------------------------------+
> > | UDP payload matching                     |
> > +===+======================================+
> > | 0 | Ethernet                             |
> > +---+--------------------------------------+
> > | 1 | IPv4                                 |
> > +---+--------------------------------------+
> > | 2 | UDP                                  |
> > +---+-----+----------+-------------+-------+
> > | 3 | RAW | ``spec`` | ``offset``  | -1    |
> > |   |     |          +-------------+-------+
> > |   |     |          | ``length``  | 3     |
> > |   |     |          +-------------+-------+
> > |   |     |          | ``pattern`` | "foo" |
> > +---+-----+----------+-------------+-------+
> > | 4 | RAW | ``spec`` | ``offset``  | 20    |
> > |   |     |          +-------------+-------+
> > |   |     |          | ``length``  | 3     |
> > |   |     |          +-------------+-------+
> > |   |     |          | ``pattern`` | "bar" |
> > +---+-----+----------+-------------+-------+
> > | 5 | RAW | ``spec`` | ``offset``  | -30   |
> > |   |     |          +-------------+-------+
> > |   |     |          | ``length``  | 3     |
> > |   |     |          +-------------+-------+
> > |   |     |          | ``pattern`` | "baz" |
> > +---+-----+----------+-------------+-------+
> > 
> > This translates to:
> > 
> > - Locate "foo" in UDP payload, remember its offset.
> > - Check "bar" at "foo"'s offset plus 20 bytes.
> > - Check "baz" at "foo"'s offset minus 30 bytes.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``ETH``
> > ^^^^^^^
> > 
> > Matches an Ethernet header.
> > 
> > - ``dst``: destination MAC.
> > - ``src``: source MAC.
> > - ``type``: EtherType.
> > - ``tags``: number of 802.1Q/ad tags defined.
> > - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> > 
> >  - ``tpid``: Tag protocol identifier.
> >  - ``tci``: Tag control information.
> > 
> > ``IPV4``
> > ^^^^^^^^
> > 
> > Matches an IPv4 header.
> > 
> > - ``src``: source IP address.
> > - ``dst``: destination IP address.
> > - ``tos``: ToS/DSCP field.
> > - ``ttl``: TTL field.
> > - ``proto``: protocol number for the next layer.
> > 
> > ``IPV6``
> > ^^^^^^^^
> > 
> > Matches an IPv6 header.
> > 
> > - ``src``: source IP address.
> > - ``dst``: destination IP address.
> > - ``tc``: traffic class field.
> > - ``nh``: Next header field (protocol).
> > - ``hop_limit``: hop limit field (TTL).
> > 
> > ``ICMP``
> > ^^^^^^^^
> > 
> > Matches an ICMP header.
> > 
> > - TBD.
> > 
> > ``UDP``
> > ^^^^^^^
> > 
> > Matches a UDP header.
> > 
> > - ``sport``: source port.
> > - ``dport``: destination port.
> > - ``length``: UDP length.
> > - ``checksum``: UDP checksum.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``TCP``
> > ^^^^^^^
> > 
> > Matches a TCP header.
> > 
> > - ``sport``: source port.
> > - ``dport``: destination port.
> > - All other TCP fields and bits.
> > 
> > ``VXLAN``
> > ^^^^^^^^^
> > 
> > Matches a VXLAN header.
> > 
> > - TBD.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Actions
> > ~~~~~~~
> > 
> > Each possible action is represented by a type. Some have associated
> > configuration structures. Several actions combined in a list can be affected
> > to a flow rule. That list is not ordered.
> > 
> > At least one action must be defined in a filter rule in order to do
> > something with matched packets.
> > 
> > - Actions are defined with ``struct rte_flow_action``.
> > - A list of actions is defined with ``struct rte_flow_actions``.
> > 
> > They fall in three categories:
> > 
> > - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
> >   processing matched packets by subsequent flow rules, unless overridden
> >   with PASSTHRU.
> > 
> > - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
> > for
> >   additional processing by subsequent flow rules.
> > 
> > - Other non terminating meta actions that do not affect the fate of packets
> >   (END, VOID, ID, COUNT).
> > 
> > When several actions are combined in a flow rule, they should all have
> > different types (e.g. dropping a packet twice is not possible). However
> > considering the VOID type is an exception to this rule, the defined behavior
> > is for PMDs to only take into account the last action of a given type found
> > in the list. PMDs still perform error checking on the entire list.
> > 
> > *Note that PASSTHRU is the only action able to override a terminating rule.*
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Example of an action that redirects packets to queue index 10:
> > 
> > +----------------+
> > | QUEUE          |
> > +===========+====+
> > | ``queue`` | 10 |
> > +-----------+----+
> > 
> > Action lists examples, their order is not significant, applications must
> > consider all actions to be performed simultaneously:
> > 
> > +----------------+
> > | Count and drop |
> > +=======+========+
> > | COUNT |        |
> > +-------+--------+
> > | DROP  |        |
> > +-------+--------+
> > 
> > +--------------------------+
> > | Tag, count and redirect  |
> > +=======+===========+======+
> > | ID    | ``id``    | 0x2a |
> > +-------+-----------+------+
> > | COUNT |                  |
> > +-------+-----------+------+
> > | QUEUE | ``queue`` | 10   |
> > +-------+-----------+------+
> > 
> > +-----------------------+
> > | Redirect to queue 5   |
> > +=======+===============+
> > | DROP  |               |
> > +-------+-----------+---+
> > | QUEUE | ``queue`` | 5 |
> > +-------+-----------+---+
> > 
> > In the above example, considering both actions are performed
> > simultaneously,
> > its end result is that only QUEUE has any effect.
> > 
> > +-----------------------+
> > | Redirect to queue 3   |
> > +=======+===========+===+
> > | QUEUE | ``queue`` | 5 |
> > +-------+-----------+---+
> > | VOID  |               |
> > +-------+-----------+---+
> > | QUEUE | ``queue`` | 3 |
> > +-------+-----------+---+
> > 
> > As previously described, only the last action of a given type found in the
> > list is taken into account. The above example also shows that VOID is
> > ignored.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Action types
> > ~~~~~~~~~~~~
> > 
> > Common action types are described in this section. Like pattern item types,
> > this list is not exhaustive as new actions will be added in the future.
> > 
> > ``END`` (action)
> > ^^^^^^^^^^^^^^^^
> > 
> > End marker for action lists. Prevents further processing of actions, thereby
> > ending the list.
> > 
> > - Its numeric value is **0** for convenience.
> > - PMD support is mandatory.
> > - No configurable property.
> > 
> > +---------------+
> > | END           |
> > +===============+
> > | no properties |
> > +---------------+
> > 
> > ``VOID`` (action)
> > ^^^^^^^^^^^^^^^^^
> > 
> > Used as a placeholder for convenience. It is ignored and simply discarded by
> > PMDs.
> > 
> > - PMD support is mandatory.
> > - No configurable property.
> > 
> > +---------------+
> > | VOID          |
> > +===============+
> > | no properties |
> > +---------------+
> > 
> > ``PASSTHRU``
> > ^^^^^^^^^^^^
> > 
> > Leaves packets up for additional processing by subsequent flow rules. This
> > is the default when a rule does not contain a terminating action, but can be
> > specified to force a rule to become non-terminating.
> > 
> > - No configurable property.
> > 
> > +---------------+
> > | PASSTHRU      |
> > +===============+
> > | no properties |
> > +---------------+
> > 
> > Example to copy a packet to a queue and continue processing by subsequent
> > flow rules:
> [Sugesh] If a packet get copied to a queue, it’s a termination action. 
> How can its possible to do subsequent action after the packet already 
> moved to the queue. ?How it differs from DUP action?
>  Am I missing anything here? 

Devices may not support the combination of QUEUE + PASSTHRU (i.e. making
QUEUE non-terminating). However these same devices may expose the ability to
copy a packet to another (sniffer) queue all while keeping the rule
terminating (QUEUE + DUP but no PASSTHRU).

DUP with two rules, assuming priorties and PASSTRHU are supported:

- pattern X, priority 0; actions: QUEUE 5, PASSTHRU (non-terminating)

- pattern X, priority 1; actions: QUEUE 6 (terminating)

DUP with two actions on a single rule and a single priority:

- pattern X, priority 0; actions: DUP 5, QUEUE 6 (terminating)

If supported, from an application point of view the end result is similar in
both cases (note the second case may be implemented by the PMD using two HW
rules internally).

However the second case does not waste a priority level and clearly states
the intent to the PMD which is more likely to be supported. If HW supports
DUP directly it is even faster since there is a single rule. That is why I
thought having DUP as an action would be useful.

> > +--------------------------+
> > | Copy to queue 8          |
> > +==========+===============+
> > | PASSTHRU |               |
> > +----------+-----------+---+
> > | QUEUE    | ``queue`` | 8 |
> > +----------+-----------+---+
> > 
> > ``ID``
> > ^^^^^^
> > 
> > Attaches a 32 bit value to packets.
> > 
> > +----------------------------------------------+
> > | ID                                           |
> > +========+=====================================+
> > | ``id`` | 32 bit value to return with packets |
> > +--------+-------------------------------------+
> > 
> [Sugesh] I assume the application has to program the flow 
> with a unique ID and matching packets are stamped with this ID
> when reporting to the software. The uniqueness of ID is NOT 
> guaranteed by the API framework. Correct me if I am wrong here.

You are right, if the way I wrote it is not clear enough, I'm open to
suggestions to improve it.

> [Sugesh] Is it a limitation to use only 32 bit ID? Is it possible to have a
> 64 bit ID? So that application can use the control plane flow pointer
> Itself as an ID. Does it make sense? 

I've specified a 32 bit ID for now because this is what FDIR supports and
also what existing devices can report today AFAIK (i40e and mlx5).

We could use 64 bit for future-proofness in a separate action like "ID64"
when at least one device supports it.

To PMD maintainers: please comment if you know devices that support tagging
matching packets with more than 32 bits of user-provided data!

> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``QUEUE``
> > ^^^^^^^^^
> > 
> > Assigns packets to a given queue index.
> > 
> > - Terminating by default.
> > 
> > +--------------------------------+
> > | QUEUE                          |
> > +===========+====================+
> > | ``queue`` | queue index to use |
> > +-----------+--------------------+
> > 
> > ``DROP``
> > ^^^^^^^^
> > 
> > Drop packets.
> > 
> > - No configurable property.
> > - Terminating by default.
> > - PASSTHRU overrides this action if both are specified.
> > 
> > +---------------+
> > | DROP          |
> > +===============+
> > | no properties |
> > +---------------+
> > 
> > ``COUNT``
> > ^^^^^^^^^
> > 
> [Sugesh] Should we really have to set count action explicitly for every rule?
> IMHO it would be great to be an implicit action. Most of the application would be
> interested in the stats of almost all the filters/flows .

I can see why, but no, it must be explicitly requested because you may want
to know in advance when it is not supported. Also considering it is
something else to be done by HW (a separate action), we can assume enabling
this may slow things down a bit.

HW limitations may also prevent you from having as many flow counters as you
want, in which case you probably want to carefully pick which rules have
them.

I think this target is most useful with DROP, VF and PF actions since
those are currently the only ones where SW may not see the related packets.

> > Enables hits counter for this rule.
> > 
> > This counter can be retrieved and reset through ``rte_flow_query()``, see
> > ``struct rte_flow_query_count``.
> > 
> > - Counters can be retrieved with ``rte_flow_query()``.
> > - No configurable property.
> > 
> > +---------------+
> > | COUNT         |
> > +===============+
> > | no properties |
> > +---------------+
> > 
> > Query structure to retrieve and reset the flow rule hits counter:
> > 
> > +------------------------------------------------+
> > | COUNT query                                    |
> > +===========+=====+==============================+
> > | ``reset`` | in  | reset counter after query    |
> > +-----------+-----+------------------------------+
> > | ``hits``  | out | number of hits for this flow |
> > +-----------+-----+------------------------------+
> > 
> > ``DUP``
> > ^^^^^^^
> > 
> > Duplicates packets to a given queue index.
> > 
> > This is normally combined with QUEUE, however when used alone, it is
> > actually similar to QUEUE + PASSTHRU.
> > 
> > - Non-terminating by default.
> > 
> > +------------------------------------------------+
> > | DUP                                            |
> > +===========+====================================+
> > | ``queue`` | queue index to duplicate packet to |
> > +-----------+------------------------------------+
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``RSS``
> > ^^^^^^^
> > 
> > Similar to QUEUE, except RSS is additionally performed on packets to spread
> > them among several queues according to the provided parameters.
> > 
> > - Terminating by default.
> > 
> > +---------------------------------------------+
> > | RSS                                         |
> > +==============+==============================+
> > | ``rss_conf`` | RSS parameters               |
> > +--------------+------------------------------+
> > | ``queues``   | number of entries in queue[] |
> > +--------------+------------------------------+
> > | ``queue[]``  | queue indices to use         |
> > +--------------+------------------------------+
> > 
> > ``PF`` (action)
> > ^^^^^^^^^^^^^^^
> > 
> > Redirects packets to the physical function (PF) of the current device.
> > 
> > - No configurable property.
> > - Terminating by default.
> > 
> > +---------------+
> > | PF            |
> > +===============+
> > | no properties |
> > +---------------+
> > 
> > ``VF`` (action)
> > ^^^^^^^^^^^^^^^
> > 
> > Redirects packets to the virtual function (VF) of the current device with
> > the specified ID.
> > 
> > - Terminating by default.
> > 
> > +---------------------------------------+
> > | VF                                    |
> > +========+==============================+
> > | ``id`` | VF ID to redirect packets to |
> > +--------+------------------------------+
> > 
> > Planned types
> > ~~~~~~~~~~~~~
> > 
> > Other action types are planned but not defined yet. These actions will add
> > the ability to alter matching packets in several ways, such as performing
> > encapsulation/decapsulation of tunnel headers on specific flows.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Rules management
> > ----------------
> > 
> > A simple API with only four functions is provided to fully manage flows.
> > 
> > Each created flow rule is associated with an opaque, PMD-specific handle
> > pointer. The application is responsible for keeping it until the rule is
> > destroyed.
> > 
> > Flows rules are defined with ``struct rte_flow``.
> > 
> > Validation
> > ~~~~~~~~~~
> > 
> > Given that expressing a definite set of device capabilities with this API is
> > not practical, a dedicated function is provided to check if a flow rule is
> > supported and can be created.
> > 
> > ::
> > 
> >  int
> >  rte_flow_validate(uint8_t port_id,
> >                    const struct rte_flow_pattern *pattern,
> >                    const struct rte_flow_actions *actions);
> > 
> > While this function has no effect on the target device, the flow rule is
> > validated against its current configuration state and the returned value
> > should be considered valid by the caller for that state only.
> > 
> > The returned value is guaranteed to remain valid only as long as no
> > successful calls to rte_flow_create() or rte_flow_destroy() are made in the
> > meantime and no device parameter affecting flow rules in any way are
> > modified, due to possible collisions or resource limitations (although in
> > such cases ``EINVAL`` should not be returned).
> > 
> > Arguments:
> > 
> > - ``port_id``: port identifier of Ethernet device.
> > - ``pattern``: pattern specification to check.
> > - ``actions``: actions associated with the flow definition.
> > 
> > Return value:
> > 
> > - **0** if flow rule is valid and can be created. A negative errno value
> >   otherwise (``rte_errno`` is also set), the following errors are defined.
> > - ``-EINVAL``: unknown or invalid rule specification.
> > - ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial masks
> >   are unsupported).
> > - ``-EEXIST``: collision with an existing rule.
> > - ``-ENOMEM``: not enough resources.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Creation
> > ~~~~~~~~
> > 
> > Creating a flow rule is similar to validating one, except the rule is
> > actually created.
> > 
> > ::
> > 
> >  struct rte_flow *
> >  rte_flow_create(uint8_t port_id,
> >                  const struct rte_flow_pattern *pattern,
> >                  const struct rte_flow_actions *actions);
> > 
> > Arguments:
> > 
> > - ``port_id``: port identifier of Ethernet device.
> > - ``pattern``: pattern specification to add.
> > - ``actions``: actions associated with the flow definition.
> > 
> > Return value:
> > 
> > A valid flow pointer in case of success, NULL otherwise and ``rte_errno`` is
> > set to the positive version of one of the error codes defined for
> > ``rte_flow_validate()``.
> [Sugesh] : Kind of implementation specific query. What if application
> try to add duplicate rules? Does the API create new flow entry for every 
> API call? 

If an application adds duplicate rules at a given priority level, the second
one may return an error depending on the PMD. Collisions are sometimes
trivial to detect (such as the same pattern twice), others not so much (one
matching an Ethernet header only, the other one matching an IP header only).

Either way if a packet is matched by two rules at a given priority level,
what happens is described in 3.3 (High level design) and 4.4.1 (Priorities).

Applications are responsible for not relying on the PMD to detect these, or
should use a single priority level for each rule to make things clear.

However since the number of HW priority levels is finite and possibly small,
they must also make sure not to waste them. My advice is to only use
priority levels when it cannot be proven that rules do not collide.

If all you have is perfect matching rules without wildcards and all of them
match the same number of layers, a single priority level is fine.

> [Sugesh] Another concern is the cost and time of installing these rules
> in the hardware. Can we make these APIs time bound(or at least an option to
> set the time limit to execute these APIs), so that
> Application doesn’t have to wait so long when installing and deleting flows with
> slow hardware/NIC. What do you think? Most of the datapath flow installations are 
> dynamic and triggered only when there is
> an ingress traffic. Delay in flow insertion/deletion have unpredictable consequences.

This API is (currently) aimed at the control path only, and must indeed be
assumed to be slow. Creating million of rules may take quite long as it may
involve syscalls and other time-consuming synchronization things on the PMD
side.

So currently there is no plan to have rules added from the data path with
time constraints. I think it would be implemented through a different set of
functions anyway.

I do not think adding time limits is practical, even specifying in the API
that creating a single flow rule must take less than a maximum number of
seconds in order to be effective is too much of a constraint (applications
that create all flows during init may not care after all).

You should consider in any case that modifying flow rules will always be
slower than receiving packets, there is no way around that. Applications
have to live with it and provide a software fallback for incoming packets
while managing flow rules.

Moreover, think about what happens when you hit the maximum number of flow
rules and cannot create any more. Applications need to implement some kind
of fallback in their data path.

Offloading flows in HW is also only useful if they live much longer than the
time taken to create and delete them. Perhaps applications may choose to do
so after detecting long lived flows such as TCP sessions.

You may have one separate control thread dedicated to manage flows and
keep your normal control thread unaffected by delays. Several threads can
even be dedicated, one per device.

> [Sugesh] Another query is on the synchronization part. What if same rules are 
> handled from different threads? Is application responsible for handling the concurrent
> hardware programming?

Like most (if not all) DPDK APIs, applications are responsible for managing
locking issues as decribed in 4.3 (Behavior). Since this is a control path
API and applications usually have a single control thread, locking should
not be necessary in most cases.

Regarding my above comment about using several control threads to manage
different devices, section 4.3 says:
 
 "There is no provision for reentrancy/multi-thread safety, although nothing
 should prevent different devices from being configured at the same
 time. PMDs may protect their control path functions accordingly."

I'd like to emphasize it is not "per port" but "per device", since in a few
cases a configurable resource is shared by several ports. It may be
difficult for applications to determine which ports are shared by a given
device but this falls outside the scope of this API.

Do you think adding the guarantee that it is always safe to configure two
different ports simultaneously without locking from the application side is
necessary? In which case the PMD would be responsible for locking shared
resources.

> > Destruction
> > ~~~~~~~~~~~
> > 
> > Flow rules destruction is not automatic, and a queue should not be released
> > if any are still attached to it. Applications must take care of performing
> > this step before releasing resources.
> > 
> > ::
> > 
> >  int
> >  rte_flow_destroy(uint8_t port_id,
> >                   struct rte_flow *flow);
> > 
> > 
> [Sugesh] I would suggest having a clean-up API is really useful as the releasing of
> Queue(is it applicable for releasing of port too?) is not guaranteeing the automatic flow 
> destruction.

Would something like rte_flow_flush(port_id) do the trick? I wanted to
emphasize in this first draft that applications should really keep the flow
pointers around in order to manage/destroy them. It is their responsibility,
not PMD's.

> This way application can initialize the port,
> clean-up all the existing rules and create new rules  on a clean slate.

No resource can be released as long as a flow rule is using it (bad things
may happen otherwise), all flow rules must be destroyed first, thus none can
possibly remain after initializing a port. It is assumed that PMDs do
automatic clean up during init if necessary to ensure this.

> > Failure to destroy a flow rule may occur when other flow rules depend on it,
> > and destroying it would result in an inconsistent state.
> > 
> > This function is only guaranteed to succeed if flow rules are destroyed in
> > reverse order of their creation.
> > 
> > Arguments:
> > 
> > - ``port_id``: port identifier of Ethernet device.
> > - ``flow``: flow rule to destroy.
> > 
> > Return value:
> > 
> > - **0** on success, a negative errno value otherwise and ``rte_errno`` is
> >   set.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Query
> > ~~~~~
> > 
> > Query an existing flow rule.
> > 
> > This function allows retrieving flow-specific data such as counters. Data
> > is gathered by special actions which must be present in the flow rule
> > definition.
> > 
> > ::
> > 
> >  int
> >  rte_flow_query(uint8_t port_id,
> >                 struct rte_flow *flow,
> >                 enum rte_flow_action_type action,
> >                 void *data);
> > 
> > Arguments:
> > 
> > - ``port_id``: port identifier of Ethernet device.
> > - ``flow``: flow rule to query.
> > - ``action``: action type to query.
> > - ``data``: pointer to storage for the associated query data type.
> > 
> > Return value:
> > 
> > - **0** on success, a negative errno value otherwise and ``rte_errno`` is
> >   set.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Behavior
> > --------
> > 
> > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> >   returned).
> > 
> > - There is no provision for reentrancy/multi-thread safety, although nothing
> >   should prevent different devices from being configured at the same
> >   time. PMDs may protect their control path functions accordingly.
> > 
> > - Stopping the data path (TX/RX) should not be necessary when managing
> > flow
> >   rules. If this cannot be achieved naturally or with workarounds (such as
> >   temporarily replacing the burst function pointers), an appropriate error
> >   code must be returned (``EBUSY``).
> > 
> > - PMDs, not applications, are responsible for maintaining flow rules
> >   configuration when stopping and restarting a port or performing other
> >   actions which may affect them. They can only be destroyed explicitly.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> [Sugesh] Query all the rules for a specific port/queue?? Useful when adding and
> deleting ports and queues dynamically according to the need. I am not sure 
> what are the other  different usecases for these APIs. But I feel it makes much easier to 
> manage flows from the application. What do you think?

Not sure, that seems to fall out of the scope of this API. As described,
applications already store the related rte_flow pointers. Accordingly, they
know how many rules are associated to a given port. They need both a port ID
and a flow rule pointer to destroy them after all.

Now perhaps something to convert back an existing rte_flow to a pattern and
a list of actions, however I cannot see an immediate use case for it.

What you describe seems to be doable through a front-end API, I think
keeping this one as low-level as possible with only basic actions is better
right now. I'll keep your suggestion in mind.

> > Compatibility
> > -------------
> > 
> > No known hardware implementation supports all the features described in
> > this
> > document.
> > 
> > Unsupported features or combinations are not expected to be fully
> > emulated
> > in software by PMDs for performance reasons. Partially supported features
> > may be completed in software as long as hardware performs most of the
> > work
> > (such as queue redirection and packet recognition).
> > 
> > However PMDs are expected to do their best to satisfy application requests
> > by working around hardware limitations as long as doing so does not affect
> > the behavior of existing flow rules.
> > 
> > The following sections provide a few examples of such cases, they are based
> > on limitations built into the previous APIs.
> > 
> > Global bitmasks
> > ~~~~~~~~~~~~~~~
> > 
> > Each flow rule comes with its own, per-layer bitmasks, while hardware may
> > support only a single, device-wide bitmask for a given layer type, so that
> > two IPv4 rules cannot use different bitmasks.
> > 
> > The expected behavior in this case is that PMDs automatically configure
> > global bitmasks according to the needs of the first created flow rule.
> > 
> > Subsequent rules are allowed only if their bitmasks match those, the
> > ``EEXIST`` error code should be returned otherwise.
> > 
> > Unsupported layer types
> > ~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > Many protocols can be simulated by crafting patterns with the `RAW`_ type.
> > 
> > PMDs can rely on this capability to simulate support for protocols with
> > fixed headers not directly recognized by hardware.
> > 
> > ``ANY`` pattern item
> > ~~~~~~~~~~~~~~~~~~~~
> > 
> > This pattern item stands for anything, which can be difficult to translate
> > to something hardware would understand, particularly if followed by more
> > specific types.
> > 
> > Consider the following pattern:
> > 
> > +---+--------------------------------+
> > | 0 | ETHER                          |
> > +---+--------------------------------+
> > | 1 | ANY (``min`` = 1, ``max`` = 1) |
> > +---+--------------------------------+
> > | 2 | TCP                            |
> > +---+--------------------------------+
> > 
> > Knowing that TCP does not make sense with something other than IPv4 and
> > IPv6
> > as L3, such a pattern may be translated to two flow rules instead:
> > 
> > +---+--------------------+
> > | 0 | ETHER              |
> > +---+--------------------+
> > | 1 | IPV4 (zeroed mask) |
> > +---+--------------------+
> > | 2 | TCP                |
> > +---+--------------------+
> > 
> > +---+--------------------+
> > | 0 | ETHER              |
> > +---+--------------------+
> > | 1 | IPV6 (zeroed mask) |
> > +---+--------------------+
> > | 2 | TCP                |
> > +---+--------------------+
> > 
> > Note that as soon as a ANY rule covers several layers, this approach may
> > yield a large number of hidden flow rules. It is thus suggested to only
> > support the most common scenarios (anything as L2 and/or L3).
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > Unsupported actions
> > ~~~~~~~~~~~~~~~~~~~
> > 
> > - When combined with a `QUEUE`_ action, packet counting (`COUNT`_) and
> >   tagging (`ID`_) may be implemented in software as long as the target queue
> >   is used by a single rule.
> > 
> > - A rule specifying both `DUP`_ + `QUEUE`_ may be translated to two hidden
> >   rules combining `QUEUE`_ and `PASSTHRU`_.
> > 
> > - When a single target queue is provided, `RSS`_ can also be implemented
> >   through `QUEUE`_.
> > 
> > Flow rules priority
> > ~~~~~~~~~~~~~~~~~~~
> > 
> > While it would naturally make sense, flow rules cannot be assumed to be
> > processed by hardware in the same order as their creation for several
> > reasons:
> > 
> > - They may be managed internally as a tree or a hash table instead of a
> >   list.
> > - Removing a flow rule before adding another one can either put the new
> > rule
> >   at the end of the list or reuse a freed entry.
> > - Duplication may occur when packets are matched by several rules.
> > 
> > For overlapping rules (particularly in order to use the `PASSTHRU`_ action)
> > predictable behavior is only guaranteed by using different priority levels.
> > 
> > Priority levels are not necessarily implemented in hardware, or may be
> > severely limited (e.g. a single priority bit).
> > 
> > For these reasons, priority levels may be implemented purely in software by
> > PMDs.
> > 
> > - For devices expecting flow rules to be added in the correct order, PMDs
> >   may destroy and re-create existing rules after adding a new one with
> >   a higher priority.
> > 
> > - A configurable number of dummy or empty rules can be created at
> >   initialization time to save high priority slots for later.
> > 
> > - In order to save priority levels, PMDs may evaluate whether rules are
> >   likely to collide and adjust their priority accordingly.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > API migration
> > =============
> > 
> > Exhaustive list of deprecated filter types and how to convert them to
> > generic flow rules.
> > 
> > ``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
> > ---------------------------------------
> > 
> > `MACVLAN`_ can be translated to a basic `ETH`_ flow rule with a `VF
> > (action)`_ or `PF (action)`_ terminating action.
> > 
> > +------------------------------------+
> > | MACVLAN                            |
> > +--------------------------+---------+
> > | Pattern                  | Actions |
> > +===+=====+==========+=====+=========+
> > | 0 | ETH | ``spec`` | any | VF,     |
> > |   |     +----------+-----+ PF      |
> > |   |     | ``mask`` | any |         |
> > +---+-----+----------+-----+---------+
> > 
> > ``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
> > ----------------------------------------------
> > 
> > `ETHERTYPE`_ is basically an `ETH`_ flow rule with `QUEUE`_ or `DROP`_ as
> > a terminating action.
> > 
> > +------------------------------------+
> > | ETHERTYPE                          |
> > +--------------------------+---------+
> > | Pattern                  | Actions |
> > +===+=====+==========+=====+=========+
> > | 0 | ETH | ``spec`` | any | QUEUE,  |
> > |   |     +----------+-----+ DROP    |
> > |   |     | ``mask`` | any |         |
> > +---+-----+----------+-----+---------+
> > 
> > ``FLEXIBLE`` to ``RAW`` → ``QUEUE``
> > -----------------------------------
> > 
> > `FLEXIBLE`_ can be translated to one `RAW`_ pattern with `QUEUE`_ as the
> > terminating action and a defined priority level.
> > 
> > +------------------------------------+
> > | FLEXIBLE                           |
> > +--------------------------+---------+
> > | Pattern                  | Actions |
> > +===+=====+==========+=====+=========+
> > | 0 | RAW | ``spec`` | any | QUEUE   |
> > |   |     +----------+-----+         |
> > |   |     | ``mask`` | any |         |
> > +---+-----+----------+-----+---------+
> > 
> > ``SYN`` to ``TCP`` → ``QUEUE``
> > ------------------------------
> > 
> > `SYN`_ is a `TCP`_ rule with only the ``syn`` bit enabled and masked, and
> > `QUEUE`_ as the terminating action.
> > 
> > Priority level can be set to simulate the high priority bit.
> > 
> > +---------------------------------------------+
> > | SYN                                         |
> > +-----------------------------------+---------+
> > | Pattern                           | Actions |
> > +===+======+==========+=============+=========+
> > | 0 | ETH  | ``spec`` | N/A         | QUEUE   |
> > |   |      +----------+-------------+         |
> > |   |      | ``mask`` | empty       |         |
> > +---+------+----------+-------------+         |
> > | 1 | IPV4 | ``spec`` | N/A         |         |
> > |   |      +----------+-------------+         |
> > |   |      | ``mask`` | empty       |         |
> > +---+------+----------+-------------+         |
> > | 2 | TCP  | ``spec`` | ``syn`` = 1 |         |
> > |   |      +----------+-------------+         |
> > |   |      | ``mask`` | ``syn`` = 1 |         |
> > +---+------+----------+-------------+---------+
> > 
> > ``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
> > ----------------------------------------------------
> > 
> > `NTUPLE`_ is similar to specifying an empty L2, `IPV4`_ as L3 with `TCP`_ or
> > `UDP`_ as L4 and `QUEUE`_ as the terminating action.
> > 
> > A priority level can be specified as well.
> > 
> > +---------------------------------------+
> > | NTUPLE                                |
> > +-----------------------------+---------+
> > | Pattern                     | Actions |
> > +===+======+==========+=======+=========+
> > | 0 | ETH  | ``spec`` | N/A   | QUEUE   |
> > |   |      +----------+-------+         |
> > |   |      | ``mask`` | empty |         |
> > +---+------+----------+-------+         |
> > | 1 | IPV4 | ``spec`` | any   |         |
> > |   |      +----------+-------+         |
> > |   |      | ``mask`` | any   |         |
> > +---+------+----------+-------+         |
> > | 2 | TCP, | ``spec`` | any   |         |
> > |   | UDP  +----------+-------+         |
> > |   |      | ``mask`` | any   |         |
> > +---+------+----------+-------+---------+
> > 
> > ``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
> > ---------------------------------------------------------------------------
> > 
> > `TUNNEL`_ matches common IPv4 and IPv6 L3/L4-based tunnel types.
> > 
> > In the following table, `ANY`_ is used to cover the optional L4.
> > 
> > +------------------------------------------------+
> > | TUNNEL                                         |
> > +--------------------------------------+---------+
> > | Pattern                              | Actions |
> > +===+=========+==========+=============+=========+
> > | 0 | ETH     | ``spec`` | any         | QUEUE   |
> > |   |         +----------+-------------+         |
> > |   |         | ``mask`` | any         |         |
> > +---+---------+----------+-------------+         |
> > | 1 | IPV4,   | ``spec`` | any         |         |
> > |   | IPV6    +----------+-------------+         |
> > |   |         | ``mask`` | any         |         |
> > +---+---------+----------+-------------+         |
> > | 2 | ANY     | ``spec`` | ``min`` = 0 |         |
> > |   |         |          +-------------+         |
> > |   |         |          | ``max`` = 0 |         |
> > |   |         +----------+-------------+         |
> > |   |         | ``mask`` | N/A         |         |
> > +---+---------+----------+-------------+         |
> > | 3 | VXLAN,  | ``spec`` | any         |         |
> > |   | GENEVE, +----------+-------------+         |
> > |   | TEREDO, | ``mask`` | any         |         |
> > |   | NVGRE,  |          |             |         |
> > |   | GRE,    |          |             |         |
> > |   | ...     |          |             |         |
> > +---+---------+----------+-------------+---------+
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
> > ---------------------------------------------------------------
> > 
> > `FDIR`_ is more complex than any other type, there are several methods to
> > emulate its functionality. It is summarized for the most part in the table
> > below.
> > 
> > A few features are intentionally not supported:
> > 
> > - The ability to configure the matching input set and masks for the entire
> >   device, PMDs should take care of it automatically according to flow rules.
> > 
> > - Returning four or eight bytes of matched data when using flex bytes
> >   filtering. Although a specific action could implement it, it conflicts
> >   with the much more useful 32 bits tagging on devices that support it.
> > 
> > - Side effects on RSS processing of the entire device. Flow rules that
> >   conflict with the current device configuration should not be
> >   allowed. Similarly, device configuration should not be allowed when it
> >   affects existing flow rules.
> > 
> > - Device modes of operation. "none" is unsupported since filtering cannot be
> >   disabled as long as a flow rule is present.
> > 
> > - "MAC VLAN" or "tunnel" perfect matching modes should be automatically
> > set
> >   according to the created flow rules.
> > 
> > +----------------------------------------------+
> > | FDIR                                         |
> > +---------------------------------+------------+
> > | Pattern                         | Actions    |
> > +===+============+==========+=====+============+
> > | 0 | ETH,       | ``spec`` | any | QUEUE,     |
> > |   | RAW        +----------+-----+ DROP,      |
> > |   |            | ``mask`` | any | PASSTHRU   |
> > +---+------------+----------+-----+------------+
> > | 1 | IPV4,      | ``spec`` | any | ID         |
> > |   | IPV6       +----------+-----+ (optional) |
> > |   |            | ``mask`` | any |            |
> > +---+------------+----------+-----+            |
> > | 2 | TCP,       | ``spec`` | any |            |
> > |   | UDP,       +----------+-----+            |
> > |   | SCTP       | ``mask`` | any |            |
> > +---+------------+----------+-----+            |
> > | 3 | VF,        | ``spec`` | any |            |
> > |   | PF,        +----------+-----+            |
> > |   | SIGNATURE  | ``mask`` | any |            |
> > |   | (optional) |          |     |            |
> > +---+------------+----------+-----+------------+
> > 
> > ``HASH``
> > ~~~~~~~~
> > 
> > Hashing configuration is set per rule through the `SIGNATURE`_ item.
> > 
> > Since it is usually a global device setting, all flow rules created with
> > this item may have to share the same specification.
> > 
> > ``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > All packets are matched. This type alters incoming packets to encapsulate
> > them in a chosen tunnel type, optionally redirect them to a VF as well.
> > 
> > The destination pool for tag based forwarding can be emulated with other
> > flow rules using `DUP`_ as the action.
> > 
> > +----------------------------------------+
> > | L2_TUNNEL                              |
> > +---------------------------+------------+
> > | Pattern                   | Actions    |
> > +===+======+==========+=====+============+
> > | 0 | VOID | ``spec`` | N/A | VXLAN,     |
> > |   |      |          |     | GENEVE,    |
> > |   |      |          |     | ...        |
> > |   |      +----------+-----+------------+
> > |   |      | ``mask`` | N/A | VF         |
> > |   |      |          |     | (optional) |
> > +---+------+----------+-----+------------+
> > 
> > --
> > Adrien Mazarguil
> > 6WIND

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-08 11:11 ` Liang, Cunming
  2016-07-08 12:38   ` Bruce Richardson
@ 2016-07-08 13:25   ` Adrien Mazarguil
  2016-07-11  3:18     ` Liang, Cunming
  1 sibling, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-08 13:25 UTC (permalink / raw)
  To: Liang, Cunming
  Cc: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu, Jan Medala,
	John Daley, Jing Chen, Konstantin Ananyev, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, Pablo de Lara,
	Olga Shern

Hi Cunming,

I agree with Bruce, I'll start snipping non relevant parts considering the
size of this message. Please see below.

On Fri, Jul 08, 2016 at 07:11:28PM +0800, Liang, Cunming wrote:
[...]
> >Meta item types
> >~~~~~~~~~~~~~~~
> >
> >These do not match packet data but affect how the pattern is processed, most
> >of them do not need a specification structure. This particularity allows
> >them to be specified anywhere without affecting other item types.
> [LC] For the meta item(END, VOID, INVERT) and some data matching type like
> ANY and RAW,
> it's all PMD responsible to understand the key character and to parse the
> header graph?

We haven't started on the PMD side of things yet (the public API described
here does not discuss it), but I think PMDs will see the pattern as-is,
untranslated (to answer your question, yes, it will be the case for END,
VOID, INVERT, ANY and RAW like all others).

However I intend to add private helper functions as needed in librte_ether
(or anywhere else in EAL) that PMDs can call to ease parsing and validation
of flow rules, otherwise most of them will end up implementing redundant
functions.

[...]
> >When several actions are combined in a flow rule, they should all have
> >different types (e.g. dropping a packet twice is not possible). However
> >considering the VOID type is an exception to this rule, the defined behavior
> >is for PMDs to only take into account the last action of a given type found
> >in the list. PMDs still perform error checking on the entire list.
> >
> >*Note that PASSTHRU is the only action able to override a terminating rule.*
> [LC] I'm wondering how to address the meta data carried by mbuf, there's no
> mentioned here.
> For packets hit one specific flow, usually there's something for CPU to
> identify the flow.
> FDIR and RSS as an example, has id or key in mbuf. In addition, some meta
> may pointed by userdata in mbuf.
> Any view on it ?

Yes, this is defined as the ID action. It is described in 4.1.6.4 (ID) and
there is an example how a FDIR rule would be converted to use it in 5.7
(FDIR to most item types → QUEUE, DROP, PASSTHRU).

It is basically described as a flow rule with two actions: queue redirection
and packet tagging.

Does it answer your question?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-08 13:25   ` Adrien Mazarguil
@ 2016-07-11  3:18     ` Liang, Cunming
  2016-07-11 10:06       ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Liang, Cunming @ 2016-07-11  3:18 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

Hi Adrien,

Thanks so much for the explanation.

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Friday, July 08, 2016 9:26 PM
> To: Liang, Cunming <cunming.liang@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>; Zhang,
> Helin <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Rasesh
> Mody <rasesh.mody@qlogic.com>; Ajit Khaparde
> <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Jan
> Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen, Jing D
> <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob <jerin.jacob@caviumnetworks.com>; De
> Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Olga Shern
> <olgas@mellanox.com>
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
> 
> Hi Cunming,
> 
> I agree with Bruce, I'll start snipping non relevant parts considering the
> size of this message. Please see below.
> 
> On Fri, Jul 08, 2016 at 07:11:28PM +0800, Liang, Cunming wrote:
> [...]
> > >Meta item types
> > >~~~~~~~~~~~~~~~
> > >
> > >These do not match packet data but affect how the pattern is processed, most
> > >of them do not need a specification structure. This particularity allows
> > >them to be specified anywhere without affecting other item types.
> > [LC] For the meta item(END, VOID, INVERT) and some data matching type like
> > ANY and RAW,
> > it's all PMD responsible to understand the key character and to parse the
> > header graph?
> 
> We haven't started on the PMD side of things yet (the public API described
> here does not discuss it), but I think PMDs will see the pattern as-is,
> untranslated (to answer your question, yes, it will be the case for END,
> VOID, INVERT, ANY and RAW like all others).
> 
> However I intend to add private helper functions as needed in librte_ether
> (or anywhere else in EAL) that PMDs can call to ease parsing and validation
> of flow rules, otherwise most of them will end up implementing redundant
> functions.
[LC] Agree, that's very helpful.

> 
> [...]
> > >When several actions are combined in a flow rule, they should all have
> > >different types (e.g. dropping a packet twice is not possible). However
> > >considering the VOID type is an exception to this rule, the defined behavior
> > >is for PMDs to only take into account the last action of a given type found
> > >in the list. PMDs still perform error checking on the entire list.
> > >
> > >*Note that PASSTHRU is the only action able to override a terminating rule.*
> > [LC] I'm wondering how to address the meta data carried by mbuf, there's no
> > mentioned here.
> > For packets hit one specific flow, usually there's something for CPU to
> > identify the flow.
> > FDIR and RSS as an example, has id or key in mbuf. In addition, some meta
> > may pointed by userdata in mbuf.
> > Any view on it ?
> 
> Yes, this is defined as the ID action. It is described in 4.1.6.4 (ID) and
> there is an example how a FDIR rule would be converted to use it in 5.7
> (FDIR to most item types → QUEUE, DROP, PASSTHRU).
[LC] So in RSS cases, the actions would be {RSS, ID}, in which ID represent for RSS key, right?

> 
> It is basically described as a flow rule with two actions: queue redirection
> and packet tagging.
> 
> Does it answer your question?
[LC] I think so, Thanks.
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-11  3:18     ` Liang, Cunming
@ 2016-07-11 10:06       ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-11 10:06 UTC (permalink / raw)
  To: Liang, Cunming
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

On Mon, Jul 11, 2016 at 03:18:19AM +0000, Liang, Cunming wrote:
[...]
> > > >When several actions are combined in a flow rule, they should all have
> > > >different types (e.g. dropping a packet twice is not possible). However
> > > >considering the VOID type is an exception to this rule, the defined behavior
> > > >is for PMDs to only take into account the last action of a given type found
> > > >in the list. PMDs still perform error checking on the entire list.
> > > >
> > > >*Note that PASSTHRU is the only action able to override a terminating rule.*
> > > [LC] I'm wondering how to address the meta data carried by mbuf, there's no
> > > mentioned here.
> > > For packets hit one specific flow, usually there's something for CPU to
> > > identify the flow.
> > > FDIR and RSS as an example, has id or key in mbuf. In addition, some meta
> > > may pointed by userdata in mbuf.
> > > Any view on it ?
> > 
> > Yes, this is defined as the ID action. It is described in 4.1.6.4 (ID) and
> > there is an example how a FDIR rule would be converted to use it in 5.7
> > (FDIR to most item types → QUEUE, DROP, PASSTHRU).
> [LC] So in RSS cases, the actions would be {RSS, ID}, in which ID represent for RSS key, right?

Well, the ID action is always the same regardless of other actions, it
takes an arbitrary 32 bit value to be returned back as meta data with
matched packets, no side effect on RSS is expected.

If you are talking about the RSS action (4.1.6.9), RSS configuration (key
and algorithm to use) are provided in their specific structure along with
the list of target queues (see struct rte_flow_action_rss in [1]).

Note the RSS action is independent, it is unrelated to the port-wide RSS
configuration. Devices may not be able to support both simultaneously, for
instance creating multiple queues with RSS enabled globally may prevent
requesting a flow rule with a RSS action later. Likewise, such a rule may
possibly be defined only once depending on capabilities.

For devices supporting both, think of it as multiple level RSS. Flow rules
perform RSS on selected packets first, then the default global RSS
configuration takes care of packets that haven't hit a terminating flow
rule. This is the same as the QUEUE action except RSS is additionally
performed to spread packet among several queues.

Thus applications can request RSS with ID to get both RSS _and_ their
arbitrary 32 bit value as meta data. Once again, HW support for this
combination is not mandatory.

PMDs can assist HW to work around such limitations sometimes as described in
4.4.4 (Unsupported actions) as long as the software cost is kept minimal.

[1] https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-05 18:16 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Adrien Mazarguil
                   ` (2 preceding siblings ...)
  2016-07-08 11:11 ` Liang, Cunming
@ 2016-07-11 10:41 ` Jerin Jacob
  2016-07-21 19:20   ` Adrien Mazarguil
  2016-07-21  8:13 ` Rahul Lakkireddy
  2016-08-19 19:32 ` [dpdk-dev] [RFC v2] " Adrien Mazarguil
  5 siblings, 1 reply; 262+ messages in thread
From: Jerin Jacob @ 2016-07-11 10:41 UTC (permalink / raw)
  To: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu, Jan Medala,
	John Daley, Jing Chen, Konstantin Ananyev, Matej Vido,
	Alejandro Lucero, Sony Chacko, Pablo de Lara, Olga Shern

On Tue, Jul 05, 2016 at 08:16:46PM +0200, Adrien Mazarguil wrote:

Hi Adrien,

Overall this proposal looks very good. I could easily map to the
classification hardware engines I am familiar with.


> Priorities
> ~~~~~~~~~~
> 
> A priority can be assigned to a matching pattern.
> 
> The default priority level is 0 and is also the highest. Support for more
> than a single priority level in hardware is not guaranteed.
> 
> If a packet is matched by several filters at a given priority level, the
> outcome is undefined. It can take any path and can even be duplicated.

In some cases fatal unrecoverable error too

> Matching pattern items for packet data must be naturally stacked (ordered
> from lowest to highest protocol layer), as in the following examples:
> 
> +--------------+
> | TCPv4 as L4  |
> +===+==========+
> | 0 | Ethernet |
> +---+----------+
> | 1 | IPv4     |
> +---+----------+
> | 2 | TCP      |
> +---+----------+
> 
> +----------------+
> | TCPv6 in VXLAN |
> +===+============+
> | 0 | Ethernet   |
> +---+------------+
> | 1 | IPv4       |
> +---+------------+
> | 2 | UDP        |
> +---+------------+
> | 3 | VXLAN      |
> +---+------------+
> | 4 | Ethernet   |
> +---+------------+
> | 5 | IPv6       |
> +---+------------+

How about enumerating as "Inner-IPV6" flow type to avoid any confusion. Though spec
can be same for both IPv6 and Inner-IPV6.

> | 6 | TCP        |
> +---+------------+
> 
> +-----------------------------+
> | TCPv4 as L4 with meta items |
> +===+=========================+
> | 0 | VOID                    |
> +---+-------------------------+
> | 1 | Ethernet                |
> +---+-------------------------+
> | 2 | VOID                    |
> +---+-------------------------+
> | 3 | IPv4                    |
> +---+-------------------------+
> | 4 | TCP                     |
> +---+-------------------------+
> | 5 | VOID                    |
> +---+-------------------------+
> | 6 | VOID                    |
> +---+-------------------------+
> 
> The above example shows how meta items do not affect packet data matching
> items, as long as those remain stacked properly. The resulting matching
> pattern is identical to "TCPv4 as L4".
> 
> +----------------+
> | UDPv6 anywhere |
> +===+============+
> | 0 | IPv6       |
> +---+------------+
> | 1 | UDP        |
> +---+------------+
> 
> If supported by the PMD, omitting one or several protocol layers at the
> bottom of the stack as in the above example (missing an Ethernet
> specification) enables hardware to look anywhere in packets.

It would be good if the common code can give it as Ethernet, IPV6, UDP
to PMD(to avoid common code duplication across PMDs)

> 
> It is unspecified whether the payload of supported encapsulations
> (e.g. VXLAN inner packet) is matched by such a pattern, which may apply to
> inner, outer or both packets.

a separate flow type enumeration may fix that problem. like "Inner-IPV6"
mentioned above.

> 
> +---------------------+
> | Invalid, missing L3 |
> +===+=================+
> | 0 | Ethernet        |
> +---+-----------------+
> | 1 | UDP             |
> +---+-----------------+
> 
> The above pattern is invalid due to a missing L3 specification between L2
> and L4. It is only allowed at the bottom and at the top of the stack.
> 

> ``SIGNATURE``
> ^^^^^^^^^^^^^
> 
> Requests hash-based signature dispatching for this rule.
> 
> Considering this is a global setting on devices that support it, all
> subsequent filter rules may have to be created with it as well.

Can you describe the use case for this and how its different from
existing rte_eth devel RSS settings.

> 
> - Only ``spec`` needs to be defined, ``mask`` is ignored.
> 
> +--------------------+
> | SIGNATURE          |
> +==========+=========+
> | ``spec`` | TBD     |
> +----------+---------+
> | ``mask`` | ignored |
> +----------+---------+
> 

> 
> ``ETH``
> ^^^^^^^
> 
> Matches an Ethernet header.
> 
> - ``dst``: destination MAC.
> - ``src``: source MAC.
> - ``type``: EtherType.
> - ``tags``: number of 802.1Q/ad tags defined.
> - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> 
>  - ``tpid``: Tag protocol identifier.
>  - ``tci``: Tag control information.

Find below the other L2 layer attributes are useful in HW classification,

- HiGig headers
- DSA Headers
- MPLS

May be we need to intrdouce a separate flow type with spec to add the support. Right?

Jerin

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-08 13:03   ` Adrien Mazarguil
@ 2016-07-11 10:42     ` Chandran, Sugesh
  2016-07-13 20:03       ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Chandran, Sugesh @ 2016-07-11 10:42 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

Hi Adrien,

Thank you for your response,
Please see my comments inline.

Regards
_Sugesh


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Friday, July 8, 2016 2:03 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Rasesh Mody <rasesh.mody@qlogic.com>; Ajit
> Khaparde <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> Hi Sugesh,
> 
> On Thu, Jul 07, 2016 at 11:15:07PM +0000, Chandran, Sugesh wrote:
> > Hi Adrien,
> >
> > Thank you for proposing this. It would be really useful for application such
> as OVS-DPDK.
> > Please find my comments and questions inline below prefixed with
> [Sugesh]. Most of them are from the perspective of enabling these APIs in
> application such as OVS-DPDK.
> 
> Thanks, I'm replying below.
> 
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien
> Mazarguil
> > > Sent: Tuesday, July 5, 2016 7:17 PM
> > > To: dev@dpdk.org
> > > Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; Zhang, Helin
> > > <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Rasesh
> > > Mody <rasesh.mody@qlogic.com>; Ajit Khaparde
> > > <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> > > <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>;
> > > Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>;
> Chen,
> > > Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> > > Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> > > <sony.chacko@qlogic.com>; Jerin Jacob
> > > <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> > > <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>
> > > Subject: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> > >

<<<<<----Snipped out ---->>>>>
> > > Flow director
> > > -------------
> > >
> > > Flow director (FDIR) is the name of the most capable filter type, which
> > > covers most features offered by others. As such, it is the most
> widespread
> > > in PMDs that support filtering (i.e. all of them besides **e1000**).
> > >
> > > It is also the only type that allows an arbitrary 32 bits value provided by
> > > applications to be attached to a filter and returned with matching packets
> > > instead of relying on the destination queue to recognize flows.
> > >
> > > Unfortunately, even FDIR requires applications to be aware of low-level
> > > capabilities and limitations (most of which come directly from **ixgbe**
> and
> > > **i40e**):
> > >
> > > - Bitmasks are set globally per device (port?), not per filter.
> > [Sugesh] This means application cannot define filters that matches on
> arbitrary different offsets?
> > If that’s the case, I assume the application has to program bitmask in
> advance. Otherwise how
> > the API framework deduce this bitmask information from the rules?? Its
> not very clear to me
> > that how application pass down the bitmask information for multiple filters
> on same port?
> 
> This is my understanding of how flow director currently works, perhaps
> someome more familiar with it can answer this question better than I could.
> 
> Let me take an example, if particular device can only handle a single IPv4
> mask common to all flow rules (say only to match destination addresses),
> updating that mask to also match the source address affects all defined and
> future flow rules simultaneously.
> 
> That is how FDIR currently works and I think it is wrong, as it penalizes
> devices that do support individual bit-masks per rule, and is a little
> awkward from an application point of view.
> 
> What I suggest for the new API instead is the ability to specify one
> bit-mask per rule, and let the PMD deal with HW limitations by automatically
> configuring global bitmasks from the first added rule, then refusing to add
> subsequent rules if they specify a conflicting bit-mask. Existing rules
> remain unaffected that way, and applications do not have to be extra
> cautious.
> 
[Sugesh] The issue with that approach is, the hardware simply discards the rule
when it is a super set of first one eventhough the hardware is capable of 
handling it. How its guaranteed the first rule will set the bitmask for all the
subsequent rules. 
How about having a CLASSIFER_TYPE for the classifier. Every port can have 
set of supported flow types(for eg: L3_TYPE, L4_TYPE, L4_TYPE_8BYTE_FLEX,
L4_TYPE_16BYTE_FLEX) based on the underlying FDIR support. Application can query 
this and set the type accordingly while initializing the port. This way the first rule need 
not set all the bits that may needed in the future rules. 
 
> > > ``PASSTHRU``
> > > ^^^^^^^^^^^^
> > >
> > > Leaves packets up for additional processing by subsequent flow rules.
> This
> > > is the default when a rule does not contain a terminating action, but can
> be
> > > specified to force a rule to become non-terminating.
> > >
> > > - No configurable property.
> > >
> > > +---------------+
> > > | PASSTHRU      |
> > > +===============+
> > > | no properties |
> > > +---------------+
> > >
> > > Example to copy a packet to a queue and continue processing by
> subsequent
> > > flow rules:
> > [Sugesh] If a packet get copied to a queue, it’s a termination action.
> > How can its possible to do subsequent action after the packet already
> > moved to the queue. ?How it differs from DUP action?
> >  Am I missing anything here?
> 
> Devices may not support the combination of QUEUE + PASSTHRU (i.e. making
> QUEUE non-terminating). However these same devices may expose the
> ability to
> copy a packet to another (sniffer) queue all while keeping the rule
> terminating (QUEUE + DUP but no PASSTHRU).
> 
> DUP with two rules, assuming priorties and PASSTRHU are supported:
> 
> - pattern X, priority 0; actions: QUEUE 5, PASSTHRU (non-terminating)
> 
> - pattern X, priority 1; actions: QUEUE 6 (terminating)
> 
> DUP with two actions on a single rule and a single priority:
> 
> - pattern X, priority 0; actions: DUP 5, QUEUE 6 (terminating)
> 
> If supported, from an application point of view the end result is similar in
> both cases (note the second case may be implemented by the PMD using
> two HW
> rules internally).
> 
> However the second case does not waste a priority level and clearly states
> the intent to the PMD which is more likely to be supported. If HW supports
> DUP directly it is even faster since there is a single rule. That is why I
> thought having DUP as an action would be useful.
[Sugesh] Thank you for the clarification. It make sense to me now.
> 
> > > +--------------------------+
> > > | Copy to queue 8          |
> > > +==========+===============+
> > > | PASSTHRU |               |
> > > +----------+-----------+---+
> > > | QUEUE    | ``queue`` | 8 |
> > > +----------+-----------+---+
> > >
> > > ``ID``
> > > ^^^^^^
> > >
> > > Attaches a 32 bit value to packets.
> > >
> > > +----------------------------------------------+
> > > | ID                                           |
> > > +========+=====================================+
> > > | ``id`` | 32 bit value to return with packets |
> > > +--------+-------------------------------------+
> > >
> > [Sugesh] I assume the application has to program the flow
> > with a unique ID and matching packets are stamped with this ID
> > when reporting to the software. The uniqueness of ID is NOT
> > guaranteed by the API framework. Correct me if I am wrong here.
> 
> You are right, if the way I wrote it is not clear enough, I'm open to
> suggestions to improve it.
[Sugesh] I guess its fine and would like to confirm the same. Perhaps
it would be nice to mention that the IDs are application defined.

> 
> > [Sugesh] Is it a limitation to use only 32 bit ID? Is it possible to have a
> > 64 bit ID? So that application can use the control plane flow pointer
> > Itself as an ID. Does it make sense?
> 
> I've specified a 32 bit ID for now because this is what FDIR supports and
> also what existing devices can report today AFAIK (i40e and mlx5).
> 
> We could use 64 bit for future-proofness in a separate action like "ID64"
> when at least one device supports it.
> 
> To PMD maintainers: please comment if you know devices that support
> tagging
> matching packets with more than 32 bits of user-provided data!
[Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says so.
And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with rss hash. This can be
a software driver limitation that expose only 32 bit. Possibly because of cache 
alignment issues? Since the hardware can support 64 bit, I feel it make sense 
to support 64 bit as well.
> 
> > > .. raw:: pdf
> > >
> > >    PageBreak
> > >
> > > ``QUEUE``
> > > ^^^^^^^^^
> > >
> > > Assigns packets to a given queue index.
> > >
> > > - Terminating by default.
> > >
> > > +--------------------------------+
> > > | QUEUE                          |
> > > +===========+====================+
> > > | ``queue`` | queue index to use |
> > > +-----------+--------------------+
> > >
> > > ``DROP``
> > > ^^^^^^^^
> > >
> > > Drop packets.
> > >
> > > - No configurable property.
> > > - Terminating by default.
> > > - PASSTHRU overrides this action if both are specified.
> > >
> > > +---------------+
> > > | DROP          |
> > > +===============+
> > > | no properties |
> > > +---------------+
> > >
> > > ``COUNT``
> > > ^^^^^^^^^
> > >
> > [Sugesh] Should we really have to set count action explicitly for every rule?
> > IMHO it would be great to be an implicit action. Most of the application
> would be
> > interested in the stats of almost all the filters/flows .
> 
> I can see why, but no, it must be explicitly requested because you may want
> to know in advance when it is not supported. Also considering it is
> something else to be done by HW (a separate action), we can assume
> enabling
> this may slow things down a bit.
> 
> HW limitations may also prevent you from having as many flow counters as
> you
> want, in which case you probably want to carefully pick which rules have
> them.
> 
> I think this target is most useful with DROP, VF and PF actions since
> those are currently the only ones where SW may not see the related
> packets.
> 
[Sugesh] Agreed and thanks for the clarification.

> > > Enables hits counter for this rule.
> > >
> > > This counter can be retrieved and reset through ``rte_flow_query()``, see
> > > ``struct rte_flow_query_count``.
> > >
> > > - Counters can be retrieved with ``rte_flow_query()``.
> > > - No configurable property.
> > >
> > > +---------------+
> > > | COUNT         |
> > > +===============+
> > > | no properties |
> > > +---------------+
> > >
> > > Query structure to retrieve and reset the flow rule hits counter:
> > >
> > > +------------------------------------------------+
> > > | COUNT query                                    |
> > > +===========+=====+==============================+
> > > | ``reset`` | in  | reset counter after query    |
> > > +-----------+-----+------------------------------+
> > > | ``hits``  | out | number of hits for this flow |
> > > +-----------+-----+------------------------------+
> > >
<<<<<<<<Snipped out >>>>
> > > ::
> > >
> > >  struct rte_flow *
> > >  rte_flow_create(uint8_t port_id,
> > >                  const struct rte_flow_pattern *pattern,
> > >                  const struct rte_flow_actions *actions);
> > >
> > > Arguments:
> > >
> > > - ``port_id``: port identifier of Ethernet device.
> > > - ``pattern``: pattern specification to add.
> > > - ``actions``: actions associated with the flow definition.
> > >
> > > Return value:
> > >
> > > A valid flow pointer in case of success, NULL otherwise and ``rte_errno`` is
> > > set to the positive version of one of the error codes defined for
> > > ``rte_flow_validate()``.
> > [Sugesh] : Kind of implementation specific query. What if application
> > try to add duplicate rules? Does the API create new flow entry for every
> > API call?
> 
> If an application adds duplicate rules at a given priority level, the second
> one may return an error depending on the PMD. Collisions are sometimes
> trivial to detect (such as the same pattern twice), others not so much (one
> matching an Ethernet header only, the other one matching an IP header
> only).
> 
> Either way if a packet is matched by two rules at a given priority level,
> what happens is described in 3.3 (High level design) and 4.4.1 (Priorities).
> 
> Applications are responsible for not relying on the PMD to detect these, or
> should use a single priority level for each rule to make things clear.
> 
> However since the number of HW priority levels is finite and possibly small,
> they must also make sure not to waste them. My advice is to only use
> priority levels when it cannot be proven that rules do not collide.
> 
> If all you have is perfect matching rules without wildcards and all of them
> match the same number of layers, a single priority level is fine.
> 
[Sugesh] Make sense. Its fine from my prespective.
> > [Sugesh] Another concern is the cost and time of installing these rules
> > in the hardware. Can we make these APIs time bound(or at least an option
> to
> > set the time limit to execute these APIs), so that
> > Application doesn’t have to wait so long when installing and deleting flows
> with
> > slow hardware/NIC. What do you think? Most of the datapath flow
> installations are
> > dynamic and triggered only when there is
> > an ingress traffic. Delay in flow insertion/deletion have unpredictable
> consequences.
> 
> This API is (currently) aimed at the control path only, and must indeed be
> assumed to be slow. Creating million of rules may take quite long as it may
> involve syscalls and other time-consuming synchronization things on the
> PMD
> side.
> 
> So currently there is no plan to have rules added from the data path with
> time constraints. I think it would be implemented through a different set of
> functions anyway.
> 
> I do not think adding time limits is practical, even specifying in the API
> that creating a single flow rule must take less than a maximum number of
> seconds in order to be effective is too much of a constraint (applications
> that create all flows during init may not care after all).
> 
> You should consider in any case that modifying flow rules will always be
> slower than receiving packets, there is no way around that. Applications
> have to live with it and provide a software fallback for incoming packets
> while managing flow rules.
> 
> Moreover, think about what happens when you hit the maximum number of
> flow
> rules and cannot create any more. Applications need to implement some
> kind
> of fallback in their data path.
> 
> Offloading flows in HW is also only useful if they live much longer than the
> time taken to create and delete them. Perhaps applications may choose to
> do
> so after detecting long lived flows such as TCP sessions.
> 
> You may have one separate control thread dedicated to manage flows and
> keep your normal control thread unaffected by delays. Several threads can
> even be dedicated, one per device.
[Sugesh] I agree that the flow insertion cannot be as fast as the packet receiving 
rate.  From application point of view the problem will be when hardware flow 
insertion takes longer than software flow insertion. At least application has to know
the cost of inserting/deleting a rule in hardware beforehand. Otherwise how application
can choose the right flow candidate for hardware. My point here is application is expecting 
a deterministic behavior from a classifier while inserting and deleting rules.
> 
> > [Sugesh] Another query is on the synchronization part. What if same rules
> are
> > handled from different threads? Is application responsible for handling the
> concurrent
> > hardware programming?
> 
> Like most (if not all) DPDK APIs, applications are responsible for managing
> locking issues as decribed in 4.3 (Behavior). Since this is a control path
> API and applications usually have a single control thread, locking should
> not be necessary in most cases.
> 
> Regarding my above comment about using several control threads to
> manage
> different devices, section 4.3 says:
> 
>  "There is no provision for reentrancy/multi-thread safety, although nothing
>  should prevent different devices from being configured at the same
>  time. PMDs may protect their control path functions accordingly."
> 
> I'd like to emphasize it is not "per port" but "per device", since in a few
> cases a configurable resource is shared by several ports. It may be
> difficult for applications to determine which ports are shared by a given
> device but this falls outside the scope of this API.
> 
> Do you think adding the guarantee that it is always safe to configure two
> different ports simultaneously without locking from the application side is
> necessary? In which case the PMD would be responsible for locking shared
> resources.
[Sugesh] This would be little bit complicated when some of ports are not under 
DPDK itself(what if one port is managed by Kernel) Or ports are tied by 
different application. Locking in PMD helps when the ports are accessed by 
multiple DPDK application. However what if the port itself not under DPDK?
> 
> > > Destruction
> > > ~~~~~~~~~~~
> > >
> > > Flow rules destruction is not automatic, and a queue should not be
> released
> > > if any are still attached to it. Applications must take care of performing
> > > this step before releasing resources.
> > >
> > > ::
> > >
> > >  int
> > >  rte_flow_destroy(uint8_t port_id,
> > >                   struct rte_flow *flow);
> > >
> > >
> > [Sugesh] I would suggest having a clean-up API is really useful as the
> releasing of
> > Queue(is it applicable for releasing of port too?) is not guaranteeing the
> automatic flow
> > destruction.
> 
> Would something like rte_flow_flush(port_id) do the trick? I wanted to
> emphasize in this first draft that applications should really keep the flow
> pointers around in order to manage/destroy them. It is their responsibility,
> not PMD's.
[Sugesh] Thanks, I think the flush call will do.
> 
> > This way application can initialize the port,
> > clean-up all the existing rules and create new rules  on a clean slate.
> 
> No resource can be released as long as a flow rule is using it (bad things
> may happen otherwise), all flow rules must be destroyed first, thus none can
> possibly remain after initializing a port. It is assumed that PMDs do
> automatic clean up during init if necessary to ensure this.
[Sugesh] That will do.
> 
> > > Failure to destroy a flow rule may occur when other flow rules depend on
> it,
> > > and destroying it would result in an inconsistent state.
> > >
> > > This function is only guaranteed to succeed if flow rules are destroyed in
> > > reverse order of their creation.
> > >
> > > Arguments:
> > >
> > > - ``port_id``: port identifier of Ethernet device.
> > > - ``flow``: flow rule to destroy.
> > >
> > > Return value:
> > >
> > > - **0** on success, a negative errno value otherwise and ``rte_errno`` is
> > >   set.
> > >
> > > .. raw:: pdf
> > >
> > >    PageBreak
> > >
> > > Query
> > > ~~~~~
> > >
> > > Query an existing flow rule.
> > >
> > > This function allows retrieving flow-specific data such as counters. Data
> > > is gathered by special actions which must be present in the flow rule
> > > definition.
> > >
> > > ::
> > >
> > >  int
> > >  rte_flow_query(uint8_t port_id,
> > >                 struct rte_flow *flow,
> > >                 enum rte_flow_action_type action,
> > >                 void *data);
> > >
> > > Arguments:
> > >
> > > - ``port_id``: port identifier of Ethernet device.
> > > - ``flow``: flow rule to query.
> > > - ``action``: action type to query.
> > > - ``data``: pointer to storage for the associated query data type.
> > >
> > > Return value:
> > >
> > > - **0** on success, a negative errno value otherwise and ``rte_errno`` is
> > >   set.
> > >
> > > .. raw:: pdf
> > >
> > >    PageBreak
> > >
> > > Behavior
> > > --------
> > >
> > > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> > >   returned).
> > >
> > > - There is no provision for reentrancy/multi-thread safety, although
> nothing
> > >   should prevent different devices from being configured at the same
> > >   time. PMDs may protect their control path functions accordingly.
> > >
> > > - Stopping the data path (TX/RX) should not be necessary when managing
> > > flow
> > >   rules. If this cannot be achieved naturally or with workarounds (such as
> > >   temporarily replacing the burst function pointers), an appropriate error
> > >   code must be returned (``EBUSY``).
> > >
> > > - PMDs, not applications, are responsible for maintaining flow rules
> > >   configuration when stopping and restarting a port or performing other
> > >   actions which may affect them. They can only be destroyed explicitly.
> > >
> > > .. raw:: pdf
> > >
> > >    PageBreak
> > >
> > [Sugesh] Query all the rules for a specific port/queue?? Useful when
> adding and
> > deleting ports and queues dynamically according to the need. I am not sure
> > what are the other  different usecases for these APIs. But I feel it makes
> much easier to
> > manage flows from the application. What do you think?
> 
> Not sure, that seems to fall out of the scope of this API. As described,
> applications already store the related rte_flow pointers. Accordingly, they
> know how many rules are associated to a given port. They need both a port
> ID
> and a flow rule pointer to destroy them after all.
> 
> Now perhaps something to convert back an existing rte_flow to a pattern
> and
> a list of actions, however I cannot see an immediate use case for it.
> 
> What you describe seems to be doable through a front-end API, I think
> keeping this one as low-level as possible with only basic actions is better
> right now. I'll keep your suggestion in mind.
[Sugesh] Sure, That will be fine.
> 
> > > Compatibility
> > > -------------
> > >
> > > No known hardware implementation supports all the features described
> in
> > > this
> > > document.
> > >
> > > Unsupported features or combinations are not expected to be fully
> > > emulated
> > > in software by PMDs for performance reasons. Partially supported
> features
> > > may be completed in software as long as hardware performs most of the
> > > work
> > > (such as queue redirection and packet recognition).
> > >
> > > However PMDs are expected to do their best to satisfy application
> requests
> > > by working around hardware limitations as long as doing so does not
> affect
> > > the behavior of existing flow rules.
> > >
> > > The following sections provide a few examples of such cases, they are
> based
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-11 10:42     ` Chandran, Sugesh
@ 2016-07-13 20:03       ` Adrien Mazarguil
  2016-07-15  9:23         ` Chandran, Sugesh
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-13 20:03 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

On Mon, Jul 11, 2016 at 10:42:36AM +0000, Chandran, Sugesh wrote:
> Hi Adrien,
> 
> Thank you for your response,
> Please see my comments inline.

Hi Sugesh,

Sorry for the delay, please see my answers inline as well.

[...]
> > > > Flow director
> > > > -------------
> > > >
> > > > Flow director (FDIR) is the name of the most capable filter type, which
> > > > covers most features offered by others. As such, it is the most
> > widespread
> > > > in PMDs that support filtering (i.e. all of them besides **e1000**).
> > > >
> > > > It is also the only type that allows an arbitrary 32 bits value provided by
> > > > applications to be attached to a filter and returned with matching packets
> > > > instead of relying on the destination queue to recognize flows.
> > > >
> > > > Unfortunately, even FDIR requires applications to be aware of low-level
> > > > capabilities and limitations (most of which come directly from **ixgbe**
> > and
> > > > **i40e**):
> > > >
> > > > - Bitmasks are set globally per device (port?), not per filter.
> > > [Sugesh] This means application cannot define filters that matches on
> > arbitrary different offsets?
> > > If that’s the case, I assume the application has to program bitmask in
> > advance. Otherwise how
> > > the API framework deduce this bitmask information from the rules?? Its
> > not very clear to me
> > > that how application pass down the bitmask information for multiple filters
> > on same port?
> > 
> > This is my understanding of how flow director currently works, perhaps
> > someome more familiar with it can answer this question better than I could.
> > 
> > Let me take an example, if particular device can only handle a single IPv4
> > mask common to all flow rules (say only to match destination addresses),
> > updating that mask to also match the source address affects all defined and
> > future flow rules simultaneously.
> > 
> > That is how FDIR currently works and I think it is wrong, as it penalizes
> > devices that do support individual bit-masks per rule, and is a little
> > awkward from an application point of view.
> > 
> > What I suggest for the new API instead is the ability to specify one
> > bit-mask per rule, and let the PMD deal with HW limitations by automatically
> > configuring global bitmasks from the first added rule, then refusing to add
> > subsequent rules if they specify a conflicting bit-mask. Existing rules
> > remain unaffected that way, and applications do not have to be extra
> > cautious.
> > 
> [Sugesh] The issue with that approach is, the hardware simply discards the rule
> when it is a super set of first one eventhough the hardware is capable of 
> handling it. How its guaranteed the first rule will set the bitmask for all the
> subsequent rules. 

Just to clarify, the API only says that new rules cannot affect existing
ones (which I think makes sense from a user's perspective), so as long as
the PMD does whatever is needed to make all rules work together, there
should not be any problem with this approach.

Even if the PMD has to temporarily remove an existing rule and reconfigure
global masks in order to add subsequent rules, it is fine as long as packets
aren't misdirected in the meantime (they may be dropped if there is no other
choice).

> How about having a CLASSIFER_TYPE for the classifier. Every port can have 
> set of supported flow types(for eg: L3_TYPE, L4_TYPE, L4_TYPE_8BYTE_FLEX,
> L4_TYPE_16BYTE_FLEX) based on the underlying FDIR support. Application can query 
> this and set the type accordingly while initializing the port. This way the first rule need 
> not set all the bits that may needed in the future rules. 

Again from a user's POV, I think doing so would add unwanted HW-specific
complexity. 

However this concern can be handled through a different approach. Let's say
user creates a pattern that only specifies a IP header with a given
bit-mask.

In FDIR language this translates to:

- Set global mask for IPv4 accordingly, remaining global masks all zeroed
  (assumed default value).

- Create an IPv4 flow.

>From now on, all rules specifying a IPv4 header must have this exact
bitmask (implicitly or explicitly), otherwise they cannot be created,
i.e. the global bitmask for IPv4 becomes immutable.

Now user creates a TCPv4 rule (as long as it uses the same IPv4 mask), to
handle this FDIR would:

- Keep global immutable mask for IPv4 unchanged, set global TCP mask
  according to the flow rule.

- Create a TCPv4 flow.

>From this point on, like IPv4, subsequent TCP rules must have this exact
bitmask and so on as the global bitmask becomes immutable.

Basically, only protocol bit-masks affected by existing flow rules are
immutable, others can be changed later. Global flow masks for protocols
become mutable again when no existing flow rule uses them.

Does it look fine for you?

[...]
> > > > +--------------------------+
> > > > | Copy to queue 8          |
> > > > +==========+===============+
> > > > | PASSTHRU |               |
> > > > +----------+-----------+---+
> > > > | QUEUE    | ``queue`` | 8 |
> > > > +----------+-----------+---+
> > > >
> > > > ``ID``
> > > > ^^^^^^
> > > >
> > > > Attaches a 32 bit value to packets.
> > > >
> > > > +----------------------------------------------+
> > > > | ID                                           |
> > > > +========+=====================================+
> > > > | ``id`` | 32 bit value to return with packets |
> > > > +--------+-------------------------------------+
> > > >
> > > [Sugesh] I assume the application has to program the flow
> > > with a unique ID and matching packets are stamped with this ID
> > > when reporting to the software. The uniqueness of ID is NOT
> > > guaranteed by the API framework. Correct me if I am wrong here.
> > 
> > You are right, if the way I wrote it is not clear enough, I'm open to
> > suggestions to improve it.
> [Sugesh] I guess its fine and would like to confirm the same. Perhaps
> it would be nice to mention that the IDs are application defined.

OK, I will make it clearer.

> > > [Sugesh] Is it a limitation to use only 32 bit ID? Is it possible to have a
> > > 64 bit ID? So that application can use the control plane flow pointer
> > > Itself as an ID. Does it make sense?
> > 
> > I've specified a 32 bit ID for now because this is what FDIR supports and
> > also what existing devices can report today AFAIK (i40e and mlx5).
> > 
> > We could use 64 bit for future-proofness in a separate action like "ID64"
> > when at least one device supports it.
> > 
> > To PMD maintainers: please comment if you know devices that support
> > tagging
> > matching packets with more than 32 bits of user-provided data!
> [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says so.
> And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with rss hash. This can be
> a software driver limitation that expose only 32 bit. Possibly because of cache 
> alignment issues? Since the hardware can support 64 bit, I feel it make sense 
> to support 64 bit as well.

I agree we need 64 bit support, but then we also need a solution for devices
that support only 32 bit. Possible methods I can think of:

- A separate "ID64" action (or a "ID32" one, perhaps with a better name).

- A single ID action with an unlimited number of bytes to return with
  packets (would actually be a string). PMDs can then refuse to create flow
  rules requesting an unsupported number of bytes. Devices supporting fewer
  than 32 bits are also included this way without the need for yet another
  action.

Thoughts?

[...]
> > > [Sugesh] Another concern is the cost and time of installing these rules
> > > in the hardware. Can we make these APIs time bound(or at least an option
> > to
> > > set the time limit to execute these APIs), so that
> > > Application doesn’t have to wait so long when installing and deleting flows
> > with
> > > slow hardware/NIC. What do you think? Most of the datapath flow
> > installations are
> > > dynamic and triggered only when there is
> > > an ingress traffic. Delay in flow insertion/deletion have unpredictable
> > consequences.
> > 
> > This API is (currently) aimed at the control path only, and must indeed be
> > assumed to be slow. Creating million of rules may take quite long as it may
> > involve syscalls and other time-consuming synchronization things on the
> > PMD
> > side.
> > 
> > So currently there is no plan to have rules added from the data path with
> > time constraints. I think it would be implemented through a different set of
> > functions anyway.
> > 
> > I do not think adding time limits is practical, even specifying in the API
> > that creating a single flow rule must take less than a maximum number of
> > seconds in order to be effective is too much of a constraint (applications
> > that create all flows during init may not care after all).
> > 
> > You should consider in any case that modifying flow rules will always be
> > slower than receiving packets, there is no way around that. Applications
> > have to live with it and provide a software fallback for incoming packets
> > while managing flow rules.
> > 
> > Moreover, think about what happens when you hit the maximum number of
> > flow
> > rules and cannot create any more. Applications need to implement some
> > kind
> > of fallback in their data path.
> > 
> > Offloading flows in HW is also only useful if they live much longer than the
> > time taken to create and delete them. Perhaps applications may choose to
> > do
> > so after detecting long lived flows such as TCP sessions.
> > 
> > You may have one separate control thread dedicated to manage flows and
> > keep your normal control thread unaffected by delays. Several threads can
> > even be dedicated, one per device.
> [Sugesh] I agree that the flow insertion cannot be as fast as the packet receiving 
> rate.  From application point of view the problem will be when hardware flow 
> insertion takes longer than software flow insertion. At least application has to know
> the cost of inserting/deleting a rule in hardware beforehand. Otherwise how application
> can choose the right flow candidate for hardware. My point here is application is expecting 
> a deterministic behavior from a classifier while inserting and deleting rules.

Understood, however it will be difficult to estimate, particularly if a PMD
must rearrange flow rules to make room for a new one due to priority levels
collision or some other HW-related reason. I mean, spent time cannot be
assumed to be constant, even PMDs cannot know in advance because it also
depends on the performance of the host CPU.

Such applications may find it easier to measure elapsed time for the rules
they create, make statistics and extrapolate from this information for
future rules. I do not think the PMD can help much here.

> > > [Sugesh] Another query is on the synchronization part. What if same rules
> > are
> > > handled from different threads? Is application responsible for handling the
> > concurrent
> > > hardware programming?
> > 
> > Like most (if not all) DPDK APIs, applications are responsible for managing
> > locking issues as decribed in 4.3 (Behavior). Since this is a control path
> > API and applications usually have a single control thread, locking should
> > not be necessary in most cases.
> > 
> > Regarding my above comment about using several control threads to
> > manage
> > different devices, section 4.3 says:
> > 
> >  "There is no provision for reentrancy/multi-thread safety, although nothing
> >  should prevent different devices from being configured at the same
> >  time. PMDs may protect their control path functions accordingly."
> > 
> > I'd like to emphasize it is not "per port" but "per device", since in a few
> > cases a configurable resource is shared by several ports. It may be
> > difficult for applications to determine which ports are shared by a given
> > device but this falls outside the scope of this API.
> > 
> > Do you think adding the guarantee that it is always safe to configure two
> > different ports simultaneously without locking from the application side is
> > necessary? In which case the PMD would be responsible for locking shared
> > resources.
> [Sugesh] This would be little bit complicated when some of ports are not under 
> DPDK itself(what if one port is managed by Kernel) Or ports are tied by 
> different application. Locking in PMD helps when the ports are accessed by 
> multiple DPDK application. However what if the port itself not under DPDK?

Well, either we do not care about what happens outside of the DPDK context,
or PMDs must find a way to satisfy everyone. I'm not a fan of locking either
but it would be nice if flow rules configuration could be attempted on
different ports simultaneously without the risk of wrecking anything, so
that applications do not need to care.

Possible cases for a dual port device with global flow rule settings
affecting both ports:

1) ports 1 & 2 are managed by DPDK: this is the easy case, a rule that needs
   to alter a global setting necessary for an existing rule on any port is
   not allowed (EEXIST). PMD must maintain a device context common to both
   ports in order for this to work. This context is either under lock, or
   the first port on which a flow rule is created owns all future flow
   rules.

2) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
   it and knows that port 2 may modify the global context: no flow rules can
   be created from the DPDK application due to safety issues (EBUSY?).

3) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
   it and knows that port 2 will not modify flow rules: PMD should not care,
   no lock necessary.

4) port 1 is managed by DPDK, port 2 by something else and the PMD is not
   aware of it: either flow rules cannot be created ever at all, or we say
   it is user's reponsibility to make sure this does not happen.

Considering that most control operations performed by DPDK affect the device
regardless of other applications, I think 1) is the only case that should be
defined, otherwise 4), defined as user's responsibility.

> > > > Destruction
> > > > ~~~~~~~~~~~
> > > >
> > > > Flow rules destruction is not automatic, and a queue should not be
> > released
> > > > if any are still attached to it. Applications must take care of performing
> > > > this step before releasing resources.
> > > >
> > > > ::
> > > >
> > > >  int
> > > >  rte_flow_destroy(uint8_t port_id,
> > > >                   struct rte_flow *flow);
> > > >
> > > >
> > > [Sugesh] I would suggest having a clean-up API is really useful as the
> > releasing of
> > > Queue(is it applicable for releasing of port too?) is not guaranteeing the
> > automatic flow
> > > destruction.
> > 
> > Would something like rte_flow_flush(port_id) do the trick? I wanted to
> > emphasize in this first draft that applications should really keep the flow
> > pointers around in order to manage/destroy them. It is their responsibility,
> > not PMD's.
> [Sugesh] Thanks, I think the flush call will do.

Noted, will add it.

> > > This way application can initialize the port,
> > > clean-up all the existing rules and create new rules  on a clean slate.
> > 
> > No resource can be released as long as a flow rule is using it (bad things
> > may happen otherwise), all flow rules must be destroyed first, thus none can
> > possibly remain after initializing a port. It is assumed that PMDs do
> > automatic clean up during init if necessary to ensure this.
> [Sugesh] That will do.

I will make it more explicit as well.

[...]

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-13 20:03       ` Adrien Mazarguil
@ 2016-07-15  9:23         ` Chandran, Sugesh
  2016-07-15 10:02           ` Chilikin, Andrey
  2016-07-15 15:04           ` Adrien Mazarguil
  0 siblings, 2 replies; 262+ messages in thread
From: Chandran, Sugesh @ 2016-07-15  9:23 UTC (permalink / raw)
  To: 'Adrien Mazarguil'
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

Thank you Adrien,
Please find below for some more comments/inputs

Let me know your thoughts on this.


Regards
_Sugesh


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Wednesday, July 13, 2016 9:03 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Rasesh Mody <rasesh.mody@qlogic.com>; Ajit
> Khaparde <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> On Mon, Jul 11, 2016 at 10:42:36AM +0000, Chandran, Sugesh wrote:
> > Hi Adrien,
> >
> > Thank you for your response,
> > Please see my comments inline.
> 
> Hi Sugesh,
> 
> Sorry for the delay, please see my answers inline as well.
> 
> [...]
> > > > > Flow director
> > > > > -------------
> > > > >
> > > > > Flow director (FDIR) is the name of the most capable filter
> > > > > type, which covers most features offered by others. As such, it
> > > > > is the most
> > > widespread
> > > > > in PMDs that support filtering (i.e. all of them besides **e1000**).
> > > > >
> > > > > It is also the only type that allows an arbitrary 32 bits value
> > > > > provided by applications to be attached to a filter and returned
> > > > > with matching packets instead of relying on the destination queue to
> recognize flows.
> > > > >
> > > > > Unfortunately, even FDIR requires applications to be aware of
> > > > > low-level capabilities and limitations (most of which come
> > > > > directly from **ixgbe**
> > > and
> > > > > **i40e**):
> > > > >
> > > > > - Bitmasks are set globally per device (port?), not per filter.
> > > > [Sugesh] This means application cannot define filters that matches
> > > > on
> > > arbitrary different offsets?
> > > > If that’s the case, I assume the application has to program
> > > > bitmask in
> > > advance. Otherwise how
> > > > the API framework deduce this bitmask information from the rules??
> > > > Its
> > > not very clear to me
> > > > that how application pass down the bitmask information for
> > > > multiple filters
> > > on same port?
> > >
> > > This is my understanding of how flow director currently works,
> > > perhaps someome more familiar with it can answer this question better
> than I could.
> > >
> > > Let me take an example, if particular device can only handle a
> > > single IPv4 mask common to all flow rules (say only to match
> > > destination addresses), updating that mask to also match the source
> > > address affects all defined and future flow rules simultaneously.
> > >
> > > That is how FDIR currently works and I think it is wrong, as it
> > > penalizes devices that do support individual bit-masks per rule, and
> > > is a little awkward from an application point of view.
> > >
> > > What I suggest for the new API instead is the ability to specify one
> > > bit-mask per rule, and let the PMD deal with HW limitations by
> > > automatically configuring global bitmasks from the first added rule,
> > > then refusing to add subsequent rules if they specify a conflicting
> > > bit-mask. Existing rules remain unaffected that way, and
> > > applications do not have to be extra cautious.
> > >
> > [Sugesh] The issue with that approach is, the hardware simply discards
> > the rule when it is a super set of first one eventhough the hardware
> > is capable of handling it. How its guaranteed the first rule will set
> > the bitmask for all the subsequent rules.
> 
> Just to clarify, the API only says that new rules cannot affect existing ones
> (which I think makes sense from a user's perspective), so as long as the PMD
> does whatever is needed to make all rules work together, there should not
> be any problem with this approach.
> 
> Even if the PMD has to temporarily remove an existing rule and reconfigure
> global masks in order to add subsequent rules, it is fine as long as packets
> aren't misdirected in the meantime (they may be dropped if there is no
> other choice).
[Sugesh] I feel this is fine. Thank you for confirming.
> 
> > How about having a CLASSIFER_TYPE for the classifier. Every port can
> > have set of supported flow types(for eg: L3_TYPE, L4_TYPE,
> > L4_TYPE_8BYTE_FLEX,
> > L4_TYPE_16BYTE_FLEX) based on the underlying FDIR support. Application
> > can query this and set the type accordingly while initializing the
> > port. This way the first rule need not set all the bits that may needed in the
> future rules.
> 
> Again from a user's POV, I think doing so would add unwanted HW-specific
> complexity.
> 
> However this concern can be handled through a different approach. Let's say
> user creates a pattern that only specifies a IP header with a given bit-mask.
> 
> In FDIR language this translates to:
> 
> - Set global mask for IPv4 accordingly, remaining global masks all zeroed
>   (assumed default value).
> 
> - Create an IPv4 flow.
> 
> From now on, all rules specifying a IPv4 header must have this exact bitmask
> (implicitly or explicitly), otherwise they cannot be created, i.e. the global
> bitmask for IPv4 becomes immutable.
> 
> Now user creates a TCPv4 rule (as long as it uses the same IPv4 mask), to
> handle this FDIR would:
> 
> - Keep global immutable mask for IPv4 unchanged, set global TCP mask
>   according to the flow rule.
> 
> - Create a TCPv4 flow.
> 
> From this point on, like IPv4, subsequent TCP rules must have this exact
> bitmask and so on as the global bitmask becomes immutable.
> 
> Basically, only protocol bit-masks affected by existing flow rules are
> immutable, others can be changed later. Global flow masks for protocols
> become mutable again when no existing flow rule uses them.
> 
> Does it look fine for you?
[Sugesh] This looks fine for me. 
> 
> [...]
> > > > > +--------------------------+
> > > > > | Copy to queue 8          |
> > > > > +==========+===============+
> > > > > | PASSTHRU |               |
> > > > > +----------+-----------+---+
> > > > > | QUEUE    | ``queue`` | 8 |
> > > > > +----------+-----------+---+
> > > > >
> > > > > ``ID``
> > > > > ^^^^^^
> > > > >
> > > > > Attaches a 32 bit value to packets.
> > > > >
> > > > > +----------------------------------------------+
> > > > > | ID                                           |
> > > > > +========+=====================================+
> > > > > | ``id`` | 32 bit value to return with packets |
> > > > > +--------+-------------------------------------+
> > > > >
> > > > [Sugesh] I assume the application has to program the flow with a
> > > > unique ID and matching packets are stamped with this ID when
> > > > reporting to the software. The uniqueness of ID is NOT guaranteed
> > > > by the API framework. Correct me if I am wrong here.
> > >
> > > You are right, if the way I wrote it is not clear enough, I'm open
> > > to suggestions to improve it.
> > [Sugesh] I guess its fine and would like to confirm the same. Perhaps
> > it would be nice to mention that the IDs are application defined.
> 
> OK, I will make it clearer.
> 
> > > > [Sugesh] Is it a limitation to use only 32 bit ID? Is it possible
> > > > to have a
> > > > 64 bit ID? So that application can use the control plane flow
> > > > pointer Itself as an ID. Does it make sense?
> > >
> > > I've specified a 32 bit ID for now because this is what FDIR
> > > supports and also what existing devices can report today AFAIK (i40e and
> mlx5).
> > >
> > > We could use 64 bit for future-proofness in a separate action like "ID64"
> > > when at least one device supports it.
> > >
> > > To PMD maintainers: please comment if you know devices that support
> > > tagging matching packets with more than 32 bits of user-provided
> > > data!
> > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says so.
> > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with rss
> > hash. This can be a software driver limitation that expose only 32
> > bit. Possibly because of cache alignment issues? Since the hardware
> > can support 64 bit, I feel it make sense to support 64 bit as well.
> 
> I agree we need 64 bit support, but then we also need a solution for devices
> that support only 32 bit. Possible methods I can think of:
> 
> - A separate "ID64" action (or a "ID32" one, perhaps with a better name).
> 
> - A single ID action with an unlimited number of bytes to return with
>   packets (would actually be a string). PMDs can then refuse to create flow
>   rules requesting an unsupported number of bytes. Devices supporting
> fewer
>   than 32 bits are also included this way without the need for yet another
>   action.
> 
> Thoughts?
[Sugesh] I feel the single ID approach is much better. But I would say a fixed size ID
is easy to handle at upper layers. Say PMD returns 64bit ID in which MSBs 
are masked out, based on how many bits the hardware can support. 
PMD can refuse the unsupported number of bytes when requested. So the size
of ID going to be a parameter to program the flow.
What do you think?
> 
> [...]
> > > > [Sugesh] Another concern is the cost and time of installing these
> > > > rules in the hardware. Can we make these APIs time bound(or at
> > > > least an option
> > > to
> > > > set the time limit to execute these APIs), so that Application
> > > > doesn’t have to wait so long when installing and deleting flows
> > > with
> > > > slow hardware/NIC. What do you think? Most of the datapath flow
> > > installations are
> > > > dynamic and triggered only when there is an ingress traffic. Delay
> > > > in flow insertion/deletion have unpredictable
> > > consequences.
> > >
> > > This API is (currently) aimed at the control path only, and must
> > > indeed be assumed to be slow. Creating million of rules may take
> > > quite long as it may involve syscalls and other time-consuming
> > > synchronization things on the PMD side.
> > >
> > > So currently there is no plan to have rules added from the data path
> > > with time constraints. I think it would be implemented through a
> > > different set of functions anyway.
> > >
> > > I do not think adding time limits is practical, even specifying in
> > > the API that creating a single flow rule must take less than a
> > > maximum number of seconds in order to be effective is too much of a
> > > constraint (applications that create all flows during init may not care after
> all).
> > >
> > > You should consider in any case that modifying flow rules will
> > > always be slower than receiving packets, there is no way around
> > > that. Applications have to live with it and provide a software
> > > fallback for incoming packets while managing flow rules.
> > >
> > > Moreover, think about what happens when you hit the maximum
> number
> > > of flow rules and cannot create any more. Applications need to
> > > implement some kind of fallback in their data path.
> > >
> > > Offloading flows in HW is also only useful if they live much longer
> > > than the time taken to create and delete them. Perhaps applications
> > > may choose to do so after detecting long lived flows such as TCP
> > > sessions.
> > >
> > > You may have one separate control thread dedicated to manage flows
> > > and keep your normal control thread unaffected by delays. Several
> > > threads can even be dedicated, one per device.
> > [Sugesh] I agree that the flow insertion cannot be as fast as the
> > packet receiving rate.  From application point of view the problem
> > will be when hardware flow insertion takes longer than software flow
> > insertion. At least application has to know the cost of
> > inserting/deleting a rule in hardware beforehand. Otherwise how
> > application can choose the right flow candidate for hardware. My point
> here is application is expecting a deterministic behavior from a classifier while
> inserting and deleting rules.
> 
> Understood, however it will be difficult to estimate, particularly if a PMD
> must rearrange flow rules to make room for a new one due to priority levels
> collision or some other HW-related reason. I mean, spent time cannot be
> assumed to be constant, even PMDs cannot know in advance because it also
> depends on the performance of the host CPU.
> 
> Such applications may find it easier to measure elapsed time for the rules
> they create, make statistics and extrapolate from this information for future
> rules. I do not think the PMD can help much here.
[Sugesh] From an application point of view this can be an issue. 
Even there is a security concern when we program a short lived flow. Lets consider the case, 

1) Control plane programs the hardware with Queue termination flow.
2) Software dataplane programmed to treat the packets from the specific queue accordingly.
3) Remove the flow from the hardware. (Lets consider this is a long wait process..). 
Or even there is a chance that hardware take more time to report the status than removing it 
physically . Now the packets in the queue no longer consider as matched/flow hit.
. This is due to the software dataplane update is yet to happen.
We must need a way to sync between software datapath and classifier APIs even though 
they are both programmed from a different control thread.

Are we saying these APIs are only meant for user defined static flows??


> 
> > > > [Sugesh] Another query is on the synchronization part. What if
> > > > same rules
> > > are
> > > > handled from different threads? Is application responsible for
> > > > handling the
> > > concurrent
> > > > hardware programming?
> > >
> > > Like most (if not all) DPDK APIs, applications are responsible for
> > > managing locking issues as decribed in 4.3 (Behavior). Since this is
> > > a control path API and applications usually have a single control
> > > thread, locking should not be necessary in most cases.
> > >
> > > Regarding my above comment about using several control threads to
> > > manage different devices, section 4.3 says:
> > >
> > >  "There is no provision for reentrancy/multi-thread safety, although
> > > nothing  should prevent different devices from being configured at
> > > the same  time. PMDs may protect their control path functions
> accordingly."
> > >
> > > I'd like to emphasize it is not "per port" but "per device", since
> > > in a few cases a configurable resource is shared by several ports.
> > > It may be difficult for applications to determine which ports are
> > > shared by a given device but this falls outside the scope of this API.
> > >
> > > Do you think adding the guarantee that it is always safe to
> > > configure two different ports simultaneously without locking from
> > > the application side is necessary? In which case the PMD would be
> > > responsible for locking shared resources.
> > [Sugesh] This would be little bit complicated when some of ports are
> > not under DPDK itself(what if one port is managed by Kernel) Or ports
> > are tied by different application. Locking in PMD helps when the ports
> > are accessed by multiple DPDK application. However what if the port itself
> not under DPDK?
> 
> Well, either we do not care about what happens outside of the DPDK
> context, or PMDs must find a way to satisfy everyone. I'm not a fan of locking
> either but it would be nice if flow rules configuration could be attempted on
> different ports simultaneously without the risk of wrecking anything, so that
> applications do not need to care.
> 
> Possible cases for a dual port device with global flow rule settings affecting
> both ports:
> 
> 1) ports 1 & 2 are managed by DPDK: this is the easy case, a rule that needs
>    to alter a global setting necessary for an existing rule on any port is
>    not allowed (EEXIST). PMD must maintain a device context common to both
>    ports in order for this to work. This context is either under lock, or
>    the first port on which a flow rule is created owns all future flow
>    rules.
> 
> 2) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
>    it and knows that port 2 may modify the global context: no flow rules can
>    be created from the DPDK application due to safety issues (EBUSY?).
> 
> 3) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
>    it and knows that port 2 will not modify flow rules: PMD should not care,
>    no lock necessary.
> 
> 4) port 1 is managed by DPDK, port 2 by something else and the PMD is not
>    aware of it: either flow rules cannot be created ever at all, or we say
>    it is user's reponsibility to make sure this does not happen.
> 
> Considering that most control operations performed by DPDK affect the
> device regardless of other applications, I think 1) is the only case that should
> be defined, otherwise 4), defined as user's responsibility.
> 
> > > > > Destruction
> > > > > ~~~~~~~~~~~
> > > > >
> > > > > Flow rules destruction is not automatic, and a queue should not
> > > > > be
> > > released
> > > > > if any are still attached to it. Applications must take care of
> > > > > performing this step before releasing resources.
> > > > >
> > > > > ::
> > > > >
> > > > >  int
> > > > >  rte_flow_destroy(uint8_t port_id,
> > > > >                   struct rte_flow *flow);
> > > > >
> > > > >
> > > > [Sugesh] I would suggest having a clean-up API is really useful as
> > > > the
> > > releasing of
> > > > Queue(is it applicable for releasing of port too?) is not
> > > > guaranteeing the
> > > automatic flow
> > > > destruction.
> > >
> > > Would something like rte_flow_flush(port_id) do the trick? I wanted
> > > to emphasize in this first draft that applications should really
> > > keep the flow pointers around in order to manage/destroy them. It is
> > > their responsibility, not PMD's.
> > [Sugesh] Thanks, I think the flush call will do.
> 
> Noted, will add it.
> 
> > > > This way application can initialize the port, clean-up all the
> > > > existing rules and create new rules  on a clean slate.
> > >
> > > No resource can be released as long as a flow rule is using it (bad
> > > things may happen otherwise), all flow rules must be destroyed
> > > first, thus none can possibly remain after initializing a port. It
> > > is assumed that PMDs do automatic clean up during init if necessary to
> ensure this.
> > [Sugesh] That will do.
> 
> I will make it more explicit as well.
> 
> [...]
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-15  9:23         ` Chandran, Sugesh
@ 2016-07-15 10:02           ` Chilikin, Andrey
  2016-07-18 13:26             ` Chandran, Sugesh
  2016-07-15 15:04           ` Adrien Mazarguil
  1 sibling, 1 reply; 262+ messages in thread
From: Chilikin, Andrey @ 2016-07-15 10:02 UTC (permalink / raw)
  To: Chandran, Sugesh, 'Adrien Mazarguil'
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

Hi Sugesh,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chandran, Sugesh
> Sent: Friday, July 15, 2016 10:23 AM
> To: 'Adrien Mazarguil' <adrien.mazarguil@6wind.com>

<snip>

> > > > To PMD maintainers: please comment if you know devices that
> > > > support tagging matching packets with more than 32 bits of
> > > > user-provided data!
> > > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says so.
> > > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with
> > > rss hash. This can be a software driver limitation that expose only
> > > 32 bit. Possibly because of cache alignment issues? Since the
> > > hardware can support 64 bit, I feel it make sense to support 64 bit as well.

XL710 supports 32bit FDIR ID only, I believe you mix it up with flexible payload data which can take 0, 4 or 8 bytes of the RX descriptor.

Regards,
Andrey

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-15  9:23         ` Chandran, Sugesh
  2016-07-15 10:02           ` Chilikin, Andrey
@ 2016-07-15 15:04           ` Adrien Mazarguil
  2016-07-18 13:26             ` Chandran, Sugesh
  1 sibling, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-15 15:04 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern, Chilikin, Andrey

On Fri, Jul 15, 2016 at 09:23:26AM +0000, Chandran, Sugesh wrote:
> Thank you Adrien,
> Please find below for some more comments/inputs
> 
> Let me know your thoughts on this.

Thanks, stripping again non relevant parts.

[...]
> > > > > [Sugesh] Is it a limitation to use only 32 bit ID? Is it possible
> > > > > to have a
> > > > > 64 bit ID? So that application can use the control plane flow
> > > > > pointer Itself as an ID. Does it make sense?
> > > >
> > > > I've specified a 32 bit ID for now because this is what FDIR
> > > > supports and also what existing devices can report today AFAIK (i40e and
> > mlx5).
> > > >
> > > > We could use 64 bit for future-proofness in a separate action like "ID64"
> > > > when at least one device supports it.
> > > >
> > > > To PMD maintainers: please comment if you know devices that support
> > > > tagging matching packets with more than 32 bits of user-provided
> > > > data!
> > > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says so.
> > > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with rss
> > > hash. This can be a software driver limitation that expose only 32
> > > bit. Possibly because of cache alignment issues? Since the hardware
> > > can support 64 bit, I feel it make sense to support 64 bit as well.
> > 
> > I agree we need 64 bit support, but then we also need a solution for devices
> > that support only 32 bit. Possible methods I can think of:
> > 
> > - A separate "ID64" action (or a "ID32" one, perhaps with a better name).
> > 
> > - A single ID action with an unlimited number of bytes to return with
> >   packets (would actually be a string). PMDs can then refuse to create flow
> >   rules requesting an unsupported number of bytes. Devices supporting
> > fewer
> >   than 32 bits are also included this way without the need for yet another
> >   action.
> > 
> > Thoughts?
> [Sugesh] I feel the single ID approach is much better. But I would say a fixed size ID
> is easy to handle at upper layers. Say PMD returns 64bit ID in which MSBs 
> are masked out, based on how many bits the hardware can support. 
> PMD can refuse the unsupported number of bytes when requested. So the size
> of ID going to be a parameter to program the flow.
> What do you think?

What you suggest if I am not mistaken is:

 struct rte_flow_action_id {
     uint64_t id;
     uint64_t mask; /* either a bit-mask or a prefix/suffix length? */
 };

I think in this case a mask is more versatile than a prefix/suffix length as
the value itself comes in an unknown endian (from PMD's POV). It also allows
specific bits to be taken into account, like when HW only supports 32 bit,
with some black magic the full original 64 bit value can be restored as long
as the application only cares about at most 32 bits anywhere.

However I do not think many applications "won't care" about specific bits in
a given value and having to provide a properly crafted mask will be a
hassle, they will just fill it with ones and hope for the best. As a result
they won't take advantage of this feature or will stick to 32 bits all the
time, or whatever happens to be the least common denominator.

My previous suggestion was:

 struct rte_flow_action_id {
     uint8_t size; /* number of bytes in id[] */
     uint8_t id[];
 };

It does not solve the issue if an application requests more bytes than
supported, however as a string, there is no endianness ambiguity and these
bytes are copied as-is to the related mbuf field as if done through memcpy()
possibly with some padding to fill the entire 64 bit field (copied bytes
thus starting from MSB for big-endian machines, LSB for little-endian
ones). The value itself remains opaque to the PMD.

One issue is the flexible array approach makes static initialization more
complicated. Maybe it is not worth the trouble since according to Andrey,
even X710 reports at most 32 bits of user data.

So what should we do? Fixed 32 bits ID for now to keep things simple, then
another action for 64 bits later when necessary?

> > [...]
> > > > > [Sugesh] Another concern is the cost and time of installing these
> > > > > rules in the hardware. Can we make these APIs time bound(or at
> > > > > least an option
> > > > to
> > > > > set the time limit to execute these APIs), so that Application
> > > > > doesn’t have to wait so long when installing and deleting flows
> > > > with
> > > > > slow hardware/NIC. What do you think? Most of the datapath flow
> > > > installations are
> > > > > dynamic and triggered only when there is an ingress traffic. Delay
> > > > > in flow insertion/deletion have unpredictable
> > > > consequences.
> > > >
> > > > This API is (currently) aimed at the control path only, and must
> > > > indeed be assumed to be slow. Creating million of rules may take
> > > > quite long as it may involve syscalls and other time-consuming
> > > > synchronization things on the PMD side.
> > > >
> > > > So currently there is no plan to have rules added from the data path
> > > > with time constraints. I think it would be implemented through a
> > > > different set of functions anyway.
> > > >
> > > > I do not think adding time limits is practical, even specifying in
> > > > the API that creating a single flow rule must take less than a
> > > > maximum number of seconds in order to be effective is too much of a
> > > > constraint (applications that create all flows during init may not care after
> > all).
> > > >
> > > > You should consider in any case that modifying flow rules will
> > > > always be slower than receiving packets, there is no way around
> > > > that. Applications have to live with it and provide a software
> > > > fallback for incoming packets while managing flow rules.
> > > >
> > > > Moreover, think about what happens when you hit the maximum
> > number
> > > > of flow rules and cannot create any more. Applications need to
> > > > implement some kind of fallback in their data path.
> > > >
> > > > Offloading flows in HW is also only useful if they live much longer
> > > > than the time taken to create and delete them. Perhaps applications
> > > > may choose to do so after detecting long lived flows such as TCP
> > > > sessions.
> > > >
> > > > You may have one separate control thread dedicated to manage flows
> > > > and keep your normal control thread unaffected by delays. Several
> > > > threads can even be dedicated, one per device.
> > > [Sugesh] I agree that the flow insertion cannot be as fast as the
> > > packet receiving rate.  From application point of view the problem
> > > will be when hardware flow insertion takes longer than software flow
> > > insertion. At least application has to know the cost of
> > > inserting/deleting a rule in hardware beforehand. Otherwise how
> > > application can choose the right flow candidate for hardware. My point
> > here is application is expecting a deterministic behavior from a classifier while
> > inserting and deleting rules.
> > 
> > Understood, however it will be difficult to estimate, particularly if a PMD
> > must rearrange flow rules to make room for a new one due to priority levels
> > collision or some other HW-related reason. I mean, spent time cannot be
> > assumed to be constant, even PMDs cannot know in advance because it also
> > depends on the performance of the host CPU.
> > 
> > Such applications may find it easier to measure elapsed time for the rules
> > they create, make statistics and extrapolate from this information for future
> > rules. I do not think the PMD can help much here.
> [Sugesh] From an application point of view this can be an issue. 
> Even there is a security concern when we program a short lived flow. Lets consider the case, 
> 
> 1) Control plane programs the hardware with Queue termination flow.
> 2) Software dataplane programmed to treat the packets from the specific queue accordingly.
> 3) Remove the flow from the hardware. (Lets consider this is a long wait process..). 
> Or even there is a chance that hardware take more time to report the status than removing it 
> physically . Now the packets in the queue no longer consider as matched/flow hit.
> . This is due to the software dataplane update is yet to happen.
> We must need a way to sync between software datapath and classifier APIs even though 
> they are both programmed from a different control thread.
> 
> Are we saying these APIs are only meant for user defined static flows??

No, that is definitely not the intent. These are good points.

With the specified API, applications may have to adapt their logic and take
extra precautions in order to remain on the safe side at all times.

For your above example, the application cannot assume a rule is
added/deleted as long as the PMD has not completed the related operation,
which means keeping the SW rule/fallback in place in the meantime. Should
handle security concerns as long as after removing a rule, packets end up in
a default queue entirely processed by SW. Obviously this may worsen response
time.

The ID action can help with this. By knowing which rule a received packet is
associated with, processing can be temporarily offloaded by another thread
without much complexity.

I think applications have to implement SW fallbacks all the time, as even
some sort of guarantee on the flow rule processing time may not be enough to
avoid misdirected packets and related security issues.

Let's wait for applications to start using this API and then consider an
extra set of asynchronous / real-time functions when the need arises. It
should not impact the way rules are specified.

> > > > > [Sugesh] Another query is on the synchronization part. What if
> > > > > same rules
> > > > are
> > > > > handled from different threads? Is application responsible for
> > > > > handling the
> > > > concurrent
> > > > > hardware programming?
> > > >
> > > > Like most (if not all) DPDK APIs, applications are responsible for
> > > > managing locking issues as decribed in 4.3 (Behavior). Since this is
> > > > a control path API and applications usually have a single control
> > > > thread, locking should not be necessary in most cases.
> > > >
> > > > Regarding my above comment about using several control threads to
> > > > manage different devices, section 4.3 says:
> > > >
> > > >  "There is no provision for reentrancy/multi-thread safety, although
> > > > nothing  should prevent different devices from being configured at
> > > > the same  time. PMDs may protect their control path functions
> > accordingly."
> > > >
> > > > I'd like to emphasize it is not "per port" but "per device", since
> > > > in a few cases a configurable resource is shared by several ports.
> > > > It may be difficult for applications to determine which ports are
> > > > shared by a given device but this falls outside the scope of this API.
> > > >
> > > > Do you think adding the guarantee that it is always safe to
> > > > configure two different ports simultaneously without locking from
> > > > the application side is necessary? In which case the PMD would be
> > > > responsible for locking shared resources.
> > > [Sugesh] This would be little bit complicated when some of ports are
> > > not under DPDK itself(what if one port is managed by Kernel) Or ports
> > > are tied by different application. Locking in PMD helps when the ports
> > > are accessed by multiple DPDK application. However what if the port itself
> > not under DPDK?
> > 
> > Well, either we do not care about what happens outside of the DPDK
> > context, or PMDs must find a way to satisfy everyone. I'm not a fan of locking
> > either but it would be nice if flow rules configuration could be attempted on
> > different ports simultaneously without the risk of wrecking anything, so that
> > applications do not need to care.
> > 
> > Possible cases for a dual port device with global flow rule settings affecting
> > both ports:
> > 
> > 1) ports 1 & 2 are managed by DPDK: this is the easy case, a rule that needs
> >    to alter a global setting necessary for an existing rule on any port is
> >    not allowed (EEXIST). PMD must maintain a device context common to both
> >    ports in order for this to work. This context is either under lock, or
> >    the first port on which a flow rule is created owns all future flow
> >    rules.
> > 
> > 2) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
> >    it and knows that port 2 may modify the global context: no flow rules can
> >    be created from the DPDK application due to safety issues (EBUSY?).
> > 
> > 3) port 1 is managed by DPDK, port 2 by something else, the PMD is aware of
> >    it and knows that port 2 will not modify flow rules: PMD should not care,
> >    no lock necessary.
> > 
> > 4) port 1 is managed by DPDK, port 2 by something else and the PMD is not
> >    aware of it: either flow rules cannot be created ever at all, or we say
> >    it is user's reponsibility to make sure this does not happen.
> > 
> > Considering that most control operations performed by DPDK affect the
> > device regardless of other applications, I think 1) is the only case that should
> > be defined, otherwise 4), defined as user's responsibility.

No more comments on this part? What do you suggest?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-15 10:02           ` Chilikin, Andrey
@ 2016-07-18 13:26             ` Chandran, Sugesh
  0 siblings, 0 replies; 262+ messages in thread
From: Chandran, Sugesh @ 2016-07-18 13:26 UTC (permalink / raw)
  To: Chilikin, Andrey, 'Adrien Mazarguil'
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern

Hi Andrey,

Regards
_Sugesh


> -----Original Message-----
> From: Chilikin, Andrey
> Sent: Friday, July 15, 2016 11:02 AM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>; 'Adrien Mazarguil'
> <adrien.mazarguil@6wind.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Rasesh Mody <rasesh.mody@qlogic.com>; Ajit
> Khaparde <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>
> Subject: RE: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> Hi Sugesh,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Chandran, Sugesh
> > Sent: Friday, July 15, 2016 10:23 AM
> > To: 'Adrien Mazarguil' <adrien.mazarguil@6wind.com>
> 
> <snip>
> 
> > > > > To PMD maintainers: please comment if you know devices that
> > > > > support tagging matching packets with more than 32 bits of
> > > > > user-provided data!
> > > > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says
> so.
> > > > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with
> > > > rss hash. This can be a software driver limitation that expose
> > > > only
> > > > 32 bit. Possibly because of cache alignment issues? Since the
> > > > hardware can support 64 bit, I feel it make sense to support 64 bit as
> well.
> 
> XL710 supports 32bit FDIR ID only, I believe you mix it up with flexible payload
> data which can take 0, 4 or 8 bytes of the RX descriptor.
[Sugesh] Thank you for correcting Andrey.
Its my mistake..I mixed up with flexible payload data. The FDIR ID is only 32 bit.

> 
> Regards,
> Andrey

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-15 15:04           ` Adrien Mazarguil
@ 2016-07-18 13:26             ` Chandran, Sugesh
  2016-07-18 15:00               ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Chandran, Sugesh @ 2016-07-18 13:26 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern, Chilikin, Andrey

Hi Adrien,
Thank you for getting back on this.
Please find my comments below.

Regards
_Sugesh


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Friday, July 15, 2016 4:04 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Rasesh Mody <rasesh.mody@qlogic.com>; Ajit
> Khaparde <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>;
> Chilikin, Andrey <andrey.chilikin@intel.com>
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> On Fri, Jul 15, 2016 at 09:23:26AM +0000, Chandran, Sugesh wrote:
> > Thank you Adrien,
> > Please find below for some more comments/inputs
> >
> > Let me know your thoughts on this.
> 
> Thanks, stripping again non relevant parts.
> 
> [...]
> > > > > > [Sugesh] Is it a limitation to use only 32 bit ID? Is it
> > > > > > possible to have a
> > > > > > 64 bit ID? So that application can use the control plane flow
> > > > > > pointer Itself as an ID. Does it make sense?
> > > > >
> > > > > I've specified a 32 bit ID for now because this is what FDIR
> > > > > supports and also what existing devices can report today AFAIK
> > > > > (i40e and
> > > mlx5).
> > > > >
> > > > > We could use 64 bit for future-proofness in a separate action like
> "ID64"
> > > > > when at least one device supports it.
> > > > >
> > > > > To PMD maintainers: please comment if you know devices that
> > > > > support tagging matching packets with more than 32 bits of
> > > > > user-provided data!
> > > > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet says
> so.
> > > > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with
> > > > rss hash. This can be a software driver limitation that expose
> > > > only 32 bit. Possibly because of cache alignment issues? Since the
> > > > hardware can support 64 bit, I feel it make sense to support 64 bit as
> well.
> > >
> > > I agree we need 64 bit support, but then we also need a solution for
> > > devices that support only 32 bit. Possible methods I can think of:
> > >
> > > - A separate "ID64" action (or a "ID32" one, perhaps with a better name).
> > >
> > > - A single ID action with an unlimited number of bytes to return with
> > >   packets (would actually be a string). PMDs can then refuse to create
> flow
> > >   rules requesting an unsupported number of bytes. Devices
> > > supporting fewer
> > >   than 32 bits are also included this way without the need for yet another
> > >   action.
> > >
> > > Thoughts?
> > [Sugesh] I feel the single ID approach is much better. But I would say
> > a fixed size ID is easy to handle at upper layers. Say PMD returns
> > 64bit ID in which MSBs are masked out, based on how many bits the
> hardware can support.
> > PMD can refuse the unsupported number of bytes when requested. So
> the
> > size of ID going to be a parameter to program the flow.
> > What do you think?
> 
> What you suggest if I am not mistaken is:
> 
>  struct rte_flow_action_id {
>      uint64_t id;
>      uint64_t mask; /* either a bit-mask or a prefix/suffix length? */  };
> 
> I think in this case a mask is more versatile than a prefix/suffix length as the
> value itself comes in an unknown endian (from PMD's POV). It also allows
> specific bits to be taken into account, like when HW only supports 32 bit, with
> some black magic the full original 64 bit value can be restored as long as the
> application only cares about at most 32 bits anywhere.
> 
> However I do not think many applications "won't care" about specific bits in a
> given value and having to provide a properly crafted mask will be a hassle,
> they will just fill it with ones and hope for the best. As a result they won't
> take advantage of this feature or will stick to 32 bits all the time, or whatever
> happens to be the least common denominator.
> 
> My previous suggestion was:
> 
>  struct rte_flow_action_id {
>      uint8_t size; /* number of bytes in id[] */
>      uint8_t id[];
>  };
> 
> It does not solve the issue if an application requests more bytes than
> supported, however as a string, there is no endianness ambiguity and these
> bytes are copied as-is to the related mbuf field as if done through memcpy()
> possibly with some padding to fill the entire 64 bit field (copied bytes thus
> starting from MSB for big-endian machines, LSB for little-endian ones). The
> value itself remains opaque to the PMD.
> 
> One issue is the flexible array approach makes static initialization more
> complicated. Maybe it is not worth the trouble since according to Andrey,
> even X710 reports at most 32 bits of user data.
> 
> So what should we do? Fixed 32 bits ID for now to keep things simple, then
> another action for 64 bits later when necessary?
[Sugesh] I agree with you. We could keep things simple by having 32 bit ID now.
I mixed up the size of ID with flexible payload size. Sorry about that.
In the future, we could add an action for 64 bit if necessary.

> 
> > > [...]
> > > > > > [Sugesh] Another concern is the cost and time of installing
> > > > > > these rules in the hardware. Can we make these APIs time
> > > > > > bound(or at least an option
> > > > > to
> > > > > > set the time limit to execute these APIs), so that Application
> > > > > > doesn’t have to wait so long when installing and deleting
> > > > > > flows
> > > > > with
> > > > > > slow hardware/NIC. What do you think? Most of the datapath
> > > > > > flow
> > > > > installations are
> > > > > > dynamic and triggered only when there is an ingress traffic.
> > > > > > Delay in flow insertion/deletion have unpredictable
> > > > > consequences.
> > > > >
> > > > > This API is (currently) aimed at the control path only, and must
> > > > > indeed be assumed to be slow. Creating million of rules may take
> > > > > quite long as it may involve syscalls and other time-consuming
> > > > > synchronization things on the PMD side.
> > > > >
> > > > > So currently there is no plan to have rules added from the data
> > > > > path with time constraints. I think it would be implemented
> > > > > through a different set of functions anyway.
> > > > >
> > > > > I do not think adding time limits is practical, even specifying
> > > > > in the API that creating a single flow rule must take less than
> > > > > a maximum number of seconds in order to be effective is too much
> > > > > of a constraint (applications that create all flows during init
> > > > > may not care after
> > > all).
> > > > >
> > > > > You should consider in any case that modifying flow rules will
> > > > > always be slower than receiving packets, there is no way around
> > > > > that. Applications have to live with it and provide a software
> > > > > fallback for incoming packets while managing flow rules.
> > > > >
> > > > > Moreover, think about what happens when you hit the maximum
> > > number
> > > > > of flow rules and cannot create any more. Applications need to
> > > > > implement some kind of fallback in their data path.
> > > > >
> > > > > Offloading flows in HW is also only useful if they live much
> > > > > longer than the time taken to create and delete them. Perhaps
> > > > > applications may choose to do so after detecting long lived
> > > > > flows such as TCP sessions.
> > > > >
> > > > > You may have one separate control thread dedicated to manage
> > > > > flows and keep your normal control thread unaffected by delays.
> > > > > Several threads can even be dedicated, one per device.
> > > > [Sugesh] I agree that the flow insertion cannot be as fast as the
> > > > packet receiving rate.  From application point of view the problem
> > > > will be when hardware flow insertion takes longer than software
> > > > flow insertion. At least application has to know the cost of
> > > > inserting/deleting a rule in hardware beforehand. Otherwise how
> > > > application can choose the right flow candidate for hardware. My
> > > > point
> > > here is application is expecting a deterministic behavior from a
> > > classifier while inserting and deleting rules.
> > >
> > > Understood, however it will be difficult to estimate, particularly
> > > if a PMD must rearrange flow rules to make room for a new one due to
> > > priority levels collision or some other HW-related reason. I mean,
> > > spent time cannot be assumed to be constant, even PMDs cannot know
> > > in advance because it also depends on the performance of the host CPU.
> > >
> > > Such applications may find it easier to measure elapsed time for the
> > > rules they create, make statistics and extrapolate from this
> > > information for future rules. I do not think the PMD can help much here.
> > [Sugesh] From an application point of view this can be an issue.
> > Even there is a security concern when we program a short lived flow.
> > Lets consider the case,
> >
> > 1) Control plane programs the hardware with Queue termination flow.
> > 2) Software dataplane programmed to treat the packets from the specific
> queue accordingly.
> > 3) Remove the flow from the hardware. (Lets consider this is a long wait
> process..).
> > Or even there is a chance that hardware take more time to report the
> > status than removing it physically . Now the packets in the queue no longer
> consider as matched/flow hit.
> > . This is due to the software dataplane update is yet to happen.
> > We must need a way to sync between software datapath and classifier
> > APIs even though they are both programmed from a different control
> thread.
> >
> > Are we saying these APIs are only meant for user defined static flows??
> 
> No, that is definitely not the intent. These are good points.
> 
> With the specified API, applications may have to adapt their logic and take
> extra precautions in order to remain on the safe side at all times.
> 
> For your above example, the application cannot assume a rule is
> added/deleted as long as the PMD has not completed the related operation,
> which means keeping the SW rule/fallback in place in the meantime. Should
> handle security concerns as long as after removing a rule, packets end up in a
> default queue entirely processed by SW. Obviously this may worsen
> response time.
> 
> The ID action can help with this. By knowing which rule a received packet is
> associated with, processing can be temporarily offloaded by another thread
> without much complexity.
[Sugesh] Setting ID for every flow may not viable especially when the size of ID
is small(just only 8 bits). I am not sure is this a valid case though.

How about a hardware flow flag in packet descriptor that set when the
packets hits any hardware rule. This way software doesn’t worry /blocked by a
hardware rule . Even though there is an additional overhead of validating this flag,
software datapath can identify the hardware processed packets easily.
This way the packets traverses the software fallback path until the rule configuration is
complete. This flag avoids setting ID action for every hardware flow that are configuring.

> 
> I think applications have to implement SW fallbacks all the time, as even
> some sort of guarantee on the flow rule processing time may not be enough
> to avoid misdirected packets and related security issues.
[Sugesh] Software fallback will be there always. However I am little bit confused on
the way software going to identify the packets that are already hardware processed . I feel we need some
notification in the packet itself, when a hardware rule hits. ID/flag/any other options?
> 
> Let's wait for applications to start using this API and then consider an extra
> set of asynchronous / real-time functions when the need arises. It should not
> impact the way rules are specified
[Sugesh] Sure. I think the rule definition may not impact with this.
.
> 
> > > > > > [Sugesh] Another query is on the synchronization part. What if
> > > > > > same rules
> > > > > are
> > > > > > handled from different threads? Is application responsible for
> > > > > > handling the
> > > > > concurrent
> > > > > > hardware programming?
> > > > >
> > > > > Like most (if not all) DPDK APIs, applications are responsible
> > > > > for managing locking issues as decribed in 4.3 (Behavior). Since
> > > > > this is a control path API and applications usually have a
> > > > > single control thread, locking should not be necessary in most cases.
> > > > >
> > > > > Regarding my above comment about using several control threads
> > > > > to manage different devices, section 4.3 says:
> > > > >
> > > > >  "There is no provision for reentrancy/multi-thread safety,
> > > > > although nothing  should prevent different devices from being
> > > > > configured at the same  time. PMDs may protect their control
> > > > > path functions
> > > accordingly."
> > > > >
> > > > > I'd like to emphasize it is not "per port" but "per device",
> > > > > since in a few cases a configurable resource is shared by several ports.
> > > > > It may be difficult for applications to determine which ports
> > > > > are shared by a given device but this falls outside the scope of this
> API.
> > > > >
> > > > > Do you think adding the guarantee that it is always safe to
> > > > > configure two different ports simultaneously without locking
> > > > > from the application side is necessary? In which case the PMD
> > > > > would be responsible for locking shared resources.
> > > > [Sugesh] This would be little bit complicated when some of ports
> > > > are not under DPDK itself(what if one port is managed by Kernel)
> > > > Or ports are tied by different application. Locking in PMD helps
> > > > when the ports are accessed by multiple DPDK application. However
> > > > what if the port itself
> > > not under DPDK?
> > >
> > > Well, either we do not care about what happens outside of the DPDK
> > > context, or PMDs must find a way to satisfy everyone. I'm not a fan
> > > of locking either but it would be nice if flow rules configuration
> > > could be attempted on different ports simultaneously without the
> > > risk of wrecking anything, so that applications do not need to care.
> > >
> > > Possible cases for a dual port device with global flow rule settings
> > > affecting both ports:
> > >
> > > 1) ports 1 & 2 are managed by DPDK: this is the easy case, a rule that
> needs
> > >    to alter a global setting necessary for an existing rule on any port is
> > >    not allowed (EEXIST). PMD must maintain a device context common to
> both
> > >    ports in order for this to work. This context is either under lock, or
> > >    the first port on which a flow rule is created owns all future flow
> > >    rules.
> > >
> > > 2) port 1 is managed by DPDK, port 2 by something else, the PMD is
> aware of
> > >    it and knows that port 2 may modify the global context: no flow rules
> can
> > >    be created from the DPDK application due to safety issues (EBUSY?).
> > >
> > > 3) port 1 is managed by DPDK, port 2 by something else, the PMD is
> aware of
> > >    it and knows that port 2 will not modify flow rules: PMD should not care,
> > >    no lock necessary.
> > >
> > > 4) port 1 is managed by DPDK, port 2 by something else and the PMD is
> not
> > >    aware of it: either flow rules cannot be created ever at all, or we say
> > >    it is user's reponsibility to make sure this does not happen.
> > >
> > > Considering that most control operations performed by DPDK affect
> > > the device regardless of other applications, I think 1) is the only
> > > case that should be defined, otherwise 4), defined as user's
> responsibility.
> 
> No more comments on this part? What do you suggest?
[Sugesh] I agree with your suggestions. I feel this is the best that can offer.

> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-18 13:26             ` Chandran, Sugesh
@ 2016-07-18 15:00               ` Adrien Mazarguil
  2016-07-20 16:32                 ` Chandran, Sugesh
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-18 15:00 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern, Chilikin, Andrey

On Mon, Jul 18, 2016 at 01:26:09PM +0000, Chandran, Sugesh wrote:
> Hi Adrien,
> Thank you for getting back on this.
> Please find my comments below.

Hi Sugesh,

Same for me, removed again the parts we agree on.

[...]
> > > > > > > [Sugesh] Another concern is the cost and time of installing
> > > > > > > these rules in the hardware. Can we make these APIs time
> > > > > > > bound(or at least an option
> > > > > > to
> > > > > > > set the time limit to execute these APIs), so that Application
> > > > > > > doesn’t have to wait so long when installing and deleting
> > > > > > > flows
> > > > > > with
> > > > > > > slow hardware/NIC. What do you think? Most of the datapath
> > > > > > > flow
> > > > > > installations are
> > > > > > > dynamic and triggered only when there is an ingress traffic.
> > > > > > > Delay in flow insertion/deletion have unpredictable
> > > > > > consequences.
> > > > > >
> > > > > > This API is (currently) aimed at the control path only, and must
> > > > > > indeed be assumed to be slow. Creating million of rules may take
> > > > > > quite long as it may involve syscalls and other time-consuming
> > > > > > synchronization things on the PMD side.
> > > > > >
> > > > > > So currently there is no plan to have rules added from the data
> > > > > > path with time constraints. I think it would be implemented
> > > > > > through a different set of functions anyway.
> > > > > >
> > > > > > I do not think adding time limits is practical, even specifying
> > > > > > in the API that creating a single flow rule must take less than
> > > > > > a maximum number of seconds in order to be effective is too much
> > > > > > of a constraint (applications that create all flows during init
> > > > > > may not care after
> > > > all).
> > > > > >
> > > > > > You should consider in any case that modifying flow rules will
> > > > > > always be slower than receiving packets, there is no way around
> > > > > > that. Applications have to live with it and provide a software
> > > > > > fallback for incoming packets while managing flow rules.
> > > > > >
> > > > > > Moreover, think about what happens when you hit the maximum
> > > > number
> > > > > > of flow rules and cannot create any more. Applications need to
> > > > > > implement some kind of fallback in their data path.
> > > > > >
> > > > > > Offloading flows in HW is also only useful if they live much
> > > > > > longer than the time taken to create and delete them. Perhaps
> > > > > > applications may choose to do so after detecting long lived
> > > > > > flows such as TCP sessions.
> > > > > >
> > > > > > You may have one separate control thread dedicated to manage
> > > > > > flows and keep your normal control thread unaffected by delays.
> > > > > > Several threads can even be dedicated, one per device.
> > > > > [Sugesh] I agree that the flow insertion cannot be as fast as the
> > > > > packet receiving rate.  From application point of view the problem
> > > > > will be when hardware flow insertion takes longer than software
> > > > > flow insertion. At least application has to know the cost of
> > > > > inserting/deleting a rule in hardware beforehand. Otherwise how
> > > > > application can choose the right flow candidate for hardware. My
> > > > > point
> > > > here is application is expecting a deterministic behavior from a
> > > > classifier while inserting and deleting rules.
> > > >
> > > > Understood, however it will be difficult to estimate, particularly
> > > > if a PMD must rearrange flow rules to make room for a new one due to
> > > > priority levels collision or some other HW-related reason. I mean,
> > > > spent time cannot be assumed to be constant, even PMDs cannot know
> > > > in advance because it also depends on the performance of the host CPU.
> > > >
> > > > Such applications may find it easier to measure elapsed time for the
> > > > rules they create, make statistics and extrapolate from this
> > > > information for future rules. I do not think the PMD can help much here.
> > > [Sugesh] From an application point of view this can be an issue.
> > > Even there is a security concern when we program a short lived flow.
> > > Lets consider the case,
> > >
> > > 1) Control plane programs the hardware with Queue termination flow.
> > > 2) Software dataplane programmed to treat the packets from the specific
> > queue accordingly.
> > > 3) Remove the flow from the hardware. (Lets consider this is a long wait
> > process..).
> > > Or even there is a chance that hardware take more time to report the
> > > status than removing it physically . Now the packets in the queue no longer
> > consider as matched/flow hit.
> > > . This is due to the software dataplane update is yet to happen.
> > > We must need a way to sync between software datapath and classifier
> > > APIs even though they are both programmed from a different control
> > thread.
> > >
> > > Are we saying these APIs are only meant for user defined static flows??
> > 
> > No, that is definitely not the intent. These are good points.
> > 
> > With the specified API, applications may have to adapt their logic and take
> > extra precautions in order to remain on the safe side at all times.
> > 
> > For your above example, the application cannot assume a rule is
> > added/deleted as long as the PMD has not completed the related operation,
> > which means keeping the SW rule/fallback in place in the meantime. Should
> > handle security concerns as long as after removing a rule, packets end up in a
> > default queue entirely processed by SW. Obviously this may worsen
> > response time.
> > 
> > The ID action can help with this. By knowing which rule a received packet is
> > associated with, processing can be temporarily offloaded by another thread
> > without much complexity.
> [Sugesh] Setting ID for every flow may not viable especially when the size of ID
> is small(just only 8 bits). I am not sure is this a valid case though.

Agreed, I'm not saying this solution works for all devices, particularly
those that do not support ID at all.

> How about a hardware flow flag in packet descriptor that set when the
> packets hits any hardware rule. This way software doesn’t worry /blocked by a
> hardware rule . Even though there is an additional overhead of validating this flag,
> software datapath can identify the hardware processed packets easily.
> This way the packets traverses the software fallback path until the rule configuration is
> complete. This flag avoids setting ID action for every hardware flow that are configuring.

That makes sense. I see it as a sort of single bit ID but it could be
implemented through a different action for less capable devices. PMDs that
support 32 bit IDs could reuse the same code for both features.

I understand you'd prefer having this feature always present, however we
already know that not all PMDs/devices support it, and like everything else
this is a kind of offload that needs to be explicitly requested by the
application as it may not be needed.

If we go with the separate action, then perhaps it would make sense to
rename "ID" to "MARK" to make things clearer:

 RTE_FLOW_ACTION_TYPE_FLAG /* Flag packets processed by flow rule. */

 RTE_FLOW_ACTION_TYPE_MARK /* Attach a 32 bit value to a packet. */

I guess the result of the FLAG action would be something in ol_flag.

Thoughts?

> > I think applications have to implement SW fallbacks all the time, as even
> > some sort of guarantee on the flow rule processing time may not be enough
> > to avoid misdirected packets and related security issues.
> [Sugesh] Software fallback will be there always. However I am little bit confused on
> the way software going to identify the packets that are already hardware processed . I feel we need some
> notification in the packet itself, when a hardware rule hits. ID/flag/any other options?

Yeah I think so too, as long as it is optional because we cannot assume all
PMDs will support it.

> > Let's wait for applications to start using this API and then consider an extra
> > set of asynchronous / real-time functions when the need arises. It should not
> > impact the way rules are specified
> [Sugesh] Sure. I think the rule definition may not impact with this.

Thanks for your comments.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-07 10:26   ` Adrien Mazarguil
@ 2016-07-19  8:11     ` Lu, Wenzhuo
  2016-07-19 13:12       ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Lu, Wenzhuo @ 2016-07-19  8:11 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

Hi Adrien,
Thanks for your clarification.  Most of my questions are clear, but still something may need to be discussed, comment below.


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Thursday, July 7, 2016 6:27 PM
> To: Lu, Wenzhuo
> Cc: dev@dpdk.org; Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody;
> Ajit Khaparde; Rahul Lakkireddy; Jan Medala; John Daley; Chen, Jing D; Ananyev,
> Konstantin; Matej Vido; Alejandro Lucero; Sony Chacko; Jerin Jacob; De Lara
> Guarch, Pablo; Olga Shern
> Subject: Re: [RFC] Generic flow director/filtering/classification API
> 
> Hi Lu Wenzhuo,
> 
> Thanks for your feedback, I'm replying below as well.
> 
> On Thu, Jul 07, 2016 at 07:14:18AM +0000, Lu, Wenzhuo wrote:
> > Hi Adrien,
> > I have some questions, please see inline, thanks.
> >
> > > -----Original Message-----
> > > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > > Sent: Wednesday, July 6, 2016 2:17 AM
> > > To: dev@dpdk.org
> > > Cc: Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody; Ajit
> > > Khaparde; Rahul Lakkireddy; Lu, Wenzhuo; Jan Medala; John Daley;
> > > Chen, Jing D; Ananyev, Konstantin; Matej Vido; Alejandro Lucero;
> > > Sony Chacko; Jerin Jacob; De Lara Guarch, Pablo; Olga Shern
> > > Subject: [RFC] Generic flow director/filtering/classification API
> > >
> > >
> > > Requirements for a new API:
> > >
> > > - Flexible and extensible without causing API/ABI problems for existing
> > >   applications.
> > > - Should be unambiguous and easy to use.
> > > - Support existing filtering features and actions listed in `Filter types`_.
> > > - Support packet alteration.
> > > - In case of overlapping filters, their priority should be well documented.
> > Does that mean we don't guarantee the consistent of priority? The priority can
> be different on different NICs. So the behavior of the actions  can be different.
> Right?
> 
> No, the intent is precisely to define what happens in order to get a consistent
> result across different devices, and document cases with undefined behavior.
> There must be no room left for interpretation.
> 
> For example, the API must describe what happens when two overlapping filters
> (e.g. one matching an Ethernet header, another one matching an IP header)
> match a given packet at a given priority level.
> 
> It is documented in section 4.1.1 (priorities) as "undefined behavior".
> Applications remain free to do it and deal with consequences, at least they know
> they cannot expect a consistent outcome, unless they use different priority
> levels for both rules, see also 4.4.5 (flow rules priority).
> 
> > Seems the users still need to aware the some details of the HW? Do we need
> to add the negotiation for the priority?
> 
> Priorities as defined in this document may not be directly mappable to HW
> capabilities (e.g. HW does not support enough priorities, or that some corner
> case make them not work as described), in which case the PMD may choose to
> simulate priorities (again 4.4.5), as long as the end result follows the
> specification.
> 
> So users must not be aware of some HW details, the PMD does and must
> perform the needed workarounds to suit their expectations. Users may only be
> impacted by errors while attempting to create rules that are either unsupported
> or would cause them (or existing rules) to diverge from the spec.
The problem is sometime the priority of the filters is fixed according to HW's implementation. For example, on ixgbe, n-tuple has a higher priority than flow director. What's the right behavior of PMD if APP want to create a flow director rule which has a higher or even equal priority than an existing n-tuple rule? Should PMD return fail? 
If so, do we need more fail reasons? According to this RFC, I think we need return " EEXIST: collision with an existing rule. ", but it's not very clear, APP doesn't know the problem is priority, maybe more detailed reason is helpful.


> > > Behavior
> > > --------
> > >
> > > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> > >   returned).
> > >
> > > - There is no provision for reentrancy/multi-thread safety, although nothing
> > >   should prevent different devices from being configured at the same
> > >   time. PMDs may protect their control path functions accordingly.
> > >
> > > - Stopping the data path (TX/RX) should not be necessary when managing
> flow
> > >   rules. If this cannot be achieved naturally or with workarounds (such as
> > >   temporarily replacing the burst function pointers), an appropriate error
> > >   code must be returned (``EBUSY``).
> > PMD cannot stop the data path without adding lock. So I think if some rules
> cannot be applied without stopping rx/tx, PMD has to return fail.
> > Or let the APP to stop the data path.
> 
> Agreed, that is the intent. If the PMD cannot touch flow rules for some reason
> even after trying really hard, then it just returns EBUSY.
> 
> Perhaps we should write down that applications may get a different outcome
> after stopping the data path if they get EBUSY?
Agree, it's better to describe more about the APP. BTW, I checked the behavior of ixgbe/igb, I think we can add/delete filters during runtime. Hopefully we'll not hit too many EBUSY problems on other NICs :)

> 
> > > - PMDs, not applications, are responsible for maintaining flow rules
> > >   configuration when stopping and restarting a port or performing other
> > >   actions which may affect them. They can only be destroyed explicitly.
> > Don’t understand " They can only be destroyed explicitly."
> 
> This part says that as long as an application has not called
> rte_flow_destroy() on a flow rule, it never disappears, whatever happens to the
> port (stopped, restarted). The application is not responsible for re-creating rules
> after that.
> 
> Note that according to the specification, this may translate to not being able to
> stop a port as long as a flow rule is present, depending on how nice the PMD
> intends to be with applications. Implementation can be done in small steps with
> minimal amount of code on the PMD side.
Does it mean PMD should store and maintain all the rules? Why not let rte do that? I think if PMD maintain all the rules, it means every kind of NIC should have a copy of code for the rules. But if rte do that, only one copy of code need to be maintained, right?
When the port is stopped and restarted, rte can reconfigure the rules. Is the concern that PMD may adjust the sequence of the rules according to the priority, so every NIC has a different list of rules? But PMD can adjust them again when rte reconfiguring the rules.

> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-19  8:11     ` Lu, Wenzhuo
@ 2016-07-19 13:12       ` Adrien Mazarguil
  2016-07-20  2:16         ` Lu, Wenzhuo
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-19 13:12 UTC (permalink / raw)
  To: Lu, Wenzhuo
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

On Tue, Jul 19, 2016 at 08:11:48AM +0000, Lu, Wenzhuo wrote:
> Hi Adrien,
> Thanks for your clarification.  Most of my questions are clear, but still something may need to be discussed, comment below.

Hi Wenzhuo,

Please see below.

[...]
> > > > Requirements for a new API:
> > > >
> > > > - Flexible and extensible without causing API/ABI problems for existing
> > > >   applications.
> > > > - Should be unambiguous and easy to use.
> > > > - Support existing filtering features and actions listed in `Filter types`_.
> > > > - Support packet alteration.
> > > > - In case of overlapping filters, their priority should be well documented.
> > > Does that mean we don't guarantee the consistent of priority? The priority can
> > be different on different NICs. So the behavior of the actions  can be different.
> > Right?
> > 
> > No, the intent is precisely to define what happens in order to get a consistent
> > result across different devices, and document cases with undefined behavior.
> > There must be no room left for interpretation.
> > 
> > For example, the API must describe what happens when two overlapping filters
> > (e.g. one matching an Ethernet header, another one matching an IP header)
> > match a given packet at a given priority level.
> > 
> > It is documented in section 4.1.1 (priorities) as "undefined behavior".
> > Applications remain free to do it and deal with consequences, at least they know
> > they cannot expect a consistent outcome, unless they use different priority
> > levels for both rules, see also 4.4.5 (flow rules priority).
> > 
> > > Seems the users still need to aware the some details of the HW? Do we need
> > to add the negotiation for the priority?
> > 
> > Priorities as defined in this document may not be directly mappable to HW
> > capabilities (e.g. HW does not support enough priorities, or that some corner
> > case make them not work as described), in which case the PMD may choose to
> > simulate priorities (again 4.4.5), as long as the end result follows the
> > specification.
> > 
> > So users must not be aware of some HW details, the PMD does and must
> > perform the needed workarounds to suit their expectations. Users may only be
> > impacted by errors while attempting to create rules that are either unsupported
> > or would cause them (or existing rules) to diverge from the spec.
> The problem is sometime the priority of the filters is fixed according to
> > HW's implementation. For example, on ixgbe, n-tuple has a higher
> > priority than flow director.

As a side note I did not know that N-tuple had a higher priority than flow
director on ixgbe, priorities among filter types do not seem to be
documented at all in DPDK. This is one of the reasons I think we need a
generic API to handle flow configuration.

So, today an application cannot combine N-tuple and FDIR flow rules and get
a reliable outcome, unless it is designed for specific devices with a known
behavior.

> What's the right behavior of PMD if APP want to create a flow director rule which has a higher or even equal priority than an existing n-tuple rule? Should PMD return fail? 

First remember applications only deal with the generic API, PMDs are
responsible for choosing the most appropriate HW implementation to use
according to the requested flow rules (FDIR, N-tuple or anything else).

For the specific case of FDIR vs N-tuple, if the underlying HW supports both
I do not see why the PMD would create a N-tuple rule. Doesn't FDIR support
everything N-tuple can do and much more?

Assuming such a thing happened anyway, that the PMD had to create a rule
using a high priority filter type and that the application requests the
creation of a rule that can only be done using a lower priority filter type,
but also requested a higher priority for that rule, then yes, it should
obviously fail.

That is, unless the PMD can perform some kind of workaround to have both.

> If so, do we need more fail reasons? According to this RFC, I think we need return " EEXIST: collision with an existing rule. ", but it's not very clear, APP doesn't know the problem is priority, maybe more detailed reason is helpful.

Possibly, I've defined a basic set of errors, there are quite a number of
errno values to choose from. However I think we should not define too many
values. In my opinion the basic set covers every possible failure:

- EINVAL: invalid format, rule is broken or cannot be understood by the PMD
  anyhow.

- ENOTSUP: pattern/actions look fine but something in the requested rule is
  not supported and thus cannot be applied.

- EEXIST: pattern/actions are fine and could have been applied if only some
  other rule did not prevent the PMD to do it (I see it as the closest thing
  to "ETOOBAD" which unfortunately does not exist).

- ENOMEM: like EEXIST, except it is due to the lack of resources not because
  of another rule. I wasn't sure which of ENOMEM or ENOSPC was better but
  settled on ENOMEM as it is well known. Still open to debate.

Errno values are only useful to get a rough idea of the reason, and another
mechanism is needed to pinpoint the exact problem for debugging/reporting
purposes, something like:

 enum rte_flow_error_type {
     RTE_FLOW_ERROR_TYPE_NONE,
     RTE_FLOW_ERROR_TYPE_UNKNOWN,
     RTE_FLOW_ERROR_TYPE_PRIORITY,
     RTE_FLOW_ERROR_TYPE_PATTERN,
     RTE_FLOW_ERROR_TYPE_ACTION,
 };

 struct rte_flow_error {
     enum rte_flow_error_type type;
     void *offset; /* Points to the exact pattern item or action. */
     const char *message;
 };

Then either provide an optional struct rte_flow_error pointer to
rte_flow_validate(), or a separate function (rte_flow_analyze()?), since
processing this may be quite expensive and applications may not care about
the exact reason.

What do you suggest?

> > > > Behavior
> > > > --------
> > > >
> > > > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> > > >   returned).
> > > >
> > > > - There is no provision for reentrancy/multi-thread safety, although nothing
> > > >   should prevent different devices from being configured at the same
> > > >   time. PMDs may protect their control path functions accordingly.
> > > >
> > > > - Stopping the data path (TX/RX) should not be necessary when managing
> > flow
> > > >   rules. If this cannot be achieved naturally or with workarounds (such as
> > > >   temporarily replacing the burst function pointers), an appropriate error
> > > >   code must be returned (``EBUSY``).
> > > PMD cannot stop the data path without adding lock. So I think if some rules
> > cannot be applied without stopping rx/tx, PMD has to return fail.
> > > Or let the APP to stop the data path.
> > 
> > Agreed, that is the intent. If the PMD cannot touch flow rules for some reason
> > even after trying really hard, then it just returns EBUSY.
> > 
> > Perhaps we should write down that applications may get a different outcome
> > after stopping the data path if they get EBUSY?
> Agree, it's better to describe more about the APP. BTW, I checked the behavior of ixgbe/igb, I think we can add/delete filters during runtime. Hopefully we'll not hit too many EBUSY problems on other NICs :)

OK, I will add it.

> > > > - PMDs, not applications, are responsible for maintaining flow rules
> > > >   configuration when stopping and restarting a port or performing other
> > > >   actions which may affect them. They can only be destroyed explicitly.
> > > Don’t understand " They can only be destroyed explicitly."
> > 
> > This part says that as long as an application has not called
> > rte_flow_destroy() on a flow rule, it never disappears, whatever happens to the
> > port (stopped, restarted). The application is not responsible for re-creating rules
> > after that.
> > 
> > Note that according to the specification, this may translate to not being able to
> > stop a port as long as a flow rule is present, depending on how nice the PMD
> > intends to be with applications. Implementation can be done in small steps with
> > minimal amount of code on the PMD side.
> Does it mean PMD should store and maintain all the rules? Why not let rte do that? I think if PMD maintain all the rules, it means every kind of NIC should have a copy of code for the rules. But if rte do that, only one copy of code need to be maintained, right?

I've considered having rules stored in a common format understood at the RTE
level and not specific to each PMD and decided that the opaque rte_flow
pointer was a better choice for the following reasons: 

- Even though flow rules management is done in the control path, processing
  must be as fast as possible. Letting PMDs store flow rules using their own
  internal representation gives them the chance to achieve better
  performance.

- An opaque context managed by PMDs would probably have to be stored
  somewhere as well anyway.

- PMDs may not need to allocate/store anything at all if they exclusively
  rely on HW state for everything. In my opinion, the generic API has enough
  constraints for this to work and maintain consistency between flow
  rules. Note this is currently how most PMDs implement FDIR and other
  filter types.

- RTE can (and will) provide helpers to avoid most of the code redundancy,
  PMDs are free to use them or manage everything by themselves.

- Given that the opaque rte_flow pointer associated with a flow rule is to
  be stored by the application, PMDs do not even have to keep references to
  them.

- The flow rules format described in this specification (pattern / actions)
  will be used by applications directly, and will be free to arrange them in
  lists, trees or in any other way if they need to keep flow specifications
  around for further processing.

> When the port is stopped and restarted, rte can reconfigure the rules. Is the concern that PMD may adjust the sequence of the rules according to the priority, so every NIC has a different list of rules? But PMD can adjust them again when rte reconfiguring the rules.

What about PMDs able to stop and restart ports without destroying their own
flow rules? If we assume flow rules must be destroyed when stopping a port,
these PMDs are needlessly penalized with slower stop/start cycles. Think
about it assuming thousands of flow rules.

Thus from an application point of view, whatever happens when stopping and
restarting a port should not matter. If a flow rule was present before, it
must still be present afterwards. If the PMD had to destroy flow rules and
re-create them, it does not actually matter if they differ slightly at the
HW level, as long as:

- Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
  and refer to the same rules.

- The overall behavior of all rules is the same.

The list of rules you think of (patterns / actions) is maintained by
applications (not RTE), and only if they need them. RTE would needlessly
duplicate this.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-19 13:12       ` Adrien Mazarguil
@ 2016-07-20  2:16         ` Lu, Wenzhuo
  2016-07-20 10:41           ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Lu, Wenzhuo @ 2016-07-20  2:16 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

Hi Adrien,


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Tuesday, July 19, 2016 9:12 PM
> To: Lu, Wenzhuo
> Cc: dev@dpdk.org; Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody;
> Ajit Khaparde; Rahul Lakkireddy; Jan Medala; John Daley; Chen, Jing D; Ananyev,
> Konstantin; Matej Vido; Alejandro Lucero; Sony Chacko; Jerin Jacob; De Lara
> Guarch, Pablo; Olga Shern
> Subject: Re: [RFC] Generic flow director/filtering/classification API
> 
> On Tue, Jul 19, 2016 at 08:11:48AM +0000, Lu, Wenzhuo wrote:
> > Hi Adrien,
> > Thanks for your clarification.  Most of my questions are clear, but still
> something may need to be discussed, comment below.
> 
> Hi Wenzhuo,
> 
> Please see below.
> 
> [...]
> > > > > Requirements for a new API:
> > > > >
> > > > > - Flexible and extensible without causing API/ABI problems for existing
> > > > >   applications.
> > > > > - Should be unambiguous and easy to use.
> > > > > - Support existing filtering features and actions listed in `Filter types`_.
> > > > > - Support packet alteration.
> > > > > - In case of overlapping filters, their priority should be well documented.
> > > > Does that mean we don't guarantee the consistent of priority? The
> > > > priority can
> > > be different on different NICs. So the behavior of the actions  can be
> different.
> > > Right?
> > >
> > > No, the intent is precisely to define what happens in order to get a
> > > consistent result across different devices, and document cases with
> undefined behavior.
> > > There must be no room left for interpretation.
> > >
> > > For example, the API must describe what happens when two overlapping
> > > filters (e.g. one matching an Ethernet header, another one matching
> > > an IP header) match a given packet at a given priority level.
> > >
> > > It is documented in section 4.1.1 (priorities) as "undefined behavior".
> > > Applications remain free to do it and deal with consequences, at
> > > least they know they cannot expect a consistent outcome, unless they
> > > use different priority levels for both rules, see also 4.4.5 (flow rules priority).
> > >
> > > > Seems the users still need to aware the some details of the HW? Do
> > > > we need
> > > to add the negotiation for the priority?
> > >
> > > Priorities as defined in this document may not be directly mappable
> > > to HW capabilities (e.g. HW does not support enough priorities, or
> > > that some corner case make them not work as described), in which
> > > case the PMD may choose to simulate priorities (again 4.4.5), as
> > > long as the end result follows the specification.
> > >
> > > So users must not be aware of some HW details, the PMD does and must
> > > perform the needed workarounds to suit their expectations. Users may
> > > only be impacted by errors while attempting to create rules that are
> > > either unsupported or would cause them (or existing rules) to diverge from
> the spec.
> > The problem is sometime the priority of the filters is fixed according
> > to
> > > HW's implementation. For example, on ixgbe, n-tuple has a higher
> > > priority than flow director.
> 
> As a side note I did not know that N-tuple had a higher priority than flow
> director on ixgbe, priorities among filter types do not seem to be documented at
> all in DPDK. This is one of the reasons I think we need a generic API to handle
> flow configuration.
Totally agree with you. We haven't documented the info well enough. And even we do that, users have to study the details of every NIC, it can still make the filters very hard to use. I believe a generic API is very helpful here :)

> 
> 
> So, today an application cannot combine N-tuple and FDIR flow rules and get a
> reliable outcome, unless it is designed for specific devices with a known
> behavior.
> 
> > What's the right behavior of PMD if APP want to create a flow director rule
> which has a higher or even equal priority than an existing n-tuple rule? Should
> PMD return fail?
> 
> First remember applications only deal with the generic API, PMDs are
> responsible for choosing the most appropriate HW implementation to use
> according to the requested flow rules (FDIR, N-tuple or anything else).
> 
> For the specific case of FDIR vs N-tuple, if the underlying HW supports both I do
> not see why the PMD would create a N-tuple rule. Doesn't FDIR support
> everything N-tuple can do and much more?
Talking about the filters, fdir can cover n-tuple. I think that's why i40e only supports fdir but not n-tuple. But n-tuple has its own highlight. As we know, at least on intel NICs, fdir only supports per device mask. But n-tuple can support per rule mask.
As every pattern has spec and mask both, we cannot guarantee the masks are same. I think ixgbe will try to use n-tuple first if can. Because even the masks are different, we can support them all.

> 
> Assuming such a thing happened anyway, that the PMD had to create a rule
> using a high priority filter type and that the application requests the creation of a
> rule that can only be done using a lower priority filter type, but also requested a
> higher priority for that rule, then yes, it should obviously fail.
> 
> That is, unless the PMD can perform some kind of workaround to have both.
> 
> > If so, do we need more fail reasons? According to this RFC, I think we need
> return " EEXIST: collision with an existing rule. ", but it's not very clear, APP
> doesn't know the problem is priority, maybe more detailed reason is helpful.
> 
> Possibly, I've defined a basic set of errors, there are quite a number of errno
> values to choose from. However I think we should not define too many values.
> In my opinion the basic set covers every possible failure:
> 
> - EINVAL: invalid format, rule is broken or cannot be understood by the PMD
>   anyhow.
> 
> - ENOTSUP: pattern/actions look fine but something in the requested rule is
>   not supported and thus cannot be applied.
> 
> - EEXIST: pattern/actions are fine and could have been applied if only some
>   other rule did not prevent the PMD to do it (I see it as the closest thing
>   to "ETOOBAD" which unfortunately does not exist).
> 
> - ENOMEM: like EEXIST, except it is due to the lack of resources not because
>   of another rule. I wasn't sure which of ENOMEM or ENOSPC was better but
>   settled on ENOMEM as it is well known. Still open to debate.
> 
> Errno values are only useful to get a rough idea of the reason, and another
> mechanism is needed to pinpoint the exact problem for debugging/reporting
> purposes, something like:
> 
>  enum rte_flow_error_type {
>      RTE_FLOW_ERROR_TYPE_NONE,
>      RTE_FLOW_ERROR_TYPE_UNKNOWN,
>      RTE_FLOW_ERROR_TYPE_PRIORITY,
>      RTE_FLOW_ERROR_TYPE_PATTERN,
>      RTE_FLOW_ERROR_TYPE_ACTION,
>  };
> 
>  struct rte_flow_error {
>      enum rte_flow_error_type type;
>      void *offset; /* Points to the exact pattern item or action. */
>      const char *message;
>  };
When we are using a CLI and it fails, normally it will let us know which parameter is not appropriate. So, I think it’s a good idea to have this error structure :)

> 
> Then either provide an optional struct rte_flow_error pointer to
> rte_flow_validate(), or a separate function (rte_flow_analyze()?), since
> processing this may be quite expensive and applications may not care about the
> exact reason.
Agree the processing may be too expensive. Maybe we can say it's optional to return error details. And that's a good question that what APP should do if creating the rule fails. I believe normally it will choose handle the rule by itself. But I think it's not bad to feedback more. Or even the APP want to adjust the rules, it cannot be an option for lack of info.

> 
> What do you suggest?
> 
> > > > > Behavior
> > > > > --------
> > > > >
> > > > > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> > > > >   returned).
> > > > >
> > > > > - There is no provision for reentrancy/multi-thread safety, although
> nothing
> > > > >   should prevent different devices from being configured at the same
> > > > >   time. PMDs may protect their control path functions accordingly.
> > > > >
> > > > > - Stopping the data path (TX/RX) should not be necessary when
> > > > > managing
> > > flow
> > > > >   rules. If this cannot be achieved naturally or with workarounds (such as
> > > > >   temporarily replacing the burst function pointers), an appropriate error
> > > > >   code must be returned (``EBUSY``).
> > > > PMD cannot stop the data path without adding lock. So I think if
> > > > some rules
> > > cannot be applied without stopping rx/tx, PMD has to return fail.
> > > > Or let the APP to stop the data path.
> > >
> > > Agreed, that is the intent. If the PMD cannot touch flow rules for
> > > some reason even after trying really hard, then it just returns EBUSY.
> > >
> > > Perhaps we should write down that applications may get a different
> > > outcome after stopping the data path if they get EBUSY?
> > Agree, it's better to describe more about the APP. BTW, I checked the
> > behavior of ixgbe/igb, I think we can add/delete filters during
> > runtime. Hopefully we'll not hit too many EBUSY problems on other NICs
> > :)
> 
> OK, I will add it.
> 
> > > > > - PMDs, not applications, are responsible for maintaining flow rules
> > > > >   configuration when stopping and restarting a port or performing other
> > > > >   actions which may affect them. They can only be destroyed explicitly.
> > > > Don’t understand " They can only be destroyed explicitly."
> > >
> > > This part says that as long as an application has not called
> > > rte_flow_destroy() on a flow rule, it never disappears, whatever
> > > happens to the port (stopped, restarted). The application is not
> > > responsible for re-creating rules after that.
> > >
> > > Note that according to the specification, this may translate to not
> > > being able to stop a port as long as a flow rule is present,
> > > depending on how nice the PMD intends to be with applications.
> > > Implementation can be done in small steps with minimal amount of code on
> the PMD side.
> > Does it mean PMD should store and maintain all the rules? Why not let rte do
> that? I think if PMD maintain all the rules, it means every kind of NIC should have
> a copy of code for the rules. But if rte do that, only one copy of code need to be
> maintained, right?
> 
> I've considered having rules stored in a common format understood at the RTE
> level and not specific to each PMD and decided that the opaque rte_flow pointer
> was a better choice for the following reasons:
> 
> - Even though flow rules management is done in the control path, processing
>   must be as fast as possible. Letting PMDs store flow rules using their own
>   internal representation gives them the chance to achieve better
>   performance.
Not quite understand. I think we're talking about maintain the rules by SW. I don’t think there's something need to be optimized according to specific NICs. If we need to optimize the code, I think we need to consider the CPU, OS ... and some common means. I'm wrong?

> 
> - An opaque context managed by PMDs would probably have to be stored
>   somewhere as well anyway.
> 
> - PMDs may not need to allocate/store anything at all if they exclusively
>   rely on HW state for everything. In my opinion, the generic API has enough
>   constraints for this to work and maintain consistency between flow
>   rules. Note this is currently how most PMDs implement FDIR and other
>   filter types.
Yes, the rules are stored by HW. But considering stop/start the device, the rules in HW will lose. we have to store the rules by SW and re-program them when restarting the device.
And in existing code, we store the filters by SW at least on Intel NICs. But I think we cannot reuse them, because considering the priority and which category of filter should be chosen, I think we need a whole new table for generic API. I think it’s what's designed now, right?

> 
> - RTE can (and will) provide helpers to avoid most of the code redundancy,
>   PMDs are free to use them or manage everything by themselves.
> 
> - Given that the opaque rte_flow pointer associated with a flow rule is to
>   be stored by the application, PMDs do not even have to keep references to
>   them.
Don’t understand. More details?

> 
> - The flow rules format described in this specification (pattern / actions)
>   will be used by applications directly, and will be free to arrange them in
>   lists, trees or in any other way if they need to keep flow specifications
>   around for further processing.
Who will create the lists, trees or something else? According to previous discussion, I think the APP will program the rules one by one. So if APP organize the rules to lists, trees..., PMD doesn’t know that. 
And you said " Given that the opaque rte_flow pointer associated with a flow rule is to be stored by the application ". I'm lost here.

> 
> > When the port is stopped and restarted, rte can reconfigure the rules. Is the
> concern that PMD may adjust the sequence of the rules according to the priority,
> so every NIC has a different list of rules? But PMD can adjust them again when
> rte reconfiguring the rules.
> 
> What about PMDs able to stop and restart ports without destroying their own
> flow rules? If we assume flow rules must be destroyed when stopping a port,
> these PMDs are needlessly penalized with slower stop/start cycles. Think about
> it assuming thousands of flow rules.
I believe the rules maintained by SW should not be destroyed, because they're used to be re-programed when the device starts again.

> 
> Thus from an application point of view, whatever happens when stopping and
> restarting a port should not matter. If a flow rule was present before, it must
> still be present afterwards. If the PMD had to destroy flow rules and re-create
> them, it does not actually matter if they differ slightly at the HW level, as long as:
> 
> - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
>   and refer to the same rules.
> 
> - The overall behavior of all rules is the same.
> 
> The list of rules you think of (patterns / actions) is maintained by applications
> (not RTE), and only if they need them. RTE would needlessly duplicate this.
As said before, need more details to understand this. Maybe an example is better :)

> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-20  2:16         ` Lu, Wenzhuo
@ 2016-07-20 10:41           ` Adrien Mazarguil
  2016-07-21  3:18             ` Lu, Wenzhuo
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-20 10:41 UTC (permalink / raw)
  To: Lu, Wenzhuo
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

Hi Wenzhuo,

On Wed, Jul 20, 2016 at 02:16:51AM +0000, Lu, Wenzhuo wrote:
[...]
> > So, today an application cannot combine N-tuple and FDIR flow rules and get a
> > reliable outcome, unless it is designed for specific devices with a known
> > behavior.
> > 
> > > What's the right behavior of PMD if APP want to create a flow director rule
> > which has a higher or even equal priority than an existing n-tuple rule? Should
> > PMD return fail?
> > 
> > First remember applications only deal with the generic API, PMDs are
> > responsible for choosing the most appropriate HW implementation to use
> > according to the requested flow rules (FDIR, N-tuple or anything else).
> > 
> > For the specific case of FDIR vs N-tuple, if the underlying HW supports both I do
> > not see why the PMD would create a N-tuple rule. Doesn't FDIR support
> > everything N-tuple can do and much more?
> Talking about the filters, fdir can cover n-tuple. I think that's why i40e only supports fdir but not n-tuple. But n-tuple has its own highlight. As we know, at least on intel NICs, fdir only supports per device mask. But n-tuple can support per rule mask.
> As every pattern has spec and mask both, we cannot guarantee the masks are same. I think ixgbe will try to use n-tuple first if can. Because even the masks are different, we can support them all.

OK, makes sense. In that case existing rules may indeed prevent subsequent
ones from getting created if their priority is wrong. I do not think there
is a way around that if the application needs this exact ordering.

> > Assuming such a thing happened anyway, that the PMD had to create a rule
> > using a high priority filter type and that the application requests the creation of a
> > rule that can only be done using a lower priority filter type, but also requested a
> > higher priority for that rule, then yes, it should obviously fail.
> > 
> > That is, unless the PMD can perform some kind of workaround to have both.
> > 
> > > If so, do we need more fail reasons? According to this RFC, I think we need
> > return " EEXIST: collision with an existing rule. ", but it's not very clear, APP
> > doesn't know the problem is priority, maybe more detailed reason is helpful.
> > 
> > Possibly, I've defined a basic set of errors, there are quite a number of errno
> > values to choose from. However I think we should not define too many values.
> > In my opinion the basic set covers every possible failure:
> > 
> > - EINVAL: invalid format, rule is broken or cannot be understood by the PMD
> >   anyhow.
> > 
> > - ENOTSUP: pattern/actions look fine but something in the requested rule is
> >   not supported and thus cannot be applied.
> > 
> > - EEXIST: pattern/actions are fine and could have been applied if only some
> >   other rule did not prevent the PMD to do it (I see it as the closest thing
> >   to "ETOOBAD" which unfortunately does not exist).
> > 
> > - ENOMEM: like EEXIST, except it is due to the lack of resources not because
> >   of another rule. I wasn't sure which of ENOMEM or ENOSPC was better but
> >   settled on ENOMEM as it is well known. Still open to debate.
> > 
> > Errno values are only useful to get a rough idea of the reason, and another
> > mechanism is needed to pinpoint the exact problem for debugging/reporting
> > purposes, something like:
> > 
> >  enum rte_flow_error_type {
> >      RTE_FLOW_ERROR_TYPE_NONE,
> >      RTE_FLOW_ERROR_TYPE_UNKNOWN,
> >      RTE_FLOW_ERROR_TYPE_PRIORITY,
> >      RTE_FLOW_ERROR_TYPE_PATTERN,
> >      RTE_FLOW_ERROR_TYPE_ACTION,
> >  };
> > 
> >  struct rte_flow_error {
> >      enum rte_flow_error_type type;
> >      void *offset; /* Points to the exact pattern item or action. */
> >      const char *message;
> >  };
> When we are using a CLI and it fails, normally it will let us know which parameter is not appropriate. So, I think it’s a good idea to have this error structure :)

Agreed.

> > Then either provide an optional struct rte_flow_error pointer to
> > rte_flow_validate(), or a separate function (rte_flow_analyze()?), since
> > processing this may be quite expensive and applications may not care about the
> > exact reason.
> Agree the processing may be too expensive. Maybe we can say it's optional to return error details. And that's a good question that what APP should do if creating the rule fails. I believe normally it will choose handle the rule by itself. But I think it's not bad to feedback more. Or even the APP want to adjust the rules, it cannot be an option for lack of info.

All right then, I'll add it to the specification.

 int
 rte_flow_validate(uint8_t port_id,
                   const struct rte_flow_pattern *pattern,
                   const struct rte_flow_actions *actions,
                   struct rte_flow_error *error);

With error possibly NULL if the application does not care. Is it fine for
you?

[...]
> > > > > > - PMDs, not applications, are responsible for maintaining flow rules
> > > > > >   configuration when stopping and restarting a port or performing other
> > > > > >   actions which may affect them. They can only be destroyed explicitly.
> > > > > Don’t understand " They can only be destroyed explicitly."
> > > >
> > > > This part says that as long as an application has not called
> > > > rte_flow_destroy() on a flow rule, it never disappears, whatever
> > > > happens to the port (stopped, restarted). The application is not
> > > > responsible for re-creating rules after that.
> > > >
> > > > Note that according to the specification, this may translate to not
> > > > being able to stop a port as long as a flow rule is present,
> > > > depending on how nice the PMD intends to be with applications.
> > > > Implementation can be done in small steps with minimal amount of code on
> > the PMD side.
> > > Does it mean PMD should store and maintain all the rules? Why not let rte do
> > that? I think if PMD maintain all the rules, it means every kind of NIC should have
> > a copy of code for the rules. But if rte do that, only one copy of code need to be
> > maintained, right?
> > 
> > I've considered having rules stored in a common format understood at the RTE
> > level and not specific to each PMD and decided that the opaque rte_flow pointer
> > was a better choice for the following reasons:
> > 
> > - Even though flow rules management is done in the control path, processing
> >   must be as fast as possible. Letting PMDs store flow rules using their own
> >   internal representation gives them the chance to achieve better
> >   performance.
> Not quite understand. I think we're talking about maintain the rules by SW. I don’t think there's something need to be optimized according to specific NICs. If we need to optimize the code, I think we need to consider the CPU, OS ... and some common means. I'm wrong?

Perhaps we were talking about different things, here I was only explaining
why rte_flow (the result of creating a flow rule) should be opaque and fully
managed by the PMD. More on the SW side of things below.

> > - An opaque context managed by PMDs would probably have to be stored
> >   somewhere as well anyway.
> > 
> > - PMDs may not need to allocate/store anything at all if they exclusively
> >   rely on HW state for everything. In my opinion, the generic API has enough
> >   constraints for this to work and maintain consistency between flow
> >   rules. Note this is currently how most PMDs implement FDIR and other
> >   filter types.
> Yes, the rules are stored by HW. But considering stop/start the device, the rules in HW will lose. we have to store the rules by SW and re-program them when restarting the device.

Assume a HW capable of keeping flow rules programmed even during a
stop/start cycle (e.g. mlx4/mlx5 may be able to do it from DPDK point of
view), don't you think it is more efficient to standardize on this behavior
and let PMDs restore flow rules for HW that do not support it regardless of
whether it would be done by RTE or the application (SW)?

> And in existing code, we store the filters by SW at least on Intel NICs. But I think we cannot reuse them, because considering the priority and which category of filter should be chosen, I think we need a whole new table for generic API. I think it’s what's designed now, right?

So I understand you'd want RTE to help your PMD keep track of the flow rules
it created?

Nothing wrong with that, all I'm saying is that it should be entirely
optional. RTE should not automatically maintain a list. PMDs have to call
RTE helpers if they need help to maintain a context. These helpers are not
defined in this API yet because it is difficult to know what will be useful
in advance.

> > - RTE can (and will) provide helpers to avoid most of the code redundancy,
> >   PMDs are free to use them or manage everything by themselves.
> > 
> > - Given that the opaque rte_flow pointer associated with a flow rule is to
> >   be stored by the application, PMDs do not even have to keep references to
> >   them.
> Don’t understand. More details?

In an application:

 rte_flow *foo = rte_flow_create(...);

In the above example, foo cannot be dereferenced by the application nor RTE,
only the PMD is aware of its contents. This object can only be used with
rte_flow*() functions.

PMDs are thus free to make this object grow as needed when adding internal
features without breaking any kind of public API/ABI.

What I meant is, given that the application is supposed to store foo
somewhere in order to destroy it later, the PMD does not have to keep track
of that pointer assuming it does not need to access it later on its own for
some reason.

> > - The flow rules format described in this specification (pattern / actions)
> >   will be used by applications directly, and will be free to arrange them in
> >   lists, trees or in any other way if they need to keep flow specifications
> >   around for further processing.
> Who will create the lists, trees or something else? According to previous discussion, I think the APP will program the rules one by one. So if APP organize the rules to lists, trees..., PMD doesn’t know that. 
> And you said " Given that the opaque rte_flow pointer associated with a flow rule is to be stored by the application ". I'm lost here.

I guess that's because we're discussing two different things, flow rule
specifications and flow rule objects. Let me sum it up:

- Flow rule specifications are the patterns/actions combinations provided by
  applications to rte_flow_create(). Applications can store those as needed
  and organize them as they wish (hash, tree, list). Neither PMDs nor RTE
  will do it for them.

- Flow rule objects (struct rte_flow *) are generated when a flow rule is
  created. Applications must keep these around if they want to manipulate
  them later (i.e. destroy or query existing rules).

Then PMDs *may* need to keep and arrange flow rule objects internally for
management purposes. Could be because HW requires it, detecting conflicting
rules, managing priorities and so on. Possible reasons are not described in
this API because these are thought as PMD-specific needs.

> > > When the port is stopped and restarted, rte can reconfigure the rules. Is the
> > concern that PMD may adjust the sequence of the rules according to the priority,
> > so every NIC has a different list of rules? But PMD can adjust them again when
> > rte reconfiguring the rules.
> > 
> > What about PMDs able to stop and restart ports without destroying their own
> > flow rules? If we assume flow rules must be destroyed when stopping a port,
> > these PMDs are needlessly penalized with slower stop/start cycles. Think about
> > it assuming thousands of flow rules.
> I believe the rules maintained by SW should not be destroyed, because they're used to be re-programed when the device starts again.

Do we agree that applications should not care? Flow rules configured before
stopping a port must still be there after restarting it.

What we seem to not agree about is that you think RTE should be responsible
for restoring flow rules of devices that lose them when stopped. I think
doing so is unfair to devices for which it is not the case and not really
nice to applications, so my opinion is that the PMD is responsible for
restoring flow rules however it wants. It is free to use RTE helpers to keep
their track, as long as it's all managed internally.

> > Thus from an application point of view, whatever happens when stopping and
> > restarting a port should not matter. If a flow rule was present before, it must
> > still be present afterwards. If the PMD had to destroy flow rules and re-create
> > them, it does not actually matter if they differ slightly at the HW level, as long as:
> > 
> > - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
> >   and refer to the same rules.
> > 
> > - The overall behavior of all rules is the same.
> > 
> > The list of rules you think of (patterns / actions) is maintained by applications
> > (not RTE), and only if they need them. RTE would needlessly duplicate this.
> As said before, need more details to understand this. Maybe an example is better :)

The generic format both RTE and applications might understand is the one
described in this API (struct rte_flow_pattern and struct
rte_flow_actions).

If we wanted RTE to maintain some sort of per-port state for flow rule
specifications, it would have to be a copy of these structures arranged
somehow (list or something else).

If we consider that PMDs need to keep a context object associated to a flow
rule (the opaque struct rte_flow *), then RTE would most likely have to
store it along with the flow specification.

Such a list may not be useful to applications (list lookups take time), so
they would implement their own redundant method. They might also require
extra room to attach some application context to flow rules. A generic list
cannot plan for it.

Applications know what they want to do with flow rules and are responsible
for managing them efficiently with RTE out of the way.

I'm not sure if this answered your question, if not, please describe a
scenario where a RTE-managed list of flow rules would be mandatory.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-18 15:00               ` Adrien Mazarguil
@ 2016-07-20 16:32                 ` Chandran, Sugesh
  2016-07-20 17:10                   ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Chandran, Sugesh @ 2016-07-20 16:32 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern, Chilikin, Andrey

Hi Adrien,

Sorry for the late reply.
Snipped out the parts we agreed.

Regards
_Sugesh


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Monday, July 18, 2016 4:00 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Rasesh Mody <rasesh.mody@qlogic.com>; Ajit
> Khaparde <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>;
> Chilikin, Andrey <andrey.chilikin@intel.com>
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> On Mon, Jul 18, 2016 at 01:26:09PM +0000, Chandran, Sugesh wrote:
> > Hi Adrien,
> > Thank you for getting back on this.
> > Please find my comments below.
> 
> Hi Sugesh,
> 
> Same for me, removed again the parts we agree on.
> 
> [...]
> > > For your above example, the application cannot assume a rule is
> > > added/deleted as long as the PMD has not completed the related
> > > operation, which means keeping the SW rule/fallback in place in the
> > > meantime. Should handle security concerns as long as after removing
> > > a rule, packets end up in a default queue entirely processed by SW.
> > > Obviously this may worsen response time.
> > >
> > > The ID action can help with this. By knowing which rule a received
> > > packet is associated with, processing can be temporarily offloaded
> > > by another thread without much complexity.
> > [Sugesh] Setting ID for every flow may not viable especially when the
> > size of ID is small(just only 8 bits). I am not sure is this a valid case though.
> 
> Agreed, I'm not saying this solution works for all devices, particularly those
> that do not support ID at all.
> 
> > How about a hardware flow flag in packet descriptor that set when the
> > packets hits any hardware rule. This way software doesn’t worry
> > /blocked by a hardware rule . Even though there is an additional
> > overhead of validating this flag, software datapath can identify the
> hardware processed packets easily.
> > This way the packets traverses the software fallback path until the
> > rule configuration is complete. This flag avoids setting ID action for every
> hardware flow that are configuring.
> 
> That makes sense. I see it as a sort of single bit ID but it could be
> implemented through a different action for less capable devices. PMDs that
> support 32 bit IDs could reuse the same code for both features.
> 
> I understand you'd prefer having this feature always present, however we
> already know that not all PMDs/devices support it, and like everything else
> this is a kind of offload that needs to be explicitly requested by the
> application as it may not be needed.
> 
> If we go with the separate action, then perhaps it would make sense to
> rename "ID" to "MARK" to make things clearer:
> 
>  RTE_FLOW_ACTION_TYPE_FLAG /* Flag packets processed by flow rule. */
> 
>  RTE_FLOW_ACTION_TYPE_MARK /* Attach a 32 bit value to a packet. */
> 
> I guess the result of the FLAG action would be something in ol_flag.
> 
[Sugesh] This looks fine for me.
> Thoughts?
> 
[Sugesh] Two more queries that I missed out in the earlier comments are,
Support for PTYPE :- Intel NICs can report packet type in mbuf.
This can be used by software for the packet processing. Is generic API
capable of handling that as well? 
RSS hashing support :- Just to confirm, the RSS flow action allows application
to decide the header fields to produce the hash. This gives
programmability on load sharing across different queues. The
application can program the NIC to calculate the RSS hash only using mac or mac+ ip or 
ip only using this.


> > > I think applications have to implement SW fallbacks all the time, as
> > > even some sort of guarantee on the flow rule processing time may not
> > > be enough to avoid misdirected packets and related security issues.
> > [Sugesh] Software fallback will be there always. However I am little
> > bit confused on the way software going to identify the packets that
> > are already hardware processed . I feel we need some notification in the
> packet itself, when a hardware rule hits. ID/flag/any other options?
> 
> Yeah I think so too, as long as it is optional because we cannot assume all
> PMDs will support it.
> 
> > > Let's wait for applications to start using this API and then
> > > consider an extra set of asynchronous / real-time functions when the
> > > need arises. It should not impact the way rules are specified
> > [Sugesh] Sure. I think the rule definition may not impact with this.
> 
> Thanks for your comments.
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-20 16:32                 ` Chandran, Sugesh
@ 2016-07-20 17:10                   ` Adrien Mazarguil
  2016-07-21 11:06                     ` Chandran, Sugesh
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-20 17:10 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern, Chilikin, Andrey

Hi Sugesh,

Please see below.

On Wed, Jul 20, 2016 at 04:32:50PM +0000, Chandran, Sugesh wrote:
[...]
> > > How about a hardware flow flag in packet descriptor that set when the
> > > packets hits any hardware rule. This way software doesn’t worry
> > > /blocked by a hardware rule . Even though there is an additional
> > > overhead of validating this flag, software datapath can identify the
> > hardware processed packets easily.
> > > This way the packets traverses the software fallback path until the
> > > rule configuration is complete. This flag avoids setting ID action for every
> > hardware flow that are configuring.
> > 
> > That makes sense. I see it as a sort of single bit ID but it could be
> > implemented through a different action for less capable devices. PMDs that
> > support 32 bit IDs could reuse the same code for both features.
> > 
> > I understand you'd prefer having this feature always present, however we
> > already know that not all PMDs/devices support it, and like everything else
> > this is a kind of offload that needs to be explicitly requested by the
> > application as it may not be needed.
> > 
> > If we go with the separate action, then perhaps it would make sense to
> > rename "ID" to "MARK" to make things clearer:
> > 
> >  RTE_FLOW_ACTION_TYPE_FLAG /* Flag packets processed by flow rule. */
> > 
> >  RTE_FLOW_ACTION_TYPE_MARK /* Attach a 32 bit value to a packet. */
> > 
> > I guess the result of the FLAG action would be something in ol_flag.
> > 
> [Sugesh] This looks fine for me.

Great, I will update the specification accordingly.

> > Thoughts?
> > 
> [Sugesh] Two more queries that I missed out in the earlier comments are,
> Support for PTYPE :- Intel NICs can report packet type in mbuf.
> This can be used by software for the packet processing. Is generic API
> capable of handling that as well? 

Yes, however no PTYPE action has been defined for this (yet). It is only a
matter of adding one.

Currently packet type recognition is enabled per port using a separate API,
so correct me if I'm wrong but I am not aware of any adapter with the
ability to enable it per flow rule, so I do not think such an action needs
to be defined from the start. We may add it later.

> RSS hashing support :- Just to confirm, the RSS flow action allows application
> to decide the header fields to produce the hash. This gives
> programmability on load sharing across different queues. The
> application can program the NIC to calculate the RSS hash only using mac or mac+ ip or 
> ip only using this.

I'd say yes but from your summary, I'm not sure we share the same idea of
what the RSS action is supposed to do, so here is mine.

Like all flow rules, the pattern part of the RSS action only filters the
packets on which the action will be performed.

The rss_conf parameter (struct rte_eth_rss_conf) only provides a key and a
RSS hash function to use (ETH_RSS_IPV4, ETH_RSS_NONFRAG_IPV6_UDP, etc).

Nothing prevents the RSS hash function from being applied to protocol
headers which are not necessarily present in the flow rule pattern. These
are two independent things, e.g. you could have a pattern matching IPv4
packets yet perform RSS hashing only on UDP headers.

Finally, the RSS action configuration only affects packets coming from this
flow rule. It is not performed on the device globally so packets which are
not matched are not affected by RSS processing. As a result it might not be
possible to configure two flow rules specifying incompatible RSS actions
simultaneously if the underlying device supports only a single global RSS
context.

Are we on the same page?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-20 10:41           ` Adrien Mazarguil
@ 2016-07-21  3:18             ` Lu, Wenzhuo
  2016-07-21 12:47               ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Lu, Wenzhuo @ 2016-07-21  3:18 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

Hi Adrien,

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Wednesday, July 20, 2016 6:41 PM
> To: Lu, Wenzhuo
> Cc: dev@dpdk.org; Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody;
> Ajit Khaparde; Rahul Lakkireddy; Jan Medala; John Daley; Chen, Jing D; Ananyev,
> Konstantin; Matej Vido; Alejandro Lucero; Sony Chacko; Jerin Jacob; De Lara
> Guarch, Pablo; Olga Shern
> Subject: Re: [RFC] Generic flow director/filtering/classification API
> 
> Hi Wenzhuo,
> 
> On Wed, Jul 20, 2016 at 02:16:51AM +0000, Lu, Wenzhuo wrote:
> [...]
> > > So, today an application cannot combine N-tuple and FDIR flow rules
> > > and get a reliable outcome, unless it is designed for specific
> > > devices with a known behavior.
> > >
> > > > What's the right behavior of PMD if APP want to create a flow
> > > > director rule
> > > which has a higher or even equal priority than an existing n-tuple
> > > rule? Should PMD return fail?
> > >
> > > First remember applications only deal with the generic API, PMDs are
> > > responsible for choosing the most appropriate HW implementation to
> > > use according to the requested flow rules (FDIR, N-tuple or anything else).
> > >
> > > For the specific case of FDIR vs N-tuple, if the underlying HW
> > > supports both I do not see why the PMD would create a N-tuple rule.
> > > Doesn't FDIR support everything N-tuple can do and much more?
> > Talking about the filters, fdir can cover n-tuple. I think that's why i40e only
> supports fdir but not n-tuple. But n-tuple has its own highlight. As we know, at
> least on intel NICs, fdir only supports per device mask. But n-tuple can support
> per rule mask.
> > As every pattern has spec and mask both, we cannot guarantee the masks are
> same. I think ixgbe will try to use n-tuple first if can. Because even the masks are
> different, we can support them all.
> 
> OK, makes sense. In that case existing rules may indeed prevent subsequent
> ones from getting created if their priority is wrong. I do not think there is a way
> around that if the application needs this exact ordering.
Agree. I don’t see any workaround either. PMD has to return fail sometimes.

> 
> > > Assuming such a thing happened anyway, that the PMD had to create a
> > > rule using a high priority filter type and that the application
> > > requests the creation of a rule that can only be done using a lower
> > > priority filter type, but also requested a higher priority for that rule, then yes,
> it should obviously fail.
> > >
> > > That is, unless the PMD can perform some kind of workaround to have both.
> > >
> > > > If so, do we need more fail reasons? According to this RFC, I
> > > > think we need
> > > return " EEXIST: collision with an existing rule. ", but it's not
> > > very clear, APP doesn't know the problem is priority, maybe more detailed
> reason is helpful.
> > >
> > > Possibly, I've defined a basic set of errors, there are quite a
> > > number of errno values to choose from. However I think we should not
> define too many values.
> > > In my opinion the basic set covers every possible failure:
> > >
> > > - EINVAL: invalid format, rule is broken or cannot be understood by the PMD
> > >   anyhow.
> > >
> > > - ENOTSUP: pattern/actions look fine but something in the requested rule is
> > >   not supported and thus cannot be applied.
> > >
> > > - EEXIST: pattern/actions are fine and could have been applied if only some
> > >   other rule did not prevent the PMD to do it (I see it as the closest thing
> > >   to "ETOOBAD" which unfortunately does not exist).
> > >
> > > - ENOMEM: like EEXIST, except it is due to the lack of resources not because
> > >   of another rule. I wasn't sure which of ENOMEM or ENOSPC was better but
> > >   settled on ENOMEM as it is well known. Still open to debate.
> > >
> > > Errno values are only useful to get a rough idea of the reason, and
> > > another mechanism is needed to pinpoint the exact problem for
> > > debugging/reporting purposes, something like:
> > >
> > >  enum rte_flow_error_type {
> > >      RTE_FLOW_ERROR_TYPE_NONE,
> > >      RTE_FLOW_ERROR_TYPE_UNKNOWN,
> > >      RTE_FLOW_ERROR_TYPE_PRIORITY,
> > >      RTE_FLOW_ERROR_TYPE_PATTERN,
> > >      RTE_FLOW_ERROR_TYPE_ACTION,
> > >  };
> > >
> > >  struct rte_flow_error {
> > >      enum rte_flow_error_type type;
> > >      void *offset; /* Points to the exact pattern item or action. */
> > >      const char *message;
> > >  };
> > When we are using a CLI and it fails, normally it will let us know
> > which parameter is not appropriate. So, I think it’s a good idea to
> > have this error structure :)
> 
> Agreed.
> 
> > > Then either provide an optional struct rte_flow_error pointer to
> > > rte_flow_validate(), or a separate function (rte_flow_analyze()?),
> > > since processing this may be quite expensive and applications may
> > > not care about the exact reason.
> > Agree the processing may be too expensive. Maybe we can say it's optional to
> return error details. And that's a good question that what APP should do if
> creating the rule fails. I believe normally it will choose handle the rule by itself.
> But I think it's not bad to feedback more. Or even the APP want to adjust the
> rules, it cannot be an option for lack of info.
> 
> All right then, I'll add it to the specification.
> 
>  int
>  rte_flow_validate(uint8_t port_id,
>                    const struct rte_flow_pattern *pattern,
>                    const struct rte_flow_actions *actions,
>                    struct rte_flow_error *error);
> 
> With error possibly NULL if the application does not care. Is it fine for you?
Yes, it looks good to me. Thanks for that :)

> 
> [...]
> > > > > > > - PMDs, not applications, are responsible for maintaining flow rules
> > > > > > >   configuration when stopping and restarting a port or performing
> other
> > > > > > >   actions which may affect them. They can only be destroyed explicitly.
> > > > > > Don’t understand " They can only be destroyed explicitly."
> > > > >
> > > > > This part says that as long as an application has not called
> > > > > rte_flow_destroy() on a flow rule, it never disappears, whatever
> > > > > happens to the port (stopped, restarted). The application is not
> > > > > responsible for re-creating rules after that.
> > > > >
> > > > > Note that according to the specification, this may translate to
> > > > > not being able to stop a port as long as a flow rule is present,
> > > > > depending on how nice the PMD intends to be with applications.
> > > > > Implementation can be done in small steps with minimal amount of
> > > > > code on
> > > the PMD side.
> > > > Does it mean PMD should store and maintain all the rules? Why not
> > > > let rte do
> > > that? I think if PMD maintain all the rules, it means every kind of
> > > NIC should have a copy of code for the rules. But if rte do that,
> > > only one copy of code need to be maintained, right?
> > >
> > > I've considered having rules stored in a common format understood at
> > > the RTE level and not specific to each PMD and decided that the
> > > opaque rte_flow pointer was a better choice for the following reasons:
> > >
> > > - Even though flow rules management is done in the control path, processing
> > >   must be as fast as possible. Letting PMDs store flow rules using their own
> > >   internal representation gives them the chance to achieve better
> > >   performance.
> > Not quite understand. I think we're talking about maintain the rules by SW. I
> don’t think there's something need to be optimized according to specific NICs. If
> we need to optimize the code, I think we need to consider the CPU, OS ... and
> some common means. I'm wrong?
> 
> Perhaps we were talking about different things, here I was only explaining why
> rte_flow (the result of creating a flow rule) should be opaque and fully managed
> by the PMD. More on the SW side of things below.
> 
> > > - An opaque context managed by PMDs would probably have to be stored
> > >   somewhere as well anyway.
> > >
> > > - PMDs may not need to allocate/store anything at all if they exclusively
> > >   rely on HW state for everything. In my opinion, the generic API has enough
> > >   constraints for this to work and maintain consistency between flow
> > >   rules. Note this is currently how most PMDs implement FDIR and other
> > >   filter types.
> > Yes, the rules are stored by HW. But considering stop/start the device, the
> rules in HW will lose. we have to store the rules by SW and re-program them
> when restarting the device.
> 
> Assume a HW capable of keeping flow rules programmed even during a
> stop/start cycle (e.g. mlx4/mlx5 may be able to do it from DPDK point of view),
> don't you think it is more efficient to standardize on this behavior and let PMDs
> restore flow rules for HW that do not support it regardless of whether it would
> be done by RTE or the application (SW)?
Didn’t know that. As some NICs have already had the ability to keep the rules during a stop/start cycle, maybe it could be a trend :)

> 
> > And in existing code, we store the filters by SW at least on Intel NICs. But I
> think we cannot reuse them, because considering the priority and which
> category of filter should be chosen, I think we need a whole new table for
> generic API. I think it’s what's designed now, right?
> 
> So I understand you'd want RTE to help your PMD keep track of the flow rules it
> created?
Yes. But as you said before, it’s not a good idea for mlx4/mlx5, because their HW doesn't need SW to re-program the rules after stopping/starting. If we make it a common mechanism, it just wastes time for mlx4/mlx5.

> 
> Nothing wrong with that, all I'm saying is that it should be entirely optional. RTE
> should not automatically maintain a list. PMDs have to call RTE helpers if they
> need help to maintain a context. These helpers are not defined in this API yet
> because it is difficult to know what will be useful in advance.
> 
> > > - RTE can (and will) provide helpers to avoid most of the code redundancy,
> > >   PMDs are free to use them or manage everything by themselves.
> > >
> > > - Given that the opaque rte_flow pointer associated with a flow rule is to
> > >   be stored by the application, PMDs do not even have to keep references to
> > >   them.
> > Don’t understand. More details?
> 
> In an application:
> 
>  rte_flow *foo = rte_flow_create(...);
> 
> In the above example, foo cannot be dereferenced by the application nor RTE,
> only the PMD is aware of its contents. This object can only be used with
> rte_flow*() functions.
> 
> PMDs are thus free to make this object grow as needed when adding internal
> features without breaking any kind of public API/ABI.
> 
> What I meant is, given that the application is supposed to store foo somewhere
> in order to destroy it later, the PMD does not have to keep track of that pointer
> assuming it does not need to access it later on its own for some reason.
> 
> > > - The flow rules format described in this specification (pattern / actions)
> > >   will be used by applications directly, and will be free to arrange them in
> > >   lists, trees or in any other way if they need to keep flow specifications
> > >   around for further processing.
> > Who will create the lists, trees or something else? According to previous
> discussion, I think the APP will program the rules one by one. So if APP organize
> the rules to lists, trees..., PMD doesn’t know that.
> > And you said " Given that the opaque rte_flow pointer associated with a flow
> rule is to be stored by the application ". I'm lost here.
> 
> I guess that's because we're discussing two different things, flow rule
> specifications and flow rule objects. Let me sum it up:
> 
> - Flow rule specifications are the patterns/actions combinations provided by
>   applications to rte_flow_create(). Applications can store those as needed
>   and organize them as they wish (hash, tree, list). Neither PMDs nor RTE
>   will do it for them.
> 
> - Flow rule objects (struct rte_flow *) are generated when a flow rule is
>   created. Applications must keep these around if they want to manipulate
>   them later (i.e. destroy or query existing rules).
Thanks for this clarification. So the specifications can be different with objects, right? The specifications are what the APP wants, the objects are what the APP really gets. As rte_flow_create can fail. Right?

> 
> Then PMDs *may* need to keep and arrange flow rule objects internally for
> management purposes. Could be because HW requires it, detecting conflicting
> rules, managing priorities and so on. Possible reasons are not described in this
> API because these are thought as PMD-specific needs.
Got it.

> 
> > > > When the port is stopped and restarted, rte can reconfigure the
> > > > rules. Is the
> > > concern that PMD may adjust the sequence of the rules according to
> > > the priority, so every NIC has a different list of rules? But PMD
> > > can adjust them again when rte reconfiguring the rules.
> > >
> > > What about PMDs able to stop and restart ports without destroying
> > > their own flow rules? If we assume flow rules must be destroyed when
> > > stopping a port, these PMDs are needlessly penalized with slower
> > > stop/start cycles. Think about it assuming thousands of flow rules.
> > I believe the rules maintained by SW should not be destroyed, because they're
> used to be re-programed when the device starts again.
> 
> Do we agree that applications should not care? Flow rules configured before
> stopping a port must still be there after restarting it.
Yes, agree.

> 
> What we seem to not agree about is that you think RTE should be responsible
> for restoring flow rules of devices that lose them when stopped. I think doing so
> is unfair to devices for which it is not the case and not really nice to applications,
> so my opinion is that the PMD is responsible for restoring flow rules however it
> wants. It is free to use RTE helpers to keep their track, as long as it's all managed
> internally.
What I think is RTE can store the flow rules and recreate them after restarting, in the function like rte_dev_start, so APP knows nothing about it. But according to the discussing above, I think the design doesn't support it, right?
RTE doesn't store the flow rules objects and event it stores them, there's no way designed to re-program the objects. And also considering some HW doesn't need to be re-programed. I think it's OK  to let PMD maintain the rules as the re-programing is a NIC specific requirement.

> 
> > > Thus from an application point of view, whatever happens when
> > > stopping and restarting a port should not matter. If a flow rule was
> > > present before, it must still be present afterwards. If the PMD had
> > > to destroy flow rules and re-create them, it does not actually matter if they
> differ slightly at the HW level, as long as:
> > >
> > > - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
> > >   and refer to the same rules.
> > >
> > > - The overall behavior of all rules is the same.
> > >
> > > The list of rules you think of (patterns / actions) is maintained by
> > > applications (not RTE), and only if they need them. RTE would needlessly
> duplicate this.
> > As said before, need more details to understand this. Maybe an example
> > is better :)
> 
> The generic format both RTE and applications might understand is the one
> described in this API (struct rte_flow_pattern and struct rte_flow_actions).
> 
> If we wanted RTE to maintain some sort of per-port state for flow rule
> specifications, it would have to be a copy of these structures arranged somehow
> (list or something else).
> 
> If we consider that PMDs need to keep a context object associated to a flow
> rule (the opaque struct rte_flow *), then RTE would most likely have to store it
> along with the flow specification.
> 
> Such a list may not be useful to applications (list lookups take time), so they
> would implement their own redundant method. They might also require extra
> room to attach some application context to flow rules. A generic list cannot plan
> for it.
> 
> Applications know what they want to do with flow rules and are responsible for
> managing them efficiently with RTE out of the way.
> 
> I'm not sure if this answered your question, if not, please describe a scenario
> where a RTE-managed list of flow rules would be mandatory.
Got your point and agree :)

> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-05 18:16 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Adrien Mazarguil
                   ` (3 preceding siblings ...)
  2016-07-11 10:41 ` Jerin Jacob
@ 2016-07-21  8:13 ` Rahul Lakkireddy
  2016-07-21 17:07   ` Adrien Mazarguil
  2016-08-19 19:32 ` [dpdk-dev] [RFC v2] " Adrien Mazarguil
  5 siblings, 1 reply; 262+ messages in thread
From: Rahul Lakkireddy @ 2016-07-21  8:13 UTC (permalink / raw)
  To: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley, Jing Chen,
	Konstantin Ananyev, Matej Vido, Alejandro Lucero, Sony Chacko,
	Jerin Jacob, Pablo de Lara, Olga Shern
  Cc: Kumar Sanghvi, Nirranjan Kirubaharan, Indranil Choudhury

Hi Adrien,

The proposal looks very good.  It satisfies most of the features
supported by Chelsio NICs.  We are looking for suggestions on exposing
more additional features supported by Chelsio NICs via this API.

Chelsio NICs have two regions in which filters can be placed -
Maskfull and Maskless regions.  As their names imply, maskfull region
can accept masks to match a range of values; whereas, maskless region
don't accept any masks and hence perform a more strict exact-matches.
Filters without masks can also be placed in maskfull region.  By
default, maskless region have higher priority over the maskfull region.
However, the priority between the two regions is configurable.

Please suggest on how we can let the apps configure in which region
filters must be placed and set the corresponding priority accordingly
via this API.

More comments below.

On Tuesday, July 07/05/16, 2016 at 20:16:46 +0200, Adrien Mazarguil wrote:
> Hi All,
> 
[...]

> 
> ``ETH``
> ^^^^^^^
> 
> Matches an Ethernet header.
> 
> - ``dst``: destination MAC.
> - ``src``: source MAC.
> - ``type``: EtherType.
> - ``tags``: number of 802.1Q/ad tags defined.
> - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> 
>  - ``tpid``: Tag protocol identifier.
>  - ``tci``: Tag control information.
> 
> ``IPV4``
> ^^^^^^^^
> 
> Matches an IPv4 header.
> 
> - ``src``: source IP address.
> - ``dst``: destination IP address.
> - ``tos``: ToS/DSCP field.
> - ``ttl``: TTL field.
> - ``proto``: protocol number for the next layer.
> 
> ``IPV6``
> ^^^^^^^^
> 
> Matches an IPv6 header.
> 
> - ``src``: source IP address.
> - ``dst``: destination IP address.
> - ``tc``: traffic class field.
> - ``nh``: Next header field (protocol).
> - ``hop_limit``: hop limit field (TTL).
> 
> ``ICMP``
> ^^^^^^^^
> 
> Matches an ICMP header.
> 
> - TBD.
> 
> ``UDP``
> ^^^^^^^
> 
> Matches a UDP header.
> 
> - ``sport``: source port.
> - ``dport``: destination port.
> - ``length``: UDP length.
> - ``checksum``: UDP checksum.
> 
> .. raw:: pdf
> 
>    PageBreak
> 
> ``TCP``
> ^^^^^^^
> 
> Matches a TCP header.
> 
> - ``sport``: source port.
> - ``dport``: destination port.
> - All other TCP fields and bits.
> 
> ``VXLAN``
> ^^^^^^^^^
> 
> Matches a VXLAN header.
> 
> - TBD.
> 

In addition to above matches, Chelsio NICs have some additional
features:

- Match based on unicast DST-MAC, multicast DST-MAC, broadcast DST-MAC.
  Also, there is a match criteria available called 'promisc' - which
  matches packets that are not destined for the interface, but had
  been received by the hardware due to interface being in promiscuous
  mode.

- Match FCoE packets.

- Match IP Fragmented packets.

- Match range of physical ports on the NIC in a single rule via masks.
  For ex: match all UDP packets coming on ports 3 and 4 out of 4
  ports available on the NIC.

- Match range of Physical Functions (PFs) on the NIC in a single rule
  via masks. For ex: match all traffic coming on several PFs.

Please suggest on how we can expose the above features to DPDK apps via
this API.

[...]

> 
> Actions
> ~~~~~~~
> 
> Each possible action is represented by a type. Some have associated
> configuration structures. Several actions combined in a list can be affected
> to a flow rule. That list is not ordered.
> 
> At least one action must be defined in a filter rule in order to do
> something with matched packets.
> 
> - Actions are defined with ``struct rte_flow_action``.
> - A list of actions is defined with ``struct rte_flow_actions``.
> 
> They fall in three categories:
> 
> - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
>   processing matched packets by subsequent flow rules, unless overridden
>   with PASSTHRU.
> 
> - Non terminating actions (PASSTHRU, DUP) that leave matched packets up for
>   additional processing by subsequent flow rules.
> 
> - Other non terminating meta actions that do not affect the fate of packets
>   (END, VOID, ID, COUNT).
> 
> When several actions are combined in a flow rule, they should all have
> different types (e.g. dropping a packet twice is not possible). However
> considering the VOID type is an exception to this rule, the defined behavior
> is for PMDs to only take into account the last action of a given type found
> in the list. PMDs still perform error checking on the entire list.
> 
> *Note that PASSTHRU is the only action able to override a terminating rule.*
> 

Chelsio NICs can support an action 'switch' which can re-direct
matched packets from one port to another port in hardware.  In addition,
it can also optionally:

1. Perform header rewrites (src-mac/dst-mac rewrite, src-mac/dst-mac
   swap, vlan add/remove/rewrite).

2. Perform NAT'ing in hardware (4-tuple rewrite).

before sending it out on the wire [1].

To meet the above requirements, we'd need a way to pass sub-actions
to action 'switch' and a way to pass extra info (such as new
src-mac/dst-mac, new vlan, new 4-tuple for NAT) to rewrite
corresponding fields.

We're looking for suggestions on how we can achieve action 'switch'
in this new API.

>From our understanding of this API, we could just expand
rte_flow_action_type with an additional action type
RTE_FLOW_ACTION_TYPE_SWITCH and define several sub-actions such as:

enum rte_flow_action_switch_type {
        RTE_FLOW_ACTION_SWITCH_TYPE_NONE,
	RTE_FLOW_ACTION_SWITCH_TYPE_MAC_REWRITE,
	RTE_FLOW_ACTION_SWITCH_TYPE_MAC_SWAP,
	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_INSERT,
	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_DELETE,
	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_REWRITE,
	RTE_FLOW_ACTION_SWITCH_TYPE_NAT_REWRITE,
};

and then define an rte_flow_action_switch as follows:

struct rte_flow_action_switch {
	enum rte_flow_action_switch_type type; /* sub actions to perform */
	uint16_t port;	/* Destination physical port to switch packet */
	enum rte_flow_item_type	 type; /* Fields to rewrite */
	const void *switch_spec;
	/* Holds info to rewrite matched flows */
};

Does the above approach sound right with respect to this new API?

[...]

> 
> ``COUNT``
> ^^^^^^^^^
> 
> Enables hits counter for this rule.
> 
> This counter can be retrieved and reset through ``rte_flow_query()``, see
> ``struct rte_flow_query_count``.
> 
> - Counters can be retrieved with ``rte_flow_query()``.
> - No configurable property.
> 
> +---------------+
> | COUNT         |
> +===============+
> | no properties |
> +---------------+
> 
> Query structure to retrieve and reset the flow rule hits counter:
> 
> +------------------------------------------------+
> | COUNT query                                    |
> +===========+=====+==============================+
> | ``reset`` | in  | reset counter after query    |
> +-----------+-----+------------------------------+
> | ``hits``  | out | number of hits for this flow |
> +-----------+-----+------------------------------+
> 

Chelsio NICs can also count the number of bytes that hit the rule.
So, need a counter "bytes".

[...]

[1]  http://www.dpdk.org/ml/archives/dev/2016-February/032605.html

Thanks,
Rahul

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-20 17:10                   ` Adrien Mazarguil
@ 2016-07-21 11:06                     ` Chandran, Sugesh
  2016-07-21 13:37                       ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Chandran, Sugesh @ 2016-07-21 11:06 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern, Chilikin, Andrey


Hi Adrien,
Please find my comments below

Regards
_Sugesh


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Wednesday, July 20, 2016 6:11 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Rasesh Mody <rasesh.mody@qlogic.com>; Ajit
> Khaparde <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>;
> Chilikin, Andrey <andrey.chilikin@intel.com>
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> Hi Sugesh,
> 
> Please see below.
> 
> On Wed, Jul 20, 2016 at 04:32:50PM +0000, Chandran, Sugesh wrote:
> [...]
> > > > How about a hardware flow flag in packet descriptor that set when
> > > > the packets hits any hardware rule. This way software doesn’t
> > > > worry /blocked by a hardware rule . Even though there is an
> > > > additional overhead of validating this flag, software datapath can
> > > > identify the
> > > hardware processed packets easily.
> > > > This way the packets traverses the software fallback path until
> > > > the rule configuration is complete. This flag avoids setting ID
> > > > action for every
> > > hardware flow that are configuring.
> > >
> > > That makes sense. I see it as a sort of single bit ID but it could
> > > be implemented through a different action for less capable devices.
> > > PMDs that support 32 bit IDs could reuse the same code for both
> features.
> > >
> > > I understand you'd prefer having this feature always present,
> > > however we already know that not all PMDs/devices support it, and
> > > like everything else this is a kind of offload that needs to be
> > > explicitly requested by the application as it may not be needed.
> > >
> > > If we go with the separate action, then perhaps it would make sense
> > > to rename "ID" to "MARK" to make things clearer:
> > >
> > >  RTE_FLOW_ACTION_TYPE_FLAG /* Flag packets processed by flow rule.
> > > */
> > >
> > >  RTE_FLOW_ACTION_TYPE_MARK /* Attach a 32 bit value to a packet. */
> > >
> > > I guess the result of the FLAG action would be something in ol_flag.
> > >
> > [Sugesh] This looks fine for me.
> 
> Great, I will update the specification accordingly.
[Sugesh] Thank you!
> 
> > > Thoughts?
> > >
> > [Sugesh] Two more queries that I missed out in the earlier comments
> > are, Support for PTYPE :- Intel NICs can report packet type in mbuf.
> > This can be used by software for the packet processing. Is generic API
> > capable of handling that as well?
> 
> Yes, however no PTYPE action has been defined for this (yet). It is only a
> matter of adding one.
[Sugesh] Thank you for confirming. Its fine for me
> 
> Currently packet type recognition is enabled per port using a separate API, so
> correct me if I'm wrong but I am not aware of any adapter with the ability to
> enable it per flow rule, so I do not think such an action needs to be defined
> from the start. We may add it later.
> 
> > RSS hashing support :- Just to confirm, the RSS flow action allows
> > application to decide the header fields to produce the hash. This
> > gives programmability on load sharing across different queues. The
> > application can program the NIC to calculate the RSS hash only using
> > mac or mac+ ip or ip only using this.
> 
> I'd say yes but from your summary, I'm not sure we share the same idea of
> what the RSS action is supposed to do, so here is mine.
> 
> Like all flow rules, the pattern part of the RSS action only filters the packets
> on which the action will be performed.
> 
> The rss_conf parameter (struct rte_eth_rss_conf) only provides a key and a
> RSS hash function to use (ETH_RSS_IPV4, ETH_RSS_NONFRAG_IPV6_UDP,
> etc).
> 
> Nothing prevents the RSS hash function from being applied to protocol
> headers which are not necessarily present in the flow rule pattern. These are
> two independent things, e.g. you could have a pattern matching IPv4 packets
> yet perform RSS hashing only on UDP headers.
> 
> Finally, the RSS action configuration only affects packets coming from this
> flow rule. It is not performed on the device globally so packets which are not
> matched are not affected by RSS processing. As a result it might not be
> possible to configure two flow rules specifying incompatible RSS actions
> simultaneously if the underlying device supports only a single global RSS
> context.
> 
[Sugesh] thank you for the explanation. This means I can have a rule that matches on
Every incoming packets(all field wild card rule) and does RSS hash on selected fields,
MAC only, IP only or IP & MAC? This can be useful to do a packet lookup in software by just using
Only hash. 
> Are we on the same page?
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-21  3:18             ` Lu, Wenzhuo
@ 2016-07-21 12:47               ` Adrien Mazarguil
  2016-07-22  1:38                 ` Lu, Wenzhuo
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-21 12:47 UTC (permalink / raw)
  To: Lu, Wenzhuo
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

Hi Wenzhuo,

It seems that we agree on about everything now, just a few more comments
below after snipping the now irrelevant parts.

On Thu, Jul 21, 2016 at 03:18:11AM +0000, Lu, Wenzhuo wrote:
[...]
> > > > > Does it mean PMD should store and maintain all the rules? Why not
> > > > > let rte do
> > > > that? I think if PMD maintain all the rules, it means every kind of
> > > > NIC should have a copy of code for the rules. But if rte do that,
> > > > only one copy of code need to be maintained, right?
> > > >
> > > > I've considered having rules stored in a common format understood at
> > > > the RTE level and not specific to each PMD and decided that the
> > > > opaque rte_flow pointer was a better choice for the following reasons:
> > > >
> > > > - Even though flow rules management is done in the control path, processing
> > > >   must be as fast as possible. Letting PMDs store flow rules using their own
> > > >   internal representation gives them the chance to achieve better
> > > >   performance.
> > > Not quite understand. I think we're talking about maintain the rules by SW. I
> > don’t think there's something need to be optimized according to specific NICs. If
> > we need to optimize the code, I think we need to consider the CPU, OS ... and
> > some common means. I'm wrong?
> > 
> > Perhaps we were talking about different things, here I was only explaining why
> > rte_flow (the result of creating a flow rule) should be opaque and fully managed
> > by the PMD. More on the SW side of things below.
> > 
> > > > - An opaque context managed by PMDs would probably have to be stored
> > > >   somewhere as well anyway.
> > > >
> > > > - PMDs may not need to allocate/store anything at all if they exclusively
> > > >   rely on HW state for everything. In my opinion, the generic API has enough
> > > >   constraints for this to work and maintain consistency between flow
> > > >   rules. Note this is currently how most PMDs implement FDIR and other
> > > >   filter types.
> > > Yes, the rules are stored by HW. But considering stop/start the device, the
> > rules in HW will lose. we have to store the rules by SW and re-program them
> > when restarting the device.
> > 
> > Assume a HW capable of keeping flow rules programmed even during a
> > stop/start cycle (e.g. mlx4/mlx5 may be able to do it from DPDK point of view),
> > don't you think it is more efficient to standardize on this behavior and let PMDs
> > restore flow rules for HW that do not support it regardless of whether it would
> > be done by RTE or the application (SW)?
> Didn’t know that. As some NICs have already had the ability to keep the rules during a stop/start cycle, maybe it could be a trend :)

Well yeah, if you are wondering about that, these PMDs do not have the same
definition for port stop/start as lower level PMDs like ixgbe and i40e. In
the mlx4/mlx5 cases, most control path operations (queue creation,
destruction and general management) end up performed by kernel
drivers. Stopping a port does not really shut it down as the kernel still
manages its own netdevice independently.

[...]
> > > > - The flow rules format described in this specification (pattern / actions)
> > > >   will be used by applications directly, and will be free to arrange them in
> > > >   lists, trees or in any other way if they need to keep flow specifications
> > > >   around for further processing.
> > > Who will create the lists, trees or something else? According to previous
> > discussion, I think the APP will program the rules one by one. So if APP organize
> > the rules to lists, trees..., PMD doesn’t know that.
> > > And you said " Given that the opaque rte_flow pointer associated with a flow
> > rule is to be stored by the application ". I'm lost here.
> > 
> > I guess that's because we're discussing two different things, flow rule
> > specifications and flow rule objects. Let me sum it up:
> > 
> > - Flow rule specifications are the patterns/actions combinations provided by
> >   applications to rte_flow_create(). Applications can store those as needed
> >   and organize them as they wish (hash, tree, list). Neither PMDs nor RTE
> >   will do it for them.
> > 
> > - Flow rule objects (struct rte_flow *) are generated when a flow rule is
> >   created. Applications must keep these around if they want to manipulate
> >   them later (i.e. destroy or query existing rules).
> Thanks for this clarification. So the specifications can be different with objects, right? The specifications are what the APP wants, the objects are what the APP really gets. As rte_flow_create can fail. Right?

Yes, precisely. Apps are also free to keep specifications around even in the
event of a flow creation failure. I think a generic software fallback will
be provided at some point.

[...]
> > What we seem to not agree about is that you think RTE should be responsible
> > for restoring flow rules of devices that lose them when stopped. I think doing so
> > is unfair to devices for which it is not the case and not really nice to applications,
> > so my opinion is that the PMD is responsible for restoring flow rules however it
> > wants. It is free to use RTE helpers to keep their track, as long as it's all managed
> > internally.
> What I think is RTE can store the flow rules and recreate them after restarting, in the function like rte_dev_start, so APP knows nothing about it. But according to the discussing above, I think the design doesn't support it, right?

Yes. Right now the design explictly states that PMDs are on their own
regarding this (4.3 Behavior). While it could be modified, I really think it
would be less efficient for the reasons stated above.

> RTE doesn't store the flow rules objects and event it stores them, there's no way designed to re-program the objects. And also considering some HW doesn't need to be re-programed. I think it's OK  to let PMD maintain the rules as the re-programing is a NIC specific requirement.

Great to finally agree on this point.

> > > > Thus from an application point of view, whatever happens when
> > > > stopping and restarting a port should not matter. If a flow rule was
> > > > present before, it must still be present afterwards. If the PMD had
> > > > to destroy flow rules and re-create them, it does not actually matter if they
> > differ slightly at the HW level, as long as:
> > > >
> > > > - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
> > > >   and refer to the same rules.
> > > >
> > > > - The overall behavior of all rules is the same.
> > > >
> > > > The list of rules you think of (patterns / actions) is maintained by
> > > > applications (not RTE), and only if they need them. RTE would needlessly
> > duplicate this.
> > > As said before, need more details to understand this. Maybe an example
> > > is better :)
> > 
> > The generic format both RTE and applications might understand is the one
> > described in this API (struct rte_flow_pattern and struct rte_flow_actions).
> > 
> > If we wanted RTE to maintain some sort of per-port state for flow rule
> > specifications, it would have to be a copy of these structures arranged somehow
> > (list or something else).
> > 
> > If we consider that PMDs need to keep a context object associated to a flow
> > rule (the opaque struct rte_flow *), then RTE would most likely have to store it
> > along with the flow specification.
> > 
> > Such a list may not be useful to applications (list lookups take time), so they
> > would implement their own redundant method. They might also require extra
> > room to attach some application context to flow rules. A generic list cannot plan
> > for it.
> > 
> > Applications know what they want to do with flow rules and are responsible for
> > managing them efficiently with RTE out of the way.
> > 
> > I'm not sure if this answered your question, if not, please describe a scenario
> > where a RTE-managed list of flow rules would be mandatory.
> Got your point and agree :)

Thanks.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-21 11:06                     ` Chandran, Sugesh
@ 2016-07-21 13:37                       ` Adrien Mazarguil
  2016-07-22 16:32                         ` Chandran, Sugesh
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-21 13:37 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern, Chilikin, Andrey

Hi Sugesh,

I do not have much to add, please see below.

On Thu, Jul 21, 2016 at 11:06:52AM +0000, Chandran, Sugesh wrote:
[...]
> > > RSS hashing support :- Just to confirm, the RSS flow action allows
> > > application to decide the header fields to produce the hash. This
> > > gives programmability on load sharing across different queues. The
> > > application can program the NIC to calculate the RSS hash only using
> > > mac or mac+ ip or ip only using this.
> > 
> > I'd say yes but from your summary, I'm not sure we share the same idea of
> > what the RSS action is supposed to do, so here is mine.
> > 
> > Like all flow rules, the pattern part of the RSS action only filters the packets
> > on which the action will be performed.
> > 
> > The rss_conf parameter (struct rte_eth_rss_conf) only provides a key and a
> > RSS hash function to use (ETH_RSS_IPV4, ETH_RSS_NONFRAG_IPV6_UDP,
> > etc).
> > 
> > Nothing prevents the RSS hash function from being applied to protocol
> > headers which are not necessarily present in the flow rule pattern. These are
> > two independent things, e.g. you could have a pattern matching IPv4 packets
> > yet perform RSS hashing only on UDP headers.
> > 
> > Finally, the RSS action configuration only affects packets coming from this
> > flow rule. It is not performed on the device globally so packets which are not
> > matched are not affected by RSS processing. As a result it might not be
> > possible to configure two flow rules specifying incompatible RSS actions
> > simultaneously if the underlying device supports only a single global RSS
> > context.
> > 
> [Sugesh] thank you for the explanation. This means I can have a rule that matches on
> Every incoming packets(all field wild card rule) and does RSS hash on selected fields,
> MAC only, IP only or IP & MAC?

Yes, I guess it could even replace the current method for configuring RSS on
a device in a more versatile fashion, but this is a topic for another
debate.

Let's implement this API first!

> This can be useful to do a packet lookup in software by just using
> Only hash. 

Not sure to fully understand your idea, but I'm positive it could be done
somehow :)

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-21  8:13 ` Rahul Lakkireddy
@ 2016-07-21 17:07   ` Adrien Mazarguil
  2016-07-25 11:32     ` Rahul Lakkireddy
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-21 17:07 UTC (permalink / raw)
  To: Rahul Lakkireddy
  Cc: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley, Jing Chen,
	Konstantin Ananyev, Matej Vido, Alejandro Lucero, Sony Chacko,
	Jerin Jacob, Pablo de Lara, Olga Shern, Kumar Sanghvi,
	Nirranjan Kirubaharan, Indranil Choudhury

Hi Rahul,

Please see below.

On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote:
> Hi Adrien,
> 
> The proposal looks very good.  It satisfies most of the features
> supported by Chelsio NICs.  We are looking for suggestions on exposing
> more additional features supported by Chelsio NICs via this API.
> 
> Chelsio NICs have two regions in which filters can be placed -
> Maskfull and Maskless regions.  As their names imply, maskfull region
> can accept masks to match a range of values; whereas, maskless region
> don't accept any masks and hence perform a more strict exact-matches.
> Filters without masks can also be placed in maskfull region.  By
> default, maskless region have higher priority over the maskfull region.
> However, the priority between the two regions is configurable.

I understand this configuration affects the entire device. Just to be clear,
assuming some filters are already configured, are they affected by a change
of region priority later?

> Please suggest on how we can let the apps configure in which region
> filters must be placed and set the corresponding priority accordingly
> via this API.

Okay. Applications, like customers, are always right.

With this in mind, PMDs are not allowed to break existing flow rules, and
face two options when applications provide a flow specification that would
break an existing rule:

- Refuse to create it (easiest).

- Find a workaround to apply it anyway (possibly quite complicated).

The reason you have these two regions is performance right? Otherwise I'd
just say put everything in the maskfull region.

PMDs are allowed to rearrange existing rules or change device parameters as
long as existing constraints are satisfied. In my opinion it does not matter
which region has the highest default priority. Applications always want the
best performance so the first created rule should be in the most appropriate
region.

If a subsequent rule requires it to be in the other region but the
application specified the wrong priority for this to work, then the PMD can
either choose to swap region priorities on the device (assuming it does not
affect other rules), or destroy and recreate the original rule in a way that
satisfies all constraints (i.e. moving conflicting rules from the maskless
region to the maskfull one).

Going further, when subsequent rules get destroyed the PMD should ideally
move back maskfull rules back into the maskless region for better
performance.

This is only a suggestion. PMDs have the right to say "no" at some point.

More important in my opinion is to make sure applications can create a given
set of flow rules in any order. If rules a/b/c can be created, then it won't
make sense from an application point of view if c/a/b for some reason cannot
and the PMD maintainers will rightfully get a bug report.

> More comments below.
> 
> On Tuesday, July 07/05/16, 2016 at 20:16:46 +0200, Adrien Mazarguil wrote:
> > Hi All,
> > 
> [...]
> 
> > 
> > ``ETH``
> > ^^^^^^^
> > 
> > Matches an Ethernet header.
> > 
> > - ``dst``: destination MAC.
> > - ``src``: source MAC.
> > - ``type``: EtherType.
> > - ``tags``: number of 802.1Q/ad tags defined.
> > - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> > 
> >  - ``tpid``: Tag protocol identifier.
> >  - ``tci``: Tag control information.
> > 
> > ``IPV4``
> > ^^^^^^^^
> > 
> > Matches an IPv4 header.
> > 
> > - ``src``: source IP address.
> > - ``dst``: destination IP address.
> > - ``tos``: ToS/DSCP field.
> > - ``ttl``: TTL field.
> > - ``proto``: protocol number for the next layer.
> > 
> > ``IPV6``
> > ^^^^^^^^
> > 
> > Matches an IPv6 header.
> > 
> > - ``src``: source IP address.
> > - ``dst``: destination IP address.
> > - ``tc``: traffic class field.
> > - ``nh``: Next header field (protocol).
> > - ``hop_limit``: hop limit field (TTL).
> > 
> > ``ICMP``
> > ^^^^^^^^
> > 
> > Matches an ICMP header.
> > 
> > - TBD.
> > 
> > ``UDP``
> > ^^^^^^^
> > 
> > Matches a UDP header.
> > 
> > - ``sport``: source port.
> > - ``dport``: destination port.
> > - ``length``: UDP length.
> > - ``checksum``: UDP checksum.
> > 
> > .. raw:: pdf
> > 
> >    PageBreak
> > 
> > ``TCP``
> > ^^^^^^^
> > 
> > Matches a TCP header.
> > 
> > - ``sport``: source port.
> > - ``dport``: destination port.
> > - All other TCP fields and bits.
> > 
> > ``VXLAN``
> > ^^^^^^^^^
> > 
> > Matches a VXLAN header.
> > 
> > - TBD.
> > 
> 
> In addition to above matches, Chelsio NICs have some additional
> features:
> 
> - Match based on unicast DST-MAC, multicast DST-MAC, broadcast DST-MAC.
>   Also, there is a match criteria available called 'promisc' - which
>   matches packets that are not destined for the interface, but had
>   been received by the hardware due to interface being in promiscuous
>   mode.

There is no unicast/multicast/broadcast distinction in the ETH pattern item,
those are covered by the specified MAC address.

Now about this "promisc" match criteria, it can be added as a new meta
pattern item (4.1.3 Meta item types). Do you want it to be defined from the
start or add it later with the related code in your PMD?

> - Match FCoE packets.

The "oE" part is covered by the ETH item (using the proper Ethertype), but a
new pattern item should probably be added for the FC protocol itself. As
above, I suggest adding it later.

> - Match IP Fragmented packets.

It seems that I missed quite a few fields in the IPv4 item definition
(struct rte_flow_item_ipv4). It should be in there.

> - Match range of physical ports on the NIC in a single rule via masks.
>   For ex: match all UDP packets coming on ports 3 and 4 out of 4
>   ports available on the NIC.

Applications create flow rules per port, I'd thus suggest that the PMD
should detect identical rules created on different ports and aggregate them
as a single HW rule automatically.

If you think this approach is not right, the alternative is a meta pattern
item that provides a list of ports. I'm not sure this is the right approach
considering it would most likely not be supported by most NICs. Applications
may not request it explicitly.

> - Match range of Physical Functions (PFs) on the NIC in a single rule
>   via masks. For ex: match all traffic coming on several PFs.

The PF and VF pattern items assume there is a single PF associated with a
DPDK port. VFs are identified with an ID. I basically took the same
definitions as the existing filter types, perhaps this is not enough for
Chelsio adapters.

Do you expose more than one PF for a DPDK port?

Anyway, I'd suggest the same approach as above, automatic aggregation of
rules for performance reasons, otherwise new or updated PF/VF pattern items,
in which case it would be great if you could provide ideal structure
definitions for this use case.

> Please suggest on how we can expose the above features to DPDK apps via
> this API.
> 
> [...]
> 
> > 
> > Actions
> > ~~~~~~~
> > 
> > Each possible action is represented by a type. Some have associated
> > configuration structures. Several actions combined in a list can be affected
> > to a flow rule. That list is not ordered.
> > 
> > At least one action must be defined in a filter rule in order to do
> > something with matched packets.
> > 
> > - Actions are defined with ``struct rte_flow_action``.
> > - A list of actions is defined with ``struct rte_flow_actions``.
> > 
> > They fall in three categories:
> > 
> > - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
> >   processing matched packets by subsequent flow rules, unless overridden
> >   with PASSTHRU.
> > 
> > - Non terminating actions (PASSTHRU, DUP) that leave matched packets up for
> >   additional processing by subsequent flow rules.
> > 
> > - Other non terminating meta actions that do not affect the fate of packets
> >   (END, VOID, ID, COUNT).
> > 
> > When several actions are combined in a flow rule, they should all have
> > different types (e.g. dropping a packet twice is not possible). However
> > considering the VOID type is an exception to this rule, the defined behavior
> > is for PMDs to only take into account the last action of a given type found
> > in the list. PMDs still perform error checking on the entire list.
> > 
> > *Note that PASSTHRU is the only action able to override a terminating rule.*
> > 
> 
> Chelsio NICs can support an action 'switch' which can re-direct
> matched packets from one port to another port in hardware.  In addition,
> it can also optionally:
> 
> 1. Perform header rewrites (src-mac/dst-mac rewrite, src-mac/dst-mac
>    swap, vlan add/remove/rewrite).
> 
> 2. Perform NAT'ing in hardware (4-tuple rewrite).
> 
> before sending it out on the wire [1].
> 
> To meet the above requirements, we'd need a way to pass sub-actions
> to action 'switch' and a way to pass extra info (such as new
> src-mac/dst-mac, new vlan, new 4-tuple for NAT) to rewrite
> corresponding fields.
> 
> We're looking for suggestions on how we can achieve action 'switch'
> in this new API.
> 
> From our understanding of this API, we could just expand
> rte_flow_action_type with an additional action type
> RTE_FLOW_ACTION_TYPE_SWITCH and define several sub-actions such as:
> 
> enum rte_flow_action_switch_type {
>         RTE_FLOW_ACTION_SWITCH_TYPE_NONE,
> 	RTE_FLOW_ACTION_SWITCH_TYPE_MAC_REWRITE,
> 	RTE_FLOW_ACTION_SWITCH_TYPE_MAC_SWAP,
> 	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_INSERT,
> 	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_DELETE,
> 	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_REWRITE,
> 	RTE_FLOW_ACTION_SWITCH_TYPE_NAT_REWRITE,
> };
> 
> and then define an rte_flow_action_switch as follows:
> 
> struct rte_flow_action_switch {
> 	enum rte_flow_action_switch_type type; /* sub actions to perform */
> 	uint16_t port;	/* Destination physical port to switch packet */
> 	enum rte_flow_item_type	 type; /* Fields to rewrite */
> 	const void *switch_spec;
> 	/* Holds info to rewrite matched flows */
> };
> 
> Does the above approach sound right with respect to this new API?

It does. Not sure I'd go with a sublevel of actions for switching types
though. Think more generic, MAC swapping, MAC rewrite and so on could be
defined as separate actions usable on their own by all PMDs, so you'd
combine a SWITCH action with these.

Also be careful with the "port" field definition. I'm sure the Chelsio
switch cannot make a packet leave through a port of a Mellanox device. DPDK
applications are aware of a single "port namespace", so it has to be
translated by the PMD to a physical port on the same Chelio adapter,
otherwise rule creation should fail.

> [...]
> 
> > 
> > ``COUNT``
> > ^^^^^^^^^
> > 
> > Enables hits counter for this rule.
> > 
> > This counter can be retrieved and reset through ``rte_flow_query()``, see
> > ``struct rte_flow_query_count``.
> > 
> > - Counters can be retrieved with ``rte_flow_query()``.
> > - No configurable property.
> > 
> > +---------------+
> > | COUNT         |
> > +===============+
> > | no properties |
> > +---------------+
> > 
> > Query structure to retrieve and reset the flow rule hits counter:
> > 
> > +------------------------------------------------+
> > | COUNT query                                    |
> > +===========+=====+==============================+
> > | ``reset`` | in  | reset counter after query    |
> > +-----------+-----+------------------------------+
> > | ``hits``  | out | number of hits for this flow |
> > +-----------+-----+------------------------------+
> > 
> 
> Chelsio NICs can also count the number of bytes that hit the rule.
> So, need a counter "bytes".

As well as the number of packets, right? Anyway it makes sense, I'll add
a "bytes" field.

> [1]  http://www.dpdk.org/ml/archives/dev/2016-February/032605.html

Wow. I've not paid much attention to this thread at the time. Tweaking FDIR
this much was quite an achievement. Now I feel lazy with my proposal for a
brand new API instead, thanks.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-11 10:41 ` Jerin Jacob
@ 2016-07-21 19:20   ` Adrien Mazarguil
  2016-07-23 21:10     ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-07-21 19:20 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu, Jan Medala,
	John Daley, Jing Chen, Konstantin Ananyev, Matej Vido,
	Alejandro Lucero, Sony Chacko, Pablo de Lara, Olga Shern

Hi Jerin,

Sorry, looks like I missed your reply. Please see below.

On Mon, Jul 11, 2016 at 04:11:43PM +0530, Jerin Jacob wrote:
> On Tue, Jul 05, 2016 at 08:16:46PM +0200, Adrien Mazarguil wrote:
> 
> Hi Adrien,
> 
> Overall this proposal looks very good. I could easily map to the
> classification hardware engines I am familiar with.

Great, thanks.

> > Priorities
> > ~~~~~~~~~~
> > 
> > A priority can be assigned to a matching pattern.
> > 
> > The default priority level is 0 and is also the highest. Support for more
> > than a single priority level in hardware is not guaranteed.
> > 
> > If a packet is matched by several filters at a given priority level, the
> > outcome is undefined. It can take any path and can even be duplicated.
> 
> In some cases fatal unrecoverable error too

Right, do you think I need to elaborate regarding unrecoverable errors?

How much unrecoverable by the way? Like not being able to receive any more
packets?

> > Matching pattern items for packet data must be naturally stacked (ordered
> > from lowest to highest protocol layer), as in the following examples:
> > 
> > +--------------+
> > | TCPv4 as L4  |
> > +===+==========+
> > | 0 | Ethernet |
> > +---+----------+
> > | 1 | IPv4     |
> > +---+----------+
> > | 2 | TCP      |
> > +---+----------+
> > 
> > +----------------+
> > | TCPv6 in VXLAN |
> > +===+============+
> > | 0 | Ethernet   |
> > +---+------------+
> > | 1 | IPv4       |
> > +---+------------+
> > | 2 | UDP        |
> > +---+------------+
> > | 3 | VXLAN      |
> > +---+------------+
> > | 4 | Ethernet   |
> > +---+------------+
> > | 5 | IPv6       |
> > +---+------------+
> 
> How about enumerating as "Inner-IPV6" flow type to avoid any confusion. Though spec
> can be same for both IPv6 and Inner-IPV6.

I'm not sure, if we have a more than two encapsulated IPv6 headers, knowing
that one of them is "inner" is not really useful. This is why I choose to
enforce the stack ordering instead, I think it makes more sense.

> > | 6 | TCP        |
> > +---+------------+
> > 
> > +-----------------------------+
> > | TCPv4 as L4 with meta items |
> > +===+=========================+
> > | 0 | VOID                    |
> > +---+-------------------------+
> > | 1 | Ethernet                |
> > +---+-------------------------+
> > | 2 | VOID                    |
> > +---+-------------------------+
> > | 3 | IPv4                    |
> > +---+-------------------------+
> > | 4 | TCP                     |
> > +---+-------------------------+
> > | 5 | VOID                    |
> > +---+-------------------------+
> > | 6 | VOID                    |
> > +---+-------------------------+
> > 
> > The above example shows how meta items do not affect packet data matching
> > items, as long as those remain stacked properly. The resulting matching
> > pattern is identical to "TCPv4 as L4".
> > 
> > +----------------+
> > | UDPv6 anywhere |
> > +===+============+
> > | 0 | IPv6       |
> > +---+------------+
> > | 1 | UDP        |
> > +---+------------+
> > 
> > If supported by the PMD, omitting one or several protocol layers at the
> > bottom of the stack as in the above example (missing an Ethernet
> > specification) enables hardware to look anywhere in packets.
> 
> It would be good if the common code can give it as Ethernet, IPV6, UDP
> to PMD(to avoid common code duplication across PMDs)

I left this mostly at PMD's discretion for now. Applications must provide
explicit rules if they need a consistent behavior. PMDs may not support this
at all, I've just documented what applications should expect when attempting
this kind of pattern.

> > It is unspecified whether the payload of supported encapsulations
> > (e.g. VXLAN inner packet) is matched by such a pattern, which may apply to
> > inner, outer or both packets.
> 
> a separate flow type enumeration may fix that problem. like "Inner-IPV6"
> mentioned above.

Not sure about that, for the same reason as above. Which "inner" level would
be matched by such a pattern? Note that it could have started with VXLAN
followed by ETH and then IPv6 if the application cared.

This is basically the ability to remain vague about a rule. I didn't want to
forbid it outright because I'm sure there are possible use cases:

- PMD validation and debugging.

- Rough filtering according to protocols a packet might contain somewhere
  (think of the network admins who cannot stand anything other than packets
  addressed to TCP port 80).

> > +---------------------+
> > | Invalid, missing L3 |
> > +===+=================+
> > | 0 | Ethernet        |
> > +---+-----------------+
> > | 1 | UDP             |
> > +---+-----------------+
> > 
> > The above pattern is invalid due to a missing L3 specification between L2
> > and L4. It is only allowed at the bottom and at the top of the stack.
> > 
> 
> > ``SIGNATURE``
> > ^^^^^^^^^^^^^
> > 
> > Requests hash-based signature dispatching for this rule.
> > 
> > Considering this is a global setting on devices that support it, all
> > subsequent filter rules may have to be created with it as well.
> 
> Can you describe the use case for this and how its different from
> existing rte_eth devel RSS settings.

Erm, well, this is my attempt at reimplementing RTE_FDIR_MODE_SIGNATURE
without being too sure of what it actually /does/. So far this definition
hasn't raised any eyebrows.

By default this API works like RTE_FDIR_MODE_PERFECT, where protocol headers
are matched with specific patterns. I understand this signature mode as not
giving the same results (a packet that would have matched a pattern in
perfect mode may not match in signature mode and vice versa) as there is
some signature involved at some point for some reason. OK, I really have no
idea.

I'm confident it is not related to RSS though. Perhaps people more familiar
with this mode (ixgbe maintainers) should comment. Maybe this mode is not
all that useful anymore.

> > - Only ``spec`` needs to be defined, ``mask`` is ignored.
> > 
> > +--------------------+
> > | SIGNATURE          |
> > +==========+=========+
> > | ``spec`` | TBD     |
> > +----------+---------+
> > | ``mask`` | ignored |
> > +----------+---------+
> > 
> 
> > 
> > ``ETH``
> > ^^^^^^^
> > 
> > Matches an Ethernet header.
> > 
> > - ``dst``: destination MAC.
> > - ``src``: source MAC.
> > - ``type``: EtherType.
> > - ``tags``: number of 802.1Q/ad tags defined.
> > - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> > 
> >  - ``tpid``: Tag protocol identifier.
> >  - ``tci``: Tag control information.
> 
> Find below the other L2 layer attributes are useful in HW classification,
> 
> - HiGig headers
> - DSA Headers
> - MPLS
> 
> May be we need to intrdouce a separate flow type with spec to add the support. Right?

Yeah, while I'm only familiar with MPLS, I think it's better to let these
so-called "2.5" protocol layers have their own specifications. I think
missing protocols will appear at the same time as PMD support.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-21 12:47               ` Adrien Mazarguil
@ 2016-07-22  1:38                 ` Lu, Wenzhuo
  0 siblings, 0 replies; 262+ messages in thread
From: Lu, Wenzhuo @ 2016-07-22  1:38 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Jan Medala, John Daley, Chen,
	Jing D, Ananyev, Konstantin, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, De Lara Guarch, Pablo, Olga Shern

Hi Adrien,


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Thursday, July 21, 2016 8:48 PM
> To: Lu, Wenzhuo
> Cc: dev@dpdk.org; Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody;
> Ajit Khaparde; Rahul Lakkireddy; Jan Medala; John Daley; Chen, Jing D; Ananyev,
> Konstantin; Matej Vido; Alejandro Lucero; Sony Chacko; Jerin Jacob; De Lara
> Guarch, Pablo; Olga Shern
> Subject: Re: [RFC] Generic flow director/filtering/classification API
> 
> Hi Wenzhuo,
> 
> It seems that we agree on about everything now, just a few more comments
> below after snipping the now irrelevant parts.
Yes, I think we’re agree with each other now. And thanks for the additional explanation :)

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-21 13:37                       ` Adrien Mazarguil
@ 2016-07-22 16:32                         ` Chandran, Sugesh
  0 siblings, 0 replies; 262+ messages in thread
From: Chandran, Sugesh @ 2016-07-22 16:32 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Zhang, Helin, Wu, Jingjing, Rasesh Mody,
	Ajit Khaparde, Rahul Lakkireddy, Lu, Wenzhuo, Jan Medala,
	John Daley, Chen, Jing D, Ananyev, Konstantin, Matej Vido,
	Alejandro Lucero, Sony Chacko, Jerin Jacob, De Lara Guarch,
	Pablo, Olga Shern, Chilikin, Andrey

HI Adrien,
Thank you for your effort and considering the inputs and comments.
The design looks fine for me now.


Regards
_Sugesh


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Thursday, July 21, 2016 2:37 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>;
> Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Rasesh Mody <rasesh.mody@qlogic.com>; Ajit
> Khaparde <ajit.khaparde@broadcom.com>; Rahul Lakkireddy
> <rahul.lakkireddy@chelsio.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>;
> Jan Medala <jan@semihalf.com>; John Daley <johndale@cisco.com>; Chen,
> Jing D <jing.d.chen@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Matej Vido <matejvido@gmail.com>;
> Alejandro Lucero <alejandro.lucero@netronome.com>; Sony Chacko
> <sony.chacko@qlogic.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olga Shern <olgas@mellanox.com>;
> Chilikin, Andrey <andrey.chilikin@intel.com>
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> Hi Sugesh,
> 
> I do not have much to add, please see below.
> 
> On Thu, Jul 21, 2016 at 11:06:52AM +0000, Chandran, Sugesh wrote:
> [...]
> > > > RSS hashing support :- Just to confirm, the RSS flow action allows
> > > > application to decide the header fields to produce the hash. This
> > > > gives programmability on load sharing across different queues. The
> > > > application can program the NIC to calculate the RSS hash only
> > > > using mac or mac+ ip or ip only using this.
> > >
> > > I'd say yes but from your summary, I'm not sure we share the same
> > > idea of what the RSS action is supposed to do, so here is mine.
> > >
> > > Like all flow rules, the pattern part of the RSS action only filters
> > > the packets on which the action will be performed.
> > >
> > > The rss_conf parameter (struct rte_eth_rss_conf) only provides a key
> > > and a RSS hash function to use (ETH_RSS_IPV4,
> > > ETH_RSS_NONFRAG_IPV6_UDP, etc).
> > >
> > > Nothing prevents the RSS hash function from being applied to
> > > protocol headers which are not necessarily present in the flow rule
> > > pattern. These are two independent things, e.g. you could have a
> > > pattern matching IPv4 packets yet perform RSS hashing only on UDP
> headers.
> > >
> > > Finally, the RSS action configuration only affects packets coming
> > > from this flow rule. It is not performed on the device globally so
> > > packets which are not matched are not affected by RSS processing. As
> > > a result it might not be possible to configure two flow rules
> > > specifying incompatible RSS actions simultaneously if the underlying
> > > device supports only a single global RSS context.
> > >
> > [Sugesh] thank you for the explanation. This means I can have a rule
> > that matches on Every incoming packets(all field wild card rule) and
> > does RSS hash on selected fields, MAC only, IP only or IP & MAC?
> 
> Yes, I guess it could even replace the current method for configuring RSS on a
> device in a more versatile fashion, but this is a topic for another debate.
> 
> Let's implement this API first!
> 
> > This can be useful to do a packet lookup in software by just using
> > Only hash.
> 
> Not sure to fully understand your idea, but I'm positive it could be done
> somehow :)
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-21 19:20   ` Adrien Mazarguil
@ 2016-07-23 21:10     ` John Fastabend
  2016-08-02 18:19       ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: John Fastabend @ 2016-07-23 21:10 UTC (permalink / raw)
  To: Jerin Jacob, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu,
	Jan Medala, John Daley, Jing Chen, Konstantin Ananyev,
	Matej Vido, Alejandro Lucero, Sony Chacko, Pablo de Lara,
	Olga Shern

On 16-07-21 12:20 PM, Adrien Mazarguil wrote:
> Hi Jerin,
> 
> Sorry, looks like I missed your reply. Please see below.
> 

Hi Adrian,

Sorry for a bit delay but a few comments that may be worth considering.

To start with completely agree on the general problem statement and the
nice summary of all the current models. Also good start on this.

> 
> Considering that allowed pattern/actions combinations cannot be known in
> advance and would result in an unpractically large number of capabilities to
> expose, a method is provided to validate a given rule from the current
> device configuration state without actually adding it (akin to a "dry run"
> mode).

Rather than have a query/validate process why did we jump over having an
intermediate representation of the capabilities? Here you state it is
unpractical but we know how to represent parse graphs and the drivers
could report their supported parse graph via a single query to a middle
layer.

This will actually reduce the msg chatter imagine many applications at
init time or in boundary cases where a large set of applications come
online at once and start banging on the interface all at once seems less
than ideal.

Worse in my opinion it requires all drivers to write mostly duplicating
validation code where a common layer could easily do this if every
driver reported a common data structure representing its parse graph
instead. The nice fallout of this initial effort upfront is the driver
no longer needs to do error handling/checking/etc and can assume all
rules are correct and valid. It makes driver code much simpler to
support. And IMO at least by doing this we get some other nice benefits
described below.

Another related question is about performance.

> Creation
> ~~~~~~~~
> 
> Creating a flow rule is similar to validating one, except the rule is
> actually created.
> 
> ::
> 
>  struct rte_flow *
>  rte_flow_create(uint8_t port_id,
>                  const struct rte_flow_pattern *pattern,
>                  const struct rte_flow_actions *actions);

I gather this implies that each driver must parse the pattern/action
block and map this onto the hardware. How many rules per second can this
support? I've run into systems that expect a level of service somewhere
around 50k cmds per second. So bulking will help at the message level
but it seems like a lot of overhead to unpack the pattern/action section.

One strategy I've used in other systems that worked relatively well
is if the query for the parse graph above returns a key for each node
in the graph then a single lookup can map the key to a node. Its
unambiguous and then these operations simply become a table lookup.
So to be a bit more concrete this changes the pattern structure in
rte_flow_create() into a  <key,value,mask> tuple where the key is known
by the initial parse graph query. If you reserve a set of well-defined
key values for well known protocols like ethernet, ip, etc. then the
query model also works but the middle layer catches errors in this case
and again the driver only gets known good flows. So something like this,

  struct rte_flow_pattern {
	uint32_t priority;
	uint32_t key;
	uint32_t value_length;
	u8 *value;
  }

Also if we have multiple tables what do you think about adding a
table_id to the signature. Probably not needed in the first generation
but is likely useful for hardware with multiple tables so that it
would be,

   rte_flow_create(uint8_t port_id, uint8_t table_id, ...);

Finally one other problem we've had which would be great to address
if we are doing a rewrite of the API is adding new protocols to
already deployed DPDK stacks. This is mostly a Linux distribution
problem where you can't easily update DPDK.

In the prototype header linked in this document it seems to add new
headers requires adding a new enum in the rte_flow_item_type but there
is at least an attempt at a catch all here,

> 	/**
> 	 * Matches a string of a given length at a given offset (in bytes),
> 	 * or anywhere in the payload of the current protocol layer
> 	 * (including L2 header if used as the first item in the stack).
> 	 *
> 	 * See struct rte_flow_item_raw.
> 	 */
> 	RTE_FLOW_ITEM_TYPE_RAW,

Actually this is a nice implementation because it works after the
previous item in the stack correct? So you can put it after "known"
variable length headers like IP. The limitation is it can't get past
undefined variable length headers. However if you use the above parse
graph reporting from the driver mechanism and the driver always reports
its largest supported graph then we don't have this issue where a new
hardware sku/ucode/etc added support for new headers but we have no
way to deploy it to existing software users without recompiling and
redeploying.

I looked at the git repo but I only saw the header definition I guess
the implementation is TBD after there is enough agreement on the
interface?

Thanks,
John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-21 17:07   ` Adrien Mazarguil
@ 2016-07-25 11:32     ` Rahul Lakkireddy
  2016-07-25 16:40       ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: Rahul Lakkireddy @ 2016-07-25 11:32 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley, Jing Chen,
	Konstantin Ananyev, Matej Vido, Alejandro Lucero, Sony Chacko,
	Jerin Jacob, Pablo de Lara, Olga Shern, Kumar Sanghvi,
	Nirranjan Kirubaharan, Indranil Choudhury

Hi Adrien,

On Thursday, July 07/21/16, 2016 at 19:07:38 +0200, Adrien Mazarguil wrote:
> Hi Rahul,
> 
> Please see below.
> 
> On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote:
> > Hi Adrien,
> > 
> > The proposal looks very good.  It satisfies most of the features
> > supported by Chelsio NICs.  We are looking for suggestions on exposing
> > more additional features supported by Chelsio NICs via this API.
> > 
> > Chelsio NICs have two regions in which filters can be placed -
> > Maskfull and Maskless regions.  As their names imply, maskfull region
> > can accept masks to match a range of values; whereas, maskless region
> > don't accept any masks and hence perform a more strict exact-matches.
> > Filters without masks can also be placed in maskfull region.  By
> > default, maskless region have higher priority over the maskfull region.
> > However, the priority between the two regions is configurable.
> 
> I understand this configuration affects the entire device. Just to be clear,
> assuming some filters are already configured, are they affected by a change
> of region priority later?
> 

Both the regions exist at the same time in the device.  Each filter can
either belong to maskfull or the maskless region.

The priority is configured at time of filter creation for every
individual filter and cannot be changed while the filter is still
active. If priority needs to be changed for a particular filter then,
it needs to be deleted first and re-created.

> > Please suggest on how we can let the apps configure in which region
> > filters must be placed and set the corresponding priority accordingly
> > via this API.
> 
> Okay. Applications, like customers, are always right.
> 
> With this in mind, PMDs are not allowed to break existing flow rules, and
> face two options when applications provide a flow specification that would
> break an existing rule:
> 
> - Refuse to create it (easiest).
> 
> - Find a workaround to apply it anyway (possibly quite complicated).
> 
> The reason you have these two regions is performance right? Otherwise I'd
> just say put everything in the maskfull region.
> 

Unfortunately, our maskfull region is extremely small too compared to
maskless region.

> PMDs are allowed to rearrange existing rules or change device parameters as
> long as existing constraints are satisfied. In my opinion it does not matter
> which region has the highest default priority. Applications always want the
> best performance so the first created rule should be in the most appropriate
> region.
> 
> If a subsequent rule requires it to be in the other region but the
> application specified the wrong priority for this to work, then the PMD can
> either choose to swap region priorities on the device (assuming it does not
> affect other rules), or destroy and recreate the original rule in a way that
> satisfies all constraints (i.e. moving conflicting rules from the maskless
> region to the maskfull one).
> 
> Going further, when subsequent rules get destroyed the PMD should ideally
> move back maskfull rules back into the maskless region for better
> performance.
> 
> This is only a suggestion. PMDs have the right to say "no" at some point.
> 

Filter search and deletion are expensive operations and they need to be
atomic in order to not affect existing traffic already hitting the
filters.  Also, Chelsio hardware can support upto ~500,000 maskless
region entries.  So, the cost of search becomes too high for the PMD
when there are large number of filter entries present.

In my opinion, rather than PMD deciding on priority by itself, it would
be more simpler if the apps have the flexibility to configure the
priority between the regions by themselves?

> More important in my opinion is to make sure applications can create a given
> set of flow rules in any order. If rules a/b/c can be created, then it won't
> make sense from an application point of view if c/a/b for some reason cannot
> and the PMD maintainers will rightfully get a bug report.
> 
> > More comments below.
> > 
> > On Tuesday, July 07/05/16, 2016 at 20:16:46 +0200, Adrien Mazarguil wrote:
> > > Hi All,
> > > 
> > [...]
> > 
> > > 
> > > ``ETH``
> > > ^^^^^^^
> > > 
> > > Matches an Ethernet header.
> > > 
> > > - ``dst``: destination MAC.
> > > - ``src``: source MAC.
> > > - ``type``: EtherType.
> > > - ``tags``: number of 802.1Q/ad tags defined.
> > > - ``tag[]``: 802.1Q/ad tag definitions, innermost first. For each one:
> > > 
> > >  - ``tpid``: Tag protocol identifier.
> > >  - ``tci``: Tag control information.
> > > 
> > > ``IPV4``
> > > ^^^^^^^^
> > > 
> > > Matches an IPv4 header.
> > > 
> > > - ``src``: source IP address.
> > > - ``dst``: destination IP address.
> > > - ``tos``: ToS/DSCP field.
> > > - ``ttl``: TTL field.
> > > - ``proto``: protocol number for the next layer.
> > > 
> > > ``IPV6``
> > > ^^^^^^^^
> > > 
> > > Matches an IPv6 header.
> > > 
> > > - ``src``: source IP address.
> > > - ``dst``: destination IP address.
> > > - ``tc``: traffic class field.
> > > - ``nh``: Next header field (protocol).
> > > - ``hop_limit``: hop limit field (TTL).
> > > 
> > > ``ICMP``
> > > ^^^^^^^^
> > > 
> > > Matches an ICMP header.
> > > 
> > > - TBD.
> > > 
> > > ``UDP``
> > > ^^^^^^^
> > > 
> > > Matches a UDP header.
> > > 
> > > - ``sport``: source port.
> > > - ``dport``: destination port.
> > > - ``length``: UDP length.
> > > - ``checksum``: UDP checksum.
> > > 
> > > .. raw:: pdf
> > > 
> > >    PageBreak
> > > 
> > > ``TCP``
> > > ^^^^^^^
> > > 
> > > Matches a TCP header.
> > > 
> > > - ``sport``: source port.
> > > - ``dport``: destination port.
> > > - All other TCP fields and bits.
> > > 
> > > ``VXLAN``
> > > ^^^^^^^^^
> > > 
> > > Matches a VXLAN header.
> > > 
> > > - TBD.
> > > 
> > 
> > In addition to above matches, Chelsio NICs have some additional
> > features:
> > 
> > - Match based on unicast DST-MAC, multicast DST-MAC, broadcast DST-MAC.
> >   Also, there is a match criteria available called 'promisc' - which
> >   matches packets that are not destined for the interface, but had
> >   been received by the hardware due to interface being in promiscuous
> >   mode.
> 
> There is no unicast/multicast/broadcast distinction in the ETH pattern item,
> those are covered by the specified MAC address.
> 

Ok.  Makes sense.

> Now about this "promisc" match criteria, it can be added as a new meta
> pattern item (4.1.3 Meta item types). Do you want it to be defined from the
> start or add it later with the related code in your PMD?
> 

It could be added as a meta item.  If there are other interested
parties, it can be added now.  Otherwise, we'll add it with our filtering
related code.

> > - Match FCoE packets.
> 
> The "oE" part is covered by the ETH item (using the proper Ethertype), but a
> new pattern item should probably be added for the FC protocol itself. As
> above, I suggest adding it later.
> 

Same here.

> > - Match IP Fragmented packets.
> 
> It seems that I missed quite a few fields in the IPv4 item definition
> (struct rte_flow_item_ipv4). It should be in there.
> 

This raises another interesting question.  What should the PMD do
if it has support to only a subset of fields in the particular item?

For example, if a rule has been sent to match IP fragmentation along
with several other IPv4 fields, and if the underlying hardware doesn't
support matching based on IP fragmentation, does the PMD reject the
complete rule although it could have done the matching for rest of the
IPv4 fields?

> > - Match range of physical ports on the NIC in a single rule via masks.
> >   For ex: match all UDP packets coming on ports 3 and 4 out of 4
> >   ports available on the NIC.
> 
> Applications create flow rules per port, I'd thus suggest that the PMD
> should detect identical rules created on different ports and aggregate them
> as a single HW rule automatically.
> 
> If you think this approach is not right, the alternative is a meta pattern
> item that provides a list of ports. I'm not sure this is the right approach
> considering it would most likely not be supported by most NICs. Applications
> may not request it explicitly.
> 

Aggregating via PMD will be expensive operation since it would involve:
- Search of existing filters.
- Deleting those filters.
- Creating a single combined filter.

And all of above 3 operations would need to be atomic so as not to
affect existing traffic which is hitting above filters. Adding a
meta item would be a simpler solution here.

> > - Match range of Physical Functions (PFs) on the NIC in a single rule
> >   via masks. For ex: match all traffic coming on several PFs.
> 
> The PF and VF pattern items assume there is a single PF associated with a
> DPDK port. VFs are identified with an ID. I basically took the same
> definitions as the existing filter types, perhaps this is not enough for
> Chelsio adapters.
> 
> Do you expose more than one PF for a DPDK port?
> 
> Anyway, I'd suggest the same approach as above, automatic aggregation of
> rules for performance reasons, otherwise new or updated PF/VF pattern items,
> in which case it would be great if you could provide ideal structure
> definitions for this use case.
> 

In Chelsio hardware, all the ports of a device are exposed via single
PF4. There could be many VFs attached to a PF.  Physical NIC functions
are operational on PF4, while VFs can be attached to PFs 0-3.
So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one
mapping assumption.

There already seems to be a PF meta-item, but it doesn't seem to accept
any "spec" and "mask" field.  Similarly, the VF meta-item doesn't
seem to accept a "mask" field.  We could probably enable these fields
in the PF and VF meta-items to allow configuration.

> > Please suggest on how we can expose the above features to DPDK apps via
> > this API.
> > 
> > [...]
> > 
> > > 
> > > Actions
> > > ~~~~~~~
> > > 
> > > Each possible action is represented by a type. Some have associated
> > > configuration structures. Several actions combined in a list can be affected
> > > to a flow rule. That list is not ordered.
> > > 
> > > At least one action must be defined in a filter rule in order to do
> > > something with matched packets.
> > > 
> > > - Actions are defined with ``struct rte_flow_action``.
> > > - A list of actions is defined with ``struct rte_flow_actions``.
> > > 
> > > They fall in three categories:
> > > 
> > > - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
> > >   processing matched packets by subsequent flow rules, unless overridden
> > >   with PASSTHRU.
> > > 
> > > - Non terminating actions (PASSTHRU, DUP) that leave matched packets up for
> > >   additional processing by subsequent flow rules.
> > > 
> > > - Other non terminating meta actions that do not affect the fate of packets
> > >   (END, VOID, ID, COUNT).
> > > 
> > > When several actions are combined in a flow rule, they should all have
> > > different types (e.g. dropping a packet twice is not possible). However
> > > considering the VOID type is an exception to this rule, the defined behavior
> > > is for PMDs to only take into account the last action of a given type found
> > > in the list. PMDs still perform error checking on the entire list.
> > > 
> > > *Note that PASSTHRU is the only action able to override a terminating rule.*
> > > 
> > 
> > Chelsio NICs can support an action 'switch' which can re-direct
> > matched packets from one port to another port in hardware.  In addition,
> > it can also optionally:
> > 
> > 1. Perform header rewrites (src-mac/dst-mac rewrite, src-mac/dst-mac
> >    swap, vlan add/remove/rewrite).
> > 
> > 2. Perform NAT'ing in hardware (4-tuple rewrite).
> > 
> > before sending it out on the wire [1].
> > 
> > To meet the above requirements, we'd need a way to pass sub-actions
> > to action 'switch' and a way to pass extra info (such as new
> > src-mac/dst-mac, new vlan, new 4-tuple for NAT) to rewrite
> > corresponding fields.
> > 
> > We're looking for suggestions on how we can achieve action 'switch'
> > in this new API.
> > 
> > From our understanding of this API, we could just expand
> > rte_flow_action_type with an additional action type
> > RTE_FLOW_ACTION_TYPE_SWITCH and define several sub-actions such as:
> > 
> > enum rte_flow_action_switch_type {
> >         RTE_FLOW_ACTION_SWITCH_TYPE_NONE,
> > 	RTE_FLOW_ACTION_SWITCH_TYPE_MAC_REWRITE,
> > 	RTE_FLOW_ACTION_SWITCH_TYPE_MAC_SWAP,
> > 	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_INSERT,
> > 	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_DELETE,
> > 	RTE_FLOW_ACTION_SWITCH_TYPE_VLAN_REWRITE,
> > 	RTE_FLOW_ACTION_SWITCH_TYPE_NAT_REWRITE,
> > };
> > 
> > and then define an rte_flow_action_switch as follows:
> > 
> > struct rte_flow_action_switch {
> > 	enum rte_flow_action_switch_type type; /* sub actions to perform */
> > 	uint16_t port;	/* Destination physical port to switch packet */
> > 	enum rte_flow_item_type	 type; /* Fields to rewrite */
> > 	const void *switch_spec;
> > 	/* Holds info to rewrite matched flows */
> > };
> > 
> > Does the above approach sound right with respect to this new API?
> 
> It does. Not sure I'd go with a sublevel of actions for switching types
> though. Think more generic, MAC swapping, MAC rewrite and so on could be
> defined as separate actions usable on their own by all PMDs, so you'd
> combine a SWITCH action with these.
> 

Ok.  Separate actions seem better.

> Also be careful with the "port" field definition. I'm sure the Chelsio
> switch cannot make a packet leave through a port of a Mellanox device. DPDK
> applications are aware of a single "port namespace", so it has to be
> translated by the PMD to a physical port on the same Chelio adapter,
> otherwise rule creation should fail.
> 

Agreed.

> > [...]
> > 
> > > 
> > > ``COUNT``
> > > ^^^^^^^^^
> > > 
> > > Enables hits counter for this rule.
> > > 
> > > This counter can be retrieved and reset through ``rte_flow_query()``, see
> > > ``struct rte_flow_query_count``.
> > > 
> > > - Counters can be retrieved with ``rte_flow_query()``.
> > > - No configurable property.
> > > 
> > > +---------------+
> > > | COUNT         |
> > > +===============+
> > > | no properties |
> > > +---------------+
> > > 
> > > Query structure to retrieve and reset the flow rule hits counter:
> > > 
> > > +------------------------------------------------+
> > > | COUNT query                                    |
> > > +===========+=====+==============================+
> > > | ``reset`` | in  | reset counter after query    |
> > > +-----------+-----+------------------------------+
> > > | ``hits``  | out | number of hits for this flow |
> > > +-----------+-----+------------------------------+
> > > 
> > 
> > Chelsio NICs can also count the number of bytes that hit the rule.
> > So, need a counter "bytes".
> 
> As well as the number of packets, right? Anyway it makes sense, I'll add
> a "bytes" field.
> 

Yes.  I hope 'hits' is analogous to 'number of packets' right?

> > [1]  http://www.dpdk.org/ml/archives/dev/2016-February/032605.html
> 
> Wow. I've not paid much attention to this thread at the time. Tweaking FDIR
> this much was quite an achievement. Now I feel lazy with my proposal for a
> brand new API instead, thanks.

Thanks.  A generic filtering infrastructure was what I was aiming for
and this proposal seems to have hit the mark.

Thanks,
Rahul

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-25 11:32     ` Rahul Lakkireddy
@ 2016-07-25 16:40       ` John Fastabend
  2016-07-26 10:07         ` Rahul Lakkireddy
  0 siblings, 1 reply; 262+ messages in thread
From: John Fastabend @ 2016-07-25 16:40 UTC (permalink / raw)
  To: Rahul Lakkireddy, Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley, Jing Chen,
	Konstantin Ananyev, Matej Vido, Alejandro Lucero, Sony Chacko,
	Jerin Jacob, Pablo de Lara, Olga Shern, Kumar Sanghvi,
	Nirranjan Kirubaharan, Indranil Choudhury

On 16-07-25 04:32 AM, Rahul Lakkireddy wrote:
> Hi Adrien,
> 
> On Thursday, July 07/21/16, 2016 at 19:07:38 +0200, Adrien Mazarguil wrote:
>> Hi Rahul,
>>
>> Please see below.
>>
>> On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote:
>>> Hi Adrien,
>>>
>>> The proposal looks very good.  It satisfies most of the features
>>> supported by Chelsio NICs.  We are looking for suggestions on exposing
>>> more additional features supported by Chelsio NICs via this API.
>>>
>>> Chelsio NICs have two regions in which filters can be placed -
>>> Maskfull and Maskless regions.  As their names imply, maskfull region
>>> can accept masks to match a range of values; whereas, maskless region
>>> don't accept any masks and hence perform a more strict exact-matches.
>>> Filters without masks can also be placed in maskfull region.  By
>>> default, maskless region have higher priority over the maskfull region.
>>> However, the priority between the two regions is configurable.
>>
>> I understand this configuration affects the entire device. Just to be clear,
>> assuming some filters are already configured, are they affected by a change
>> of region priority later?
>>
> 
> Both the regions exist at the same time in the device.  Each filter can
> either belong to maskfull or the maskless region.
> 
> The priority is configured at time of filter creation for every
> individual filter and cannot be changed while the filter is still
> active. If priority needs to be changed for a particular filter then,
> it needs to be deleted first and re-created.

Could you model this as two tables and add a table_id to the API? This
way user space could populate the table it chooses. We would have to add
some capabilities attributes to "learn" if tables support masks or not
though.

I don't see how the PMD can sort this out in any meaningful way and it
has to be exposed to the application that has the intelligence to 'know'
priorities between masks and non-masks filters. I'm sure you could come
up with something but it would be less than ideal in many cases I would
guess and we can't have the driver getting priorities wrong or we may
not get the correct behavior.

> 
>>> Please suggest on how we can let the apps configure in which region
>>> filters must be placed and set the corresponding priority accordingly
>>> via this API.
>>
>> Okay. Applications, like customers, are always right.
>>
>> With this in mind, PMDs are not allowed to break existing flow rules, and
>> face two options when applications provide a flow specification that would
>> break an existing rule:
>>
>> - Refuse to create it (easiest).
>>
>> - Find a workaround to apply it anyway (possibly quite complicated).
>>
>> The reason you have these two regions is performance right? Otherwise I'd
>> just say put everything in the maskfull region.
>>
> 
> Unfortunately, our maskfull region is extremely small too compared to
> maskless region.
> 

To me this means a userspace application would want to pack it
carefully to get the full benefit. So you need some mechanism to specify
the "region" hence the above table proposal.

>> PMDs are allowed to rearrange existing rules or change device parameters as
>> long as existing constraints are satisfied. In my opinion it does not matter
>> which region has the highest default priority. Applications always want the
>> best performance so the first created rule should be in the most appropriate
>> region.
>>
>> If a subsequent rule requires it to be in the other region but the
>> application specified the wrong priority for this to work, then the PMD can
>> either choose to swap region priorities on the device (assuming it does not
>> affect other rules), or destroy and recreate the original rule in a way that
>> satisfies all constraints (i.e. moving conflicting rules from the maskless
>> region to the maskfull one).
>>
>> Going further, when subsequent rules get destroyed the PMD should ideally
>> move back maskfull rules back into the maskless region for better
>> performance.
>>
>> This is only a suggestion. PMDs have the right to say "no" at some point.
>>
> 
> Filter search and deletion are expensive operations and they need to be
> atomic in order to not affect existing traffic already hitting the
> filters.  Also, Chelsio hardware can support upto ~500,000 maskless
> region entries.  So, the cost of search becomes too high for the PMD
> when there are large number of filter entries present.
> 
> In my opinion, rather than PMD deciding on priority by itself, it would
> be more simpler if the apps have the flexibility to configure the
> priority between the regions by themselves?
> 

Agreed I think this will be a common problem especially as the hardware
tables get bigger and support masks where priority becomes important to
distinguish between multiple matching rules.

+1 for having a priority field.

>> More important in my opinion is to make sure applications can create a given
>> set of flow rules in any order. If rules a/b/c can be created, then it won't
>> make sense from an application point of view if c/a/b for some reason cannot
>> and the PMD maintainers will rightfully get a bug report.

[...]

> 
>> Now about this "promisc" match criteria, it can be added as a new meta
>> pattern item (4.1.3 Meta item types). Do you want it to be defined from the
>> start or add it later with the related code in your PMD?
>>
> 
> It could be added as a meta item.  If there are other interested
> parties, it can be added now.  Otherwise, we'll add it with our filtering
> related code.
> 

hmm I guess by "promisc" here you mean match packets received from the
wire before they have been switched by the silicon?

>>> - Match FCoE packets.
>>
>> The "oE" part is covered by the ETH item (using the proper Ethertype), but a
>> new pattern item should probably be added for the FC protocol itself. As
>> above, I suggest adding it later.
>>
> 
> Same here.
> 
>>> - Match IP Fragmented packets.
>>
>> It seems that I missed quite a few fields in the IPv4 item definition
>> (struct rte_flow_item_ipv4). It should be in there.
>>
> 
> This raises another interesting question.  What should the PMD do
> if it has support to only a subset of fields in the particular item?
> 
> For example, if a rule has been sent to match IP fragmentation along
> with several other IPv4 fields, and if the underlying hardware doesn't
> support matching based on IP fragmentation, does the PMD reject the
> complete rule although it could have done the matching for rest of the
> IPv4 fields?

I think it has to fail the command other wise user space will not have
any way to understand that the full match criteria can not be met and
we will get different behavior for the same applications on different
nics depending on hardware feature set. This will most likely break
applications so we need the error IMO.

> 
>>> - Match range of physical ports on the NIC in a single rule via masks.
>>>   For ex: match all UDP packets coming on ports 3 and 4 out of 4
>>>   ports available on the NIC.
>>
>> Applications create flow rules per port, I'd thus suggest that the PMD
>> should detect identical rules created on different ports and aggregate them
>> as a single HW rule automatically.
>>
>> If you think this approach is not right, the alternative is a meta pattern
>> item that provides a list of ports. I'm not sure this is the right approach
>> considering it would most likely not be supported by most NICs. Applications
>> may not request it explicitly.
>>
> 
> Aggregating via PMD will be expensive operation since it would involve:
> - Search of existing filters.
> - Deleting those filters.
> - Creating a single combined filter.
> 
> And all of above 3 operations would need to be atomic so as not to
> affect existing traffic which is hitting above filters. Adding a
> meta item would be a simpler solution here.
> 

For this adding a meta-data item seems simplest to me. And if you want
to make the default to be only a single port that would maybe make it
easier for existing apps to port from flow director. Then if an
application cares it can create a list of ports if needed.

>>> - Match range of Physical Functions (PFs) on the NIC in a single rule
>>>   via masks. For ex: match all traffic coming on several PFs.
>>
>> The PF and VF pattern items assume there is a single PF associated with a
>> DPDK port. VFs are identified with an ID. I basically took the same
>> definitions as the existing filter types, perhaps this is not enough for
>> Chelsio adapters.
>>
>> Do you expose more than one PF for a DPDK port?
>>
>> Anyway, I'd suggest the same approach as above, automatic aggregation of
>> rules for performance reasons, otherwise new or updated PF/VF pattern items,
>> in which case it would be great if you could provide ideal structure
>> definitions for this use case.
>>
> 
> In Chelsio hardware, all the ports of a device are exposed via single
> PF4. There could be many VFs attached to a PF.  Physical NIC functions
> are operational on PF4, while VFs can be attached to PFs 0-3.
> So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one
> mapping assumption.
> 
> There already seems to be a PF meta-item, but it doesn't seem to accept
> any "spec" and "mask" field.  Similarly, the VF meta-item doesn't
> seem to accept a "mask" field.  We could probably enable these fields
> in the PF and VF meta-items to allow configuration.

Maybe a range field would help here as well? So you could specify a VF
range. It might be one of the things to consider adding later though if
there is no clear use for it now.

> 
>>> Please suggest on how we can expose the above features to DPDK apps via
>>> this API.
>>>

[...]

Thanks,
John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-25 16:40       ` John Fastabend
@ 2016-07-26 10:07         ` Rahul Lakkireddy
  2016-08-03 16:44           ` Adrien Mazarguil
  2016-08-19 21:13           ` John Daley (johndale)
  0 siblings, 2 replies; 262+ messages in thread
From: Rahul Lakkireddy @ 2016-07-26 10:07 UTC (permalink / raw)
  To: John Fastabend, Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley, Jing Chen,
	Konstantin Ananyev, Matej Vido, Alejandro Lucero, Sony Chacko,
	Jerin Jacob, Pablo de Lara, Olga Shern, Kumar A S,
	Nirranjan Kirubaharan, Indranil Choudhury

On Monday, July 07/25/16, 2016 at 09:40:02 -0700, John Fastabend wrote:
> On 16-07-25 04:32 AM, Rahul Lakkireddy wrote:
> > Hi Adrien,
> > 
> > On Thursday, July 07/21/16, 2016 at 19:07:38 +0200, Adrien Mazarguil wrote:
> >> Hi Rahul,
> >>
> >> Please see below.
> >>
> >> On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote:
> >>> Hi Adrien,
> >>>
> >>> The proposal looks very good.  It satisfies most of the features
> >>> supported by Chelsio NICs.  We are looking for suggestions on exposing
> >>> more additional features supported by Chelsio NICs via this API.
> >>>
> >>> Chelsio NICs have two regions in which filters can be placed -
> >>> Maskfull and Maskless regions.  As their names imply, maskfull region
> >>> can accept masks to match a range of values; whereas, maskless region
> >>> don't accept any masks and hence perform a more strict exact-matches.
> >>> Filters without masks can also be placed in maskfull region.  By
> >>> default, maskless region have higher priority over the maskfull region.
> >>> However, the priority between the two regions is configurable.
> >>
> >> I understand this configuration affects the entire device. Just to be clear,
> >> assuming some filters are already configured, are they affected by a change
> >> of region priority later?
> >>
> > 
> > Both the regions exist at the same time in the device.  Each filter can
> > either belong to maskfull or the maskless region.
> > 
> > The priority is configured at time of filter creation for every
> > individual filter and cannot be changed while the filter is still
> > active. If priority needs to be changed for a particular filter then,
> > it needs to be deleted first and re-created.
> 
> Could you model this as two tables and add a table_id to the API? This
> way user space could populate the table it chooses. We would have to add
> some capabilities attributes to "learn" if tables support masks or not
> though.
> 

This approach sounds interesting.

> I don't see how the PMD can sort this out in any meaningful way and it
> has to be exposed to the application that has the intelligence to 'know'
> priorities between masks and non-masks filters. I'm sure you could come
> up with something but it would be less than ideal in many cases I would
> guess and we can't have the driver getting priorities wrong or we may
> not get the correct behavior.
> 
> > 
> >>> Please suggest on how we can let the apps configure in which region
> >>> filters must be placed and set the corresponding priority accordingly
> >>> via this API.
> >>
> >> Okay. Applications, like customers, are always right.
> >>
> >> With this in mind, PMDs are not allowed to break existing flow rules, and
> >> face two options when applications provide a flow specification that would
> >> break an existing rule:
> >>
> >> - Refuse to create it (easiest).
> >>
> >> - Find a workaround to apply it anyway (possibly quite complicated).
> >>
> >> The reason you have these two regions is performance right? Otherwise I'd
> >> just say put everything in the maskfull region.
> >>
> > 
> > Unfortunately, our maskfull region is extremely small too compared to
> > maskless region.
> > 
> 
> To me this means a userspace application would want to pack it
> carefully to get the full benefit. So you need some mechanism to specify
> the "region" hence the above table proposal.
> 

Right. Makes sense.

[...]
> >> Now about this "promisc" match criteria, it can be added as a new meta
> >> pattern item (4.1.3 Meta item types). Do you want it to be defined from the
> >> start or add it later with the related code in your PMD?
> >>
> > 
> > It could be added as a meta item.  If there are other interested
> > parties, it can be added now.  Otherwise, we'll add it with our filtering
> > related code.
> > 
> 
> hmm I guess by "promisc" here you mean match packets received from the
> wire before they have been switched by the silicon?
> 

Match packets received from wire before they have been switched by
silicon, and which also includes packets not destined for DUT and were
still received due to interface being in promisc mode.

> >>> - Match FCoE packets.
> >>
> >> The "oE" part is covered by the ETH item (using the proper Ethertype), but a
> >> new pattern item should probably be added for the FC protocol itself. As
> >> above, I suggest adding it later.
> >>
> > 
> > Same here.
> > 
> >>> - Match IP Fragmented packets.
> >>
> >> It seems that I missed quite a few fields in the IPv4 item definition
> >> (struct rte_flow_item_ipv4). It should be in there.
> >>
> > 
> > This raises another interesting question.  What should the PMD do
> > if it has support to only a subset of fields in the particular item?
> > 
> > For example, if a rule has been sent to match IP fragmentation along
> > with several other IPv4 fields, and if the underlying hardware doesn't
> > support matching based on IP fragmentation, does the PMD reject the
> > complete rule although it could have done the matching for rest of the
> > IPv4 fields?
> 
> I think it has to fail the command other wise user space will not have
> any way to understand that the full match criteria can not be met and
> we will get different behavior for the same applications on different
> nics depending on hardware feature set. This will most likely break
> applications so we need the error IMO.
> 

Ok. Makes sense.

> > 
> >>> - Match range of physical ports on the NIC in a single rule via masks.
> >>>   For ex: match all UDP packets coming on ports 3 and 4 out of 4
> >>>   ports available on the NIC.
> >>
> >> Applications create flow rules per port, I'd thus suggest that the PMD
> >> should detect identical rules created on different ports and aggregate them
> >> as a single HW rule automatically.
> >>
> >> If you think this approach is not right, the alternative is a meta pattern
> >> item that provides a list of ports. I'm not sure this is the right approach
> >> considering it would most likely not be supported by most NICs. Applications
> >> may not request it explicitly.
> >>
> > 
> > Aggregating via PMD will be expensive operation since it would involve:
> > - Search of existing filters.
> > - Deleting those filters.
> > - Creating a single combined filter.
> > 
> > And all of above 3 operations would need to be atomic so as not to
> > affect existing traffic which is hitting above filters. Adding a
> > meta item would be a simpler solution here.
> > 
> 
> For this adding a meta-data item seems simplest to me. And if you want
> to make the default to be only a single port that would maybe make it
> easier for existing apps to port from flow director. Then if an
> application cares it can create a list of ports if needed.
> 

Agreed.

> >>> - Match range of Physical Functions (PFs) on the NIC in a single rule
> >>>   via masks. For ex: match all traffic coming on several PFs.
> >>
> >> The PF and VF pattern items assume there is a single PF associated with a
> >> DPDK port. VFs are identified with an ID. I basically took the same
> >> definitions as the existing filter types, perhaps this is not enough for
> >> Chelsio adapters.
> >>
> >> Do you expose more than one PF for a DPDK port?
> >>
> >> Anyway, I'd suggest the same approach as above, automatic aggregation of
> >> rules for performance reasons, otherwise new or updated PF/VF pattern items,
> >> in which case it would be great if you could provide ideal structure
> >> definitions for this use case.
> >>
> > 
> > In Chelsio hardware, all the ports of a device are exposed via single
> > PF4. There could be many VFs attached to a PF.  Physical NIC functions
> > are operational on PF4, while VFs can be attached to PFs 0-3.
> > So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one
> > mapping assumption.
> > 
> > There already seems to be a PF meta-item, but it doesn't seem to accept
> > any "spec" and "mask" field.  Similarly, the VF meta-item doesn't
> > seem to accept a "mask" field.  We could probably enable these fields
> > in the PF and VF meta-items to allow configuration.
> 
> Maybe a range field would help here as well? So you could specify a VF
> range. It might be one of the things to consider adding later though if
> there is no clear use for it now.
> 

VF-value and VF-mask would help to achieve the desired filter.
VF-mask would also enable to specify a range of VF values.

Thanks,
Rahul

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-23 21:10     ` John Fastabend
@ 2016-08-02 18:19       ` John Fastabend
  2016-08-03 14:30         ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: John Fastabend @ 2016-08-02 18:19 UTC (permalink / raw)
  To: Jerin Jacob, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu,
	Jan Medala, John Daley, Jing Chen, Konstantin Ananyev,
	Matej Vido, Alejandro Lucero, Sony Chacko, Pablo de Lara,
	Olga Shern

On 16-07-23 02:10 PM, John Fastabend wrote:
> On 16-07-21 12:20 PM, Adrien Mazarguil wrote:
>> Hi Jerin,
>>
>> Sorry, looks like I missed your reply. Please see below.
>>
> 
> Hi Adrian,
> 
> Sorry for a bit delay but a few comments that may be worth considering.
> 
> To start with completely agree on the general problem statement and the
> nice summary of all the current models. Also good start on this.
> 
>>
>> Considering that allowed pattern/actions combinations cannot be known in
>> advance and would result in an unpractically large number of capabilities to
>> expose, a method is provided to validate a given rule from the current
>> device configuration state without actually adding it (akin to a "dry run"
>> mode).
> 
> Rather than have a query/validate process why did we jump over having an
> intermediate representation of the capabilities? Here you state it is
> unpractical but we know how to represent parse graphs and the drivers
> could report their supported parse graph via a single query to a middle
> layer.
> 
> This will actually reduce the msg chatter imagine many applications at
> init time or in boundary cases where a large set of applications come
> online at once and start banging on the interface all at once seems less
> than ideal.
> 

A bit more details on possible interface for capabilities query,

One way I've used to describe these graphs from driver to software
stacks is to use a set of structures to build the graph. For fixed
graphs this could just be *.h file for programmable hardware (typically
coming from fw update on nics) the driver can read the parser details
out of firmware and render the structures.

I've done this two ways: one is to define all the fields in their
own structures using something like,

struct field {
	char *name;
	u32 uid;
	u32 bitwidth;
};

This gives a unique id (uid) for each field along with its
width and a user friendly name. The fields are organized into
headers via a header structure,

struct header_node {
	char *name;
	u32 uid;
	u32 *fields;
	struct parse_graph *jump;
};

Each node has a unique id and then a list of fields. Where 'fields'
is a list of uid's of fields its also easy enough to embed the field
struct in the header_node if that is simpler its really a style
question.

The 'struct parse_graph' gives the list of edges from this header node
to other header nodes. Using a parse graph structure defined

struct parse_graph {
	struct field_reference ref;
	__u32 jump_uid;
};

Again as a matter of style you can embed the parse graph in the header
node as I did above or do it as its own object.

The field_reference noted below gives the id of the field and the value
e.g. the tuple (ipv4.protocol, 6) then jump_uid would be the uid of TCP.

struct field_reference {
	__u32 header_uid;
	__u32 field_uid;
	__u32 mask_type;
	__u32 type;
	__u8  *value;
	__u8  *mask;
};

The cost doing all this is some additional overhead at init time. But
building generic function over this and having a set of predefined
uids for well-known protocols such ip, udp, tcp, etc helps. What you
get for the cost is a few things that I think are worth it. (i) Now
new protocols can be added/removed without recompiling DPDK (ii) a
software package can use the capability query to verify the required
protocols are off-loadable vs a possibly large set of test queries and
(iii) when we do the programming of the device we can provide a tuple
(table-uid, header-uid, field-uid, value, mask, priority) and the
middle layer "knowing" the above graph can verify the command so
drivers only ever see "good"  commands, (iv) finally it should be
faster in terms of cmds per second because the drivers can map the
tuple (table, header, field, priority) to a slot efficiently vs
parsing.

IMO point (iii) and (iv) will in practice make the code much simpler
because we can maintain common middle layer and not require parsing
by drivers. Making each driver simpler by abstracting into common
layer.

> Worse in my opinion it requires all drivers to write mostly duplicating
> validation code where a common layer could easily do this if every
> driver reported a common data structure representing its parse graph
> instead. The nice fallout of this initial effort upfront is the driver
> no longer needs to do error handling/checking/etc and can assume all
> rules are correct and valid. It makes driver code much simpler to
> support. And IMO at least by doing this we get some other nice benefits
> described below.
> 
> Another related question is about performance.
> 
>> Creation
>> ~~~~~~~~
>>
>> Creating a flow rule is similar to validating one, except the rule is
>> actually created.
>>
>> ::
>>
>>  struct rte_flow *
>>  rte_flow_create(uint8_t port_id,
>>                  const struct rte_flow_pattern *pattern,
>>                  const struct rte_flow_actions *actions);
> 
> I gather this implies that each driver must parse the pattern/action
> block and map this onto the hardware. How many rules per second can this
> support? I've run into systems that expect a level of service somewhere
> around 50k cmds per second. So bulking will help at the message level
> but it seems like a lot of overhead to unpack the pattern/action section.
> 
> One strategy I've used in other systems that worked relatively well
> is if the query for the parse graph above returns a key for each node
> in the graph then a single lookup can map the key to a node. Its
> unambiguous and then these operations simply become a table lookup.
> So to be a bit more concrete this changes the pattern structure in
> rte_flow_create() into a  <key,value,mask> tuple where the key is known
> by the initial parse graph query. If you reserve a set of well-defined
> key values for well known protocols like ethernet, ip, etc. then the
> query model also works but the middle layer catches errors in this case
> and again the driver only gets known good flows. So something like this,
> 
>   struct rte_flow_pattern {
> 	uint32_t priority;
> 	uint32_t key;
> 	uint32_t value_length;
> 	u8 *value;
>   }
> 
> Also if we have multiple tables what do you think about adding a
> table_id to the signature. Probably not needed in the first generation
> but is likely useful for hardware with multiple tables so that it
> would be,
> 
>    rte_flow_create(uint8_t port_id, uint8_t table_id, ...);
> 
> Finally one other problem we've had which would be great to address
> if we are doing a rewrite of the API is adding new protocols to
> already deployed DPDK stacks. This is mostly a Linux distribution
> problem where you can't easily update DPDK.
> 
> In the prototype header linked in this document it seems to add new
> headers requires adding a new enum in the rte_flow_item_type but there
> is at least an attempt at a catch all here,
> 
>> 	/**
>> 	 * Matches a string of a given length at a given offset (in bytes),
>> 	 * or anywhere in the payload of the current protocol layer
>> 	 * (including L2 header if used as the first item in the stack).
>> 	 *
>> 	 * See struct rte_flow_item_raw.
>> 	 */
>> 	RTE_FLOW_ITEM_TYPE_RAW,
> 
> Actually this is a nice implementation because it works after the
> previous item in the stack correct? So you can put it after "known"
> variable length headers like IP. The limitation is it can't get past
> undefined variable length headers. However if you use the above parse
> graph reporting from the driver mechanism and the driver always reports
> its largest supported graph then we don't have this issue where a new
> hardware sku/ucode/etc added support for new headers but we have no
> way to deploy it to existing software users without recompiling and
> redeploying.
> 
> I looked at the git repo but I only saw the header definition I guess
> the implementation is TBD after there is enough agreement on the
> interface?
> 
> Thanks,
> John
> 

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-02 18:19       ` John Fastabend
@ 2016-08-03 14:30         ` Adrien Mazarguil
  2016-08-03 18:10           ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-03 14:30 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jerin Jacob, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu,
	Jan Medala, John Daley, Jing Chen, Konstantin Ananyev,
	Matej Vido, Alejandro Lucero, Sony Chacko, Pablo de Lara,
	Olga Shern

Hi John,

I'm replying below to both messages.

On Tue, Aug 02, 2016 at 11:19:15AM -0700, John Fastabend wrote:
> On 16-07-23 02:10 PM, John Fastabend wrote:
> > On 16-07-21 12:20 PM, Adrien Mazarguil wrote:
> >> Hi Jerin,
> >>
> >> Sorry, looks like I missed your reply. Please see below.
> >>
> > 
> > Hi Adrian,
> > 
> > Sorry for a bit delay but a few comments that may be worth considering.
> > 
> > To start with completely agree on the general problem statement and the
> > nice summary of all the current models. Also good start on this.

Thanks.

> >> Considering that allowed pattern/actions combinations cannot be known in
> >> advance and would result in an unpractically large number of capabilities to
> >> expose, a method is provided to validate a given rule from the current
> >> device configuration state without actually adding it (akin to a "dry run"
> >> mode).
> > 
> > Rather than have a query/validate process why did we jump over having an
> > intermediate representation of the capabilities? Here you state it is
> > unpractical but we know how to represent parse graphs and the drivers
> > could report their supported parse graph via a single query to a middle
> > layer.
> > 
> > This will actually reduce the msg chatter imagine many applications at
> > init time or in boundary cases where a large set of applications come
> > online at once and start banging on the interface all at once seems less
> > than ideal.

Well, I also thought about a kind of graph to represent capabilities but
feared the extra complexity would not be worth the trouble, thus settled on
the query idea. A couple more reasons:

- Capabilities evolve at the same time as devices are configured. For
  example, if a device supports a single RSS context, then a single rule
  with a RSS action may be created. The graph would have to be rewritten
  accordingly and thus queried/parsed again by the application.

- Expressing capabilities at bit granularity (say, for a matching pattern
  item mask) is complex, there is no way to simplify the representation of
  capabilities without either losing information or making the graph more
  complex to parse than simply providing a flow rule from an application
  point of view.

With that in mind, I am not opposed to the idea, both methods could even
coexist, with the query function eventually evolving to become a front-end
to a capability graph. Just remember that I am only defining the
fundamentals for the initial implementation, i.e. how rules are expressed as
patterns/actions and the basic functions to manage them, ideally without
having to redefine them ever.

> A bit more details on possible interface for capabilities query,
> 
> One way I've used to describe these graphs from driver to software
> stacks is to use a set of structures to build the graph. For fixed
> graphs this could just be *.h file for programmable hardware (typically
> coming from fw update on nics) the driver can read the parser details
> out of firmware and render the structures.

I understand, however I think this approach may be too low-level to express
all the possible combinations. This graph would have to include possible
actions for each possible pattern, all while considering that some actions
are not possible with some patterns and that there are exclusive actions.

Also while memory consumption is not really an issue, such a graph may be
huge. It could take a while for the PMD to update it when adding a rule
impacting capabilities.

> I've done this two ways: one is to define all the fields in their
> own structures using something like,
> 
> struct field {
> 	char *name;
> 	u32 uid;
> 	u32 bitwidth;
> };
> 
> This gives a unique id (uid) for each field along with its
> width and a user friendly name. The fields are organized into
> headers via a header structure,
> 
> struct header_node {
> 	char *name;
> 	u32 uid;
> 	u32 *fields;
> 	struct parse_graph *jump;
> };
> 
> Each node has a unique id and then a list of fields. Where 'fields'
> is a list of uid's of fields its also easy enough to embed the field
> struct in the header_node if that is simpler its really a style
> question.
> 
> The 'struct parse_graph' gives the list of edges from this header node
> to other header nodes. Using a parse graph structure defined
> 
> struct parse_graph {
> 	struct field_reference ref;
> 	__u32 jump_uid;
> };
> 
> Again as a matter of style you can embed the parse graph in the header
> node as I did above or do it as its own object.
> 
> The field_reference noted below gives the id of the field and the value
> e.g. the tuple (ipv4.protocol, 6) then jump_uid would be the uid of TCP.
> 
> struct field_reference {
> 	__u32 header_uid;
> 	__u32 field_uid;
> 	__u32 mask_type;
> 	__u32 type;
> 	__u8  *value;
> 	__u8  *mask;
> };
> 
> The cost doing all this is some additional overhead at init time. But
> building generic function over this and having a set of predefined
> uids for well-known protocols such ip, udp, tcp, etc helps. What you
> get for the cost is a few things that I think are worth it. (i) Now
> new protocols can be added/removed without recompiling DPDK (ii) a
> software package can use the capability query to verify the required
> protocols are off-loadable vs a possibly large set of test queries and
> (iii) when we do the programming of the device we can provide a tuple
> (table-uid, header-uid, field-uid, value, mask, priority) and the
> middle layer "knowing" the above graph can verify the command so
> drivers only ever see "good"  commands, (iv) finally it should be
> faster in terms of cmds per second because the drivers can map the
> tuple (table, header, field, priority) to a slot efficiently vs
> parsing.
> 
> IMO point (iii) and (iv) will in practice make the code much simpler
> because we can maintain common middle layer and not require parsing
> by drivers. Making each driver simpler by abstracting into common
> layer.

Before answering your points, let's consider how applications are going to
be written. Not only devices do not support all possible pattern/actions
combinations, they also have memory constraints. Whichever method
applications use to determine if a flow rule is supported, at some point
they won't be able to add any more due to device limitations.

Sane applications designed to work regardless of the underlying device won't
simply call abort() at this point but provide a software fallback
instead. My bet is that applications will provide one every time a rule
cannot be added for any reason, they won't even bother to query capabilities
except perhaps for a very small subset, as in "does this device support the
ID action at all?".

Applications that really want/need to know at init time whether all the
rules they may want to possibly create are supported will spend about the
same time in both cases (query or graph). For queries, by iterating on a
list of typical rules. For a graph, by walking through it. Either way, it
won't be done later from the data path.

I think that for an application maintainer, writing or even generating a set
of typical rules will also be easier than walking through a graph. It should
also be easier on the PMD side.

For individual points:

(i) should be doable with the query API without recompiling DPDK as well,
the fact API/ABI breakage must be avoided being part of the requirements. If
you think there is a problem regarding this, can you provide a specific
example?

(ii) as described above, I think this use case won't be very common in the
wild, except for applications designed for a specific device and then they
will probably know enough about it to skip the query step entirely. If time
must be spent anyway, it will be in the control path at initialization
time.

(iii) misses the fact that capabilities evolve as flow rules get added,
there is no way for PMDs to only see "valid" rules also because device
limitations may prevent adding an otherwise valid rule.

(iv) could be true if not for the same reason as (iii). The graph would have
to be verfied again before adding another rule. Note that PMDs maintainers
are encouraged to make their query function as fast as possible, they may
rely on static data internally for this as well.

> > Worse in my opinion it requires all drivers to write mostly duplicating
> > validation code where a common layer could easily do this if every
> > driver reported a common data structure representing its parse graph
> > instead. The nice fallout of this initial effort upfront is the driver
> > no longer needs to do error handling/checking/etc and can assume all
> > rules are correct and valid. It makes driver code much simpler to
> > support. And IMO at least by doing this we get some other nice benefits
> > described below.

About duplicated code, my usual reply is that DPDK will provide internal
helper methods to assist PMDs with rules management/parsing/etc. These are
not discussed in the specification because I wanted everyone to agree to the
application side of things first, and it is difficult to know how much
assistance PMDs might need without an initial implementation.

I think this private API will be built at the same time as support is added
to PMDs and maintainers notice generic code that can be shared.
Documentation may be written later once things start to settle down.

> > Another related question is about performance.
> > 
> >> Creation
> >> ~~~~~~~~
> >>
> >> Creating a flow rule is similar to validating one, except the rule is
> >> actually created.
> >>
> >> ::
> >>
> >>  struct rte_flow *
> >>  rte_flow_create(uint8_t port_id,
> >>                  const struct rte_flow_pattern *pattern,
> >>                  const struct rte_flow_actions *actions);
> > 
> > I gather this implies that each driver must parse the pattern/action
> > block and map this onto the hardware. How many rules per second can this
> > support? I've run into systems that expect a level of service somewhere
> > around 50k cmds per second. So bulking will help at the message level
> > but it seems like a lot of overhead to unpack the pattern/action section.

There is indeed no guarantee on the time taken to create a flow rule, as
debated with Sugesh (see the full thread):

 http://dpdk.org/ml/archives/dev/2016-July/043958.html

I will update the specification accordingly.

Consider that even 50k cmds per second may not be fast enough. Applications
always need to have some kind of fallback ready, and the ability to know
whether a packet has been matched by a rule is a way to help with that.

In any case, flow rules must be managed from the control path, the data path
must only handle consequences.

> > One strategy I've used in other systems that worked relatively well
> > is if the query for the parse graph above returns a key for each node
> > in the graph then a single lookup can map the key to a node. Its
> > unambiguous and then these operations simply become a table lookup.
> > So to be a bit more concrete this changes the pattern structure in
> > rte_flow_create() into a  <key,value,mask> tuple where the key is known
> > by the initial parse graph query. If you reserve a set of well-defined
> > key values for well known protocols like ethernet, ip, etc. then the
> > query model also works but the middle layer catches errors in this case
> > and again the driver only gets known good flows. So something like this,
> > 
> >   struct rte_flow_pattern {
> > 	uint32_t priority;
> > 	uint32_t key;
> > 	uint32_t value_length;
> > 	u8 *value;
> >   }

I agree that having an integer representing an entire pattern/actions combo
would be great, however how do you tell whether you want matched packets to
be duplicated to queue 6 and redirected to queue 3? This method can be used
to check if a type of rule is allowed but not whether it is actually
applicable. You still need to provide the entire pattern/actions description
to create a flow rule.

> > Also if we have multiple tables what do you think about adding a
> > table_id to the signature. Probably not needed in the first generation
> > but is likely useful for hardware with multiple tables so that it
> > would be,
> > 
> >    rte_flow_create(uint8_t port_id, uint8_t table_id, ...);

Not sure if I understand the table ID concept, do you mean in case a device
supports entirely different sets of features depending on something? (What?)

> > Finally one other problem we've had which would be great to address
> > if we are doing a rewrite of the API is adding new protocols to
> > already deployed DPDK stacks. This is mostly a Linux distribution
> > problem where you can't easily update DPDK.
> > 
> > In the prototype header linked in this document it seems to add new
> > headers requires adding a new enum in the rte_flow_item_type but there
> > is at least an attempt at a catch all here,
> > 
> >> 	/**
> >> 	 * Matches a string of a given length at a given offset (in bytes),
> >> 	 * or anywhere in the payload of the current protocol layer
> >> 	 * (including L2 header if used as the first item in the stack).
> >> 	 *
> >> 	 * See struct rte_flow_item_raw.
> >> 	 */
> >> 	RTE_FLOW_ITEM_TYPE_RAW,
> > 
> > Actually this is a nice implementation because it works after the
> > previous item in the stack correct?

Yes, this is correct.

> > So you can put it after "known"
> > variable length headers like IP. The limitation is it can't get past
> > undefined variable length headers.

RTE_FLOW_ITEM_TYPE_ANY is made for that purpose. Is that what you are
looking for?

> > However if you use the above parse
> > graph reporting from the driver mechanism and the driver always reports
> > its largest supported graph then we don't have this issue where a new
> > hardware sku/ucode/etc added support for new headers but we have no
> > way to deploy it to existing software users without recompiling and
> > redeploying.

I really would like to understand if you see a limitation regarding this
with the specified API, even assuming DPDK is compiled as a shared library
and thus not part of the user application.

> > I looked at the git repo but I only saw the header definition I guess
> > the implementation is TBD after there is enough agreement on the
> > interface?

Precisely, I intend to update the tree and send a v2 soon (unfortunately did
not have much time these past few days to work on this).

Now what if, instead of a seemingly complex parse graph and still in
addition to the query method, enum values were defined for PMDs to report
an array of supported items, typical patterns and actions so applications
can get a quick idea of what devices are capable of without being too
specific. Something like:

 enum rte_flow_capability {
     RTE_FLOW_CAPABILITY_ITEM_ETH,
     RTE_FLOW_CAPABILITY_PATTERN_ETH_IP_TCP,
     RTE_FLOW_CAPABILITY_ACTION_ID,
     ...
 };

Although I'm not convinced about the usefulness of this because it would
have to be maintained separately, but that would be easier than building a
dummy flow rule for simple query purposes.

The main question I have for you is, do you think the core of the specified
API is adequate enough assuming it can be extended later with new methods?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-26 10:07         ` Rahul Lakkireddy
@ 2016-08-03 16:44           ` Adrien Mazarguil
  2016-08-03 19:11             ` John Fastabend
  2016-08-19 21:13           ` John Daley (johndale)
  1 sibling, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-03 16:44 UTC (permalink / raw)
  To: Rahul Lakkireddy
  Cc: John Fastabend, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley,
	Jing Chen, Konstantin Ananyev, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, Pablo de Lara, Olga Shern, Kumar A S,
	Nirranjan Kirubaharan, Indranil Choudhury

Replying to everything at once, please see below.

On Tue, Jul 26, 2016 at 03:37:35PM +0530, Rahul Lakkireddy wrote:
> On Monday, July 07/25/16, 2016 at 09:40:02 -0700, John Fastabend wrote:
> > On 16-07-25 04:32 AM, Rahul Lakkireddy wrote:
> > > Hi Adrien,
> > > 
> > > On Thursday, July 07/21/16, 2016 at 19:07:38 +0200, Adrien Mazarguil wrote:
> > >> Hi Rahul,
> > >>
> > >> Please see below.
> > >>
> > >> On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote:
> > >>> Hi Adrien,
> > >>>
> > >>> The proposal looks very good.  It satisfies most of the features
> > >>> supported by Chelsio NICs.  We are looking for suggestions on exposing
> > >>> more additional features supported by Chelsio NICs via this API.
> > >>>
> > >>> Chelsio NICs have two regions in which filters can be placed -
> > >>> Maskfull and Maskless regions.  As their names imply, maskfull region
> > >>> can accept masks to match a range of values; whereas, maskless region
> > >>> don't accept any masks and hence perform a more strict exact-matches.
> > >>> Filters without masks can also be placed in maskfull region.  By
> > >>> default, maskless region have higher priority over the maskfull region.
> > >>> However, the priority between the two regions is configurable.
> > >>
> > >> I understand this configuration affects the entire device. Just to be clear,
> > >> assuming some filters are already configured, are they affected by a change
> > >> of region priority later?
> > >>
> > > 
> > > Both the regions exist at the same time in the device.  Each filter can
> > > either belong to maskfull or the maskless region.
> > > 
> > > The priority is configured at time of filter creation for every
> > > individual filter and cannot be changed while the filter is still
> > > active. If priority needs to be changed for a particular filter then,
> > > it needs to be deleted first and re-created.
> > 
> > Could you model this as two tables and add a table_id to the API? This
> > way user space could populate the table it chooses. We would have to add
> > some capabilities attributes to "learn" if tables support masks or not
> > though.
> > 
> 
> This approach sounds interesting.

Now I understand the idea behind these tables, however from an application
point of view I still think it's better if the PMD could take care of flow
rules optimizations automatically. Think about it, PMDs have exactly a
single kind of device they know perfectly well to manage, while applications
want the best possible performance out of any device in the most generic
fashion.

> > I don't see how the PMD can sort this out in any meaningful way and it
> > has to be exposed to the application that has the intelligence to 'know'
> > priorities between masks and non-masks filters. I'm sure you could come
> > up with something but it would be less than ideal in many cases I would
> > guess and we can't have the driver getting priorities wrong or we may
> > not get the correct behavior.

It may be solved by having the PMD maintain a SW state to quickly know which
rules are currently created and in what state the device is so basically the
application doesn't have to perform this work.

This API allows applications to express basic needs such as "redirect
packets matching this pattern to that queue". It must not deal with HW
details and limitations in my opinion. If a request cannot be satisfied,
then the rule cannot be created. No help from the application must be
expected by PMDs, otherwise it opens the door to the same issues as the
legacy filtering APIs.

[...]
> > > Unfortunately, our maskfull region is extremely small too compared to
> > > maskless region.
> > > 
> > 
> > To me this means a userspace application would want to pack it
> > carefully to get the full benefit. So you need some mechanism to specify
> > the "region" hence the above table proposal.
> > 
> 
> Right. Makes sense.

I do not agree, applications should not be aware of it. Note this case can
be handled differently, so that rules do not have to be moved back and forth
between both tables. If the first created rule requires a maskfull entry,
then all subsequent rules will be entered into that table. Otherwise no
maskfull entry can be created as long as there is one maskless entry. When
either table is full, no more rules may be added. Would that work for you?

> [...]
> > >> Now about this "promisc" match criteria, it can be added as a new meta
> > >> pattern item (4.1.3 Meta item types). Do you want it to be defined from the
> > >> start or add it later with the related code in your PMD?
> > >>
> > > 
> > > It could be added as a meta item.  If there are other interested
> > > parties, it can be added now.  Otherwise, we'll add it with our filtering
> > > related code.
> > > 
> > 
> > hmm I guess by "promisc" here you mean match packets received from the
> > wire before they have been switched by the silicon?
> > 
> 
> Match packets received from wire before they have been switched by
> silicon, and which also includes packets not destined for DUT and were
> still received due to interface being in promisc mode.

I think it's fine, but we'll have to precisely define what happens when a
packet matched with such pattern is part of a terminating rule. For instance
if it is duplicated by HW, then the rule cannot be terminating.

[...]
> > > This raises another interesting question.  What should the PMD do
> > > if it has support to only a subset of fields in the particular item?
> > > 
> > > For example, if a rule has been sent to match IP fragmentation along
> > > with several other IPv4 fields, and if the underlying hardware doesn't
> > > support matching based on IP fragmentation, does the PMD reject the
> > > complete rule although it could have done the matching for rest of the
> > > IPv4 fields?
> > 
> > I think it has to fail the command other wise user space will not have
> > any way to understand that the full match criteria can not be met and
> > we will get different behavior for the same applications on different
> > nics depending on hardware feature set. This will most likely break
> > applications so we need the error IMO.
> > 
> 
> Ok. Makes sense.

Yes, I fully agree with this.

> > >>> - Match range of physical ports on the NIC in a single rule via masks.
> > >>>   For ex: match all UDP packets coming on ports 3 and 4 out of 4
> > >>>   ports available on the NIC.
> > >>
> > >> Applications create flow rules per port, I'd thus suggest that the PMD
> > >> should detect identical rules created on different ports and aggregate them
> > >> as a single HW rule automatically.
> > >>
> > >> If you think this approach is not right, the alternative is a meta pattern
> > >> item that provides a list of ports. I'm not sure this is the right approach
> > >> considering it would most likely not be supported by most NICs. Applications
> > >> may not request it explicitly.
> > >>
> > > 
> > > Aggregating via PMD will be expensive operation since it would involve:
> > > - Search of existing filters.
> > > - Deleting those filters.
> > > - Creating a single combined filter.
> > > 
> > > And all of above 3 operations would need to be atomic so as not to
> > > affect existing traffic which is hitting above filters.

Atomicity may not be a problem if the PMD makes sure the new combined rule
is inserted before the others, so they do not need to be removed either.

> > > Adding a
> > > meta item would be a simpler solution here.

Yes, clearly.

> > For this adding a meta-data item seems simplest to me. And if you want
> > to make the default to be only a single port that would maybe make it
> > easier for existing apps to port from flow director. Then if an
> > application cares it can create a list of ports if needed.
> > 
> 
> Agreed.

However although I'm not opposed to adding dedicated meta items, remember
applications will not automatically benefit from the increased performance
if a single PMD implements this feature, their maintainers will probably not
bother with it.

> > >>> - Match range of Physical Functions (PFs) on the NIC in a single rule
> > >>>   via masks. For ex: match all traffic coming on several PFs.
> > >>
> > >> The PF and VF pattern items assume there is a single PF associated with a
> > >> DPDK port. VFs are identified with an ID. I basically took the same
> > >> definitions as the existing filter types, perhaps this is not enough for
> > >> Chelsio adapters.
> > >>
> > >> Do you expose more than one PF for a DPDK port?
> > >>
> > >> Anyway, I'd suggest the same approach as above, automatic aggregation of
> > >> rules for performance reasons, otherwise new or updated PF/VF pattern items,
> > >> in which case it would be great if you could provide ideal structure
> > >> definitions for this use case.
> > >>
> > > 
> > > In Chelsio hardware, all the ports of a device are exposed via single
> > > PF4. There could be many VFs attached to a PF.  Physical NIC functions
> > > are operational on PF4, while VFs can be attached to PFs 0-3.
> > > So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one
> > > mapping assumption.
> > > 
> > > There already seems to be a PF meta-item, but it doesn't seem to accept
> > > any "spec" and "mask" field.  Similarly, the VF meta-item doesn't
> > > seem to accept a "mask" field.  We could probably enable these fields
> > > in the PF and VF meta-items to allow configuration.
> > 
> > Maybe a range field would help here as well? So you could specify a VF
> > range. It might be one of the things to consider adding later though if
> > there is no clear use for it now.
> > 
> 
> VF-value and VF-mask would help to achieve the desired filter.
> VF-mask would also enable to specify a range of VF values.

Like John, I think a range or even a list instead of a mask would be better,
the PMD can easily create a mask from that if necessary. Reason is that
we've always had bad experiences with bit-fields, they're always too short
at some point and we would like to avoid having to break the ABI to update
existing pattern items later.

Also while I don't think this is the case yet, perhaps it will be a good
idea for PFs/VFs to have global unique IDs, just like DPDK ports.

Thanks.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-03 14:30         ` Adrien Mazarguil
@ 2016-08-03 18:10           ` John Fastabend
  2016-08-04 13:05             ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: John Fastabend @ 2016-08-03 18:10 UTC (permalink / raw)
  To: Jerin Jacob, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu,
	Jan Medala, John Daley, Jing Chen, Konstantin Ananyev,
	Matej Vido, Alejandro Lucero, Sony Chacko, Pablo de Lara,
	Olga Shern

[...]

>>>> Considering that allowed pattern/actions combinations cannot be known in
>>>> advance and would result in an unpractically large number of capabilities to
>>>> expose, a method is provided to validate a given rule from the current
>>>> device configuration state without actually adding it (akin to a "dry run"
>>>> mode).
>>>
>>> Rather than have a query/validate process why did we jump over having an
>>> intermediate representation of the capabilities? Here you state it is
>>> unpractical but we know how to represent parse graphs and the drivers
>>> could report their supported parse graph via a single query to a middle
>>> layer.
>>>
>>> This will actually reduce the msg chatter imagine many applications at
>>> init time or in boundary cases where a large set of applications come
>>> online at once and start banging on the interface all at once seems less
>>> than ideal.
> 
> Well, I also thought about a kind of graph to represent capabilities but
> feared the extra complexity would not be worth the trouble, thus settled on
> the query idea. A couple more reasons:
> 
> - Capabilities evolve at the same time as devices are configured. For
>   example, if a device supports a single RSS context, then a single rule
>   with a RSS action may be created. The graph would have to be rewritten
>   accordingly and thus queried/parsed again by the application.

The graph would not help here because this is an action
restriction not a parsing restriction. This is yet another query to see
what actions are supported and how many of each action are supported.

   get_parse_graph - report the parsable fields
   get_actions - report the supported actions and possible num of each

> 
> - Expressing capabilities at bit granularity (say, for a matching pattern
>   item mask) is complex, there is no way to simplify the representation of
>   capabilities without either losing information or making the graph more
>   complex to parse than simply providing a flow rule from an application
>   point of view.
> 

I'm not sure I understand 'bit granularity' here. I would say we have
devices now that have rather strange restrictions due to hardware
implementation. Going forward we should get better hardware and a lot
of this will go away in my view. Yes this is a long term view and
doesn't help the current state. The overall point you are making is
the sum off all these strange/odd bits in the hardware implementation
means capabilities queries are very difficult to guarantee. On existing
hardware and I think you've convinced me. Thanks ;)

> With that in mind, I am not opposed to the idea, both methods could even
> coexist, with the query function eventually evolving to become a front-end
> to a capability graph. Just remember that I am only defining the
> fundamentals for the initial implementation, i.e. how rules are expressed as
> patterns/actions and the basic functions to manage them, ideally without
> having to redefine them ever.
> 

Agreed they should be able to coexist. So I can get my capabilities
queries as a layer on top of the API here.

>> A bit more details on possible interface for capabilities query,
>>
>> One way I've used to describe these graphs from driver to software
>> stacks is to use a set of structures to build the graph. For fixed
>> graphs this could just be *.h file for programmable hardware (typically
>> coming from fw update on nics) the driver can read the parser details
>> out of firmware and render the structures.
> 
> I understand, however I think this approach may be too low-level to express
> all the possible combinations. This graph would have to include possible
> actions for each possible pattern, all while considering that some actions
> are not possible with some patterns and that there are exclusive actions.
> 

Really? You have hardware that has dependencies between the parser and
the supported actions? Ugh...

If the hardware has separate tables then we shouldn't try to have the
PMD flatten those into a single table because we will have no way of
knowing how to do that. (I'll respond to the other thread on this in
an attempt to not get to scattered).

> Also while memory consumption is not really an issue, such a graph may be
> huge. It could take a while for the PMD to update it when adding a rule
> impacting capabilities.

Ugh... I wouldn't suggest updating the capabilities at runtime like
this. But I see your point if the graph has to _guarantee_ correctness
how does it represent limited number of masks and other strange hw,
its unfortunate the hardware isn't more regular.

You have convinced me that guaranteed correctness via capabilities
is going to difficult for many types of devices although not all.

[...]

>>
>> The cost doing all this is some additional overhead at init time. But
>> building generic function over this and having a set of predefined
>> uids for well-known protocols such ip, udp, tcp, etc helps. What you
>> get for the cost is a few things that I think are worth it. (i) Now
>> new protocols can be added/removed without recompiling DPDK (ii) a
>> software package can use the capability query to verify the required
>> protocols are off-loadable vs a possibly large set of test queries and
>> (iii) when we do the programming of the device we can provide a tuple
>> (table-uid, header-uid, field-uid, value, mask, priority) and the
>> middle layer "knowing" the above graph can verify the command so
>> drivers only ever see "good"  commands, (iv) finally it should be
>> faster in terms of cmds per second because the drivers can map the
>> tuple (table, header, field, priority) to a slot efficiently vs
>> parsing.
>>
>> IMO point (iii) and (iv) will in practice make the code much simpler
>> because we can maintain common middle layer and not require parsing
>> by drivers. Making each driver simpler by abstracting into common
>> layer.
> 
> Before answering your points, let's consider how applications are going to
> be written. Not only devices do not support all possible pattern/actions
> combinations, they also have memory constraints. Whichever method
> applications use to determine if a flow rule is supported, at some point
> they won't be able to add any more due to device limitations.
> 
> Sane applications designed to work regardless of the underlying device won't
> simply call abort() at this point but provide a software fallback
> instead. My bet is that applications will provide one every time a rule
> cannot be added for any reason, they won't even bother to query capabilities
> except perhaps for a very small subset, as in "does this device support the
> ID action at all?".
> 
> Applications that really want/need to know at init time whether all the
> rules they may want to possibly create are supported will spend about the
> same time in both cases (query or graph). For queries, by iterating on a
> list of typical rules. For a graph, by walking through it. Either way, it
> won't be done later from the data path.

The queries and graph suffer from the same problems you noted above if
actually instantiating the rules will impact what rules are allowed. So
that in both cases we may run into corner cases but it seems that this
is a result of hardware deficiencies and can't be solved easily at least
with software.

My concern is this non-determinism will create performance issues in
the network because when a flow may or may not be offloaded this can
have a rather significant impact on its performance. This can make
debugging network wide performance miserable when at time X I get
performance X and then for whatever reason something degrades to
software and at time Y I get some performance Y << X. I suspect that
in general applications will bind tightly with hardware they know
works.

> 
> I think that for an application maintainer, writing or even generating a set
> of typical rules will also be easier than walking through a graph. It should
> also be easier on the PMD side.
> 

I tend to think getting a graph and doing operations on graphs is easier
myself but I can see this is a matter of opinion/style.

> For individual points:
> 
> (i) should be doable with the query API without recompiling DPDK as well,
> the fact API/ABI breakage must be avoided being part of the requirements. If
> you think there is a problem regarding this, can you provide a specific
> example?

What I was after you noted yourself in the doc here,

"PMDs can rely on this capability to simulate support for protocols with
fixed headers not directly recognized by hardware."

I was trying to get variable header support with the RAW capabilities. A
parse graph supports this for example the proposed query API does not.

> 
> (ii) as described above, I think this use case won't be very common in the
> wild, except for applications designed for a specific device and then they
> will probably know enough about it to skip the query step entirely. If time
> must be spent anyway, it will be in the control path at initialization
> time.
> 

OK.

> (iii) misses the fact that capabilities evolve as flow rules get added,
> there is no way for PMDs to only see "valid" rules also because device
> limitations may prevent adding an otherwise valid rule.

OK I agree for devices with this evolving characteristic we are lost.

> 
> (iv) could be true if not for the same reason as (iii). The graph would have
> to be verfied again before adding another rule. Note that PMDs maintainers
> are encouraged to make their query function as fast as possible, they may
> rely on static data internally for this as well.
> 

OK I'm not going to get hung up on this because I think its an
implementation detail and not an API problem. I would prefer to be
pragmatic and see how fast the API is before I bikeshed it to death for
no good reason.

>>> Worse in my opinion it requires all drivers to write mostly duplicating
>>> validation code where a common layer could easily do this if every
>>> driver reported a common data structure representing its parse graph
>>> instead. The nice fallout of this initial effort upfront is the driver
>>> no longer needs to do error handling/checking/etc and can assume all
>>> rules are correct and valid. It makes driver code much simpler to
>>> support. And IMO at least by doing this we get some other nice benefits
>>> described below.
> 
> About duplicated code, my usual reply is that DPDK will provide internal
> helper methods to assist PMDs with rules management/parsing/etc. These are
> not discussed in the specification because I wanted everyone to agree to the
> application side of things first, and it is difficult to know how much
> assistance PMDs might need without an initial implementation.
> 
> I think this private API will be built at the same time as support is added
> to PMDs and maintainers notice generic code that can be shared.
> Documentation may be written later once things start to settle down.

OK lets see.

> 
>>> Another related question is about performance.
>>>
>>>> Creation
>>>> ~~~~~~~~
>>>>
>>>> Creating a flow rule is similar to validating one, except the rule is
>>>> actually created.
>>>>
>>>> ::
>>>>
>>>>  struct rte_flow *
>>>>  rte_flow_create(uint8_t port_id,
>>>>                  const struct rte_flow_pattern *pattern,
>>>>                  const struct rte_flow_actions *actions);
>>>
>>> I gather this implies that each driver must parse the pattern/action
>>> block and map this onto the hardware. How many rules per second can this
>>> support? I've run into systems that expect a level of service somewhere
>>> around 50k cmds per second. So bulking will help at the message level
>>> but it seems like a lot of overhead to unpack the pattern/action section.
> 
> There is indeed no guarantee on the time taken to create a flow rule, as
> debated with Sugesh (see the full thread):
> 
>  http://dpdk.org/ml/archives/dev/2016-July/043958.html
> 
> I will update the specification accordingly.
> 
> Consider that even 50k cmds per second may not be fast enough. Applications
> always need to have some kind of fallback ready, and the ability to know
> whether a packet has been matched by a rule is a way to help with that.
> 
> In any case, flow rules must be managed from the control path, the data path
> must only handle consequences.

Same as above lets see I think it can probably be made fast enough.

> 
>>> One strategy I've used in other systems that worked relatively well
>>> is if the query for the parse graph above returns a key for each node
>>> in the graph then a single lookup can map the key to a node. Its
>>> unambiguous and then these operations simply become a table lookup.
>>> So to be a bit more concrete this changes the pattern structure in
>>> rte_flow_create() into a  <key,value,mask> tuple where the key is known
>>> by the initial parse graph query. If you reserve a set of well-defined
>>> key values for well known protocols like ethernet, ip, etc. then the
>>> query model also works but the middle layer catches errors in this case
>>> and again the driver only gets known good flows. So something like this,
>>>
>>>   struct rte_flow_pattern {
>>> 	uint32_t priority;
>>> 	uint32_t key;
>>> 	uint32_t value_length;
>>> 	u8 *value;
>>>   }
> 
> I agree that having an integer representing an entire pattern/actions combo
> would be great, however how do you tell whether you want matched packets to
> be duplicated to queue 6 and redirected to queue 3? This method can be used
> to check if a type of rule is allowed but not whether it is actually
> applicable. You still need to provide the entire pattern/actions description
> to create a flow rule.

In reality its almost the same as your proposal it just took me a moment
to see it. The only difference I can see is adding new headers via RAW
type only supports fixed length headers.

To answer your question the flow_pattern would have to include a action
set as well to give a list of actions to perform. I just didn't include
it here.

> 
>>> Also if we have multiple tables what do you think about adding a
>>> table_id to the signature. Probably not needed in the first generation
>>> but is likely useful for hardware with multiple tables so that it
>>> would be,
>>>
>>>    rte_flow_create(uint8_t port_id, uint8_t table_id, ...);
> 
> Not sure if I understand the table ID concept, do you mean in case a device
> supports entirely different sets of features depending on something? (What?)
> 

In many devices we support multiple tables each with their own size,
match fields and action set. This is useful for building routers for
example along with lots of other constructs. The basic idea is
smashing everything into a single table creates a Cartesian product
problem.

>>> Finally one other problem we've had which would be great to address
>>> if we are doing a rewrite of the API is adding new protocols to
>>> already deployed DPDK stacks. This is mostly a Linux distribution
>>> problem where you can't easily update DPDK.
>>>
>>> In the prototype header linked in this document it seems to add new
>>> headers requires adding a new enum in the rte_flow_item_type but there
>>> is at least an attempt at a catch all here,
>>>
>>>> 	/**
>>>> 	 * Matches a string of a given length at a given offset (in bytes),
>>>> 	 * or anywhere in the payload of the current protocol layer
>>>> 	 * (including L2 header if used as the first item in the stack).
>>>> 	 *
>>>> 	 * See struct rte_flow_item_raw.
>>>> 	 */
>>>> 	RTE_FLOW_ITEM_TYPE_RAW,
>>>
>>> Actually this is a nice implementation because it works after the
>>> previous item in the stack correct?
> 
> Yes, this is correct.

Great.

> 
>>> So you can put it after "known"
>>> variable length headers like IP. The limitation is it can't get past
>>> undefined variable length headers.
> 
> RTE_FLOW_ITEM_TYPE_ANY is made for that purpose. Is that what you are
> looking for?
> 

But FLOW_ITEM_TYPE_ANY skips "any" header type is my understanding if
we have new variable length header in the future we will have to add
a new type RTE_FLOW_ITEM_TYPE_FOO for example. The RAW type will work
for fixed headers as noted above.

>>> However if you use the above parse
>>> graph reporting from the driver mechanism and the driver always reports
>>> its largest supported graph then we don't have this issue where a new
>>> hardware sku/ucode/etc added support for new headers but we have no
>>> way to deploy it to existing software users without recompiling and
>>> redeploying.
> 
> I really would like to understand if you see a limitation regarding this
> with the specified API, even assuming DPDK is compiled as a shared library
> and thus not part of the user application.
> 

Thanks this thread was very helpful for me at least. So the summary
for me is. Capability queries can be build on top of this API no
problem and for many existing devices capability queries will not be
able to guarantee a flow insertion success due to hardware
quirks/limitations.

The two open items from me are do we need to support adding new variable
length headers? And how do we handle multiple tables I'll take that up
in the other thread.

>>> I looked at the git repo but I only saw the header definition I guess
>>> the implementation is TBD after there is enough agreement on the
>>> interface?
> 
> Precisely, I intend to update the tree and send a v2 soon (unfortunately did
> not have much time these past few days to work on this).
> 
> Now what if, instead of a seemingly complex parse graph and still in
> addition to the query method, enum values were defined for PMDs to report
> an array of supported items, typical patterns and actions so applications
> can get a quick idea of what devices are capable of without being too
> specific. Something like:
> 
>  enum rte_flow_capability {
>      RTE_FLOW_CAPABILITY_ITEM_ETH,
>      RTE_FLOW_CAPABILITY_PATTERN_ETH_IP_TCP,
>      RTE_FLOW_CAPABILITY_ACTION_ID,
>      ...
>  };
> 
> Although I'm not convinced about the usefulness of this because it would
> have to be maintained separately, but that would be easier than building a
> dummy flow rule for simple query purposes.

I'm not sure its necessary either at first.

> 
> The main question I have for you is, do you think the core of the specified
> API is adequate enough assuming it can be extended later with new methods?
> 

The above two items are my only opens at this point, I agree with your
summary of my capabilities proposal namely it can be added.

.John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-03 16:44           ` Adrien Mazarguil
@ 2016-08-03 19:11             ` John Fastabend
  2016-08-04 13:24               ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: John Fastabend @ 2016-08-03 19:11 UTC (permalink / raw)
  To: Rahul Lakkireddy, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley,
	Jing Chen, Konstantin Ananyev, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, Pablo de Lara, Olga Shern, Kumar A S,
	Nirranjan Kirubaharan, Indranil Choudhury

[...]

>>>>>> The proposal looks very good.  It satisfies most of the features
>>>>>> supported by Chelsio NICs.  We are looking for suggestions on exposing
>>>>>> more additional features supported by Chelsio NICs via this API.
>>>>>>
>>>>>> Chelsio NICs have two regions in which filters can be placed -
>>>>>> Maskfull and Maskless regions.  As their names imply, maskfull region
>>>>>> can accept masks to match a range of values; whereas, maskless region
>>>>>> don't accept any masks and hence perform a more strict exact-matches.
>>>>>> Filters without masks can also be placed in maskfull region.  By
>>>>>> default, maskless region have higher priority over the maskfull region.
>>>>>> However, the priority between the two regions is configurable.
>>>>>
>>>>> I understand this configuration affects the entire device. Just to be clear,
>>>>> assuming some filters are already configured, are they affected by a change
>>>>> of region priority later?
>>>>>
>>>>
>>>> Both the regions exist at the same time in the device.  Each filter can
>>>> either belong to maskfull or the maskless region.
>>>>
>>>> The priority is configured at time of filter creation for every
>>>> individual filter and cannot be changed while the filter is still
>>>> active. If priority needs to be changed for a particular filter then,
>>>> it needs to be deleted first and re-created.
>>>
>>> Could you model this as two tables and add a table_id to the API? This
>>> way user space could populate the table it chooses. We would have to add
>>> some capabilities attributes to "learn" if tables support masks or not
>>> though.
>>>
>>
>> This approach sounds interesting.
> 
> Now I understand the idea behind these tables, however from an application
> point of view I still think it's better if the PMD could take care of flow
> rules optimizations automatically. Think about it, PMDs have exactly a
> single kind of device they know perfectly well to manage, while applications
> want the best possible performance out of any device in the most generic
> fashion.

The problem is keeping priorities in order and/or possibly breaking
rules apart (e.g. you have an L2 table and an L3 table) becomes very
complex to manage at driver level. I think its easier for the
application which has some context to do this. The application "knows"
if its a router for example will likely be able to pack rules better
than a PMD will.

> 
>>> I don't see how the PMD can sort this out in any meaningful way and it
>>> has to be exposed to the application that has the intelligence to 'know'
>>> priorities between masks and non-masks filters. I'm sure you could come
>>> up with something but it would be less than ideal in many cases I would
>>> guess and we can't have the driver getting priorities wrong or we may
>>> not get the correct behavior.
> 
> It may be solved by having the PMD maintain a SW state to quickly know which
> rules are currently created and in what state the device is so basically the
> application doesn't have to perform this work.
> 
> This API allows applications to express basic needs such as "redirect
> packets matching this pattern to that queue". It must not deal with HW
> details and limitations in my opinion. If a request cannot be satisfied,
> then the rule cannot be created. No help from the application must be
> expected by PMDs, otherwise it opens the door to the same issues as the
> legacy filtering APIs.

This depends on the application and what/how it wants to manage the
device. If the application manages a pipeline with some set of tables,
then mapping this down to a single table, which then the PMD has to
unwind back to a multi-table topology to me seems like a waste.

> 
> [...]
>>>> Unfortunately, our maskfull region is extremely small too compared to
>>>> maskless region.
>>>>
>>>
>>> To me this means a userspace application would want to pack it
>>> carefully to get the full benefit. So you need some mechanism to specify
>>> the "region" hence the above table proposal.
>>>
>>
>> Right. Makes sense.
> 
> I do not agree, applications should not be aware of it. Note this case can
> be handled differently, so that rules do not have to be moved back and forth
> between both tables. If the first created rule requires a maskfull entry,
> then all subsequent rules will be entered into that table. Otherwise no
> maskfull entry can be created as long as there is one maskless entry. When
> either table is full, no more rules may be added. Would that work for you?
> 

Its not about mask vs no mask. The devices with multiple tables that I
have don't have this mask limitations. Its about how to optimally pack
the rules and who implements that logic. I think its best done in the
application where I have the context.

Is there a way to omit the table field if the PMD is expected to do
a best effort and add the table field if the user wants explicit
control over table mgmt. This would support both models. I at least
would like to have explicit control over rule population in my pipeline
for use cases where I'm building a pipeline on top of the hardware.

>> [...]
>>>>> Now about this "promisc" match criteria, it can be added as a new meta
>>>>> pattern item (4.1.3 Meta item types). Do you want it to be defined from the
>>>>> start or add it later with the related code in your PMD?
>>>>>
>>>>
>>>> It could be added as a meta item.  If there are other interested
>>>> parties, it can be added now.  Otherwise, we'll add it with our filtering
>>>> related code.
>>>>
>>>
>>> hmm I guess by "promisc" here you mean match packets received from the
>>> wire before they have been switched by the silicon?
>>>
>>
>> Match packets received from wire before they have been switched by
>> silicon, and which also includes packets not destined for DUT and were
>> still received due to interface being in promisc mode.
> 
> I think it's fine, but we'll have to precisely define what happens when a
> packet matched with such pattern is part of a terminating rule. For instance
> if it is duplicated by HW, then the rule cannot be terminating.
> 
> [...]
>>>> This raises another interesting question.  What should the PMD do
>>>> if it has support to only a subset of fields in the particular item?
>>>>
>>>> For example, if a rule has been sent to match IP fragmentation along
>>>> with several other IPv4 fields, and if the underlying hardware doesn't
>>>> support matching based on IP fragmentation, does the PMD reject the
>>>> complete rule although it could have done the matching for rest of the
>>>> IPv4 fields?
>>>
>>> I think it has to fail the command other wise user space will not have
>>> any way to understand that the full match criteria can not be met and
>>> we will get different behavior for the same applications on different
>>> nics depending on hardware feature set. This will most likely break
>>> applications so we need the error IMO.
>>>
>>
>> Ok. Makes sense.
> 
> Yes, I fully agree with this.
> 
>>>>>> - Match range of physical ports on the NIC in a single rule via masks.
>>>>>>   For ex: match all UDP packets coming on ports 3 and 4 out of 4
>>>>>>   ports available on the NIC.
>>>>>
>>>>> Applications create flow rules per port, I'd thus suggest that the PMD
>>>>> should detect identical rules created on different ports and aggregate them
>>>>> as a single HW rule automatically.
>>>>>
>>>>> If you think this approach is not right, the alternative is a meta pattern
>>>>> item that provides a list of ports. I'm not sure this is the right approach
>>>>> considering it would most likely not be supported by most NICs. Applications
>>>>> may not request it explicitly.
>>>>>
>>>>
>>>> Aggregating via PMD will be expensive operation since it would involve:
>>>> - Search of existing filters.
>>>> - Deleting those filters.
>>>> - Creating a single combined filter.
>>>>
>>>> And all of above 3 operations would need to be atomic so as not to
>>>> affect existing traffic which is hitting above filters.
> 
> Atomicity may not be a problem if the PMD makes sure the new combined rule
> is inserted before the others, so they do not need to be removed either.
> 
>>>> Adding a
>>>> meta item would be a simpler solution here.
> 
> Yes, clearly.
> 
>>> For this adding a meta-data item seems simplest to me. And if you want
>>> to make the default to be only a single port that would maybe make it
>>> easier for existing apps to port from flow director. Then if an
>>> application cares it can create a list of ports if needed.
>>>
>>
>> Agreed.
> 
> However although I'm not opposed to adding dedicated meta items, remember
> applications will not automatically benefit from the increased performance
> if a single PMD implements this feature, their maintainers will probably not
> bother with it.
> 

Unless as we noted in other thread the application is closely bound to
its hardware for capability reasons. In this case it would make sense
to implement.

>>>>>> - Match range of Physical Functions (PFs) on the NIC in a single rule
>>>>>>   via masks. For ex: match all traffic coming on several PFs.
>>>>>
>>>>> The PF and VF pattern items assume there is a single PF associated with a
>>>>> DPDK port. VFs are identified with an ID. I basically took the same
>>>>> definitions as the existing filter types, perhaps this is not enough for
>>>>> Chelsio adapters.
>>>>>
>>>>> Do you expose more than one PF for a DPDK port?
>>>>>
>>>>> Anyway, I'd suggest the same approach as above, automatic aggregation of
>>>>> rules for performance reasons, otherwise new or updated PF/VF pattern items,
>>>>> in which case it would be great if you could provide ideal structure
>>>>> definitions for this use case.
>>>>>
>>>>
>>>> In Chelsio hardware, all the ports of a device are exposed via single
>>>> PF4. There could be many VFs attached to a PF.  Physical NIC functions
>>>> are operational on PF4, while VFs can be attached to PFs 0-3.
>>>> So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one
>>>> mapping assumption.
>>>>
>>>> There already seems to be a PF meta-item, but it doesn't seem to accept
>>>> any "spec" and "mask" field.  Similarly, the VF meta-item doesn't
>>>> seem to accept a "mask" field.  We could probably enable these fields
>>>> in the PF and VF meta-items to allow configuration.
>>>
>>> Maybe a range field would help here as well? So you could specify a VF
>>> range. It might be one of the things to consider adding later though if
>>> there is no clear use for it now.
>>>
>>
>> VF-value and VF-mask would help to achieve the desired filter.
>> VF-mask would also enable to specify a range of VF values.
> 
> Like John, I think a range or even a list instead of a mask would be better,
> the PMD can easily create a mask from that if necessary. Reason is that
> we've always had bad experiences with bit-fields, they're always too short
> at some point and we would like to avoid having to break the ABI to update
> existing pattern items later.

Agreed avoiding bit-fields is a good idea.

> 
> Also while I don't think this is the case yet, perhaps it will be a good
> idea for PFs/VFs to have global unique IDs, just like DPDK ports.
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-03 18:10           ` John Fastabend
@ 2016-08-04 13:05             ` Adrien Mazarguil
  2016-08-09 21:24               ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-04 13:05 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jerin Jacob, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu,
	Jan Medala, John Daley, Jing Chen, Konstantin Ananyev,
	Matej Vido, Alejandro Lucero, Sony Chacko, Pablo de Lara,
	Olga Shern

On Wed, Aug 03, 2016 at 11:10:49AM -0700, John Fastabend wrote:
> [...]
> 
> >>>> Considering that allowed pattern/actions combinations cannot be known in
> >>>> advance and would result in an unpractically large number of capabilities to
> >>>> expose, a method is provided to validate a given rule from the current
> >>>> device configuration state without actually adding it (akin to a "dry run"
> >>>> mode).
> >>>
> >>> Rather than have a query/validate process why did we jump over having an
> >>> intermediate representation of the capabilities? Here you state it is
> >>> unpractical but we know how to represent parse graphs and the drivers
> >>> could report their supported parse graph via a single query to a middle
> >>> layer.
> >>>
> >>> This will actually reduce the msg chatter imagine many applications at
> >>> init time or in boundary cases where a large set of applications come
> >>> online at once and start banging on the interface all at once seems less
> >>> than ideal.
> > 
> > Well, I also thought about a kind of graph to represent capabilities but
> > feared the extra complexity would not be worth the trouble, thus settled on
> > the query idea. A couple more reasons:
> > 
> > - Capabilities evolve at the same time as devices are configured. For
> >   example, if a device supports a single RSS context, then a single rule
> >   with a RSS action may be created. The graph would have to be rewritten
> >   accordingly and thus queried/parsed again by the application.
> 
> The graph would not help here because this is an action
> restriction not a parsing restriction. This is yet another query to see
> what actions are supported and how many of each action are supported.
> 
>    get_parse_graph - report the parsable fields
>    get_actions - report the supported actions and possible num of each

OK, now I understand your idea, in my mind the graph was indeed supposed to
represent complete flow rules.

> > - Expressing capabilities at bit granularity (say, for a matching pattern
> >   item mask) is complex, there is no way to simplify the representation of
> >   capabilities without either losing information or making the graph more
> >   complex to parse than simply providing a flow rule from an application
> >   point of view.
> > 
> 
> I'm not sure I understand 'bit granularity' here. I would say we have
> devices now that have rather strange restrictions due to hardware
> implementation. Going forward we should get better hardware and a lot
> of this will go away in my view. Yes this is a long term view and
> doesn't help the current state. The overall point you are making is
> the sum off all these strange/odd bits in the hardware implementation
> means capabilities queries are very difficult to guarantee. On existing
> hardware and I think you've convinced me. Thanks ;)

Precisely. By "bit granularity" I meant that while it is fairly easy to
report whether bit-masking is supported on protocol fields such as MAC
addresses at all, devices may have restrictions on the possible bit-masks,
like they may only have an effect at byte level (0xff), may not allow
specific bits (broadcast) or there even may be a fixed set of bit-masks to
choose from.

[...]
> > I understand, however I think this approach may be too low-level to express
> > all the possible combinations. This graph would have to include possible
> > actions for each possible pattern, all while considering that some actions
> > are not possible with some patterns and that there are exclusive actions.
> > 
> 
> Really? You have hardware that has dependencies between the parser and
> the supported actions? Ugh...

Not that I know of actually, even though we cannot rule out this
possibility.

Here are the possible cases I have in mind with existing HW:

- Too many actions specified for a single rule, even though each of them is
  otherwise supported.

- Performing several encap/decap actions. None are defined in the initial
  specification but these are already planned.

- Assuming there is a single table from the application point of view
  (separate discussion for the other thread), some actions may only be
  possible with the right pattern item or meta item. Asking HW to perform
  tunnel decap may only be safe if the pattern specifically matches that
  protocol.

> If the hardware has separate tables then we shouldn't try to have the
> PMD flatten those into a single table because we will have no way of
> knowing how to do that. (I'll respond to the other thread on this in
> an attempt to not get to scattered).

OK, will reply there as well.

> > Also while memory consumption is not really an issue, such a graph may be
> > huge. It could take a while for the PMD to update it when adding a rule
> > impacting capabilities.
> 
> Ugh... I wouldn't suggest updating the capabilities at runtime like
> this. But I see your point if the graph has to _guarantee_ correctness
> how does it represent limited number of masks and other strange hw,
> its unfortunate the hardware isn't more regular.
> 
> You have convinced me that guaranteed correctness via capabilities
> is going to difficult for many types of devices although not all.

I'll just add that these capabilities also depend on side effects of
configuration performed outside the scope of this API. The way queues are
(re)initialized or offloads configured may affect them. RSS configuration is
the most obvious example.

> [...]
> 
> >>
> >> The cost doing all this is some additional overhead at init time. But
> >> building generic function over this and having a set of predefined
> >> uids for well-known protocols such ip, udp, tcp, etc helps. What you
> >> get for the cost is a few things that I think are worth it. (i) Now
> >> new protocols can be added/removed without recompiling DPDK (ii) a
> >> software package can use the capability query to verify the required
> >> protocols are off-loadable vs a possibly large set of test queries and
> >> (iii) when we do the programming of the device we can provide a tuple
> >> (table-uid, header-uid, field-uid, value, mask, priority) and the
> >> middle layer "knowing" the above graph can verify the command so
> >> drivers only ever see "good"  commands, (iv) finally it should be
> >> faster in terms of cmds per second because the drivers can map the
> >> tuple (table, header, field, priority) to a slot efficiently vs
> >> parsing.
> >>
> >> IMO point (iii) and (iv) will in practice make the code much simpler
> >> because we can maintain common middle layer and not require parsing
> >> by drivers. Making each driver simpler by abstracting into common
> >> layer.
> > 
> > Before answering your points, let's consider how applications are going to
> > be written. Not only devices do not support all possible pattern/actions
> > combinations, they also have memory constraints. Whichever method
> > applications use to determine if a flow rule is supported, at some point
> > they won't be able to add any more due to device limitations.
> > 
> > Sane applications designed to work regardless of the underlying device won't
> > simply call abort() at this point but provide a software fallback
> > instead. My bet is that applications will provide one every time a rule
> > cannot be added for any reason, they won't even bother to query capabilities
> > except perhaps for a very small subset, as in "does this device support the
> > ID action at all?".
> > 
> > Applications that really want/need to know at init time whether all the
> > rules they may want to possibly create are supported will spend about the
> > same time in both cases (query or graph). For queries, by iterating on a
> > list of typical rules. For a graph, by walking through it. Either way, it
> > won't be done later from the data path.
> 
> The queries and graph suffer from the same problems you noted above if
> actually instantiating the rules will impact what rules are allowed. So
> that in both cases we may run into corner cases but it seems that this
> is a result of hardware deficiencies and can't be solved easily at least
> with software.
> 
> My concern is this non-determinism will create performance issues in
> the network because when a flow may or may not be offloaded this can
> have a rather significant impact on its performance. This can make
> debugging network wide performance miserable when at time X I get
> performance X and then for whatever reason something degrades to
> software and at time Y I get some performance Y << X. I suspect that
> in general applications will bind tightly with hardware they know
> works.

You are right, performance determinism is not taken into account at all, at
least not yet. It should not be an issue at the beginning as long as the
API has the ability evolve later for applications that need it.

Just an idea, could some kind of meta pattern items specifying time
constraints for a rule address this issue? Say, how long (cycles/ms) the PMD
may take to query/apply/delete the rule. If it cannot be guaranteed, the
rule cannot be created. Applications could mantain statistic counters about
failed rules to determine if performance issues are caused by the inability
to create them.

[...]
> > For individual points:
> > 
> > (i) should be doable with the query API without recompiling DPDK as well,
> > the fact API/ABI breakage must be avoided being part of the requirements. If
> > you think there is a problem regarding this, can you provide a specific
> > example?
> 
> What I was after you noted yourself in the doc here,
> 
> "PMDs can rely on this capability to simulate support for protocols with
> fixed headers not directly recognized by hardware."
> 
> I was trying to get variable header support with the RAW capabilities. A
> parse graph supports this for example the proposed query API does not.

OK, I see, however the RAW capability itself may not be supported everywhere
in patterns. What I described is that PMDs, not applications, could leverage
the RAW abilities of underlying devices to implement otherwise unsupported
but fixed patterns.

So basically you would like to expose the ability to describe fixed protocol
definitions following RAW patterns, as in:

 ETH / RAW / IP / UDP / ...

While with such a pattern the current specification makes RAW (4.1.4.2) and
IP start matching from the same offset as two different branches, in effect
you cannot specify a fixed protocol following a RAW item.

It is defined that way because I do not see how HW could parse higher level
protocols after having given up due to a RAW pattern, however assuming the
entire stack is described only using RAW patterns I guess it could be done.

Such a pattern could be generated from a separate function before feeding it
to rte_flow_create(), or translated by the PMD afterwards assuming a
separate meta item such as RAW_END exists to signal the end of a RAW layer.
Of course processing this would be more expensive.

[...]
> >>> One strategy I've used in other systems that worked relatively well
> >>> is if the query for the parse graph above returns a key for each node
> >>> in the graph then a single lookup can map the key to a node. Its
> >>> unambiguous and then these operations simply become a table lookup.
> >>> So to be a bit more concrete this changes the pattern structure in
> >>> rte_flow_create() into a  <key,value,mask> tuple where the key is known
> >>> by the initial parse graph query. If you reserve a set of well-defined
> >>> key values for well known protocols like ethernet, ip, etc. then the
> >>> query model also works but the middle layer catches errors in this case
> >>> and again the driver only gets known good flows. So something like this,
> >>>
> >>>   struct rte_flow_pattern {
> >>> 	uint32_t priority;
> >>> 	uint32_t key;
> >>> 	uint32_t value_length;
> >>> 	u8 *value;
> >>>   }
> > 
> > I agree that having an integer representing an entire pattern/actions combo
> > would be great, however how do you tell whether you want matched packets to
> > be duplicated to queue 6 and redirected to queue 3? This method can be used
> > to check if a type of rule is allowed but not whether it is actually
> > applicable. You still need to provide the entire pattern/actions description
> > to create a flow rule.
> 
> In reality its almost the same as your proposal it just took me a moment
> to see it. The only difference I can see is adding new headers via RAW
> type only supports fixed length headers.
> 
> To answer your question the flow_pattern would have to include a action
> set as well to give a list of actions to perform. I just didn't include
> it here.

OK.

> >>> Also if we have multiple tables what do you think about adding a
> >>> table_id to the signature. Probably not needed in the first generation
> >>> but is likely useful for hardware with multiple tables so that it
> >>> would be,
> >>>
> >>>    rte_flow_create(uint8_t port_id, uint8_t table_id, ...);
> > 
> > Not sure if I understand the table ID concept, do you mean in case a device
> > supports entirely different sets of features depending on something? (What?)
> > 
> 
> In many devices we support multiple tables each with their own size,
> match fields and action set. This is useful for building routers for
> example along with lots of other constructs. The basic idea is
> smashing everything into a single table creates a Cartesian product
> problem.

Right, so I understand we'd need a method to express table capabilities as
well as you described (a topic for the other thread then).

[...]
> >>> So you can put it after "known"
> >>> variable length headers like IP. The limitation is it can't get past
> >>> undefined variable length headers.
> > 
> > RTE_FLOW_ITEM_TYPE_ANY is made for that purpose. Is that what you are
> > looking for?
> > 
> 
> But FLOW_ITEM_TYPE_ANY skips "any" header type is my understanding if
> we have new variable length header in the future we will have to add
> a new type RTE_FLOW_ITEM_TYPE_FOO for example. The RAW type will work
> for fixed headers as noted above.

I'm (slowly) starting to get it. How about the suggestion I made above for
RAW items then?

[...]
> The two open items from me are do we need to support adding new variable
> length headers? And how do we handle multiple tables I'll take that up
> in the other thread.

I think variable length headers may be eventually supported through pattern
tricks or eventually a separate conversion layer.

> >>> I looked at the git repo but I only saw the header definition I guess
> >>> the implementation is TBD after there is enough agreement on the
> >>> interface?
> > 
> > Precisely, I intend to update the tree and send a v2 soon (unfortunately did
> > not have much time these past few days to work on this).
> > 
> > Now what if, instead of a seemingly complex parse graph and still in
> > addition to the query method, enum values were defined for PMDs to report
> > an array of supported items, typical patterns and actions so applications
> > can get a quick idea of what devices are capable of without being too
> > specific. Something like:
> > 
> >  enum rte_flow_capability {
> >      RTE_FLOW_CAPABILITY_ITEM_ETH,
> >      RTE_FLOW_CAPABILITY_PATTERN_ETH_IP_TCP,
> >      RTE_FLOW_CAPABILITY_ACTION_ID,
> >      ...
> >  };
> > 
> > Although I'm not convinced about the usefulness of this because it would
> > have to be maintained separately, but that would be easier than building a
> > dummy flow rule for simple query purposes.
> 
> I'm not sure its necessary either at first.

Then I'll discard this idea.

> > The main question I have for you is, do you think the core of the specified
> > API is adequate enough assuming it can be extended later with new methods?
> > 
> 
> The above two items are my only opens at this point, I agree with your
> summary of my capabilities proposal namely it can be added.

Thanks, see you in the other thread.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-03 19:11             ` John Fastabend
@ 2016-08-04 13:24               ` Adrien Mazarguil
  2016-08-09 21:47                 ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-04 13:24 UTC (permalink / raw)
  To: John Fastabend
  Cc: Rahul Lakkireddy, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley,
	Jing Chen, Konstantin Ananyev, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, Pablo de Lara, Olga Shern, Kumar A S,
	Nirranjan Kirubaharan, Indranil Choudhury

On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:
> [...]
> 
> >>>>>> The proposal looks very good.  It satisfies most of the features
> >>>>>> supported by Chelsio NICs.  We are looking for suggestions on exposing
> >>>>>> more additional features supported by Chelsio NICs via this API.
> >>>>>>
> >>>>>> Chelsio NICs have two regions in which filters can be placed -
> >>>>>> Maskfull and Maskless regions.  As their names imply, maskfull region
> >>>>>> can accept masks to match a range of values; whereas, maskless region
> >>>>>> don't accept any masks and hence perform a more strict exact-matches.
> >>>>>> Filters without masks can also be placed in maskfull region.  By
> >>>>>> default, maskless region have higher priority over the maskfull region.
> >>>>>> However, the priority between the two regions is configurable.
> >>>>>
> >>>>> I understand this configuration affects the entire device. Just to be clear,
> >>>>> assuming some filters are already configured, are they affected by a change
> >>>>> of region priority later?
> >>>>>
> >>>>
> >>>> Both the regions exist at the same time in the device.  Each filter can
> >>>> either belong to maskfull or the maskless region.
> >>>>
> >>>> The priority is configured at time of filter creation for every
> >>>> individual filter and cannot be changed while the filter is still
> >>>> active. If priority needs to be changed for a particular filter then,
> >>>> it needs to be deleted first and re-created.
> >>>
> >>> Could you model this as two tables and add a table_id to the API? This
> >>> way user space could populate the table it chooses. We would have to add
> >>> some capabilities attributes to "learn" if tables support masks or not
> >>> though.
> >>>
> >>
> >> This approach sounds interesting.
> > 
> > Now I understand the idea behind these tables, however from an application
> > point of view I still think it's better if the PMD could take care of flow
> > rules optimizations automatically. Think about it, PMDs have exactly a
> > single kind of device they know perfectly well to manage, while applications
> > want the best possible performance out of any device in the most generic
> > fashion.
> 
> The problem is keeping priorities in order and/or possibly breaking
> rules apart (e.g. you have an L2 table and an L3 table) becomes very
> complex to manage at driver level. I think its easier for the
> application which has some context to do this. The application "knows"
> if its a router for example will likely be able to pack rules better
> than a PMD will.

I don't think most applications know they are L2 or L3 routers. They may not
know more than the pattern provided to the PMD, which may indeed end at a L2
or L3 protocol. If the application simply chooses a table based on this
information, then the PMD could have easily done the same.

I understand the issue is what happens when applications really want to
define e.g. L2/L3/L2 rules in this specific order (or any ordering that
cannot be satisfied by HW due to table constraints).

By exposing tables, in such a case applications should move all rules from
L2 to a L3 table themselves (assuming this is even supported) to guarantee
ordering between rules, or fail to add them. This is basically what the PMD
could have done, possibly in a more efficient manner in my opinion.

Let's assume two opposite scenarios for this discussion:

- App #1 is a command-line interface directly mapped to flow rules, which
  basically gets slow random input from users depending on how they want to
  configure their traffic. All rules differ considerably (L2, L3, L4, some
  with incomplete bit-masks, etc). All in all, few but complex rules with
  specific priorities.

- App #2 is something like OVS, creating and deleting a large number of very
  specific (without incomplete bit-masks) and mostly identical
  single-priority rules automatically and very frequently.

Actual applications will certainly be a mix of both.

For app #1, users would have to be aware of these tables and base their
filtering decisions according to them. Reporting tables capabilities, making
sure priorities between tables are well configured will be their
responsibility. Obviously applications may take care of these details for
them, but the end result will be the same. At some point, some combination
won't be possible. Getting there was only more complicated from
users/applications point of view.

For app #2 if the first rule can be created then subsequent rules shouldn't
be a problem until their number reaches device limits. Selecting the proper
table to use for these can easily be done by the PMD.

> >>> I don't see how the PMD can sort this out in any meaningful way and it
> >>> has to be exposed to the application that has the intelligence to 'know'
> >>> priorities between masks and non-masks filters. I'm sure you could come
> >>> up with something but it would be less than ideal in many cases I would
> >>> guess and we can't have the driver getting priorities wrong or we may
> >>> not get the correct behavior.
> > 
> > It may be solved by having the PMD maintain a SW state to quickly know which
> > rules are currently created and in what state the device is so basically the
> > application doesn't have to perform this work.
> > 
> > This API allows applications to express basic needs such as "redirect
> > packets matching this pattern to that queue". It must not deal with HW
> > details and limitations in my opinion. If a request cannot be satisfied,
> > then the rule cannot be created. No help from the application must be
> > expected by PMDs, otherwise it opens the door to the same issues as the
> > legacy filtering APIs.
> 
> This depends on the application and what/how it wants to manage the
> device. If the application manages a pipeline with some set of tables,
> then mapping this down to a single table, which then the PMD has to
> unwind back to a multi-table topology to me seems like a waste.

Of course, only I am not sure applications will behave differently if they
are aware of HW tables. I fear it will make things more complicated for
them and they will just stick with the most capable table all the time, but
I agree it should be easier for PMDs.

> > [...]
> >>>> Unfortunately, our maskfull region is extremely small too compared to
> >>>> maskless region.
> >>>>
> >>>
> >>> To me this means a userspace application would want to pack it
> >>> carefully to get the full benefit. So you need some mechanism to specify
> >>> the "region" hence the above table proposal.
> >>>
> >>
> >> Right. Makes sense.
> > 
> > I do not agree, applications should not be aware of it. Note this case can
> > be handled differently, so that rules do not have to be moved back and forth
> > between both tables. If the first created rule requires a maskfull entry,
> > then all subsequent rules will be entered into that table. Otherwise no
> > maskfull entry can be created as long as there is one maskless entry. When
> > either table is full, no more rules may be added. Would that work for you?
> > 
> 
> Its not about mask vs no mask. The devices with multiple tables that I
> have don't have this mask limitations. Its about how to optimally pack
> the rules and who implements that logic. I think its best done in the
> application where I have the context.
> 
> Is there a way to omit the table field if the PMD is expected to do
> a best effort and add the table field if the user wants explicit
> control over table mgmt. This would support both models. I at least
> would like to have explicit control over rule population in my pipeline
> for use cases where I'm building a pipeline on top of the hardware.

Yes that's a possibility. Perhaps the table ID to use could be specified as
a meta pattern item? We'd still need methods to report how many tables exist
and perhaps some way to report their limitations, these could be later
through a separate set of functions.

[...]
> >>> For this adding a meta-data item seems simplest to me. And if you want
> >>> to make the default to be only a single port that would maybe make it
> >>> easier for existing apps to port from flow director. Then if an
> >>> application cares it can create a list of ports if needed.
> >>>
> >>
> >> Agreed.
> > 
> > However although I'm not opposed to adding dedicated meta items, remember
> > applications will not automatically benefit from the increased performance
> > if a single PMD implements this feature, their maintainers will probably not
> > bother with it.
> > 
> 
> Unless as we noted in other thread the application is closely bound to
> its hardware for capability reasons. In this case it would make sense
> to implement.

Sure.

[...]

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-04 13:05             ` Adrien Mazarguil
@ 2016-08-09 21:24               ` John Fastabend
  2016-08-10 11:02                 ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: John Fastabend @ 2016-08-09 21:24 UTC (permalink / raw)
  To: Jerin Jacob, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu,
	Jan Medala, John Daley, Jing Chen, Konstantin Ananyev,
	Matej Vido, Alejandro Lucero, Sony Chacko, Pablo de Lara,
	Olga Shern

[...]

>> I'm not sure I understand 'bit granularity' here. I would say we have
>> devices now that have rather strange restrictions due to hardware
>> implementation. Going forward we should get better hardware and a lot
>> of this will go away in my view. Yes this is a long term view and
>> doesn't help the current state. The overall point you are making is
>> the sum off all these strange/odd bits in the hardware implementation
>> means capabilities queries are very difficult to guarantee. On existing
>> hardware and I think you've convinced me. Thanks ;)
> 
> Precisely. By "bit granularity" I meant that while it is fairly easy to
> report whether bit-masking is supported on protocol fields such as MAC
> addresses at all, devices may have restrictions on the possible bit-masks,
> like they may only have an effect at byte level (0xff), may not allow
> specific bits (broadcast) or there even may be a fixed set of bit-masks to
> choose from.

Yep lots of strange hardware implementation voodoo here.

> 
> [...]
>>> I understand, however I think this approach may be too low-level to express
>>> all the possible combinations. This graph would have to include possible
>>> actions for each possible pattern, all while considering that some actions
>>> are not possible with some patterns and that there are exclusive actions.
>>>
>>
>> Really? You have hardware that has dependencies between the parser and
>> the supported actions? Ugh...
> 
> Not that I know of actually, even though we cannot rule out this
> possibility.
> 
> Here are the possible cases I have in mind with existing HW:
> 
> - Too many actions specified for a single rule, even though each of them is
>   otherwise supported.

Yep most hardware will have this restriction.

> 
> - Performing several encap/decap actions. None are defined in the initial
>   specification but these are already planned.
> 

Great this is certainly needed.

> - Assuming there is a single table from the application point of view
>   (separate discussion for the other thread), some actions may only be
>   possible with the right pattern item or meta item. Asking HW to perform
>   tunnel decap may only be safe if the pattern specifically matches that
>   protocol.
> 

Yep continue in other thread.

>> If the hardware has separate tables then we shouldn't try to have the
>> PMD flatten those into a single table because we will have no way of
>> knowing how to do that. (I'll respond to the other thread on this in
>> an attempt to not get to scattered).
> 
> OK, will reply there as well.
> 
>>> Also while memory consumption is not really an issue, such a graph may be
>>> huge. It could take a while for the PMD to update it when adding a rule
>>> impacting capabilities.
>>
>> Ugh... I wouldn't suggest updating the capabilities at runtime like
>> this. But I see your point if the graph has to _guarantee_ correctness
>> how does it represent limited number of masks and other strange hw,
>> its unfortunate the hardware isn't more regular.
>>
>> You have convinced me that guaranteed correctness via capabilities
>> is going to difficult for many types of devices although not all.
> 
> I'll just add that these capabilities also depend on side effects of
> configuration performed outside the scope of this API. The way queues are
> (re)initialized or offloads configured may affect them. RSS configuration is
> the most obvious example.
> 

OK.

[...]

>>
>> My concern is this non-determinism will create performance issues in
>> the network because when a flow may or may not be offloaded this can
>> have a rather significant impact on its performance. This can make
>> debugging network wide performance miserable when at time X I get
>> performance X and then for whatever reason something degrades to
>> software and at time Y I get some performance Y << X. I suspect that
>> in general applications will bind tightly with hardware they know
>> works.
> 
> You are right, performance determinism is not taken into account at all, at
> least not yet. It should not be an issue at the beginning as long as the
> API has the ability evolve later for applications that need it.
> 
> Just an idea, could some kind of meta pattern items specifying time
> constraints for a rule address this issue? Say, how long (cycles/ms) the PMD
> may take to query/apply/delete the rule. If it cannot be guaranteed, the
> rule cannot be created. Applications could mantain statistic counters about
> failed rules to determine if performance issues are caused by the inability
> to create them.

It seems a bit heavy to me to have each PMD driver implementing
something like this. But it would be interesting to explore probably
after the basic support is implemented though.

> 
> [...]
>>> For individual points:
>>>
>>> (i) should be doable with the query API without recompiling DPDK as well,
>>> the fact API/ABI breakage must be avoided being part of the requirements. If
>>> you think there is a problem regarding this, can you provide a specific
>>> example?
>>
>> What I was after you noted yourself in the doc here,
>>
>> "PMDs can rely on this capability to simulate support for protocols with
>> fixed headers not directly recognized by hardware."
>>
>> I was trying to get variable header support with the RAW capabilities. A
>> parse graph supports this for example the proposed query API does not.
> 
> OK, I see, however the RAW capability itself may not be supported everywhere
> in patterns. What I described is that PMDs, not applications, could leverage
> the RAW abilities of underlying devices to implement otherwise unsupported
> but fixed patterns.
> 
> So basically you would like to expose the ability to describe fixed protocol
> definitions following RAW patterns, as in:

Correct for say some new tunnel metadata or something.

> 
>  ETH / RAW / IP / UDP / ...
> 
> While with such a pattern the current specification makes RAW (4.1.4.2) and
> IP start matching from the same offset as two different branches, in effect
> you cannot specify a fixed protocol following a RAW item.

What this means though is for every new protocol we will need to rebuild
drivers and dpdk. For a shared lib DPDK environment or a Linux
distribution this can be painful. It would be best to avoid this.

> 
> It is defined that way because I do not see how HW could parse higher level
> protocols after having given up due to a RAW pattern, however assuming the
> entire stack is described only using RAW patterns I guess it could be done.
> 
> Such a pattern could be generated from a separate function before feeding it
> to rte_flow_create(), or translated by the PMD afterwards assuming a
> separate meta item such as RAW_END exists to signal the end of a RAW layer.
> Of course processing this would be more expensive.
> 

Or the supported parse graph could be fetched from the hardware with the
values for each protocol so that the programming interface is the same.
The well known protocols could keep the 'enum values' in the header
rte_flow_item_type enum so that users would not be required to do
the parse graph but for new or experimental protocols we could query
the parse graph and get the programming pattern matching id for them.

The normal flow would be unchanged but we don't get stuck upgrading
everything to add our own protocol. So the flow would be,

 rte_get_parse_graph(graph);
 flow_item_proto = is_my_proto_supported(graph);

 pattern = build_flow_match(flow_item_proto, value, mask);
 action = build_action();
 rte_flow_create(my_port, pattern, action);

The only change to the API proposed to support this would be to allow
unsupported RTE_FLOW_ values to be pushed to the hardware and define
a range of values that are reserved for use by the parse graph discover.

This would not have to be any more expensive.

[...]

>>>>> So you can put it after "known"
>>>>> variable length headers like IP. The limitation is it can't get past
>>>>> undefined variable length headers.
>>>
>>> RTE_FLOW_ITEM_TYPE_ANY is made for that purpose. Is that what you are
>>> looking for?
>>>
>>
>> But FLOW_ITEM_TYPE_ANY skips "any" header type is my understanding if
>> we have new variable length header in the future we will have to add
>> a new type RTE_FLOW_ITEM_TYPE_FOO for example. The RAW type will work
>> for fixed headers as noted above.
> 
> I'm (slowly) starting to get it. How about the suggestion I made above for
> RAW items then?

hmm for performance reasons building an entire graph up using RAW items
seems to be a bit heavy. Another alternative to the above parse graph
notion would be to allow users to add RAW node definitions at init time
and have the PMD give a ID back for those. Then the new node could be
used just like any other RTE_FLOW_ITEM_TYPE in a pattern.

Something like,

	ret_flow_item_type_foo = rte_create_raw_node(foo_raw_pattern)
	ret_flow_item_type_bar = rte_create_raw_node(bar_raw_pattern)

then allow ret_flow_item_type_{foo|bar} to be used in subsequent
pattern matching items. And if the hardware can not support this return
an error from the initial rte_create_raw_node() API call.

Do any either of those proposals sound like reasonable extensions?

> 
> [...]
>> The two open items from me are do we need to support adding new variable
>> length headers? And how do we handle multiple tables I'll take that up
>> in the other thread.
> 
> I think variable length headers may be eventually supported through pattern
> tricks or eventually a separate conversion layer.
> 

A parse graph notion would support this naturally though without pattern
tricks hence my above suggestions.

Also in the current scheme how would I match an ipv6 option or specific
nsh option or mpls tag?

>>>>> I looked at the git repo but I only saw the header definition I guess
>>>>> the implementation is TBD after there is enough agreement on the
>>>>> interface?
>>>
>>> Precisely, I intend to update the tree and send a v2 soon (unfortunately did
>>> not have much time these past few days to work on this).
>>>
>>> Now what if, instead of a seemingly complex parse graph and still in
>>> addition to the query method, enum values were defined for PMDs to report
>>> an array of supported items, typical patterns and actions so applications
>>> can get a quick idea of what devices are capable of without being too
>>> specific. Something like:
>>>
>>>  enum rte_flow_capability {
>>>      RTE_FLOW_CAPABILITY_ITEM_ETH,
>>>      RTE_FLOW_CAPABILITY_PATTERN_ETH_IP_TCP,
>>>      RTE_FLOW_CAPABILITY_ACTION_ID,
>>>      ...
>>>  };
>>>
>>> Although I'm not convinced about the usefulness of this because it would
>>> have to be maintained separately, but that would be easier than building a
>>> dummy flow rule for simple query purposes.
>>
>> I'm not sure its necessary either at first.
> 
> Then I'll discard this idea.
> 
>>> The main question I have for you is, do you think the core of the specified
>>> API is adequate enough assuming it can be extended later with new methods?
>>>
>>
>> The above two items are my only opens at this point, I agree with your
>> summary of my capabilities proposal namely it can be added.
> 
> Thanks, see you in the other thread.
> 

Thanks,
John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-04 13:24               ` Adrien Mazarguil
@ 2016-08-09 21:47                 ` John Fastabend
  2016-08-10 13:37                   ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: John Fastabend @ 2016-08-09 21:47 UTC (permalink / raw)
  To: Rahul Lakkireddy, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley,
	Jing Chen, Konstantin Ananyev, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, Pablo de Lara, Olga Shern, Kumar A S,
	Nirranjan Kirubaharan, Indranil Choudhury

On 16-08-04 06:24 AM, Adrien Mazarguil wrote:
> On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:
>> [...]
>>
>>>>>>>> The proposal looks very good.  It satisfies most of the features
>>>>>>>> supported by Chelsio NICs.  We are looking for suggestions on exposing
>>>>>>>> more additional features supported by Chelsio NICs via this API.
>>>>>>>>
>>>>>>>> Chelsio NICs have two regions in which filters can be placed -
>>>>>>>> Maskfull and Maskless regions.  As their names imply, maskfull region
>>>>>>>> can accept masks to match a range of values; whereas, maskless region
>>>>>>>> don't accept any masks and hence perform a more strict exact-matches.
>>>>>>>> Filters without masks can also be placed in maskfull region.  By
>>>>>>>> default, maskless region have higher priority over the maskfull region.
>>>>>>>> However, the priority between the two regions is configurable.
>>>>>>>
>>>>>>> I understand this configuration affects the entire device. Just to be clear,
>>>>>>> assuming some filters are already configured, are they affected by a change
>>>>>>> of region priority later?
>>>>>>>
>>>>>>
>>>>>> Both the regions exist at the same time in the device.  Each filter can
>>>>>> either belong to maskfull or the maskless region.
>>>>>>
>>>>>> The priority is configured at time of filter creation for every
>>>>>> individual filter and cannot be changed while the filter is still
>>>>>> active. If priority needs to be changed for a particular filter then,
>>>>>> it needs to be deleted first and re-created.
>>>>>
>>>>> Could you model this as two tables and add a table_id to the API? This
>>>>> way user space could populate the table it chooses. We would have to add
>>>>> some capabilities attributes to "learn" if tables support masks or not
>>>>> though.
>>>>>
>>>>
>>>> This approach sounds interesting.
>>>
>>> Now I understand the idea behind these tables, however from an application
>>> point of view I still think it's better if the PMD could take care of flow
>>> rules optimizations automatically. Think about it, PMDs have exactly a
>>> single kind of device they know perfectly well to manage, while applications
>>> want the best possible performance out of any device in the most generic
>>> fashion.
>>
>> The problem is keeping priorities in order and/or possibly breaking
>> rules apart (e.g. you have an L2 table and an L3 table) becomes very
>> complex to manage at driver level. I think its easier for the
>> application which has some context to do this. The application "knows"
>> if its a router for example will likely be able to pack rules better
>> than a PMD will.
> 
> I don't think most applications know they are L2 or L3 routers. They may not
> know more than the pattern provided to the PMD, which may indeed end at a L2
> or L3 protocol. If the application simply chooses a table based on this
> information, then the PMD could have easily done the same.
> 

But when we start thinking about encap/decap then its natural to start
using this interface to implement various forwarding dataplanes. And one
common way to organize a switch is into a TEP, router, switch
(mac/vlan), ACL tables, etc. In fact we see this topology starting to
show up in the NICs now.

Further each table may be "managed" by a different entity. In which
case the software will want to manage the physical and virtual networks
separately.

It doesn't make sense to me to require a software aggregator object to
marshal the rules into a flat table then for a PMD to split them apart
again.

> I understand the issue is what happens when applications really want to
> define e.g. L2/L3/L2 rules in this specific order (or any ordering that
> cannot be satisfied by HW due to table constraints).
> 
> By exposing tables, in such a case applications should move all rules from
> L2 to a L3 table themselves (assuming this is even supported) to guarantee
> ordering between rules, or fail to add them. This is basically what the PMD
> could have done, possibly in a more efficient manner in my opinion.

I disagree with the more efficient comment :)

If the software layer is working on L2/TEP/ACL/router layers merging
them just to pull them back apart is not going to be more efficient.

> 
> Let's assume two opposite scenarios for this discussion:
> 
> - App #1 is a command-line interface directly mapped to flow rules, which
>   basically gets slow random input from users depending on how they want to
>   configure their traffic. All rules differ considerably (L2, L3, L4, some
>   with incomplete bit-masks, etc). All in all, few but complex rules with
>   specific priorities.
> 

Agree with this and in this case the application should be behind any
network physical/virtual and not giving rules like encap/decap/etc. This
application either sits on the physical function and "owns" the hardware
resource or sits behind a virtual switch.


> - App #2 is something like OVS, creating and deleting a large number of very
>   specific (without incomplete bit-masks) and mostly identical
>   single-priority rules automatically and very frequently.
> 

Maybe for OVS but not all virtual switches are built with flat tables
at the bottom like this. Nor is it optimal it necessarily optimal.

Another application (the one I'm concerned about :) would be build as
a pipeline, something like

	ACL -> TEP -> ACL -> VEB -> ACL

If I have hardware that supports a TEP hardware block an ACL hardware
block and a VEB  block for example I don't want to merge my control
plane into a single table. The merging in this case is just pure
overhead/complexity for no gain.

> Actual applications will certainly be a mix of both.
> 
> For app #1, users would have to be aware of these tables and base their
> filtering decisions according to them. Reporting tables capabilities, making
> sure priorities between tables are well configured will be their
> responsibility. Obviously applications may take care of these details for
> them, but the end result will be the same. At some point, some combination
> won't be possible. Getting there was only more complicated from
> users/applications point of view.
> 
> For app #2 if the first rule can be created then subsequent rules shouldn't
> be a problem until their number reaches device limits. Selecting the proper
> table to use for these can easily be done by the PMD.
> 

But it requires rewriting my pipeline software to be useful and this I
want to avoid. Using my TEP example again I'll need something in
software to catch every VEB/ACL rule and append the rest of the rule
creating wide rules. For my use cases its not a very user friendly API.

>>>>> I don't see how the PMD can sort this out in any meaningful way and it
>>>>> has to be exposed to the application that has the intelligence to 'know'
>>>>> priorities between masks and non-masks filters. I'm sure you could come
>>>>> up with something but it would be less than ideal in many cases I would
>>>>> guess and we can't have the driver getting priorities wrong or we may
>>>>> not get the correct behavior.
>>>
>>> It may be solved by having the PMD maintain a SW state to quickly know which
>>> rules are currently created and in what state the device is so basically the
>>> application doesn't have to perform this work.
>>>
>>> This API allows applications to express basic needs such as "redirect
>>> packets matching this pattern to that queue". It must not deal with HW
>>> details and limitations in my opinion. If a request cannot be satisfied,
>>> then the rule cannot be created. No help from the application must be
>>> expected by PMDs, otherwise it opens the door to the same issues as the
>>> legacy filtering APIs.
>>
>> This depends on the application and what/how it wants to manage the
>> device. If the application manages a pipeline with some set of tables,
>> then mapping this down to a single table, which then the PMD has to
>> unwind back to a multi-table topology to me seems like a waste.
> 
> Of course, only I am not sure applications will behave differently if they
> are aware of HW tables. I fear it will make things more complicated for
> them and they will just stick with the most capable table all the time, but
> I agree it should be easier for PMDs.
> 

On the other side if the API doesn't match my software pipeline the
complexity/overhead of merging it just to tear it apart again may
prohibit use of the API in these cases.

>>> [...]
>>>>>> Unfortunately, our maskfull region is extremely small too compared to
>>>>>> maskless region.
>>>>>>
>>>>>
>>>>> To me this means a userspace application would want to pack it
>>>>> carefully to get the full benefit. So you need some mechanism to specify
>>>>> the "region" hence the above table proposal.
>>>>>
>>>>
>>>> Right. Makes sense.
>>>
>>> I do not agree, applications should not be aware of it. Note this case can
>>> be handled differently, so that rules do not have to be moved back and forth
>>> between both tables. If the first created rule requires a maskfull entry,
>>> then all subsequent rules will be entered into that table. Otherwise no
>>> maskfull entry can be created as long as there is one maskless entry. When
>>> either table is full, no more rules may be added. Would that work for you?
>>>
>>
>> Its not about mask vs no mask. The devices with multiple tables that I
>> have don't have this mask limitations. Its about how to optimally pack
>> the rules and who implements that logic. I think its best done in the
>> application where I have the context.
>>
>> Is there a way to omit the table field if the PMD is expected to do
>> a best effort and add the table field if the user wants explicit
>> control over table mgmt. This would support both models. I at least
>> would like to have explicit control over rule population in my pipeline
>> for use cases where I'm building a pipeline on top of the hardware.
> 
> Yes that's a possibility. Perhaps the table ID to use could be specified as
> a meta pattern item? We'd still need methods to report how many tables exist
> and perhaps some way to report their limitations, these could be later
> through a separate set of functions.

Sure I think a meta pattern item would be fine or put it in the API call
directly, something like

  rte_flow_create(port_id, pattern, actions);
  rte_flow_create_table(port_id, table_id, pattern, actions);


> 
> [...]
>>>>> For this adding a meta-data item seems simplest to me. And if you want
>>>>> to make the default to be only a single port that would maybe make it
>>>>> easier for existing apps to port from flow director. Then if an
>>>>> application cares it can create a list of ports if needed.
>>>>>
>>>>
>>>> Agreed.
>>>
>>> However although I'm not opposed to adding dedicated meta items, remember
>>> applications will not automatically benefit from the increased performance
>>> if a single PMD implements this feature, their maintainers will probably not
>>> bother with it.
>>>
>>
>> Unless as we noted in other thread the application is closely bound to
>> its hardware for capability reasons. In this case it would make sense
>> to implement.
> 
> Sure.
> 
> [...]
> 

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-09 21:24               ` John Fastabend
@ 2016-08-10 11:02                 ` Adrien Mazarguil
  2016-08-10 16:35                   ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-10 11:02 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jerin Jacob, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu,
	Jan Medala, John Daley, Jing Chen, Konstantin Ananyev,
	Matej Vido, Alejandro Lucero, Sony Chacko, Pablo de Lara,
	Olga Shern

On Tue, Aug 09, 2016 at 02:24:26PM -0700, John Fastabend wrote:
[...]
> > Just an idea, could some kind of meta pattern items specifying time
> > constraints for a rule address this issue? Say, how long (cycles/ms) the PMD
> > may take to query/apply/delete the rule. If it cannot be guaranteed, the
> > rule cannot be created. Applications could mantain statistic counters about
> > failed rules to determine if performance issues are caused by the inability
> > to create them.
> 
> It seems a bit heavy to me to have each PMD driver implementing
> something like this. But it would be interesting to explore probably
> after the basic support is implemented though.

OK, let's keep this for later.

[...]
> > Such a pattern could be generated from a separate function before feeding it
> > to rte_flow_create(), or translated by the PMD afterwards assuming a
> > separate meta item such as RAW_END exists to signal the end of a RAW layer.
> > Of course processing this would be more expensive.
> > 
> 
> Or the supported parse graph could be fetched from the hardware with the
> values for each protocol so that the programming interface is the same.
> The well known protocols could keep the 'enum values' in the header
> rte_flow_item_type enum so that users would not be required to do
> the parse graph but for new or experimental protocols we could query
> the parse graph and get the programming pattern matching id for them.
> 
> The normal flow would be unchanged but we don't get stuck upgrading
> everything to add our own protocol. So the flow would be,
> 
>  rte_get_parse_graph(graph);
>  flow_item_proto = is_my_proto_supported(graph);
> 
>  pattern = build_flow_match(flow_item_proto, value, mask);
>  action = build_action();
>  rte_flow_create(my_port, pattern, action);
> 
> The only change to the API proposed to support this would be to allow
> unsupported RTE_FLOW_ values to be pushed to the hardware and define
> a range of values that are reserved for use by the parse graph discover.
> 
> This would not have to be any more expensive.

Makes sense. Unless made entirely out of RAW items however the ensuing
pattern would not be portable across DPDK ports, instances and versions if
dumped in binary form for later use.

Since those would have be recognized by PMDs and applications regardless of
the API version, I suggest making generated item types negative (enums are
signed, let's use that).

DPDK would have to maintain a list of expended values to avoid collisions
between PMDs. A method should be provided to release them.

[...]
> hmm for performance reasons building an entire graph up using RAW items
> seems to be a bit heavy. Another alternative to the above parse graph
> notion would be to allow users to add RAW node definitions at init time
> and have the PMD give a ID back for those. Then the new node could be
> used just like any other RTE_FLOW_ITEM_TYPE in a pattern.
> 
> Something like,
> 
> 	ret_flow_item_type_foo = rte_create_raw_node(foo_raw_pattern)
> 	ret_flow_item_type_bar = rte_create_raw_node(bar_raw_pattern)
> 
> then allow ret_flow_item_type_{foo|bar} to be used in subsequent
> pattern matching items. And if the hardware can not support this return
> an error from the initial rte_create_raw_node() API call.
> 
> Do any either of those proposals sound like reasonable extensions?

Both seem acceptable in my opinion as they fit in the described API. However
I think it would be better for this function to spit out a pattern made of
any number of items instead of a single new item type. That way, existing
fixed items can be reused as well, the entire pattern may even become
portable as a result, it could be considered as a method to optimize a
RAW pattern.

The RAW approach has the advantage of not requiring much additional code in
the public API besides a couple of function declarations. A proper full
blown graph would require a lot more as described in your original
reply. Not sure which is better.

Either way they won't be part of the initial specification but it looks like
they can be added later without affecting the basics.

> > [...]
> >> The two open items from me are do we need to support adding new variable
> >> length headers? And how do we handle multiple tables I'll take that up
> >> in the other thread.
> > 
> > I think variable length headers may be eventually supported through pattern
> > tricks or eventually a separate conversion layer.
> > 
> 
> A parse graph notion would support this naturally though without pattern
> tricks hence my above suggestions.

All right, I agree a method to let applications precisely define what they
want to match can be useful now I understand what you mean by
"dynamically".

> Also in the current scheme how would I match an ipv6 option or specific
> nsh option or mpls tag?

Ideally through specific pattern items defined for this purpose, which is
how I thought the API would evolve. Of course it wouldn't be fully dynamic
and you'd have to wait for a DPDK release that implements them.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-09 21:47                 ` John Fastabend
@ 2016-08-10 13:37                   ` Adrien Mazarguil
  2016-08-10 16:46                     ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-10 13:37 UTC (permalink / raw)
  To: John Fastabend
  Cc: Rahul Lakkireddy, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley,
	Jing Chen, Konstantin Ananyev, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, Pablo de Lara, Olga Shern, Kumar A S,
	Nirranjan Kirubaharan, Indranil Choudhury

On Tue, Aug 09, 2016 at 02:47:44PM -0700, John Fastabend wrote:
> On 16-08-04 06:24 AM, Adrien Mazarguil wrote:
> > On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:
[...]
> >> The problem is keeping priorities in order and/or possibly breaking
> >> rules apart (e.g. you have an L2 table and an L3 table) becomes very
> >> complex to manage at driver level. I think its easier for the
> >> application which has some context to do this. The application "knows"
> >> if its a router for example will likely be able to pack rules better
> >> than a PMD will.
> > 
> > I don't think most applications know they are L2 or L3 routers. They may not
> > know more than the pattern provided to the PMD, which may indeed end at a L2
> > or L3 protocol. If the application simply chooses a table based on this
> > information, then the PMD could have easily done the same.
> > 
> 
> But when we start thinking about encap/decap then its natural to start
> using this interface to implement various forwarding dataplanes. And one
> common way to organize a switch is into a TEP, router, switch
> (mac/vlan), ACL tables, etc. In fact we see this topology starting to
> show up in the NICs now.
> 
> Further each table may be "managed" by a different entity. In which
> case the software will want to manage the physical and virtual networks
> separately.
> 
> It doesn't make sense to me to require a software aggregator object to
> marshal the rules into a flat table then for a PMD to split them apart
> again.

OK, my point was mostly about handling basic cases easily and making sure
applications do not have to bother with petty HW details when they do not
want to, yet still get maximum performance by having the PMD make the most
appropriate choices automatically.

You've convinced me that in many cases PMDs won't be able to optimize
efficiently and that conscious applications will know better. The API has to
provide the ability to do so. I think it's fine as long as it is not
mandatory.

> > I understand the issue is what happens when applications really want to
> > define e.g. L2/L3/L2 rules in this specific order (or any ordering that
> > cannot be satisfied by HW due to table constraints).
> > 
> > By exposing tables, in such a case applications should move all rules from
> > L2 to a L3 table themselves (assuming this is even supported) to guarantee
> > ordering between rules, or fail to add them. This is basically what the PMD
> > could have done, possibly in a more efficient manner in my opinion.
> 
> I disagree with the more efficient comment :)
> 
> If the software layer is working on L2/TEP/ACL/router layers merging
> them just to pull them back apart is not going to be more efficient.

Moving flow rules around cannot be efficient by definition, however I think
that attempting to describe table capabilities may be as complicated as
describing HW bit-masking features. Applications may get it wrong as a
result while a PMD would not make any mistake.

Your use case is valid though, if the application already groups rules, then
sharing this information with the PMD would make sense from a performance
standpoint.

> > Let's assume two opposite scenarios for this discussion:
> > 
> > - App #1 is a command-line interface directly mapped to flow rules, which
> >   basically gets slow random input from users depending on how they want to
> >   configure their traffic. All rules differ considerably (L2, L3, L4, some
> >   with incomplete bit-masks, etc). All in all, few but complex rules with
> >   specific priorities.
> > 
> 
> Agree with this and in this case the application should be behind any
> network physical/virtual and not giving rules like encap/decap/etc. This
> application either sits on the physical function and "owns" the hardware
> resource or sits behind a virtual switch.
> 
> 
> > - App #2 is something like OVS, creating and deleting a large number of very
> >   specific (without incomplete bit-masks) and mostly identical
> >   single-priority rules automatically and very frequently.
> > 
> 
> Maybe for OVS but not all virtual switches are built with flat tables
> at the bottom like this. Nor is it optimal it necessarily optimal.
> 
> Another application (the one I'm concerned about :) would be build as
> a pipeline, something like
> 
> 	ACL -> TEP -> ACL -> VEB -> ACL
> 
> If I have hardware that supports a TEP hardware block an ACL hardware
> block and a VEB  block for example I don't want to merge my control
> plane into a single table. The merging in this case is just pure
> overhead/complexity for no gain.

It could be done by dedicating priority ranges for each item in the
pipeline but then it would be clunky. OK then, let's discuss the best
approach to implement this.

[...]
> >> Its not about mask vs no mask. The devices with multiple tables that I
> >> have don't have this mask limitations. Its about how to optimally pack
> >> the rules and who implements that logic. I think its best done in the
> >> application where I have the context.
> >>
> >> Is there a way to omit the table field if the PMD is expected to do
> >> a best effort and add the table field if the user wants explicit
> >> control over table mgmt. This would support both models. I at least
> >> would like to have explicit control over rule population in my pipeline
> >> for use cases where I'm building a pipeline on top of the hardware.
> > 
> > Yes that's a possibility. Perhaps the table ID to use could be specified as
> > a meta pattern item? We'd still need methods to report how many tables exist
> > and perhaps some way to report their limitations, these could be later
> > through a separate set of functions.
> 
> Sure I think a meta pattern item would be fine or put it in the API call
> directly, something like
> 
>   rte_flow_create(port_id, pattern, actions);
>   rte_flow_create_table(port_id, table_id, pattern, actions);

I suggest using a common method for both cases, either seems fine to me, as
long as a default table value can be provided (zero) when applications do
not care.

Now about tables management, I think there is no need to not expose table
capabilities (in case they have different capabilities) but instead provide
guidelines as part of the specification to encourage applications writers to
group similar rules in tables. A previously discussed, flow rules priorities
would be specific to the table they are affected to.

Like flow rules, table priorities could be handled through their index with
index 0 having the highest priority. Like flow rule priorities, table
indices wouldn't have to be contiguous.

If this works for you, how about renaming "tables" to "groups"?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-10 11:02                 ` Adrien Mazarguil
@ 2016-08-10 16:35                   ` John Fastabend
  0 siblings, 0 replies; 262+ messages in thread
From: John Fastabend @ 2016-08-10 16:35 UTC (permalink / raw)
  To: Jerin Jacob, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Rahul Lakkireddy, Wenzhuo Lu,
	Jan Medala, John Daley, Jing Chen, Konstantin Ananyev,
	Matej Vido, Alejandro Lucero, Sony Chacko, Pablo de Lara,
	Olga Shern

On 16-08-10 04:02 AM, Adrien Mazarguil wrote:
> On Tue, Aug 09, 2016 at 02:24:26PM -0700, John Fastabend wrote:
> [...]
>>> Just an idea, could some kind of meta pattern items specifying time
>>> constraints for a rule address this issue? Say, how long (cycles/ms) the PMD
>>> may take to query/apply/delete the rule. If it cannot be guaranteed, the
>>> rule cannot be created. Applications could mantain statistic counters about
>>> failed rules to determine if performance issues are caused by the inability
>>> to create them.
>>
>> It seems a bit heavy to me to have each PMD driver implementing
>> something like this. But it would be interesting to explore probably
>> after the basic support is implemented though.
> 
> OK, let's keep this for later.
> 
> [...]
>>> Such a pattern could be generated from a separate function before feeding it
>>> to rte_flow_create(), or translated by the PMD afterwards assuming a
>>> separate meta item such as RAW_END exists to signal the end of a RAW layer.
>>> Of course processing this would be more expensive.
>>>
>>
>> Or the supported parse graph could be fetched from the hardware with the
>> values for each protocol so that the programming interface is the same.
>> The well known protocols could keep the 'enum values' in the header
>> rte_flow_item_type enum so that users would not be required to do
>> the parse graph but for new or experimental protocols we could query
>> the parse graph and get the programming pattern matching id for them.
>>
>> The normal flow would be unchanged but we don't get stuck upgrading
>> everything to add our own protocol. So the flow would be,
>>
>>  rte_get_parse_graph(graph);
>>  flow_item_proto = is_my_proto_supported(graph);
>>
>>  pattern = build_flow_match(flow_item_proto, value, mask);
>>  action = build_action();
>>  rte_flow_create(my_port, pattern, action);
>>
>> The only change to the API proposed to support this would be to allow
>> unsupported RTE_FLOW_ values to be pushed to the hardware and define
>> a range of values that are reserved for use by the parse graph discover.
>>
>> This would not have to be any more expensive.
> 
> Makes sense. Unless made entirely out of RAW items however the ensuing
> pattern would not be portable across DPDK ports, instances and versions if
> dumped in binary form for later use.
> 

Right.

> Since those would have be recognized by PMDs and applications regardless of
> the API version, I suggest making generated item types negative (enums are
> signed, let's use that).

That works then the normal positive enums maintain the list of
known/accepted protocols.

> 
> DPDK would have to maintain a list of expended values to avoid collisions
> between PMDs. A method should be provided to release them.

The 'middle layer' could have a non-public API for PMDs to get new
values call it get_flow_type_item_id() or something.

> 
> [...]
>> hmm for performance reasons building an entire graph up using RAW items
>> seems to be a bit heavy. Another alternative to the above parse graph
>> notion would be to allow users to add RAW node definitions at init time
>> and have the PMD give a ID back for those. Then the new node could be
>> used just like any other RTE_FLOW_ITEM_TYPE in a pattern.
>>
>> Something like,
>>
>> 	ret_flow_item_type_foo = rte_create_raw_node(foo_raw_pattern)
>> 	ret_flow_item_type_bar = rte_create_raw_node(bar_raw_pattern)
>>
>> then allow ret_flow_item_type_{foo|bar} to be used in subsequent
>> pattern matching items. And if the hardware can not support this return
>> an error from the initial rte_create_raw_node() API call.
>>
>> Do any either of those proposals sound like reasonable extensions?
> 
> Both seem acceptable in my opinion as they fit in the described API. However
> I think it would be better for this function to spit out a pattern made of
> any number of items instead of a single new item type. That way, existing
> fixed items can be reused as well, the entire pattern may even become
> portable as a result, it could be considered as a method to optimize a
> RAW pattern.
> 
> The RAW approach has the advantage of not requiring much additional code in
> the public API besides a couple of function declarations. A proper full
> blown graph would require a lot more as described in your original
> reply. Not sure which is better.
> 
> Either way they won't be part of the initial specification but it looks like
> they can be added later without affecting the basics.
> 

Right its not needed in initial spec as long as we have a path to get
there and it looks like we have two usable possibilities so that works
for me.


>>> [...]
>>>> The two open items from me are do we need to support adding new variable
>>>> length headers? And how do we handle multiple tables I'll take that up
>>>> in the other thread.
>>>
>>> I think variable length headers may be eventually supported through pattern
>>> tricks or eventually a separate conversion layer.
>>>
>>
>> A parse graph notion would support this naturally though without pattern
>> tricks hence my above suggestions.
> 
> All right, I agree a method to let applications precisely define what they
> want to match can be useful now I understand what you mean by
> "dynamically".
> 
>> Also in the current scheme how would I match an ipv6 option or specific
>> nsh option or mpls tag?
> 
> Ideally through specific pattern items defined for this purpose, which is
> how I thought the API would evolve. Of course it wouldn't be fully dynamic
> and you'd have to wait for a DPDK release that implements them.
> 

The only trouble is if you don't know exactly where the option is in the
list of options (which you wont in general) its a bit hard to get right
with the existing spec as best I can tell. Because RAW patterns
would require you to know where the option is in the list and ANY
pattern wouldn't guarantee a match is in the correct header with stacked
headers. At least if I'm reading the spec correctly it seems to be
an issue.

.John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-10 13:37                   ` Adrien Mazarguil
@ 2016-08-10 16:46                     ` John Fastabend
  0 siblings, 0 replies; 262+ messages in thread
From: John Fastabend @ 2016-08-10 16:46 UTC (permalink / raw)
  To: Rahul Lakkireddy, dev, Thomas Monjalon, Helin Zhang, Jingjing Wu,
	Rasesh Mody, Ajit Khaparde, Wenzhuo Lu, Jan Medala, John Daley,
	Jing Chen, Konstantin Ananyev, Matej Vido, Alejandro Lucero,
	Sony Chacko, Jerin Jacob, Pablo de Lara, Olga Shern, Kumar A S,
	Nirranjan Kirubaharan, Indranil Choudhury

On 16-08-10 06:37 AM, Adrien Mazarguil wrote:
> On Tue, Aug 09, 2016 at 02:47:44PM -0700, John Fastabend wrote:
>> On 16-08-04 06:24 AM, Adrien Mazarguil wrote:
>>> On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:
> [...]
>>>> The problem is keeping priorities in order and/or possibly breaking
>>>> rules apart (e.g. you have an L2 table and an L3 table) becomes very
>>>> complex to manage at driver level. I think its easier for the
>>>> application which has some context to do this. The application "knows"
>>>> if its a router for example will likely be able to pack rules better
>>>> than a PMD will.
>>>
>>> I don't think most applications know they are L2 or L3 routers. They may not
>>> know more than the pattern provided to the PMD, which may indeed end at a L2
>>> or L3 protocol. If the application simply chooses a table based on this
>>> information, then the PMD could have easily done the same.
>>>
>>
>> But when we start thinking about encap/decap then its natural to start
>> using this interface to implement various forwarding dataplanes. And one
>> common way to organize a switch is into a TEP, router, switch
>> (mac/vlan), ACL tables, etc. In fact we see this topology starting to
>> show up in the NICs now.
>>
>> Further each table may be "managed" by a different entity. In which
>> case the software will want to manage the physical and virtual networks
>> separately.
>>
>> It doesn't make sense to me to require a software aggregator object to
>> marshal the rules into a flat table then for a PMD to split them apart
>> again.
> 
> OK, my point was mostly about handling basic cases easily and making sure
> applications do not have to bother with petty HW details when they do not
> want to, yet still get maximum performance by having the PMD make the most
> appropriate choices automatically.
> 
> You've convinced me that in many cases PMDs won't be able to optimize
> efficiently and that conscious applications will know better. The API has to
> provide the ability to do so. I think it's fine as long as it is not
> mandatory.
> 

Great. I also agree making table feature _not_ mandatory for many use
cases will be helpful. I'm just making sure we get all the use cases I
know of covered.

>>> I understand the issue is what happens when applications really want to
>>> define e.g. L2/L3/L2 rules in this specific order (or any ordering that
>>> cannot be satisfied by HW due to table constraints).
>>>
>>> By exposing tables, in such a case applications should move all rules from
>>> L2 to a L3 table themselves (assuming this is even supported) to guarantee
>>> ordering between rules, or fail to add them. This is basically what the PMD
>>> could have done, possibly in a more efficient manner in my opinion.
>>
>> I disagree with the more efficient comment :)
>>
>> If the software layer is working on L2/TEP/ACL/router layers merging
>> them just to pull them back apart is not going to be more efficient.
> 
> Moving flow rules around cannot be efficient by definition, however I think
> that attempting to describe table capabilities may be as complicated as
> describing HW bit-masking features. Applications may get it wrong as a
> result while a PMD would not make any mistake.
> 
> Your use case is valid though, if the application already groups rules, then
> sharing this information with the PMD would make sense from a performance
> standpoint.
> 
>>> Let's assume two opposite scenarios for this discussion:
>>>
>>> - App #1 is a command-line interface directly mapped to flow rules, which
>>>   basically gets slow random input from users depending on how they want to
>>>   configure their traffic. All rules differ considerably (L2, L3, L4, some
>>>   with incomplete bit-masks, etc). All in all, few but complex rules with
>>>   specific priorities.
>>>
>>
>> Agree with this and in this case the application should be behind any
>> network physical/virtual and not giving rules like encap/decap/etc. This
>> application either sits on the physical function and "owns" the hardware
>> resource or sits behind a virtual switch.
>>
>>
>>> - App #2 is something like OVS, creating and deleting a large number of very
>>>   specific (without incomplete bit-masks) and mostly identical
>>>   single-priority rules automatically and very frequently.
>>>
>>
>> Maybe for OVS but not all virtual switches are built with flat tables
>> at the bottom like this. Nor is it optimal it necessarily optimal.
>>
>> Another application (the one I'm concerned about :) would be build as
>> a pipeline, something like
>>
>> 	ACL -> TEP -> ACL -> VEB -> ACL
>>
>> If I have hardware that supports a TEP hardware block an ACL hardware
>> block and a VEB  block for example I don't want to merge my control
>> plane into a single table. The merging in this case is just pure
>> overhead/complexity for no gain.
> 
> It could be done by dedicating priority ranges for each item in the
> pipeline but then it would be clunky. OK then, let's discuss the best
> approach to implement this.
> 
> [...]
>>>> Its not about mask vs no mask. The devices with multiple tables that I
>>>> have don't have this mask limitations. Its about how to optimally pack
>>>> the rules and who implements that logic. I think its best done in the
>>>> application where I have the context.
>>>>
>>>> Is there a way to omit the table field if the PMD is expected to do
>>>> a best effort and add the table field if the user wants explicit
>>>> control over table mgmt. This would support both models. I at least
>>>> would like to have explicit control over rule population in my pipeline
>>>> for use cases where I'm building a pipeline on top of the hardware.
>>>
>>> Yes that's a possibility. Perhaps the table ID to use could be specified as
>>> a meta pattern item? We'd still need methods to report how many tables exist
>>> and perhaps some way to report their limitations, these could be later
>>> through a separate set of functions.
>>
>> Sure I think a meta pattern item would be fine or put it in the API call
>> directly, something like
>>
>>   rte_flow_create(port_id, pattern, actions);
>>   rte_flow_create_table(port_id, table_id, pattern, actions);
> 
> I suggest using a common method for both cases, either seems fine to me, as
> long as a default table value can be provided (zero) when applications do
> not care.
> 

Works for me just use zero as the default when the application has no
preference and expects PMD to do the table mapping.

> Now about tables management, I think there is no need to not expose table
> capabilities (in case they have different capabilities) but instead provide
> guidelines as part of the specification to encourage applications writers to
> group similar rules in tables. A previously discussed, flow rules priorities
> would be specific to the table they are affected to.

This seems sufficient to me.

> 
> Like flow rules, table priorities could be handled through their index with
> index 0 having the highest priority. Like flow rule priorities, table
> indices wouldn't have to be contiguous.
> 
> If this works for you, how about renaming "tables" to "groups"?
> 

Works for me. And actually I like renaming them "groups" as this seems
more neutral to how the hardware actually implements a group. For
example I've worked on hardware with multiple Tunnel Endpoint engines
but we exposed it as a single "group" to simplify the user interface.

.John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API
  2016-07-05 18:16 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Adrien Mazarguil
                   ` (4 preceding siblings ...)
  2016-07-21  8:13 ` Rahul Lakkireddy
@ 2016-08-19 19:32 ` Adrien Mazarguil
  2016-08-19 19:32   ` [dpdk-dev] [RFC v2] ethdev: introduce generic flow API Adrien Mazarguil
                     ` (3 more replies)
  5 siblings, 4 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-19 19:32 UTC (permalink / raw)
  To: dev

Hi All,

Thanks to many for the positive and constructive feedback I've received so
far. Here is the updated specification (v0.7) at last.

I've attempted to address as many comments as possible but could not
process them all just yet. A new section "Future evolutions" has been
added for the remaining topics.

This series adds rte_flow.h to the DPDK tree. Next time I will attempt to
convert the specification as a documentation commit part of the patchset
and actually implement API functions.

I think including the entire document here makes it easier to annotate on
the ML, apologies in advance for the resulting traffic.

Finally I'm off for the next two weeks, do not expect replies from me in
the meantime.

Updates are also available online:

HTML version:
 https://rawgit.com/6WIND/rte_flow/master/rte_flow.html

PDF version:
 https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf                          

Related draft header file (also in the next patch):
 https://raw.githubusercontent.com/6WIND/rte_flow/master/rte_flow.h

Git tree:
 https://github.com/6WIND/rte_flow

Changes from v1:

 Specification:

 - Settled on [generic] "flow interface" / "flow API" as the name of this
   framework, matches the rte_flow prefix better.
 - Minor wording changes in several places.
 - Partially added egress (TX) support.
 - Added "unrecoverable errors" as another consequence of overlapping
   rules.
 - Described flow rules groups and their interaction with flow rule
   priorities.
 - Fully described PF and VF meta pattern items so they are not open to
   interpretation anymore.
 - Removed the SIGNATURE meta pattern item as its description was too
   vague, may be re-added later if necessary.
 - Added the PORT pattern item to apply rules to non-default physical
   ports.
 - Entirely redefined the RAW pattern item.
 - Fixed tag error in the ETH item definition.
 - Updated protocol definitions (IPV4, IPV6, ICMP, UDP).
 - Added missing protocols (SCTP, VXLAN).
 - Converted ID action to MARK and FLAG actions, described interaction
   with the RSS hash result in mbufs.
 - Updated COUNT query structure to retrieve the number of bytes.
 - Updated VF action.
 - Documented negative item and action types, those will be used for
   dynamic types generated at run-time.
 - Added blurb about IPv4 options and IPv6 extension headers matching.
 - Updated function definitions.
 - Documented a flush method to remove all rules on a given port at once.
 - Documented the verbose error reporting interface.
 - Documented how the private interface for PMD use will work.
 - Documented expected behavior between successive port initializations.
 - Documented expected behavior for ports not under DPDK control.
 - Updated API migration section.
 - Added future evolutions section.

 Header file:
 
 - Not a draft anymore and can be used as-is for preliminary
   implementations.
 - Flow rule attributes (group, priority, etc) now have their own
   structure provided separately to API functions (struct rte_flow_attr).
 - Group and priority interactions have been documented.
 - Added PORT item.
 - Removed SIGNATURE item.
 - Defined ICMP, SCTP and VXLAN items.
 - Redefined PF, VF, RAW, IPV4, IPV6, UDP and TCP items.
 - Fixed tag error in the ETH item definition.
 - Converted ID action to MARK and FLAG actions.
   hash result in mbufs.
 - Updated COUNT query structure.
 - Updated VF action.
 - Added verbose errors interface.
 - Updated function prototypes according to the above.
 - Defined rte_flow_flush().

--------

======================
Generic flow interface
======================

.. footer::

   v0.7

.. contents::
.. sectnum::
.. raw:: pdf

   PageBreak

Overview
========

DPDK provides several competing interfaces added over time to perform packet
matching and related actions such as filtering and classification.

They must be extended to implement the features supported by newer devices
in order to expose them to applications, however the current design has
several drawbacks:

- Complicated filter combinations which have not been hard-coded cannot be
  expressed.
- Prone to API/ABI breakage when new features must be added to an existing
  filter type, which frequently happens.

>From an application point of view:

- Having disparate interfaces, all optional and lacking in features does not
  make this API easy to use.
- Seemingly arbitrary built-in limitations of filter types based on the
  device they were initially designed for.
- Undefined relationship between different filter types.
- High complexity, considerable undocumented and/or undefined behavior.

Considering the growing number of devices supported by DPDK, adding a new
filter type each time a new feature must be implemented is not sustainable
in the long term. Applications not written to target a specific device
cannot really benefit from such an API.

For these reasons, this document defines an extensible unified API that
encompasses and supersedes these legacy filter types.

.. raw:: pdf

   PageBreak

Current API
===========

Rationale
---------

The reason several competing (and mostly overlapping) filtering APIs are
present in DPDK is due to its nature as a thin layer between hardware and
software.

Each subsequent interface has been added to better match the capabilities
and limitations of the latest supported device, which usually happened to
need an incompatible configuration approach. Because of this, many ended up
device-centric and not usable by applications that were not written for that
particular device.

This document is not the first attempt to address this proliferation issue,
in fact a lot of work has already been done both to create a more generic
interface while somewhat keeping compatibility with legacy ones through a
common call interface (``rte_eth_dev_filter_ctrl()`` with the
``.filter_ctrl`` PMD callback in ``rte_ethdev.h``).

Today, these previously incompatible interfaces are known as filter types
(``RTE_ETH_FILTER_*`` from ``enum rte_filter_type`` in ``rte_eth_ctrl.h``).

However while trivial to extend with new types, it only shifted the
underlying problem as applications still need to be written for one kind of
filter type, which, as described in the following sections, is not
necessarily implemented by all PMDs that support filtering.

.. raw:: pdf

   PageBreak

Filter types
------------

This section summarizes the capabilities of each filter type.

Although the following list is exhaustive, the description of individual
types may contain inaccuracies due to the lack of documentation or usage
examples.

Note: names are prefixed with ``RTE_ETH_FILTER_``.

``MACVLAN``
~~~~~~~~~~~

Matching:

- L2 source/destination addresses.
- Optional 802.1Q VLAN ID.
- Masking individual fields on a rule basis is not supported.

Action:

- Packets are redirected either to a given VF device using its ID or to the
  PF.

``ETHERTYPE``
~~~~~~~~~~~~~

Matching:

- L2 source/destination addresses (optional).
- Ethertype (no VLAN ID?).
- Masking individual fields on a rule basis is not supported.

Action:

- Receive packets on a given queue.
- Drop packets.

``FLEXIBLE``
~~~~~~~~~~~~

Matching:

- At most 128 consecutive bytes anywhere in packets.
- Masking is supported with byte granularity.
- Priorities are supported (relative to this filter type, undefined
  otherwise).

Action:

- Receive packets on a given queue.

``SYN``
~~~~~~~

Matching:

- TCP SYN packets only.
- One high priority bit can be set to give the highest possible priority to
  this type when other filters with different types are configured.

Action:

- Receive packets on a given queue.

``NTUPLE``
~~~~~~~~~~

Matching:

- Source/destination IPv4 addresses (optional in 2-tuple mode).
- Source/destination TCP/UDP port (mandatory in 2 and 5-tuple modes).
- L4 protocol (2 and 5-tuple modes).
- Masking individual fields is supported.
- TCP flags.
- Up to 7 levels of priority relative to this filter type, undefined
  otherwise.
- No IPv6.

Action:

- Receive packets on a given queue.

``TUNNEL``
~~~~~~~~~~

Matching:

- Outer L2 source/destination addresses.
- Inner L2 source/destination addresses.
- Inner VLAN ID.
- IPv4/IPv6 source (destination?) address.
- Tunnel type to match (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE, 802.1BR
  E-Tag).
- Tenant ID for tunneling protocols that have one.
- Any combination of the above can be specified.
- Masking individual fields on a rule basis is not supported.

Action:

- Receive packets on a given queue.

.. raw:: pdf

   PageBreak

``FDIR``
~~~~~~~~

Queries:

- Device capabilities and limitations.
- Device statistics about configured filters (resource usage, collisions).
- Device configuration (matching input set and masks)

Matching:

- Device mode of operation: none (to disable filtering), signature
  (hash-based dispatching from masked fields) or perfect (either MAC VLAN or
  tunnel).
- L2 Ethertype.
- Outer L2 destination address (MAC VLAN mode).
- Inner L2 destination address, tunnel type (NVGRE, VXLAN) and tunnel ID
  (tunnel mode).
- IPv4 source/destination addresses, ToS, TTL and protocol fields.
- IPv6 source/destination addresses, TC, protocol and hop limits fields.
- UDP source/destination IPv4/IPv6 and ports.
- TCP source/destination IPv4/IPv6 and ports.
- SCTP source/destination IPv4/IPv6, ports and verification tag field.
- Note, only one protocol type at once (either only L2 Ethertype, basic
  IPv6, IPv4+UDP, IPv4+TCP and so on).
- VLAN TCI (extended API).
- At most 16 bytes to match in payload (extended API). A global device
  look-up table specifies for each possible protocol layer (unknown, raw,
  L2, L3, L4) the offset to use for each byte (they do not need to be
  contiguous) and the related bit-mask.
- Whether packet is addressed to PF or VF, in that case its ID can be
  matched as well (extended API).
- Masking most of the above fields is supported, but simultaneously affects
  all filters configured on a device.
- Input set can be modified in a similar fashion for a given device to
  ignore individual fields of filters (i.e. do not match the destination
  address in a IPv4 filter, refer to **RTE_ETH_INPUT_SET_**
  macros). Configuring this also affects RSS processing on **i40e**.
- Filters can also provide 32 bits of arbitrary data to return as part of
  matched packets.

Action:

- **RTE_ETH_FDIR_ACCEPT**: receive (accept) packet on a given queue.
- **RTE_ETH_FDIR_REJECT**: drop packet immediately.
- **RTE_ETH_FDIR_PASSTHRU**: similar to accept for the last filter in list,
  otherwise process it with subsequent filters.
- For accepted packets and if requested by filter, either 32 bits of
  arbitrary data and four bytes of matched payload (only in case of flex
  bytes matching), or eight bytes of matched payload (flex also) are added
  to meta data.

.. raw:: pdf

   PageBreak

``HASH``
~~~~~~~~

Not an actual filter type. Provides and retrieves the global device
configuration (per port or entire NIC) for hash functions and their
properties.

Hash function selection: "default" (keep current), XOR or Toeplitz.

This function can be configured per flow type (**RTE_ETH_FLOW_**
definitions), supported types are:

- Unknown.
- Raw.
- Fragmented or non-fragmented IPv4.
- Non-fragmented IPv4 with L4 (TCP, UDP, SCTP or other).
- Fragmented or non-fragmented IPv6.
- Non-fragmented IPv6 with L4 (TCP, UDP, SCTP or other).
- L2 payload.
- IPv6 with extensions.
- IPv6 with L4 (TCP, UDP) and extensions.

``L2_TUNNEL``
~~~~~~~~~~~~~

Matching:

- All packets received on a given port.

Action:

- Add tunnel encapsulation (VXLAN, GENEVE, TEREDO, NVGRE, IP over GRE,
  802.1BR E-Tag) using the provided Ethertype and tunnel ID (only E-Tag
  is implemented at the moment).
- VF ID to use for tag insertion (currently unused).
- Destination pool for tag based forwarding (pools are IDs that can be
  affected to ports, duplication occurs if the same ID is shared by several
  ports of the same NIC).

.. raw:: pdf

   PageBreak

Driver support
--------------

======== ======= ========= ======== === ====== ====== ==== ==== =========
Driver   MACVLAN ETHERTYPE FLEXIBLE SYN NTUPLE TUNNEL FDIR HASH L2_TUNNEL
======== ======= ========= ======== === ====== ====== ==== ==== =========
bnx2x
cxgbe
e1000            yes       yes      yes yes
ena
enic                                                  yes
fm10k
i40e     yes     yes                           yes    yes  yes
ixgbe            yes                yes yes           yes       yes
mlx4
mlx5                                                  yes
szedata2
======== ======= ========= ======== === ====== ====== ==== ==== =========

Flow director
-------------

Flow director (FDIR) is the name of the most capable filter type, which
covers most features offered by others. As such, it is the most widespread
in PMDs that support filtering (i.e. all of them besides **e1000**).

It is also the only type that allows an arbitrary 32 bits value provided by
applications to be attached to a filter and returned with matching packets
instead of relying on the destination queue to recognize flows.

Unfortunately, even FDIR requires applications to be aware of low-level
capabilities and limitations (most of which come directly from **ixgbe** and
**i40e**):

- Bit-masks are set globally per device (port?), not per filter.
- Configuration state is not expected to be saved by the driver, and
  stopping/restarting a port requires the application to perform it again
  (API documentation is also unclear about this).
- Monolithic approach with ABI issues as soon as a new kind of flow or
  combination needs to be supported.
- Cryptic global statistics/counters.
- Unclear about how priorities are managed; filters seem to be arranged as a
  linked list in hardware (possibly related to configuration order).

Packet alteration
-----------------

One interesting feature is that the L2 tunnel filter type implements the
ability to alter incoming packets through a filter (in this case to
encapsulate them), thus the **mlx5** flow encap/decap features are not a
foreign concept.

.. raw:: pdf

   PageBreak

Proposed API
============

Terminology
-----------

- **Flow API**: overall framework affecting the fate of selected packets,
  covers everything described in this document.
- **Filtering API**: an alias for *Flow API*.
- **Matching pattern**: properties to look for in packets, a combination of
  any number of items.
- **Pattern item**: part of a pattern that either matches packet data
  (protocol header, payload or derived information), or specifies properties
  of the pattern itself.
- **Actions**: what needs to be done when a packet is matched by a pattern.
- **Flow rule**: this is the result of combining a *matching pattern* with
  *actions*.
- **Filter rule**: a less generic term than *flow rule*, can otherwise be
  used interchangeably.
- **Hit**: a flow rule is said to be *hit* when processing a matching
  packet.

Requirements
------------

As described in the previous section, there is a growing need for a common
method to configure filtering and related actions in a hardware independent
fashion.

The flow API should not disallow any filter combination by design and must
remain as simple as possible to use. It can simply be defined as a method to
perform one or several actions on selected packets.

PMDs are aware of the capabilities of the device they manage and should be
responsible for preventing unsupported or conflicting combinations.

This approach is fundamentally different as it places most of the burden on
the software side of the PMD instead of having device capabilities directly
mapped to API functions, then expecting applications to work around ensuing
compatibility issues.

Requirements for a new API:

- Flexible and extensible without causing API/ABI problems for existing
  applications.
- Should be unambiguous and easy to use.
- Support existing filtering features and actions listed in `Filter types`_.
- Support packet alteration.
- In case of overlapping filters, their priority should be well documented.
- Support filter queries (for example to retrieve counters).
- Support egress (TX) matching and specific actions.

.. raw:: pdf

   PageBreak

High level design
-----------------

The chosen approach to make filtering as generic as possible is by
expressing matching patterns through lists of items instead of the flat
structures used in DPDK today, enabling combinations that are not predefined
and thus being more versatile.

Flow rules can have several distinct actions (such as counting,
encapsulating, decapsulating before redirecting packets to a particular
queue, etc.), instead of relying on several rules to achieve this and having
applications deal with hardware implementation details regarding their
order.

Support for different priority levels on a rule basis is provided, for
example in order to force a more specific rule come before a more generic
one for packets matched by both, however hardware support for more than a
single priority level cannot be guaranteed. When supported, the number of
available priority levels is usually low, which is why they can also be
implemented in software by PMDs (e.g. missing priority levels may be
emulated by reordering rules).

In order to remain as hardware agnostic as possible, by default all rules
are considered to have the same priority, which means that the order between
overlapping rules (when a packet is matched by several filters) is
undefined, packet duplication or unrecoverable errors may even occur as a
result.

PMDs may refuse to create overlapping rules at a given priority level when
they can be detected (e.g. if a pattern matches an existing filter).

Thus predictable results for a given priority level can only be achieved
with non-overlapping rules, using perfect matching on all protocol layers.

Flow rules can also be grouped, the flow rule priority is specific to the
group they belong to. All flow rules in a given group are thus processed
either before or after another group.

Support for multiple actions per rule may be implemented internally on top
of non-default hardware priorities, as a result both features may not be
simultaneously available to applications.

Considering that allowed pattern/actions combinations cannot be known in
advance and would result in an unpractically large number of capabilities to
expose, a method is provided to validate a given rule from the current
device configuration state without actually adding it (akin to a "dry run"
mode).

This enables applications to check if the rule types they need is supported
at initialization time, before starting their data path. This method can be
used anytime, its only requirement being that the resources needed by a rule
must exist (e.g. a target RX queue must be configured first).

Each defined rule is associated with an opaque handle managed by the PMD,
applications are responsible for keeping it. These can be used for queries
and rules management, such as retrieving counters or other data and
destroying them.

To avoid resource leaks on the PMD side, handles must be explicitly
destroyed by the application before releasing associated resources such as
queues and ports.

Integration
-----------

To avoid ABI breakage, this new interface will be implemented through the
existing filtering control framework (``rte_eth_dev_filter_ctrl()``) using
**RTE_ETH_FILTER_GENERIC** as a new filter type.

However a public front-end API described in `Rules management`_ will
be added as the preferred method to use it.

Once discussions with the community have converged to a definite API, legacy
filter types should be deprecated and a deadline defined to remove their
support entirely.

PMDs will have to be gradually converted to **RTE_ETH_FILTER_GENERIC** or
drop filtering support entirely. Less maintained PMDs for older hardware may
lose support at this point.

The notion of filter type will then be deprecated and subsequently dropped
to avoid confusion between both frameworks.

Implementation details
======================

Flow rule
---------

A flow rule is the combination a matching pattern with a list of actions,
and is the basis of this API.

They also have several other attributes described in the following sections.

Groups
~~~~~~

Flow rules can be grouped by assigning them a common group number. Lower
values have higher priority. Group 0 has the highest priority.

Although optional, applications are encouraged to group similar rules as
much as possible to fully take advantage of hardware capabilities
(e.g. optimized matching) and work around limitations (e.g. a single pattern
type possibly allowed in a given group).

Note that support for more than a single group is not guaranteed.

Priorities
~~~~~~~~~~

A priority level can be assigned to a flow rule. Like groups, lower values
denote higher priority, with 0 as the maximum.

A rule with priority 0 in group 8 is always matched after a rule with
priority 8 in group 0.

Group and priority levels are arbitrary and up to the application, they do
not need to be contiguous nor start from 0, however the maximum number
varies between devices and may be affected by existing flow rules.

If a packet is matched by several rules of a given group for a given
priority level, the outcome is undefined. It can take any path, may be
duplicated or even cause unrecoverable errors.

Note that support for more than a single priority level is not guaranteed.

Traffic direction
~~~~~~~~~~~~~~~~~

Flow rules can apply to inbound and/or outbound traffic (ingress/egress).

Several pattern items and actions are valid and can be used in both
directions. Those valid for only one direction are described as such.

Specifying both directions at once is not recommended but may be valid in
some cases, such as incrementing the same counter twice.

Not specifying any direction is currently an error.

.. raw:: pdf

   PageBreak

Matching pattern
~~~~~~~~~~~~~~~~

A matching pattern comprises any number of items of various types.

Items are arranged in a list to form a matching pattern for packets. They
fall in two categories:

- Protocol matching (ANY, RAW, ETH, IPV4, IPV6, ICMP, UDP, TCP, SCTP, VXLAN
  and so on), usually associated with a specification structure. These must
  be stacked in the same order as the protocol layers to match, starting
  from L2.

- Affecting how the pattern is processed (END, VOID, INVERT, PF, VF, PORT
  and so on), often without a specification structure. Since they are meta
  data that does not match packet contents, these can be specified anywhere
  within item lists without affecting the protocol matching items.

Most item specifications can be optionally paired with a mask to narrow the
specific fields or bits to be matched.

- Items are defined with ``struct rte_flow_item``.
- Patterns are defined with ``struct rte_flow_pattern``.

Example of an item specification matching an Ethernet header:

+-----------------------------------------+
| Ethernet                                |
+==========+=========+====================+
| ``spec`` | ``src`` | ``00:01:02:03:04`` |
|          +---------+--------------------+
|          | ``dst`` | ``00:2a:66:00:01`` |
+----------+---------+--------------------+
| ``mask`` | ``src`` | ``00:ff:ff:ff:00`` |
|          +---------+--------------------+
|          | ``dst`` | ``00:00:00:00:ff`` |
+----------+---------+--------------------+

Non-masked bits stand for any value, Ethernet headers with the following
properties are thus matched:

- ``src``: ``??:01:02:03:??``
- ``dst``: ``??:??:??:??:01``

Except for meta types that do not need one, ``spec`` must be a valid pointer
to a structure of the related item type. A ``mask`` of the same type can be
provided to tell which bits in ``spec`` are to be matched.

A mask is normally only needed for ``spec`` fields matching packet data,
ignored otherwise. See individual item types for more information.

A ``NULL`` mask pointer is allowed and is similar to matching with a full
mask (all ones) ``spec`` fields supported by hardware, the remaining fields
are ignored (all zeroes), there is thus no error checking for unsupported
fields.

.. raw:: pdf

   PageBreak

Matching pattern items for packet data must be naturally stacked (ordered
from lowest to highest protocol layer), as in the following examples:

+--------------+
| TCPv4 as L4  |
+===+==========+
| 0 | Ethernet |
+---+----------+
| 1 | IPv4     |
+---+----------+
| 2 | TCP      |
+---+----------+

+----------------+
| TCPv6 in VXLAN |
+===+============+
| 0 | Ethernet   |
+---+------------+
| 1 | IPv4       |
+---+------------+
| 2 | UDP        |
+---+------------+
| 3 | VXLAN      |
+---+------------+
| 4 | Ethernet   |
+---+------------+
| 5 | IPv6       |
+---+------------+
| 6 | TCP        |
+---+------------+

+-----------------------------+
| TCPv4 as L4 with meta items |
+===+=========================+
| 0 | VOID                    |
+---+-------------------------+
| 1 | Ethernet                |
+---+-------------------------+
| 2 | VOID                    |
+---+-------------------------+
| 3 | IPv4                    |
+---+-------------------------+
| 4 | TCP                     |
+---+-------------------------+
| 5 | VOID                    |
+---+-------------------------+
| 6 | VOID                    |
+---+-------------------------+

The above example shows how meta items do not affect packet data matching
items, as long as those remain stacked properly. The resulting matching
pattern is identical to "TCPv4 as L4".

+----------------+
| UDPv6 anywhere |
+===+============+
| 0 | IPv6       |
+---+------------+
| 1 | UDP        |
+---+------------+

If supported by the PMD, omitting one or several protocol layers at the
bottom of the stack as in the above example (missing an Ethernet
specification) enables hardware to look anywhere in packets.

This is an alias for specifying `ANY`_ with ``min = 0`` and ``max = 0``
properties as the first item.

It is unspecified whether the payload of supported encapsulations
(e.g. VXLAN inner packet) is matched by such a pattern, which may apply to
inner, outer or both packets.

+---------------------+
| Invalid, missing L3 |
+===+=================+
| 0 | Ethernet        |
+---+-----------------+
| 1 | UDP             |
+---+-----------------+

The above pattern is invalid due to a missing L3 specification between L2
and L4. It is only allowed at the bottom and at the top of the stack.

Meta item types
~~~~~~~~~~~~~~~

These do not match packet data but affect how the pattern is processed, most
of them do not need a specification structure. This particularity allows
them to be specified anywhere without affecting other item types.

``END``
^^^^^^^

End marker for item lists. Prevents further processing of items, thereby
ending the pattern.

- Its numeric value is **0** for convenience.
- PMD support is mandatory.
- Both ``spec`` and ``mask`` are ignored.

+--------------------+
| END                |
+==========+=========+
| ``spec`` | ignored |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

``VOID``
^^^^^^^^

Used as a placeholder for convenience. It is ignored and simply discarded by
PMDs.

- PMD support is mandatory.
- Both ``spec`` and ``mask`` are ignored.

+--------------------+
| VOID               |
+==========+=========+
| ``spec`` | ignored |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

One usage example for this type is generating rules that share a common
prefix quickly without reallocating memory, only by updating item types:

+------------------------+
| TCP, UDP or ICMP as L4 |
+===+====================+
| 0 | Ethernet           |
+---+--------------------+
| 1 | IPv4               |
+---+------+------+------+
| 2 | UDP  | VOID | VOID |
+---+------+------+------+
| 3 | VOID | TCP  | VOID |
+---+------+------+------+
| 4 | VOID | VOID | ICMP |
+---+------+------+------+

.. raw:: pdf

   PageBreak

``INVERT``
^^^^^^^^^^

Inverted matching, i.e. process packets that do not match the pattern.

- Both ``spec`` and ``mask`` are ignored.

+--------------------+
| INVERT             |
+==========+=========+
| ``spec`` | ignored |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

Usage example in order to match non-TCPv4 packets only:

+--------------------+
| Anything but TCPv4 |
+===+================+
| 0 | INVERT         |
+---+----------------+
| 1 | Ethernet       |
+---+----------------+
| 2 | IPv4           |
+---+----------------+
| 3 | TCP            |
+---+----------------+

``PF``
^^^^^^

Matches packets addressed to the physical function of the device.

If the underlying device function differs from the one that would normally
receive the matched traffic, specifying this item prevents it from reaching
that device unless the flow rule contains a `PF (action)`_. Packets are not
duplicated between device instances by default.

- Likely to return an error or never match any traffic if applied to a VF
  device.
- Can be combined with any number of `VF`_ items to match both PF and VF
  traffic.
- Both ``spec`` and ``mask`` are ignored.

+--------------------+
| PF                 |
+==========+=========+
| ``spec`` | ignored |
+----------+---------+
| ``mask`` | ignored |
+----------+---------+

``VF``
^^^^^^

Matches packets addressed to a virtual function ID of the device.

If the underlying device function differs from the one that would normally
receive the matched traffic, specifying this item prevents it from reaching
that device unless the flow rule contains a `VF (action)`_. Packets are not
duplicated between device instances by default.

- Likely to return an error or never match any traffic if this causes a VF
  device to match traffic addressed to a different VF.
- Can be specified multiple times to match traffic addressed to several VFs.
- Can be combined with a `PF`_ item to match both PF and VF traffic.
- Only ``spec`` needs to be defined, ``mask`` is ignored.

+-------------------------------------------------+
| VF                                              |
+==========+=========+============================+
| ``spec`` | ``any`` | ignore the specified VF ID |
|          +---------+----------------------------+
|          | ``vf``  | destination VF ID          |
+----------+---------+----------------------------+
| ``mask`` | ignored                              |
+----------+--------------------------------------+

``PORT``
^^^^^^^^

Matches packets coming from the specified physical port of the underlying
device.

The first PORT item overrides the physical port normally associated with the
specified DPDK input port (port_id). This item can be provided several times
to match additional physical ports.

Note that physical ports are not necessarily tied to DPDK input ports
(port_id) when those are not under DPDK control. Possible values are
specific to each device, they are not necessarily indexed from zero and may
not be contiguous.

As a device property, the list of allowed values as well as the value
associated with a port_id should be retrieved by other means.

- Only ``spec`` needs to be defined, ``mask`` is ignored.

+--------------------------------------------+
| PORT                                       |
+==========+===========+=====================+
| ``spec`` | ``index`` | physical port index |
+----------+-----------+---------------------+
| ``mask`` | ignored                         |
+----------+---------------------------------+

.. raw:: pdf

   PageBreak

Data matching item types
~~~~~~~~~~~~~~~~~~~~~~~~

Most of these are basically protocol header definitions with associated
bit-masks. They must be specified (stacked) from lowest to highest protocol
layer.

The following list is not exhaustive as new protocols will be added in the
future.

``ANY``
^^^^^^^

Matches any protocol in place of the current layer, a single ANY may also
stand for several protocol layers.

This is usually specified as the first pattern item when looking for a
protocol anywhere in a packet.

- A maximum value of **0** requests matching any number of protocol layers
  above or equal to the minimum value, a maximum value lower than the
  minimum one is otherwise invalid.
- Only ``spec`` needs to be defined, ``mask`` is ignored.

+-----------------------------------------------------------------------+
| ANY                                                                   |
+==========+=========+==================================================+
| ``spec`` | ``min`` | minimum number of layers covered                 |
|          +---------+--------------------------------------------------+
|          | ``max`` | maximum number of layers covered, 0 for infinity |
+----------+---------+--------------------------------------------------+
| ``mask`` | ignored                                                    |
+----------+------------------------------------------------------------+

Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or IPv6)
and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
or IPv6) matched by the second ANY specification:

+----------------------------------+
| TCP in VXLAN with wildcards      |
+===+==============================+
| 0 | Ethernet                     |
+---+-----+----------+---------+---+
| 1 | ANY | ``spec`` | ``min`` | 2 |
|   |     |          +---------+---+
|   |     |          | ``max`` | 2 |
+---+-----+----------+---------+---+
| 2 | VXLAN                        |
+---+------------------------------+
| 3 | Ethernet                     |
+---+-----+----------+---------+---+
| 4 | ANY | ``spec`` | ``min`` | 1 |
|   |     |          +---------+---+
|   |     |          | ``max`` | 1 |
+---+-----+----------+---------+---+
| 5 | TCP                          |
+---+------------------------------+

.. raw:: pdf

   PageBreak

``RAW``
^^^^^^^

Matches a byte string of a given length at a given offset.

Offset is either absolute (using the start of the packet) or relative to the
end of the previous matched item in the stack, in which case negative values
are allowed.

If search is enabled, offset is used as the starting point. The search area
can be delimited by setting limit to a nonzero value, which is the maximum
number of bytes after offset where the pattern may start.

Matching a zero-length pattern is allowed, doing so resets the relative
offset for subsequent items.

- ``mask`` only affects the pattern field.

+---------------------------------------------------------------------------+
| RAW                                                                       |
+==========+==============+=================================================+
| ``spec`` | ``relative`` | look for pattern after the previous item        |
|          +--------------+-------------------------------------------------+
|          | ``search``   | search pattern from offset (see also ``limit``) |
|          +--------------+-------------------------------------------------+
|          | ``reserved`` | reserved, must be set to zero                   |
|          +--------------+-------------------------------------------------+
|          | ``offset``   | absolute or relative offset for ``pattern``     |
|          +--------------+-------------------------------------------------+
|          | ``limit``    | search area limit for start of ``pattern``      |
|          +--------------+-------------------------------------------------+
|          | ``length``   | ``pattern`` length                              |
|          +--------------+-------------------------------------------------+
|          | ``pattern``  | byte string to look for                         |
+----------+--------------+-------------------------------------------------+
| ``mask`` | ``relative`` | ignored                                         |
|          +--------------+-------------------------------------------------+
|          | ``search``   | ignored                                         |
|          +--------------+-------------------------------------------------+
|          | ``reserved`` | ignored                                         |
|          +--------------+-------------------------------------------------+
|          | ``offset``   | ignored                                         |
|          +--------------+-------------------------------------------------+
|          | ``limit``    | ignored                                         |
|          +--------------+-------------------------------------------------+
|          | ``length``   | ignored                                         |
|          +--------------+-------------------------------------------------+
|          | ``pattern``  | bit-mask of the same byte length as ``pattern`` |
+----------+--------------+-------------------------------------------------+

Example pattern looking for several strings at various offsets of a UDP
payload, using combined RAW items:

.. raw:: pdf

   PageBreak

+-------------------------------------------+
| UDP payload matching                      |
+===+=======================================+
| 0 | Ethernet                              |
+---+---------------------------------------+
| 1 | IPv4                                  |
+---+---------------------------------------+
| 2 | UDP                                   |
+---+-----+----------+--------------+-------+
| 3 | RAW | ``spec`` | ``relative`` | 1     |
|   |     |          +--------------+-------+
|   |     |          | ``search``   | 1     |
|   |     |          +--------------+-------+
|   |     |          | ``offset``   | 10    |
|   |     |          +--------------+-------+
|   |     |          | ``limit``    | 0     |
|   |     |          +--------------+-------+
|   |     |          | ``length``   | 3     |
|   |     |          +--------------+-------+
|   |     |          | ``pattern``  | "foo" |
+---+-----+----------+--------------+-------+
| 4 | RAW | ``spec`` | ``relative`` | 1     |
|   |     |          +--------------+-------+
|   |     |          | ``search``   | 0     |
|   |     |          +--------------+-------+
|   |     |          | ``offset``   | 20    |
|   |     |          +--------------+-------+
|   |     |          | ``limit``    | 0     |
|   |     |          +--------------+-------+
|   |     |          | ``length``   | 3     |
|   |     |          +--------------+-------+
|   |     |          | ``pattern``  | "bar" |
+---+-----+----------+--------------+-------+
| 5 | RAW | ``spec`` | ``relative`` | 1     |
|   |     |          +--------------+-------+
|   |     |          | ``search``   | 0     |
|   |     |          +--------------+-------+
|   |     |          | ``offset``   | -29   |
|   |     |          +--------------+-------+
|   |     |          | ``limit``    | 0     |
|   |     |          +--------------+-------+
|   |     |          | ``length``   | 3     |
|   |     |          +--------------+-------+
|   |     |          | ``pattern``  | "baz" |
+---+-----+----------+--------------+-------+

This translates to:

- Locate "foo" at least 10 bytes deep inside UDP payload.
- Locate "bar" after "foo" plus 20 bytes.
- Locate "baz" after "bar" minus 29 bytes.

Such a packet may be represented as follows (not to scale)::

 0                     >= 10 B           == 20 B
 |                  |<--------->|     |<--------->|
 |                  |           |     |           |
 |-----|------|-----|-----|-----|-----|-----------|-----|------|
 | ETH | IPv4 | UDP | ... | baz | foo | ......... | bar | .... |
 |-----|------|-----|-----|-----|-----|-----------|-----|------|
                          |                             |
                          |<--------------------------->|
                                      == 29 B

Note that matching subsequent pattern items would resume after "baz", not
"bar" since matching is always performed after the previous item of the
stack.

.. raw:: pdf

   PageBreak

``ETH``
^^^^^^^

Matches an Ethernet header.

- ``dst``: destination MAC.
- ``src``: source MAC.
- ``type``: EtherType.
- ``tags``: number of 802.1Q/ad tags defined.
- ``tag[]``: 802.1Q/ad tag definitions, outermost first. For each one:

 - ``tpid``: Tag protocol identifier.
 - ``tci``: Tag control information.

``IPV4``
^^^^^^^^

Matches an IPv4 header.

Note: IPv4 options are handled by dedicated pattern items.

- ``hdr``: IPv4 header definition (``rte_ip.h``).

``IPV6``
^^^^^^^^

Matches an IPv6 header.

Note: IPv6 options are handled by dedicated pattern items.

- ``hdr``: IPv6 header definition (``rte_ip.h``).

``ICMP``
^^^^^^^^

Matches an ICMP header.

- ``hdr``: ICMP header definition (``rte_icmp.h``).

``UDP``
^^^^^^^

Matches a UDP header.

- ``hdr``: UDP header definition (``rte_udp.h``).

``TCP``
^^^^^^^

Matches a TCP header.

- ``hdr``: TCP header definition (``rte_tcp.h``).

``SCTP``
^^^^^^^^

Matches a SCTP header.

- ``hdr``: SCTP header definition (``rte_sctp.h``).

``VXLAN``
^^^^^^^^^

Matches a VXLAN header (RFC 7348).

- ``flags``: normally 0x08 (I flag).
- ``rsvd0``: reserved, normally 0x000000.
- ``vni``: VXLAN network identifier.
- ``rsvd1``: reserved, normally 0x00.

.. raw:: pdf

   PageBreak

Actions
~~~~~~~

Each possible action is represented by a type. Some have associated
configuration structures. Several actions combined in a list can be affected
to a flow rule. That list is not ordered.

At least one action must be defined in a filter rule in order to do
something with matched packets.

- Actions are defined with ``struct rte_flow_action``.
- A list of actions is defined with ``struct rte_flow_actions``.

They fall in three categories:

- Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
  processing matched packets by subsequent flow rules, unless overridden
  with PASSTHRU.

- Non terminating actions (PASSTHRU, DUP) that leave matched packets up for
  additional processing by subsequent flow rules.

- Other non terminating meta actions that do not affect the fate of packets
  (END, VOID, MARK, FLAG, COUNT).

When several actions are combined in a flow rule, they should all have
different types (e.g. dropping a packet twice is not possible). The defined
behavior is for PMDs to only take into account the last action of a given
type found in the list. PMDs still perform error checking on the entire
list.

*Note that PASSTHRU is the only action having the ability to override a
terminating rule.*

.. raw:: pdf

   PageBreak

Example of an action that redirects packets to queue index 10:

+----------------+
| QUEUE          |
+===========+====+
| ``queue`` | 10 |
+-----------+----+

Action lists examples, their order is not significant, applications must
consider all actions to be performed simultaneously:

+----------------+
| Count and drop |
+=======+========+
| COUNT |        |
+-------+--------+
| DROP  |        |
+-------+--------+

+--------------------------+
| Tag, count and redirect  |
+=======+===========+======+
| MARK  | ``mark``  | 0x2a |
+-------+-----------+------+
| COUNT |                  |
+-------+-----------+------+
| QUEUE | ``queue`` | 10   |
+-------+-----------+------+

+-----------------------+
| Redirect to queue 5   |
+=======+===============+
| DROP  |               |
+-------+-----------+---+
| QUEUE | ``queue`` | 5 |
+-------+-----------+---+

In the above example, considering both actions are performed simultaneously,
its end result is that only QUEUE has any effect.

+-----------------------+
| Redirect to queue 3   |
+=======+===========+===+
| QUEUE | ``queue`` | 5 |
+-------+-----------+---+
| VOID  |               |
+-------+-----------+---+
| QUEUE | ``queue`` | 3 |
+-------+-----------+---+

As previously described, only the last action of a given type found in the
list is taken into account. The above example also shows that VOID is
ignored.

.. raw:: pdf

   PageBreak

Action types
~~~~~~~~~~~~

Common action types are described in this section. Like pattern item types,
this list is not exhaustive as new actions will be added in the future.

``END`` (action)
^^^^^^^^^^^^^^^^

End marker for action lists. Prevents further processing of actions, thereby
ending the list.

- Its numeric value is **0** for convenience.
- PMD support is mandatory.
- No configurable property.

+---------------+
| END           |
+===============+
| no properties |
+---------------+

``VOID`` (action)
^^^^^^^^^^^^^^^^^

Used as a placeholder for convenience. It is ignored and simply discarded by
PMDs.

- PMD support is mandatory.
- No configurable property.

+---------------+
| VOID          |
+===============+
| no properties |
+---------------+

``PASSTHRU``
^^^^^^^^^^^^

Leaves packets up for additional processing by subsequent flow rules. This
is the default when a rule does not contain a terminating action, but can be
specified to force a rule to become non-terminating.

- No configurable property.

+---------------+
| PASSTHRU      |
+===============+
| no properties |
+---------------+

Example to copy a packet to a queue and continue processing by subsequent
flow rules:

+--------------------------+
| Copy to queue 8          |
+==========+===============+
| PASSTHRU |               |
+----------+-----------+---+
| QUEUE    | ``queue`` | 8 |
+----------+-----------+---+

.. raw:: pdf

   PageBreak

``MARK``
^^^^^^^^

Attaches a 32 bit value to packets.

This value is arbitrary and application-defined. For compatibility with FDIR
it is returned in the ``hash.fdir.hi`` mbuf field. ``PKT_RX_FDIR_ID`` is
also set in ``ol_flags``.

+------------------------------------------------+
| MARK                                           |
+==========+=====================================+
| ``mark`` | 32 bit value to return with packets |
+----------+-------------------------------------+

``FLAG``
^^^^^^^^

Flag packets. Similar to `MARK`_ but only affects ``ol_flags``.

Note: a distinctive flag must be defined for it.

+---------------+
| FLAG          |
+===============+
| no properties |
+---------------+

``QUEUE``
^^^^^^^^^

Assigns packets to a given queue index.

- Terminating by default.

+--------------------------------+
| QUEUE                          |
+===========+====================+
| ``queue`` | queue index to use |
+-----------+--------------------+

``DROP``
^^^^^^^^

Drop packets.

- No configurable property.
- Terminating by default.
- PASSTHRU overrides this action if both are specified.

+---------------+
| DROP          |
+===============+
| no properties |
+---------------+

.. raw:: pdf

   PageBreak

``COUNT``
^^^^^^^^^

Enables counters for this rule.

These counters can be retrieved and reset through ``rte_flow_query()``, see
``struct rte_flow_query_count``.

- Counters can be retrieved with ``rte_flow_query()``.
- No configurable property.

+---------------+
| COUNT         |
+===============+
| no properties |
+---------------+

Query structure to retrieve and reset flow rule counters:

+---------------------------------------------------------+
| COUNT query                                             |
+===============+=====+===================================+
| ``reset``     | in  | reset counter after query         |
+---------------+-----+-----------------------------------+
| ``hits_set``  | out | ``hits`` field is set             |
+---------------+-----+-----------------------------------+
| ``bytes_set`` | out | ``bytes`` field is set            |
+---------------+-----+-----------------------------------+
| ``hits``      | out | number of hits for this rule      |
+---------------+-----+-----------------------------------+
| ``bytes``     | out | number of bytes through this rule |
+---------------+-----+-----------------------------------+

``DUP``
^^^^^^^

Duplicates packets to a given queue index.

This is normally combined with QUEUE, however when used alone, it is
actually similar to QUEUE + PASSTHRU.

- Non-terminating by default.

+------------------------------------------------+
| DUP                                            |
+===========+====================================+
| ``queue`` | queue index to duplicate packet to |
+-----------+------------------------------------+

``RSS``
^^^^^^^

Similar to QUEUE, except RSS is additionally performed on packets to spread
them among several queues according to the provided parameters.

Note: RSS hash result is normally stored in the ``hash.rss`` mbuf field,
however it conflicts with the `MARK`_ action as they share the same
space. When both actions are specified, the RSS hash is discarded and
``PKT_RX_RSS_HASH`` is not set in ``ol_flags``. MARK has priority. The mbuf
structure should eventually evolve to store both.

- Terminating by default.

+---------------------------------------------+
| RSS                                         |
+==============+==============================+
| ``rss_conf`` | RSS parameters               |
+--------------+------------------------------+
| ``queues``   | number of entries in queue[] |
+--------------+------------------------------+
| ``queue[]``  | queue indices to use         |
+--------------+------------------------------+

.. raw:: pdf

   PageBreak

``PF`` (action)
^^^^^^^^^^^^^^^

Redirects packets to the physical function (PF) of the current device.

- No configurable property.
- Terminating by default.

+---------------+
| PF            |
+===============+
| no properties |
+---------------+

``VF`` (action)
^^^^^^^^^^^^^^^

Redirects packets to a virtual function (VF) of the current device.

Packets matched by a VF pattern item can be redirected to their original VF
ID instead of the specified one. This parameter may not be available and is
not guaranteed to work properly if the VF part is matched by a prior flow
rule or if packets are not addressed to a VF in the first place.

- Terminating by default.

+-----------------------------------------------+
| VF                                            |
+==============+================================+
| ``original`` | use original VF ID if possible |
+--------------+--------------------------------+
| ``vf``       | VF ID to redirect packets to   |
+--------------+--------------------------------+

Negative types
~~~~~~~~~~~~~~

All specified pattern items (``enum rte_flow_item_type``) and actions
(``enum rte_flow_action_type``) use positive identifiers.

The negative space is reserved for dynamic types generated by PMDs during
run-time, PMDs may encounter them as a result but do not have to accept the
negative types they did not generate.

The method to generate them has not been specified yet.

Planned types
~~~~~~~~~~~~~

Pattern item types will be added as new protocols are implemented.

Variable headers support through dedicated pattern items, for example in
order to match specific IPv4 options and IPv6 extension headers, these would
be stacked behind IPv4/IPv6 items.

Other action types are planned but not defined yet. These actions will add
the ability to alter matched packets in several ways, such as performing
encapsulation/decapsulation of tunnel headers on specific flows.

.. raw:: pdf

   PageBreak

Rules management
----------------

A simple API with few functions is provided to fully manage flows.

Each created flow rule is associated with an opaque, PMD-specific handle
pointer. The application is responsible for keeping it until the rule is
destroyed.

Flows rules are represented by ``struct rte_flow`` objects.

Validation
~~~~~~~~~~

Given that expressing a definite set of device capabilities with this API is
not practical, a dedicated function is provided to check if a flow rule is
supported and can be created.

::

 int
 rte_flow_validate(uint8_t port_id,
                   const struct rte_flow_attr *attr,
                   const struct rte_flow_pattern *pattern,
                   const struct rte_flow_actions *actions,
                   struct rte_flow_error *error);

While this function has no effect on the target device, the flow rule is
validated against its current configuration state and the returned value
should be considered valid by the caller for that state only.

The returned value is guaranteed to remain valid only as long as no
successful calls to rte_flow_create() or rte_flow_destroy() are made in the
meantime and no device parameter affecting flow rules in any way are
modified, due to possible collisions or resource limitations (although in
such cases ``EINVAL`` should not be returned).

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``attr``: flow rule attributes.
- ``pattern``: pattern specification.
- ``actions``: actions associated with the flow definition.
- ``error``: perform verbose error reporting if not NULL.

Return value:

- **0** if flow rule is valid and can be created. A negative errno value
  otherwise (``rte_errno`` is also set), the following errors are defined.
- ``-ENOSYS``: underlying device does not support this functionality.
- ``-EINVAL``: unknown or invalid rule specification.
- ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial
  bit-masks are unsupported).
- ``-EEXIST``: collision with an existing rule.
- ``-ENOMEM``: not enough resources.
- ``-EBUSY``: action cannot be performed due to busy device resources, may
  succeed if the affected queues or even the entire port are in a stopped
  state (see ``rte_eth_dev_rx_queue_stop()`` and ``rte_eth_dev_stop()``).

.. raw:: pdf

   PageBreak

Creation
~~~~~~~~

Creating a flow rule is similar to validating one, except the rule is
actually created and a handle returned.

::

 struct rte_flow *
 rte_flow_create(uint8_t port_id,
                 const struct rte_flow_attr *attr,
                 const struct rte_flow_pattern *pattern,
                 const struct rte_flow_actions *actions,
                 struct rte_flow_error *error);

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``attr``: flow rule attributes.
- ``pattern``: pattern specification.
- ``actions``: actions associated with the flow definition.
- ``error``: perform verbose error reporting if not NULL.

Return value:

A valid handle in case of success, NULL otherwise and ``rte_errno`` is set
to the positive version of one of the error codes defined for
``rte_flow_validate()``.

Destruction
~~~~~~~~~~~

Flow rules destruction is not automatic, and a queue or a port should not be
released if any are still attached to them. Applications must take care of
performing this step before releasing resources.

::

 int
 rte_flow_destroy(uint8_t port_id,
                  struct rte_flow *flow,
                  struct rte_flow_error *error);


Failure to destroy a flow rule handle may occur when other flow rules depend
on it, and destroying it would result in an inconsistent state.

This function is only guaranteed to succeed if handles are destroyed in
reverse order of their creation.

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``flow``: flow rule handle to destroy.
- ``error``: perform verbose error reporting if not NULL.

Return value:

- **0** on success, a negative errno value otherwise and ``rte_errno`` is
  set.

.. raw:: pdf

   PageBreak

Flush
~~~~~

Convenience function to destroy all flow rule handles associated with a
port. They are released as with successive calls to ``rte_flow_destroy()``.

::

 int
 rte_flow_flush(uint8_t port_id,
                struct rte_flow_error *error);

In the unlikely event of failure, handles are still considered destroyed and
no longer valid but the port must be assumed to be in an inconsistent state.

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``error``: perform verbose error reporting if not NULL.

Return value:

- **0** on success, a negative errno value otherwise and ``rte_errno`` is
  set.

Query
~~~~~

Query an existing flow rule.

This function allows retrieving flow-specific data such as counters. Data
is gathered by special actions which must be present in the flow rule
definition.

::

 int
 rte_flow_query(uint8_t port_id,
                struct rte_flow *flow,
                enum rte_flow_action_type action,
                void *data,
                struct rte_flow_error *error);

Arguments:

- ``port_id``: port identifier of Ethernet device.
- ``flow``: flow rule handle to query.
- ``action``: action type to query.
- ``data``: pointer to storage for the associated query data type.
- ``error``: perform verbose error reporting if not NULL.

Return value:

- **0** on success, a negative errno value otherwise and ``rte_errno`` is
  set.

.. raw:: pdf

   PageBreak

Verbose error reporting
~~~~~~~~~~~~~~~~~~~~~~~

The defined *errno* values may not be accurate enough for users or
application developers who want to investigate issues related to flow rules
management. A dedicated error object is defined for this purpose::

 enum rte_flow_error_type {
     RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
     RTE_FLOW_ERROR_TYPE_UNDEFINED, /**< Cause is undefined. */
     RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
     RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
     RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
     RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< field. */
     RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< field. */
     RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure itself. */
     RTE_FLOW_ERROR_TYPE_PATTERN_MAX, /**< Pattern length (max field). */
     RTE_FLOW_ERROR_TYPE_PATTERN_ITEM, /**< Specific pattern item. */
     RTE_FLOW_ERROR_TYPE_PATTERN, /**< Pattern structure itself. */
     RTE_FLOW_ERROR_TYPE_ACTION_MAX, /**< Number of actions (max field). */
     RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
     RTE_FLOW_ERROR_TYPE_ACTIONS, /**< Actions structure itself. */
 };

 struct rte_flow_error {
     enum rte_flow_error_type type; /**< Cause field and error types. */
     void *cause; /**< Object responsible for the error. */
     const char *message; /**< Human-readable error message. */
 };

Error type ``RTE_FLOW_ERROR_TYPE_NONE`` stands for no error, in which case
the remaining fields can be ignored. Other error types describe the object
type pointed to by ``cause``.

If non-NULL, ``cause`` points to the object responsible for the error. For a
flow rule, this may be a pattern item or an individual action.

If non-NULL, ``message`` provides a human-readable error message.

This object is normally allocated by applications and set by PMDs, the
message points to a constant string which does not need to be freed by the
application, however its pointer can be considered valid only as long as its
associated DPDK port remains configured. Closing the underlying device or
unloading the PMD invalidates it.

.. raw:: pdf

   PageBreak

PMD interface
~~~~~~~~~~~~~

This specification focuses on the public-facing interface, which must be
fully defined from the start to avoid a re-design later as it is subject to
API and ABI versioning constraints.

No such issue exists with the internal interface for use by poll-mode
drivers which can evolve independently, hence this section only outlines how
requests are processed by PMDs.

Public functions are mapped more or less directly to PMD operation
callbacks, thus:

- Public API functions do not process flow rules definitions at all before
  calling PMD callbacks (no basic error checking, no validation
  whatsoever). They only make sure these callbacks are non-NULL or return
  the ``ENOSYS`` (function not supported) error.

- DPDK does not keep track of flow rules definitions or flow rule objects
  automatically. Applications may keep track of the former and must keep
  track of the latter. PMDs may also do it for internal needs, however this
  cannot be relied on by applications.

The private interface will provide helper functions to perform common tasks
such as parsing, validating and keeping track of flow rule specifications to
avoid redundant code in PMDs and ease implementation.

Its contents are currently largely undefined since at least one PMD
implementation is necessary first. PMD maintainers are encouraged to share
as much generic code as possible.

.. raw:: pdf

   PageBreak

Caveats
-------

- Flow rules are not maintained between successive port initializations. An
  application exiting without releasing them and restarting must re-create
  them from scratch.

- API operations are synchronous and blocking (``EAGAIN`` cannot be
  returned).

- There is no provision for reentrancy/multi-thread safety, although nothing
  should prevent different devices from being configured at the same
  time. PMDs may protect their control path functions accordingly.

- Stopping the data path (TX/RX) should not be necessary when managing flow
  rules. If this cannot be achieved naturally or with workarounds (such as
  temporarily replacing the burst function pointers), an appropriate error
  code must be returned (``EBUSY``).

- PMDs, not applications, are responsible for maintaining flow rules
  configuration when stopping and restarting a port or performing other
  actions which may affect them. They can only be destroyed explicitly.

For devices exposing multiple ports sharing global settings affected by flow
rules:

- All ports under DPDK control must behave consistently, PMDs are
  responsible for making sure that existing flow rules on a port are not
  affected by other ports.

- Ports not under DPDK control (unaffected or handled by other applications)
  are user's responsibility. They may affect existing flow rules and cause
  undefined behavior. PMDs aware of this may prevent flow rules creation
  altogether in such cases.

.. raw:: pdf

   PageBreak

Compatibility
-------------

No known hardware implementation supports all the features described in this
document.

Unsupported features or combinations are not expected to be fully emulated
in software by PMDs for performance reasons. Partially supported features
may be completed in software as long as hardware performs most of the work
(such as queue redirection and packet recognition).

However PMDs are expected to do their best to satisfy application requests
by working around hardware limitations as long as doing so does not affect
the behavior of existing flow rules.

The following sections provide a few examples of such cases, they are based
on limitations built into the previous APIs.

Global bit-masks
~~~~~~~~~~~~~~~~

Each flow rule comes with its own, per-layer bit-masks, while hardware may
support only a single, device-wide bit-mask for a given layer type, so that
two IPv4 rules cannot use different bit-masks.

The expected behavior in this case is that PMDs automatically configure
global bit-masks according to the needs of the first created flow rule.

Subsequent rules are allowed only if their bit-masks match those, the
``EEXIST`` error code should be returned otherwise.

Unsupported layer types
~~~~~~~~~~~~~~~~~~~~~~~

Many protocols can be simulated by crafting patterns with the `RAW`_ type.

PMDs can rely on this capability to simulate support for protocols with
fixed headers not directly recognized by hardware.

``ANY`` pattern item
~~~~~~~~~~~~~~~~~~~~

This pattern item stands for anything, which can be difficult to translate
to something hardware would understand, particularly if followed by more
specific types.

Consider the following pattern:

+---+--------------------------------+
| 0 | ETHER                          |
+---+--------------------------------+
| 1 | ANY (``min`` = 1, ``max`` = 1) |
+---+--------------------------------+
| 2 | TCP                            |
+---+--------------------------------+

Knowing that TCP does not make sense with something other than IPv4 and IPv6
as L3, such a pattern may be translated to two flow rules instead:

+---+--------------------+
| 0 | ETHER              |
+---+--------------------+
| 1 | IPV4 (zeroed mask) |
+---+--------------------+
| 2 | TCP                |
+---+--------------------+

+---+--------------------+
| 0 | ETHER              |
+---+--------------------+
| 1 | IPV6 (zeroed mask) |
+---+--------------------+
| 2 | TCP                |
+---+--------------------+

Note that as soon as a ANY rule covers several layers, this approach may
yield a large number of hidden flow rules. It is thus suggested to only
support the most common scenarios (anything as L2 and/or L3).

.. raw:: pdf

   PageBreak

Unsupported actions
~~~~~~~~~~~~~~~~~~~

- When combined with a `QUEUE`_ action, packet counting (`COUNT`_) and
  tagging (`MARK`_ or `FLAG`_) may be implemented in software as long as the
  target queue is used by a single rule.

- A rule specifying both `DUP`_ + `QUEUE`_ may be translated to two hidden
  rules combining `QUEUE`_ and `PASSTHRU`_.

- When a single target queue is provided, `RSS`_ can also be implemented
  through `QUEUE`_.

Flow rules priority
~~~~~~~~~~~~~~~~~~~

While it would naturally make sense, flow rules cannot be assumed to be
processed by hardware in the same order as their creation for several
reasons:

- They may be managed internally as a tree or a hash table instead of a
  list.
- Removing a flow rule before adding another one can either put the new rule
  at the end of the list or reuse a freed entry.
- Duplication may occur when packets are matched by several rules.

For overlapping rules (particularly in order to use the `PASSTHRU`_ action)
predictable behavior is only guaranteed by using different priority levels.

Priority levels are not necessarily implemented in hardware, or may be
severely limited (e.g. a single priority bit).

For these reasons, priority levels may be implemented purely in software by
PMDs.

- For devices expecting flow rules to be added in the correct order, PMDs
  may destroy and re-create existing rules after adding a new one with
  a higher priority.

- A configurable number of dummy or empty rules can be created at
  initialization time to save high priority slots for later.

- In order to save priority levels, PMDs may evaluate whether rules are
  likely to collide and adjust their priority accordingly.

.. raw:: pdf

   PageBreak

API migration
=============

Exhaustive list of deprecated filter types and how to convert them to
generic flow rules.

``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
---------------------------------------

`MACVLAN`_ can be translated to a basic `ETH`_ flow rule with a `VF
(action)`_ or `PF (action)`_ terminating action.

+------------------------------------+
| MACVLAN                            |
+--------------------------+---------+
| Pattern                  | Actions |
+===+=====+==========+=====+=========+
| 0 | ETH | ``spec`` | any | VF,     |
|   |     +----------+-----+ PF      |
|   |     | ``mask`` | any |         |
+---+-----+----------+-----+---------+

``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
----------------------------------------------

`ETHERTYPE`_ is basically an `ETH`_ flow rule with `QUEUE`_ or `DROP`_ as
a terminating action.

+------------------------------------+
| ETHERTYPE                          |
+--------------------------+---------+
| Pattern                  | Actions |
+===+=====+==========+=====+=========+
| 0 | ETH | ``spec`` | any | QUEUE,  |
|   |     +----------+-----+ DROP    |
|   |     | ``mask`` | any |         |
+---+-----+----------+-----+---------+

``FLEXIBLE`` to ``RAW`` → ``QUEUE``
-----------------------------------

`FLEXIBLE`_ can be translated to one `RAW`_ pattern with `QUEUE`_ as the
terminating action and a defined priority level.

+------------------------------------+
| FLEXIBLE                           |
+--------------------------+---------+
| Pattern                  | Actions |
+===+=====+==========+=====+=========+
| 0 | RAW | ``spec`` | any | QUEUE   |
|   |     +----------+-----+         |
|   |     | ``mask`` | any |         |
+---+-----+----------+-----+---------+

``SYN`` to ``TCP`` → ``QUEUE``
------------------------------

`SYN`_ is a `TCP`_ rule with only the ``syn`` bit enabled and masked, and
`QUEUE`_ as the terminating action.

Priority level can be set to simulate the high priority bit.

+---------------------------------------------+
| SYN                                         |
+-----------------------------------+---------+
| Pattern                           | Actions |
+===+======+==========+=============+=========+
| 0 | ETH  | ``spec`` | empty       | QUEUE   |
|   |      +----------+-------------+         |
|   |      | ``mask`` | empty       |         |
+---+------+----------+-------------+         |
| 1 | IPV4 | ``spec`` | empty       |         |
|   |      +----------+-------------+         |
|   |      | ``mask`` | empty       |         |
+---+------+----------+-------------+         |
| 2 | TCP  | ``spec`` | ``syn`` = 1 |         |
|   |      +----------+-------------+         |
|   |      | ``mask`` | ``syn`` = 1 |         |
+---+------+----------+-------------+---------+

``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
----------------------------------------------------

`NTUPLE`_ is similar to specifying an empty L2, `IPV4`_ as L3 with `TCP`_ or
`UDP`_ as L4 and `QUEUE`_ as the terminating action.

A priority level can be specified as well.

+---------------------------------------+
| NTUPLE                                |
+-----------------------------+---------+
| Pattern                     | Actions |
+===+======+==========+=======+=========+
| 0 | ETH  | ``spec`` | empty | QUEUE   |
|   |      +----------+-------+         |
|   |      | ``mask`` | empty |         |
+---+------+----------+-------+         |
| 1 | IPV4 | ``spec`` | any   |         |
|   |      +----------+-------+         |
|   |      | ``mask`` | any   |         |
+---+------+----------+-------+         |
| 2 | TCP, | ``spec`` | any   |         |
|   | UDP  +----------+-------+         |
|   |      | ``mask`` | any   |         |
+---+------+----------+-------+---------+

``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
---------------------------------------------------------------------------

`TUNNEL`_ matches common IPv4 and IPv6 L3/L4-based tunnel types.

In the following table, `ANY`_ is used to cover the optional L4.

+------------------------------------------------+
| TUNNEL                                         |
+--------------------------------------+---------+
| Pattern                              | Actions |
+===+=========+==========+=============+=========+
| 0 | ETH     | ``spec`` | any         | QUEUE   |
|   |         +----------+-------------+         |
|   |         | ``mask`` | any         |         |
+---+---------+----------+-------------+         |
| 1 | IPV4,   | ``spec`` | any         |         |
|   | IPV6    +----------+-------------+         |
|   |         | ``mask`` | any         |         |
+---+---------+----------+-------------+         |
| 2 | ANY     | ``spec`` | ``min`` = 0 |         |
|   |         |          +-------------+         |
|   |         |          | ``max`` = 0 |         |
|   |         +----------+-------------+         |
|   |         | ``mask`` | N/A         |         |
+---+---------+----------+-------------+         |
| 3 | VXLAN,  | ``spec`` | any         |         |
|   | GENEVE, +----------+-------------+         |
|   | TEREDO, | ``mask`` | any         |         |
|   | NVGRE,  |          |             |         |
|   | GRE,    |          |             |         |
|   | ...     |          |             |         |
+---+---------+----------+-------------+---------+

.. raw:: pdf

   PageBreak

``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
---------------------------------------------------------------

`FDIR`_ is more complex than any other type, there are several methods to
emulate its functionality. It is summarized for the most part in the table
below.

A few features are intentionally not supported:

- The ability to configure the matching input set and masks for the entire
  device, PMDs should take care of it automatically according to the
  requested flow rules.

  For example if a device supports only one bit-mask per protocol type,
  source/address IPv4 bit-masks can be made immutable by the first created
  rule. Subsequent IPv4 or TCPv4 rules can only be created if they are
  compatible.

  Note that only protocol bit-masks affected by existing flow rules are
  immutable, others can be changed later. They become mutable again after
  the related flow rules are destroyed.

- Returning four or eight bytes of matched data when using flex bytes
  filtering. Although a specific action could implement it, it conflicts
  with the much more useful 32 bits tagging on devices that support it.

- Side effects on RSS processing of the entire device. Flow rules that
  conflict with the current device configuration should not be
  allowed. Similarly, device configuration should not be allowed when it
  affects existing flow rules.

- Device modes of operation. "none" is unsupported since filtering cannot be
  disabled as long as a flow rule is present.

- "MAC VLAN" or "tunnel" perfect matching modes should be automatically set
  according to the created flow rules.

- Signature mode of operation is not defined but could be handled through a
  specific item type if needed.

+----------------------------------------------+
| FDIR                                         |
+---------------------------------+------------+
| Pattern                         | Actions    |
+===+============+==========+=====+============+
| 0 | ETH,       | ``spec`` | any | QUEUE,     |
|   | RAW        +----------+-----+ DROP,      |
|   |            | ``mask`` | any | PASSTHRU   |
+---+------------+----------+-----+------------+
| 1 | IPV4,      | ``spec`` | any | MARK       |
|   | IPV6       +----------+-----+ (optional) |
|   |            | ``mask`` | any |            |
+---+------------+----------+-----+            |
| 2 | TCP,       | ``spec`` | any |            |
|   | UDP,       +----------+-----+            |
|   | SCTP       | ``mask`` | any |            |
+---+------------+----------+-----+            |
| 3 | VF,        | ``spec`` | any |            |
|   | PF         +----------+-----+            |
|   | (optional) | ``mask`` | any |            |
+---+------------+----------+-----+------------+

.. raw:: pdf

   PageBreak

``HASH``
~~~~~~~~

There is no counterpart to this filter type because it translates to a
global device setting instead of a pattern item. Device settings are
automatically set according to the created flow rules.

``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All packets are matched. This type alters incoming packets to encapsulate
them in a chosen tunnel type, optionally redirect them to a VF as well.

The destination pool for tag based forwarding can be emulated with other
flow rules using `DUP`_ as the action.

+----------------------------------------+
| L2_TUNNEL                              |
+---------------------------+------------+
| Pattern                   | Actions    |
+===+======+==========+=====+============+
| 0 | VOID | ``spec`` | N/A | VXLAN,     |
|   |      |          |     | GENEVE,    |
|   |      |          |     | ...        |
|   |      +----------+-----+------------+
|   |      | ``mask`` | N/A | VF         |
|   |      |          |     | (optional) |
+---+------+----------+-----+------------+

.. raw:: pdf

   PageBreak

Future evolutions
=================

- Describing dedicated testpmd commands to control and validate this API.

- A method to optimize generic flow rules with specific pattern items and
  action types generated on the fly by PMDs. DPDK will assign negative
  numbers to these in order to not collide with the existing types. See
  `Negative types`_.

- Adding specific egress pattern items and actions as described in `Traffic
  direction`_.

- Optional software fallback when PMDs are unable to handle requested flow
  rules so applications do not have to implement their own.

- Ranges in addition to bit-masks. Ranges are more generic in many ways as
  they interpret values. For instance only ranges make sense to cover
  several TCP or UDP ports. These will probably be defined on a pattern item
  basis.

--------

Adrien Mazarguil (1):
  ethdev: introduce generic flow API

 lib/librte_ether/Makefile   |   2 +
 lib/librte_ether/rte_flow.h | 941 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 943 insertions(+)
 create mode 100644 lib/librte_ether/rte_flow.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [RFC v2] ethdev: introduce generic flow API
  2016-08-19 19:32 ` [dpdk-dev] [RFC v2] " Adrien Mazarguil
@ 2016-08-19 19:32   ` Adrien Mazarguil
  2016-08-20  7:00     ` Lu, Wenzhuo
  2016-08-22 18:20     ` John Fastabend
  2016-08-22 18:30   ` [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API John Fastabend
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-19 19:32 UTC (permalink / raw)
  To: dev

This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

It has the following benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

The existing filter types will be deprecated and removed in the near
future.

Note that it is not complete yet. This commit only provides the header
file. The specification is provided separately, see below.

HTML version:
 https://rawgit.com/6WIND/rte_flow/master/rte_flow.html

PDF version:
 https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf

Git tree:
 https://github.com/6WIND/rte_flow

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 lib/librte_ether/Makefile   |   2 +
 lib/librte_ether/rte_flow.h | 941 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 943 insertions(+)

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 0bb5dc9..a6f7cd5 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -52,8 +52,10 @@ SYMLINK-y-include += rte_ether.h
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
+SYMLINK-y-include += rte_flow.h
 
 # this lib depends upon:
 DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
+DEPDIRS-y += lib/librte_net
 
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
new file mode 100644
index 0000000..0aa6094
--- /dev/null
+++ b/lib/librte_ether/rte_flow.h
@@ -0,0 +1,941 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_H_
+#define RTE_FLOW_H_
+
+/**
+ * @file
+ * RTE generic flow API
+ *
+ * This interface provides the ability to program packet matching and
+ * associated actions in hardware through flow rules.
+ */
+
+#include <rte_arp.h>
+#include <rte_ether.h>
+#include <rte_icmp.h>
+#include <rte_ip.h>
+#include <rte_sctp.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Flow rule attributes.
+ *
+ * Priorities are set on two levels: per group and per rule within groups.
+ *
+ * Lower values denote higher priority, the highest priority for both levels
+ * is 0, so that a rule with priority 0 in group 8 is always matched after a
+ * rule with priority 8 in group 0.
+ *
+ * Although optional, applications are encouraged to group similar rules as
+ * much as possible to fully take advantage of hardware capabilities
+ * (e.g. optimized matching) and work around limitations (e.g. a single
+ * pattern type possibly allowed in a given group).
+ *
+ * Group and priority levels are arbitrary and up to the application, they
+ * do not need to be contiguous nor start from 0, however the maximum number
+ * varies between devices and may be affected by existing flow rules.
+ *
+ * If a packet is matched by several rules of a given group for a given
+ * priority level, the outcome is undefined. It can take any path, may be
+ * duplicated or even cause unrecoverable errors.
+ *
+ * Note that support for more than a single group and priority level is not
+ * guaranteed.
+ *
+ * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+ *
+ * Several pattern items and actions are valid and can be used in both
+ * directions. Those valid for only one direction are described as such.
+ *
+ * Specifying both directions at once is not recommended but may be valid in
+ * some cases, such as incrementing the same counter twice.
+ *
+ * Not specifying any direction is currently an error.
+ */
+struct rte_flow_attr {
+	uint32_t group; /**< Priority group. */
+	uint32_t priority; /**< Priority level within group. */
+	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
+	uint32_t egress:1; /**< Rule applies to egress traffic. */
+	uint32_t reserved:30; /**< Reserved, must be zero. */
+};
+
+/**
+ * Matching pattern item types.
+ *
+ * Items are arranged in a list to form a matching pattern for packets.
+ * They fall in two categories:
+ *
+ * - Protocol matching (ANY, RAW, ETH, IPV4, IPV6, ICMP, UDP, TCP, SCTP,
+ *   VXLAN and so on), usually associated with a specification
+ *   structure. These must be stacked in the same order as the protocol
+ *   layers to match, starting from L2.
+ *
+ * - Affecting how the pattern is processed (END, VOID, INVERT, PF, VF, PORT
+ *   and so on), often without a specification structure. Since they are
+ *   meta data that does not match packet contents, these can be specified
+ *   anywhere within item lists without affecting the protocol matching
+ *   items.
+ *
+ * See the description of individual types for more information. Those
+ * marked with [META] fall into the second category.
+ */
+enum rte_flow_item_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for item lists. Prevents further processing of items,
+	 * thereby ending the pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_VOID,
+
+	/**
+	 * [META]
+	 *
+	 * Inverted matching, i.e. process packets that do not match the
+	 * pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_INVERT,
+
+	/**
+	 * Matches any protocol in place of the current layer, a single ANY
+	 * may also stand for several protocol layers.
+	 *
+	 * See struct rte_flow_item_any.
+	 */
+	RTE_FLOW_ITEM_TYPE_ANY,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to the physical function of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a PF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_PF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to a virtual function ID of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a VF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * See struct rte_flow_item_vf.
+	 */
+	RTE_FLOW_ITEM_TYPE_VF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets coming from the specified physical port of the
+	 * underlying device.
+	 *
+	 * The first PORT item overrides the physical port normally
+	 * associated with the specified DPDK input port (port_id). This
+	 * item can be provided several times to match additional physical
+	 * ports.
+	 *
+	 * See struct rte_flow_item_port.
+	 */
+	RTE_FLOW_ITEM_TYPE_PORT,
+
+	/**
+	 * Matches a byte string of a given length at a given offset.
+	 *
+	 * See struct rte_flow_item_raw.
+	 */
+	RTE_FLOW_ITEM_TYPE_RAW,
+
+	/**
+	 * Matches an Ethernet header.
+	 *
+	 * See struct rte_flow_item_eth.
+	 */
+	RTE_FLOW_ITEM_TYPE_ETH,
+
+	/**
+	 * Matches an IPv4 header.
+	 *
+	 * See struct rte_flow_item_ipv4.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV4,
+
+	/**
+	 * Matches an IPv6 header.
+	 *
+	 * See struct rte_flow_item_ipv6.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV6,
+
+	/**
+	 * Matches an ICMP header.
+	 *
+	 * See struct rte_flow_item_icmp.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMP,
+
+	/**
+	 * Matches a UDP header.
+	 *
+	 * See struct rte_flow_item_udp.
+	 */
+	RTE_FLOW_ITEM_TYPE_UDP,
+
+	/**
+	 * Matches a TCP header.
+	 *
+	 * See struct rte_flow_item_tcp.
+	 */
+	RTE_FLOW_ITEM_TYPE_TCP,
+
+	/**
+	 * Matches a SCTP header.
+	 *
+	 * See struct rte_flow_item_sctp.
+	 */
+	RTE_FLOW_ITEM_TYPE_SCTP,
+
+	/**
+	 * Matches a VXLAN header.
+	 *
+	 * See struct rte_flow_item_vxlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VXLAN,
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ANY
+ *
+ * Matches any protocol in place of the current layer, a single ANY may also
+ * stand for several protocol layers.
+ *
+ * This is usually specified as the first pattern item when looking for a
+ * protocol anywhere in a packet.
+ *
+ * A maximum value of 0 requests matching any number of protocol layers
+ * above or equal to the minimum value, a maximum value lower than the
+ * minimum one is otherwise invalid.
+ *
+ * Layer mask is ignored.
+ */
+struct rte_flow_item_any {
+	uint16_t min; /**< Minimum number of layers covered. */
+	uint16_t max; /**< Maximum number of layers covered, 0 for infinity. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VF
+ *
+ * Matches packets addressed to a virtual function ID of the device.
+ *
+ * If the underlying device function differs from the one that would
+ * normally receive the matched traffic, specifying this item prevents it
+ * from reaching that device unless the flow rule contains a VF
+ * action. Packets are not duplicated between device instances by default.
+ *
+ * - Likely to return an error or never match any traffic if this causes a
+ *   VF device to match traffic addressed to a different VF.
+ * - Can be specified multiple times to match traffic addressed to several
+ *   specific VFs.
+ * - Can be combined with a PF item to match both PF and VF traffic.
+ *
+ * Layer mask is ignored.
+ */
+struct rte_flow_item_vf {
+	uint32_t any:1; /**< Ignore the specified VF ID. */
+	uint32_t reserved:31; /**< Reserved, must be zero. */
+	uint32_t vf; /**< Destination VF ID. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_PORT
+ *
+ * Matches packets coming from the specified physical port of the underlying
+ * device.
+ *
+ * The first PORT item overrides the physical port normally associated with
+ * the specified DPDK input port (port_id). This item can be provided
+ * several times to match additional physical ports.
+ *
+ * Layer mask is ignored.
+ *
+ * Note that physical ports are not necessarily tied to DPDK input ports
+ * (port_id) when those are not under DPDK control. Possible values are
+ * specific to each device, they are not necessarily indexed from zero and
+ * may not be contiguous.
+ *
+ * As a device property, the list of allowed values as well as the value
+ * associated with a port_id should be retrieved by other means.
+ */
+struct rte_flow_item_port {
+	uint32_t index; /**< Physical port index. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_RAW
+ *
+ * Matches a byte string of a given length at a given offset.
+ *
+ * Offset is either absolute (using the start of the packet) or relative to
+ * the end of the previous matched item in the stack, in which case negative
+ * values are allowed.
+ *
+ * If search is enabled, offset is used as the starting point. The search
+ * area can be delimited by setting limit to a nonzero value, which is the
+ * maximum number of bytes after offset where the pattern may start.
+ *
+ * Matching a zero-length pattern is allowed, doing so resets the relative
+ * offset for subsequent items.
+ *
+ * The mask only affects the pattern field.
+ */
+struct rte_flow_item_raw {
+	uint32_t relative:1; /**< Look for pattern after the previous item. */
+	uint32_t search:1; /**< Search pattern from offset (see also limit). */
+	uint32_t reserved:30; /**< Reserved, must be set to zero. */
+	int32_t offset; /**< Absolute or relative offset for pattern. */
+	uint16_t limit; /**< Search area limit for start of pattern. */
+	uint16_t length; /**< Pattern length. */
+	uint8_t pattern[]; /**< Byte string to look for. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ETH
+ *
+ * Matches an Ethernet header.
+ */
+struct rte_flow_item_eth {
+	struct ether_addr dst; /**< Destination MAC. */
+	struct ether_addr src; /**< Source MAC. */
+	unsigned int type; /**< EtherType. */
+	unsigned int tags; /**< Number of 802.1Q/ad tags defined. */
+	struct {
+		uint16_t tpid; /**< Tag protocol identifier. */
+		uint16_t tci; /**< Tag control information. */
+	} tag[]; /**< 802.1Q/ad tag definitions, outermost first. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV4
+ *
+ * Matches an IPv4 header.
+ *
+ * Note: IPv4 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv4 {
+	struct ipv4_hdr hdr; /**< IPv4 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV6.
+ *
+ * Matches an IPv6 header.
+ *
+ * Note: IPv6 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv6 {
+	struct ipv6_hdr hdr; /**< IPv6 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMP.
+ *
+ * Matches an ICMP header.
+ */
+struct rte_flow_item_icmp {
+	struct icmp_hdr hdr; /**< ICMP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_UDP.
+ *
+ * Matches a UDP header.
+ */
+struct rte_flow_item_udp {
+	struct udp_hdr hdr; /**< UDP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_TCP.
+ *
+ * Matches a TCP header.
+ */
+struct rte_flow_item_tcp {
+	struct tcp_hdr hdr; /**< TCP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_SCTP.
+ *
+ * Matches a SCTP header.
+ */
+struct rte_flow_item_sctp {
+	struct sctp_hdr hdr; /**< SCTP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VXLAN.
+ *
+ * Matches a VXLAN header (RFC 7348).
+ */
+struct rte_flow_item_vxlan {
+	uint32_t flags:8; /**< Normally 0x08 (I flag). */
+	uint32_t rsvd0:24; /**< Reserved, normally 0x000000. */
+	uint32_t vni:24; /**< VXLAN network identifier. */
+	uint32_t rsvd1:8; /**< Reserved, normally 0x00. */
+};
+
+/**
+ * Matching pattern item definition.
+ *
+ * Except for meta types that do not need one, spec must be a valid pointer
+ * to a structure of the related item type. A mask of the same type can be
+ * provided to tell which bits in spec are to be matched.
+ *
+ * A mask is normally only needed for spec fields matching packet data,
+ * ignored otherwise. See individual item types for more information.
+ *
+ * A NULL mask pointer is allowed and is similar to matching with a full
+ * mask (all ones) spec fields supported by hardware, the remaining fields
+ * are ignored (all zero), there is thus no error checking for unsupported
+ * fields.
+ */
+struct rte_flow_item {
+	enum rte_flow_item_type type; /**< Item type. */
+	const void *spec; /**< Pointer to item specification structure. */
+	const void *mask; /**< Mask for item specification. */
+};
+
+/**
+ * Matching pattern definition.
+ *
+ * A pattern is formed by stacking items starting from the lowest protocol
+ * layer to match. This stacking restriction does not apply to meta items
+ * which can be placed anywhere in the stack with no effect on the meaning
+ * of the resulting pattern.
+ *
+ * The end of the item[] stack is detected either by reaching max or a END
+ * item, whichever comes first.
+ */
+struct rte_flow_pattern {
+	uint32_t max; /**< Maximum number of entries in item[]. */
+	struct rte_flow_item item[]; /**< Stacked items. */
+};
+
+/**
+ * Action types.
+ *
+ * Each possible action is represented by a type. Some have associated
+ * configuration structures. Several actions combined in a list can be
+ * affected to a flow rule. That list is not ordered.
+ *
+ * They fall in three categories:
+ *
+ * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+ *   processing matched packets by subsequent flow rules, unless overridden
+ *   with PASSTHRU.
+ *
+ * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
+ *   for additional processing by subsequent flow rules.
+ *
+ * - Other non terminating meta actions that do not affect the fate of
+ *   packets (END, VOID, MARK, FLAG, COUNT).
+ *
+ * When several actions are combined in a flow rule, they should all have
+ * different types (e.g. dropping a packet twice is not possible). The
+ * defined behavior is for PMDs to only take into account the last action of
+ * a given type found in the list. PMDs still perform error checking on the
+ * entire list.
+ *
+ * Note that PASSTHRU is the only action able to override a terminating
+ * rule.
+ */
+enum rte_flow_action_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for action lists. Prevents further processing of
+	 * actions, thereby ending the list.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_VOID,
+
+	/**
+	 * Leaves packets up for additional processing by subsequent flow
+	 * rules. This is the default when a rule does not contain a
+	 * terminating action, but can be specified to force a rule to
+	 * become non-terminating.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PASSTHRU,
+
+	/**
+	 * [META]
+	 *
+	 * Attaches a 32 bit value to packets.
+	 *
+	 * See struct rte_flow_action_mark.
+	 */
+	RTE_FLOW_ACTION_TYPE_MARK,
+
+	/**
+	 * [META]
+	 *
+	 * Flag packets. Similar to MARK but only affects ol_flags.
+	 *
+	 * Note: a distinctive flag must be defined for it.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_FLAG,
+
+	/**
+	 * Assigns packets to a given queue index.
+	 *
+	 * See struct rte_flow_action_queue.
+	 */
+	RTE_FLOW_ACTION_TYPE_QUEUE,
+
+	/**
+	 * Drops packets.
+	 *
+	 * PASSTHRU overrides this action if both are specified.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_DROP,
+
+	/**
+	 * [META]
+	 *
+	 * Enables counters for this rule.
+	 *
+	 * These counters can be retrieved and reset through rte_flow_query(),
+	 * see struct rte_flow_query_count.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_COUNT,
+
+	/**
+	 * Duplicates packets to a given queue index.
+	 *
+	 * This is normally combined with QUEUE, however when used alone, it
+	 * is actually similar to QUEUE + PASSTHRU.
+	 *
+	 * See struct rte_flow_action_dup.
+	 */
+	RTE_FLOW_ACTION_TYPE_DUP,
+
+	/**
+	 * Similar to QUEUE, except RSS is additionally performed on packets
+	 * to spread them among several queues according to the provided
+	 * parameters.
+	 *
+	 * See struct rte_flow_action_rss.
+	 */
+	RTE_FLOW_ACTION_TYPE_RSS,
+
+	/**
+	 * Redirects packets to the physical function (PF) of the current
+	 * device.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PF,
+
+	/**
+	 * Redirects packets to the virtual function (VF) of the current
+	 * device with the specified ID.
+	 *
+	 * See struct rte_flow_action_vf.
+	 */
+	RTE_FLOW_ACTION_TYPE_VF,
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_MARK
+ *
+ * Attaches a 32 bit value to packets.
+ *
+ * This value is arbitrary and application-defined. For compatibility with
+ * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
+ * also set in ol_flags.
+ */
+struct rte_flow_action_id {
+	uint32_t id; /**< 32 bit value to return with packets. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_QUEUE
+ *
+ * Assign packets to a given queue index.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_queue {
+	uint16_t queue; /**< Queue index to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_COUNT (query)
+ *
+ * Query structure to retrieve and reset flow rule counters.
+ */
+struct rte_flow_query_count {
+	uint32_t reset:1; /**< Reset counters after query [in]. */
+	uint32_t hits_set:1; /**< hits field is set [out]. */
+	uint32_t bytes_set:1; /**< bytes field is set [out]. */
+	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
+	uint64_t hits; /**< Number of hits for this rule [out]. */
+	uint64_t bytes; /**< Number of bytes through this rule [out]. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_DUP
+ *
+ * Duplicates packets to a given queue index.
+ *
+ * This is normally combined with QUEUE, however when used alone, it is
+ * actually similar to QUEUE + PASSTHRU.
+ *
+ * Non-terminating by default.
+ */
+struct rte_flow_action_dup {
+	uint16_t queue; /**< Queue index to duplicate packet to. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_RSS
+ *
+ * Similar to QUEUE, except RSS is additionally performed on packets to
+ * spread them among several queues according to the provided parameters.
+ *
+ * Note: RSS hash result is normally stored in the hash.rss mbuf field,
+ * however it conflicts with the MARK action as they share the same
+ * space. When both actions are specified, the RSS hash is discarded and
+ * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
+ * structure should eventually evolve to store both.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_rss {
+	struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
+	uint16_t queues; /**< Number of entries in queue[]. */
+	uint16_t queue[]; /**< Queues indices to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_VF
+ *
+ * Redirects packets to a virtual function (VF) of the current device.
+ *
+ * Packets matched by a VF pattern item can be redirected to their original
+ * VF ID instead of the specified one. This parameter may not be available
+ * and is not guaranteed to work properly if the VF part is matched by a
+ * prior flow rule or if packets are not addressed to a VF in the first
+ * place.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_vf {
+	uint32_t original:1; /**< Use original VF ID if possible. */
+	uint32_t reserved:31; /**< Reserved, must be zero. */
+	uint16_t vf; /**< VF ID to redirect packets to. */
+};
+
+/**
+ * Definition of a single action.
+ *
+ * For simple actions without a configuration structure, conf remains NULL.
+ */
+struct rte_flow_action {
+	enum rte_flow_action_type type; /**< Action type. */
+	const void *conf; /**< Pointer to action configuration structure. */
+};
+
+/**
+ * List of actions to associate with a flow.
+ *
+ * The end of the action[] list is detected either by reaching max or a END
+ * action, whichever comes first.
+ */
+struct rte_flow_actions {
+	uint32_t max; /**< Maximum number of entries in action[]. */
+	struct rte_flow_action action[]; /**< Actions to perform. */
+};
+
+/**
+ * Opaque type returned after successfully creating a flow.
+ *
+ * This handle can be used to manage and query the related flow (e.g. to
+ * destroy it or retrieve counters).
+ */
+struct rte_flow;
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_flow_error.cause.
+ */
+enum rte_flow_error_type {
+	RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+	RTE_FLOW_ERROR_TYPE_UNDEFINED, /**< Cause is undefined. */
+	RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+	RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< field. */
+	RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure itself. */
+	RTE_FLOW_ERROR_TYPE_PATTERN_MAX, /**< Pattern length (max field). */
+	RTE_FLOW_ERROR_TYPE_PATTERN_ITEM, /**< Specific pattern item. */
+	RTE_FLOW_ERROR_TYPE_PATTERN, /**< Pattern structure itself. */
+	RTE_FLOW_ERROR_TYPE_ACTION_MAX, /**< Number of actions (max field). */
+	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+	RTE_FLOW_ERROR_TYPE_ACTIONS, /**< Actions structure itself. */
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by PMDs, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_flow_error {
+	enum rte_flow_error_type type; /**< Cause field and error types. */
+	void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * Check whether a flow rule can be created on a given port.
+ *
+ * While this function has no effect on the target device, the flow rule is
+ * validated against its current configuration state and the returned value
+ * should be considered valid by the caller for that state only.
+ *
+ * The returned value is guaranteed to remain valid only as long as no
+ * successful calls to rte_flow_create() or rte_flow_destroy() are made in
+ * the meantime and no device parameter affecting flow rules in any way are
+ * modified, due to possible collisions or resource limitations (although in
+ * such cases EINVAL should not be returned).
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification.
+ * @param[in] actions
+ *   Actions associated with the flow definition.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 if flow rule is valid and can be created. A negative errno value
+ *   otherwise (rte_errno is also set), the following errors are defined:
+ *
+ *   -ENOSYS: underlying device does not support this functionality.
+ *
+ *   -EINVAL: unknown or invalid rule specification.
+ *
+ *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
+ *   bit-masks are unsupported).
+ *
+ *   -EEXIST: collision with an existing rule.
+ *
+ *   -ENOMEM: not enough resources.
+ *
+ *   -EBUSY: action cannot be performed due to busy device resources, may
+ *   succeed if the affected queues or even the entire port are in a stopped
+ *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
+ */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_pattern *pattern,
+		  const struct rte_flow_actions *actions,
+		  struct rte_flow_error *error);
+
+/**
+ * Create a flow rule on a given port.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification.
+ * @param[in] actions
+ *   Actions associated with the flow definition.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   A valid handle in case of success, NULL otherwise and rte_errno is set
+ *   to the positive version of one of the error codes defined for
+ *   rte_flow_validate().
+ */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_pattern *pattern,
+		const struct rte_flow_actions *actions,
+		struct rte_flow_error *error);
+
+/**
+ * Destroy a flow rule on a given port.
+ *
+ * Failure to destroy a flow rule handle may occur when other flow rules
+ * depend on it, and destroying it would result in an inconsistent state.
+ *
+ * This function is only guaranteed to succeed if handles are destroyed in
+ * reverse order of their creation.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error);
+
+/**
+ * Destroy all flow rules associated with a port.
+ *
+ * In the unlikely event of failure, handles are still considered destroyed
+ * and no longer valid but the port must be assumed to be in an inconsistent
+ * state.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error);
+
+/**
+ * Query an existing flow rule.
+ *
+ * This function allows retrieving flow-specific data such as counters.
+ * Data is gathered by special actions which must be present in the flow
+ * rule definition.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to query.
+ * @param action
+ *   Action type to query.
+ * @param[in, out] data
+ *   Pointer to storage for the associated query data type.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_H_ */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-07-26 10:07         ` Rahul Lakkireddy
  2016-08-03 16:44           ` Adrien Mazarguil
@ 2016-08-19 21:13           ` John Daley (johndale)
  1 sibling, 0 replies; 262+ messages in thread
From: John Daley (johndale) @ 2016-08-19 21:13 UTC (permalink / raw)
  To: Rahul Lakkireddy, John Fastabend, Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Helin Zhang, Jingjing Wu, Rasesh Mody,
	Ajit Khaparde, Wenzhuo Lu, Jan Medala, Jing Chen,
	Konstantin Ananyev, Matej Vido, Alejandro Lucero, Sony Chacko,
	Jerin Jacob, Pablo de Lara, Olga Shern, Kumar A S,
	Nirranjan Kirubaharan, Indranil Choudhury

Hi, this is an old thread, but I'll reply to this instead of the RFC v2 since there is more context here.
Thanks for pushing the new api forward Adrien.
-john daley

> > >>> - Match range of Physical Functions (PFs) on the NIC in a single rule
> > >>>   via masks. For ex: match all traffic coming on several PFs.
> > >>
> > >> The PF and VF pattern items assume there is a single PF associated
> > >> with a DPDK port. VFs are identified with an ID. I basically took
> > >> the same definitions as the existing filter types, perhaps this is
> > >> not enough for Chelsio adapters.
> > >>
> > >> Do you expose more than one PF for a DPDK port?

The Cisco VIC can support multiple PFs per Ethernet port.  These are called virtual-nics (VNICs). It would be nice to be able to redirect matched Rx packets to another queue on another VNIC.

> > >>
> > >> Anyway, I'd suggest the same approach as above, automatic
> > >> aggregation of rules for performance reasons, otherwise new or
> > >> updated PF/VF pattern items, in which case it would be great if you
> > >> could provide ideal structure definitions for this use case.
> > >>
> > >
> > > In Chelsio hardware, all the ports of a device are exposed via
> > > single PF4. There could be many VFs attached to a PF.  Physical NIC
> > > functions are operational on PF4, while VFs can be attached to PFs 0-3.
> > > So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one
> > > mapping assumption.
> > >
> > > There already seems to be a PF meta-item, but it doesn't seem to
> > > accept any "spec" and "mask" field.  Similarly, the VF meta-item
> > > doesn't seem to accept a "mask" field.  We could probably enable
> > > these fields in the PF and VF meta-items to allow configuration.
> >
I would like to see an ID property added to the PF action meta-item, where perhaps a BDF can be specified. This would potentially allow matched Rx packets to be redirected to another VNIC and could be paired with the QUEUE action meta-item to redirect to a specific queue on a VNIC. The PF ID property set to 0 would have the current specified behavior or redirecting to the current PF. Is something like this possible?

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC v2] ethdev: introduce generic flow API
  2016-08-19 19:32   ` [dpdk-dev] [RFC v2] ethdev: introduce generic flow API Adrien Mazarguil
@ 2016-08-20  7:00     ` Lu, Wenzhuo
  2016-08-22 18:20     ` John Fastabend
  1 sibling, 0 replies; 262+ messages in thread
From: Lu, Wenzhuo @ 2016-08-20  7:00 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

Hi  Adrien,
Thanks for the V2. 
May I ask a question that may a little out of the scope here. As currently we don't store all the flow rules in the driver of Intel NICs, we're trying to fill this gap. Considering we need to order the flow rules by the priority, I think it's better to introduce avl tree or RB tree or something like that. We can transplant the avl tree code from FreeBSD. But it doesn't make sense to put it in the PMD. As you mentioned you'll provide some common code in the lib, will you provide avl tree or something similar in the common code? If you have already done it, we need not waste time to do the same thing again :)

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC v2] ethdev: introduce generic flow API
  2016-08-19 19:32   ` [dpdk-dev] [RFC v2] ethdev: introduce generic flow API Adrien Mazarguil
  2016-08-20  7:00     ` Lu, Wenzhuo
@ 2016-08-22 18:20     ` John Fastabend
  1 sibling, 0 replies; 262+ messages in thread
From: John Fastabend @ 2016-08-22 18:20 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

On 16-08-19 12:32 PM, Adrien Mazarguil wrote:
> This new API supersedes all the legacy filter types described in
> rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
> PMDs to process and validate flow rules.
> 
> It has the following benefits:
> 
> - A unified API is easier to program for, applications do not have to be
>   written for a specific filter type which may or may not be supported by
>   the underlying device.
> 
> - The behavior of a flow rule is the same regardless of the underlying
>   device, applications do not need to be aware of hardware quirks.
> 
> - Extensible by design, API/ABI breakage should rarely occur if at all.
> 
> - Documentation is self-standing, no need to look up elsewhere.
> 
> The existing filter types will be deprecated and removed in the near
> future.
> 
> Note that it is not complete yet. This commit only provides the header
> file. The specification is provided separately, see below.
> 
> HTML version:
>  https://rawgit.com/6WIND/rte_flow/master/rte_flow.html
> 
> PDF version:
>  https://rawgit.com/6WIND/rte_flow/master/rte_flow.pdf
> 
> Git tree:
>  https://github.com/6WIND/rte_flow
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---

Hi Adrien,

[...]

> +
> +/**
> + * Flow rule attributes.
> + *
> + * Priorities are set on two levels: per group and per rule within groups.
> + *
> + * Lower values denote higher priority, the highest priority for both levels
> + * is 0, so that a rule with priority 0 in group 8 is always matched after a
> + * rule with priority 8 in group 0.
> + *
> + * Although optional, applications are encouraged to group similar rules as
> + * much as possible to fully take advantage of hardware capabilities
> + * (e.g. optimized matching) and work around limitations (e.g. a single
> + * pattern type possibly allowed in a given group).
> + *
> + * Group and priority levels are arbitrary and up to the application, they
> + * do not need to be contiguous nor start from 0, however the maximum number
> + * varies between devices and may be affected by existing flow rules.
> + *

Another pattern that I just want to note, I think it can be covered is
to map rules between groups.

The idea is if we build a "tunnel-endpoint" group based on a rule in the
tunnel-endpoint we might map this onto a "switch" group. In this case
the "switch" group match should depend on a rule in the
"tunnel-endpoint"  group. Meaning the TEP select the switch. I believe
this can be done with a metadata action.

My idea is to create a rule in "tunnel-endpoint" group that has a
match based on TEP address and then an action "send-to-switch group" +
"metadata set 0x1234". Then in the "switch group" add a match "metadata
eq 0x1234" this allows linking groups together.

It certainly doesn't all need to be in the first iteration of this
series but do you think this is reasonable as a TODO/future extension.
And if we standardize around group-ids the semantics should be
consistent for at least the set of NICs that support tunnel endpoint and
multiple switches.

Any thoughts?


.John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API
  2016-08-19 19:32 ` [dpdk-dev] [RFC v2] " Adrien Mazarguil
  2016-08-19 19:32   ` [dpdk-dev] [RFC v2] ethdev: introduce generic flow API Adrien Mazarguil
@ 2016-08-22 18:30   ` John Fastabend
  2016-09-29 17:10   ` Adrien Mazarguil
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
  3 siblings, 0 replies; 262+ messages in thread
From: John Fastabend @ 2016-08-22 18:30 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

On 16-08-19 12:32 PM, Adrien Mazarguil wrote:
> Hi All,
> 
> Thanks to many for the positive and constructive feedback I've received so
> far. Here is the updated specification (v0.7) at last.
> 
> I've attempted to address as many comments as possible but could not
> process them all just yet. A new section "Future evolutions" has been
> added for the remaining topics.
> 
> This series adds rte_flow.h to the DPDK tree. Next time I will attempt to
> convert the specification as a documentation commit part of the patchset
> and actually implement API functions.
> 
> I think including the entire document here makes it easier to annotate on
> the ML, apologies in advance for the resulting traffic.
> 
> Finally I'm off for the next two weeks, do not expect replies from me in
> the meantime.
> 

Hopefully on vacation :)

[...]


> .. raw:: pdf
> 
>    PageBreak
> 
> +-------------------------------------------+
> | UDP payload matching                      |
> +===+=======================================+
> | 0 | Ethernet                              |
> +---+---------------------------------------+
> | 1 | IPv4                                  |
> +---+---------------------------------------+
> | 2 | UDP                                   |
> +---+-----+----------+--------------+-------+
> | 3 | RAW | ``spec`` | ``relative`` | 1     |
> |   |     |          +--------------+-------+
> |   |     |          | ``search``   | 1     |
> |   |     |          +--------------+-------+
> |   |     |          | ``offset``   | 10    |
> |   |     |          +--------------+-------+
> |   |     |          | ``limit``    | 0     |
> |   |     |          +--------------+-------+
> |   |     |          | ``length``   | 3     |
> |   |     |          +--------------+-------+
> |   |     |          | ``pattern``  | "foo" |
> +---+-----+----------+--------------+-------+
> | 4 | RAW | ``spec`` | ``relative`` | 1     |
> |   |     |          +--------------+-------+
> |   |     |          | ``search``   | 0     |
> |   |     |          +--------------+-------+
> |   |     |          | ``offset``   | 20    |
> |   |     |          +--------------+-------+
> |   |     |          | ``limit``    | 0     |
> |   |     |          +--------------+-------+
> |   |     |          | ``length``   | 3     |
> |   |     |          +--------------+-------+
> |   |     |          | ``pattern``  | "bar" |
> +---+-----+----------+--------------+-------+
> | 5 | RAW | ``spec`` | ``relative`` | 1     |
> |   |     |          +--------------+-------+
> |   |     |          | ``search``   | 0     |
> |   |     |          +--------------+-------+
> |   |     |          | ``offset``   | -29   |
> |   |     |          +--------------+-------+
> |   |     |          | ``limit``    | 0     |
> |   |     |          +--------------+-------+
> |   |     |          | ``length``   | 3     |
> |   |     |          +--------------+-------+
> |   |     |          | ``pattern``  | "baz" |
> +---+-----+----------+--------------+-------+
> 

Just an observation if you made 'offset' specified as an embedded RAW
field so that the offset could point at header length this would befully
generic. Although I guess its not practical as far as I know no hardware
would support the most general case.

> This translates to:
> 
> - Locate "foo" at least 10 bytes deep inside UDP payload.
> - Locate "bar" after "foo" plus 20 bytes.
> - Locate "baz" after "bar" minus 29 bytes.
> 
> Such a packet may be represented as follows (not to scale)::
> 
>  0                     >= 10 B           == 20 B
>  |                  |<--------->|     |<--------->|
>  |                  |           |     |           |
>  |-----|------|-----|-----|-----|-----|-----------|-----|------|
>  | ETH | IPv4 | UDP | ... | baz | foo | ......... | bar | .... |
>  |-----|------|-----|-----|-----|-----|-----------|-----|------|
>                           |                             |
>                           |<--------------------------->|
>                                       == 29 B
> 

[...]

> 
> Future evolutions
> =================
> 
> - Describing dedicated testpmd commands to control and validate this API.
> 
> - A method to optimize generic flow rules with specific pattern items and
>   action types generated on the fly by PMDs. DPDK will assign negative
>   numbers to these in order to not collide with the existing types. See
>   `Negative types`_.

Great thanks. As long as we build the core layer to support this then it
looks good to me.

> 
> - Adding specific egress pattern items and actions as described in `Traffic
>   direction`_.
> 
> - Optional software fallback when PMDs are unable to handle requested flow
>   rules so applications do not have to implement their own.

This is an interesting block. Would you presumably build this using the
existing support in DPDK or propose something else?

> 
> - Ranges in addition to bit-masks. Ranges are more generic in many ways as
>   they interpret values. For instance only ranges make sense to cover
>   several TCP or UDP ports. These will probably be defined on a pattern item
>   basis.
> 

Yep not needed at first but hardware does support this.




Thanks for doing this work, I'll look it over in a bit more detail over
the next few days but it looks like a reasonable base implementation to
me.

.John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API
  2016-08-19 19:32 ` [dpdk-dev] [RFC v2] " Adrien Mazarguil
  2016-08-19 19:32   ` [dpdk-dev] [RFC v2] ethdev: introduce generic flow API Adrien Mazarguil
  2016-08-22 18:30   ` [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API John Fastabend
@ 2016-09-29 17:10   ` Adrien Mazarguil
  2016-10-31  7:19     ` Zhang, Helin
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
  3 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-09-29 17:10 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon

On Fri, Aug 19, 2016 at 08:50:44PM +0200, Adrien Mazarguil wrote:
> Hi All,
> 
> Thanks to many for the positive and constructive feedback I've received so
> far. Here is the updated specification (v0.7) at last.
> 
> I've attempted to address as many comments as possible but could not
> process them all just yet. A new section "Future evolutions" has been
> added for the remaining topics.
> 
> This series adds rte_flow.h to the DPDK tree. Next time I will attempt to
> convert the specification as a documentation commit part of the patchset
> and actually implement API functions.
[...]

A quick update, we initially targeted 16.11 as the DPDK release this API
would be available for, turns out this goal was somewhat too optimistic as
September is ending and we are about to overshoot the deadline for
integration (basically everything took longer than expected, big surprise).

So instead of rushing things now to include a botched API in 16.11 with no
PMD support, we simply modified the target, now set to 17.02. On the plus
side this should leave developers more time to refine and test the API
before applications and PMDs start to use it.

I intend to send the patchset for the first non-draft version mid-October
worst case (ASAP in fact). I still haven't replied to several comments but
did take them into account, thanks for your feedback.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API
  2016-09-29 17:10   ` Adrien Mazarguil
@ 2016-10-31  7:19     ` Zhang, Helin
  2016-11-02 11:13       ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Zhang, Helin @ 2016-10-31  7:19 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

Hi Adrien

Just a double check, do you have any update on the v1 patch set, as now it is the end of October?
We are extremly eager to see the v1 patch set for development.
I don't think we need full validation on the v1 patch set for API. It should be together with PMD and example application.
If we can see the v1 API patch set earlier, we can help to validate it with our code changes. That's should be more efficient and helpful.
Any comments on my personal understanding?

Thank you very much for the hard work and kind helps!

Regards,
Helin

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Friday, September 30, 2016 1:11 AM
> To: dev@dpdk.org
> Cc: Thomas Monjalon
> Subject: Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification
> API
> 
> On Fri, Aug 19, 2016 at 08:50:44PM +0200, Adrien Mazarguil wrote:
> > Hi All,
> >
> > Thanks to many for the positive and constructive feedback I've
> > received so far. Here is the updated specification (v0.7) at last.
> >
> > I've attempted to address as many comments as possible but could not
> > process them all just yet. A new section "Future evolutions" has been
> > added for the remaining topics.
> >
> > This series adds rte_flow.h to the DPDK tree. Next time I will attempt
> > to convert the specification as a documentation commit part of the
> > patchset and actually implement API functions.
> [...]
> 
> A quick update, we initially targeted 16.11 as the DPDK release this API would
> be available for, turns out this goal was somewhat too optimistic as
> September is ending and we are about to overshoot the deadline for
> integration (basically everything took longer than expected, big surprise).
> 
> So instead of rushing things now to include a botched API in 16.11 with no
> PMD support, we simply modified the target, now set to 17.02. On the plus
> side this should leave developers more time to refine and test the API before
> applications and PMDs start to use it.
> 
> I intend to send the patchset for the first non-draft version mid-October
> worst case (ASAP in fact). I still haven't replied to several comments but did
> take them into account, thanks for your feedback.
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API
  2016-10-31  7:19     ` Zhang, Helin
@ 2016-11-02 11:13       ` Adrien Mazarguil
  2016-11-08  1:31         ` Zhang, Helin
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-02 11:13 UTC (permalink / raw)
  To: Zhang, Helin; +Cc: dev

Hi Helin,

On Mon, Oct 31, 2016 at 07:19:18AM +0000, Zhang, Helin wrote:
> Hi Adrien
> 
> Just a double check, do you have any update on the v1 patch set, as now it is the end of October?
> We are extremly eager to see the v1 patch set for development.
> I don't think we need full validation on the v1 patch set for API. It should be together with PMD and example application.
> If we can see the v1 API patch set earlier, we can help to validate it with our code changes. That's should be more efficient and helpful.
> Any comments on my personal understanding?
> 
> Thank you very much for the hard work and kind helps!

I intend to send it shortly, likely this week. For the record, a large part
of this task was also dedicated to implement it on the client side (I've
just read Wei's RFC for a client-side application to which I will reply
separately), in order to validate it from a usability standpoint that led me
to make a few necessary adjustments to the API.

My next submission will include both the updated API with several changes
discussed on this ML and testpmd code (not a separate application) that uses
it. Just hang on a bit longer!

> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Friday, September 30, 2016 1:11 AM
> > To: dev@dpdk.org
> > Cc: Thomas Monjalon
> > Subject: Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification
> > API
> > 
> > On Fri, Aug 19, 2016 at 08:50:44PM +0200, Adrien Mazarguil wrote:
> > > Hi All,
> > >
> > > Thanks to many for the positive and constructive feedback I've
> > > received so far. Here is the updated specification (v0.7) at last.
> > >
> > > I've attempted to address as many comments as possible but could not
> > > process them all just yet. A new section "Future evolutions" has been
> > > added for the remaining topics.
> > >
> > > This series adds rte_flow.h to the DPDK tree. Next time I will attempt
> > > to convert the specification as a documentation commit part of the
> > > patchset and actually implement API functions.
> > [...]
> > 
> > A quick update, we initially targeted 16.11 as the DPDK release this API would
> > be available for, turns out this goal was somewhat too optimistic as
> > September is ending and we are about to overshoot the deadline for
> > integration (basically everything took longer than expected, big surprise).
> > 
> > So instead of rushing things now to include a botched API in 16.11 with no
> > PMD support, we simply modified the target, now set to 17.02. On the plus
> > side this should leave developers more time to refine and test the API before
> > applications and PMDs start to use it.
> > 
> > I intend to send the patchset for the first non-draft version mid-October
> > worst case (ASAP in fact). I still haven't replied to several comments but did
> > take them into account, thanks for your feedback.
> > 
> > --
> > Adrien Mazarguil
> > 6WIND

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API
  2016-11-02 11:13       ` Adrien Mazarguil
@ 2016-11-08  1:31         ` Zhang, Helin
  2016-11-09 11:07           ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Zhang, Helin @ 2016-11-08  1:31 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Thomas Monjalon, Lu, Wenzhuo, Zhao1, Wei

Hi Adrien

Any update on the v1 APIs? We are struggling on that, as we need that for our development.
May I bring another idea to remove the blocking?
Can we send out the APIs with PMD changes based on our understaning of the RFC we discussed recenlty on community? Then you can just update any modification on top of it, or ask the submittors to change with your review comments?
Any comments on this idea? If not, then we may go this way. I guess this might be the most efficient way. Thank you very much!

Regards,
Helin

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Wednesday, November 2, 2016 7:13 PM
> To: Zhang, Helin
> Cc: dev@dpdk.org; Thomas Monjalon; Lu, Wenzhuo
> Subject: Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification
> API
> 
> Hi Helin,
> 
> On Mon, Oct 31, 2016 at 07:19:18AM +0000, Zhang, Helin wrote:
> > Hi Adrien
> >
> > Just a double check, do you have any update on the v1 patch set, as now it
> is the end of October?
> > We are extremly eager to see the v1 patch set for development.
> > I don't think we need full validation on the v1 patch set for API. It should be
> together with PMD and example application.
> > If we can see the v1 API patch set earlier, we can help to validate it with
> our code changes. That's should be more efficient and helpful.
> > Any comments on my personal understanding?
> >
> > Thank you very much for the hard work and kind helps!
> 
> I intend to send it shortly, likely this week. For the record, a large part of this
> task was also dedicated to implement it on the client side (I've just read Wei's
> RFC for a client-side application to which I will reply separately), in order to
> validate it from a usability standpoint that led me to make a few necessary
> adjustments to the API.
> 
> My next submission will include both the updated API with several changes
> discussed on this ML and testpmd code (not a separate application) that uses
> it. Just hang on a bit longer!
> 
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien
> > > Mazarguil
> > > Sent: Friday, September 30, 2016 1:11 AM
> > > To: dev@dpdk.org
> > > Cc: Thomas Monjalon
> > > Subject: Re: [dpdk-dev] [RFC v2] Generic flow
> > > director/filtering/classification API
> > >
> > > On Fri, Aug 19, 2016 at 08:50:44PM +0200, Adrien Mazarguil wrote:
> > > > Hi All,
> > > >
> > > > Thanks to many for the positive and constructive feedback I've
> > > > received so far. Here is the updated specification (v0.7) at last.
> > > >
> > > > I've attempted to address as many comments as possible but could
> > > > not process them all just yet. A new section "Future evolutions"
> > > > has been added for the remaining topics.
> > > >
> > > > This series adds rte_flow.h to the DPDK tree. Next time I will
> > > > attempt to convert the specification as a documentation commit
> > > > part of the patchset and actually implement API functions.
> > > [...]
> > >
> > > A quick update, we initially targeted 16.11 as the DPDK release this
> > > API would be available for, turns out this goal was somewhat too
> > > optimistic as September is ending and we are about to overshoot the
> > > deadline for integration (basically everything took longer than expected,
> big surprise).
> > >
> > > So instead of rushing things now to include a botched API in 16.11
> > > with no PMD support, we simply modified the target, now set to
> > > 17.02. On the plus side this should leave developers more time to
> > > refine and test the API before applications and PMDs start to use it.
> > >
> > > I intend to send the patchset for the first non-draft version
> > > mid-October worst case (ASAP in fact). I still haven't replied to
> > > several comments but did take them into account, thanks for your
> feedback.
> > >
> > > --
> > > Adrien Mazarguil
> > > 6WIND
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API
  2016-11-08  1:31         ` Zhang, Helin
@ 2016-11-09 11:07           ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-09 11:07 UTC (permalink / raw)
  To: Zhang, Helin, Zhao1, Wei, Ajit Khaparde, Alejandro Lucero,
	Evgeny Schemeilin, Jing Chen, Jingjing Wu, Konstantin Ananyev,
	Maciej Czekaj, Matej Vido, Nelson Escobar, Rahul Lakkireddy,
	Rasesh Mody, Sony Chacko, Wenzhuo Lu, Yong Wang, Yuanhan Liu,
	Nelio Laranjeiro
  Cc: dev, Thomas Monjalon

Hi Helin and PMD maintainers,

On Tue, Nov 08, 2016 at 01:31:05AM +0000, Zhang, Helin wrote:
> Hi Adrien
> 
> Any update on the v1 APIs? We are struggling on that, as we need that for our development.
> May I bring another idea to remove the blocking?
> Can we send out the APIs with PMD changes based on our understaning of the RFC we discussed recenlty on community? Then you can just update any modification on top of it, or ask the submittors to change with your review comments?
> Any comments on this idea? If not, then we may go this way. I guess this might be the most efficient way. Thank you very much!

Not wanting to hold back anyone's progress anymore (not that I was doing it
on purpose), here's my work tree with the updated and functional API
(rte_flow branch based on top of v16.11-rc3) while I'm preparing the
patchset for official submission:

 https://github.com/am6/dpdk.org/tree/rte_flow

As a work in progress, this branch is subject to change.

API changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-08-19 19:32 ` [dpdk-dev] [RFC v2] " Adrien Mazarguil
                     ` (2 preceding siblings ...)
  2016-09-29 17:10   ` Adrien Mazarguil
@ 2016-11-16 16:23   ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API Adrien Mazarguil
                       ` (25 more replies)
  3 siblings, 26 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

As previously discussed in RFC v1 [1], RFC v2 [2], with changes
described in [3] (also pasted below), here is the first non-draft series
for this new API.

Its capabilities are so generic that its name had to be vague, it may be
called "Generic flow API", "Generic flow interface" (possibly shortened
as "GFI") to refer to the name of the new filter type, or "rte_flow" from
the prefix used for its public symbols. I personally favor the latter.

While it is currently meant to supersede existing filter types in order for
all PMDs to expose a common filtering/classification interface, it may
eventually evolve to cover the following ideas as well:

- Rx/Tx offloads configuration through automatic offloads for specific
  packets, e.g. performing checksum on TCP packets could be expressed with
  an egress rule with a TCP pattern and a kind of checksum action.

- RSS configuration (already defined actually). Could be global or per rule
  depending on hardware capabilities.

- Switching configuration for devices with many physical ports; rules doing
  both ingress and egress could even be used to completely bypass software
  if supported by hardware.

 [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
 [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
 [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html

Changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect
  struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

Adrien Mazarguil (22):
  ethdev: introduce generic flow API
  cmdline: add support for dynamic tokens
  cmdline: add alignment constraint
  app/testpmd: implement basic support for rte_flow
  app/testpmd: add flow command
  app/testpmd: add rte_flow integer support
  app/testpmd: add flow list command
  app/testpmd: add flow flush command
  app/testpmd: add flow destroy command
  app/testpmd: add flow validate/create commands
  app/testpmd: add flow query command
  app/testpmd: add rte_flow item spec handler
  app/testpmd: add rte_flow item spec prefix length
  app/testpmd: add rte_flow bit-field support
  app/testpmd: add item any to flow command
  app/testpmd: add various items to flow command
  app/testpmd: add item raw to flow command
  app/testpmd: add items eth/vlan to flow command
  app/testpmd: add items ipv4/ipv6 to flow command
  app/testpmd: add L4 items to flow command
  app/testpmd: add various actions to flow command
  app/testpmd: add queue actions to flow command

 MAINTAINERS                            |    4 +
 app/test-pmd/Makefile                  |    1 +
 app/test-pmd/cmdline.c                 |   32 +
 app/test-pmd/cmdline_flow.c            | 2581 +++++++++++++++++++++++++++
 app/test-pmd/config.c                  |  484 +++++
 app/test-pmd/csumonly.c                |    1 +
 app/test-pmd/flowgen.c                 |    1 +
 app/test-pmd/icmpecho.c                |    1 +
 app/test-pmd/ieee1588fwd.c             |    1 +
 app/test-pmd/iofwd.c                   |    1 +
 app/test-pmd/macfwd.c                  |    1 +
 app/test-pmd/macswap.c                 |    1 +
 app/test-pmd/parameters.c              |    1 +
 app/test-pmd/rxonly.c                  |    1 +
 app/test-pmd/testpmd.c                 |    6 +
 app/test-pmd/testpmd.h                 |   27 +
 app/test-pmd/txonly.c                  |    1 +
 lib/librte_cmdline/cmdline_parse.c     |   67 +-
 lib/librte_cmdline/cmdline_parse.h     |   21 +
 lib/librte_ether/Makefile              |    3 +
 lib/librte_ether/rte_eth_ctrl.h        |    1 +
 lib/librte_ether/rte_ether_version.map |   10 +
 lib/librte_ether/rte_flow.c            |  159 ++
 lib/librte_ether/rte_flow.h            |  947 ++++++++++
 lib/librte_ether/rte_flow_driver.h     |  177 ++
 25 files changed, 4521 insertions(+), 9 deletions(-)
 create mode 100644 app/test-pmd/cmdline_flow.c
 create mode 100644 lib/librte_ether/rte_flow.c
 create mode 100644 lib/librte_ether/rte_flow.h
 create mode 100644 lib/librte_ether/rte_flow_driver.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-18  6:36       ` Xing, Beilei
                         ` (2 more replies)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 02/22] cmdline: add support for dynamic tokens Adrien Mazarguil
                       ` (24 subsequent siblings)
  25 siblings, 3 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

Benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

Existing filter types will be deprecated and removed in the near future.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 MAINTAINERS                            |   4 +
 lib/librte_ether/Makefile              |   3 +
 lib/librte_ether/rte_eth_ctrl.h        |   1 +
 lib/librte_ether/rte_ether_version.map |  10 +
 lib/librte_ether/rte_flow.c            | 159 +++++
 lib/librte_ether/rte_flow.h            | 947 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow_driver.h     | 177 ++++++
 7 files changed, 1301 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d6bb8f8..3b46630 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -243,6 +243,10 @@ M: Thomas Monjalon <thomas.monjalon@6wind.com>
 F: lib/librte_ether/
 F: scripts/test-null.sh
 
+Generic flow API
+M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
+F: lib/librte_ether/rte_flow*
+
 Crypto API
 M: Declan Doherty <declan.doherty@intel.com>
 F: lib/librte_cryptodev/
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index efe1e5f..9335361 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -44,6 +44,7 @@ EXPORT_MAP := rte_ether_version.map
 LIBABIVER := 5
 
 SRCS-y += rte_ethdev.c
+SRCS-y += rte_flow.c
 
 #
 # Export include files
@@ -51,6 +52,8 @@ SRCS-y += rte_ethdev.c
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
+SYMLINK-y-include += rte_flow.h
+SYMLINK-y-include += rte_flow_driver.h
 
 # this lib depends upon:
 DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index fe80eb0..8386904 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -99,6 +99,7 @@ enum rte_filter_type {
 	RTE_ETH_FILTER_FDIR,
 	RTE_ETH_FILTER_HASH,
 	RTE_ETH_FILTER_L2_TUNNEL,
+	RTE_ETH_FILTER_GENERIC,
 	RTE_ETH_FILTER_MAX
 };
 
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 72be66d..b5d2547 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -147,3 +147,13 @@ DPDK_16.11 {
 	rte_eth_dev_pci_remove;
 
 } DPDK_16.07;
+
+DPDK_17.02 {
+	global:
+
+	rte_flow_validate;
+	rte_flow_create;
+	rte_flow_destroy;
+	rte_flow_query;
+
+} DPDK_16.11;
diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
new file mode 100644
index 0000000..064963d
--- /dev/null
+++ b/lib/librte_ether/rte_flow.c
@@ -0,0 +1,159 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include "rte_ethdev.h"
+#include "rte_flow_driver.h"
+#include "rte_flow.h"
+
+/* Get generic flow operations structure from a port. */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops;
+	int code;
+
+	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
+		code = ENODEV;
+	else if (unlikely(!dev->dev_ops->filter_ctrl ||
+			  dev->dev_ops->filter_ctrl(dev,
+						    RTE_ETH_FILTER_GENERIC,
+						    RTE_ETH_FILTER_GET,
+						    &ops) ||
+			  !ops))
+		code = ENOTSUP;
+	else
+		return ops;
+	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(code));
+	return NULL;
+}
+
+/* Check whether a flow rule can be created on a given port. */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error)
+{
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->validate))
+		return ops->validate(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return -rte_errno;
+}
+
+/* Create a flow rule on a given port. */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return NULL;
+	if (likely(!!ops->create))
+		return ops->create(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return NULL;
+}
+
+/* Destroy a flow rule on a given port. */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->destroy))
+		return ops->destroy(dev, flow, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return -rte_errno;
+}
+
+/* Destroy all flow rules associated with a port. */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->flush))
+		return ops->flush(dev, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return -rte_errno;
+}
+
+/* Query an existing flow rule. */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (!ops)
+		return -rte_errno;
+	if (likely(!!ops->query))
+		return ops->query(dev, flow, action, data, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return -rte_errno;
+}
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
new file mode 100644
index 0000000..211f307
--- /dev/null
+++ b/lib/librte_ether/rte_flow.h
@@ -0,0 +1,947 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_H_
+#define RTE_FLOW_H_
+
+/**
+ * @file
+ * RTE generic flow API
+ *
+ * This interface provides the ability to program packet matching and
+ * associated actions in hardware through flow rules.
+ */
+
+#include <rte_arp.h>
+#include <rte_ether.h>
+#include <rte_icmp.h>
+#include <rte_ip.h>
+#include <rte_sctp.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Flow rule attributes.
+ *
+ * Priorities are set on two levels: per group and per rule within groups.
+ *
+ * Lower values denote higher priority, the highest priority for both levels
+ * is 0, so that a rule with priority 0 in group 8 is always matched after a
+ * rule with priority 8 in group 0.
+ *
+ * Although optional, applications are encouraged to group similar rules as
+ * much as possible to fully take advantage of hardware capabilities
+ * (e.g. optimized matching) and work around limitations (e.g. a single
+ * pattern type possibly allowed in a given group).
+ *
+ * Group and priority levels are arbitrary and up to the application, they
+ * do not need to be contiguous nor start from 0, however the maximum number
+ * varies between devices and may be affected by existing flow rules.
+ *
+ * If a packet is matched by several rules of a given group for a given
+ * priority level, the outcome is undefined. It can take any path, may be
+ * duplicated or even cause unrecoverable errors.
+ *
+ * Note that support for more than a single group and priority level is not
+ * guaranteed.
+ *
+ * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+ *
+ * Several pattern items and actions are valid and can be used in both
+ * directions. Those valid for only one direction are described as such.
+ *
+ * Specifying both directions at once is not recommended but may be valid in
+ * some cases, such as incrementing the same counter twice.
+ *
+ * Not specifying any direction is currently an error.
+ */
+struct rte_flow_attr {
+	uint32_t group; /**< Priority group. */
+	uint32_t priority; /**< Priority level within group. */
+	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
+	uint32_t egress:1; /**< Rule applies to egress traffic. */
+	uint32_t reserved:30; /**< Reserved, must be zero. */
+};
+
+/**
+ * Matching pattern item types.
+ *
+ * Items are arranged in a list to form a matching pattern for packets.
+ * They fall in two categories:
+ *
+ * - Protocol matching (ANY, RAW, ETH, IPV4, IPV6, ICMP, UDP, TCP, SCTP,
+ *   VXLAN and so on), usually associated with a specification
+ *   structure. These must be stacked in the same order as the protocol
+ *   layers to match, starting from L2.
+ *
+ * - Affecting how the pattern is processed (END, VOID, INVERT, PF, VF, PORT
+ *   and so on), often without a specification structure. Since they are
+ *   meta data that does not match packet contents, these can be specified
+ *   anywhere within item lists without affecting the protocol matching
+ *   items.
+ *
+ * See the description of individual types for more information. Those
+ * marked with [META] fall into the second category.
+ */
+enum rte_flow_item_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for item lists. Prevents further processing of items,
+	 * thereby ending the pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_VOID,
+
+	/**
+	 * [META]
+	 *
+	 * Inverted matching, i.e. process packets that do not match the
+	 * pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_INVERT,
+
+	/**
+	 * Matches any protocol in place of the current layer, a single ANY
+	 * may also stand for several protocol layers.
+	 *
+	 * See struct rte_flow_item_any.
+	 */
+	RTE_FLOW_ITEM_TYPE_ANY,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to the physical function of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a PF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_PF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to a virtual function ID of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a VF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * See struct rte_flow_item_vf.
+	 */
+	RTE_FLOW_ITEM_TYPE_VF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets coming from the specified physical port of the
+	 * underlying device.
+	 *
+	 * The first PORT item overrides the physical port normally
+	 * associated with the specified DPDK input port (port_id). This
+	 * item can be provided several times to match additional physical
+	 * ports.
+	 *
+	 * See struct rte_flow_item_port.
+	 */
+	RTE_FLOW_ITEM_TYPE_PORT,
+
+	/**
+	 * Matches a byte string of a given length at a given offset.
+	 *
+	 * See struct rte_flow_item_raw.
+	 */
+	RTE_FLOW_ITEM_TYPE_RAW,
+
+	/**
+	 * Matches an Ethernet header.
+	 *
+	 * See struct rte_flow_item_eth.
+	 */
+	RTE_FLOW_ITEM_TYPE_ETH,
+
+	/**
+	 * Matches an 802.1Q/ad VLAN tag.
+	 *
+	 * See struct rte_flow_item_vlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VLAN,
+
+	/**
+	 * Matches an IPv4 header.
+	 *
+	 * See struct rte_flow_item_ipv4.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV4,
+
+	/**
+	 * Matches an IPv6 header.
+	 *
+	 * See struct rte_flow_item_ipv6.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV6,
+
+	/**
+	 * Matches an ICMP header.
+	 *
+	 * See struct rte_flow_item_icmp.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMP,
+
+	/**
+	 * Matches a UDP header.
+	 *
+	 * See struct rte_flow_item_udp.
+	 */
+	RTE_FLOW_ITEM_TYPE_UDP,
+
+	/**
+	 * Matches a TCP header.
+	 *
+	 * See struct rte_flow_item_tcp.
+	 */
+	RTE_FLOW_ITEM_TYPE_TCP,
+
+	/**
+	 * Matches a SCTP header.
+	 *
+	 * See struct rte_flow_item_sctp.
+	 */
+	RTE_FLOW_ITEM_TYPE_SCTP,
+
+	/**
+	 * Matches a VXLAN header.
+	 *
+	 * See struct rte_flow_item_vxlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VXLAN,
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ANY
+ *
+ * Matches any protocol in place of the current layer, a single ANY may also
+ * stand for several protocol layers.
+ *
+ * This is usually specified as the first pattern item when looking for a
+ * protocol anywhere in a packet.
+ *
+ * A maximum value of 0 requests matching any number of protocol layers
+ * above or equal to the minimum value, a maximum value lower than the
+ * minimum one is otherwise invalid.
+ *
+ * This type does not work with a range (struct rte_flow_item.last).
+ */
+struct rte_flow_item_any {
+	uint16_t min; /**< Minimum number of layers covered. */
+	uint16_t max; /**< Maximum number of layers covered, 0 for infinity. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VF
+ *
+ * Matches packets addressed to a virtual function ID of the device.
+ *
+ * If the underlying device function differs from the one that would
+ * normally receive the matched traffic, specifying this item prevents it
+ * from reaching that device unless the flow rule contains a VF
+ * action. Packets are not duplicated between device instances by default.
+ *
+ * - Likely to return an error or never match any traffic if this causes a
+ *   VF device to match traffic addressed to a different VF.
+ * - Can be specified multiple times to match traffic addressed to several
+ *   specific VFs.
+ * - Can be combined with a PF item to match both PF and VF traffic.
+ *
+ * A zeroed mask can be used to match any VF.
+ */
+struct rte_flow_item_vf {
+	uint32_t id; /**< Destination VF ID. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_PORT
+ *
+ * Matches packets coming from the specified physical port of the underlying
+ * device.
+ *
+ * The first PORT item overrides the physical port normally associated with
+ * the specified DPDK input port (port_id). This item can be provided
+ * several times to match additional physical ports.
+ *
+ * Note that physical ports are not necessarily tied to DPDK input ports
+ * (port_id) when those are not under DPDK control. Possible values are
+ * specific to each device, they are not necessarily indexed from zero and
+ * may not be contiguous.
+ *
+ * As a device property, the list of allowed values as well as the value
+ * associated with a port_id should be retrieved by other means.
+ *
+ * A zeroed mask can be used to match any port index.
+ */
+struct rte_flow_item_port {
+	uint32_t index; /**< Physical port index. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_RAW
+ *
+ * Matches a byte string of a given length at a given offset.
+ *
+ * Offset is either absolute (using the start of the packet) or relative to
+ * the end of the previous matched item in the stack, in which case negative
+ * values are allowed.
+ *
+ * If search is enabled, offset is used as the starting point. The search
+ * area can be delimited by setting limit to a nonzero value, which is the
+ * maximum number of bytes after offset where the pattern may start.
+ *
+ * Matching a zero-length pattern is allowed, doing so resets the relative
+ * offset for subsequent items.
+ *
+ * This type does not work with a range (struct rte_flow_item.last).
+ */
+struct rte_flow_item_raw {
+	uint32_t relative:1; /**< Look for pattern after the previous item. */
+	uint32_t search:1; /**< Search pattern from offset (see also limit). */
+	uint32_t reserved:30; /**< Reserved, must be set to zero. */
+	int32_t offset; /**< Absolute or relative offset for pattern. */
+	uint16_t limit; /**< Search area limit for start of pattern. */
+	uint16_t length; /**< Pattern length. */
+	uint8_t pattern[]; /**< Byte string to look for. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ETH
+ *
+ * Matches an Ethernet header.
+ */
+struct rte_flow_item_eth {
+	struct ether_addr dst; /**< Destination MAC. */
+	struct ether_addr src; /**< Source MAC. */
+	unsigned int type; /**< EtherType. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VLAN
+ *
+ * Matches an 802.1Q/ad VLAN tag.
+ *
+ * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
+ * RTE_FLOW_ITEM_TYPE_VLAN.
+ */
+struct rte_flow_item_vlan {
+	uint16_t tpid; /**< Tag protocol identifier. */
+	uint16_t tci; /**< Tag control information. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV4
+ *
+ * Matches an IPv4 header.
+ *
+ * Note: IPv4 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv4 {
+	struct ipv4_hdr hdr; /**< IPv4 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV6.
+ *
+ * Matches an IPv6 header.
+ *
+ * Note: IPv6 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv6 {
+	struct ipv6_hdr hdr; /**< IPv6 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMP.
+ *
+ * Matches an ICMP header.
+ */
+struct rte_flow_item_icmp {
+	struct icmp_hdr hdr; /**< ICMP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_UDP.
+ *
+ * Matches a UDP header.
+ */
+struct rte_flow_item_udp {
+	struct udp_hdr hdr; /**< UDP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_TCP.
+ *
+ * Matches a TCP header.
+ */
+struct rte_flow_item_tcp {
+	struct tcp_hdr hdr; /**< TCP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_SCTP.
+ *
+ * Matches a SCTP header.
+ */
+struct rte_flow_item_sctp {
+	struct sctp_hdr hdr; /**< SCTP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VXLAN.
+ *
+ * Matches a VXLAN header (RFC 7348).
+ */
+struct rte_flow_item_vxlan {
+	uint8_t flags; /**< Normally 0x08 (I flag). */
+	uint8_t rsvd0[3]; /**< Reserved, normally 0x000000. */
+	uint8_t vni[3]; /**< VXLAN identifier. */
+	uint8_t rsvd1; /**< Reserved, normally 0x00. */
+};
+
+/**
+ * Matching pattern item definition.
+ *
+ * A pattern is formed by stacking items starting from the lowest protocol
+ * layer to match. This stacking restriction does not apply to meta items
+ * which can be placed anywhere in the stack with no effect on the meaning
+ * of the resulting pattern.
+ *
+ * A stack is terminated by a END item.
+ *
+ * The spec field should be a valid pointer to a structure of the related
+ * item type. It may be set to NULL in many cases to use default values.
+ *
+ * Optionally, last can point to a structure of the same type to define an
+ * inclusive range. This is mostly supported by integer and address fields,
+ * may cause errors otherwise. Fields that do not support ranges must be set
+ * to the same value as their spec counterparts.
+ *
+ * By default all fields present in spec are considered relevant.* This
+ * behavior can be altered by providing a mask structure of the same type
+ * with applicable bits set to one. It can also be used to partially filter
+ * out specific fields (e.g. as an alternate mean to match ranges of IP
+ * addresses).
+ *
+ * Note this is a simple bit-mask applied before interpreting the contents
+ * of spec and last, which may yield unexpected results if not used
+ * carefully. For example, if for an IPv4 address field, spec provides
+ * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
+ * effective range is 10.1.0.0 to 10.3.255.255.
+ *
+ * * The defaults for data-matching items such as IPv4 when mask is not
+ *   specified actually depend on the underlying implementation since only
+ *   recognized fields can be taken into account.
+ */
+struct rte_flow_item {
+	enum rte_flow_item_type type; /**< Item type. */
+	const void *spec; /**< Pointer to item specification structure. */
+	const void *last; /**< Defines an inclusive range (spec to last). */
+	const void *mask; /**< Bit-mask applied to spec and last. */
+};
+
+/**
+ * Action types.
+ *
+ * Each possible action is represented by a type. Some have associated
+ * configuration structures. Several actions combined in a list can be
+ * affected to a flow rule. That list is not ordered.
+ *
+ * They fall in three categories:
+ *
+ * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+ *   processing matched packets by subsequent flow rules, unless overridden
+ *   with PASSTHRU.
+ *
+ * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
+ *   for additional processing by subsequent flow rules.
+ *
+ * - Other non terminating meta actions that do not affect the fate of
+ *   packets (END, VOID, MARK, FLAG, COUNT).
+ *
+ * When several actions are combined in a flow rule, they should all have
+ * different types (e.g. dropping a packet twice is not possible). The
+ * defined behavior is for PMDs to only take into account the last action of
+ * a given type found in the list. PMDs still perform error checking on the
+ * entire list.
+ *
+ * Note that PASSTHRU is the only action able to override a terminating
+ * rule.
+ */
+enum rte_flow_action_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for action lists. Prevents further processing of
+	 * actions, thereby ending the list.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_VOID,
+
+	/**
+	 * Leaves packets up for additional processing by subsequent flow
+	 * rules. This is the default when a rule does not contain a
+	 * terminating action, but can be specified to force a rule to
+	 * become non-terminating.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PASSTHRU,
+
+	/**
+	 * [META]
+	 *
+	 * Attaches a 32 bit value to packets.
+	 *
+	 * See struct rte_flow_action_mark.
+	 */
+	RTE_FLOW_ACTION_TYPE_MARK,
+
+	/**
+	 * [META]
+	 *
+	 * Flag packets. Similar to MARK but only affects ol_flags.
+	 *
+	 * Note: a distinctive flag must be defined for it.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_FLAG,
+
+	/**
+	 * Assigns packets to a given queue index.
+	 *
+	 * See struct rte_flow_action_queue.
+	 */
+	RTE_FLOW_ACTION_TYPE_QUEUE,
+
+	/**
+	 * Drops packets.
+	 *
+	 * PASSTHRU overrides this action if both are specified.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_DROP,
+
+	/**
+	 * [META]
+	 *
+	 * Enables counters for this rule.
+	 *
+	 * These counters can be retrieved and reset through rte_flow_query(),
+	 * see struct rte_flow_query_count.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_COUNT,
+
+	/**
+	 * Duplicates packets to a given queue index.
+	 *
+	 * This is normally combined with QUEUE, however when used alone, it
+	 * is actually similar to QUEUE + PASSTHRU.
+	 *
+	 * See struct rte_flow_action_dup.
+	 */
+	RTE_FLOW_ACTION_TYPE_DUP,
+
+	/**
+	 * Similar to QUEUE, except RSS is additionally performed on packets
+	 * to spread them among several queues according to the provided
+	 * parameters.
+	 *
+	 * See struct rte_flow_action_rss.
+	 */
+	RTE_FLOW_ACTION_TYPE_RSS,
+
+	/**
+	 * Redirects packets to the physical function (PF) of the current
+	 * device.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PF,
+
+	/**
+	 * Redirects packets to the virtual function (VF) of the current
+	 * device with the specified ID.
+	 *
+	 * See struct rte_flow_action_vf.
+	 */
+	RTE_FLOW_ACTION_TYPE_VF,
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_MARK
+ *
+ * Attaches a 32 bit value to packets.
+ *
+ * This value is arbitrary and application-defined. For compatibility with
+ * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
+ * also set in ol_flags.
+ */
+struct rte_flow_action_mark {
+	uint32_t id; /**< 32 bit value to return with packets. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_QUEUE
+ *
+ * Assign packets to a given queue index.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_queue {
+	uint16_t index; /**< Queue index to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_COUNT (query)
+ *
+ * Query structure to retrieve and reset flow rule counters.
+ */
+struct rte_flow_query_count {
+	uint32_t reset:1; /**< Reset counters after query [in]. */
+	uint32_t hits_set:1; /**< hits field is set [out]. */
+	uint32_t bytes_set:1; /**< bytes field is set [out]. */
+	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
+	uint64_t hits; /**< Number of hits for this rule [out]. */
+	uint64_t bytes; /**< Number of bytes through this rule [out]. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_DUP
+ *
+ * Duplicates packets to a given queue index.
+ *
+ * This is normally combined with QUEUE, however when used alone, it is
+ * actually similar to QUEUE + PASSTHRU.
+ *
+ * Non-terminating by default.
+ */
+struct rte_flow_action_dup {
+	uint16_t index; /**< Queue index to duplicate packets to. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_RSS
+ *
+ * Similar to QUEUE, except RSS is additionally performed on packets to
+ * spread them among several queues according to the provided parameters.
+ *
+ * Note: RSS hash result is normally stored in the hash.rss mbuf field,
+ * however it conflicts with the MARK action as they share the same
+ * space. When both actions are specified, the RSS hash is discarded and
+ * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
+ * structure should eventually evolve to store both.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_rss {
+	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
+	uint16_t queues; /**< Number of entries in queue[]. */
+	uint16_t queue[]; /**< Queues indices to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_VF
+ *
+ * Redirects packets to a virtual function (VF) of the current device.
+ *
+ * Packets matched by a VF pattern item can be redirected to their original
+ * VF ID instead of the specified one. This parameter may not be available
+ * and is not guaranteed to work properly if the VF part is matched by a
+ * prior flow rule or if packets are not addressed to a VF in the first
+ * place.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_vf {
+	uint32_t original:1; /**< Use original VF ID if possible. */
+	uint32_t reserved:31; /**< Reserved, must be zero. */
+	uint32_t id; /**< VF ID to redirect packets to. */
+};
+
+/**
+ * Definition of a single action.
+ *
+ * A list of actions is terminated by a END action.
+ *
+ * For simple actions without a configuration structure, conf remains NULL.
+ */
+struct rte_flow_action {
+	enum rte_flow_action_type type; /**< Action type. */
+	const void *conf; /**< Pointer to action configuration structure. */
+};
+
+/**
+ * Opaque type returned after successfully creating a flow.
+ *
+ * This handle can be used to manage and query the related flow (e.g. to
+ * destroy it or retrieve counters).
+ */
+struct rte_flow;
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_flow_error.cause.
+ */
+enum rte_flow_error_type {
+	RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+	RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+	RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+	RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+	RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by PMDs, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_flow_error {
+	enum rte_flow_error_type type; /**< Cause field and error types. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * Check whether a flow rule can be created on a given port.
+ *
+ * While this function has no effect on the target device, the flow rule is
+ * validated against its current configuration state and the returned value
+ * should be considered valid by the caller for that state only.
+ *
+ * The returned value is guaranteed to remain valid only as long as no
+ * successful calls to rte_flow_create() or rte_flow_destroy() are made in
+ * the meantime and no device parameter affecting flow rules in any way are
+ * modified, due to possible collisions or resource limitations (although in
+ * such cases EINVAL should not be returned).
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 if flow rule is valid and can be created. A negative errno value
+ *   otherwise (rte_errno is also set), the following errors are defined:
+ *
+ *   -ENOSYS: underlying device does not support this functionality.
+ *
+ *   -EINVAL: unknown or invalid rule specification.
+ *
+ *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
+ *   bit-masks are unsupported).
+ *
+ *   -EEXIST: collision with an existing rule.
+ *
+ *   -ENOMEM: not enough resources.
+ *
+ *   -EBUSY: action cannot be performed due to busy device resources, may
+ *   succeed if the affected queues or even the entire port are in a stopped
+ *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
+ */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error);
+
+/**
+ * Create a flow rule on a given port.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   A valid handle in case of success, NULL otherwise and rte_errno is set
+ *   to the positive version of one of the error codes defined for
+ *   rte_flow_validate().
+ */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error);
+
+/**
+ * Destroy a flow rule on a given port.
+ *
+ * Failure to destroy a flow rule handle may occur when other flow rules
+ * depend on it, and destroying it would result in an inconsistent state.
+ *
+ * This function is only guaranteed to succeed if handles are destroyed in
+ * reverse order of their creation.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error);
+
+/**
+ * Destroy all flow rules associated with a port.
+ *
+ * In the unlikely event of failure, handles are still considered destroyed
+ * and no longer valid but the port must be assumed to be in an inconsistent
+ * state.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error);
+
+/**
+ * Query an existing flow rule.
+ *
+ * This function allows retrieving flow-specific data such as counters.
+ * Data is gathered by special actions which must be present in the flow
+ * rule definition.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to query.
+ * @param action
+ *   Action type to query.
+ * @param[in, out] data
+ *   Pointer to storage for the associated query data type.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_H_ */
diff --git a/lib/librte_ether/rte_flow_driver.h b/lib/librte_ether/rte_flow_driver.h
new file mode 100644
index 0000000..a88c621
--- /dev/null
+++ b/lib/librte_ether/rte_flow_driver.h
@@ -0,0 +1,177 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_DRIVER_H_
+#define RTE_FLOW_DRIVER_H_
+
+/**
+ * @file
+ * RTE generic flow API (driver side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include "rte_flow.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Generic flow operations structure implemented and returned by PMDs.
+ *
+ * To implement this API, PMDs must handle the RTE_ETH_FILTER_GENERIC filter
+ * type in their .filter_ctrl callback function (struct eth_dev_ops) as well
+ * as the RTE_ETH_FILTER_GET filter operation.
+ *
+ * If successful, this operation must result in a pointer to a PMD-specific
+ * struct rte_flow_ops written to the argument address as described below:
+ *
+ *  // PMD filter_ctrl callback
+ *
+ *  static const struct rte_flow_ops pmd_flow_ops = { ... };
+ *
+ *  switch (filter_type) {
+ *  case RTE_ETH_FILTER_GENERIC:
+ *      if (filter_op != RTE_ETH_FILTER_GET)
+ *          return -EINVAL;
+ *      *(const void **)arg = &pmd_flow_ops;
+ *      return 0;
+ *  }
+ *
+ * See also rte_flow_ops_get().
+ *
+ * These callback functions are not supposed to be used by applications
+ * directly, which must rely on the API defined in rte_flow.h.
+ *
+ * Public-facing wrapper functions perform a few consistency checks so that
+ * unimplemented (i.e. NULL) callbacks simply return -ENOTSUP. These
+ * callbacks otherwise only differ by their first argument (with port ID
+ * already resolved to a pointer to struct rte_eth_dev).
+ */
+struct rte_flow_ops {
+	/** See rte_flow_validate(). */
+	int (*validate)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_create(). */
+	struct rte_flow *(*create)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_destroy(). */
+	int (*destroy)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 struct rte_flow_error *);
+	/** See rte_flow_flush(). */
+	int (*flush)
+		(struct rte_eth_dev *,
+		 struct rte_flow_error *);
+	/** See rte_flow_query(). */
+	int (*query)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 enum rte_flow_action_type,
+		 void *,
+		 struct rte_flow_error *);
+};
+
+/**
+ * Initialize generic flow error structure.
+ *
+ * This function also sets rte_errno to a given value.
+ *
+ * @param[out] error
+ *   Pointer to flow error structure (may be NULL).
+ * @param code
+ *   Related error code (rte_errno).
+ * @param type
+ *   Cause field and error types.
+ * @param cause
+ *   Object responsible for the error.
+ * @param message
+ *   Human-readable error message.
+ *
+ * @return
+ *   Pointer to flow error structure.
+ */
+static inline struct rte_flow_error *
+rte_flow_error_set(struct rte_flow_error *error,
+		   int code,
+		   enum rte_flow_error_type type,
+		   void *cause,
+		   const char *message)
+{
+	if (error) {
+		*error = (struct rte_flow_error){
+			.type = type,
+			.cause = cause,
+			.message = message,
+		};
+	}
+	rte_errno = code;
+	return error;
+}
+
+/**
+ * Get generic flow operations structure from a port.
+ *
+ * @param port_id
+ *   Port identifier to query.
+ * @param[out] error
+ *   Pointer to flow error structure.
+ *
+ * @return
+ *   The flow operations structure associated with port_id, NULL in case of
+ *   error, in which case rte_errno is set and the error structure contains
+ *   additional details.
+ */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_DRIVER_H_ */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 02/22] cmdline: add support for dynamic tokens
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 03/22] cmdline: add alignment constraint Adrien Mazarguil
                       ` (23 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Considering tokens must be hard-coded in a list part of the instruction
structure, context-dependent tokens cannot be expressed.

This commit adds support for building dynamic token lists through a
user-provided function, which is called when the static token list is empty
(a single NULL entry).

Because no structures are modified (existing fields are reused), this
commit has no impact on the current ABI.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 lib/librte_cmdline/cmdline_parse.c | 60 +++++++++++++++++++++++++++++----
 lib/librte_cmdline/cmdline_parse.h | 21 ++++++++++++
 2 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index b496067..14f5553 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -146,7 +146,9 @@ nb_common_chars(const char * s1, const char * s2)
  */
 static int
 match_inst(cmdline_parse_inst_t *inst, const char *buf,
-	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
+	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size,
+	   cmdline_parse_token_hdr_t
+		*(*dyn_tokens)[CMDLINE_PARSE_DYNAMIC_TOKENS])
 {
 	unsigned int token_num=0;
 	cmdline_parse_token_hdr_t * token_p;
@@ -155,6 +157,11 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 	struct cmdline_token_hdr token_hdr;
 
 	token_p = inst->tokens[token_num];
+	if (!token_p && dyn_tokens && inst->f) {
+		if (!(*dyn_tokens)[0])
+			inst->f(&(*dyn_tokens)[0], NULL, dyn_tokens);
+		token_p = (*dyn_tokens)[0];
+	}
 	if (token_p)
 		memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -196,7 +203,17 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 		buf += n;
 
 		token_num ++;
-		token_p = inst->tokens[token_num];
+		if (!inst->tokens[0]) {
+			if (token_num < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!(*dyn_tokens)[token_num])
+					inst->f(&(*dyn_tokens)[token_num],
+						NULL,
+						dyn_tokens);
+				token_p = (*dyn_tokens)[token_num];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[token_num];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 	}
@@ -239,6 +256,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
 	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
 	int comment = 0;
@@ -255,6 +273,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		return CMDLINE_PARSE_BAD_ARGS;
 
 	ctx = cl->ctx;
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/*
 	 * - look if the buffer contains at least one line
@@ -299,7 +318,8 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
+		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
 			err = CMDLINE_PARSE_BAD_ARGS;
@@ -355,6 +375,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 	cmdline_parse_token_hdr_t *token_p;
 	struct cmdline_token_hdr token_hdr;
 	char tmpbuf[CMDLINE_BUFFER_SIZE], comp_buf[CMDLINE_BUFFER_SIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	unsigned int partial_tok_len;
 	int comp_len = -1;
 	int tmp_len = -1;
@@ -374,6 +395,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 
 	debug_printf("%s called\n", __func__);
 	memset(&token_hdr, 0, sizeof(token_hdr));
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/* count the number of complete token to parse */
 	for (i=0 ; buf[i] ; i++) {
@@ -396,11 +418,24 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		inst = ctx[inst_num];
 		while (inst) {
 			/* parse the first tokens of the inst */
-			if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+			if (nb_token &&
+			    match_inst(inst, buf, nb_token, NULL, 0,
+				       &dyn_tokens))
 				goto next;
 
 			debug_printf("instruction match\n");
-			token_p = inst->tokens[nb_token];
+			if (!inst->tokens[0]) {
+				if (nb_token <
+				    (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+					if (!dyn_tokens[nb_token])
+						inst->f(&dyn_tokens[nb_token],
+							NULL,
+							&dyn_tokens);
+					token_p = dyn_tokens[nb_token];
+				} else
+					token_p = NULL;
+			} else
+				token_p = inst->tokens[nb_token];
 			if (token_p)
 				memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -490,10 +525,21 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		/* we need to redo it */
 		inst = ctx[inst_num];
 
-		if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+		if (nb_token &&
+		    match_inst(inst, buf, nb_token, NULL, 0, &dyn_tokens))
 			goto next2;
 
-		token_p = inst->tokens[nb_token];
+		if (!inst->tokens[0]) {
+			if (nb_token < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!dyn_tokens[nb_token])
+					inst->f(&dyn_tokens[nb_token],
+						NULL,
+						&dyn_tokens);
+				token_p = dyn_tokens[nb_token];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[nb_token];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
diff --git a/lib/librte_cmdline/cmdline_parse.h b/lib/librte_cmdline/cmdline_parse.h
index 4ac05d6..65b18d4 100644
--- a/lib/librte_cmdline/cmdline_parse.h
+++ b/lib/librte_cmdline/cmdline_parse.h
@@ -83,6 +83,9 @@ extern "C" {
 /* maximum buffer size for parsed result */
 #define CMDLINE_PARSE_RESULT_BUFSIZE 8192
 
+/* maximum number of dynamic tokens */
+#define CMDLINE_PARSE_DYNAMIC_TOKENS 128
+
 /**
  * Stores a pointer to the ops struct, and the offset: the place to
  * write the parsed result in the destination structure.
@@ -130,6 +133,24 @@ struct cmdline;
  * Store a instruction, which is a pointer to a callback function and
  * its parameter that is called when the instruction is parsed, a help
  * string, and a list of token composing this instruction.
+ *
+ * When no tokens are defined (tokens[0] == NULL), they are retrieved
+ * dynamically by calling f() as follows:
+ *
+ *  f((struct cmdline_token_hdr **)&token_hdr,
+ *    NULL,
+ *    (struct cmdline_token_hdr *[])tokens));
+ *
+ * The address of the resulting token is expected at the location pointed by
+ * the first argument. Can be set to NULL to end the list.
+ *
+ * The cmdline argument (struct cmdline *) is always NULL.
+ *
+ * The last argument points to the NULL-terminated list of dynamic tokens
+ * defined so far. Since token_hdr points to an index of that list, the
+ * current index can be derived as follows:
+ *
+ *  int index = token_hdr - &(*tokens)[0];
  */
 struct cmdline_inst {
 	/* f(parsed_struct, data) */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 03/22] cmdline: add alignment constraint
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 02/22] cmdline: add support for dynamic tokens Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 04/22] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
                       ` (22 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

This prevents sigbus errors on architectures that cannot handle unexpected
unaligned accesses to the output buffer.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 lib/librte_cmdline/cmdline_parse.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index 14f5553..763c286 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -255,7 +255,10 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	unsigned int inst_num=0;
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
-	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	union {
+		char buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+		long double align; /* strong alignment constraint for buf */
+	} result;
 	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
@@ -318,7 +321,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+		tok = match_inst(inst, buf, 0, result.buf, sizeof(result.buf),
 				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
@@ -353,7 +356,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 
 	/* call func */
 	if (f) {
-		f(result_buf, cl, data);
+		f(result.buf, cl, data);
 	}
 
 	/* no match */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 04/22] app/testpmd: implement basic support for rte_flow
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (2 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 03/22] cmdline: add alignment constraint Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 05/22] app/testpmd: add flow command Adrien Mazarguil
                       ` (21 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Add basic management functions for the generic flow API (validate, create,
destroy, flush, query and list). Flow rule objects and properties are
arranged in lists associated with each port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c     |   1 +
 app/test-pmd/config.c      | 484 ++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/csumonly.c    |   1 +
 app/test-pmd/flowgen.c     |   1 +
 app/test-pmd/icmpecho.c    |   1 +
 app/test-pmd/ieee1588fwd.c |   1 +
 app/test-pmd/iofwd.c       |   1 +
 app/test-pmd/macfwd.c      |   1 +
 app/test-pmd/macswap.c     |   1 +
 app/test-pmd/parameters.c  |   1 +
 app/test-pmd/rxonly.c      |   1 +
 app/test-pmd/testpmd.c     |   6 +
 app/test-pmd/testpmd.h     |  27 +++
 app/test-pmd/txonly.c      |   1 +
 14 files changed, 528 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 63b55dc..c5b015c 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -75,6 +75,7 @@
 #include <rte_string_fns.h>
 #include <rte_devargs.h>
 #include <rte_eth_ctrl.h>
+#include <rte_flow.h>
 
 #include <cmdline_rdline.h>
 #include <cmdline_parse.h>
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 36c47ab..c9dc872 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -92,6 +92,8 @@
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
 #include <rte_cycles.h>
+#include <rte_flow.h>
+#include <rte_errno.h>
 
 #include "testpmd.h"
 
@@ -750,6 +752,488 @@ port_mtu_set(portid_t port_id, uint16_t mtu)
 	printf("Set MTU failed. diag=%d\n", diag);
 }
 
+/* Generic flow management functions. */
+
+/** Generate flow_item[] entry. */
+#define MK_FLOW_ITEM(t, s) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow pattern items. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_item[] = {
+	MK_FLOW_ITEM(END, 0),
+	MK_FLOW_ITEM(VOID, 0),
+	MK_FLOW_ITEM(INVERT, 0),
+	MK_FLOW_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+	MK_FLOW_ITEM(PF, 0),
+	MK_FLOW_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+	MK_FLOW_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+	MK_FLOW_ITEM(RAW, sizeof(struct rte_flow_item_raw)), /* +pattern[] */
+	MK_FLOW_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+	MK_FLOW_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+	MK_FLOW_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+	MK_FLOW_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+	MK_FLOW_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+	MK_FLOW_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+	MK_FLOW_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+	MK_FLOW_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+	MK_FLOW_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+};
+
+/** Compute storage space needed by item specification. */
+static void
+flow_item_spec_size(const struct rte_flow_item *item,
+		    size_t *size, size_t *pad)
+{
+	if (!item->spec)
+		goto empty;
+	switch (item->type) {
+		union {
+			const struct rte_flow_item_raw *raw;
+		} spec;
+
+	case RTE_FLOW_ITEM_TYPE_RAW:
+		spec.raw = item->spec;
+		*size = offsetof(struct rte_flow_item_raw, pattern) +
+			spec.raw->length * sizeof(*spec.raw->pattern);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate flow_action[] entry. */
+#define MK_FLOW_ACTION(t, s) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow actions. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_action[] = {
+	MK_FLOW_ACTION(END, 0),
+	MK_FLOW_ACTION(VOID, 0),
+	MK_FLOW_ACTION(PASSTHRU, 0),
+	MK_FLOW_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+	MK_FLOW_ACTION(FLAG, 0),
+	MK_FLOW_ACTION(QUEUE, sizeof(struct rte_flow_action_queue)),
+	MK_FLOW_ACTION(DROP, 0),
+	MK_FLOW_ACTION(COUNT, 0),
+	MK_FLOW_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+	MK_FLOW_ACTION(RSS, sizeof(struct rte_flow_action_rss)), /* +queue[] */
+	MK_FLOW_ACTION(PF, 0),
+	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+};
+
+/** Compute storage space needed by action configuration. */
+static void
+flow_action_conf_size(const struct rte_flow_action *action,
+		      size_t *size, size_t *pad)
+{
+	if (!action->conf)
+		goto empty;
+	switch (action->type) {
+		union {
+			const struct rte_flow_action_rss *rss;
+		} conf;
+
+	case RTE_FLOW_ACTION_TYPE_RSS:
+		conf.rss = action->conf;
+		*size = offsetof(struct rte_flow_action_rss, queue) +
+			conf.rss->queues * sizeof(*conf.rss->queue);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate a port_flow entry from attributes/pattern/actions. */
+static struct port_flow *
+port_flow_new(const struct rte_flow_attr *attr,
+	      const struct rte_flow_item *pattern,
+	      const struct rte_flow_action *actions)
+{
+	const struct rte_flow_item *item;
+	const struct rte_flow_action *action;
+	struct port_flow *pf = NULL;
+	size_t tmp;
+	size_t pad;
+	size_t off1 = 0;
+	size_t off2 = 0;
+	int err = ENOTSUP;
+
+store:
+	item = pattern;
+	if (pf)
+		pf->pattern = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_item *dst = NULL;
+
+		if ((unsigned int)item->type > RTE_DIM(flow_item) ||
+		    !flow_item[item->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, item, sizeof(*item));
+		off1 += sizeof(*item);
+		flow_item_spec_size(item, &tmp, &pad);
+		if (item->spec) {
+			if (pf)
+				dst->spec = memcpy(pf->data + off2,
+						   item->spec, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->last) {
+			if (pf)
+				dst->last = memcpy(pf->data + off2,
+						   item->last, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->mask) {
+			if (pf)
+				dst->mask = memcpy(pf->data + off2,
+						   item->mask, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((item++)->type != RTE_FLOW_ITEM_TYPE_END);
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	action = actions;
+	if (pf)
+		pf->actions = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_action *dst = NULL;
+
+		if ((unsigned int)action->type > RTE_DIM(flow_action) ||
+		    !flow_action[action->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, action, sizeof(*action));
+		off1 += sizeof(*action);
+		flow_action_conf_size(action, &tmp, &pad);
+		if (action->conf) {
+			if (pf)
+				dst->conf = memcpy(pf->data + off2,
+						   action->conf, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((action++)->type != RTE_FLOW_ACTION_TYPE_END);
+	if (pf != NULL)
+		return pf;
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	tmp = RTE_ALIGN_CEIL(offsetof(struct port_flow, data), sizeof(double));
+	pf = calloc(1, tmp + off1 + off2);
+	if (pf == NULL)
+		err = errno;
+	else {
+		*pf = (const struct port_flow){
+			.size = tmp + off1 + off2,
+			.attr = *attr,
+		};
+		tmp -= offsetof(struct port_flow, data);
+		off2 = tmp + off1;
+		off1 = tmp;
+		goto store;
+	}
+notsup:
+	rte_errno = err;
+	return NULL;
+}
+
+/** Print a message out of a flow error. */
+static int
+port_flow_complain(struct rte_flow_error *error)
+{
+	static const char *const errstrlist[] = {
+		[RTE_FLOW_ERROR_TYPE_NONE] = "no error",
+		[RTE_FLOW_ERROR_TYPE_UNSPECIFIED] = "cause unspecified",
+		[RTE_FLOW_ERROR_TYPE_HANDLE] = "flow rule (handle)",
+		[RTE_FLOW_ERROR_TYPE_ATTR_GROUP] = "group field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY] = "priority field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_INGRESS] = "ingress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_EGRESS] = "egress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR] = "attributes structure",
+		[RTE_FLOW_ERROR_TYPE_ITEM_NUM] = "pattern length",
+		[RTE_FLOW_ERROR_TYPE_ITEM] = "specific pattern item",
+		[RTE_FLOW_ERROR_TYPE_ACTION_NUM] = "number of actions",
+		[RTE_FLOW_ERROR_TYPE_ACTION] = "specific action",
+	};
+	const char *errstr;
+	char buf[32];
+	int err = rte_errno;
+
+	if ((unsigned int)error->type > RTE_DIM(errstrlist) ||
+	    !errstrlist[error->type])
+		errstr = "unknown type";
+	else
+		errstr = errstrlist[error->type];
+	printf("Caught error type %d (%s): %s%s\n",
+	       error->type, errstr,
+	       error->cause ? (snprintf(buf, sizeof(buf), "cause: %p, ",
+					error->cause), buf) : "",
+	       error->message ? error->message : "(no stated reason)");
+	return -err;
+}
+
+/** Validate flow rule. */
+int
+port_flow_validate(portid_t port_id,
+		   const struct rte_flow_attr *attr,
+		   const struct rte_flow_item *pattern,
+		   const struct rte_flow_action *actions)
+{
+	struct rte_flow_error error;
+
+	if (rte_flow_validate(port_id, attr, pattern, actions, &error))
+		return port_flow_complain(&error);
+	printf("Flow rule validated\n");
+	return 0;
+}
+
+/** Create flow rule. */
+int
+port_flow_create(portid_t port_id,
+		 const struct rte_flow_attr *attr,
+		 const struct rte_flow_item *pattern,
+		 const struct rte_flow_action *actions)
+{
+	struct rte_flow *flow;
+	struct rte_port *port;
+	struct port_flow *pf;
+	uint32_t id;
+	struct rte_flow_error error;
+
+	flow = rte_flow_create(port_id, attr, pattern, actions, &error);
+	if (!flow)
+		return port_flow_complain(&error);
+	port = &ports[port_id];
+	if (port->flow_list) {
+		if (port->flow_list->id == UINT32_MAX) {
+			printf("Highest rule ID is already assigned, delete"
+			       " it first");
+			rte_flow_destroy(port_id, flow, NULL);
+			return -ENOMEM;
+		}
+		id = port->flow_list->id + 1;
+	} else
+		id = 0;
+	pf = port_flow_new(attr, pattern, actions);
+	if (!pf) {
+		int err = rte_errno;
+
+		printf("Cannot allocate flow: %s\n", rte_strerror(err));
+		rte_flow_destroy(port_id, flow, NULL);
+		return -err;
+	}
+	pf->next = port->flow_list;
+	pf->id = id;
+	port->flow_list = pf;
+	printf("Flow rule #%u created\n", pf->id);
+	return 0;
+}
+
+/** Destroy a number of flow rules. */
+int
+port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule)
+{
+	struct rte_port *port;
+	struct port_flow **tmp;
+	uint32_t c = 0;
+	int ret = 0;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	tmp = &port->flow_list;
+	while (*tmp) {
+		uint32_t i;
+
+		for (i = 0; i != n; ++i) {
+			struct rte_flow_error error;
+			struct port_flow *pf = *tmp;
+
+			if (rule[i] != pf->id)
+				continue;
+			if (rte_flow_destroy(port_id, pf->flow, &error)) {
+				ret = port_flow_complain(&error);
+				continue;
+			}
+			printf("Flow rule #%u destroyed\n", pf->id);
+			*tmp = pf->next;
+			free(pf);
+			break;
+		}
+		if (i == n)
+			tmp = &(*tmp)->next;
+		++c;
+	}
+	return ret;
+}
+
+/** Remove all flow rules. */
+int
+port_flow_flush(portid_t port_id)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	int ret = 0;
+
+	if (rte_flow_flush(port_id, &error)) {
+		ret = port_flow_complain(&error);
+		if (port_id_is_invalid(port_id, DISABLED_WARN) ||
+		    port_id == (portid_t)RTE_PORT_ALL)
+			return ret;
+	}
+	port = &ports[port_id];
+	while (port->flow_list) {
+		struct port_flow *pf = port->flow_list->next;
+
+		free(port->flow_list);
+		port->flow_list = pf;
+	}
+	return ret;
+}
+
+/** Query a flow rule. */
+int
+port_flow_query(portid_t port_id, uint32_t rule,
+		enum rte_flow_action_type action)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	struct port_flow *pf;
+	const char *name;
+	union {
+		struct rte_flow_query_count count;
+	} query;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	for (pf = port->flow_list; pf; pf = pf->next)
+		if (pf->id == rule)
+			break;
+	if (!pf) {
+		printf("Flow rule #%u not found\n", rule);
+		return -ENOENT;
+	}
+	if ((unsigned int)action > RTE_DIM(flow_action) ||
+	    !flow_action[action].name)
+		name = "unknown";
+	else
+		name = flow_action[action].name;
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		break;
+	default:
+		printf("Cannot query action type %d (%s)\n", action, name);
+		return -ENOTSUP;
+	}
+	memset(&query, 0, sizeof(query));
+	if (rte_flow_query(port_id, pf->flow, action, &query, &error))
+		return port_flow_complain(&error);
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		printf("%s:\n"
+		       " hits_set: %u\n"
+		       " bytes_set: %u\n"
+		       " hits: %" PRIu64 "\n"
+		       " bytes: %" PRIu64 "\n",
+		       name,
+		       query.count.hits_set,
+		       query.count.bytes_set,
+		       query.count.hits,
+		       query.count.bytes);
+		break;
+	default:
+		printf("Cannot display result for action type %d (%s).\n",
+		       action, name);
+		break;
+	}
+	return 0;
+}
+
+/** List flow rules. */
+void
+port_flow_list(portid_t port_id, uint32_t n, const uint32_t group[n])
+{
+	struct rte_port *port;
+	struct port_flow *pf;
+	struct port_flow *list = NULL;
+	uint32_t i;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return;
+	port = &ports[port_id];
+	if (!port->flow_list)
+		return;
+	/* Sort flows by group, priority and ID. */
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		struct port_flow **tmp;
+
+		if (n) {
+			/* Filter out unwanted groups. */
+			for (i = 0; i != n; ++i)
+				if (pf->attr.group == group[i])
+					break;
+			if (i == n)
+				continue;
+		}
+		tmp = &list;
+		while (*tmp &&
+		       (pf->attr.group > (*tmp)->attr.group ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority > (*tmp)->attr.priority) ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority == (*tmp)->attr.priority &&
+			 pf->id > (*tmp)->id)))
+			tmp = &(*tmp)->tmp;
+		pf->tmp = *tmp;
+		*tmp = pf;
+	}
+	printf("ID\tGroup\tPrio\tAttr\tRule\n");
+	for (pf = list; pf != NULL; pf = pf->tmp) {
+		const struct rte_flow_item *item = pf->pattern;
+		const struct rte_flow_action *action = pf->actions;
+
+		printf("%" PRIu32 "\t%" PRIu32 "\t%" PRIu32 "\t%c%c\t",
+		       pf->id,
+		       pf->attr.group,
+		       pf->attr.priority,
+		       pf->attr.ingress ? 'i' : '-',
+		       pf->attr.egress ? 'e' : '-');
+		while (item->type != RTE_FLOW_ITEM_TYPE_END) {
+			if (item->type != RTE_FLOW_ITEM_TYPE_VOID)
+				printf("%s ", flow_item[item->type].name);
+			++item;
+		}
+		printf("=>");
+		while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+			if (action->type != RTE_FLOW_ACTION_TYPE_VOID)
+				printf(" %s", flow_action[action->type].name);
+			++action;
+		}
+		printf("\n");
+	}
+}
+
 /*
  * RX/TX ring descriptors display functions.
  */
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 57e6ae2..dd67ebf 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,7 @@
 #include <rte_sctp.h>
 #include <rte_prefetch.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index b13ff89..13b4f90 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 6a4e750..f25a8f5 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -61,6 +61,7 @@
 #include <rte_ip.h>
 #include <rte_icmp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/ieee1588fwd.c b/app/test-pmd/ieee1588fwd.c
index 0d3b37a..51170ee 100644
--- a/app/test-pmd/ieee1588fwd.c
+++ b/app/test-pmd/ieee1588fwd.c
@@ -34,6 +34,7 @@
 
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/iofwd.c b/app/test-pmd/iofwd.c
index 26936b7..15cb4a2 100644
--- a/app/test-pmd/iofwd.c
+++ b/app/test-pmd/iofwd.c
@@ -64,6 +64,7 @@
 #include <rte_ether.h>
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index 86e01de..d361db1 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 36e139f..f996039 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 08e5a76..28db8cd 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_BOND
 #include <rte_eth_bond.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index fff815c..cf00576 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -67,6 +67,7 @@
 #include <rte_ip.h>
 #include <rte_udp.h>
 #include <rte_net.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a0332c2..bfb2f8e 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #ifdef RTE_LIBRTE_PDUMP
 #include <rte_pdump.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
@@ -1545,6 +1546,8 @@ close_port(portid_t pid)
 			continue;
 		}
 
+		if (port->flow_list)
+			port_flow_flush(pi);
 		rte_eth_dev_close(pi);
 
 		if (rte_atomic16_cmpset(&(port->port_status),
@@ -1599,6 +1602,9 @@ detach_port(uint8_t port_id)
 		return;
 	}
 
+	if (ports[port_id].flow_list)
+		port_flow_flush(port_id);
+
 	if (rte_eth_dev_detach(port_id, name))
 		return;
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9c1e703..22ce2d6 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -144,6 +144,19 @@ struct fwd_stream {
 /** Insert double VLAN header in forward engine */
 #define TESTPMD_TX_OFFLOAD_INSERT_QINQ       0x0080
 
+/** Descriptor for a single flow. */
+struct port_flow {
+	size_t size; /**< Allocated space including data[]. */
+	struct port_flow *next; /**< Next flow in list. */
+	struct port_flow *tmp; /**< Temporary linking. */
+	uint32_t id; /**< Flow rule ID. */
+	struct rte_flow *flow; /**< Opaque flow object returned by PMD. */
+	struct rte_flow_attr attr; /**< Attributes. */
+	struct rte_flow_item *pattern; /**< Pattern. */
+	struct rte_flow_action *actions; /**< Actions. */
+	uint8_t data[]; /**< Storage for pattern/actions. */
+};
+
 /**
  * The data structure associated with each port.
  */
@@ -177,6 +190,7 @@ struct rte_port {
 	struct ether_addr       *mc_addr_pool; /**< pool of multicast addrs */
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
+	struct port_flow        *flow_list; /**< Associated flows. */
 };
 
 extern portid_t __rte_unused
@@ -504,6 +518,19 @@ void port_reg_bit_field_set(portid_t port_id, uint32_t reg_off,
 			    uint8_t bit1_pos, uint8_t bit2_pos, uint32_t value);
 void port_reg_display(portid_t port_id, uint32_t reg_off);
 void port_reg_set(portid_t port_id, uint32_t reg_off, uint32_t value);
+int port_flow_validate(portid_t port_id,
+		       const struct rte_flow_attr *attr,
+		       const struct rte_flow_item *pattern,
+		       const struct rte_flow_action *actions);
+int port_flow_create(portid_t port_id,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item *pattern,
+		     const struct rte_flow_action *actions);
+int port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule);
+int port_flow_flush(portid_t port_id);
+int port_flow_query(portid_t port_id, uint32_t rule,
+		    enum rte_flow_action_type action);
+void port_flow_list(portid_t port_id, uint32_t n, const uint32_t *group);
 
 void rx_ring_desc_display(portid_t port_id, queueid_t rxq_id, uint16_t rxd_id);
 void tx_ring_desc_display(portid_t port_id, queueid_t txq_id, uint16_t txd_id);
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 8513a06..e996f35 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 05/22] app/testpmd: add flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (3 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 04/22] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 06/22] app/testpmd: add rte_flow integer support Adrien Mazarguil
                       ` (20 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Managing generic flow API functions from command line requires the use of
dynamic tokens for convenience as flow rules are not fixed and cannot be
defined statically.

This commit adds specific flexible parser code and object for a new "flow"
command in separate file.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/Makefile       |   1 +
 app/test-pmd/cmdline.c      |   4 +
 app/test-pmd/cmdline_flow.c | 439 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 444 insertions(+)

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 891b85a..5988c3e 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -47,6 +47,7 @@ CFLAGS += $(WERROR_FLAGS)
 SRCS-y := testpmd.c
 SRCS-y += parameters.c
 SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline.c
+SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_flow.c
 SRCS-y += config.c
 SRCS-y += iofwd.c
 SRCS-y += macfwd.c
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index c5b015c..b7d10b3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -9520,6 +9520,9 @@ cmdline_parse_inst_t cmd_set_flow_director_flex_payload = {
 	},
 };
 
+/* Generic flow interface command. */
+extern cmdline_parse_inst_t cmd_flow;
+
 /* *** Classification Filters Control *** */
 /* *** Get symmetric hash enable per port *** */
 struct cmd_get_sym_hash_ena_per_port_result {
@@ -11557,6 +11560,7 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_set_hash_global_config,
 	(cmdline_parse_inst_t *)&cmd_set_hash_input_set,
 	(cmdline_parse_inst_t *)&cmd_set_fdir_input_set,
+	(cmdline_parse_inst_t *)&cmd_flow,
 	(cmdline_parse_inst_t *)&cmd_mcast_addr,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_all,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_specific,
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
new file mode 100644
index 0000000..7dbda84
--- /dev/null
+++ b/app/test-pmd/cmdline_flow.c
@@ -0,0 +1,439 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <ctype.h>
+#include <string.h>
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <cmdline_parse.h>
+#include <rte_flow.h>
+
+#include "testpmd.h"
+
+/** Parser token indices. */
+enum index {
+	/* Special tokens. */
+	ZERO = 0,
+	END,
+
+	/* Top-level command. */
+	FLOW,
+};
+
+/** Maximum number of subsequent tokens and arguments on the stack. */
+#define CTX_STACK_SIZE 16
+
+/** Parser context. */
+struct context {
+	/** Stack of subsequent token lists to process. */
+	const enum index *next[CTX_STACK_SIZE];
+	enum index curr; /**< Current token index. */
+	enum index prev; /**< Index of the last token seen. */
+	int next_num; /**< Number of entries in next[]. */
+	uint32_t reparse:1; /**< Start over from the beginning. */
+	uint32_t eol:1; /**< EOL has been detected. */
+	uint32_t last:1; /**< No more arguments. */
+};
+
+/** Parser token definition. */
+struct token {
+	/** Type displayed during completion (defaults to "TOKEN"). */
+	const char *type;
+	/** Help displayed during completion (defaults to token name). */
+	const char *help;
+	/**
+	 * Lists of subsequent tokens to push on the stack. Each call to the
+	 * parser consumes the last entry of that stack.
+	 */
+	const enum index *const *next;
+	/**
+	 * Token-processing callback, returns -1 in case of error, the
+	 * length of the matched string otherwise. If NULL, attempts to
+	 * match the token name.
+	 *
+	 * If buf is not NULL, the result should be stored in it according
+	 * to context. An error is returned if not large enough.
+	 */
+	int (*call)(struct context *ctx, const struct token *token,
+		    const char *str, unsigned int len,
+		    void *buf, unsigned int size);
+	/**
+	 * Callback that provides possible values for this token, used for
+	 * completion. Returns -1 in case of error, the number of possible
+	 * values otherwise. If NULL, the token name is used.
+	 *
+	 * If buf is not NULL, entry index ent is written to buf and the
+	 * full length of the entry is returned (same behavior as
+	 * snprintf()).
+	 */
+	int (*comp)(struct context *ctx, const struct token *token,
+		    unsigned int ent, char *buf, unsigned int size);
+	/** Mandatory token name, no default value. */
+	const char *name;
+};
+
+/** Static initializer for the next field. */
+#define NEXT(...) (const enum index *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for a NEXT() entry. */
+#define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, 0, }
+
+/** Parser output buffer layout expected by cmd_flow_parsed(). */
+struct buffer {
+	enum index command; /**< Flow command. */
+	uint16_t port; /**< Affected port ID. */
+};
+
+static int parse_init(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
+
+/** Token definitions. */
+static const struct token token_list[] = {
+	/* Special tokens. */
+	[ZERO] = {
+		.name = "ZERO",
+		.help = "null entry, abused as the entry point",
+		.next = NEXT(NEXT_ENTRY(FLOW)),
+	},
+	[END] = {
+		.name = "",
+		.type = "RETURN",
+		.help = "command may end here",
+	},
+	/* Top-level command. */
+	[FLOW] = {
+		.name = "flow",
+		.type = "{command} {port_id} [{arg} [...]]",
+		.help = "manage ingress/egress flow rules",
+		.call = parse_init,
+	},
+};
+
+/** Default parsing function for token name matching. */
+static int
+parse_default(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)buf;
+	(void)size;
+	if (strncmp(str, token->name, len))
+		return -1;
+	return len;
+}
+
+/** Parse flow command, initialize output buffer for subsequent tokens. */
+static int
+parse_init(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	/* Initialize buffer. */
+	memset(out, 0x00, sizeof(*out));
+	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	return len;
+}
+
+/** Internal context. */
+static struct context cmd_flow_context;
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow;
+
+/** Initialize context. */
+static void
+cmd_flow_context_init(struct context *ctx)
+{
+	/* A full memset() is not necessary. */
+	ctx->curr = 0;
+	ctx->prev = 0;
+	ctx->next_num = 0;
+	ctx->reparse = 0;
+	ctx->eol = 0;
+	ctx->last = 0;
+}
+
+/** Parse a token (cmdline API). */
+static int
+cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
+	       unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token;
+	const enum index *list;
+	int len;
+	int i;
+
+	(void)hdr;
+	/* Restart as requested. */
+	if (ctx->reparse)
+		cmd_flow_context_init(ctx);
+	token = &token_list[ctx->curr];
+	/* Check argument length. */
+	ctx->eol = 0;
+	ctx->last = 1;
+	for (len = 0; src[len]; ++len)
+		if (src[len] == '#' || isspace(src[len]))
+			break;
+	if (!len)
+		return -1;
+	/* Last argument and EOL detection. */
+	for (i = len; src[i]; ++i)
+		if (src[i] == '#' || src[i] == '\r' || src[i] == '\n')
+			break;
+		else if (!isspace(src[i])) {
+			ctx->last = 0;
+			break;
+		}
+	for (; src[i]; ++i)
+		if (src[i] == '\r' || src[i] == '\n') {
+			ctx->eol = 1;
+			break;
+		}
+	/* Initialize context if necessary. */
+	if (!ctx->next_num) {
+		if (!token->next)
+			return 0;
+		ctx->next[ctx->next_num++] = token->next[0];
+	}
+	/* Process argument through candidates. */
+	ctx->prev = ctx->curr;
+	list = ctx->next[ctx->next_num - 1];
+	for (i = 0; list[i]; ++i) {
+		const struct token *next = &token_list[list[i]];
+		int tmp;
+
+		ctx->curr = list[i];
+		if (next->call)
+			tmp = next->call(ctx, next, src, len, result, size);
+		else
+			tmp = parse_default(ctx, next, src, len, result, size);
+		if (tmp == -1 || tmp != len)
+			continue;
+		token = next;
+		break;
+	}
+	if (!list[i])
+		return -1;
+	--ctx->next_num;
+	/* Push subsequent tokens if any. */
+	if (token->next)
+		for (i = 0; token->next[i]; ++i) {
+			if (ctx->next_num == RTE_DIM(ctx->next))
+				return -1;
+			ctx->next[ctx->next_num++] = token->next[i];
+		}
+	return len;
+}
+
+/** Return number of completion entries (cmdline API). */
+static int
+cmd_flow_complete_get_nb(cmdline_parse_token_hdr_t *hdr)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return 0;
+	/*
+	 * If there is a single token, use its completion callback, otherwise
+	 * return the number of entries.
+	 */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, 0, NULL, 0);
+	}
+	return i;
+}
+
+/** Return a completion entry (cmdline API). */
+static int
+cmd_flow_complete_get_elt(cmdline_parse_token_hdr_t *hdr, int index,
+			  char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return -1;
+	/* If there is a single token, use its completion callback. */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, index, dst, size) < 0 ? -1 : 0;
+	}
+	/* Otherwise make sure the index is valid and use defaults. */
+	if (index >= i)
+		return -1;
+	token = &token_list[list[index]];
+	snprintf(dst, size, "%s", token->name);
+	/* Save index for cmd_flow_get_help(). */
+	ctx->prev = list[index];
+	return 0;
+}
+
+/** Populate help strings for current token (cmdline API). */
+static int
+cmd_flow_get_help(cmdline_parse_token_hdr_t *hdr, char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->prev];
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	if (!size)
+		return -1;
+	/* Set token type and update global help with details. */
+	snprintf(dst, size, "%s", (token->type ? token->type : "TOKEN"));
+	if (token->help)
+		cmd_flow.help_str = token->help;
+	else
+		cmd_flow.help_str = token->name;
+	return 0;
+}
+
+/** Token definition template (cmdline API). */
+static struct cmdline_token_hdr cmd_flow_token_hdr = {
+	.ops = &(struct cmdline_token_ops){
+		.parse = cmd_flow_parse,
+		.complete_get_nb = cmd_flow_complete_get_nb,
+		.complete_get_elt = cmd_flow_complete_get_elt,
+		.get_help = cmd_flow_get_help,
+	},
+	.offset = 0,
+};
+
+/** Populate the next dynamic token. */
+static void
+cmd_flow_tok(cmdline_parse_token_hdr_t **hdr,
+	     cmdline_parse_token_hdr_t *(*hdrs)[])
+{
+	struct context *ctx = &cmd_flow_context;
+
+	/* Always reinitialize context before requesting the first token. */
+	if (!(hdr - *hdrs))
+		cmd_flow_context_init(ctx);
+	/* Return NULL when no more tokens are expected. */
+	if (!ctx->next_num && ctx->curr) {
+		*hdr = NULL;
+		return;
+	}
+	/* Determine if command should end here. */
+	if (ctx->eol && ctx->last && ctx->next_num) {
+		const enum index *list = ctx->next[ctx->next_num - 1];
+		int i;
+
+		for (i = 0; list[i]; ++i) {
+			if (list[i] != END)
+				continue;
+			*hdr = NULL;
+			return;
+		}
+	}
+	*hdr = &cmd_flow_token_hdr;
+}
+
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_flow_parsed(const struct buffer *in)
+{
+	switch (in->command) {
+	default:
+		break;
+	}
+}
+
+/** Token generator and output processing callback (cmdline API). */
+static void
+cmd_flow_cb(void *arg0, struct cmdline *cl, void *arg2)
+{
+	if (cl == NULL)
+		cmd_flow_tok(arg0, arg2);
+	else
+		cmd_flow_parsed(arg0);
+}
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow = {
+	.f = cmd_flow_cb,
+	.data = NULL, /**< Unused. */
+	.help_str = NULL, /**< Updated by cmd_flow_get_help(). */
+	.tokens = {
+		NULL,
+	}, /**< Tokens are returned by cmd_flow_tok(). */
+};
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 06/22] app/testpmd: add rte_flow integer support
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (4 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 05/22] app/testpmd: add flow command Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 07/22] app/testpmd: add flow list command Adrien Mazarguil
                       ` (19 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Parse all integer types and handle conversion to network byte order in a
single function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 148 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7dbda84..7078f80 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -34,11 +34,14 @@
 #include <stddef.h>
 #include <stdint.h>
 #include <stdio.h>
+#include <inttypes.h>
+#include <errno.h>
 #include <ctype.h>
 #include <string.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
+#include <rte_byteorder.h>
 #include <cmdline_parse.h>
 #include <rte_flow.h>
 
@@ -50,6 +53,10 @@ enum index {
 	ZERO = 0,
 	END,
 
+	/* Common tokens. */
+	INTEGER,
+	UNSIGNED,
+
 	/* Top-level command. */
 	FLOW,
 };
@@ -61,12 +68,24 @@ enum index {
 struct context {
 	/** Stack of subsequent token lists to process. */
 	const enum index *next[CTX_STACK_SIZE];
+	/** Arguments for stacked tokens. */
+	const void *args[CTX_STACK_SIZE];
 	enum index curr; /**< Current token index. */
 	enum index prev; /**< Index of the last token seen. */
 	int next_num; /**< Number of entries in next[]. */
+	int args_num; /**< Number of entries in args[]. */
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	void *object; /**< Address of current object for relative offsets. */
+};
+
+/** Token argument. */
+struct arg {
+	uint32_t hton:1; /**< Use network byte ordering. */
+	uint32_t sign:1; /**< Value is signed. */
+	uint32_t offset; /**< Relative offset from ctx->object. */
+	uint32_t size; /**< Field size. */
 };
 
 /** Parser token definition. */
@@ -80,6 +99,8 @@ struct token {
 	 * parser consumes the last entry of that stack.
 	 */
 	const enum index *const *next;
+	/** Arguments stack for subsequent tokens that need them. */
+	const struct arg *const *args;
 	/**
 	 * Token-processing callback, returns -1 in case of error, the
 	 * length of the matched string otherwise. If NULL, attempts to
@@ -112,6 +133,22 @@ struct token {
 /** Static initializer for a NEXT() entry. */
 #define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, 0, }
 
+/** Static initializer for the args field. */
+#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for ARGS() to target a field. */
+#define ARGS_ENTRY(s, f) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
+/** Static initializer for ARGS() to target a pointer. */
+#define ARGS_ENTRY_PTR(s, f) \
+	(&(const struct arg){ \
+		.size = sizeof(*((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -121,6 +158,11 @@ struct buffer {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_int(struct context *, const struct token *,
+		     const char *, unsigned int,
+		     void *, unsigned int);
+static int comp_none(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -135,6 +177,21 @@ static const struct token token_list[] = {
 		.type = "RETURN",
 		.help = "command may end here",
 	},
+	/* Common tokens. */
+	[INTEGER] = {
+		.name = "{int}",
+		.type = "INTEGER",
+		.help = "integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[UNSIGNED] = {
+		.name = "{unsigned}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -144,6 +201,23 @@ static const struct token token_list[] = {
 	},
 };
 
+/** Remove and return last entry from argument stack. */
+static const struct arg *
+pop_args(struct context *ctx)
+{
+	return ctx->args_num ? ctx->args[--ctx->args_num] : NULL;
+}
+
+/** Add entry on top of the argument stack. */
+static int
+push_args(struct context *ctx, const struct arg *arg)
+{
+	if (ctx->args_num == CTX_STACK_SIZE)
+		return -1;
+	ctx->args[ctx->args_num++] = arg;
+	return 0;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -178,9 +252,74 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->object = out;
 	return len;
 }
 
+/**
+ * Parse signed/unsigned integers 8 to 64-bit long.
+ *
+ * Last argument (ctx->args) is retrieved to determine integer type and
+ * storage location.
+ */
+static int
+parse_int(struct context *ctx, const struct token *token,
+	  const char *str, unsigned int len,
+	  void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	uintmax_t u;
+	char *end;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = arg->sign ?
+		(uintmax_t)strtoimax(str, &end, 0) :
+		strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	size = arg->size;
+	switch (size) {
+	case sizeof(uint8_t):
+		*(uint8_t *)buf = u;
+		break;
+	case sizeof(uint16_t):
+		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
+		break;
+	case sizeof(uint32_t):
+		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
+		break;
+	case sizeof(uint64_t):
+		*(uint64_t *)buf = arg->hton ? rte_cpu_to_be_64(u) : u;
+		break;
+	default:
+		goto error;
+	}
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/** No completion. */
+static int
+comp_none(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)token;
+	(void)ent;
+	(void)buf;
+	(void)size;
+	return 0;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -195,9 +334,11 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->curr = 0;
 	ctx->prev = 0;
 	ctx->next_num = 0;
+	ctx->args_num = 0;
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->object = NULL;
 }
 
 /** Parse a token (cmdline API). */
@@ -270,6 +411,13 @@ cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
 				return -1;
 			ctx->next[ctx->next_num++] = token->next[i];
 		}
+	/* Push arguments if any. */
+	if (token->args)
+		for (i = 0; token->args[i]; ++i) {
+			if (ctx->args_num == RTE_DIM(ctx->args))
+				return -1;
+			ctx->args[ctx->args_num++] = token->args[i];
+		}
 	return len;
 }
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 07/22] app/testpmd: add flow list command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (5 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 06/22] app/testpmd: add rte_flow integer support Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 08/22] app/testpmd: add flow flush command Adrien Mazarguil
                       ` (18 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Syntax:

 flow list {port_id} [group {group_id}] [...]

List configured flow rules on a port. Output can optionally be limited to a
given set of group identifiers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |   4 ++
 app/test-pmd/cmdline_flow.c | 141 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b7d10b3..09357c0 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -810,6 +810,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
+
+			"flow list {port_id} [group {group_id}] [...]\n"
+			"    List existing flow rules sorted by priority,"
+			" filtered by group identifiers.\n\n"
 		);
 	}
 }
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7078f80..727fe78 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,9 +56,17 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PORT_ID,
+	GROUP_ID,
 
 	/* Top-level command. */
 	FLOW,
+
+	/* Sub-level commands. */
+	LIST,
+
+	/* List arguments. */
+	LIST_GROUP,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -77,6 +85,7 @@ struct context {
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	uint16_t port; /**< Current port ID (for completions). */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -153,16 +162,36 @@ struct token {
 struct buffer {
 	enum index command; /**< Flow command. */
 	uint16_t port; /**< Affected port ID. */
+	union {
+		struct {
+			uint32_t *group;
+			uint32_t group_n;
+		} list; /**< List arguments. */
+	} args; /**< Command arguments. */
+};
+
+static const enum index next_list_attr[] = {
+	LIST_GROUP,
+	END,
+	0,
 };
 
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_list(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_port(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_port(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -192,13 +221,44 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PORT_ID] = {
+		.name = "{port_id}",
+		.type = "PORT ID",
+		.help = "port identifier",
+		.call = parse_port,
+		.comp = comp_port,
+	},
+	[GROUP_ID] = {
+		.name = "{group_id}",
+		.type = "GROUP ID",
+		.help = "group identifier",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
+		.next = NEXT(NEXT_ENTRY(LIST)),
 		.call = parse_init,
 	},
+	/* Sub-level commands. */
+	[LIST] = {
+		.name = "list",
+		.help = "list existing flow rules",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_list,
+	},
+	/* List arguments. */
+	[LIST_GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
+		.call = parse_list,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -256,6 +316,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for list command. */
+static int
+parse_list(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != LIST)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.list.group =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
+	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.list.group + out->args.list.group_n++;
+	return len;
+}
+
 /**
  * Parse signed/unsigned integers 8 to 64-bit long.
  *
@@ -307,6 +400,29 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/** Parse port and update context. */
+static int
+parse_port(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = &(struct buffer){ .port = 0 };
+	int ret;
+
+	if (buf)
+		out = buf;
+	else {
+		ctx->object = out;
+		size = sizeof(*out);
+	}
+	ret = parse_int(ctx, token, str, len, out, size);
+	if (ret >= 0)
+		ctx->port = out->port;
+	if (!buf)
+		ctx->object = NULL;
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -320,6 +436,26 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete available ports. */
+static int
+comp_port(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	portid_t p;
+
+	(void)ctx;
+	(void)token;
+	FOREACH_PORT(p, ports) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", p);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -338,6 +474,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->port = 0;
 	ctx->object = NULL;
 }
 
@@ -561,6 +698,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case LIST:
+		port_flow_list(in->port, in->args.list.group_n,
+			       in->args.list.group);
+		break;
 	default:
 		break;
 	}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 08/22] app/testpmd: add flow flush command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (6 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 07/22] app/testpmd: add flow list command Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 09/22] app/testpmd: add flow destroy command Adrien Mazarguil
                       ` (17 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Syntax:

 flow flush {port_id}

Destroy all flow rules on a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |  3 +++
 app/test-pmd/cmdline_flow.c | 43 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 09357c0..9f124fc 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow flush {port_id}\n"
+			"    Destroy all flow rules.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 727fe78..414bacc 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -63,6 +63,7 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	FLUSH,
 	LIST,
 
 	/* List arguments. */
@@ -179,6 +180,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_flush(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -240,10 +244,19 @@ static const struct token token_list[] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
-		.next = NEXT(NEXT_ENTRY(LIST)),
+		.next = NEXT(NEXT_ENTRY
+			     (FLUSH,
+			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[FLUSH] = {
+		.name = "flush",
+		.help = "destroy all flow rules",
+		.next = NEXT(NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_flush,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -316,6 +329,31 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for flush command. */
+static int
+parse_flush(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != FLUSH)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+	}
+	return len;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -698,6 +736,9 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case FLUSH:
+		port_flow_flush(in->port);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 09/22] app/testpmd: add flow destroy command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (7 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 08/22] app/testpmd: add flow flush command Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 10/22] app/testpmd: add flow validate/create commands Adrien Mazarguil
                       ` (16 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Syntax:

 flow destroy {port_id} rule {rule_id} [...]

Destroy a given set of flow rules associated with a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |   3 ++
 app/test-pmd/cmdline_flow.c | 106 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 9f124fc..20a64b6 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow destroy {port_id} rule {rule_id} [...]\n"
+			"    Destroy specific flow rules.\n\n"
+
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 414bacc..5a8980c 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
 
@@ -63,9 +64,13 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	DESTROY,
 	FLUSH,
 	LIST,
 
+	/* Destroy arguments. */
+	DESTROY_RULE,
+
 	/* List arguments. */
 	LIST_GROUP,
 };
@@ -165,12 +170,22 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			uint32_t *rule;
+			uint32_t rule_n;
+		} destroy; /**< Destroy arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
 	} args; /**< Command arguments. */
 };
 
+static const enum index next_destroy_attr[] = {
+	DESTROY_RULE,
+	END,
+	0,
+};
+
 static const enum index next_list_attr[] = {
 	LIST_GROUP,
 	END,
@@ -180,6 +195,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_destroy(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
@@ -196,6 +214,8 @@ static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_rule_id(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -225,6 +245,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[RULE_ID] = {
+		.name = "{rule id}",
+		.type = "RULE ID",
+		.help = "rule identifier",
+		.call = parse_int,
+		.comp = comp_rule_id,
+	},
 	[PORT_ID] = {
 		.name = "{port_id}",
 		.type = "PORT ID",
@@ -245,11 +272,19 @@ static const struct token token_list[] = {
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (FLUSH,
+			     (DESTROY,
+			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[DESTROY] = {
+		.name = "destroy",
+		.help = "destroy specific flow rules",
+		.next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_destroy,
+	},
 	[FLUSH] = {
 		.name = "flush",
 		.help = "destroy all flow rules",
@@ -264,6 +299,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_list,
 	},
+	/* Destroy arguments. */
+	[DESTROY_RULE] = {
+		.name = "rule",
+		.help = "specify a rule identifier",
+		.next = NEXT(next_destroy_attr, NEXT_ENTRY(RULE_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
+		.call = parse_destroy,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -329,6 +372,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for destroy command. */
+static int
+parse_destroy(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != DESTROY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.destroy.rule =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
+	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	return len;
+}
+
 /** Parse tokens for flush command. */
 static int
 parse_flush(struct context *ctx, const struct token *token,
@@ -494,6 +570,30 @@ comp_port(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete available rule IDs. */
+static int
+comp_rule_id(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	struct rte_port *port;
+	struct port_flow *pf;
+
+	(void)token;
+	if (port_id_is_invalid(ctx->port, DISABLED_WARN) ||
+	    ctx->port == (uint16_t)RTE_PORT_ALL)
+		return -1;
+	port = &ports[ctx->port];
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", pf->id);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -736,6 +836,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case DESTROY:
+		port_flow_destroy(in->port, in->args.destroy.rule_n,
+				  in->args.destroy.rule);
+		break;
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 10/22] app/testpmd: add flow validate/create commands
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (8 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 09/22] app/testpmd: add flow destroy command Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 11/22] app/testpmd: add flow query command Adrien Mazarguil
                       ` (15 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Syntax:

 flow (validate|create) {port_id}
    [group {group_id}] [priority {level}] [ingress] [egress]
    pattern {item} [/ {item} [...]] / end
    actions {action} [/ {action} [...]] / end

Either check the validity of a flow rule or create it. Any number of
pattern items and actions can be provided in any order. Completion is
available for convenience.

This commit only adds support for the most basic item and action types,
namely:

- END: terminates pattern items and actions lists.
- VOID: item/action filler, no operation.
- INVERT: inverted pattern matching, process packets that do not match.
- PASSTHRU: action that leaves packets up for additional processing by
  subsequent flow rules.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |  14 ++
 app/test-pmd/cmdline_flow.c | 314 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 327 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 20a64b6..851cc16 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,20 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow validate {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Check whether a flow rule can be created.\n\n"
+
+			"flow create {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [{item} [...]] end"
+			" actions {action} [{action} [...]] end\n"
+			"    Create a flow rule.\n\n"
+
 			"flow destroy {port_id} rule {rule_id} [...]\n"
 			"    Destroy specific flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 5a8980c..1874849 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -59,11 +59,14 @@ enum index {
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
+	PRIORITY_LEVEL,
 
 	/* Top-level command. */
 	FLOW,
 
 	/* Sub-level commands. */
+	VALIDATE,
+	CREATE,
 	DESTROY,
 	FLUSH,
 	LIST,
@@ -73,6 +76,26 @@ enum index {
 
 	/* List arguments. */
 	LIST_GROUP,
+
+	/* Validate/create arguments. */
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+
+	/* Validate/create pattern. */
+	PATTERN,
+	ITEM_NEXT,
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+
+	/* Validate/create actions. */
+	ACTIONS,
+	ACTION_NEXT,
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -92,6 +115,7 @@ struct context {
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
 	uint16_t port; /**< Current port ID (for completions). */
+	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -109,6 +133,8 @@ struct token {
 	const char *type;
 	/** Help displayed during completion (defaults to token name). */
 	const char *help;
+	/** Private data used by parser functions. */
+	const void *priv;
 	/**
 	 * Lists of subsequent tokens to push on the stack. Each call to the
 	 * parser consumes the last entry of that stack.
@@ -170,6 +196,14 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			struct rte_flow_attr attr;
+			struct rte_flow_item *pattern;
+			struct rte_flow_action *actions;
+			uint32_t pattern_n;
+			uint32_t actions_n;
+			uint8_t *data;
+		} vc; /**< Validate/create arguments. */
+		struct {
 			uint32_t *rule;
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
@@ -180,6 +214,39 @@ struct buffer {
 	} args; /**< Command arguments. */
 };
 
+/** Private data for pattern items. */
+struct parse_item_priv {
+	enum rte_flow_item_type type; /**< Item type. */
+	uint32_t size; /**< Size of item specification structure. */
+};
+
+#define PRIV_ITEM(t, s) \
+	(&(const struct parse_item_priv){ \
+		.type = RTE_FLOW_ITEM_TYPE_ ## t, \
+		.size = s, \
+	})
+
+/** Private data for actions. */
+struct parse_action_priv {
+	enum rte_flow_action_type type; /**< Action type. */
+	uint32_t size; /**< Size of action configuration structure. */
+};
+
+#define PRIV_ACTION(t, s) \
+	(&(const struct parse_action_priv){ \
+		.type = RTE_FLOW_ACTION_TYPE_ ## t, \
+		.size = s, \
+	})
+
+static const enum index next_vc_attr[] = {
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+	PATTERN,
+	0,
+};
+
 static const enum index next_destroy_attr[] = {
 	DESTROY_RULE,
 	END,
@@ -192,9 +259,26 @@ static const enum index next_list_attr[] = {
 	0,
 };
 
+static const enum index next_item[] = {
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+	0,
+};
+
+static const enum index next_action[] = {
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
+	0,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_vc(struct context *, const struct token *,
+		    const char *, unsigned int,
+		    void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -266,18 +350,41 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PRIORITY_LEVEL] = {
+		.name = "{level}",
+		.type = "PRIORITY",
+		.help = "priority level",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (DESTROY,
+			     (VALIDATE,
+			      CREATE,
+			      DESTROY,
 			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[VALIDATE] = {
+		.name = "validate",
+		.help = "check whether a flow rule can be created",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
+	[CREATE] = {
+		.name = "create",
+		.help = "create a flow rule",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
 	[DESTROY] = {
 		.name = "destroy",
 		.help = "destroy specific flow rules",
@@ -315,6 +422,98 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
 		.call = parse_list,
 	},
+	/* Validate/create attributes. */
+	[GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, group)),
+		.call = parse_vc,
+	},
+	[PRIORITY] = {
+		.name = "priority",
+		.help = "specify a priority level",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PRIORITY_LEVEL)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, priority)),
+		.call = parse_vc,
+	},
+	[INGRESS] = {
+		.name = "ingress",
+		.help = "affect rule to ingress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	[EGRESS] = {
+		.name = "egress",
+		.help = "affect rule to egress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	/* Validate/create pattern. */
+	[PATTERN] = {
+		.name = "pattern",
+		.help = "submit a list of pattern items",
+		.next = NEXT(next_item),
+		.call = parse_vc,
+	},
+	[ITEM_NEXT] = {
+		.name = "/",
+		.help = "specify next pattern item",
+		.next = NEXT(next_item),
+	},
+	[ITEM_END] = {
+		.name = "end",
+		.help = "end list of pattern items",
+		.priv = PRIV_ITEM(END, 0),
+		.next = NEXT(NEXT_ENTRY(ACTIONS)),
+		.call = parse_vc,
+	},
+	[ITEM_VOID] = {
+		.name = "void",
+		.help = "no-op pattern item",
+		.priv = PRIV_ITEM(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_INVERT] = {
+		.name = "invert",
+		.help = "perform actions when pattern does not match",
+		.priv = PRIV_ITEM(INVERT, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	/* Validate/create actions. */
+	[ACTIONS] = {
+		.name = "actions",
+		.help = "submit a list of associated actions",
+		.next = NEXT(next_action),
+		.call = parse_vc,
+	},
+	[ACTION_NEXT] = {
+		.name = "/",
+		.help = "specify next action",
+		.next = NEXT(next_action),
+	},
+	[ACTION_END] = {
+		.name = "end",
+		.help = "end list of actions",
+		.priv = PRIV_ACTION(END, 0),
+		.call = parse_vc,
+	},
+	[ACTION_VOID] = {
+		.name = "void",
+		.help = "no-op action",
+		.priv = PRIV_ACTION(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PASSTHRU] = {
+		.name = "passthru",
+		.help = "let subsequent rule process matched packets",
+		.priv = PRIV_ACTION(PASSTHRU, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -368,10 +567,108 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->objdata = 0;
 	ctx->object = out;
 	return len;
 }
 
+/** Parse tokens for validate/create commands. */
+static int
+parse_vc(struct context *ctx, const struct token *token,
+	 const char *str, unsigned int len,
+	 void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	uint8_t *data;
+	uint32_t data_size;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != VALIDATE && ctx->curr != CREATE)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		out->args.vc.data = (uint8_t *)out + size;
+		return len;
+	}
+	ctx->objdata = 0;
+	ctx->object = &out->args.vc.attr;
+	switch (ctx->curr) {
+	case GROUP:
+	case PRIORITY:
+		return len;
+	case INGRESS:
+		out->args.vc.attr.ingress = 1;
+		return len;
+	case EGRESS:
+		out->args.vc.attr.egress = 1;
+		return len;
+	case PATTERN:
+		out->args.vc.pattern =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		ctx->object = out->args.vc.pattern;
+		return len;
+	case ACTIONS:
+		out->args.vc.actions =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)
+					       (out->args.vc.pattern +
+						out->args.vc.pattern_n),
+					       sizeof(double));
+		ctx->object = out->args.vc.actions;
+		return len;
+	default:
+		if (!token->priv)
+			return -1;
+		break;
+	}
+	if (!out->args.vc.actions) {
+		const struct parse_item_priv *priv = token->priv;
+		struct rte_flow_item *item =
+			out->args.vc.pattern + out->args.vc.pattern_n;
+
+		data_size = priv->size * 3; /* spec, last, mask */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)item + sizeof(*item) > data)
+			return -1;
+		*item = (struct rte_flow_item){
+			.type = priv->type,
+		};
+		++out->args.vc.pattern_n;
+		ctx->object = item;
+	} else {
+		const struct parse_action_priv *priv = token->priv;
+		struct rte_flow_action *action =
+			out->args.vc.actions + out->args.vc.actions_n;
+
+		data_size = priv->size; /* configuration */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)action + sizeof(*action) > data)
+			return -1;
+		*action = (struct rte_flow_action){
+			.type = priv->type,
+		};
+		++out->args.vc.actions_n;
+		ctx->object = action;
+	}
+	memset(data, 0, data_size);
+	out->args.vc.data = data;
+	ctx->objdata = data_size;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -392,6 +689,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -401,6 +699,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
 	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
 	return len;
 }
@@ -425,6 +724,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 	}
 	return len;
@@ -450,6 +750,7 @@ parse_list(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -459,6 +760,7 @@ parse_list(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
 	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
 	return len;
 }
@@ -526,6 +828,7 @@ parse_port(struct context *ctx, const struct token *token,
 	if (buf)
 		out = buf;
 	else {
+		ctx->objdata = 0;
 		ctx->object = out;
 		size = sizeof(*out);
 	}
@@ -613,6 +916,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->eol = 0;
 	ctx->last = 0;
 	ctx->port = 0;
+	ctx->objdata = 0;
 	ctx->object = NULL;
 }
 
@@ -836,6 +1140,14 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case VALIDATE:
+		port_flow_validate(in->port, &in->args.vc.attr,
+				   in->args.vc.pattern, in->args.vc.actions);
+		break;
+	case CREATE:
+		port_flow_create(in->port, &in->args.vc.attr,
+				 in->args.vc.pattern, in->args.vc.actions);
+		break;
 	case DESTROY:
 		port_flow_destroy(in->port, in->args.destroy.rule_n,
 				  in->args.destroy.rule);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 11/22] app/testpmd: add flow query command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (9 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 10/22] app/testpmd: add flow validate/create commands Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
                       ` (14 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Syntax:

 flow query {port_id} {rule_id} {action}

Query a specific action of an existing flow rule.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |   3 +
 app/test-pmd/cmdline_flow.c | 121 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 851cc16..edd1ee3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -831,6 +831,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
+			"flow query {port_id} {rule_id} {action}\n"
+			"    Query an existing flow rule.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 1874849..e70e8e2 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -69,11 +69,15 @@ enum index {
 	CREATE,
 	DESTROY,
 	FLUSH,
+	QUERY,
 	LIST,
 
 	/* Destroy arguments. */
 	DESTROY_RULE,
 
+	/* Query arguments. */
+	QUERY_ACTION,
+
 	/* List arguments. */
 	LIST_GROUP,
 
@@ -208,6 +212,10 @@ struct buffer {
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
 		struct {
+			uint32_t rule;
+			enum rte_flow_action_type action;
+		} query; /**< Query arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
@@ -285,6 +293,12 @@ static int parse_destroy(struct context *, const struct token *,
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
+static int parse_query(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
+static int parse_action(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -296,6 +310,8 @@ static int parse_port(struct context *, const struct token *,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_action(struct context *, const struct token *,
+		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
@@ -367,7 +383,8 @@ static const struct token token_list[] = {
 			      CREATE,
 			      DESTROY,
 			      FLUSH,
-			      LIST)),
+			      LIST,
+			      QUERY)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
@@ -399,6 +416,17 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_flush,
 	},
+	[QUERY] = {
+		.name = "query",
+		.help = "query an existing flow rule",
+		.next = NEXT(NEXT_ENTRY(QUERY_ACTION),
+			     NEXT_ENTRY(RULE_ID),
+			     NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.query.action),
+			     ARGS_ENTRY(struct buffer, args.query.rule),
+			     ARGS_ENTRY(struct buffer, port)),
+		.call = parse_query,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -414,6 +442,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
 		.call = parse_destroy,
 	},
+	/* Query arguments. */
+	[QUERY_ACTION] = {
+		.name = "{action}",
+		.type = "ACTION",
+		.help = "action to query, must be part of the rule",
+		.call = parse_action,
+		.comp = comp_action,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -730,6 +766,67 @@ parse_flush(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for query command. */
+static int
+parse_query(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != QUERY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+	}
+	return len;
+}
+
+/** Parse action names. */
+static int
+parse_action(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+
+	(void)size;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	/* Parse action name. */
+	for (i = 0; next_action[i]; ++i) {
+		const struct parse_action_priv *priv;
+
+		token = &token_list[next_action[i]];
+		if (strncmp(token->name, str, len))
+			continue;
+		priv = token->priv;
+		if (!priv)
+			goto error;
+		if (out)
+			memcpy((uint8_t *)ctx->object + arg->offset,
+			       &priv->type,
+			       arg->size);
+		return len;
+	}
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -853,6 +950,24 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete action names. */
+static int
+comp_action(struct context *ctx, const struct token *token,
+	    unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; next_action[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s",
+					token_list[next_action[i]].name);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete available ports. */
 static int
 comp_port(struct context *ctx, const struct token *token,
@@ -1155,6 +1270,10 @@ cmd_flow_parsed(const struct buffer *in)
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
+	case QUERY:
+		port_flow_query(in->port, in->args.query.rule,
+				in->args.query.action);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (10 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 11/22] app/testpmd: add flow query command Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-12-16  3:01       ` Pei, Yulong
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 13/22] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
                       ` (13 subsequent siblings)
  25 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Add parser code to fully set individual fields of pattern item
specification structures, using the following operators:

- fix: sets field and applies full bit-mask for perfect matching.
- spec: sets field without modifying its bit-mask.
- last: sets upper value of the spec => last range.
- mask: sets bit-mask affecting both spec and last from arbitrary value.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 110 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index e70e8e2..790b4b8 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -89,6 +89,10 @@ enum index {
 
 	/* Validate/create pattern. */
 	PATTERN,
+	ITEM_PARAM_FIX,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -121,6 +125,7 @@ struct context {
 	uint16_t port; /**< Current port ID (for completions). */
 	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
+	void *objmask; /**< Object a full mask must be written to. */
 };
 
 /** Token argument. */
@@ -267,6 +272,14 @@ static const enum index next_list_attr[] = {
 	0,
 };
 
+static const enum index item_param[] = {
+	ITEM_PARAM_FIX,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
+	0,
+};
+
 static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
@@ -287,6 +300,8 @@ static int parse_init(struct context *, const struct token *,
 static int parse_vc(struct context *, const struct token *,
 		    const char *, unsigned int,
 		    void *, unsigned int);
+static int parse_vc_spec(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -492,6 +507,26 @@ static const struct token token_list[] = {
 		.next = NEXT(next_item),
 		.call = parse_vc,
 	},
+	[ITEM_PARAM_FIX] = {
+		.name = "fix",
+		.help = "match value perfectly (with full bit-mask)",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_SPEC] = {
+		.name = "spec",
+		.help = "match value according to configured bit-mask",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_LAST] = {
+		.name = "last",
+		.help = "specify upper bound to establish a range",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_MASK] = {
+		.name = "mask",
+		.help = "specify bit-mask with relevant bits set to one",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +640,7 @@ parse_init(struct context *ctx, const struct token *token,
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
 	ctx->objdata = 0;
 	ctx->object = out;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -632,11 +668,13 @@ parse_vc(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.vc.data = (uint8_t *)out + size;
 		return len;
 	}
 	ctx->objdata = 0;
 	ctx->object = &out->args.vc.attr;
+	ctx->objmask = NULL;
 	switch (ctx->curr) {
 	case GROUP:
 	case PRIORITY:
@@ -652,6 +690,7 @@ parse_vc(struct context *ctx, const struct token *token,
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
 		ctx->object = out->args.vc.pattern;
+		ctx->objmask = NULL;
 		return len;
 	case ACTIONS:
 		out->args.vc.actions =
@@ -660,6 +699,7 @@ parse_vc(struct context *ctx, const struct token *token,
 						out->args.vc.pattern_n),
 					       sizeof(double));
 		ctx->object = out->args.vc.actions;
+		ctx->objmask = NULL;
 		return len;
 	default:
 		if (!token->priv)
@@ -682,6 +722,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.pattern_n;
 		ctx->object = item;
+		ctx->objmask = NULL;
 	} else {
 		const struct parse_action_priv *priv = token->priv;
 		struct rte_flow_action *action =
@@ -698,6 +739,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.actions_n;
 		ctx->object = action;
+		ctx->objmask = NULL;
 	}
 	memset(data, 0, data_size);
 	out->args.vc.data = data;
@@ -705,6 +747,60 @@ parse_vc(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse pattern item parameter type. */
+static int
+parse_vc_spec(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_item *item;
+	uint32_t data_size;
+	int index;
+	int objmask = 0;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Parse parameter types. */
+	switch (ctx->curr) {
+	case ITEM_PARAM_FIX:
+		index = 0;
+		objmask = 1;
+		break;
+	case ITEM_PARAM_SPEC:
+		index = 0;
+		break;
+	case ITEM_PARAM_LAST:
+		index = 1;
+		break;
+	case ITEM_PARAM_MASK:
+		index = 2;
+		break;
+	default:
+		return -1;
+	}
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.pattern_n)
+		return -1;
+	item = &out->args.vc.pattern[out->args.vc.pattern_n - 1];
+	data_size = ctx->objdata / 3; /* spec, last, mask */
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data + (data_size * index);
+	if (objmask) {
+		ctx->objmask = out->args.vc.data + (data_size * 2); /* mask */
+		item->mask = ctx->objmask;
+	} else
+		ctx->objmask = NULL;
+	/* Update relevant item pointer. */
+	*((const void **[]){ &item->spec, &item->last, &item->mask })[index] =
+		ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -727,6 +823,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -737,6 +834,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -762,6 +860,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -788,6 +887,7 @@ parse_query(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -849,6 +949,7 @@ parse_list(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -859,6 +960,7 @@ parse_list(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -891,6 +993,7 @@ parse_int(struct context *ctx, const struct token *token,
 		return len;
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
+objmask:
 	switch (size) {
 	case sizeof(uint8_t):
 		*(uint8_t *)buf = u;
@@ -907,6 +1010,11 @@ parse_int(struct context *ctx, const struct token *token,
 	default:
 		goto error;
 	}
+	if (ctx->objmask && buf != (uint8_t *)ctx->objmask + arg->offset) {
+		u = -1;
+		buf = (uint8_t *)ctx->objmask + arg->offset;
+		goto objmask;
+	}
 	return len;
 error:
 	push_args(ctx, arg);
@@ -927,6 +1035,7 @@ parse_port(struct context *ctx, const struct token *token,
 	else {
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		size = sizeof(*out);
 	}
 	ret = parse_int(ctx, token, str, len, out, size);
@@ -1033,6 +1142,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->port = 0;
 	ctx->objdata = 0;
 	ctx->object = NULL;
+	ctx->objmask = NULL;
 }
 
 /** Parse a token (cmdline API). */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 13/22] app/testpmd: add rte_flow item spec prefix length
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (11 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 14/22] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
                       ` (12 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Generating bit-masks from prefix lengths is often more convenient than
providing them entirely (e.g. to define IPv4 and IPv6 subnets).

This commit adds the "prefix" operator that assigns generated bit-masks to
any pattern item specification field.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 80 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 790b4b8..89307cb 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PREFIX,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -93,6 +94,7 @@ enum index {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -277,6 +279,7 @@ static const enum index item_param[] = {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	0,
 };
 
@@ -320,6 +323,9 @@ static int parse_list(struct context *, const struct token *,
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_prefix(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -360,6 +366,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PREFIX] = {
+		.name = "{prefix}",
+		.type = "PREFIX",
+		.help = "prefix length for bit-mask",
+		.call = parse_prefix,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -527,6 +540,11 @@ static const struct token token_list[] = {
 		.help = "specify bit-mask with relevant bits set to one",
 		.call = parse_vc_spec,
 	},
+	[ITEM_PARAM_PREFIX] = {
+		.name = "prefix",
+		.help = "generate bit-mask from a prefix length",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -604,6 +622,62 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/**
+ * Parse a prefix length and generate a bit-mask.
+ *
+ * Last argument (ctx->args) is retrieved to determine mask size, storage
+ * location and whether the result must use network byte ordering.
+ */
+static int
+parse_prefix(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	static const uint8_t conv[] = "\x00\x80\xc0\xe0\xf0\xf8\xfc\xfe\xff";
+	char *end;
+	uintmax_t u;
+	unsigned int bytes;
+	unsigned int extra;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	bytes = u / 8;
+	extra = u % 8;
+	size = arg->size;
+	if (bytes > size || bytes + !!extra > size)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+	if (!arg->hton) {
+		memset((uint8_t *)buf + size - bytes, 0xff, bytes);
+		memset(buf, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[size - bytes - 1] = conv[extra];
+	} else
+#endif
+	{
+		memset(buf, 0xff, bytes);
+		memset((uint8_t *)buf + bytes, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[bytes] = conv[extra];
+	}
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -775,6 +849,12 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	case ITEM_PARAM_LAST:
 		index = 1;
 		break;
+	case ITEM_PARAM_PREFIX:
+		/* Modify next token to expect a prefix. */
+		if (ctx->next_num < 2)
+			return -1;
+		ctx->next[ctx->next_num - 2] = NEXT_ENTRY(PREFIX);
+		/* Fall through. */
 	case ITEM_PARAM_MASK:
 		index = 2;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 14/22] app/testpmd: add rte_flow bit-field support
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (12 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 13/22] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 15/22] app/testpmd: add item any to flow command Adrien Mazarguil
                       ` (11 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Several rte_flow structures expose bit-fields that cannot be set in a
generic fashion at byte level. Add bit-mask support to handle them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 59 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 89307cb..81930e1 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -136,6 +136,7 @@ struct arg {
 	uint32_t sign:1; /**< Value is signed. */
 	uint32_t offset; /**< Relative offset from ctx->object. */
 	uint32_t size; /**< Field size. */
+	const uint8_t *mask; /**< Bit-mask to use instead of offset/size. */
 };
 
 /** Parser token definition. */
@@ -195,6 +196,13 @@ struct token {
 		.size = sizeof(((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() to target a bit-field. */
+#define ARGS_ENTRY_BF(s, f) \
+	(&(const struct arg){ \
+		.size = sizeof(s), \
+		.mask = (const void *)&(const s){ .f = -1 }, \
+	})
+
 /** Static initializer for ARGS() to target a pointer. */
 #define ARGS_ENTRY_PTR(s, f) \
 	(&(const struct arg){ \
@@ -622,6 +630,34 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/** Spread value into buffer according to bit-mask. */
+static size_t
+arg_entry_bf_fill(void *dst, uintmax_t val, const struct arg *arg)
+{
+	uint32_t i;
+	size_t len = 0;
+
+	/* Endian conversion is not supported on bit-fields. */
+	if (!arg->mask || arg->hton)
+		return 0;
+	for (i = 0; i != arg->size; ++i) {
+		unsigned int shift = 0;
+		uint8_t *buf = (uint8_t *)dst + i;
+
+		for (shift = 0; arg->mask[i] >> shift; ++shift) {
+			if (!(arg->mask[i] & (1 << shift)))
+				continue;
+			++len;
+			if (!dst)
+				continue;
+			*buf &= ~(1 << shift);
+			*buf |= (val & 1) << shift;
+			val >>= 1;
+		}
+	}
+	return len;
+}
+
 /**
  * Parse a prefix length and generate a bit-mask.
  *
@@ -648,6 +684,23 @@ parse_prefix(struct context *ctx, const struct token *token,
 	u = strtoumax(str, &end, 0);
 	if (errno || (size_t)(end - str) != len)
 		goto error;
+	if (arg->mask) {
+		uintmax_t v = 0;
+
+		extra = arg_entry_bf_fill(NULL, 0, arg);
+		if (u > extra)
+			goto error;
+		if (!ctx->object)
+			return len;
+		extra -= u;
+		while (u--)
+			(v <<= 1, v |= 1);
+		v <<= extra;
+		if (!arg_entry_bf_fill(ctx->object, v, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	bytes = u / 8;
 	extra = u % 8;
 	size = arg->size;
@@ -1071,6 +1124,12 @@ parse_int(struct context *ctx, const struct token *token,
 		goto error;
 	if (!ctx->object)
 		return len;
+	if (arg->mask) {
+		if (!arg_entry_bf_fill(ctx->object, u, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
 objmask:
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 15/22] app/testpmd: add item any to flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (13 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 14/22] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 16/22] app/testpmd: add various items " Adrien Mazarguil
                       ` (10 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

This pattern item matches any protocol in place of the current layer and
has two properties:

- min: minimum number of layers covered (0 or more).
- max: maximum number of layers covered (0 means infinity).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 81930e1..5816be4 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -99,6 +99,9 @@ enum index {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ITEM_ANY_MIN,
+	ITEM_ANY_MAX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -295,6 +298,14 @@ static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	0,
+};
+
+static const enum index item_any[] = {
+	ITEM_ANY_MIN,
+	ITEM_ANY_MAX,
+	ITEM_NEXT,
 	0,
 };
 
@@ -579,6 +590,25 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
 		.call = parse_vc,
 	},
+	[ITEM_ANY] = {
+		.name = "any",
+		.help = "match any protocol for the current layer",
+		.priv = PRIV_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+		.next = NEXT(item_any),
+		.call = parse_vc,
+	},
+	[ITEM_ANY_MIN] = {
+		.name = "min",
+		.help = "minimum number of layers covered",
+		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, min)),
+	},
+	[ITEM_ANY_MAX] = {
+		.name = "max",
+		.help = "maximum number of layers covered, 0 for infinity",
+		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, max)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 16/22] app/testpmd: add various items to flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (14 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 15/22] app/testpmd: add item any to flow command Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 17/22] app/testpmd: add item raw " Adrien Mazarguil
                       ` (9 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

- PF: match packets addressed to the physical function.
- VF: match packets addressed to a virtual function ID.
- PORT: device-specific physical port index to use.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 53 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 5816be4..c61e31e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -102,6 +102,11 @@ enum index {
 	ITEM_ANY,
 	ITEM_ANY_MIN,
 	ITEM_ANY_MAX,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_VF_ID,
+	ITEM_PORT,
+	ITEM_PORT_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -299,6 +304,9 @@ static const enum index next_item[] = {
 	ITEM_VOID,
 	ITEM_INVERT,
 	ITEM_ANY,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_PORT,
 	0,
 };
 
@@ -309,6 +317,18 @@ static const enum index item_any[] = {
 	0,
 };
 
+static const enum index item_vf[] = {
+	ITEM_VF_ID,
+	ITEM_NEXT,
+	0,
+};
+
+static const enum index item_port[] = {
+	ITEM_PORT_INDEX,
+	ITEM_NEXT,
+	0,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -609,6 +629,39 @@ static const struct token token_list[] = {
 		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, max)),
 	},
+	[ITEM_PF] = {
+		.name = "pf",
+		.help = "match packets addressed to the physical function",
+		.priv = PRIV_ITEM(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_VF] = {
+		.name = "vf",
+		.help = "match packets addressed to a virtual function ID",
+		.priv = PRIV_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+		.next = NEXT(item_vf),
+		.call = parse_vc,
+	},
+	[ITEM_VF_ID] = {
+		.name = "id",
+		.help = "destination VF ID",
+		.next = NEXT(item_vf, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_vf, id)),
+	},
+	[ITEM_PORT] = {
+		.name = "port",
+		.help = "device-specific physical port index to use",
+		.priv = PRIV_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+		.next = NEXT(item_port),
+		.call = parse_vc,
+	},
+	[ITEM_PORT_INDEX] = {
+		.name = "index",
+		.help = "physical port index",
+		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 17/22] app/testpmd: add item raw to flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (15 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 16/22] app/testpmd: add various items " Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 18/22] app/testpmd: add items eth/vlan " Adrien Mazarguil
                       ` (8 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Matches arbitrary byte strings with properties:

- relative: look for pattern after the previous item.
- search: search pattern from offset (see also limit).
- offset: absolute or relative offset for pattern.
- limit: search area limit for start of pattern.
- length: pattern length.
- pattern: byte string to look for.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 206 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 206 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index c61e31e..6f2f26c 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -57,6 +57,8 @@ enum index {
 	INTEGER,
 	UNSIGNED,
 	PREFIX,
+	BOOLEAN,
+	STRING,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -107,6 +109,12 @@ enum index {
 	ITEM_VF_ID,
 	ITEM_PORT,
 	ITEM_PORT_INDEX,
+	ITEM_RAW,
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -116,6 +124,13 @@ enum index {
 	ACTION_PASSTHRU,
 };
 
+/** Size of pattern[] field in struct rte_flow_item_raw. */
+#define ITEM_RAW_PATTERN_SIZE 36
+
+/** Storage size for struct rte_flow_item_raw including pattern. */
+#define ITEM_RAW_SIZE \
+	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -217,6 +232,13 @@ struct token {
 		.size = sizeof(*((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() with arbitrary size. */
+#define ARGS_ENTRY_USZ(s, f, sz) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = (sz), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -307,6 +329,7 @@ static const enum index next_item[] = {
 	ITEM_PF,
 	ITEM_VF,
 	ITEM_PORT,
+	ITEM_RAW,
 	0,
 };
 
@@ -329,6 +352,16 @@ static const enum index item_port[] = {
 	0,
 };
 
+static const enum index item_raw[] = {
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
+	ITEM_NEXT,
+	0,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -365,11 +398,19 @@ static int parse_int(struct context *, const struct token *,
 static int parse_prefix(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_boolean(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
+static int parse_string(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_boolean(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 static int comp_action(struct context *, const struct token *,
 		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
@@ -412,6 +453,20 @@ static const struct token token_list[] = {
 		.call = parse_prefix,
 		.comp = comp_none,
 	},
+	[BOOLEAN] = {
+		.name = "{boolean}",
+		.type = "BOOLEAN",
+		.help = "any boolean value",
+		.call = parse_boolean,
+		.comp = comp_boolean,
+	},
+	[STRING] = {
+		.name = "{string}",
+		.type = "STRING",
+		.help = "fixed string",
+		.call = parse_string,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -662,6 +717,50 @@ static const struct token token_list[] = {
 		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
 	},
+	[ITEM_RAW] = {
+		.name = "raw",
+		.help = "match an arbitrary byte string",
+		.priv = PRIV_ITEM(RAW, ITEM_RAW_SIZE),
+		.next = NEXT(item_raw),
+		.call = parse_vc,
+	},
+	[ITEM_RAW_RELATIVE] = {
+		.name = "relative",
+		.help = "look for pattern after the previous item",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw, relative)),
+	},
+	[ITEM_RAW_SEARCH] = {
+		.name = "search",
+		.help = "search pattern from offset (see also limit)",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw, search)),
+	},
+	[ITEM_RAW_OFFSET] = {
+		.name = "offset",
+		.help = "absolute or relative offset for pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(INTEGER), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, offset)),
+	},
+	[ITEM_RAW_LIMIT] = {
+		.name = "limit",
+		.help = "search area limit for start of pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, limit)),
+	},
+	[ITEM_RAW_PATTERN] = {
+		.name = "pattern",
+		.help = "byte string to look for",
+		.next = NEXT(item_raw,
+			     NEXT_ENTRY(STRING),
+			     NEXT_ENTRY(ITEM_PARAM_FIX,
+					ITEM_PARAM_SPEC,
+					ITEM_PARAM_MASK)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, length),
+			     ARGS_ENTRY_USZ(struct rte_flow_item_raw,
+					    pattern,
+					    ITEM_RAW_PATTERN_SIZE)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1243,6 +1342,96 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a string.
+ *
+ * Two arguments (ctx->args) are retrieved from the stack to store data and
+ * its length (in that order).
+ */
+static int
+parse_string(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg_data = pop_args(ctx);
+	const struct arg *arg_len = pop_args(ctx);
+	char tmp[16]; /* Ought to be enough. */
+	int ret;
+
+	/* Arguments are expected. */
+	if (!arg_data)
+		return -1;
+	if (!arg_len) {
+		push_args(ctx, arg_data);
+		return -1;
+	}
+	size = arg_data->size;
+	/* Bit-mask fill is not supported. */
+	if (arg_data->mask || size < len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	/* Let parse_int() fill length information first. */
+	ret = snprintf(tmp, sizeof(tmp), "%u", len);
+	if (ret < 0)
+		goto error;
+	push_args(ctx, arg_len);
+	ret = parse_int(ctx, token, tmp, ret, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		goto error;
+	}
+	buf = (uint8_t *)ctx->object + arg_data->offset;
+	/* Output buffer is not necessarily NUL-terminated. */
+	memcpy(buf, str, len);
+	memset((uint8_t *)buf + len, 0x55, size - len);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg_data->offset, 0xff, len);
+	return len;
+error:
+	push_args(ctx, arg_len);
+	push_args(ctx, arg_data);
+	return -1;
+}
+
+/** Boolean values (even indices stand for false). */
+static const char *const boolean_name[] = {
+	"0", "1",
+	"false", "true",
+	"no", "yes",
+	"N", "Y",
+	NULL,
+};
+
+/**
+ * Parse a boolean value.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_boolean(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	for (i = 0; boolean_name[i]; ++i)
+		if (!strncmp(str, boolean_name[i], len))
+			break;
+	/* Process token as integer. */
+	if (boolean_name[i])
+		str = i & 1 ? "1" : "0";
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, strlen(str), buf, size);
+	return ret > 0 ? (int)len : ret;
+}
+
 /** Parse port and update context. */
 static int
 parse_port(struct context *ctx, const struct token *token,
@@ -1281,6 +1470,23 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete boolean values. */
+static int
+comp_boolean(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; boolean_name[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", boolean_name[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete action names. */
 static int
 comp_action(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 18/22] app/testpmd: add items eth/vlan to flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (16 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 17/22] app/testpmd: add item raw " Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 19/22] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
                       ` (7 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

These pattern items match basic Ethernet headers (source, destination and
type) and related 802.1Q/ad VLAN headers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 126 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 126 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 6f2f26c..f2bd405 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -43,6 +43,7 @@
 #include <rte_ethdev.h>
 #include <rte_byteorder.h>
 #include <cmdline_parse.h>
+#include <cmdline_parse_etheraddr.h>
 #include <rte_flow.h>
 
 #include "testpmd.h"
@@ -59,6 +60,7 @@ enum index {
 	PREFIX,
 	BOOLEAN,
 	STRING,
+	MAC_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -115,6 +117,13 @@ enum index {
 	ITEM_RAW_OFFSET,
 	ITEM_RAW_LIMIT,
 	ITEM_RAW_PATTERN,
+	ITEM_ETH,
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_VLAN,
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -239,6 +248,14 @@ struct token {
 		.size = (sz), \
 	})
 
+/** Same as ARGS_ENTRY() using network byte ordering. */
+#define ARGS_ENTRY_HTON(s, f) \
+	(&(const struct arg){ \
+		.hton = 1, \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -330,6 +347,8 @@ static const enum index next_item[] = {
 	ITEM_VF,
 	ITEM_PORT,
 	ITEM_RAW,
+	ITEM_ETH,
+	ITEM_VLAN,
 	0,
 };
 
@@ -362,6 +381,21 @@ static const enum index item_raw[] = {
 	0,
 };
 
+static const enum index item_eth[] = {
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_NEXT,
+	0,
+};
+
+static const enum index item_vlan[] = {
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
+	ITEM_NEXT,
+	0,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -404,6 +438,9 @@ static int parse_boolean(struct context *, const struct token *,
 static int parse_string(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_mac_addr(struct context *, const struct token *,
+			  const char *, unsigned int,
+			  void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -467,6 +504,13 @@ static const struct token token_list[] = {
 		.call = parse_string,
 		.comp = comp_none,
 	},
+	[MAC_ADDR] = {
+		.name = "{MAC address}",
+		.type = "MAC-48",
+		.help = "standard MAC address notation",
+		.call = parse_mac_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -761,6 +805,50 @@ static const struct token token_list[] = {
 					    pattern,
 					    ITEM_RAW_PATTERN_SIZE)),
 	},
+	[ITEM_ETH] = {
+		.name = "eth",
+		.help = "match Ethernet header",
+		.priv = PRIV_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+		.next = NEXT(item_eth),
+		.call = parse_vc,
+	},
+	[ITEM_ETH_DST] = {
+		.name = "dst",
+		.help = "destination MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, dst)),
+	},
+	[ITEM_ETH_SRC] = {
+		.name = "src",
+		.help = "source MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, src)),
+	},
+	[ITEM_ETH_TYPE] = {
+		.name = "type",
+		.help = "EtherType",
+		.next = NEXT(item_eth, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_eth, type)),
+	},
+	[ITEM_VLAN] = {
+		.name = "vlan",
+		.help = "match 802.1Q/ad VLAN tag",
+		.priv = PRIV_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+		.next = NEXT(item_vlan),
+		.call = parse_vc,
+	},
+	[ITEM_VLAN_TPID] = {
+		.name = "tpid",
+		.help = "tag protocol identifier",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tpid)),
+	},
+	[ITEM_VLAN_TCI] = {
+		.name = "tci",
+		.help = "tag control information",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1394,6 +1482,44 @@ parse_string(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a MAC address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_mac_addr(struct context *ctx, const struct token *token,
+	       const char *str, unsigned int len,
+	       void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	struct ether_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	ret = cmdline_parse_etheraddr(NULL, str, &tmp, size);
+	if (ret < 0 || (unsigned int)ret != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 19/22] app/testpmd: add items ipv4/ipv6 to flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (17 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 18/22] app/testpmd: add items eth/vlan " Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 20/22] app/testpmd: add L4 items " Adrien Mazarguil
                       ` (6 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Add the ability to match basic fields from IPv4 and IPv6 headers (source
and destination addresses only).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 177 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index f2bd405..75096df 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -38,6 +38,7 @@
 #include <errno.h>
 #include <ctype.h>
 #include <string.h>
+#include <arpa/inet.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
@@ -61,6 +62,8 @@ enum index {
 	BOOLEAN,
 	STRING,
 	MAC_ADDR,
+	IPV4_ADDR,
+	IPV6_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -124,6 +127,12 @@ enum index {
 	ITEM_VLAN,
 	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
+	ITEM_IPV4,
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_IPV6,
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -349,6 +358,8 @@ static const enum index next_item[] = {
 	ITEM_RAW,
 	ITEM_ETH,
 	ITEM_VLAN,
+	ITEM_IPV4,
+	ITEM_IPV6,
 	0,
 };
 
@@ -396,6 +407,20 @@ static const enum index item_vlan[] = {
 	0,
 };
 
+static const enum index item_ipv4[] = {
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_NEXT,
+	0,
+};
+
+static const enum index item_ipv6[] = {
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
+	ITEM_NEXT,
+	0,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -441,6 +466,12 @@ static int parse_string(struct context *, const struct token *,
 static int parse_mac_addr(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int parse_ipv4_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
+static int parse_ipv6_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -511,6 +542,20 @@ static const struct token token_list[] = {
 		.call = parse_mac_addr,
 		.comp = comp_none,
 	},
+	[IPV4_ADDR] = {
+		.name = "{IPv4 address}",
+		.type = "IPV4 ADDRESS",
+		.help = "standard IPv4 address notation",
+		.call = parse_ipv4_addr,
+		.comp = comp_none,
+	},
+	[IPV6_ADDR] = {
+		.name = "{IPv6 address}",
+		.type = "IPV6 ADDRESS",
+		.help = "standard IPv6 address notation",
+		.call = parse_ipv6_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -849,6 +894,48 @@ static const struct token token_list[] = {
 		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
 	},
+	[ITEM_IPV4] = {
+		.name = "ipv4",
+		.help = "match IPv4 header",
+		.priv = PRIV_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+		.next = NEXT(item_ipv4),
+		.call = parse_vc,
+	},
+	[ITEM_IPV4_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV4_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.dst_addr)),
+	},
+	[ITEM_IPV6] = {
+		.name = "ipv6",
+		.help = "match IPv6 header",
+		.priv = PRIV_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+		.next = NEXT(item_ipv6),
+		.call = parse_vc,
+	},
+	[ITEM_IPV6_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV6_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.dst_addr)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1520,6 +1607,96 @@ parse_mac_addr(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse an IPv4 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv4_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in_addr tmp;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET, str2, &tmp);
+	if (ret != 1) {
+		/* Attempt integer parsing. */
+		push_args(ctx, arg);
+		return parse_int(ctx, token, str, len, buf, size);
+	}
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/**
+ * Parse an IPv6 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv6_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in6_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET6, str2, &tmp);
+	if (ret != 1)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 20/22] app/testpmd: add L4 items to flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (18 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 19/22] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 21/22] app/testpmd: add various actions " Adrien Mazarguil
                       ` (5 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Add the ability to match a few properties of common L4[.5] protocol
headers:

- ICMP: type and code.
- UDP: source and destination ports.
- TCP: source and destination ports.
- SCTP: source and destination ports.
- VXLAN: network identifier.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 163 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 75096df..892f300 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -133,6 +133,20 @@ enum index {
 	ITEM_IPV6,
 	ITEM_IPV6_SRC,
 	ITEM_IPV6_DST,
+	ITEM_ICMP,
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_UDP,
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_TCP,
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_SCTP,
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_VXLAN,
+	ITEM_VXLAN_VNI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -360,6 +374,11 @@ static const enum index next_item[] = {
 	ITEM_VLAN,
 	ITEM_IPV4,
 	ITEM_IPV6,
+	ITEM_ICMP,
+	ITEM_UDP,
+	ITEM_TCP,
+	ITEM_SCTP,
+	ITEM_VXLAN,
 	0,
 };
 
@@ -421,6 +440,40 @@ static const enum index item_ipv6[] = {
 	0,
 };
 
+static const enum index item_icmp[] = {
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_NEXT,
+	0,
+};
+
+static const enum index item_udp[] = {
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_NEXT,
+	0,
+};
+
+static const enum index item_tcp[] = {
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_NEXT,
+	0,
+};
+
+static const enum index item_sctp[] = {
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_NEXT,
+	0,
+};
+
+static const enum index item_vxlan[] = {
+	ITEM_VXLAN_VNI,
+	ITEM_NEXT,
+	0,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -936,6 +989,103 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
 					     hdr.dst_addr)),
 	},
+	[ITEM_ICMP] = {
+		.name = "icmp",
+		.help = "match ICMP header",
+		.priv = PRIV_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+		.next = NEXT(item_icmp),
+		.call = parse_vc,
+	},
+	[ITEM_ICMP_TYPE] = {
+		.name = "type",
+		.help = "ICMP packet type",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_type)),
+	},
+	[ITEM_ICMP_CODE] = {
+		.name = "code",
+		.help = "ICMP packet code",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_code)),
+	},
+	[ITEM_UDP] = {
+		.name = "udp",
+		.help = "match UDP header",
+		.priv = PRIV_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+		.next = NEXT(item_udp),
+		.call = parse_vc,
+	},
+	[ITEM_UDP_SRC] = {
+		.name = "src",
+		.help = "UDP source port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.src_port)),
+	},
+	[ITEM_UDP_DST] = {
+		.name = "dst",
+		.help = "UDP destination port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.dst_port)),
+	},
+	[ITEM_TCP] = {
+		.name = "tcp",
+		.help = "match TCP header",
+		.priv = PRIV_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+		.next = NEXT(item_tcp),
+		.call = parse_vc,
+	},
+	[ITEM_TCP_SRC] = {
+		.name = "src",
+		.help = "TCP source port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.src_port)),
+	},
+	[ITEM_TCP_DST] = {
+		.name = "dst",
+		.help = "TCP destination port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.dst_port)),
+	},
+	[ITEM_SCTP] = {
+		.name = "sctp",
+		.help = "match SCTP header",
+		.priv = PRIV_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+		.next = NEXT(item_sctp),
+		.call = parse_vc,
+	},
+	[ITEM_SCTP_SRC] = {
+		.name = "src",
+		.help = "SCTP source port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.src_port)),
+	},
+	[ITEM_SCTP_DST] = {
+		.name = "dst",
+		.help = "SCTP destination port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.dst_port)),
+	},
+	[ITEM_VXLAN] = {
+		.name = "vxlan",
+		.help = "match VXLAN header",
+		.priv = PRIV_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+		.next = NEXT(item_vxlan),
+		.call = parse_vc,
+	},
+	[ITEM_VXLAN_VNI] = {
+		.name = "vni",
+		.help = "VXLAN identifier",
+		.next = NEXT(item_vxlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vxlan, vni)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1497,6 +1647,19 @@ parse_int(struct context *ctx, const struct token *token,
 	case sizeof(uint16_t):
 		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
 		break;
+	case sizeof(uint8_t [3]):
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+		if (!arg->hton) {
+			((uint8_t *)buf)[0] = u;
+			((uint8_t *)buf)[1] = u >> 8;
+			((uint8_t *)buf)[2] = u >> 16;
+			break;
+		}
+#endif
+		((uint8_t *)buf)[0] = u >> 16;
+		((uint8_t *)buf)[1] = u >> 8;
+		((uint8_t *)buf)[2] = u;
+		break;
 	case sizeof(uint32_t):
 		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 21/22] app/testpmd: add various actions to flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (19 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 20/22] app/testpmd: add L4 items " Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 22/22] app/testpmd: add queue " Adrien Mazarguil
                       ` (4 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

- MARK: attach 32 bit value to packets.
- FLAG: flag packets.
- DROP: drop packets.
- COUNT: enable counters for a rule.
- PF: redirect packets to physical device function.
- VF: redirect packets to virtual device function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 121 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 892f300..e166045 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -154,6 +154,15 @@ enum index {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_MARK_ID,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
 };
 
 /** Size of pattern[] field in struct rte_flow_item_raw. */
@@ -478,6 +487,25 @@ static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	0,
+};
+
+static const enum index action_mark[] = {
+	ACTION_MARK_ID,
+	ACTION_NEXT,
+	0,
+};
+
+static const enum index action_vf[] = {
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
+	ACTION_NEXT,
 	0,
 };
 
@@ -489,6 +517,8 @@ static int parse_vc(struct context *, const struct token *,
 		    void *, unsigned int);
 static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_conf(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1118,6 +1148,70 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_MARK] = {
+		.name = "mark",
+		.help = "attach 32 bit value to packets",
+		.priv = PRIV_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+		.next = NEXT(action_mark),
+		.call = parse_vc,
+	},
+	[ACTION_MARK_ID] = {
+		.name = "id",
+		.help = "32 bit value to return with packets",
+		.next = NEXT(action_mark, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_mark, id)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_FLAG] = {
+		.name = "flag",
+		.help = "flag packets",
+		.priv = PRIV_ACTION(FLAG, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_DROP] = {
+		.name = "drop",
+		.help = "drop packets (note: passthru has priority)",
+		.priv = PRIV_ACTION(DROP, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_COUNT] = {
+		.name = "count",
+		.help = "enable counters for this rule",
+		.priv = PRIV_ACTION(COUNT, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PF] = {
+		.name = "pf",
+		.help = "redirect packets to physical device function",
+		.priv = PRIV_ACTION(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_VF] = {
+		.name = "vf",
+		.help = "redirect packets to virtual device function",
+		.priv = PRIV_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+		.next = NEXT(action_vf),
+		.call = parse_vc,
+	},
+	[ACTION_VF_ORIGINAL] = {
+		.name = "original",
+		.help = "use original VF ID if possible",
+		.next = NEXT(action_vf, NEXT_ENTRY(BOOLEAN)),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_vf,
+					   original)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_VF_ID] = {
+		.name = "id",
+		.help = "VF ID to redirect packets to",
+		.next = NEXT(action_vf, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_vf, id)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -1441,6 +1535,33 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse action configuration field. */
+static int
+parse_vc_conf(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Update configuration pointer. */
+	action->conf = ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH 22/22] app/testpmd: add queue actions to flow command
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (20 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 21/22] app/testpmd: add various actions " Adrien Mazarguil
@ 2016-11-16 16:23     ` Adrien Mazarguil
  2016-11-21  9:23     ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Nélio Laranjeiro
                       ` (3 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-16 16:23 UTC (permalink / raw)
  To: dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

- QUEUE: assign packets to a given queue index.
- DUP: duplicate packets to a given queue index.
- RSS: spread packets among several queues.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 152 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index e166045..70e2b76 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -157,8 +157,15 @@ enum index {
 	ACTION_MARK,
 	ACTION_MARK_ID,
 	ACTION_FLAG,
+	ACTION_QUEUE,
+	ACTION_QUEUE_INDEX,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_DUP_INDEX,
+	ACTION_RSS,
+	ACTION_RSS_QUEUES,
+	ACTION_RSS_QUEUE,
 	ACTION_PF,
 	ACTION_VF,
 	ACTION_VF_ORIGINAL,
@@ -172,6 +179,14 @@ enum index {
 #define ITEM_RAW_SIZE \
 	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
 
+/** Number of queue[] entries in struct rte_flow_action_rss. */
+#define ACTION_RSS_NUM 32
+
+/** Storage size for struct rte_flow_action_rss including queues. */
+#define ACTION_RSS_SIZE \
+	(offsetof(struct rte_flow_action_rss, queue) + \
+	 sizeof(*((struct rte_flow_action_rss *)0)->queue) * ACTION_RSS_NUM)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -489,8 +504,11 @@ static const enum index next_action[] = {
 	ACTION_PASSTHRU,
 	ACTION_MARK,
 	ACTION_FLAG,
+	ACTION_QUEUE,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_RSS,
 	ACTION_PF,
 	ACTION_VF,
 	0,
@@ -502,6 +520,24 @@ static const enum index action_mark[] = {
 	0,
 };
 
+static const enum index action_queue[] = {
+	ACTION_QUEUE_INDEX,
+	ACTION_NEXT,
+	0,
+};
+
+static const enum index action_dup[] = {
+	ACTION_DUP_INDEX,
+	ACTION_NEXT,
+	0,
+};
+
+static const enum index action_rss[] = {
+	ACTION_RSS_QUEUES,
+	ACTION_NEXT,
+	0,
+};
+
 static const enum index action_vf[] = {
 	ACTION_VF_ORIGINAL,
 	ACTION_VF_ID,
@@ -519,6 +555,9 @@ static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
 static int parse_vc_conf(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_action_rss_queue(struct context *, const struct token *,
+				     const char *, unsigned int, void *,
+				     unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -568,6 +607,8 @@ static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
 			unsigned int, char *, unsigned int);
+static int comp_vc_action_rss_queue(struct context *, const struct token *,
+				    unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -1169,6 +1210,21 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_QUEUE] = {
+		.name = "queue",
+		.help = "assign packets to a given queue index",
+		.priv = PRIV_ACTION(QUEUE,
+				    sizeof(struct rte_flow_action_queue)),
+		.next = NEXT(action_queue),
+		.call = parse_vc,
+	},
+	[ACTION_QUEUE_INDEX] = {
+		.name = "index",
+		.help = "queue index to use",
+		.next = NEXT(action_queue, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_queue, index)),
+		.call = parse_vc_conf,
+	},
 	[ACTION_DROP] = {
 		.name = "drop",
 		.help = "drop packets (note: passthru has priority)",
@@ -1183,6 +1239,39 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_DUP] = {
+		.name = "dup",
+		.help = "duplicate packets to a given queue index",
+		.priv = PRIV_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+		.next = NEXT(action_dup),
+		.call = parse_vc,
+	},
+	[ACTION_DUP_INDEX] = {
+		.name = "index",
+		.help = "queue index to duplicate packets to",
+		.next = NEXT(action_dup, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_dup, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS] = {
+		.name = "rss",
+		.help = "spread packets among several queues",
+		.priv = PRIV_ACTION(RSS, ACTION_RSS_SIZE),
+		.next = NEXT(action_rss),
+		.call = parse_vc,
+	},
+	[ACTION_RSS_QUEUES] = {
+		.name = "queues",
+		.help = "queue indices to use",
+		.next = NEXT(action_rss, NEXT_ENTRY(ACTION_RSS_QUEUE)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS_QUEUE] = {
+		.name = "{queue}",
+		.help = "queue index",
+		.call = parse_vc_action_rss_queue,
+		.comp = comp_vc_action_rss_queue,
+	},
 	[ACTION_PF] = {
 		.name = "pf",
 		.help = "redirect packets to physical device function",
@@ -1562,6 +1651,51 @@ parse_vc_conf(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/**
+ * Parse queue field for RSS action.
+ *
+ * Valid tokens are queue indices and the "end" token.
+ */
+static int
+parse_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	static const enum index const next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
+	int ret;
+	int i;
+
+	(void)token;
+	(void)buf;
+	(void)size;
+	if (ctx->curr != ACTION_RSS_QUEUE)
+		return -1;
+	i = ctx->objdata >> 16;
+	if (!strncmp(str, "end", len)) {
+		ctx->objdata &= 0xffff;
+		return len;
+	}
+	if (i >= ACTION_RSS_NUM)
+		return -1;
+	if (push_args(ctx, ARGS_ENTRY(struct rte_flow_action_rss, queue[i])))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	++i;
+	ctx->objdata = i << 16 | (ctx->objdata & 0xffff);
+	/* Repeat token. */
+	if (ctx->next_num == RTE_DIM(ctx->next))
+		return -1;
+	ctx->next[ctx->next_num++] = next;
+	if (!ctx->object)
+		return len;
+	((struct rte_flow_action_rss *)ctx->object)->queues = i;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -2136,6 +2270,24 @@ comp_rule_id(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete queue field for RSS action. */
+static int
+comp_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			 unsigned int ent, char *buf, unsigned int size)
+{
+	static const char *const str[] = { "", "end", NULL };
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; str[i] != NULL; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", str[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API Adrien Mazarguil
@ 2016-11-18  6:36       ` Xing, Beilei
  2016-11-18 10:28         ` Adrien Mazarguil
  2016-11-30 17:47       ` Kevin Traynor
  2016-12-08  9:00       ` Xing, Beilei
  2 siblings, 1 reply; 262+ messages in thread
From: Xing, Beilei @ 2016-11-18  6:36 UTC (permalink / raw)
  To: Adrien Mazarguil, dev
  Cc: Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz

Hi Adrien,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Thursday, November 17, 2016 12:23 AM
> To: dev@dpdk.org
> Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; De Lara Guarch,
> Pablo <pablo.de.lara.guarch@intel.com>; Olivier Matz
> <olivier.matz@6wind.com>
> Subject: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> 
> This new API supersedes all the legacy filter types described in rte_eth_ctrl.h.
> It is slightly higher level and as a result relies more on PMDs to process and
> validate flow rules.
> 
> Benefits:
> 
> - A unified API is easier to program for, applications do not have to be
>   written for a specific filter type which may or may not be supported by
>   the underlying device.
> 
> - The behavior of a flow rule is the same regardless of the underlying
>   device, applications do not need to be aware of hardware quirks.
> 
> - Extensible by design, API/ABI breakage should rarely occur if at all.
> 
> - Documentation is self-standing, no need to look up elsewhere.
> 
> Existing filter types will be deprecated and removed in the near future.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>


> +
> +/**
> + * Opaque type returned after successfully creating a flow.
> + *
> + * This handle can be used to manage and query the related flow (e.g.
> +to
> + * destroy it or retrieve counters).
> + */
> +struct rte_flow;
> +

As we talked before, we use attr/pattern/actions to create and destroy a flow in PMD, 
but I don't think it's easy to clone the user-provided parameters and return the result
to the application as a rte_flow pointer.  As you suggested:
/* PMD-specific code. */
 struct rte_flow {
    struct rte_flow_attr attr;
    struct rte_flow_item *pattern;
    struct rte_flow_action *actions;
 };

Because both pattern and actions are pointers, and there're also pointers in structure
rte_flow_item and struct rte_flow_action. We need to iterate allocation during clone
and iterate free during destroy, then seems that the code is something ugly, right?

I think application saves info when creating a flow rule, so why not application provide
attr/pattern/actions info to PMD before calling PMD API?

Thanks,
Beilei Xing

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-11-18  6:36       ` Xing, Beilei
@ 2016-11-18 10:28         ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-11-18 10:28 UTC (permalink / raw)
  To: Xing, Beilei; +Cc: dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz

Hi Beilei,

On Fri, Nov 18, 2016 at 06:36:31AM +0000, Xing, Beilei wrote:
> Hi Adrien,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Thursday, November 17, 2016 12:23 AM
> > To: dev@dpdk.org
> > Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; De Lara Guarch,
> > Pablo <pablo.de.lara.guarch@intel.com>; Olivier Matz
> > <olivier.matz@6wind.com>
> > Subject: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> > 
> > This new API supersedes all the legacy filter types described in rte_eth_ctrl.h.
> > It is slightly higher level and as a result relies more on PMDs to process and
> > validate flow rules.
> > 
> > Benefits:
> > 
> > - A unified API is easier to program for, applications do not have to be
> >   written for a specific filter type which may or may not be supported by
> >   the underlying device.
> > 
> > - The behavior of a flow rule is the same regardless of the underlying
> >   device, applications do not need to be aware of hardware quirks.
> > 
> > - Extensible by design, API/ABI breakage should rarely occur if at all.
> > 
> > - Documentation is self-standing, no need to look up elsewhere.
> > 
> > Existing filter types will be deprecated and removed in the near future.
> > 
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> 
> 
> > +
> > +/**
> > + * Opaque type returned after successfully creating a flow.
> > + *
> > + * This handle can be used to manage and query the related flow (e.g.
> > +to
> > + * destroy it or retrieve counters).
> > + */
> > +struct rte_flow;
> > +
> 
> As we talked before, we use attr/pattern/actions to create and destroy a flow in PMD, 
> but I don't think it's easy to clone the user-provided parameters and return the result
> to the application as a rte_flow pointer.  As you suggested:
> /* PMD-specific code. */
>  struct rte_flow {
>     struct rte_flow_attr attr;
>     struct rte_flow_item *pattern;
>     struct rte_flow_action *actions;
>  };

Just to provide some context to the community since the above snippet comes
from private exchanges, I've suggested the above structure as a mean to
create and remove rules in the same fashion as FDIR, by providing the rule
used for creation to the destroy callback.

As an opaque type, each PMD currently needs to implement its own version of
struct rte_flow. The above definition may ease transition from FDIR to
rte_flow for some PMDs, however they need to clone the entire
application-provided rule to do so because there is no requirement for it to
be kept allocated.

I've implemented such a function in testpmd (port_flow_new() in commit [1])
as an example.

 [1] http://dpdk.org/ml/archives/dev/2016-November/050266.html

However my suggestion is for PMDs to use their own HW-specific structure
that only contains relevant information instead of being forced to drag
large, non-native data around, missing useful context and that requires
parsing every time. This is one benefit of using an opaque type in the first
place, the other being ABI breakage avoidance.

> Because both pattern and actions are pointers, and there're also pointers in structure
> rte_flow_item and struct rte_flow_action. We need to iterate allocation during clone
> and iterate free during destroy, then seems that the code is something ugly, right?

Well since I wrote that code, I won't easily admit it's ugly. I think PMDs
should not require the duplication of generic rules actually, which are only
defined as a common language between applications and PMDs. Both are free to
store rules in their own preferred and efficient format internally.

> I think application saves info when creating a flow rule, so why not application provide
> attr/pattern/actions info to PMD before calling PMD API?

They have to do so temporarily (e.g. allocated on the stack) while calling
rte_flow_create() and rte_flow_validate(), that's it. Once a rule is
created, there's no requirement for applications to keep anything around.

For simple applications such as testpmd, the generic format is probably
enough. More complex and existing applications such as ovs-dpdk may rather
choose to keep using their internal format that already fits their needs,
partially duplicating this information in rte_flow_attr and
rte_flow_item/rte_flow_action lists would waste memory. The conversion in
this case should only be performed when creating/validating flow rules.

In short, I fail to see any downside with maintaining struct rte_flow opaque
to applications.

Best regards,

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (21 preceding siblings ...)
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 22/22] app/testpmd: add queue " Adrien Mazarguil
@ 2016-11-21  9:23     ` Nélio Laranjeiro
  2016-11-28 10:03     ` Pei, Yulong
                       ` (2 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Nélio Laranjeiro @ 2016-11-21  9:23 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz

Hi,

I found some small issues:

 - In rte_flow_error_set(), *cause argument should be const to avoid a
   compilation error when we implement it.

 - In port_flow_create(), the flow is not stored in the pf structure
   which ends by providing a null pointer to port_flow_destroy()
   function.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (22 preceding siblings ...)
  2016-11-21  9:23     ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Nélio Laranjeiro
@ 2016-11-28 10:03     ` Pei, Yulong
  2016-12-01  8:39       ` Adrien Mazarguil
  2016-12-02 16:58     ` Ferruh Yigit
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
  25 siblings, 1 reply; 262+ messages in thread
From: Pei, Yulong @ 2016-11-28 10:03 UTC (permalink / raw)
  To: Adrien Mazarguil, dev
  Cc: Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz, Xing, Beilei

Hi Adrien,

I  think that you already did test for your patchset,  do you have any automated test scripts can be shared for validation since there did not have testpmd flow command documentation yet?

Best Regards
Yulong Pei

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
Sent: Thursday, November 17, 2016 12:23 AM
To: dev@dpdk.org
Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Olivier Matz <olivier.matz@6wind.com>
Subject: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)

As previously discussed in RFC v1 [1], RFC v2 [2], with changes described in [3] (also pasted below), here is the first non-draft series for this new API.

Its capabilities are so generic that its name had to be vague, it may be called "Generic flow API", "Generic flow interface" (possibly shortened as "GFI") to refer to the name of the new filter type, or "rte_flow" from the prefix used for its public symbols. I personally favor the latter.

While it is currently meant to supersede existing filter types in order for all PMDs to expose a common filtering/classification interface, it may eventually evolve to cover the following ideas as well:

- Rx/Tx offloads configuration through automatic offloads for specific
  packets, e.g. performing checksum on TCP packets could be expressed with
  an egress rule with a TCP pattern and a kind of checksum action.

- RSS configuration (already defined actually). Could be global or per rule
  depending on hardware capabilities.

- Switching configuration for devices with many physical ports; rules doing
  both ingress and egress could even be used to completely bypass software
  if supported by hardware.

 [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
 [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
 [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html

Changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect
  struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

Adrien Mazarguil (22):
  ethdev: introduce generic flow API
  cmdline: add support for dynamic tokens
  cmdline: add alignment constraint
  app/testpmd: implement basic support for rte_flow
  app/testpmd: add flow command
  app/testpmd: add rte_flow integer support
  app/testpmd: add flow list command
  app/testpmd: add flow flush command
  app/testpmd: add flow destroy command
  app/testpmd: add flow validate/create commands
  app/testpmd: add flow query command
  app/testpmd: add rte_flow item spec handler
  app/testpmd: add rte_flow item spec prefix length
  app/testpmd: add rte_flow bit-field support
  app/testpmd: add item any to flow command
  app/testpmd: add various items to flow command
  app/testpmd: add item raw to flow command
  app/testpmd: add items eth/vlan to flow command
  app/testpmd: add items ipv4/ipv6 to flow command
  app/testpmd: add L4 items to flow command
  app/testpmd: add various actions to flow command
  app/testpmd: add queue actions to flow command

 MAINTAINERS                            |    4 +
 app/test-pmd/Makefile                  |    1 +
 app/test-pmd/cmdline.c                 |   32 +
 app/test-pmd/cmdline_flow.c            | 2581 +++++++++++++++++++++++++++
 app/test-pmd/config.c                  |  484 +++++
 app/test-pmd/csumonly.c                |    1 +
 app/test-pmd/flowgen.c                 |    1 +
 app/test-pmd/icmpecho.c                |    1 +
 app/test-pmd/ieee1588fwd.c             |    1 +
 app/test-pmd/iofwd.c                   |    1 +
 app/test-pmd/macfwd.c                  |    1 +
 app/test-pmd/macswap.c                 |    1 +
 app/test-pmd/parameters.c              |    1 +
 app/test-pmd/rxonly.c                  |    1 +
 app/test-pmd/testpmd.c                 |    6 +
 app/test-pmd/testpmd.h                 |   27 +
 app/test-pmd/txonly.c                  |    1 +
 lib/librte_cmdline/cmdline_parse.c     |   67 +-
 lib/librte_cmdline/cmdline_parse.h     |   21 +
 lib/librte_ether/Makefile              |    3 +
 lib/librte_ether/rte_eth_ctrl.h        |    1 +
 lib/librte_ether/rte_ether_version.map |   10 +
 lib/librte_ether/rte_flow.c            |  159 ++
 lib/librte_ether/rte_flow.h            |  947 ++++++++++
 lib/librte_ether/rte_flow_driver.h     |  177 ++
 25 files changed, 4521 insertions(+), 9 deletions(-)  create mode 100644 app/test-pmd/cmdline_flow.c  create mode 100644 lib/librte_ether/rte_flow.c  create mode 100644 lib/librte_ether/rte_flow.h  create mode 100644 lib/librte_ether/rte_flow_driver.h

--
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API Adrien Mazarguil
  2016-11-18  6:36       ` Xing, Beilei
@ 2016-11-30 17:47       ` Kevin Traynor
  2016-12-01  8:36         ` Adrien Mazarguil
  2016-12-08  9:00       ` Xing, Beilei
  2 siblings, 1 reply; 262+ messages in thread
From: Kevin Traynor @ 2016-11-30 17:47 UTC (permalink / raw)
  To: Adrien Mazarguil, dev
  Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz, sugesh.chandra

Hi Adrien,

On 11/16/2016 04:23 PM, Adrien Mazarguil wrote:
> This new API supersedes all the legacy filter types described in
> rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
> PMDs to process and validate flow rules.
> 
> Benefits:
> 
> - A unified API is easier to program for, applications do not have to be
>   written for a specific filter type which may or may not be supported by
>   the underlying device.
> 
> - The behavior of a flow rule is the same regardless of the underlying
>   device, applications do not need to be aware of hardware quirks.
> 
> - Extensible by design, API/ABI breakage should rarely occur if at all.
> 
> - Documentation is self-standing, no need to look up elsewhere.
> 
> Existing filter types will be deprecated and removed in the near future.

I'd suggest to add a deprecation notice to deprecation.rst, ideally with
a target release.

> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
>  MAINTAINERS                            |   4 +
>  lib/librte_ether/Makefile              |   3 +
>  lib/librte_ether/rte_eth_ctrl.h        |   1 +
>  lib/librte_ether/rte_ether_version.map |  10 +
>  lib/librte_ether/rte_flow.c            | 159 +++++
>  lib/librte_ether/rte_flow.h            | 947 ++++++++++++++++++++++++++++
>  lib/librte_ether/rte_flow_driver.h     | 177 ++++++
>  7 files changed, 1301 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d6bb8f8..3b46630 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -243,6 +243,10 @@ M: Thomas Monjalon <thomas.monjalon@6wind.com>
>  F: lib/librte_ether/
>  F: scripts/test-null.sh
>  
> +Generic flow API
> +M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> +F: lib/librte_ether/rte_flow*
> +
>  Crypto API
>  M: Declan Doherty <declan.doherty@intel.com>
>  F: lib/librte_cryptodev/
> diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
> index efe1e5f..9335361 100644
> --- a/lib/librte_ether/Makefile
> +++ b/lib/librte_ether/Makefile
> @@ -44,6 +44,7 @@ EXPORT_MAP := rte_ether_version.map
>  LIBABIVER := 5
>  
>  SRCS-y += rte_ethdev.c
> +SRCS-y += rte_flow.c
>  
>  #
>  # Export include files
> @@ -51,6 +52,8 @@ SRCS-y += rte_ethdev.c
>  SYMLINK-y-include += rte_ethdev.h
>  SYMLINK-y-include += rte_eth_ctrl.h
>  SYMLINK-y-include += rte_dev_info.h
> +SYMLINK-y-include += rte_flow.h
> +SYMLINK-y-include += rte_flow_driver.h
>  
>  # this lib depends upon:
>  DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
> diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
> index fe80eb0..8386904 100644
> --- a/lib/librte_ether/rte_eth_ctrl.h
> +++ b/lib/librte_ether/rte_eth_ctrl.h
> @@ -99,6 +99,7 @@ enum rte_filter_type {
>  	RTE_ETH_FILTER_FDIR,
>  	RTE_ETH_FILTER_HASH,
>  	RTE_ETH_FILTER_L2_TUNNEL,
> +	RTE_ETH_FILTER_GENERIC,
>  	RTE_ETH_FILTER_MAX
>  };
>  
> diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
> index 72be66d..b5d2547 100644
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -147,3 +147,13 @@ DPDK_16.11 {
>  	rte_eth_dev_pci_remove;
>  
>  } DPDK_16.07;
> +
> +DPDK_17.02 {
> +	global:
> +
> +	rte_flow_validate;
> +	rte_flow_create;
> +	rte_flow_destroy;
> +	rte_flow_query;
> +
> +} DPDK_16.11;
> diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
> new file mode 100644
> index 0000000..064963d
> --- /dev/null
> +++ b/lib/librte_ether/rte_flow.c
> @@ -0,0 +1,159 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright 2016 6WIND S.A.
> + *   Copyright 2016 Mellanox.

There's Mellanox copyright but you are the only signed-off-by - is that
right?

> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of 6WIND S.A. nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <stdint.h>
> +
> +#include <rte_errno.h>
> +#include <rte_branch_prediction.h>
> +#include "rte_ethdev.h"
> +#include "rte_flow_driver.h"
> +#include "rte_flow.h"
> +
> +/* Get generic flow operations structure from a port. */
> +const struct rte_flow_ops *
> +rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops;
> +	int code;
> +
> +	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
> +		code = ENODEV;
> +	else if (unlikely(!dev->dev_ops->filter_ctrl ||
> +			  dev->dev_ops->filter_ctrl(dev,
> +						    RTE_ETH_FILTER_GENERIC,
> +						    RTE_ETH_FILTER_GET,
> +						    &ops) ||
> +			  !ops))
> +		code = ENOTSUP;
> +	else
> +		return ops;
> +	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(code));
> +	return NULL;
> +}
> +

Is it expected that the application or pmd will provide locking between
these functions if required? I think it's going to have to be the app.

> +/* Check whether a flow rule can be created on a given port. */
> +int
> +rte_flow_validate(uint8_t port_id,
> +		  const struct rte_flow_attr *attr,
> +		  const struct rte_flow_item pattern[],
> +		  const struct rte_flow_action actions[],
> +		  struct rte_flow_error *error)
> +{
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +
> +	if (unlikely(!ops))
> +		return -rte_errno;
> +	if (likely(!!ops->validate))
> +		return ops->validate(dev, attr, pattern, actions, error);
> +	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOTSUP));
> +	return -rte_errno;
> +}
> +
> +/* Create a flow rule on a given port. */
> +struct rte_flow *
> +rte_flow_create(uint8_t port_id,
> +		const struct rte_flow_attr *attr,
> +		const struct rte_flow_item pattern[],
> +		const struct rte_flow_action actions[],
> +		struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (unlikely(!ops))
> +		return NULL;
> +	if (likely(!!ops->create))
> +		return ops->create(dev, attr, pattern, actions, error);
> +	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOTSUP));
> +	return NULL;
> +}
> +
> +/* Destroy a flow rule on a given port. */
> +int
> +rte_flow_destroy(uint8_t port_id,
> +		 struct rte_flow *flow,
> +		 struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (unlikely(!ops))
> +		return -rte_errno;
> +	if (likely(!!ops->destroy))
> +		return ops->destroy(dev, flow, error);
> +	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOTSUP));
> +	return -rte_errno;
> +}
> +
> +/* Destroy all flow rules associated with a port. */
> +int
> +rte_flow_flush(uint8_t port_id,
> +	       struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (unlikely(!ops))
> +		return -rte_errno;
> +	if (likely(!!ops->flush))
> +		return ops->flush(dev, error);
> +	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOTSUP));
> +	return -rte_errno;
> +}
> +
> +/* Query an existing flow rule. */
> +int
> +rte_flow_query(uint8_t port_id,
> +	       struct rte_flow *flow,
> +	       enum rte_flow_action_type action,
> +	       void *data,
> +	       struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (!ops)
> +		return -rte_errno;
> +	if (likely(!!ops->query))
> +		return ops->query(dev, flow, action, data, error);
> +	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOTSUP));
> +	return -rte_errno;
> +}
> diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
> new file mode 100644
> index 0000000..211f307
> --- /dev/null
> +++ b/lib/librte_ether/rte_flow.h
> @@ -0,0 +1,947 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright 2016 6WIND S.A.
> + *   Copyright 2016 Mellanox.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of 6WIND S.A. nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef RTE_FLOW_H_
> +#define RTE_FLOW_H_
> +
> +/**
> + * @file
> + * RTE generic flow API
> + *
> + * This interface provides the ability to program packet matching and
> + * associated actions in hardware through flow rules.
> + */
> +
> +#include <rte_arp.h>
> +#include <rte_ether.h>
> +#include <rte_icmp.h>
> +#include <rte_ip.h>
> +#include <rte_sctp.h>
> +#include <rte_tcp.h>
> +#include <rte_udp.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Flow rule attributes.
> + *
> + * Priorities are set on two levels: per group and per rule within groups.
> + *
> + * Lower values denote higher priority, the highest priority for both levels
> + * is 0, so that a rule with priority 0 in group 8 is always matched after a
> + * rule with priority 8 in group 0.
> + *
> + * Although optional, applications are encouraged to group similar rules as
> + * much as possible to fully take advantage of hardware capabilities
> + * (e.g. optimized matching) and work around limitations (e.g. a single
> + * pattern type possibly allowed in a given group).
> + *
> + * Group and priority levels are arbitrary and up to the application, they
> + * do not need to be contiguous nor start from 0, however the maximum number
> + * varies between devices and may be affected by existing flow rules.
> + *
> + * If a packet is matched by several rules of a given group for a given
> + * priority level, the outcome is undefined. It can take any path, may be
> + * duplicated or even cause unrecoverable errors.

I get what you are trying to do here wrt supporting multiple
pmds/hardware implementations and it's a good idea to keep it flexible.

Given that the outcome is undefined, it would be nice that the
application has a way of finding the specific effects for verification
and debugging.

> + *
> + * Note that support for more than a single group and priority level is not
> + * guaranteed.
> + *
> + * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
> + *
> + * Several pattern items and actions are valid and can be used in both
> + * directions. Those valid for only one direction are described as such.
> + *
> + * Specifying both directions at once is not recommended but may be valid in
> + * some cases, such as incrementing the same counter twice.
> + *
> + * Not specifying any direction is currently an error.
> + */
> +struct rte_flow_attr {
> +	uint32_t group; /**< Priority group. */
> +	uint32_t priority; /**< Priority level within group. */
> +	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
> +	uint32_t egress:1; /**< Rule applies to egress traffic. */
> +	uint32_t reserved:30; /**< Reserved, must be zero. */
> +};
> +
> +/**
> + * Matching pattern item types.
> + *
> + * Items are arranged in a list to form a matching pattern for packets.
> + * They fall in two categories:
> + *
> + * - Protocol matching (ANY, RAW, ETH, IPV4, IPV6, ICMP, UDP, TCP, SCTP,
> + *   VXLAN and so on), usually associated with a specification
> + *   structure. These must be stacked in the same order as the protocol
> + *   layers to match, starting from L2.
> + *
> + * - Affecting how the pattern is processed (END, VOID, INVERT, PF, VF, PORT
> + *   and so on), often without a specification structure. Since they are
> + *   meta data that does not match packet contents, these can be specified
> + *   anywhere within item lists without affecting the protocol matching
> + *   items.
> + *
> + * See the description of individual types for more information. Those
> + * marked with [META] fall into the second category.
> + */
> +enum rte_flow_item_type {
> +	/**
> +	 * [META]
> +	 *
> +	 * End marker for item lists. Prevents further processing of items,
> +	 * thereby ending the pattern.
> +	 *
> +	 * No associated specification structure.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_END,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Used as a placeholder for convenience. It is ignored and simply
> +	 * discarded by PMDs.
> +	 *
> +	 * No associated specification structure.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_VOID,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Inverted matching, i.e. process packets that do not match the
> +	 * pattern.
> +	 *
> +	 * No associated specification structure.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_INVERT,
> +
> +	/**
> +	 * Matches any protocol in place of the current layer, a single ANY
> +	 * may also stand for several protocol layers.
> +	 *
> +	 * See struct rte_flow_item_any.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_ANY,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Matches packets addressed to the physical function of the device.
> +	 *
> +	 * If the underlying device function differs from the one that would
> +	 * normally receive the matched traffic, specifying this item
> +	 * prevents it from reaching that device unless the flow rule
> +	 * contains a PF action. Packets are not duplicated between device
> +	 * instances by default.
> +	 *
> +	 * No associated specification structure.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_PF,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Matches packets addressed to a virtual function ID of the device.
> +	 *
> +	 * If the underlying device function differs from the one that would
> +	 * normally receive the matched traffic, specifying this item
> +	 * prevents it from reaching that device unless the flow rule
> +	 * contains a VF action. Packets are not duplicated between device
> +	 * instances by default.
> +	 *
> +	 * See struct rte_flow_item_vf.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_VF,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Matches packets coming from the specified physical port of the
> +	 * underlying device.
> +	 *
> +	 * The first PORT item overrides the physical port normally
> +	 * associated with the specified DPDK input port (port_id). This
> +	 * item can be provided several times to match additional physical
> +	 * ports.
> +	 *
> +	 * See struct rte_flow_item_port.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_PORT,
> +
> +	/**
> +	 * Matches a byte string of a given length at a given offset.
> +	 *
> +	 * See struct rte_flow_item_raw.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_RAW,
> +
> +	/**
> +	 * Matches an Ethernet header.
> +	 *
> +	 * See struct rte_flow_item_eth.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_ETH,
> +
> +	/**
> +	 * Matches an 802.1Q/ad VLAN tag.
> +	 *
> +	 * See struct rte_flow_item_vlan.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_VLAN,
> +
> +	/**
> +	 * Matches an IPv4 header.
> +	 *
> +	 * See struct rte_flow_item_ipv4.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_IPV4,
> +
> +	/**
> +	 * Matches an IPv6 header.
> +	 *
> +	 * See struct rte_flow_item_ipv6.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_IPV6,
> +
> +	/**
> +	 * Matches an ICMP header.
> +	 *
> +	 * See struct rte_flow_item_icmp.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_ICMP,
> +
> +	/**
> +	 * Matches a UDP header.
> +	 *
> +	 * See struct rte_flow_item_udp.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_UDP,
> +
> +	/**
> +	 * Matches a TCP header.
> +	 *
> +	 * See struct rte_flow_item_tcp.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_TCP,
> +
> +	/**
> +	 * Matches a SCTP header.
> +	 *
> +	 * See struct rte_flow_item_sctp.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_SCTP,
> +
> +	/**
> +	 * Matches a VXLAN header.
> +	 *
> +	 * See struct rte_flow_item_vxlan.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_VXLAN,
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_ANY
> + *
> + * Matches any protocol in place of the current layer, a single ANY may also
> + * stand for several protocol layers.
> + *
> + * This is usually specified as the first pattern item when looking for a
> + * protocol anywhere in a packet.
> + *
> + * A maximum value of 0 requests matching any number of protocol layers
> + * above or equal to the minimum value, a maximum value lower than the
> + * minimum one is otherwise invalid.
> + *
> + * This type does not work with a range (struct rte_flow_item.last).
> + */
> +struct rte_flow_item_any {
> +	uint16_t min; /**< Minimum number of layers covered. */
> +	uint16_t max; /**< Maximum number of layers covered, 0 for infinity. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_VF
> + *
> + * Matches packets addressed to a virtual function ID of the device.
> + *
> + * If the underlying device function differs from the one that would
> + * normally receive the matched traffic, specifying this item prevents it
> + * from reaching that device unless the flow rule contains a VF
> + * action. Packets are not duplicated between device instances by default.
> + *
> + * - Likely to return an error or never match any traffic if this causes a
> + *   VF device to match traffic addressed to a different VF.
> + * - Can be specified multiple times to match traffic addressed to several
> + *   specific VFs.
> + * - Can be combined with a PF item to match both PF and VF traffic.
> + *
> + * A zeroed mask can be used to match any VF.

can you refer explicitly to id

> + */
> +struct rte_flow_item_vf {
> +	uint32_t id; /**< Destination VF ID. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_PORT
> + *
> + * Matches packets coming from the specified physical port of the underlying
> + * device.
> + *
> + * The first PORT item overrides the physical port normally associated with
> + * the specified DPDK input port (port_id). This item can be provided
> + * several times to match additional physical ports.
> + *
> + * Note that physical ports are not necessarily tied to DPDK input ports
> + * (port_id) when those are not under DPDK control. Possible values are
> + * specific to each device, they are not necessarily indexed from zero and
> + * may not be contiguous.
> + *
> + * As a device property, the list of allowed values as well as the value
> + * associated with a port_id should be retrieved by other means.
> + *
> + * A zeroed mask can be used to match any port index.
> + */
> +struct rte_flow_item_port {
> +	uint32_t index; /**< Physical port index. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_RAW
> + *
> + * Matches a byte string of a given length at a given offset.
> + *
> + * Offset is either absolute (using the start of the packet) or relative to
> + * the end of the previous matched item in the stack, in which case negative
> + * values are allowed.
> + *
> + * If search is enabled, offset is used as the starting point. The search
> + * area can be delimited by setting limit to a nonzero value, which is the
> + * maximum number of bytes after offset where the pattern may start.
> + *
> + * Matching a zero-length pattern is allowed, doing so resets the relative
> + * offset for subsequent items.
> + *
> + * This type does not work with a range (struct rte_flow_item.last).
> + */
> +struct rte_flow_item_raw {
> +	uint32_t relative:1; /**< Look for pattern after the previous item. */
> +	uint32_t search:1; /**< Search pattern from offset (see also limit). */
> +	uint32_t reserved:30; /**< Reserved, must be set to zero. */
> +	int32_t offset; /**< Absolute or relative offset for pattern. */
> +	uint16_t limit; /**< Search area limit for start of pattern. */
> +	uint16_t length; /**< Pattern length. */
> +	uint8_t pattern[]; /**< Byte string to look for. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_ETH
> + *
> + * Matches an Ethernet header.
> + */
> +struct rte_flow_item_eth {
> +	struct ether_addr dst; /**< Destination MAC. */
> +	struct ether_addr src; /**< Source MAC. */
> +	unsigned int type; /**< EtherType. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_VLAN
> + *
> + * Matches an 802.1Q/ad VLAN tag.
> + *
> + * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
> + * RTE_FLOW_ITEM_TYPE_VLAN.
> + */
> +struct rte_flow_item_vlan {
> +	uint16_t tpid; /**< Tag protocol identifier. */
> +	uint16_t tci; /**< Tag control information. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_IPV4
> + *
> + * Matches an IPv4 header.
> + *
> + * Note: IPv4 options are handled by dedicated pattern items.
> + */
> +struct rte_flow_item_ipv4 {
> +	struct ipv4_hdr hdr; /**< IPv4 header definition. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_IPV6.
> + *
> + * Matches an IPv6 header.
> + *
> + * Note: IPv6 options are handled by dedicated pattern items.
> + */
> +struct rte_flow_item_ipv6 {
> +	struct ipv6_hdr hdr; /**< IPv6 header definition. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_ICMP.
> + *
> + * Matches an ICMP header.
> + */
> +struct rte_flow_item_icmp {
> +	struct icmp_hdr hdr; /**< ICMP header definition. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_UDP.
> + *
> + * Matches a UDP header.
> + */
> +struct rte_flow_item_udp {
> +	struct udp_hdr hdr; /**< UDP header definition. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_TCP.
> + *
> + * Matches a TCP header.
> + */
> +struct rte_flow_item_tcp {
> +	struct tcp_hdr hdr; /**< TCP header definition. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_SCTP.
> + *
> + * Matches a SCTP header.
> + */
> +struct rte_flow_item_sctp {
> +	struct sctp_hdr hdr; /**< SCTP header definition. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_VXLAN.
> + *
> + * Matches a VXLAN header (RFC 7348).
> + */
> +struct rte_flow_item_vxlan {
> +	uint8_t flags; /**< Normally 0x08 (I flag). */
> +	uint8_t rsvd0[3]; /**< Reserved, normally 0x000000. */
> +	uint8_t vni[3]; /**< VXLAN identifier. */
> +	uint8_t rsvd1; /**< Reserved, normally 0x00. */
> +};
> +
> +/**
> + * Matching pattern item definition.
> + *
> + * A pattern is formed by stacking items starting from the lowest protocol
> + * layer to match. This stacking restriction does not apply to meta items
> + * which can be placed anywhere in the stack with no effect on the meaning
> + * of the resulting pattern.
> + *
> + * A stack is terminated by a END item.
> + *
> + * The spec field should be a valid pointer to a structure of the related
> + * item type. It may be set to NULL in many cases to use default values.
> + *
> + * Optionally, last can point to a structure of the same type to define an
> + * inclusive range. This is mostly supported by integer and address fields,
> + * may cause errors otherwise. Fields that do not support ranges must be set
> + * to the same value as their spec counterparts.
> + *
> + * By default all fields present in spec are considered relevant.* This

typo "*"

> + * behavior can be altered by providing a mask structure of the same type
> + * with applicable bits set to one. It can also be used to partially filter
> + * out specific fields (e.g. as an alternate mean to match ranges of IP
> + * addresses).
> + *
> + * Note this is a simple bit-mask applied before interpreting the contents
> + * of spec and last, which may yield unexpected results if not used
> + * carefully. For example, if for an IPv4 address field, spec provides
> + * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
> + * effective range is 10.1.0.0 to 10.3.255.255.
> + *
> + * * The defaults for data-matching items such as IPv4 when mask is not
> + *   specified actually depend on the underlying implementation since only
> + *   recognized fields can be taken into account.
> + */
> +struct rte_flow_item {
> +	enum rte_flow_item_type type; /**< Item type. */
> +	const void *spec; /**< Pointer to item specification structure. */
> +	const void *last; /**< Defines an inclusive range (spec to last). */
> +	const void *mask; /**< Bit-mask applied to spec and last. */
> +};
> +
> +/**
> + * Action types.
> + *
> + * Each possible action is represented by a type. Some have associated
> + * configuration structures. Several actions combined in a list can be
> + * affected to a flow rule. That list is not ordered.
> + *
> + * They fall in three categories:
> + *
> + * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
> + *   processing matched packets by subsequent flow rules, unless overridden
> + *   with PASSTHRU.
> + *
> + * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
> + *   for additional processing by subsequent flow rules.
> + *
> + * - Other non terminating meta actions that do not affect the fate of
> + *   packets (END, VOID, MARK, FLAG, COUNT).
> + *
> + * When several actions are combined in a flow rule, they should all have
> + * different types (e.g. dropping a packet twice is not possible). The
> + * defined behavior is for PMDs to only take into account the last action of
> + * a given type found in the list. PMDs still perform error checking on the
> + * entire list.

why do you define that the pmd will interpret multiple same type rules
in this way...would it not make more sense for the pmd to just return
EINVAL for an invalid set of rules? It seems more transparent for the
application.

> + *
> + * Note that PASSTHRU is the only action able to override a terminating
> + * rule.
> + */
> +enum rte_flow_action_type {
> +	/**
> +	 * [META]
> +	 *
> +	 * End marker for action lists. Prevents further processing of
> +	 * actions, thereby ending the list.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_END,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Used as a placeholder for convenience. It is ignored and simply
> +	 * discarded by PMDs.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_VOID,
> +
> +	/**
> +	 * Leaves packets up for additional processing by subsequent flow
> +	 * rules. This is the default when a rule does not contain a
> +	 * terminating action, but can be specified to force a rule to
> +	 * become non-terminating.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_PASSTHRU,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Attaches a 32 bit value to packets.
> +	 *
> +	 * See struct rte_flow_action_mark.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_MARK,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Flag packets. Similar to MARK but only affects ol_flags.
> +	 *
> +	 * Note: a distinctive flag must be defined for it.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_FLAG,
> +
> +	/**
> +	 * Assigns packets to a given queue index.
> +	 *
> +	 * See struct rte_flow_action_queue.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_QUEUE,
> +
> +	/**
> +	 * Drops packets.
> +	 *
> +	 * PASSTHRU overrides this action if both are specified.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_DROP,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Enables counters for this rule.
> +	 *
> +	 * These counters can be retrieved and reset through rte_flow_query(),
> +	 * see struct rte_flow_query_count.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_COUNT,
> +
> +	/**
> +	 * Duplicates packets to a given queue index.
> +	 *
> +	 * This is normally combined with QUEUE, however when used alone, it
> +	 * is actually similar to QUEUE + PASSTHRU.
> +	 *
> +	 * See struct rte_flow_action_dup.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_DUP,
> +
> +	/**
> +	 * Similar to QUEUE, except RSS is additionally performed on packets
> +	 * to spread them among several queues according to the provided
> +	 * parameters.
> +	 *
> +	 * See struct rte_flow_action_rss.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_RSS,
> +
> +	/**
> +	 * Redirects packets to the physical function (PF) of the current
> +	 * device.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_PF,
> +
> +	/**
> +	 * Redirects packets to the virtual function (VF) of the current
> +	 * device with the specified ID.
> +	 *
> +	 * See struct rte_flow_action_vf.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_VF,
> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_MARK
> + *
> + * Attaches a 32 bit value to packets.
> + *
> + * This value is arbitrary and application-defined. For compatibility with
> + * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
> + * also set in ol_flags.
> + */
> +struct rte_flow_action_mark {
> +	uint32_t id; /**< 32 bit value to return with packets. */
> +};

One use case I thought we would be able to do for OVS is classification
in hardware and the unique flow id is sent with the packet to software.
But in OVS the ufid is 128 bits, so it means we can't and there is still
the miniflow extract overhead. I'm not sure if there is a practical way
around this.

Sugesh (cc'd) has looked at this before and may be able to comment or
correct me.

> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_QUEUE
> + *
> + * Assign packets to a given queue index.
> + *
> + * Terminating by default.
> + */
> +struct rte_flow_action_queue {
> +	uint16_t index; /**< Queue index to use. */
> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_COUNT (query)
> + *
> + * Query structure to retrieve and reset flow rule counters.
> + */
> +struct rte_flow_query_count {
> +	uint32_t reset:1; /**< Reset counters after query [in]. */
> +	uint32_t hits_set:1; /**< hits field is set [out]. */
> +	uint32_t bytes_set:1; /**< bytes field is set [out]. */
> +	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
> +	uint64_t hits; /**< Number of hits for this rule [out]. */
> +	uint64_t bytes; /**< Number of bytes through this rule [out]. */
> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_DUP
> + *
> + * Duplicates packets to a given queue index.
> + *
> + * This is normally combined with QUEUE, however when used alone, it is
> + * actually similar to QUEUE + PASSTHRU.
> + *
> + * Non-terminating by default.
> + */
> +struct rte_flow_action_dup {
> +	uint16_t index; /**< Queue index to duplicate packets to. */
> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_RSS
> + *
> + * Similar to QUEUE, except RSS is additionally performed on packets to
> + * spread them among several queues according to the provided parameters.
> + *
> + * Note: RSS hash result is normally stored in the hash.rss mbuf field,
> + * however it conflicts with the MARK action as they share the same
> + * space. When both actions are specified, the RSS hash is discarded and
> + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
> + * structure should eventually evolve to store both.
> + *
> + * Terminating by default.
> + */
> +struct rte_flow_action_rss {
> +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
> +	uint16_t queues; /**< Number of entries in queue[]. */
> +	uint16_t queue[]; /**< Queues indices to use. */

I'd try and avoid queue and queues - someone will say "huh?" when
reading code. s/queues/num ?

> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_VF
> + *
> + * Redirects packets to a virtual function (VF) of the current device.
> + *
> + * Packets matched by a VF pattern item can be redirected to their original
> + * VF ID instead of the specified one. This parameter may not be available
> + * and is not guaranteed to work properly if the VF part is matched by a
> + * prior flow rule or if packets are not addressed to a VF in the first
> + * place.

Not clear what you mean by "not guaranteed to work if...". Please return
fail when this action is used if this is not going to work.

> + *
> + * Terminating by default.
> + */
> +struct rte_flow_action_vf {
> +	uint32_t original:1; /**< Use original VF ID if possible. */
> +	uint32_t reserved:31; /**< Reserved, must be zero. */
> +	uint32_t id; /**< VF ID to redirect packets to. */
> +};
> +
> +/**
> + * Definition of a single action.
> + *
> + * A list of actions is terminated by a END action.
> + *
> + * For simple actions without a configuration structure, conf remains NULL.
> + */
> +struct rte_flow_action {
> +	enum rte_flow_action_type type; /**< Action type. */
> +	const void *conf; /**< Pointer to action configuration structure. */
> +};
> +
> +/**
> + * Opaque type returned after successfully creating a flow.
> + *
> + * This handle can be used to manage and query the related flow (e.g. to
> + * destroy it or retrieve counters).
> + */
> +struct rte_flow;
> +
> +/**
> + * Verbose error types.
> + *
> + * Most of them provide the type of the object referenced by struct
> + * rte_flow_error.cause.
> + */
> +enum rte_flow_error_type {
> +	RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
> +	RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
> +	RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
> +	RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
> +	RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
> +	RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
> +	RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
> +	RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
> +	RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
> +	RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
> +	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
> +	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
> +};
> +
> +/**
> + * Verbose error structure definition.
> + *
> + * This object is normally allocated by applications and set by PMDs, the
> + * message points to a constant string which does not need to be freed by
> + * the application, however its pointer can be considered valid only as long
> + * as its associated DPDK port remains configured. Closing the underlying
> + * device or unloading the PMD invalidates it.
> + *
> + * Both cause and message may be NULL regardless of the error type.
> + */
> +struct rte_flow_error {
> +	enum rte_flow_error_type type; /**< Cause field and error types. */
> +	const void *cause; /**< Object responsible for the error. */
> +	const char *message; /**< Human-readable error message. */
> +};
> +
> +/**
> + * Check whether a flow rule can be created on a given port.
> + *
> + * While this function has no effect on the target device, the flow rule is
> + * validated against its current configuration state and the returned value
> + * should be considered valid by the caller for that state only.
> + *
> + * The returned value is guaranteed to remain valid only as long as no
> + * successful calls to rte_flow_create() or rte_flow_destroy() are made in
> + * the meantime and no device parameter affecting flow rules in any way are
> + * modified, due to possible collisions or resource limitations (although in
> + * such cases EINVAL should not be returned).
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] attr
> + *   Flow rule attributes.
> + * @param[in] pattern
> + *   Pattern specification (list terminated by the END pattern item).
> + * @param[in] actions
> + *   Associated actions (list terminated by the END action).
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *
> + * @return
> + *   0 if flow rule is valid and can be created. A negative errno value
> + *   otherwise (rte_errno is also set), the following errors are defined:
> + *
> + *   -ENOSYS: underlying device does not support this functionality.
> + *
> + *   -EINVAL: unknown or invalid rule specification.
> + *
> + *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
> + *   bit-masks are unsupported).
> + *
> + *   -EEXIST: collision with an existing rule.
> + *
> + *   -ENOMEM: not enough resources.
> + *
> + *   -EBUSY: action cannot be performed due to busy device resources, may
> + *   succeed if the affected queues or even the entire port are in a stopped
> + *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
> + */
> +int
> +rte_flow_validate(uint8_t port_id,
> +		  const struct rte_flow_attr *attr,
> +		  const struct rte_flow_item pattern[],
> +		  const struct rte_flow_action actions[],
> +		  struct rte_flow_error *error);

Why not just use rte_flow_create() and get an error? Is it less
disruptive to do a validate and find the rule cannot be created, than
using a create directly?

> +
> +/**
> + * Create a flow rule on a given port.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] attr
> + *   Flow rule attributes.
> + * @param[in] pattern
> + *   Pattern specification (list terminated by the END pattern item).
> + * @param[in] actions
> + *   Associated actions (list terminated by the END action).
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *
> + * @return
> + *   A valid handle in case of success, NULL otherwise and rte_errno is set
> + *   to the positive version of one of the error codes defined for
> + *   rte_flow_validate().
> + */
> +struct rte_flow *
> +rte_flow_create(uint8_t port_id,
> +		const struct rte_flow_attr *attr,
> +		const struct rte_flow_item pattern[],
> +		const struct rte_flow_action actions[],
> +		struct rte_flow_error *error);

General question - are these functions threadsafe? In the OVS example
you could have several threads wanting to create flow rules at the same
time for same or different ports.

> +
> +/**
> + * Destroy a flow rule on a given port.
> + *
> + * Failure to destroy a flow rule handle may occur when other flow rules
> + * depend on it, and destroying it would result in an inconsistent state.
> + *
> + * This function is only guaranteed to succeed if handles are destroyed in
> + * reverse order of their creation.

How can the application find this information out on error?

> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param flow
> + *   Flow rule handle to destroy.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_flow_destroy(uint8_t port_id,
> +		 struct rte_flow *flow,
> +		 struct rte_flow_error *error);
> +
> +/**
> + * Destroy all flow rules associated with a port.
> + *
> + * In the unlikely event of failure, handles are still considered destroyed
> + * and no longer valid but the port must be assumed to be in an inconsistent
> + * state.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_flow_flush(uint8_t port_id,
> +	       struct rte_flow_error *error);

rte_flow_destroy_all() would be more descriptive (but breaks your style)

> +
> +/**
> + * Query an existing flow rule.
> + *
> + * This function allows retrieving flow-specific data such as counters.
> + * Data is gathered by special actions which must be present in the flow
> + * rule definition.

re last sentence, it would be good if you can put a link to
RTE_FLOW_ACTION_TYPE_COUNT

> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param flow
> + *   Flow rule handle to query.
> + * @param action
> + *   Action type to query.
> + * @param[in, out] data
> + *   Pointer to storage for the associated query data type.

can this be anything other than rte_flow_query_count?

> + * @param[out] error
> + *   Perform verbose error reporting if not NULL.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_flow_query(uint8_t port_id,
> +	       struct rte_flow *flow,
> +	       enum rte_flow_action_type action,
> +	       void *data,
> +	       struct rte_flow_error *error);
> +
> +#ifdef __cplusplus
> +}
> +#endif

I don't see a way to dump all the rules for a port out. I think this is
neccessary for degbugging. You could have a look through dpif.h in OVS
and see how dpif_flow_dump_next() is used, it might be a good reference.

Also, it would be nice if there were an api that would allow a test
packet to be injected and traced for debugging - although I'm not
exactly sure how well it could be traced. For reference:
http://developers.redhat.com/blog/2016/10/12/tracing-packets-inside-open-vswitch/

thanks,
Kevin.

> +
> +#endif /* RTE_FLOW_H_ */
> diff --git a/lib/librte_ether/rte_flow_driver.h b/lib/librte_ether/rte_flow_driver.h
> new file mode 100644
> index 0000000..a88c621
> --- /dev/null
> +++ b/lib/librte_ether/rte_flow_driver.h
> @@ -0,0 +1,177 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright 2016 6WIND S.A.
> + *   Copyright 2016 Mellanox.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of 6WIND S.A. nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef RTE_FLOW_DRIVER_H_
> +#define RTE_FLOW_DRIVER_H_
> +
> +/**
> + * @file
> + * RTE generic flow API (driver side)
> + *
> + * This file provides implementation helpers for internal use by PMDs, they
> + * are not intended to be exposed to applications and are not subject to ABI
> + * versioning.
> + */
> +
> +#include <stdint.h>
> +
> +#include <rte_errno.h>
> +#include "rte_flow.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Generic flow operations structure implemented and returned by PMDs.
> + *
> + * To implement this API, PMDs must handle the RTE_ETH_FILTER_GENERIC filter
> + * type in their .filter_ctrl callback function (struct eth_dev_ops) as well
> + * as the RTE_ETH_FILTER_GET filter operation.
> + *
> + * If successful, this operation must result in a pointer to a PMD-specific
> + * struct rte_flow_ops written to the argument address as described below:
> + *
> + *  // PMD filter_ctrl callback
> + *
> + *  static const struct rte_flow_ops pmd_flow_ops = { ... };
> + *
> + *  switch (filter_type) {
> + *  case RTE_ETH_FILTER_GENERIC:
> + *      if (filter_op != RTE_ETH_FILTER_GET)
> + *          return -EINVAL;
> + *      *(const void **)arg = &pmd_flow_ops;
> + *      return 0;
> + *  }
> + *
> + * See also rte_flow_ops_get().
> + *
> + * These callback functions are not supposed to be used by applications
> + * directly, which must rely on the API defined in rte_flow.h.
> + *
> + * Public-facing wrapper functions perform a few consistency checks so that
> + * unimplemented (i.e. NULL) callbacks simply return -ENOTSUP. These
> + * callbacks otherwise only differ by their first argument (with port ID
> + * already resolved to a pointer to struct rte_eth_dev).
> + */
> +struct rte_flow_ops {
> +	/** See rte_flow_validate(). */
> +	int (*validate)
> +		(struct rte_eth_dev *,
> +		 const struct rte_flow_attr *,
> +		 const struct rte_flow_item [],
> +		 const struct rte_flow_action [],
> +		 struct rte_flow_error *);
> +	/** See rte_flow_create(). */
> +	struct rte_flow *(*create)
> +		(struct rte_eth_dev *,
> +		 const struct rte_flow_attr *,
> +		 const struct rte_flow_item [],
> +		 const struct rte_flow_action [],
> +		 struct rte_flow_error *);
> +	/** See rte_flow_destroy(). */
> +	int (*destroy)
> +		(struct rte_eth_dev *,
> +		 struct rte_flow *,
> +		 struct rte_flow_error *);
> +	/** See rte_flow_flush(). */
> +	int (*flush)
> +		(struct rte_eth_dev *,
> +		 struct rte_flow_error *);
> +	/** See rte_flow_query(). */
> +	int (*query)
> +		(struct rte_eth_dev *,
> +		 struct rte_flow *,
> +		 enum rte_flow_action_type,
> +		 void *,
> +		 struct rte_flow_error *);
> +};
> +
> +/**
> + * Initialize generic flow error structure.
> + *
> + * This function also sets rte_errno to a given value.
> + *
> + * @param[out] error
> + *   Pointer to flow error structure (may be NULL).
> + * @param code
> + *   Related error code (rte_errno).
> + * @param type
> + *   Cause field and error types.
> + * @param cause
> + *   Object responsible for the error.
> + * @param message
> + *   Human-readable error message.
> + *
> + * @return
> + *   Pointer to flow error structure.
> + */
> +static inline struct rte_flow_error *
> +rte_flow_error_set(struct rte_flow_error *error,
> +		   int code,
> +		   enum rte_flow_error_type type,
> +		   void *cause,
> +		   const char *message)
> +{
> +	if (error) {
> +		*error = (struct rte_flow_error){
> +			.type = type,
> +			.cause = cause,
> +			.message = message,
> +		};
> +	}
> +	rte_errno = code;
> +	return error;
> +}
> +
> +/**
> + * Get generic flow operations structure from a port.
> + *
> + * @param port_id
> + *   Port identifier to query.
> + * @param[out] error
> + *   Pointer to flow error structure.
> + *
> + * @return
> + *   The flow operations structure associated with port_id, NULL in case of
> + *   error, in which case rte_errno is set and the error structure contains
> + *   additional details.
> + */
> +const struct rte_flow_ops *
> +rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_FLOW_DRIVER_H_ */
> 

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-11-30 17:47       ` Kevin Traynor
@ 2016-12-01  8:36         ` Adrien Mazarguil
  2016-12-02 21:06           ` Kevin Traynor
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-01  8:36 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz, sugesh.chandra

Hi Kevin,

On Wed, Nov 30, 2016 at 05:47:17PM +0000, Kevin Traynor wrote:
> Hi Adrien,
> 
> On 11/16/2016 04:23 PM, Adrien Mazarguil wrote:
> > This new API supersedes all the legacy filter types described in
> > rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
> > PMDs to process and validate flow rules.
> > 
> > Benefits:
> > 
> > - A unified API is easier to program for, applications do not have to be
> >   written for a specific filter type which may or may not be supported by
> >   the underlying device.
> > 
> > - The behavior of a flow rule is the same regardless of the underlying
> >   device, applications do not need to be aware of hardware quirks.
> > 
> > - Extensible by design, API/ABI breakage should rarely occur if at all.
> > 
> > - Documentation is self-standing, no need to look up elsewhere.
> > 
> > Existing filter types will be deprecated and removed in the near future.
> 
> I'd suggest to add a deprecation notice to deprecation.rst, ideally with
> a target release.

Will do, not a sure about the target release though. It seems a bit early
since no PMD really supports this API yet.

[...]
> > diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
> > new file mode 100644
> > index 0000000..064963d
> > --- /dev/null
> > +++ b/lib/librte_ether/rte_flow.c
> > @@ -0,0 +1,159 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright 2016 6WIND S.A.
> > + *   Copyright 2016 Mellanox.
> 
> There's Mellanox copyright but you are the only signed-off-by - is that
> right?

Yes, I'm the primary maintainer for Mellanox PMDs and this API was designed
on their behalf to expose several features from mlx4/mlx5 as the existing
filter types had too many limitations.

[...]
> > +/* Get generic flow operations structure from a port. */
> > +const struct rte_flow_ops *
> > +rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
> > +{
> > +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> > +	const struct rte_flow_ops *ops;
> > +	int code;
> > +
> > +	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
> > +		code = ENODEV;
> > +	else if (unlikely(!dev->dev_ops->filter_ctrl ||
> > +			  dev->dev_ops->filter_ctrl(dev,
> > +						    RTE_ETH_FILTER_GENERIC,
> > +						    RTE_ETH_FILTER_GET,
> > +						    &ops) ||
> > +			  !ops))
> > +		code = ENOTSUP;
> > +	else
> > +		return ops;
> > +	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> > +			   NULL, rte_strerror(code));
> > +	return NULL;
> > +}
> > +
> 
> Is it expected that the application or pmd will provide locking between
> these functions if required? I think it's going to have to be the app.

Locking is indeed expected to be performed by applications. This API only
documents places where locking would make sense if necessary and expected
behavior.

Like all control path APIs, this one assumes a single control thread.
Applications must take the necessary precautions.

[...]
> > +/**
> > + * Flow rule attributes.
> > + *
> > + * Priorities are set on two levels: per group and per rule within groups.
> > + *
> > + * Lower values denote higher priority, the highest priority for both levels
> > + * is 0, so that a rule with priority 0 in group 8 is always matched after a
> > + * rule with priority 8 in group 0.
> > + *
> > + * Although optional, applications are encouraged to group similar rules as
> > + * much as possible to fully take advantage of hardware capabilities
> > + * (e.g. optimized matching) and work around limitations (e.g. a single
> > + * pattern type possibly allowed in a given group).
> > + *
> > + * Group and priority levels are arbitrary and up to the application, they
> > + * do not need to be contiguous nor start from 0, however the maximum number
> > + * varies between devices and may be affected by existing flow rules.
> > + *
> > + * If a packet is matched by several rules of a given group for a given
> > + * priority level, the outcome is undefined. It can take any path, may be
> > + * duplicated or even cause unrecoverable errors.
> 
> I get what you are trying to do here wrt supporting multiple
> pmds/hardware implementations and it's a good idea to keep it flexible.
> 
> Given that the outcome is undefined, it would be nice that the
> application has a way of finding the specific effects for verification
> and debugging.

Right, however it was deemed a bit difficult to manage in many cases hence
the vagueness.

For example, suppose two rules with the same group and priority, one
matching any IPv4 header, the other one any UDP header:

- TCPv4 packets => rule #1.
- UDPv6 packets => rule #2.
- UDPv4 packets => both?

That last one is perhaps invalid, checking that some unspecified protocol
combination does not overlap is expensive and may miss corner cases, even
assuming this is not an issue, what if the application guarantees that no
UDPv4 packets can ever hit that rule?

Suggestions are welcome though, perhaps we can refine the description.

> > + *
> > + * Note that support for more than a single group and priority level is not
> > + * guaranteed.
> > + *
> > + * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
> > + *
> > + * Several pattern items and actions are valid and can be used in both
> > + * directions. Those valid for only one direction are described as such.
> > + *
> > + * Specifying both directions at once is not recommended but may be valid in
> > + * some cases, such as incrementing the same counter twice.
> > + *
> > + * Not specifying any direction is currently an error.
> > + */
> > +struct rte_flow_attr {
> > +	uint32_t group; /**< Priority group. */
> > +	uint32_t priority; /**< Priority level within group. */
> > +	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
> > +	uint32_t egress:1; /**< Rule applies to egress traffic. */
> > +	uint32_t reserved:30; /**< Reserved, must be zero. */
> > +};
[...]
> > +/**
> > + * RTE_FLOW_ITEM_TYPE_VF
> > + *
> > + * Matches packets addressed to a virtual function ID of the device.
> > + *
> > + * If the underlying device function differs from the one that would
> > + * normally receive the matched traffic, specifying this item prevents it
> > + * from reaching that device unless the flow rule contains a VF
> > + * action. Packets are not duplicated between device instances by default.
> > + *
> > + * - Likely to return an error or never match any traffic if this causes a
> > + *   VF device to match traffic addressed to a different VF.
> > + * - Can be specified multiple times to match traffic addressed to several
> > + *   specific VFs.
> > + * - Can be combined with a PF item to match both PF and VF traffic.
> > + *
> > + * A zeroed mask can be used to match any VF.
> 
> can you refer explicitly to id

If you mean "VF" to "VF ID" then yes, will do it for v2.

> > + */
> > +struct rte_flow_item_vf {
> > +	uint32_t id; /**< Destination VF ID. */
> > +};
[...]
> > +/**
> > + * Matching pattern item definition.
> > + *
> > + * A pattern is formed by stacking items starting from the lowest protocol
> > + * layer to match. This stacking restriction does not apply to meta items
> > + * which can be placed anywhere in the stack with no effect on the meaning
> > + * of the resulting pattern.
> > + *
> > + * A stack is terminated by a END item.
> > + *
> > + * The spec field should be a valid pointer to a structure of the related
> > + * item type. It may be set to NULL in many cases to use default values.
> > + *
> > + * Optionally, last can point to a structure of the same type to define an
> > + * inclusive range. This is mostly supported by integer and address fields,
> > + * may cause errors otherwise. Fields that do not support ranges must be set
> > + * to the same value as their spec counterparts.
> > + *
> > + * By default all fields present in spec are considered relevant.* This
> 
> typo "*"

No, that's an asterisk for a footnote below. Perhaps it is a bit unusual,
would something like "[1]" look better?

> > + * behavior can be altered by providing a mask structure of the same type
> > + * with applicable bits set to one. It can also be used to partially filter
> > + * out specific fields (e.g. as an alternate mean to match ranges of IP
> > + * addresses).
> > + *
> > + * Note this is a simple bit-mask applied before interpreting the contents
> > + * of spec and last, which may yield unexpected results if not used
> > + * carefully. For example, if for an IPv4 address field, spec provides
> > + * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
> > + * effective range is 10.1.0.0 to 10.3.255.255.
> > + *

See footnote below:

> > + * * The defaults for data-matching items such as IPv4 when mask is not
> > + *   specified actually depend on the underlying implementation since only
> > + *   recognized fields can be taken into account.
> > + */
> > +struct rte_flow_item {
> > +	enum rte_flow_item_type type; /**< Item type. */
> > +	const void *spec; /**< Pointer to item specification structure. */
> > +	const void *last; /**< Defines an inclusive range (spec to last). */
> > +	const void *mask; /**< Bit-mask applied to spec and last. */
> > +};
> > +
> > +/**
> > + * Action types.
> > + *
> > + * Each possible action is represented by a type. Some have associated
> > + * configuration structures. Several actions combined in a list can be
> > + * affected to a flow rule. That list is not ordered.
> > + *
> > + * They fall in three categories:
> > + *
> > + * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
> > + *   processing matched packets by subsequent flow rules, unless overridden
> > + *   with PASSTHRU.
> > + *
> > + * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
> > + *   for additional processing by subsequent flow rules.
> > + *
> > + * - Other non terminating meta actions that do not affect the fate of
> > + *   packets (END, VOID, MARK, FLAG, COUNT).
> > + *
> > + * When several actions are combined in a flow rule, they should all have
> > + * different types (e.g. dropping a packet twice is not possible). The
> > + * defined behavior is for PMDs to only take into account the last action of
> > + * a given type found in the list. PMDs still perform error checking on the
> > + * entire list.
> 
> why do you define that the pmd will interpret multiple same type rules
> in this way...would it not make more sense for the pmd to just return
> EINVAL for an invalid set of rules? It seems more transparent for the
> application.

Well, I had to define something as a default. The reason is that any number
of VOID actions may specified and did not want that to be a special case in
order to keep PMD parsers as simple as possible. I'll settle for EINVAL (or
some other error) if at least one PMD maintainer other than Nelio who
intends to implement this API is not convinced by this explanation, all
right?

[...]
> > +/**
> > + * RTE_FLOW_ACTION_TYPE_MARK
> > + *
> > + * Attaches a 32 bit value to packets.
> > + *
> > + * This value is arbitrary and application-defined. For compatibility with
> > + * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
> > + * also set in ol_flags.
> > + */
> > +struct rte_flow_action_mark {
> > +	uint32_t id; /**< 32 bit value to return with packets. */
> > +};
> 
> One use case I thought we would be able to do for OVS is classification
> in hardware and the unique flow id is sent with the packet to software.
> But in OVS the ufid is 128 bits, so it means we can't and there is still
> the miniflow extract overhead. I'm not sure if there is a practical way
> around this.
> 
> Sugesh (cc'd) has looked at this before and may be able to comment or
> correct me.

Yes, we settled on 32 bit because currently no known hardware implementation
supports more than this. If that changes, another action with a larger type
shall be provided (no ABI breakage).

Also since even 64 bit would not be enough for the use case you mention,
there is no choice but use this as an indirect value (such as an array or
hash table index/value).

[...]
> > +/**
> > + * RTE_FLOW_ACTION_TYPE_RSS
> > + *
> > + * Similar to QUEUE, except RSS is additionally performed on packets to
> > + * spread them among several queues according to the provided parameters.
> > + *
> > + * Note: RSS hash result is normally stored in the hash.rss mbuf field,
> > + * however it conflicts with the MARK action as they share the same
> > + * space. When both actions are specified, the RSS hash is discarded and
> > + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
> > + * structure should eventually evolve to store both.
> > + *
> > + * Terminating by default.
> > + */
> > +struct rte_flow_action_rss {
> > +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
> > +	uint16_t queues; /**< Number of entries in queue[]. */
> > +	uint16_t queue[]; /**< Queues indices to use. */
> 
> I'd try and avoid queue and queues - someone will say "huh?" when
> reading code. s/queues/num ?

Agreed, will update for v2.

> > +};
> > +
> > +/**
> > + * RTE_FLOW_ACTION_TYPE_VF
> > + *
> > + * Redirects packets to a virtual function (VF) of the current device.
> > + *
> > + * Packets matched by a VF pattern item can be redirected to their original
> > + * VF ID instead of the specified one. This parameter may not be available
> > + * and is not guaranteed to work properly if the VF part is matched by a
> > + * prior flow rule or if packets are not addressed to a VF in the first
> > + * place.
> 
> Not clear what you mean by "not guaranteed to work if...". Please return
> fail when this action is used if this is not going to work.

Again, this is a case where it is difficult for a PMD to determine if the
entire list of flow rules makes sense. Perhaps it does, perhaps whatever
goes through has already been filtered out of possible issues.

Here the documentation states the precautions an application should take to
guarantee it will work as intended. Perhaps it can be reworded (any
suggestion?), but a PMD can certainly not provide any strong guarantee.

> > + *
> > + * Terminating by default.
> > + */
> > +struct rte_flow_action_vf {
> > +	uint32_t original:1; /**< Use original VF ID if possible. */
> > +	uint32_t reserved:31; /**< Reserved, must be zero. */
> > +	uint32_t id; /**< VF ID to redirect packets to. */
> > +};
[...]
> > +/**
> > + * Check whether a flow rule can be created on a given port.
> > + *
> > + * While this function has no effect on the target device, the flow rule is
> > + * validated against its current configuration state and the returned value
> > + * should be considered valid by the caller for that state only.
> > + *
> > + * The returned value is guaranteed to remain valid only as long as no
> > + * successful calls to rte_flow_create() or rte_flow_destroy() are made in
> > + * the meantime and no device parameter affecting flow rules in any way are
> > + * modified, due to possible collisions or resource limitations (although in
> > + * such cases EINVAL should not be returned).
> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param[in] attr
> > + *   Flow rule attributes.
> > + * @param[in] pattern
> > + *   Pattern specification (list terminated by the END pattern item).
> > + * @param[in] actions
> > + *   Associated actions (list terminated by the END action).
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *
> > + * @return
> > + *   0 if flow rule is valid and can be created. A negative errno value
> > + *   otherwise (rte_errno is also set), the following errors are defined:
> > + *
> > + *   -ENOSYS: underlying device does not support this functionality.
> > + *
> > + *   -EINVAL: unknown or invalid rule specification.
> > + *
> > + *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
> > + *   bit-masks are unsupported).
> > + *
> > + *   -EEXIST: collision with an existing rule.
> > + *
> > + *   -ENOMEM: not enough resources.
> > + *
> > + *   -EBUSY: action cannot be performed due to busy device resources, may
> > + *   succeed if the affected queues or even the entire port are in a stopped
> > + *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
> > + */
> > +int
> > +rte_flow_validate(uint8_t port_id,
> > +		  const struct rte_flow_attr *attr,
> > +		  const struct rte_flow_item pattern[],
> > +		  const struct rte_flow_action actions[],
> > +		  struct rte_flow_error *error);
> 
> Why not just use rte_flow_create() and get an error? Is it less
> disruptive to do a validate and find the rule cannot be created, than
> using a create directly?

The rationale can be found in the original RFC, which I'll convert to actual
documentation in v2. In short:

- Calling rte_flow_validate() before rte_flow_create() is useless since
  rte_flow_create() also performs validation.

- We cannot possibly express a full static set of allowed flow rules, even
  if we could, it usually depends on the current hardware configuration
  therefore would not be static.

- rte_flow_validate() is thus provided as a replacement for capability
  flags. It can be used to determine during initialization if the underlying
  device can support the typical flow rules an application might want to
  provide later and do something useful with that information (e.g. always
  use software fallback due to HW limitations).

- rte_flow_validate() being a subset of rte_flow_create(), it is essentially
  free to expose.

> > +
> > +/**
> > + * Create a flow rule on a given port.
> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param[in] attr
> > + *   Flow rule attributes.
> > + * @param[in] pattern
> > + *   Pattern specification (list terminated by the END pattern item).
> > + * @param[in] actions
> > + *   Associated actions (list terminated by the END action).
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *
> > + * @return
> > + *   A valid handle in case of success, NULL otherwise and rte_errno is set
> > + *   to the positive version of one of the error codes defined for
> > + *   rte_flow_validate().
> > + */
> > +struct rte_flow *
> > +rte_flow_create(uint8_t port_id,
> > +		const struct rte_flow_attr *attr,
> > +		const struct rte_flow_item pattern[],
> > +		const struct rte_flow_action actions[],
> > +		struct rte_flow_error *error);
> 
> General question - are these functions threadsafe? In the OVS example
> you could have several threads wanting to create flow rules at the same
> time for same or different ports.

No they aren't, applications have to perform their own locking. The RFC (to
be converted to actual documentation in v2) says that:

- API operations are synchronous and blocking (``EAGAIN`` cannot be
  returned).

- There is no provision for reentrancy/multi-thread safety, although nothing
  should prevent different devices from being configured at the same
  time. PMDs may protect their control path functions accordingly.

> > +
> > +/**
> > + * Destroy a flow rule on a given port.
> > + *
> > + * Failure to destroy a flow rule handle may occur when other flow rules
> > + * depend on it, and destroying it would result in an inconsistent state.
> > + *
> > + * This function is only guaranteed to succeed if handles are destroyed in
> > + * reverse order of their creation.
> 
> How can the application find this information out on error?

Without maintaining a list, they cannot. The specified case is the only
possible guarantee. That does not mean PMDs should not do their best to
destroy flow rules, only that ordering must remain consistent in case of
inability to destroy one.

What do you suggest?

> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param flow
> > + *   Flow rule handle to destroy.
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +int
> > +rte_flow_destroy(uint8_t port_id,
> > +		 struct rte_flow *flow,
> > +		 struct rte_flow_error *error);
> > +
> > +/**
> > + * Destroy all flow rules associated with a port.
> > + *
> > + * In the unlikely event of failure, handles are still considered destroyed
> > + * and no longer valid but the port must be assumed to be in an inconsistent
> > + * state.
> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +int
> > +rte_flow_flush(uint8_t port_id,
> > +	       struct rte_flow_error *error);
> 
> rte_flow_destroy_all() would be more descriptive (but breaks your style)

There are enough underscores as it is. I like flush, if enough people
complain we'll change it but it has to occur before the first public
release.

> > +
> > +/**
> > + * Query an existing flow rule.
> > + *
> > + * This function allows retrieving flow-specific data such as counters.
> > + * Data is gathered by special actions which must be present in the flow
> > + * rule definition.
> 
> re last sentence, it would be good if you can put a link to
> RTE_FLOW_ACTION_TYPE_COUNT

Will do, I did not know how until very recently.

> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param flow
> > + *   Flow rule handle to query.
> > + * @param action
> > + *   Action type to query.
> > + * @param[in, out] data
> > + *   Pointer to storage for the associated query data type.
> 
> can this be anything other than rte_flow_query_count?

Likely in the future. I've only defined this one as a counterpart for
existing API functionality and because we wanted to expose it in mlx5.

> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +int
> > +rte_flow_query(uint8_t port_id,
> > +	       struct rte_flow *flow,
> > +	       enum rte_flow_action_type action,
> > +	       void *data,
> > +	       struct rte_flow_error *error);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> 
> I don't see a way to dump all the rules for a port out. I think this is
> neccessary for degbugging. You could have a look through dpif.h in OVS
> and see how dpif_flow_dump_next() is used, it might be a good reference.

DPDK does not maintain flow rules and, depending on hardware capabilities
and level of compliance, PMDs do not necessarily do it either, particularly
since it requires space and application probably have a better method to
store these pointers for their own needs.

What you see here is only a PMD interface. Depending on applications needs,
generic helper functions built on top of these may be added to manage flow
rules in the future.

> Also, it would be nice if there were an api that would allow a test
> packet to be injected and traced for debugging - although I'm not
> exactly sure how well it could be traced. For reference:
> http://developers.redhat.com/blog/2016/10/12/tracing-packets-inside-open-vswitch/

Thanks for the link, I'm not sure how you'd do this either. Remember, as
generic as it looks, this interface is only meant to configure the
underlying device. You need to see it as one big offload, everything else
is left to applications.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-11-28 10:03     ` Pei, Yulong
@ 2016-12-01  8:39       ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-01  8:39 UTC (permalink / raw)
  To: Pei, Yulong
  Cc: dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz, Xing, Beilei

Hi Yulong,

On Mon, Nov 28, 2016 at 10:03:53AM +0000, Pei, Yulong wrote:
> Hi Adrien,
> 
> I  think that you already did test for your patchset,  do you have any automated test scripts can be shared for validation since there did not have testpmd flow command documentation yet?

No automated script, at least not yet. I intend to submit v2 with extra
API documentation, testpmd commands with examples of expected behavior and
output, as well as fixes for the issues pointed out by Nelio.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (23 preceding siblings ...)
  2016-11-28 10:03     ` Pei, Yulong
@ 2016-12-02 16:58     ` Ferruh Yigit
  2016-12-08 15:19       ` Adrien Mazarguil
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
  25 siblings, 1 reply; 262+ messages in thread
From: Ferruh Yigit @ 2016-12-02 16:58 UTC (permalink / raw)
  To: Adrien Mazarguil, dev; +Cc: Thomas Monjalon, Pablo de Lara, Olivier Matz

Hi Adrien,

On 11/16/2016 4:23 PM, Adrien Mazarguil wrote:
> As previously discussed in RFC v1 [1], RFC v2 [2], with changes
> described in [3] (also pasted below), here is the first non-draft series
> for this new API.
> 
> Its capabilities are so generic that its name had to be vague, it may be
> called "Generic flow API", "Generic flow interface" (possibly shortened
> as "GFI") to refer to the name of the new filter type, or "rte_flow" from
> the prefix used for its public symbols. I personally favor the latter.
> 
> While it is currently meant to supersede existing filter types in order for
> all PMDs to expose a common filtering/classification interface, it may
> eventually evolve to cover the following ideas as well:
> 
> - Rx/Tx offloads configuration through automatic offloads for specific
>   packets, e.g. performing checksum on TCP packets could be expressed with
>   an egress rule with a TCP pattern and a kind of checksum action.
> 
> - RSS configuration (already defined actually). Could be global or per rule
>   depending on hardware capabilities.
> 
> - Switching configuration for devices with many physical ports; rules doing
>   both ingress and egress could even be used to completely bypass software
>   if supported by hardware.
> 
>  [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
>  [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
>  [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html
> 
> Changes since RFC v2:
> 
> - New separate VLAN pattern item (previously part of the ETH definition),
>   found to be much more convenient.
> 
> - Removed useless "any" field from VF pattern item, the same effect can be
>   achieved by not providing a specification structure.
> 
> - Replaced bit-fields from the VXLAN pattern item to avoid endianness
>   conversion issues on 24-bit fields.
> 
> - Updated struct rte_flow_item with a new "last" field to create inclusive
>   ranges. They are defined as the interval between (spec & mask) and
>   (last & mask). All three parameters are optional.
> 
> - Renamed ID action MARK.
> 
> - Renamed "queue" fields in actions QUEUE and DUP to "index".
> 
> - "rss_conf" field in RSS action is now const.
> 
> - VF action now uses a 32 bit ID like its pattern item counterpart.
> 
> - Removed redundant struct rte_flow_pattern, API functions now expect
>   struct
>   rte_flow_item lists terminated by END items.
> 
> - Replaced struct rte_flow_actions for the same reason, with struct
>   rte_flow_action lists terminated by END actions.
> 
> - Error types (enum rte_flow_error_type) have been updated and the cause
>   pointer in struct rte_flow_error is now const.
> 
> - Function prototypes (rte_flow_create, rte_flow_validate) have also been
>   updated for clarity.
> 
> Additions:
> 
> - Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
>   are now implemented in rte_flow.c, with their symbols exported and
>   versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.
> 
> - A separate header (rte_flow_driver.h) has been added for driver-side
>   functionality, in particular struct rte_flow_ops which contains PMD
>   callbacks returned by RTE_ETH_FILTER_GENERIC query.
> 
> - testpmd now exposes most of this API through the new "flow" command.
> 
> What remains to be done:
> 
> - Using endian-aware integer types (rte_beX_t) where necessary for clarity.
> 
> - API documentation (based on RFC).
> 
> - testpmd flow command documentation (although context-aware command
>   completion should already help quite a bit in this regard).
> 
> - A few pattern item / action properties cannot be configured yet
>   (e.g. rss_conf parameter for RSS action) and a few completions
>   (e.g. possible queue IDs) should be added.
> 

<...>

I was trying to check driver filter API patches, but hit a few compiler
errors with this patchset.

[1] clang complains about variable bitfield value changed from -1 to 1.
Which is correct, but I guess that is intentional, but I don't know how
to tell this to clang?

[2] shred library compilation error, because of missing rte_flow_flush
in rte_ether_version.map file

[3] bunch of icc compilation errors, almost all are same type:
error #188: enumerated type mixed with another type


Thanks,
ferruh



[1]
=============================
.../app/test-pmd/cmdline_flow.c:944:16: error: implicit truncation from
'int' to bitfield changes value from -1 to 1
[-Werror,-Wbitfield-constant-conversion]
                .args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
relative)),

^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.../app/test-pmd/cmdline_flow.c:282:42: note: expanded from macro
'ARGS_ENTRY_BF'
                .mask = (const void *)&(const s){ .f = -1 }, \
                                                       ^~
.../app/test-pmd/cmdline_flow.c:269:49: note: expanded from macro 'ARGS'
#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
                                                ^~~~~~~~~~~
.../app/test-pmd/cmdline_flow.c:950:16: error: implicit truncation from
'int' to bitfield changes value from -1 to 1
[-Werror,-Wbitfield-constant-conversion]
                .args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
search)),
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.../app/test-pmd/cmdline_flow.c:282:42: note: expanded from macro
'ARGS_ENTRY_BF'
                .mask = (const void *)&(const s){ .f = -1 }, \
                                                       ^~
.../app/test-pmd/cmdline_flow.c:269:49: note: expanded from macro 'ARGS'
#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
                                                ^~~~~~~~~~~
.../app/test-pmd/cmdline_flow.c:1293:16: error: implicit truncation from
'int' to bitfield changes value from -1 to 1
[-Werror,-Wbitfield-constant-conversion]
                .args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_vf,
                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.../app/test-pmd/cmdline_flow.c:282:42: note: expanded from macro
'ARGS_ENTRY_BF'
                .mask = (const void *)&(const s){ .f = -1 }, \
                                                       ^~
.../app/test-pmd/cmdline_flow.c:269:49: note: expanded from macro 'ARGS'
#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
                                                ^~~~~~~~~~~
.../app/test-pmd/cmdline_flow.c:1664:26: error: duplicate 'const'
declaration specifier [-Werror,-Wduplicate-decl-specifier]
        static const enum index const next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
                                ^
4 errors generated.



[2]
=============================
  LD testpmd
config.o: In function `port_flow_flush':
config.c:(.text+0x2231): undefined reference to `rte_flow_flush'
collect2: error: ld returned 1 exit status


[3]
=============================
.../app/test-pmd/cmdline_flow.c(364): error #188: enumerated type mixed
with another type

        0,


        ^

.../app/test-pmd/cmdline_flow.c(370): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(376): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(385): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(406): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(413): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(419): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(425): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(435): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(443): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(450): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(457): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(464): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(471): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(478): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(485): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(492): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(498): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(514): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(520): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(526): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(532): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(538): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(545): error #188: enumerated type mixed
with another type
        0,
        ^

.../app/test-pmd/cmdline_flow.c(619): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(FLOW)),
                        ^

.../app/test-pmd/cmdline_flow.c(716): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY
                        ^

.../app/test-pmd/cmdline_flow.c(729): error #188: enumerated type mixed
with another type
                .next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(736): error #188: enumerated type mixed
with another type
                .next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(743): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(743): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(750): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(PORT_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(757): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(QUERY_ACTION),
                        ^

.../app/test-pmd/cmdline_flow.c(757): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(QUERY_ACTION),
                        ^

.../app/test-pmd/cmdline_flow.c(757): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(QUERY_ACTION),
                        ^

.../app/test-pmd/cmdline_flow.c(768): error #188: enumerated type mixed
with another type
                .next = NEXT(next_list_attr, NEXT_ENTRY(PORT_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(776): error #188: enumerated type mixed
with another type
                .next = NEXT(next_destroy_attr, NEXT_ENTRY(RULE_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(792): error #188: enumerated type mixed
with another type
                .next = NEXT(next_list_attr, NEXT_ENTRY(GROUP_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(800): error #188: enumerated type mixed
with another type
                .next = NEXT(next_vc_attr, NEXT_ENTRY(GROUP_ID)),
                        ^

.../app/test-pmd/cmdline_flow.c(807): error #188: enumerated type mixed
with another type
                .next = NEXT(next_vc_attr, NEXT_ENTRY(PRIORITY_LEVEL)),
                        ^

.../app/test-pmd/cmdline_flow.c(864): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ACTIONS)),
                        ^

.../app/test-pmd/cmdline_flow.c(871): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(878): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(891): error #188: enumerated type mixed
with another type
                .next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(897): error #188: enumerated type mixed
with another type
                .next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(904): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(917): error #188: enumerated type mixed
with another type
                .next = NEXT(item_vf, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(930): error #188: enumerated type mixed
with another type
                .next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(943): error #188: enumerated type mixed
with another type
                .next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(949): error #188: enumerated type mixed
with another type
                .next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(955): error #188: enumerated type mixed
with another type
                .next = NEXT(item_raw, NEXT_ENTRY(INTEGER), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(961): error #188: enumerated type mixed
with another type
                .next = NEXT(item_raw, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(967): error #188: enumerated type mixed
with another type
                .next = NEXT(item_raw,
                        ^

.../app/test-pmd/cmdline_flow.c(967): error #188: enumerated type mixed
with another type
                .next = NEXT(item_raw,
                        ^

.../app/test-pmd/cmdline_flow.c(987): error #188: enumerated type mixed
with another type
                .next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(993): error #188: enumerated type mixed
with another type
                .next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(999): error #188: enumerated type mixed
with another type
                .next = NEXT(item_eth, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1012): error #188: enumerated type mixed
with another type
                .next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1018): error #188: enumerated type mixed
with another type
                .next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1031): error #188: enumerated type mixed
with another type
                .next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1038): error #188: enumerated type mixed
with another type
                .next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1052): error #188: enumerated type mixed
with another type
                .next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1059): error #188: enumerated type mixed
with another type
                .next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1073): error #188: enumerated type mixed
with another type
                .next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1080): error #188: enumerated type mixed
with another type
                .next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1094): error #188: enumerated type mixed
with another type
                .next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1101): error #188: enumerated type mixed
with another type
                .next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1115): error #188: enumerated type mixed
with another type
                .next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1122): error #188: enumerated type mixed
with another type
                .next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1136): error #188: enumerated type mixed
with another type
                .next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1143): error #188: enumerated type mixed
with another type
                .next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1157): error #188: enumerated type mixed
with another type
                .next = NEXT(item_vxlan, NEXT_ENTRY(UNSIGNED), item_param),
                        ^

.../app/test-pmd/cmdline_flow.c(1182): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(1189): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(1202): error #188: enumerated type mixed
with another type
                .next = NEXT(action_mark, NEXT_ENTRY(UNSIGNED)),
                        ^

.../app/test-pmd/cmdline_flow.c(1210): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(1224): error #188: enumerated type mixed
with another type
                .next = NEXT(action_queue, NEXT_ENTRY(UNSIGNED)),
                        ^

.../app/test-pmd/cmdline_flow.c(1232): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(1239): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(1252): error #188: enumerated type mixed
with another type
                .next = NEXT(action_dup, NEXT_ENTRY(UNSIGNED)),
                        ^

.../app/test-pmd/cmdline_flow.c(1266): error #188: enumerated type mixed
with another type
                .next = NEXT(action_rss, NEXT_ENTRY(ACTION_RSS_QUEUE)),
                        ^

.../app/test-pmd/cmdline_flow.c(1279): error #188: enumerated type mixed
with another type
                .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
                        ^

.../app/test-pmd/cmdline_flow.c(1292): error #188: enumerated type mixed
with another type
                .next = NEXT(action_vf, NEXT_ENTRY(BOOLEAN)),
                        ^

.../app/test-pmd/cmdline_flow.c(1300): error #188: enumerated type mixed
with another type
                .next = NEXT(action_vf, NEXT_ENTRY(UNSIGNED)),
                        ^

.../app/test-pmd/cmdline_flow.c(1599): error #188: enumerated type mixed
with another type
                ctx->next[ctx->next_num - 2] = NEXT_ENTRY(PREFIX);
                                               ^

.../app/test-pmd/cmdline_flow.c(1664): error #83: type qualifier
specified more than once
        static const enum index const next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
                                ^

.../app/test-pmd/cmdline_flow.c(1664): error #188: enumerated type mixed
with another type
        static const enum index const next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
                                               ^

.../app/test-pmd/cmdline_flow.c(2302): error #188: enumerated type mixed
with another type
        ctx->curr = 0;
                  ^

.../app/test-pmd/cmdline_flow.c(2303): error #188: enumerated type mixed
with another type
        ctx->prev = 0;
                  ^

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-01  8:36         ` Adrien Mazarguil
@ 2016-12-02 21:06           ` Kevin Traynor
  2016-12-06 18:11             ` Chandran, Sugesh
  2016-12-08 17:07             ` Adrien Mazarguil
  0 siblings, 2 replies; 262+ messages in thread
From: Kevin Traynor @ 2016-12-02 21:06 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz, sugesh.chandran

On 12/01/2016 08:36 AM, Adrien Mazarguil wrote:
> Hi Kevin,
> 
> On Wed, Nov 30, 2016 at 05:47:17PM +0000, Kevin Traynor wrote:
>> Hi Adrien,
>>
>> On 11/16/2016 04:23 PM, Adrien Mazarguil wrote:
>>> This new API supersedes all the legacy filter types described in
>>> rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
>>> PMDs to process and validate flow rules.
>>>
>>> Benefits:
>>>
>>> - A unified API is easier to program for, applications do not have to be
>>>   written for a specific filter type which may or may not be supported by
>>>   the underlying device.
>>>
>>> - The behavior of a flow rule is the same regardless of the underlying
>>>   device, applications do not need to be aware of hardware quirks.
>>>
>>> - Extensible by design, API/ABI breakage should rarely occur if at all.
>>>
>>> - Documentation is self-standing, no need to look up elsewhere.
>>>
>>> Existing filter types will be deprecated and removed in the near future.
>>
>> I'd suggest to add a deprecation notice to deprecation.rst, ideally with
>> a target release.
> 
> Will do, not a sure about the target release though. It seems a bit early
> since no PMD really supports this API yet.
> 
> [...]
>>> diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
>>> new file mode 100644
>>> index 0000000..064963d
>>> --- /dev/null
>>> +++ b/lib/librte_ether/rte_flow.c
>>> @@ -0,0 +1,159 @@
>>> +/*-
>>> + *   BSD LICENSE
>>> + *
>>> + *   Copyright 2016 6WIND S.A.
>>> + *   Copyright 2016 Mellanox.
>>
>> There's Mellanox copyright but you are the only signed-off-by - is that
>> right?
> 
> Yes, I'm the primary maintainer for Mellanox PMDs and this API was designed
> on their behalf to expose several features from mlx4/mlx5 as the existing
> filter types had too many limitations.
> 
> [...]
>>> +/* Get generic flow operations structure from a port. */
>>> +const struct rte_flow_ops *
>>> +rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
>>> +{
>>> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>>> +	const struct rte_flow_ops *ops;
>>> +	int code;
>>> +
>>> +	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
>>> +		code = ENODEV;
>>> +	else if (unlikely(!dev->dev_ops->filter_ctrl ||
>>> +			  dev->dev_ops->filter_ctrl(dev,
>>> +						    RTE_ETH_FILTER_GENERIC,
>>> +						    RTE_ETH_FILTER_GET,
>>> +						    &ops) ||
>>> +			  !ops))
>>> +		code = ENOTSUP;
>>> +	else
>>> +		return ops;
>>> +	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
>>> +			   NULL, rte_strerror(code));
>>> +	return NULL;
>>> +}
>>> +
>>
>> Is it expected that the application or pmd will provide locking between
>> these functions if required? I think it's going to have to be the app.
> 
> Locking is indeed expected to be performed by applications. This API only
> documents places where locking would make sense if necessary and expected
> behavior.
> 
> Like all control path APIs, this one assumes a single control thread.
> Applications must take the necessary precautions.

If you look at OVS now it's quite possible that you have 2 rx queues
serviced by different threads, that would also install the flow rules in
the software flow caches - possibly that could extend to adding hardware
flows. There could also be another thread that is querying for stats. So
anything that can be done to minimise the locking would be helpful -
maybe query() could be atomic and not require any locking?

> 
> [...]
>>> +/**
>>> + * Flow rule attributes.
>>> + *
>>> + * Priorities are set on two levels: per group and per rule within groups.
>>> + *
>>> + * Lower values denote higher priority, the highest priority for both levels
>>> + * is 0, so that a rule with priority 0 in group 8 is always matched after a
>>> + * rule with priority 8 in group 0.
>>> + *
>>> + * Although optional, applications are encouraged to group similar rules as
>>> + * much as possible to fully take advantage of hardware capabilities
>>> + * (e.g. optimized matching) and work around limitations (e.g. a single
>>> + * pattern type possibly allowed in a given group).
>>> + *
>>> + * Group and priority levels are arbitrary and up to the application, they
>>> + * do not need to be contiguous nor start from 0, however the maximum number
>>> + * varies between devices and may be affected by existing flow rules.
>>> + *
>>> + * If a packet is matched by several rules of a given group for a given
>>> + * priority level, the outcome is undefined. It can take any path, may be
>>> + * duplicated or even cause unrecoverable errors.
>>
>> I get what you are trying to do here wrt supporting multiple
>> pmds/hardware implementations and it's a good idea to keep it flexible.
>>
>> Given that the outcome is undefined, it would be nice that the
>> application has a way of finding the specific effects for verification
>> and debugging.
> 
> Right, however it was deemed a bit difficult to manage in many cases hence
> the vagueness.
> 
> For example, suppose two rules with the same group and priority, one
> matching any IPv4 header, the other one any UDP header:
> 
> - TCPv4 packets => rule #1.
> - UDPv6 packets => rule #2.
> - UDPv4 packets => both?
> 
> That last one is perhaps invalid, checking that some unspecified protocol
> combination does not overlap is expensive and may miss corner cases, even
> assuming this is not an issue, what if the application guarantees that no
> UDPv4 packets can ever hit that rule?

that's fine - I don't expect the software to be able to know what the
hardware will do with those rules. It's more about trying to get a dump
from the hardware if something goes wrong. Anyway covered in comment later.

> 
> Suggestions are welcome though, perhaps we can refine the description
> 
>>> + *
>>> + * Note that support for more than a single group and priority level is not
>>> + * guaranteed.
>>> + *
>>> + * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
>>> + *
>>> + * Several pattern items and actions are valid and can be used in both
>>> + * directions. Those valid for only one direction are described as such.
>>> + *
>>> + * Specifying both directions at once is not recommended but may be valid in
>>> + * some cases, such as incrementing the same counter twice.
>>> + *
>>> + * Not specifying any direction is currently an error.
>>> + */
>>> +struct rte_flow_attr {
>>> +	uint32_t group; /**< Priority group. */
>>> +	uint32_t priority; /**< Priority level within group. */
>>> +	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
>>> +	uint32_t egress:1; /**< Rule applies to egress traffic. */
>>> +	uint32_t reserved:30; /**< Reserved, must be zero. */
>>> +};
> [...]
>>> +/**
>>> + * RTE_FLOW_ITEM_TYPE_VF
>>> + *
>>> + * Matches packets addressed to a virtual function ID of the device.
>>> + *
>>> + * If the underlying device function differs from the one that would
>>> + * normally receive the matched traffic, specifying this item prevents it
>>> + * from reaching that device unless the flow rule contains a VF
>>> + * action. Packets are not duplicated between device instances by default.
>>> + *
>>> + * - Likely to return an error or never match any traffic if this causes a
>>> + *   VF device to match traffic addressed to a different VF.
>>> + * - Can be specified multiple times to match traffic addressed to several
>>> + *   specific VFs.
>>> + * - Can be combined with a PF item to match both PF and VF traffic.
>>> + *
>>> + * A zeroed mask can be used to match any VF.
>>
>> can you refer explicitly to id
> 
> If you mean "VF" to "VF ID" then yes, will do it for v2.
> 
>>> + */
>>> +struct rte_flow_item_vf {
>>> +	uint32_t id; /**< Destination VF ID. */
>>> +};
> [...]
>>> +/**
>>> + * Matching pattern item definition.
>>> + *
>>> + * A pattern is formed by stacking items starting from the lowest protocol
>>> + * layer to match. This stacking restriction does not apply to meta items
>>> + * which can be placed anywhere in the stack with no effect on the meaning
>>> + * of the resulting pattern.
>>> + *
>>> + * A stack is terminated by a END item.
>>> + *
>>> + * The spec field should be a valid pointer to a structure of the related
>>> + * item type. It may be set to NULL in many cases to use default values.
>>> + *
>>> + * Optionally, last can point to a structure of the same type to define an
>>> + * inclusive range. This is mostly supported by integer and address fields,
>>> + * may cause errors otherwise. Fields that do not support ranges must be set
>>> + * to the same value as their spec counterparts.
>>> + *
>>> + * By default all fields present in spec are considered relevant.* This
>>
>> typo "*"
> 
> No, that's an asterisk for a footnote below. Perhaps it is a bit unusual,
> would something like "[1]" look better?

oh, I thought it was the start of a comment line gone astray. Maybe "See
note below", no big deal though.

> 
>>> + * behavior can be altered by providing a mask structure of the same type
>>> + * with applicable bits set to one. It can also be used to partially filter
>>> + * out specific fields (e.g. as an alternate mean to match ranges of IP
>>> + * addresses).
>>> + *
>>> + * Note this is a simple bit-mask applied before interpreting the contents
>>> + * of spec and last, which may yield unexpected results if not used
>>> + * carefully. For example, if for an IPv4 address field, spec provides
>>> + * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
>>> + * effective range is 10.1.0.0 to 10.3.255.255.
>>> + *
> 
> See footnote below:
> 
>>> + * * The defaults for data-matching items such as IPv4 when mask is not
>>> + *   specified actually depend on the underlying implementation since only
>>> + *   recognized fields can be taken into account.
>>> + */
>>> +struct rte_flow_item {
>>> +	enum rte_flow_item_type type; /**< Item type. */
>>> +	const void *spec; /**< Pointer to item specification structure. */
>>> +	const void *last; /**< Defines an inclusive range (spec to last). */
>>> +	const void *mask; /**< Bit-mask applied to spec and last. */
>>> +};
>>> +
>>> +/**
>>> + * Action types.
>>> + *
>>> + * Each possible action is represented by a type. Some have associated
>>> + * configuration structures. Several actions combined in a list can be
>>> + * affected to a flow rule. That list is not ordered.
>>> + *
>>> + * They fall in three categories:
>>> + *
>>> + * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
>>> + *   processing matched packets by subsequent flow rules, unless overridden
>>> + *   with PASSTHRU.
>>> + *
>>> + * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
>>> + *   for additional processing by subsequent flow rules.
>>> + *
>>> + * - Other non terminating meta actions that do not affect the fate of
>>> + *   packets (END, VOID, MARK, FLAG, COUNT).
>>> + *
>>> + * When several actions are combined in a flow rule, they should all have
>>> + * different types (e.g. dropping a packet twice is not possible). The
>>> + * defined behavior is for PMDs to only take into account the last action of
>>> + * a given type found in the list. PMDs still perform error checking on the
>>> + * entire list.
>>
>> why do you define that the pmd will interpret multiple same type rules
>> in this way...would it not make more sense for the pmd to just return
>> EINVAL for an invalid set of rules? It seems more transparent for the
>> application.
> 
> Well, I had to define something as a default. The reason is that any number
> of VOID actions may specified and did not want that to be a special case in
> order to keep PMD parsers as simple as possible. I'll settle for EINVAL (or
> some other error) if at least one PMD maintainer other than Nelio who
> intends to implement this API is not convinced by this explanation, all
> right?

>From an API perspective I think it's cleaner to pass or fail with the
input rather than change it. But yes, please take pmd maintainers input
as to what is reasonable to check also.

> 
> [...]
>>> +/**
>>> + * RTE_FLOW_ACTION_TYPE_MARK
>>> + *
>>> + * Attaches a 32 bit value to packets.
>>> + *
>>> + * This value is arbitrary and application-defined. For compatibility with
>>> + * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
>>> + * also set in ol_flags.
>>> + */
>>> +struct rte_flow_action_mark {
>>> +	uint32_t id; /**< 32 bit value to return with packets. */
>>> +};
>>
>> One use case I thought we would be able to do for OVS is classification
>> in hardware and the unique flow id is sent with the packet to software.
>> But in OVS the ufid is 128 bits, so it means we can't and there is still
>> the miniflow extract overhead. I'm not sure if there is a practical way
>> around this.
>>
>> Sugesh (cc'd) has looked at this before and may be able to comment or
>> correct me.
> 
> Yes, we settled on 32 bit because currently no known hardware implementation
> supports more than this. If that changes, another action with a larger type
> shall be provided (no ABI breakage).
> 
> Also since even 64 bit would not be enough for the use case you mention,
> there is no choice but use this as an indirect value (such as an array or
> hash table index/value).

ok, cool. I think Sugesh has other ideas anyway!

> 
> [...]
>>> +/**
>>> + * RTE_FLOW_ACTION_TYPE_RSS
>>> + *
>>> + * Similar to QUEUE, except RSS is additionally performed on packets to
>>> + * spread them among several queues according to the provided parameters.
>>> + *
>>> + * Note: RSS hash result is normally stored in the hash.rss mbuf field,
>>> + * however it conflicts with the MARK action as they share the same
>>> + * space. When both actions are specified, the RSS hash is discarded and
>>> + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
>>> + * structure should eventually evolve to store both.
>>> + *
>>> + * Terminating by default.
>>> + */
>>> +struct rte_flow_action_rss {
>>> +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
>>> +	uint16_t queues; /**< Number of entries in queue[]. */
>>> +	uint16_t queue[]; /**< Queues indices to use. */
>>
>> I'd try and avoid queue and queues - someone will say "huh?" when
>> reading code. s/queues/num ?
> 
> Agreed, will update for v2.
> 
>>> +};
>>> +
>>> +/**
>>> + * RTE_FLOW_ACTION_TYPE_VF
>>> + *
>>> + * Redirects packets to a virtual function (VF) of the current device.
>>> + *
>>> + * Packets matched by a VF pattern item can be redirected to their original
>>> + * VF ID instead of the specified one. This parameter may not be available
>>> + * and is not guaranteed to work properly if the VF part is matched by a
>>> + * prior flow rule or if packets are not addressed to a VF in the first
>>> + * place.
>>
>> Not clear what you mean by "not guaranteed to work if...". Please return
>> fail when this action is used if this is not going to work.
> 
> Again, this is a case where it is difficult for a PMD to determine if the
> entire list of flow rules makes sense. Perhaps it does, perhaps whatever
> goes through has already been filtered out of possible issues.
> 
> Here the documentation states the precautions an application should take to
> guarantee it will work as intended. Perhaps it can be reworded (any
> suggestion?), but a PMD can certainly not provide any strong guarantee.

I see your point. Maybe for easy check things the pmd would return fail,
but for more complex I agree it's too difficult.

> 
>>> + *
>>> + * Terminating by default.
>>> + */
>>> +struct rte_flow_action_vf {
>>> +	uint32_t original:1; /**< Use original VF ID if possible. */
>>> +	uint32_t reserved:31; /**< Reserved, must be zero. */
>>> +	uint32_t id; /**< VF ID to redirect packets to. */
>>> +};
> [...]
>>> +/**
>>> + * Check whether a flow rule can be created on a given port.
>>> + *
>>> + * While this function has no effect on the target device, the flow rule is
>>> + * validated against its current configuration state and the returned value
>>> + * should be considered valid by the caller for that state only.
>>> + *
>>> + * The returned value is guaranteed to remain valid only as long as no
>>> + * successful calls to rte_flow_create() or rte_flow_destroy() are made in
>>> + * the meantime and no device parameter affecting flow rules in any way are
>>> + * modified, due to possible collisions or resource limitations (although in
>>> + * such cases EINVAL should not be returned).
>>> + *
>>> + * @param port_id
>>> + *   Port identifier of Ethernet device.
>>> + * @param[in] attr
>>> + *   Flow rule attributes.
>>> + * @param[in] pattern
>>> + *   Pattern specification (list terminated by the END pattern item).
>>> + * @param[in] actions
>>> + *   Associated actions (list terminated by the END action).
>>> + * @param[out] error
>>> + *   Perform verbose error reporting if not NULL.
>>> + *
>>> + * @return
>>> + *   0 if flow rule is valid and can be created. A negative errno value
>>> + *   otherwise (rte_errno is also set), the following errors are defined:
>>> + *
>>> + *   -ENOSYS: underlying device does not support this functionality.
>>> + *
>>> + *   -EINVAL: unknown or invalid rule specification.
>>> + *
>>> + *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
>>> + *   bit-masks are unsupported).
>>> + *
>>> + *   -EEXIST: collision with an existing rule.
>>> + *
>>> + *   -ENOMEM: not enough resources.
>>> + *
>>> + *   -EBUSY: action cannot be performed due to busy device resources, may
>>> + *   succeed if the affected queues or even the entire port are in a stopped
>>> + *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
>>> + */
>>> +int
>>> +rte_flow_validate(uint8_t port_id,
>>> +		  const struct rte_flow_attr *attr,
>>> +		  const struct rte_flow_item pattern[],
>>> +		  const struct rte_flow_action actions[],
>>> +		  struct rte_flow_error *error);
>>
>> Why not just use rte_flow_create() and get an error? Is it less
>> disruptive to do a validate and find the rule cannot be created, than
>> using a create directly?
> 
> The rationale can be found in the original RFC, which I'll convert to actual
> documentation in v2. In short:
> 
> - Calling rte_flow_validate() before rte_flow_create() is useless since
>   rte_flow_create() also performs validation.
> 
> - We cannot possibly express a full static set of allowed flow rules, even
>   if we could, it usually depends on the current hardware configuration
>   therefore would not be static.
> 
> - rte_flow_validate() is thus provided as a replacement for capability
>   flags. It can be used to determine during initialization if the underlying
>   device can support the typical flow rules an application might want to
>   provide later and do something useful with that information (e.g. always
>   use software fallback due to HW limitations).
> 
> - rte_flow_validate() being a subset of rte_flow_create(), it is essentially
>   free to expose.

make sense now, thanks.

> 
>>> +
>>> +/**
>>> + * Create a flow rule on a given port.
>>> + *
>>> + * @param port_id
>>> + *   Port identifier of Ethernet device.
>>> + * @param[in] attr
>>> + *   Flow rule attributes.
>>> + * @param[in] pattern
>>> + *   Pattern specification (list terminated by the END pattern item).
>>> + * @param[in] actions
>>> + *   Associated actions (list terminated by the END action).
>>> + * @param[out] error
>>> + *   Perform verbose error reporting if not NULL.
>>> + *
>>> + * @return
>>> + *   A valid handle in case of success, NULL otherwise and rte_errno is set
>>> + *   to the positive version of one of the error codes defined for
>>> + *   rte_flow_validate().
>>> + */
>>> +struct rte_flow *
>>> +rte_flow_create(uint8_t port_id,
>>> +		const struct rte_flow_attr *attr,
>>> +		const struct rte_flow_item pattern[],
>>> +		const struct rte_flow_action actions[],
>>> +		struct rte_flow_error *error);
>>
>> General question - are these functions threadsafe? In the OVS example
>> you could have several threads wanting to create flow rules at the same
>> time for same or different ports.
> 
> No they aren't, applications have to perform their own locking. The RFC (to
> be converted to actual documentation in v2) says that:
> 
> - API operations are synchronous and blocking (``EAGAIN`` cannot be
>   returned).
> 
> - There is no provision for reentrancy/multi-thread safety, although nothing
>   should prevent different devices from being configured at the same
>   time. PMDs may protect their control path functions accordingly.

other comment above wrt locking.

> 
>>> +
>>> +/**
>>> + * Destroy a flow rule on a given port.
>>> + *
>>> + * Failure to destroy a flow rule handle may occur when other flow rules
>>> + * depend on it, and destroying it would result in an inconsistent state.
>>> + *
>>> + * This function is only guaranteed to succeed if handles are destroyed in
>>> + * reverse order of their creation.
>>
>> How can the application find this information out on error?
> 
> Without maintaining a list, they cannot. The specified case is the only
> possible guarantee. That does not mean PMDs should not do their best to
> destroy flow rules, only that ordering must remain consistent in case of
> inability to destroy one.
> 
> What do you suggest?

I think if the app cannot remove a specific rule it may want to remove
all rules and deal with flows in software for a time. So once the app
knows it fails that should be enough.

> 
>>> + *
>>> + * @param port_id
>>> + *   Port identifier of Ethernet device.
>>> + * @param flow
>>> + *   Flow rule handle to destroy.
>>> + * @param[out] error
>>> + *   Perform verbose error reporting if not NULL.
>>> + *
>>> + * @return
>>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>>> + */
>>> +int
>>> +rte_flow_destroy(uint8_t port_id,
>>> +		 struct rte_flow *flow,
>>> +		 struct rte_flow_error *error);
>>> +
>>> +/**
>>> + * Destroy all flow rules associated with a port.
>>> + *
>>> + * In the unlikely event of failure, handles are still considered destroyed
>>> + * and no longer valid but the port must be assumed to be in an inconsistent
>>> + * state.
>>> + *
>>> + * @param port_id
>>> + *   Port identifier of Ethernet device.
>>> + * @param[out] error
>>> + *   Perform verbose error reporting if not NULL.
>>> + *
>>> + * @return
>>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>>> + */
>>> +int
>>> +rte_flow_flush(uint8_t port_id,
>>> +	       struct rte_flow_error *error);
>>
>> rte_flow_destroy_all() would be more descriptive (but breaks your style)
> 
> There are enough underscores as it is. I like flush, if enough people
> complain we'll change it but it has to occur before the first public
> release.
> 
>>> +
>>> +/**
>>> + * Query an existing flow rule.
>>> + *
>>> + * This function allows retrieving flow-specific data such as counters.
>>> + * Data is gathered by special actions which must be present in the flow
>>> + * rule definition.
>>
>> re last sentence, it would be good if you can put a link to
>> RTE_FLOW_ACTION_TYPE_COUNT
> 
> Will do, I did not know how until very recently.
> 
>>> + *
>>> + * @param port_id
>>> + *   Port identifier of Ethernet device.
>>> + * @param flow
>>> + *   Flow rule handle to query.
>>> + * @param action
>>> + *   Action type to query.
>>> + * @param[in, out] data
>>> + *   Pointer to storage for the associated query data type.
>>
>> can this be anything other than rte_flow_query_count?
> 
> Likely in the future. I've only defined this one as a counterpart for
> existing API functionality and because we wanted to expose it in mlx5.
> 
>>> + * @param[out] error
>>> + *   Perform verbose error reporting if not NULL.
>>> + *
>>> + * @return
>>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>>> + */
>>> +int
>>> +rte_flow_query(uint8_t port_id,
>>> +	       struct rte_flow *flow,
>>> +	       enum rte_flow_action_type action,
>>> +	       void *data,
>>> +	       struct rte_flow_error *error);
>>> +
>>> +#ifdef __cplusplus
>>> +}
>>> +#endif
>>
>> I don't see a way to dump all the rules for a port out. I think this is
>> neccessary for degbugging. You could have a look through dpif.h in OVS
>> and see how dpif_flow_dump_next() is used, it might be a good reference.
> 
> DPDK does not maintain flow rules and, depending on hardware capabilities
> and level of compliance, PMDs do not necessarily do it either, particularly
> since it requires space and application probably have a better method to
> store these pointers for their own needs.

understood

> 
> What you see here is only a PMD interface. Depending on applications needs,
> generic helper functions built on top of these may be added to manage flow
> rules in the future.

I'm thinking of the case where something goes wrong and I want to get a
dump of all the flow rules from hardware, not query the rules I think I
have. I don't see a way to do it or something to build a helper on top of?

> 
>> Also, it would be nice if there were an api that would allow a test
>> packet to be injected and traced for debugging - although I'm not
>> exactly sure how well it could be traced. For reference:
>> http://developers.redhat.com/blog/2016/10/12/tracing-packets-inside-open-vswitch/
> 
> Thanks for the link, I'm not sure how you'd do this either. Remember, as
> generic as it looks, this interface is only meant to configure the
> underlying device. You need to see it as one big offload, everything else
> is left to applications.
> 

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-02 21:06           ` Kevin Traynor
@ 2016-12-06 18:11             ` Chandran, Sugesh
  2016-12-08 15:09               ` Adrien Mazarguil
  2016-12-08 17:07             ` Adrien Mazarguil
  1 sibling, 1 reply; 262+ messages in thread
From: Chandran, Sugesh @ 2016-12-06 18:11 UTC (permalink / raw)
  To: Kevin Traynor, Adrien Mazarguil
  Cc: dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz,
	sugesh.chandran

Hi Adrien,
Thanks for sending out the patches,

Please find few comments below,


Regards
_Sugesh


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Kevin Traynor
> Sent: Friday, December 2, 2016 9:07 PM
> To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>; De
> Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Olivier Matz
> <olivier.matz@6wind.com>; sugesh.chandran@intel.comn
> Subject: Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> 
>>>>>>Snipp
> >>> + *
> >>> + * Attaches a 32 bit value to packets.
> >>> + *
> >>> + * This value is arbitrary and application-defined. For
> >>> +compatibility with
> >>> + * FDIR it is returned in the hash.fdir.hi mbuf field.
> >>> +PKT_RX_FDIR_ID is
> >>> + * also set in ol_flags.
> >>> + */
> >>> +struct rte_flow_action_mark {
> >>> +	uint32_t id; /**< 32 bit value to return with packets. */ };
> >>
> >> One use case I thought we would be able to do for OVS is
> >> classification in hardware and the unique flow id is sent with the packet to
> software.
> >> But in OVS the ufid is 128 bits, so it means we can't and there is
> >> still the miniflow extract overhead. I'm not sure if there is a
> >> practical way around this.
> >>
> >> Sugesh (cc'd) has looked at this before and may be able to comment or
> >> correct me.
> >
> > Yes, we settled on 32 bit because currently no known hardware
> > implementation supports more than this. If that changes, another
> > action with a larger type shall be provided (no ABI breakage).
> >
> > Also since even 64 bit would not be enough for the use case you
> > mention, there is no choice but use this as an indirect value (such as
> > an array or hash table index/value).
> 
> ok, cool. I think Sugesh has other ideas anyway!
[Sugesh] It should be fine with 32 bit . we can manage it in OVS accordingly.
> 
> >
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_RSS
> >>> + *
> >>> +
> >>> + *
> >>> + * Terminating by default.
> >>> + */
> >>> +struct rte_flow_action_vf {
> >>> +	uint32_t original:1; /**< Use original VF ID if possible. */
> >>> +	uint32_t reserved:31; /**< Reserved, must be zero. */
> >>> +	uint32_t id; /**< VF ID to redirect packets to. */ };
> > [...]
> >>> +/**
> >>> + * Check whether a flow rule can be created on a given port.
> >>> + *
> >>> + * While this function has no effect on the target device, the flow
> >>> +rule is
> >>> + * validated against its current configuration state and the
> >>> +returned value
> >>> + * should be considered valid by the caller for that state only.
> >>> + *
> >>> + * The returned value is guaranteed to remain valid only as long as
> >>> +no
> >>> + * successful calls to rte_flow_create() or rte_flow_destroy() are
> >>> +made in
> >>> + * the meantime and no device parameter affecting flow rules in any
> >>> +way are
> >>> + * modified, due to possible collisions or resource limitations
> >>> +(although in
> >>> + * such cases EINVAL should not be returned).
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item).
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 if flow rule is valid and can be created. A negative errno value
> >>> + *   otherwise (rte_errno is also set), the following errors are defined:
> >>> + *
> >>> + *   -ENOSYS: underlying device does not support this functionality.
> >>> + *
> >>> + *   -EINVAL: unknown or invalid rule specification.
> >>> + *
> >>> + *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
> >>> + *   bit-masks are unsupported).
> >>> + *
> >>> + *   -EEXIST: collision with an existing rule.
> >>> + *
> >>> + *   -ENOMEM: not enough resources.
> >>> + *
> >>> + *   -EBUSY: action cannot be performed due to busy device resources,
> may
> >>> + *   succeed if the affected queues or even the entire port are in a
> stopped
> >>> + *   state (see rte_eth_dev_rx_queue_stop() and
> rte_eth_dev_stop()).
> >>> + */
> >>> +int
> >>> +rte_flow_validate(uint8_t port_id,
> >>> +		  const struct rte_flow_attr *attr,
> >>> +		  const struct rte_flow_item pattern[],
> >>> +		  const struct rte_flow_action actions[],
> >>> +		  struct rte_flow_error *error);
> >>
> >> Why not just use rte_flow_create() and get an error? Is it less
> >> disruptive to do a validate and find the rule cannot be created, than
> >> using a create directly?
> >
> > The rationale can be found in the original RFC, which I'll convert to
> > actual documentation in v2. In short:
> >
> > - Calling rte_flow_validate() before rte_flow_create() is useless since
> >   rte_flow_create() also performs validation.
> >
> > - We cannot possibly express a full static set of allowed flow rules, even
> >   if we could, it usually depends on the current hardware configuration
> >   therefore would not be static.
> >
> > - rte_flow_validate() is thus provided as a replacement for capability
> >   flags. It can be used to determine during initialization if the underlying
> >   device can support the typical flow rules an application might want to
> >   provide later and do something useful with that information (e.g. always
> >   use software fallback due to HW limitations).
> >
> > - rte_flow_validate() being a subset of rte_flow_create(), it is essentially
> >   free to expose.
> 
> make sense now, thanks.
[Sugesh] : We had this discussion earlier at the design stage about the time taken for programming the hardware,
and how to make it deterministic. How about having a timeout parameter as well for the rte_flow_*
If the hardware flow insert is timed out, error out than waiting indefinitely, so that application have some control over
The time to program the flow. It can be another set of APIs something like, rte_flow_create_timeout()

Are you going to provide any control over the initialization of NIC  to define the capability matrices
For eg; To operate in a L3 router mode,  software wanted to initialize the NIC port only to consider the L2 and L3 fields.
I assume the initialization is done based on the first rules that are programmed into the NIC.?
> 
> >
> >>> +
> >>> +/**
> >>> + * Create a flow rule on a given port.
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item).
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   A valid handle in case of success, NULL otherwise and rte_errno is
> set
> >>> + *   to the positive version of one of the error codes defined for
> >>> + *   rte_flow_validate().
> >>> + */
> >>> +struct rte_flow *
> >>> +rte_flow_create(uint8_t port_id,
> >>> +		const struct rte_flow_attr *attr,
> >>> +		const struct rte_flow_item pattern[],
> >>> +		const struct rte_flow_action actions[],
> >>> +		struct rte_flow_error *error);
> >>
> >> General question - are these functions threadsafe? In the OVS example
> >> you could have several threads wanting to create flow rules at the
> >> same time for same or different ports.
> >
> > No they aren't, applications have to perform their own locking. The
> > RFC (to be converted to actual documentation in v2) says that:
> >
> > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> >   returned).
> >
> > - There is no provision for reentrancy/multi-thread safety, although
> nothing
> >   should prevent different devices from being configured at the same
> >   time. PMDs may protect their control path functions accordingly.
> 
> other comment above wrt locking.
> 
> >

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API Adrien Mazarguil
  2016-11-18  6:36       ` Xing, Beilei
  2016-11-30 17:47       ` Kevin Traynor
@ 2016-12-08  9:00       ` Xing, Beilei
  2016-12-08 14:50         ` Adrien Mazarguil
  2 siblings, 1 reply; 262+ messages in thread
From: Xing, Beilei @ 2016-12-08  9:00 UTC (permalink / raw)
  To: Adrien Mazarguil, dev
  Cc: Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Thursday, November 17, 2016 12:23 AM
> To: dev@dpdk.org
> Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; De Lara Guarch,
> Pablo <pablo.de.lara.guarch@intel.com>; Olivier Matz
> <olivier.matz@6wind.com>
> Subject: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> 
> This new API supersedes all the legacy filter types described in rte_eth_ctrl.h.
> It is slightly higher level and as a result relies more on PMDs to process and
> validate flow rules.
> 
> Benefits:
> 
> - A unified API is easier to program for, applications do not have to be
>   written for a specific filter type which may or may not be supported by
>   the underlying device.
> 
> - The behavior of a flow rule is the same regardless of the underlying
>   device, applications do not need to be aware of hardware quirks.
> 
> - Extensible by design, API/ABI breakage should rarely occur if at all.
> 
> - Documentation is self-standing, no need to look up elsewhere.
> 
> Existing filter types will be deprecated and removed in the near future.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
>  MAINTAINERS                            |   4 +
>  lib/librte_ether/Makefile              |   3 +
>  lib/librte_ether/rte_eth_ctrl.h        |   1 +
>  lib/librte_ether/rte_ether_version.map |  10 +
>  lib/librte_ether/rte_flow.c            | 159 +++++
>  lib/librte_ether/rte_flow.h            | 947 ++++++++++++++++++++++++++++
>  lib/librte_ether/rte_flow_driver.h     | 177 ++++++
>  7 files changed, 1301 insertions(+)
> 
> +/**
> + * RTE_FLOW_ITEM_TYPE_ETH
> + *
> + * Matches an Ethernet header.
> + */
> +struct rte_flow_item_eth {
> +	struct ether_addr dst; /**< Destination MAC. */
> +	struct ether_addr src; /**< Source MAC. */
> +	unsigned int type; /**< EtherType. */
Hi Adrien,

ETHERTYPE in ether header is 2 bytes, so I think "uint16_t type" is more appropriate here, what do you think?

Thanks,
Beilei Xing
> +};
> +

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-08  9:00       ` Xing, Beilei
@ 2016-12-08 14:50         ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-08 14:50 UTC (permalink / raw)
  To: Xing, Beilei; +Cc: dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz

Hi Beilei,

On Thu, Dec 08, 2016 at 09:00:05AM +0000, Xing, Beilei wrote:
[...]
> > +/**
> > + * RTE_FLOW_ITEM_TYPE_ETH
> > + *
> > + * Matches an Ethernet header.
> > + */
> > +struct rte_flow_item_eth {
> > +	struct ether_addr dst; /**< Destination MAC. */
> > +	struct ether_addr src; /**< Source MAC. */
> > +	unsigned int type; /**< EtherType. */
> Hi Adrien,
> 
> ETHERTYPE in ether header is 2 bytes, so I think "uint16_t type" is more appropriate here, what do you think?

You're right, thanks for catching this. I'll update it in v2 (soon).

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-06 18:11             ` Chandran, Sugesh
@ 2016-12-08 15:09               ` Adrien Mazarguil
  2016-12-09 12:18                 ` Chandran, Sugesh
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-08 15:09 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: Kevin Traynor, dev, Thomas Monjalon, De Lara Guarch, Pablo,
	Olivier Matz, sugesh.chandran

Hi Sugesh,

On Tue, Dec 06, 2016 at 06:11:38PM +0000, Chandran, Sugesh wrote:
[...]
> > >>> +int
> > >>> +rte_flow_validate(uint8_t port_id,
> > >>> +		  const struct rte_flow_attr *attr,
> > >>> +		  const struct rte_flow_item pattern[],
> > >>> +		  const struct rte_flow_action actions[],
> > >>> +		  struct rte_flow_error *error);
> > >>
> > >> Why not just use rte_flow_create() and get an error? Is it less
> > >> disruptive to do a validate and find the rule cannot be created, than
> > >> using a create directly?
> > >
> > > The rationale can be found in the original RFC, which I'll convert to
> > > actual documentation in v2. In short:
> > >
> > > - Calling rte_flow_validate() before rte_flow_create() is useless since
> > >   rte_flow_create() also performs validation.
> > >
> > > - We cannot possibly express a full static set of allowed flow rules, even
> > >   if we could, it usually depends on the current hardware configuration
> > >   therefore would not be static.
> > >
> > > - rte_flow_validate() is thus provided as a replacement for capability
> > >   flags. It can be used to determine during initialization if the underlying
> > >   device can support the typical flow rules an application might want to
> > >   provide later and do something useful with that information (e.g. always
> > >   use software fallback due to HW limitations).
> > >
> > > - rte_flow_validate() being a subset of rte_flow_create(), it is essentially
> > >   free to expose.
> > 
> > make sense now, thanks.
> [Sugesh] : We had this discussion earlier at the design stage about the time taken for programming the hardware,
> and how to make it deterministic. How about having a timeout parameter as well for the rte_flow_*
> If the hardware flow insert is timed out, error out than waiting indefinitely, so that application have some control over
> The time to program the flow. It can be another set of APIs something like, rte_flow_create_timeout()

Yes as discussed the existing API does not provide any timing constraints to
PMDs, validate() and create() may take forever to complete, although PMDs
are strongly encouraged to take as little time as possible.

Like you suggested, this could be done through distinct API calls. The
validate() function would also have its _timeout() counterpart since the set
of possible rules could be restricted in that mode.

> Are you going to provide any control over the initialization of NIC  to define the capability matrices
> For eg; To operate in a L3 router mode,  software wanted to initialize the NIC port only to consider the L2 and L3 fields.
> I assume the initialization is done based on the first rules that are programmed into the NIC.?

Precisely, PMDs are supposed to determine the most appropriate device mode
to use in order to handle the requested rules. They may even switch to
another mode if necessary assuming this does not break existing constraints.

I think we've discussed an atomic (commit-based) mode of operation through
separate functions as well, where the application would attempt to create a
bunch of rules at once, possibly making it easier for PMDs to determine the
most appropriate mode of operation for the device.

All of these may be added later according to users feedback once the basic
API has settled.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-12-02 16:58     ` Ferruh Yigit
@ 2016-12-08 15:19       ` Adrien Mazarguil
  2016-12-08 17:56         ` Ferruh Yigit
  2016-12-15 12:20         ` Ferruh Yigit
  0 siblings, 2 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-08 15:19 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz

Hi Ferruh,

On Fri, Dec 02, 2016 at 04:58:53PM +0000, Ferruh Yigit wrote:
> Hi Adrien,
> 
> On 11/16/2016 4:23 PM, Adrien Mazarguil wrote:
> > As previously discussed in RFC v1 [1], RFC v2 [2], with changes
> > described in [3] (also pasted below), here is the first non-draft series
> > for this new API.
> > 
> > Its capabilities are so generic that its name had to be vague, it may be
> > called "Generic flow API", "Generic flow interface" (possibly shortened
> > as "GFI") to refer to the name of the new filter type, or "rte_flow" from
> > the prefix used for its public symbols. I personally favor the latter.
> > 
> > While it is currently meant to supersede existing filter types in order for
> > all PMDs to expose a common filtering/classification interface, it may
> > eventually evolve to cover the following ideas as well:
> > 
> > - Rx/Tx offloads configuration through automatic offloads for specific
> >   packets, e.g. performing checksum on TCP packets could be expressed with
> >   an egress rule with a TCP pattern and a kind of checksum action.
> > 
> > - RSS configuration (already defined actually). Could be global or per rule
> >   depending on hardware capabilities.
> > 
> > - Switching configuration for devices with many physical ports; rules doing
> >   both ingress and egress could even be used to completely bypass software
> >   if supported by hardware.
> > 
> >  [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
> >  [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
> >  [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html
> > 
> > Changes since RFC v2:
> > 
> > - New separate VLAN pattern item (previously part of the ETH definition),
> >   found to be much more convenient.
> > 
> > - Removed useless "any" field from VF pattern item, the same effect can be
> >   achieved by not providing a specification structure.
> > 
> > - Replaced bit-fields from the VXLAN pattern item to avoid endianness
> >   conversion issues on 24-bit fields.
> > 
> > - Updated struct rte_flow_item with a new "last" field to create inclusive
> >   ranges. They are defined as the interval between (spec & mask) and
> >   (last & mask). All three parameters are optional.
> > 
> > - Renamed ID action MARK.
> > 
> > - Renamed "queue" fields in actions QUEUE and DUP to "index".
> > 
> > - "rss_conf" field in RSS action is now const.
> > 
> > - VF action now uses a 32 bit ID like its pattern item counterpart.
> > 
> > - Removed redundant struct rte_flow_pattern, API functions now expect
> >   struct
> >   rte_flow_item lists terminated by END items.
> > 
> > - Replaced struct rte_flow_actions for the same reason, with struct
> >   rte_flow_action lists terminated by END actions.
> > 
> > - Error types (enum rte_flow_error_type) have been updated and the cause
> >   pointer in struct rte_flow_error is now const.
> > 
> > - Function prototypes (rte_flow_create, rte_flow_validate) have also been
> >   updated for clarity.
> > 
> > Additions:
> > 
> > - Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
> >   are now implemented in rte_flow.c, with their symbols exported and
> >   versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.
> > 
> > - A separate header (rte_flow_driver.h) has been added for driver-side
> >   functionality, in particular struct rte_flow_ops which contains PMD
> >   callbacks returned by RTE_ETH_FILTER_GENERIC query.
> > 
> > - testpmd now exposes most of this API through the new "flow" command.
> > 
> > What remains to be done:
> > 
> > - Using endian-aware integer types (rte_beX_t) where necessary for clarity.
> > 
> > - API documentation (based on RFC).
> > 
> > - testpmd flow command documentation (although context-aware command
> >   completion should already help quite a bit in this regard).
> > 
> > - A few pattern item / action properties cannot be configured yet
> >   (e.g. rss_conf parameter for RSS action) and a few completions
> >   (e.g. possible queue IDs) should be added.
> > 
> 
> <...>
> 
> I was trying to check driver filter API patches, but hit a few compiler
> errors with this patchset.
> 
> [1] clang complains about variable bitfield value changed from -1 to 1.
> Which is correct, but I guess that is intentional, but I don't know how
> to tell this to clang?
> 
> [2] shred library compilation error, because of missing rte_flow_flush
> in rte_ether_version.map file
> 
> [3] bunch of icc compilation errors, almost all are same type:
> error #188: enumerated type mixed with another type

Thanks for the report, I'll attempt to address them all in v2. However icc
error #188 looks like a pain, I think I can work around it but do we really
not tolerate the use of normal integers inside enum fields in DPDK?

> [1]
> =============================
> .../app/test-pmd/cmdline_flow.c:944:16: error: implicit truncation from
> 'int' to bitfield changes value from -1 to 1
> [-Werror,-Wbitfield-constant-conversion]
>                 .args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
> relative)),
> 
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> .../app/test-pmd/cmdline_flow.c:282:42: note: expanded from macro
> 'ARGS_ENTRY_BF'
>                 .mask = (const void *)&(const s){ .f = -1 }, \
>                                                        ^~
> .../app/test-pmd/cmdline_flow.c:269:49: note: expanded from macro 'ARGS'
> #define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
>                                                 ^~~~~~~~~~~
> .../app/test-pmd/cmdline_flow.c:950:16: error: implicit truncation from
> 'int' to bitfield changes value from -1 to 1
> [-Werror,-Wbitfield-constant-conversion]
>                 .args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
> search)),
>                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> .../app/test-pmd/cmdline_flow.c:282:42: note: expanded from macro
> 'ARGS_ENTRY_BF'
>                 .mask = (const void *)&(const s){ .f = -1 }, \
>                                                        ^~
> .../app/test-pmd/cmdline_flow.c:269:49: note: expanded from macro 'ARGS'
> #define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
>                                                 ^~~~~~~~~~~
> .../app/test-pmd/cmdline_flow.c:1293:16: error: implicit truncation from
> 'int' to bitfield changes value from -1 to 1
> [-Werror,-Wbitfield-constant-conversion]
>                 .args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_vf,
>                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> .../app/test-pmd/cmdline_flow.c:282:42: note: expanded from macro
> 'ARGS_ENTRY_BF'
>                 .mask = (const void *)&(const s){ .f = -1 }, \
>                                                        ^~
> .../app/test-pmd/cmdline_flow.c:269:49: note: expanded from macro 'ARGS'
> #define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
>                                                 ^~~~~~~~~~~
> .../app/test-pmd/cmdline_flow.c:1664:26: error: duplicate 'const'
> declaration specifier [-Werror,-Wduplicate-decl-specifier]
>         static const enum index const next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
>                                 ^
> 4 errors generated.
> 
> 
> 
> [2]
> =============================
>   LD testpmd
> config.o: In function `port_flow_flush':
> config.c:(.text+0x2231): undefined reference to `rte_flow_flush'
> collect2: error: ld returned 1 exit status
> 
> 
> [3]
> =============================
> .../app/test-pmd/cmdline_flow.c(364): error #188: enumerated type mixed
> with another type
> 
>         0,
> 
> 
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(370): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(376): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(385): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(406): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(413): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(419): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(425): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(435): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(443): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(450): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(457): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(464): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(471): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(478): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(485): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(492): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(498): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(514): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(520): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(526): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(532): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(538): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(545): error #188: enumerated type mixed
> with another type
>         0,
>         ^
> 
> .../app/test-pmd/cmdline_flow.c(619): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(FLOW)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(716): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(729): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(736): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(743): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(743): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(750): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(PORT_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(757): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(QUERY_ACTION),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(757): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(QUERY_ACTION),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(757): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(QUERY_ACTION),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(768): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(next_list_attr, NEXT_ENTRY(PORT_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(776): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(next_destroy_attr, NEXT_ENTRY(RULE_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(792): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(next_list_attr, NEXT_ENTRY(GROUP_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(800): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(next_vc_attr, NEXT_ENTRY(GROUP_ID)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(807): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(next_vc_attr, NEXT_ENTRY(PRIORITY_LEVEL)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(864): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ACTIONS)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(871): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(878): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(891): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(897): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(904): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(917): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_vf, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(930): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(943): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(949): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(955): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_raw, NEXT_ENTRY(INTEGER), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(961): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_raw, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(967): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_raw,
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(967): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_raw,
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(987): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(993): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(999): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_eth, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1012): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1018): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1031): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1038): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1052): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1059): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1073): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1080): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1094): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1101): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1115): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1122): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1136): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1143): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1157): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(item_vxlan, NEXT_ENTRY(UNSIGNED), item_param),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1182): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1189): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1202): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(action_mark, NEXT_ENTRY(UNSIGNED)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1210): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1224): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(action_queue, NEXT_ENTRY(UNSIGNED)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1232): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1239): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1252): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(action_dup, NEXT_ENTRY(UNSIGNED)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1266): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(action_rss, NEXT_ENTRY(ACTION_RSS_QUEUE)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1279): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1292): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(action_vf, NEXT_ENTRY(BOOLEAN)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1300): error #188: enumerated type mixed
> with another type
>                 .next = NEXT(action_vf, NEXT_ENTRY(UNSIGNED)),
>                         ^
> 
> .../app/test-pmd/cmdline_flow.c(1599): error #188: enumerated type mixed
> with another type
>                 ctx->next[ctx->next_num - 2] = NEXT_ENTRY(PREFIX);
>                                                ^
> 
> .../app/test-pmd/cmdline_flow.c(1664): error #83: type qualifier
> specified more than once
>         static const enum index const next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
>                                 ^
> 
> .../app/test-pmd/cmdline_flow.c(1664): error #188: enumerated type mixed
> with another type
>         static const enum index const next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
>                                                ^
> 
> .../app/test-pmd/cmdline_flow.c(2302): error #188: enumerated type mixed
> with another type
>         ctx->curr = 0;
>                   ^
> 
> .../app/test-pmd/cmdline_flow.c(2303): error #188: enumerated type mixed
> with another type
>         ctx->prev = 0;
>                   ^
> 
> 

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-02 21:06           ` Kevin Traynor
  2016-12-06 18:11             ` Chandran, Sugesh
@ 2016-12-08 17:07             ` Adrien Mazarguil
  2016-12-14 11:48               ` Kevin Traynor
  1 sibling, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-08 17:07 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz, sugesh.chandran

On Fri, Dec 02, 2016 at 09:06:42PM +0000, Kevin Traynor wrote:
> On 12/01/2016 08:36 AM, Adrien Mazarguil wrote:
> > Hi Kevin,
> > 
> > On Wed, Nov 30, 2016 at 05:47:17PM +0000, Kevin Traynor wrote:
> >> Hi Adrien,
> >>
> >> On 11/16/2016 04:23 PM, Adrien Mazarguil wrote:
> >>> This new API supersedes all the legacy filter types described in
> >>> rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
> >>> PMDs to process and validate flow rules.
> >>>
> >>> Benefits:
> >>>
> >>> - A unified API is easier to program for, applications do not have to be
> >>>   written for a specific filter type which may or may not be supported by
> >>>   the underlying device.
> >>>
> >>> - The behavior of a flow rule is the same regardless of the underlying
> >>>   device, applications do not need to be aware of hardware quirks.
> >>>
> >>> - Extensible by design, API/ABI breakage should rarely occur if at all.
> >>>
> >>> - Documentation is self-standing, no need to look up elsewhere.
> >>>
> >>> Existing filter types will be deprecated and removed in the near future.
> >>
> >> I'd suggest to add a deprecation notice to deprecation.rst, ideally with
> >> a target release.
> > 
> > Will do, not a sure about the target release though. It seems a bit early
> > since no PMD really supports this API yet.
> > 
> > [...]
> >>> diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
> >>> new file mode 100644
> >>> index 0000000..064963d
> >>> --- /dev/null
> >>> +++ b/lib/librte_ether/rte_flow.c
> >>> @@ -0,0 +1,159 @@
> >>> +/*-
> >>> + *   BSD LICENSE
> >>> + *
> >>> + *   Copyright 2016 6WIND S.A.
> >>> + *   Copyright 2016 Mellanox.
> >>
> >> There's Mellanox copyright but you are the only signed-off-by - is that
> >> right?
> > 
> > Yes, I'm the primary maintainer for Mellanox PMDs and this API was designed
> > on their behalf to expose several features from mlx4/mlx5 as the existing
> > filter types had too many limitations.
> > 
> > [...]
> >>> +/* Get generic flow operations structure from a port. */
> >>> +const struct rte_flow_ops *
> >>> +rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
> >>> +{
> >>> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >>> +	const struct rte_flow_ops *ops;
> >>> +	int code;
> >>> +
> >>> +	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
> >>> +		code = ENODEV;
> >>> +	else if (unlikely(!dev->dev_ops->filter_ctrl ||
> >>> +			  dev->dev_ops->filter_ctrl(dev,
> >>> +						    RTE_ETH_FILTER_GENERIC,
> >>> +						    RTE_ETH_FILTER_GET,
> >>> +						    &ops) ||
> >>> +			  !ops))
> >>> +		code = ENOTSUP;
> >>> +	else
> >>> +		return ops;
> >>> +	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> >>> +			   NULL, rte_strerror(code));
> >>> +	return NULL;
> >>> +}
> >>> +
> >>
> >> Is it expected that the application or pmd will provide locking between
> >> these functions if required? I think it's going to have to be the app.
> > 
> > Locking is indeed expected to be performed by applications. This API only
> > documents places where locking would make sense if necessary and expected
> > behavior.
> > 
> > Like all control path APIs, this one assumes a single control thread.
> > Applications must take the necessary precautions.
> 
> If you look at OVS now it's quite possible that you have 2 rx queues
> serviced by different threads, that would also install the flow rules in
> the software flow caches - possibly that could extend to adding hardware
> flows. There could also be another thread that is querying for stats. So
> anything that can be done to minimise the locking would be helpful -
> maybe query() could be atomic and not require any locking?

I think we need basic functions with as few constraints as possible on PMDs
first, this API being somewhat complex to implement on their side. That
covers the common use case where applications have a single control thread
or otherwise perform locking on their own.

Once the basics are there for most PMDs, we may add new functions, items,
properties and actions that provide additional constraints (timing,
multi-threading and so on), which remain to be defined according to
feedback. It is designed to be extended without causing ABI breakage.

As for query(), let's see how PMDs handle it first. A race between query()
and create() on a given device is almost unavoidable without locking, same
for queries that reset counters in a given flow rule. Basic parallel queries
should not cause any harm otherwise, although this cannot be guaranteed yet.

> > [...]
> >>> +/**
> >>> + * Flow rule attributes.
> >>> + *
> >>> + * Priorities are set on two levels: per group and per rule within groups.
> >>> + *
> >>> + * Lower values denote higher priority, the highest priority for both levels
> >>> + * is 0, so that a rule with priority 0 in group 8 is always matched after a
> >>> + * rule with priority 8 in group 0.
> >>> + *
> >>> + * Although optional, applications are encouraged to group similar rules as
> >>> + * much as possible to fully take advantage of hardware capabilities
> >>> + * (e.g. optimized matching) and work around limitations (e.g. a single
> >>> + * pattern type possibly allowed in a given group).
> >>> + *
> >>> + * Group and priority levels are arbitrary and up to the application, they
> >>> + * do not need to be contiguous nor start from 0, however the maximum number
> >>> + * varies between devices and may be affected by existing flow rules.
> >>> + *
> >>> + * If a packet is matched by several rules of a given group for a given
> >>> + * priority level, the outcome is undefined. It can take any path, may be
> >>> + * duplicated or even cause unrecoverable errors.
> >>
> >> I get what you are trying to do here wrt supporting multiple
> >> pmds/hardware implementations and it's a good idea to keep it flexible.
> >>
> >> Given that the outcome is undefined, it would be nice that the
> >> application has a way of finding the specific effects for verification
> >> and debugging.
> > 
> > Right, however it was deemed a bit difficult to manage in many cases hence
> > the vagueness.
> > 
> > For example, suppose two rules with the same group and priority, one
> > matching any IPv4 header, the other one any UDP header:
> > 
> > - TCPv4 packets => rule #1.
> > - UDPv6 packets => rule #2.
> > - UDPv4 packets => both?
> > 
> > That last one is perhaps invalid, checking that some unspecified protocol
> > combination does not overlap is expensive and may miss corner cases, even
> > assuming this is not an issue, what if the application guarantees that no
> > UDPv4 packets can ever hit that rule?
> 
> that's fine - I don't expect the software to be able to know what the
> hardware will do with those rules. It's more about trying to get a dump
> from the hardware if something goes wrong. Anyway covered in comment later.
> 
> > 
> > Suggestions are welcome though, perhaps we can refine the description
> > 
> >>> + *
> >>> + * Note that support for more than a single group and priority level is not
> >>> + * guaranteed.
> >>> + *
> >>> + * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
> >>> + *
> >>> + * Several pattern items and actions are valid and can be used in both
> >>> + * directions. Those valid for only one direction are described as such.
> >>> + *
> >>> + * Specifying both directions at once is not recommended but may be valid in
> >>> + * some cases, such as incrementing the same counter twice.
> >>> + *
> >>> + * Not specifying any direction is currently an error.
> >>> + */
> >>> +struct rte_flow_attr {
> >>> +	uint32_t group; /**< Priority group. */
> >>> +	uint32_t priority; /**< Priority level within group. */
> >>> +	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
> >>> +	uint32_t egress:1; /**< Rule applies to egress traffic. */
> >>> +	uint32_t reserved:30; /**< Reserved, must be zero. */
> >>> +};
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ITEM_TYPE_VF
> >>> + *
> >>> + * Matches packets addressed to a virtual function ID of the device.
> >>> + *
> >>> + * If the underlying device function differs from the one that would
> >>> + * normally receive the matched traffic, specifying this item prevents it
> >>> + * from reaching that device unless the flow rule contains a VF
> >>> + * action. Packets are not duplicated between device instances by default.
> >>> + *
> >>> + * - Likely to return an error or never match any traffic if this causes a
> >>> + *   VF device to match traffic addressed to a different VF.
> >>> + * - Can be specified multiple times to match traffic addressed to several
> >>> + *   specific VFs.
> >>> + * - Can be combined with a PF item to match both PF and VF traffic.
> >>> + *
> >>> + * A zeroed mask can be used to match any VF.
> >>
> >> can you refer explicitly to id
> > 
> > If you mean "VF" to "VF ID" then yes, will do it for v2.
> > 
> >>> + */
> >>> +struct rte_flow_item_vf {
> >>> +	uint32_t id; /**< Destination VF ID. */
> >>> +};
> > [...]
> >>> +/**
> >>> + * Matching pattern item definition.
> >>> + *
> >>> + * A pattern is formed by stacking items starting from the lowest protocol
> >>> + * layer to match. This stacking restriction does not apply to meta items
> >>> + * which can be placed anywhere in the stack with no effect on the meaning
> >>> + * of the resulting pattern.
> >>> + *
> >>> + * A stack is terminated by a END item.
> >>> + *
> >>> + * The spec field should be a valid pointer to a structure of the related
> >>> + * item type. It may be set to NULL in many cases to use default values.
> >>> + *
> >>> + * Optionally, last can point to a structure of the same type to define an
> >>> + * inclusive range. This is mostly supported by integer and address fields,
> >>> + * may cause errors otherwise. Fields that do not support ranges must be set
> >>> + * to the same value as their spec counterparts.
> >>> + *
> >>> + * By default all fields present in spec are considered relevant.* This
> >>
> >> typo "*"
> > 
> > No, that's an asterisk for a footnote below. Perhaps it is a bit unusual,
> > would something like "[1]" look better?
> 
> oh, I thought it was the start of a comment line gone astray. Maybe "See
> note below", no big deal though.

OK, will change it anyway for clarity.

> >>> + * behavior can be altered by providing a mask structure of the same type
> >>> + * with applicable bits set to one. It can also be used to partially filter
> >>> + * out specific fields (e.g. as an alternate mean to match ranges of IP
> >>> + * addresses).
> >>> + *
> >>> + * Note this is a simple bit-mask applied before interpreting the contents
> >>> + * of spec and last, which may yield unexpected results if not used
> >>> + * carefully. For example, if for an IPv4 address field, spec provides
> >>> + * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
> >>> + * effective range is 10.1.0.0 to 10.3.255.255.
> >>> + *
> > 
> > See footnote below:
> > 
> >>> + * * The defaults for data-matching items such as IPv4 when mask is not
> >>> + *   specified actually depend on the underlying implementation since only
> >>> + *   recognized fields can be taken into account.
> >>> + */
> >>> +struct rte_flow_item {
> >>> +	enum rte_flow_item_type type; /**< Item type. */
> >>> +	const void *spec; /**< Pointer to item specification structure. */
> >>> +	const void *last; /**< Defines an inclusive range (spec to last). */
> >>> +	const void *mask; /**< Bit-mask applied to spec and last. */
> >>> +};
> >>> +
> >>> +/**
> >>> + * Action types.
> >>> + *
> >>> + * Each possible action is represented by a type. Some have associated
> >>> + * configuration structures. Several actions combined in a list can be
> >>> + * affected to a flow rule. That list is not ordered.
> >>> + *
> >>> + * They fall in three categories:
> >>> + *
> >>> + * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
> >>> + *   processing matched packets by subsequent flow rules, unless overridden
> >>> + *   with PASSTHRU.
> >>> + *
> >>> + * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
> >>> + *   for additional processing by subsequent flow rules.
> >>> + *
> >>> + * - Other non terminating meta actions that do not affect the fate of
> >>> + *   packets (END, VOID, MARK, FLAG, COUNT).
> >>> + *
> >>> + * When several actions are combined in a flow rule, they should all have
> >>> + * different types (e.g. dropping a packet twice is not possible). The
> >>> + * defined behavior is for PMDs to only take into account the last action of
> >>> + * a given type found in the list. PMDs still perform error checking on the
> >>> + * entire list.
> >>
> >> why do you define that the pmd will interpret multiple same type rules
> >> in this way...would it not make more sense for the pmd to just return
> >> EINVAL for an invalid set of rules? It seems more transparent for the
> >> application.
> > 
> > Well, I had to define something as a default. The reason is that any number
> > of VOID actions may specified and did not want that to be a special case in
> > order to keep PMD parsers as simple as possible. I'll settle for EINVAL (or
> > some other error) if at least one PMD maintainer other than Nelio who
> > intends to implement this API is not convinced by this explanation, all
> > right?
> 
> From an API perspective I think it's cleaner to pass or fail with the
> input rather than change it. But yes, please take pmd maintainers input
> as to what is reasonable to check also.
> 
> > 
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_MARK
> >>> + *
> >>> + * Attaches a 32 bit value to packets.
> >>> + *
> >>> + * This value is arbitrary and application-defined. For compatibility with
> >>> + * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
> >>> + * also set in ol_flags.
> >>> + */
> >>> +struct rte_flow_action_mark {
> >>> +	uint32_t id; /**< 32 bit value to return with packets. */
> >>> +};
> >>
> >> One use case I thought we would be able to do for OVS is classification
> >> in hardware and the unique flow id is sent with the packet to software.
> >> But in OVS the ufid is 128 bits, so it means we can't and there is still
> >> the miniflow extract overhead. I'm not sure if there is a practical way
> >> around this.
> >>
> >> Sugesh (cc'd) has looked at this before and may be able to comment or
> >> correct me.
> > 
> > Yes, we settled on 32 bit because currently no known hardware implementation
> > supports more than this. If that changes, another action with a larger type
> > shall be provided (no ABI breakage).
> > 
> > Also since even 64 bit would not be enough for the use case you mention,
> > there is no choice but use this as an indirect value (such as an array or
> > hash table index/value).
> 
> ok, cool. I think Sugesh has other ideas anyway!
> 
> > 
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_RSS
> >>> + *
> >>> + * Similar to QUEUE, except RSS is additionally performed on packets to
> >>> + * spread them among several queues according to the provided parameters.
> >>> + *
> >>> + * Note: RSS hash result is normally stored in the hash.rss mbuf field,
> >>> + * however it conflicts with the MARK action as they share the same
> >>> + * space. When both actions are specified, the RSS hash is discarded and
> >>> + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
> >>> + * structure should eventually evolve to store both.
> >>> + *
> >>> + * Terminating by default.
> >>> + */
> >>> +struct rte_flow_action_rss {
> >>> +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
> >>> +	uint16_t queues; /**< Number of entries in queue[]. */
> >>> +	uint16_t queue[]; /**< Queues indices to use. */
> >>
> >> I'd try and avoid queue and queues - someone will say "huh?" when
> >> reading code. s/queues/num ?
> > 
> > Agreed, will update for v2.
> > 
> >>> +};
> >>> +
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_VF
> >>> + *
> >>> + * Redirects packets to a virtual function (VF) of the current device.
> >>> + *
> >>> + * Packets matched by a VF pattern item can be redirected to their original
> >>> + * VF ID instead of the specified one. This parameter may not be available
> >>> + * and is not guaranteed to work properly if the VF part is matched by a
> >>> + * prior flow rule or if packets are not addressed to a VF in the first
> >>> + * place.
> >>
> >> Not clear what you mean by "not guaranteed to work if...". Please return
> >> fail when this action is used if this is not going to work.
> > 
> > Again, this is a case where it is difficult for a PMD to determine if the
> > entire list of flow rules makes sense. Perhaps it does, perhaps whatever
> > goes through has already been filtered out of possible issues.
> > 
> > Here the documentation states the precautions an application should take to
> > guarantee it will work as intended. Perhaps it can be reworded (any
> > suggestion?), but a PMD can certainly not provide any strong guarantee.
> 
> I see your point. Maybe for easy check things the pmd would return fail,
> but for more complex I agree it's too difficult.
> 
> > 
> >>> + *
> >>> + * Terminating by default.
> >>> + */
> >>> +struct rte_flow_action_vf {
> >>> +	uint32_t original:1; /**< Use original VF ID if possible. */
> >>> +	uint32_t reserved:31; /**< Reserved, must be zero. */
> >>> +	uint32_t id; /**< VF ID to redirect packets to. */
> >>> +};
> > [...]
> >>> +/**
> >>> + * Check whether a flow rule can be created on a given port.
> >>> + *
> >>> + * While this function has no effect on the target device, the flow rule is
> >>> + * validated against its current configuration state and the returned value
> >>> + * should be considered valid by the caller for that state only.
> >>> + *
> >>> + * The returned value is guaranteed to remain valid only as long as no
> >>> + * successful calls to rte_flow_create() or rte_flow_destroy() are made in
> >>> + * the meantime and no device parameter affecting flow rules in any way are
> >>> + * modified, due to possible collisions or resource limitations (although in
> >>> + * such cases EINVAL should not be returned).
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item).
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 if flow rule is valid and can be created. A negative errno value
> >>> + *   otherwise (rte_errno is also set), the following errors are defined:
> >>> + *
> >>> + *   -ENOSYS: underlying device does not support this functionality.
> >>> + *
> >>> + *   -EINVAL: unknown or invalid rule specification.
> >>> + *
> >>> + *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
> >>> + *   bit-masks are unsupported).
> >>> + *
> >>> + *   -EEXIST: collision with an existing rule.
> >>> + *
> >>> + *   -ENOMEM: not enough resources.
> >>> + *
> >>> + *   -EBUSY: action cannot be performed due to busy device resources, may
> >>> + *   succeed if the affected queues or even the entire port are in a stopped
> >>> + *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
> >>> + */
> >>> +int
> >>> +rte_flow_validate(uint8_t port_id,
> >>> +		  const struct rte_flow_attr *attr,
> >>> +		  const struct rte_flow_item pattern[],
> >>> +		  const struct rte_flow_action actions[],
> >>> +		  struct rte_flow_error *error);
> >>
> >> Why not just use rte_flow_create() and get an error? Is it less
> >> disruptive to do a validate and find the rule cannot be created, than
> >> using a create directly?
> > 
> > The rationale can be found in the original RFC, which I'll convert to actual
> > documentation in v2. In short:
> > 
> > - Calling rte_flow_validate() before rte_flow_create() is useless since
> >   rte_flow_create() also performs validation.
> > 
> > - We cannot possibly express a full static set of allowed flow rules, even
> >   if we could, it usually depends on the current hardware configuration
> >   therefore would not be static.
> > 
> > - rte_flow_validate() is thus provided as a replacement for capability
> >   flags. It can be used to determine during initialization if the underlying
> >   device can support the typical flow rules an application might want to
> >   provide later and do something useful with that information (e.g. always
> >   use software fallback due to HW limitations).
> > 
> > - rte_flow_validate() being a subset of rte_flow_create(), it is essentially
> >   free to expose.
> 
> make sense now, thanks.
> 
> > 
> >>> +
> >>> +/**
> >>> + * Create a flow rule on a given port.
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item).
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   A valid handle in case of success, NULL otherwise and rte_errno is set
> >>> + *   to the positive version of one of the error codes defined for
> >>> + *   rte_flow_validate().
> >>> + */
> >>> +struct rte_flow *
> >>> +rte_flow_create(uint8_t port_id,
> >>> +		const struct rte_flow_attr *attr,
> >>> +		const struct rte_flow_item pattern[],
> >>> +		const struct rte_flow_action actions[],
> >>> +		struct rte_flow_error *error);
> >>
> >> General question - are these functions threadsafe? In the OVS example
> >> you could have several threads wanting to create flow rules at the same
> >> time for same or different ports.
> > 
> > No they aren't, applications have to perform their own locking. The RFC (to
> > be converted to actual documentation in v2) says that:
> > 
> > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> >   returned).
> > 
> > - There is no provision for reentrancy/multi-thread safety, although nothing
> >   should prevent different devices from being configured at the same
> >   time. PMDs may protect their control path functions accordingly.
> 
> other comment above wrt locking.
> 
> > 
> >>> +
> >>> +/**
> >>> + * Destroy a flow rule on a given port.
> >>> + *
> >>> + * Failure to destroy a flow rule handle may occur when other flow rules
> >>> + * depend on it, and destroying it would result in an inconsistent state.
> >>> + *
> >>> + * This function is only guaranteed to succeed if handles are destroyed in
> >>> + * reverse order of their creation.
> >>
> >> How can the application find this information out on error?
> > 
> > Without maintaining a list, they cannot. The specified case is the only
> > possible guarantee. That does not mean PMDs should not do their best to
> > destroy flow rules, only that ordering must remain consistent in case of
> > inability to destroy one.
> > 
> > What do you suggest?
> 
> I think if the app cannot remove a specific rule it may want to remove
> all rules and deal with flows in software for a time. So once the app
> knows it fails that should be enough.

OK, then since destruction may return an error already, is it fine?
Applications may call rte_flow_flush() (not supposed to fail unless there is
a serious issue, abort() in that case) and switch to SW fallback.

> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param flow
> >>> + *   Flow rule handle to destroy.
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >>> + */
> >>> +int
> >>> +rte_flow_destroy(uint8_t port_id,
> >>> +		 struct rte_flow *flow,
> >>> +		 struct rte_flow_error *error);
> >>> +
> >>> +/**
> >>> + * Destroy all flow rules associated with a port.
> >>> + *
> >>> + * In the unlikely event of failure, handles are still considered destroyed
> >>> + * and no longer valid but the port must be assumed to be in an inconsistent
> >>> + * state.
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >>> + */
> >>> +int
> >>> +rte_flow_flush(uint8_t port_id,
> >>> +	       struct rte_flow_error *error);
> >>
> >> rte_flow_destroy_all() would be more descriptive (but breaks your style)
> > 
> > There are enough underscores as it is. I like flush, if enough people
> > complain we'll change it but it has to occur before the first public
> > release.
> > 
> >>> +
> >>> +/**
> >>> + * Query an existing flow rule.
> >>> + *
> >>> + * This function allows retrieving flow-specific data such as counters.
> >>> + * Data is gathered by special actions which must be present in the flow
> >>> + * rule definition.
> >>
> >> re last sentence, it would be good if you can put a link to
> >> RTE_FLOW_ACTION_TYPE_COUNT
> > 
> > Will do, I did not know how until very recently.
> > 
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param flow
> >>> + *   Flow rule handle to query.
> >>> + * @param action
> >>> + *   Action type to query.
> >>> + * @param[in, out] data
> >>> + *   Pointer to storage for the associated query data type.
> >>
> >> can this be anything other than rte_flow_query_count?
> > 
> > Likely in the future. I've only defined this one as a counterpart for
> > existing API functionality and because we wanted to expose it in mlx5.
> > 
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >>> + */
> >>> +int
> >>> +rte_flow_query(uint8_t port_id,
> >>> +	       struct rte_flow *flow,
> >>> +	       enum rte_flow_action_type action,
> >>> +	       void *data,
> >>> +	       struct rte_flow_error *error);
> >>> +
> >>> +#ifdef __cplusplus
> >>> +}
> >>> +#endif
> >>
> >> I don't see a way to dump all the rules for a port out. I think this is
> >> neccessary for degbugging. You could have a look through dpif.h in OVS
> >> and see how dpif_flow_dump_next() is used, it might be a good reference.
> > 
> > DPDK does not maintain flow rules and, depending on hardware capabilities
> > and level of compliance, PMDs do not necessarily do it either, particularly
> > since it requires space and application probably have a better method to
> > store these pointers for their own needs.
> 
> understood
> 
> > 
> > What you see here is only a PMD interface. Depending on applications needs,
> > generic helper functions built on top of these may be added to manage flow
> > rules in the future.
> 
> I'm thinking of the case where something goes wrong and I want to get a
> dump of all the flow rules from hardware, not query the rules I think I
> have. I don't see a way to do it or something to build a helper on top of?

Generic helper functions would exist on top of this API and would likely
maintain a list of flow rules themselves. The dump in that case would be
entirely implemented in software. I think that recovering flow rules from HW
may be complicated in many cases (even without taking storage allocation and
rules conversion issues into account), therefore if there is really a need
for it, we could perhaps add a dump() function that PMDs are free to
implement later.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-12-08 15:19       ` Adrien Mazarguil
@ 2016-12-08 17:56         ` Ferruh Yigit
  2016-12-15 12:20         ` Ferruh Yigit
  1 sibling, 0 replies; 262+ messages in thread
From: Ferruh Yigit @ 2016-12-08 17:56 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz

On 12/8/2016 3:19 PM, Adrien Mazarguil wrote:
> Hi Ferruh,
> 
> On Fri, Dec 02, 2016 at 04:58:53PM +0000, Ferruh Yigit wrote:
>> Hi Adrien,
>>
>> On 11/16/2016 4:23 PM, Adrien Mazarguil wrote:
>>> As previously discussed in RFC v1 [1], RFC v2 [2], with changes
>>> described in [3] (also pasted below), here is the first non-draft series
>>> for this new API.
>>>
>>> Its capabilities are so generic that its name had to be vague, it may be
>>> called "Generic flow API", "Generic flow interface" (possibly shortened
>>> as "GFI") to refer to the name of the new filter type, or "rte_flow" from
>>> the prefix used for its public symbols. I personally favor the latter.
>>>
>>> While it is currently meant to supersede existing filter types in order for
>>> all PMDs to expose a common filtering/classification interface, it may
>>> eventually evolve to cover the following ideas as well:
>>>
>>> - Rx/Tx offloads configuration through automatic offloads for specific
>>>   packets, e.g. performing checksum on TCP packets could be expressed with
>>>   an egress rule with a TCP pattern and a kind of checksum action.
>>>
>>> - RSS configuration (already defined actually). Could be global or per rule
>>>   depending on hardware capabilities.
>>>
>>> - Switching configuration for devices with many physical ports; rules doing
>>>   both ingress and egress could even be used to completely bypass software
>>>   if supported by hardware.
>>>
>>>  [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
>>>  [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
>>>  [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html
>>>
>>> Changes since RFC v2:
>>>
>>> - New separate VLAN pattern item (previously part of the ETH definition),
>>>   found to be much more convenient.
>>>
>>> - Removed useless "any" field from VF pattern item, the same effect can be
>>>   achieved by not providing a specification structure.
>>>
>>> - Replaced bit-fields from the VXLAN pattern item to avoid endianness
>>>   conversion issues on 24-bit fields.
>>>
>>> - Updated struct rte_flow_item with a new "last" field to create inclusive
>>>   ranges. They are defined as the interval between (spec & mask) and
>>>   (last & mask). All three parameters are optional.
>>>
>>> - Renamed ID action MARK.
>>>
>>> - Renamed "queue" fields in actions QUEUE and DUP to "index".
>>>
>>> - "rss_conf" field in RSS action is now const.
>>>
>>> - VF action now uses a 32 bit ID like its pattern item counterpart.
>>>
>>> - Removed redundant struct rte_flow_pattern, API functions now expect
>>>   struct
>>>   rte_flow_item lists terminated by END items.
>>>
>>> - Replaced struct rte_flow_actions for the same reason, with struct
>>>   rte_flow_action lists terminated by END actions.
>>>
>>> - Error types (enum rte_flow_error_type) have been updated and the cause
>>>   pointer in struct rte_flow_error is now const.
>>>
>>> - Function prototypes (rte_flow_create, rte_flow_validate) have also been
>>>   updated for clarity.
>>>
>>> Additions:
>>>
>>> - Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
>>>   are now implemented in rte_flow.c, with their symbols exported and
>>>   versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.
>>>
>>> - A separate header (rte_flow_driver.h) has been added for driver-side
>>>   functionality, in particular struct rte_flow_ops which contains PMD
>>>   callbacks returned by RTE_ETH_FILTER_GENERIC query.
>>>
>>> - testpmd now exposes most of this API through the new "flow" command.
>>>
>>> What remains to be done:
>>>
>>> - Using endian-aware integer types (rte_beX_t) where necessary for clarity.
>>>
>>> - API documentation (based on RFC).
>>>
>>> - testpmd flow command documentation (although context-aware command
>>>   completion should already help quite a bit in this regard).
>>>
>>> - A few pattern item / action properties cannot be configured yet
>>>   (e.g. rss_conf parameter for RSS action) and a few completions
>>>   (e.g. possible queue IDs) should be added.
>>>
>>
>> <...>
>>
>> I was trying to check driver filter API patches, but hit a few compiler
>> errors with this patchset.
>>
>> [1] clang complains about variable bitfield value changed from -1 to 1.
>> Which is correct, but I guess that is intentional, but I don't know how
>> to tell this to clang?
>>
>> [2] shred library compilation error, because of missing rte_flow_flush
>> in rte_ether_version.map file
>>
>> [3] bunch of icc compilation errors, almost all are same type:
>> error #188: enumerated type mixed with another type
> 
> Thanks for the report, I'll attempt to address them all in v2. However icc
> error #188 looks like a pain, I think I can work around it but do we really
> not tolerate the use of normal integers inside enum fields in DPDK?

If this warning is not improving the code, and community agree on it, it
is possible to disable warning by adding "-wd188" to test-pmd Makefile
for ICC compiler.

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-08 15:09               ` Adrien Mazarguil
@ 2016-12-09 12:18                 ` Chandran, Sugesh
  2016-12-09 16:38                   ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Chandran, Sugesh @ 2016-12-09 12:18 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Kevin Traynor, dev, Thomas Monjalon, De Lara Guarch, Pablo,
	Olivier Matz, sugesh.chandran

Hi Adrien,
Thank you for your comments,
Please see the reply below.

Regards
_Sugesh


> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Thursday, December 8, 2016 3:09 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>
> Cc: Kevin Traynor <ktraynor@redhat.com>; dev@dpdk.org; Thomas
> Monjalon <thomas.monjalon@6wind.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olivier Matz <olivier.matz@6wind.com>;
> sugesh.chandran@intel.comn
> Subject: Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> 
> Hi Sugesh,
> 
> On Tue, Dec 06, 2016 at 06:11:38PM +0000, Chandran, Sugesh wrote:
> [...]
> > > >>> +int
> > > >>> +rte_flow_validate(uint8_t port_id,
> > > >>> +		  const struct rte_flow_attr *attr,
> > > >>> +		  const struct rte_flow_item pattern[],
> > > >>> +		  const struct rte_flow_action actions[],
> > > >>> +		  struct rte_flow_error *error);
> > > >>
> > > >> Why not just use rte_flow_create() and get an error? Is it less
> > > >> disruptive to do a validate and find the rule cannot be created,
> > > >> than using a create directly?
> > > >
> > > > The rationale can be found in the original RFC, which I'll convert
> > > > to actual documentation in v2. In short:
> > > >
> > > > - Calling rte_flow_validate() before rte_flow_create() is useless since
> > > >   rte_flow_create() also performs validation.
> > > >
> > > > - We cannot possibly express a full static set of allowed flow rules, even
> > > >   if we could, it usually depends on the current hardware configuration
> > > >   therefore would not be static.
> > > >
> > > > - rte_flow_validate() is thus provided as a replacement for capability
> > > >   flags. It can be used to determine during initialization if the underlying
> > > >   device can support the typical flow rules an application might want to
> > > >   provide later and do something useful with that information (e.g.
> always
> > > >   use software fallback due to HW limitations).
> > > >
> > > > - rte_flow_validate() being a subset of rte_flow_create(), it is
> essentially
> > > >   free to expose.
> > >
> > > make sense now, thanks.
> > [Sugesh] : We had this discussion earlier at the design stage about
> > the time taken for programming the hardware, and how to make it
> > deterministic. How about having a timeout parameter as well for the
> > rte_flow_* If the hardware flow insert is timed out, error out than
> > waiting indefinitely, so that application have some control over The
> > time to program the flow. It can be another set of APIs something
> > like, rte_flow_create_timeout()
> 
> Yes as discussed the existing API does not provide any timing constraints to
> PMDs, validate() and create() may take forever to complete, although PMDs
> are strongly encouraged to take as little time as possible.
> 
> Like you suggested, this could be done through distinct API calls. The
> validate() function would also have its _timeout() counterpart since the set
> of possible rules could be restricted in that mode.
[Sugesh] Thanks!. Looking forward to see an api set with that implementation as well 
in the future :). I feel it's a must from the user application point of view.
> 
> > Are you going to provide any control over the initialization of NIC
> > to define the capability matrices For eg; To operate in a L3 router mode,
> software wanted to initialize the NIC port only to consider the L2 and L3
> fields.
> > I assume the initialization is done based on the first rules that are
> programmed into the NIC.?
> 
> Precisely, PMDs are supposed to determine the most appropriate device
> mode to use in order to handle the requested rules. They may even switch
> to another mode if necessary assuming this does not break existing
> constraints.
> 
> I think we've discussed an atomic (commit-based) mode of operation
> through separate functions as well, where the application would attempt to
> create a bunch of rules at once, possibly making it easier for PMDs to
> determine the most appropriate mode of operation for the device.
> 
> All of these may be added later according to users feedback once the basic
> API has settled.
[Sugesh] Yes , we discussed about this before. However I feel that, it make sense
to provide some flexibility to the user/application to define a profile/mode of the device.
This way the complexity of determining the mode by itself will be taken away from PMD.
Looking at the P4 enablement patches in OVS, the mode definition APIs can be used in conjunction
P4 behavioral model. 
For eg: A P4 model for a L2 switch operate OVS as a L2 switch. Using the mode definition APIs
Its possible to impose the same behavioral model in the hardware too. 
This way its simple, clean and very predictive though it needs to define an additional profile_define APIs.
I am sorry to provide the comment at this stage,  However looking at the adoption of ebpf, P4 make me
to think this way.
What do you think?
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-09 12:18                 ` Chandran, Sugesh
@ 2016-12-09 16:38                   ` Adrien Mazarguil
  2016-12-12 10:20                     ` Chandran, Sugesh
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-09 16:38 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: Kevin Traynor, dev, Thomas Monjalon, De Lara Guarch, Pablo,
	Olivier Matz, sugesh.chandran

Hi Sugesh,

On Fri, Dec 09, 2016 at 12:18:03PM +0000, Chandran, Sugesh wrote:
[...]
> > > Are you going to provide any control over the initialization of NIC
> > > to define the capability matrices For eg; To operate in a L3 router mode,
> > software wanted to initialize the NIC port only to consider the L2 and L3
> > fields.
> > > I assume the initialization is done based on the first rules that are
> > programmed into the NIC.?
> > 
> > Precisely, PMDs are supposed to determine the most appropriate device
> > mode to use in order to handle the requested rules. They may even switch
> > to another mode if necessary assuming this does not break existing
> > constraints.
> > 
> > I think we've discussed an atomic (commit-based) mode of operation
> > through separate functions as well, where the application would attempt to
> > create a bunch of rules at once, possibly making it easier for PMDs to
> > determine the most appropriate mode of operation for the device.
> > 
> > All of these may be added later according to users feedback once the basic
> > API has settled.
> [Sugesh] Yes , we discussed about this before. However I feel that, it make sense
> to provide some flexibility to the user/application to define a profile/mode of the device.
> This way the complexity of determining the mode by itself will be taken away from PMD.
> Looking at the P4 enablement patches in OVS, the mode definition APIs can be used in conjunction
> P4 behavioral model. 
> For eg: A P4 model for a L2 switch operate OVS as a L2 switch. Using the mode definition APIs
> Its possible to impose the same behavioral model in the hardware too. 
> This way its simple, clean and very predictive though it needs to define an additional profile_define APIs.
> I am sorry to provide the comment at this stage,  However looking at the adoption of ebpf, P4 make me
> to think this way.
> What do you think?

What you suggest (device profile configuration) would be done by a separate
function in any case, so as long as everyone agrees on a generic method to
do so, no problem with extending rte_flow. By default in the meantime we'll
have to rely on PMDs to make the right decision.

Do you think it has to be defined from the beginning?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-09 16:38                   ` Adrien Mazarguil
@ 2016-12-12 10:20                     ` Chandran, Sugesh
  2016-12-12 11:17                       ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Chandran, Sugesh @ 2016-12-12 10:20 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Kevin Traynor, dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz

Hi Adrien,

Regards
_Sugesh

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Friday, December 9, 2016 4:39 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>
> Cc: Kevin Traynor <ktraynor@redhat.com>; dev@dpdk.org; Thomas
> Monjalon <thomas.monjalon@6wind.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; Olivier Matz <olivier.matz@6wind.com>;
> sugesh.chandran@intel.comn
> Subject: Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> 
> Hi Sugesh,
> 
> On Fri, Dec 09, 2016 at 12:18:03PM +0000, Chandran, Sugesh wrote:
> [...]
> > > > Are you going to provide any control over the initialization of
> > > > NIC to define the capability matrices For eg; To operate in a L3
> > > > router mode,
> > > software wanted to initialize the NIC port only to consider the L2
> > > and L3 fields.
> > > > I assume the initialization is done based on the first rules that
> > > > are
> > > programmed into the NIC.?
> > >
> > > Precisely, PMDs are supposed to determine the most appropriate
> > > device mode to use in order to handle the requested rules. They may
> > > even switch to another mode if necessary assuming this does not
> > > break existing constraints.
> > >
> > > I think we've discussed an atomic (commit-based) mode of operation
> > > through separate functions as well, where the application would
> > > attempt to create a bunch of rules at once, possibly making it
> > > easier for PMDs to determine the most appropriate mode of operation
> for the device.
> > >
> > > All of these may be added later according to users feedback once the
> > > basic API has settled.
> > [Sugesh] Yes , we discussed about this before. However I feel that, it
> > make sense to provide some flexibility to the user/application to define a
> profile/mode of the device.
> > This way the complexity of determining the mode by itself will be taken
> away from PMD.
> > Looking at the P4 enablement patches in OVS, the mode definition APIs
> > can be used in conjunction
> > P4 behavioral model.
> > For eg: A P4 model for a L2 switch operate OVS as a L2 switch. Using
> > the mode definition APIs Its possible to impose the same behavioral model
> in the hardware too.
> > This way its simple, clean and very predictive though it needs to define an
> additional profile_define APIs.
> > I am sorry to provide the comment at this stage,  However looking at
> > the adoption of ebpf, P4 make me to think this way.
> > What do you think?
> 
> What you suggest (device profile configuration) would be done by a separate
> function in any case, so as long as everyone agrees on a generic method to
> do so, no problem with extending rte_flow. By default in the meantime we'll
> have to rely on PMDs to make the right decision.
[Sugesh] I am fine with PMD is making the decision on profile/mode selection in
Default case. However we must provide an option for the application to define a mode
and PMD must honor with it to avoid making an invalid mode change.
> 
> Do you think it has to be defined from the beginning?
[Sugesh] I feel it's going to be another big topic to decide how proposed mode implementation will be looks like,
What should be available modes and etc.  So I am OK to consider as its not part of this flow API definition for now.
However its good to mention that in the API comments section to be aware. Do you agree that?

> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-12 10:20                     ` Chandran, Sugesh
@ 2016-12-12 11:17                       ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-12 11:17 UTC (permalink / raw)
  To: Chandran, Sugesh
  Cc: Kevin Traynor, dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz

Hi Sugesh,

On Mon, Dec 12, 2016 at 10:20:18AM +0000, Chandran, Sugesh wrote:
> Hi Adrien,
> 
> Regards
> _Sugesh
> 
> > -----Original Message-----
> > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > Sent: Friday, December 9, 2016 4:39 PM
> > To: Chandran, Sugesh <sugesh.chandran@intel.com>
> > Cc: Kevin Traynor <ktraynor@redhat.com>; dev@dpdk.org; Thomas
> > Monjalon <thomas.monjalon@6wind.com>; De Lara Guarch, Pablo
> > <pablo.de.lara.guarch@intel.com>; Olivier Matz <olivier.matz@6wind.com>;
> > sugesh.chandran@intel.comn
> > Subject: Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
> > 
> > Hi Sugesh,
> > 
> > On Fri, Dec 09, 2016 at 12:18:03PM +0000, Chandran, Sugesh wrote:
> > [...]
> > > > > Are you going to provide any control over the initialization of
> > > > > NIC to define the capability matrices For eg; To operate in a L3
> > > > > router mode,
> > > > software wanted to initialize the NIC port only to consider the L2
> > > > and L3 fields.
> > > > > I assume the initialization is done based on the first rules that
> > > > > are
> > > > programmed into the NIC.?
> > > >
> > > > Precisely, PMDs are supposed to determine the most appropriate
> > > > device mode to use in order to handle the requested rules. They may
> > > > even switch to another mode if necessary assuming this does not
> > > > break existing constraints.
> > > >
> > > > I think we've discussed an atomic (commit-based) mode of operation
> > > > through separate functions as well, where the application would
> > > > attempt to create a bunch of rules at once, possibly making it
> > > > easier for PMDs to determine the most appropriate mode of operation
> > for the device.
> > > >
> > > > All of these may be added later according to users feedback once the
> > > > basic API has settled.
> > > [Sugesh] Yes , we discussed about this before. However I feel that, it
> > > make sense to provide some flexibility to the user/application to define a
> > profile/mode of the device.
> > > This way the complexity of determining the mode by itself will be taken
> > away from PMD.
> > > Looking at the P4 enablement patches in OVS, the mode definition APIs
> > > can be used in conjunction
> > > P4 behavioral model.
> > > For eg: A P4 model for a L2 switch operate OVS as a L2 switch. Using
> > > the mode definition APIs Its possible to impose the same behavioral model
> > in the hardware too.
> > > This way its simple, clean and very predictive though it needs to define an
> > additional profile_define APIs.
> > > I am sorry to provide the comment at this stage,  However looking at
> > > the adoption of ebpf, P4 make me to think this way.
> > > What do you think?
> > 
> > What you suggest (device profile configuration) would be done by a separate
> > function in any case, so as long as everyone agrees on a generic method to
> > do so, no problem with extending rte_flow. By default in the meantime we'll
> > have to rely on PMDs to make the right decision.
> [Sugesh] I am fine with PMD is making the decision on profile/mode selection in
> Default case. However we must provide an option for the application to define a mode
> and PMD must honor with it to avoid making an invalid mode change.
> > 
> > Do you think it has to be defined from the beginning?
> [Sugesh] I feel it's going to be another big topic to decide how proposed mode implementation will be looks like,
> What should be available modes and etc.  So I am OK to consider as its not part of this flow API definition for now.
> However its good to mention that in the API comments section to be aware. Do you agree that?

Will do, I'll mention it in the "future evolutions" section.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-08 17:07             ` Adrien Mazarguil
@ 2016-12-14 11:48               ` Kevin Traynor
  2016-12-14 13:54                 ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Kevin Traynor @ 2016-12-14 11:48 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz, sugesh.chandran

hi Adrien, sorry for the delay

<...>

>>>>
>>>> Is it expected that the application or pmd will provide locking between
>>>> these functions if required? I think it's going to have to be the app.
>>>
>>> Locking is indeed expected to be performed by applications. This API only
>>> documents places where locking would make sense if necessary and expected
>>> behavior.
>>>
>>> Like all control path APIs, this one assumes a single control thread.
>>> Applications must take the necessary precautions.
>>
>> If you look at OVS now it's quite possible that you have 2 rx queues
>> serviced by different threads, that would also install the flow rules in
>> the software flow caches - possibly that could extend to adding hardware
>> flows. There could also be another thread that is querying for stats. So
>> anything that can be done to minimise the locking would be helpful -
>> maybe query() could be atomic and not require any locking?
> 
> I think we need basic functions with as few constraints as possible on PMDs
> first, this API being somewhat complex to implement on their side. That
> covers the common use case where applications have a single control thread
> or otherwise perform locking on their own.
> 
> Once the basics are there for most PMDs, we may add new functions, items,
> properties and actions that provide additional constraints (timing,
> multi-threading and so on), which remain to be defined according to
> feedback. It is designed to be extended without causing ABI breakage.

I think Sugesh and I are trying to foresee some of the issues that may
arise when integrating with something like OVS. OTOH it's
hard/impossible to say what will be needed exactly in the API right now
to make it suitable for OVS.

So, I'm ok with the approach you are taking by exposing a basic API
but I think there should be an expectation that it may not be sufficient
for a project like OVS to integrate in and may take several
iterations/extensions - don't go anywhere!

> 
> As for query(), let's see how PMDs handle it first. A race between query()
> and create() on a given device is almost unavoidable without locking, same
> for queries that reset counters in a given flow rule. Basic parallel queries
> should not cause any harm otherwise, although this cannot be guaranteed yet.

You still have a race if there is locking, except it is for the lock,
but it has the same effect. The downside of my suggestion is that all
the PMDs would need to guarantee they could gets stats atomically - I'm
not sure if they can or it's too restrictive.

> 

<...>

>>
>>>
>>>>> +
>>>>> +/**
>>>>> + * Destroy a flow rule on a given port.
>>>>> + *
>>>>> + * Failure to destroy a flow rule handle may occur when other flow rules
>>>>> + * depend on it, and destroying it would result in an inconsistent state.
>>>>> + *
>>>>> + * This function is only guaranteed to succeed if handles are destroyed in
>>>>> + * reverse order of their creation.
>>>>
>>>> How can the application find this information out on error?
>>>
>>> Without maintaining a list, they cannot. The specified case is the only
>>> possible guarantee. That does not mean PMDs should not do their best to
>>> destroy flow rules, only that ordering must remain consistent in case of
>>> inability to destroy one.
>>>
>>> What do you suggest?
>>
>> I think if the app cannot remove a specific rule it may want to remove
>> all rules and deal with flows in software for a time. So once the app
>> knows it fails that should be enough.
> 
> OK, then since destruction may return an error already, is it fine?
> Applications may call rte_flow_flush() (not supposed to fail unless there is
> a serious issue, abort() in that case) and switch to SW fallback.

yes, it's fine.

> 

<...>

>>>>> + * @param[out] error
>>>>> + *   Perform verbose error reporting if not NULL.
>>>>> + *
>>>>> + * @return
>>>>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>>>>> + */
>>>>> +int
>>>>> +rte_flow_query(uint8_t port_id,
>>>>> +	       struct rte_flow *flow,
>>>>> +	       enum rte_flow_action_type action,
>>>>> +	       void *data,
>>>>> +	       struct rte_flow_error *error);
>>>>> +
>>>>> +#ifdef __cplusplus
>>>>> +}
>>>>> +#endif
>>>>
>>>> I don't see a way to dump all the rules for a port out. I think this is
>>>> neccessary for degbugging. You could have a look through dpif.h in OVS
>>>> and see how dpif_flow_dump_next() is used, it might be a good reference.
>>>
>>> DPDK does not maintain flow rules and, depending on hardware capabilities
>>> and level of compliance, PMDs do not necessarily do it either, particularly
>>> since it requires space and application probably have a better method to
>>> store these pointers for their own needs.
>>
>> understood
>>
>>>
>>> What you see here is only a PMD interface. Depending on applications needs,
>>> generic helper functions built on top of these may be added to manage flow
>>> rules in the future.
>>
>> I'm thinking of the case where something goes wrong and I want to get a
>> dump of all the flow rules from hardware, not query the rules I think I
>> have. I don't see a way to do it or something to build a helper on top of?
> 
> Generic helper functions would exist on top of this API and would likely
> maintain a list of flow rules themselves. The dump in that case would be
> entirely implemented in software. I think that recovering flow rules from HW
> may be complicated in many cases (even without taking storage allocation and
> rules conversion issues into account), therefore if there is really a need
> for it, we could perhaps add a dump() function that PMDs are free to
> implement later.
> 

ok. Maybe there are some more generic stats that can be got from the
hardware that would help debugging that would suffice, like total flow
rule hits/misses (i.e. not on a per flow rule basis).

You can get this from the software flow caches and it's widely used for
debugging. e.g.

pmd thread numa_id 0 core_id 3:
	emc hits:0
	megaflow hits:0
	avg. subtable lookups per hit:0.00
	miss:0

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-14 11:48               ` Kevin Traynor
@ 2016-12-14 13:54                 ` Adrien Mazarguil
  2016-12-14 16:11                   ` Kevin Traynor
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-14 13:54 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz, sugesh.chandran

Hi Kevin,

On Wed, Dec 14, 2016 at 11:48:04AM +0000, Kevin Traynor wrote:
> hi Adrien, sorry for the delay
> 
> <...>
> 
> >>>>
> >>>> Is it expected that the application or pmd will provide locking between
> >>>> these functions if required? I think it's going to have to be the app.
> >>>
> >>> Locking is indeed expected to be performed by applications. This API only
> >>> documents places where locking would make sense if necessary and expected
> >>> behavior.
> >>>
> >>> Like all control path APIs, this one assumes a single control thread.
> >>> Applications must take the necessary precautions.
> >>
> >> If you look at OVS now it's quite possible that you have 2 rx queues
> >> serviced by different threads, that would also install the flow rules in
> >> the software flow caches - possibly that could extend to adding hardware
> >> flows. There could also be another thread that is querying for stats. So
> >> anything that can be done to minimise the locking would be helpful -
> >> maybe query() could be atomic and not require any locking?
> > 
> > I think we need basic functions with as few constraints as possible on PMDs
> > first, this API being somewhat complex to implement on their side. That
> > covers the common use case where applications have a single control thread
> > or otherwise perform locking on their own.
> > 
> > Once the basics are there for most PMDs, we may add new functions, items,
> > properties and actions that provide additional constraints (timing,
> > multi-threading and so on), which remain to be defined according to
> > feedback. It is designed to be extended without causing ABI breakage.
> 
> I think Sugesh and I are trying to foresee some of the issues that may
> arise when integrating with something like OVS. OTOH it's
> hard/impossible to say what will be needed exactly in the API right now
> to make it suitable for OVS.
> 
> So, I'm ok with the approach you are taking by exposing a basic API
> but I think there should be an expectation that it may not be sufficient
> for a project like OVS to integrate in and may take several
> iterations/extensions - don't go anywhere!
> 
> > 
> > As for query(), let's see how PMDs handle it first. A race between query()
> > and create() on a given device is almost unavoidable without locking, same
> > for queries that reset counters in a given flow rule. Basic parallel queries
> > should not cause any harm otherwise, although this cannot be guaranteed yet.
> 
> You still have a race if there is locking, except it is for the lock,
> but it has the same effect. The downside of my suggestion is that all
> the PMDs would need to guarantee they could gets stats atomically - I'm
> not sure if they can or it's too restrictive.
> 
> > 
> 
> <...>
> 
> >>
> >>>
> >>>>> +
> >>>>> +/**
> >>>>> + * Destroy a flow rule on a given port.
> >>>>> + *
> >>>>> + * Failure to destroy a flow rule handle may occur when other flow rules
> >>>>> + * depend on it, and destroying it would result in an inconsistent state.
> >>>>> + *
> >>>>> + * This function is only guaranteed to succeed if handles are destroyed in
> >>>>> + * reverse order of their creation.
> >>>>
> >>>> How can the application find this information out on error?
> >>>
> >>> Without maintaining a list, they cannot. The specified case is the only
> >>> possible guarantee. That does not mean PMDs should not do their best to
> >>> destroy flow rules, only that ordering must remain consistent in case of
> >>> inability to destroy one.
> >>>
> >>> What do you suggest?
> >>
> >> I think if the app cannot remove a specific rule it may want to remove
> >> all rules and deal with flows in software for a time. So once the app
> >> knows it fails that should be enough.
> > 
> > OK, then since destruction may return an error already, is it fine?
> > Applications may call rte_flow_flush() (not supposed to fail unless there is
> > a serious issue, abort() in that case) and switch to SW fallback.
> 
> yes, it's fine.
> 
> > 
> 
> <...>
> 
> >>>>> + * @param[out] error
> >>>>> + *   Perform verbose error reporting if not NULL.
> >>>>> + *
> >>>>> + * @return
> >>>>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >>>>> + */
> >>>>> +int
> >>>>> +rte_flow_query(uint8_t port_id,
> >>>>> +	       struct rte_flow *flow,
> >>>>> +	       enum rte_flow_action_type action,
> >>>>> +	       void *data,
> >>>>> +	       struct rte_flow_error *error);
> >>>>> +
> >>>>> +#ifdef __cplusplus
> >>>>> +}
> >>>>> +#endif
> >>>>
> >>>> I don't see a way to dump all the rules for a port out. I think this is
> >>>> neccessary for degbugging. You could have a look through dpif.h in OVS
> >>>> and see how dpif_flow_dump_next() is used, it might be a good reference.
> >>>
> >>> DPDK does not maintain flow rules and, depending on hardware capabilities
> >>> and level of compliance, PMDs do not necessarily do it either, particularly
> >>> since it requires space and application probably have a better method to
> >>> store these pointers for their own needs.
> >>
> >> understood
> >>
> >>>
> >>> What you see here is only a PMD interface. Depending on applications needs,
> >>> generic helper functions built on top of these may be added to manage flow
> >>> rules in the future.
> >>
> >> I'm thinking of the case where something goes wrong and I want to get a
> >> dump of all the flow rules from hardware, not query the rules I think I
> >> have. I don't see a way to do it or something to build a helper on top of?
> > 
> > Generic helper functions would exist on top of this API and would likely
> > maintain a list of flow rules themselves. The dump in that case would be
> > entirely implemented in software. I think that recovering flow rules from HW
> > may be complicated in many cases (even without taking storage allocation and
> > rules conversion issues into account), therefore if there is really a need
> > for it, we could perhaps add a dump() function that PMDs are free to
> > implement later.
> > 
> 
> ok. Maybe there are some more generic stats that can be got from the
> hardware that would help debugging that would suffice, like total flow
> rule hits/misses (i.e. not on a per flow rule basis).
> 
> You can get this from the software flow caches and it's widely used for
> debugging. e.g.
> 
> pmd thread numa_id 0 core_id 3:
> 	emc hits:0
> 	megaflow hits:0
> 	avg. subtable lookups per hit:0.00
> 	miss:0
> 

Perhaps a rule such as the following could do the trick:

 group: 42 (or priority 42)
 pattern: void
 actions: count / passthru

Assuming useful flow rules are defined with higher priorities (using lower
group ID or priority level) and provide a terminating action, this one would
count all packets that were not caught by them.

That is one example to illustrate how "global" counters can be requested by
applications.

Otherwise you could just make sure all rules contain mark / flag actions, in
which case mbufs would tell directly if they went through them or need
additional SW processing.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API
  2016-12-14 13:54                 ` Adrien Mazarguil
@ 2016-12-14 16:11                   ` Kevin Traynor
  0 siblings, 0 replies; 262+ messages in thread
From: Kevin Traynor @ 2016-12-14 16:11 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz, sugesh.chandran

On 12/14/2016 01:54 PM, Adrien Mazarguil wrote:

>>
>>>>>>> + * @param[out] error
>>>>>>> + *   Perform verbose error reporting if not NULL.
>>>>>>> + *
>>>>>>> + * @return
>>>>>>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
>>>>>>> + */
>>>>>>> +int
>>>>>>> +rte_flow_query(uint8_t port_id,
>>>>>>> +	       struct rte_flow *flow,
>>>>>>> +	       enum rte_flow_action_type action,
>>>>>>> +	       void *data,
>>>>>>> +	       struct rte_flow_error *error);
>>>>>>> +
>>>>>>> +#ifdef __cplusplus
>>>>>>> +}
>>>>>>> +#endif
>>>>>>
>>>>>> I don't see a way to dump all the rules for a port out. I think this is
>>>>>> neccessary for degbugging. You could have a look through dpif.h in OVS
>>>>>> and see how dpif_flow_dump_next() is used, it might be a good reference.
>>>>>
>>>>> DPDK does not maintain flow rules and, depending on hardware capabilities
>>>>> and level of compliance, PMDs do not necessarily do it either, particularly
>>>>> since it requires space and application probably have a better method to
>>>>> store these pointers for their own needs.
>>>>
>>>> understood
>>>>
>>>>>
>>>>> What you see here is only a PMD interface. Depending on applications needs,
>>>>> generic helper functions built on top of these may be added to manage flow
>>>>> rules in the future.
>>>>
>>>> I'm thinking of the case where something goes wrong and I want to get a
>>>> dump of all the flow rules from hardware, not query the rules I think I
>>>> have. I don't see a way to do it or something to build a helper on top of?
>>>
>>> Generic helper functions would exist on top of this API and would likely
>>> maintain a list of flow rules themselves. The dump in that case would be
>>> entirely implemented in software. I think that recovering flow rules from HW
>>> may be complicated in many cases (even without taking storage allocation and
>>> rules conversion issues into account), therefore if there is really a need
>>> for it, we could perhaps add a dump() function that PMDs are free to
>>> implement later.
>>>
>>
>> ok. Maybe there are some more generic stats that can be got from the
>> hardware that would help debugging that would suffice, like total flow
>> rule hits/misses (i.e. not on a per flow rule basis).
>>
>> You can get this from the software flow caches and it's widely used for
>> debugging. e.g.
>>
>> pmd thread numa_id 0 core_id 3:
>> 	emc hits:0
>> 	megaflow hits:0
>> 	avg. subtable lookups per hit:0.00
>> 	miss:0
>>
> 
> Perhaps a rule such as the following could do the trick:
> 
>  group: 42 (or priority 42)
>  pattern: void
>  actions: count / passthru
> 
> Assuming useful flow rules are defined with higher priorities (using lower
> group ID or priority level) and provide a terminating action, this one would
> count all packets that were not caught by them.
> 
> That is one example to illustrate how "global" counters can be requested by
> applications.
> 
> Otherwise you could just make sure all rules contain mark / flag actions, in
> which case mbufs would tell directly if they went through them or need
> additional SW processing.
> 

ok, sounds like there's some options at least to work with on this which
is good. thanks.

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-12-08 15:19       ` Adrien Mazarguil
  2016-12-08 17:56         ` Ferruh Yigit
@ 2016-12-15 12:20         ` Ferruh Yigit
  2016-12-16  8:22           ` Adrien Mazarguil
  1 sibling, 1 reply; 262+ messages in thread
From: Ferruh Yigit @ 2016-12-15 12:20 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz

On 12/8/2016 3:19 PM, Adrien Mazarguil wrote:
> Hi Ferruh,
> 
> On Fri, Dec 02, 2016 at 04:58:53PM +0000, Ferruh Yigit wrote:
>> Hi Adrien,
>>
>> On 11/16/2016 4:23 PM, Adrien Mazarguil wrote:
>>> As previously discussed in RFC v1 [1], RFC v2 [2], with changes
>>> described in [3] (also pasted below), here is the first non-draft series
>>> for this new API.
>>>
>>> Its capabilities are so generic that its name had to be vague, it may be
>>> called "Generic flow API", "Generic flow interface" (possibly shortened
>>> as "GFI") to refer to the name of the new filter type, or "rte_flow" from
>>> the prefix used for its public symbols. I personally favor the latter.
>>>
>>> While it is currently meant to supersede existing filter types in order for
>>> all PMDs to expose a common filtering/classification interface, it may
>>> eventually evolve to cover the following ideas as well:
>>>
>>> - Rx/Tx offloads configuration through automatic offloads for specific
>>>   packets, e.g. performing checksum on TCP packets could be expressed with
>>>   an egress rule with a TCP pattern and a kind of checksum action.
>>>
>>> - RSS configuration (already defined actually). Could be global or per rule
>>>   depending on hardware capabilities.
>>>
>>> - Switching configuration for devices with many physical ports; rules doing
>>>   both ingress and egress could even be used to completely bypass software
>>>   if supported by hardware.
>>>
>>>  [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
>>>  [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
>>>  [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html
>>>
>>> Changes since RFC v2:
>>>
>>> - New separate VLAN pattern item (previously part of the ETH definition),
>>>   found to be much more convenient.
>>>
>>> - Removed useless "any" field from VF pattern item, the same effect can be
>>>   achieved by not providing a specification structure.
>>>
>>> - Replaced bit-fields from the VXLAN pattern item to avoid endianness
>>>   conversion issues on 24-bit fields.
>>>
>>> - Updated struct rte_flow_item with a new "last" field to create inclusive
>>>   ranges. They are defined as the interval between (spec & mask) and
>>>   (last & mask). All three parameters are optional.
>>>
>>> - Renamed ID action MARK.
>>>
>>> - Renamed "queue" fields in actions QUEUE and DUP to "index".
>>>
>>> - "rss_conf" field in RSS action is now const.
>>>
>>> - VF action now uses a 32 bit ID like its pattern item counterpart.
>>>
>>> - Removed redundant struct rte_flow_pattern, API functions now expect
>>>   struct
>>>   rte_flow_item lists terminated by END items.
>>>
>>> - Replaced struct rte_flow_actions for the same reason, with struct
>>>   rte_flow_action lists terminated by END actions.
>>>
>>> - Error types (enum rte_flow_error_type) have been updated and the cause
>>>   pointer in struct rte_flow_error is now const.
>>>
>>> - Function prototypes (rte_flow_create, rte_flow_validate) have also been
>>>   updated for clarity.
>>>
>>> Additions:
>>>
>>> - Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
>>>   are now implemented in rte_flow.c, with their symbols exported and
>>>   versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.
>>>
>>> - A separate header (rte_flow_driver.h) has been added for driver-side
>>>   functionality, in particular struct rte_flow_ops which contains PMD
>>>   callbacks returned by RTE_ETH_FILTER_GENERIC query.
>>>
>>> - testpmd now exposes most of this API through the new "flow" command.
>>>
>>> What remains to be done:
>>>
>>> - Using endian-aware integer types (rte_beX_t) where necessary for clarity.
>>>
>>> - API documentation (based on RFC).
>>>
>>> - testpmd flow command documentation (although context-aware command
>>>   completion should already help quite a bit in this regard).
>>>
>>> - A few pattern item / action properties cannot be configured yet
>>>   (e.g. rss_conf parameter for RSS action) and a few completions
>>>   (e.g. possible queue IDs) should be added.
>>>
>>
>> <...>
>>
>> I was trying to check driver filter API patches, but hit a few compiler
>> errors with this patchset.
>>
>> [1] clang complains about variable bitfield value changed from -1 to 1.
>> Which is correct, but I guess that is intentional, but I don't know how
>> to tell this to clang?
>>
>> [2] shred library compilation error, because of missing rte_flow_flush
>> in rte_ether_version.map file
>>
>> [3] bunch of icc compilation errors, almost all are same type:
>> error #188: enumerated type mixed with another type
> 
> Thanks for the report, I'll attempt to address them all in v2. 

Hi Adrien,

I would like to remind that there are driver patch sets depends to this
patch.

New version of this patch should give some time to drivers to re-do (if
required) the patchsets before integration deadline.


Thanks,
ferruh



> However icc
> error #188 looks like a pain, I think I can work around it but do we really
> not tolerate the use of normal integers inside enum fields in DPDK?
> 

<...>

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler
  2016-11-16 16:23     ` [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
@ 2016-12-16  3:01       ` Pei, Yulong
  2016-12-16  9:17         ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Pei, Yulong @ 2016-12-16  3:01 UTC (permalink / raw)
  To: Adrien Mazarguil, dev
  Cc: Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz, Xing, Beilei

Hi Adrien,

I try to setup the following rule, but it seems that after set 'spec'  param, can not set 'mask' param,  is it an issue here or am I wrong to use it ?

testpmd> flow create 0 ingress pattern eth dst spec 00:00:00:00:09:00
 dst [TOKEN]: destination MAC
 src [TOKEN]: source MAC
 type [TOKEN]: EtherType
 / [TOKEN]: specify next pattern item


Best Regards
Yulong Pei

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
Sent: Thursday, November 17, 2016 12:24 AM
To: dev@dpdk.org
Cc: Thomas Monjalon <thomas.monjalon@6wind.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Olivier Matz <olivier.matz@6wind.com>
Subject: [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler

Add parser code to fully set individual fields of pattern item specification structures, using the following operators:

- fix: sets field and applies full bit-mask for perfect matching.
- spec: sets field without modifying its bit-mask.
- last: sets upper value of the spec => last range.
- mask: sets bit-mask affecting both spec and last from arbitrary value.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 110 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index e70e8e2..790b4b8 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -89,6 +89,10 @@ enum index {
 
 	/* Validate/create pattern. */
 	PATTERN,
+	ITEM_PARAM_FIX,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -121,6 +125,7 @@ struct context {
 	uint16_t port; /**< Current port ID (for completions). */
 	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
+	void *objmask; /**< Object a full mask must be written to. */
 };
 
 /** Token argument. */
@@ -267,6 +272,14 @@ static const enum index next_list_attr[] = {
 	0,
 };
 
+static const enum index item_param[] = {
+	ITEM_PARAM_FIX,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
+	0,
+};
+
 static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
@@ -287,6 +300,8 @@ static int parse_init(struct context *, const struct token *,  static int parse_vc(struct context *, const struct token *,
 		    const char *, unsigned int,
 		    void *, unsigned int);
+static int parse_vc_spec(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -492,6 +507,26 @@ static const struct token token_list[] = {
 		.next = NEXT(next_item),
 		.call = parse_vc,
 	},
+	[ITEM_PARAM_FIX] = {
+		.name = "fix",
+		.help = "match value perfectly (with full bit-mask)",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_SPEC] = {
+		.name = "spec",
+		.help = "match value according to configured bit-mask",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_LAST] = {
+		.name = "last",
+		.help = "specify upper bound to establish a range",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_MASK] = {
+		.name = "mask",
+		.help = "specify bit-mask with relevant bits set to one",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +640,7 @@ parse_init(struct context *ctx, const struct token *token,
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
 	ctx->objdata = 0;
 	ctx->object = out;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -632,11 +668,13 @@ parse_vc(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.vc.data = (uint8_t *)out + size;
 		return len;
 	}
 	ctx->objdata = 0;
 	ctx->object = &out->args.vc.attr;
+	ctx->objmask = NULL;
 	switch (ctx->curr) {
 	case GROUP:
 	case PRIORITY:
@@ -652,6 +690,7 @@ parse_vc(struct context *ctx, const struct token *token,
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
 		ctx->object = out->args.vc.pattern;
+		ctx->objmask = NULL;
 		return len;
 	case ACTIONS:
 		out->args.vc.actions =
@@ -660,6 +699,7 @@ parse_vc(struct context *ctx, const struct token *token,
 						out->args.vc.pattern_n),
 					       sizeof(double));
 		ctx->object = out->args.vc.actions;
+		ctx->objmask = NULL;
 		return len;
 	default:
 		if (!token->priv)
@@ -682,6 +722,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.pattern_n;
 		ctx->object = item;
+		ctx->objmask = NULL;
 	} else {
 		const struct parse_action_priv *priv = token->priv;
 		struct rte_flow_action *action =
@@ -698,6 +739,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.actions_n;
 		ctx->object = action;
+		ctx->objmask = NULL;
 	}
 	memset(data, 0, data_size);
 	out->args.vc.data = data;
@@ -705,6 +747,60 @@ parse_vc(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse pattern item parameter type. */ static int 
+parse_vc_spec(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_item *item;
+	uint32_t data_size;
+	int index;
+	int objmask = 0;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Parse parameter types. */
+	switch (ctx->curr) {
+	case ITEM_PARAM_FIX:
+		index = 0;
+		objmask = 1;
+		break;
+	case ITEM_PARAM_SPEC:
+		index = 0;
+		break;
+	case ITEM_PARAM_LAST:
+		index = 1;
+		break;
+	case ITEM_PARAM_MASK:
+		index = 2;
+		break;
+	default:
+		return -1;
+	}
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.pattern_n)
+		return -1;
+	item = &out->args.vc.pattern[out->args.vc.pattern_n - 1];
+	data_size = ctx->objdata / 3; /* spec, last, mask */
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data + (data_size * index);
+	if (objmask) {
+		ctx->objmask = out->args.vc.data + (data_size * 2); /* mask */
+		item->mask = ctx->objmask;
+	} else
+		ctx->objmask = NULL;
+	/* Update relevant item pointer. */
+	*((const void **[]){ &item->spec, &item->last, &item->mask })[index] =
+		ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */  static int  parse_destroy(struct context *ctx, const struct token *token, @@ -727,6 +823,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -737,6 +834,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -762,6 +860,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -788,6 +887,7 @@ parse_query(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -849,6 +949,7 @@ parse_list(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -859,6 +960,7 @@ parse_list(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -891,6 +993,7 @@ parse_int(struct context *ctx, const struct token *token,
 		return len;
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
+objmask:
 	switch (size) {
 	case sizeof(uint8_t):
 		*(uint8_t *)buf = u;
@@ -907,6 +1010,11 @@ parse_int(struct context *ctx, const struct token *token,
 	default:
 		goto error;
 	}
+	if (ctx->objmask && buf != (uint8_t *)ctx->objmask + arg->offset) {
+		u = -1;
+		buf = (uint8_t *)ctx->objmask + arg->offset;
+		goto objmask;
+	}
 	return len;
 error:
 	push_args(ctx, arg);
@@ -927,6 +1035,7 @@ parse_port(struct context *ctx, const struct token *token,
 	else {
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		size = sizeof(*out);
 	}
 	ret = parse_int(ctx, token, str, len, out, size); @@ -1033,6 +1142,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->port = 0;
 	ctx->objdata = 0;
 	ctx->object = NULL;
+	ctx->objmask = NULL;
 }
 
 /** Parse a token (cmdline API). */
--
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow)
  2016-12-15 12:20         ` Ferruh Yigit
@ 2016-12-16  8:22           ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16  8:22 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev, Thomas Monjalon, Pablo de Lara, Olivier Matz

On Thu, Dec 15, 2016 at 12:20:36PM +0000, Ferruh Yigit wrote:
> On 12/8/2016 3:19 PM, Adrien Mazarguil wrote:
> > Hi Ferruh,
> > 
> > On Fri, Dec 02, 2016 at 04:58:53PM +0000, Ferruh Yigit wrote:
> >> Hi Adrien,
> >>
> >> On 11/16/2016 4:23 PM, Adrien Mazarguil wrote:
> >>> As previously discussed in RFC v1 [1], RFC v2 [2], with changes
> >>> described in [3] (also pasted below), here is the first non-draft series
> >>> for this new API.
> >>>
> >>> Its capabilities are so generic that its name had to be vague, it may be
> >>> called "Generic flow API", "Generic flow interface" (possibly shortened
> >>> as "GFI") to refer to the name of the new filter type, or "rte_flow" from
> >>> the prefix used for its public symbols. I personally favor the latter.
> >>>
> >>> While it is currently meant to supersede existing filter types in order for
> >>> all PMDs to expose a common filtering/classification interface, it may
> >>> eventually evolve to cover the following ideas as well:
> >>>
> >>> - Rx/Tx offloads configuration through automatic offloads for specific
> >>>   packets, e.g. performing checksum on TCP packets could be expressed with
> >>>   an egress rule with a TCP pattern and a kind of checksum action.
> >>>
> >>> - RSS configuration (already defined actually). Could be global or per rule
> >>>   depending on hardware capabilities.
> >>>
> >>> - Switching configuration for devices with many physical ports; rules doing
> >>>   both ingress and egress could even be used to completely bypass software
> >>>   if supported by hardware.
> >>>
> >>>  [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
> >>>  [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
> >>>  [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html
> >>>
> >>> Changes since RFC v2:
> >>>
> >>> - New separate VLAN pattern item (previously part of the ETH definition),
> >>>   found to be much more convenient.
> >>>
> >>> - Removed useless "any" field from VF pattern item, the same effect can be
> >>>   achieved by not providing a specification structure.
> >>>
> >>> - Replaced bit-fields from the VXLAN pattern item to avoid endianness
> >>>   conversion issues on 24-bit fields.
> >>>
> >>> - Updated struct rte_flow_item with a new "last" field to create inclusive
> >>>   ranges. They are defined as the interval between (spec & mask) and
> >>>   (last & mask). All three parameters are optional.
> >>>
> >>> - Renamed ID action MARK.
> >>>
> >>> - Renamed "queue" fields in actions QUEUE and DUP to "index".
> >>>
> >>> - "rss_conf" field in RSS action is now const.
> >>>
> >>> - VF action now uses a 32 bit ID like its pattern item counterpart.
> >>>
> >>> - Removed redundant struct rte_flow_pattern, API functions now expect
> >>>   struct
> >>>   rte_flow_item lists terminated by END items.
> >>>
> >>> - Replaced struct rte_flow_actions for the same reason, with struct
> >>>   rte_flow_action lists terminated by END actions.
> >>>
> >>> - Error types (enum rte_flow_error_type) have been updated and the cause
> >>>   pointer in struct rte_flow_error is now const.
> >>>
> >>> - Function prototypes (rte_flow_create, rte_flow_validate) have also been
> >>>   updated for clarity.
> >>>
> >>> Additions:
> >>>
> >>> - Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
> >>>   are now implemented in rte_flow.c, with their symbols exported and
> >>>   versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.
> >>>
> >>> - A separate header (rte_flow_driver.h) has been added for driver-side
> >>>   functionality, in particular struct rte_flow_ops which contains PMD
> >>>   callbacks returned by RTE_ETH_FILTER_GENERIC query.
> >>>
> >>> - testpmd now exposes most of this API through the new "flow" command.
> >>>
> >>> What remains to be done:
> >>>
> >>> - Using endian-aware integer types (rte_beX_t) where necessary for clarity.
> >>>
> >>> - API documentation (based on RFC).
> >>>
> >>> - testpmd flow command documentation (although context-aware command
> >>>   completion should already help quite a bit in this regard).
> >>>
> >>> - A few pattern item / action properties cannot be configured yet
> >>>   (e.g. rss_conf parameter for RSS action) and a few completions
> >>>   (e.g. possible queue IDs) should be added.
> >>>
> >>
> >> <...>
> >>
> >> I was trying to check driver filter API patches, but hit a few compiler
> >> errors with this patchset.
> >>
> >> [1] clang complains about variable bitfield value changed from -1 to 1.
> >> Which is correct, but I guess that is intentional, but I don't know how
> >> to tell this to clang?
> >>
> >> [2] shred library compilation error, because of missing rte_flow_flush
> >> in rte_ether_version.map file
> >>
> >> [3] bunch of icc compilation errors, almost all are same type:
> >> error #188: enumerated type mixed with another type
> > 
> > Thanks for the report, I'll attempt to address them all in v2. 
> 
> Hi Adrien,
> 
> I would like to remind that there are driver patch sets depends to this
> patch.
> 
> New version of this patch should give some time to drivers to re-do (if
> required) the patchsets before integration deadline.

Hi Ferruh,

I intend to send v2 (including all the requested changes and fixes) shortly,
most likely today.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler
  2016-12-16  3:01       ` Pei, Yulong
@ 2016-12-16  9:17         ` Adrien Mazarguil
  2016-12-16 12:22           ` Xing, Beilei
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16  9:17 UTC (permalink / raw)
  To: Pei, Yulong
  Cc: dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz, Xing, Beilei

Hi Yulong,

On Fri, Dec 16, 2016 at 03:01:15AM +0000, Pei, Yulong wrote:
> Hi Adrien,
> 
> I try to setup the following rule, but it seems that after set 'spec'  param, can not set 'mask' param,  is it an issue here or am I wrong to use it ?
> 
> testpmd> flow create 0 ingress pattern eth dst spec 00:00:00:00:09:00
>  dst [TOKEN]: destination MAC
>  src [TOKEN]: source MAC
>  type [TOKEN]: EtherType
>  / [TOKEN]: specify next pattern item

You need to re-specify dst with "mask" instead of "spec". You can specify it
as many times you like to update each structure in turn, e.g.:

 testpmd> flow create 0 ingress pattern eth dst spec 00:00:00:00:09:00 dst mask 00:00:00:00:ff:ff

If you want to specify both spec and mask at once assuming you want it full,
these commands yield the same result:

 testpmd> flow create 0 ingress pattern eth dst fix 00:00:00:00:09:00
 testpmd> flow create 0 ingress pattern eth dst spec 00:00:00:00:09:00 dst mask ff:ff:ff:ff:ff:ff
 testpmd> flow create 0 ingress pattern eth dst spec 00:00:00:00:09:00 dst prefix 48

You are even allowed to change your mind:

 testpmd> flow create 0 ingress pattern eth dst fix 00:00:2a:2a:2a:2a dst fix 00:00:00:00:09:00

All these will be properly documented in the v2 patchset. Note, this version
will replace the "fix" keyword with "is" ("fix" made no sense according to
feedback).

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler
  2016-12-16  9:17         ` Adrien Mazarguil
@ 2016-12-16 12:22           ` Xing, Beilei
  2016-12-16 15:25             ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Xing, Beilei @ 2016-12-16 12:22 UTC (permalink / raw)
  To: Adrien Mazarguil, Pei, Yulong
  Cc: dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz

Thanks Adrien.

I have two questions:
1.  when I set " / vlan tci fix 10" with testpmd, I find the mask of tci is 0xFFFF.
     Actually tci includes PRI, CFI, and Vlan_id which holds 12 bits, so is it possible
     to set the mask to 0xFFF? 
     Our driver will check the mask only covers vlan_id instead of the whole tci.

2. When we test destroy function, we find the pointer provided to PMD is NULL
    instead of the pointer PMD returned to RTE during creating flow. Could you
    please have double check? Thanks.

Best Regards
Beilei

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Friday, December 16, 2016 5:18 PM
> To: Pei, Yulong <yulong.pei@intel.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas.monjalon@6wind.com>; De
> Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; Olivier Matz
> <olivier.matz@6wind.com>; Xing, Beilei <beilei.xing@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec
> handler
> 
> Hi Yulong,
> 
> On Fri, Dec 16, 2016 at 03:01:15AM +0000, Pei, Yulong wrote:
> > Hi Adrien,
> >
> > I try to setup the following rule, but it seems that after set 'spec'  param,
> can not set 'mask' param,  is it an issue here or am I wrong to use it ?
> >
> > testpmd> flow create 0 ingress pattern eth dst spec 00:00:00:00:09:00
> >  dst [TOKEN]: destination MAC
> >  src [TOKEN]: source MAC
> >  type [TOKEN]: EtherType
> >  / [TOKEN]: specify next pattern item
> 
> You need to re-specify dst with "mask" instead of "spec". You can specify it
> as many times you like to update each structure in turn, e.g.:
> 
>  testpmd> flow create 0 ingress pattern eth dst spec 00:00:00:00:09:00 dst
> mask 00:00:00:00:ff:ff
> 
> If you want to specify both spec and mask at once assuming you want it full,
> these commands yield the same result:
> 
>  testpmd> flow create 0 ingress pattern eth dst fix 00:00:00:00:09:00  testpmd>
> flow create 0 ingress pattern eth dst spec 00:00:00:00:09:00 dst mask
> ff:ff:ff:ff:ff:ff  testpmd> flow create 0 ingress pattern eth dst spec
> 00:00:00:00:09:00 dst prefix 48
> 
> You are even allowed to change your mind:
> 
>  testpmd> flow create 0 ingress pattern eth dst fix 00:00:2a:2a:2a:2a dst fix
> 00:00:00:00:09:00
> 
> All these will be properly documented in the v2 patchset. Note, this version
> will replace the "fix" keyword with "is" ("fix" made no sense according to
> feedback).
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler
  2016-12-16 12:22           ` Xing, Beilei
@ 2016-12-16 15:25             ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 15:25 UTC (permalink / raw)
  To: Xing, Beilei
  Cc: Pei, Yulong, dev, Thomas Monjalon, De Lara Guarch, Pablo, Olivier Matz

Hi Beilei,

On Fri, Dec 16, 2016 at 12:22:52PM +0000, Xing, Beilei wrote:
> Thanks Adrien.
> 
> I have two questions:
> 1.  when I set " / vlan tci fix 10" with testpmd, I find the mask of tci is 0xFFFF.
>      Actually tci includes PRI, CFI, and Vlan_id which holds 12 bits, so is it possible
>      to set the mask to 0xFFF? 
>      Our driver will check the mask only covers vlan_id instead of the whole tci.

Right, I'll work on a method to do that. TCI remains 16 bit either way so
the current approach remains accurate, although not convenient because a
10 bit mask must be specified manually.

This change won't be included in v2 though.

> 2. When we test destroy function, we find the pointer provided to PMD is NULL
>     instead of the pointer PMD returned to RTE during creating flow. Could you
>     please have double check? Thanks.

There is indeed a bug [1]. It is fixed in v2.

Thanks.

 [1] http://dpdk.org/ml/archives/dev/2016-November/050435.html

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow)
  2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
                       ` (24 preceding siblings ...)
  2016-12-02 16:58     ` Ferruh Yigit
@ 2016-12-16 16:24     ` Adrien Mazarguil
  2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API Adrien Mazarguil
                         ` (27 more replies)
  25 siblings, 28 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:24 UTC (permalink / raw)
  To: dev

As previously discussed in RFC v1 [1], RFC v2 [2], with changes
described in [3] (also pasted below), here is the first non-draft series
for this new API.

Its capabilities are so generic that its name had to be vague, it may be
called "Generic flow API", "Generic flow interface" (possibly shortened
as "GFI") to refer to the name of the new filter type, or "rte_flow" from
the prefix used for its public symbols. I personally favor the latter.

While it is currently meant to supersede existing filter types in order for
all PMDs to expose a common filtering/classification interface, it may
eventually evolve to cover the following ideas as well:

- Rx/Tx offloads configuration through automatic offloads for specific
  packets, e.g. performing checksum on TCP packets could be expressed with
  an egress rule with a TCP pattern and a kind of checksum action.

- RSS configuration (already defined actually). Could be global or per rule
  depending on hardware capabilities.

- Switching configuration for devices with many physical ports; rules doing
  both ingress and egress could even be used to completely bypass software
  if supported by hardware.

 [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
 [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
 [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html

Changes since v1 series:

- Added programmer's guide documentation for rte_flow.

- Added depreciation notice for the legacy API.

- Documented testpmd flow command.

- Fixed missing rte_flow_flush symbol in rte_ether_version.map.

- Cleaned up API documentation in rte_flow.h.

- Replaced "min/max" parameters with "num" in struct rte_flow_item_any, to
  align behavior with other item definitions.

- Fixed "type" (EtherType) size in struct rte_flow_item_eth.

- Renamed "queues" to "num" in struct rte_flow_action_rss.

- Fixed missing const in rte_flow_error_set() prototype definition.

- Fixed testpmd flow create command that did not save the rte_flow object
  pointer, causing crashes.

- Hopefully fixed all the remaining ICC/clang errors.

- Replaced testpmd flow command's "fix" token with "is" for clarity.

Changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect
  struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

Adrien Mazarguil (25):
  ethdev: introduce generic flow API
  doc: add rte_flow prog guide
  doc: announce depreciation of legacy filter types
  cmdline: add support for dynamic tokens
  cmdline: add alignment constraint
  app/testpmd: implement basic support for rte_flow
  app/testpmd: add flow command
  app/testpmd: add rte_flow integer support
  app/testpmd: add flow list command
  app/testpmd: add flow flush command
  app/testpmd: add flow destroy command
  app/testpmd: add flow validate/create commands
  app/testpmd: add flow query command
  app/testpmd: add rte_flow item spec handler
  app/testpmd: add rte_flow item spec prefix length
  app/testpmd: add rte_flow bit-field support
  app/testpmd: add item any to flow command
  app/testpmd: add various items to flow command
  app/testpmd: add item raw to flow command
  app/testpmd: add items eth/vlan to flow command
  app/testpmd: add items ipv4/ipv6 to flow command
  app/testpmd: add L4 items to flow command
  app/testpmd: add various actions to flow command
  app/testpmd: add queue actions to flow command
  doc: describe testpmd flow command

 MAINTAINERS                                 |    4 +
 app/test-pmd/Makefile                       |    1 +
 app/test-pmd/cmdline.c                      |   32 +
 app/test-pmd/cmdline_flow.c                 | 2575 ++++++++++++++++++++++
 app/test-pmd/config.c                       |  485 ++++
 app/test-pmd/csumonly.c                     |    1 +
 app/test-pmd/flowgen.c                      |    1 +
 app/test-pmd/icmpecho.c                     |    1 +
 app/test-pmd/ieee1588fwd.c                  |    1 +
 app/test-pmd/iofwd.c                        |    1 +
 app/test-pmd/macfwd.c                       |    1 +
 app/test-pmd/macswap.c                      |    1 +
 app/test-pmd/parameters.c                   |    1 +
 app/test-pmd/rxonly.c                       |    1 +
 app/test-pmd/testpmd.c                      |    6 +
 app/test-pmd/testpmd.h                      |   27 +
 app/test-pmd/txonly.c                       |    1 +
 doc/api/doxy-api-index.md                   |    2 +
 doc/guides/prog_guide/index.rst             |    1 +
 doc/guides/prog_guide/rte_flow.rst          | 1853 ++++++++++++++++
 doc/guides/rel_notes/deprecation.rst        |    7 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  612 +++++
 lib/librte_cmdline/cmdline_parse.c          |   67 +-
 lib/librte_cmdline/cmdline_parse.h          |   21 +
 lib/librte_ether/Makefile                   |    3 +
 lib/librte_ether/rte_eth_ctrl.h             |    1 +
 lib/librte_ether/rte_ether_version.map      |   11 +
 lib/librte_ether/rte_flow.c                 |  159 ++
 lib/librte_ether/rte_flow.h                 |  942 ++++++++
 lib/librte_ether/rte_flow_driver.h          |  181 ++
 30 files changed, 6991 insertions(+), 9 deletions(-)
 create mode 100644 app/test-pmd/cmdline_flow.c
 create mode 100644 doc/guides/prog_guide/rte_flow.rst
 create mode 100644 lib/librte_ether/rte_flow.c
 create mode 100644 lib/librte_ether/rte_flow.h
 create mode 100644 lib/librte_ether/rte_flow_driver.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
@ 2016-12-16 16:24       ` Adrien Mazarguil
  2017-10-23  8:53         ` Zhao1, Wei
  2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide Adrien Mazarguil
                         ` (26 subsequent siblings)
  27 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:24 UTC (permalink / raw)
  To: dev

This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

Benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

Existing filter types will be deprecated and removed in the near future.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 MAINTAINERS                            |   4 +
 doc/api/doxy-api-index.md              |   2 +
 lib/librte_ether/Makefile              |   3 +
 lib/librte_ether/rte_eth_ctrl.h        |   1 +
 lib/librte_ether/rte_ether_version.map |  11 +
 lib/librte_ether/rte_flow.c            | 159 +++++
 lib/librte_ether/rte_flow.h            | 942 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow_driver.h     | 181 ++++++
 8 files changed, 1303 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 26d9590..5975cff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -243,6 +243,10 @@ M: Thomas Monjalon <thomas.monjalon@6wind.com>
 F: lib/librte_ether/
 F: scripts/test-null.sh
 
+Generic flow API
+M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
+F: lib/librte_ether/rte_flow*
+
 Crypto API
 M: Declan Doherty <declan.doherty@intel.com>
 F: lib/librte_cryptodev/
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de65b4c..4951552 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -39,6 +39,8 @@ There are many libraries, so their headers may be grouped by topics:
   [dev]                (@ref rte_dev.h),
   [ethdev]             (@ref rte_ethdev.h),
   [ethctrl]            (@ref rte_eth_ctrl.h),
+  [rte_flow]           (@ref rte_flow.h),
+  [rte_flow_driver]    (@ref rte_flow_driver.h),
   [cryptodev]          (@ref rte_cryptodev.h),
   [devargs]            (@ref rte_devargs.h),
   [bond]               (@ref rte_eth_bond.h),
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index efe1e5f..9335361 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -44,6 +44,7 @@ EXPORT_MAP := rte_ether_version.map
 LIBABIVER := 5
 
 SRCS-y += rte_ethdev.c
+SRCS-y += rte_flow.c
 
 #
 # Export include files
@@ -51,6 +52,8 @@ SRCS-y += rte_ethdev.c
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
+SYMLINK-y-include += rte_flow.h
+SYMLINK-y-include += rte_flow_driver.h
 
 # this lib depends upon:
 DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index fe80eb0..8386904 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -99,6 +99,7 @@ enum rte_filter_type {
 	RTE_ETH_FILTER_FDIR,
 	RTE_ETH_FILTER_HASH,
 	RTE_ETH_FILTER_L2_TUNNEL,
+	RTE_ETH_FILTER_GENERIC,
 	RTE_ETH_FILTER_MAX
 };
 
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 72be66d..384cdee 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -147,3 +147,14 @@ DPDK_16.11 {
 	rte_eth_dev_pci_remove;
 
 } DPDK_16.07;
+
+DPDK_17.02 {
+	global:
+
+	rte_flow_validate;
+	rte_flow_create;
+	rte_flow_destroy;
+	rte_flow_flush;
+	rte_flow_query;
+
+} DPDK_16.11;
diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
new file mode 100644
index 0000000..064963d
--- /dev/null
+++ b/lib/librte_ether/rte_flow.c
@@ -0,0 +1,159 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include "rte_ethdev.h"
+#include "rte_flow_driver.h"
+#include "rte_flow.h"
+
+/* Get generic flow operations structure from a port. */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops;
+	int code;
+
+	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
+		code = ENODEV;
+	else if (unlikely(!dev->dev_ops->filter_ctrl ||
+			  dev->dev_ops->filter_ctrl(dev,
+						    RTE_ETH_FILTER_GENERIC,
+						    RTE_ETH_FILTER_GET,
+						    &ops) ||
+			  !ops))
+		code = ENOTSUP;
+	else
+		return ops;
+	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(code));
+	return NULL;
+}
+
+/* Check whether a flow rule can be created on a given port. */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error)
+{
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->validate))
+		return ops->validate(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return -rte_errno;
+}
+
+/* Create a flow rule on a given port. */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return NULL;
+	if (likely(!!ops->create))
+		return ops->create(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return NULL;
+}
+
+/* Destroy a flow rule on a given port. */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->destroy))
+		return ops->destroy(dev, flow, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return -rte_errno;
+}
+
+/* Destroy all flow rules associated with a port. */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->flush))
+		return ops->flush(dev, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return -rte_errno;
+}
+
+/* Query an existing flow rule. */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (!ops)
+		return -rte_errno;
+	if (likely(!!ops->query))
+		return ops->query(dev, flow, action, data, error);
+	rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOTSUP));
+	return -rte_errno;
+}
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
new file mode 100644
index 0000000..0bd5957
--- /dev/null
+++ b/lib/librte_ether/rte_flow.h
@@ -0,0 +1,942 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_H_
+#define RTE_FLOW_H_
+
+/**
+ * @file
+ * RTE generic flow API
+ *
+ * This interface provides the ability to program packet matching and
+ * associated actions in hardware through flow rules.
+ */
+
+#include <rte_arp.h>
+#include <rte_ether.h>
+#include <rte_icmp.h>
+#include <rte_ip.h>
+#include <rte_sctp.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Flow rule attributes.
+ *
+ * Priorities are set on two levels: per group and per rule within groups.
+ *
+ * Lower values denote higher priority, the highest priority for both levels
+ * is 0, so that a rule with priority 0 in group 8 is always matched after a
+ * rule with priority 8 in group 0.
+ *
+ * Although optional, applications are encouraged to group similar rules as
+ * much as possible to fully take advantage of hardware capabilities
+ * (e.g. optimized matching) and work around limitations (e.g. a single
+ * pattern type possibly allowed in a given group).
+ *
+ * Group and priority levels are arbitrary and up to the application, they
+ * do not need to be contiguous nor start from 0, however the maximum number
+ * varies between devices and may be affected by existing flow rules.
+ *
+ * If a packet is matched by several rules of a given group for a given
+ * priority level, the outcome is undefined. It can take any path, may be
+ * duplicated or even cause unrecoverable errors.
+ *
+ * Note that support for more than a single group and priority level is not
+ * guaranteed.
+ *
+ * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+ *
+ * Several pattern items and actions are valid and can be used in both
+ * directions. Those valid for only one direction are described as such.
+ *
+ * At least one direction must be specified.
+ *
+ * Specifying both directions at once for a given rule is not recommended
+ * but may be valid in a few cases (e.g. shared counter).
+ */
+struct rte_flow_attr {
+	uint32_t group; /**< Priority group. */
+	uint32_t priority; /**< Priority level within group. */
+	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
+	uint32_t egress:1; /**< Rule applies to egress traffic. */
+	uint32_t reserved:30; /**< Reserved, must be zero. */
+};
+
+/**
+ * Matching pattern item types.
+ *
+ * Pattern items fall in two categories:
+ *
+ * - Matching protocol headers and packet data (ANY, RAW, ETH, VLAN, IPV4,
+ *   IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
+ *   specification structure. These must be stacked in the same order as the
+ *   protocol layers to match, starting from the lowest.
+ *
+ * - Matching meta-data or affecting pattern processing (END, VOID, INVERT,
+ *   PF, VF, PORT and so on), often without a specification structure. Since
+ *   they do not match packet contents, these can be specified anywhere
+ *   within item lists without affecting others.
+ *
+ * See the description of individual types for more information. Those
+ * marked with [META] fall into the second category.
+ */
+enum rte_flow_item_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for item lists. Prevents further processing of items,
+	 * thereby ending the pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_VOID,
+
+	/**
+	 * [META]
+	 *
+	 * Inverted matching, i.e. process packets that do not match the
+	 * pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_INVERT,
+
+	/**
+	 * Matches any protocol in place of the current layer, a single ANY
+	 * may also stand for several protocol layers.
+	 *
+	 * See struct rte_flow_item_any.
+	 */
+	RTE_FLOW_ITEM_TYPE_ANY,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to the physical function of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a PF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_PF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to a virtual function ID of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a VF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * See struct rte_flow_item_vf.
+	 */
+	RTE_FLOW_ITEM_TYPE_VF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets coming from the specified physical port of the
+	 * underlying device.
+	 *
+	 * The first PORT item overrides the physical port normally
+	 * associated with the specified DPDK input port (port_id). This
+	 * item can be provided several times to match additional physical
+	 * ports.
+	 *
+	 * See struct rte_flow_item_port.
+	 */
+	RTE_FLOW_ITEM_TYPE_PORT,
+
+	/**
+	 * Matches a byte string of a given length at a given offset.
+	 *
+	 * See struct rte_flow_item_raw.
+	 */
+	RTE_FLOW_ITEM_TYPE_RAW,
+
+	/**
+	 * Matches an Ethernet header.
+	 *
+	 * See struct rte_flow_item_eth.
+	 */
+	RTE_FLOW_ITEM_TYPE_ETH,
+
+	/**
+	 * Matches an 802.1Q/ad VLAN tag.
+	 *
+	 * See struct rte_flow_item_vlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VLAN,
+
+	/**
+	 * Matches an IPv4 header.
+	 *
+	 * See struct rte_flow_item_ipv4.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV4,
+
+	/**
+	 * Matches an IPv6 header.
+	 *
+	 * See struct rte_flow_item_ipv6.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV6,
+
+	/**
+	 * Matches an ICMP header.
+	 *
+	 * See struct rte_flow_item_icmp.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMP,
+
+	/**
+	 * Matches a UDP header.
+	 *
+	 * See struct rte_flow_item_udp.
+	 */
+	RTE_FLOW_ITEM_TYPE_UDP,
+
+	/**
+	 * Matches a TCP header.
+	 *
+	 * See struct rte_flow_item_tcp.
+	 */
+	RTE_FLOW_ITEM_TYPE_TCP,
+
+	/**
+	 * Matches a SCTP header.
+	 *
+	 * See struct rte_flow_item_sctp.
+	 */
+	RTE_FLOW_ITEM_TYPE_SCTP,
+
+	/**
+	 * Matches a VXLAN header.
+	 *
+	 * See struct rte_flow_item_vxlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VXLAN,
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ANY
+ *
+ * Matches any protocol in place of the current layer, a single ANY may also
+ * stand for several protocol layers.
+ *
+ * This is usually specified as the first pattern item when looking for a
+ * protocol anywhere in a packet.
+ *
+ * A zeroed mask stands for any number of layers.
+ */
+struct rte_flow_item_any {
+	uint32_t num; /* Number of layers covered. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VF
+ *
+ * Matches packets addressed to a virtual function ID of the device.
+ *
+ * If the underlying device function differs from the one that would
+ * normally receive the matched traffic, specifying this item prevents it
+ * from reaching that device unless the flow rule contains a VF
+ * action. Packets are not duplicated between device instances by default.
+ *
+ * - Likely to return an error or never match any traffic if this causes a
+ *   VF device to match traffic addressed to a different VF.
+ * - Can be specified multiple times to match traffic addressed to several
+ *   VF IDs.
+ * - Can be combined with a PF item to match both PF and VF traffic.
+ *
+ * A zeroed mask can be used to match any VF ID.
+ */
+struct rte_flow_item_vf {
+	uint32_t id; /**< Destination VF ID. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_PORT
+ *
+ * Matches packets coming from the specified physical port of the underlying
+ * device.
+ *
+ * The first PORT item overrides the physical port normally associated with
+ * the specified DPDK input port (port_id). This item can be provided
+ * several times to match additional physical ports.
+ *
+ * Note that physical ports are not necessarily tied to DPDK input ports
+ * (port_id) when those are not under DPDK control. Possible values are
+ * specific to each device, they are not necessarily indexed from zero and
+ * may not be contiguous.
+ *
+ * As a device property, the list of allowed values as well as the value
+ * associated with a port_id should be retrieved by other means.
+ *
+ * A zeroed mask can be used to match any port index.
+ */
+struct rte_flow_item_port {
+	uint32_t index; /**< Physical port index. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_RAW
+ *
+ * Matches a byte string of a given length at a given offset.
+ *
+ * Offset is either absolute (using the start of the packet) or relative to
+ * the end of the previous matched item in the stack, in which case negative
+ * values are allowed.
+ *
+ * If search is enabled, offset is used as the starting point. The search
+ * area can be delimited by setting limit to a nonzero value, which is the
+ * maximum number of bytes after offset where the pattern may start.
+ *
+ * Matching a zero-length pattern is allowed, doing so resets the relative
+ * offset for subsequent items.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_raw {
+	uint32_t relative:1; /**< Look for pattern after the previous item. */
+	uint32_t search:1; /**< Search pattern from offset (see also limit). */
+	uint32_t reserved:30; /**< Reserved, must be set to zero. */
+	int32_t offset; /**< Absolute or relative offset for pattern. */
+	uint16_t limit; /**< Search area limit for start of pattern. */
+	uint16_t length; /**< Pattern length. */
+	uint8_t pattern[]; /**< Byte string to look for. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ETH
+ *
+ * Matches an Ethernet header.
+ */
+struct rte_flow_item_eth {
+	struct ether_addr dst; /**< Destination MAC. */
+	struct ether_addr src; /**< Source MAC. */
+	uint16_t type; /**< EtherType. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VLAN
+ *
+ * Matches an 802.1Q/ad VLAN tag.
+ *
+ * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
+ * RTE_FLOW_ITEM_TYPE_VLAN.
+ */
+struct rte_flow_item_vlan {
+	uint16_t tpid; /**< Tag protocol identifier. */
+	uint16_t tci; /**< Tag control information. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV4
+ *
+ * Matches an IPv4 header.
+ *
+ * Note: IPv4 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv4 {
+	struct ipv4_hdr hdr; /**< IPv4 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV6.
+ *
+ * Matches an IPv6 header.
+ *
+ * Note: IPv6 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv6 {
+	struct ipv6_hdr hdr; /**< IPv6 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMP.
+ *
+ * Matches an ICMP header.
+ */
+struct rte_flow_item_icmp {
+	struct icmp_hdr hdr; /**< ICMP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_UDP.
+ *
+ * Matches a UDP header.
+ */
+struct rte_flow_item_udp {
+	struct udp_hdr hdr; /**< UDP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_TCP.
+ *
+ * Matches a TCP header.
+ */
+struct rte_flow_item_tcp {
+	struct tcp_hdr hdr; /**< TCP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_SCTP.
+ *
+ * Matches a SCTP header.
+ */
+struct rte_flow_item_sctp {
+	struct sctp_hdr hdr; /**< SCTP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VXLAN.
+ *
+ * Matches a VXLAN header (RFC 7348).
+ */
+struct rte_flow_item_vxlan {
+	uint8_t flags; /**< Normally 0x08 (I flag). */
+	uint8_t rsvd0[3]; /**< Reserved, normally 0x000000. */
+	uint8_t vni[3]; /**< VXLAN identifier. */
+	uint8_t rsvd1; /**< Reserved, normally 0x00. */
+};
+
+/**
+ * Matching pattern item definition.
+ *
+ * A pattern is formed by stacking items starting from the lowest protocol
+ * layer to match. This stacking restriction does not apply to meta items
+ * which can be placed anywhere in the stack without affecting the meaning
+ * of the resulting pattern.
+ *
+ * Patterns are terminated by END items.
+ *
+ * The spec field should be a valid pointer to a structure of the related
+ * item type. It may be set to NULL in many cases to use default values.
+ *
+ * Optionally, last can point to a structure of the same type to define an
+ * inclusive range. This is mostly supported by integer and address fields,
+ * may cause errors otherwise. Fields that do not support ranges must be set
+ * to 0 or to the same value as the corresponding fields in spec.
+ *
+ * By default all fields present in spec are considered relevant (see note
+ * below). This behavior can be altered by providing a mask structure of the
+ * same type with applicable bits set to one. It can also be used to
+ * partially filter out specific fields (e.g. as an alternate mean to match
+ * ranges of IP addresses).
+ *
+ * Mask is a simple bit-mask applied before interpreting the contents of
+ * spec and last, which may yield unexpected results if not used
+ * carefully. For example, if for an IPv4 address field, spec provides
+ * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
+ * effective range becomes 10.1.0.0 to 10.3.255.255.
+ *
+ * Note: the defaults for data-matching items such as IPv4 when mask is not
+ * specified actually depend on the underlying implementation since only
+ * recognized fields can be taken into account.
+ */
+struct rte_flow_item {
+	enum rte_flow_item_type type; /**< Item type. */
+	const void *spec; /**< Pointer to item specification structure. */
+	const void *last; /**< Defines an inclusive range (spec to last). */
+	const void *mask; /**< Bit-mask applied to spec and last. */
+};
+
+/**
+ * Action types.
+ *
+ * Each possible action is represented by a type. Some have associated
+ * configuration structures. Several actions combined in a list can be
+ * affected to a flow rule. That list is not ordered.
+ *
+ * They fall in three categories:
+ *
+ * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+ *   processing matched packets by subsequent flow rules, unless overridden
+ *   with PASSTHRU.
+ *
+ * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
+ *   for additional processing by subsequent flow rules.
+ *
+ * - Other non terminating meta actions that do not affect the fate of
+ *   packets (END, VOID, MARK, FLAG, COUNT).
+ *
+ * When several actions are combined in a flow rule, they should all have
+ * different types (e.g. dropping a packet twice is not possible).
+ *
+ * Only the last action of a given type is taken into account. PMDs still
+ * perform error checking on the entire list.
+ *
+ * Note that PASSTHRU is the only action able to override a terminating
+ * rule.
+ */
+enum rte_flow_action_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for action lists. Prevents further processing of
+	 * actions, thereby ending the list.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_VOID,
+
+	/**
+	 * Leaves packets up for additional processing by subsequent flow
+	 * rules. This is the default when a rule does not contain a
+	 * terminating action, but can be specified to force a rule to
+	 * become non-terminating.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PASSTHRU,
+
+	/**
+	 * [META]
+	 *
+	 * Attaches a 32 bit value to packets.
+	 *
+	 * See struct rte_flow_action_mark.
+	 */
+	RTE_FLOW_ACTION_TYPE_MARK,
+
+	/**
+	 * [META]
+	 *
+	 * Flag packets. Similar to MARK but only affects ol_flags.
+	 *
+	 * Note: a distinctive flag must be defined for it.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_FLAG,
+
+	/**
+	 * Assigns packets to a given queue index.
+	 *
+	 * See struct rte_flow_action_queue.
+	 */
+	RTE_FLOW_ACTION_TYPE_QUEUE,
+
+	/**
+	 * Drops packets.
+	 *
+	 * PASSTHRU overrides this action if both are specified.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_DROP,
+
+	/**
+	 * [META]
+	 *
+	 * Enables counters for this rule.
+	 *
+	 * These counters can be retrieved and reset through rte_flow_query(),
+	 * see struct rte_flow_query_count.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_COUNT,
+
+	/**
+	 * Duplicates packets to a given queue index.
+	 *
+	 * This is normally combined with QUEUE, however when used alone, it
+	 * is actually similar to QUEUE + PASSTHRU.
+	 *
+	 * See struct rte_flow_action_dup.
+	 */
+	RTE_FLOW_ACTION_TYPE_DUP,
+
+	/**
+	 * Similar to QUEUE, except RSS is additionally performed on packets
+	 * to spread them among several queues according to the provided
+	 * parameters.
+	 *
+	 * See struct rte_flow_action_rss.
+	 */
+	RTE_FLOW_ACTION_TYPE_RSS,
+
+	/**
+	 * Redirects packets to the physical function (PF) of the current
+	 * device.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PF,
+
+	/**
+	 * Redirects packets to the virtual function (VF) of the current
+	 * device with the specified ID.
+	 *
+	 * See struct rte_flow_action_vf.
+	 */
+	RTE_FLOW_ACTION_TYPE_VF,
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_MARK
+ *
+ * Attaches a 32 bit value to packets.
+ *
+ * This value is arbitrary and application-defined. For compatibility with
+ * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
+ * also set in ol_flags.
+ */
+struct rte_flow_action_mark {
+	uint32_t id; /**< 32 bit value to return with packets. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_QUEUE
+ *
+ * Assign packets to a given queue index.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_queue {
+	uint16_t index; /**< Queue index to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_COUNT (query)
+ *
+ * Query structure to retrieve and reset flow rule counters.
+ */
+struct rte_flow_query_count {
+	uint32_t reset:1; /**< Reset counters after query [in]. */
+	uint32_t hits_set:1; /**< hits field is set [out]. */
+	uint32_t bytes_set:1; /**< bytes field is set [out]. */
+	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
+	uint64_t hits; /**< Number of hits for this rule [out]. */
+	uint64_t bytes; /**< Number of bytes through this rule [out]. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_DUP
+ *
+ * Duplicates packets to a given queue index.
+ *
+ * This is normally combined with QUEUE, however when used alone, it is
+ * actually similar to QUEUE + PASSTHRU.
+ *
+ * Non-terminating by default.
+ */
+struct rte_flow_action_dup {
+	uint16_t index; /**< Queue index to duplicate packets to. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_RSS
+ *
+ * Similar to QUEUE, except RSS is additionally performed on packets to
+ * spread them among several queues according to the provided parameters.
+ *
+ * Note: RSS hash result is normally stored in the hash.rss mbuf field,
+ * however it conflicts with the MARK action as they share the same
+ * space. When both actions are specified, the RSS hash is discarded and
+ * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
+ * structure should eventually evolve to store both.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_rss {
+	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
+	uint16_t num; /**< Number of entries in queue[]. */
+	uint16_t queue[]; /**< Queues indices to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_VF
+ *
+ * Redirects packets to a virtual function (VF) of the current device.
+ *
+ * Packets matched by a VF pattern item can be redirected to their original
+ * VF ID instead of the specified one. This parameter may not be available
+ * and is not guaranteed to work properly if the VF part is matched by a
+ * prior flow rule or if packets are not addressed to a VF in the first
+ * place.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_vf {
+	uint32_t original:1; /**< Use original VF ID if possible. */
+	uint32_t reserved:31; /**< Reserved, must be zero. */
+	uint32_t id; /**< VF ID to redirect packets to. */
+};
+
+/**
+ * Definition of a single action.
+ *
+ * A list of actions is terminated by a END action.
+ *
+ * For simple actions without a configuration structure, conf remains NULL.
+ */
+struct rte_flow_action {
+	enum rte_flow_action_type type; /**< Action type. */
+	const void *conf; /**< Pointer to action configuration structure. */
+};
+
+/**
+ * Opaque type returned after successfully creating a flow.
+ *
+ * This handle can be used to manage and query the related flow (e.g. to
+ * destroy it or retrieve counters).
+ */
+struct rte_flow;
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_flow_error.cause.
+ */
+enum rte_flow_error_type {
+	RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+	RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+	RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+	RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+	RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by PMDs, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_flow_error {
+	enum rte_flow_error_type type; /**< Cause field and error types. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * Check whether a flow rule can be created on a given port.
+ *
+ * While this function has no effect on the target device, the flow rule is
+ * validated against its current configuration state and the returned value
+ * should be considered valid by the caller for that state only.
+ *
+ * The returned value is guaranteed to remain valid only as long as no
+ * successful calls to rte_flow_create() or rte_flow_destroy() are made in
+ * the meantime and no device parameter affecting flow rules in any way are
+ * modified, due to possible collisions or resource limitations (although in
+ * such cases EINVAL should not be returned).
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 if flow rule is valid and can be created. A negative errno value
+ *   otherwise (rte_errno is also set), the following errors are defined:
+ *
+ *   -ENOSYS: underlying device does not support this functionality.
+ *
+ *   -EINVAL: unknown or invalid rule specification.
+ *
+ *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
+ *   bit-masks are unsupported).
+ *
+ *   -EEXIST: collision with an existing rule.
+ *
+ *   -ENOMEM: not enough resources.
+ *
+ *   -EBUSY: action cannot be performed due to busy device resources, may
+ *   succeed if the affected queues or even the entire port are in a stopped
+ *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
+ */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error);
+
+/**
+ * Create a flow rule on a given port.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   A valid handle in case of success, NULL otherwise and rte_errno is set
+ *   to the positive version of one of the error codes defined for
+ *   rte_flow_validate().
+ */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error);
+
+/**
+ * Destroy a flow rule on a given port.
+ *
+ * Failure to destroy a flow rule handle may occur when other flow rules
+ * depend on it, and destroying it would result in an inconsistent state.
+ *
+ * This function is only guaranteed to succeed if handles are destroyed in
+ * reverse order of their creation.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error);
+
+/**
+ * Destroy all flow rules associated with a port.
+ *
+ * In the unlikely event of failure, handles are still considered destroyed
+ * and no longer valid but the port must be assumed to be in an inconsistent
+ * state.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error);
+
+/**
+ * Query an existing flow rule.
+ *
+ * This function allows retrieving flow-specific data such as counters.
+ * Data is gathered by special actions which must be present in the flow
+ * rule definition.
+ *
+ * \see RTE_FLOW_ACTION_TYPE_COUNT
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to query.
+ * @param action
+ *   Action type to query.
+ * @param[in, out] data
+ *   Pointer to storage for the associated query data type.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_H_ */
diff --git a/lib/librte_ether/rte_flow_driver.h b/lib/librte_ether/rte_flow_driver.h
new file mode 100644
index 0000000..b75cfdd
--- /dev/null
+++ b/lib/librte_ether/rte_flow_driver.h
@@ -0,0 +1,181 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_DRIVER_H_
+#define RTE_FLOW_DRIVER_H_
+
+/**
+ * @file
+ * RTE generic flow API (driver side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include "rte_flow.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Generic flow operations structure implemented and returned by PMDs.
+ *
+ * To implement this API, PMDs must handle the RTE_ETH_FILTER_GENERIC filter
+ * type in their .filter_ctrl callback function (struct eth_dev_ops) as well
+ * as the RTE_ETH_FILTER_GET filter operation.
+ *
+ * If successful, this operation must result in a pointer to a PMD-specific
+ * struct rte_flow_ops written to the argument address as described below:
+ *
+ * \code
+ *
+ * // PMD filter_ctrl callback
+ *
+ * static const struct rte_flow_ops pmd_flow_ops = { ... };
+ *
+ * switch (filter_type) {
+ * case RTE_ETH_FILTER_GENERIC:
+ *     if (filter_op != RTE_ETH_FILTER_GET)
+ *         return -EINVAL;
+ *     *(const void **)arg = &pmd_flow_ops;
+ *     return 0;
+ * }
+ *
+ * \endcode
+ *
+ * See also rte_flow_ops_get().
+ *
+ * These callback functions are not supposed to be used by applications
+ * directly, which must rely on the API defined in rte_flow.h.
+ *
+ * Public-facing wrapper functions perform a few consistency checks so that
+ * unimplemented (i.e. NULL) callbacks simply return -ENOTSUP. These
+ * callbacks otherwise only differ by their first argument (with port ID
+ * already resolved to a pointer to struct rte_eth_dev).
+ */
+struct rte_flow_ops {
+	/** See rte_flow_validate(). */
+	int (*validate)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_create(). */
+	struct rte_flow *(*create)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_destroy(). */
+	int (*destroy)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 struct rte_flow_error *);
+	/** See rte_flow_flush(). */
+	int (*flush)
+		(struct rte_eth_dev *,
+		 struct rte_flow_error *);
+	/** See rte_flow_query(). */
+	int (*query)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 enum rte_flow_action_type,
+		 void *,
+		 struct rte_flow_error *);
+};
+
+/**
+ * Initialize generic flow error structure.
+ *
+ * This function also sets rte_errno to a given value.
+ *
+ * @param[out] error
+ *   Pointer to flow error structure (may be NULL).
+ * @param code
+ *   Related error code (rte_errno).
+ * @param type
+ *   Cause field and error types.
+ * @param cause
+ *   Object responsible for the error.
+ * @param message
+ *   Human-readable error message.
+ *
+ * @return
+ *   Pointer to flow error structure.
+ */
+static inline struct rte_flow_error *
+rte_flow_error_set(struct rte_flow_error *error,
+		   int code,
+		   enum rte_flow_error_type type,
+		   const void *cause,
+		   const char *message)
+{
+	if (error) {
+		*error = (struct rte_flow_error){
+			.type = type,
+			.cause = cause,
+			.message = message,
+		};
+	}
+	rte_errno = code;
+	return error;
+}
+
+/**
+ * Get generic flow operations structure from a port.
+ *
+ * @param port_id
+ *   Port identifier to query.
+ * @param[out] error
+ *   Pointer to flow error structure.
+ *
+ * @return
+ *   The flow operations structure associated with port_id, NULL in case of
+ *   error, in which case rte_errno is set and the error structure contains
+ *   additional details.
+ */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_DRIVER_H_ */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
  2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API Adrien Mazarguil
@ 2016-12-16 16:24       ` Adrien Mazarguil
  2016-12-19 10:45         ` Mcnamara, John
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 03/25] doc: announce depreciation of legacy filter types Adrien Mazarguil
                         ` (25 subsequent siblings)
  27 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:24 UTC (permalink / raw)
  To: dev

This documentation is based on the latest RFC submission, subsequently
updated according to feedback from the community.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/prog_guide/index.rst    |    1 +
 doc/guides/prog_guide/rte_flow.rst | 1853 +++++++++++++++++++++++++++++++
 2 files changed, 1854 insertions(+)

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index e5a50a8..ed7f770 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -42,6 +42,7 @@ Programmer's Guide
     mempool_lib
     mbuf_lib
     poll_mode_drv
+    rte_flow
     cryptodev_lib
     link_bonding_poll_mode_drv_lib
     timer_lib
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
new file mode 100644
index 0000000..63413d1
--- /dev/null
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -0,0 +1,1853 @@
+..  BSD LICENSE
+    Copyright 2016 6WIND S.A.
+    Copyright 2016 Mellanox.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of 6WIND S.A. nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Generic_flow_API:
+
+Generic flow API (rte_flow)
+===========================
+
+Overview
+--------
+
+This API provides a generic means to configure hardware to match specific
+ingress or egress traffic, alter its fate and query related counters
+according to any number of user-defined rules.
+
+It is named *rte_flow* after the prefix used for all its symbols, and is
+defined in ``rte_flow.h``.
+
+- Matching can be performed on packet data (protocol headers, payload) and
+  properties (e.g. associated physical port, virtual device function ID).
+
+- Possible operations include dropping traffic, diverting it to specific
+  queues, to virtual/physical device functions or ports, performing tunnel
+  offloads, adding marks and so on.
+
+It is slightly higher-level than the legacy filtering framework which it
+encompasses and supersedes (including all functions and filter types) in
+order to expose a single interface with an unambiguous behavior that is
+common to all poll-mode drivers (PMDs).
+
+Several methods to migrate existing applications are described in `API
+migration`_.
+
+Flow rule
+---------
+
+Description
+~~~~~~~~~~~
+
+A flow rule is the combination of attributes with a matching pattern and a
+list of actions. Flow rules form the basis of this API.
+
+Flow rules can have several distinct actions (such as counting,
+encapsulating, decapsulating before redirecting packets to a particular
+queue, etc.), instead of relying on several rules to achieve this and having
+applications deal with hardware implementation details regarding their
+order.
+
+Support for different priority levels on a rule basis is provided, for
+example in order to force a more specific rule to come before a more generic
+one for packets matched by both. However hardware support for more than a
+single priority level cannot be guaranteed. When supported, the number of
+available priority levels is usually low, which is why they can also be
+implemented in software by PMDs (e.g. missing priority levels may be
+emulated by reordering rules).
+
+In order to remain as hardware-agnostic as possible, by default all rules
+are considered to have the same priority, which means that the order between
+overlapping rules (when a packet is matched by several filters) is
+undefined.
+
+PMDs may refuse to create overlapping rules at a given priority level when
+they can be detected (e.g. if a pattern matches an existing filter).
+
+Thus predictable results for a given priority level can only be achieved
+with non-overlapping rules, using perfect matching on all protocol layers.
+
+Flow rules can also be grouped, the flow rule priority is specific to the
+group they belong to. All flow rules in a given group are thus processed
+either before or after another group.
+
+Support for multiple actions per rule may be implemented internally on top
+of non-default hardware priorities, as a result both features may not be
+simultaneously available to applications.
+
+Considering that allowed pattern/actions combinations cannot be known in
+advance and would result in an unpractically large number of capabilities to
+expose, a method is provided to validate a given rule from the current
+device configuration state.
+
+This enables applications to check if the rule types they need is supported
+at initialization time, before starting their data path. This method can be
+used anytime, its only requirement being that the resources needed by a rule
+should exist (e.g. a target RX queue should be configured first).
+
+Each defined rule is associated with an opaque handle managed by the PMD,
+applications are responsible for keeping it. These can be used for queries
+and rules management, such as retrieving counters or other data and
+destroying them.
+
+To avoid resource leaks on the PMD side, handles must be explicitly
+destroyed by the application before releasing associated resources such as
+queues and ports.
+
+The following sections cover:
+
+- **Attributes** (represented by ``struct rte_flow_attr``): properties of a
+  flow rule such as its direction (ingress or egress) and priority.
+
+- **Pattern item** (represented by ``struct rte_flow_item``): part of a
+  matching pattern that either matches specific packet data or traffic
+  properties. It can also describe properties of the pattern itself, such as
+  inverted matching.
+
+- **Matching pattern**: traffic properties to look for, a combination of any
+  number of items.
+
+- **Actions** (represented by ``struct rte_flow_action``): operations to
+  perform whenever a packet is matched by a pattern.
+
+Attributes
+~~~~~~~~~~
+
+Group
+^^^^^
+
+Flow rules can be grouped by assigning them a common group number. Lower
+values have higher priority. Group 0 has the highest priority.
+
+Although optional, applications are encouraged to group similar rules as
+much as possible to fully take advantage of hardware capabilities
+(e.g. optimized matching) and work around limitations (e.g. a single pattern
+type possibly allowed in a given group).
+
+Note that support for more than a single group is not guaranteed.
+
+Priority
+^^^^^^^^
+
+A priority level can be assigned to a flow rule. Like groups, lower values
+denote higher priority, with 0 as the maximum.
+
+A rule with priority 0 in group 8 is always matched after a rule with
+priority 8 in group 0.
+
+Group and priority levels are arbitrary and up to the application, they do
+not need to be contiguous nor start from 0, however the maximum number
+varies between devices and may be affected by existing flow rules.
+
+If a packet is matched by several rules of a given group for a given
+priority level, the outcome is undefined. It can take any path, may be
+duplicated or even cause unrecoverable errors.
+
+Note that support for more than a single priority level is not guaranteed.
+
+Traffic direction
+^^^^^^^^^^^^^^^^^
+
+Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+
+Several pattern items and actions are valid and can be used in both
+directions. At least one direction must be specified.
+
+Specifying both directions at once for a given rule is not recommended but
+may be valid in a few cases (e.g. shared counters).
+
+Pattern item
+~~~~~~~~~~~~
+
+Pattern items fall in two categories:
+
+- Matching protocol headers and packet data (ANY, RAW, ETH, VLAN, IPV4,
+  IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
+  specification structure.
+
+- Matching meta-data or affecting pattern processing (END, VOID, INVERT, PF,
+  VF, PORT and so on), often without a specification structure.
+
+Item specification structures are used to match specific values among
+protocol fields (or item properties). Documentation describes for each item
+whether they are associated with one and their type name if so.
+
+Up to three structures of the same type can be set for a given item:
+
+- ``spec``: values to match (e.g. a given IPv4 address).
+
+- ``last``: upper bound for an inclusive range with corresponding fields in
+  ``spec``.
+
+- ``mask``: bit-mask applied to both ``spec`` and ``last`` whose purpose is
+  to distinguish the values to take into account and/or partially mask them
+  out (e.g. in order to match an IPv4 address prefix).
+
+Usage restrictions and expected behavior:
+
+- Setting either ``mask`` or ``last`` without ``spec`` is an error.
+
+- Field values in ``last`` which are either 0 or equal to the corresponding
+  values in ``spec`` are ignored; they do not generate a range. Nonzero
+  values lower than those in ``spec`` are not supported.
+
+- Setting ``spec`` and optionally ``last`` without ``mask`` causes the PMD
+  to only take the fields it can recognize into account. There is no error
+  checking for unsupported fields.
+
+- Not setting any of them (assuming item type allows it) uses default
+  parameters that depend on the item type. Most of the time, particularly
+  for protocol header items, it is equivalent to providing an empty (zeroed)
+  ``mask``.
+
+- ``mask`` is a simple bit-mask applied before interpreting the contents of
+  ``spec`` and ``last``, which may yield unexpected results if not used
+  carefully. For example, if for an IPv4 address field, ``spec`` provides
+  *10.1.2.3*, ``last`` provides *10.3.4.5* and ``mask`` provides
+  *255.255.0.0*, the effective range becomes *10.1.0.0* to *10.3.255.255*.
+
+Example of an item specification matching an Ethernet header:
+
++------------------------------------------+
+| Ethernet                                 |
++==========+==========+====================+
+| ``spec`` | ``src``  | ``00:01:02:03:04`` |
+|          +----------+--------------------+
+|          | ``dst``  | ``00:2a:66:00:01`` |
+|          +----------+--------------------+
+|          | ``type`` | ``0x22aa``         |
++----------+----------+--------------------+
+| ``last`` | unspecified                   |
++----------+----------+--------------------+
+| ``mask`` | ``src``  | ``00:ff:ff:ff:00`` |
+|          +----------+--------------------+
+|          | ``dst``  | ``00:00:00:00:ff`` |
+|          +----------+--------------------+
+|          | ``type`` | ``0x0000``         |
++----------+----------+--------------------+
+
+Non-masked bits stand for any value (shown as ``?`` below), Ethernet headers
+with the following properties are thus matched:
+
+- ``src``: ``??:01:02:03:??``
+- ``dst``: ``??:??:??:??:01``
+- ``type``: ``0x????``
+
+Matching pattern
+~~~~~~~~~~~~~~~~
+
+A pattern is formed by stacking items starting from the lowest protocol
+layer to match. This stacking restriction does not apply to meta items which
+can be placed anywhere in the stack without affecting the meaning of the
+resulting pattern.
+
+Patterns are terminated by END items.
+
+Examples:
+
++--------------+
+| TCPv4 as L4  |
++===+==========+
+| 0 | Ethernet |
++---+----------+
+| 1 | IPv4     |
++---+----------+
+| 2 | TCP      |
++---+----------+
+| 3 | END      |
++---+----------+
+
+|
+
++----------------+
+| TCPv6 in VXLAN |
++===+============+
+| 0 | Ethernet   |
++---+------------+
+| 1 | IPv4       |
++---+------------+
+| 2 | UDP        |
++---+------------+
+| 3 | VXLAN      |
++---+------------+
+| 4 | Ethernet   |
++---+------------+
+| 5 | IPv6       |
++---+------------+
+| 6 | TCP        |
++---+------------+
+| 7 | END        |
++---+------------+
+
+|
+
++-----------------------------+
+| TCPv4 as L4 with meta items |
++===+=========================+
+| 0 | VOID                    |
++---+-------------------------+
+| 1 | Ethernet                |
++---+-------------------------+
+| 2 | VOID                    |
++---+-------------------------+
+| 3 | IPv4                    |
++---+-------------------------+
+| 4 | TCP                     |
++---+-------------------------+
+| 5 | VOID                    |
++---+-------------------------+
+| 6 | VOID                    |
++---+-------------------------+
+| 7 | END                     |
++---+-------------------------+
+
+The above example shows how meta items do not affect packet data matching
+items, as long as those remain stacked properly. The resulting matching
+pattern is identical to "TCPv4 as L4".
+
++----------------+
+| UDPv6 anywhere |
++===+============+
+| 0 | IPv6       |
++---+------------+
+| 1 | UDP        |
++---+------------+
+| 2 | END        |
++---+------------+
+
+If supported by the PMD, omitting one or several protocol layers at the
+bottom of the stack as in the above example (missing an Ethernet
+specification) enables looking up anywhere in packets.
+
+It is unspecified whether the payload of supported encapsulations
+(e.g. VXLAN payload) is matched by such a pattern, which may apply to inner,
+outer or both packets.
+
++---------------------+
+| Invalid, missing L3 |
++===+=================+
+| 0 | Ethernet        |
++---+-----------------+
+| 1 | UDP             |
++---+-----------------+
+| 2 | END             |
++---+-----------------+
+
+The above pattern is invalid due to a missing L3 specification between L2
+(Ethernet) and L4 (UDP). Doing so is only allowed at the bottom and at the
+top of the stack.
+
+Meta item types
+~~~~~~~~~~~~~~~
+
+They match meta-data or affect pattern processing instead of matching packet
+data directly, most of them do not need a specification structure. This
+particularity allows them to be specified anywhere in the stack without
+causing any side effect.
+
+``END``
+^^^^^^^
+
+End marker for item lists. Prevents further processing of items, thereby
+ending the pattern.
+
+- Its numeric value is 0 for convenience.
+- PMD support is mandatory.
+- ``spec``, ``last`` and ``mask`` are ignored.
+
++--------------------+
+| END                |
++==========+=========+
+| ``spec`` | ignored |
++----------+---------+
+| ``last`` | ignored |
++----------+---------+
+| ``mask`` | ignored |
++----------+---------+
+
+``VOID``
+^^^^^^^^
+
+Used as a placeholder for convenience. It is ignored and simply discarded by
+PMDs.
+
+- PMD support is mandatory.
+- ``spec``, ``last`` and ``mask`` are ignored.
+
++--------------------+
+| VOID               |
++==========+=========+
+| ``spec`` | ignored |
++----------+---------+
+| ``last`` | ignored |
++----------+---------+
+| ``mask`` | ignored |
++----------+---------+
+
+One usage example for this type is generating rules that share a common
+prefix quickly without reallocating memory, only by updating item types:
+
++------------------------+
+| TCP, UDP or ICMP as L4 |
++===+====================+
+| 0 | Ethernet           |
++---+--------------------+
+| 1 | IPv4               |
++---+------+------+------+
+| 2 | UDP  | VOID | VOID |
++---+------+------+------+
+| 3 | VOID | TCP  | VOID |
++---+------+------+------+
+| 4 | VOID | VOID | ICMP |
++---+------+------+------+
+| 5 | END                |
++---+--------------------+
+
+``INVERT``
+^^^^^^^^^^
+
+Inverted matching, i.e. process packets that do not match the pattern.
+
+- ``spec``, ``last`` and ``mask`` are ignored.
+
++--------------------+
+| INVERT             |
++==========+=========+
+| ``spec`` | ignored |
++----------+---------+
+| ``last`` | ignored |
++----------+---------+
+| ``mask`` | ignored |
++----------+---------+
+
+Usage example, matching non-TCPv4 packets only:
+
++--------------------+
+| Anything but TCPv4 |
++===+================+
+| 0 | INVERT         |
++---+----------------+
+| 1 | Ethernet       |
++---+----------------+
+| 2 | IPv4           |
++---+----------------+
+| 3 | TCP            |
++---+----------------+
+| 4 | END            |
++---+----------------+
+
+``PF``
+^^^^^^
+
+Matches packets addressed to the physical function of the device.
+
+If the underlying device function differs from the one that would normally
+receive the matched traffic, specifying this item prevents it from reaching
+that device unless the flow rule contains a `PF (action)`_. Packets are not
+duplicated between device instances by default.
+
+- Likely to return an error or never match any traffic if applied to a VF
+  device.
+- Can be combined with any number of `VF`_ items to match both PF and VF
+  traffic.
+- ``spec``, ``last`` and ``mask`` must not be set.
+
++------------------+
+| PF               |
++==========+=======+
+| ``spec`` | unset |
++----------+-------+
+| ``last`` | unset |
++----------+-------+
+| ``mask`` | unset |
++----------+-------+
+
+``VF``
+^^^^^^
+
+Matches packets addressed to a virtual function ID of the device.
+
+If the underlying device function differs from the one that would normally
+receive the matched traffic, specifying this item prevents it from reaching
+that device unless the flow rule contains a `VF (action)`_. Packets are not
+duplicated between device instances by default.
+
+- Likely to return an error or never match any traffic if this causes a VF
+  device to match traffic addressed to a different VF.
+- Can be specified multiple times to match traffic addressed to several VF
+  IDs.
+- Can be combined with a PF item to match both PF and VF traffic.
+
++------------------------------------------------+
+| VF                                             |
++==========+=========+===========================+
+| ``spec`` | ``id``  | destination VF ID         |
++----------+---------+---------------------------+
+| ``last`` | ``id``  | upper range value         |
++----------+---------+---------------------------+
+| ``mask`` | ``id``  | zeroed to match any VF ID |
++----------+---------+---------------------------+
+
+``PORT``
+^^^^^^^^
+
+Matches packets coming from the specified physical port of the underlying
+device.
+
+The first PORT item overrides the physical port normally associated with the
+specified DPDK input port (port_id). This item can be provided several times
+to match additional physical ports.
+
+Note that physical ports are not necessarily tied to DPDK input ports
+(port_id) when those are not under DPDK control. Possible values are
+specific to each device, they are not necessarily indexed from zero and may
+not be contiguous.
+
+As a device property, the list of allowed values as well as the value
+associated with a port_id should be retrieved by other means.
+
++-------------------------------------------------------+
+| PORT                                                  |
++==========+===========+================================+
+| ``spec`` | ``index`` | physical port index            |
++----------+-----------+--------------------------------+
+| ``last`` | ``index`` | upper range value              |
++----------+-----------+--------------------------------+
+| ``mask`` | ``index`` | zeroed to match any port index |
++----------+-----------+--------------------------------+
+
+Data matching item types
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Most of these are basically protocol header definitions with associated
+bit-masks. They must be specified (stacked) from lowest to highest protocol
+layer to form a matching pattern.
+
+The following list is not exhaustive, new protocols will be added in the
+future.
+
+``ANY``
+^^^^^^^
+
+Matches any protocol in place of the current layer, a single ANY may also
+stand for several protocol layers.
+
+This is usually specified as the first pattern item when looking for a
+protocol anywhere in a packet.
+
++-----------------------------------------------------------+
+| ANY                                                       |
++==========+=========+======================================+
+| ``spec`` | ``num`` | number of layers covered             |
++----------+---------+--------------------------------------+
+| ``last`` | ``num`` | upper range value                    |
++----------+---------+--------------------------------------+
+| ``mask`` | ``num`` | zeroed to cover any number of layers |
++----------+---------+--------------------------------------+
+
+Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or IPv6)
+and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
+or IPv6) matched by the second ANY specification:
+
++----------------------------------+
+| TCP in VXLAN with wildcards      |
++===+==============================+
+| 0 | Ethernet                     |
++---+-----+----------+---------+---+
+| 1 | ANY | ``spec`` | ``num`` | 2 |
++---+-----+----------+---------+---+
+| 2 | VXLAN                        |
++---+------------------------------+
+| 3 | Ethernet                     |
++---+-----+----------+---------+---+
+| 4 | ANY | ``spec`` | ``num`` | 1 |
++---+-----+----------+---------+---+
+| 5 | TCP                          |
++---+------------------------------+
+| 6 | END                          |
++---+------------------------------+
+
+``RAW``
+^^^^^^^
+
+Matches a byte string of a given length at a given offset.
+
+Offset is either absolute (using the start of the packet) or relative to the
+end of the previous matched item in the stack, in which case negative values
+are allowed.
+
+If search is enabled, offset is used as the starting point. The search area
+can be delimited by setting limit to a nonzero value, which is the maximum
+number of bytes after offset where the pattern may start.
+
+Matching a zero-length pattern is allowed, doing so resets the relative
+offset for subsequent items.
+
+- This type does not support ranges (``last`` field).
+
++---------------------------------------------------------------------------+
+| RAW                                                                       |
++==========+==============+=================================================+
+| ``spec`` | ``relative`` | look for pattern after the previous item        |
+|          +--------------+-------------------------------------------------+
+|          | ``search``   | search pattern from offset (see also ``limit``) |
+|          +--------------+-------------------------------------------------+
+|          | ``reserved`` | reserved, must be set to zero                   |
+|          +--------------+-------------------------------------------------+
+|          | ``offset``   | absolute or relative offset for ``pattern``     |
+|          +--------------+-------------------------------------------------+
+|          | ``limit``    | search area limit for start of ``pattern``      |
+|          +--------------+-------------------------------------------------+
+|          | ``length``   | ``pattern`` length                              |
+|          +--------------+-------------------------------------------------+
+|          | ``pattern``  | byte string to look for                         |
++----------+--------------+-------------------------------------------------+
+| ``last`` | if specified, either all 0 or with the same values as ``spec`` |
++----------+----------------------------------------------------------------+
+| ``mask`` | bit-mask applied to ``spec`` values with usual behavior        |
++----------+----------------------------------------------------------------+
+
+Example pattern looking for several strings at various offsets of a UDP
+payload, using combined RAW items:
+
++-------------------------------------------+
+| UDP payload matching                      |
++===+=======================================+
+| 0 | Ethernet                              |
++---+---------------------------------------+
+| 1 | IPv4                                  |
++---+---------------------------------------+
+| 2 | UDP                                   |
++---+-----+----------+--------------+-------+
+| 3 | RAW | ``spec`` | ``relative`` | 1     |
+|   |     |          +--------------+-------+
+|   |     |          | ``search``   | 1     |
+|   |     |          +--------------+-------+
+|   |     |          | ``offset``   | 10    |
+|   |     |          +--------------+-------+
+|   |     |          | ``limit``    | 0     |
+|   |     |          +--------------+-------+
+|   |     |          | ``length``   | 3     |
+|   |     |          +--------------+-------+
+|   |     |          | ``pattern``  | "foo" |
++---+-----+----------+--------------+-------+
+| 4 | RAW | ``spec`` | ``relative`` | 1     |
+|   |     |          +--------------+-------+
+|   |     |          | ``search``   | 0     |
+|   |     |          +--------------+-------+
+|   |     |          | ``offset``   | 20    |
+|   |     |          +--------------+-------+
+|   |     |          | ``limit``    | 0     |
+|   |     |          +--------------+-------+
+|   |     |          | ``length``   | 3     |
+|   |     |          +--------------+-------+
+|   |     |          | ``pattern``  | "bar" |
++---+-----+----------+--------------+-------+
+| 5 | RAW | ``spec`` | ``relative`` | 1     |
+|   |     |          +--------------+-------+
+|   |     |          | ``search``   | 0     |
+|   |     |          +--------------+-------+
+|   |     |          | ``offset``   | -29   |
+|   |     |          +--------------+-------+
+|   |     |          | ``limit``    | 0     |
+|   |     |          +--------------+-------+
+|   |     |          | ``length``   | 3     |
+|   |     |          +--------------+-------+
+|   |     |          | ``pattern``  | "baz" |
++---+-----+----------+--------------+-------+
+| 6 | END                                   |
++---+---------------------------------------+
+
+This translates to:
+
+- Locate "foo" at least 10 bytes deep inside UDP payload.
+- Locate "bar" after "foo" plus 20 bytes.
+- Locate "baz" after "bar" minus 29 bytes.
+
+Such a packet may be represented as follows (not to scale)::
+
+ 0                     >= 10 B           == 20 B
+ |                  |<--------->|     |<--------->|
+ |                  |           |     |           |
+ |-----|------|-----|-----|-----|-----|-----------|-----|------|
+ | ETH | IPv4 | UDP | ... | baz | foo | ......... | bar | .... |
+ |-----|------|-----|-----|-----|-----|-----------|-----|------|
+                          |                             |
+                          |<--------------------------->|
+                                      == 29 B
+
+Note that matching subsequent pattern items would resume after "baz", not
+"bar" since matching is always performed after the previous item of the
+stack.
+
+``ETH``
+^^^^^^^
+
+Matches an Ethernet header.
+
+- ``dst``: destination MAC.
+- ``src``: source MAC.
+- ``type``: EtherType.
+
+``VLAN``
+^^^^^^^^
+
+Matches an 802.1Q/ad VLAN tag.
+
+- ``tpid``: tag protocol identifier.
+- ``tci``: tag control information.
+
+``IPV4``
+^^^^^^^^
+
+Matches an IPv4 header.
+
+Note: IPv4 options are handled by dedicated pattern items.
+
+- ``hdr``: IPv4 header definition (``rte_ip.h``).
+
+``IPV6``
+^^^^^^^^
+
+Matches an IPv6 header.
+
+Note: IPv6 options are handled by dedicated pattern items.
+
+- ``hdr``: IPv6 header definition (``rte_ip.h``).
+
+``ICMP``
+^^^^^^^^
+
+Matches an ICMP header.
+
+- ``hdr``: ICMP header definition (``rte_icmp.h``).
+
+``UDP``
+^^^^^^^
+
+Matches a UDP header.
+
+- ``hdr``: UDP header definition (``rte_udp.h``).
+
+``TCP``
+^^^^^^^
+
+Matches a TCP header.
+
+- ``hdr``: TCP header definition (``rte_tcp.h``).
+
+``SCTP``
+^^^^^^^^
+
+Matches a SCTP header.
+
+- ``hdr``: SCTP header definition (``rte_sctp.h``).
+
+``VXLAN``
+^^^^^^^^^
+
+Matches a VXLAN header (RFC 7348).
+
+- ``flags``: normally 0x08 (I flag).
+- ``rsvd0``: reserved, normally 0x000000.
+- ``vni``: VXLAN network identifier.
+- ``rsvd1``: reserved, normally 0x00.
+
+Actions
+~~~~~~~
+
+Each possible action is represented by a type. Some have associated
+configuration structures. Several actions combined in a list can be affected
+to a flow rule. That list is not ordered.
+
+They fall in three categories:
+
+- Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+  processing matched packets by subsequent flow rules, unless overridden
+  with PASSTHRU.
+
+- Non-terminating actions (PASSTHRU, DUP) that leave matched packets up for
+  additional processing by subsequent flow rules.
+
+- Other non-terminating meta actions that do not affect the fate of packets
+  (END, VOID, MARK, FLAG, COUNT).
+
+When several actions are combined in a flow rule, they should all have
+different types (e.g. dropping a packet twice is not possible).
+
+Only the last action of a given type is taken into account. PMDs still
+perform error checking on the entire list.
+
+Like matching patterns, action lists are terminated by END items.
+
+*Note that PASSTHRU is the only action able to override a terminating rule.*
+
+Example of action that redirects packets to queue index 10:
+
++----------------+
+| QUEUE          |
++===========+====+
+| ``index`` | 10 |
++-----------+----+
+
+Action lists examples, their order is not significant, applications must
+consider all actions to be performed simultaneously:
+
++----------------+
+| Count and drop |
++================+
+| COUNT          |
++----------------+
+| DROP           |
++----------------+
+| END            |
++----------------+
+
+|
+
++--------------------------+
+| Mark, count and redirect |
++=======+===========+======+
+| MARK  | ``mark``  | 0x2a |
++-------+-----------+------+
+| COUNT                    |
++-------+-----------+------+
+| QUEUE | ``queue`` | 10   |
++-------+-----------+------+
+| END                      |
++--------------------------+
+
+|
+
++-----------------------+
+| Redirect to queue 5   |
++=======================+
+| DROP                  |
++-------+-----------+---+
+| QUEUE | ``queue`` | 5 |
++-------+-----------+---+
+| END                   |
++-----------------------+
+
+In the above example, considering both actions are performed simultaneously,
+the end result is that only QUEUE has any effect.
+
++-----------------------+
+| Redirect to queue 3   |
++=======+===========+===+
+| QUEUE | ``queue`` | 5 |
++-------+-----------+---+
+| VOID                  |
++-------+-----------+---+
+| QUEUE | ``queue`` | 3 |
++-------+-----------+---+
+| END                   |
++-----------------------+
+
+As previously described, only the last action of a given type found in the
+list is taken into account. The above example also shows that VOID is
+ignored.
+
+Action types
+~~~~~~~~~~~~
+
+Common action types are described in this section. Like pattern item types,
+this list is not exhaustive as new actions will be added in the future.
+
+``END`` (action)
+^^^^^^^^^^^^^^^^
+
+End marker for action lists. Prevents further processing of actions, thereby
+ending the list.
+
+- Its numeric value is 0 for convenience.
+- PMD support is mandatory.
+- No configurable properties.
+
++---------------+
+| END           |
++===============+
+| no properties |
++---------------+
+
+``VOID`` (action)
+^^^^^^^^^^^^^^^^^
+
+Used as a placeholder for convenience. It is ignored and simply discarded by
+PMDs.
+
+- PMD support is mandatory.
+- No configurable properties.
+
++---------------+
+| VOID          |
++===============+
+| no properties |
++---------------+
+
+``PASSTHRU``
+^^^^^^^^^^^^
+
+Leaves packets up for additional processing by subsequent flow rules. This
+is the default when a rule does not contain a terminating action, but can be
+specified to force a rule to become non-terminating.
+
+- No configurable properties.
+
++---------------+
+| PASSTHRU      |
++===============+
+| no properties |
++---------------+
+
+Example to copy a packet to a queue and continue processing by subsequent
+flow rules:
+
++--------------------------+
+| Copy to queue 8          |
++==========================+
+| PASSTHRU                 |
++----------+-----------+---+
+| QUEUE    | ``queue`` | 8 |
++----------+-----------+---+
+| END                      |
++--------------------------+
+
+``MARK``
+^^^^^^^^
+
+Attaches a 32 bit value to packets.
+
+This value is arbitrary and application-defined. For compatibility with FDIR
+it is returned in the ``hash.fdir.hi`` mbuf field. ``PKT_RX_FDIR_ID`` is
+also set in ``ol_flags``.
+
++----------------------------------------------+
+| MARK                                         |
++========+=====================================+
+| ``id`` | 32 bit value to return with packets |
++--------+-------------------------------------+
+
+``FLAG``
+^^^^^^^^
+
+Flag packets. Similar to `MARK`_ but only affects ``ol_flags``.
+
+- No configurable properties.
+
+Note: a distinctive flag must be defined for it.
+
++---------------+
+| FLAG          |
++===============+
+| no properties |
++---------------+
+
+``QUEUE``
+^^^^^^^^^
+
+Assigns packets to a given queue index.
+
+- Terminating by default.
+
++--------------------------------+
+| QUEUE                          |
++===========+====================+
+| ``index`` | queue index to use |
++-----------+--------------------+
+
+``DROP``
+^^^^^^^^
+
+Drop packets.
+
+- No configurable properties.
+- Terminating by default.
+- PASSTHRU overrides this action if both are specified.
+
++---------------+
+| DROP          |
++===============+
+| no properties |
++---------------+
+
+``COUNT``
+^^^^^^^^^
+
+Enables counters for this rule.
+
+These counters can be retrieved and reset through ``rte_flow_query()``, see
+``struct rte_flow_query_count``.
+
+- Counters can be retrieved with ``rte_flow_query()``.
+- No configurable properties.
+
++---------------+
+| COUNT         |
++===============+
+| no properties |
++---------------+
+
+Query structure to retrieve and reset flow rule counters:
+
++---------------------------------------------------------+
+| COUNT query                                             |
++===============+=====+===================================+
+| ``reset``     | in  | reset counter after query         |
++---------------+-----+-----------------------------------+
+| ``hits_set``  | out | ``hits`` field is set             |
++---------------+-----+-----------------------------------+
+| ``bytes_set`` | out | ``bytes`` field is set            |
++---------------+-----+-----------------------------------+
+| ``hits``      | out | number of hits for this rule      |
++---------------+-----+-----------------------------------+
+| ``bytes``     | out | number of bytes through this rule |
++---------------+-----+-----------------------------------+
+
+``DUP``
+^^^^^^^
+
+Duplicates packets to a given queue index.
+
+This is normally combined with QUEUE, however when used alone, it is
+actually similar to QUEUE + PASSTHRU.
+
+- Non-terminating by default.
+
++------------------------------------------------+
+| DUP                                            |
++===========+====================================+
+| ``index`` | queue index to duplicate packet to |
++-----------+------------------------------------+
+
+``RSS``
+^^^^^^^
+
+Similar to QUEUE, except RSS is additionally performed on packets to spread
+them among several queues according to the provided parameters.
+
+Note: RSS hash result is normally stored in the ``hash.rss`` mbuf field,
+however it conflicts with the `MARK`_ action as they share the same
+space. When both actions are specified, the RSS hash is discarded and
+``PKT_RX_RSS_HASH`` is not set in ``ol_flags``. MARK has priority. The mbuf
+structure should eventually evolve to store both.
+
+- Terminating by default.
+
++---------------------------------------------+
+| RSS                                         |
++==============+==============================+
+| ``rss_conf`` | RSS parameters               |
++--------------+------------------------------+
+| ``num``      | number of entries in queue[] |
++--------------+------------------------------+
+| ``queue[]``  | queue indices to use         |
++--------------+------------------------------+
+
+``PF`` (action)
+^^^^^^^^^^^^^^^
+
+Redirects packets to the physical function (PF) of the current device.
+
+- No configurable properties.
+- Terminating by default.
+
++---------------+
+| PF            |
++===============+
+| no properties |
++---------------+
+
+``VF`` (action)
+^^^^^^^^^^^^^^^
+
+Redirects packets to a virtual function (VF) of the current device.
+
+Packets matched by a VF pattern item can be redirected to their original VF
+ID instead of the specified one. This parameter may not be available and is
+not guaranteed to work properly if the VF part is matched by a prior flow
+rule or if packets are not addressed to a VF in the first place.
+
+- Terminating by default.
+
++-----------------------------------------------+
+| VF                                            |
++==============+================================+
+| ``original`` | use original VF ID if possible |
++--------------+--------------------------------+
+| ``vf``       | VF ID to redirect packets to   |
++--------------+--------------------------------+
+
+Negative types
+~~~~~~~~~~~~~~
+
+All specified pattern items (``enum rte_flow_item_type``) and actions
+(``enum rte_flow_action_type``) use positive identifiers.
+
+The negative space is reserved for dynamic types generated by PMDs during
+run-time. PMDs may encounter them as a result but must not accept negative
+identifiers they are not aware of.
+
+A method to generate them remains to be defined.
+
+Planned types
+~~~~~~~~~~~~~
+
+Pattern item types will be added as new protocols are implemented.
+
+Variable headers support through dedicated pattern items, for example in
+order to match specific IPv4 options and IPv6 extension headers would be
+stacked after IPv4/IPv6 items.
+
+Other action types are planned but are not defined yet. These include the
+ability to alter packet data in several ways, such as performing
+encapsulation/decapsulation of tunnel headers.
+
+Rules management
+----------------
+
+A rather simple API with few functions is provided to fully manage flow
+rules.
+
+Each created flow rule is associated with an opaque, PMD-specific handle
+pointer. The application is responsible for keeping it until the rule is
+destroyed.
+
+Flows rules are represented by ``struct rte_flow`` objects.
+
+Validation
+~~~~~~~~~~
+
+Given that expressing a definite set of device capabilities is not
+practical, a dedicated function is provided to check if a flow rule is
+supported and can be created.
+
+::
+
+ int
+ rte_flow_validate(uint8_t port_id,
+                   const struct rte_flow_attr *attr,
+                   const struct rte_flow_item pattern[],
+                   const struct rte_flow_action actions[],
+                   struct rte_flow_error *error);
+
+While this function has no effect on the target device, the flow rule is
+validated against its current configuration state and the returned value
+should be considered valid by the caller for that state only.
+
+The returned value is guaranteed to remain valid only as long as no
+successful calls to ``rte_flow_create()`` or ``rte_flow_destroy()`` are made
+in the meantime and no device parameter affecting flow rules in any way are
+modified, due to possible collisions or resource limitations (although in
+such cases ``EINVAL`` should not be returned).
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``attr``: flow rule attributes.
+- ``pattern``: pattern specification (list terminated by the END pattern
+  item).
+- ``actions``: associated actions (list terminated by the END action).
+- ``error``: perform verbose error reporting if not NULL.
+
+Return values:
+
+- 0 if flow rule is valid and can be created. A negative errno value
+  otherwise (``rte_errno`` is also set), the following errors are defined.
+- ``-ENOSYS``: underlying device does not support this functionality.
+- ``-EINVAL``: unknown or invalid rule specification.
+- ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial
+  bit-masks are unsupported).
+- ``-EEXIST``: collision with an existing rule.
+- ``-ENOMEM``: not enough resources.
+- ``-EBUSY``: action cannot be performed due to busy device resources, may
+  succeed if the affected queues or even the entire port are in a stopped
+  state (see ``rte_eth_dev_rx_queue_stop()`` and ``rte_eth_dev_stop()``).
+
+Creation
+~~~~~~~~
+
+Creating a flow rule is similar to validating one, except the rule is
+actually created and a handle returned.
+
+::
+
+ struct rte_flow *
+ rte_flow_create(uint8_t port_id,
+                 const struct rte_flow_attr *attr,
+                 const struct rte_flow_item pattern[],
+                 const struct rte_flow_action *actions[],
+                 struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``attr``: flow rule attributes.
+- ``pattern``: pattern specification (list terminated by the END pattern
+  item).
+- ``actions``: associated actions (list terminated by the END action).
+- ``error``: perform verbose error reporting if not NULL.
+
+Return values:
+
+A valid handle in case of success, NULL otherwise and ``rte_errno`` is set
+to the positive version of one of the error codes defined for
+``rte_flow_validate()``.
+
+Destruction
+~~~~~~~~~~~
+
+Flow rules destruction is not automatic, and a queue or a port should not be
+released if any are still attached to them. Applications must take care of
+performing this step before releasing resources.
+
+::
+
+ int
+ rte_flow_destroy(uint8_t port_id,
+                  struct rte_flow *flow,
+                  struct rte_flow_error *error);
+
+
+Failure to destroy a flow rule handle may occur when other flow rules depend
+on it, and destroying it would result in an inconsistent state.
+
+This function is only guaranteed to succeed if handles are destroyed in
+reverse order of their creation.
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``flow``: flow rule handle to destroy.
+- ``error``: perform verbose error reporting if not NULL.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Flush
+~~~~~
+
+Convenience function to destroy all flow rule handles associated with a
+port. They are released as with successive calls to ``rte_flow_destroy()``.
+
+::
+
+ int
+ rte_flow_flush(uint8_t port_id,
+                struct rte_flow_error *error);
+
+In the unlikely event of failure, handles are still considered destroyed and
+no longer valid but the port must be assumed to be in an inconsistent state.
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``error``: perform verbose error reporting if not NULL.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Query
+~~~~~
+
+Query an existing flow rule.
+
+This function allows retrieving flow-specific data such as counters. Data
+is gathered by special actions which must be present in the flow rule
+definition.
+
+::
+
+ int
+ rte_flow_query(uint8_t port_id,
+                struct rte_flow *flow,
+                enum rte_flow_action_type action,
+                void *data,
+                struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``flow``: flow rule handle to query.
+- ``action``: action type to query.
+- ``data``: pointer to storage for the associated query data type.
+- ``error``: perform verbose error reporting if not NULL.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Verbose error reporting
+-----------------------
+
+The defined *errno* values may not be accurate enough for users or
+application developers who want to investigate issues related to flow rules
+management. A dedicated error object is defined for this purpose::
+
+ enum rte_flow_error_type {
+     RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+     RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+     RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+     RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+     RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+     RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+     RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+     RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+     RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+     RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+     RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+     RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+ };
+
+ struct rte_flow_error {
+     enum rte_flow_error_type type; /**< Cause field and error types. */
+     const void *cause; /**< Object responsible for the error. */
+     const char *message; /**< Human-readable error message. */
+ };
+
+Error type ``RTE_FLOW_ERROR_TYPE_NONE`` stands for no error, in which case
+remaining fields can be ignored. Other error types describe the type of the
+object pointed by ``cause``.
+
+If non-NULL, ``cause`` points to the object responsible for the error. For a
+flow rule, this may be a pattern item or an individual action.
+
+If non-NULL, ``message`` provides a human-readable error message.
+
+This object is normally allocated by applications and set by PMDs, the
+message points to a constant string which does not need to be freed by the
+application, however its pointer can be considered valid only as long as its
+associated DPDK port remains configured. Closing the underlying device or
+unloading the PMD invalidates it.
+
+Caveats
+-------
+
+- DPDK does not keep track of flow rules definitions or flow rule objects
+  automatically. Applications may keep track of the former and must keep
+  track of the latter. PMDs may also do it for internal needs, however this
+  must not be relied on by applications.
+
+- Flow rules are not maintained between successive port initializations. An
+  application exiting without releasing them and restarting must re-create
+  them from scratch.
+
+- API operations are synchronous and blocking (``EAGAIN`` cannot be
+  returned).
+
+- There is no provision for reentrancy/multi-thread safety, although nothing
+  should prevent different devices from being configured at the same
+  time. PMDs may protect their control path functions accordingly.
+
+- Stopping the data path (TX/RX) should not be necessary when managing flow
+  rules. If this cannot be achieved naturally or with workarounds (such as
+  temporarily replacing the burst function pointers), an appropriate error
+  code must be returned (``EBUSY``).
+
+- PMDs, not applications, are responsible for maintaining flow rules
+  configuration when stopping and restarting a port or performing other
+  actions which may affect them. They can only be destroyed explicitly by
+  applications.
+
+For devices exposing multiple ports sharing global settings affected by flow
+rules:
+
+- All ports under DPDK control must behave consistently, PMDs are
+  responsible for making sure that existing flow rules on a port are not
+  affected by other ports.
+
+- Ports not under DPDK control (unaffected or handled by other applications)
+  are user's responsibility. They may affect existing flow rules and cause
+  undefined behavior. PMDs aware of this may prevent flow rules creation
+  altogether in such cases.
+
+PMD interface
+-------------
+
+The PMD interface is defined in ``rte_flow_driver.h``. It is not subject to
+API/ABI versioning constraints as it is not exposed to applications and may
+evolve independently.
+
+It is currently implemented on top of the legacy filtering framework through
+filter type *RTE_ETH_FILTER_GENERIC* that accepts the single operation
+*RTE_ETH_FILTER_GET* to return PMD-specific *rte_flow* callbacks wrapped
+inside ``struct rte_flow_ops``.
+
+This overhead is temporarily necessary in order to keep compatibility with
+the legacy filtering framework, which should eventually disappear.
+
+- PMD callbacks implement exactly the interface described in `Rules
+  management`_, except for the port ID argument which has already been
+  converted to a pointer to the underlying ``struct rte_eth_dev``.
+
+- Public API functions do not process flow rules definitions at all before
+  calling PMD functions (no basic error checking, no validation
+  whatsoever). They only make sure these callbacks are non-NULL or return
+  the ``ENOSYS`` (function not supported) error.
+
+This interface additionally defines the following helper functions:
+
+- ``rte_flow_ops_get()``: get generic flow operations structure from a
+  port.
+
+- ``rte_flow_error_set()``: initialize generic flow error structure.
+
+More will be added over time.
+
+Device compatibility
+--------------------
+
+No known implementation supports all the described features.
+
+Unsupported features or combinations are not expected to be fully emulated
+in software by PMDs for performance reasons. Partially supported features
+may be completed in software as long as hardware performs most of the work
+(such as queue redirection and packet recognition).
+
+However PMDs are expected to do their best to satisfy application requests
+by working around hardware limitations as long as doing so does not affect
+the behavior of existing flow rules.
+
+The following sections provide a few examples of such cases and describe how
+PMDs should handle them, they are based on limitations built into the
+previous APIs.
+
+Global bit-masks
+~~~~~~~~~~~~~~~~
+
+Each flow rule comes with its own, per-layer bit-masks, while hardware may
+support only a single, device-wide bit-mask for a given layer type, so that
+two IPv4 rules cannot use different bit-masks.
+
+The expected behavior in this case is that PMDs automatically configure
+global bit-masks according to the needs of the first flow rule created.
+
+Subsequent rules are allowed only if their bit-masks match those, the
+``EEXIST`` error code should be returned otherwise.
+
+Unsupported layer types
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Many protocols can be simulated by crafting patterns with the `RAW`_ type.
+
+PMDs can rely on this capability to simulate support for protocols with
+headers not directly recognized by hardware.
+
+``ANY`` pattern item
+~~~~~~~~~~~~~~~~~~~~
+
+This pattern item stands for anything, which can be difficult to translate
+to something hardware would understand, particularly if followed by more
+specific types.
+
+Consider the following pattern:
+
++---+-------------------------+
+| 0 | ETHER                   |
++---+-------+---------+-------+
+| 1 | ANY   | ``num`` | ``1`` |
++---+-------+---------+-------+
+| 2 | TCP                     |
++---+-------------------------+
+| 3 | END                     |
++---+-------------------------+
+
+Knowing that TCP does not make sense with something other than IPv4 and IPv6
+as L3, such a pattern may be translated to two flow rules instead:
+
++---+--------------------+
+| 0 | ETHER              |
++---+--------------------+
+| 1 | IPV4 (zeroed mask) |
++---+--------------------+
+| 2 | TCP                |
++---+--------------------+
+| 3 | END                |
++---+--------------------+
+
+..
+
++---+--------------------+
+| 0 | ETHER              |
++---+--------------------+
+| 1 | IPV6 (zeroed mask) |
++---+--------------------+
+| 2 | TCP                |
++---+--------------------+
+| 3 | END                |
++---+--------------------+
+
+Note that as soon as a ANY rule covers several layers, this approach may
+yield a large number of hidden flow rules. It is thus suggested to only
+support the most common scenarios (anything as L2 and/or L3).
+
+Unsupported actions
+~~~~~~~~~~~~~~~~~~~
+
+- When combined with a `QUEUE`_ action, packet counting (`COUNT`_) and
+  tagging (`MARK`_ or `FLAG`_) may be implemented in software as long as the
+  target queue is used by a single rule.
+
+- A rule specifying both `DUP`_ + `QUEUE`_ may be translated to two hidden
+  rules combining `QUEUE`_ and `PASSTHRU`_.
+
+- When a single target queue is provided, `RSS`_ can also be implemented
+  through `QUEUE`_.
+
+Flow rules priority
+~~~~~~~~~~~~~~~~~~~
+
+While it would naturally make sense, flow rules cannot be assumed to be
+processed by hardware in the same order as their creation for several
+reasons:
+
+- They may be managed internally as a tree or a hash table instead of a
+  list.
+- Removing a flow rule before adding another one can either put the new rule
+  at the end of the list or reuse a freed entry.
+- Duplication may occur when packets are matched by several rules.
+
+For overlapping rules (particularly in order to use the `PASSTHRU`_ action)
+predictable behavior is only guaranteed by using different priority levels.
+
+Priority levels are not necessarily implemented in hardware, or may be
+severely limited (e.g. a single priority bit).
+
+For these reasons, priority levels may be implemented purely in software by
+PMDs.
+
+- For devices expecting flow rules to be added in the correct order, PMDs
+  may destroy and re-create existing rules after adding a new one with
+  a higher priority.
+
+- A configurable number of dummy or empty rules can be created at
+  initialization time to save high priority slots for later.
+
+- In order to save priority levels, PMDs may evaluate whether rules are
+  likely to collide and adjust their priority accordingly.
+
+Future evolutions
+-----------------
+
+- A device profile selection function which could be used to force a
+  permanent profile instead of relying on its automatic configuration based
+  on existing flow rules.
+
+- A method to optimize *rte_flow* rules with specific pattern items and
+  action types generated on the fly by PMDs. DPDK should assign negative
+  numbers to these in order to not collide with the existing types. See
+  `Negative types`_.
+
+- Adding specific egress pattern items and actions as described in `Traffic
+  direction`_.
+
+- Optional software fallback when PMDs are unable to handle requested flow
+  rules so applications do not have to implement their own.
+
+API migration
+-------------
+
+Exhaustive list of deprecated filter types (normally prefixed with
+*RTE_ETH_FILTER_*) found in ``rte_eth_ctrl.h`` and methods to convert them
+to *rte_flow* rules.
+
+``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*MACVLAN* can be translated to a basic `ETH`_ flow rule with a `VF
+(action)`_ or `PF (action)`_ terminating action.
+
++------------------------------------+
+| MACVLAN                            |
++--------------------------+---------+
+| Pattern                  | Actions |
++===+=====+==========+=====+=========+
+| 0 | ETH | ``spec`` | any | VF,     |
+|   |     +----------+-----+ PF      |
+|   |     | ``last`` | N/A |         |
+|   |     +----------+-----+         |
+|   |     | ``mask`` | any |         |
++---+-----+----------+-----+---------+
+| 1 | END                  | END     |
++---+----------------------+---------+
+
+``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*ETHERTYPE* is basically an `ETH`_ flow rule with `QUEUE`_ or `DROP`_ as a
+terminating action.
+
++------------------------------------+
+| ETHERTYPE                          |
++--------------------------+---------+
+| Pattern                  | Actions |
++===+=====+==========+=====+=========+
+| 0 | ETH | ``spec`` | any | QUEUE,  |
+|   |     +----------+-----+ DROP    |
+|   |     | ``last`` | N/A |         |
+|   |     +----------+-----+         |
+|   |     | ``mask`` | any |         |
++---+-----+----------+-----+---------+
+| 1 | END                  | END     |
++---+----------------------+---------+
+
+``FLEXIBLE`` to ``RAW`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*FLEXIBLE* can be translated to one `RAW`_ pattern with `QUEUE`_ as the
+terminating action and a defined priority level.
+
++------------------------------------+
+| FLEXIBLE                           |
++--------------------------+---------+
+| Pattern                  | Actions |
++===+=====+==========+=====+=========+
+| 0 | RAW | ``spec`` | any | QUEUE   |
+|   |     +----------+-----+         |
+|   |     | ``last`` | N/A |         |
+|   |     +----------+-----+         |
+|   |     | ``mask`` | any |         |
++---+-----+----------+-----+---------+
+| 1 | END                  | END     |
++---+----------------------+---------+
+
+``SYN`` to ``TCP`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*SYN* is a `TCP`_ rule with only the ``syn`` bit enabled and masked, and
+`QUEUE`_ as the terminating action.
+
+Priority level can be set to simulate the high priority bit.
+
++---------------------------------------------+
+| SYN                                         |
++-----------------------------------+---------+
+| Pattern                           | Actions |
++===+======+==========+=============+=========+
+| 0 | ETH  | ``spec`` | unset       | QUEUE   |
+|   |      +----------+-------------+         |
+|   |      | ``last`` | unset       |         |
+|   |      +----------+-------------+         |
+|   |      | ``mask`` | unset       |         |
++---+------+----------+-------------+         |
+| 1 | IPV4 | ``spec`` | unset       |         |
+|   |      +----------+-------------+         |
+|   |      | ``mask`` | unset       |         |
+|   |      +----------+-------------+         |
+|   |      | ``mask`` | unset       |         |
++---+------+----------+---------+---+         |
+| 2 | TCP  | ``spec`` | ``syn`` | 1 |         |
+|   |      +----------+---------+---+         |
+|   |      | ``mask`` | ``syn`` | 1 |         |
++---+------+----------+---------+---+---------+
+| 3 | END                           | END     |
++---+-------------------------------+---------+
+
+``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*NTUPLE* is similar to specifying an empty L2, `IPV4`_ as L3 with `TCP`_ or
+`UDP`_ as L4 and `QUEUE`_ as the terminating action.
+
+A priority level can be specified as well.
+
++---------------------------------------+
+| NTUPLE                                |
++-----------------------------+---------+
+| Pattern                     | Actions |
++===+======+==========+=======+=========+
+| 0 | ETH  | ``spec`` | unset | QUEUE   |
+|   |      +----------+-------+         |
+|   |      | ``last`` | unset |         |
+|   |      +----------+-------+         |
+|   |      | ``mask`` | unset |         |
++---+------+----------+-------+         |
+| 1 | IPV4 | ``spec`` | any   |         |
+|   |      +----------+-------+         |
+|   |      | ``last`` | unset |         |
+|   |      +----------+-------+         |
+|   |      | ``mask`` | any   |         |
++---+------+----------+-------+         |
+| 2 | TCP, | ``spec`` | any   |         |
+|   | UDP  +----------+-------+         |
+|   |      | ``last`` | unset |         |
+|   |      +----------+-------+         |
+|   |      | ``mask`` | any   |         |
++---+------+----------+-------+---------+
+| 3 | END                     | END     |
++---+-------------------------+---------+
+
+``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*TUNNEL* matches common IPv4 and IPv6 L3/L4-based tunnel types.
+
+In the following table, `ANY`_ is used to cover the optional L4.
+
++------------------------------------------------+
+| TUNNEL                                         |
++--------------------------------------+---------+
+| Pattern                              | Actions |
++===+=========+==========+=============+=========+
+| 0 | ETH     | ``spec`` | any         | QUEUE   |
+|   |         +----------+-------------+         |
+|   |         | ``last`` | unset       |         |
+|   |         +----------+-------------+         |
+|   |         | ``mask`` | any         |         |
++---+---------+----------+-------------+         |
+| 1 | IPV4,   | ``spec`` | any         |         |
+|   | IPV6    +----------+-------------+         |
+|   |         | ``last`` | unset       |         |
+|   |         +----------+-------------+         |
+|   |         | ``mask`` | any         |         |
++---+---------+----------+-------------+         |
+| 2 | ANY     | ``spec`` | any         |         |
+|   |         +----------+-------------+         |
+|   |         | ``last`` | unset       |         |
+|   |         +----------+---------+---+         |
+|   |         | ``mask`` | ``num`` | 0 |         |
++---+---------+----------+---------+---+         |
+| 3 | VXLAN,  | ``spec`` | any         |         |
+|   | GENEVE, +----------+-------------+         |
+|   | TEREDO, | ``last`` | unset       |         |
+|   | NVGRE,  +----------+-------------+         |
+|   | GRE,    | ``mask`` | any         |         |
+|   | ...     |          |             |         |
+|   |         |          |             |         |
+|   |         |          |             |         |
++---+---------+----------+-------------+---------+
+| 4 | END                              | END     |
++---+----------------------------------+---------+
+
+``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*FDIR* is more complex than any other type, there are several methods to
+emulate its functionality. It is summarized for the most part in the table
+below.
+
+A few features are intentionally not supported:
+
+- The ability to configure the matching input set and masks for the entire
+  device, PMDs should take care of it automatically according to the
+  requested flow rules.
+
+  For example if a device supports only one bit-mask per protocol type,
+  source/address IPv4 bit-masks can be made immutable by the first created
+  rule. Subsequent IPv4 or TCPv4 rules can only be created if they are
+  compatible.
+
+  Note that only protocol bit-masks affected by existing flow rules are
+  immutable, others can be changed later. They become mutable again after
+  the related flow rules are destroyed.
+
+- Returning four or eight bytes of matched data when using flex bytes
+  filtering. Although a specific action could implement it, it conflicts
+  with the much more useful 32 bits tagging on devices that support it.
+
+- Side effects on RSS processing of the entire device. Flow rules that
+  conflict with the current device configuration should not be
+  allowed. Similarly, device configuration should not be allowed when it
+  affects existing flow rules.
+
+- Device modes of operation. "none" is unsupported since filtering cannot be
+  disabled as long as a flow rule is present.
+
+- "MAC VLAN" or "tunnel" perfect matching modes should be automatically set
+  according to the created flow rules.
+
+- Signature mode of operation is not defined but could be handled through a
+  specific item type if needed.
+
++----------------------------------------------+
+| FDIR                                         |
++---------------------------------+------------+
+| Pattern                         | Actions    |
++===+============+==========+=====+============+
+| 0 | ETH,       | ``spec`` | any | QUEUE,     |
+|   | RAW        +----------+-----+ DROP,      |
+|   |            | ``last`` | N/A | PASSTHRU   |
+|   |            +----------+-----+            |
+|   |            | ``mask`` | any |            |
++---+------------+----------+-----+------------+
+| 1 | IPV4,      | ``spec`` | any | MARK       |
+|   | IPV6       +----------+-----+            |
+|   |            | ``last`` | N/A |            |
+|   |            +----------+-----+            |
+|   |            | ``mask`` | any |            |
++---+------------+----------+-----+            |
+| 2 | TCP,       | ``spec`` | any |            |
+|   | UDP,       +----------+-----+            |
+|   | SCTP       | ``last`` | N/A |            |
+|   |            +----------+-----+            |
+|   |            | ``mask`` | any |            |
++---+------------+----------+-----+            |
+| 3 | VF,        | ``spec`` | any |            |
+|   | PF         +----------+-----+            |
+|   | (optional) | ``last`` | N/A |            |
+|   |            +----------+-----+            |
+|   |            | ``mask`` | any |            |
++---+------------+----------+-----+------------+
+| 4 | END                         | END        |
++---+-----------------------------+------------+
+
+
+``HASH``
+~~~~~~~~
+
+There is no counterpart to this filter type because it translates to a
+global device setting instead of a pattern item. Device settings are
+automatically set according to the created flow rules.
+
+``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All packets are matched. This type alters incoming packets to encapsulate
+them in a chosen tunnel type, optionally redirect them to a VF as well.
+
+The destination pool for tag based forwarding can be emulated with other
+flow rules using `DUP`_ as the action.
+
++----------------------------------------+
+| L2_TUNNEL                              |
++---------------------------+------------+
+| Pattern                   | Actions    |
++===+======+==========+=====+============+
+| 0 | VOID | ``spec`` | N/A | VXLAN,     |
+|   |      |          |     | GENEVE,    |
+|   |      |          |     | ...        |
+|   |      +----------+-----+------------+
+|   |      | ``last`` | N/A | VF         |
+|   |      +----------+-----+ (optional) |
+|   |      | ``mask`` | N/A |            |
+|   |      |          |     |            |
++---+------+----------+-----+------------+
+| 1 | END                   | END        |
++---+-----------------------+------------+
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 03/25] doc: announce depreciation of legacy filter types
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
  2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API Adrien Mazarguil
  2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-19 10:47         ` Mcnamara, John
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
                         ` (24 subsequent siblings)
  27 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

They are superseded by the generic flow API (rte_flow). Target release is
not defined yet.

Suggested-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/rel_notes/deprecation.rst | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 2d17bc6..4819078 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -71,3 +71,10 @@ Deprecation Notices
 * mempool: The functions for single/multi producer/consumer are deprecated
   and will be removed in 17.02.
   It is replaced by ``rte_mempool_generic_get/put`` functions.
+
+* ethdev: the legacy filter API, including rte_eth_dev_filter_supported(),
+  rte_eth_dev_filter_ctrl() as well as filter types MACVLAN, ETHERTYPE,
+  FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR, HASH and L2_TUNNEL, is superseded by
+  the generic flow API (rte_flow) in PMDs that implement the latter.
+  Target release for removal of the legacy API will be defined once most
+  PMDs have switched to rte_flow.
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 04/25] cmdline: add support for dynamic tokens
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (2 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 03/25] doc: announce depreciation of legacy filter types Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 05/25] cmdline: add alignment constraint Adrien Mazarguil
                         ` (23 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Considering tokens must be hard-coded in a list part of the instruction
structure, context-dependent tokens cannot be expressed.

This commit adds support for building dynamic token lists through a
user-provided function, which is called when the static token list is empty
(a single NULL entry).

Because no structures are modified (existing fields are reused), this
commit has no impact on the current ABI.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 lib/librte_cmdline/cmdline_parse.c | 60 +++++++++++++++++++++++++++++----
 lib/librte_cmdline/cmdline_parse.h | 21 ++++++++++++
 2 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index b496067..14f5553 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -146,7 +146,9 @@ nb_common_chars(const char * s1, const char * s2)
  */
 static int
 match_inst(cmdline_parse_inst_t *inst, const char *buf,
-	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
+	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size,
+	   cmdline_parse_token_hdr_t
+		*(*dyn_tokens)[CMDLINE_PARSE_DYNAMIC_TOKENS])
 {
 	unsigned int token_num=0;
 	cmdline_parse_token_hdr_t * token_p;
@@ -155,6 +157,11 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 	struct cmdline_token_hdr token_hdr;
 
 	token_p = inst->tokens[token_num];
+	if (!token_p && dyn_tokens && inst->f) {
+		if (!(*dyn_tokens)[0])
+			inst->f(&(*dyn_tokens)[0], NULL, dyn_tokens);
+		token_p = (*dyn_tokens)[0];
+	}
 	if (token_p)
 		memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -196,7 +203,17 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 		buf += n;
 
 		token_num ++;
-		token_p = inst->tokens[token_num];
+		if (!inst->tokens[0]) {
+			if (token_num < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!(*dyn_tokens)[token_num])
+					inst->f(&(*dyn_tokens)[token_num],
+						NULL,
+						dyn_tokens);
+				token_p = (*dyn_tokens)[token_num];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[token_num];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 	}
@@ -239,6 +256,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
 	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
 	int comment = 0;
@@ -255,6 +273,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		return CMDLINE_PARSE_BAD_ARGS;
 
 	ctx = cl->ctx;
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/*
 	 * - look if the buffer contains at least one line
@@ -299,7 +318,8 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
+		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
 			err = CMDLINE_PARSE_BAD_ARGS;
@@ -355,6 +375,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 	cmdline_parse_token_hdr_t *token_p;
 	struct cmdline_token_hdr token_hdr;
 	char tmpbuf[CMDLINE_BUFFER_SIZE], comp_buf[CMDLINE_BUFFER_SIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	unsigned int partial_tok_len;
 	int comp_len = -1;
 	int tmp_len = -1;
@@ -374,6 +395,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 
 	debug_printf("%s called\n", __func__);
 	memset(&token_hdr, 0, sizeof(token_hdr));
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/* count the number of complete token to parse */
 	for (i=0 ; buf[i] ; i++) {
@@ -396,11 +418,24 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		inst = ctx[inst_num];
 		while (inst) {
 			/* parse the first tokens of the inst */
-			if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+			if (nb_token &&
+			    match_inst(inst, buf, nb_token, NULL, 0,
+				       &dyn_tokens))
 				goto next;
 
 			debug_printf("instruction match\n");
-			token_p = inst->tokens[nb_token];
+			if (!inst->tokens[0]) {
+				if (nb_token <
+				    (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+					if (!dyn_tokens[nb_token])
+						inst->f(&dyn_tokens[nb_token],
+							NULL,
+							&dyn_tokens);
+					token_p = dyn_tokens[nb_token];
+				} else
+					token_p = NULL;
+			} else
+				token_p = inst->tokens[nb_token];
 			if (token_p)
 				memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -490,10 +525,21 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		/* we need to redo it */
 		inst = ctx[inst_num];
 
-		if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+		if (nb_token &&
+		    match_inst(inst, buf, nb_token, NULL, 0, &dyn_tokens))
 			goto next2;
 
-		token_p = inst->tokens[nb_token];
+		if (!inst->tokens[0]) {
+			if (nb_token < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!dyn_tokens[nb_token])
+					inst->f(&dyn_tokens[nb_token],
+						NULL,
+						&dyn_tokens);
+				token_p = dyn_tokens[nb_token];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[nb_token];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
diff --git a/lib/librte_cmdline/cmdline_parse.h b/lib/librte_cmdline/cmdline_parse.h
index 4ac05d6..65b18d4 100644
--- a/lib/librte_cmdline/cmdline_parse.h
+++ b/lib/librte_cmdline/cmdline_parse.h
@@ -83,6 +83,9 @@ extern "C" {
 /* maximum buffer size for parsed result */
 #define CMDLINE_PARSE_RESULT_BUFSIZE 8192
 
+/* maximum number of dynamic tokens */
+#define CMDLINE_PARSE_DYNAMIC_TOKENS 128
+
 /**
  * Stores a pointer to the ops struct, and the offset: the place to
  * write the parsed result in the destination structure.
@@ -130,6 +133,24 @@ struct cmdline;
  * Store a instruction, which is a pointer to a callback function and
  * its parameter that is called when the instruction is parsed, a help
  * string, and a list of token composing this instruction.
+ *
+ * When no tokens are defined (tokens[0] == NULL), they are retrieved
+ * dynamically by calling f() as follows:
+ *
+ *  f((struct cmdline_token_hdr **)&token_hdr,
+ *    NULL,
+ *    (struct cmdline_token_hdr *[])tokens));
+ *
+ * The address of the resulting token is expected at the location pointed by
+ * the first argument. Can be set to NULL to end the list.
+ *
+ * The cmdline argument (struct cmdline *) is always NULL.
+ *
+ * The last argument points to the NULL-terminated list of dynamic tokens
+ * defined so far. Since token_hdr points to an index of that list, the
+ * current index can be derived as follows:
+ *
+ *  int index = token_hdr - &(*tokens)[0];
  */
 struct cmdline_inst {
 	/* f(parsed_struct, data) */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 05/25] cmdline: add alignment constraint
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (3 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
                         ` (22 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

This prevents sigbus errors on architectures that cannot handle unexpected
unaligned accesses to the output buffer.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 lib/librte_cmdline/cmdline_parse.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index 14f5553..763c286 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -255,7 +255,10 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	unsigned int inst_num=0;
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
-	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	union {
+		char buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+		long double align; /* strong alignment constraint for buf */
+	} result;
 	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
@@ -318,7 +321,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+		tok = match_inst(inst, buf, 0, result.buf, sizeof(result.buf),
 				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
@@ -353,7 +356,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 
 	/* call func */
 	if (f) {
-		f(result_buf, cl, data);
+		f(result.buf, cl, data);
 	}
 
 	/* no match */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (4 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 05/25] cmdline: add alignment constraint Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-19  8:37         ` Xing, Beilei
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 07/25] app/testpmd: add flow command Adrien Mazarguil
                         ` (21 subsequent siblings)
  27 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Add basic management functions for the generic flow API (validate, create,
destroy, flush, query and list). Flow rule objects and properties are
arranged in lists associated with each port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c     |   1 +
 app/test-pmd/config.c      | 485 ++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/csumonly.c    |   1 +
 app/test-pmd/flowgen.c     |   1 +
 app/test-pmd/icmpecho.c    |   1 +
 app/test-pmd/ieee1588fwd.c |   1 +
 app/test-pmd/iofwd.c       |   1 +
 app/test-pmd/macfwd.c      |   1 +
 app/test-pmd/macswap.c     |   1 +
 app/test-pmd/parameters.c  |   1 +
 app/test-pmd/rxonly.c      |   1 +
 app/test-pmd/testpmd.c     |   6 +
 app/test-pmd/testpmd.h     |  27 +++
 app/test-pmd/txonly.c      |   1 +
 14 files changed, 529 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index d03a592..5d1c0dd 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -75,6 +75,7 @@
 #include <rte_string_fns.h>
 #include <rte_devargs.h>
 #include <rte_eth_ctrl.h>
+#include <rte_flow.h>
 
 #include <cmdline_rdline.h>
 #include <cmdline_parse.h>
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 8cf537d..e965930 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -92,6 +92,8 @@
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
 #include <rte_cycles.h>
+#include <rte_flow.h>
+#include <rte_errno.h>
 
 #include "testpmd.h"
 
@@ -751,6 +753,489 @@ port_mtu_set(portid_t port_id, uint16_t mtu)
 	printf("Set MTU failed. diag=%d\n", diag);
 }
 
+/* Generic flow management functions. */
+
+/** Generate flow_item[] entry. */
+#define MK_FLOW_ITEM(t, s) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow pattern items. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_item[] = {
+	MK_FLOW_ITEM(END, 0),
+	MK_FLOW_ITEM(VOID, 0),
+	MK_FLOW_ITEM(INVERT, 0),
+	MK_FLOW_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+	MK_FLOW_ITEM(PF, 0),
+	MK_FLOW_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+	MK_FLOW_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+	MK_FLOW_ITEM(RAW, sizeof(struct rte_flow_item_raw)), /* +pattern[] */
+	MK_FLOW_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+	MK_FLOW_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+	MK_FLOW_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+	MK_FLOW_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+	MK_FLOW_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+	MK_FLOW_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+	MK_FLOW_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+	MK_FLOW_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+	MK_FLOW_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+};
+
+/** Compute storage space needed by item specification. */
+static void
+flow_item_spec_size(const struct rte_flow_item *item,
+		    size_t *size, size_t *pad)
+{
+	if (!item->spec)
+		goto empty;
+	switch (item->type) {
+		union {
+			const struct rte_flow_item_raw *raw;
+		} spec;
+
+	case RTE_FLOW_ITEM_TYPE_RAW:
+		spec.raw = item->spec;
+		*size = offsetof(struct rte_flow_item_raw, pattern) +
+			spec.raw->length * sizeof(*spec.raw->pattern);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate flow_action[] entry. */
+#define MK_FLOW_ACTION(t, s) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow actions. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_action[] = {
+	MK_FLOW_ACTION(END, 0),
+	MK_FLOW_ACTION(VOID, 0),
+	MK_FLOW_ACTION(PASSTHRU, 0),
+	MK_FLOW_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+	MK_FLOW_ACTION(FLAG, 0),
+	MK_FLOW_ACTION(QUEUE, sizeof(struct rte_flow_action_queue)),
+	MK_FLOW_ACTION(DROP, 0),
+	MK_FLOW_ACTION(COUNT, 0),
+	MK_FLOW_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+	MK_FLOW_ACTION(RSS, sizeof(struct rte_flow_action_rss)), /* +queue[] */
+	MK_FLOW_ACTION(PF, 0),
+	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+};
+
+/** Compute storage space needed by action configuration. */
+static void
+flow_action_conf_size(const struct rte_flow_action *action,
+		      size_t *size, size_t *pad)
+{
+	if (!action->conf)
+		goto empty;
+	switch (action->type) {
+		union {
+			const struct rte_flow_action_rss *rss;
+		} conf;
+
+	case RTE_FLOW_ACTION_TYPE_RSS:
+		conf.rss = action->conf;
+		*size = offsetof(struct rte_flow_action_rss, queue) +
+			conf.rss->num * sizeof(*conf.rss->queue);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate a port_flow entry from attributes/pattern/actions. */
+static struct port_flow *
+port_flow_new(const struct rte_flow_attr *attr,
+	      const struct rte_flow_item *pattern,
+	      const struct rte_flow_action *actions)
+{
+	const struct rte_flow_item *item;
+	const struct rte_flow_action *action;
+	struct port_flow *pf = NULL;
+	size_t tmp;
+	size_t pad;
+	size_t off1 = 0;
+	size_t off2 = 0;
+	int err = ENOTSUP;
+
+store:
+	item = pattern;
+	if (pf)
+		pf->pattern = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_item *dst = NULL;
+
+		if ((unsigned int)item->type > RTE_DIM(flow_item) ||
+		    !flow_item[item->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, item, sizeof(*item));
+		off1 += sizeof(*item);
+		flow_item_spec_size(item, &tmp, &pad);
+		if (item->spec) {
+			if (pf)
+				dst->spec = memcpy(pf->data + off2,
+						   item->spec, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->last) {
+			if (pf)
+				dst->last = memcpy(pf->data + off2,
+						   item->last, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->mask) {
+			if (pf)
+				dst->mask = memcpy(pf->data + off2,
+						   item->mask, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((item++)->type != RTE_FLOW_ITEM_TYPE_END);
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	action = actions;
+	if (pf)
+		pf->actions = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_action *dst = NULL;
+
+		if ((unsigned int)action->type > RTE_DIM(flow_action) ||
+		    !flow_action[action->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, action, sizeof(*action));
+		off1 += sizeof(*action);
+		flow_action_conf_size(action, &tmp, &pad);
+		if (action->conf) {
+			if (pf)
+				dst->conf = memcpy(pf->data + off2,
+						   action->conf, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((action++)->type != RTE_FLOW_ACTION_TYPE_END);
+	if (pf != NULL)
+		return pf;
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	tmp = RTE_ALIGN_CEIL(offsetof(struct port_flow, data), sizeof(double));
+	pf = calloc(1, tmp + off1 + off2);
+	if (pf == NULL)
+		err = errno;
+	else {
+		*pf = (const struct port_flow){
+			.size = tmp + off1 + off2,
+			.attr = *attr,
+		};
+		tmp -= offsetof(struct port_flow, data);
+		off2 = tmp + off1;
+		off1 = tmp;
+		goto store;
+	}
+notsup:
+	rte_errno = err;
+	return NULL;
+}
+
+/** Print a message out of a flow error. */
+static int
+port_flow_complain(struct rte_flow_error *error)
+{
+	static const char *const errstrlist[] = {
+		[RTE_FLOW_ERROR_TYPE_NONE] = "no error",
+		[RTE_FLOW_ERROR_TYPE_UNSPECIFIED] = "cause unspecified",
+		[RTE_FLOW_ERROR_TYPE_HANDLE] = "flow rule (handle)",
+		[RTE_FLOW_ERROR_TYPE_ATTR_GROUP] = "group field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY] = "priority field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_INGRESS] = "ingress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_EGRESS] = "egress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR] = "attributes structure",
+		[RTE_FLOW_ERROR_TYPE_ITEM_NUM] = "pattern length",
+		[RTE_FLOW_ERROR_TYPE_ITEM] = "specific pattern item",
+		[RTE_FLOW_ERROR_TYPE_ACTION_NUM] = "number of actions",
+		[RTE_FLOW_ERROR_TYPE_ACTION] = "specific action",
+	};
+	const char *errstr;
+	char buf[32];
+	int err = rte_errno;
+
+	if ((unsigned int)error->type > RTE_DIM(errstrlist) ||
+	    !errstrlist[error->type])
+		errstr = "unknown type";
+	else
+		errstr = errstrlist[error->type];
+	printf("Caught error type %d (%s): %s%s\n",
+	       error->type, errstr,
+	       error->cause ? (snprintf(buf, sizeof(buf), "cause: %p, ",
+					error->cause), buf) : "",
+	       error->message ? error->message : "(no stated reason)");
+	return -err;
+}
+
+/** Validate flow rule. */
+int
+port_flow_validate(portid_t port_id,
+		   const struct rte_flow_attr *attr,
+		   const struct rte_flow_item *pattern,
+		   const struct rte_flow_action *actions)
+{
+	struct rte_flow_error error;
+
+	if (rte_flow_validate(port_id, attr, pattern, actions, &error))
+		return port_flow_complain(&error);
+	printf("Flow rule validated\n");
+	return 0;
+}
+
+/** Create flow rule. */
+int
+port_flow_create(portid_t port_id,
+		 const struct rte_flow_attr *attr,
+		 const struct rte_flow_item *pattern,
+		 const struct rte_flow_action *actions)
+{
+	struct rte_flow *flow;
+	struct rte_port *port;
+	struct port_flow *pf;
+	uint32_t id;
+	struct rte_flow_error error;
+
+	flow = rte_flow_create(port_id, attr, pattern, actions, &error);
+	if (!flow)
+		return port_flow_complain(&error);
+	port = &ports[port_id];
+	if (port->flow_list) {
+		if (port->flow_list->id == UINT32_MAX) {
+			printf("Highest rule ID is already assigned, delete"
+			       " it first");
+			rte_flow_destroy(port_id, flow, NULL);
+			return -ENOMEM;
+		}
+		id = port->flow_list->id + 1;
+	} else
+		id = 0;
+	pf = port_flow_new(attr, pattern, actions);
+	if (!pf) {
+		int err = rte_errno;
+
+		printf("Cannot allocate flow: %s\n", rte_strerror(err));
+		rte_flow_destroy(port_id, flow, NULL);
+		return -err;
+	}
+	pf->next = port->flow_list;
+	pf->id = id;
+	pf->flow = flow;
+	port->flow_list = pf;
+	printf("Flow rule #%u created\n", pf->id);
+	return 0;
+}
+
+/** Destroy a number of flow rules. */
+int
+port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule)
+{
+	struct rte_port *port;
+	struct port_flow **tmp;
+	uint32_t c = 0;
+	int ret = 0;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	tmp = &port->flow_list;
+	while (*tmp) {
+		uint32_t i;
+
+		for (i = 0; i != n; ++i) {
+			struct rte_flow_error error;
+			struct port_flow *pf = *tmp;
+
+			if (rule[i] != pf->id)
+				continue;
+			if (rte_flow_destroy(port_id, pf->flow, &error)) {
+				ret = port_flow_complain(&error);
+				continue;
+			}
+			printf("Flow rule #%u destroyed\n", pf->id);
+			*tmp = pf->next;
+			free(pf);
+			break;
+		}
+		if (i == n)
+			tmp = &(*tmp)->next;
+		++c;
+	}
+	return ret;
+}
+
+/** Remove all flow rules. */
+int
+port_flow_flush(portid_t port_id)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	int ret = 0;
+
+	if (rte_flow_flush(port_id, &error)) {
+		ret = port_flow_complain(&error);
+		if (port_id_is_invalid(port_id, DISABLED_WARN) ||
+		    port_id == (portid_t)RTE_PORT_ALL)
+			return ret;
+	}
+	port = &ports[port_id];
+	while (port->flow_list) {
+		struct port_flow *pf = port->flow_list->next;
+
+		free(port->flow_list);
+		port->flow_list = pf;
+	}
+	return ret;
+}
+
+/** Query a flow rule. */
+int
+port_flow_query(portid_t port_id, uint32_t rule,
+		enum rte_flow_action_type action)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	struct port_flow *pf;
+	const char *name;
+	union {
+		struct rte_flow_query_count count;
+	} query;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	for (pf = port->flow_list; pf; pf = pf->next)
+		if (pf->id == rule)
+			break;
+	if (!pf) {
+		printf("Flow rule #%u not found\n", rule);
+		return -ENOENT;
+	}
+	if ((unsigned int)action > RTE_DIM(flow_action) ||
+	    !flow_action[action].name)
+		name = "unknown";
+	else
+		name = flow_action[action].name;
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		break;
+	default:
+		printf("Cannot query action type %d (%s)\n", action, name);
+		return -ENOTSUP;
+	}
+	memset(&query, 0, sizeof(query));
+	if (rte_flow_query(port_id, pf->flow, action, &query, &error))
+		return port_flow_complain(&error);
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		printf("%s:\n"
+		       " hits_set: %u\n"
+		       " bytes_set: %u\n"
+		       " hits: %" PRIu64 "\n"
+		       " bytes: %" PRIu64 "\n",
+		       name,
+		       query.count.hits_set,
+		       query.count.bytes_set,
+		       query.count.hits,
+		       query.count.bytes);
+		break;
+	default:
+		printf("Cannot display result for action type %d (%s)\n",
+		       action, name);
+		break;
+	}
+	return 0;
+}
+
+/** List flow rules. */
+void
+port_flow_list(portid_t port_id, uint32_t n, const uint32_t group[n])
+{
+	struct rte_port *port;
+	struct port_flow *pf;
+	struct port_flow *list = NULL;
+	uint32_t i;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return;
+	port = &ports[port_id];
+	if (!port->flow_list)
+		return;
+	/* Sort flows by group, priority and ID. */
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		struct port_flow **tmp;
+
+		if (n) {
+			/* Filter out unwanted groups. */
+			for (i = 0; i != n; ++i)
+				if (pf->attr.group == group[i])
+					break;
+			if (i == n)
+				continue;
+		}
+		tmp = &list;
+		while (*tmp &&
+		       (pf->attr.group > (*tmp)->attr.group ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority > (*tmp)->attr.priority) ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority == (*tmp)->attr.priority &&
+			 pf->id > (*tmp)->id)))
+			tmp = &(*tmp)->tmp;
+		pf->tmp = *tmp;
+		*tmp = pf;
+	}
+	printf("ID\tGroup\tPrio\tAttr\tRule\n");
+	for (pf = list; pf != NULL; pf = pf->tmp) {
+		const struct rte_flow_item *item = pf->pattern;
+		const struct rte_flow_action *action = pf->actions;
+
+		printf("%" PRIu32 "\t%" PRIu32 "\t%" PRIu32 "\t%c%c\t",
+		       pf->id,
+		       pf->attr.group,
+		       pf->attr.priority,
+		       pf->attr.ingress ? 'i' : '-',
+		       pf->attr.egress ? 'e' : '-');
+		while (item->type != RTE_FLOW_ITEM_TYPE_END) {
+			if (item->type != RTE_FLOW_ITEM_TYPE_VOID)
+				printf("%s ", flow_item[item->type].name);
+			++item;
+		}
+		printf("=>");
+		while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+			if (action->type != RTE_FLOW_ACTION_TYPE_VOID)
+				printf(" %s", flow_action[action->type].name);
+			++action;
+		}
+		printf("\n");
+	}
+}
+
 /*
  * RX/TX ring descriptors display functions.
  */
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 57e6ae2..dd67ebf 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,7 @@
 #include <rte_sctp.h>
 #include <rte_prefetch.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index b13ff89..13b4f90 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 6a4e750..f25a8f5 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -61,6 +61,7 @@
 #include <rte_ip.h>
 #include <rte_icmp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/ieee1588fwd.c b/app/test-pmd/ieee1588fwd.c
index 0d3b37a..51170ee 100644
--- a/app/test-pmd/ieee1588fwd.c
+++ b/app/test-pmd/ieee1588fwd.c
@@ -34,6 +34,7 @@
 
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/iofwd.c b/app/test-pmd/iofwd.c
index 26936b7..15cb4a2 100644
--- a/app/test-pmd/iofwd.c
+++ b/app/test-pmd/iofwd.c
@@ -64,6 +64,7 @@
 #include <rte_ether.h>
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index 86e01de..d361db1 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 36e139f..f996039 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 08e5a76..28db8cd 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_BOND
 #include <rte_eth_bond.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index fff815c..cf00576 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -67,6 +67,7 @@
 #include <rte_ip.h>
 #include <rte_udp.h>
 #include <rte_net.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a0332c2..bfb2f8e 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #ifdef RTE_LIBRTE_PDUMP
 #include <rte_pdump.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
@@ -1545,6 +1546,8 @@ close_port(portid_t pid)
 			continue;
 		}
 
+		if (port->flow_list)
+			port_flow_flush(pi);
 		rte_eth_dev_close(pi);
 
 		if (rte_atomic16_cmpset(&(port->port_status),
@@ -1599,6 +1602,9 @@ detach_port(uint8_t port_id)
 		return;
 	}
 
+	if (ports[port_id].flow_list)
+		port_flow_flush(port_id);
+
 	if (rte_eth_dev_detach(port_id, name))
 		return;
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9c1e703..22ce2d6 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -144,6 +144,19 @@ struct fwd_stream {
 /** Insert double VLAN header in forward engine */
 #define TESTPMD_TX_OFFLOAD_INSERT_QINQ       0x0080
 
+/** Descriptor for a single flow. */
+struct port_flow {
+	size_t size; /**< Allocated space including data[]. */
+	struct port_flow *next; /**< Next flow in list. */
+	struct port_flow *tmp; /**< Temporary linking. */
+	uint32_t id; /**< Flow rule ID. */
+	struct rte_flow *flow; /**< Opaque flow object returned by PMD. */
+	struct rte_flow_attr attr; /**< Attributes. */
+	struct rte_flow_item *pattern; /**< Pattern. */
+	struct rte_flow_action *actions; /**< Actions. */
+	uint8_t data[]; /**< Storage for pattern/actions. */
+};
+
 /**
  * The data structure associated with each port.
  */
@@ -177,6 +190,7 @@ struct rte_port {
 	struct ether_addr       *mc_addr_pool; /**< pool of multicast addrs */
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
+	struct port_flow        *flow_list; /**< Associated flows. */
 };
 
 extern portid_t __rte_unused
@@ -504,6 +518,19 @@ void port_reg_bit_field_set(portid_t port_id, uint32_t reg_off,
 			    uint8_t bit1_pos, uint8_t bit2_pos, uint32_t value);
 void port_reg_display(portid_t port_id, uint32_t reg_off);
 void port_reg_set(portid_t port_id, uint32_t reg_off, uint32_t value);
+int port_flow_validate(portid_t port_id,
+		       const struct rte_flow_attr *attr,
+		       const struct rte_flow_item *pattern,
+		       const struct rte_flow_action *actions);
+int port_flow_create(portid_t port_id,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item *pattern,
+		     const struct rte_flow_action *actions);
+int port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule);
+int port_flow_flush(portid_t port_id);
+int port_flow_query(portid_t port_id, uint32_t rule,
+		    enum rte_flow_action_type action);
+void port_flow_list(portid_t port_id, uint32_t n, const uint32_t *group);
 
 void rx_ring_desc_display(portid_t port_id, queueid_t rxq_id, uint16_t rxd_id);
 void tx_ring_desc_display(portid_t port_id, queueid_t txq_id, uint16_t txd_id);
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 8513a06..e996f35 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 07/25] app/testpmd: add flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (5 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
                         ` (20 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Managing generic flow API functions from command line requires the use of
dynamic tokens for convenience as flow rules are not fixed and cannot be
defined statically.

This commit adds specific flexible parser code and object for a new "flow"
command in separate file.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/Makefile       |   1 +
 app/test-pmd/cmdline.c      |   4 +
 app/test-pmd/cmdline_flow.c | 439 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 444 insertions(+)

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 891b85a..5988c3e 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -47,6 +47,7 @@ CFLAGS += $(WERROR_FLAGS)
 SRCS-y := testpmd.c
 SRCS-y += parameters.c
 SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline.c
+SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_flow.c
 SRCS-y += config.c
 SRCS-y += iofwd.c
 SRCS-y += macfwd.c
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 5d1c0dd..b124412 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -9567,6 +9567,9 @@ cmdline_parse_inst_t cmd_set_flow_director_flex_payload = {
 	},
 };
 
+/* Generic flow interface command. */
+extern cmdline_parse_inst_t cmd_flow;
+
 /* *** Classification Filters Control *** */
 /* *** Get symmetric hash enable per port *** */
 struct cmd_get_sym_hash_ena_per_port_result {
@@ -11605,6 +11608,7 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_set_hash_global_config,
 	(cmdline_parse_inst_t *)&cmd_set_hash_input_set,
 	(cmdline_parse_inst_t *)&cmd_set_fdir_input_set,
+	(cmdline_parse_inst_t *)&cmd_flow,
 	(cmdline_parse_inst_t *)&cmd_mcast_addr,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_all,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_specific,
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
new file mode 100644
index 0000000..a36da06
--- /dev/null
+++ b/app/test-pmd/cmdline_flow.c
@@ -0,0 +1,439 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <ctype.h>
+#include <string.h>
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <cmdline_parse.h>
+#include <rte_flow.h>
+
+#include "testpmd.h"
+
+/** Parser token indices. */
+enum index {
+	/* Special tokens. */
+	ZERO = 0,
+	END,
+
+	/* Top-level command. */
+	FLOW,
+};
+
+/** Maximum number of subsequent tokens and arguments on the stack. */
+#define CTX_STACK_SIZE 16
+
+/** Parser context. */
+struct context {
+	/** Stack of subsequent token lists to process. */
+	const enum index *next[CTX_STACK_SIZE];
+	enum index curr; /**< Current token index. */
+	enum index prev; /**< Index of the last token seen. */
+	int next_num; /**< Number of entries in next[]. */
+	uint32_t reparse:1; /**< Start over from the beginning. */
+	uint32_t eol:1; /**< EOL has been detected. */
+	uint32_t last:1; /**< No more arguments. */
+};
+
+/** Parser token definition. */
+struct token {
+	/** Type displayed during completion (defaults to "TOKEN"). */
+	const char *type;
+	/** Help displayed during completion (defaults to token name). */
+	const char *help;
+	/**
+	 * Lists of subsequent tokens to push on the stack. Each call to the
+	 * parser consumes the last entry of that stack.
+	 */
+	const enum index *const *next;
+	/**
+	 * Token-processing callback, returns -1 in case of error, the
+	 * length of the matched string otherwise. If NULL, attempts to
+	 * match the token name.
+	 *
+	 * If buf is not NULL, the result should be stored in it according
+	 * to context. An error is returned if not large enough.
+	 */
+	int (*call)(struct context *ctx, const struct token *token,
+		    const char *str, unsigned int len,
+		    void *buf, unsigned int size);
+	/**
+	 * Callback that provides possible values for this token, used for
+	 * completion. Returns -1 in case of error, the number of possible
+	 * values otherwise. If NULL, the token name is used.
+	 *
+	 * If buf is not NULL, entry index ent is written to buf and the
+	 * full length of the entry is returned (same behavior as
+	 * snprintf()).
+	 */
+	int (*comp)(struct context *ctx, const struct token *token,
+		    unsigned int ent, char *buf, unsigned int size);
+	/** Mandatory token name, no default value. */
+	const char *name;
+};
+
+/** Static initializer for the next field. */
+#define NEXT(...) (const enum index *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for a NEXT() entry. */
+#define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, ZERO, }
+
+/** Parser output buffer layout expected by cmd_flow_parsed(). */
+struct buffer {
+	enum index command; /**< Flow command. */
+	uint16_t port; /**< Affected port ID. */
+};
+
+static int parse_init(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
+
+/** Token definitions. */
+static const struct token token_list[] = {
+	/* Special tokens. */
+	[ZERO] = {
+		.name = "ZERO",
+		.help = "null entry, abused as the entry point",
+		.next = NEXT(NEXT_ENTRY(FLOW)),
+	},
+	[END] = {
+		.name = "",
+		.type = "RETURN",
+		.help = "command may end here",
+	},
+	/* Top-level command. */
+	[FLOW] = {
+		.name = "flow",
+		.type = "{command} {port_id} [{arg} [...]]",
+		.help = "manage ingress/egress flow rules",
+		.call = parse_init,
+	},
+};
+
+/** Default parsing function for token name matching. */
+static int
+parse_default(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)buf;
+	(void)size;
+	if (strncmp(str, token->name, len))
+		return -1;
+	return len;
+}
+
+/** Parse flow command, initialize output buffer for subsequent tokens. */
+static int
+parse_init(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	/* Initialize buffer. */
+	memset(out, 0x00, sizeof(*out));
+	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	return len;
+}
+
+/** Internal context. */
+static struct context cmd_flow_context;
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow;
+
+/** Initialize context. */
+static void
+cmd_flow_context_init(struct context *ctx)
+{
+	/* A full memset() is not necessary. */
+	ctx->curr = 0;
+	ctx->prev = 0;
+	ctx->next_num = 0;
+	ctx->reparse = 0;
+	ctx->eol = 0;
+	ctx->last = 0;
+}
+
+/** Parse a token (cmdline API). */
+static int
+cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
+	       unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token;
+	const enum index *list;
+	int len;
+	int i;
+
+	(void)hdr;
+	/* Restart as requested. */
+	if (ctx->reparse)
+		cmd_flow_context_init(ctx);
+	token = &token_list[ctx->curr];
+	/* Check argument length. */
+	ctx->eol = 0;
+	ctx->last = 1;
+	for (len = 0; src[len]; ++len)
+		if (src[len] == '#' || isspace(src[len]))
+			break;
+	if (!len)
+		return -1;
+	/* Last argument and EOL detection. */
+	for (i = len; src[i]; ++i)
+		if (src[i] == '#' || src[i] == '\r' || src[i] == '\n')
+			break;
+		else if (!isspace(src[i])) {
+			ctx->last = 0;
+			break;
+		}
+	for (; src[i]; ++i)
+		if (src[i] == '\r' || src[i] == '\n') {
+			ctx->eol = 1;
+			break;
+		}
+	/* Initialize context if necessary. */
+	if (!ctx->next_num) {
+		if (!token->next)
+			return 0;
+		ctx->next[ctx->next_num++] = token->next[0];
+	}
+	/* Process argument through candidates. */
+	ctx->prev = ctx->curr;
+	list = ctx->next[ctx->next_num - 1];
+	for (i = 0; list[i]; ++i) {
+		const struct token *next = &token_list[list[i]];
+		int tmp;
+
+		ctx->curr = list[i];
+		if (next->call)
+			tmp = next->call(ctx, next, src, len, result, size);
+		else
+			tmp = parse_default(ctx, next, src, len, result, size);
+		if (tmp == -1 || tmp != len)
+			continue;
+		token = next;
+		break;
+	}
+	if (!list[i])
+		return -1;
+	--ctx->next_num;
+	/* Push subsequent tokens if any. */
+	if (token->next)
+		for (i = 0; token->next[i]; ++i) {
+			if (ctx->next_num == RTE_DIM(ctx->next))
+				return -1;
+			ctx->next[ctx->next_num++] = token->next[i];
+		}
+	return len;
+}
+
+/** Return number of completion entries (cmdline API). */
+static int
+cmd_flow_complete_get_nb(cmdline_parse_token_hdr_t *hdr)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return 0;
+	/*
+	 * If there is a single token, use its completion callback, otherwise
+	 * return the number of entries.
+	 */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, 0, NULL, 0);
+	}
+	return i;
+}
+
+/** Return a completion entry (cmdline API). */
+static int
+cmd_flow_complete_get_elt(cmdline_parse_token_hdr_t *hdr, int index,
+			  char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return -1;
+	/* If there is a single token, use its completion callback. */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, index, dst, size) < 0 ? -1 : 0;
+	}
+	/* Otherwise make sure the index is valid and use defaults. */
+	if (index >= i)
+		return -1;
+	token = &token_list[list[index]];
+	snprintf(dst, size, "%s", token->name);
+	/* Save index for cmd_flow_get_help(). */
+	ctx->prev = list[index];
+	return 0;
+}
+
+/** Populate help strings for current token (cmdline API). */
+static int
+cmd_flow_get_help(cmdline_parse_token_hdr_t *hdr, char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->prev];
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	if (!size)
+		return -1;
+	/* Set token type and update global help with details. */
+	snprintf(dst, size, "%s", (token->type ? token->type : "TOKEN"));
+	if (token->help)
+		cmd_flow.help_str = token->help;
+	else
+		cmd_flow.help_str = token->name;
+	return 0;
+}
+
+/** Token definition template (cmdline API). */
+static struct cmdline_token_hdr cmd_flow_token_hdr = {
+	.ops = &(struct cmdline_token_ops){
+		.parse = cmd_flow_parse,
+		.complete_get_nb = cmd_flow_complete_get_nb,
+		.complete_get_elt = cmd_flow_complete_get_elt,
+		.get_help = cmd_flow_get_help,
+	},
+	.offset = 0,
+};
+
+/** Populate the next dynamic token. */
+static void
+cmd_flow_tok(cmdline_parse_token_hdr_t **hdr,
+	     cmdline_parse_token_hdr_t *(*hdrs)[])
+{
+	struct context *ctx = &cmd_flow_context;
+
+	/* Always reinitialize context before requesting the first token. */
+	if (!(hdr - *hdrs))
+		cmd_flow_context_init(ctx);
+	/* Return NULL when no more tokens are expected. */
+	if (!ctx->next_num && ctx->curr) {
+		*hdr = NULL;
+		return;
+	}
+	/* Determine if command should end here. */
+	if (ctx->eol && ctx->last && ctx->next_num) {
+		const enum index *list = ctx->next[ctx->next_num - 1];
+		int i;
+
+		for (i = 0; list[i]; ++i) {
+			if (list[i] != END)
+				continue;
+			*hdr = NULL;
+			return;
+		}
+	}
+	*hdr = &cmd_flow_token_hdr;
+}
+
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_flow_parsed(const struct buffer *in)
+{
+	switch (in->command) {
+	default:
+		break;
+	}
+}
+
+/** Token generator and output processing callback (cmdline API). */
+static void
+cmd_flow_cb(void *arg0, struct cmdline *cl, void *arg2)
+{
+	if (cl == NULL)
+		cmd_flow_tok(arg0, arg2);
+	else
+		cmd_flow_parsed(arg0);
+}
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow = {
+	.f = cmd_flow_cb,
+	.data = NULL, /**< Unused. */
+	.help_str = NULL, /**< Updated by cmd_flow_get_help(). */
+	.tokens = {
+		NULL,
+	}, /**< Tokens are returned by cmd_flow_tok(). */
+};
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 08/25] app/testpmd: add rte_flow integer support
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (6 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 07/25] app/testpmd: add flow command Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 09/25] app/testpmd: add flow list command Adrien Mazarguil
                         ` (19 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Parse all integer types and handle conversion to network byte order in a
single function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 148 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a36da06..81281f9 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -34,11 +34,14 @@
 #include <stddef.h>
 #include <stdint.h>
 #include <stdio.h>
+#include <inttypes.h>
+#include <errno.h>
 #include <ctype.h>
 #include <string.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
+#include <rte_byteorder.h>
 #include <cmdline_parse.h>
 #include <rte_flow.h>
 
@@ -50,6 +53,10 @@ enum index {
 	ZERO = 0,
 	END,
 
+	/* Common tokens. */
+	INTEGER,
+	UNSIGNED,
+
 	/* Top-level command. */
 	FLOW,
 };
@@ -61,12 +68,24 @@ enum index {
 struct context {
 	/** Stack of subsequent token lists to process. */
 	const enum index *next[CTX_STACK_SIZE];
+	/** Arguments for stacked tokens. */
+	const void *args[CTX_STACK_SIZE];
 	enum index curr; /**< Current token index. */
 	enum index prev; /**< Index of the last token seen. */
 	int next_num; /**< Number of entries in next[]. */
+	int args_num; /**< Number of entries in args[]. */
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	void *object; /**< Address of current object for relative offsets. */
+};
+
+/** Token argument. */
+struct arg {
+	uint32_t hton:1; /**< Use network byte ordering. */
+	uint32_t sign:1; /**< Value is signed. */
+	uint32_t offset; /**< Relative offset from ctx->object. */
+	uint32_t size; /**< Field size. */
 };
 
 /** Parser token definition. */
@@ -80,6 +99,8 @@ struct token {
 	 * parser consumes the last entry of that stack.
 	 */
 	const enum index *const *next;
+	/** Arguments stack for subsequent tokens that need them. */
+	const struct arg *const *args;
 	/**
 	 * Token-processing callback, returns -1 in case of error, the
 	 * length of the matched string otherwise. If NULL, attempts to
@@ -112,6 +133,22 @@ struct token {
 /** Static initializer for a NEXT() entry. */
 #define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, ZERO, }
 
+/** Static initializer for the args field. */
+#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for ARGS() to target a field. */
+#define ARGS_ENTRY(s, f) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
+/** Static initializer for ARGS() to target a pointer. */
+#define ARGS_ENTRY_PTR(s, f) \
+	(&(const struct arg){ \
+		.size = sizeof(*((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -121,6 +158,11 @@ struct buffer {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_int(struct context *, const struct token *,
+		     const char *, unsigned int,
+		     void *, unsigned int);
+static int comp_none(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -135,6 +177,21 @@ static const struct token token_list[] = {
 		.type = "RETURN",
 		.help = "command may end here",
 	},
+	/* Common tokens. */
+	[INTEGER] = {
+		.name = "{int}",
+		.type = "INTEGER",
+		.help = "integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[UNSIGNED] = {
+		.name = "{unsigned}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -144,6 +201,23 @@ static const struct token token_list[] = {
 	},
 };
 
+/** Remove and return last entry from argument stack. */
+static const struct arg *
+pop_args(struct context *ctx)
+{
+	return ctx->args_num ? ctx->args[--ctx->args_num] : NULL;
+}
+
+/** Add entry on top of the argument stack. */
+static int
+push_args(struct context *ctx, const struct arg *arg)
+{
+	if (ctx->args_num == CTX_STACK_SIZE)
+		return -1;
+	ctx->args[ctx->args_num++] = arg;
+	return 0;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -178,9 +252,74 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->object = out;
 	return len;
 }
 
+/**
+ * Parse signed/unsigned integers 8 to 64-bit long.
+ *
+ * Last argument (ctx->args) is retrieved to determine integer type and
+ * storage location.
+ */
+static int
+parse_int(struct context *ctx, const struct token *token,
+	  const char *str, unsigned int len,
+	  void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	uintmax_t u;
+	char *end;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = arg->sign ?
+		(uintmax_t)strtoimax(str, &end, 0) :
+		strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	size = arg->size;
+	switch (size) {
+	case sizeof(uint8_t):
+		*(uint8_t *)buf = u;
+		break;
+	case sizeof(uint16_t):
+		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
+		break;
+	case sizeof(uint32_t):
+		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
+		break;
+	case sizeof(uint64_t):
+		*(uint64_t *)buf = arg->hton ? rte_cpu_to_be_64(u) : u;
+		break;
+	default:
+		goto error;
+	}
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/** No completion. */
+static int
+comp_none(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)token;
+	(void)ent;
+	(void)buf;
+	(void)size;
+	return 0;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -195,9 +334,11 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->curr = 0;
 	ctx->prev = 0;
 	ctx->next_num = 0;
+	ctx->args_num = 0;
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->object = NULL;
 }
 
 /** Parse a token (cmdline API). */
@@ -270,6 +411,13 @@ cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
 				return -1;
 			ctx->next[ctx->next_num++] = token->next[i];
 		}
+	/* Push arguments if any. */
+	if (token->args)
+		for (i = 0; token->args[i]; ++i) {
+			if (ctx->args_num == RTE_DIM(ctx->args))
+				return -1;
+			ctx->args[ctx->args_num++] = token->args[i];
+		}
 	return len;
 }
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 09/25] app/testpmd: add flow list command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (7 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 10/25] app/testpmd: add flow flush command Adrien Mazarguil
                         ` (18 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Syntax:

 flow list {port_id} [group {group_id}] [...]

List configured flow rules on a port. Output can optionally be limited to a
given set of group identifiers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |   4 ++
 app/test-pmd/cmdline_flow.c | 141 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b124412..0dc6c63 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -810,6 +810,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
+
+			"flow list {port_id} [group {group_id}] [...]\n"
+			"    List existing flow rules sorted by priority,"
+			" filtered by group identifiers.\n\n"
 		);
 	}
 }
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 81281f9..bd3da38 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,9 +56,17 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PORT_ID,
+	GROUP_ID,
 
 	/* Top-level command. */
 	FLOW,
+
+	/* Sub-level commands. */
+	LIST,
+
+	/* List arguments. */
+	LIST_GROUP,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -77,6 +85,7 @@ struct context {
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	uint16_t port; /**< Current port ID (for completions). */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -153,16 +162,36 @@ struct token {
 struct buffer {
 	enum index command; /**< Flow command. */
 	uint16_t port; /**< Affected port ID. */
+	union {
+		struct {
+			uint32_t *group;
+			uint32_t group_n;
+		} list; /**< List arguments. */
+	} args; /**< Command arguments. */
+};
+
+static const enum index next_list_attr[] = {
+	LIST_GROUP,
+	END,
+	ZERO,
 };
 
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_list(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_port(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_port(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -192,13 +221,44 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PORT_ID] = {
+		.name = "{port_id}",
+		.type = "PORT ID",
+		.help = "port identifier",
+		.call = parse_port,
+		.comp = comp_port,
+	},
+	[GROUP_ID] = {
+		.name = "{group_id}",
+		.type = "GROUP ID",
+		.help = "group identifier",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
+		.next = NEXT(NEXT_ENTRY(LIST)),
 		.call = parse_init,
 	},
+	/* Sub-level commands. */
+	[LIST] = {
+		.name = "list",
+		.help = "list existing flow rules",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_list,
+	},
+	/* List arguments. */
+	[LIST_GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
+		.call = parse_list,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -256,6 +316,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for list command. */
+static int
+parse_list(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != LIST)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.list.group =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
+	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.list.group + out->args.list.group_n++;
+	return len;
+}
+
 /**
  * Parse signed/unsigned integers 8 to 64-bit long.
  *
@@ -307,6 +400,29 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/** Parse port and update context. */
+static int
+parse_port(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = &(struct buffer){ .port = 0 };
+	int ret;
+
+	if (buf)
+		out = buf;
+	else {
+		ctx->object = out;
+		size = sizeof(*out);
+	}
+	ret = parse_int(ctx, token, str, len, out, size);
+	if (ret >= 0)
+		ctx->port = out->port;
+	if (!buf)
+		ctx->object = NULL;
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -320,6 +436,26 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete available ports. */
+static int
+comp_port(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	portid_t p;
+
+	(void)ctx;
+	(void)token;
+	FOREACH_PORT(p, ports) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", p);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -338,6 +474,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->port = 0;
 	ctx->object = NULL;
 }
 
@@ -561,6 +698,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case LIST:
+		port_flow_list(in->port, in->args.list.group_n,
+			       in->args.list.group);
+		break;
 	default:
 		break;
 	}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 10/25] app/testpmd: add flow flush command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (8 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 09/25] app/testpmd: add flow list command Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
                         ` (17 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Syntax:

 flow flush {port_id}

Destroy all flow rules on a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |  3 +++
 app/test-pmd/cmdline_flow.c | 43 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0dc6c63..6e2b289 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow flush {port_id}\n"
+			"    Destroy all flow rules.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bd3da38..49578eb 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -63,6 +63,7 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	FLUSH,
 	LIST,
 
 	/* List arguments. */
@@ -179,6 +180,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_flush(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -240,10 +244,19 @@ static const struct token token_list[] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
-		.next = NEXT(NEXT_ENTRY(LIST)),
+		.next = NEXT(NEXT_ENTRY
+			     (FLUSH,
+			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[FLUSH] = {
+		.name = "flush",
+		.help = "destroy all flow rules",
+		.next = NEXT(NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_flush,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -316,6 +329,31 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for flush command. */
+static int
+parse_flush(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != FLUSH)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+	}
+	return len;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -698,6 +736,9 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case FLUSH:
+		port_flow_flush(in->port);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 11/25] app/testpmd: add flow destroy command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (9 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 10/25] app/testpmd: add flow flush command Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
                         ` (16 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Syntax:

 flow destroy {port_id} rule {rule_id} [...]

Destroy a given set of flow rules associated with a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |   3 ++
 app/test-pmd/cmdline_flow.c | 106 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 6e2b289..80ddda2 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow destroy {port_id} rule {rule_id} [...]\n"
+			"    Destroy specific flow rules.\n\n"
+
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 49578eb..786b718 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
 
@@ -63,9 +64,13 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	DESTROY,
 	FLUSH,
 	LIST,
 
+	/* Destroy arguments. */
+	DESTROY_RULE,
+
 	/* List arguments. */
 	LIST_GROUP,
 };
@@ -165,12 +170,22 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			uint32_t *rule;
+			uint32_t rule_n;
+		} destroy; /**< Destroy arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
 	} args; /**< Command arguments. */
 };
 
+static const enum index next_destroy_attr[] = {
+	DESTROY_RULE,
+	END,
+	ZERO,
+};
+
 static const enum index next_list_attr[] = {
 	LIST_GROUP,
 	END,
@@ -180,6 +195,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_destroy(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
@@ -196,6 +214,8 @@ static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_rule_id(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -225,6 +245,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[RULE_ID] = {
+		.name = "{rule id}",
+		.type = "RULE ID",
+		.help = "rule identifier",
+		.call = parse_int,
+		.comp = comp_rule_id,
+	},
 	[PORT_ID] = {
 		.name = "{port_id}",
 		.type = "PORT ID",
@@ -245,11 +272,19 @@ static const struct token token_list[] = {
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (FLUSH,
+			     (DESTROY,
+			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[DESTROY] = {
+		.name = "destroy",
+		.help = "destroy specific flow rules",
+		.next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_destroy,
+	},
 	[FLUSH] = {
 		.name = "flush",
 		.help = "destroy all flow rules",
@@ -264,6 +299,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_list,
 	},
+	/* Destroy arguments. */
+	[DESTROY_RULE] = {
+		.name = "rule",
+		.help = "specify a rule identifier",
+		.next = NEXT(next_destroy_attr, NEXT_ENTRY(RULE_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
+		.call = parse_destroy,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -329,6 +372,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for destroy command. */
+static int
+parse_destroy(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != DESTROY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.destroy.rule =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
+	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	return len;
+}
+
 /** Parse tokens for flush command. */
 static int
 parse_flush(struct context *ctx, const struct token *token,
@@ -494,6 +570,30 @@ comp_port(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete available rule IDs. */
+static int
+comp_rule_id(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	struct rte_port *port;
+	struct port_flow *pf;
+
+	(void)token;
+	if (port_id_is_invalid(ctx->port, DISABLED_WARN) ||
+	    ctx->port == (uint16_t)RTE_PORT_ALL)
+		return -1;
+	port = &ports[ctx->port];
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", pf->id);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -736,6 +836,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case DESTROY:
+		port_flow_destroy(in->port, in->args.destroy.rule_n,
+				  in->args.destroy.rule);
+		break;
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 12/25] app/testpmd: add flow validate/create commands
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (10 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 13/25] app/testpmd: add flow query command Adrien Mazarguil
                         ` (15 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Syntax:

 flow (validate|create) {port_id}
    [group {group_id}] [priority {level}] [ingress] [egress]
    pattern {item} [/ {item} [...]] / end
    actions {action} [/ {action} [...]] / end

Either check the validity of a flow rule or create it. Any number of
pattern items and actions can be provided in any order. Completion is
available for convenience.

This commit only adds support for the most basic item and action types,
namely:

- END: terminates pattern items and actions lists.
- VOID: item/action filler, no operation.
- INVERT: inverted pattern matching, process packets that do not match.
- PASSTHRU: action that leaves packets up for additional processing by
  subsequent flow rules.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |  14 ++
 app/test-pmd/cmdline_flow.c | 314 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 327 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 80ddda2..23f4b48 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,20 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow validate {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Check whether a flow rule can be created.\n\n"
+
+			"flow create {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Create a flow rule.\n\n"
+
 			"flow destroy {port_id} rule {rule_id} [...]\n"
 			"    Destroy specific flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 786b718..2fd3a5d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -59,11 +59,14 @@ enum index {
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
+	PRIORITY_LEVEL,
 
 	/* Top-level command. */
 	FLOW,
 
 	/* Sub-level commands. */
+	VALIDATE,
+	CREATE,
 	DESTROY,
 	FLUSH,
 	LIST,
@@ -73,6 +76,26 @@ enum index {
 
 	/* List arguments. */
 	LIST_GROUP,
+
+	/* Validate/create arguments. */
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+
+	/* Validate/create pattern. */
+	PATTERN,
+	ITEM_NEXT,
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+
+	/* Validate/create actions. */
+	ACTIONS,
+	ACTION_NEXT,
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -92,6 +115,7 @@ struct context {
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
 	uint16_t port; /**< Current port ID (for completions). */
+	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -109,6 +133,8 @@ struct token {
 	const char *type;
 	/** Help displayed during completion (defaults to token name). */
 	const char *help;
+	/** Private data used by parser functions. */
+	const void *priv;
 	/**
 	 * Lists of subsequent tokens to push on the stack. Each call to the
 	 * parser consumes the last entry of that stack.
@@ -170,6 +196,14 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			struct rte_flow_attr attr;
+			struct rte_flow_item *pattern;
+			struct rte_flow_action *actions;
+			uint32_t pattern_n;
+			uint32_t actions_n;
+			uint8_t *data;
+		} vc; /**< Validate/create arguments. */
+		struct {
 			uint32_t *rule;
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
@@ -180,6 +214,39 @@ struct buffer {
 	} args; /**< Command arguments. */
 };
 
+/** Private data for pattern items. */
+struct parse_item_priv {
+	enum rte_flow_item_type type; /**< Item type. */
+	uint32_t size; /**< Size of item specification structure. */
+};
+
+#define PRIV_ITEM(t, s) \
+	(&(const struct parse_item_priv){ \
+		.type = RTE_FLOW_ITEM_TYPE_ ## t, \
+		.size = s, \
+	})
+
+/** Private data for actions. */
+struct parse_action_priv {
+	enum rte_flow_action_type type; /**< Action type. */
+	uint32_t size; /**< Size of action configuration structure. */
+};
+
+#define PRIV_ACTION(t, s) \
+	(&(const struct parse_action_priv){ \
+		.type = RTE_FLOW_ACTION_TYPE_ ## t, \
+		.size = s, \
+	})
+
+static const enum index next_vc_attr[] = {
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+	PATTERN,
+	ZERO,
+};
+
 static const enum index next_destroy_attr[] = {
 	DESTROY_RULE,
 	END,
@@ -192,9 +259,26 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
+static const enum index next_item[] = {
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+	ZERO,
+};
+
+static const enum index next_action[] = {
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_vc(struct context *, const struct token *,
+		    const char *, unsigned int,
+		    void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -266,18 +350,41 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PRIORITY_LEVEL] = {
+		.name = "{level}",
+		.type = "PRIORITY",
+		.help = "priority level",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (DESTROY,
+			     (VALIDATE,
+			      CREATE,
+			      DESTROY,
 			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[VALIDATE] = {
+		.name = "validate",
+		.help = "check whether a flow rule can be created",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
+	[CREATE] = {
+		.name = "create",
+		.help = "create a flow rule",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
 	[DESTROY] = {
 		.name = "destroy",
 		.help = "destroy specific flow rules",
@@ -315,6 +422,98 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
 		.call = parse_list,
 	},
+	/* Validate/create attributes. */
+	[GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, group)),
+		.call = parse_vc,
+	},
+	[PRIORITY] = {
+		.name = "priority",
+		.help = "specify a priority level",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PRIORITY_LEVEL)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, priority)),
+		.call = parse_vc,
+	},
+	[INGRESS] = {
+		.name = "ingress",
+		.help = "affect rule to ingress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	[EGRESS] = {
+		.name = "egress",
+		.help = "affect rule to egress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	/* Validate/create pattern. */
+	[PATTERN] = {
+		.name = "pattern",
+		.help = "submit a list of pattern items",
+		.next = NEXT(next_item),
+		.call = parse_vc,
+	},
+	[ITEM_NEXT] = {
+		.name = "/",
+		.help = "specify next pattern item",
+		.next = NEXT(next_item),
+	},
+	[ITEM_END] = {
+		.name = "end",
+		.help = "end list of pattern items",
+		.priv = PRIV_ITEM(END, 0),
+		.next = NEXT(NEXT_ENTRY(ACTIONS)),
+		.call = parse_vc,
+	},
+	[ITEM_VOID] = {
+		.name = "void",
+		.help = "no-op pattern item",
+		.priv = PRIV_ITEM(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_INVERT] = {
+		.name = "invert",
+		.help = "perform actions when pattern does not match",
+		.priv = PRIV_ITEM(INVERT, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	/* Validate/create actions. */
+	[ACTIONS] = {
+		.name = "actions",
+		.help = "submit a list of associated actions",
+		.next = NEXT(next_action),
+		.call = parse_vc,
+	},
+	[ACTION_NEXT] = {
+		.name = "/",
+		.help = "specify next action",
+		.next = NEXT(next_action),
+	},
+	[ACTION_END] = {
+		.name = "end",
+		.help = "end list of actions",
+		.priv = PRIV_ACTION(END, 0),
+		.call = parse_vc,
+	},
+	[ACTION_VOID] = {
+		.name = "void",
+		.help = "no-op action",
+		.priv = PRIV_ACTION(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PASSTHRU] = {
+		.name = "passthru",
+		.help = "let subsequent rule process matched packets",
+		.priv = PRIV_ACTION(PASSTHRU, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -368,10 +567,108 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->objdata = 0;
 	ctx->object = out;
 	return len;
 }
 
+/** Parse tokens for validate/create commands. */
+static int
+parse_vc(struct context *ctx, const struct token *token,
+	 const char *str, unsigned int len,
+	 void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	uint8_t *data;
+	uint32_t data_size;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != VALIDATE && ctx->curr != CREATE)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		out->args.vc.data = (uint8_t *)out + size;
+		return len;
+	}
+	ctx->objdata = 0;
+	ctx->object = &out->args.vc.attr;
+	switch (ctx->curr) {
+	case GROUP:
+	case PRIORITY:
+		return len;
+	case INGRESS:
+		out->args.vc.attr.ingress = 1;
+		return len;
+	case EGRESS:
+		out->args.vc.attr.egress = 1;
+		return len;
+	case PATTERN:
+		out->args.vc.pattern =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		ctx->object = out->args.vc.pattern;
+		return len;
+	case ACTIONS:
+		out->args.vc.actions =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)
+					       (out->args.vc.pattern +
+						out->args.vc.pattern_n),
+					       sizeof(double));
+		ctx->object = out->args.vc.actions;
+		return len;
+	default:
+		if (!token->priv)
+			return -1;
+		break;
+	}
+	if (!out->args.vc.actions) {
+		const struct parse_item_priv *priv = token->priv;
+		struct rte_flow_item *item =
+			out->args.vc.pattern + out->args.vc.pattern_n;
+
+		data_size = priv->size * 3; /* spec, last, mask */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)item + sizeof(*item) > data)
+			return -1;
+		*item = (struct rte_flow_item){
+			.type = priv->type,
+		};
+		++out->args.vc.pattern_n;
+		ctx->object = item;
+	} else {
+		const struct parse_action_priv *priv = token->priv;
+		struct rte_flow_action *action =
+			out->args.vc.actions + out->args.vc.actions_n;
+
+		data_size = priv->size; /* configuration */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)action + sizeof(*action) > data)
+			return -1;
+		*action = (struct rte_flow_action){
+			.type = priv->type,
+		};
+		++out->args.vc.actions_n;
+		ctx->object = action;
+	}
+	memset(data, 0, data_size);
+	out->args.vc.data = data;
+	ctx->objdata = data_size;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -392,6 +689,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -401,6 +699,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
 	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
 	return len;
 }
@@ -425,6 +724,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 	}
 	return len;
@@ -450,6 +750,7 @@ parse_list(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -459,6 +760,7 @@ parse_list(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
 	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
 	return len;
 }
@@ -526,6 +828,7 @@ parse_port(struct context *ctx, const struct token *token,
 	if (buf)
 		out = buf;
 	else {
+		ctx->objdata = 0;
 		ctx->object = out;
 		size = sizeof(*out);
 	}
@@ -613,6 +916,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->eol = 0;
 	ctx->last = 0;
 	ctx->port = 0;
+	ctx->objdata = 0;
 	ctx->object = NULL;
 }
 
@@ -836,6 +1140,14 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case VALIDATE:
+		port_flow_validate(in->port, &in->args.vc.attr,
+				   in->args.vc.pattern, in->args.vc.actions);
+		break;
+	case CREATE:
+		port_flow_create(in->port, &in->args.vc.attr,
+				 in->args.vc.pattern, in->args.vc.actions);
+		break;
 	case DESTROY:
 		port_flow_destroy(in->port, in->args.destroy.rule_n,
 				  in->args.destroy.rule);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 13/25] app/testpmd: add flow query command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (11 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
                         ` (14 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Syntax:

 flow query {port_id} {rule_id} {action}

Query a specific action of an existing flow rule.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline.c      |   3 +
 app/test-pmd/cmdline_flow.c | 121 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 23f4b48..f768b6b 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -831,6 +831,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
+			"flow query {port_id} {rule_id} {action}\n"
+			"    Query an existing flow rule.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 2fd3a5d..8f7ec1d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -69,11 +69,15 @@ enum index {
 	CREATE,
 	DESTROY,
 	FLUSH,
+	QUERY,
 	LIST,
 
 	/* Destroy arguments. */
 	DESTROY_RULE,
 
+	/* Query arguments. */
+	QUERY_ACTION,
+
 	/* List arguments. */
 	LIST_GROUP,
 
@@ -208,6 +212,10 @@ struct buffer {
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
 		struct {
+			uint32_t rule;
+			enum rte_flow_action_type action;
+		} query; /**< Query arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
@@ -285,6 +293,12 @@ static int parse_destroy(struct context *, const struct token *,
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
+static int parse_query(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
+static int parse_action(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -296,6 +310,8 @@ static int parse_port(struct context *, const struct token *,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_action(struct context *, const struct token *,
+		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
@@ -367,7 +383,8 @@ static const struct token token_list[] = {
 			      CREATE,
 			      DESTROY,
 			      FLUSH,
-			      LIST)),
+			      LIST,
+			      QUERY)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
@@ -399,6 +416,17 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_flush,
 	},
+	[QUERY] = {
+		.name = "query",
+		.help = "query an existing flow rule",
+		.next = NEXT(NEXT_ENTRY(QUERY_ACTION),
+			     NEXT_ENTRY(RULE_ID),
+			     NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.query.action),
+			     ARGS_ENTRY(struct buffer, args.query.rule),
+			     ARGS_ENTRY(struct buffer, port)),
+		.call = parse_query,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -414,6 +442,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
 		.call = parse_destroy,
 	},
+	/* Query arguments. */
+	[QUERY_ACTION] = {
+		.name = "{action}",
+		.type = "ACTION",
+		.help = "action to query, must be part of the rule",
+		.call = parse_action,
+		.comp = comp_action,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -730,6 +766,67 @@ parse_flush(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for query command. */
+static int
+parse_query(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != QUERY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+	}
+	return len;
+}
+
+/** Parse action names. */
+static int
+parse_action(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+
+	(void)size;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	/* Parse action name. */
+	for (i = 0; next_action[i]; ++i) {
+		const struct parse_action_priv *priv;
+
+		token = &token_list[next_action[i]];
+		if (strncmp(token->name, str, len))
+			continue;
+		priv = token->priv;
+		if (!priv)
+			goto error;
+		if (out)
+			memcpy((uint8_t *)ctx->object + arg->offset,
+			       &priv->type,
+			       arg->size);
+		return len;
+	}
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -853,6 +950,24 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete action names. */
+static int
+comp_action(struct context *ctx, const struct token *token,
+	    unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; next_action[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s",
+					token_list[next_action[i]].name);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete available ports. */
 static int
 comp_port(struct context *ctx, const struct token *token,
@@ -1155,6 +1270,10 @@ cmd_flow_parsed(const struct buffer *in)
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
+	case QUERY:
+		port_flow_query(in->port, in->args.query.rule,
+				in->args.query.action);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 14/25] app/testpmd: add rte_flow item spec handler
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (12 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 13/25] app/testpmd: add flow query command Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
                         ` (13 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Add parser code to fully set individual fields of pattern item
specification structures, using the following operators:

- fix: sets field and applies full bit-mask for perfect matching.
- spec: sets field without modifying its bit-mask.
- last: sets upper value of the spec => last range.
- mask: sets bit-mask affecting both spec and last from arbitrary value.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 111 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 8f7ec1d..b66fecf 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -89,6 +89,10 @@ enum index {
 
 	/* Validate/create pattern. */
 	PATTERN,
+	ITEM_PARAM_IS,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -121,6 +125,7 @@ struct context {
 	uint16_t port; /**< Current port ID (for completions). */
 	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
+	void *objmask; /**< Object a full mask must be written to. */
 };
 
 /** Token argument. */
@@ -267,6 +272,15 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
+__rte_unused
+static const enum index item_param[] = {
+	ITEM_PARAM_IS,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
+	ZERO,
+};
+
 static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
@@ -287,6 +301,8 @@ static int parse_init(struct context *, const struct token *,
 static int parse_vc(struct context *, const struct token *,
 		    const char *, unsigned int,
 		    void *, unsigned int);
+static int parse_vc_spec(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -492,6 +508,26 @@ static const struct token token_list[] = {
 		.next = NEXT(next_item),
 		.call = parse_vc,
 	},
+	[ITEM_PARAM_IS] = {
+		.name = "is",
+		.help = "match value perfectly (with full bit-mask)",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_SPEC] = {
+		.name = "spec",
+		.help = "match value according to configured bit-mask",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_LAST] = {
+		.name = "last",
+		.help = "specify upper bound to establish a range",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_MASK] = {
+		.name = "mask",
+		.help = "specify bit-mask with relevant bits set to one",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +641,7 @@ parse_init(struct context *ctx, const struct token *token,
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
 	ctx->objdata = 0;
 	ctx->object = out;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -632,11 +669,13 @@ parse_vc(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.vc.data = (uint8_t *)out + size;
 		return len;
 	}
 	ctx->objdata = 0;
 	ctx->object = &out->args.vc.attr;
+	ctx->objmask = NULL;
 	switch (ctx->curr) {
 	case GROUP:
 	case PRIORITY:
@@ -652,6 +691,7 @@ parse_vc(struct context *ctx, const struct token *token,
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
 		ctx->object = out->args.vc.pattern;
+		ctx->objmask = NULL;
 		return len;
 	case ACTIONS:
 		out->args.vc.actions =
@@ -660,6 +700,7 @@ parse_vc(struct context *ctx, const struct token *token,
 						out->args.vc.pattern_n),
 					       sizeof(double));
 		ctx->object = out->args.vc.actions;
+		ctx->objmask = NULL;
 		return len;
 	default:
 		if (!token->priv)
@@ -682,6 +723,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.pattern_n;
 		ctx->object = item;
+		ctx->objmask = NULL;
 	} else {
 		const struct parse_action_priv *priv = token->priv;
 		struct rte_flow_action *action =
@@ -698,6 +740,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.actions_n;
 		ctx->object = action;
+		ctx->objmask = NULL;
 	}
 	memset(data, 0, data_size);
 	out->args.vc.data = data;
@@ -705,6 +748,60 @@ parse_vc(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse pattern item parameter type. */
+static int
+parse_vc_spec(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_item *item;
+	uint32_t data_size;
+	int index;
+	int objmask = 0;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Parse parameter types. */
+	switch (ctx->curr) {
+	case ITEM_PARAM_IS:
+		index = 0;
+		objmask = 1;
+		break;
+	case ITEM_PARAM_SPEC:
+		index = 0;
+		break;
+	case ITEM_PARAM_LAST:
+		index = 1;
+		break;
+	case ITEM_PARAM_MASK:
+		index = 2;
+		break;
+	default:
+		return -1;
+	}
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.pattern_n)
+		return -1;
+	item = &out->args.vc.pattern[out->args.vc.pattern_n - 1];
+	data_size = ctx->objdata / 3; /* spec, last, mask */
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data + (data_size * index);
+	if (objmask) {
+		ctx->objmask = out->args.vc.data + (data_size * 2); /* mask */
+		item->mask = ctx->objmask;
+	} else
+		ctx->objmask = NULL;
+	/* Update relevant item pointer. */
+	*((const void **[]){ &item->spec, &item->last, &item->mask })[index] =
+		ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -727,6 +824,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -737,6 +835,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -762,6 +861,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -788,6 +888,7 @@ parse_query(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -849,6 +950,7 @@ parse_list(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -859,6 +961,7 @@ parse_list(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -891,6 +994,7 @@ parse_int(struct context *ctx, const struct token *token,
 		return len;
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
+objmask:
 	switch (size) {
 	case sizeof(uint8_t):
 		*(uint8_t *)buf = u;
@@ -907,6 +1011,11 @@ parse_int(struct context *ctx, const struct token *token,
 	default:
 		goto error;
 	}
+	if (ctx->objmask && buf != (uint8_t *)ctx->objmask + arg->offset) {
+		u = -1;
+		buf = (uint8_t *)ctx->objmask + arg->offset;
+		goto objmask;
+	}
 	return len;
 error:
 	push_args(ctx, arg);
@@ -927,6 +1036,7 @@ parse_port(struct context *ctx, const struct token *token,
 	else {
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		size = sizeof(*out);
 	}
 	ret = parse_int(ctx, token, str, len, out, size);
@@ -1033,6 +1143,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->port = 0;
 	ctx->objdata = 0;
 	ctx->object = NULL;
+	ctx->objmask = NULL;
 }
 
 /** Parse a token (cmdline API). */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 15/25] app/testpmd: add rte_flow item spec prefix length
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (13 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
                         ` (12 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Generating bit-masks from prefix lengths is often more convenient than
providing them entirely (e.g. to define IPv4 and IPv6 subnets).

This commit adds the "prefix" operator that assigns generated bit-masks to
any pattern item specification field.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 80 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index b66fecf..07f895e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PREFIX,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -93,6 +94,7 @@ enum index {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -278,6 +280,7 @@ static const enum index item_param[] = {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ZERO,
 };
 
@@ -321,6 +324,9 @@ static int parse_list(struct context *, const struct token *,
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_prefix(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -361,6 +367,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PREFIX] = {
+		.name = "{prefix}",
+		.type = "PREFIX",
+		.help = "prefix length for bit-mask",
+		.call = parse_prefix,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -528,6 +541,11 @@ static const struct token token_list[] = {
 		.help = "specify bit-mask with relevant bits set to one",
 		.call = parse_vc_spec,
 	},
+	[ITEM_PARAM_PREFIX] = {
+		.name = "prefix",
+		.help = "generate bit-mask from a prefix length",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +623,62 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/**
+ * Parse a prefix length and generate a bit-mask.
+ *
+ * Last argument (ctx->args) is retrieved to determine mask size, storage
+ * location and whether the result must use network byte ordering.
+ */
+static int
+parse_prefix(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	static const uint8_t conv[] = "\x00\x80\xc0\xe0\xf0\xf8\xfc\xfe\xff";
+	char *end;
+	uintmax_t u;
+	unsigned int bytes;
+	unsigned int extra;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	bytes = u / 8;
+	extra = u % 8;
+	size = arg->size;
+	if (bytes > size || bytes + !!extra > size)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+	if (!arg->hton) {
+		memset((uint8_t *)buf + size - bytes, 0xff, bytes);
+		memset(buf, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[size - bytes - 1] = conv[extra];
+	} else
+#endif
+	{
+		memset(buf, 0xff, bytes);
+		memset((uint8_t *)buf + bytes, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[bytes] = conv[extra];
+	}
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -776,6 +850,12 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	case ITEM_PARAM_LAST:
 		index = 1;
 		break;
+	case ITEM_PARAM_PREFIX:
+		/* Modify next token to expect a prefix. */
+		if (ctx->next_num < 2)
+			return -1;
+		ctx->next[ctx->next_num - 2] = NEXT_ENTRY(PREFIX);
+		/* Fall through. */
 	case ITEM_PARAM_MASK:
 		index = 2;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 16/25] app/testpmd: add rte_flow bit-field support
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (14 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
                         ` (11 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Several rte_flow structures expose bit-fields that cannot be set in a
generic fashion at byte level. Add bit-mask support to handle them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 59 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 07f895e..69887fc 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -136,6 +136,7 @@ struct arg {
 	uint32_t sign:1; /**< Value is signed. */
 	uint32_t offset; /**< Relative offset from ctx->object. */
 	uint32_t size; /**< Field size. */
+	const uint8_t *mask; /**< Bit-mask to use instead of offset/size. */
 };
 
 /** Parser token definition. */
@@ -195,6 +196,13 @@ struct token {
 		.size = sizeof(((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() to target a bit-field. */
+#define ARGS_ENTRY_BF(s, f, b) \
+	(&(const struct arg){ \
+		.size = sizeof(s), \
+		.mask = (const void *)&(const s){ .f = (1 << (b)) - 1 }, \
+	})
+
 /** Static initializer for ARGS() to target a pointer. */
 #define ARGS_ENTRY_PTR(s, f) \
 	(&(const struct arg){ \
@@ -623,6 +631,34 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/** Spread value into buffer according to bit-mask. */
+static size_t
+arg_entry_bf_fill(void *dst, uintmax_t val, const struct arg *arg)
+{
+	uint32_t i;
+	size_t len = 0;
+
+	/* Endian conversion is not supported on bit-fields. */
+	if (!arg->mask || arg->hton)
+		return 0;
+	for (i = 0; i != arg->size; ++i) {
+		unsigned int shift = 0;
+		uint8_t *buf = (uint8_t *)dst + i;
+
+		for (shift = 0; arg->mask[i] >> shift; ++shift) {
+			if (!(arg->mask[i] & (1 << shift)))
+				continue;
+			++len;
+			if (!dst)
+				continue;
+			*buf &= ~(1 << shift);
+			*buf |= (val & 1) << shift;
+			val >>= 1;
+		}
+	}
+	return len;
+}
+
 /**
  * Parse a prefix length and generate a bit-mask.
  *
@@ -649,6 +685,23 @@ parse_prefix(struct context *ctx, const struct token *token,
 	u = strtoumax(str, &end, 0);
 	if (errno || (size_t)(end - str) != len)
 		goto error;
+	if (arg->mask) {
+		uintmax_t v = 0;
+
+		extra = arg_entry_bf_fill(NULL, 0, arg);
+		if (u > extra)
+			goto error;
+		if (!ctx->object)
+			return len;
+		extra -= u;
+		while (u--)
+			(v <<= 1, v |= 1);
+		v <<= extra;
+		if (!arg_entry_bf_fill(ctx->object, v, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	bytes = u / 8;
 	extra = u % 8;
 	size = arg->size;
@@ -1072,6 +1125,12 @@ parse_int(struct context *ctx, const struct token *token,
 		goto error;
 	if (!ctx->object)
 		return len;
+	if (arg->mask) {
+		if (!arg_entry_bf_fill(ctx->object, u, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
 objmask:
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 17/25] app/testpmd: add item any to flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (15 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 18/25] app/testpmd: add various items " Adrien Mazarguil
                         ` (10 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

This pattern item matches any protocol in place of the current layer and
has two properties:

- min: minimum number of layers covered (0 or more).
- max: maximum number of layers covered (0 means infinity).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 69887fc..1736954 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -99,6 +99,8 @@ enum index {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ITEM_ANY_NUM,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -282,7 +284,6 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
-__rte_unused
 static const enum index item_param[] = {
 	ITEM_PARAM_IS,
 	ITEM_PARAM_SPEC,
@@ -296,6 +297,13 @@ static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ZERO,
+};
+
+static const enum index item_any[] = {
+	ITEM_ANY_NUM,
+	ITEM_NEXT,
 	ZERO,
 };
 
@@ -580,6 +588,19 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
 		.call = parse_vc,
 	},
+	[ITEM_ANY] = {
+		.name = "any",
+		.help = "match any protocol for the current layer",
+		.priv = PRIV_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+		.next = NEXT(item_any),
+		.call = parse_vc,
+	},
+	[ITEM_ANY_NUM] = {
+		.name = "num",
+		.help = "number of layers covered",
+		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, num)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 18/25] app/testpmd: add various items to flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (16 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 19/25] app/testpmd: add item raw " Adrien Mazarguil
                         ` (9 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

- PF: match packets addressed to the physical function.
- VF: match packets addressed to a virtual function ID.
- PORT: device-specific physical port index to use.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 53 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 1736954..ac93679 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -101,6 +101,11 @@ enum index {
 	ITEM_INVERT,
 	ITEM_ANY,
 	ITEM_ANY_NUM,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_VF_ID,
+	ITEM_PORT,
+	ITEM_PORT_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -298,6 +303,9 @@ static const enum index next_item[] = {
 	ITEM_VOID,
 	ITEM_INVERT,
 	ITEM_ANY,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_PORT,
 	ZERO,
 };
 
@@ -307,6 +315,18 @@ static const enum index item_any[] = {
 	ZERO,
 };
 
+static const enum index item_vf[] = {
+	ITEM_VF_ID,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_port[] = {
+	ITEM_PORT_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -601,6 +621,39 @@ static const struct token token_list[] = {
 		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, num)),
 	},
+	[ITEM_PF] = {
+		.name = "pf",
+		.help = "match packets addressed to the physical function",
+		.priv = PRIV_ITEM(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_VF] = {
+		.name = "vf",
+		.help = "match packets addressed to a virtual function ID",
+		.priv = PRIV_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+		.next = NEXT(item_vf),
+		.call = parse_vc,
+	},
+	[ITEM_VF_ID] = {
+		.name = "id",
+		.help = "destination VF ID",
+		.next = NEXT(item_vf, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_vf, id)),
+	},
+	[ITEM_PORT] = {
+		.name = "port",
+		.help = "device-specific physical port index to use",
+		.priv = PRIV_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+		.next = NEXT(item_port),
+		.call = parse_vc,
+	},
+	[ITEM_PORT_INDEX] = {
+		.name = "index",
+		.help = "physical port index",
+		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 19/25] app/testpmd: add item raw to flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (17 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 18/25] app/testpmd: add various items " Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
                         ` (8 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Matches arbitrary byte strings with properties:

- relative: look for pattern after the previous item.
- search: search pattern from offset (see also limit).
- offset: absolute or relative offset for pattern.
- limit: search area limit for start of pattern.
- length: pattern length.
- pattern: byte string to look for.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 208 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 208 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index ac93679..dafb07f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -57,6 +57,8 @@ enum index {
 	INTEGER,
 	UNSIGNED,
 	PREFIX,
+	BOOLEAN,
+	STRING,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -106,6 +108,12 @@ enum index {
 	ITEM_VF_ID,
 	ITEM_PORT,
 	ITEM_PORT_INDEX,
+	ITEM_RAW,
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -115,6 +123,13 @@ enum index {
 	ACTION_PASSTHRU,
 };
 
+/** Size of pattern[] field in struct rte_flow_item_raw. */
+#define ITEM_RAW_PATTERN_SIZE 36
+
+/** Storage size for struct rte_flow_item_raw including pattern. */
+#define ITEM_RAW_SIZE \
+	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -216,6 +231,13 @@ struct token {
 		.size = sizeof(*((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() with arbitrary size. */
+#define ARGS_ENTRY_USZ(s, f, sz) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = (sz), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -306,6 +328,7 @@ static const enum index next_item[] = {
 	ITEM_PF,
 	ITEM_VF,
 	ITEM_PORT,
+	ITEM_RAW,
 	ZERO,
 };
 
@@ -327,6 +350,16 @@ static const enum index item_port[] = {
 	ZERO,
 };
 
+static const enum index item_raw[] = {
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -363,11 +396,19 @@ static int parse_int(struct context *, const struct token *,
 static int parse_prefix(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_boolean(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
+static int parse_string(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_boolean(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 static int comp_action(struct context *, const struct token *,
 		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
@@ -410,6 +451,20 @@ static const struct token token_list[] = {
 		.call = parse_prefix,
 		.comp = comp_none,
 	},
+	[BOOLEAN] = {
+		.name = "{boolean}",
+		.type = "BOOLEAN",
+		.help = "any boolean value",
+		.call = parse_boolean,
+		.comp = comp_boolean,
+	},
+	[STRING] = {
+		.name = "{string}",
+		.type = "STRING",
+		.help = "fixed string",
+		.call = parse_string,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -654,6 +709,52 @@ static const struct token token_list[] = {
 		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
 	},
+	[ITEM_RAW] = {
+		.name = "raw",
+		.help = "match an arbitrary byte string",
+		.priv = PRIV_ITEM(RAW, ITEM_RAW_SIZE),
+		.next = NEXT(item_raw),
+		.call = parse_vc,
+	},
+	[ITEM_RAW_RELATIVE] = {
+		.name = "relative",
+		.help = "look for pattern after the previous item",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
+					   relative, 1)),
+	},
+	[ITEM_RAW_SEARCH] = {
+		.name = "search",
+		.help = "search pattern from offset (see also limit)",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
+					   search, 1)),
+	},
+	[ITEM_RAW_OFFSET] = {
+		.name = "offset",
+		.help = "absolute or relative offset for pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(INTEGER), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, offset)),
+	},
+	[ITEM_RAW_LIMIT] = {
+		.name = "limit",
+		.help = "search area limit for start of pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, limit)),
+	},
+	[ITEM_RAW_PATTERN] = {
+		.name = "pattern",
+		.help = "byte string to look for",
+		.next = NEXT(item_raw,
+			     NEXT_ENTRY(STRING),
+			     NEXT_ENTRY(ITEM_PARAM_IS,
+					ITEM_PARAM_SPEC,
+					ITEM_PARAM_MASK)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, length),
+			     ARGS_ENTRY_USZ(struct rte_flow_item_raw,
+					    pattern,
+					    ITEM_RAW_PATTERN_SIZE)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1235,6 +1336,96 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a string.
+ *
+ * Two arguments (ctx->args) are retrieved from the stack to store data and
+ * its length (in that order).
+ */
+static int
+parse_string(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg_data = pop_args(ctx);
+	const struct arg *arg_len = pop_args(ctx);
+	char tmp[16]; /* Ought to be enough. */
+	int ret;
+
+	/* Arguments are expected. */
+	if (!arg_data)
+		return -1;
+	if (!arg_len) {
+		push_args(ctx, arg_data);
+		return -1;
+	}
+	size = arg_data->size;
+	/* Bit-mask fill is not supported. */
+	if (arg_data->mask || size < len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	/* Let parse_int() fill length information first. */
+	ret = snprintf(tmp, sizeof(tmp), "%u", len);
+	if (ret < 0)
+		goto error;
+	push_args(ctx, arg_len);
+	ret = parse_int(ctx, token, tmp, ret, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		goto error;
+	}
+	buf = (uint8_t *)ctx->object + arg_data->offset;
+	/* Output buffer is not necessarily NUL-terminated. */
+	memcpy(buf, str, len);
+	memset((uint8_t *)buf + len, 0x55, size - len);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg_data->offset, 0xff, len);
+	return len;
+error:
+	push_args(ctx, arg_len);
+	push_args(ctx, arg_data);
+	return -1;
+}
+
+/** Boolean values (even indices stand for false). */
+static const char *const boolean_name[] = {
+	"0", "1",
+	"false", "true",
+	"no", "yes",
+	"N", "Y",
+	NULL,
+};
+
+/**
+ * Parse a boolean value.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_boolean(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	for (i = 0; boolean_name[i]; ++i)
+		if (!strncmp(str, boolean_name[i], len))
+			break;
+	/* Process token as integer. */
+	if (boolean_name[i])
+		str = i & 1 ? "1" : "0";
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, strlen(str), buf, size);
+	return ret > 0 ? (int)len : ret;
+}
+
 /** Parse port and update context. */
 static int
 parse_port(struct context *ctx, const struct token *token,
@@ -1273,6 +1464,23 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete boolean values. */
+static int
+comp_boolean(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; boolean_name[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", boolean_name[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete action names. */
 static int
 comp_action(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 20/25] app/testpmd: add items eth/vlan to flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (18 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 19/25] app/testpmd: add item raw " Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
                         ` (7 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

These pattern items match basic Ethernet headers (source, destination and
type) and related 802.1Q/ad VLAN headers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 126 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 126 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index dafb07f..53709fe 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -43,6 +43,7 @@
 #include <rte_ethdev.h>
 #include <rte_byteorder.h>
 #include <cmdline_parse.h>
+#include <cmdline_parse_etheraddr.h>
 #include <rte_flow.h>
 
 #include "testpmd.h"
@@ -59,6 +60,7 @@ enum index {
 	PREFIX,
 	BOOLEAN,
 	STRING,
+	MAC_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -114,6 +116,13 @@ enum index {
 	ITEM_RAW_OFFSET,
 	ITEM_RAW_LIMIT,
 	ITEM_RAW_PATTERN,
+	ITEM_ETH,
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_VLAN,
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -238,6 +247,14 @@ struct token {
 		.size = (sz), \
 	})
 
+/** Same as ARGS_ENTRY() using network byte ordering. */
+#define ARGS_ENTRY_HTON(s, f) \
+	(&(const struct arg){ \
+		.hton = 1, \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -329,6 +346,8 @@ static const enum index next_item[] = {
 	ITEM_VF,
 	ITEM_PORT,
 	ITEM_RAW,
+	ITEM_ETH,
+	ITEM_VLAN,
 	ZERO,
 };
 
@@ -360,6 +379,21 @@ static const enum index item_raw[] = {
 	ZERO,
 };
 
+static const enum index item_eth[] = {
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vlan[] = {
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -402,6 +436,9 @@ static int parse_boolean(struct context *, const struct token *,
 static int parse_string(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_mac_addr(struct context *, const struct token *,
+			  const char *, unsigned int,
+			  void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -465,6 +502,13 @@ static const struct token token_list[] = {
 		.call = parse_string,
 		.comp = comp_none,
 	},
+	[MAC_ADDR] = {
+		.name = "{MAC address}",
+		.type = "MAC-48",
+		.help = "standard MAC address notation",
+		.call = parse_mac_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -755,6 +799,50 @@ static const struct token token_list[] = {
 					    pattern,
 					    ITEM_RAW_PATTERN_SIZE)),
 	},
+	[ITEM_ETH] = {
+		.name = "eth",
+		.help = "match Ethernet header",
+		.priv = PRIV_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+		.next = NEXT(item_eth),
+		.call = parse_vc,
+	},
+	[ITEM_ETH_DST] = {
+		.name = "dst",
+		.help = "destination MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, dst)),
+	},
+	[ITEM_ETH_SRC] = {
+		.name = "src",
+		.help = "source MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, src)),
+	},
+	[ITEM_ETH_TYPE] = {
+		.name = "type",
+		.help = "EtherType",
+		.next = NEXT(item_eth, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_eth, type)),
+	},
+	[ITEM_VLAN] = {
+		.name = "vlan",
+		.help = "match 802.1Q/ad VLAN tag",
+		.priv = PRIV_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+		.next = NEXT(item_vlan),
+		.call = parse_vc,
+	},
+	[ITEM_VLAN_TPID] = {
+		.name = "tpid",
+		.help = "tag protocol identifier",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tpid)),
+	},
+	[ITEM_VLAN_TCI] = {
+		.name = "tci",
+		.help = "tag control information",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1388,6 +1476,44 @@ parse_string(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a MAC address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_mac_addr(struct context *ctx, const struct token *token,
+	       const char *str, unsigned int len,
+	       void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	struct ether_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	ret = cmdline_parse_etheraddr(NULL, str, &tmp, size);
+	if (ret < 0 || (unsigned int)ret != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 21/25] app/testpmd: add items ipv4/ipv6 to flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (19 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 22/25] app/testpmd: add L4 items " Adrien Mazarguil
                         ` (6 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Add the ability to match basic fields from IPv4 and IPv6 headers (source
and destination addresses only).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 177 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 53709fe..c2725a5 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -38,6 +38,7 @@
 #include <errno.h>
 #include <ctype.h>
 #include <string.h>
+#include <arpa/inet.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
@@ -61,6 +62,8 @@ enum index {
 	BOOLEAN,
 	STRING,
 	MAC_ADDR,
+	IPV4_ADDR,
+	IPV6_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -123,6 +126,12 @@ enum index {
 	ITEM_VLAN,
 	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
+	ITEM_IPV4,
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_IPV6,
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -348,6 +357,8 @@ static const enum index next_item[] = {
 	ITEM_RAW,
 	ITEM_ETH,
 	ITEM_VLAN,
+	ITEM_IPV4,
+	ITEM_IPV6,
 	ZERO,
 };
 
@@ -394,6 +405,20 @@ static const enum index item_vlan[] = {
 	ZERO,
 };
 
+static const enum index item_ipv4[] = {
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_ipv6[] = {
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -439,6 +464,12 @@ static int parse_string(struct context *, const struct token *,
 static int parse_mac_addr(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int parse_ipv4_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
+static int parse_ipv6_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -509,6 +540,20 @@ static const struct token token_list[] = {
 		.call = parse_mac_addr,
 		.comp = comp_none,
 	},
+	[IPV4_ADDR] = {
+		.name = "{IPv4 address}",
+		.type = "IPV4 ADDRESS",
+		.help = "standard IPv4 address notation",
+		.call = parse_ipv4_addr,
+		.comp = comp_none,
+	},
+	[IPV6_ADDR] = {
+		.name = "{IPv6 address}",
+		.type = "IPV6 ADDRESS",
+		.help = "standard IPv6 address notation",
+		.call = parse_ipv6_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -843,6 +888,48 @@ static const struct token token_list[] = {
 		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
 	},
+	[ITEM_IPV4] = {
+		.name = "ipv4",
+		.help = "match IPv4 header",
+		.priv = PRIV_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+		.next = NEXT(item_ipv4),
+		.call = parse_vc,
+	},
+	[ITEM_IPV4_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV4_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.dst_addr)),
+	},
+	[ITEM_IPV6] = {
+		.name = "ipv6",
+		.help = "match IPv6 header",
+		.priv = PRIV_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+		.next = NEXT(item_ipv6),
+		.call = parse_vc,
+	},
+	[ITEM_IPV6_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV6_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.dst_addr)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1514,6 +1601,96 @@ parse_mac_addr(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse an IPv4 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv4_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in_addr tmp;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET, str2, &tmp);
+	if (ret != 1) {
+		/* Attempt integer parsing. */
+		push_args(ctx, arg);
+		return parse_int(ctx, token, str, len, buf, size);
+	}
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/**
+ * Parse an IPv6 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv6_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in6_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET6, str2, &tmp);
+	if (ret != 1)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 22/25] app/testpmd: add L4 items to flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (20 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 23/25] app/testpmd: add various actions " Adrien Mazarguil
                         ` (5 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Add the ability to match a few properties of common L4[.5] protocol
headers:

- ICMP: type and code.
- UDP: source and destination ports.
- TCP: source and destination ports.
- SCTP: source and destination ports.
- VXLAN: network identifier.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 163 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index c2725a5..a340a75 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -132,6 +132,20 @@ enum index {
 	ITEM_IPV6,
 	ITEM_IPV6_SRC,
 	ITEM_IPV6_DST,
+	ITEM_ICMP,
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_UDP,
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_TCP,
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_SCTP,
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_VXLAN,
+	ITEM_VXLAN_VNI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -359,6 +373,11 @@ static const enum index next_item[] = {
 	ITEM_VLAN,
 	ITEM_IPV4,
 	ITEM_IPV6,
+	ITEM_ICMP,
+	ITEM_UDP,
+	ITEM_TCP,
+	ITEM_SCTP,
+	ITEM_VXLAN,
 	ZERO,
 };
 
@@ -419,6 +438,40 @@ static const enum index item_ipv6[] = {
 	ZERO,
 };
 
+static const enum index item_icmp[] = {
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_udp[] = {
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_tcp[] = {
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_sctp[] = {
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vxlan[] = {
+	ITEM_VXLAN_VNI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -930,6 +983,103 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
 					     hdr.dst_addr)),
 	},
+	[ITEM_ICMP] = {
+		.name = "icmp",
+		.help = "match ICMP header",
+		.priv = PRIV_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+		.next = NEXT(item_icmp),
+		.call = parse_vc,
+	},
+	[ITEM_ICMP_TYPE] = {
+		.name = "type",
+		.help = "ICMP packet type",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_type)),
+	},
+	[ITEM_ICMP_CODE] = {
+		.name = "code",
+		.help = "ICMP packet code",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_code)),
+	},
+	[ITEM_UDP] = {
+		.name = "udp",
+		.help = "match UDP header",
+		.priv = PRIV_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+		.next = NEXT(item_udp),
+		.call = parse_vc,
+	},
+	[ITEM_UDP_SRC] = {
+		.name = "src",
+		.help = "UDP source port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.src_port)),
+	},
+	[ITEM_UDP_DST] = {
+		.name = "dst",
+		.help = "UDP destination port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.dst_port)),
+	},
+	[ITEM_TCP] = {
+		.name = "tcp",
+		.help = "match TCP header",
+		.priv = PRIV_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+		.next = NEXT(item_tcp),
+		.call = parse_vc,
+	},
+	[ITEM_TCP_SRC] = {
+		.name = "src",
+		.help = "TCP source port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.src_port)),
+	},
+	[ITEM_TCP_DST] = {
+		.name = "dst",
+		.help = "TCP destination port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.dst_port)),
+	},
+	[ITEM_SCTP] = {
+		.name = "sctp",
+		.help = "match SCTP header",
+		.priv = PRIV_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+		.next = NEXT(item_sctp),
+		.call = parse_vc,
+	},
+	[ITEM_SCTP_SRC] = {
+		.name = "src",
+		.help = "SCTP source port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.src_port)),
+	},
+	[ITEM_SCTP_DST] = {
+		.name = "dst",
+		.help = "SCTP destination port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.dst_port)),
+	},
+	[ITEM_VXLAN] = {
+		.name = "vxlan",
+		.help = "match VXLAN header",
+		.priv = PRIV_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+		.next = NEXT(item_vxlan),
+		.call = parse_vc,
+	},
+	[ITEM_VXLAN_VNI] = {
+		.name = "vni",
+		.help = "VXLAN identifier",
+		.next = NEXT(item_vxlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vxlan, vni)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1491,6 +1641,19 @@ parse_int(struct context *ctx, const struct token *token,
 	case sizeof(uint16_t):
 		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
 		break;
+	case sizeof(uint8_t [3]):
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+		if (!arg->hton) {
+			((uint8_t *)buf)[0] = u;
+			((uint8_t *)buf)[1] = u >> 8;
+			((uint8_t *)buf)[2] = u >> 16;
+			break;
+		}
+#endif
+		((uint8_t *)buf)[0] = u >> 16;
+		((uint8_t *)buf)[1] = u >> 8;
+		((uint8_t *)buf)[2] = u;
+		break;
 	case sizeof(uint32_t):
 		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 23/25] app/testpmd: add various actions to flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (21 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 22/25] app/testpmd: add L4 items " Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 24/25] app/testpmd: add queue " Adrien Mazarguil
                         ` (4 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

- MARK: attach 32 bit value to packets.
- FLAG: flag packets.
- DROP: drop packets.
- COUNT: enable counters for a rule.
- PF: redirect packets to physical device function.
- VF: redirect packets to virtual device function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 121 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a340a75..90712bf 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -153,6 +153,15 @@ enum index {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_MARK_ID,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
 };
 
 /** Size of pattern[] field in struct rte_flow_item_raw. */
@@ -476,6 +485,25 @@ static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ZERO,
+};
+
+static const enum index action_mark[] = {
+	ACTION_MARK_ID,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_vf[] = {
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
+	ACTION_NEXT,
 	ZERO,
 };
 
@@ -487,6 +515,8 @@ static int parse_vc(struct context *, const struct token *,
 		    void *, unsigned int);
 static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_conf(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1112,6 +1142,70 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_MARK] = {
+		.name = "mark",
+		.help = "attach 32 bit value to packets",
+		.priv = PRIV_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+		.next = NEXT(action_mark),
+		.call = parse_vc,
+	},
+	[ACTION_MARK_ID] = {
+		.name = "id",
+		.help = "32 bit value to return with packets",
+		.next = NEXT(action_mark, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_mark, id)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_FLAG] = {
+		.name = "flag",
+		.help = "flag packets",
+		.priv = PRIV_ACTION(FLAG, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_DROP] = {
+		.name = "drop",
+		.help = "drop packets (note: passthru has priority)",
+		.priv = PRIV_ACTION(DROP, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_COUNT] = {
+		.name = "count",
+		.help = "enable counters for this rule",
+		.priv = PRIV_ACTION(COUNT, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PF] = {
+		.name = "pf",
+		.help = "redirect packets to physical device function",
+		.priv = PRIV_ACTION(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_VF] = {
+		.name = "vf",
+		.help = "redirect packets to virtual device function",
+		.priv = PRIV_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+		.next = NEXT(action_vf),
+		.call = parse_vc,
+	},
+	[ACTION_VF_ORIGINAL] = {
+		.name = "original",
+		.help = "use original VF ID if possible",
+		.next = NEXT(action_vf, NEXT_ENTRY(BOOLEAN)),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_vf,
+					   original, 1)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_VF_ID] = {
+		.name = "id",
+		.help = "VF ID to redirect packets to",
+		.next = NEXT(action_vf, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_vf, id)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -1435,6 +1529,33 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse action configuration field. */
+static int
+parse_vc_conf(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Update configuration pointer. */
+	action->conf = ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 24/25] app/testpmd: add queue actions to flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (22 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 23/25] app/testpmd: add various actions " Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 25/25] doc: describe testpmd " Adrien Mazarguil
                         ` (3 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

- QUEUE: assign packets to a given queue index.
- DUP: duplicate packets to a given queue index.
- RSS: spread packets among several queues.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c | 152 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 90712bf..2376b8f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -156,8 +156,15 @@ enum index {
 	ACTION_MARK,
 	ACTION_MARK_ID,
 	ACTION_FLAG,
+	ACTION_QUEUE,
+	ACTION_QUEUE_INDEX,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_DUP_INDEX,
+	ACTION_RSS,
+	ACTION_RSS_QUEUES,
+	ACTION_RSS_QUEUE,
 	ACTION_PF,
 	ACTION_VF,
 	ACTION_VF_ORIGINAL,
@@ -171,6 +178,14 @@ enum index {
 #define ITEM_RAW_SIZE \
 	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
 
+/** Number of queue[] entries in struct rte_flow_action_rss. */
+#define ACTION_RSS_NUM 32
+
+/** Storage size for struct rte_flow_action_rss including queues. */
+#define ACTION_RSS_SIZE \
+	(offsetof(struct rte_flow_action_rss, queue) + \
+	 sizeof(*((struct rte_flow_action_rss *)0)->queue) * ACTION_RSS_NUM)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -487,8 +502,11 @@ static const enum index next_action[] = {
 	ACTION_PASSTHRU,
 	ACTION_MARK,
 	ACTION_FLAG,
+	ACTION_QUEUE,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_RSS,
 	ACTION_PF,
 	ACTION_VF,
 	ZERO,
@@ -500,6 +518,24 @@ static const enum index action_mark[] = {
 	ZERO,
 };
 
+static const enum index action_queue[] = {
+	ACTION_QUEUE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_dup[] = {
+	ACTION_DUP_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_rss[] = {
+	ACTION_RSS_QUEUES,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static const enum index action_vf[] = {
 	ACTION_VF_ORIGINAL,
 	ACTION_VF_ID,
@@ -517,6 +553,9 @@ static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
 static int parse_vc_conf(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_action_rss_queue(struct context *, const struct token *,
+				     const char *, unsigned int, void *,
+				     unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -566,6 +605,8 @@ static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
 			unsigned int, char *, unsigned int);
+static int comp_vc_action_rss_queue(struct context *, const struct token *,
+				    unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -1163,6 +1204,21 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_QUEUE] = {
+		.name = "queue",
+		.help = "assign packets to a given queue index",
+		.priv = PRIV_ACTION(QUEUE,
+				    sizeof(struct rte_flow_action_queue)),
+		.next = NEXT(action_queue),
+		.call = parse_vc,
+	},
+	[ACTION_QUEUE_INDEX] = {
+		.name = "index",
+		.help = "queue index to use",
+		.next = NEXT(action_queue, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_queue, index)),
+		.call = parse_vc_conf,
+	},
 	[ACTION_DROP] = {
 		.name = "drop",
 		.help = "drop packets (note: passthru has priority)",
@@ -1177,6 +1233,39 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_DUP] = {
+		.name = "dup",
+		.help = "duplicate packets to a given queue index",
+		.priv = PRIV_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+		.next = NEXT(action_dup),
+		.call = parse_vc,
+	},
+	[ACTION_DUP_INDEX] = {
+		.name = "index",
+		.help = "queue index to duplicate packets to",
+		.next = NEXT(action_dup, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_dup, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS] = {
+		.name = "rss",
+		.help = "spread packets among several queues",
+		.priv = PRIV_ACTION(RSS, ACTION_RSS_SIZE),
+		.next = NEXT(action_rss),
+		.call = parse_vc,
+	},
+	[ACTION_RSS_QUEUES] = {
+		.name = "queues",
+		.help = "queue indices to use",
+		.next = NEXT(action_rss, NEXT_ENTRY(ACTION_RSS_QUEUE)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS_QUEUE] = {
+		.name = "{queue}",
+		.help = "queue index",
+		.call = parse_vc_action_rss_queue,
+		.comp = comp_vc_action_rss_queue,
+	},
 	[ACTION_PF] = {
 		.name = "pf",
 		.help = "redirect packets to physical device function",
@@ -1556,6 +1645,51 @@ parse_vc_conf(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/**
+ * Parse queue field for RSS action.
+ *
+ * Valid tokens are queue indices and the "end" token.
+ */
+static int
+parse_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	static const enum index next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
+	int ret;
+	int i;
+
+	(void)token;
+	(void)buf;
+	(void)size;
+	if (ctx->curr != ACTION_RSS_QUEUE)
+		return -1;
+	i = ctx->objdata >> 16;
+	if (!strncmp(str, "end", len)) {
+		ctx->objdata &= 0xffff;
+		return len;
+	}
+	if (i >= ACTION_RSS_NUM)
+		return -1;
+	if (push_args(ctx, ARGS_ENTRY(struct rte_flow_action_rss, queue[i])))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	++i;
+	ctx->objdata = i << 16 | (ctx->objdata & 0xffff);
+	/* Repeat token. */
+	if (ctx->next_num == RTE_DIM(ctx->next))
+		return -1;
+	ctx->next[ctx->next_num++] = next;
+	if (!ctx->object)
+		return len;
+	((struct rte_flow_action_rss *)ctx->object)->num = i;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -2130,6 +2264,24 @@ comp_rule_id(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete queue field for RSS action. */
+static int
+comp_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			 unsigned int ent, char *buf, unsigned int size)
+{
+	static const char *const str[] = { "", "end", NULL };
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; str[i] != NULL; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", str[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v2 25/25] doc: describe testpmd flow command
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (23 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 24/25] app/testpmd: add queue " Adrien Mazarguil
@ 2016-12-16 16:25       ` Adrien Mazarguil
  2016-12-17 22:06       ` [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow) Olga Shern
                         ` (2 subsequent siblings)
  27 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-16 16:25 UTC (permalink / raw)
  To: dev

Document syntax, interaction with rte_flow and provide usage examples.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 612 +++++++++++++++++++++++
 1 file changed, 612 insertions(+)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index f1c269a..50cba16 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1631,6 +1631,9 @@ Filter Functions
 
 This section details the available filter functions that are available.
 
+Note these functions interface the deprecated legacy filtering framework,
+superseded by *rte_flow*. See `Flow rules management`_.
+
 ethertype_filter
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -2041,3 +2044,612 @@ Set different GRE key length for input set::
 For example to set GRE key length for input set to 4 bytes on port 0::
 
    testpmd> global_config 0 gre-key-len 4
+
+
+.. _testpmd_rte_flow:
+
+Flow rules management
+---------------------
+
+Control of the generic flow API (*rte_flow*) is fully exposed through the
+``flow`` command (validation, creation, destruction and queries).
+
+Considering *rte_flow* overlaps with all `Filter Functions`_, using both
+features simultaneously may cause undefined side-effects and is therefore
+not recommended.
+
+``flow`` syntax
+~~~~~~~~~~~~~~~
+
+Because the ``flow`` command uses dynamic tokens to handle the large number
+of possible flow rules combinations, its behavior differs slightly from
+other commands, in particular:
+
+- Pressing *?* or the *<tab>* key displays contextual help for the current
+  token, not that of the entire command.
+
+- Optional and repeated parameters are supported (provided they are listed
+  in the contextual help).
+
+The first parameter stands for the operation mode. Possible operations and
+their general syntax are described below. They are covered in detail in the
+following sections.
+
+- Check whether a flow rule can be created::
+
+   flow validate {port_id}
+       [group {group_id}] [priority {level}] [ingress] [egress]
+       pattern {item} [/ {item} [...]] / end
+       actions {action} [/ {action} [...]] / end
+
+- Create a flow rule::
+
+   flow create {port_id}
+       [group {group_id}] [priority {level}] [ingress] [egress]
+       pattern {item} [/ {item} [...]] / end
+       actions {action} [/ {action} [...]] / end
+
+- Destroy specific flow rules::
+
+   flow destroy {port_id} rule {rule_id} [...]
+
+- Destroy all flow rules::
+
+   flow flush {port_id}
+
+- Query an existing flow rule::
+
+   flow query {port_id} {rule_id} {action}
+
+- List existing flow rules sorted by priority, filtered by group
+  identifiers::
+
+   flow list {port_id} [group {group_id}] [...]
+
+Validating flow rules
+~~~~~~~~~~~~~~~~~~~~~
+
+``flow validate`` reports whether a flow rule would be accepted by the
+underlying device in its current state but stops short of creating it. It is
+bound to ``rte_flow_validate()``::
+
+ flow validate {port_id}
+     [group {group_id}] [priority {level}] [ingress] [egress]
+     pattern {item} [/ {item} [...]] / end
+     actions {action} [/ {action} [...]] / end
+
+If successful, it will show::
+
+ Flow rule validated
+
+Otherwise it will show an error message of the form::
+
+ Caught error type [...] ([...]): [...]
+
+This command uses the same parameters as ``flow create``, their format is
+described in `Creating flow rules`_.
+
+Check whether redirecting any Ethernet packet received on port 0 to RX queue
+index 6 is supported::
+
+ testpmd> flow validate 1 ingress pattern eth / end
+     actions queue index 6 / end
+ Flow rule validated
+ testpmd>
+
+Port 0 does not support TCPv6 rules::
+
+ testpmd> flow validate 0 ingress pattern eth / ipv6 / tcp / end
+     actions drop / end
+ Caught error type 9 (specific pattern item): Invalid argument.
+ testpmd>
+
+Creating flow rules
+~~~~~~~~~~~~~~~~~~~
+
+``flow create`` validates and creates the specified flow rule. It is bound
+to ``rte_flow_create()``::
+
+ flow create {port_id}
+     [group {group_id}] [priority {level}] [ingress] [egress]
+     pattern {item} [/ {item} [...]] / end
+     actions {action} [/ {action} [...]] / end
+
+If successful, it will return a flow rule ID usable with other commands::
+
+ Flow rule #[...] created
+
+Otherwise it will show an error message of the form::
+
+ Caught error type [...] ([...]): [...]
+
+Parameters describe in the following order:
+
+- Attributes (*group*, *priority*, *ingress*, *egress* tokens).
+- A matching pattern, starting with the *pattern* token and terminated by an
+  *end* pattern item.
+- Actions, starting with the *actions* token and terminated by an *end*
+  action.
+
+These translate directly to *rte_flow* objects provided as-is to the
+underlying functions.
+
+The shortest valid definition only comprises mandatory tokens::
+
+ testpmd> flow create 0 pattern end actions end
+
+Note that PMDs may refuse rules that essentially do nothing such as this
+one.
+
+**All unspecified object values are automatically initialized to 0.**
+
+Attributes
+^^^^^^^^^^
+
+These tokens affect flow rule attributes (``struct rte_flow_attr``) and are
+specified before the ``pattern`` token.
+
+- ``group {group id}``: priority group.
+- ``priority {level}``: priority level within group.
+- ``ingress``: rule applies to ingress traffic.
+- ``egress``: rule applies to egress traffic.
+
+Each instance of an attribute specified several times overrides the previous
+value as shown below (group 4 is used)::
+
+ testpmd> flow create 0 group 42 group 24 group 4 [...]
+
+Note that once enabled, ``ingress`` and ``egress`` cannot be disabled.
+
+While not specifying a direction is an error, some rules may allow both
+simultaneously.
+
+Most rules affect RX therefore contain the ``ingress`` token::
+
+ testpmd> flow create 0 ingress pattern [...]
+
+Matching pattern
+^^^^^^^^^^^^^^^^
+
+A matching pattern starts after the ``pattern`` token. It is made of pattern
+items and is terminated by a mandatory ``end`` item.
+
+Items are named after their type (*RTE_FLOW_ITEM_TYPE_* from ``enum
+rte_flow_item_type``).
+
+The ``/`` token is used as a separator between pattern items as shown
+below::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end [...]
+
+Note that protocol items like these must be stacked from lowest to highest
+layer to make sense. For instance, the following rule is either invalid or
+unlikely to match any packet::
+
+ testpmd> flow create 0 ingress pattern eth / udp / ipv4 / end [...]
+
+More information on these restrictions can be found in the *rte_flow*
+documentation.
+
+Several items support additional specification structures, for example
+``ipv4`` allows specifying source and destination addresses as follows::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 src is 10.1.1.1
+     dst is 10.2.0.0 / end [...]
+
+This rule matches all IPv4 traffic with the specified properties.
+
+In this example, ``src`` and ``dst`` are field names of the underlying
+``struct rte_flow_item_ipv4`` object. All item properties can be specified
+in a similar fashion.
+
+The ``is`` token means that the subsequent value must be matched exactly,
+and assigns ``spec`` and ``mask`` fields in ``struct rte_flow_item``
+accordingly. Possible assignment tokens are:
+
+- ``is``: match value perfectly (with full bit-mask).
+- ``spec``: match value according to configured bit-mask.
+- ``last``: specify upper bound to establish a range.
+- ``mask``: specify bit-mask with relevant bits set to one.
+- ``prefix``: generate bit-mask from a prefix length.
+
+These yield identical results::
+
+ ipv4 src is 10.1.1.1
+
+::
+
+ ipv4 src spec 10.1.1.1 src mask 255.255.255.255
+
+::
+
+ ipv4 src spec 10.1.1.1 src prefix 32
+
+::
+
+ ipv4 src is 10.1.1.1 src last 10.1.1.1 # range with a single value
+
+::
+
+ ipv4 src is 10.1.1.1 src last 0 # 0 disables range
+
+Inclusive ranges can be defined with ``last``::
+
+ ipv4 src is 10.1.1.1 src last 10.2.3.4 # 10.1.1.1 to 10.2.3.4
+
+Note that ``mask`` affects both ``spec`` and ``last``::
+
+ ipv4 src is 10.1.1.1 src last 10.2.3.4 src mask 255.255.0.0
+    # matches 10.1.0.0 to 10.2.255.255
+
+Properties can be modified multiple times::
+
+ ipv4 src is 10.1.1.1 src is 10.1.2.3 src is 10.2.3.4 # matches 10.2.3.4
+
+::
+
+ ipv4 src is 10.1.1.1 src prefix 24 src prefix 16 # matches 10.1.0.0/16
+
+Pattern items
+^^^^^^^^^^^^^
+
+This section lists supported pattern items and their attributes, if any.
+
+- ``end``: end list of pattern items.
+
+- ``void``: no-op pattern item.
+
+- ``invert``: perform actions when pattern does not match.
+
+- ``any``: match any protocol for the current layer.
+
+  - ``num {unsigned}``: number of layers covered.
+
+- ``pf``: match packets addressed to the physical function.
+
+- ``vf``: match packets addressed to a virtual function ID.
+
+  - ``id {unsigned}``: destination VF ID.
+
+- ``port``: device-specific physical port index to use.
+
+  - ``index {unsigned}``: physical port index.
+
+- ``raw``: match an arbitrary byte string.
+
+  - ``relative {boolean}``: look for pattern after the previous item.
+  - ``search {boolean}``: search pattern from offset (see also limit).
+  - ``offset {integer}``: absolute or relative offset for pattern.
+  - ``limit {unsigned}``: search area limit for start of pattern.
+  - ``pattern {string}``: byte string to look for.
+
+- ``eth``: match Ethernet header.
+
+  - ``dst {MAC-48}``: destination MAC.
+  - ``src {MAC-48}``: source MAC.
+  - ``type {unsigned}``: EtherType.
+
+- ``vlan``: match 802.1Q/ad VLAN tag.
+
+  - ``tpid {unsigned}``: tag protocol identifier.
+  - ``tci {unsigned}``: tag control information.
+
+- ``ipv4``: match IPv4 header.
+
+  - ``src {ipv4 address}``: source address.
+  - ``dst {ipv4 address}``: destination address.
+
+- ``ipv6``: match IPv6 header.
+
+  - ``src {ipv6 address}``: source address.
+  - ``dst {ipv6 address}``: destination address.
+
+- ``icmp``: match ICMP header.
+
+  - ``type {unsigned}``: ICMP packet type.
+  - ``code {unsigned}``: ICMP packet code.
+
+- ``udp``: match UDP header.
+
+  - ``src {unsigned}``: UDP source port.
+  - ``dst {unsigned}``: UDP destination port.
+
+- ``tcp``: match TCP header.
+
+  - ``src {unsigned}``: TCP source port.
+  - ``dst {unsigned}``: TCP destination port.
+
+- ``sctp``: match SCTP header.
+
+  - ``src {unsigned}``: SCTP source port.
+  - ``dst {unsigned}``: SCTP destination port.
+
+- ``vxlan``: match VXLAN header.
+
+  - ``vni {unsigned}``: VXLAN identifier.
+
+Actions list
+^^^^^^^^^^^^
+
+A list of actions starts after the ``actions`` token in the same fashion as
+`Matching pattern`_; actions are separated by ``/`` tokens and the list is
+terminated by a mandatory ``end`` action.
+
+Actions are named after their type (*RTE_FLOW_ACTION_TYPE_* from ``enum
+rte_flow_action_type``).
+
+Dropping all incoming UDPv4 packets can be expressed as follows::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+     actions drop / end
+
+Several actions have configurable properties which must be specified when
+there is no valid default value. For example, ``queue`` requires a target
+queue index.
+
+This rule redirects incoming UDPv4 traffic to queue index 6::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+     actions queue index 6 / end
+
+While this one could be rejected by PMDs (unspecified queue index)::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+     actions queue / end
+
+As defined by *rte_flow*, the list is not ordered, all actions of a given
+rule are performed simultaneously. These are equivalent::
+
+ queue index 6 / void / mark id 42 / end
+
+::
+
+ void / mark id 42 / queue index 6 / end
+
+All actions in a list should have different types, otherwise only the last
+action of a given type is taken into account::
+
+ queue index 4 / queue index 5 / queue index 6 / end # will use queue 6
+
+::
+
+ drop / drop / drop / end # drop is performed only once
+
+::
+
+ mark id 42 / queue index 3 / mark id 24 / end # mark will be 24
+
+Considering they are performed simultaneously, opposite and overlapping
+actions can sometimes be combined when the end result is unambiguous::
+
+ drop / queue index 6 / end # drop has no effect
+
+::
+
+ drop / dup index 6 / end # same as above
+
+::
+
+ queue index 6 / rss queues 6 7 8 / end # queue has no effect
+
+::
+
+ drop / passthru / end # drop has no effect
+
+Note that PMDs may still refuse such combinations.
+
+Actions
+^^^^^^^
+
+This section lists supported actions and their attributes, if any.
+
+- ``end``: end list of actions.
+
+- ``void``: no-op action.
+
+- ``passthru``: let subsequent rule process matched packets.
+
+- ``mark``: attach 32 bit value to packets.
+
+  - ``id {unsigned}``: 32 bit value to return with packets.
+
+- ``flag``: flag packets.
+
+- ``queue``: assign packets to a given queue index.
+
+  - ``index {unsigned}``: queue index to use.
+
+- ``drop``: drop packets (note: passthru has priority).
+
+- ``count``: enable counters for this rule.
+
+- ``dup``: duplicate packets to a given queue index.
+
+  - ``index {unsigned}``: queue index to duplicate packets to.
+
+- ``rss``: spread packets among several queues.
+
+  - ``queues [{unsigned} [...]] end``: queue indices to use.
+
+- ``pf``: redirect packets to physical device function.
+
+- ``vf``: redirect packets to virtual device function.
+
+  - ``original {boolean}``: use original VF ID if possible.
+  - ``id {unsigned}``: VF ID to redirect packets to.
+
+Destroying flow rules
+~~~~~~~~~~~~~~~~~~~~~
+
+``flow destroy`` destroys one or more rules from their rule ID (as returned
+by ``flow create``), this command calls ``rte_flow_destroy()`` as many
+times as necessary::
+
+ flow destroy {port_id} rule {rule_id} [...]
+
+If successful, it will show::
+
+ Flow rule #[...] destroyed
+
+It does not report anything for rule IDs that do not exist. The usual error
+message is shown when a rule cannot be destroyed::
+
+ Caught error type [...] ([...]): [...]
+
+``flow flush`` destroys all rules on a device and does not take extra
+arguments. It is bound to ``rte_flow_flush()``::
+
+ flow flush {port_id}
+
+Any errors are reported as above.
+
+Creating several rules and destroying them::
+
+ testpmd> flow create 0 ingress pattern eth / ipv6 / end
+     actions queue index 2 / end
+ Flow rule #0 created
+ testpmd> flow create 0 ingress pattern eth / ipv4 / end
+     actions queue index 3 / end
+ Flow rule #1 created
+ testpmd> flow destroy 0 rule 0 rule 1
+ Flow rule #1 destroyed
+ Flow rule #0 destroyed
+ testpmd>
+
+The same result can be achieved using ``flow flush``::
+
+ testpmd> flow create 0 ingress pattern eth / ipv6 / end
+     actions queue index 2 / end
+ Flow rule #0 created
+ testpmd> flow create 0 ingress pattern eth / ipv4 / end
+     actions queue index 3 / end
+ Flow rule #1 created
+ testpmd> flow flush 0
+ testpmd>
+
+Non-existent rule IDs are ignored::
+
+ testpmd> flow create 0 ingress pattern eth / ipv6 / end
+     actions queue index 2 / end
+ Flow rule #0 created
+ testpmd> flow create 0 ingress pattern eth / ipv4 / end
+     actions queue index 3 / end
+ Flow rule #1 created
+ testpmd> flow destroy 0 rule 42 rule 10 rule 2
+ testpmd>
+ testpmd> flow destroy 0 rule 0
+ Flow rule #0 destroyed
+ testpmd>
+
+Querying flow rules
+~~~~~~~~~~~~~~~~~~~
+
+``flow query`` queries a specific action of a flow rule having that
+ability. Such actions collect information that can be reported using this
+command. It is bound to ``rte_flow_query()``::
+
+ flow query {port_id} {rule_id} {action}
+
+If successful, it will display either the retrieved data for known actions
+or the following message::
+
+ Cannot display result for action type [...] ([...])
+
+Otherwise, it will complain either that the rule does not exist or that some
+error occurred::
+
+ Flow rule #[...] not found
+
+::
+
+ Caught error type [...] ([...]): [...]
+
+Currently only the ``count`` action is supported. This action reports the
+number of packets that hit the flow rule and the total number of bytes. Its
+output has the following format::
+
+ count:
+  hits_set: [...] # whether "hits" contains a valid value
+  bytes_set: [...] # whether "bytes" contains a valid value
+  hits: [...] # number of packets
+  bytes: [...] # number of bytes
+
+Querying counters for TCPv6 packets redirected to queue 6::
+
+ testpmd> flow create 0 ingress pattern eth / ipv6 / tcp / end
+     actions queue index 6 / count / end
+ Flow rule #4 created
+ testpmd> flow query 0 4 count
+ count:
+  hits_set: 1
+  bytes_set: 0
+  hits: 386446
+  bytes: 0
+ testpmd>
+
+Listing flow rules
+~~~~~~~~~~~~~~~~~~
+
+``flow list`` lists existing flow rules sorted by priority and optionally
+filtered by group identifiers::
+
+ flow list {port_id} [group {group_id}] [...]
+
+This command only fails with the following message if the device does not
+exist::
+
+ Invalid port [...]
+
+Output consists of a header line followed by a short description of each
+flow rule, one per line. There is no output at all when no flow rules are
+configured on the device::
+
+ ID      Group   Prio    Attr    Rule
+ [...]   [...]   [...]   [...]   [...]
+
+``Attr`` column flags:
+
+- ``i`` for ``ingress``.
+- ``e`` for ``egress``.
+
+Creating several flow rules and listing them::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / end
+     actions queue index 6 / end
+ Flow rule #0 created
+ testpmd> flow create 0 ingress pattern eth / ipv6 / end
+     actions queue index 2 / end
+ Flow rule #1 created
+ testpmd> flow create 0 priority 5 ingress pattern eth / ipv4 / udp / end
+     actions rss queues 6 7 8 end / end
+ Flow rule #2 created
+ testpmd> flow list 0
+ ID      Group   Prio    Attr    Rule
+ 0       0       0       i-      ETH IPV4 => QUEUE
+ 1       0       0       i-      ETH IPV6 => QUEUE
+ 2       0       5       i-      ETH IPV4 UDP => RSS
+ testpmd>
+
+Rules are sorted by priority (i.e. group ID first, then priority level)::
+
+ testpmd> flow list 1
+ ID      Group   Prio    Attr    Rule
+ 0       0       0       i-      ETH => COUNT
+ 6       0       500     i-      ETH IPV6 TCP => DROP COUNT
+ 5       0       1000    i-      ETH IPV6 ICMP => QUEUE
+ 1       24      0       i-      ETH IPV4 UDP => QUEUE
+ 4       24      10      i-      ETH IPV4 TCP => DROP
+ 3       24      20      i-      ETH IPV4 => DROP
+ 2       24      42      i-      ETH IPV4 UDP => QUEUE
+ 7       63      0       i-      ETH IPV6 UDP VXLAN => MARK QUEUE
+ testpmd>
+
+Output can be limited to specific groups::
+
+ testpmd> flow list 1 group 0 group 63
+ ID      Group   Prio    Attr    Rule
+ 0       0       0       i-      ETH => COUNT
+ 6       0       500     i-      ETH IPV6 TCP => DROP COUNT
+ 5       0       1000    i-      ETH IPV6 ICMP => QUEUE
+ 7       63      0       i-      ETH IPV6 UDP VXLAN => MARK QUEUE
+ testpmd>
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow)
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (24 preceding siblings ...)
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 25/25] doc: describe testpmd " Adrien Mazarguil
@ 2016-12-17 22:06       ` Olga Shern
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
  2016-12-21 16:19       ` [dpdk-dev] [PATCH v2 00/25] " Simon Horman
  27 siblings, 0 replies; 262+ messages in thread
From: Olga Shern @ 2016-12-17 22:06 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Friday, December 16, 2016 6:25 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow)
> 
> As previously discussed in RFC v1 [1], RFC v2 [2], with changes described in
> [3] (also pasted below), here is the first non-draft series for this new API.
> 
> Its capabilities are so generic that its name had to be vague, it may be called
> "Generic flow API", "Generic flow interface" (possibly shortened as "GFI") to
> refer to the name of the new filter type, or "rte_flow" from the prefix used
> for its public symbols. I personally favor the latter.
> 
> While it is currently meant to supersede existing filter types in order for all
> PMDs to expose a common filtering/classification interface, it may eventually
> evolve to cover the following ideas as well:
> 
> - Rx/Tx offloads configuration through automatic offloads for specific
>   packets, e.g. performing checksum on TCP packets could be expressed with
>   an egress rule with a TCP pattern and a kind of checksum action.
> 
> - RSS configuration (already defined actually). Could be global or per rule
>   depending on hardware capabilities.
> 
> - Switching configuration for devices with many physical ports; rules doing
>   both ingress and egress could even be used to completely bypass software
>   if supported by hardware.
> 
>  [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
>  [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
>  [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html
> 
> Changes since v1 series:
> 
> - Added programmer's guide documentation for rte_flow.
> 
> - Added depreciation notice for the legacy API.
> 
> - Documented testpmd flow command.
> 
> - Fixed missing rte_flow_flush symbol in rte_ether_version.map.
> 
> - Cleaned up API documentation in rte_flow.h.
> 
> - Replaced "min/max" parameters with "num" in struct rte_flow_item_any,
> to
>   align behavior with other item definitions.
> 
> - Fixed "type" (EtherType) size in struct rte_flow_item_eth.
> 
> - Renamed "queues" to "num" in struct rte_flow_action_rss.
> 
> - Fixed missing const in rte_flow_error_set() prototype definition.
> 
> - Fixed testpmd flow create command that did not save the rte_flow object
>   pointer, causing crashes.
> 
> - Hopefully fixed all the remaining ICC/clang errors.
> 
> - Replaced testpmd flow command's "fix" token with "is" for clarity.
> 
> Changes since RFC v2:
> 
> - New separate VLAN pattern item (previously part of the ETH definition),
>   found to be much more convenient.
> 
> - Removed useless "any" field from VF pattern item, the same effect can be
>   achieved by not providing a specification structure.
> 
> - Replaced bit-fields from the VXLAN pattern item to avoid endianness
>   conversion issues on 24-bit fields.
> 
> - Updated struct rte_flow_item with a new "last" field to create inclusive
>   ranges. They are defined as the interval between (spec & mask) and
>   (last & mask). All three parameters are optional.
> 
> - Renamed ID action MARK.
> 
> - Renamed "queue" fields in actions QUEUE and DUP to "index".
> 
> - "rss_conf" field in RSS action is now const.
> 
> - VF action now uses a 32 bit ID like its pattern item counterpart.
> 
> - Removed redundant struct rte_flow_pattern, API functions now expect
>   struct
>   rte_flow_item lists terminated by END items.
> 
> - Replaced struct rte_flow_actions for the same reason, with struct
>   rte_flow_action lists terminated by END actions.
> 
> - Error types (enum rte_flow_error_type) have been updated and the cause
>   pointer in struct rte_flow_error is now const.
> 
> - Function prototypes (rte_flow_create, rte_flow_validate) have also been
>   updated for clarity.
> 
> Additions:
> 
> - Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
>   are now implemented in rte_flow.c, with their symbols exported and
>   versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.
> 
> - A separate header (rte_flow_driver.h) has been added for driver-side
>   functionality, in particular struct rte_flow_ops which contains PMD
>   callbacks returned by RTE_ETH_FILTER_GENERIC query.
> 
> - testpmd now exposes most of this API through the new "flow" command.
> 
> What remains to be done:
> 
> - Using endian-aware integer types (rte_beX_t) where necessary for clarity.
> 
> - API documentation (based on RFC).
> 
> - testpmd flow command documentation (although context-aware command
>   completion should already help quite a bit in this regard).
> 
> - A few pattern item / action properties cannot be configured yet
>   (e.g. rss_conf parameter for RSS action) and a few completions
>   (e.g. possible queue IDs) should be added.
> 

Acked-by: Olga Shern <olgas@mellanox.com>

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
@ 2016-12-19  8:37         ` Xing, Beilei
  2016-12-19 10:19           ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Xing, Beilei @ 2016-12-19  8:37 UTC (permalink / raw)
  To: Adrien Mazarguil, dev; +Cc: Pei, Yulong

Hi Adrien,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Saturday, December 17, 2016 12:25 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic
> support for rte_flow
> 
> Add basic management functions for the generic flow API (validate, create,
> destroy, flush, query and list). Flow rule objects and properties are arranged
> in lists associated with each port.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> +/** Create flow rule. */
> +int
> +port_flow_create(portid_t port_id,
> +		 const struct rte_flow_attr *attr,
> +		 const struct rte_flow_item *pattern,
> +		 const struct rte_flow_action *actions) {
> +	struct rte_flow *flow;
> +	struct rte_port *port;
> +	struct port_flow *pf;
> +	uint32_t id;
> +	struct rte_flow_error error;
> +

I think there should be memset for error here, e.g. memset(&error, 0, sizeof(struct rte_flow_error));
Since both cause and message may be NULL regardless of the error type, if there's no error.cause and error.message returned from PMD, Segmentation fault will happen in port_flow_complain.
PS: This issue doesn't happen if add "export EXTRA_CFLAGS=' -g O0'" when compiling.

Thanks
Beilei

> +	flow = rte_flow_create(port_id, attr, pattern, actions, &error);
> +	if (!flow)
> +		return port_flow_complain(&error);
> +	port = &ports[port_id];
> +	if (port->flow_list) {
> +		if (port->flow_list->id == UINT32_MAX) {
> +			printf("Highest rule ID is already assigned, delete"
> +			       " it first");
> +			rte_flow_destroy(port_id, flow, NULL);
> +			return -ENOMEM;
> +		}
> +		id = port->flow_list->id + 1;
> +	} else
> +		id = 0;
> +	pf = port_flow_new(attr, pattern, actions);
> +	if (!pf) {
> +		int err = rte_errno;
> +
> +		printf("Cannot allocate flow: %s\n", rte_strerror(err));
> +		rte_flow_destroy(port_id, flow, NULL);
> +		return -err;
> +	}
> +	pf->next = port->flow_list;
> +	pf->id = id;
> +	pf->flow = flow;
> +	port->flow_list = pf;
> +	printf("Flow rule #%u created\n", pf->id);
> +	return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow
  2016-12-19  8:37         ` Xing, Beilei
@ 2016-12-19 10:19           ` Adrien Mazarguil
  2016-12-20  1:57             ` Xing, Beilei
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 10:19 UTC (permalink / raw)
  To: Xing, Beilei; +Cc: dev, Pei, Yulong

Hi Beilei,

On Mon, Dec 19, 2016 at 08:37:20AM +0000, Xing, Beilei wrote:
> Hi Adrien,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Saturday, December 17, 2016 12:25 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic
> > support for rte_flow
> > 
> > Add basic management functions for the generic flow API (validate, create,
> > destroy, flush, query and list). Flow rule objects and properties are arranged
> > in lists associated with each port.
> > 
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > +/** Create flow rule. */
> > +int
> > +port_flow_create(portid_t port_id,
> > +		 const struct rte_flow_attr *attr,
> > +		 const struct rte_flow_item *pattern,
> > +		 const struct rte_flow_action *actions) {
> > +	struct rte_flow *flow;
> > +	struct rte_port *port;
> > +	struct port_flow *pf;
> > +	uint32_t id;
> > +	struct rte_flow_error error;
> > +
> 
> I think there should be memset for error here, e.g. memset(&error, 0, sizeof(struct rte_flow_error));
> Since both cause and message may be NULL regardless of the error type, if there's no error.cause and error.message returned from PMD, Segmentation fault will happen in port_flow_complain.
> PS: This issue doesn't happen if add "export EXTRA_CFLAGS=' -g O0'" when compiling.

Actually, PMDs must fill the error structure only in case of error if the
application provides one, it's not optional. I didn't initialize this
structure for this reason.

I suggest we initialize it with a known poisoning value for debugging
purposes though, to make it fail every time. Does it sound reasonable?

> > +	flow = rte_flow_create(port_id, attr, pattern, actions, &error);
> > +	if (!flow)
> > +		return port_flow_complain(&error);
> > +	port = &ports[port_id];
> > +	if (port->flow_list) {
> > +		if (port->flow_list->id == UINT32_MAX) {
> > +			printf("Highest rule ID is already assigned, delete"
> > +			       " it first");
> > +			rte_flow_destroy(port_id, flow, NULL);
> > +			return -ENOMEM;
> > +		}
> > +		id = port->flow_list->id + 1;
> > +	} else
> > +		id = 0;
> > +	pf = port_flow_new(attr, pattern, actions);
> > +	if (!pf) {
> > +		int err = rte_errno;
> > +
> > +		printf("Cannot allocate flow: %s\n", rte_strerror(err));
> > +		rte_flow_destroy(port_id, flow, NULL);
> > +		return -err;
> > +	}
> > +	pf->next = port->flow_list;
> > +	pf->id = id;
> > +	pf->flow = flow;
> > +	port->flow_list = pf;
> > +	printf("Flow rule #%u created\n", pf->id);
> > +	return 0;
> > +}
> > +

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide
  2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide Adrien Mazarguil
@ 2016-12-19 10:45         ` Mcnamara, John
  2016-12-19 11:10           ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Mcnamara, John @ 2016-12-19 10:45 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Friday, December 16, 2016 4:25 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide
> 
> This documentation is based on the latest RFC submission, subsequently
> updated according to feedback from the community.

Hi,

Thanks. That is a very good doc.

A few RST comments.

Section headers should use the use the following underline formats:

    Level 1 Heading
    ===============


    Level 2 Heading
    ---------------


    Level 3 Heading
    ~~~~~~~~~~~~~~~


    Level 4 Heading
    ^^^^^^^^^^^^^^^

See: http://dpdk.org/doc/guides/contributing/documentation.html#rst-guidelines


Also, some of the section headers for Attributes, Patterns, Match and Action
are a bit short and it isn't clear what section you are in, especially in the
PDF doc. It might be clearer to add the section name before each item like:


    Attribute: Group
    ~~~~~~~~~~~~~~~~

    Match: VOID
    ~~~~~~~~~~~


Tables should have a reference link and a caption, like this:

    .. _table_qos_pipes:

    .. table:: Sample configuration for QOS pipes.

       +----------+----------+----------+
       | Header 1 | Header 2 | Header 3 |
       |          |          |          |
       +==========+==========+==========+
       | Text     | Text     | Text     |
       +----------+----------+----------+
       | ...      | ...      | ...      |
       +----------+----------+----------+


See: http://dpdk.org/doc/guides/contributing/documentation.html#tables

This will make the tables clearer when there are several in a row and will allow
the text to refer to them with :numref:.

Also, there is one typo:

s/unpractically/impractically/


Otherwise, very good work. A good clear document.

John

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 03/25] doc: announce depreciation of legacy filter types
  2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 03/25] doc: announce depreciation of legacy filter types Adrien Mazarguil
@ 2016-12-19 10:47         ` Mcnamara, John
  0 siblings, 0 replies; 262+ messages in thread
From: Mcnamara, John @ 2016-12-19 10:47 UTC (permalink / raw)
  To: Adrien Mazarguil, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Friday, December 16, 2016 4:25 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 03/25] doc: announce depreciation of legacy
> filter types
> 
> They are superseded by the generic flow API (rte_flow). Target release is
> not defined yet.
> 
> Suggested-by: Kevin Traynor <ktraynor@redhat.com>
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
>  doc/guides/rel_notes/deprecation.rst | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 2d17bc6..4819078 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -71,3 +71,10 @@ Deprecation Notices
>  * mempool: The functions for single/multi producer/consumer are
> deprecated
>    and will be removed in 17.02.
>    It is replaced by ``rte_mempool_generic_get/put`` functions.
> +
> +* ethdev: the legacy filter API, including
> +rte_eth_dev_filter_supported(),
> +  rte_eth_dev_filter_ctrl() as well as filter types MACVLAN, ETHERTYPE,
> +  FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR, HASH and L2_TUNNEL, is
> +superseded by
> +  the generic flow API (rte_flow) in PMDs that implement the latter.
> +  Target release for removal of the legacy API will be defined once
> +most
> +  PMDs have switched to rte_flow.


All the function names and constants should be fixed width quoted:
``rte_eth_dev_filter_supported()``.

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide
  2016-12-19 10:45         ` Mcnamara, John
@ 2016-12-19 11:10           ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 11:10 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

Hi John,

On Mon, Dec 19, 2016 at 10:45:32AM +0000, Mcnamara, John wrote:
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Friday, December 16, 2016 4:25 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide
> > 
> > This documentation is based on the latest RFC submission, subsequently
> > updated according to feedback from the community.
> 
> Hi,
> 
> Thanks. That is a very good doc.
> 
> A few RST comments.
> 
> Section headers should use the use the following underline formats:
> 
>     Level 1 Heading
>     ===============
> 
> 
>     Level 2 Heading
>     ---------------
> 
> 
>     Level 3 Heading
>     ~~~~~~~~~~~~~~~
> 
> 
>     Level 4 Heading
>     ^^^^^^^^^^^^^^^
> 
> See: http://dpdk.org/doc/guides/contributing/documentation.html#rst-guidelines

OK, although I do not see any mistake regarding this?

> Also, some of the section headers for Attributes, Patterns, Match and Action
> are a bit short and it isn't clear what section you are in, especially in the
> PDF doc. It might be clearer to add the section name before each item like:
> 
> 
>     Attribute: Group
>     ~~~~~~~~~~~~~~~~
> 
>     Match: VOID
>     ~~~~~~~~~~~
> 
> 
> Tables should have a reference link and a caption, like this:
> 
>     .. _table_qos_pipes:
> 
>     .. table:: Sample configuration for QOS pipes.
> 
>        +----------+----------+----------+
>        | Header 1 | Header 2 | Header 3 |
>        |          |          |          |
>        +==========+==========+==========+
>        | Text     | Text     | Text     |
>        +----------+----------+----------+
>        | ...      | ...      | ...      |
>        +----------+----------+----------+
> 
> 
> See: http://dpdk.org/doc/guides/contributing/documentation.html#tables
> 
> This will make the tables clearer when there are several in a row and will allow
> the text to refer to them with :numref:.
> 
> Also, there is one typo:
> 
> s/unpractically/impractically/
> 
> 
> Otherwise, very good work. A good clear document.

Thanks, I will fix these and re-submit.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 00/25] Generic flow API (rte_flow)
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (25 preceding siblings ...)
  2016-12-17 22:06       ` [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow) Olga Shern
@ 2016-12-19 17:48       ` Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API Adrien Mazarguil
                           ` (25 more replies)
  2016-12-21 16:19       ` [dpdk-dev] [PATCH v2 00/25] " Simon Horman
  27 siblings, 26 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

As previously discussed in RFC v1 [1], RFC v2 [2], with changes
described in [3] (also pasted below), here is the first non-draft series
for this new API.

Its capabilities are so generic that its name had to be vague, it may be
called "Generic flow API", "Generic flow interface" (possibly shortened
as "GFI") to refer to the name of the new filter type, or "rte_flow" from
the prefix used for its public symbols. I personally favor the latter.

While it is currently meant to supersede existing filter types in order for
all PMDs to expose a common filtering/classification interface, it may
eventually evolve to cover the following ideas as well:

- Rx/Tx offloads configuration through automatic offloads for specific
  packets, e.g. performing checksum on TCP packets could be expressed with
  an egress rule with a TCP pattern and a kind of checksum action.

- RSS configuration (already defined actually). Could be global or per rule
  depending on hardware capabilities.

- Switching configuration for devices with many physical ports; rules doing
  both ingress and egress could even be used to completely bypass software
  if supported by hardware.

 [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
 [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
 [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html

Changes since v2 series:

- Replaced ENOTSUP with ENOSYS in the code (although doing so triggers
  spurious checkpatch warnings) to tell apart unimplemented callbacks from
  unsupported flow rules and match the documented behavior.

- Fixed missing include seen by check-includes.sh in rte_flow_driver.h.

- Made clearer that PMDs must initialize rte_flow_error (if non-NULL) in
  case of error, added related memory poisoning in testpmd to catch missing
  initializations.

- Fixed rte_flow programmer's guide according to John Mcnamara's comments
  (tables, sections header and typos).

- Fixed deprecation notice as well.

Changes since v1 series:

- Added programmer's guide documentation for rte_flow.

- Added depreciation notice for the legacy API.

- Documented testpmd flow command.

- Fixed missing rte_flow_flush symbol in rte_ether_version.map.

- Cleaned up API documentation in rte_flow.h.

- Replaced "min/max" parameters with "num" in struct rte_flow_item_any, to
  align behavior with other item definitions.

- Fixed "type" (EtherType) size in struct rte_flow_item_eth.

- Renamed "queues" to "num" in struct rte_flow_action_rss.

- Fixed missing const in rte_flow_error_set() prototype definition.

- Fixed testpmd flow create command that did not save the rte_flow object
  pointer, causing crashes.

- Hopefully fixed all the remaining ICC/clang errors.

- Replaced testpmd flow command's "fix" token with "is" for clarity.

Changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect
  struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

Adrien Mazarguil (25):
  ethdev: introduce generic flow API
  doc: add rte_flow prog guide
  doc: announce deprecation of legacy filter types
  cmdline: add support for dynamic tokens
  cmdline: add alignment constraint
  app/testpmd: implement basic support for rte_flow
  app/testpmd: add flow command
  app/testpmd: add rte_flow integer support
  app/testpmd: add flow list command
  app/testpmd: add flow flush command
  app/testpmd: add flow destroy command
  app/testpmd: add flow validate/create commands
  app/testpmd: add flow query command
  app/testpmd: add rte_flow item spec handler
  app/testpmd: add rte_flow item spec prefix length
  app/testpmd: add rte_flow bit-field support
  app/testpmd: add item any to flow command
  app/testpmd: add various items to flow command
  app/testpmd: add item raw to flow command
  app/testpmd: add items eth/vlan to flow command
  app/testpmd: add items ipv4/ipv6 to flow command
  app/testpmd: add L4 items to flow command
  app/testpmd: add various actions to flow command
  app/testpmd: add queue actions to flow command
  doc: describe testpmd flow command

 MAINTAINERS                                 |    4 +
 app/test-pmd/Makefile                       |    1 +
 app/test-pmd/cmdline.c                      |   32 +
 app/test-pmd/cmdline_flow.c                 | 2575 ++++++++++++++++++++++
 app/test-pmd/config.c                       |  498 +++++
 app/test-pmd/csumonly.c                     |    1 +
 app/test-pmd/flowgen.c                      |    1 +
 app/test-pmd/icmpecho.c                     |    1 +
 app/test-pmd/ieee1588fwd.c                  |    1 +
 app/test-pmd/iofwd.c                        |    1 +
 app/test-pmd/macfwd.c                       |    1 +
 app/test-pmd/macswap.c                      |    1 +
 app/test-pmd/parameters.c                   |    1 +
 app/test-pmd/rxonly.c                       |    1 +
 app/test-pmd/testpmd.c                      |    6 +
 app/test-pmd/testpmd.h                      |   27 +
 app/test-pmd/txonly.c                       |    1 +
 doc/api/doxy-api-index.md                   |    2 +
 doc/guides/prog_guide/index.rst             |    1 +
 doc/guides/prog_guide/rte_flow.rst          | 2042 +++++++++++++++++
 doc/guides/rel_notes/deprecation.rst        |    8 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  612 +++++
 lib/librte_cmdline/cmdline_parse.c          |   67 +-
 lib/librte_cmdline/cmdline_parse.h          |   21 +
 lib/librte_ether/Makefile                   |    3 +
 lib/librte_ether/rte_eth_ctrl.h             |    1 +
 lib/librte_ether/rte_ether_version.map      |   11 +
 lib/librte_ether/rte_flow.c                 |  159 ++
 lib/librte_ether/rte_flow.h                 |  947 ++++++++
 lib/librte_ether/rte_flow_driver.h          |  182 ++
 30 files changed, 7200 insertions(+), 9 deletions(-)
 create mode 100644 app/test-pmd/cmdline_flow.c
 create mode 100644 doc/guides/prog_guide/rte_flow.rst
 create mode 100644 lib/librte_ether/rte_flow.c
 create mode 100644 lib/librte_ether/rte_flow.h
 create mode 100644 lib/librte_ether/rte_flow_driver.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2017-05-23  6:07           ` Zhao1, Wei
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 02/25] doc: add rte_flow prog guide Adrien Mazarguil
                           ` (24 subsequent siblings)
  25 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

Benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

Existing filter types will be deprecated and removed in the near future.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 MAINTAINERS                            |   4 +
 doc/api/doxy-api-index.md              |   2 +
 lib/librte_ether/Makefile              |   3 +
 lib/librte_ether/rte_eth_ctrl.h        |   1 +
 lib/librte_ether/rte_ether_version.map |  11 +
 lib/librte_ether/rte_flow.c            | 159 +++++
 lib/librte_ether/rte_flow.h            | 947 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow_driver.h     | 182 ++++++
 8 files changed, 1309 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 26d9590..5975cff 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -243,6 +243,10 @@ M: Thomas Monjalon <thomas.monjalon@6wind.com>
 F: lib/librte_ether/
 F: scripts/test-null.sh
 
+Generic flow API
+M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
+F: lib/librte_ether/rte_flow*
+
 Crypto API
 M: Declan Doherty <declan.doherty@intel.com>
 F: lib/librte_cryptodev/
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de65b4c..4951552 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -39,6 +39,8 @@ There are many libraries, so their headers may be grouped by topics:
   [dev]                (@ref rte_dev.h),
   [ethdev]             (@ref rte_ethdev.h),
   [ethctrl]            (@ref rte_eth_ctrl.h),
+  [rte_flow]           (@ref rte_flow.h),
+  [rte_flow_driver]    (@ref rte_flow_driver.h),
   [cryptodev]          (@ref rte_cryptodev.h),
   [devargs]            (@ref rte_devargs.h),
   [bond]               (@ref rte_eth_bond.h),
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index efe1e5f..9335361 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -44,6 +44,7 @@ EXPORT_MAP := rte_ether_version.map
 LIBABIVER := 5
 
 SRCS-y += rte_ethdev.c
+SRCS-y += rte_flow.c
 
 #
 # Export include files
@@ -51,6 +52,8 @@ SRCS-y += rte_ethdev.c
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
+SYMLINK-y-include += rte_flow.h
+SYMLINK-y-include += rte_flow_driver.h
 
 # this lib depends upon:
 DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index fe80eb0..8386904 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -99,6 +99,7 @@ enum rte_filter_type {
 	RTE_ETH_FILTER_FDIR,
 	RTE_ETH_FILTER_HASH,
 	RTE_ETH_FILTER_L2_TUNNEL,
+	RTE_ETH_FILTER_GENERIC,
 	RTE_ETH_FILTER_MAX
 };
 
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 72be66d..384cdee 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -147,3 +147,14 @@ DPDK_16.11 {
 	rte_eth_dev_pci_remove;
 
 } DPDK_16.07;
+
+DPDK_17.02 {
+	global:
+
+	rte_flow_validate;
+	rte_flow_create;
+	rte_flow_destroy;
+	rte_flow_flush;
+	rte_flow_query;
+
+} DPDK_16.11;
diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
new file mode 100644
index 0000000..d98fb1b
--- /dev/null
+++ b/lib/librte_ether/rte_flow.c
@@ -0,0 +1,159 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include "rte_ethdev.h"
+#include "rte_flow_driver.h"
+#include "rte_flow.h"
+
+/* Get generic flow operations structure from a port. */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops;
+	int code;
+
+	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
+		code = ENODEV;
+	else if (unlikely(!dev->dev_ops->filter_ctrl ||
+			  dev->dev_ops->filter_ctrl(dev,
+						    RTE_ETH_FILTER_GENERIC,
+						    RTE_ETH_FILTER_GET,
+						    &ops) ||
+			  !ops))
+		code = ENOSYS;
+	else
+		return ops;
+	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(code));
+	return NULL;
+}
+
+/* Check whether a flow rule can be created on a given port. */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error)
+{
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->validate))
+		return ops->validate(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Create a flow rule on a given port. */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return NULL;
+	if (likely(!!ops->create))
+		return ops->create(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return NULL;
+}
+
+/* Destroy a flow rule on a given port. */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->destroy))
+		return ops->destroy(dev, flow, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Destroy all flow rules associated with a port. */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->flush))
+		return ops->flush(dev, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Query an existing flow rule. */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (!ops)
+		return -rte_errno;
+	if (likely(!!ops->query))
+		return ops->query(dev, flow, action, data, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
new file mode 100644
index 0000000..98084ac
--- /dev/null
+++ b/lib/librte_ether/rte_flow.h
@@ -0,0 +1,947 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_H_
+#define RTE_FLOW_H_
+
+/**
+ * @file
+ * RTE generic flow API
+ *
+ * This interface provides the ability to program packet matching and
+ * associated actions in hardware through flow rules.
+ */
+
+#include <rte_arp.h>
+#include <rte_ether.h>
+#include <rte_icmp.h>
+#include <rte_ip.h>
+#include <rte_sctp.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Flow rule attributes.
+ *
+ * Priorities are set on two levels: per group and per rule within groups.
+ *
+ * Lower values denote higher priority, the highest priority for both levels
+ * is 0, so that a rule with priority 0 in group 8 is always matched after a
+ * rule with priority 8 in group 0.
+ *
+ * Although optional, applications are encouraged to group similar rules as
+ * much as possible to fully take advantage of hardware capabilities
+ * (e.g. optimized matching) and work around limitations (e.g. a single
+ * pattern type possibly allowed in a given group).
+ *
+ * Group and priority levels are arbitrary and up to the application, they
+ * do not need to be contiguous nor start from 0, however the maximum number
+ * varies between devices and may be affected by existing flow rules.
+ *
+ * If a packet is matched by several rules of a given group for a given
+ * priority level, the outcome is undefined. It can take any path, may be
+ * duplicated or even cause unrecoverable errors.
+ *
+ * Note that support for more than a single group and priority level is not
+ * guaranteed.
+ *
+ * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+ *
+ * Several pattern items and actions are valid and can be used in both
+ * directions. Those valid for only one direction are described as such.
+ *
+ * At least one direction must be specified.
+ *
+ * Specifying both directions at once for a given rule is not recommended
+ * but may be valid in a few cases (e.g. shared counter).
+ */
+struct rte_flow_attr {
+	uint32_t group; /**< Priority group. */
+	uint32_t priority; /**< Priority level within group. */
+	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
+	uint32_t egress:1; /**< Rule applies to egress traffic. */
+	uint32_t reserved:30; /**< Reserved, must be zero. */
+};
+
+/**
+ * Matching pattern item types.
+ *
+ * Pattern items fall in two categories:
+ *
+ * - Matching protocol headers and packet data (ANY, RAW, ETH, VLAN, IPV4,
+ *   IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
+ *   specification structure. These must be stacked in the same order as the
+ *   protocol layers to match, starting from the lowest.
+ *
+ * - Matching meta-data or affecting pattern processing (END, VOID, INVERT,
+ *   PF, VF, PORT and so on), often without a specification structure. Since
+ *   they do not match packet contents, these can be specified anywhere
+ *   within item lists without affecting others.
+ *
+ * See the description of individual types for more information. Those
+ * marked with [META] fall into the second category.
+ */
+enum rte_flow_item_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for item lists. Prevents further processing of items,
+	 * thereby ending the pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_VOID,
+
+	/**
+	 * [META]
+	 *
+	 * Inverted matching, i.e. process packets that do not match the
+	 * pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_INVERT,
+
+	/**
+	 * Matches any protocol in place of the current layer, a single ANY
+	 * may also stand for several protocol layers.
+	 *
+	 * See struct rte_flow_item_any.
+	 */
+	RTE_FLOW_ITEM_TYPE_ANY,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to the physical function of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a PF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_PF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to a virtual function ID of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a VF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * See struct rte_flow_item_vf.
+	 */
+	RTE_FLOW_ITEM_TYPE_VF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets coming from the specified physical port of the
+	 * underlying device.
+	 *
+	 * The first PORT item overrides the physical port normally
+	 * associated with the specified DPDK input port (port_id). This
+	 * item can be provided several times to match additional physical
+	 * ports.
+	 *
+	 * See struct rte_flow_item_port.
+	 */
+	RTE_FLOW_ITEM_TYPE_PORT,
+
+	/**
+	 * Matches a byte string of a given length at a given offset.
+	 *
+	 * See struct rte_flow_item_raw.
+	 */
+	RTE_FLOW_ITEM_TYPE_RAW,
+
+	/**
+	 * Matches an Ethernet header.
+	 *
+	 * See struct rte_flow_item_eth.
+	 */
+	RTE_FLOW_ITEM_TYPE_ETH,
+
+	/**
+	 * Matches an 802.1Q/ad VLAN tag.
+	 *
+	 * See struct rte_flow_item_vlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VLAN,
+
+	/**
+	 * Matches an IPv4 header.
+	 *
+	 * See struct rte_flow_item_ipv4.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV4,
+
+	/**
+	 * Matches an IPv6 header.
+	 *
+	 * See struct rte_flow_item_ipv6.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV6,
+
+	/**
+	 * Matches an ICMP header.
+	 *
+	 * See struct rte_flow_item_icmp.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMP,
+
+	/**
+	 * Matches a UDP header.
+	 *
+	 * See struct rte_flow_item_udp.
+	 */
+	RTE_FLOW_ITEM_TYPE_UDP,
+
+	/**
+	 * Matches a TCP header.
+	 *
+	 * See struct rte_flow_item_tcp.
+	 */
+	RTE_FLOW_ITEM_TYPE_TCP,
+
+	/**
+	 * Matches a SCTP header.
+	 *
+	 * See struct rte_flow_item_sctp.
+	 */
+	RTE_FLOW_ITEM_TYPE_SCTP,
+
+	/**
+	 * Matches a VXLAN header.
+	 *
+	 * See struct rte_flow_item_vxlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VXLAN,
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ANY
+ *
+ * Matches any protocol in place of the current layer, a single ANY may also
+ * stand for several protocol layers.
+ *
+ * This is usually specified as the first pattern item when looking for a
+ * protocol anywhere in a packet.
+ *
+ * A zeroed mask stands for any number of layers.
+ */
+struct rte_flow_item_any {
+	uint32_t num; /* Number of layers covered. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VF
+ *
+ * Matches packets addressed to a virtual function ID of the device.
+ *
+ * If the underlying device function differs from the one that would
+ * normally receive the matched traffic, specifying this item prevents it
+ * from reaching that device unless the flow rule contains a VF
+ * action. Packets are not duplicated between device instances by default.
+ *
+ * - Likely to return an error or never match any traffic if this causes a
+ *   VF device to match traffic addressed to a different VF.
+ * - Can be specified multiple times to match traffic addressed to several
+ *   VF IDs.
+ * - Can be combined with a PF item to match both PF and VF traffic.
+ *
+ * A zeroed mask can be used to match any VF ID.
+ */
+struct rte_flow_item_vf {
+	uint32_t id; /**< Destination VF ID. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_PORT
+ *
+ * Matches packets coming from the specified physical port of the underlying
+ * device.
+ *
+ * The first PORT item overrides the physical port normally associated with
+ * the specified DPDK input port (port_id). This item can be provided
+ * several times to match additional physical ports.
+ *
+ * Note that physical ports are not necessarily tied to DPDK input ports
+ * (port_id) when those are not under DPDK control. Possible values are
+ * specific to each device, they are not necessarily indexed from zero and
+ * may not be contiguous.
+ *
+ * As a device property, the list of allowed values as well as the value
+ * associated with a port_id should be retrieved by other means.
+ *
+ * A zeroed mask can be used to match any port index.
+ */
+struct rte_flow_item_port {
+	uint32_t index; /**< Physical port index. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_RAW
+ *
+ * Matches a byte string of a given length at a given offset.
+ *
+ * Offset is either absolute (using the start of the packet) or relative to
+ * the end of the previous matched item in the stack, in which case negative
+ * values are allowed.
+ *
+ * If search is enabled, offset is used as the starting point. The search
+ * area can be delimited by setting limit to a nonzero value, which is the
+ * maximum number of bytes after offset where the pattern may start.
+ *
+ * Matching a zero-length pattern is allowed, doing so resets the relative
+ * offset for subsequent items.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_raw {
+	uint32_t relative:1; /**< Look for pattern after the previous item. */
+	uint32_t search:1; /**< Search pattern from offset (see also limit). */
+	uint32_t reserved:30; /**< Reserved, must be set to zero. */
+	int32_t offset; /**< Absolute or relative offset for pattern. */
+	uint16_t limit; /**< Search area limit for start of pattern. */
+	uint16_t length; /**< Pattern length. */
+	uint8_t pattern[]; /**< Byte string to look for. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ETH
+ *
+ * Matches an Ethernet header.
+ */
+struct rte_flow_item_eth {
+	struct ether_addr dst; /**< Destination MAC. */
+	struct ether_addr src; /**< Source MAC. */
+	uint16_t type; /**< EtherType. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VLAN
+ *
+ * Matches an 802.1Q/ad VLAN tag.
+ *
+ * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
+ * RTE_FLOW_ITEM_TYPE_VLAN.
+ */
+struct rte_flow_item_vlan {
+	uint16_t tpid; /**< Tag protocol identifier. */
+	uint16_t tci; /**< Tag control information. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV4
+ *
+ * Matches an IPv4 header.
+ *
+ * Note: IPv4 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv4 {
+	struct ipv4_hdr hdr; /**< IPv4 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV6.
+ *
+ * Matches an IPv6 header.
+ *
+ * Note: IPv6 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv6 {
+	struct ipv6_hdr hdr; /**< IPv6 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMP.
+ *
+ * Matches an ICMP header.
+ */
+struct rte_flow_item_icmp {
+	struct icmp_hdr hdr; /**< ICMP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_UDP.
+ *
+ * Matches a UDP header.
+ */
+struct rte_flow_item_udp {
+	struct udp_hdr hdr; /**< UDP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_TCP.
+ *
+ * Matches a TCP header.
+ */
+struct rte_flow_item_tcp {
+	struct tcp_hdr hdr; /**< TCP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_SCTP.
+ *
+ * Matches a SCTP header.
+ */
+struct rte_flow_item_sctp {
+	struct sctp_hdr hdr; /**< SCTP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VXLAN.
+ *
+ * Matches a VXLAN header (RFC 7348).
+ */
+struct rte_flow_item_vxlan {
+	uint8_t flags; /**< Normally 0x08 (I flag). */
+	uint8_t rsvd0[3]; /**< Reserved, normally 0x000000. */
+	uint8_t vni[3]; /**< VXLAN identifier. */
+	uint8_t rsvd1; /**< Reserved, normally 0x00. */
+};
+
+/**
+ * Matching pattern item definition.
+ *
+ * A pattern is formed by stacking items starting from the lowest protocol
+ * layer to match. This stacking restriction does not apply to meta items
+ * which can be placed anywhere in the stack without affecting the meaning
+ * of the resulting pattern.
+ *
+ * Patterns are terminated by END items.
+ *
+ * The spec field should be a valid pointer to a structure of the related
+ * item type. It may be set to NULL in many cases to use default values.
+ *
+ * Optionally, last can point to a structure of the same type to define an
+ * inclusive range. This is mostly supported by integer and address fields,
+ * may cause errors otherwise. Fields that do not support ranges must be set
+ * to 0 or to the same value as the corresponding fields in spec.
+ *
+ * By default all fields present in spec are considered relevant (see note
+ * below). This behavior can be altered by providing a mask structure of the
+ * same type with applicable bits set to one. It can also be used to
+ * partially filter out specific fields (e.g. as an alternate mean to match
+ * ranges of IP addresses).
+ *
+ * Mask is a simple bit-mask applied before interpreting the contents of
+ * spec and last, which may yield unexpected results if not used
+ * carefully. For example, if for an IPv4 address field, spec provides
+ * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
+ * effective range becomes 10.1.0.0 to 10.3.255.255.
+ *
+ * Note: the defaults for data-matching items such as IPv4 when mask is not
+ * specified actually depend on the underlying implementation since only
+ * recognized fields can be taken into account.
+ */
+struct rte_flow_item {
+	enum rte_flow_item_type type; /**< Item type. */
+	const void *spec; /**< Pointer to item specification structure. */
+	const void *last; /**< Defines an inclusive range (spec to last). */
+	const void *mask; /**< Bit-mask applied to spec and last. */
+};
+
+/**
+ * Action types.
+ *
+ * Each possible action is represented by a type. Some have associated
+ * configuration structures. Several actions combined in a list can be
+ * affected to a flow rule. That list is not ordered.
+ *
+ * They fall in three categories:
+ *
+ * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+ *   processing matched packets by subsequent flow rules, unless overridden
+ *   with PASSTHRU.
+ *
+ * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
+ *   for additional processing by subsequent flow rules.
+ *
+ * - Other non terminating meta actions that do not affect the fate of
+ *   packets (END, VOID, MARK, FLAG, COUNT).
+ *
+ * When several actions are combined in a flow rule, they should all have
+ * different types (e.g. dropping a packet twice is not possible).
+ *
+ * Only the last action of a given type is taken into account. PMDs still
+ * perform error checking on the entire list.
+ *
+ * Note that PASSTHRU is the only action able to override a terminating
+ * rule.
+ */
+enum rte_flow_action_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for action lists. Prevents further processing of
+	 * actions, thereby ending the list.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_VOID,
+
+	/**
+	 * Leaves packets up for additional processing by subsequent flow
+	 * rules. This is the default when a rule does not contain a
+	 * terminating action, but can be specified to force a rule to
+	 * become non-terminating.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PASSTHRU,
+
+	/**
+	 * [META]
+	 *
+	 * Attaches a 32 bit value to packets.
+	 *
+	 * See struct rte_flow_action_mark.
+	 */
+	RTE_FLOW_ACTION_TYPE_MARK,
+
+	/**
+	 * [META]
+	 *
+	 * Flag packets. Similar to MARK but only affects ol_flags.
+	 *
+	 * Note: a distinctive flag must be defined for it.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_FLAG,
+
+	/**
+	 * Assigns packets to a given queue index.
+	 *
+	 * See struct rte_flow_action_queue.
+	 */
+	RTE_FLOW_ACTION_TYPE_QUEUE,
+
+	/**
+	 * Drops packets.
+	 *
+	 * PASSTHRU overrides this action if both are specified.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_DROP,
+
+	/**
+	 * [META]
+	 *
+	 * Enables counters for this rule.
+	 *
+	 * These counters can be retrieved and reset through rte_flow_query(),
+	 * see struct rte_flow_query_count.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_COUNT,
+
+	/**
+	 * Duplicates packets to a given queue index.
+	 *
+	 * This is normally combined with QUEUE, however when used alone, it
+	 * is actually similar to QUEUE + PASSTHRU.
+	 *
+	 * See struct rte_flow_action_dup.
+	 */
+	RTE_FLOW_ACTION_TYPE_DUP,
+
+	/**
+	 * Similar to QUEUE, except RSS is additionally performed on packets
+	 * to spread them among several queues according to the provided
+	 * parameters.
+	 *
+	 * See struct rte_flow_action_rss.
+	 */
+	RTE_FLOW_ACTION_TYPE_RSS,
+
+	/**
+	 * Redirects packets to the physical function (PF) of the current
+	 * device.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PF,
+
+	/**
+	 * Redirects packets to the virtual function (VF) of the current
+	 * device with the specified ID.
+	 *
+	 * See struct rte_flow_action_vf.
+	 */
+	RTE_FLOW_ACTION_TYPE_VF,
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_MARK
+ *
+ * Attaches a 32 bit value to packets.
+ *
+ * This value is arbitrary and application-defined. For compatibility with
+ * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
+ * also set in ol_flags.
+ */
+struct rte_flow_action_mark {
+	uint32_t id; /**< 32 bit value to return with packets. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_QUEUE
+ *
+ * Assign packets to a given queue index.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_queue {
+	uint16_t index; /**< Queue index to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_COUNT (query)
+ *
+ * Query structure to retrieve and reset flow rule counters.
+ */
+struct rte_flow_query_count {
+	uint32_t reset:1; /**< Reset counters after query [in]. */
+	uint32_t hits_set:1; /**< hits field is set [out]. */
+	uint32_t bytes_set:1; /**< bytes field is set [out]. */
+	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
+	uint64_t hits; /**< Number of hits for this rule [out]. */
+	uint64_t bytes; /**< Number of bytes through this rule [out]. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_DUP
+ *
+ * Duplicates packets to a given queue index.
+ *
+ * This is normally combined with QUEUE, however when used alone, it is
+ * actually similar to QUEUE + PASSTHRU.
+ *
+ * Non-terminating by default.
+ */
+struct rte_flow_action_dup {
+	uint16_t index; /**< Queue index to duplicate packets to. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_RSS
+ *
+ * Similar to QUEUE, except RSS is additionally performed on packets to
+ * spread them among several queues according to the provided parameters.
+ *
+ * Note: RSS hash result is normally stored in the hash.rss mbuf field,
+ * however it conflicts with the MARK action as they share the same
+ * space. When both actions are specified, the RSS hash is discarded and
+ * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
+ * structure should eventually evolve to store both.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_rss {
+	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
+	uint16_t num; /**< Number of entries in queue[]. */
+	uint16_t queue[]; /**< Queues indices to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_VF
+ *
+ * Redirects packets to a virtual function (VF) of the current device.
+ *
+ * Packets matched by a VF pattern item can be redirected to their original
+ * VF ID instead of the specified one. This parameter may not be available
+ * and is not guaranteed to work properly if the VF part is matched by a
+ * prior flow rule or if packets are not addressed to a VF in the first
+ * place.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_vf {
+	uint32_t original:1; /**< Use original VF ID if possible. */
+	uint32_t reserved:31; /**< Reserved, must be zero. */
+	uint32_t id; /**< VF ID to redirect packets to. */
+};
+
+/**
+ * Definition of a single action.
+ *
+ * A list of actions is terminated by a END action.
+ *
+ * For simple actions without a configuration structure, conf remains NULL.
+ */
+struct rte_flow_action {
+	enum rte_flow_action_type type; /**< Action type. */
+	const void *conf; /**< Pointer to action configuration structure. */
+};
+
+/**
+ * Opaque type returned after successfully creating a flow.
+ *
+ * This handle can be used to manage and query the related flow (e.g. to
+ * destroy it or retrieve counters).
+ */
+struct rte_flow;
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_flow_error.cause.
+ */
+enum rte_flow_error_type {
+	RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+	RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+	RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+	RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+	RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by PMDs, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_flow_error {
+	enum rte_flow_error_type type; /**< Cause field and error types. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * Check whether a flow rule can be created on a given port.
+ *
+ * While this function has no effect on the target device, the flow rule is
+ * validated against its current configuration state and the returned value
+ * should be considered valid by the caller for that state only.
+ *
+ * The returned value is guaranteed to remain valid only as long as no
+ * successful calls to rte_flow_create() or rte_flow_destroy() are made in
+ * the meantime and no device parameter affecting flow rules in any way are
+ * modified, due to possible collisions or resource limitations (although in
+ * such cases EINVAL should not be returned).
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 if flow rule is valid and can be created. A negative errno value
+ *   otherwise (rte_errno is also set), the following errors are defined:
+ *
+ *   -ENOSYS: underlying device does not support this functionality.
+ *
+ *   -EINVAL: unknown or invalid rule specification.
+ *
+ *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
+ *   bit-masks are unsupported).
+ *
+ *   -EEXIST: collision with an existing rule.
+ *
+ *   -ENOMEM: not enough resources.
+ *
+ *   -EBUSY: action cannot be performed due to busy device resources, may
+ *   succeed if the affected queues or even the entire port are in a stopped
+ *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
+ */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error);
+
+/**
+ * Create a flow rule on a given port.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid handle in case of success, NULL otherwise and rte_errno is set
+ *   to the positive version of one of the error codes defined for
+ *   rte_flow_validate().
+ */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error);
+
+/**
+ * Destroy a flow rule on a given port.
+ *
+ * Failure to destroy a flow rule handle may occur when other flow rules
+ * depend on it, and destroying it would result in an inconsistent state.
+ *
+ * This function is only guaranteed to succeed if handles are destroyed in
+ * reverse order of their creation.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error);
+
+/**
+ * Destroy all flow rules associated with a port.
+ *
+ * In the unlikely event of failure, handles are still considered destroyed
+ * and no longer valid but the port must be assumed to be in an inconsistent
+ * state.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error);
+
+/**
+ * Query an existing flow rule.
+ *
+ * This function allows retrieving flow-specific data such as counters.
+ * Data is gathered by special actions which must be present in the flow
+ * rule definition.
+ *
+ * \see RTE_FLOW_ACTION_TYPE_COUNT
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to query.
+ * @param action
+ *   Action type to query.
+ * @param[in, out] data
+ *   Pointer to storage for the associated query data type.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_H_ */
diff --git a/lib/librte_ether/rte_flow_driver.h b/lib/librte_ether/rte_flow_driver.h
new file mode 100644
index 0000000..274562c
--- /dev/null
+++ b/lib/librte_ether/rte_flow_driver.h
@@ -0,0 +1,182 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_DRIVER_H_
+#define RTE_FLOW_DRIVER_H_
+
+/**
+ * @file
+ * RTE generic flow API (driver side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include "rte_flow.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Generic flow operations structure implemented and returned by PMDs.
+ *
+ * To implement this API, PMDs must handle the RTE_ETH_FILTER_GENERIC filter
+ * type in their .filter_ctrl callback function (struct eth_dev_ops) as well
+ * as the RTE_ETH_FILTER_GET filter operation.
+ *
+ * If successful, this operation must result in a pointer to a PMD-specific
+ * struct rte_flow_ops written to the argument address as described below:
+ *
+ * \code
+ *
+ * // PMD filter_ctrl callback
+ *
+ * static const struct rte_flow_ops pmd_flow_ops = { ... };
+ *
+ * switch (filter_type) {
+ * case RTE_ETH_FILTER_GENERIC:
+ *     if (filter_op != RTE_ETH_FILTER_GET)
+ *         return -EINVAL;
+ *     *(const void **)arg = &pmd_flow_ops;
+ *     return 0;
+ * }
+ *
+ * \endcode
+ *
+ * See also rte_flow_ops_get().
+ *
+ * These callback functions are not supposed to be used by applications
+ * directly, which must rely on the API defined in rte_flow.h.
+ *
+ * Public-facing wrapper functions perform a few consistency checks so that
+ * unimplemented (i.e. NULL) callbacks simply return -ENOTSUP. These
+ * callbacks otherwise only differ by their first argument (with port ID
+ * already resolved to a pointer to struct rte_eth_dev).
+ */
+struct rte_flow_ops {
+	/** See rte_flow_validate(). */
+	int (*validate)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_create(). */
+	struct rte_flow *(*create)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_destroy(). */
+	int (*destroy)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 struct rte_flow_error *);
+	/** See rte_flow_flush(). */
+	int (*flush)
+		(struct rte_eth_dev *,
+		 struct rte_flow_error *);
+	/** See rte_flow_query(). */
+	int (*query)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 enum rte_flow_action_type,
+		 void *,
+		 struct rte_flow_error *);
+};
+
+/**
+ * Initialize generic flow error structure.
+ *
+ * This function also sets rte_errno to a given value.
+ *
+ * @param[out] error
+ *   Pointer to flow error structure (may be NULL).
+ * @param code
+ *   Related error code (rte_errno).
+ * @param type
+ *   Cause field and error types.
+ * @param cause
+ *   Object responsible for the error.
+ * @param message
+ *   Human-readable error message.
+ *
+ * @return
+ *   Pointer to flow error structure.
+ */
+static inline struct rte_flow_error *
+rte_flow_error_set(struct rte_flow_error *error,
+		   int code,
+		   enum rte_flow_error_type type,
+		   const void *cause,
+		   const char *message)
+{
+	if (error) {
+		*error = (struct rte_flow_error){
+			.type = type,
+			.cause = cause,
+			.message = message,
+		};
+	}
+	rte_errno = code;
+	return error;
+}
+
+/**
+ * Get generic flow operations structure from a port.
+ *
+ * @param port_id
+ *   Port identifier to query.
+ * @param[out] error
+ *   Pointer to flow error structure.
+ *
+ * @return
+ *   The flow operations structure associated with port_id, NULL in case of
+ *   error, in which case rte_errno is set and the error structure contains
+ *   additional details.
+ */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_DRIVER_H_ */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 02/25] doc: add rte_flow prog guide
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2016-12-20 16:30           ` Mcnamara, John
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 03/25] doc: announce deprecation of legacy filter types Adrien Mazarguil
                           ` (23 subsequent siblings)
  25 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

This documentation is based on the latest RFC submission, subsequently
updated according to feedback from the community.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/prog_guide/index.rst    |    1 +
 doc/guides/prog_guide/rte_flow.rst | 2042 +++++++++++++++++++++++++++++++
 2 files changed, 2043 insertions(+)

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index e5a50a8..ed7f770 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -42,6 +42,7 @@ Programmer's Guide
     mempool_lib
     mbuf_lib
     poll_mode_drv
+    rte_flow
     cryptodev_lib
     link_bonding_poll_mode_drv_lib
     timer_lib
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
new file mode 100644
index 0000000..73fe809
--- /dev/null
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -0,0 +1,2042 @@
+..  BSD LICENSE
+    Copyright 2016 6WIND S.A.
+    Copyright 2016 Mellanox.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of 6WIND S.A. nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Generic_flow_API:
+
+Generic flow API (rte_flow)
+===========================
+
+Overview
+--------
+
+This API provides a generic means to configure hardware to match specific
+ingress or egress traffic, alter its fate and query related counters
+according to any number of user-defined rules.
+
+It is named *rte_flow* after the prefix used for all its symbols, and is
+defined in ``rte_flow.h``.
+
+- Matching can be performed on packet data (protocol headers, payload) and
+  properties (e.g. associated physical port, virtual device function ID).
+
+- Possible operations include dropping traffic, diverting it to specific
+  queues, to virtual/physical device functions or ports, performing tunnel
+  offloads, adding marks and so on.
+
+It is slightly higher-level than the legacy filtering framework which it
+encompasses and supersedes (including all functions and filter types) in
+order to expose a single interface with an unambiguous behavior that is
+common to all poll-mode drivers (PMDs).
+
+Several methods to migrate existing applications are described in `API
+migration`_.
+
+Flow rule
+---------
+
+Description
+~~~~~~~~~~~
+
+A flow rule is the combination of attributes with a matching pattern and a
+list of actions. Flow rules form the basis of this API.
+
+Flow rules can have several distinct actions (such as counting,
+encapsulating, decapsulating before redirecting packets to a particular
+queue, etc.), instead of relying on several rules to achieve this and having
+applications deal with hardware implementation details regarding their
+order.
+
+Support for different priority levels on a rule basis is provided, for
+example in order to force a more specific rule to come before a more generic
+one for packets matched by both. However hardware support for more than a
+single priority level cannot be guaranteed. When supported, the number of
+available priority levels is usually low, which is why they can also be
+implemented in software by PMDs (e.g. missing priority levels may be
+emulated by reordering rules).
+
+In order to remain as hardware-agnostic as possible, by default all rules
+are considered to have the same priority, which means that the order between
+overlapping rules (when a packet is matched by several filters) is
+undefined.
+
+PMDs may refuse to create overlapping rules at a given priority level when
+they can be detected (e.g. if a pattern matches an existing filter).
+
+Thus predictable results for a given priority level can only be achieved
+with non-overlapping rules, using perfect matching on all protocol layers.
+
+Flow rules can also be grouped, the flow rule priority is specific to the
+group they belong to. All flow rules in a given group are thus processed
+either before or after another group.
+
+Support for multiple actions per rule may be implemented internally on top
+of non-default hardware priorities, as a result both features may not be
+simultaneously available to applications.
+
+Considering that allowed pattern/actions combinations cannot be known in
+advance and would result in an impractically large number of capabilities to
+expose, a method is provided to validate a given rule from the current
+device configuration state.
+
+This enables applications to check if the rule types they need is supported
+at initialization time, before starting their data path. This method can be
+used anytime, its only requirement being that the resources needed by a rule
+should exist (e.g. a target RX queue should be configured first).
+
+Each defined rule is associated with an opaque handle managed by the PMD,
+applications are responsible for keeping it. These can be used for queries
+and rules management, such as retrieving counters or other data and
+destroying them.
+
+To avoid resource leaks on the PMD side, handles must be explicitly
+destroyed by the application before releasing associated resources such as
+queues and ports.
+
+The following sections cover:
+
+- **Attributes** (represented by ``struct rte_flow_attr``): properties of a
+  flow rule such as its direction (ingress or egress) and priority.
+
+- **Pattern item** (represented by ``struct rte_flow_item``): part of a
+  matching pattern that either matches specific packet data or traffic
+  properties. It can also describe properties of the pattern itself, such as
+  inverted matching.
+
+- **Matching pattern**: traffic properties to look for, a combination of any
+  number of items.
+
+- **Actions** (represented by ``struct rte_flow_action``): operations to
+  perform whenever a packet is matched by a pattern.
+
+Attributes
+~~~~~~~~~~
+
+Attribute: Group
+^^^^^^^^^^^^^^^^
+
+Flow rules can be grouped by assigning them a common group number. Lower
+values have higher priority. Group 0 has the highest priority.
+
+Although optional, applications are encouraged to group similar rules as
+much as possible to fully take advantage of hardware capabilities
+(e.g. optimized matching) and work around limitations (e.g. a single pattern
+type possibly allowed in a given group).
+
+Note that support for more than a single group is not guaranteed.
+
+Attribute: Priority
+^^^^^^^^^^^^^^^^^^^
+
+A priority level can be assigned to a flow rule. Like groups, lower values
+denote higher priority, with 0 as the maximum.
+
+A rule with priority 0 in group 8 is always matched after a rule with
+priority 8 in group 0.
+
+Group and priority levels are arbitrary and up to the application, they do
+not need to be contiguous nor start from 0, however the maximum number
+varies between devices and may be affected by existing flow rules.
+
+If a packet is matched by several rules of a given group for a given
+priority level, the outcome is undefined. It can take any path, may be
+duplicated or even cause unrecoverable errors.
+
+Note that support for more than a single priority level is not guaranteed.
+
+Attribute: Traffic direction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+
+Several pattern items and actions are valid and can be used in both
+directions. At least one direction must be specified.
+
+Specifying both directions at once for a given rule is not recommended but
+may be valid in a few cases (e.g. shared counters).
+
+Pattern item
+~~~~~~~~~~~~
+
+Pattern items fall in two categories:
+
+- Matching protocol headers and packet data (ANY, RAW, ETH, VLAN, IPV4,
+  IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
+  specification structure.
+
+- Matching meta-data or affecting pattern processing (END, VOID, INVERT, PF,
+  VF, PORT and so on), often without a specification structure.
+
+Item specification structures are used to match specific values among
+protocol fields (or item properties). Documentation describes for each item
+whether they are associated with one and their type name if so.
+
+Up to three structures of the same type can be set for a given item:
+
+- ``spec``: values to match (e.g. a given IPv4 address).
+
+- ``last``: upper bound for an inclusive range with corresponding fields in
+  ``spec``.
+
+- ``mask``: bit-mask applied to both ``spec`` and ``last`` whose purpose is
+  to distinguish the values to take into account and/or partially mask them
+  out (e.g. in order to match an IPv4 address prefix).
+
+Usage restrictions and expected behavior:
+
+- Setting either ``mask`` or ``last`` without ``spec`` is an error.
+
+- Field values in ``last`` which are either 0 or equal to the corresponding
+  values in ``spec`` are ignored; they do not generate a range. Nonzero
+  values lower than those in ``spec`` are not supported.
+
+- Setting ``spec`` and optionally ``last`` without ``mask`` causes the PMD
+  to only take the fields it can recognize into account. There is no error
+  checking for unsupported fields.
+
+- Not setting any of them (assuming item type allows it) uses default
+  parameters that depend on the item type. Most of the time, particularly
+  for protocol header items, it is equivalent to providing an empty (zeroed)
+  ``mask``.
+
+- ``mask`` is a simple bit-mask applied before interpreting the contents of
+  ``spec`` and ``last``, which may yield unexpected results if not used
+  carefully. For example, if for an IPv4 address field, ``spec`` provides
+  *10.1.2.3*, ``last`` provides *10.3.4.5* and ``mask`` provides
+  *255.255.0.0*, the effective range becomes *10.1.0.0* to *10.3.255.255*.
+
+Example of an item specification matching an Ethernet header:
+
+.. _table_rte_flow_pattern_item_example:
+
+.. table:: Ethernet item
+
+   +----------+----------+--------------------+
+   | Field    | Subfield | Value              |
+   +==========+==========+====================+
+   | ``spec`` | ``src``  | ``00:01:02:03:04`` |
+   |          +----------+--------------------+
+   |          | ``dst``  | ``00:2a:66:00:01`` |
+   |          +----------+--------------------+
+   |          | ``type`` | ``0x22aa``         |
+   +----------+----------+--------------------+
+   | ``last`` | unspecified                   |
+   +----------+----------+--------------------+
+   | ``mask`` | ``src``  | ``00:ff:ff:ff:00`` |
+   |          +----------+--------------------+
+   |          | ``dst``  | ``00:00:00:00:ff`` |
+   |          +----------+--------------------+
+   |          | ``type`` | ``0x0000``         |
+   +----------+----------+--------------------+
+
+Non-masked bits stand for any value (shown as ``?`` below), Ethernet headers
+with the following properties are thus matched:
+
+- ``src``: ``??:01:02:03:??``
+- ``dst``: ``??:??:??:??:01``
+- ``type``: ``0x????``
+
+Matching pattern
+~~~~~~~~~~~~~~~~
+
+A pattern is formed by stacking items starting from the lowest protocol
+layer to match. This stacking restriction does not apply to meta items which
+can be placed anywhere in the stack without affecting the meaning of the
+resulting pattern.
+
+Patterns are terminated by END items.
+
+Examples:
+
+.. _table_rte_flow_tcpv4_as_l4:
+
+.. table:: TCPv4 as L4
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | Ethernet |
+   +-------+----------+
+   | 1     | IPv4     |
+   +-------+----------+
+   | 2     | TCP      |
+   +-------+----------+
+   | 3     | END      |
+   +-------+----------+
+
+|
+
+.. _table_rte_flow_tcpv6_in_vxlan:
+
+.. table:: TCPv6 in VXLAN
+
+   +-------+------------+
+   | Index | Item       |
+   +=======+============+
+   | 0     | Ethernet   |
+   +-------+------------+
+   | 1     | IPv4       |
+   +-------+------------+
+   | 2     | UDP        |
+   +-------+------------+
+   | 3     | VXLAN      |
+   +-------+------------+
+   | 4     | Ethernet   |
+   +-------+------------+
+   | 5     | IPv6       |
+   +-------+------------+
+   | 6     | TCP        |
+   +-------+------------+
+   | 7     | END        |
+   +-------+------------+
+
+|
+
+.. _table_rte_flow_tcpv4_as_l4_meta:
+
+.. table:: TCPv4 as L4 with meta items
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | VOID     |
+   +-------+----------+
+   | 1     | Ethernet |
+   +-------+----------+
+   | 2     | VOID     |
+   +-------+----------+
+   | 3     | IPv4     |
+   +-------+----------+
+   | 4     | TCP      |
+   +-------+----------+
+   | 5     | VOID     |
+   +-------+----------+
+   | 6     | VOID     |
+   +-------+----------+
+   | 7     | END      |
+   +-------+----------+
+
+The above example shows how meta items do not affect packet data matching
+items, as long as those remain stacked properly. The resulting matching
+pattern is identical to "TCPv4 as L4".
+
+.. _table_rte_flow_udpv6_anywhere:
+
+.. table:: UDPv6 anywhere
+
+   +-------+------+
+   | Index | Item |
+   +=======+======+
+   | 0     | IPv6 |
+   +-------+------+
+   | 1     | UDP  |
+   +-------+------+
+   | 2     | END  |
+   +-------+------+
+
+If supported by the PMD, omitting one or several protocol layers at the
+bottom of the stack as in the above example (missing an Ethernet
+specification) enables looking up anywhere in packets.
+
+It is unspecified whether the payload of supported encapsulations
+(e.g. VXLAN payload) is matched by such a pattern, which may apply to inner,
+outer or both packets.
+
+.. _table_rte_flow_invalid_l3:
+
+.. table:: Invalid, missing L3
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | Ethernet |
+   +-------+----------+
+   | 1     | UDP      |
+   +-------+----------+
+   | 2     | END      |
+   +-------+----------+
+
+The above pattern is invalid due to a missing L3 specification between L2
+(Ethernet) and L4 (UDP). Doing so is only allowed at the bottom and at the
+top of the stack.
+
+Meta item types
+~~~~~~~~~~~~~~~
+
+They match meta-data or affect pattern processing instead of matching packet
+data directly, most of them do not need a specification structure. This
+particularity allows them to be specified anywhere in the stack without
+causing any side effect.
+
+Item: ``END``
+^^^^^^^^^^^^^
+
+End marker for item lists. Prevents further processing of items, thereby
+ending the pattern.
+
+- Its numeric value is 0 for convenience.
+- PMD support is mandatory.
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_end:
+
+.. table:: END
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+Item: ``VOID``
+^^^^^^^^^^^^^^
+
+Used as a placeholder for convenience. It is ignored and simply discarded by
+PMDs.
+
+- PMD support is mandatory.
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_void:
+
+.. table:: VOID
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+One usage example for this type is generating rules that share a common
+prefix quickly without reallocating memory, only by updating item types:
+
+.. _table_rte_flow_item_void_example:
+
+.. table:: TCP, UDP or ICMP as L4
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | Ethernet           |
+   +-------+--------------------+
+   | 1     | IPv4               |
+   +-------+------+------+------+
+   | 2     | UDP  | VOID | VOID |
+   +-------+------+------+------+
+   | 3     | VOID | TCP  | VOID |
+   +-------+------+------+------+
+   | 4     | VOID | VOID | ICMP |
+   +-------+------+------+------+
+   | 5     | END                |
+   +-------+--------------------+
+
+Item: ``INVERT``
+^^^^^^^^^^^^^^^^
+
+Inverted matching, i.e. process packets that do not match the pattern.
+
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_invert:
+
+.. table:: INVERT
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+Usage example, matching non-TCPv4 packets only:
+
+.. _table_rte_flow_item_invert_example:
+
+.. table:: Anything but TCPv4
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | INVERT   |
+   +-------+----------+
+   | 1     | Ethernet |
+   +-------+----------+
+   | 2     | IPv4     |
+   +-------+----------+
+   | 3     | TCP      |
+   +-------+----------+
+   | 4     | END      |
+   +-------+----------+
+
+Item: ``PF``
+^^^^^^^^^^^^
+
+Matches packets addressed to the physical function of the device.
+
+If the underlying device function differs from the one that would normally
+receive the matched traffic, specifying this item prevents it from reaching
+that device unless the flow rule contains a `Action: PF`_. Packets are not
+duplicated between device instances by default.
+
+- Likely to return an error or never match any traffic if applied to a VF
+  device.
+- Can be combined with any number of `Item: VF`_ to match both PF and VF
+  traffic.
+- ``spec``, ``last`` and ``mask`` must not be set.
+
+.. _table_rte_flow_item_pf:
+
+.. table:: PF
+
+   +----------+-------+
+   | Field    | Value |
+   +==========+=======+
+   | ``spec`` | unset |
+   +----------+-------+
+   | ``last`` | unset |
+   +----------+-------+
+   | ``mask`` | unset |
+   +----------+-------+
+
+Item: ``VF``
+^^^^^^^^^^^^
+
+Matches packets addressed to a virtual function ID of the device.
+
+If the underlying device function differs from the one that would normally
+receive the matched traffic, specifying this item prevents it from reaching
+that device unless the flow rule contains a `Action: VF`_. Packets are not
+duplicated between device instances by default.
+
+- Likely to return an error or never match any traffic if this causes a VF
+  device to match traffic addressed to a different VF.
+- Can be specified multiple times to match traffic addressed to several VF
+  IDs.
+- Can be combined with a PF item to match both PF and VF traffic.
+
+.. _table_rte_flow_item_vf:
+
+.. table:: VF
+
+   +----------+----------+---------------------------+
+   | Field    | Subfield | Value                     |
+   +==========+==========+===========================+
+   | ``spec`` | ``id``   | destination VF ID         |
+   +----------+----------+---------------------------+
+   | ``last`` | ``id``   | upper range value         |
+   +----------+----------+---------------------------+
+   | ``mask`` | ``id``   | zeroed to match any VF ID |
+   +----------+----------+---------------------------+
+
+Item: ``PORT``
+^^^^^^^^^^^^^^
+
+Matches packets coming from the specified physical port of the underlying
+device.
+
+The first PORT item overrides the physical port normally associated with the
+specified DPDK input port (port_id). This item can be provided several times
+to match additional physical ports.
+
+Note that physical ports are not necessarily tied to DPDK input ports
+(port_id) when those are not under DPDK control. Possible values are
+specific to each device, they are not necessarily indexed from zero and may
+not be contiguous.
+
+As a device property, the list of allowed values as well as the value
+associated with a port_id should be retrieved by other means.
+
+.. _table_rte_flow_item_port:
+
+.. table:: PORT
+
+   +----------+-----------+--------------------------------+
+   | Field    | Subfield  | Value                          |
+   +==========+===========+================================+
+   | ``spec`` | ``index`` | physical port index            |
+   +----------+-----------+--------------------------------+
+   | ``last`` | ``index`` | upper range value              |
+   +----------+-----------+--------------------------------+
+   | ``mask`` | ``index`` | zeroed to match any port index |
+   +----------+-----------+--------------------------------+
+
+Data matching item types
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Most of these are basically protocol header definitions with associated
+bit-masks. They must be specified (stacked) from lowest to highest protocol
+layer to form a matching pattern.
+
+The following list is not exhaustive, new protocols will be added in the
+future.
+
+Item: ``ANY``
+^^^^^^^^^^^^^
+
+Matches any protocol in place of the current layer, a single ANY may also
+stand for several protocol layers.
+
+This is usually specified as the first pattern item when looking for a
+protocol anywhere in a packet.
+
+.. _table_rte_flow_item_any:
+
+.. table:: ANY
+
+   +----------+----------+--------------------------------------+
+   | Field    | Subfield | Value                                |
+   +==========+==========+======================================+
+   | ``spec`` | ``num``  | number of layers covered             |
+   +----------+----------+--------------------------------------+
+   | ``last`` | ``num``  | upper range value                    |
+   +----------+----------+--------------------------------------+
+   | ``mask`` | ``num``  | zeroed to cover any number of layers |
+   +----------+----------+--------------------------------------+
+
+Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or IPv6)
+and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
+or IPv6) matched by the second ANY specification:
+
+.. _table_rte_flow_item_any_example:
+
+.. table:: TCP in VXLAN with wildcards
+
+   +-------+------+----------+----------+-------+
+   | Index | Item | Field    | Subfield | Value |
+   +=======+======+==========+==========+=======+
+   | 0     | Ethernet                           |
+   +-------+------+----------+----------+-------+
+   | 1     | ANY  | ``spec`` | ``num``  | 2     |
+   +-------+------+----------+----------+-------+
+   | 2     | VXLAN                              |
+   +-------+------------------------------------+
+   | 3     | Ethernet                           |
+   +-------+------+----------+----------+-------+
+   | 4     | ANY  | ``spec`` | ``num``  | 1     |
+   +-------+------+----------+----------+-------+
+   | 5     | TCP                                |
+   +-------+------------------------------------+
+   | 6     | END                                |
+   +-------+------------------------------------+
+
+Item: ``RAW``
+^^^^^^^^^^^^^
+
+Matches a byte string of a given length at a given offset.
+
+Offset is either absolute (using the start of the packet) or relative to the
+end of the previous matched item in the stack, in which case negative values
+are allowed.
+
+If search is enabled, offset is used as the starting point. The search area
+can be delimited by setting limit to a nonzero value, which is the maximum
+number of bytes after offset where the pattern may start.
+
+Matching a zero-length pattern is allowed, doing so resets the relative
+offset for subsequent items.
+
+- This type does not support ranges (``last`` field).
+
+.. _table_rte_flow_item_raw:
+
+.. table:: RAW
+
+   +----------+--------------+-------------------------------------------------+
+   | Field    | Subfield     | Value                                           |
+   +==========+==============+=================================================+
+   | ``spec`` | ``relative`` | look for pattern after the previous item        |
+   |          +--------------+-------------------------------------------------+
+   |          | ``search``   | search pattern from offset (see also ``limit``) |
+   |          +--------------+-------------------------------------------------+
+   |          | ``reserved`` | reserved, must be set to zero                   |
+   |          +--------------+-------------------------------------------------+
+   |          | ``offset``   | absolute or relative offset for ``pattern``     |
+   |          +--------------+-------------------------------------------------+
+   |          | ``limit``    | search area limit for start of ``pattern``      |
+   |          +--------------+-------------------------------------------------+
+   |          | ``length``   | ``pattern`` length                              |
+   |          +--------------+-------------------------------------------------+
+   |          | ``pattern``  | byte string to look for                         |
+   +----------+--------------+-------------------------------------------------+
+   | ``last`` | if specified, either all 0 or with the same values as ``spec`` |
+   +----------+----------------------------------------------------------------+
+   | ``mask`` | bit-mask applied to ``spec`` values with usual behavior        |
+   +----------+----------------------------------------------------------------+
+
+Example pattern looking for several strings at various offsets of a UDP
+payload, using combined RAW items:
+
+.. _table_rte_flow_item_raw_example:
+
+.. table:: UDP payload matching
+
+   +-------+------+----------+--------------+-------+
+   | Index | Item | Field    | Subfield     | Value |
+   +=======+======+==========+==============+=======+
+   | 0     | Ethernet                               |
+   +-------+----------------------------------------+
+   | 1     | IPv4                                   |
+   +-------+----------------------------------------+
+   | 2     | UDP                                    |
+   +-------+------+----------+--------------+-------+
+   | 3     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | 10    |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "foo" |
+   +-------+------+----------+--------------+-------+
+   | 4     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | 20    |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "bar" |
+   +-------+------+----------+--------------+-------+
+   | 5     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | -29   |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "baz" |
+   +-------+------+----------+--------------+-------+
+   | 6     | END                                    |
+   +-------+----------------------------------------+
+
+This translates to:
+
+- Locate "foo" at least 10 bytes deep inside UDP payload.
+- Locate "bar" after "foo" plus 20 bytes.
+- Locate "baz" after "bar" minus 29 bytes.
+
+Such a packet may be represented as follows (not to scale)::
+
+ 0                     >= 10 B           == 20 B
+ |                  |<--------->|     |<--------->|
+ |                  |           |     |           |
+ |-----|------|-----|-----|-----|-----|-----------|-----|------|
+ | ETH | IPv4 | UDP | ... | baz | foo | ......... | bar | .... |
+ |-----|------|-----|-----|-----|-----|-----------|-----|------|
+                          |                             |
+                          |<--------------------------->|
+                                      == 29 B
+
+Note that matching subsequent pattern items would resume after "baz", not
+"bar" since matching is always performed after the previous item of the
+stack.
+
+Item: ``ETH``
+^^^^^^^^^^^^^
+
+Matches an Ethernet header.
+
+- ``dst``: destination MAC.
+- ``src``: source MAC.
+- ``type``: EtherType.
+
+Item: ``VLAN``
+^^^^^^^^^^^^^^
+
+Matches an 802.1Q/ad VLAN tag.
+
+- ``tpid``: tag protocol identifier.
+- ``tci``: tag control information.
+
+Item: ``IPV4``
+^^^^^^^^^^^^^^
+
+Matches an IPv4 header.
+
+Note: IPv4 options are handled by dedicated pattern items.
+
+- ``hdr``: IPv4 header definition (``rte_ip.h``).
+
+Item: ``IPV6``
+^^^^^^^^^^^^^^
+
+Matches an IPv6 header.
+
+Note: IPv6 options are handled by dedicated pattern items.
+
+- ``hdr``: IPv6 header definition (``rte_ip.h``).
+
+Item: ``ICMP``
+^^^^^^^^^^^^^^
+
+Matches an ICMP header.
+
+- ``hdr``: ICMP header definition (``rte_icmp.h``).
+
+Item: ``UDP``
+^^^^^^^^^^^^^
+
+Matches a UDP header.
+
+- ``hdr``: UDP header definition (``rte_udp.h``).
+
+Item: ``TCP``
+^^^^^^^^^^^^^
+
+Matches a TCP header.
+
+- ``hdr``: TCP header definition (``rte_tcp.h``).
+
+Item: ``SCTP``
+^^^^^^^^^^^^^^
+
+Matches a SCTP header.
+
+- ``hdr``: SCTP header definition (``rte_sctp.h``).
+
+Item: ``VXLAN``
+^^^^^^^^^^^^^^^
+
+Matches a VXLAN header (RFC 7348).
+
+- ``flags``: normally 0x08 (I flag).
+- ``rsvd0``: reserved, normally 0x000000.
+- ``vni``: VXLAN network identifier.
+- ``rsvd1``: reserved, normally 0x00.
+
+Actions
+~~~~~~~
+
+Each possible action is represented by a type. Some have associated
+configuration structures. Several actions combined in a list can be affected
+to a flow rule. That list is not ordered.
+
+They fall in three categories:
+
+- Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+  processing matched packets by subsequent flow rules, unless overridden
+  with PASSTHRU.
+
+- Non-terminating actions (PASSTHRU, DUP) that leave matched packets up for
+  additional processing by subsequent flow rules.
+
+- Other non-terminating meta actions that do not affect the fate of packets
+  (END, VOID, MARK, FLAG, COUNT).
+
+When several actions are combined in a flow rule, they should all have
+different types (e.g. dropping a packet twice is not possible).
+
+Only the last action of a given type is taken into account. PMDs still
+perform error checking on the entire list.
+
+Like matching patterns, action lists are terminated by END items.
+
+*Note that PASSTHRU is the only action able to override a terminating rule.*
+
+Example of action that redirects packets to queue index 10:
+
+.. _table_rte_flow_action_example:
+
+.. table:: Queue action
+
+   +-----------+-------+
+   | Field     | Value |
+   +===========+=======+
+   | ``index`` | 10    |
+   +-----------+-------+
+
+Action lists examples, their order is not significant, applications must
+consider all actions to be performed simultaneously:
+
+.. _table_rte_flow_count_and_drop:
+
+.. table:: Count and drop
+
+   +-------+--------+
+   | Index | Action |
+   +=======+========+
+   | 0     | COUNT  |
+   +-------+--------+
+   | 1     | DROP   |
+   +-------+--------+
+   | 2     | END    |
+   +-------+--------+
+
+|
+
+.. _table_rte_flow_mark_count_redirect:
+
+.. table:: Mark, count and redirect
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | MARK   | ``mark``  | 0x2a  |
+   +-------+--------+-----------+-------+
+   | 1     | COUNT                      |
+   +-------+--------+-----------+-------+
+   | 2     | QUEUE  | ``queue`` | 10    |
+   +-------+--------+-----------+-------+
+   | 3     | END                        |
+   +-------+----------------------------+
+
+|
+
+.. _table_rte_flow_redirect_queue_5:
+
+.. table:: Redirect to queue 5
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | DROP                       |
+   +-------+--------+-----------+-------+
+   | 1     | QUEUE  | ``queue`` | 5     |
+   +-------+--------+-----------+-------+
+   | 2     | END                        |
+   +-------+----------------------------+
+
+In the above example, considering both actions are performed simultaneously,
+the end result is that only QUEUE has any effect.
+
+.. _table_rte_flow_redirect_queue_3:
+
+.. table:: Redirect to queue 3
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | QUEUE  | ``queue`` | 5     |
+   +-------+--------+-----------+-------+
+   | 1     | VOID                       |
+   +-------+--------+-----------+-------+
+   | 2     | QUEUE  | ``queue`` | 3     |
+   +-------+--------+-----------+-------+
+   | 3     | END                        |
+   +-------+----------------------------+
+
+As previously described, only the last action of a given type found in the
+list is taken into account. The above example also shows that VOID is
+ignored.
+
+Action types
+~~~~~~~~~~~~
+
+Common action types are described in this section. Like pattern item types,
+this list is not exhaustive as new actions will be added in the future.
+
+Action: ``END``
+^^^^^^^^^^^^^^^
+
+End marker for action lists. Prevents further processing of actions, thereby
+ending the list.
+
+- Its numeric value is 0 for convenience.
+- PMD support is mandatory.
+- No configurable properties.
+
+.. _table_rte_flow_action_end:
+
+.. table:: END
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``VOID``
+^^^^^^^^^^^^^^^^
+
+Used as a placeholder for convenience. It is ignored and simply discarded by
+PMDs.
+
+- PMD support is mandatory.
+- No configurable properties.
+
+.. _table_rte_flow_action_void:
+
+.. table:: VOID
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``PASSTHRU``
+^^^^^^^^^^^^^^^^^^^^
+
+Leaves packets up for additional processing by subsequent flow rules. This
+is the default when a rule does not contain a terminating action, but can be
+specified to force a rule to become non-terminating.
+
+- No configurable properties.
+
+.. _table_rte_flow_action_passthru:
+
+.. table:: PASSTHRU
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Example to copy a packet to a queue and continue processing by subsequent
+flow rules:
+
+.. _table_rte_flow_action_passthru_example:
+
+.. table:: Copy to queue 8
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | PASSTHRU                   |
+   +-------+--------+-----------+-------+
+   | 1     | QUEUE  | ``queue`` | 8     |
+   +-------+--------+-----------+-------+
+   | 2     | END                        |
+   +-------+----------------------------+
+
+Action: ``MARK``
+^^^^^^^^^^^^^^^^
+
+Attaches a 32 bit value to packets.
+
+This value is arbitrary and application-defined. For compatibility with FDIR
+it is returned in the ``hash.fdir.hi`` mbuf field. ``PKT_RX_FDIR_ID`` is
+also set in ``ol_flags``.
+
+.. _table_rte_flow_action_mark:
+
+.. table:: MARK
+
+   +--------+-------------------------------------+
+   | Field  | Value                               |
+   +========+=====================================+
+   | ``id`` | 32 bit value to return with packets |
+   +--------+-------------------------------------+
+
+Action: ``FLAG``
+^^^^^^^^^^^^^^^^
+
+Flag packets. Similar to `Action: MARK`_ but only affects ``ol_flags``.
+
+- No configurable properties.
+
+Note: a distinctive flag must be defined for it.
+
+.. _table_rte_flow_action_flag:
+
+.. table:: FLAG
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``QUEUE``
+^^^^^^^^^^^^^^^^^
+
+Assigns packets to a given queue index.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_queue:
+
+.. table:: QUEUE
+
+   +-----------+--------------------+
+   | Field     | Value              |
+   +===========+====================+
+   | ``index`` | queue index to use |
+   +-----------+--------------------+
+
+Action: ``DROP``
+^^^^^^^^^^^^^^^^
+
+Drop packets.
+
+- No configurable properties.
+- Terminating by default.
+- PASSTHRU overrides this action if both are specified.
+
+.. _table_rte_flow_action_drop:
+
+.. table:: DROP
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``COUNT``
+^^^^^^^^^^^^^^^^^
+
+Enables counters for this rule.
+
+These counters can be retrieved and reset through ``rte_flow_query()``, see
+``struct rte_flow_query_count``.
+
+- Counters can be retrieved with ``rte_flow_query()``.
+- No configurable properties.
+
+.. _table_rte_flow_action_count:
+
+.. table:: COUNT
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Query structure to retrieve and reset flow rule counters:
+
+.. _table_rte_flow_query_count:
+
+.. table:: COUNT query
+
+   +---------------+-----+-----------------------------------+
+   | Field         | I/O | Value                             |
+   +===============+=====+===================================+
+   | ``reset``     | in  | reset counter after query         |
+   +---------------+-----+-----------------------------------+
+   | ``hits_set``  | out | ``hits`` field is set             |
+   +---------------+-----+-----------------------------------+
+   | ``bytes_set`` | out | ``bytes`` field is set            |
+   +---------------+-----+-----------------------------------+
+   | ``hits``      | out | number of hits for this rule      |
+   +---------------+-----+-----------------------------------+
+   | ``bytes``     | out | number of bytes through this rule |
+   +---------------+-----+-----------------------------------+
+
+Action: ``DUP``
+^^^^^^^^^^^^^^^
+
+Duplicates packets to a given queue index.
+
+This is normally combined with QUEUE, however when used alone, it is
+actually similar to QUEUE + PASSTHRU.
+
+- Non-terminating by default.
+
+.. _table_rte_flow_action_dup:
+
+.. table:: DUP
+
+   +-----------+------------------------------------+
+   | Field     | Value                              |
+   +===========+====================================+
+   | ``index`` | queue index to duplicate packet to |
+   +-----------+------------------------------------+
+
+Action: ``RSS``
+^^^^^^^^^^^^^^^
+
+Similar to QUEUE, except RSS is additionally performed on packets to spread
+them among several queues according to the provided parameters.
+
+Note: RSS hash result is normally stored in the ``hash.rss`` mbuf field,
+however it conflicts with `Action: MARK`_ as they share the same space. When
+both actions are specified, the RSS hash is discarded and
+``PKT_RX_RSS_HASH`` is not set in ``ol_flags``. MARK has priority. The mbuf
+structure should eventually evolve to store both.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_rss:
+
+.. table:: RSS
+
+   +--------------+------------------------------+
+   | Field        | Value                        |
+   +==============+==============================+
+   | ``rss_conf`` | RSS parameters               |
+   +--------------+------------------------------+
+   | ``num``      | number of entries in queue[] |
+   +--------------+------------------------------+
+   | ``queue[]``  | queue indices to use         |
+   +--------------+------------------------------+
+
+Action: ``PF``
+^^^^^^^^^^^^^^
+
+Redirects packets to the physical function (PF) of the current device.
+
+- No configurable properties.
+- Terminating by default.
+
+.. _table_rte_flow_action_pf:
+
+.. table:: PF
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``VF``
+^^^^^^^^^^^^^^
+
+Redirects packets to a virtual function (VF) of the current device.
+
+Packets matched by a VF pattern item can be redirected to their original VF
+ID instead of the specified one. This parameter may not be available and is
+not guaranteed to work properly if the VF part is matched by a prior flow
+rule or if packets are not addressed to a VF in the first place.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_vf:
+
+.. table:: VF
+
+   +--------------+--------------------------------+
+   | Field        | Value                          |
+   +==============+================================+
+   | ``original`` | use original VF ID if possible |
+   +--------------+--------------------------------+
+   | ``vf``       | VF ID to redirect packets to   |
+   +--------------+--------------------------------+
+
+Negative types
+~~~~~~~~~~~~~~
+
+All specified pattern items (``enum rte_flow_item_type``) and actions
+(``enum rte_flow_action_type``) use positive identifiers.
+
+The negative space is reserved for dynamic types generated by PMDs during
+run-time. PMDs may encounter them as a result but must not accept negative
+identifiers they are not aware of.
+
+A method to generate them remains to be defined.
+
+Planned types
+~~~~~~~~~~~~~
+
+Pattern item types will be added as new protocols are implemented.
+
+Variable headers support through dedicated pattern items, for example in
+order to match specific IPv4 options and IPv6 extension headers would be
+stacked after IPv4/IPv6 items.
+
+Other action types are planned but are not defined yet. These include the
+ability to alter packet data in several ways, such as performing
+encapsulation/decapsulation of tunnel headers.
+
+Rules management
+----------------
+
+A rather simple API with few functions is provided to fully manage flow
+rules.
+
+Each created flow rule is associated with an opaque, PMD-specific handle
+pointer. The application is responsible for keeping it until the rule is
+destroyed.
+
+Flows rules are represented by ``struct rte_flow`` objects.
+
+Validation
+~~~~~~~~~~
+
+Given that expressing a definite set of device capabilities is not
+practical, a dedicated function is provided to check if a flow rule is
+supported and can be created.
+
+.. code-block:: c
+
+   int
+   rte_flow_validate(uint8_t port_id,
+                     const struct rte_flow_attr *attr,
+                     const struct rte_flow_item pattern[],
+                     const struct rte_flow_action actions[],
+                     struct rte_flow_error *error);
+
+While this function has no effect on the target device, the flow rule is
+validated against its current configuration state and the returned value
+should be considered valid by the caller for that state only.
+
+The returned value is guaranteed to remain valid only as long as no
+successful calls to ``rte_flow_create()`` or ``rte_flow_destroy()`` are made
+in the meantime and no device parameter affecting flow rules in any way are
+modified, due to possible collisions or resource limitations (although in
+such cases ``EINVAL`` should not be returned).
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``attr``: flow rule attributes.
+- ``pattern``: pattern specification (list terminated by the END pattern
+  item).
+- ``actions``: associated actions (list terminated by the END action).
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 if flow rule is valid and can be created. A negative errno value
+  otherwise (``rte_errno`` is also set), the following errors are defined.
+- ``-ENOSYS``: underlying device does not support this functionality.
+- ``-EINVAL``: unknown or invalid rule specification.
+- ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial
+  bit-masks are unsupported).
+- ``-EEXIST``: collision with an existing rule.
+- ``-ENOMEM``: not enough resources.
+- ``-EBUSY``: action cannot be performed due to busy device resources, may
+  succeed if the affected queues or even the entire port are in a stopped
+  state (see ``rte_eth_dev_rx_queue_stop()`` and ``rte_eth_dev_stop()``).
+
+Creation
+~~~~~~~~
+
+Creating a flow rule is similar to validating one, except the rule is
+actually created and a handle returned.
+
+.. code-block:: c
+
+   struct rte_flow *
+   rte_flow_create(uint8_t port_id,
+                   const struct rte_flow_attr *attr,
+                   const struct rte_flow_item pattern[],
+                   const struct rte_flow_action *actions[],
+                   struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``attr``: flow rule attributes.
+- ``pattern``: pattern specification (list terminated by the END pattern
+  item).
+- ``actions``: associated actions (list terminated by the END action).
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+A valid handle in case of success, NULL otherwise and ``rte_errno`` is set
+to the positive version of one of the error codes defined for
+``rte_flow_validate()``.
+
+Destruction
+~~~~~~~~~~~
+
+Flow rules destruction is not automatic, and a queue or a port should not be
+released if any are still attached to them. Applications must take care of
+performing this step before releasing resources.
+
+.. code-block:: c
+
+   int
+   rte_flow_destroy(uint8_t port_id,
+                    struct rte_flow *flow,
+                    struct rte_flow_error *error);
+
+
+Failure to destroy a flow rule handle may occur when other flow rules depend
+on it, and destroying it would result in an inconsistent state.
+
+This function is only guaranteed to succeed if handles are destroyed in
+reverse order of their creation.
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``flow``: flow rule handle to destroy.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Flush
+~~~~~
+
+Convenience function to destroy all flow rule handles associated with a
+port. They are released as with successive calls to ``rte_flow_destroy()``.
+
+.. code-block:: c
+
+   int
+   rte_flow_flush(uint8_t port_id,
+                  struct rte_flow_error *error);
+
+In the unlikely event of failure, handles are still considered destroyed and
+no longer valid but the port must be assumed to be in an inconsistent state.
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Query
+~~~~~
+
+Query an existing flow rule.
+
+This function allows retrieving flow-specific data such as counters. Data
+is gathered by special actions which must be present in the flow rule
+definition.
+
+.. code-block:: c
+
+   int
+   rte_flow_query(uint8_t port_id,
+                  struct rte_flow *flow,
+                  enum rte_flow_action_type action,
+                  void *data,
+                  struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``flow``: flow rule handle to query.
+- ``action``: action type to query.
+- ``data``: pointer to storage for the associated query data type.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Verbose error reporting
+-----------------------
+
+The defined *errno* values may not be accurate enough for users or
+application developers who want to investigate issues related to flow rules
+management. A dedicated error object is defined for this purpose:
+
+.. code-block:: c
+
+   enum rte_flow_error_type {
+       RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+       RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+       RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+       RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+       RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+       RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+       RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+       RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+       RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+   };
+
+   struct rte_flow_error {
+       enum rte_flow_error_type type; /**< Cause field and error types. */
+       const void *cause; /**< Object responsible for the error. */
+       const char *message; /**< Human-readable error message. */
+   };
+
+Error type ``RTE_FLOW_ERROR_TYPE_NONE`` stands for no error, in which case
+remaining fields can be ignored. Other error types describe the type of the
+object pointed by ``cause``.
+
+If non-NULL, ``cause`` points to the object responsible for the error. For a
+flow rule, this may be a pattern item or an individual action.
+
+If non-NULL, ``message`` provides a human-readable error message.
+
+This object is normally allocated by applications and set by PMDs in case of
+error, the message points to a constant string which does not need to be
+freed by the application, however its pointer can be considered valid only
+as long as its associated DPDK port remains configured. Closing the
+underlying device or unloading the PMD invalidates it.
+
+Caveats
+-------
+
+- DPDK does not keep track of flow rules definitions or flow rule objects
+  automatically. Applications may keep track of the former and must keep
+  track of the latter. PMDs may also do it for internal needs, however this
+  must not be relied on by applications.
+
+- Flow rules are not maintained between successive port initializations. An
+  application exiting without releasing them and restarting must re-create
+  them from scratch.
+
+- API operations are synchronous and blocking (``EAGAIN`` cannot be
+  returned).
+
+- There is no provision for reentrancy/multi-thread safety, although nothing
+  should prevent different devices from being configured at the same
+  time. PMDs may protect their control path functions accordingly.
+
+- Stopping the data path (TX/RX) should not be necessary when managing flow
+  rules. If this cannot be achieved naturally or with workarounds (such as
+  temporarily replacing the burst function pointers), an appropriate error
+  code must be returned (``EBUSY``).
+
+- PMDs, not applications, are responsible for maintaining flow rules
+  configuration when stopping and restarting a port or performing other
+  actions which may affect them. They can only be destroyed explicitly by
+  applications.
+
+For devices exposing multiple ports sharing global settings affected by flow
+rules:
+
+- All ports under DPDK control must behave consistently, PMDs are
+  responsible for making sure that existing flow rules on a port are not
+  affected by other ports.
+
+- Ports not under DPDK control (unaffected or handled by other applications)
+  are user's responsibility. They may affect existing flow rules and cause
+  undefined behavior. PMDs aware of this may prevent flow rules creation
+  altogether in such cases.
+
+PMD interface
+-------------
+
+The PMD interface is defined in ``rte_flow_driver.h``. It is not subject to
+API/ABI versioning constraints as it is not exposed to applications and may
+evolve independently.
+
+It is currently implemented on top of the legacy filtering framework through
+filter type *RTE_ETH_FILTER_GENERIC* that accepts the single operation
+*RTE_ETH_FILTER_GET* to return PMD-specific *rte_flow* callbacks wrapped
+inside ``struct rte_flow_ops``.
+
+This overhead is temporarily necessary in order to keep compatibility with
+the legacy filtering framework, which should eventually disappear.
+
+- PMD callbacks implement exactly the interface described in `Rules
+  management`_, except for the port ID argument which has already been
+  converted to a pointer to the underlying ``struct rte_eth_dev``.
+
+- Public API functions do not process flow rules definitions at all before
+  calling PMD functions (no basic error checking, no validation
+  whatsoever). They only make sure these callbacks are non-NULL or return
+  the ``ENOSYS`` (function not supported) error.
+
+This interface additionally defines the following helper functions:
+
+- ``rte_flow_ops_get()``: get generic flow operations structure from a
+  port.
+
+- ``rte_flow_error_set()``: initialize generic flow error structure.
+
+More will be added over time.
+
+Device compatibility
+--------------------
+
+No known implementation supports all the described features.
+
+Unsupported features or combinations are not expected to be fully emulated
+in software by PMDs for performance reasons. Partially supported features
+may be completed in software as long as hardware performs most of the work
+(such as queue redirection and packet recognition).
+
+However PMDs are expected to do their best to satisfy application requests
+by working around hardware limitations as long as doing so does not affect
+the behavior of existing flow rules.
+
+The following sections provide a few examples of such cases and describe how
+PMDs should handle them, they are based on limitations built into the
+previous APIs.
+
+Global bit-masks
+~~~~~~~~~~~~~~~~
+
+Each flow rule comes with its own, per-layer bit-masks, while hardware may
+support only a single, device-wide bit-mask for a given layer type, so that
+two IPv4 rules cannot use different bit-masks.
+
+The expected behavior in this case is that PMDs automatically configure
+global bit-masks according to the needs of the first flow rule created.
+
+Subsequent rules are allowed only if their bit-masks match those, the
+``EEXIST`` error code should be returned otherwise.
+
+Unsupported layer types
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Many protocols can be simulated by crafting patterns with the `Item: RAW`_
+type.
+
+PMDs can rely on this capability to simulate support for protocols with
+headers not directly recognized by hardware.
+
+``ANY`` pattern item
+~~~~~~~~~~~~~~~~~~~~
+
+This pattern item stands for anything, which can be difficult to translate
+to something hardware would understand, particularly if followed by more
+specific types.
+
+Consider the following pattern:
+
+.. _table_rte_flow_unsupported_any:
+
+.. table:: Pattern with ANY as L3
+
+   +-------+-----------------------+
+   | Index | Item                  |
+   +=======+=======================+
+   | 0     | ETHER                 |
+   +-------+-----+---------+-------+
+   | 1     | ANY | ``num`` | ``1`` |
+   +-------+-----+---------+-------+
+   | 2     | TCP                   |
+   +-------+-----------------------+
+   | 3     | END                   |
+   +-------+-----------------------+
+
+Knowing that TCP does not make sense with something other than IPv4 and IPv6
+as L3, such a pattern may be translated to two flow rules instead:
+
+.. _table_rte_flow_unsupported_any_ipv4:
+
+.. table:: ANY replaced with IPV4
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | ETHER              |
+   +-------+--------------------+
+   | 1     | IPV4 (zeroed mask) |
+   +-------+--------------------+
+   | 2     | TCP                |
+   +-------+--------------------+
+   | 3     | END                |
+   +-------+--------------------+
+
+|
+
+.. _table_rte_flow_unsupported_any_ipv6:
+
+.. table:: ANY replaced with IPV6
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | ETHER              |
+   +-------+--------------------+
+   | 1     | IPV6 (zeroed mask) |
+   +-------+--------------------+
+   | 2     | TCP                |
+   +-------+--------------------+
+   | 3     | END                |
+   +-------+--------------------+
+
+Note that as soon as a ANY rule covers several layers, this approach may
+yield a large number of hidden flow rules. It is thus suggested to only
+support the most common scenarios (anything as L2 and/or L3).
+
+Unsupported actions
+~~~~~~~~~~~~~~~~~~~
+
+- When combined with `Action: QUEUE`_, packet counting (`Action: COUNT`_)
+  and tagging (`Action: MARK`_ or `Action: FLAG`_) may be implemented in
+  software as long as the target queue is used by a single rule.
+
+- A rule specifying both `Action: DUP`_ + `Action: QUEUE`_ may be translated
+  to two hidden rules combining `Action: QUEUE`_ and `Action: PASSTHRU`_.
+
+- When a single target queue is provided, `Action: RSS`_ can also be
+  implemented through `Action: QUEUE`_.
+
+Flow rules priority
+~~~~~~~~~~~~~~~~~~~
+
+While it would naturally make sense, flow rules cannot be assumed to be
+processed by hardware in the same order as their creation for several
+reasons:
+
+- They may be managed internally as a tree or a hash table instead of a
+  list.
+- Removing a flow rule before adding another one can either put the new rule
+  at the end of the list or reuse a freed entry.
+- Duplication may occur when packets are matched by several rules.
+
+For overlapping rules (particularly in order to use `Action: PASSTHRU`_)
+predictable behavior is only guaranteed by using different priority levels.
+
+Priority levels are not necessarily implemented in hardware, or may be
+severely limited (e.g. a single priority bit).
+
+For these reasons, priority levels may be implemented purely in software by
+PMDs.
+
+- For devices expecting flow rules to be added in the correct order, PMDs
+  may destroy and re-create existing rules after adding a new one with
+  a higher priority.
+
+- A configurable number of dummy or empty rules can be created at
+  initialization time to save high priority slots for later.
+
+- In order to save priority levels, PMDs may evaluate whether rules are
+  likely to collide and adjust their priority accordingly.
+
+Future evolutions
+-----------------
+
+- A device profile selection function which could be used to force a
+  permanent profile instead of relying on its automatic configuration based
+  on existing flow rules.
+
+- A method to optimize *rte_flow* rules with specific pattern items and
+  action types generated on the fly by PMDs. DPDK should assign negative
+  numbers to these in order to not collide with the existing types. See
+  `Negative types`_.
+
+- Adding specific egress pattern items and actions as described in
+  `Attribute: Traffic direction`_.
+
+- Optional software fallback when PMDs are unable to handle requested flow
+  rules so applications do not have to implement their own.
+
+API migration
+-------------
+
+Exhaustive list of deprecated filter types (normally prefixed with
+*RTE_ETH_FILTER_*) found in ``rte_eth_ctrl.h`` and methods to convert them
+to *rte_flow* rules.
+
+``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*MACVLAN* can be translated to a basic `Item: ETH`_ flow rule with a
+terminating `Action: VF`_ or `Action: PF`_.
+
+.. _table_rte_flow_migration_macvlan:
+
+.. table:: MACVLAN conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | ETH | ``spec`` | any | VF,     |
+   |   |     +----------+-----+ PF      |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*ETHERTYPE* is basically an `Item: ETH`_ flow rule with a terminating
+`Action: QUEUE`_ or `Action: DROP`_.
+
+.. _table_rte_flow_migration_ethertype:
+
+.. table:: ETHERTYPE conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | ETH | ``spec`` | any | QUEUE,  |
+   |   |     +----------+-----+ DROP    |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``FLEXIBLE`` to ``RAW`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*FLEXIBLE* can be translated to one `Item: RAW`_ pattern with a terminating
+`Action: QUEUE`_ and a defined priority level.
+
+.. _table_rte_flow_migration_flexible:
+
+.. table:: FLEXIBLE conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | RAW | ``spec`` | any | QUEUE   |
+   |   |     +----------+-----+         |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``SYN`` to ``TCP`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*SYN* is a `Item: TCP`_ rule with only the ``syn`` bit enabled and masked,
+and a terminating `Action: QUEUE`_.
+
+Priority level can be set to simulate the high priority bit.
+
+.. _table_rte_flow_migration_syn:
+
+.. table:: SYN conversion
+
+   +-----------------------------------+---------+
+   | Pattern                           | Actions |
+   +===+======+==========+=============+=========+
+   | 0 | ETH  | ``spec`` | unset       | QUEUE   |
+   |   |      +----------+-------------+         |
+   |   |      | ``last`` | unset       |         |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   +---+------+----------+-------------+         |
+   | 1 | IPV4 | ``spec`` | unset       |         |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   +---+------+----------+---------+---+         |
+   | 2 | TCP  | ``spec`` | ``syn`` | 1 |         |
+   |   |      +----------+---------+---+         |
+   |   |      | ``mask`` | ``syn`` | 1 |         |
+   +---+------+----------+---------+---+---------+
+   | 3 | END                           | END     |
+   +---+-------------------------------+---------+
+
+``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*NTUPLE* is similar to specifying an empty L2, `Item: IPV4`_ as L3 with
+`Item: TCP`_ or `Item: UDP`_ as L4 and a terminating `Action: QUEUE`_.
+
+A priority level can be specified as well.
+
+.. _table_rte_flow_migration_ntuple:
+
+.. table:: NTUPLE conversion
+
+   +-----------------------------+---------+
+   | Pattern                     | Actions |
+   +===+======+==========+=======+=========+
+   | 0 | ETH  | ``spec`` | unset | QUEUE   |
+   |   |      +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | unset |         |
+   +---+------+----------+-------+         |
+   | 1 | IPV4 | ``spec`` | any   |         |
+   |   |      +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | any   |         |
+   +---+------+----------+-------+         |
+   | 2 | TCP, | ``spec`` | any   |         |
+   |   | UDP  +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | any   |         |
+   +---+------+----------+-------+---------+
+   | 3 | END                     | END     |
+   +---+-------------------------+---------+
+
+``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*TUNNEL* matches common IPv4 and IPv6 L3/L4-based tunnel types.
+
+In the following table, `Item: ANY`_ is used to cover the optional L4.
+
+.. _table_rte_flow_migration_tunnel:
+
+.. table:: TUNNEL conversion
+
+   +--------------------------------------+---------+
+   | Pattern                              | Actions |
+   +===+=========+==========+=============+=========+
+   | 0 | ETH     | ``spec`` | any         | QUEUE   |
+   |   |         +----------+-------------+         |
+   |   |         | ``last`` | unset       |         |
+   |   |         +----------+-------------+         |
+   |   |         | ``mask`` | any         |         |
+   +---+---------+----------+-------------+         |
+   | 1 | IPV4,   | ``spec`` | any         |         |
+   |   | IPV6    +----------+-------------+         |
+   |   |         | ``last`` | unset       |         |
+   |   |         +----------+-------------+         |
+   |   |         | ``mask`` | any         |         |
+   +---+---------+----------+-------------+         |
+   | 2 | ANY     | ``spec`` | any         |         |
+   |   |         +----------+-------------+         |
+   |   |         | ``last`` | unset       |         |
+   |   |         +----------+---------+---+         |
+   |   |         | ``mask`` | ``num`` | 0 |         |
+   +---+---------+----------+---------+---+         |
+   | 3 | VXLAN,  | ``spec`` | any         |         |
+   |   | GENEVE, +----------+-------------+         |
+   |   | TEREDO, | ``last`` | unset       |         |
+   |   | NVGRE,  +----------+-------------+         |
+   |   | GRE,    | ``mask`` | any         |         |
+   |   | ...     |          |             |         |
+   |   |         |          |             |         |
+   |   |         |          |             |         |
+   +---+---------+----------+-------------+---------+
+   | 4 | END                              | END     |
+   +---+----------------------------------+---------+
+
+``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*FDIR* is more complex than any other type, there are several methods to
+emulate its functionality. It is summarized for the most part in the table
+below.
+
+A few features are intentionally not supported:
+
+- The ability to configure the matching input set and masks for the entire
+  device, PMDs should take care of it automatically according to the
+  requested flow rules.
+
+  For example if a device supports only one bit-mask per protocol type,
+  source/address IPv4 bit-masks can be made immutable by the first created
+  rule. Subsequent IPv4 or TCPv4 rules can only be created if they are
+  compatible.
+
+  Note that only protocol bit-masks affected by existing flow rules are
+  immutable, others can be changed later. They become mutable again after
+  the related flow rules are destroyed.
+
+- Returning four or eight bytes of matched data when using flex bytes
+  filtering. Although a specific action could implement it, it conflicts
+  with the much more useful 32 bits tagging on devices that support it.
+
+- Side effects on RSS processing of the entire device. Flow rules that
+  conflict with the current device configuration should not be
+  allowed. Similarly, device configuration should not be allowed when it
+  affects existing flow rules.
+
+- Device modes of operation. "none" is unsupported since filtering cannot be
+  disabled as long as a flow rule is present.
+
+- "MAC VLAN" or "tunnel" perfect matching modes should be automatically set
+  according to the created flow rules.
+
+- Signature mode of operation is not defined but could be handled through a
+  specific item type if needed.
+
+.. _table_rte_flow_migration_fdir:
+
+.. table:: FDIR conversion
+
+   +---------------------------------+------------+
+   | Pattern                         | Actions    |
+   +===+============+==========+=====+============+
+   | 0 | ETH,       | ``spec`` | any | QUEUE,     |
+   |   | RAW        +----------+-----+ DROP,      |
+   |   |            | ``last`` | N/A | PASSTHRU   |
+   |   |            +----------+-----+            |
+   |   |            | ``mask`` | any |            |
+   +---+------------+----------+-----+------------+
+   | 1 | IPV4,      | ``spec`` | any | MARK       |
+   |   | IPV6       +----------+-----+            |
+   |   |            | ``last`` | N/A |            |
+   |   |            +----------+-----+            |
+   |   |            | ``mask`` | any |            |
+   +---+------------+----------+-----+            |
+   | 2 | TCP,       | ``spec`` | any |            |
+   |   | UDP,       +----------+-----+            |
+   |   | SCTP       | ``last`` | N/A |            |
+   |   |            +----------+-----+            |
+   |   |            | ``mask`` | any |            |
+   +---+------------+----------+-----+            |
+   | 3 | VF,        | ``spec`` | any |            |
+   |   | PF         +----------+-----+            |
+   |   | (optional) | ``last`` | N/A |            |
+   |   |            +----------+-----+            |
+   |   |            | ``mask`` | any |            |
+   +---+------------+----------+-----+------------+
+   | 4 | END                         | END        |
+   +---+-----------------------------+------------+
+
+``HASH``
+~~~~~~~~
+
+There is no counterpart to this filter type because it translates to a
+global device setting instead of a pattern item. Device settings are
+automatically set according to the created flow rules.
+
+``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All packets are matched. This type alters incoming packets to encapsulate
+them in a chosen tunnel type, optionally redirect them to a VF as well.
+
+The destination pool for tag based forwarding can be emulated with other
+flow rules using `Action: DUP`_.
+
+.. _table_rte_flow_migration_l2tunnel:
+
+.. table:: L2_TUNNEL conversion
+
+   +---------------------------+------------+
+   | Pattern                   | Actions    |
+   +===+======+==========+=====+============+
+   | 0 | VOID | ``spec`` | N/A | VXLAN,     |
+   |   |      |          |     | GENEVE,    |
+   |   |      |          |     | ...        |
+   |   |      +----------+-----+------------+
+   |   |      | ``last`` | N/A | VF         |
+   |   |      +----------+-----+ (optional) |
+   |   |      | ``mask`` | N/A |            |
+   |   |      |          |     |            |
+   +---+------+----------+-----+------------+
+   | 1 | END                   | END        |
+   +---+-----------------------+------------+
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 03/25] doc: announce deprecation of legacy filter types
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 02/25] doc: add rte_flow prog guide Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
                           ` (22 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

They are superseded by the generic flow API (rte_flow). Target release is
not defined yet.

Suggested-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/rel_notes/deprecation.rst | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 2d17bc6..1438c77 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -71,3 +71,11 @@ Deprecation Notices
 * mempool: The functions for single/multi producer/consumer are deprecated
   and will be removed in 17.02.
   It is replaced by ``rte_mempool_generic_get/put`` functions.
+
+* ethdev: the legacy filter API, including
+  ``rte_eth_dev_filter_supported()``, ``rte_eth_dev_filter_ctrl()`` as well
+  as filter types MACVLAN, ETHERTYPE, FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR,
+  HASH and L2_TUNNEL, is superseded by the generic flow API (rte_flow) in
+  PMDs that implement the latter.
+  Target release for removal of the legacy API will be defined once most
+  PMDs have switched to rte_flow.
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 04/25] cmdline: add support for dynamic tokens
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (2 preceding siblings ...)
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 03/25] doc: announce deprecation of legacy filter types Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 05/25] cmdline: add alignment constraint Adrien Mazarguil
                           ` (21 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

Considering tokens must be hard-coded in a list part of the instruction
structure, context-dependent tokens cannot be expressed.

This commit adds support for building dynamic token lists through a
user-provided function, which is called when the static token list is empty
(a single NULL entry).

Because no structures are modified (existing fields are reused), this
commit has no impact on the current ABI.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 lib/librte_cmdline/cmdline_parse.c | 60 +++++++++++++++++++++++++++++----
 lib/librte_cmdline/cmdline_parse.h | 21 ++++++++++++
 2 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index b496067..14f5553 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -146,7 +146,9 @@ nb_common_chars(const char * s1, const char * s2)
  */
 static int
 match_inst(cmdline_parse_inst_t *inst, const char *buf,
-	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
+	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size,
+	   cmdline_parse_token_hdr_t
+		*(*dyn_tokens)[CMDLINE_PARSE_DYNAMIC_TOKENS])
 {
 	unsigned int token_num=0;
 	cmdline_parse_token_hdr_t * token_p;
@@ -155,6 +157,11 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 	struct cmdline_token_hdr token_hdr;
 
 	token_p = inst->tokens[token_num];
+	if (!token_p && dyn_tokens && inst->f) {
+		if (!(*dyn_tokens)[0])
+			inst->f(&(*dyn_tokens)[0], NULL, dyn_tokens);
+		token_p = (*dyn_tokens)[0];
+	}
 	if (token_p)
 		memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -196,7 +203,17 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 		buf += n;
 
 		token_num ++;
-		token_p = inst->tokens[token_num];
+		if (!inst->tokens[0]) {
+			if (token_num < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!(*dyn_tokens)[token_num])
+					inst->f(&(*dyn_tokens)[token_num],
+						NULL,
+						dyn_tokens);
+				token_p = (*dyn_tokens)[token_num];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[token_num];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 	}
@@ -239,6 +256,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
 	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
 	int comment = 0;
@@ -255,6 +273,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		return CMDLINE_PARSE_BAD_ARGS;
 
 	ctx = cl->ctx;
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/*
 	 * - look if the buffer contains at least one line
@@ -299,7 +318,8 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
+		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
 			err = CMDLINE_PARSE_BAD_ARGS;
@@ -355,6 +375,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 	cmdline_parse_token_hdr_t *token_p;
 	struct cmdline_token_hdr token_hdr;
 	char tmpbuf[CMDLINE_BUFFER_SIZE], comp_buf[CMDLINE_BUFFER_SIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	unsigned int partial_tok_len;
 	int comp_len = -1;
 	int tmp_len = -1;
@@ -374,6 +395,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 
 	debug_printf("%s called\n", __func__);
 	memset(&token_hdr, 0, sizeof(token_hdr));
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/* count the number of complete token to parse */
 	for (i=0 ; buf[i] ; i++) {
@@ -396,11 +418,24 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		inst = ctx[inst_num];
 		while (inst) {
 			/* parse the first tokens of the inst */
-			if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+			if (nb_token &&
+			    match_inst(inst, buf, nb_token, NULL, 0,
+				       &dyn_tokens))
 				goto next;
 
 			debug_printf("instruction match\n");
-			token_p = inst->tokens[nb_token];
+			if (!inst->tokens[0]) {
+				if (nb_token <
+				    (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+					if (!dyn_tokens[nb_token])
+						inst->f(&dyn_tokens[nb_token],
+							NULL,
+							&dyn_tokens);
+					token_p = dyn_tokens[nb_token];
+				} else
+					token_p = NULL;
+			} else
+				token_p = inst->tokens[nb_token];
 			if (token_p)
 				memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -490,10 +525,21 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		/* we need to redo it */
 		inst = ctx[inst_num];
 
-		if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+		if (nb_token &&
+		    match_inst(inst, buf, nb_token, NULL, 0, &dyn_tokens))
 			goto next2;
 
-		token_p = inst->tokens[nb_token];
+		if (!inst->tokens[0]) {
+			if (nb_token < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!dyn_tokens[nb_token])
+					inst->f(&dyn_tokens[nb_token],
+						NULL,
+						&dyn_tokens);
+				token_p = dyn_tokens[nb_token];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[nb_token];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
diff --git a/lib/librte_cmdline/cmdline_parse.h b/lib/librte_cmdline/cmdline_parse.h
index 4ac05d6..65b18d4 100644
--- a/lib/librte_cmdline/cmdline_parse.h
+++ b/lib/librte_cmdline/cmdline_parse.h
@@ -83,6 +83,9 @@ extern "C" {
 /* maximum buffer size for parsed result */
 #define CMDLINE_PARSE_RESULT_BUFSIZE 8192
 
+/* maximum number of dynamic tokens */
+#define CMDLINE_PARSE_DYNAMIC_TOKENS 128
+
 /**
  * Stores a pointer to the ops struct, and the offset: the place to
  * write the parsed result in the destination structure.
@@ -130,6 +133,24 @@ struct cmdline;
  * Store a instruction, which is a pointer to a callback function and
  * its parameter that is called when the instruction is parsed, a help
  * string, and a list of token composing this instruction.
+ *
+ * When no tokens are defined (tokens[0] == NULL), they are retrieved
+ * dynamically by calling f() as follows:
+ *
+ *  f((struct cmdline_token_hdr **)&token_hdr,
+ *    NULL,
+ *    (struct cmdline_token_hdr *[])tokens));
+ *
+ * The address of the resulting token is expected at the location pointed by
+ * the first argument. Can be set to NULL to end the list.
+ *
+ * The cmdline argument (struct cmdline *) is always NULL.
+ *
+ * The last argument points to the NULL-terminated list of dynamic tokens
+ * defined so far. Since token_hdr points to an index of that list, the
+ * current index can be derived as follows:
+ *
+ *  int index = token_hdr - &(*tokens)[0];
  */
 struct cmdline_inst {
 	/* f(parsed_struct, data) */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 05/25] cmdline: add alignment constraint
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (3 preceding siblings ...)
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
                           ` (20 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

This prevents sigbus errors on architectures that cannot handle unexpected
unaligned accesses to the output buffer.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 lib/librte_cmdline/cmdline_parse.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index 14f5553..763c286 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -255,7 +255,10 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	unsigned int inst_num=0;
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
-	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	union {
+		char buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+		long double align; /* strong alignment constraint for buf */
+	} result;
 	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
@@ -318,7 +321,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+		tok = match_inst(inst, buf, 0, result.buf, sizeof(result.buf),
 				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
@@ -353,7 +356,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 
 	/* call func */
 	if (f) {
-		f(result_buf, cl, data);
+		f(result.buf, cl, data);
 	}
 
 	/* no match */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 06/25] app/testpmd: implement basic support for rte_flow
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (4 preceding siblings ...)
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 05/25] cmdline: add alignment constraint Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 07/25] app/testpmd: add flow command Adrien Mazarguil
                           ` (19 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

Add basic management functions for the generic flow API (validate, create,
destroy, flush, query and list). Flow rule objects and properties are
arranged in lists associated with each port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c     |   1 +
 app/test-pmd/config.c      | 498 ++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/csumonly.c    |   1 +
 app/test-pmd/flowgen.c     |   1 +
 app/test-pmd/icmpecho.c    |   1 +
 app/test-pmd/ieee1588fwd.c |   1 +
 app/test-pmd/iofwd.c       |   1 +
 app/test-pmd/macfwd.c      |   1 +
 app/test-pmd/macswap.c     |   1 +
 app/test-pmd/parameters.c  |   1 +
 app/test-pmd/rxonly.c      |   1 +
 app/test-pmd/testpmd.c     |   6 +
 app/test-pmd/testpmd.h     |  27 +++
 app/test-pmd/txonly.c      |   1 +
 14 files changed, 542 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index d03a592..5d1c0dd 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -75,6 +75,7 @@
 #include <rte_string_fns.h>
 #include <rte_devargs.h>
 #include <rte_eth_ctrl.h>
+#include <rte_flow.h>
 
 #include <cmdline_rdline.h>
 #include <cmdline_parse.h>
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 8cf537d..9716ce7 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -92,6 +92,8 @@
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
 #include <rte_cycles.h>
+#include <rte_flow.h>
+#include <rte_errno.h>
 
 #include "testpmd.h"
 
@@ -751,6 +753,502 @@ port_mtu_set(portid_t port_id, uint16_t mtu)
 	printf("Set MTU failed. diag=%d\n", diag);
 }
 
+/* Generic flow management functions. */
+
+/** Generate flow_item[] entry. */
+#define MK_FLOW_ITEM(t, s) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow pattern items. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_item[] = {
+	MK_FLOW_ITEM(END, 0),
+	MK_FLOW_ITEM(VOID, 0),
+	MK_FLOW_ITEM(INVERT, 0),
+	MK_FLOW_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+	MK_FLOW_ITEM(PF, 0),
+	MK_FLOW_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+	MK_FLOW_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+	MK_FLOW_ITEM(RAW, sizeof(struct rte_flow_item_raw)), /* +pattern[] */
+	MK_FLOW_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+	MK_FLOW_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+	MK_FLOW_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+	MK_FLOW_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+	MK_FLOW_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+	MK_FLOW_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+	MK_FLOW_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+	MK_FLOW_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+	MK_FLOW_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+};
+
+/** Compute storage space needed by item specification. */
+static void
+flow_item_spec_size(const struct rte_flow_item *item,
+		    size_t *size, size_t *pad)
+{
+	if (!item->spec)
+		goto empty;
+	switch (item->type) {
+		union {
+			const struct rte_flow_item_raw *raw;
+		} spec;
+
+	case RTE_FLOW_ITEM_TYPE_RAW:
+		spec.raw = item->spec;
+		*size = offsetof(struct rte_flow_item_raw, pattern) +
+			spec.raw->length * sizeof(*spec.raw->pattern);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate flow_action[] entry. */
+#define MK_FLOW_ACTION(t, s) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow actions. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_action[] = {
+	MK_FLOW_ACTION(END, 0),
+	MK_FLOW_ACTION(VOID, 0),
+	MK_FLOW_ACTION(PASSTHRU, 0),
+	MK_FLOW_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+	MK_FLOW_ACTION(FLAG, 0),
+	MK_FLOW_ACTION(QUEUE, sizeof(struct rte_flow_action_queue)),
+	MK_FLOW_ACTION(DROP, 0),
+	MK_FLOW_ACTION(COUNT, 0),
+	MK_FLOW_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+	MK_FLOW_ACTION(RSS, sizeof(struct rte_flow_action_rss)), /* +queue[] */
+	MK_FLOW_ACTION(PF, 0),
+	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+};
+
+/** Compute storage space needed by action configuration. */
+static void
+flow_action_conf_size(const struct rte_flow_action *action,
+		      size_t *size, size_t *pad)
+{
+	if (!action->conf)
+		goto empty;
+	switch (action->type) {
+		union {
+			const struct rte_flow_action_rss *rss;
+		} conf;
+
+	case RTE_FLOW_ACTION_TYPE_RSS:
+		conf.rss = action->conf;
+		*size = offsetof(struct rte_flow_action_rss, queue) +
+			conf.rss->num * sizeof(*conf.rss->queue);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate a port_flow entry from attributes/pattern/actions. */
+static struct port_flow *
+port_flow_new(const struct rte_flow_attr *attr,
+	      const struct rte_flow_item *pattern,
+	      const struct rte_flow_action *actions)
+{
+	const struct rte_flow_item *item;
+	const struct rte_flow_action *action;
+	struct port_flow *pf = NULL;
+	size_t tmp;
+	size_t pad;
+	size_t off1 = 0;
+	size_t off2 = 0;
+	int err = ENOTSUP;
+
+store:
+	item = pattern;
+	if (pf)
+		pf->pattern = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_item *dst = NULL;
+
+		if ((unsigned int)item->type > RTE_DIM(flow_item) ||
+		    !flow_item[item->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, item, sizeof(*item));
+		off1 += sizeof(*item);
+		flow_item_spec_size(item, &tmp, &pad);
+		if (item->spec) {
+			if (pf)
+				dst->spec = memcpy(pf->data + off2,
+						   item->spec, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->last) {
+			if (pf)
+				dst->last = memcpy(pf->data + off2,
+						   item->last, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->mask) {
+			if (pf)
+				dst->mask = memcpy(pf->data + off2,
+						   item->mask, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((item++)->type != RTE_FLOW_ITEM_TYPE_END);
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	action = actions;
+	if (pf)
+		pf->actions = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_action *dst = NULL;
+
+		if ((unsigned int)action->type > RTE_DIM(flow_action) ||
+		    !flow_action[action->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, action, sizeof(*action));
+		off1 += sizeof(*action);
+		flow_action_conf_size(action, &tmp, &pad);
+		if (action->conf) {
+			if (pf)
+				dst->conf = memcpy(pf->data + off2,
+						   action->conf, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((action++)->type != RTE_FLOW_ACTION_TYPE_END);
+	if (pf != NULL)
+		return pf;
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	tmp = RTE_ALIGN_CEIL(offsetof(struct port_flow, data), sizeof(double));
+	pf = calloc(1, tmp + off1 + off2);
+	if (pf == NULL)
+		err = errno;
+	else {
+		*pf = (const struct port_flow){
+			.size = tmp + off1 + off2,
+			.attr = *attr,
+		};
+		tmp -= offsetof(struct port_flow, data);
+		off2 = tmp + off1;
+		off1 = tmp;
+		goto store;
+	}
+notsup:
+	rte_errno = err;
+	return NULL;
+}
+
+/** Print a message out of a flow error. */
+static int
+port_flow_complain(struct rte_flow_error *error)
+{
+	static const char *const errstrlist[] = {
+		[RTE_FLOW_ERROR_TYPE_NONE] = "no error",
+		[RTE_FLOW_ERROR_TYPE_UNSPECIFIED] = "cause unspecified",
+		[RTE_FLOW_ERROR_TYPE_HANDLE] = "flow rule (handle)",
+		[RTE_FLOW_ERROR_TYPE_ATTR_GROUP] = "group field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY] = "priority field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_INGRESS] = "ingress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_EGRESS] = "egress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR] = "attributes structure",
+		[RTE_FLOW_ERROR_TYPE_ITEM_NUM] = "pattern length",
+		[RTE_FLOW_ERROR_TYPE_ITEM] = "specific pattern item",
+		[RTE_FLOW_ERROR_TYPE_ACTION_NUM] = "number of actions",
+		[RTE_FLOW_ERROR_TYPE_ACTION] = "specific action",
+	};
+	const char *errstr;
+	char buf[32];
+	int err = rte_errno;
+
+	if ((unsigned int)error->type > RTE_DIM(errstrlist) ||
+	    !errstrlist[error->type])
+		errstr = "unknown type";
+	else
+		errstr = errstrlist[error->type];
+	printf("Caught error type %d (%s): %s%s\n",
+	       error->type, errstr,
+	       error->cause ? (snprintf(buf, sizeof(buf), "cause: %p, ",
+					error->cause), buf) : "",
+	       error->message ? error->message : "(no stated reason)");
+	return -err;
+}
+
+/** Validate flow rule. */
+int
+port_flow_validate(portid_t port_id,
+		   const struct rte_flow_attr *attr,
+		   const struct rte_flow_item *pattern,
+		   const struct rte_flow_action *actions)
+{
+	struct rte_flow_error error;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x11, sizeof(error));
+	if (rte_flow_validate(port_id, attr, pattern, actions, &error))
+		return port_flow_complain(&error);
+	printf("Flow rule validated\n");
+	return 0;
+}
+
+/** Create flow rule. */
+int
+port_flow_create(portid_t port_id,
+		 const struct rte_flow_attr *attr,
+		 const struct rte_flow_item *pattern,
+		 const struct rte_flow_action *actions)
+{
+	struct rte_flow *flow;
+	struct rte_port *port;
+	struct port_flow *pf;
+	uint32_t id;
+	struct rte_flow_error error;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x22, sizeof(error));
+	flow = rte_flow_create(port_id, attr, pattern, actions, &error);
+	if (!flow)
+		return port_flow_complain(&error);
+	port = &ports[port_id];
+	if (port->flow_list) {
+		if (port->flow_list->id == UINT32_MAX) {
+			printf("Highest rule ID is already assigned, delete"
+			       " it first");
+			rte_flow_destroy(port_id, flow, NULL);
+			return -ENOMEM;
+		}
+		id = port->flow_list->id + 1;
+	} else
+		id = 0;
+	pf = port_flow_new(attr, pattern, actions);
+	if (!pf) {
+		int err = rte_errno;
+
+		printf("Cannot allocate flow: %s\n", rte_strerror(err));
+		rte_flow_destroy(port_id, flow, NULL);
+		return -err;
+	}
+	pf->next = port->flow_list;
+	pf->id = id;
+	pf->flow = flow;
+	port->flow_list = pf;
+	printf("Flow rule #%u created\n", pf->id);
+	return 0;
+}
+
+/** Destroy a number of flow rules. */
+int
+port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule)
+{
+	struct rte_port *port;
+	struct port_flow **tmp;
+	uint32_t c = 0;
+	int ret = 0;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	tmp = &port->flow_list;
+	while (*tmp) {
+		uint32_t i;
+
+		for (i = 0; i != n; ++i) {
+			struct rte_flow_error error;
+			struct port_flow *pf = *tmp;
+
+			if (rule[i] != pf->id)
+				continue;
+			/*
+			 * Poisoning to make sure PMDs update it in case
+			 * of error.
+			 */
+			memset(&error, 0x33, sizeof(error));
+			if (rte_flow_destroy(port_id, pf->flow, &error)) {
+				ret = port_flow_complain(&error);
+				continue;
+			}
+			printf("Flow rule #%u destroyed\n", pf->id);
+			*tmp = pf->next;
+			free(pf);
+			break;
+		}
+		if (i == n)
+			tmp = &(*tmp)->next;
+		++c;
+	}
+	return ret;
+}
+
+/** Remove all flow rules. */
+int
+port_flow_flush(portid_t port_id)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	int ret = 0;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x44, sizeof(error));
+	if (rte_flow_flush(port_id, &error)) {
+		ret = port_flow_complain(&error);
+		if (port_id_is_invalid(port_id, DISABLED_WARN) ||
+		    port_id == (portid_t)RTE_PORT_ALL)
+			return ret;
+	}
+	port = &ports[port_id];
+	while (port->flow_list) {
+		struct port_flow *pf = port->flow_list->next;
+
+		free(port->flow_list);
+		port->flow_list = pf;
+	}
+	return ret;
+}
+
+/** Query a flow rule. */
+int
+port_flow_query(portid_t port_id, uint32_t rule,
+		enum rte_flow_action_type action)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	struct port_flow *pf;
+	const char *name;
+	union {
+		struct rte_flow_query_count count;
+	} query;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	for (pf = port->flow_list; pf; pf = pf->next)
+		if (pf->id == rule)
+			break;
+	if (!pf) {
+		printf("Flow rule #%u not found\n", rule);
+		return -ENOENT;
+	}
+	if ((unsigned int)action > RTE_DIM(flow_action) ||
+	    !flow_action[action].name)
+		name = "unknown";
+	else
+		name = flow_action[action].name;
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		break;
+	default:
+		printf("Cannot query action type %d (%s)\n", action, name);
+		return -ENOTSUP;
+	}
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x55, sizeof(error));
+	memset(&query, 0, sizeof(query));
+	if (rte_flow_query(port_id, pf->flow, action, &query, &error))
+		return port_flow_complain(&error);
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		printf("%s:\n"
+		       " hits_set: %u\n"
+		       " bytes_set: %u\n"
+		       " hits: %" PRIu64 "\n"
+		       " bytes: %" PRIu64 "\n",
+		       name,
+		       query.count.hits_set,
+		       query.count.bytes_set,
+		       query.count.hits,
+		       query.count.bytes);
+		break;
+	default:
+		printf("Cannot display result for action type %d (%s)\n",
+		       action, name);
+		break;
+	}
+	return 0;
+}
+
+/** List flow rules. */
+void
+port_flow_list(portid_t port_id, uint32_t n, const uint32_t group[n])
+{
+	struct rte_port *port;
+	struct port_flow *pf;
+	struct port_flow *list = NULL;
+	uint32_t i;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return;
+	port = &ports[port_id];
+	if (!port->flow_list)
+		return;
+	/* Sort flows by group, priority and ID. */
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		struct port_flow **tmp;
+
+		if (n) {
+			/* Filter out unwanted groups. */
+			for (i = 0; i != n; ++i)
+				if (pf->attr.group == group[i])
+					break;
+			if (i == n)
+				continue;
+		}
+		tmp = &list;
+		while (*tmp &&
+		       (pf->attr.group > (*tmp)->attr.group ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority > (*tmp)->attr.priority) ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority == (*tmp)->attr.priority &&
+			 pf->id > (*tmp)->id)))
+			tmp = &(*tmp)->tmp;
+		pf->tmp = *tmp;
+		*tmp = pf;
+	}
+	printf("ID\tGroup\tPrio\tAttr\tRule\n");
+	for (pf = list; pf != NULL; pf = pf->tmp) {
+		const struct rte_flow_item *item = pf->pattern;
+		const struct rte_flow_action *action = pf->actions;
+
+		printf("%" PRIu32 "\t%" PRIu32 "\t%" PRIu32 "\t%c%c\t",
+		       pf->id,
+		       pf->attr.group,
+		       pf->attr.priority,
+		       pf->attr.ingress ? 'i' : '-',
+		       pf->attr.egress ? 'e' : '-');
+		while (item->type != RTE_FLOW_ITEM_TYPE_END) {
+			if (item->type != RTE_FLOW_ITEM_TYPE_VOID)
+				printf("%s ", flow_item[item->type].name);
+			++item;
+		}
+		printf("=>");
+		while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+			if (action->type != RTE_FLOW_ACTION_TYPE_VOID)
+				printf(" %s", flow_action[action->type].name);
+			++action;
+		}
+		printf("\n");
+	}
+}
+
 /*
  * RX/TX ring descriptors display functions.
  */
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 57e6ae2..dd67ebf 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,7 @@
 #include <rte_sctp.h>
 #include <rte_prefetch.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index b13ff89..13b4f90 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 6a4e750..f25a8f5 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -61,6 +61,7 @@
 #include <rte_ip.h>
 #include <rte_icmp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/ieee1588fwd.c b/app/test-pmd/ieee1588fwd.c
index 0d3b37a..51170ee 100644
--- a/app/test-pmd/ieee1588fwd.c
+++ b/app/test-pmd/ieee1588fwd.c
@@ -34,6 +34,7 @@
 
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/iofwd.c b/app/test-pmd/iofwd.c
index 26936b7..15cb4a2 100644
--- a/app/test-pmd/iofwd.c
+++ b/app/test-pmd/iofwd.c
@@ -64,6 +64,7 @@
 #include <rte_ether.h>
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index 86e01de..d361db1 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 36e139f..f996039 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 08e5a76..28db8cd 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_BOND
 #include <rte_eth_bond.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index fff815c..cf00576 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -67,6 +67,7 @@
 #include <rte_ip.h>
 #include <rte_udp.h>
 #include <rte_net.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a0332c2..bfb2f8e 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #ifdef RTE_LIBRTE_PDUMP
 #include <rte_pdump.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
@@ -1545,6 +1546,8 @@ close_port(portid_t pid)
 			continue;
 		}
 
+		if (port->flow_list)
+			port_flow_flush(pi);
 		rte_eth_dev_close(pi);
 
 		if (rte_atomic16_cmpset(&(port->port_status),
@@ -1599,6 +1602,9 @@ detach_port(uint8_t port_id)
 		return;
 	}
 
+	if (ports[port_id].flow_list)
+		port_flow_flush(port_id);
+
 	if (rte_eth_dev_detach(port_id, name))
 		return;
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9c1e703..22ce2d6 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -144,6 +144,19 @@ struct fwd_stream {
 /** Insert double VLAN header in forward engine */
 #define TESTPMD_TX_OFFLOAD_INSERT_QINQ       0x0080
 
+/** Descriptor for a single flow. */
+struct port_flow {
+	size_t size; /**< Allocated space including data[]. */
+	struct port_flow *next; /**< Next flow in list. */
+	struct port_flow *tmp; /**< Temporary linking. */
+	uint32_t id; /**< Flow rule ID. */
+	struct rte_flow *flow; /**< Opaque flow object returned by PMD. */
+	struct rte_flow_attr attr; /**< Attributes. */
+	struct rte_flow_item *pattern; /**< Pattern. */
+	struct rte_flow_action *actions; /**< Actions. */
+	uint8_t data[]; /**< Storage for pattern/actions. */
+};
+
 /**
  * The data structure associated with each port.
  */
@@ -177,6 +190,7 @@ struct rte_port {
 	struct ether_addr       *mc_addr_pool; /**< pool of multicast addrs */
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
+	struct port_flow        *flow_list; /**< Associated flows. */
 };
 
 extern portid_t __rte_unused
@@ -504,6 +518,19 @@ void port_reg_bit_field_set(portid_t port_id, uint32_t reg_off,
 			    uint8_t bit1_pos, uint8_t bit2_pos, uint32_t value);
 void port_reg_display(portid_t port_id, uint32_t reg_off);
 void port_reg_set(portid_t port_id, uint32_t reg_off, uint32_t value);
+int port_flow_validate(portid_t port_id,
+		       const struct rte_flow_attr *attr,
+		       const struct rte_flow_item *pattern,
+		       const struct rte_flow_action *actions);
+int port_flow_create(portid_t port_id,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item *pattern,
+		     const struct rte_flow_action *actions);
+int port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule);
+int port_flow_flush(portid_t port_id);
+int port_flow_query(portid_t port_id, uint32_t rule,
+		    enum rte_flow_action_type action);
+void port_flow_list(portid_t port_id, uint32_t n, const uint32_t *group);
 
 void rx_ring_desc_display(portid_t port_id, queueid_t rxq_id, uint16_t rxd_id);
 void tx_ring_desc_display(portid_t port_id, queueid_t txq_id, uint16_t txd_id);
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 8513a06..e996f35 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 07/25] app/testpmd: add flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (5 preceding siblings ...)
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2016-12-20 16:13           ` Ferruh Yigit
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
                           ` (18 subsequent siblings)
  25 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

Managing generic flow API functions from command line requires the use of
dynamic tokens for convenience as flow rules are not fixed and cannot be
defined statically.

This commit adds specific flexible parser code and object for a new "flow"
command in separate file.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/Makefile       |   1 +
 app/test-pmd/cmdline.c      |   4 +
 app/test-pmd/cmdline_flow.c | 439 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 444 insertions(+)

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 891b85a..5988c3e 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -47,6 +47,7 @@ CFLAGS += $(WERROR_FLAGS)
 SRCS-y := testpmd.c
 SRCS-y += parameters.c
 SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline.c
+SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_flow.c
 SRCS-y += config.c
 SRCS-y += iofwd.c
 SRCS-y += macfwd.c
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 5d1c0dd..b124412 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -9567,6 +9567,9 @@ cmdline_parse_inst_t cmd_set_flow_director_flex_payload = {
 	},
 };
 
+/* Generic flow interface command. */
+extern cmdline_parse_inst_t cmd_flow;
+
 /* *** Classification Filters Control *** */
 /* *** Get symmetric hash enable per port *** */
 struct cmd_get_sym_hash_ena_per_port_result {
@@ -11605,6 +11608,7 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_set_hash_global_config,
 	(cmdline_parse_inst_t *)&cmd_set_hash_input_set,
 	(cmdline_parse_inst_t *)&cmd_set_fdir_input_set,
+	(cmdline_parse_inst_t *)&cmd_flow,
 	(cmdline_parse_inst_t *)&cmd_mcast_addr,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_all,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_specific,
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
new file mode 100644
index 0000000..a36da06
--- /dev/null
+++ b/app/test-pmd/cmdline_flow.c
@@ -0,0 +1,439 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <ctype.h>
+#include <string.h>
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <cmdline_parse.h>
+#include <rte_flow.h>
+
+#include "testpmd.h"
+
+/** Parser token indices. */
+enum index {
+	/* Special tokens. */
+	ZERO = 0,
+	END,
+
+	/* Top-level command. */
+	FLOW,
+};
+
+/** Maximum number of subsequent tokens and arguments on the stack. */
+#define CTX_STACK_SIZE 16
+
+/** Parser context. */
+struct context {
+	/** Stack of subsequent token lists to process. */
+	const enum index *next[CTX_STACK_SIZE];
+	enum index curr; /**< Current token index. */
+	enum index prev; /**< Index of the last token seen. */
+	int next_num; /**< Number of entries in next[]. */
+	uint32_t reparse:1; /**< Start over from the beginning. */
+	uint32_t eol:1; /**< EOL has been detected. */
+	uint32_t last:1; /**< No more arguments. */
+};
+
+/** Parser token definition. */
+struct token {
+	/** Type displayed during completion (defaults to "TOKEN"). */
+	const char *type;
+	/** Help displayed during completion (defaults to token name). */
+	const char *help;
+	/**
+	 * Lists of subsequent tokens to push on the stack. Each call to the
+	 * parser consumes the last entry of that stack.
+	 */
+	const enum index *const *next;
+	/**
+	 * Token-processing callback, returns -1 in case of error, the
+	 * length of the matched string otherwise. If NULL, attempts to
+	 * match the token name.
+	 *
+	 * If buf is not NULL, the result should be stored in it according
+	 * to context. An error is returned if not large enough.
+	 */
+	int (*call)(struct context *ctx, const struct token *token,
+		    const char *str, unsigned int len,
+		    void *buf, unsigned int size);
+	/**
+	 * Callback that provides possible values for this token, used for
+	 * completion. Returns -1 in case of error, the number of possible
+	 * values otherwise. If NULL, the token name is used.
+	 *
+	 * If buf is not NULL, entry index ent is written to buf and the
+	 * full length of the entry is returned (same behavior as
+	 * snprintf()).
+	 */
+	int (*comp)(struct context *ctx, const struct token *token,
+		    unsigned int ent, char *buf, unsigned int size);
+	/** Mandatory token name, no default value. */
+	const char *name;
+};
+
+/** Static initializer for the next field. */
+#define NEXT(...) (const enum index *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for a NEXT() entry. */
+#define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, ZERO, }
+
+/** Parser output buffer layout expected by cmd_flow_parsed(). */
+struct buffer {
+	enum index command; /**< Flow command. */
+	uint16_t port; /**< Affected port ID. */
+};
+
+static int parse_init(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
+
+/** Token definitions. */
+static const struct token token_list[] = {
+	/* Special tokens. */
+	[ZERO] = {
+		.name = "ZERO",
+		.help = "null entry, abused as the entry point",
+		.next = NEXT(NEXT_ENTRY(FLOW)),
+	},
+	[END] = {
+		.name = "",
+		.type = "RETURN",
+		.help = "command may end here",
+	},
+	/* Top-level command. */
+	[FLOW] = {
+		.name = "flow",
+		.type = "{command} {port_id} [{arg} [...]]",
+		.help = "manage ingress/egress flow rules",
+		.call = parse_init,
+	},
+};
+
+/** Default parsing function for token name matching. */
+static int
+parse_default(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)buf;
+	(void)size;
+	if (strncmp(str, token->name, len))
+		return -1;
+	return len;
+}
+
+/** Parse flow command, initialize output buffer for subsequent tokens. */
+static int
+parse_init(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	/* Initialize buffer. */
+	memset(out, 0x00, sizeof(*out));
+	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	return len;
+}
+
+/** Internal context. */
+static struct context cmd_flow_context;
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow;
+
+/** Initialize context. */
+static void
+cmd_flow_context_init(struct context *ctx)
+{
+	/* A full memset() is not necessary. */
+	ctx->curr = 0;
+	ctx->prev = 0;
+	ctx->next_num = 0;
+	ctx->reparse = 0;
+	ctx->eol = 0;
+	ctx->last = 0;
+}
+
+/** Parse a token (cmdline API). */
+static int
+cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
+	       unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token;
+	const enum index *list;
+	int len;
+	int i;
+
+	(void)hdr;
+	/* Restart as requested. */
+	if (ctx->reparse)
+		cmd_flow_context_init(ctx);
+	token = &token_list[ctx->curr];
+	/* Check argument length. */
+	ctx->eol = 0;
+	ctx->last = 1;
+	for (len = 0; src[len]; ++len)
+		if (src[len] == '#' || isspace(src[len]))
+			break;
+	if (!len)
+		return -1;
+	/* Last argument and EOL detection. */
+	for (i = len; src[i]; ++i)
+		if (src[i] == '#' || src[i] == '\r' || src[i] == '\n')
+			break;
+		else if (!isspace(src[i])) {
+			ctx->last = 0;
+			break;
+		}
+	for (; src[i]; ++i)
+		if (src[i] == '\r' || src[i] == '\n') {
+			ctx->eol = 1;
+			break;
+		}
+	/* Initialize context if necessary. */
+	if (!ctx->next_num) {
+		if (!token->next)
+			return 0;
+		ctx->next[ctx->next_num++] = token->next[0];
+	}
+	/* Process argument through candidates. */
+	ctx->prev = ctx->curr;
+	list = ctx->next[ctx->next_num - 1];
+	for (i = 0; list[i]; ++i) {
+		const struct token *next = &token_list[list[i]];
+		int tmp;
+
+		ctx->curr = list[i];
+		if (next->call)
+			tmp = next->call(ctx, next, src, len, result, size);
+		else
+			tmp = parse_default(ctx, next, src, len, result, size);
+		if (tmp == -1 || tmp != len)
+			continue;
+		token = next;
+		break;
+	}
+	if (!list[i])
+		return -1;
+	--ctx->next_num;
+	/* Push subsequent tokens if any. */
+	if (token->next)
+		for (i = 0; token->next[i]; ++i) {
+			if (ctx->next_num == RTE_DIM(ctx->next))
+				return -1;
+			ctx->next[ctx->next_num++] = token->next[i];
+		}
+	return len;
+}
+
+/** Return number of completion entries (cmdline API). */
+static int
+cmd_flow_complete_get_nb(cmdline_parse_token_hdr_t *hdr)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return 0;
+	/*
+	 * If there is a single token, use its completion callback, otherwise
+	 * return the number of entries.
+	 */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, 0, NULL, 0);
+	}
+	return i;
+}
+
+/** Return a completion entry (cmdline API). */
+static int
+cmd_flow_complete_get_elt(cmdline_parse_token_hdr_t *hdr, int index,
+			  char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return -1;
+	/* If there is a single token, use its completion callback. */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, index, dst, size) < 0 ? -1 : 0;
+	}
+	/* Otherwise make sure the index is valid and use defaults. */
+	if (index >= i)
+		return -1;
+	token = &token_list[list[index]];
+	snprintf(dst, size, "%s", token->name);
+	/* Save index for cmd_flow_get_help(). */
+	ctx->prev = list[index];
+	return 0;
+}
+
+/** Populate help strings for current token (cmdline API). */
+static int
+cmd_flow_get_help(cmdline_parse_token_hdr_t *hdr, char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->prev];
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	if (!size)
+		return -1;
+	/* Set token type and update global help with details. */
+	snprintf(dst, size, "%s", (token->type ? token->type : "TOKEN"));
+	if (token->help)
+		cmd_flow.help_str = token->help;
+	else
+		cmd_flow.help_str = token->name;
+	return 0;
+}
+
+/** Token definition template (cmdline API). */
+static struct cmdline_token_hdr cmd_flow_token_hdr = {
+	.ops = &(struct cmdline_token_ops){
+		.parse = cmd_flow_parse,
+		.complete_get_nb = cmd_flow_complete_get_nb,
+		.complete_get_elt = cmd_flow_complete_get_elt,
+		.get_help = cmd_flow_get_help,
+	},
+	.offset = 0,
+};
+
+/** Populate the next dynamic token. */
+static void
+cmd_flow_tok(cmdline_parse_token_hdr_t **hdr,
+	     cmdline_parse_token_hdr_t *(*hdrs)[])
+{
+	struct context *ctx = &cmd_flow_context;
+
+	/* Always reinitialize context before requesting the first token. */
+	if (!(hdr - *hdrs))
+		cmd_flow_context_init(ctx);
+	/* Return NULL when no more tokens are expected. */
+	if (!ctx->next_num && ctx->curr) {
+		*hdr = NULL;
+		return;
+	}
+	/* Determine if command should end here. */
+	if (ctx->eol && ctx->last && ctx->next_num) {
+		const enum index *list = ctx->next[ctx->next_num - 1];
+		int i;
+
+		for (i = 0; list[i]; ++i) {
+			if (list[i] != END)
+				continue;
+			*hdr = NULL;
+			return;
+		}
+	}
+	*hdr = &cmd_flow_token_hdr;
+}
+
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_flow_parsed(const struct buffer *in)
+{
+	switch (in->command) {
+	default:
+		break;
+	}
+}
+
+/** Token generator and output processing callback (cmdline API). */
+static void
+cmd_flow_cb(void *arg0, struct cmdline *cl, void *arg2)
+{
+	if (cl == NULL)
+		cmd_flow_tok(arg0, arg2);
+	else
+		cmd_flow_parsed(arg0);
+}
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow = {
+	.f = cmd_flow_cb,
+	.data = NULL, /**< Unused. */
+	.help_str = NULL, /**< Updated by cmd_flow_get_help(). */
+	.tokens = {
+		NULL,
+	}, /**< Tokens are returned by cmd_flow_tok(). */
+};
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 08/25] app/testpmd: add rte_flow integer support
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (6 preceding siblings ...)
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 07/25] app/testpmd: add flow command Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 09/25] app/testpmd: add flow list command Adrien Mazarguil
                           ` (17 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

Parse all integer types and handle conversion to network byte order in a
single function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 148 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a36da06..81281f9 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -34,11 +34,14 @@
 #include <stddef.h>
 #include <stdint.h>
 #include <stdio.h>
+#include <inttypes.h>
+#include <errno.h>
 #include <ctype.h>
 #include <string.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
+#include <rte_byteorder.h>
 #include <cmdline_parse.h>
 #include <rte_flow.h>
 
@@ -50,6 +53,10 @@ enum index {
 	ZERO = 0,
 	END,
 
+	/* Common tokens. */
+	INTEGER,
+	UNSIGNED,
+
 	/* Top-level command. */
 	FLOW,
 };
@@ -61,12 +68,24 @@ enum index {
 struct context {
 	/** Stack of subsequent token lists to process. */
 	const enum index *next[CTX_STACK_SIZE];
+	/** Arguments for stacked tokens. */
+	const void *args[CTX_STACK_SIZE];
 	enum index curr; /**< Current token index. */
 	enum index prev; /**< Index of the last token seen. */
 	int next_num; /**< Number of entries in next[]. */
+	int args_num; /**< Number of entries in args[]. */
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	void *object; /**< Address of current object for relative offsets. */
+};
+
+/** Token argument. */
+struct arg {
+	uint32_t hton:1; /**< Use network byte ordering. */
+	uint32_t sign:1; /**< Value is signed. */
+	uint32_t offset; /**< Relative offset from ctx->object. */
+	uint32_t size; /**< Field size. */
 };
 
 /** Parser token definition. */
@@ -80,6 +99,8 @@ struct token {
 	 * parser consumes the last entry of that stack.
 	 */
 	const enum index *const *next;
+	/** Arguments stack for subsequent tokens that need them. */
+	const struct arg *const *args;
 	/**
 	 * Token-processing callback, returns -1 in case of error, the
 	 * length of the matched string otherwise. If NULL, attempts to
@@ -112,6 +133,22 @@ struct token {
 /** Static initializer for a NEXT() entry. */
 #define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, ZERO, }
 
+/** Static initializer for the args field. */
+#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for ARGS() to target a field. */
+#define ARGS_ENTRY(s, f) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
+/** Static initializer for ARGS() to target a pointer. */
+#define ARGS_ENTRY_PTR(s, f) \
+	(&(const struct arg){ \
+		.size = sizeof(*((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -121,6 +158,11 @@ struct buffer {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_int(struct context *, const struct token *,
+		     const char *, unsigned int,
+		     void *, unsigned int);
+static int comp_none(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -135,6 +177,21 @@ static const struct token token_list[] = {
 		.type = "RETURN",
 		.help = "command may end here",
 	},
+	/* Common tokens. */
+	[INTEGER] = {
+		.name = "{int}",
+		.type = "INTEGER",
+		.help = "integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[UNSIGNED] = {
+		.name = "{unsigned}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -144,6 +201,23 @@ static const struct token token_list[] = {
 	},
 };
 
+/** Remove and return last entry from argument stack. */
+static const struct arg *
+pop_args(struct context *ctx)
+{
+	return ctx->args_num ? ctx->args[--ctx->args_num] : NULL;
+}
+
+/** Add entry on top of the argument stack. */
+static int
+push_args(struct context *ctx, const struct arg *arg)
+{
+	if (ctx->args_num == CTX_STACK_SIZE)
+		return -1;
+	ctx->args[ctx->args_num++] = arg;
+	return 0;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -178,9 +252,74 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->object = out;
 	return len;
 }
 
+/**
+ * Parse signed/unsigned integers 8 to 64-bit long.
+ *
+ * Last argument (ctx->args) is retrieved to determine integer type and
+ * storage location.
+ */
+static int
+parse_int(struct context *ctx, const struct token *token,
+	  const char *str, unsigned int len,
+	  void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	uintmax_t u;
+	char *end;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = arg->sign ?
+		(uintmax_t)strtoimax(str, &end, 0) :
+		strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	size = arg->size;
+	switch (size) {
+	case sizeof(uint8_t):
+		*(uint8_t *)buf = u;
+		break;
+	case sizeof(uint16_t):
+		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
+		break;
+	case sizeof(uint32_t):
+		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
+		break;
+	case sizeof(uint64_t):
+		*(uint64_t *)buf = arg->hton ? rte_cpu_to_be_64(u) : u;
+		break;
+	default:
+		goto error;
+	}
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/** No completion. */
+static int
+comp_none(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)token;
+	(void)ent;
+	(void)buf;
+	(void)size;
+	return 0;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -195,9 +334,11 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->curr = 0;
 	ctx->prev = 0;
 	ctx->next_num = 0;
+	ctx->args_num = 0;
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->object = NULL;
 }
 
 /** Parse a token (cmdline API). */
@@ -270,6 +411,13 @@ cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
 				return -1;
 			ctx->next[ctx->next_num++] = token->next[i];
 		}
+	/* Push arguments if any. */
+	if (token->args)
+		for (i = 0; token->args[i]; ++i) {
+			if (ctx->args_num == RTE_DIM(ctx->args))
+				return -1;
+			ctx->args[ctx->args_num++] = token->args[i];
+		}
 	return len;
 }
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 09/25] app/testpmd: add flow list command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (7 preceding siblings ...)
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
@ 2016-12-19 17:48         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush command Adrien Mazarguil
                           ` (16 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:48 UTC (permalink / raw)
  To: dev

Syntax:

 flow list {port_id} [group {group_id}] [...]

List configured flow rules on a port. Output can optionally be limited to a
given set of group identifiers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   4 ++
 app/test-pmd/cmdline_flow.c | 141 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b124412..0dc6c63 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -810,6 +810,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
+
+			"flow list {port_id} [group {group_id}] [...]\n"
+			"    List existing flow rules sorted by priority,"
+			" filtered by group identifiers.\n\n"
 		);
 	}
 }
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 81281f9..bd3da38 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,9 +56,17 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PORT_ID,
+	GROUP_ID,
 
 	/* Top-level command. */
 	FLOW,
+
+	/* Sub-level commands. */
+	LIST,
+
+	/* List arguments. */
+	LIST_GROUP,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -77,6 +85,7 @@ struct context {
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	uint16_t port; /**< Current port ID (for completions). */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -153,16 +162,36 @@ struct token {
 struct buffer {
 	enum index command; /**< Flow command. */
 	uint16_t port; /**< Affected port ID. */
+	union {
+		struct {
+			uint32_t *group;
+			uint32_t group_n;
+		} list; /**< List arguments. */
+	} args; /**< Command arguments. */
+};
+
+static const enum index next_list_attr[] = {
+	LIST_GROUP,
+	END,
+	ZERO,
 };
 
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_list(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_port(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_port(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -192,13 +221,44 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PORT_ID] = {
+		.name = "{port_id}",
+		.type = "PORT ID",
+		.help = "port identifier",
+		.call = parse_port,
+		.comp = comp_port,
+	},
+	[GROUP_ID] = {
+		.name = "{group_id}",
+		.type = "GROUP ID",
+		.help = "group identifier",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
+		.next = NEXT(NEXT_ENTRY(LIST)),
 		.call = parse_init,
 	},
+	/* Sub-level commands. */
+	[LIST] = {
+		.name = "list",
+		.help = "list existing flow rules",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_list,
+	},
+	/* List arguments. */
+	[LIST_GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
+		.call = parse_list,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -256,6 +316,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for list command. */
+static int
+parse_list(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != LIST)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.list.group =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
+	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.list.group + out->args.list.group_n++;
+	return len;
+}
+
 /**
  * Parse signed/unsigned integers 8 to 64-bit long.
  *
@@ -307,6 +400,29 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/** Parse port and update context. */
+static int
+parse_port(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = &(struct buffer){ .port = 0 };
+	int ret;
+
+	if (buf)
+		out = buf;
+	else {
+		ctx->object = out;
+		size = sizeof(*out);
+	}
+	ret = parse_int(ctx, token, str, len, out, size);
+	if (ret >= 0)
+		ctx->port = out->port;
+	if (!buf)
+		ctx->object = NULL;
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -320,6 +436,26 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete available ports. */
+static int
+comp_port(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	portid_t p;
+
+	(void)ctx;
+	(void)token;
+	FOREACH_PORT(p, ports) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", p);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -338,6 +474,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->port = 0;
 	ctx->object = NULL;
 }
 
@@ -561,6 +698,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case LIST:
+		port_flow_list(in->port, in->args.list.group_n,
+			       in->args.list.group);
+		break;
 	default:
 		break;
 	}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (8 preceding siblings ...)
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 09/25] app/testpmd: add flow list command Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-20  7:32           ` Zhao1, Wei
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
                           ` (15 subsequent siblings)
  25 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Syntax:

 flow flush {port_id}

Destroy all flow rules on a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |  3 +++
 app/test-pmd/cmdline_flow.c | 43 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0dc6c63..6e2b289 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow flush {port_id}\n"
+			"    Destroy all flow rules.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bd3da38..49578eb 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -63,6 +63,7 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	FLUSH,
 	LIST,
 
 	/* List arguments. */
@@ -179,6 +180,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_flush(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -240,10 +244,19 @@ static const struct token token_list[] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
-		.next = NEXT(NEXT_ENTRY(LIST)),
+		.next = NEXT(NEXT_ENTRY
+			     (FLUSH,
+			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[FLUSH] = {
+		.name = "flush",
+		.help = "destroy all flow rules",
+		.next = NEXT(NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_flush,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -316,6 +329,31 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for flush command. */
+static int
+parse_flush(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != FLUSH)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+	}
+	return len;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -698,6 +736,9 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case FLUSH:
+		port_flow_flush(in->port);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 11/25] app/testpmd: add flow destroy command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (9 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush command Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
                           ` (14 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Syntax:

 flow destroy {port_id} rule {rule_id} [...]

Destroy a given set of flow rules associated with a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   3 ++
 app/test-pmd/cmdline_flow.c | 106 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 6e2b289..80ddda2 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow destroy {port_id} rule {rule_id} [...]\n"
+			"    Destroy specific flow rules.\n\n"
+
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 49578eb..786b718 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
 
@@ -63,9 +64,13 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	DESTROY,
 	FLUSH,
 	LIST,
 
+	/* Destroy arguments. */
+	DESTROY_RULE,
+
 	/* List arguments. */
 	LIST_GROUP,
 };
@@ -165,12 +170,22 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			uint32_t *rule;
+			uint32_t rule_n;
+		} destroy; /**< Destroy arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
 	} args; /**< Command arguments. */
 };
 
+static const enum index next_destroy_attr[] = {
+	DESTROY_RULE,
+	END,
+	ZERO,
+};
+
 static const enum index next_list_attr[] = {
 	LIST_GROUP,
 	END,
@@ -180,6 +195,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_destroy(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
@@ -196,6 +214,8 @@ static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_rule_id(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -225,6 +245,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[RULE_ID] = {
+		.name = "{rule id}",
+		.type = "RULE ID",
+		.help = "rule identifier",
+		.call = parse_int,
+		.comp = comp_rule_id,
+	},
 	[PORT_ID] = {
 		.name = "{port_id}",
 		.type = "PORT ID",
@@ -245,11 +272,19 @@ static const struct token token_list[] = {
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (FLUSH,
+			     (DESTROY,
+			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[DESTROY] = {
+		.name = "destroy",
+		.help = "destroy specific flow rules",
+		.next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_destroy,
+	},
 	[FLUSH] = {
 		.name = "flush",
 		.help = "destroy all flow rules",
@@ -264,6 +299,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_list,
 	},
+	/* Destroy arguments. */
+	[DESTROY_RULE] = {
+		.name = "rule",
+		.help = "specify a rule identifier",
+		.next = NEXT(next_destroy_attr, NEXT_ENTRY(RULE_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
+		.call = parse_destroy,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -329,6 +372,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for destroy command. */
+static int
+parse_destroy(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != DESTROY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.destroy.rule =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
+	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	return len;
+}
+
 /** Parse tokens for flush command. */
 static int
 parse_flush(struct context *ctx, const struct token *token,
@@ -494,6 +570,30 @@ comp_port(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete available rule IDs. */
+static int
+comp_rule_id(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	struct rte_port *port;
+	struct port_flow *pf;
+
+	(void)token;
+	if (port_id_is_invalid(ctx->port, DISABLED_WARN) ||
+	    ctx->port == (uint16_t)RTE_PORT_ALL)
+		return -1;
+	port = &ports[ctx->port];
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", pf->id);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -736,6 +836,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case DESTROY:
+		port_flow_destroy(in->port, in->args.destroy.rule_n,
+				  in->args.destroy.rule);
+		break;
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 12/25] app/testpmd: add flow validate/create commands
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (10 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 13/25] app/testpmd: add flow query command Adrien Mazarguil
                           ` (13 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Syntax:

 flow (validate|create) {port_id}
    [group {group_id}] [priority {level}] [ingress] [egress]
    pattern {item} [/ {item} [...]] / end
    actions {action} [/ {action} [...]] / end

Either check the validity of a flow rule or create it. Any number of
pattern items and actions can be provided in any order. Completion is
available for convenience.

This commit only adds support for the most basic item and action types,
namely:

- END: terminates pattern items and actions lists.
- VOID: item/action filler, no operation.
- INVERT: inverted pattern matching, process packets that do not match.
- PASSTHRU: action that leaves packets up for additional processing by
  subsequent flow rules.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |  14 ++
 app/test-pmd/cmdline_flow.c | 314 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 327 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 80ddda2..23f4b48 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,20 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow validate {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Check whether a flow rule can be created.\n\n"
+
+			"flow create {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Create a flow rule.\n\n"
+
 			"flow destroy {port_id} rule {rule_id} [...]\n"
 			"    Destroy specific flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 786b718..2fd3a5d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -59,11 +59,14 @@ enum index {
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
+	PRIORITY_LEVEL,
 
 	/* Top-level command. */
 	FLOW,
 
 	/* Sub-level commands. */
+	VALIDATE,
+	CREATE,
 	DESTROY,
 	FLUSH,
 	LIST,
@@ -73,6 +76,26 @@ enum index {
 
 	/* List arguments. */
 	LIST_GROUP,
+
+	/* Validate/create arguments. */
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+
+	/* Validate/create pattern. */
+	PATTERN,
+	ITEM_NEXT,
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+
+	/* Validate/create actions. */
+	ACTIONS,
+	ACTION_NEXT,
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -92,6 +115,7 @@ struct context {
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
 	uint16_t port; /**< Current port ID (for completions). */
+	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -109,6 +133,8 @@ struct token {
 	const char *type;
 	/** Help displayed during completion (defaults to token name). */
 	const char *help;
+	/** Private data used by parser functions. */
+	const void *priv;
 	/**
 	 * Lists of subsequent tokens to push on the stack. Each call to the
 	 * parser consumes the last entry of that stack.
@@ -170,6 +196,14 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			struct rte_flow_attr attr;
+			struct rte_flow_item *pattern;
+			struct rte_flow_action *actions;
+			uint32_t pattern_n;
+			uint32_t actions_n;
+			uint8_t *data;
+		} vc; /**< Validate/create arguments. */
+		struct {
 			uint32_t *rule;
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
@@ -180,6 +214,39 @@ struct buffer {
 	} args; /**< Command arguments. */
 };
 
+/** Private data for pattern items. */
+struct parse_item_priv {
+	enum rte_flow_item_type type; /**< Item type. */
+	uint32_t size; /**< Size of item specification structure. */
+};
+
+#define PRIV_ITEM(t, s) \
+	(&(const struct parse_item_priv){ \
+		.type = RTE_FLOW_ITEM_TYPE_ ## t, \
+		.size = s, \
+	})
+
+/** Private data for actions. */
+struct parse_action_priv {
+	enum rte_flow_action_type type; /**< Action type. */
+	uint32_t size; /**< Size of action configuration structure. */
+};
+
+#define PRIV_ACTION(t, s) \
+	(&(const struct parse_action_priv){ \
+		.type = RTE_FLOW_ACTION_TYPE_ ## t, \
+		.size = s, \
+	})
+
+static const enum index next_vc_attr[] = {
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+	PATTERN,
+	ZERO,
+};
+
 static const enum index next_destroy_attr[] = {
 	DESTROY_RULE,
 	END,
@@ -192,9 +259,26 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
+static const enum index next_item[] = {
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+	ZERO,
+};
+
+static const enum index next_action[] = {
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_vc(struct context *, const struct token *,
+		    const char *, unsigned int,
+		    void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -266,18 +350,41 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PRIORITY_LEVEL] = {
+		.name = "{level}",
+		.type = "PRIORITY",
+		.help = "priority level",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (DESTROY,
+			     (VALIDATE,
+			      CREATE,
+			      DESTROY,
 			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[VALIDATE] = {
+		.name = "validate",
+		.help = "check whether a flow rule can be created",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
+	[CREATE] = {
+		.name = "create",
+		.help = "create a flow rule",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
 	[DESTROY] = {
 		.name = "destroy",
 		.help = "destroy specific flow rules",
@@ -315,6 +422,98 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
 		.call = parse_list,
 	},
+	/* Validate/create attributes. */
+	[GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, group)),
+		.call = parse_vc,
+	},
+	[PRIORITY] = {
+		.name = "priority",
+		.help = "specify a priority level",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PRIORITY_LEVEL)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, priority)),
+		.call = parse_vc,
+	},
+	[INGRESS] = {
+		.name = "ingress",
+		.help = "affect rule to ingress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	[EGRESS] = {
+		.name = "egress",
+		.help = "affect rule to egress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	/* Validate/create pattern. */
+	[PATTERN] = {
+		.name = "pattern",
+		.help = "submit a list of pattern items",
+		.next = NEXT(next_item),
+		.call = parse_vc,
+	},
+	[ITEM_NEXT] = {
+		.name = "/",
+		.help = "specify next pattern item",
+		.next = NEXT(next_item),
+	},
+	[ITEM_END] = {
+		.name = "end",
+		.help = "end list of pattern items",
+		.priv = PRIV_ITEM(END, 0),
+		.next = NEXT(NEXT_ENTRY(ACTIONS)),
+		.call = parse_vc,
+	},
+	[ITEM_VOID] = {
+		.name = "void",
+		.help = "no-op pattern item",
+		.priv = PRIV_ITEM(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_INVERT] = {
+		.name = "invert",
+		.help = "perform actions when pattern does not match",
+		.priv = PRIV_ITEM(INVERT, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	/* Validate/create actions. */
+	[ACTIONS] = {
+		.name = "actions",
+		.help = "submit a list of associated actions",
+		.next = NEXT(next_action),
+		.call = parse_vc,
+	},
+	[ACTION_NEXT] = {
+		.name = "/",
+		.help = "specify next action",
+		.next = NEXT(next_action),
+	},
+	[ACTION_END] = {
+		.name = "end",
+		.help = "end list of actions",
+		.priv = PRIV_ACTION(END, 0),
+		.call = parse_vc,
+	},
+	[ACTION_VOID] = {
+		.name = "void",
+		.help = "no-op action",
+		.priv = PRIV_ACTION(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PASSTHRU] = {
+		.name = "passthru",
+		.help = "let subsequent rule process matched packets",
+		.priv = PRIV_ACTION(PASSTHRU, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -368,10 +567,108 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->objdata = 0;
 	ctx->object = out;
 	return len;
 }
 
+/** Parse tokens for validate/create commands. */
+static int
+parse_vc(struct context *ctx, const struct token *token,
+	 const char *str, unsigned int len,
+	 void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	uint8_t *data;
+	uint32_t data_size;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != VALIDATE && ctx->curr != CREATE)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		out->args.vc.data = (uint8_t *)out + size;
+		return len;
+	}
+	ctx->objdata = 0;
+	ctx->object = &out->args.vc.attr;
+	switch (ctx->curr) {
+	case GROUP:
+	case PRIORITY:
+		return len;
+	case INGRESS:
+		out->args.vc.attr.ingress = 1;
+		return len;
+	case EGRESS:
+		out->args.vc.attr.egress = 1;
+		return len;
+	case PATTERN:
+		out->args.vc.pattern =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		ctx->object = out->args.vc.pattern;
+		return len;
+	case ACTIONS:
+		out->args.vc.actions =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)
+					       (out->args.vc.pattern +
+						out->args.vc.pattern_n),
+					       sizeof(double));
+		ctx->object = out->args.vc.actions;
+		return len;
+	default:
+		if (!token->priv)
+			return -1;
+		break;
+	}
+	if (!out->args.vc.actions) {
+		const struct parse_item_priv *priv = token->priv;
+		struct rte_flow_item *item =
+			out->args.vc.pattern + out->args.vc.pattern_n;
+
+		data_size = priv->size * 3; /* spec, last, mask */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)item + sizeof(*item) > data)
+			return -1;
+		*item = (struct rte_flow_item){
+			.type = priv->type,
+		};
+		++out->args.vc.pattern_n;
+		ctx->object = item;
+	} else {
+		const struct parse_action_priv *priv = token->priv;
+		struct rte_flow_action *action =
+			out->args.vc.actions + out->args.vc.actions_n;
+
+		data_size = priv->size; /* configuration */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)action + sizeof(*action) > data)
+			return -1;
+		*action = (struct rte_flow_action){
+			.type = priv->type,
+		};
+		++out->args.vc.actions_n;
+		ctx->object = action;
+	}
+	memset(data, 0, data_size);
+	out->args.vc.data = data;
+	ctx->objdata = data_size;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -392,6 +689,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -401,6 +699,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
 	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
 	return len;
 }
@@ -425,6 +724,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 	}
 	return len;
@@ -450,6 +750,7 @@ parse_list(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -459,6 +760,7 @@ parse_list(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
 	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
 	return len;
 }
@@ -526,6 +828,7 @@ parse_port(struct context *ctx, const struct token *token,
 	if (buf)
 		out = buf;
 	else {
+		ctx->objdata = 0;
 		ctx->object = out;
 		size = sizeof(*out);
 	}
@@ -613,6 +916,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->eol = 0;
 	ctx->last = 0;
 	ctx->port = 0;
+	ctx->objdata = 0;
 	ctx->object = NULL;
 }
 
@@ -836,6 +1140,14 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case VALIDATE:
+		port_flow_validate(in->port, &in->args.vc.attr,
+				   in->args.vc.pattern, in->args.vc.actions);
+		break;
+	case CREATE:
+		port_flow_create(in->port, &in->args.vc.attr,
+				 in->args.vc.pattern, in->args.vc.actions);
+		break;
 	case DESTROY:
 		port_flow_destroy(in->port, in->args.destroy.rule_n,
 				  in->args.destroy.rule);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 13/25] app/testpmd: add flow query command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (11 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
                           ` (12 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Syntax:

 flow query {port_id} {rule_id} {action}

Query a specific action of an existing flow rule.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   3 +
 app/test-pmd/cmdline_flow.c | 121 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 23f4b48..f768b6b 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -831,6 +831,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
+			"flow query {port_id} {rule_id} {action}\n"
+			"    Query an existing flow rule.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 2fd3a5d..8f7ec1d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -69,11 +69,15 @@ enum index {
 	CREATE,
 	DESTROY,
 	FLUSH,
+	QUERY,
 	LIST,
 
 	/* Destroy arguments. */
 	DESTROY_RULE,
 
+	/* Query arguments. */
+	QUERY_ACTION,
+
 	/* List arguments. */
 	LIST_GROUP,
 
@@ -208,6 +212,10 @@ struct buffer {
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
 		struct {
+			uint32_t rule;
+			enum rte_flow_action_type action;
+		} query; /**< Query arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
@@ -285,6 +293,12 @@ static int parse_destroy(struct context *, const struct token *,
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
+static int parse_query(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
+static int parse_action(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -296,6 +310,8 @@ static int parse_port(struct context *, const struct token *,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_action(struct context *, const struct token *,
+		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
@@ -367,7 +383,8 @@ static const struct token token_list[] = {
 			      CREATE,
 			      DESTROY,
 			      FLUSH,
-			      LIST)),
+			      LIST,
+			      QUERY)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
@@ -399,6 +416,17 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_flush,
 	},
+	[QUERY] = {
+		.name = "query",
+		.help = "query an existing flow rule",
+		.next = NEXT(NEXT_ENTRY(QUERY_ACTION),
+			     NEXT_ENTRY(RULE_ID),
+			     NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.query.action),
+			     ARGS_ENTRY(struct buffer, args.query.rule),
+			     ARGS_ENTRY(struct buffer, port)),
+		.call = parse_query,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -414,6 +442,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
 		.call = parse_destroy,
 	},
+	/* Query arguments. */
+	[QUERY_ACTION] = {
+		.name = "{action}",
+		.type = "ACTION",
+		.help = "action to query, must be part of the rule",
+		.call = parse_action,
+		.comp = comp_action,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -730,6 +766,67 @@ parse_flush(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for query command. */
+static int
+parse_query(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != QUERY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+	}
+	return len;
+}
+
+/** Parse action names. */
+static int
+parse_action(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+
+	(void)size;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	/* Parse action name. */
+	for (i = 0; next_action[i]; ++i) {
+		const struct parse_action_priv *priv;
+
+		token = &token_list[next_action[i]];
+		if (strncmp(token->name, str, len))
+			continue;
+		priv = token->priv;
+		if (!priv)
+			goto error;
+		if (out)
+			memcpy((uint8_t *)ctx->object + arg->offset,
+			       &priv->type,
+			       arg->size);
+		return len;
+	}
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -853,6 +950,24 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete action names. */
+static int
+comp_action(struct context *ctx, const struct token *token,
+	    unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; next_action[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s",
+					token_list[next_action[i]].name);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete available ports. */
 static int
 comp_port(struct context *ctx, const struct token *token,
@@ -1155,6 +1270,10 @@ cmd_flow_parsed(const struct buffer *in)
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
+	case QUERY:
+		port_flow_query(in->port, in->args.query.rule,
+				in->args.query.action);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 14/25] app/testpmd: add rte_flow item spec handler
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (12 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 13/25] app/testpmd: add flow query command Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
                           ` (11 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Add parser code to fully set individual fields of pattern item
specification structures, using the following operators:

- fix: sets field and applies full bit-mask for perfect matching.
- spec: sets field without modifying its bit-mask.
- last: sets upper value of the spec => last range.
- mask: sets bit-mask affecting both spec and last from arbitrary value.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 111 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 8f7ec1d..b66fecf 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -89,6 +89,10 @@ enum index {
 
 	/* Validate/create pattern. */
 	PATTERN,
+	ITEM_PARAM_IS,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -121,6 +125,7 @@ struct context {
 	uint16_t port; /**< Current port ID (for completions). */
 	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
+	void *objmask; /**< Object a full mask must be written to. */
 };
 
 /** Token argument. */
@@ -267,6 +272,15 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
+__rte_unused
+static const enum index item_param[] = {
+	ITEM_PARAM_IS,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
+	ZERO,
+};
+
 static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
@@ -287,6 +301,8 @@ static int parse_init(struct context *, const struct token *,
 static int parse_vc(struct context *, const struct token *,
 		    const char *, unsigned int,
 		    void *, unsigned int);
+static int parse_vc_spec(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -492,6 +508,26 @@ static const struct token token_list[] = {
 		.next = NEXT(next_item),
 		.call = parse_vc,
 	},
+	[ITEM_PARAM_IS] = {
+		.name = "is",
+		.help = "match value perfectly (with full bit-mask)",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_SPEC] = {
+		.name = "spec",
+		.help = "match value according to configured bit-mask",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_LAST] = {
+		.name = "last",
+		.help = "specify upper bound to establish a range",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_MASK] = {
+		.name = "mask",
+		.help = "specify bit-mask with relevant bits set to one",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +641,7 @@ parse_init(struct context *ctx, const struct token *token,
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
 	ctx->objdata = 0;
 	ctx->object = out;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -632,11 +669,13 @@ parse_vc(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.vc.data = (uint8_t *)out + size;
 		return len;
 	}
 	ctx->objdata = 0;
 	ctx->object = &out->args.vc.attr;
+	ctx->objmask = NULL;
 	switch (ctx->curr) {
 	case GROUP:
 	case PRIORITY:
@@ -652,6 +691,7 @@ parse_vc(struct context *ctx, const struct token *token,
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
 		ctx->object = out->args.vc.pattern;
+		ctx->objmask = NULL;
 		return len;
 	case ACTIONS:
 		out->args.vc.actions =
@@ -660,6 +700,7 @@ parse_vc(struct context *ctx, const struct token *token,
 						out->args.vc.pattern_n),
 					       sizeof(double));
 		ctx->object = out->args.vc.actions;
+		ctx->objmask = NULL;
 		return len;
 	default:
 		if (!token->priv)
@@ -682,6 +723,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.pattern_n;
 		ctx->object = item;
+		ctx->objmask = NULL;
 	} else {
 		const struct parse_action_priv *priv = token->priv;
 		struct rte_flow_action *action =
@@ -698,6 +740,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.actions_n;
 		ctx->object = action;
+		ctx->objmask = NULL;
 	}
 	memset(data, 0, data_size);
 	out->args.vc.data = data;
@@ -705,6 +748,60 @@ parse_vc(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse pattern item parameter type. */
+static int
+parse_vc_spec(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_item *item;
+	uint32_t data_size;
+	int index;
+	int objmask = 0;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Parse parameter types. */
+	switch (ctx->curr) {
+	case ITEM_PARAM_IS:
+		index = 0;
+		objmask = 1;
+		break;
+	case ITEM_PARAM_SPEC:
+		index = 0;
+		break;
+	case ITEM_PARAM_LAST:
+		index = 1;
+		break;
+	case ITEM_PARAM_MASK:
+		index = 2;
+		break;
+	default:
+		return -1;
+	}
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.pattern_n)
+		return -1;
+	item = &out->args.vc.pattern[out->args.vc.pattern_n - 1];
+	data_size = ctx->objdata / 3; /* spec, last, mask */
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data + (data_size * index);
+	if (objmask) {
+		ctx->objmask = out->args.vc.data + (data_size * 2); /* mask */
+		item->mask = ctx->objmask;
+	} else
+		ctx->objmask = NULL;
+	/* Update relevant item pointer. */
+	*((const void **[]){ &item->spec, &item->last, &item->mask })[index] =
+		ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -727,6 +824,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -737,6 +835,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -762,6 +861,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -788,6 +888,7 @@ parse_query(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -849,6 +950,7 @@ parse_list(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -859,6 +961,7 @@ parse_list(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -891,6 +994,7 @@ parse_int(struct context *ctx, const struct token *token,
 		return len;
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
+objmask:
 	switch (size) {
 	case sizeof(uint8_t):
 		*(uint8_t *)buf = u;
@@ -907,6 +1011,11 @@ parse_int(struct context *ctx, const struct token *token,
 	default:
 		goto error;
 	}
+	if (ctx->objmask && buf != (uint8_t *)ctx->objmask + arg->offset) {
+		u = -1;
+		buf = (uint8_t *)ctx->objmask + arg->offset;
+		goto objmask;
+	}
 	return len;
 error:
 	push_args(ctx, arg);
@@ -927,6 +1036,7 @@ parse_port(struct context *ctx, const struct token *token,
 	else {
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		size = sizeof(*out);
 	}
 	ret = parse_int(ctx, token, str, len, out, size);
@@ -1033,6 +1143,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->port = 0;
 	ctx->objdata = 0;
 	ctx->object = NULL;
+	ctx->objmask = NULL;
 }
 
 /** Parse a token (cmdline API). */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 15/25] app/testpmd: add rte_flow item spec prefix length
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (13 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
                           ` (10 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Generating bit-masks from prefix lengths is often more convenient than
providing them entirely (e.g. to define IPv4 and IPv6 subnets).

This commit adds the "prefix" operator that assigns generated bit-masks to
any pattern item specification field.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 80 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index b66fecf..07f895e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PREFIX,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -93,6 +94,7 @@ enum index {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -278,6 +280,7 @@ static const enum index item_param[] = {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ZERO,
 };
 
@@ -321,6 +324,9 @@ static int parse_list(struct context *, const struct token *,
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_prefix(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -361,6 +367,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PREFIX] = {
+		.name = "{prefix}",
+		.type = "PREFIX",
+		.help = "prefix length for bit-mask",
+		.call = parse_prefix,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -528,6 +541,11 @@ static const struct token token_list[] = {
 		.help = "specify bit-mask with relevant bits set to one",
 		.call = parse_vc_spec,
 	},
+	[ITEM_PARAM_PREFIX] = {
+		.name = "prefix",
+		.help = "generate bit-mask from a prefix length",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +623,62 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/**
+ * Parse a prefix length and generate a bit-mask.
+ *
+ * Last argument (ctx->args) is retrieved to determine mask size, storage
+ * location and whether the result must use network byte ordering.
+ */
+static int
+parse_prefix(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	static const uint8_t conv[] = "\x00\x80\xc0\xe0\xf0\xf8\xfc\xfe\xff";
+	char *end;
+	uintmax_t u;
+	unsigned int bytes;
+	unsigned int extra;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	bytes = u / 8;
+	extra = u % 8;
+	size = arg->size;
+	if (bytes > size || bytes + !!extra > size)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+	if (!arg->hton) {
+		memset((uint8_t *)buf + size - bytes, 0xff, bytes);
+		memset(buf, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[size - bytes - 1] = conv[extra];
+	} else
+#endif
+	{
+		memset(buf, 0xff, bytes);
+		memset((uint8_t *)buf + bytes, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[bytes] = conv[extra];
+	}
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -776,6 +850,12 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	case ITEM_PARAM_LAST:
 		index = 1;
 		break;
+	case ITEM_PARAM_PREFIX:
+		/* Modify next token to expect a prefix. */
+		if (ctx->next_num < 2)
+			return -1;
+		ctx->next[ctx->next_num - 2] = NEXT_ENTRY(PREFIX);
+		/* Fall through. */
 	case ITEM_PARAM_MASK:
 		index = 2;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 16/25] app/testpmd: add rte_flow bit-field support
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (14 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
                           ` (9 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Several rte_flow structures expose bit-fields that cannot be set in a
generic fashion at byte level. Add bit-mask support to handle them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 59 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 07f895e..69887fc 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -136,6 +136,7 @@ struct arg {
 	uint32_t sign:1; /**< Value is signed. */
 	uint32_t offset; /**< Relative offset from ctx->object. */
 	uint32_t size; /**< Field size. */
+	const uint8_t *mask; /**< Bit-mask to use instead of offset/size. */
 };
 
 /** Parser token definition. */
@@ -195,6 +196,13 @@ struct token {
 		.size = sizeof(((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() to target a bit-field. */
+#define ARGS_ENTRY_BF(s, f, b) \
+	(&(const struct arg){ \
+		.size = sizeof(s), \
+		.mask = (const void *)&(const s){ .f = (1 << (b)) - 1 }, \
+	})
+
 /** Static initializer for ARGS() to target a pointer. */
 #define ARGS_ENTRY_PTR(s, f) \
 	(&(const struct arg){ \
@@ -623,6 +631,34 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/** Spread value into buffer according to bit-mask. */
+static size_t
+arg_entry_bf_fill(void *dst, uintmax_t val, const struct arg *arg)
+{
+	uint32_t i;
+	size_t len = 0;
+
+	/* Endian conversion is not supported on bit-fields. */
+	if (!arg->mask || arg->hton)
+		return 0;
+	for (i = 0; i != arg->size; ++i) {
+		unsigned int shift = 0;
+		uint8_t *buf = (uint8_t *)dst + i;
+
+		for (shift = 0; arg->mask[i] >> shift; ++shift) {
+			if (!(arg->mask[i] & (1 << shift)))
+				continue;
+			++len;
+			if (!dst)
+				continue;
+			*buf &= ~(1 << shift);
+			*buf |= (val & 1) << shift;
+			val >>= 1;
+		}
+	}
+	return len;
+}
+
 /**
  * Parse a prefix length and generate a bit-mask.
  *
@@ -649,6 +685,23 @@ parse_prefix(struct context *ctx, const struct token *token,
 	u = strtoumax(str, &end, 0);
 	if (errno || (size_t)(end - str) != len)
 		goto error;
+	if (arg->mask) {
+		uintmax_t v = 0;
+
+		extra = arg_entry_bf_fill(NULL, 0, arg);
+		if (u > extra)
+			goto error;
+		if (!ctx->object)
+			return len;
+		extra -= u;
+		while (u--)
+			(v <<= 1, v |= 1);
+		v <<= extra;
+		if (!arg_entry_bf_fill(ctx->object, v, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	bytes = u / 8;
 	extra = u % 8;
 	size = arg->size;
@@ -1072,6 +1125,12 @@ parse_int(struct context *ctx, const struct token *token,
 		goto error;
 	if (!ctx->object)
 		return len;
+	if (arg->mask) {
+		if (!arg_entry_bf_fill(ctx->object, u, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
 objmask:
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 17/25] app/testpmd: add item any to flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (15 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 18/25] app/testpmd: add various items " Adrien Mazarguil
                           ` (8 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

This pattern item matches any protocol in place of the current layer and
has two properties:

- min: minimum number of layers covered (0 or more).
- max: maximum number of layers covered (0 means infinity).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 69887fc..1736954 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -99,6 +99,8 @@ enum index {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ITEM_ANY_NUM,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -282,7 +284,6 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
-__rte_unused
 static const enum index item_param[] = {
 	ITEM_PARAM_IS,
 	ITEM_PARAM_SPEC,
@@ -296,6 +297,13 @@ static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ZERO,
+};
+
+static const enum index item_any[] = {
+	ITEM_ANY_NUM,
+	ITEM_NEXT,
 	ZERO,
 };
 
@@ -580,6 +588,19 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
 		.call = parse_vc,
 	},
+	[ITEM_ANY] = {
+		.name = "any",
+		.help = "match any protocol for the current layer",
+		.priv = PRIV_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+		.next = NEXT(item_any),
+		.call = parse_vc,
+	},
+	[ITEM_ANY_NUM] = {
+		.name = "num",
+		.help = "number of layers covered",
+		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, num)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 18/25] app/testpmd: add various items to flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (16 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 19/25] app/testpmd: add item raw " Adrien Mazarguil
                           ` (7 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

- PF: match packets addressed to the physical function.
- VF: match packets addressed to a virtual function ID.
- PORT: device-specific physical port index to use.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 53 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 1736954..ac93679 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -101,6 +101,11 @@ enum index {
 	ITEM_INVERT,
 	ITEM_ANY,
 	ITEM_ANY_NUM,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_VF_ID,
+	ITEM_PORT,
+	ITEM_PORT_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -298,6 +303,9 @@ static const enum index next_item[] = {
 	ITEM_VOID,
 	ITEM_INVERT,
 	ITEM_ANY,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_PORT,
 	ZERO,
 };
 
@@ -307,6 +315,18 @@ static const enum index item_any[] = {
 	ZERO,
 };
 
+static const enum index item_vf[] = {
+	ITEM_VF_ID,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_port[] = {
+	ITEM_PORT_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -601,6 +621,39 @@ static const struct token token_list[] = {
 		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, num)),
 	},
+	[ITEM_PF] = {
+		.name = "pf",
+		.help = "match packets addressed to the physical function",
+		.priv = PRIV_ITEM(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_VF] = {
+		.name = "vf",
+		.help = "match packets addressed to a virtual function ID",
+		.priv = PRIV_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+		.next = NEXT(item_vf),
+		.call = parse_vc,
+	},
+	[ITEM_VF_ID] = {
+		.name = "id",
+		.help = "destination VF ID",
+		.next = NEXT(item_vf, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_vf, id)),
+	},
+	[ITEM_PORT] = {
+		.name = "port",
+		.help = "device-specific physical port index to use",
+		.priv = PRIV_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+		.next = NEXT(item_port),
+		.call = parse_vc,
+	},
+	[ITEM_PORT_INDEX] = {
+		.name = "index",
+		.help = "physical port index",
+		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 19/25] app/testpmd: add item raw to flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (17 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 18/25] app/testpmd: add various items " Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
                           ` (6 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Matches arbitrary byte strings with properties:

- relative: look for pattern after the previous item.
- search: search pattern from offset (see also limit).
- offset: absolute or relative offset for pattern.
- limit: search area limit for start of pattern.
- length: pattern length.
- pattern: byte string to look for.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 208 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 208 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index ac93679..dafb07f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -57,6 +57,8 @@ enum index {
 	INTEGER,
 	UNSIGNED,
 	PREFIX,
+	BOOLEAN,
+	STRING,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -106,6 +108,12 @@ enum index {
 	ITEM_VF_ID,
 	ITEM_PORT,
 	ITEM_PORT_INDEX,
+	ITEM_RAW,
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -115,6 +123,13 @@ enum index {
 	ACTION_PASSTHRU,
 };
 
+/** Size of pattern[] field in struct rte_flow_item_raw. */
+#define ITEM_RAW_PATTERN_SIZE 36
+
+/** Storage size for struct rte_flow_item_raw including pattern. */
+#define ITEM_RAW_SIZE \
+	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -216,6 +231,13 @@ struct token {
 		.size = sizeof(*((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() with arbitrary size. */
+#define ARGS_ENTRY_USZ(s, f, sz) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = (sz), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -306,6 +328,7 @@ static const enum index next_item[] = {
 	ITEM_PF,
 	ITEM_VF,
 	ITEM_PORT,
+	ITEM_RAW,
 	ZERO,
 };
 
@@ -327,6 +350,16 @@ static const enum index item_port[] = {
 	ZERO,
 };
 
+static const enum index item_raw[] = {
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -363,11 +396,19 @@ static int parse_int(struct context *, const struct token *,
 static int parse_prefix(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_boolean(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
+static int parse_string(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_boolean(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 static int comp_action(struct context *, const struct token *,
 		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
@@ -410,6 +451,20 @@ static const struct token token_list[] = {
 		.call = parse_prefix,
 		.comp = comp_none,
 	},
+	[BOOLEAN] = {
+		.name = "{boolean}",
+		.type = "BOOLEAN",
+		.help = "any boolean value",
+		.call = parse_boolean,
+		.comp = comp_boolean,
+	},
+	[STRING] = {
+		.name = "{string}",
+		.type = "STRING",
+		.help = "fixed string",
+		.call = parse_string,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -654,6 +709,52 @@ static const struct token token_list[] = {
 		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
 	},
+	[ITEM_RAW] = {
+		.name = "raw",
+		.help = "match an arbitrary byte string",
+		.priv = PRIV_ITEM(RAW, ITEM_RAW_SIZE),
+		.next = NEXT(item_raw),
+		.call = parse_vc,
+	},
+	[ITEM_RAW_RELATIVE] = {
+		.name = "relative",
+		.help = "look for pattern after the previous item",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
+					   relative, 1)),
+	},
+	[ITEM_RAW_SEARCH] = {
+		.name = "search",
+		.help = "search pattern from offset (see also limit)",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
+					   search, 1)),
+	},
+	[ITEM_RAW_OFFSET] = {
+		.name = "offset",
+		.help = "absolute or relative offset for pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(INTEGER), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, offset)),
+	},
+	[ITEM_RAW_LIMIT] = {
+		.name = "limit",
+		.help = "search area limit for start of pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, limit)),
+	},
+	[ITEM_RAW_PATTERN] = {
+		.name = "pattern",
+		.help = "byte string to look for",
+		.next = NEXT(item_raw,
+			     NEXT_ENTRY(STRING),
+			     NEXT_ENTRY(ITEM_PARAM_IS,
+					ITEM_PARAM_SPEC,
+					ITEM_PARAM_MASK)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, length),
+			     ARGS_ENTRY_USZ(struct rte_flow_item_raw,
+					    pattern,
+					    ITEM_RAW_PATTERN_SIZE)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1235,6 +1336,96 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a string.
+ *
+ * Two arguments (ctx->args) are retrieved from the stack to store data and
+ * its length (in that order).
+ */
+static int
+parse_string(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg_data = pop_args(ctx);
+	const struct arg *arg_len = pop_args(ctx);
+	char tmp[16]; /* Ought to be enough. */
+	int ret;
+
+	/* Arguments are expected. */
+	if (!arg_data)
+		return -1;
+	if (!arg_len) {
+		push_args(ctx, arg_data);
+		return -1;
+	}
+	size = arg_data->size;
+	/* Bit-mask fill is not supported. */
+	if (arg_data->mask || size < len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	/* Let parse_int() fill length information first. */
+	ret = snprintf(tmp, sizeof(tmp), "%u", len);
+	if (ret < 0)
+		goto error;
+	push_args(ctx, arg_len);
+	ret = parse_int(ctx, token, tmp, ret, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		goto error;
+	}
+	buf = (uint8_t *)ctx->object + arg_data->offset;
+	/* Output buffer is not necessarily NUL-terminated. */
+	memcpy(buf, str, len);
+	memset((uint8_t *)buf + len, 0x55, size - len);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg_data->offset, 0xff, len);
+	return len;
+error:
+	push_args(ctx, arg_len);
+	push_args(ctx, arg_data);
+	return -1;
+}
+
+/** Boolean values (even indices stand for false). */
+static const char *const boolean_name[] = {
+	"0", "1",
+	"false", "true",
+	"no", "yes",
+	"N", "Y",
+	NULL,
+};
+
+/**
+ * Parse a boolean value.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_boolean(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	for (i = 0; boolean_name[i]; ++i)
+		if (!strncmp(str, boolean_name[i], len))
+			break;
+	/* Process token as integer. */
+	if (boolean_name[i])
+		str = i & 1 ? "1" : "0";
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, strlen(str), buf, size);
+	return ret > 0 ? (int)len : ret;
+}
+
 /** Parse port and update context. */
 static int
 parse_port(struct context *ctx, const struct token *token,
@@ -1273,6 +1464,23 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete boolean values. */
+static int
+comp_boolean(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; boolean_name[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", boolean_name[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete action names. */
 static int
 comp_action(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 20/25] app/testpmd: add items eth/vlan to flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (18 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 19/25] app/testpmd: add item raw " Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
                           ` (5 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

These pattern items match basic Ethernet headers (source, destination and
type) and related 802.1Q/ad VLAN headers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 126 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 126 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index dafb07f..53709fe 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -43,6 +43,7 @@
 #include <rte_ethdev.h>
 #include <rte_byteorder.h>
 #include <cmdline_parse.h>
+#include <cmdline_parse_etheraddr.h>
 #include <rte_flow.h>
 
 #include "testpmd.h"
@@ -59,6 +60,7 @@ enum index {
 	PREFIX,
 	BOOLEAN,
 	STRING,
+	MAC_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -114,6 +116,13 @@ enum index {
 	ITEM_RAW_OFFSET,
 	ITEM_RAW_LIMIT,
 	ITEM_RAW_PATTERN,
+	ITEM_ETH,
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_VLAN,
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -238,6 +247,14 @@ struct token {
 		.size = (sz), \
 	})
 
+/** Same as ARGS_ENTRY() using network byte ordering. */
+#define ARGS_ENTRY_HTON(s, f) \
+	(&(const struct arg){ \
+		.hton = 1, \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -329,6 +346,8 @@ static const enum index next_item[] = {
 	ITEM_VF,
 	ITEM_PORT,
 	ITEM_RAW,
+	ITEM_ETH,
+	ITEM_VLAN,
 	ZERO,
 };
 
@@ -360,6 +379,21 @@ static const enum index item_raw[] = {
 	ZERO,
 };
 
+static const enum index item_eth[] = {
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vlan[] = {
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -402,6 +436,9 @@ static int parse_boolean(struct context *, const struct token *,
 static int parse_string(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_mac_addr(struct context *, const struct token *,
+			  const char *, unsigned int,
+			  void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -465,6 +502,13 @@ static const struct token token_list[] = {
 		.call = parse_string,
 		.comp = comp_none,
 	},
+	[MAC_ADDR] = {
+		.name = "{MAC address}",
+		.type = "MAC-48",
+		.help = "standard MAC address notation",
+		.call = parse_mac_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -755,6 +799,50 @@ static const struct token token_list[] = {
 					    pattern,
 					    ITEM_RAW_PATTERN_SIZE)),
 	},
+	[ITEM_ETH] = {
+		.name = "eth",
+		.help = "match Ethernet header",
+		.priv = PRIV_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+		.next = NEXT(item_eth),
+		.call = parse_vc,
+	},
+	[ITEM_ETH_DST] = {
+		.name = "dst",
+		.help = "destination MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, dst)),
+	},
+	[ITEM_ETH_SRC] = {
+		.name = "src",
+		.help = "source MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, src)),
+	},
+	[ITEM_ETH_TYPE] = {
+		.name = "type",
+		.help = "EtherType",
+		.next = NEXT(item_eth, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_eth, type)),
+	},
+	[ITEM_VLAN] = {
+		.name = "vlan",
+		.help = "match 802.1Q/ad VLAN tag",
+		.priv = PRIV_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+		.next = NEXT(item_vlan),
+		.call = parse_vc,
+	},
+	[ITEM_VLAN_TPID] = {
+		.name = "tpid",
+		.help = "tag protocol identifier",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tpid)),
+	},
+	[ITEM_VLAN_TCI] = {
+		.name = "tci",
+		.help = "tag control information",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1388,6 +1476,44 @@ parse_string(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a MAC address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_mac_addr(struct context *ctx, const struct token *token,
+	       const char *str, unsigned int len,
+	       void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	struct ether_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	ret = cmdline_parse_etheraddr(NULL, str, &tmp, size);
+	if (ret < 0 || (unsigned int)ret != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 21/25] app/testpmd: add items ipv4/ipv6 to flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (19 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-20  9:21           ` Pei, Yulong
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 22/25] app/testpmd: add L4 items " Adrien Mazarguil
                           ` (4 subsequent siblings)
  25 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Add the ability to match basic fields from IPv4 and IPv6 headers (source
and destination addresses only).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 177 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 53709fe..c2725a5 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -38,6 +38,7 @@
 #include <errno.h>
 #include <ctype.h>
 #include <string.h>
+#include <arpa/inet.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
@@ -61,6 +62,8 @@ enum index {
 	BOOLEAN,
 	STRING,
 	MAC_ADDR,
+	IPV4_ADDR,
+	IPV6_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -123,6 +126,12 @@ enum index {
 	ITEM_VLAN,
 	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
+	ITEM_IPV4,
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_IPV6,
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -348,6 +357,8 @@ static const enum index next_item[] = {
 	ITEM_RAW,
 	ITEM_ETH,
 	ITEM_VLAN,
+	ITEM_IPV4,
+	ITEM_IPV6,
 	ZERO,
 };
 
@@ -394,6 +405,20 @@ static const enum index item_vlan[] = {
 	ZERO,
 };
 
+static const enum index item_ipv4[] = {
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_ipv6[] = {
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -439,6 +464,12 @@ static int parse_string(struct context *, const struct token *,
 static int parse_mac_addr(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int parse_ipv4_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
+static int parse_ipv6_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -509,6 +540,20 @@ static const struct token token_list[] = {
 		.call = parse_mac_addr,
 		.comp = comp_none,
 	},
+	[IPV4_ADDR] = {
+		.name = "{IPv4 address}",
+		.type = "IPV4 ADDRESS",
+		.help = "standard IPv4 address notation",
+		.call = parse_ipv4_addr,
+		.comp = comp_none,
+	},
+	[IPV6_ADDR] = {
+		.name = "{IPv6 address}",
+		.type = "IPV6 ADDRESS",
+		.help = "standard IPv6 address notation",
+		.call = parse_ipv6_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -843,6 +888,48 @@ static const struct token token_list[] = {
 		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
 	},
+	[ITEM_IPV4] = {
+		.name = "ipv4",
+		.help = "match IPv4 header",
+		.priv = PRIV_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+		.next = NEXT(item_ipv4),
+		.call = parse_vc,
+	},
+	[ITEM_IPV4_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV4_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.dst_addr)),
+	},
+	[ITEM_IPV6] = {
+		.name = "ipv6",
+		.help = "match IPv6 header",
+		.priv = PRIV_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+		.next = NEXT(item_ipv6),
+		.call = parse_vc,
+	},
+	[ITEM_IPV6_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV6_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.dst_addr)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1514,6 +1601,96 @@ parse_mac_addr(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse an IPv4 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv4_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in_addr tmp;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET, str2, &tmp);
+	if (ret != 1) {
+		/* Attempt integer parsing. */
+		push_args(ctx, arg);
+		return parse_int(ctx, token, str, len, buf, size);
+	}
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/**
+ * Parse an IPv6 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv6_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in6_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET6, str2, &tmp);
+	if (ret != 1)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 22/25] app/testpmd: add L4 items to flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (20 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-20  9:14           ` Pei, Yulong
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 23/25] app/testpmd: add various actions " Adrien Mazarguil
                           ` (3 subsequent siblings)
  25 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Add the ability to match a few properties of common L4[.5] protocol
headers:

- ICMP: type and code.
- UDP: source and destination ports.
- TCP: source and destination ports.
- SCTP: source and destination ports.
- VXLAN: network identifier.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 163 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index c2725a5..a340a75 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -132,6 +132,20 @@ enum index {
 	ITEM_IPV6,
 	ITEM_IPV6_SRC,
 	ITEM_IPV6_DST,
+	ITEM_ICMP,
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_UDP,
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_TCP,
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_SCTP,
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_VXLAN,
+	ITEM_VXLAN_VNI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -359,6 +373,11 @@ static const enum index next_item[] = {
 	ITEM_VLAN,
 	ITEM_IPV4,
 	ITEM_IPV6,
+	ITEM_ICMP,
+	ITEM_UDP,
+	ITEM_TCP,
+	ITEM_SCTP,
+	ITEM_VXLAN,
 	ZERO,
 };
 
@@ -419,6 +438,40 @@ static const enum index item_ipv6[] = {
 	ZERO,
 };
 
+static const enum index item_icmp[] = {
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_udp[] = {
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_tcp[] = {
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_sctp[] = {
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vxlan[] = {
+	ITEM_VXLAN_VNI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -930,6 +983,103 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
 					     hdr.dst_addr)),
 	},
+	[ITEM_ICMP] = {
+		.name = "icmp",
+		.help = "match ICMP header",
+		.priv = PRIV_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+		.next = NEXT(item_icmp),
+		.call = parse_vc,
+	},
+	[ITEM_ICMP_TYPE] = {
+		.name = "type",
+		.help = "ICMP packet type",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_type)),
+	},
+	[ITEM_ICMP_CODE] = {
+		.name = "code",
+		.help = "ICMP packet code",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_code)),
+	},
+	[ITEM_UDP] = {
+		.name = "udp",
+		.help = "match UDP header",
+		.priv = PRIV_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+		.next = NEXT(item_udp),
+		.call = parse_vc,
+	},
+	[ITEM_UDP_SRC] = {
+		.name = "src",
+		.help = "UDP source port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.src_port)),
+	},
+	[ITEM_UDP_DST] = {
+		.name = "dst",
+		.help = "UDP destination port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.dst_port)),
+	},
+	[ITEM_TCP] = {
+		.name = "tcp",
+		.help = "match TCP header",
+		.priv = PRIV_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+		.next = NEXT(item_tcp),
+		.call = parse_vc,
+	},
+	[ITEM_TCP_SRC] = {
+		.name = "src",
+		.help = "TCP source port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.src_port)),
+	},
+	[ITEM_TCP_DST] = {
+		.name = "dst",
+		.help = "TCP destination port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.dst_port)),
+	},
+	[ITEM_SCTP] = {
+		.name = "sctp",
+		.help = "match SCTP header",
+		.priv = PRIV_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+		.next = NEXT(item_sctp),
+		.call = parse_vc,
+	},
+	[ITEM_SCTP_SRC] = {
+		.name = "src",
+		.help = "SCTP source port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.src_port)),
+	},
+	[ITEM_SCTP_DST] = {
+		.name = "dst",
+		.help = "SCTP destination port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.dst_port)),
+	},
+	[ITEM_VXLAN] = {
+		.name = "vxlan",
+		.help = "match VXLAN header",
+		.priv = PRIV_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+		.next = NEXT(item_vxlan),
+		.call = parse_vc,
+	},
+	[ITEM_VXLAN_VNI] = {
+		.name = "vni",
+		.help = "VXLAN identifier",
+		.next = NEXT(item_vxlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vxlan, vni)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1491,6 +1641,19 @@ parse_int(struct context *ctx, const struct token *token,
 	case sizeof(uint16_t):
 		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
 		break;
+	case sizeof(uint8_t [3]):
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+		if (!arg->hton) {
+			((uint8_t *)buf)[0] = u;
+			((uint8_t *)buf)[1] = u >> 8;
+			((uint8_t *)buf)[2] = u >> 16;
+			break;
+		}
+#endif
+		((uint8_t *)buf)[0] = u >> 16;
+		((uint8_t *)buf)[1] = u >> 8;
+		((uint8_t *)buf)[2] = u;
+		break;
 	case sizeof(uint32_t):
 		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 23/25] app/testpmd: add various actions to flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (21 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 22/25] app/testpmd: add L4 items " Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 24/25] app/testpmd: add queue " Adrien Mazarguil
                           ` (2 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

- MARK: attach 32 bit value to packets.
- FLAG: flag packets.
- DROP: drop packets.
- COUNT: enable counters for a rule.
- PF: redirect packets to physical device function.
- VF: redirect packets to virtual device function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 121 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a340a75..90712bf 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -153,6 +153,15 @@ enum index {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_MARK_ID,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
 };
 
 /** Size of pattern[] field in struct rte_flow_item_raw. */
@@ -476,6 +485,25 @@ static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ZERO,
+};
+
+static const enum index action_mark[] = {
+	ACTION_MARK_ID,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_vf[] = {
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
+	ACTION_NEXT,
 	ZERO,
 };
 
@@ -487,6 +515,8 @@ static int parse_vc(struct context *, const struct token *,
 		    void *, unsigned int);
 static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_conf(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1112,6 +1142,70 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_MARK] = {
+		.name = "mark",
+		.help = "attach 32 bit value to packets",
+		.priv = PRIV_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+		.next = NEXT(action_mark),
+		.call = parse_vc,
+	},
+	[ACTION_MARK_ID] = {
+		.name = "id",
+		.help = "32 bit value to return with packets",
+		.next = NEXT(action_mark, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_mark, id)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_FLAG] = {
+		.name = "flag",
+		.help = "flag packets",
+		.priv = PRIV_ACTION(FLAG, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_DROP] = {
+		.name = "drop",
+		.help = "drop packets (note: passthru has priority)",
+		.priv = PRIV_ACTION(DROP, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_COUNT] = {
+		.name = "count",
+		.help = "enable counters for this rule",
+		.priv = PRIV_ACTION(COUNT, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PF] = {
+		.name = "pf",
+		.help = "redirect packets to physical device function",
+		.priv = PRIV_ACTION(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_VF] = {
+		.name = "vf",
+		.help = "redirect packets to virtual device function",
+		.priv = PRIV_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+		.next = NEXT(action_vf),
+		.call = parse_vc,
+	},
+	[ACTION_VF_ORIGINAL] = {
+		.name = "original",
+		.help = "use original VF ID if possible",
+		.next = NEXT(action_vf, NEXT_ENTRY(BOOLEAN)),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_vf,
+					   original, 1)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_VF_ID] = {
+		.name = "id",
+		.help = "VF ID to redirect packets to",
+		.next = NEXT(action_vf, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_vf, id)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -1435,6 +1529,33 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse action configuration field. */
+static int
+parse_vc_conf(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Update configuration pointer. */
+	action->conf = ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 24/25] app/testpmd: add queue actions to flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (22 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 23/25] app/testpmd: add various actions " Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd " Adrien Mazarguil
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

- QUEUE: assign packets to a given queue index.
- DUP: duplicate packets to a given queue index.
- RSS: spread packets among several queues.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 152 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 90712bf..2376b8f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -156,8 +156,15 @@ enum index {
 	ACTION_MARK,
 	ACTION_MARK_ID,
 	ACTION_FLAG,
+	ACTION_QUEUE,
+	ACTION_QUEUE_INDEX,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_DUP_INDEX,
+	ACTION_RSS,
+	ACTION_RSS_QUEUES,
+	ACTION_RSS_QUEUE,
 	ACTION_PF,
 	ACTION_VF,
 	ACTION_VF_ORIGINAL,
@@ -171,6 +178,14 @@ enum index {
 #define ITEM_RAW_SIZE \
 	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
 
+/** Number of queue[] entries in struct rte_flow_action_rss. */
+#define ACTION_RSS_NUM 32
+
+/** Storage size for struct rte_flow_action_rss including queues. */
+#define ACTION_RSS_SIZE \
+	(offsetof(struct rte_flow_action_rss, queue) + \
+	 sizeof(*((struct rte_flow_action_rss *)0)->queue) * ACTION_RSS_NUM)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -487,8 +502,11 @@ static const enum index next_action[] = {
 	ACTION_PASSTHRU,
 	ACTION_MARK,
 	ACTION_FLAG,
+	ACTION_QUEUE,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_RSS,
 	ACTION_PF,
 	ACTION_VF,
 	ZERO,
@@ -500,6 +518,24 @@ static const enum index action_mark[] = {
 	ZERO,
 };
 
+static const enum index action_queue[] = {
+	ACTION_QUEUE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_dup[] = {
+	ACTION_DUP_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_rss[] = {
+	ACTION_RSS_QUEUES,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static const enum index action_vf[] = {
 	ACTION_VF_ORIGINAL,
 	ACTION_VF_ID,
@@ -517,6 +553,9 @@ static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
 static int parse_vc_conf(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_action_rss_queue(struct context *, const struct token *,
+				     const char *, unsigned int, void *,
+				     unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -566,6 +605,8 @@ static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
 			unsigned int, char *, unsigned int);
+static int comp_vc_action_rss_queue(struct context *, const struct token *,
+				    unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -1163,6 +1204,21 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_QUEUE] = {
+		.name = "queue",
+		.help = "assign packets to a given queue index",
+		.priv = PRIV_ACTION(QUEUE,
+				    sizeof(struct rte_flow_action_queue)),
+		.next = NEXT(action_queue),
+		.call = parse_vc,
+	},
+	[ACTION_QUEUE_INDEX] = {
+		.name = "index",
+		.help = "queue index to use",
+		.next = NEXT(action_queue, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_queue, index)),
+		.call = parse_vc_conf,
+	},
 	[ACTION_DROP] = {
 		.name = "drop",
 		.help = "drop packets (note: passthru has priority)",
@@ -1177,6 +1233,39 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_DUP] = {
+		.name = "dup",
+		.help = "duplicate packets to a given queue index",
+		.priv = PRIV_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+		.next = NEXT(action_dup),
+		.call = parse_vc,
+	},
+	[ACTION_DUP_INDEX] = {
+		.name = "index",
+		.help = "queue index to duplicate packets to",
+		.next = NEXT(action_dup, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_dup, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS] = {
+		.name = "rss",
+		.help = "spread packets among several queues",
+		.priv = PRIV_ACTION(RSS, ACTION_RSS_SIZE),
+		.next = NEXT(action_rss),
+		.call = parse_vc,
+	},
+	[ACTION_RSS_QUEUES] = {
+		.name = "queues",
+		.help = "queue indices to use",
+		.next = NEXT(action_rss, NEXT_ENTRY(ACTION_RSS_QUEUE)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS_QUEUE] = {
+		.name = "{queue}",
+		.help = "queue index",
+		.call = parse_vc_action_rss_queue,
+		.comp = comp_vc_action_rss_queue,
+	},
 	[ACTION_PF] = {
 		.name = "pf",
 		.help = "redirect packets to physical device function",
@@ -1556,6 +1645,51 @@ parse_vc_conf(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/**
+ * Parse queue field for RSS action.
+ *
+ * Valid tokens are queue indices and the "end" token.
+ */
+static int
+parse_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	static const enum index next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
+	int ret;
+	int i;
+
+	(void)token;
+	(void)buf;
+	(void)size;
+	if (ctx->curr != ACTION_RSS_QUEUE)
+		return -1;
+	i = ctx->objdata >> 16;
+	if (!strncmp(str, "end", len)) {
+		ctx->objdata &= 0xffff;
+		return len;
+	}
+	if (i >= ACTION_RSS_NUM)
+		return -1;
+	if (push_args(ctx, ARGS_ENTRY(struct rte_flow_action_rss, queue[i])))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	++i;
+	ctx->objdata = i << 16 | (ctx->objdata & 0xffff);
+	/* Repeat token. */
+	if (ctx->next_num == RTE_DIM(ctx->next))
+		return -1;
+	ctx->next[ctx->next_num++] = next;
+	if (!ctx->object)
+		return len;
+	((struct rte_flow_action_rss *)ctx->object)->num = i;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -2130,6 +2264,24 @@ comp_rule_id(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete queue field for RSS action. */
+static int
+comp_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			 unsigned int ent, char *buf, unsigned int size)
+{
+	static const char *const str[] = { "", "end", NULL };
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; str[i] != NULL; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", str[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd flow command
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (23 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 24/25] app/testpmd: add queue " Adrien Mazarguil
@ 2016-12-19 17:49         ` Adrien Mazarguil
  2016-12-19 20:44           ` Mcnamara, John
  2016-12-20 17:06           ` Ferruh Yigit
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
  25 siblings, 2 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-19 17:49 UTC (permalink / raw)
  To: dev

Document syntax, interaction with rte_flow and provide usage examples.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 612 +++++++++++++++++++++++
 1 file changed, 612 insertions(+)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index f1c269a..50cba16 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1631,6 +1631,9 @@ Filter Functions
 
 This section details the available filter functions that are available.
 
+Note these functions interface the deprecated legacy filtering framework,
+superseded by *rte_flow*. See `Flow rules management`_.
+
 ethertype_filter
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -2041,3 +2044,612 @@ Set different GRE key length for input set::
 For example to set GRE key length for input set to 4 bytes on port 0::
 
    testpmd> global_config 0 gre-key-len 4
+
+
+.. _testpmd_rte_flow:
+
+Flow rules management
+---------------------
+
+Control of the generic flow API (*rte_flow*) is fully exposed through the
+``flow`` command (validation, creation, destruction and queries).
+
+Considering *rte_flow* overlaps with all `Filter Functions`_, using both
+features simultaneously may cause undefined side-effects and is therefore
+not recommended.
+
+``flow`` syntax
+~~~~~~~~~~~~~~~
+
+Because the ``flow`` command uses dynamic tokens to handle the large number
+of possible flow rules combinations, its behavior differs slightly from
+other commands, in particular:
+
+- Pressing *?* or the *<tab>* key displays contextual help for the current
+  token, not that of the entire command.
+
+- Optional and repeated parameters are supported (provided they are listed
+  in the contextual help).
+
+The first parameter stands for the operation mode. Possible operations and
+their general syntax are described below. They are covered in detail in the
+following sections.
+
+- Check whether a flow rule can be created::
+
+   flow validate {port_id}
+       [group {group_id}] [priority {level}] [ingress] [egress]
+       pattern {item} [/ {item} [...]] / end
+       actions {action} [/ {action} [...]] / end
+
+- Create a flow rule::
+
+   flow create {port_id}
+       [group {group_id}] [priority {level}] [ingress] [egress]
+       pattern {item} [/ {item} [...]] / end
+       actions {action} [/ {action} [...]] / end
+
+- Destroy specific flow rules::
+
+   flow destroy {port_id} rule {rule_id} [...]
+
+- Destroy all flow rules::
+
+   flow flush {port_id}
+
+- Query an existing flow rule::
+
+   flow query {port_id} {rule_id} {action}
+
+- List existing flow rules sorted by priority, filtered by group
+  identifiers::
+
+   flow list {port_id} [group {group_id}] [...]
+
+Validating flow rules
+~~~~~~~~~~~~~~~~~~~~~
+
+``flow validate`` reports whether a flow rule would be accepted by the
+underlying device in its current state but stops short of creating it. It is
+bound to ``rte_flow_validate()``::
+
+ flow validate {port_id}
+     [group {group_id}] [priority {level}] [ingress] [egress]
+     pattern {item} [/ {item} [...]] / end
+     actions {action} [/ {action} [...]] / end
+
+If successful, it will show::
+
+ Flow rule validated
+
+Otherwise it will show an error message of the form::
+
+ Caught error type [...] ([...]): [...]
+
+This command uses the same parameters as ``flow create``, their format is
+described in `Creating flow rules`_.
+
+Check whether redirecting any Ethernet packet received on port 0 to RX queue
+index 6 is supported::
+
+ testpmd> flow validate 1 ingress pattern eth / end
+     actions queue index 6 / end
+ Flow rule validated
+ testpmd>
+
+Port 0 does not support TCPv6 rules::
+
+ testpmd> flow validate 0 ingress pattern eth / ipv6 / tcp / end
+     actions drop / end
+ Caught error type 9 (specific pattern item): Invalid argument.
+ testpmd>
+
+Creating flow rules
+~~~~~~~~~~~~~~~~~~~
+
+``flow create`` validates and creates the specified flow rule. It is bound
+to ``rte_flow_create()``::
+
+ flow create {port_id}
+     [group {group_id}] [priority {level}] [ingress] [egress]
+     pattern {item} [/ {item} [...]] / end
+     actions {action} [/ {action} [...]] / end
+
+If successful, it will return a flow rule ID usable with other commands::
+
+ Flow rule #[...] created
+
+Otherwise it will show an error message of the form::
+
+ Caught error type [...] ([...]): [...]
+
+Parameters describe in the following order:
+
+- Attributes (*group*, *priority*, *ingress*, *egress* tokens).
+- A matching pattern, starting with the *pattern* token and terminated by an
+  *end* pattern item.
+- Actions, starting with the *actions* token and terminated by an *end*
+  action.
+
+These translate directly to *rte_flow* objects provided as-is to the
+underlying functions.
+
+The shortest valid definition only comprises mandatory tokens::
+
+ testpmd> flow create 0 pattern end actions end
+
+Note that PMDs may refuse rules that essentially do nothing such as this
+one.
+
+**All unspecified object values are automatically initialized to 0.**
+
+Attributes
+^^^^^^^^^^
+
+These tokens affect flow rule attributes (``struct rte_flow_attr``) and are
+specified before the ``pattern`` token.
+
+- ``group {group id}``: priority group.
+- ``priority {level}``: priority level within group.
+- ``ingress``: rule applies to ingress traffic.
+- ``egress``: rule applies to egress traffic.
+
+Each instance of an attribute specified several times overrides the previous
+value as shown below (group 4 is used)::
+
+ testpmd> flow create 0 group 42 group 24 group 4 [...]
+
+Note that once enabled, ``ingress`` and ``egress`` cannot be disabled.
+
+While not specifying a direction is an error, some rules may allow both
+simultaneously.
+
+Most rules affect RX therefore contain the ``ingress`` token::
+
+ testpmd> flow create 0 ingress pattern [...]
+
+Matching pattern
+^^^^^^^^^^^^^^^^
+
+A matching pattern starts after the ``pattern`` token. It is made of pattern
+items and is terminated by a mandatory ``end`` item.
+
+Items are named after their type (*RTE_FLOW_ITEM_TYPE_* from ``enum
+rte_flow_item_type``).
+
+The ``/`` token is used as a separator between pattern items as shown
+below::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end [...]
+
+Note that protocol items like these must be stacked from lowest to highest
+layer to make sense. For instance, the following rule is either invalid or
+unlikely to match any packet::
+
+ testpmd> flow create 0 ingress pattern eth / udp / ipv4 / end [...]
+
+More information on these restrictions can be found in the *rte_flow*
+documentation.
+
+Several items support additional specification structures, for example
+``ipv4`` allows specifying source and destination addresses as follows::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 src is 10.1.1.1
+     dst is 10.2.0.0 / end [...]
+
+This rule matches all IPv4 traffic with the specified properties.
+
+In this example, ``src`` and ``dst`` are field names of the underlying
+``struct rte_flow_item_ipv4`` object. All item properties can be specified
+in a similar fashion.
+
+The ``is`` token means that the subsequent value must be matched exactly,
+and assigns ``spec`` and ``mask`` fields in ``struct rte_flow_item``
+accordingly. Possible assignment tokens are:
+
+- ``is``: match value perfectly (with full bit-mask).
+- ``spec``: match value according to configured bit-mask.
+- ``last``: specify upper bound to establish a range.
+- ``mask``: specify bit-mask with relevant bits set to one.
+- ``prefix``: generate bit-mask from a prefix length.
+
+These yield identical results::
+
+ ipv4 src is 10.1.1.1
+
+::
+
+ ipv4 src spec 10.1.1.1 src mask 255.255.255.255
+
+::
+
+ ipv4 src spec 10.1.1.1 src prefix 32
+
+::
+
+ ipv4 src is 10.1.1.1 src last 10.1.1.1 # range with a single value
+
+::
+
+ ipv4 src is 10.1.1.1 src last 0 # 0 disables range
+
+Inclusive ranges can be defined with ``last``::
+
+ ipv4 src is 10.1.1.1 src last 10.2.3.4 # 10.1.1.1 to 10.2.3.4
+
+Note that ``mask`` affects both ``spec`` and ``last``::
+
+ ipv4 src is 10.1.1.1 src last 10.2.3.4 src mask 255.255.0.0
+    # matches 10.1.0.0 to 10.2.255.255
+
+Properties can be modified multiple times::
+
+ ipv4 src is 10.1.1.1 src is 10.1.2.3 src is 10.2.3.4 # matches 10.2.3.4
+
+::
+
+ ipv4 src is 10.1.1.1 src prefix 24 src prefix 16 # matches 10.1.0.0/16
+
+Pattern items
+^^^^^^^^^^^^^
+
+This section lists supported pattern items and their attributes, if any.
+
+- ``end``: end list of pattern items.
+
+- ``void``: no-op pattern item.
+
+- ``invert``: perform actions when pattern does not match.
+
+- ``any``: match any protocol for the current layer.
+
+  - ``num {unsigned}``: number of layers covered.
+
+- ``pf``: match packets addressed to the physical function.
+
+- ``vf``: match packets addressed to a virtual function ID.
+
+  - ``id {unsigned}``: destination VF ID.
+
+- ``port``: device-specific physical port index to use.
+
+  - ``index {unsigned}``: physical port index.
+
+- ``raw``: match an arbitrary byte string.
+
+  - ``relative {boolean}``: look for pattern after the previous item.
+  - ``search {boolean}``: search pattern from offset (see also limit).
+  - ``offset {integer}``: absolute or relative offset for pattern.
+  - ``limit {unsigned}``: search area limit for start of pattern.
+  - ``pattern {string}``: byte string to look for.
+
+- ``eth``: match Ethernet header.
+
+  - ``dst {MAC-48}``: destination MAC.
+  - ``src {MAC-48}``: source MAC.
+  - ``type {unsigned}``: EtherType.
+
+- ``vlan``: match 802.1Q/ad VLAN tag.
+
+  - ``tpid {unsigned}``: tag protocol identifier.
+  - ``tci {unsigned}``: tag control information.
+
+- ``ipv4``: match IPv4 header.
+
+  - ``src {ipv4 address}``: source address.
+  - ``dst {ipv4 address}``: destination address.
+
+- ``ipv6``: match IPv6 header.
+
+  - ``src {ipv6 address}``: source address.
+  - ``dst {ipv6 address}``: destination address.
+
+- ``icmp``: match ICMP header.
+
+  - ``type {unsigned}``: ICMP packet type.
+  - ``code {unsigned}``: ICMP packet code.
+
+- ``udp``: match UDP header.
+
+  - ``src {unsigned}``: UDP source port.
+  - ``dst {unsigned}``: UDP destination port.
+
+- ``tcp``: match TCP header.
+
+  - ``src {unsigned}``: TCP source port.
+  - ``dst {unsigned}``: TCP destination port.
+
+- ``sctp``: match SCTP header.
+
+  - ``src {unsigned}``: SCTP source port.
+  - ``dst {unsigned}``: SCTP destination port.
+
+- ``vxlan``: match VXLAN header.
+
+  - ``vni {unsigned}``: VXLAN identifier.
+
+Actions list
+^^^^^^^^^^^^
+
+A list of actions starts after the ``actions`` token in the same fashion as
+`Matching pattern`_; actions are separated by ``/`` tokens and the list is
+terminated by a mandatory ``end`` action.
+
+Actions are named after their type (*RTE_FLOW_ACTION_TYPE_* from ``enum
+rte_flow_action_type``).
+
+Dropping all incoming UDPv4 packets can be expressed as follows::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+     actions drop / end
+
+Several actions have configurable properties which must be specified when
+there is no valid default value. For example, ``queue`` requires a target
+queue index.
+
+This rule redirects incoming UDPv4 traffic to queue index 6::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+     actions queue index 6 / end
+
+While this one could be rejected by PMDs (unspecified queue index)::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+     actions queue / end
+
+As defined by *rte_flow*, the list is not ordered, all actions of a given
+rule are performed simultaneously. These are equivalent::
+
+ queue index 6 / void / mark id 42 / end
+
+::
+
+ void / mark id 42 / queue index 6 / end
+
+All actions in a list should have different types, otherwise only the last
+action of a given type is taken into account::
+
+ queue index 4 / queue index 5 / queue index 6 / end # will use queue 6
+
+::
+
+ drop / drop / drop / end # drop is performed only once
+
+::
+
+ mark id 42 / queue index 3 / mark id 24 / end # mark will be 24
+
+Considering they are performed simultaneously, opposite and overlapping
+actions can sometimes be combined when the end result is unambiguous::
+
+ drop / queue index 6 / end # drop has no effect
+
+::
+
+ drop / dup index 6 / end # same as above
+
+::
+
+ queue index 6 / rss queues 6 7 8 / end # queue has no effect
+
+::
+
+ drop / passthru / end # drop has no effect
+
+Note that PMDs may still refuse such combinations.
+
+Actions
+^^^^^^^
+
+This section lists supported actions and their attributes, if any.
+
+- ``end``: end list of actions.
+
+- ``void``: no-op action.
+
+- ``passthru``: let subsequent rule process matched packets.
+
+- ``mark``: attach 32 bit value to packets.
+
+  - ``id {unsigned}``: 32 bit value to return with packets.
+
+- ``flag``: flag packets.
+
+- ``queue``: assign packets to a given queue index.
+
+  - ``index {unsigned}``: queue index to use.
+
+- ``drop``: drop packets (note: passthru has priority).
+
+- ``count``: enable counters for this rule.
+
+- ``dup``: duplicate packets to a given queue index.
+
+  - ``index {unsigned}``: queue index to duplicate packets to.
+
+- ``rss``: spread packets among several queues.
+
+  - ``queues [{unsigned} [...]] end``: queue indices to use.
+
+- ``pf``: redirect packets to physical device function.
+
+- ``vf``: redirect packets to virtual device function.
+
+  - ``original {boolean}``: use original VF ID if possible.
+  - ``id {unsigned}``: VF ID to redirect packets to.
+
+Destroying flow rules
+~~~~~~~~~~~~~~~~~~~~~
+
+``flow destroy`` destroys one or more rules from their rule ID (as returned
+by ``flow create``), this command calls ``rte_flow_destroy()`` as many
+times as necessary::
+
+ flow destroy {port_id} rule {rule_id} [...]
+
+If successful, it will show::
+
+ Flow rule #[...] destroyed
+
+It does not report anything for rule IDs that do not exist. The usual error
+message is shown when a rule cannot be destroyed::
+
+ Caught error type [...] ([...]): [...]
+
+``flow flush`` destroys all rules on a device and does not take extra
+arguments. It is bound to ``rte_flow_flush()``::
+
+ flow flush {port_id}
+
+Any errors are reported as above.
+
+Creating several rules and destroying them::
+
+ testpmd> flow create 0 ingress pattern eth / ipv6 / end
+     actions queue index 2 / end
+ Flow rule #0 created
+ testpmd> flow create 0 ingress pattern eth / ipv4 / end
+     actions queue index 3 / end
+ Flow rule #1 created
+ testpmd> flow destroy 0 rule 0 rule 1
+ Flow rule #1 destroyed
+ Flow rule #0 destroyed
+ testpmd>
+
+The same result can be achieved using ``flow flush``::
+
+ testpmd> flow create 0 ingress pattern eth / ipv6 / end
+     actions queue index 2 / end
+ Flow rule #0 created
+ testpmd> flow create 0 ingress pattern eth / ipv4 / end
+     actions queue index 3 / end
+ Flow rule #1 created
+ testpmd> flow flush 0
+ testpmd>
+
+Non-existent rule IDs are ignored::
+
+ testpmd> flow create 0 ingress pattern eth / ipv6 / end
+     actions queue index 2 / end
+ Flow rule #0 created
+ testpmd> flow create 0 ingress pattern eth / ipv4 / end
+     actions queue index 3 / end
+ Flow rule #1 created
+ testpmd> flow destroy 0 rule 42 rule 10 rule 2
+ testpmd>
+ testpmd> flow destroy 0 rule 0
+ Flow rule #0 destroyed
+ testpmd>
+
+Querying flow rules
+~~~~~~~~~~~~~~~~~~~
+
+``flow query`` queries a specific action of a flow rule having that
+ability. Such actions collect information that can be reported using this
+command. It is bound to ``rte_flow_query()``::
+
+ flow query {port_id} {rule_id} {action}
+
+If successful, it will display either the retrieved data for known actions
+or the following message::
+
+ Cannot display result for action type [...] ([...])
+
+Otherwise, it will complain either that the rule does not exist or that some
+error occurred::
+
+ Flow rule #[...] not found
+
+::
+
+ Caught error type [...] ([...]): [...]
+
+Currently only the ``count`` action is supported. This action reports the
+number of packets that hit the flow rule and the total number of bytes. Its
+output has the following format::
+
+ count:
+  hits_set: [...] # whether "hits" contains a valid value
+  bytes_set: [...] # whether "bytes" contains a valid value
+  hits: [...] # number of packets
+  bytes: [...] # number of bytes
+
+Querying counters for TCPv6 packets redirected to queue 6::
+
+ testpmd> flow create 0 ingress pattern eth / ipv6 / tcp / end
+     actions queue index 6 / count / end
+ Flow rule #4 created
+ testpmd> flow query 0 4 count
+ count:
+  hits_set: 1
+  bytes_set: 0
+  hits: 386446
+  bytes: 0
+ testpmd>
+
+Listing flow rules
+~~~~~~~~~~~~~~~~~~
+
+``flow list`` lists existing flow rules sorted by priority and optionally
+filtered by group identifiers::
+
+ flow list {port_id} [group {group_id}] [...]
+
+This command only fails with the following message if the device does not
+exist::
+
+ Invalid port [...]
+
+Output consists of a header line followed by a short description of each
+flow rule, one per line. There is no output at all when no flow rules are
+configured on the device::
+
+ ID      Group   Prio    Attr    Rule
+ [...]   [...]   [...]   [...]   [...]
+
+``Attr`` column flags:
+
+- ``i`` for ``ingress``.
+- ``e`` for ``egress``.
+
+Creating several flow rules and listing them::
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / end
+     actions queue index 6 / end
+ Flow rule #0 created
+ testpmd> flow create 0 ingress pattern eth / ipv6 / end
+     actions queue index 2 / end
+ Flow rule #1 created
+ testpmd> flow create 0 priority 5 ingress pattern eth / ipv4 / udp / end
+     actions rss queues 6 7 8 end / end
+ Flow rule #2 created
+ testpmd> flow list 0
+ ID      Group   Prio    Attr    Rule
+ 0       0       0       i-      ETH IPV4 => QUEUE
+ 1       0       0       i-      ETH IPV6 => QUEUE
+ 2       0       5       i-      ETH IPV4 UDP => RSS
+ testpmd>
+
+Rules are sorted by priority (i.e. group ID first, then priority level)::
+
+ testpmd> flow list 1
+ ID      Group   Prio    Attr    Rule
+ 0       0       0       i-      ETH => COUNT
+ 6       0       500     i-      ETH IPV6 TCP => DROP COUNT
+ 5       0       1000    i-      ETH IPV6 ICMP => QUEUE
+ 1       24      0       i-      ETH IPV4 UDP => QUEUE
+ 4       24      10      i-      ETH IPV4 TCP => DROP
+ 3       24      20      i-      ETH IPV4 => DROP
+ 2       24      42      i-      ETH IPV4 UDP => QUEUE
+ 7       63      0       i-      ETH IPV6 UDP VXLAN => MARK QUEUE
+ testpmd>
+
+Output can be limited to specific groups::
+
+ testpmd> flow list 1 group 0 group 63
+ ID      Group   Prio    Attr    Rule
+ 0       0       0       i-      ETH => COUNT
+ 6       0       500     i-      ETH IPV6 TCP => DROP COUNT
+ 5       0       1000    i-      ETH IPV6 ICMP => QUEUE
+ 7       63      0       i-      ETH IPV6 UDP VXLAN => MARK QUEUE
+ testpmd>
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd flow command
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd " Adrien Mazarguil
@ 2016-12-19 20:44           ` Mcnamara, John
  2016-12-20 10:51             ` Adrien Mazarguil
  2016-12-20 17:06           ` Ferruh Yigit
  1 sibling, 1 reply; 262+ messages in thread
From: Mcnamara, John @ 2016-12-19 20:44 UTC (permalink / raw)
  To: Adrien Mazarguil, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Monday, December 19, 2016 5:49 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd flow command
> 
> Document syntax, interaction with rte_flow and provide usage examples.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> 
> ...
>
> +
> +- Check whether a flow rule can be created::
> +
> +   flow validate {port_id}
> +       [group {group_id}] [priority {level}] [ingress] [egress]
> +       pattern {item} [/ {item} [...]] / end
> +       actions {action} [/ {action} [...]] / end
> +
> +- Create a flow rule::
> +
> +   flow create {port_id}
> +       [group {group_id}] [priority {level}] [ingress] [egress]
> +       pattern {item} [/ {item} [...]] / end
> +       actions {action} [/ {action} [...]] / end
> +
> +- Destroy specific flow rules::
> +
> +   flow destroy {port_id} rule {rule_id} [...]
> +
> +- Destroy all flow rules::
> +
> +   flow flush {port_id}
> +

Just a note:

The verbs destroy and flush don't sound right here. Create/destroy are common
verbs pairs for objects but these actions are more like add/remove. I guess the
names come from the underlying APIs which possibly are creating/freeing
objects/structures but maybe they should be called add/remove as well.

And flush generally applies to a pipeline or a queue. The action here is closer
to "remove all".

Probably not worth reworking at this stage if it hasn't bothered anyone else.


> +underlying device in its current state but stops short of creating it.
> +It is bound to ``rte_flow_validate()``::
> +
> + flow validate {port_id}
> +     [group {group_id}] [priority {level}] [ingress] [egress]
> +     pattern {item} [/ {item} [...]] / end
> +     actions {action} [/ {action} [...]] / end
> +

Here and elsewhere the indentation should be the RST standard 3 spaces,
similar to the rest of the doc. This is only worth changing if you
do some other revision of this doc.

Otherwise very good documentation.

Acked-by: John McNamara <john.mcnamara@intel.com>

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow
  2016-12-19 10:19           ` Adrien Mazarguil
@ 2016-12-20  1:57             ` Xing, Beilei
  2016-12-20  9:38               ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Xing, Beilei @ 2016-12-20  1:57 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Pei, Yulong



> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Monday, December 19, 2016 6:20 PM
> To: Xing, Beilei <beilei.xing@intel.com>
> Cc: dev@dpdk.org; Pei, Yulong <yulong.pei@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic
> support for rte_flow
> 
> Hi Beilei,
> 
> On Mon, Dec 19, 2016 at 08:37:20AM +0000, Xing, Beilei wrote:
> > Hi Adrien,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien
> > > Mazarguil
> > > Sent: Saturday, December 17, 2016 12:25 AM
> > > To: dev@dpdk.org
> > > Subject: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic
> > > support for rte_flow
> > >
> > > Add basic management functions for the generic flow API (validate,
> > > create, destroy, flush, query and list). Flow rule objects and
> > > properties are arranged in lists associated with each port.
> > >
> > > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > +/** Create flow rule. */
> > > +int
> > > +port_flow_create(portid_t port_id,
> > > +		 const struct rte_flow_attr *attr,
> > > +		 const struct rte_flow_item *pattern,
> > > +		 const struct rte_flow_action *actions) {
> > > +	struct rte_flow *flow;
> > > +	struct rte_port *port;
> > > +	struct port_flow *pf;
> > > +	uint32_t id;
> > > +	struct rte_flow_error error;
> > > +
> >
> > I think there should be memset for error here, e.g. memset(&error, 0,
> > sizeof(struct rte_flow_error)); Since both cause and message may be NULL
> regardless of the error type, if there's no error.cause and error.message
> returned from PMD, Segmentation fault will happen in port_flow_complain.
> > PS: This issue doesn't happen if add "export EXTRA_CFLAGS=' -g O0'" when
> compiling.
> 
> Actually, PMDs must fill the error structure only in case of error if the
> application provides one, it's not optional. I didn't initialize this structure for
> this reason.
> 
> I suggest we initialize it with a known poisoning value for debugging purposes
> though, to make it fail every time. Does it sound reasonable?

OK, I see. Do you want PMD to allocate the memory for cause and message of error, and must fill the cause and message if error exists, right?
So is it possible to set NULL for pointers of cause and message in application? then PMD can judge if it's need to allocate or overlap memory.

> 
> > > +	flow = rte_flow_create(port_id, attr, pattern, actions, &error);
> > > +	if (!flow)
> > > +		return port_flow_complain(&error);
> > > +	port = &ports[port_id];
> > > +	if (port->flow_list) {
> > > +		if (port->flow_list->id == UINT32_MAX) {
> > > +			printf("Highest rule ID is already assigned, delete"
> > > +			       " it first");
> > > +			rte_flow_destroy(port_id, flow, NULL);
> > > +			return -ENOMEM;
> > > +		}
> > > +		id = port->flow_list->id + 1;
> > > +	} else
> > > +		id = 0;
> > > +	pf = port_flow_new(attr, pattern, actions);
> > > +	if (!pf) {
> > > +		int err = rte_errno;
> > > +
> > > +		printf("Cannot allocate flow: %s\n", rte_strerror(err));
> > > +		rte_flow_destroy(port_id, flow, NULL);
> > > +		return -err;
> > > +	}
> > > +	pf->next = port->flow_list;
> > > +	pf->id = id;
> > > +	pf->flow = flow;
> > > +	port->flow_list = pf;
> > > +	printf("Flow rule #%u created\n", pf->id);
> > > +	return 0;
> > > +}
> > > +
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush command
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush command Adrien Mazarguil
@ 2016-12-20  7:32           ` Zhao1, Wei
  2016-12-20  9:45             ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Zhao1, Wei @ 2016-12-20  7:32 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

Hi,  Adrien

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Tuesday, December 20, 2016 1:49 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush
> command
> 
> Syntax:
> 
>  flow flush {port_id}
> 
> Destroy all flow rules on a port.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Acked-by: Olga Shern <olgas@mellanox.com>
> ---
>  app/test-pmd/cmdline.c      |  3 +++
>  app/test-pmd/cmdline_flow.c | 43
> +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> 0dc6c63..6e2b289 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void
> *parsed_result,
>  			" (select|add)\n"
>  			"    Set the input set for FDir.\n\n"
> 
> +			"flow flush {port_id}\n"
> +			"    Destroy all flow rules.\n\n"
> +
>  			"flow list {port_id} [group {group_id}] [...]\n"
>  			"    List existing flow rules sorted by priority,"
>  			" filtered by group identifiers.\n\n"
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index bd3da38..49578eb 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -63,6 +63,7 @@ enum index {
>  	FLOW,
> 
>  	/* Sub-level commands. */
> +	FLUSH,
>  	LIST,
> 
>  	/* List arguments. */
> @@ -179,6 +180,9 @@ static const enum index next_list_attr[] = {  static int
> parse_init(struct context *, const struct token *,
>  		      const char *, unsigned int,
>  		      void *, unsigned int);
> +static int parse_flush(struct context *, const struct token *,
> +		       const char *, unsigned int,
> +		       void *, unsigned int);
>  static int parse_list(struct context *, const struct token *,
>  		      const char *, unsigned int,
>  		      void *, unsigned int);
> @@ -240,10 +244,19 @@ static const struct token token_list[] = {
>  		.name = "flow",
>  		.type = "{command} {port_id} [{arg} [...]]",
>  		.help = "manage ingress/egress flow rules",
> -		.next = NEXT(NEXT_ENTRY(LIST)),
> +		.next = NEXT(NEXT_ENTRY
> +			     (FLUSH,
> +			      LIST)),
>  		.call = parse_init,
>  	},
>  	/* Sub-level commands. */
> +	[FLUSH] = {
> +		.name = "flush",
> +		.help = "destroy all flow rules",
> +		.next = NEXT(NEXT_ENTRY(PORT_ID)),
> +		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
> +		.call = parse_flush,
> +	},
>  	[LIST] = {
>  		.name = "list",
>  		.help = "list existing flow rules",
> @@ -316,6 +329,31 @@ parse_init(struct context *ctx, const struct token
> *token,
>  	return len;
>  }
> 
> +/** Parse tokens for flush command. */
> +static int
> +parse_flush(struct context *ctx, const struct token *token,
> +	    const char *str, unsigned int len,
> +	    void *buf, unsigned int size)
> +{
> +	struct buffer *out = buf;
> +
> +	/* Token name must match. */
> +	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
> +		return -1;
> +	/* Nothing else to do if there is no buffer. */
> +	if (!out)
> +		return len;
> +	if (!out->command) {
> +		if (ctx->curr != FLUSH)
> +			return -1;
> +		if (sizeof(*out) > size)
> +			return -1;
> +		out->command = ctx->curr;
> +		ctx->object = out;
> +	}
> +	return len;
> +}
> +
>  /** Parse tokens for list command. */
>  static int
>  parse_list(struct context *ctx, const struct token *token, @@ -698,6 +736,9
> @@ static void  cmd_flow_parsed(const struct buffer *in)  {
>  	switch (in->command) {
> +	case FLUSH:
> +		port_flow_flush(in->port);
> +		break;
>  	case LIST:
>  		port_flow_list(in->port, in->args.list.group_n,
>  			       in->args.list.group);
> --
> 2.1.4

When user  flow flush cmd, PMD will flush all the rule on the specific port, and  the memory of which rte_flow point to must be flushed.
This memory is returned when flow create, will rte layer flush this memory or PMD is responsible for that memory flush?
BTW, there is no argument about rte_flow in flush function pass into PMD layer.

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 22/25] app/testpmd: add L4 items to flow command
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 22/25] app/testpmd: add L4 items " Adrien Mazarguil
@ 2016-12-20  9:14           ` Pei, Yulong
  2016-12-20  9:50             ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Pei, Yulong @ 2016-12-20  9:14 UTC (permalink / raw)
  To: Adrien Mazarguil, dev; +Cc: Xing, Beilei

Hi Adrien,

For SCTP, it may need support to set  'tag'  since in FDIR (--pkt-filter-mode=perfect) need set it when filter sctp flow.

struct sctp_hdr {
        uint16_t src_port; /**< Source port. */
        uint16_t dst_port; /**< Destin port. */
        uint32_t tag;      /**< Validation tag. */
        uint32_t cksum;    /**< Checksum. */
} __attribute__((__packed__));

testpmd> flow create 0 ingress pattern eth / ipv4 src is 192.168.0.1 dst is 192.168.0.2 / sctp
 src [TOKEN]: SCTP source port
 dst [TOKEN]: SCTP destination port
 / [TOKEN]: specify next pattern item

Best Regards
Yulong Pei

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
Sent: Tuesday, December 20, 2016 1:49 AM
To: dev@dpdk.org
Subject: [dpdk-dev] [PATCH v3 22/25] app/testpmd: add L4 items to flow command

Add the ability to match a few properties of common L4[.5] protocol
headers:

- ICMP: type and code.
- UDP: source and destination ports.
- TCP: source and destination ports.
- SCTP: source and destination ports.
- VXLAN: network identifier.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 163 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index c2725a5..a340a75 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -132,6 +132,20 @@ enum index {
 	ITEM_IPV6,
 	ITEM_IPV6_SRC,
 	ITEM_IPV6_DST,
+	ITEM_ICMP,
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_UDP,
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_TCP,
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_SCTP,
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_VXLAN,
+	ITEM_VXLAN_VNI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -359,6 +373,11 @@ static const enum index next_item[] = {
 	ITEM_VLAN,
 	ITEM_IPV4,
 	ITEM_IPV6,
+	ITEM_ICMP,
+	ITEM_UDP,
+	ITEM_TCP,
+	ITEM_SCTP,
+	ITEM_VXLAN,
 	ZERO,
 };
 
@@ -419,6 +438,40 @@ static const enum index item_ipv6[] = {
 	ZERO,
 };
 
+static const enum index item_icmp[] = {
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_udp[] = {
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_tcp[] = {
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_sctp[] = {
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vxlan[] = {
+	ITEM_VXLAN_VNI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -930,6 +983,103 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
 					     hdr.dst_addr)),
 	},
+	[ITEM_ICMP] = {
+		.name = "icmp",
+		.help = "match ICMP header",
+		.priv = PRIV_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+		.next = NEXT(item_icmp),
+		.call = parse_vc,
+	},
+	[ITEM_ICMP_TYPE] = {
+		.name = "type",
+		.help = "ICMP packet type",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_type)),
+	},
+	[ITEM_ICMP_CODE] = {
+		.name = "code",
+		.help = "ICMP packet code",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_code)),
+	},
+	[ITEM_UDP] = {
+		.name = "udp",
+		.help = "match UDP header",
+		.priv = PRIV_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+		.next = NEXT(item_udp),
+		.call = parse_vc,
+	},
+	[ITEM_UDP_SRC] = {
+		.name = "src",
+		.help = "UDP source port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.src_port)),
+	},
+	[ITEM_UDP_DST] = {
+		.name = "dst",
+		.help = "UDP destination port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.dst_port)),
+	},
+	[ITEM_TCP] = {
+		.name = "tcp",
+		.help = "match TCP header",
+		.priv = PRIV_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+		.next = NEXT(item_tcp),
+		.call = parse_vc,
+	},
+	[ITEM_TCP_SRC] = {
+		.name = "src",
+		.help = "TCP source port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.src_port)),
+	},
+	[ITEM_TCP_DST] = {
+		.name = "dst",
+		.help = "TCP destination port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.dst_port)),
+	},
+	[ITEM_SCTP] = {
+		.name = "sctp",
+		.help = "match SCTP header",
+		.priv = PRIV_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+		.next = NEXT(item_sctp),
+		.call = parse_vc,
+	},
+	[ITEM_SCTP_SRC] = {
+		.name = "src",
+		.help = "SCTP source port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.src_port)),
+	},
+	[ITEM_SCTP_DST] = {
+		.name = "dst",
+		.help = "SCTP destination port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.dst_port)),
+	},
+	[ITEM_VXLAN] = {
+		.name = "vxlan",
+		.help = "match VXLAN header",
+		.priv = PRIV_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+		.next = NEXT(item_vxlan),
+		.call = parse_vc,
+	},
+	[ITEM_VXLAN_VNI] = {
+		.name = "vni",
+		.help = "VXLAN identifier",
+		.next = NEXT(item_vxlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vxlan, vni)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1491,6 +1641,19 @@ parse_int(struct context *ctx, const struct token *token,
 	case sizeof(uint16_t):
 		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
 		break;
+	case sizeof(uint8_t [3]):
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+		if (!arg->hton) {
+			((uint8_t *)buf)[0] = u;
+			((uint8_t *)buf)[1] = u >> 8;
+			((uint8_t *)buf)[2] = u >> 16;
+			break;
+		}
+#endif
+		((uint8_t *)buf)[0] = u >> 16;
+		((uint8_t *)buf)[1] = u >> 8;
+		((uint8_t *)buf)[2] = u;
+		break;
 	case sizeof(uint32_t):
 		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
 		break;
--
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 21/25] app/testpmd: add items ipv4/ipv6 to flow command
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
@ 2016-12-20  9:21           ` Pei, Yulong
  2016-12-20 10:02             ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Pei, Yulong @ 2016-12-20  9:21 UTC (permalink / raw)
  To: Adrien Mazarguil, dev; +Cc: Xing, Beilei

Hi adrien,

Is it possible to support to set ipv4 TOS, ipv4 PROTO, ipv4 TTL and ipv6 tc, ipv6 next-header, ipv6 hop-limits since 
previous FDIR for i40e already support it.

Best Regards
Yulong Pei

-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
Sent: Tuesday, December 20, 2016 1:49 AM
To: dev@dpdk.org
Subject: [dpdk-dev] [PATCH v3 21/25] app/testpmd: add items ipv4/ipv6 to flow command

Add the ability to match basic fields from IPv4 and IPv6 headers (source and destination addresses only).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 177 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index 53709fe..c2725a5 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -38,6 +38,7 @@
 #include <errno.h>
 #include <ctype.h>
 #include <string.h>
+#include <arpa/inet.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
@@ -61,6 +62,8 @@ enum index {
 	BOOLEAN,
 	STRING,
 	MAC_ADDR,
+	IPV4_ADDR,
+	IPV6_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -123,6 +126,12 @@ enum index {
 	ITEM_VLAN,
 	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
+	ITEM_IPV4,
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_IPV6,
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -348,6 +357,8 @@ static const enum index next_item[] = {
 	ITEM_RAW,
 	ITEM_ETH,
 	ITEM_VLAN,
+	ITEM_IPV4,
+	ITEM_IPV6,
 	ZERO,
 };
 
@@ -394,6 +405,20 @@ static const enum index item_vlan[] = {
 	ZERO,
 };
 
+static const enum index item_ipv4[] = {
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_ipv6[] = {
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -439,6 +464,12 @@ static int parse_string(struct context *, const struct token *,  static int parse_mac_addr(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int parse_ipv4_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
+static int parse_ipv6_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -509,6 +540,20 @@ static const struct token token_list[] = {
 		.call = parse_mac_addr,
 		.comp = comp_none,
 	},
+	[IPV4_ADDR] = {
+		.name = "{IPv4 address}",
+		.type = "IPV4 ADDRESS",
+		.help = "standard IPv4 address notation",
+		.call = parse_ipv4_addr,
+		.comp = comp_none,
+	},
+	[IPV6_ADDR] = {
+		.name = "{IPv6 address}",
+		.type = "IPV6 ADDRESS",
+		.help = "standard IPv6 address notation",
+		.call = parse_ipv6_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -843,6 +888,48 @@ static const struct token token_list[] = {
 		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
 	},
+	[ITEM_IPV4] = {
+		.name = "ipv4",
+		.help = "match IPv4 header",
+		.priv = PRIV_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+		.next = NEXT(item_ipv4),
+		.call = parse_vc,
+	},
+	[ITEM_IPV4_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV4_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.dst_addr)),
+	},
+	[ITEM_IPV6] = {
+		.name = "ipv6",
+		.help = "match IPv6 header",
+		.priv = PRIV_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+		.next = NEXT(item_ipv6),
+		.call = parse_vc,
+	},
+	[ITEM_IPV6_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV6_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.dst_addr)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1514,6 +1601,96 @@ parse_mac_addr(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse an IPv4 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv4_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in_addr tmp;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET, str2, &tmp);
+	if (ret != 1) {
+		/* Attempt integer parsing. */
+		push_args(ctx, arg);
+		return parse_int(ctx, token, str, len, buf, size);
+	}
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/**
+ * Parse an IPv6 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv6_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in6_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET6, str2, &tmp);
+	if (ret != 1)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */  static const char *const boolean_name[] = {
 	"0", "1",
--
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow
  2016-12-20  1:57             ` Xing, Beilei
@ 2016-12-20  9:38               ` Adrien Mazarguil
  2016-12-21  5:23                 ` Xing, Beilei
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20  9:38 UTC (permalink / raw)
  To: Xing, Beilei; +Cc: dev, Pei, Yulong

On Tue, Dec 20, 2016 at 01:57:46AM +0000, Xing, Beilei wrote:
> > -----Original Message-----
> > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > Sent: Monday, December 19, 2016 6:20 PM
> > To: Xing, Beilei <beilei.xing@intel.com>
> > Cc: dev@dpdk.org; Pei, Yulong <yulong.pei@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic
> > support for rte_flow
> > 
> > Hi Beilei,
> > 
> > On Mon, Dec 19, 2016 at 08:37:20AM +0000, Xing, Beilei wrote:
> > > Hi Adrien,
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien
> > > > Mazarguil
> > > > Sent: Saturday, December 17, 2016 12:25 AM
> > > > To: dev@dpdk.org
> > > > Subject: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic
> > > > support for rte_flow
> > > >
> > > > Add basic management functions for the generic flow API (validate,
> > > > create, destroy, flush, query and list). Flow rule objects and
> > > > properties are arranged in lists associated with each port.
> > > >
> > > > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > +/** Create flow rule. */
> > > > +int
> > > > +port_flow_create(portid_t port_id,
> > > > +		 const struct rte_flow_attr *attr,
> > > > +		 const struct rte_flow_item *pattern,
> > > > +		 const struct rte_flow_action *actions) {
> > > > +	struct rte_flow *flow;
> > > > +	struct rte_port *port;
> > > > +	struct port_flow *pf;
> > > > +	uint32_t id;
> > > > +	struct rte_flow_error error;
> > > > +
> > >
> > > I think there should be memset for error here, e.g. memset(&error, 0,
> > > sizeof(struct rte_flow_error)); Since both cause and message may be NULL
> > regardless of the error type, if there's no error.cause and error.message
> > returned from PMD, Segmentation fault will happen in port_flow_complain.
> > > PS: This issue doesn't happen if add "export EXTRA_CFLAGS=' -g O0'" when
> > compiling.
> > 
> > Actually, PMDs must fill the error structure only in case of error if the
> > application provides one, it's not optional. I didn't initialize this structure for
> > this reason.
> > 
> > I suggest we initialize it with a known poisoning value for debugging purposes
> > though, to make it fail every time. Does it sound reasonable?

Done for v3 by the way.

> OK, I see. Do you want PMD to allocate the memory for cause and message of error, and must fill the cause and message if error exists, right?
> So is it possible to set NULL for pointers of cause and message in application? then PMD can judge if it's need to allocate or overlap memory.

PMDs never allocate this structure, applications do and initialize it
however they want. They only provide a non-NULL pointer if they want
additional details in case of error.

It will likely be allocated on the stack in most cases (as in testpmd).

>From a PMD standpoint, the contents of this structure must be updated in
case of non-NULL pointer and error state.

Now the message pointer can be allocated dynamically but it's not
recommended, it's far easier to make it point to some constant
string. Applications won't free it anyway, so PMDs would have to do it
during dev_close(). Please see "Verbose error reporting" documentation
section.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush command
  2016-12-20  7:32           ` Zhao1, Wei
@ 2016-12-20  9:45             ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20  9:45 UTC (permalink / raw)
  To: Zhao1, Wei; +Cc: dev

Hi Wei,

On Tue, Dec 20, 2016 at 07:32:29AM +0000, Zhao1, Wei wrote:
> Hi,  Adrien
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Tuesday, December 20, 2016 1:49 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush
> > command
> > 
> > Syntax:
> > 
> >  flow flush {port_id}
> > 
> > Destroy all flow rules on a port.
> > 
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Acked-by: Olga Shern <olgas@mellanox.com>
> > ---
> >  app/test-pmd/cmdline.c      |  3 +++
> >  app/test-pmd/cmdline_flow.c | 43
> > +++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 45 insertions(+), 1 deletion(-)
> > 
> > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > 0dc6c63..6e2b289 100644
> > --- a/app/test-pmd/cmdline.c
> > +++ b/app/test-pmd/cmdline.c
> > @@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void
> > *parsed_result,
> >  			" (select|add)\n"
> >  			"    Set the input set for FDir.\n\n"
> > 
> > +			"flow flush {port_id}\n"
> > +			"    Destroy all flow rules.\n\n"
> > +
> >  			"flow list {port_id} [group {group_id}] [...]\n"
> >  			"    List existing flow rules sorted by priority,"
> >  			" filtered by group identifiers.\n\n"
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index bd3da38..49578eb 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -63,6 +63,7 @@ enum index {
> >  	FLOW,
> > 
> >  	/* Sub-level commands. */
> > +	FLUSH,
> >  	LIST,
> > 
> >  	/* List arguments. */
> > @@ -179,6 +180,9 @@ static const enum index next_list_attr[] = {  static int
> > parse_init(struct context *, const struct token *,
> >  		      const char *, unsigned int,
> >  		      void *, unsigned int);
> > +static int parse_flush(struct context *, const struct token *,
> > +		       const char *, unsigned int,
> > +		       void *, unsigned int);
> >  static int parse_list(struct context *, const struct token *,
> >  		      const char *, unsigned int,
> >  		      void *, unsigned int);
> > @@ -240,10 +244,19 @@ static const struct token token_list[] = {
> >  		.name = "flow",
> >  		.type = "{command} {port_id} [{arg} [...]]",
> >  		.help = "manage ingress/egress flow rules",
> > -		.next = NEXT(NEXT_ENTRY(LIST)),
> > +		.next = NEXT(NEXT_ENTRY
> > +			     (FLUSH,
> > +			      LIST)),
> >  		.call = parse_init,
> >  	},
> >  	/* Sub-level commands. */
> > +	[FLUSH] = {
> > +		.name = "flush",
> > +		.help = "destroy all flow rules",
> > +		.next = NEXT(NEXT_ENTRY(PORT_ID)),
> > +		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
> > +		.call = parse_flush,
> > +	},
> >  	[LIST] = {
> >  		.name = "list",
> >  		.help = "list existing flow rules",
> > @@ -316,6 +329,31 @@ parse_init(struct context *ctx, const struct token
> > *token,
> >  	return len;
> >  }
> > 
> > +/** Parse tokens for flush command. */
> > +static int
> > +parse_flush(struct context *ctx, const struct token *token,
> > +	    const char *str, unsigned int len,
> > +	    void *buf, unsigned int size)
> > +{
> > +	struct buffer *out = buf;
> > +
> > +	/* Token name must match. */
> > +	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
> > +		return -1;
> > +	/* Nothing else to do if there is no buffer. */
> > +	if (!out)
> > +		return len;
> > +	if (!out->command) {
> > +		if (ctx->curr != FLUSH)
> > +			return -1;
> > +		if (sizeof(*out) > size)
> > +			return -1;
> > +		out->command = ctx->curr;
> > +		ctx->object = out;
> > +	}
> > +	return len;
> > +}
> > +
> >  /** Parse tokens for list command. */
> >  static int
> >  parse_list(struct context *ctx, const struct token *token, @@ -698,6 +736,9
> > @@ static void  cmd_flow_parsed(const struct buffer *in)  {
> >  	switch (in->command) {
> > +	case FLUSH:
> > +		port_flow_flush(in->port);
> > +		break;
> >  	case LIST:
> >  		port_flow_list(in->port, in->args.list.group_n,
> >  			       in->args.list.group);
> > --
> > 2.1.4
> 
> When user  flow flush cmd, PMD will flush all the rule on the specific port, and  the memory of which rte_flow point to must be flushed.

Right.

> This memory is returned when flow create, will rte layer flush this memory or PMD is responsible for that memory flush?

All handles are considered destroyed and their memory freed, i.e. no
rte_flow object remains valid after flush. Applications still need to clean
up the memory they allocated to manage these objects, but that's their
problem.

> BTW, there is no argument about rte_flow in flush function pass into PMD layer.

Right, that's because flush does not request the destruction of a specific
rule. PMDs that allocate memory for rte_flow objects must link them together
somehow to retrieve them during a flush event.

Note this is likely already necessary to clean up the memory allocated for
flow rules during dev_close().

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 22/25] app/testpmd: add L4 items to flow command
  2016-12-20  9:14           ` Pei, Yulong
@ 2016-12-20  9:50             ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20  9:50 UTC (permalink / raw)
  To: Pei, Yulong; +Cc: dev, Xing, Beilei

Hi Yulong,

On Tue, Dec 20, 2016 at 09:14:55AM +0000, Pei, Yulong wrote:
> Hi Adrien,
> 
> For SCTP, it may need support to set  'tag'  since in FDIR (--pkt-filter-mode=perfect) need set it when filter sctp flow.
> 
> struct sctp_hdr {
>         uint16_t src_port; /**< Source port. */
>         uint16_t dst_port; /**< Destin port. */
>         uint32_t tag;      /**< Validation tag. */
>         uint32_t cksum;    /**< Checksum. */
> } __attribute__((__packed__));
> 
> testpmd> flow create 0 ingress pattern eth / ipv4 src is 192.168.0.1 dst is 192.168.0.2 / sctp
>  src [TOKEN]: SCTP source port
>  dst [TOKEN]: SCTP destination port
>  / [TOKEN]: specify next pattern item

Sure, let's add it in a subsequent patch after this series is applied, it is
only a few lines of code in testpmd (basically like all missing protocol
fields).

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 21/25] app/testpmd: add items ipv4/ipv6 to flow command
  2016-12-20  9:21           ` Pei, Yulong
@ 2016-12-20 10:02             ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 10:02 UTC (permalink / raw)
  To: Pei, Yulong; +Cc: dev, Xing, Beilei

Hi Yulong,

On Tue, Dec 20, 2016 at 09:21:28AM +0000, Pei, Yulong wrote:
> Hi adrien,
> 
> Is it possible to support to set ipv4 TOS, ipv4 PROTO, ipv4 TTL and ipv6 tc, ipv6 next-header, ipv6 hop-limits since 
> previous FDIR for i40e already support it.

I suggest we add them later (like for SCTP tag, it's just a bunch of new
flow tokens to manage in testpmd), it is not blocking for the API itself.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd flow command
  2016-12-19 20:44           ` Mcnamara, John
@ 2016-12-20 10:51             ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 10:51 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev, Kevin Traynor

On Mon, Dec 19, 2016 at 08:44:07PM +0000, Mcnamara, John wrote:
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Monday, December 19, 2016 5:49 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd flow command
> > 
> > Document syntax, interaction with rte_flow and provide usage examples.
> > 
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > 
> > ...
> >
> > +
> > +- Check whether a flow rule can be created::
> > +
> > +   flow validate {port_id}
> > +       [group {group_id}] [priority {level}] [ingress] [egress]
> > +       pattern {item} [/ {item} [...]] / end
> > +       actions {action} [/ {action} [...]] / end
> > +
> > +- Create a flow rule::
> > +
> > +   flow create {port_id}
> > +       [group {group_id}] [priority {level}] [ingress] [egress]
> > +       pattern {item} [/ {item} [...]] / end
> > +       actions {action} [/ {action} [...]] / end
> > +
> > +- Destroy specific flow rules::
> > +
> > +   flow destroy {port_id} rule {rule_id} [...]
> > +
> > +- Destroy all flow rules::
> > +
> > +   flow flush {port_id}
> > +
> 
> Just a note:
> 
> The verbs destroy and flush don't sound right here. Create/destroy are common
> verbs pairs for objects but these actions are more like add/remove. I guess the
> names come from the underlying APIs which possibly are creating/freeing
> objects/structures but maybe they should be called add/remove as well.
> 
> And flush generally applies to a pipeline or a queue. The action here is closer
> to "remove all".
> 
> Probably not worth reworking at this stage if it hasn't bothered anyone else.

Well, Kevin Traynor made a similar suggestion to which I replied that the
name would be modified if enough people complained [1].

I understand your point but for some reason I keep hearing a flushing noise
every time all rules are removed at once, hence the name.

Problem is also that we now have 3 PMD series floating on the ML that depend
on the current definition. If we decided to change it, I suggest doing so in
a separate fix. A few more complaints from developers are needed before it's
too late for 17.02.

> > +underlying device in its current state but stops short of creating it.
> > +It is bound to ``rte_flow_validate()``::
> > +
> > + flow validate {port_id}
> > +     [group {group_id}] [priority {level}] [ingress] [egress]
> > +     pattern {item} [/ {item} [...]] / end
> > +     actions {action} [/ {action} [...]] / end
> > +
> 
> Here and elsewhere the indentation should be the RST standard 3 spaces,
> similar to the rest of the doc. This is only worth changing if you
> do some other revision of this doc.
> 
> Otherwise very good documentation.
> 
> Acked-by: John McNamara <john.mcnamara@intel.com>

Thanks, I'll make those changes if anything else warrants a v4.

[1] http://dpdk.org/ml/archives/dev/2016-December/050973.html

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 07/25] app/testpmd: add flow command
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 07/25] app/testpmd: add flow command Adrien Mazarguil
@ 2016-12-20 16:13           ` Ferruh Yigit
  0 siblings, 0 replies; 262+ messages in thread
From: Ferruh Yigit @ 2016-12-20 16:13 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

Hi Adrien,

On 12/19/2016 5:48 PM, Adrien Mazarguil wrote:
> Managing generic flow API functions from command line requires the use of
> dynamic tokens for convenience as flow rules are not fixed and cannot be
> defined statically.
> 
> This commit adds specific flexible parser code and object for a new "flow"
> command in separate file.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Acked-by: Olga Shern <olgas@mellanox.com>

<...>

> +/** Initialize context. */
> +static void
> +cmd_flow_context_init(struct context *ctx)
> +{
> +	/* A full memset() is not necessary. */
> +	ctx->curr = 0;
> +	ctx->prev = 0;

It seems you have cleaned up all compiler warnings, including bunch of
ICC e188 ones, instead of ignoring them. Thank you for your effort.

These ones are only remaining ones as far as I can see:
ctx->curr = ZERO;
ctx->prev = ZERO;

Thanks,
ferruh

<...>

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 02/25] doc: add rte_flow prog guide
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 02/25] doc: add rte_flow prog guide Adrien Mazarguil
@ 2016-12-20 16:30           ` Mcnamara, John
  0 siblings, 0 replies; 262+ messages in thread
From: Mcnamara, John @ 2016-12-20 16:30 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

Hi Adrien,

There seems to be an issue when building the PDF version of the docs
due to the width of some of the fields in 2 of the tables. I think
this may be a Sphinx/Latex bug since it doesn't happen without the
table:: directive.

If you replace the following tables with the format below (the text
in between is omitted) it resolves the issue:

.. _table_rte_flow_migration_tunnel:

.. table:: TUNNEL conversion

   +-------------------------------------------------------+---------+
   | Pattern                                               | Actions |
   +===+==========================+==========+=============+=========+
   | 0 | ETH                      | ``spec`` | any         | QUEUE   |
   |   |                          +----------+-------------+         |
   |   |                          | ``last`` | unset       |         |
   |   |                          +----------+-------------+         |
   |   |                          | ``mask`` | any         |         |
   +---+--------------------------+----------+-------------+         |
   | 1 | IPV4, IPV6               | ``spec`` | any         |         |
   |   |                          +----------+-------------+         |
   |   |                          | ``last`` | unset       |         |
   |   |                          +----------+-------------+         |
   |   |                          | ``mask`` | any         |         |
   +---+--------------------------+----------+-------------+         |
   | 2 | ANY                      | ``spec`` | any         |         |
   |   |                          +----------+-------------+         |
   |   |                          | ``last`` | unset       |         |
   |   |                          +----------+---------+---+         |
   |   |                          | ``mask`` | ``num`` | 0 |         |
   +---+--------------------------+----------+---------+---+         |
   | 3 | VXLAN, GENEVE, TEREDO,   | ``spec`` | any         |         |
   |   | NVGRE, GRE, ...          +----------+-------------+         |
   |   |                          | ``last`` | unset       |         |
   |   |                          +----------+-------------+         |
   |   |                          | ``mask`` | any         |         |
   +---+--------------------------+----------+-------------+---------+
   | 4 | END                                               | END     |
   +---+---------------------------------------------------+---------+


<text>

.. _table_rte_flow_migration_fdir:

.. table:: FDIR conversion

   +----------------------------------------+-----------------------+
   | Pattern                                | Actions               |
   +===+===================+==========+=====+=======================+
   | 0 | ETH, RAW          | ``spec`` | any | QUEUE, DROP, PASSTHRU |
   |   |                   +----------+-----+                       |
   |   |                   | ``last`` | N/A |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``mask`` | any |                       |
   +---+-------------------+----------+-----+-----------------------+
   | 1 | IPV4, IPv6        | ``spec`` | any | MARK                  |
   |   |                   +----------+-----+                       |
   |   |                   | ``last`` | N/A |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``mask`` | any |                       |
   +---+-------------------+----------+-----+                       |
   | 2 | TCP, UDP, SCTP    | ``spec`` | any |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``last`` | N/A |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``mask`` | any |                       |
   +---+-------------------+----------+-----+                       |
   | 3 | VF, PF (optional) | ``spec`` | any |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``last`` | N/A |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``mask`` | any |                       |
   +---+-------------------+----------+-----+-----------------------+
   | 4 | END                                | END                   |
   +---+------------------------------------+-----------------------+



^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd flow command
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd " Adrien Mazarguil
  2016-12-19 20:44           ` Mcnamara, John
@ 2016-12-20 17:06           ` Ferruh Yigit
  1 sibling, 0 replies; 262+ messages in thread
From: Ferruh Yigit @ 2016-12-20 17:06 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

On 12/19/2016 5:49 PM, Adrien Mazarguil wrote:
> Document syntax, interaction with rte_flow and provide usage examples.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Acked-by: Olga Shern <olgas@mellanox.com>
> ---

<...>

> +
> +Check whether redirecting any Ethernet packet received on port 0 to RX queue
> +index 6 is supported::
> +
> + testpmd> flow validate 1 ingress pattern eth / end

Small detail, but since note mentions "port 0", command should be:
testpmd> flow validate 0 ingress pattern eth / end ...

<...>

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow)
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
                           ` (24 preceding siblings ...)
  2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd " Adrien Mazarguil
@ 2016-12-20 18:42         ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 01/25] ethdev: introduce generic flow API Adrien Mazarguil
                             ` (25 more replies)
  25 siblings, 26 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

As previously discussed in RFC v1 [1], RFC v2 [2], with changes
described in [3] (also pasted below), here is the first non-draft series
for this new API.

Its capabilities are so generic that its name had to be vague, it may be
called "Generic flow API", "Generic flow interface" (possibly shortened
as "GFI") to refer to the name of the new filter type, or "rte_flow" from
the prefix used for its public symbols. I personally favor the latter.

While it is currently meant to supersede existing filter types in order for
all PMDs to expose a common filtering/classification interface, it may
eventually evolve to cover the following ideas as well:

- Rx/Tx offloads configuration through automatic offloads for specific
  packets, e.g. performing checksum on TCP packets could be expressed with
  an egress rule with a TCP pattern and a kind of checksum action.

- RSS configuration (already defined actually). Could be global or per rule
  depending on hardware capabilities.

- Switching configuration for devices with many physical ports; rules doing
  both ingress and egress could even be used to completely bypass software
  if supported by hardware.

 [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
 [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
 [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html

Changes since v3 series:

- Fixed documentation tables that broke PDF generation
  (John Mcnamara / rte_flow.rst).

- Also properly aligned "Action" lines in several tables with their
  corresponding "Index" (rte_flow.rst).

- Fixed remaining ICC error #188 in testpmd (Ferruh / cmdline_flow.c).

- Indented testpmd examples properly (John / testpmd_funcs.rst).

- Fixed wrong port in example (Ferruh / testpmd_funcs.rst).

Changes since v2 series:

- Replaced ENOTSUP with ENOSYS in the code (although doing so triggers
  spurious checkpatch warnings) to tell apart unimplemented callbacks from
  unsupported flow rules and match the documented behavior.

- Fixed missing include seen by check-includes.sh in rte_flow_driver.h.

- Made clearer that PMDs must initialize rte_flow_error (if non-NULL) in
  case of error, added related memory poisoning in testpmd to catch missing
  initializations.

- Fixed rte_flow programmer's guide according to John Mcnamara's comments
  (tables, sections header and typos).

- Fixed deprecation notice as well.

Changes since v1 series:

- Added programmer's guide documentation for rte_flow.

- Added depreciation notice for the legacy API.

- Documented testpmd flow command.

- Fixed missing rte_flow_flush symbol in rte_ether_version.map.

- Cleaned up API documentation in rte_flow.h.

- Replaced "min/max" parameters with "num" in struct rte_flow_item_any, to
  align behavior with other item definitions.

- Fixed "type" (EtherType) size in struct rte_flow_item_eth.

- Renamed "queues" to "num" in struct rte_flow_action_rss.

- Fixed missing const in rte_flow_error_set() prototype definition.

- Fixed testpmd flow create command that did not save the rte_flow object
  pointer, causing crashes.

- Hopefully fixed all the remaining ICC/clang errors.

- Replaced testpmd flow command's "fix" token with "is" for clarity.

Changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect
  struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

Adrien Mazarguil (25):
  ethdev: introduce generic flow API
  doc: add rte_flow prog guide
  doc: announce deprecation of legacy filter types
  cmdline: add support for dynamic tokens
  cmdline: add alignment constraint
  app/testpmd: implement basic support for rte_flow
  app/testpmd: add flow command
  app/testpmd: add rte_flow integer support
  app/testpmd: add flow list command
  app/testpmd: add flow flush command
  app/testpmd: add flow destroy command
  app/testpmd: add flow validate/create commands
  app/testpmd: add flow query command
  app/testpmd: add rte_flow item spec handler
  app/testpmd: add rte_flow item spec prefix length
  app/testpmd: add rte_flow bit-field support
  app/testpmd: add item any to flow command
  app/testpmd: add various items to flow command
  app/testpmd: add item raw to flow command
  app/testpmd: add items eth/vlan to flow command
  app/testpmd: add items ipv4/ipv6 to flow command
  app/testpmd: add L4 items to flow command
  app/testpmd: add various actions to flow command
  app/testpmd: add queue actions to flow command
  doc: describe testpmd flow command

 MAINTAINERS                                 |    4 +
 app/test-pmd/Makefile                       |    1 +
 app/test-pmd/cmdline.c                      |   32 +
 app/test-pmd/cmdline_flow.c                 | 2575 ++++++++++++++++++++++
 app/test-pmd/config.c                       |  498 +++++
 app/test-pmd/csumonly.c                     |    1 +
 app/test-pmd/flowgen.c                      |    1 +
 app/test-pmd/icmpecho.c                     |    1 +
 app/test-pmd/ieee1588fwd.c                  |    1 +
 app/test-pmd/iofwd.c                        |    1 +
 app/test-pmd/macfwd.c                       |    1 +
 app/test-pmd/macswap.c                      |    1 +
 app/test-pmd/parameters.c                   |    1 +
 app/test-pmd/rxonly.c                       |    1 +
 app/test-pmd/testpmd.c                      |    6 +
 app/test-pmd/testpmd.h                      |   27 +
 app/test-pmd/txonly.c                       |    1 +
 doc/api/doxy-api-index.md                   |    2 +
 doc/guides/prog_guide/index.rst             |    1 +
 doc/guides/prog_guide/rte_flow.rst          | 2042 +++++++++++++++++
 doc/guides/rel_notes/deprecation.rst        |    8 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  612 +++++
 lib/librte_cmdline/cmdline_parse.c          |   67 +-
 lib/librte_cmdline/cmdline_parse.h          |   21 +
 lib/librte_ether/Makefile                   |    3 +
 lib/librte_ether/rte_eth_ctrl.h             |    1 +
 lib/librte_ether/rte_ether_version.map      |   11 +
 lib/librte_ether/rte_flow.c                 |  159 ++
 lib/librte_ether/rte_flow.h                 |  947 ++++++++
 lib/librte_ether/rte_flow_driver.h          |  182 ++
 30 files changed, 7200 insertions(+), 9 deletions(-)
 create mode 100644 app/test-pmd/cmdline_flow.c
 create mode 100644 doc/guides/prog_guide/rte_flow.rst
 create mode 100644 lib/librte_ether/rte_flow.c
 create mode 100644 lib/librte_ether/rte_flow.h
 create mode 100644 lib/librte_ether/rte_flow_driver.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 01/25] ethdev: introduce generic flow API
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 02/25] doc: add rte_flow prog guide Adrien Mazarguil
                             ` (24 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

Benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

Existing filter types will be deprecated and removed in the near future.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 MAINTAINERS                            |   4 +
 doc/api/doxy-api-index.md              |   2 +
 lib/librte_ether/Makefile              |   3 +
 lib/librte_ether/rte_eth_ctrl.h        |   1 +
 lib/librte_ether/rte_ether_version.map |  11 +
 lib/librte_ether/rte_flow.c            | 159 +++++
 lib/librte_ether/rte_flow.h            | 947 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow_driver.h     | 182 ++++++
 8 files changed, 1309 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3bb0b99..775b058 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -243,6 +243,10 @@ M: Thomas Monjalon <thomas.monjalon@6wind.com>
 F: lib/librte_ether/
 F: scripts/test-null.sh
 
+Generic flow API
+M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
+F: lib/librte_ether/rte_flow*
+
 Crypto API
 M: Declan Doherty <declan.doherty@intel.com>
 F: lib/librte_cryptodev/
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de65b4c..4951552 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -39,6 +39,8 @@ There are many libraries, so their headers may be grouped by topics:
   [dev]                (@ref rte_dev.h),
   [ethdev]             (@ref rte_ethdev.h),
   [ethctrl]            (@ref rte_eth_ctrl.h),
+  [rte_flow]           (@ref rte_flow.h),
+  [rte_flow_driver]    (@ref rte_flow_driver.h),
   [cryptodev]          (@ref rte_cryptodev.h),
   [devargs]            (@ref rte_devargs.h),
   [bond]               (@ref rte_eth_bond.h),
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index efe1e5f..9335361 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -44,6 +44,7 @@ EXPORT_MAP := rte_ether_version.map
 LIBABIVER := 5
 
 SRCS-y += rte_ethdev.c
+SRCS-y += rte_flow.c
 
 #
 # Export include files
@@ -51,6 +52,8 @@ SRCS-y += rte_ethdev.c
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
+SYMLINK-y-include += rte_flow.h
+SYMLINK-y-include += rte_flow_driver.h
 
 # this lib depends upon:
 DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index fe80eb0..8386904 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -99,6 +99,7 @@ enum rte_filter_type {
 	RTE_ETH_FILTER_FDIR,
 	RTE_ETH_FILTER_HASH,
 	RTE_ETH_FILTER_L2_TUNNEL,
+	RTE_ETH_FILTER_GENERIC,
 	RTE_ETH_FILTER_MAX
 };
 
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 72be66d..384cdee 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -147,3 +147,14 @@ DPDK_16.11 {
 	rte_eth_dev_pci_remove;
 
 } DPDK_16.07;
+
+DPDK_17.02 {
+	global:
+
+	rte_flow_validate;
+	rte_flow_create;
+	rte_flow_destroy;
+	rte_flow_flush;
+	rte_flow_query;
+
+} DPDK_16.11;
diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
new file mode 100644
index 0000000..d98fb1b
--- /dev/null
+++ b/lib/librte_ether/rte_flow.c
@@ -0,0 +1,159 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include "rte_ethdev.h"
+#include "rte_flow_driver.h"
+#include "rte_flow.h"
+
+/* Get generic flow operations structure from a port. */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops;
+	int code;
+
+	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
+		code = ENODEV;
+	else if (unlikely(!dev->dev_ops->filter_ctrl ||
+			  dev->dev_ops->filter_ctrl(dev,
+						    RTE_ETH_FILTER_GENERIC,
+						    RTE_ETH_FILTER_GET,
+						    &ops) ||
+			  !ops))
+		code = ENOSYS;
+	else
+		return ops;
+	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(code));
+	return NULL;
+}
+
+/* Check whether a flow rule can be created on a given port. */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error)
+{
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->validate))
+		return ops->validate(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Create a flow rule on a given port. */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return NULL;
+	if (likely(!!ops->create))
+		return ops->create(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return NULL;
+}
+
+/* Destroy a flow rule on a given port. */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->destroy))
+		return ops->destroy(dev, flow, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Destroy all flow rules associated with a port. */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->flush))
+		return ops->flush(dev, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Query an existing flow rule. */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (!ops)
+		return -rte_errno;
+	if (likely(!!ops->query))
+		return ops->query(dev, flow, action, data, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
new file mode 100644
index 0000000..98084ac
--- /dev/null
+++ b/lib/librte_ether/rte_flow.h
@@ -0,0 +1,947 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_H_
+#define RTE_FLOW_H_
+
+/**
+ * @file
+ * RTE generic flow API
+ *
+ * This interface provides the ability to program packet matching and
+ * associated actions in hardware through flow rules.
+ */
+
+#include <rte_arp.h>
+#include <rte_ether.h>
+#include <rte_icmp.h>
+#include <rte_ip.h>
+#include <rte_sctp.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Flow rule attributes.
+ *
+ * Priorities are set on two levels: per group and per rule within groups.
+ *
+ * Lower values denote higher priority, the highest priority for both levels
+ * is 0, so that a rule with priority 0 in group 8 is always matched after a
+ * rule with priority 8 in group 0.
+ *
+ * Although optional, applications are encouraged to group similar rules as
+ * much as possible to fully take advantage of hardware capabilities
+ * (e.g. optimized matching) and work around limitations (e.g. a single
+ * pattern type possibly allowed in a given group).
+ *
+ * Group and priority levels are arbitrary and up to the application, they
+ * do not need to be contiguous nor start from 0, however the maximum number
+ * varies between devices and may be affected by existing flow rules.
+ *
+ * If a packet is matched by several rules of a given group for a given
+ * priority level, the outcome is undefined. It can take any path, may be
+ * duplicated or even cause unrecoverable errors.
+ *
+ * Note that support for more than a single group and priority level is not
+ * guaranteed.
+ *
+ * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+ *
+ * Several pattern items and actions are valid and can be used in both
+ * directions. Those valid for only one direction are described as such.
+ *
+ * At least one direction must be specified.
+ *
+ * Specifying both directions at once for a given rule is not recommended
+ * but may be valid in a few cases (e.g. shared counter).
+ */
+struct rte_flow_attr {
+	uint32_t group; /**< Priority group. */
+	uint32_t priority; /**< Priority level within group. */
+	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
+	uint32_t egress:1; /**< Rule applies to egress traffic. */
+	uint32_t reserved:30; /**< Reserved, must be zero. */
+};
+
+/**
+ * Matching pattern item types.
+ *
+ * Pattern items fall in two categories:
+ *
+ * - Matching protocol headers and packet data (ANY, RAW, ETH, VLAN, IPV4,
+ *   IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
+ *   specification structure. These must be stacked in the same order as the
+ *   protocol layers to match, starting from the lowest.
+ *
+ * - Matching meta-data or affecting pattern processing (END, VOID, INVERT,
+ *   PF, VF, PORT and so on), often without a specification structure. Since
+ *   they do not match packet contents, these can be specified anywhere
+ *   within item lists without affecting others.
+ *
+ * See the description of individual types for more information. Those
+ * marked with [META] fall into the second category.
+ */
+enum rte_flow_item_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for item lists. Prevents further processing of items,
+	 * thereby ending the pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_VOID,
+
+	/**
+	 * [META]
+	 *
+	 * Inverted matching, i.e. process packets that do not match the
+	 * pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_INVERT,
+
+	/**
+	 * Matches any protocol in place of the current layer, a single ANY
+	 * may also stand for several protocol layers.
+	 *
+	 * See struct rte_flow_item_any.
+	 */
+	RTE_FLOW_ITEM_TYPE_ANY,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to the physical function of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a PF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_PF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to a virtual function ID of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a VF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * See struct rte_flow_item_vf.
+	 */
+	RTE_FLOW_ITEM_TYPE_VF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets coming from the specified physical port of the
+	 * underlying device.
+	 *
+	 * The first PORT item overrides the physical port normally
+	 * associated with the specified DPDK input port (port_id). This
+	 * item can be provided several times to match additional physical
+	 * ports.
+	 *
+	 * See struct rte_flow_item_port.
+	 */
+	RTE_FLOW_ITEM_TYPE_PORT,
+
+	/**
+	 * Matches a byte string of a given length at a given offset.
+	 *
+	 * See struct rte_flow_item_raw.
+	 */
+	RTE_FLOW_ITEM_TYPE_RAW,
+
+	/**
+	 * Matches an Ethernet header.
+	 *
+	 * See struct rte_flow_item_eth.
+	 */
+	RTE_FLOW_ITEM_TYPE_ETH,
+
+	/**
+	 * Matches an 802.1Q/ad VLAN tag.
+	 *
+	 * See struct rte_flow_item_vlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VLAN,
+
+	/**
+	 * Matches an IPv4 header.
+	 *
+	 * See struct rte_flow_item_ipv4.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV4,
+
+	/**
+	 * Matches an IPv6 header.
+	 *
+	 * See struct rte_flow_item_ipv6.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV6,
+
+	/**
+	 * Matches an ICMP header.
+	 *
+	 * See struct rte_flow_item_icmp.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMP,
+
+	/**
+	 * Matches a UDP header.
+	 *
+	 * See struct rte_flow_item_udp.
+	 */
+	RTE_FLOW_ITEM_TYPE_UDP,
+
+	/**
+	 * Matches a TCP header.
+	 *
+	 * See struct rte_flow_item_tcp.
+	 */
+	RTE_FLOW_ITEM_TYPE_TCP,
+
+	/**
+	 * Matches a SCTP header.
+	 *
+	 * See struct rte_flow_item_sctp.
+	 */
+	RTE_FLOW_ITEM_TYPE_SCTP,
+
+	/**
+	 * Matches a VXLAN header.
+	 *
+	 * See struct rte_flow_item_vxlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VXLAN,
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ANY
+ *
+ * Matches any protocol in place of the current layer, a single ANY may also
+ * stand for several protocol layers.
+ *
+ * This is usually specified as the first pattern item when looking for a
+ * protocol anywhere in a packet.
+ *
+ * A zeroed mask stands for any number of layers.
+ */
+struct rte_flow_item_any {
+	uint32_t num; /* Number of layers covered. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VF
+ *
+ * Matches packets addressed to a virtual function ID of the device.
+ *
+ * If the underlying device function differs from the one that would
+ * normally receive the matched traffic, specifying this item prevents it
+ * from reaching that device unless the flow rule contains a VF
+ * action. Packets are not duplicated between device instances by default.
+ *
+ * - Likely to return an error or never match any traffic if this causes a
+ *   VF device to match traffic addressed to a different VF.
+ * - Can be specified multiple times to match traffic addressed to several
+ *   VF IDs.
+ * - Can be combined with a PF item to match both PF and VF traffic.
+ *
+ * A zeroed mask can be used to match any VF ID.
+ */
+struct rte_flow_item_vf {
+	uint32_t id; /**< Destination VF ID. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_PORT
+ *
+ * Matches packets coming from the specified physical port of the underlying
+ * device.
+ *
+ * The first PORT item overrides the physical port normally associated with
+ * the specified DPDK input port (port_id). This item can be provided
+ * several times to match additional physical ports.
+ *
+ * Note that physical ports are not necessarily tied to DPDK input ports
+ * (port_id) when those are not under DPDK control. Possible values are
+ * specific to each device, they are not necessarily indexed from zero and
+ * may not be contiguous.
+ *
+ * As a device property, the list of allowed values as well as the value
+ * associated with a port_id should be retrieved by other means.
+ *
+ * A zeroed mask can be used to match any port index.
+ */
+struct rte_flow_item_port {
+	uint32_t index; /**< Physical port index. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_RAW
+ *
+ * Matches a byte string of a given length at a given offset.
+ *
+ * Offset is either absolute (using the start of the packet) or relative to
+ * the end of the previous matched item in the stack, in which case negative
+ * values are allowed.
+ *
+ * If search is enabled, offset is used as the starting point. The search
+ * area can be delimited by setting limit to a nonzero value, which is the
+ * maximum number of bytes after offset where the pattern may start.
+ *
+ * Matching a zero-length pattern is allowed, doing so resets the relative
+ * offset for subsequent items.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_raw {
+	uint32_t relative:1; /**< Look for pattern after the previous item. */
+	uint32_t search:1; /**< Search pattern from offset (see also limit). */
+	uint32_t reserved:30; /**< Reserved, must be set to zero. */
+	int32_t offset; /**< Absolute or relative offset for pattern. */
+	uint16_t limit; /**< Search area limit for start of pattern. */
+	uint16_t length; /**< Pattern length. */
+	uint8_t pattern[]; /**< Byte string to look for. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ETH
+ *
+ * Matches an Ethernet header.
+ */
+struct rte_flow_item_eth {
+	struct ether_addr dst; /**< Destination MAC. */
+	struct ether_addr src; /**< Source MAC. */
+	uint16_t type; /**< EtherType. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VLAN
+ *
+ * Matches an 802.1Q/ad VLAN tag.
+ *
+ * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
+ * RTE_FLOW_ITEM_TYPE_VLAN.
+ */
+struct rte_flow_item_vlan {
+	uint16_t tpid; /**< Tag protocol identifier. */
+	uint16_t tci; /**< Tag control information. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV4
+ *
+ * Matches an IPv4 header.
+ *
+ * Note: IPv4 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv4 {
+	struct ipv4_hdr hdr; /**< IPv4 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV6.
+ *
+ * Matches an IPv6 header.
+ *
+ * Note: IPv6 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv6 {
+	struct ipv6_hdr hdr; /**< IPv6 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMP.
+ *
+ * Matches an ICMP header.
+ */
+struct rte_flow_item_icmp {
+	struct icmp_hdr hdr; /**< ICMP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_UDP.
+ *
+ * Matches a UDP header.
+ */
+struct rte_flow_item_udp {
+	struct udp_hdr hdr; /**< UDP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_TCP.
+ *
+ * Matches a TCP header.
+ */
+struct rte_flow_item_tcp {
+	struct tcp_hdr hdr; /**< TCP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_SCTP.
+ *
+ * Matches a SCTP header.
+ */
+struct rte_flow_item_sctp {
+	struct sctp_hdr hdr; /**< SCTP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VXLAN.
+ *
+ * Matches a VXLAN header (RFC 7348).
+ */
+struct rte_flow_item_vxlan {
+	uint8_t flags; /**< Normally 0x08 (I flag). */
+	uint8_t rsvd0[3]; /**< Reserved, normally 0x000000. */
+	uint8_t vni[3]; /**< VXLAN identifier. */
+	uint8_t rsvd1; /**< Reserved, normally 0x00. */
+};
+
+/**
+ * Matching pattern item definition.
+ *
+ * A pattern is formed by stacking items starting from the lowest protocol
+ * layer to match. This stacking restriction does not apply to meta items
+ * which can be placed anywhere in the stack without affecting the meaning
+ * of the resulting pattern.
+ *
+ * Patterns are terminated by END items.
+ *
+ * The spec field should be a valid pointer to a structure of the related
+ * item type. It may be set to NULL in many cases to use default values.
+ *
+ * Optionally, last can point to a structure of the same type to define an
+ * inclusive range. This is mostly supported by integer and address fields,
+ * may cause errors otherwise. Fields that do not support ranges must be set
+ * to 0 or to the same value as the corresponding fields in spec.
+ *
+ * By default all fields present in spec are considered relevant (see note
+ * below). This behavior can be altered by providing a mask structure of the
+ * same type with applicable bits set to one. It can also be used to
+ * partially filter out specific fields (e.g. as an alternate mean to match
+ * ranges of IP addresses).
+ *
+ * Mask is a simple bit-mask applied before interpreting the contents of
+ * spec and last, which may yield unexpected results if not used
+ * carefully. For example, if for an IPv4 address field, spec provides
+ * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
+ * effective range becomes 10.1.0.0 to 10.3.255.255.
+ *
+ * Note: the defaults for data-matching items such as IPv4 when mask is not
+ * specified actually depend on the underlying implementation since only
+ * recognized fields can be taken into account.
+ */
+struct rte_flow_item {
+	enum rte_flow_item_type type; /**< Item type. */
+	const void *spec; /**< Pointer to item specification structure. */
+	const void *last; /**< Defines an inclusive range (spec to last). */
+	const void *mask; /**< Bit-mask applied to spec and last. */
+};
+
+/**
+ * Action types.
+ *
+ * Each possible action is represented by a type. Some have associated
+ * configuration structures. Several actions combined in a list can be
+ * affected to a flow rule. That list is not ordered.
+ *
+ * They fall in three categories:
+ *
+ * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+ *   processing matched packets by subsequent flow rules, unless overridden
+ *   with PASSTHRU.
+ *
+ * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
+ *   for additional processing by subsequent flow rules.
+ *
+ * - Other non terminating meta actions that do not affect the fate of
+ *   packets (END, VOID, MARK, FLAG, COUNT).
+ *
+ * When several actions are combined in a flow rule, they should all have
+ * different types (e.g. dropping a packet twice is not possible).
+ *
+ * Only the last action of a given type is taken into account. PMDs still
+ * perform error checking on the entire list.
+ *
+ * Note that PASSTHRU is the only action able to override a terminating
+ * rule.
+ */
+enum rte_flow_action_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for action lists. Prevents further processing of
+	 * actions, thereby ending the list.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_VOID,
+
+	/**
+	 * Leaves packets up for additional processing by subsequent flow
+	 * rules. This is the default when a rule does not contain a
+	 * terminating action, but can be specified to force a rule to
+	 * become non-terminating.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PASSTHRU,
+
+	/**
+	 * [META]
+	 *
+	 * Attaches a 32 bit value to packets.
+	 *
+	 * See struct rte_flow_action_mark.
+	 */
+	RTE_FLOW_ACTION_TYPE_MARK,
+
+	/**
+	 * [META]
+	 *
+	 * Flag packets. Similar to MARK but only affects ol_flags.
+	 *
+	 * Note: a distinctive flag must be defined for it.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_FLAG,
+
+	/**
+	 * Assigns packets to a given queue index.
+	 *
+	 * See struct rte_flow_action_queue.
+	 */
+	RTE_FLOW_ACTION_TYPE_QUEUE,
+
+	/**
+	 * Drops packets.
+	 *
+	 * PASSTHRU overrides this action if both are specified.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_DROP,
+
+	/**
+	 * [META]
+	 *
+	 * Enables counters for this rule.
+	 *
+	 * These counters can be retrieved and reset through rte_flow_query(),
+	 * see struct rte_flow_query_count.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_COUNT,
+
+	/**
+	 * Duplicates packets to a given queue index.
+	 *
+	 * This is normally combined with QUEUE, however when used alone, it
+	 * is actually similar to QUEUE + PASSTHRU.
+	 *
+	 * See struct rte_flow_action_dup.
+	 */
+	RTE_FLOW_ACTION_TYPE_DUP,
+
+	/**
+	 * Similar to QUEUE, except RSS is additionally performed on packets
+	 * to spread them among several queues according to the provided
+	 * parameters.
+	 *
+	 * See struct rte_flow_action_rss.
+	 */
+	RTE_FLOW_ACTION_TYPE_RSS,
+
+	/**
+	 * Redirects packets to the physical function (PF) of the current
+	 * device.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PF,
+
+	/**
+	 * Redirects packets to the virtual function (VF) of the current
+	 * device with the specified ID.
+	 *
+	 * See struct rte_flow_action_vf.
+	 */
+	RTE_FLOW_ACTION_TYPE_VF,
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_MARK
+ *
+ * Attaches a 32 bit value to packets.
+ *
+ * This value is arbitrary and application-defined. For compatibility with
+ * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
+ * also set in ol_flags.
+ */
+struct rte_flow_action_mark {
+	uint32_t id; /**< 32 bit value to return with packets. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_QUEUE
+ *
+ * Assign packets to a given queue index.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_queue {
+	uint16_t index; /**< Queue index to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_COUNT (query)
+ *
+ * Query structure to retrieve and reset flow rule counters.
+ */
+struct rte_flow_query_count {
+	uint32_t reset:1; /**< Reset counters after query [in]. */
+	uint32_t hits_set:1; /**< hits field is set [out]. */
+	uint32_t bytes_set:1; /**< bytes field is set [out]. */
+	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
+	uint64_t hits; /**< Number of hits for this rule [out]. */
+	uint64_t bytes; /**< Number of bytes through this rule [out]. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_DUP
+ *
+ * Duplicates packets to a given queue index.
+ *
+ * This is normally combined with QUEUE, however when used alone, it is
+ * actually similar to QUEUE + PASSTHRU.
+ *
+ * Non-terminating by default.
+ */
+struct rte_flow_action_dup {
+	uint16_t index; /**< Queue index to duplicate packets to. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_RSS
+ *
+ * Similar to QUEUE, except RSS is additionally performed on packets to
+ * spread them among several queues according to the provided parameters.
+ *
+ * Note: RSS hash result is normally stored in the hash.rss mbuf field,
+ * however it conflicts with the MARK action as they share the same
+ * space. When both actions are specified, the RSS hash is discarded and
+ * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
+ * structure should eventually evolve to store both.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_rss {
+	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
+	uint16_t num; /**< Number of entries in queue[]. */
+	uint16_t queue[]; /**< Queues indices to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_VF
+ *
+ * Redirects packets to a virtual function (VF) of the current device.
+ *
+ * Packets matched by a VF pattern item can be redirected to their original
+ * VF ID instead of the specified one. This parameter may not be available
+ * and is not guaranteed to work properly if the VF part is matched by a
+ * prior flow rule or if packets are not addressed to a VF in the first
+ * place.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_vf {
+	uint32_t original:1; /**< Use original VF ID if possible. */
+	uint32_t reserved:31; /**< Reserved, must be zero. */
+	uint32_t id; /**< VF ID to redirect packets to. */
+};
+
+/**
+ * Definition of a single action.
+ *
+ * A list of actions is terminated by a END action.
+ *
+ * For simple actions without a configuration structure, conf remains NULL.
+ */
+struct rte_flow_action {
+	enum rte_flow_action_type type; /**< Action type. */
+	const void *conf; /**< Pointer to action configuration structure. */
+};
+
+/**
+ * Opaque type returned after successfully creating a flow.
+ *
+ * This handle can be used to manage and query the related flow (e.g. to
+ * destroy it or retrieve counters).
+ */
+struct rte_flow;
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_flow_error.cause.
+ */
+enum rte_flow_error_type {
+	RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+	RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+	RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+	RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+	RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by PMDs, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_flow_error {
+	enum rte_flow_error_type type; /**< Cause field and error types. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * Check whether a flow rule can be created on a given port.
+ *
+ * While this function has no effect on the target device, the flow rule is
+ * validated against its current configuration state and the returned value
+ * should be considered valid by the caller for that state only.
+ *
+ * The returned value is guaranteed to remain valid only as long as no
+ * successful calls to rte_flow_create() or rte_flow_destroy() are made in
+ * the meantime and no device parameter affecting flow rules in any way are
+ * modified, due to possible collisions or resource limitations (although in
+ * such cases EINVAL should not be returned).
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 if flow rule is valid and can be created. A negative errno value
+ *   otherwise (rte_errno is also set), the following errors are defined:
+ *
+ *   -ENOSYS: underlying device does not support this functionality.
+ *
+ *   -EINVAL: unknown or invalid rule specification.
+ *
+ *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
+ *   bit-masks are unsupported).
+ *
+ *   -EEXIST: collision with an existing rule.
+ *
+ *   -ENOMEM: not enough resources.
+ *
+ *   -EBUSY: action cannot be performed due to busy device resources, may
+ *   succeed if the affected queues or even the entire port are in a stopped
+ *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
+ */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error);
+
+/**
+ * Create a flow rule on a given port.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid handle in case of success, NULL otherwise and rte_errno is set
+ *   to the positive version of one of the error codes defined for
+ *   rte_flow_validate().
+ */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error);
+
+/**
+ * Destroy a flow rule on a given port.
+ *
+ * Failure to destroy a flow rule handle may occur when other flow rules
+ * depend on it, and destroying it would result in an inconsistent state.
+ *
+ * This function is only guaranteed to succeed if handles are destroyed in
+ * reverse order of their creation.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error);
+
+/**
+ * Destroy all flow rules associated with a port.
+ *
+ * In the unlikely event of failure, handles are still considered destroyed
+ * and no longer valid but the port must be assumed to be in an inconsistent
+ * state.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error);
+
+/**
+ * Query an existing flow rule.
+ *
+ * This function allows retrieving flow-specific data such as counters.
+ * Data is gathered by special actions which must be present in the flow
+ * rule definition.
+ *
+ * \see RTE_FLOW_ACTION_TYPE_COUNT
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to query.
+ * @param action
+ *   Action type to query.
+ * @param[in, out] data
+ *   Pointer to storage for the associated query data type.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_H_ */
diff --git a/lib/librte_ether/rte_flow_driver.h b/lib/librte_ether/rte_flow_driver.h
new file mode 100644
index 0000000..274562c
--- /dev/null
+++ b/lib/librte_ether/rte_flow_driver.h
@@ -0,0 +1,182 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_DRIVER_H_
+#define RTE_FLOW_DRIVER_H_
+
+/**
+ * @file
+ * RTE generic flow API (driver side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include <rte_ethdev.h>
+#include "rte_flow.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Generic flow operations structure implemented and returned by PMDs.
+ *
+ * To implement this API, PMDs must handle the RTE_ETH_FILTER_GENERIC filter
+ * type in their .filter_ctrl callback function (struct eth_dev_ops) as well
+ * as the RTE_ETH_FILTER_GET filter operation.
+ *
+ * If successful, this operation must result in a pointer to a PMD-specific
+ * struct rte_flow_ops written to the argument address as described below:
+ *
+ * \code
+ *
+ * // PMD filter_ctrl callback
+ *
+ * static const struct rte_flow_ops pmd_flow_ops = { ... };
+ *
+ * switch (filter_type) {
+ * case RTE_ETH_FILTER_GENERIC:
+ *     if (filter_op != RTE_ETH_FILTER_GET)
+ *         return -EINVAL;
+ *     *(const void **)arg = &pmd_flow_ops;
+ *     return 0;
+ * }
+ *
+ * \endcode
+ *
+ * See also rte_flow_ops_get().
+ *
+ * These callback functions are not supposed to be used by applications
+ * directly, which must rely on the API defined in rte_flow.h.
+ *
+ * Public-facing wrapper functions perform a few consistency checks so that
+ * unimplemented (i.e. NULL) callbacks simply return -ENOTSUP. These
+ * callbacks otherwise only differ by their first argument (with port ID
+ * already resolved to a pointer to struct rte_eth_dev).
+ */
+struct rte_flow_ops {
+	/** See rte_flow_validate(). */
+	int (*validate)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_create(). */
+	struct rte_flow *(*create)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_destroy(). */
+	int (*destroy)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 struct rte_flow_error *);
+	/** See rte_flow_flush(). */
+	int (*flush)
+		(struct rte_eth_dev *,
+		 struct rte_flow_error *);
+	/** See rte_flow_query(). */
+	int (*query)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 enum rte_flow_action_type,
+		 void *,
+		 struct rte_flow_error *);
+};
+
+/**
+ * Initialize generic flow error structure.
+ *
+ * This function also sets rte_errno to a given value.
+ *
+ * @param[out] error
+ *   Pointer to flow error structure (may be NULL).
+ * @param code
+ *   Related error code (rte_errno).
+ * @param type
+ *   Cause field and error types.
+ * @param cause
+ *   Object responsible for the error.
+ * @param message
+ *   Human-readable error message.
+ *
+ * @return
+ *   Pointer to flow error structure.
+ */
+static inline struct rte_flow_error *
+rte_flow_error_set(struct rte_flow_error *error,
+		   int code,
+		   enum rte_flow_error_type type,
+		   const void *cause,
+		   const char *message)
+{
+	if (error) {
+		*error = (struct rte_flow_error){
+			.type = type,
+			.cause = cause,
+			.message = message,
+		};
+	}
+	rte_errno = code;
+	return error;
+}
+
+/**
+ * Get generic flow operations structure from a port.
+ *
+ * @param port_id
+ *   Port identifier to query.
+ * @param[out] error
+ *   Pointer to flow error structure.
+ *
+ * @return
+ *   The flow operations structure associated with port_id, NULL in case of
+ *   error, in which case rte_errno is set and the error structure contains
+ *   additional details.
+ */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_DRIVER_H_ */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 02/25] doc: add rte_flow prog guide
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 01/25] ethdev: introduce generic flow API Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-21 10:55             ` Mcnamara, John
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 03/25] doc: announce deprecation of legacy filter types Adrien Mazarguil
                             ` (23 subsequent siblings)
  25 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

This documentation is based on the latest RFC submission, subsequently
updated according to feedback from the community.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/prog_guide/index.rst    |    1 +
 doc/guides/prog_guide/rte_flow.rst | 2042 +++++++++++++++++++++++++++++++
 2 files changed, 2043 insertions(+)

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index e5a50a8..ed7f770 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -42,6 +42,7 @@ Programmer's Guide
     mempool_lib
     mbuf_lib
     poll_mode_drv
+    rte_flow
     cryptodev_lib
     link_bonding_poll_mode_drv_lib
     timer_lib
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
new file mode 100644
index 0000000..98c672e
--- /dev/null
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -0,0 +1,2042 @@
+..  BSD LICENSE
+    Copyright 2016 6WIND S.A.
+    Copyright 2016 Mellanox.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of 6WIND S.A. nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Generic_flow_API:
+
+Generic flow API (rte_flow)
+===========================
+
+Overview
+--------
+
+This API provides a generic means to configure hardware to match specific
+ingress or egress traffic, alter its fate and query related counters
+according to any number of user-defined rules.
+
+It is named *rte_flow* after the prefix used for all its symbols, and is
+defined in ``rte_flow.h``.
+
+- Matching can be performed on packet data (protocol headers, payload) and
+  properties (e.g. associated physical port, virtual device function ID).
+
+- Possible operations include dropping traffic, diverting it to specific
+  queues, to virtual/physical device functions or ports, performing tunnel
+  offloads, adding marks and so on.
+
+It is slightly higher-level than the legacy filtering framework which it
+encompasses and supersedes (including all functions and filter types) in
+order to expose a single interface with an unambiguous behavior that is
+common to all poll-mode drivers (PMDs).
+
+Several methods to migrate existing applications are described in `API
+migration`_.
+
+Flow rule
+---------
+
+Description
+~~~~~~~~~~~
+
+A flow rule is the combination of attributes with a matching pattern and a
+list of actions. Flow rules form the basis of this API.
+
+Flow rules can have several distinct actions (such as counting,
+encapsulating, decapsulating before redirecting packets to a particular
+queue, etc.), instead of relying on several rules to achieve this and having
+applications deal with hardware implementation details regarding their
+order.
+
+Support for different priority levels on a rule basis is provided, for
+example in order to force a more specific rule to come before a more generic
+one for packets matched by both. However hardware support for more than a
+single priority level cannot be guaranteed. When supported, the number of
+available priority levels is usually low, which is why they can also be
+implemented in software by PMDs (e.g. missing priority levels may be
+emulated by reordering rules).
+
+In order to remain as hardware-agnostic as possible, by default all rules
+are considered to have the same priority, which means that the order between
+overlapping rules (when a packet is matched by several filters) is
+undefined.
+
+PMDs may refuse to create overlapping rules at a given priority level when
+they can be detected (e.g. if a pattern matches an existing filter).
+
+Thus predictable results for a given priority level can only be achieved
+with non-overlapping rules, using perfect matching on all protocol layers.
+
+Flow rules can also be grouped, the flow rule priority is specific to the
+group they belong to. All flow rules in a given group are thus processed
+either before or after another group.
+
+Support for multiple actions per rule may be implemented internally on top
+of non-default hardware priorities, as a result both features may not be
+simultaneously available to applications.
+
+Considering that allowed pattern/actions combinations cannot be known in
+advance and would result in an impractically large number of capabilities to
+expose, a method is provided to validate a given rule from the current
+device configuration state.
+
+This enables applications to check if the rule types they need is supported
+at initialization time, before starting their data path. This method can be
+used anytime, its only requirement being that the resources needed by a rule
+should exist (e.g. a target RX queue should be configured first).
+
+Each defined rule is associated with an opaque handle managed by the PMD,
+applications are responsible for keeping it. These can be used for queries
+and rules management, such as retrieving counters or other data and
+destroying them.
+
+To avoid resource leaks on the PMD side, handles must be explicitly
+destroyed by the application before releasing associated resources such as
+queues and ports.
+
+The following sections cover:
+
+- **Attributes** (represented by ``struct rte_flow_attr``): properties of a
+  flow rule such as its direction (ingress or egress) and priority.
+
+- **Pattern item** (represented by ``struct rte_flow_item``): part of a
+  matching pattern that either matches specific packet data or traffic
+  properties. It can also describe properties of the pattern itself, such as
+  inverted matching.
+
+- **Matching pattern**: traffic properties to look for, a combination of any
+  number of items.
+
+- **Actions** (represented by ``struct rte_flow_action``): operations to
+  perform whenever a packet is matched by a pattern.
+
+Attributes
+~~~~~~~~~~
+
+Attribute: Group
+^^^^^^^^^^^^^^^^
+
+Flow rules can be grouped by assigning them a common group number. Lower
+values have higher priority. Group 0 has the highest priority.
+
+Although optional, applications are encouraged to group similar rules as
+much as possible to fully take advantage of hardware capabilities
+(e.g. optimized matching) and work around limitations (e.g. a single pattern
+type possibly allowed in a given group).
+
+Note that support for more than a single group is not guaranteed.
+
+Attribute: Priority
+^^^^^^^^^^^^^^^^^^^
+
+A priority level can be assigned to a flow rule. Like groups, lower values
+denote higher priority, with 0 as the maximum.
+
+A rule with priority 0 in group 8 is always matched after a rule with
+priority 8 in group 0.
+
+Group and priority levels are arbitrary and up to the application, they do
+not need to be contiguous nor start from 0, however the maximum number
+varies between devices and may be affected by existing flow rules.
+
+If a packet is matched by several rules of a given group for a given
+priority level, the outcome is undefined. It can take any path, may be
+duplicated or even cause unrecoverable errors.
+
+Note that support for more than a single priority level is not guaranteed.
+
+Attribute: Traffic direction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+
+Several pattern items and actions are valid and can be used in both
+directions. At least one direction must be specified.
+
+Specifying both directions at once for a given rule is not recommended but
+may be valid in a few cases (e.g. shared counters).
+
+Pattern item
+~~~~~~~~~~~~
+
+Pattern items fall in two categories:
+
+- Matching protocol headers and packet data (ANY, RAW, ETH, VLAN, IPV4,
+  IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
+  specification structure.
+
+- Matching meta-data or affecting pattern processing (END, VOID, INVERT, PF,
+  VF, PORT and so on), often without a specification structure.
+
+Item specification structures are used to match specific values among
+protocol fields (or item properties). Documentation describes for each item
+whether they are associated with one and their type name if so.
+
+Up to three structures of the same type can be set for a given item:
+
+- ``spec``: values to match (e.g. a given IPv4 address).
+
+- ``last``: upper bound for an inclusive range with corresponding fields in
+  ``spec``.
+
+- ``mask``: bit-mask applied to both ``spec`` and ``last`` whose purpose is
+  to distinguish the values to take into account and/or partially mask them
+  out (e.g. in order to match an IPv4 address prefix).
+
+Usage restrictions and expected behavior:
+
+- Setting either ``mask`` or ``last`` without ``spec`` is an error.
+
+- Field values in ``last`` which are either 0 or equal to the corresponding
+  values in ``spec`` are ignored; they do not generate a range. Nonzero
+  values lower than those in ``spec`` are not supported.
+
+- Setting ``spec`` and optionally ``last`` without ``mask`` causes the PMD
+  to only take the fields it can recognize into account. There is no error
+  checking for unsupported fields.
+
+- Not setting any of them (assuming item type allows it) uses default
+  parameters that depend on the item type. Most of the time, particularly
+  for protocol header items, it is equivalent to providing an empty (zeroed)
+  ``mask``.
+
+- ``mask`` is a simple bit-mask applied before interpreting the contents of
+  ``spec`` and ``last``, which may yield unexpected results if not used
+  carefully. For example, if for an IPv4 address field, ``spec`` provides
+  *10.1.2.3*, ``last`` provides *10.3.4.5* and ``mask`` provides
+  *255.255.0.0*, the effective range becomes *10.1.0.0* to *10.3.255.255*.
+
+Example of an item specification matching an Ethernet header:
+
+.. _table_rte_flow_pattern_item_example:
+
+.. table:: Ethernet item
+
+   +----------+----------+--------------------+
+   | Field    | Subfield | Value              |
+   +==========+==========+====================+
+   | ``spec`` | ``src``  | ``00:01:02:03:04`` |
+   |          +----------+--------------------+
+   |          | ``dst``  | ``00:2a:66:00:01`` |
+   |          +----------+--------------------+
+   |          | ``type`` | ``0x22aa``         |
+   +----------+----------+--------------------+
+   | ``last`` | unspecified                   |
+   +----------+----------+--------------------+
+   | ``mask`` | ``src``  | ``00:ff:ff:ff:00`` |
+   |          +----------+--------------------+
+   |          | ``dst``  | ``00:00:00:00:ff`` |
+   |          +----------+--------------------+
+   |          | ``type`` | ``0x0000``         |
+   +----------+----------+--------------------+
+
+Non-masked bits stand for any value (shown as ``?`` below), Ethernet headers
+with the following properties are thus matched:
+
+- ``src``: ``??:01:02:03:??``
+- ``dst``: ``??:??:??:??:01``
+- ``type``: ``0x????``
+
+Matching pattern
+~~~~~~~~~~~~~~~~
+
+A pattern is formed by stacking items starting from the lowest protocol
+layer to match. This stacking restriction does not apply to meta items which
+can be placed anywhere in the stack without affecting the meaning of the
+resulting pattern.
+
+Patterns are terminated by END items.
+
+Examples:
+
+.. _table_rte_flow_tcpv4_as_l4:
+
+.. table:: TCPv4 as L4
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | Ethernet |
+   +-------+----------+
+   | 1     | IPv4     |
+   +-------+----------+
+   | 2     | TCP      |
+   +-------+----------+
+   | 3     | END      |
+   +-------+----------+
+
+|
+
+.. _table_rte_flow_tcpv6_in_vxlan:
+
+.. table:: TCPv6 in VXLAN
+
+   +-------+------------+
+   | Index | Item       |
+   +=======+============+
+   | 0     | Ethernet   |
+   +-------+------------+
+   | 1     | IPv4       |
+   +-------+------------+
+   | 2     | UDP        |
+   +-------+------------+
+   | 3     | VXLAN      |
+   +-------+------------+
+   | 4     | Ethernet   |
+   +-------+------------+
+   | 5     | IPv6       |
+   +-------+------------+
+   | 6     | TCP        |
+   +-------+------------+
+   | 7     | END        |
+   +-------+------------+
+
+|
+
+.. _table_rte_flow_tcpv4_as_l4_meta:
+
+.. table:: TCPv4 as L4 with meta items
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | VOID     |
+   +-------+----------+
+   | 1     | Ethernet |
+   +-------+----------+
+   | 2     | VOID     |
+   +-------+----------+
+   | 3     | IPv4     |
+   +-------+----------+
+   | 4     | TCP      |
+   +-------+----------+
+   | 5     | VOID     |
+   +-------+----------+
+   | 6     | VOID     |
+   +-------+----------+
+   | 7     | END      |
+   +-------+----------+
+
+The above example shows how meta items do not affect packet data matching
+items, as long as those remain stacked properly. The resulting matching
+pattern is identical to "TCPv4 as L4".
+
+.. _table_rte_flow_udpv6_anywhere:
+
+.. table:: UDPv6 anywhere
+
+   +-------+------+
+   | Index | Item |
+   +=======+======+
+   | 0     | IPv6 |
+   +-------+------+
+   | 1     | UDP  |
+   +-------+------+
+   | 2     | END  |
+   +-------+------+
+
+If supported by the PMD, omitting one or several protocol layers at the
+bottom of the stack as in the above example (missing an Ethernet
+specification) enables looking up anywhere in packets.
+
+It is unspecified whether the payload of supported encapsulations
+(e.g. VXLAN payload) is matched by such a pattern, which may apply to inner,
+outer or both packets.
+
+.. _table_rte_flow_invalid_l3:
+
+.. table:: Invalid, missing L3
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | Ethernet |
+   +-------+----------+
+   | 1     | UDP      |
+   +-------+----------+
+   | 2     | END      |
+   +-------+----------+
+
+The above pattern is invalid due to a missing L3 specification between L2
+(Ethernet) and L4 (UDP). Doing so is only allowed at the bottom and at the
+top of the stack.
+
+Meta item types
+~~~~~~~~~~~~~~~
+
+They match meta-data or affect pattern processing instead of matching packet
+data directly, most of them do not need a specification structure. This
+particularity allows them to be specified anywhere in the stack without
+causing any side effect.
+
+Item: ``END``
+^^^^^^^^^^^^^
+
+End marker for item lists. Prevents further processing of items, thereby
+ending the pattern.
+
+- Its numeric value is 0 for convenience.
+- PMD support is mandatory.
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_end:
+
+.. table:: END
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+Item: ``VOID``
+^^^^^^^^^^^^^^
+
+Used as a placeholder for convenience. It is ignored and simply discarded by
+PMDs.
+
+- PMD support is mandatory.
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_void:
+
+.. table:: VOID
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+One usage example for this type is generating rules that share a common
+prefix quickly without reallocating memory, only by updating item types:
+
+.. _table_rte_flow_item_void_example:
+
+.. table:: TCP, UDP or ICMP as L4
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | Ethernet           |
+   +-------+--------------------+
+   | 1     | IPv4               |
+   +-------+------+------+------+
+   | 2     | UDP  | VOID | VOID |
+   +-------+------+------+------+
+   | 3     | VOID | TCP  | VOID |
+   +-------+------+------+------+
+   | 4     | VOID | VOID | ICMP |
+   +-------+------+------+------+
+   | 5     | END                |
+   +-------+--------------------+
+
+Item: ``INVERT``
+^^^^^^^^^^^^^^^^
+
+Inverted matching, i.e. process packets that do not match the pattern.
+
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_invert:
+
+.. table:: INVERT
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+Usage example, matching non-TCPv4 packets only:
+
+.. _table_rte_flow_item_invert_example:
+
+.. table:: Anything but TCPv4
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | INVERT   |
+   +-------+----------+
+   | 1     | Ethernet |
+   +-------+----------+
+   | 2     | IPv4     |
+   +-------+----------+
+   | 3     | TCP      |
+   +-------+----------+
+   | 4     | END      |
+   +-------+----------+
+
+Item: ``PF``
+^^^^^^^^^^^^
+
+Matches packets addressed to the physical function of the device.
+
+If the underlying device function differs from the one that would normally
+receive the matched traffic, specifying this item prevents it from reaching
+that device unless the flow rule contains a `Action: PF`_. Packets are not
+duplicated between device instances by default.
+
+- Likely to return an error or never match any traffic if applied to a VF
+  device.
+- Can be combined with any number of `Item: VF`_ to match both PF and VF
+  traffic.
+- ``spec``, ``last`` and ``mask`` must not be set.
+
+.. _table_rte_flow_item_pf:
+
+.. table:: PF
+
+   +----------+-------+
+   | Field    | Value |
+   +==========+=======+
+   | ``spec`` | unset |
+   +----------+-------+
+   | ``last`` | unset |
+   +----------+-------+
+   | ``mask`` | unset |
+   +----------+-------+
+
+Item: ``VF``
+^^^^^^^^^^^^
+
+Matches packets addressed to a virtual function ID of the device.
+
+If the underlying device function differs from the one that would normally
+receive the matched traffic, specifying this item prevents it from reaching
+that device unless the flow rule contains a `Action: VF`_. Packets are not
+duplicated between device instances by default.
+
+- Likely to return an error or never match any traffic if this causes a VF
+  device to match traffic addressed to a different VF.
+- Can be specified multiple times to match traffic addressed to several VF
+  IDs.
+- Can be combined with a PF item to match both PF and VF traffic.
+
+.. _table_rte_flow_item_vf:
+
+.. table:: VF
+
+   +----------+----------+---------------------------+
+   | Field    | Subfield | Value                     |
+   +==========+==========+===========================+
+   | ``spec`` | ``id``   | destination VF ID         |
+   +----------+----------+---------------------------+
+   | ``last`` | ``id``   | upper range value         |
+   +----------+----------+---------------------------+
+   | ``mask`` | ``id``   | zeroed to match any VF ID |
+   +----------+----------+---------------------------+
+
+Item: ``PORT``
+^^^^^^^^^^^^^^
+
+Matches packets coming from the specified physical port of the underlying
+device.
+
+The first PORT item overrides the physical port normally associated with the
+specified DPDK input port (port_id). This item can be provided several times
+to match additional physical ports.
+
+Note that physical ports are not necessarily tied to DPDK input ports
+(port_id) when those are not under DPDK control. Possible values are
+specific to each device, they are not necessarily indexed from zero and may
+not be contiguous.
+
+As a device property, the list of allowed values as well as the value
+associated with a port_id should be retrieved by other means.
+
+.. _table_rte_flow_item_port:
+
+.. table:: PORT
+
+   +----------+-----------+--------------------------------+
+   | Field    | Subfield  | Value                          |
+   +==========+===========+================================+
+   | ``spec`` | ``index`` | physical port index            |
+   +----------+-----------+--------------------------------+
+   | ``last`` | ``index`` | upper range value              |
+   +----------+-----------+--------------------------------+
+   | ``mask`` | ``index`` | zeroed to match any port index |
+   +----------+-----------+--------------------------------+
+
+Data matching item types
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Most of these are basically protocol header definitions with associated
+bit-masks. They must be specified (stacked) from lowest to highest protocol
+layer to form a matching pattern.
+
+The following list is not exhaustive, new protocols will be added in the
+future.
+
+Item: ``ANY``
+^^^^^^^^^^^^^
+
+Matches any protocol in place of the current layer, a single ANY may also
+stand for several protocol layers.
+
+This is usually specified as the first pattern item when looking for a
+protocol anywhere in a packet.
+
+.. _table_rte_flow_item_any:
+
+.. table:: ANY
+
+   +----------+----------+--------------------------------------+
+   | Field    | Subfield | Value                                |
+   +==========+==========+======================================+
+   | ``spec`` | ``num``  | number of layers covered             |
+   +----------+----------+--------------------------------------+
+   | ``last`` | ``num``  | upper range value                    |
+   +----------+----------+--------------------------------------+
+   | ``mask`` | ``num``  | zeroed to cover any number of layers |
+   +----------+----------+--------------------------------------+
+
+Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or IPv6)
+and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
+or IPv6) matched by the second ANY specification:
+
+.. _table_rte_flow_item_any_example:
+
+.. table:: TCP in VXLAN with wildcards
+
+   +-------+------+----------+----------+-------+
+   | Index | Item | Field    | Subfield | Value |
+   +=======+======+==========+==========+=======+
+   | 0     | Ethernet                           |
+   +-------+------+----------+----------+-------+
+   | 1     | ANY  | ``spec`` | ``num``  | 2     |
+   +-------+------+----------+----------+-------+
+   | 2     | VXLAN                              |
+   +-------+------------------------------------+
+   | 3     | Ethernet                           |
+   +-------+------+----------+----------+-------+
+   | 4     | ANY  | ``spec`` | ``num``  | 1     |
+   +-------+------+----------+----------+-------+
+   | 5     | TCP                                |
+   +-------+------------------------------------+
+   | 6     | END                                |
+   +-------+------------------------------------+
+
+Item: ``RAW``
+^^^^^^^^^^^^^
+
+Matches a byte string of a given length at a given offset.
+
+Offset is either absolute (using the start of the packet) or relative to the
+end of the previous matched item in the stack, in which case negative values
+are allowed.
+
+If search is enabled, offset is used as the starting point. The search area
+can be delimited by setting limit to a nonzero value, which is the maximum
+number of bytes after offset where the pattern may start.
+
+Matching a zero-length pattern is allowed, doing so resets the relative
+offset for subsequent items.
+
+- This type does not support ranges (``last`` field).
+
+.. _table_rte_flow_item_raw:
+
+.. table:: RAW
+
+   +----------+--------------+-------------------------------------------------+
+   | Field    | Subfield     | Value                                           |
+   +==========+==============+=================================================+
+   | ``spec`` | ``relative`` | look for pattern after the previous item        |
+   |          +--------------+-------------------------------------------------+
+   |          | ``search``   | search pattern from offset (see also ``limit``) |
+   |          +--------------+-------------------------------------------------+
+   |          | ``reserved`` | reserved, must be set to zero                   |
+   |          +--------------+-------------------------------------------------+
+   |          | ``offset``   | absolute or relative offset for ``pattern``     |
+   |          +--------------+-------------------------------------------------+
+   |          | ``limit``    | search area limit for start of ``pattern``      |
+   |          +--------------+-------------------------------------------------+
+   |          | ``length``   | ``pattern`` length                              |
+   |          +--------------+-------------------------------------------------+
+   |          | ``pattern``  | byte string to look for                         |
+   +----------+--------------+-------------------------------------------------+
+   | ``last`` | if specified, either all 0 or with the same values as ``spec`` |
+   +----------+----------------------------------------------------------------+
+   | ``mask`` | bit-mask applied to ``spec`` values with usual behavior        |
+   +----------+----------------------------------------------------------------+
+
+Example pattern looking for several strings at various offsets of a UDP
+payload, using combined RAW items:
+
+.. _table_rte_flow_item_raw_example:
+
+.. table:: UDP payload matching
+
+   +-------+------+----------+--------------+-------+
+   | Index | Item | Field    | Subfield     | Value |
+   +=======+======+==========+==============+=======+
+   | 0     | Ethernet                               |
+   +-------+----------------------------------------+
+   | 1     | IPv4                                   |
+   +-------+----------------------------------------+
+   | 2     | UDP                                    |
+   +-------+------+----------+--------------+-------+
+   | 3     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | 10    |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "foo" |
+   +-------+------+----------+--------------+-------+
+   | 4     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | 20    |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "bar" |
+   +-------+------+----------+--------------+-------+
+   | 5     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | -29   |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "baz" |
+   +-------+------+----------+--------------+-------+
+   | 6     | END                                    |
+   +-------+----------------------------------------+
+
+This translates to:
+
+- Locate "foo" at least 10 bytes deep inside UDP payload.
+- Locate "bar" after "foo" plus 20 bytes.
+- Locate "baz" after "bar" minus 29 bytes.
+
+Such a packet may be represented as follows (not to scale)::
+
+ 0                     >= 10 B           == 20 B
+ |                  |<--------->|     |<--------->|
+ |                  |           |     |           |
+ |-----|------|-----|-----|-----|-----|-----------|-----|------|
+ | ETH | IPv4 | UDP | ... | baz | foo | ......... | bar | .... |
+ |-----|------|-----|-----|-----|-----|-----------|-----|------|
+                          |                             |
+                          |<--------------------------->|
+                                      == 29 B
+
+Note that matching subsequent pattern items would resume after "baz", not
+"bar" since matching is always performed after the previous item of the
+stack.
+
+Item: ``ETH``
+^^^^^^^^^^^^^
+
+Matches an Ethernet header.
+
+- ``dst``: destination MAC.
+- ``src``: source MAC.
+- ``type``: EtherType.
+
+Item: ``VLAN``
+^^^^^^^^^^^^^^
+
+Matches an 802.1Q/ad VLAN tag.
+
+- ``tpid``: tag protocol identifier.
+- ``tci``: tag control information.
+
+Item: ``IPV4``
+^^^^^^^^^^^^^^
+
+Matches an IPv4 header.
+
+Note: IPv4 options are handled by dedicated pattern items.
+
+- ``hdr``: IPv4 header definition (``rte_ip.h``).
+
+Item: ``IPV6``
+^^^^^^^^^^^^^^
+
+Matches an IPv6 header.
+
+Note: IPv6 options are handled by dedicated pattern items.
+
+- ``hdr``: IPv6 header definition (``rte_ip.h``).
+
+Item: ``ICMP``
+^^^^^^^^^^^^^^
+
+Matches an ICMP header.
+
+- ``hdr``: ICMP header definition (``rte_icmp.h``).
+
+Item: ``UDP``
+^^^^^^^^^^^^^
+
+Matches a UDP header.
+
+- ``hdr``: UDP header definition (``rte_udp.h``).
+
+Item: ``TCP``
+^^^^^^^^^^^^^
+
+Matches a TCP header.
+
+- ``hdr``: TCP header definition (``rte_tcp.h``).
+
+Item: ``SCTP``
+^^^^^^^^^^^^^^
+
+Matches a SCTP header.
+
+- ``hdr``: SCTP header definition (``rte_sctp.h``).
+
+Item: ``VXLAN``
+^^^^^^^^^^^^^^^
+
+Matches a VXLAN header (RFC 7348).
+
+- ``flags``: normally 0x08 (I flag).
+- ``rsvd0``: reserved, normally 0x000000.
+- ``vni``: VXLAN network identifier.
+- ``rsvd1``: reserved, normally 0x00.
+
+Actions
+~~~~~~~
+
+Each possible action is represented by a type. Some have associated
+configuration structures. Several actions combined in a list can be affected
+to a flow rule. That list is not ordered.
+
+They fall in three categories:
+
+- Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+  processing matched packets by subsequent flow rules, unless overridden
+  with PASSTHRU.
+
+- Non-terminating actions (PASSTHRU, DUP) that leave matched packets up for
+  additional processing by subsequent flow rules.
+
+- Other non-terminating meta actions that do not affect the fate of packets
+  (END, VOID, MARK, FLAG, COUNT).
+
+When several actions are combined in a flow rule, they should all have
+different types (e.g. dropping a packet twice is not possible).
+
+Only the last action of a given type is taken into account. PMDs still
+perform error checking on the entire list.
+
+Like matching patterns, action lists are terminated by END items.
+
+*Note that PASSTHRU is the only action able to override a terminating rule.*
+
+Example of action that redirects packets to queue index 10:
+
+.. _table_rte_flow_action_example:
+
+.. table:: Queue action
+
+   +-----------+-------+
+   | Field     | Value |
+   +===========+=======+
+   | ``index`` | 10    |
+   +-----------+-------+
+
+Action lists examples, their order is not significant, applications must
+consider all actions to be performed simultaneously:
+
+.. _table_rte_flow_count_and_drop:
+
+.. table:: Count and drop
+
+   +-------+--------+
+   | Index | Action |
+   +=======+========+
+   | 0     | COUNT  |
+   +-------+--------+
+   | 1     | DROP   |
+   +-------+--------+
+   | 2     | END    |
+   +-------+--------+
+
+|
+
+.. _table_rte_flow_mark_count_redirect:
+
+.. table:: Mark, count and redirect
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | MARK   | ``mark``  | 0x2a  |
+   +-------+--------+-----------+-------+
+   | 1     | COUNT                      |
+   +-------+--------+-----------+-------+
+   | 2     | QUEUE  | ``queue`` | 10    |
+   +-------+--------+-----------+-------+
+   | 3     | END                        |
+   +-------+----------------------------+
+
+|
+
+.. _table_rte_flow_redirect_queue_5:
+
+.. table:: Redirect to queue 5
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | DROP                       |
+   +-------+--------+-----------+-------+
+   | 1     | QUEUE  | ``queue`` | 5     |
+   +-------+--------+-----------+-------+
+   | 2     | END                        |
+   +-------+----------------------------+
+
+In the above example, considering both actions are performed simultaneously,
+the end result is that only QUEUE has any effect.
+
+.. _table_rte_flow_redirect_queue_3:
+
+.. table:: Redirect to queue 3
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | QUEUE  | ``queue`` | 5     |
+   +-------+--------+-----------+-------+
+   | 1     | VOID                       |
+   +-------+--------+-----------+-------+
+   | 2     | QUEUE  | ``queue`` | 3     |
+   +-------+--------+-----------+-------+
+   | 3     | END                        |
+   +-------+----------------------------+
+
+As previously described, only the last action of a given type found in the
+list is taken into account. The above example also shows that VOID is
+ignored.
+
+Action types
+~~~~~~~~~~~~
+
+Common action types are described in this section. Like pattern item types,
+this list is not exhaustive as new actions will be added in the future.
+
+Action: ``END``
+^^^^^^^^^^^^^^^
+
+End marker for action lists. Prevents further processing of actions, thereby
+ending the list.
+
+- Its numeric value is 0 for convenience.
+- PMD support is mandatory.
+- No configurable properties.
+
+.. _table_rte_flow_action_end:
+
+.. table:: END
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``VOID``
+^^^^^^^^^^^^^^^^
+
+Used as a placeholder for convenience. It is ignored and simply discarded by
+PMDs.
+
+- PMD support is mandatory.
+- No configurable properties.
+
+.. _table_rte_flow_action_void:
+
+.. table:: VOID
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``PASSTHRU``
+^^^^^^^^^^^^^^^^^^^^
+
+Leaves packets up for additional processing by subsequent flow rules. This
+is the default when a rule does not contain a terminating action, but can be
+specified to force a rule to become non-terminating.
+
+- No configurable properties.
+
+.. _table_rte_flow_action_passthru:
+
+.. table:: PASSTHRU
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Example to copy a packet to a queue and continue processing by subsequent
+flow rules:
+
+.. _table_rte_flow_action_passthru_example:
+
+.. table:: Copy to queue 8
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | PASSTHRU                   |
+   +-------+--------+-----------+-------+
+   | 1     | QUEUE  | ``queue`` | 8     |
+   +-------+--------+-----------+-------+
+   | 2     | END                        |
+   +-------+----------------------------+
+
+Action: ``MARK``
+^^^^^^^^^^^^^^^^
+
+Attaches a 32 bit value to packets.
+
+This value is arbitrary and application-defined. For compatibility with FDIR
+it is returned in the ``hash.fdir.hi`` mbuf field. ``PKT_RX_FDIR_ID`` is
+also set in ``ol_flags``.
+
+.. _table_rte_flow_action_mark:
+
+.. table:: MARK
+
+   +--------+-------------------------------------+
+   | Field  | Value                               |
+   +========+=====================================+
+   | ``id`` | 32 bit value to return with packets |
+   +--------+-------------------------------------+
+
+Action: ``FLAG``
+^^^^^^^^^^^^^^^^
+
+Flag packets. Similar to `Action: MARK`_ but only affects ``ol_flags``.
+
+- No configurable properties.
+
+Note: a distinctive flag must be defined for it.
+
+.. _table_rte_flow_action_flag:
+
+.. table:: FLAG
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``QUEUE``
+^^^^^^^^^^^^^^^^^
+
+Assigns packets to a given queue index.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_queue:
+
+.. table:: QUEUE
+
+   +-----------+--------------------+
+   | Field     | Value              |
+   +===========+====================+
+   | ``index`` | queue index to use |
+   +-----------+--------------------+
+
+Action: ``DROP``
+^^^^^^^^^^^^^^^^
+
+Drop packets.
+
+- No configurable properties.
+- Terminating by default.
+- PASSTHRU overrides this action if both are specified.
+
+.. _table_rte_flow_action_drop:
+
+.. table:: DROP
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``COUNT``
+^^^^^^^^^^^^^^^^^
+
+Enables counters for this rule.
+
+These counters can be retrieved and reset through ``rte_flow_query()``, see
+``struct rte_flow_query_count``.
+
+- Counters can be retrieved with ``rte_flow_query()``.
+- No configurable properties.
+
+.. _table_rte_flow_action_count:
+
+.. table:: COUNT
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Query structure to retrieve and reset flow rule counters:
+
+.. _table_rte_flow_query_count:
+
+.. table:: COUNT query
+
+   +---------------+-----+-----------------------------------+
+   | Field         | I/O | Value                             |
+   +===============+=====+===================================+
+   | ``reset``     | in  | reset counter after query         |
+   +---------------+-----+-----------------------------------+
+   | ``hits_set``  | out | ``hits`` field is set             |
+   +---------------+-----+-----------------------------------+
+   | ``bytes_set`` | out | ``bytes`` field is set            |
+   +---------------+-----+-----------------------------------+
+   | ``hits``      | out | number of hits for this rule      |
+   +---------------+-----+-----------------------------------+
+   | ``bytes``     | out | number of bytes through this rule |
+   +---------------+-----+-----------------------------------+
+
+Action: ``DUP``
+^^^^^^^^^^^^^^^
+
+Duplicates packets to a given queue index.
+
+This is normally combined with QUEUE, however when used alone, it is
+actually similar to QUEUE + PASSTHRU.
+
+- Non-terminating by default.
+
+.. _table_rte_flow_action_dup:
+
+.. table:: DUP
+
+   +-----------+------------------------------------+
+   | Field     | Value                              |
+   +===========+====================================+
+   | ``index`` | queue index to duplicate packet to |
+   +-----------+------------------------------------+
+
+Action: ``RSS``
+^^^^^^^^^^^^^^^
+
+Similar to QUEUE, except RSS is additionally performed on packets to spread
+them among several queues according to the provided parameters.
+
+Note: RSS hash result is normally stored in the ``hash.rss`` mbuf field,
+however it conflicts with `Action: MARK`_ as they share the same space. When
+both actions are specified, the RSS hash is discarded and
+``PKT_RX_RSS_HASH`` is not set in ``ol_flags``. MARK has priority. The mbuf
+structure should eventually evolve to store both.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_rss:
+
+.. table:: RSS
+
+   +--------------+------------------------------+
+   | Field        | Value                        |
+   +==============+==============================+
+   | ``rss_conf`` | RSS parameters               |
+   +--------------+------------------------------+
+   | ``num``      | number of entries in queue[] |
+   +--------------+------------------------------+
+   | ``queue[]``  | queue indices to use         |
+   +--------------+------------------------------+
+
+Action: ``PF``
+^^^^^^^^^^^^^^
+
+Redirects packets to the physical function (PF) of the current device.
+
+- No configurable properties.
+- Terminating by default.
+
+.. _table_rte_flow_action_pf:
+
+.. table:: PF
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``VF``
+^^^^^^^^^^^^^^
+
+Redirects packets to a virtual function (VF) of the current device.
+
+Packets matched by a VF pattern item can be redirected to their original VF
+ID instead of the specified one. This parameter may not be available and is
+not guaranteed to work properly if the VF part is matched by a prior flow
+rule or if packets are not addressed to a VF in the first place.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_vf:
+
+.. table:: VF
+
+   +--------------+--------------------------------+
+   | Field        | Value                          |
+   +==============+================================+
+   | ``original`` | use original VF ID if possible |
+   +--------------+--------------------------------+
+   | ``vf``       | VF ID to redirect packets to   |
+   +--------------+--------------------------------+
+
+Negative types
+~~~~~~~~~~~~~~
+
+All specified pattern items (``enum rte_flow_item_type``) and actions
+(``enum rte_flow_action_type``) use positive identifiers.
+
+The negative space is reserved for dynamic types generated by PMDs during
+run-time. PMDs may encounter them as a result but must not accept negative
+identifiers they are not aware of.
+
+A method to generate them remains to be defined.
+
+Planned types
+~~~~~~~~~~~~~
+
+Pattern item types will be added as new protocols are implemented.
+
+Variable headers support through dedicated pattern items, for example in
+order to match specific IPv4 options and IPv6 extension headers would be
+stacked after IPv4/IPv6 items.
+
+Other action types are planned but are not defined yet. These include the
+ability to alter packet data in several ways, such as performing
+encapsulation/decapsulation of tunnel headers.
+
+Rules management
+----------------
+
+A rather simple API with few functions is provided to fully manage flow
+rules.
+
+Each created flow rule is associated with an opaque, PMD-specific handle
+pointer. The application is responsible for keeping it until the rule is
+destroyed.
+
+Flows rules are represented by ``struct rte_flow`` objects.
+
+Validation
+~~~~~~~~~~
+
+Given that expressing a definite set of device capabilities is not
+practical, a dedicated function is provided to check if a flow rule is
+supported and can be created.
+
+.. code-block:: c
+
+   int
+   rte_flow_validate(uint8_t port_id,
+                     const struct rte_flow_attr *attr,
+                     const struct rte_flow_item pattern[],
+                     const struct rte_flow_action actions[],
+                     struct rte_flow_error *error);
+
+While this function has no effect on the target device, the flow rule is
+validated against its current configuration state and the returned value
+should be considered valid by the caller for that state only.
+
+The returned value is guaranteed to remain valid only as long as no
+successful calls to ``rte_flow_create()`` or ``rte_flow_destroy()`` are made
+in the meantime and no device parameter affecting flow rules in any way are
+modified, due to possible collisions or resource limitations (although in
+such cases ``EINVAL`` should not be returned).
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``attr``: flow rule attributes.
+- ``pattern``: pattern specification (list terminated by the END pattern
+  item).
+- ``actions``: associated actions (list terminated by the END action).
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 if flow rule is valid and can be created. A negative errno value
+  otherwise (``rte_errno`` is also set), the following errors are defined.
+- ``-ENOSYS``: underlying device does not support this functionality.
+- ``-EINVAL``: unknown or invalid rule specification.
+- ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial
+  bit-masks are unsupported).
+- ``-EEXIST``: collision with an existing rule.
+- ``-ENOMEM``: not enough resources.
+- ``-EBUSY``: action cannot be performed due to busy device resources, may
+  succeed if the affected queues or even the entire port are in a stopped
+  state (see ``rte_eth_dev_rx_queue_stop()`` and ``rte_eth_dev_stop()``).
+
+Creation
+~~~~~~~~
+
+Creating a flow rule is similar to validating one, except the rule is
+actually created and a handle returned.
+
+.. code-block:: c
+
+   struct rte_flow *
+   rte_flow_create(uint8_t port_id,
+                   const struct rte_flow_attr *attr,
+                   const struct rte_flow_item pattern[],
+                   const struct rte_flow_action *actions[],
+                   struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``attr``: flow rule attributes.
+- ``pattern``: pattern specification (list terminated by the END pattern
+  item).
+- ``actions``: associated actions (list terminated by the END action).
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+A valid handle in case of success, NULL otherwise and ``rte_errno`` is set
+to the positive version of one of the error codes defined for
+``rte_flow_validate()``.
+
+Destruction
+~~~~~~~~~~~
+
+Flow rules destruction is not automatic, and a queue or a port should not be
+released if any are still attached to them. Applications must take care of
+performing this step before releasing resources.
+
+.. code-block:: c
+
+   int
+   rte_flow_destroy(uint8_t port_id,
+                    struct rte_flow *flow,
+                    struct rte_flow_error *error);
+
+
+Failure to destroy a flow rule handle may occur when other flow rules depend
+on it, and destroying it would result in an inconsistent state.
+
+This function is only guaranteed to succeed if handles are destroyed in
+reverse order of their creation.
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``flow``: flow rule handle to destroy.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Flush
+~~~~~
+
+Convenience function to destroy all flow rule handles associated with a
+port. They are released as with successive calls to ``rte_flow_destroy()``.
+
+.. code-block:: c
+
+   int
+   rte_flow_flush(uint8_t port_id,
+                  struct rte_flow_error *error);
+
+In the unlikely event of failure, handles are still considered destroyed and
+no longer valid but the port must be assumed to be in an inconsistent state.
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Query
+~~~~~
+
+Query an existing flow rule.
+
+This function allows retrieving flow-specific data such as counters. Data
+is gathered by special actions which must be present in the flow rule
+definition.
+
+.. code-block:: c
+
+   int
+   rte_flow_query(uint8_t port_id,
+                  struct rte_flow *flow,
+                  enum rte_flow_action_type action,
+                  void *data,
+                  struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``flow``: flow rule handle to query.
+- ``action``: action type to query.
+- ``data``: pointer to storage for the associated query data type.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Verbose error reporting
+-----------------------
+
+The defined *errno* values may not be accurate enough for users or
+application developers who want to investigate issues related to flow rules
+management. A dedicated error object is defined for this purpose:
+
+.. code-block:: c
+
+   enum rte_flow_error_type {
+       RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+       RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+       RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+       RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+       RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+       RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+       RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+       RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+       RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+   };
+
+   struct rte_flow_error {
+       enum rte_flow_error_type type; /**< Cause field and error types. */
+       const void *cause; /**< Object responsible for the error. */
+       const char *message; /**< Human-readable error message. */
+   };
+
+Error type ``RTE_FLOW_ERROR_TYPE_NONE`` stands for no error, in which case
+remaining fields can be ignored. Other error types describe the type of the
+object pointed by ``cause``.
+
+If non-NULL, ``cause`` points to the object responsible for the error. For a
+flow rule, this may be a pattern item or an individual action.
+
+If non-NULL, ``message`` provides a human-readable error message.
+
+This object is normally allocated by applications and set by PMDs in case of
+error, the message points to a constant string which does not need to be
+freed by the application, however its pointer can be considered valid only
+as long as its associated DPDK port remains configured. Closing the
+underlying device or unloading the PMD invalidates it.
+
+Caveats
+-------
+
+- DPDK does not keep track of flow rules definitions or flow rule objects
+  automatically. Applications may keep track of the former and must keep
+  track of the latter. PMDs may also do it for internal needs, however this
+  must not be relied on by applications.
+
+- Flow rules are not maintained between successive port initializations. An
+  application exiting without releasing them and restarting must re-create
+  them from scratch.
+
+- API operations are synchronous and blocking (``EAGAIN`` cannot be
+  returned).
+
+- There is no provision for reentrancy/multi-thread safety, although nothing
+  should prevent different devices from being configured at the same
+  time. PMDs may protect their control path functions accordingly.
+
+- Stopping the data path (TX/RX) should not be necessary when managing flow
+  rules. If this cannot be achieved naturally or with workarounds (such as
+  temporarily replacing the burst function pointers), an appropriate error
+  code must be returned (``EBUSY``).
+
+- PMDs, not applications, are responsible for maintaining flow rules
+  configuration when stopping and restarting a port or performing other
+  actions which may affect them. They can only be destroyed explicitly by
+  applications.
+
+For devices exposing multiple ports sharing global settings affected by flow
+rules:
+
+- All ports under DPDK control must behave consistently, PMDs are
+  responsible for making sure that existing flow rules on a port are not
+  affected by other ports.
+
+- Ports not under DPDK control (unaffected or handled by other applications)
+  are user's responsibility. They may affect existing flow rules and cause
+  undefined behavior. PMDs aware of this may prevent flow rules creation
+  altogether in such cases.
+
+PMD interface
+-------------
+
+The PMD interface is defined in ``rte_flow_driver.h``. It is not subject to
+API/ABI versioning constraints as it is not exposed to applications and may
+evolve independently.
+
+It is currently implemented on top of the legacy filtering framework through
+filter type *RTE_ETH_FILTER_GENERIC* that accepts the single operation
+*RTE_ETH_FILTER_GET* to return PMD-specific *rte_flow* callbacks wrapped
+inside ``struct rte_flow_ops``.
+
+This overhead is temporarily necessary in order to keep compatibility with
+the legacy filtering framework, which should eventually disappear.
+
+- PMD callbacks implement exactly the interface described in `Rules
+  management`_, except for the port ID argument which has already been
+  converted to a pointer to the underlying ``struct rte_eth_dev``.
+
+- Public API functions do not process flow rules definitions at all before
+  calling PMD functions (no basic error checking, no validation
+  whatsoever). They only make sure these callbacks are non-NULL or return
+  the ``ENOSYS`` (function not supported) error.
+
+This interface additionally defines the following helper functions:
+
+- ``rte_flow_ops_get()``: get generic flow operations structure from a
+  port.
+
+- ``rte_flow_error_set()``: initialize generic flow error structure.
+
+More will be added over time.
+
+Device compatibility
+--------------------
+
+No known implementation supports all the described features.
+
+Unsupported features or combinations are not expected to be fully emulated
+in software by PMDs for performance reasons. Partially supported features
+may be completed in software as long as hardware performs most of the work
+(such as queue redirection and packet recognition).
+
+However PMDs are expected to do their best to satisfy application requests
+by working around hardware limitations as long as doing so does not affect
+the behavior of existing flow rules.
+
+The following sections provide a few examples of such cases and describe how
+PMDs should handle them, they are based on limitations built into the
+previous APIs.
+
+Global bit-masks
+~~~~~~~~~~~~~~~~
+
+Each flow rule comes with its own, per-layer bit-masks, while hardware may
+support only a single, device-wide bit-mask for a given layer type, so that
+two IPv4 rules cannot use different bit-masks.
+
+The expected behavior in this case is that PMDs automatically configure
+global bit-masks according to the needs of the first flow rule created.
+
+Subsequent rules are allowed only if their bit-masks match those, the
+``EEXIST`` error code should be returned otherwise.
+
+Unsupported layer types
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Many protocols can be simulated by crafting patterns with the `Item: RAW`_
+type.
+
+PMDs can rely on this capability to simulate support for protocols with
+headers not directly recognized by hardware.
+
+``ANY`` pattern item
+~~~~~~~~~~~~~~~~~~~~
+
+This pattern item stands for anything, which can be difficult to translate
+to something hardware would understand, particularly if followed by more
+specific types.
+
+Consider the following pattern:
+
+.. _table_rte_flow_unsupported_any:
+
+.. table:: Pattern with ANY as L3
+
+   +-------+-----------------------+
+   | Index | Item                  |
+   +=======+=======================+
+   | 0     | ETHER                 |
+   +-------+-----+---------+-------+
+   | 1     | ANY | ``num`` | ``1`` |
+   +-------+-----+---------+-------+
+   | 2     | TCP                   |
+   +-------+-----------------------+
+   | 3     | END                   |
+   +-------+-----------------------+
+
+Knowing that TCP does not make sense with something other than IPv4 and IPv6
+as L3, such a pattern may be translated to two flow rules instead:
+
+.. _table_rte_flow_unsupported_any_ipv4:
+
+.. table:: ANY replaced with IPV4
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | ETHER              |
+   +-------+--------------------+
+   | 1     | IPV4 (zeroed mask) |
+   +-------+--------------------+
+   | 2     | TCP                |
+   +-------+--------------------+
+   | 3     | END                |
+   +-------+--------------------+
+
+|
+
+.. _table_rte_flow_unsupported_any_ipv6:
+
+.. table:: ANY replaced with IPV6
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | ETHER              |
+   +-------+--------------------+
+   | 1     | IPV6 (zeroed mask) |
+   +-------+--------------------+
+   | 2     | TCP                |
+   +-------+--------------------+
+   | 3     | END                |
+   +-------+--------------------+
+
+Note that as soon as a ANY rule covers several layers, this approach may
+yield a large number of hidden flow rules. It is thus suggested to only
+support the most common scenarios (anything as L2 and/or L3).
+
+Unsupported actions
+~~~~~~~~~~~~~~~~~~~
+
+- When combined with `Action: QUEUE`_, packet counting (`Action: COUNT`_)
+  and tagging (`Action: MARK`_ or `Action: FLAG`_) may be implemented in
+  software as long as the target queue is used by a single rule.
+
+- A rule specifying both `Action: DUP`_ + `Action: QUEUE`_ may be translated
+  to two hidden rules combining `Action: QUEUE`_ and `Action: PASSTHRU`_.
+
+- When a single target queue is provided, `Action: RSS`_ can also be
+  implemented through `Action: QUEUE`_.
+
+Flow rules priority
+~~~~~~~~~~~~~~~~~~~
+
+While it would naturally make sense, flow rules cannot be assumed to be
+processed by hardware in the same order as their creation for several
+reasons:
+
+- They may be managed internally as a tree or a hash table instead of a
+  list.
+- Removing a flow rule before adding another one can either put the new rule
+  at the end of the list or reuse a freed entry.
+- Duplication may occur when packets are matched by several rules.
+
+For overlapping rules (particularly in order to use `Action: PASSTHRU`_)
+predictable behavior is only guaranteed by using different priority levels.
+
+Priority levels are not necessarily implemented in hardware, or may be
+severely limited (e.g. a single priority bit).
+
+For these reasons, priority levels may be implemented purely in software by
+PMDs.
+
+- For devices expecting flow rules to be added in the correct order, PMDs
+  may destroy and re-create existing rules after adding a new one with
+  a higher priority.
+
+- A configurable number of dummy or empty rules can be created at
+  initialization time to save high priority slots for later.
+
+- In order to save priority levels, PMDs may evaluate whether rules are
+  likely to collide and adjust their priority accordingly.
+
+Future evolutions
+-----------------
+
+- A device profile selection function which could be used to force a
+  permanent profile instead of relying on its automatic configuration based
+  on existing flow rules.
+
+- A method to optimize *rte_flow* rules with specific pattern items and
+  action types generated on the fly by PMDs. DPDK should assign negative
+  numbers to these in order to not collide with the existing types. See
+  `Negative types`_.
+
+- Adding specific egress pattern items and actions as described in
+  `Attribute: Traffic direction`_.
+
+- Optional software fallback when PMDs are unable to handle requested flow
+  rules so applications do not have to implement their own.
+
+API migration
+-------------
+
+Exhaustive list of deprecated filter types (normally prefixed with
+*RTE_ETH_FILTER_*) found in ``rte_eth_ctrl.h`` and methods to convert them
+to *rte_flow* rules.
+
+``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*MACVLAN* can be translated to a basic `Item: ETH`_ flow rule with a
+terminating `Action: VF`_ or `Action: PF`_.
+
+.. _table_rte_flow_migration_macvlan:
+
+.. table:: MACVLAN conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | ETH | ``spec`` | any | VF,     |
+   |   |     +----------+-----+ PF      |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*ETHERTYPE* is basically an `Item: ETH`_ flow rule with a terminating
+`Action: QUEUE`_ or `Action: DROP`_.
+
+.. _table_rte_flow_migration_ethertype:
+
+.. table:: ETHERTYPE conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | ETH | ``spec`` | any | QUEUE,  |
+   |   |     +----------+-----+ DROP    |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``FLEXIBLE`` to ``RAW`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*FLEXIBLE* can be translated to one `Item: RAW`_ pattern with a terminating
+`Action: QUEUE`_ and a defined priority level.
+
+.. _table_rte_flow_migration_flexible:
+
+.. table:: FLEXIBLE conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | RAW | ``spec`` | any | QUEUE   |
+   |   |     +----------+-----+         |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``SYN`` to ``TCP`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*SYN* is a `Item: TCP`_ rule with only the ``syn`` bit enabled and masked,
+and a terminating `Action: QUEUE`_.
+
+Priority level can be set to simulate the high priority bit.
+
+.. _table_rte_flow_migration_syn:
+
+.. table:: SYN conversion
+
+   +-----------------------------------+---------+
+   | Pattern                           | Actions |
+   +===+======+==========+=============+=========+
+   | 0 | ETH  | ``spec`` | unset       | QUEUE   |
+   |   |      +----------+-------------+         |
+   |   |      | ``last`` | unset       |         |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   +---+------+----------+-------------+---------+
+   | 1 | IPV4 | ``spec`` | unset       | END     |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   +---+------+----------+---------+---+         |
+   | 2 | TCP  | ``spec`` | ``syn`` | 1 |         |
+   |   |      +----------+---------+---+         |
+   |   |      | ``mask`` | ``syn`` | 1 |         |
+   +---+------+----------+---------+---+         |
+   | 3 | END                           |         |
+   +---+-------------------------------+---------+
+
+``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*NTUPLE* is similar to specifying an empty L2, `Item: IPV4`_ as L3 with
+`Item: TCP`_ or `Item: UDP`_ as L4 and a terminating `Action: QUEUE`_.
+
+A priority level can be specified as well.
+
+.. _table_rte_flow_migration_ntuple:
+
+.. table:: NTUPLE conversion
+
+   +-----------------------------+---------+
+   | Pattern                     | Actions |
+   +===+======+==========+=======+=========+
+   | 0 | ETH  | ``spec`` | unset | QUEUE   |
+   |   |      +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | unset |         |
+   +---+------+----------+-------+---------+
+   | 1 | IPV4 | ``spec`` | any   | END     |
+   |   |      +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | any   |         |
+   +---+------+----------+-------+         |
+   | 2 | TCP, | ``spec`` | any   |         |
+   |   | UDP  +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | any   |         |
+   +---+------+----------+-------+         |
+   | 3 | END                     |         |
+   +---+-------------------------+---------+
+
+``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*TUNNEL* matches common IPv4 and IPv6 L3/L4-based tunnel types.
+
+In the following table, `Item: ANY`_ is used to cover the optional L4.
+
+.. _table_rte_flow_migration_tunnel:
+
+.. table:: TUNNEL conversion
+
+   +-------------------------------------------------------+---------+
+   | Pattern                                               | Actions |
+   +===+==========================+==========+=============+=========+
+   | 0 | ETH                      | ``spec`` | any         | QUEUE   |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``last`` | unset       |         |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``mask`` | any         |         |
+   +---+--------------------------+----------+-------------+---------+
+   | 1 | IPV4, IPV6               | ``spec`` | any         | END     |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``last`` | unset       |         |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``mask`` | any         |         |
+   +---+--------------------------+----------+-------------+         |
+   | 2 | ANY                      | ``spec`` | any         |         |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``last`` | unset       |         |
+   |   |                          +----------+---------+---+         |
+   |   |                          | ``mask`` | ``num`` | 0 |         |
+   +---+--------------------------+----------+---------+---+         |
+   | 3 | VXLAN, GENEVE, TEREDO,   | ``spec`` | any         |         |
+   |   | NVGRE, GRE, ...          +----------+-------------+         |
+   |   |                          | ``last`` | unset       |         |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``mask`` | any         |         |
+   +---+--------------------------+----------+-------------+         |
+   | 4 | END                                               |         |
+   +---+---------------------------------------------------+---------+
+
+``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*FDIR* is more complex than any other type, there are several methods to
+emulate its functionality. It is summarized for the most part in the table
+below.
+
+A few features are intentionally not supported:
+
+- The ability to configure the matching input set and masks for the entire
+  device, PMDs should take care of it automatically according to the
+  requested flow rules.
+
+  For example if a device supports only one bit-mask per protocol type,
+  source/address IPv4 bit-masks can be made immutable by the first created
+  rule. Subsequent IPv4 or TCPv4 rules can only be created if they are
+  compatible.
+
+  Note that only protocol bit-masks affected by existing flow rules are
+  immutable, others can be changed later. They become mutable again after
+  the related flow rules are destroyed.
+
+- Returning four or eight bytes of matched data when using flex bytes
+  filtering. Although a specific action could implement it, it conflicts
+  with the much more useful 32 bits tagging on devices that support it.
+
+- Side effects on RSS processing of the entire device. Flow rules that
+  conflict with the current device configuration should not be
+  allowed. Similarly, device configuration should not be allowed when it
+  affects existing flow rules.
+
+- Device modes of operation. "none" is unsupported since filtering cannot be
+  disabled as long as a flow rule is present.
+
+- "MAC VLAN" or "tunnel" perfect matching modes should be automatically set
+  according to the created flow rules.
+
+- Signature mode of operation is not defined but could be handled through a
+  specific item type if needed.
+
+.. _table_rte_flow_migration_fdir:
+
+.. table:: FDIR conversion
+
+   +---------------------------------+------------+
+   | Pattern                         | Actions    |
+   +===+============+==========+=====+============+
+   | 0 | ETH,       | ``spec`` | any | QUEUE,     |
+   |   | RAW        +----------+-----+ DROP,      |
+   |   |            | ``last`` | N/A | PASSTHRU   |
+   |   |            +----------+-----+            |
+   |   |            | ``mask`` | any |            |
+   +---+------------+----------+-----+------------+
+   | 1 | IPV4,      | ``spec`` | any | MARK       |
+   |   | IPV6       +----------+-----+            |
+   |   |            | ``last`` | N/A |            |
+   |   |            +----------+-----+            |
+   |   |            | ``mask`` | any |            |
+   +---+------------+----------+-----+------------+
+   | 2 | TCP,       | ``spec`` | any | END        |
+   |   | UDP,       +----------+-----+            |
+   |   | SCTP       | ``last`` | N/A |            |
+   |   |            +----------+-----+            |
+   |   |            | ``mask`` | any |            |
+   +---+------------+----------+-----+            |
+   | 3 | VF,        | ``spec`` | any |            |
+   |   | PF         +----------+-----+            |
+   |   | (optional) | ``last`` | N/A |            |
+   |   |            +----------+-----+            |
+   |   |            | ``mask`` | any |            |
+   +---+------------+----------+-----+            |
+   | 4 | END                         |            |
+   +---+-----------------------------+------------+
+
+``HASH``
+~~~~~~~~
+
+There is no counterpart to this filter type because it translates to a
+global device setting instead of a pattern item. Device settings are
+automatically set according to the created flow rules.
+
+``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All packets are matched. This type alters incoming packets to encapsulate
+them in a chosen tunnel type, optionally redirect them to a VF as well.
+
+The destination pool for tag based forwarding can be emulated with other
+flow rules using `Action: DUP`_.
+
+.. _table_rte_flow_migration_l2tunnel:
+
+.. table:: L2_TUNNEL conversion
+
+   +---------------------------+------------+
+   | Pattern                   | Actions    |
+   +===+======+==========+=====+============+
+   | 0 | VOID | ``spec`` | N/A | VXLAN,     |
+   |   |      |          |     | GENEVE,    |
+   |   |      |          |     | ...        |
+   |   |      +----------+-----+            |
+   |   |      | ``last`` | N/A |            |
+   |   |      +----------+-----+            |
+   |   |      | ``mask`` | N/A |            |
+   |   |      |          |     |            |
+   +---+------+----------+-----+------------+
+   | 1 | END                   | VF         |
+   |   |                       | (optional) |
+   +---+                       +------------+
+   | 2 |                       | END        |
+   +---+-----------------------+------------+
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 03/25] doc: announce deprecation of legacy filter types
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 01/25] ethdev: introduce generic flow API Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 02/25] doc: add rte_flow prog guide Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
                             ` (22 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

They are superseded by the generic flow API (rte_flow). Target release is
not defined yet.

Suggested-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/rel_notes/deprecation.rst | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 2d17bc6..1438c77 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -71,3 +71,11 @@ Deprecation Notices
 * mempool: The functions for single/multi producer/consumer are deprecated
   and will be removed in 17.02.
   It is replaced by ``rte_mempool_generic_get/put`` functions.
+
+* ethdev: the legacy filter API, including
+  ``rte_eth_dev_filter_supported()``, ``rte_eth_dev_filter_ctrl()`` as well
+  as filter types MACVLAN, ETHERTYPE, FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR,
+  HASH and L2_TUNNEL, is superseded by the generic flow API (rte_flow) in
+  PMDs that implement the latter.
+  Target release for removal of the legacy API will be defined once most
+  PMDs have switched to rte_flow.
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 04/25] cmdline: add support for dynamic tokens
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (2 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 03/25] doc: announce deprecation of legacy filter types Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 05/25] cmdline: add alignment constraint Adrien Mazarguil
                             ` (21 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Considering tokens must be hard-coded in a list part of the instruction
structure, context-dependent tokens cannot be expressed.

This commit adds support for building dynamic token lists through a
user-provided function, which is called when the static token list is empty
(a single NULL entry).

Because no structures are modified (existing fields are reused), this
commit has no impact on the current ABI.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 lib/librte_cmdline/cmdline_parse.c | 60 +++++++++++++++++++++++++++++----
 lib/librte_cmdline/cmdline_parse.h | 21 ++++++++++++
 2 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index b496067..14f5553 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -146,7 +146,9 @@ nb_common_chars(const char * s1, const char * s2)
  */
 static int
 match_inst(cmdline_parse_inst_t *inst, const char *buf,
-	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
+	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size,
+	   cmdline_parse_token_hdr_t
+		*(*dyn_tokens)[CMDLINE_PARSE_DYNAMIC_TOKENS])
 {
 	unsigned int token_num=0;
 	cmdline_parse_token_hdr_t * token_p;
@@ -155,6 +157,11 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 	struct cmdline_token_hdr token_hdr;
 
 	token_p = inst->tokens[token_num];
+	if (!token_p && dyn_tokens && inst->f) {
+		if (!(*dyn_tokens)[0])
+			inst->f(&(*dyn_tokens)[0], NULL, dyn_tokens);
+		token_p = (*dyn_tokens)[0];
+	}
 	if (token_p)
 		memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -196,7 +203,17 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 		buf += n;
 
 		token_num ++;
-		token_p = inst->tokens[token_num];
+		if (!inst->tokens[0]) {
+			if (token_num < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!(*dyn_tokens)[token_num])
+					inst->f(&(*dyn_tokens)[token_num],
+						NULL,
+						dyn_tokens);
+				token_p = (*dyn_tokens)[token_num];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[token_num];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 	}
@@ -239,6 +256,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
 	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
 	int comment = 0;
@@ -255,6 +273,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		return CMDLINE_PARSE_BAD_ARGS;
 
 	ctx = cl->ctx;
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/*
 	 * - look if the buffer contains at least one line
@@ -299,7 +318,8 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
+		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
 			err = CMDLINE_PARSE_BAD_ARGS;
@@ -355,6 +375,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 	cmdline_parse_token_hdr_t *token_p;
 	struct cmdline_token_hdr token_hdr;
 	char tmpbuf[CMDLINE_BUFFER_SIZE], comp_buf[CMDLINE_BUFFER_SIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	unsigned int partial_tok_len;
 	int comp_len = -1;
 	int tmp_len = -1;
@@ -374,6 +395,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 
 	debug_printf("%s called\n", __func__);
 	memset(&token_hdr, 0, sizeof(token_hdr));
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/* count the number of complete token to parse */
 	for (i=0 ; buf[i] ; i++) {
@@ -396,11 +418,24 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		inst = ctx[inst_num];
 		while (inst) {
 			/* parse the first tokens of the inst */
-			if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+			if (nb_token &&
+			    match_inst(inst, buf, nb_token, NULL, 0,
+				       &dyn_tokens))
 				goto next;
 
 			debug_printf("instruction match\n");
-			token_p = inst->tokens[nb_token];
+			if (!inst->tokens[0]) {
+				if (nb_token <
+				    (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+					if (!dyn_tokens[nb_token])
+						inst->f(&dyn_tokens[nb_token],
+							NULL,
+							&dyn_tokens);
+					token_p = dyn_tokens[nb_token];
+				} else
+					token_p = NULL;
+			} else
+				token_p = inst->tokens[nb_token];
 			if (token_p)
 				memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -490,10 +525,21 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		/* we need to redo it */
 		inst = ctx[inst_num];
 
-		if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+		if (nb_token &&
+		    match_inst(inst, buf, nb_token, NULL, 0, &dyn_tokens))
 			goto next2;
 
-		token_p = inst->tokens[nb_token];
+		if (!inst->tokens[0]) {
+			if (nb_token < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!dyn_tokens[nb_token])
+					inst->f(&dyn_tokens[nb_token],
+						NULL,
+						&dyn_tokens);
+				token_p = dyn_tokens[nb_token];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[nb_token];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
diff --git a/lib/librte_cmdline/cmdline_parse.h b/lib/librte_cmdline/cmdline_parse.h
index 4ac05d6..65b18d4 100644
--- a/lib/librte_cmdline/cmdline_parse.h
+++ b/lib/librte_cmdline/cmdline_parse.h
@@ -83,6 +83,9 @@ extern "C" {
 /* maximum buffer size for parsed result */
 #define CMDLINE_PARSE_RESULT_BUFSIZE 8192
 
+/* maximum number of dynamic tokens */
+#define CMDLINE_PARSE_DYNAMIC_TOKENS 128
+
 /**
  * Stores a pointer to the ops struct, and the offset: the place to
  * write the parsed result in the destination structure.
@@ -130,6 +133,24 @@ struct cmdline;
  * Store a instruction, which is a pointer to a callback function and
  * its parameter that is called when the instruction is parsed, a help
  * string, and a list of token composing this instruction.
+ *
+ * When no tokens are defined (tokens[0] == NULL), they are retrieved
+ * dynamically by calling f() as follows:
+ *
+ *  f((struct cmdline_token_hdr **)&token_hdr,
+ *    NULL,
+ *    (struct cmdline_token_hdr *[])tokens));
+ *
+ * The address of the resulting token is expected at the location pointed by
+ * the first argument. Can be set to NULL to end the list.
+ *
+ * The cmdline argument (struct cmdline *) is always NULL.
+ *
+ * The last argument points to the NULL-terminated list of dynamic tokens
+ * defined so far. Since token_hdr points to an index of that list, the
+ * current index can be derived as follows:
+ *
+ *  int index = token_hdr - &(*tokens)[0];
  */
 struct cmdline_inst {
 	/* f(parsed_struct, data) */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 05/25] cmdline: add alignment constraint
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (3 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
                             ` (20 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

This prevents sigbus errors on architectures that cannot handle unexpected
unaligned accesses to the output buffer.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 lib/librte_cmdline/cmdline_parse.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index 14f5553..763c286 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -255,7 +255,10 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	unsigned int inst_num=0;
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
-	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	union {
+		char buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+		long double align; /* strong alignment constraint for buf */
+	} result;
 	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
@@ -318,7 +321,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+		tok = match_inst(inst, buf, 0, result.buf, sizeof(result.buf),
 				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
@@ -353,7 +356,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 
 	/* call func */
 	if (f) {
-		f(result_buf, cl, data);
+		f(result.buf, cl, data);
 	}
 
 	/* no match */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 06/25] app/testpmd: implement basic support for rte_flow
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (4 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 05/25] cmdline: add alignment constraint Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 07/25] app/testpmd: add flow command Adrien Mazarguil
                             ` (19 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Add basic management functions for the generic flow API (validate, create,
destroy, flush, query and list). Flow rule objects and properties are
arranged in lists associated with each port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c     |   1 +
 app/test-pmd/config.c      | 498 ++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/csumonly.c    |   1 +
 app/test-pmd/flowgen.c     |   1 +
 app/test-pmd/icmpecho.c    |   1 +
 app/test-pmd/ieee1588fwd.c |   1 +
 app/test-pmd/iofwd.c       |   1 +
 app/test-pmd/macfwd.c      |   1 +
 app/test-pmd/macswap.c     |   1 +
 app/test-pmd/parameters.c  |   1 +
 app/test-pmd/rxonly.c      |   1 +
 app/test-pmd/testpmd.c     |   6 +
 app/test-pmd/testpmd.h     |  27 +++
 app/test-pmd/txonly.c      |   1 +
 14 files changed, 542 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index d03a592..5d1c0dd 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -75,6 +75,7 @@
 #include <rte_string_fns.h>
 #include <rte_devargs.h>
 #include <rte_eth_ctrl.h>
+#include <rte_flow.h>
 
 #include <cmdline_rdline.h>
 #include <cmdline_parse.h>
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 8cf537d..9716ce7 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -92,6 +92,8 @@
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
 #include <rte_cycles.h>
+#include <rte_flow.h>
+#include <rte_errno.h>
 
 #include "testpmd.h"
 
@@ -751,6 +753,502 @@ port_mtu_set(portid_t port_id, uint16_t mtu)
 	printf("Set MTU failed. diag=%d\n", diag);
 }
 
+/* Generic flow management functions. */
+
+/** Generate flow_item[] entry. */
+#define MK_FLOW_ITEM(t, s) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow pattern items. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_item[] = {
+	MK_FLOW_ITEM(END, 0),
+	MK_FLOW_ITEM(VOID, 0),
+	MK_FLOW_ITEM(INVERT, 0),
+	MK_FLOW_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+	MK_FLOW_ITEM(PF, 0),
+	MK_FLOW_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+	MK_FLOW_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+	MK_FLOW_ITEM(RAW, sizeof(struct rte_flow_item_raw)), /* +pattern[] */
+	MK_FLOW_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+	MK_FLOW_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+	MK_FLOW_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+	MK_FLOW_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+	MK_FLOW_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+	MK_FLOW_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+	MK_FLOW_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+	MK_FLOW_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+	MK_FLOW_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+};
+
+/** Compute storage space needed by item specification. */
+static void
+flow_item_spec_size(const struct rte_flow_item *item,
+		    size_t *size, size_t *pad)
+{
+	if (!item->spec)
+		goto empty;
+	switch (item->type) {
+		union {
+			const struct rte_flow_item_raw *raw;
+		} spec;
+
+	case RTE_FLOW_ITEM_TYPE_RAW:
+		spec.raw = item->spec;
+		*size = offsetof(struct rte_flow_item_raw, pattern) +
+			spec.raw->length * sizeof(*spec.raw->pattern);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate flow_action[] entry. */
+#define MK_FLOW_ACTION(t, s) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow actions. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_action[] = {
+	MK_FLOW_ACTION(END, 0),
+	MK_FLOW_ACTION(VOID, 0),
+	MK_FLOW_ACTION(PASSTHRU, 0),
+	MK_FLOW_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+	MK_FLOW_ACTION(FLAG, 0),
+	MK_FLOW_ACTION(QUEUE, sizeof(struct rte_flow_action_queue)),
+	MK_FLOW_ACTION(DROP, 0),
+	MK_FLOW_ACTION(COUNT, 0),
+	MK_FLOW_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+	MK_FLOW_ACTION(RSS, sizeof(struct rte_flow_action_rss)), /* +queue[] */
+	MK_FLOW_ACTION(PF, 0),
+	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+};
+
+/** Compute storage space needed by action configuration. */
+static void
+flow_action_conf_size(const struct rte_flow_action *action,
+		      size_t *size, size_t *pad)
+{
+	if (!action->conf)
+		goto empty;
+	switch (action->type) {
+		union {
+			const struct rte_flow_action_rss *rss;
+		} conf;
+
+	case RTE_FLOW_ACTION_TYPE_RSS:
+		conf.rss = action->conf;
+		*size = offsetof(struct rte_flow_action_rss, queue) +
+			conf.rss->num * sizeof(*conf.rss->queue);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate a port_flow entry from attributes/pattern/actions. */
+static struct port_flow *
+port_flow_new(const struct rte_flow_attr *attr,
+	      const struct rte_flow_item *pattern,
+	      const struct rte_flow_action *actions)
+{
+	const struct rte_flow_item *item;
+	const struct rte_flow_action *action;
+	struct port_flow *pf = NULL;
+	size_t tmp;
+	size_t pad;
+	size_t off1 = 0;
+	size_t off2 = 0;
+	int err = ENOTSUP;
+
+store:
+	item = pattern;
+	if (pf)
+		pf->pattern = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_item *dst = NULL;
+
+		if ((unsigned int)item->type > RTE_DIM(flow_item) ||
+		    !flow_item[item->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, item, sizeof(*item));
+		off1 += sizeof(*item);
+		flow_item_spec_size(item, &tmp, &pad);
+		if (item->spec) {
+			if (pf)
+				dst->spec = memcpy(pf->data + off2,
+						   item->spec, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->last) {
+			if (pf)
+				dst->last = memcpy(pf->data + off2,
+						   item->last, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->mask) {
+			if (pf)
+				dst->mask = memcpy(pf->data + off2,
+						   item->mask, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((item++)->type != RTE_FLOW_ITEM_TYPE_END);
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	action = actions;
+	if (pf)
+		pf->actions = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_action *dst = NULL;
+
+		if ((unsigned int)action->type > RTE_DIM(flow_action) ||
+		    !flow_action[action->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, action, sizeof(*action));
+		off1 += sizeof(*action);
+		flow_action_conf_size(action, &tmp, &pad);
+		if (action->conf) {
+			if (pf)
+				dst->conf = memcpy(pf->data + off2,
+						   action->conf, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((action++)->type != RTE_FLOW_ACTION_TYPE_END);
+	if (pf != NULL)
+		return pf;
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	tmp = RTE_ALIGN_CEIL(offsetof(struct port_flow, data), sizeof(double));
+	pf = calloc(1, tmp + off1 + off2);
+	if (pf == NULL)
+		err = errno;
+	else {
+		*pf = (const struct port_flow){
+			.size = tmp + off1 + off2,
+			.attr = *attr,
+		};
+		tmp -= offsetof(struct port_flow, data);
+		off2 = tmp + off1;
+		off1 = tmp;
+		goto store;
+	}
+notsup:
+	rte_errno = err;
+	return NULL;
+}
+
+/** Print a message out of a flow error. */
+static int
+port_flow_complain(struct rte_flow_error *error)
+{
+	static const char *const errstrlist[] = {
+		[RTE_FLOW_ERROR_TYPE_NONE] = "no error",
+		[RTE_FLOW_ERROR_TYPE_UNSPECIFIED] = "cause unspecified",
+		[RTE_FLOW_ERROR_TYPE_HANDLE] = "flow rule (handle)",
+		[RTE_FLOW_ERROR_TYPE_ATTR_GROUP] = "group field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY] = "priority field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_INGRESS] = "ingress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_EGRESS] = "egress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR] = "attributes structure",
+		[RTE_FLOW_ERROR_TYPE_ITEM_NUM] = "pattern length",
+		[RTE_FLOW_ERROR_TYPE_ITEM] = "specific pattern item",
+		[RTE_FLOW_ERROR_TYPE_ACTION_NUM] = "number of actions",
+		[RTE_FLOW_ERROR_TYPE_ACTION] = "specific action",
+	};
+	const char *errstr;
+	char buf[32];
+	int err = rte_errno;
+
+	if ((unsigned int)error->type > RTE_DIM(errstrlist) ||
+	    !errstrlist[error->type])
+		errstr = "unknown type";
+	else
+		errstr = errstrlist[error->type];
+	printf("Caught error type %d (%s): %s%s\n",
+	       error->type, errstr,
+	       error->cause ? (snprintf(buf, sizeof(buf), "cause: %p, ",
+					error->cause), buf) : "",
+	       error->message ? error->message : "(no stated reason)");
+	return -err;
+}
+
+/** Validate flow rule. */
+int
+port_flow_validate(portid_t port_id,
+		   const struct rte_flow_attr *attr,
+		   const struct rte_flow_item *pattern,
+		   const struct rte_flow_action *actions)
+{
+	struct rte_flow_error error;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x11, sizeof(error));
+	if (rte_flow_validate(port_id, attr, pattern, actions, &error))
+		return port_flow_complain(&error);
+	printf("Flow rule validated\n");
+	return 0;
+}
+
+/** Create flow rule. */
+int
+port_flow_create(portid_t port_id,
+		 const struct rte_flow_attr *attr,
+		 const struct rte_flow_item *pattern,
+		 const struct rte_flow_action *actions)
+{
+	struct rte_flow *flow;
+	struct rte_port *port;
+	struct port_flow *pf;
+	uint32_t id;
+	struct rte_flow_error error;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x22, sizeof(error));
+	flow = rte_flow_create(port_id, attr, pattern, actions, &error);
+	if (!flow)
+		return port_flow_complain(&error);
+	port = &ports[port_id];
+	if (port->flow_list) {
+		if (port->flow_list->id == UINT32_MAX) {
+			printf("Highest rule ID is already assigned, delete"
+			       " it first");
+			rte_flow_destroy(port_id, flow, NULL);
+			return -ENOMEM;
+		}
+		id = port->flow_list->id + 1;
+	} else
+		id = 0;
+	pf = port_flow_new(attr, pattern, actions);
+	if (!pf) {
+		int err = rte_errno;
+
+		printf("Cannot allocate flow: %s\n", rte_strerror(err));
+		rte_flow_destroy(port_id, flow, NULL);
+		return -err;
+	}
+	pf->next = port->flow_list;
+	pf->id = id;
+	pf->flow = flow;
+	port->flow_list = pf;
+	printf("Flow rule #%u created\n", pf->id);
+	return 0;
+}
+
+/** Destroy a number of flow rules. */
+int
+port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule)
+{
+	struct rte_port *port;
+	struct port_flow **tmp;
+	uint32_t c = 0;
+	int ret = 0;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	tmp = &port->flow_list;
+	while (*tmp) {
+		uint32_t i;
+
+		for (i = 0; i != n; ++i) {
+			struct rte_flow_error error;
+			struct port_flow *pf = *tmp;
+
+			if (rule[i] != pf->id)
+				continue;
+			/*
+			 * Poisoning to make sure PMDs update it in case
+			 * of error.
+			 */
+			memset(&error, 0x33, sizeof(error));
+			if (rte_flow_destroy(port_id, pf->flow, &error)) {
+				ret = port_flow_complain(&error);
+				continue;
+			}
+			printf("Flow rule #%u destroyed\n", pf->id);
+			*tmp = pf->next;
+			free(pf);
+			break;
+		}
+		if (i == n)
+			tmp = &(*tmp)->next;
+		++c;
+	}
+	return ret;
+}
+
+/** Remove all flow rules. */
+int
+port_flow_flush(portid_t port_id)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	int ret = 0;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x44, sizeof(error));
+	if (rte_flow_flush(port_id, &error)) {
+		ret = port_flow_complain(&error);
+		if (port_id_is_invalid(port_id, DISABLED_WARN) ||
+		    port_id == (portid_t)RTE_PORT_ALL)
+			return ret;
+	}
+	port = &ports[port_id];
+	while (port->flow_list) {
+		struct port_flow *pf = port->flow_list->next;
+
+		free(port->flow_list);
+		port->flow_list = pf;
+	}
+	return ret;
+}
+
+/** Query a flow rule. */
+int
+port_flow_query(portid_t port_id, uint32_t rule,
+		enum rte_flow_action_type action)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	struct port_flow *pf;
+	const char *name;
+	union {
+		struct rte_flow_query_count count;
+	} query;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	for (pf = port->flow_list; pf; pf = pf->next)
+		if (pf->id == rule)
+			break;
+	if (!pf) {
+		printf("Flow rule #%u not found\n", rule);
+		return -ENOENT;
+	}
+	if ((unsigned int)action > RTE_DIM(flow_action) ||
+	    !flow_action[action].name)
+		name = "unknown";
+	else
+		name = flow_action[action].name;
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		break;
+	default:
+		printf("Cannot query action type %d (%s)\n", action, name);
+		return -ENOTSUP;
+	}
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x55, sizeof(error));
+	memset(&query, 0, sizeof(query));
+	if (rte_flow_query(port_id, pf->flow, action, &query, &error))
+		return port_flow_complain(&error);
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		printf("%s:\n"
+		       " hits_set: %u\n"
+		       " bytes_set: %u\n"
+		       " hits: %" PRIu64 "\n"
+		       " bytes: %" PRIu64 "\n",
+		       name,
+		       query.count.hits_set,
+		       query.count.bytes_set,
+		       query.count.hits,
+		       query.count.bytes);
+		break;
+	default:
+		printf("Cannot display result for action type %d (%s)\n",
+		       action, name);
+		break;
+	}
+	return 0;
+}
+
+/** List flow rules. */
+void
+port_flow_list(portid_t port_id, uint32_t n, const uint32_t group[n])
+{
+	struct rte_port *port;
+	struct port_flow *pf;
+	struct port_flow *list = NULL;
+	uint32_t i;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return;
+	port = &ports[port_id];
+	if (!port->flow_list)
+		return;
+	/* Sort flows by group, priority and ID. */
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		struct port_flow **tmp;
+
+		if (n) {
+			/* Filter out unwanted groups. */
+			for (i = 0; i != n; ++i)
+				if (pf->attr.group == group[i])
+					break;
+			if (i == n)
+				continue;
+		}
+		tmp = &list;
+		while (*tmp &&
+		       (pf->attr.group > (*tmp)->attr.group ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority > (*tmp)->attr.priority) ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority == (*tmp)->attr.priority &&
+			 pf->id > (*tmp)->id)))
+			tmp = &(*tmp)->tmp;
+		pf->tmp = *tmp;
+		*tmp = pf;
+	}
+	printf("ID\tGroup\tPrio\tAttr\tRule\n");
+	for (pf = list; pf != NULL; pf = pf->tmp) {
+		const struct rte_flow_item *item = pf->pattern;
+		const struct rte_flow_action *action = pf->actions;
+
+		printf("%" PRIu32 "\t%" PRIu32 "\t%" PRIu32 "\t%c%c\t",
+		       pf->id,
+		       pf->attr.group,
+		       pf->attr.priority,
+		       pf->attr.ingress ? 'i' : '-',
+		       pf->attr.egress ? 'e' : '-');
+		while (item->type != RTE_FLOW_ITEM_TYPE_END) {
+			if (item->type != RTE_FLOW_ITEM_TYPE_VOID)
+				printf("%s ", flow_item[item->type].name);
+			++item;
+		}
+		printf("=>");
+		while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+			if (action->type != RTE_FLOW_ACTION_TYPE_VOID)
+				printf(" %s", flow_action[action->type].name);
+			++action;
+		}
+		printf("\n");
+	}
+}
+
 /*
  * RX/TX ring descriptors display functions.
  */
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 57e6ae2..dd67ebf 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,7 @@
 #include <rte_sctp.h>
 #include <rte_prefetch.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index b13ff89..13b4f90 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 6a4e750..f25a8f5 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -61,6 +61,7 @@
 #include <rte_ip.h>
 #include <rte_icmp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/ieee1588fwd.c b/app/test-pmd/ieee1588fwd.c
index 0d3b37a..51170ee 100644
--- a/app/test-pmd/ieee1588fwd.c
+++ b/app/test-pmd/ieee1588fwd.c
@@ -34,6 +34,7 @@
 
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/iofwd.c b/app/test-pmd/iofwd.c
index 26936b7..15cb4a2 100644
--- a/app/test-pmd/iofwd.c
+++ b/app/test-pmd/iofwd.c
@@ -64,6 +64,7 @@
 #include <rte_ether.h>
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index 86e01de..d361db1 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 36e139f..f996039 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 08e5a76..28db8cd 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_BOND
 #include <rte_eth_bond.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index fff815c..cf00576 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -67,6 +67,7 @@
 #include <rte_ip.h>
 #include <rte_udp.h>
 #include <rte_net.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a0332c2..bfb2f8e 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #ifdef RTE_LIBRTE_PDUMP
 #include <rte_pdump.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
@@ -1545,6 +1546,8 @@ close_port(portid_t pid)
 			continue;
 		}
 
+		if (port->flow_list)
+			port_flow_flush(pi);
 		rte_eth_dev_close(pi);
 
 		if (rte_atomic16_cmpset(&(port->port_status),
@@ -1599,6 +1602,9 @@ detach_port(uint8_t port_id)
 		return;
 	}
 
+	if (ports[port_id].flow_list)
+		port_flow_flush(port_id);
+
 	if (rte_eth_dev_detach(port_id, name))
 		return;
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9c1e703..22ce2d6 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -144,6 +144,19 @@ struct fwd_stream {
 /** Insert double VLAN header in forward engine */
 #define TESTPMD_TX_OFFLOAD_INSERT_QINQ       0x0080
 
+/** Descriptor for a single flow. */
+struct port_flow {
+	size_t size; /**< Allocated space including data[]. */
+	struct port_flow *next; /**< Next flow in list. */
+	struct port_flow *tmp; /**< Temporary linking. */
+	uint32_t id; /**< Flow rule ID. */
+	struct rte_flow *flow; /**< Opaque flow object returned by PMD. */
+	struct rte_flow_attr attr; /**< Attributes. */
+	struct rte_flow_item *pattern; /**< Pattern. */
+	struct rte_flow_action *actions; /**< Actions. */
+	uint8_t data[]; /**< Storage for pattern/actions. */
+};
+
 /**
  * The data structure associated with each port.
  */
@@ -177,6 +190,7 @@ struct rte_port {
 	struct ether_addr       *mc_addr_pool; /**< pool of multicast addrs */
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
+	struct port_flow        *flow_list; /**< Associated flows. */
 };
 
 extern portid_t __rte_unused
@@ -504,6 +518,19 @@ void port_reg_bit_field_set(portid_t port_id, uint32_t reg_off,
 			    uint8_t bit1_pos, uint8_t bit2_pos, uint32_t value);
 void port_reg_display(portid_t port_id, uint32_t reg_off);
 void port_reg_set(portid_t port_id, uint32_t reg_off, uint32_t value);
+int port_flow_validate(portid_t port_id,
+		       const struct rte_flow_attr *attr,
+		       const struct rte_flow_item *pattern,
+		       const struct rte_flow_action *actions);
+int port_flow_create(portid_t port_id,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item *pattern,
+		     const struct rte_flow_action *actions);
+int port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule);
+int port_flow_flush(portid_t port_id);
+int port_flow_query(portid_t port_id, uint32_t rule,
+		    enum rte_flow_action_type action);
+void port_flow_list(portid_t port_id, uint32_t n, const uint32_t *group);
 
 void rx_ring_desc_display(portid_t port_id, queueid_t rxq_id, uint16_t rxd_id);
 void tx_ring_desc_display(portid_t port_id, queueid_t txq_id, uint16_t txd_id);
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 8513a06..e996f35 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 07/25] app/testpmd: add flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (5 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
                             ` (18 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Managing generic flow API functions from command line requires the use of
dynamic tokens for convenience as flow rules are not fixed and cannot be
defined statically.

This commit adds specific flexible parser code and object for a new "flow"
command in separate file.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/Makefile       |   1 +
 app/test-pmd/cmdline.c      |   4 +
 app/test-pmd/cmdline_flow.c | 439 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 444 insertions(+)

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 891b85a..5988c3e 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -47,6 +47,7 @@ CFLAGS += $(WERROR_FLAGS)
 SRCS-y := testpmd.c
 SRCS-y += parameters.c
 SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline.c
+SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_flow.c
 SRCS-y += config.c
 SRCS-y += iofwd.c
 SRCS-y += macfwd.c
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 5d1c0dd..b124412 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -9567,6 +9567,9 @@ cmdline_parse_inst_t cmd_set_flow_director_flex_payload = {
 	},
 };
 
+/* Generic flow interface command. */
+extern cmdline_parse_inst_t cmd_flow;
+
 /* *** Classification Filters Control *** */
 /* *** Get symmetric hash enable per port *** */
 struct cmd_get_sym_hash_ena_per_port_result {
@@ -11605,6 +11608,7 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_set_hash_global_config,
 	(cmdline_parse_inst_t *)&cmd_set_hash_input_set,
 	(cmdline_parse_inst_t *)&cmd_set_fdir_input_set,
+	(cmdline_parse_inst_t *)&cmd_flow,
 	(cmdline_parse_inst_t *)&cmd_mcast_addr,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_all,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_specific,
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
new file mode 100644
index 0000000..f5aef0f
--- /dev/null
+++ b/app/test-pmd/cmdline_flow.c
@@ -0,0 +1,439 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <ctype.h>
+#include <string.h>
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <cmdline_parse.h>
+#include <rte_flow.h>
+
+#include "testpmd.h"
+
+/** Parser token indices. */
+enum index {
+	/* Special tokens. */
+	ZERO = 0,
+	END,
+
+	/* Top-level command. */
+	FLOW,
+};
+
+/** Maximum number of subsequent tokens and arguments on the stack. */
+#define CTX_STACK_SIZE 16
+
+/** Parser context. */
+struct context {
+	/** Stack of subsequent token lists to process. */
+	const enum index *next[CTX_STACK_SIZE];
+	enum index curr; /**< Current token index. */
+	enum index prev; /**< Index of the last token seen. */
+	int next_num; /**< Number of entries in next[]. */
+	uint32_t reparse:1; /**< Start over from the beginning. */
+	uint32_t eol:1; /**< EOL has been detected. */
+	uint32_t last:1; /**< No more arguments. */
+};
+
+/** Parser token definition. */
+struct token {
+	/** Type displayed during completion (defaults to "TOKEN"). */
+	const char *type;
+	/** Help displayed during completion (defaults to token name). */
+	const char *help;
+	/**
+	 * Lists of subsequent tokens to push on the stack. Each call to the
+	 * parser consumes the last entry of that stack.
+	 */
+	const enum index *const *next;
+	/**
+	 * Token-processing callback, returns -1 in case of error, the
+	 * length of the matched string otherwise. If NULL, attempts to
+	 * match the token name.
+	 *
+	 * If buf is not NULL, the result should be stored in it according
+	 * to context. An error is returned if not large enough.
+	 */
+	int (*call)(struct context *ctx, const struct token *token,
+		    const char *str, unsigned int len,
+		    void *buf, unsigned int size);
+	/**
+	 * Callback that provides possible values for this token, used for
+	 * completion. Returns -1 in case of error, the number of possible
+	 * values otherwise. If NULL, the token name is used.
+	 *
+	 * If buf is not NULL, entry index ent is written to buf and the
+	 * full length of the entry is returned (same behavior as
+	 * snprintf()).
+	 */
+	int (*comp)(struct context *ctx, const struct token *token,
+		    unsigned int ent, char *buf, unsigned int size);
+	/** Mandatory token name, no default value. */
+	const char *name;
+};
+
+/** Static initializer for the next field. */
+#define NEXT(...) (const enum index *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for a NEXT() entry. */
+#define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, ZERO, }
+
+/** Parser output buffer layout expected by cmd_flow_parsed(). */
+struct buffer {
+	enum index command; /**< Flow command. */
+	uint16_t port; /**< Affected port ID. */
+};
+
+static int parse_init(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
+
+/** Token definitions. */
+static const struct token token_list[] = {
+	/* Special tokens. */
+	[ZERO] = {
+		.name = "ZERO",
+		.help = "null entry, abused as the entry point",
+		.next = NEXT(NEXT_ENTRY(FLOW)),
+	},
+	[END] = {
+		.name = "",
+		.type = "RETURN",
+		.help = "command may end here",
+	},
+	/* Top-level command. */
+	[FLOW] = {
+		.name = "flow",
+		.type = "{command} {port_id} [{arg} [...]]",
+		.help = "manage ingress/egress flow rules",
+		.call = parse_init,
+	},
+};
+
+/** Default parsing function for token name matching. */
+static int
+parse_default(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)buf;
+	(void)size;
+	if (strncmp(str, token->name, len))
+		return -1;
+	return len;
+}
+
+/** Parse flow command, initialize output buffer for subsequent tokens. */
+static int
+parse_init(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	/* Initialize buffer. */
+	memset(out, 0x00, sizeof(*out));
+	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	return len;
+}
+
+/** Internal context. */
+static struct context cmd_flow_context;
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow;
+
+/** Initialize context. */
+static void
+cmd_flow_context_init(struct context *ctx)
+{
+	/* A full memset() is not necessary. */
+	ctx->curr = ZERO;
+	ctx->prev = ZERO;
+	ctx->next_num = 0;
+	ctx->reparse = 0;
+	ctx->eol = 0;
+	ctx->last = 0;
+}
+
+/** Parse a token (cmdline API). */
+static int
+cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
+	       unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token;
+	const enum index *list;
+	int len;
+	int i;
+
+	(void)hdr;
+	/* Restart as requested. */
+	if (ctx->reparse)
+		cmd_flow_context_init(ctx);
+	token = &token_list[ctx->curr];
+	/* Check argument length. */
+	ctx->eol = 0;
+	ctx->last = 1;
+	for (len = 0; src[len]; ++len)
+		if (src[len] == '#' || isspace(src[len]))
+			break;
+	if (!len)
+		return -1;
+	/* Last argument and EOL detection. */
+	for (i = len; src[i]; ++i)
+		if (src[i] == '#' || src[i] == '\r' || src[i] == '\n')
+			break;
+		else if (!isspace(src[i])) {
+			ctx->last = 0;
+			break;
+		}
+	for (; src[i]; ++i)
+		if (src[i] == '\r' || src[i] == '\n') {
+			ctx->eol = 1;
+			break;
+		}
+	/* Initialize context if necessary. */
+	if (!ctx->next_num) {
+		if (!token->next)
+			return 0;
+		ctx->next[ctx->next_num++] = token->next[0];
+	}
+	/* Process argument through candidates. */
+	ctx->prev = ctx->curr;
+	list = ctx->next[ctx->next_num - 1];
+	for (i = 0; list[i]; ++i) {
+		const struct token *next = &token_list[list[i]];
+		int tmp;
+
+		ctx->curr = list[i];
+		if (next->call)
+			tmp = next->call(ctx, next, src, len, result, size);
+		else
+			tmp = parse_default(ctx, next, src, len, result, size);
+		if (tmp == -1 || tmp != len)
+			continue;
+		token = next;
+		break;
+	}
+	if (!list[i])
+		return -1;
+	--ctx->next_num;
+	/* Push subsequent tokens if any. */
+	if (token->next)
+		for (i = 0; token->next[i]; ++i) {
+			if (ctx->next_num == RTE_DIM(ctx->next))
+				return -1;
+			ctx->next[ctx->next_num++] = token->next[i];
+		}
+	return len;
+}
+
+/** Return number of completion entries (cmdline API). */
+static int
+cmd_flow_complete_get_nb(cmdline_parse_token_hdr_t *hdr)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return 0;
+	/*
+	 * If there is a single token, use its completion callback, otherwise
+	 * return the number of entries.
+	 */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, 0, NULL, 0);
+	}
+	return i;
+}
+
+/** Return a completion entry (cmdline API). */
+static int
+cmd_flow_complete_get_elt(cmdline_parse_token_hdr_t *hdr, int index,
+			  char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return -1;
+	/* If there is a single token, use its completion callback. */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, index, dst, size) < 0 ? -1 : 0;
+	}
+	/* Otherwise make sure the index is valid and use defaults. */
+	if (index >= i)
+		return -1;
+	token = &token_list[list[index]];
+	snprintf(dst, size, "%s", token->name);
+	/* Save index for cmd_flow_get_help(). */
+	ctx->prev = list[index];
+	return 0;
+}
+
+/** Populate help strings for current token (cmdline API). */
+static int
+cmd_flow_get_help(cmdline_parse_token_hdr_t *hdr, char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->prev];
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	if (!size)
+		return -1;
+	/* Set token type and update global help with details. */
+	snprintf(dst, size, "%s", (token->type ? token->type : "TOKEN"));
+	if (token->help)
+		cmd_flow.help_str = token->help;
+	else
+		cmd_flow.help_str = token->name;
+	return 0;
+}
+
+/** Token definition template (cmdline API). */
+static struct cmdline_token_hdr cmd_flow_token_hdr = {
+	.ops = &(struct cmdline_token_ops){
+		.parse = cmd_flow_parse,
+		.complete_get_nb = cmd_flow_complete_get_nb,
+		.complete_get_elt = cmd_flow_complete_get_elt,
+		.get_help = cmd_flow_get_help,
+	},
+	.offset = 0,
+};
+
+/** Populate the next dynamic token. */
+static void
+cmd_flow_tok(cmdline_parse_token_hdr_t **hdr,
+	     cmdline_parse_token_hdr_t *(*hdrs)[])
+{
+	struct context *ctx = &cmd_flow_context;
+
+	/* Always reinitialize context before requesting the first token. */
+	if (!(hdr - *hdrs))
+		cmd_flow_context_init(ctx);
+	/* Return NULL when no more tokens are expected. */
+	if (!ctx->next_num && ctx->curr) {
+		*hdr = NULL;
+		return;
+	}
+	/* Determine if command should end here. */
+	if (ctx->eol && ctx->last && ctx->next_num) {
+		const enum index *list = ctx->next[ctx->next_num - 1];
+		int i;
+
+		for (i = 0; list[i]; ++i) {
+			if (list[i] != END)
+				continue;
+			*hdr = NULL;
+			return;
+		}
+	}
+	*hdr = &cmd_flow_token_hdr;
+}
+
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_flow_parsed(const struct buffer *in)
+{
+	switch (in->command) {
+	default:
+		break;
+	}
+}
+
+/** Token generator and output processing callback (cmdline API). */
+static void
+cmd_flow_cb(void *arg0, struct cmdline *cl, void *arg2)
+{
+	if (cl == NULL)
+		cmd_flow_tok(arg0, arg2);
+	else
+		cmd_flow_parsed(arg0);
+}
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow = {
+	.f = cmd_flow_cb,
+	.data = NULL, /**< Unused. */
+	.help_str = NULL, /**< Updated by cmd_flow_get_help(). */
+	.tokens = {
+		NULL,
+	}, /**< Tokens are returned by cmd_flow_tok(). */
+};
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 08/25] app/testpmd: add rte_flow integer support
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (6 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 07/25] app/testpmd: add flow command Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 09/25] app/testpmd: add flow list command Adrien Mazarguil
                             ` (17 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Parse all integer types and handle conversion to network byte order in a
single function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 148 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index f5aef0f..c5a4209 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -34,11 +34,14 @@
 #include <stddef.h>
 #include <stdint.h>
 #include <stdio.h>
+#include <inttypes.h>
+#include <errno.h>
 #include <ctype.h>
 #include <string.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
+#include <rte_byteorder.h>
 #include <cmdline_parse.h>
 #include <rte_flow.h>
 
@@ -50,6 +53,10 @@ enum index {
 	ZERO = 0,
 	END,
 
+	/* Common tokens. */
+	INTEGER,
+	UNSIGNED,
+
 	/* Top-level command. */
 	FLOW,
 };
@@ -61,12 +68,24 @@ enum index {
 struct context {
 	/** Stack of subsequent token lists to process. */
 	const enum index *next[CTX_STACK_SIZE];
+	/** Arguments for stacked tokens. */
+	const void *args[CTX_STACK_SIZE];
 	enum index curr; /**< Current token index. */
 	enum index prev; /**< Index of the last token seen. */
 	int next_num; /**< Number of entries in next[]. */
+	int args_num; /**< Number of entries in args[]. */
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	void *object; /**< Address of current object for relative offsets. */
+};
+
+/** Token argument. */
+struct arg {
+	uint32_t hton:1; /**< Use network byte ordering. */
+	uint32_t sign:1; /**< Value is signed. */
+	uint32_t offset; /**< Relative offset from ctx->object. */
+	uint32_t size; /**< Field size. */
 };
 
 /** Parser token definition. */
@@ -80,6 +99,8 @@ struct token {
 	 * parser consumes the last entry of that stack.
 	 */
 	const enum index *const *next;
+	/** Arguments stack for subsequent tokens that need them. */
+	const struct arg *const *args;
 	/**
 	 * Token-processing callback, returns -1 in case of error, the
 	 * length of the matched string otherwise. If NULL, attempts to
@@ -112,6 +133,22 @@ struct token {
 /** Static initializer for a NEXT() entry. */
 #define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, ZERO, }
 
+/** Static initializer for the args field. */
+#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for ARGS() to target a field. */
+#define ARGS_ENTRY(s, f) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
+/** Static initializer for ARGS() to target a pointer. */
+#define ARGS_ENTRY_PTR(s, f) \
+	(&(const struct arg){ \
+		.size = sizeof(*((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -121,6 +158,11 @@ struct buffer {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_int(struct context *, const struct token *,
+		     const char *, unsigned int,
+		     void *, unsigned int);
+static int comp_none(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -135,6 +177,21 @@ static const struct token token_list[] = {
 		.type = "RETURN",
 		.help = "command may end here",
 	},
+	/* Common tokens. */
+	[INTEGER] = {
+		.name = "{int}",
+		.type = "INTEGER",
+		.help = "integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[UNSIGNED] = {
+		.name = "{unsigned}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -144,6 +201,23 @@ static const struct token token_list[] = {
 	},
 };
 
+/** Remove and return last entry from argument stack. */
+static const struct arg *
+pop_args(struct context *ctx)
+{
+	return ctx->args_num ? ctx->args[--ctx->args_num] : NULL;
+}
+
+/** Add entry on top of the argument stack. */
+static int
+push_args(struct context *ctx, const struct arg *arg)
+{
+	if (ctx->args_num == CTX_STACK_SIZE)
+		return -1;
+	ctx->args[ctx->args_num++] = arg;
+	return 0;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -178,9 +252,74 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->object = out;
 	return len;
 }
 
+/**
+ * Parse signed/unsigned integers 8 to 64-bit long.
+ *
+ * Last argument (ctx->args) is retrieved to determine integer type and
+ * storage location.
+ */
+static int
+parse_int(struct context *ctx, const struct token *token,
+	  const char *str, unsigned int len,
+	  void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	uintmax_t u;
+	char *end;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = arg->sign ?
+		(uintmax_t)strtoimax(str, &end, 0) :
+		strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	size = arg->size;
+	switch (size) {
+	case sizeof(uint8_t):
+		*(uint8_t *)buf = u;
+		break;
+	case sizeof(uint16_t):
+		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
+		break;
+	case sizeof(uint32_t):
+		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
+		break;
+	case sizeof(uint64_t):
+		*(uint64_t *)buf = arg->hton ? rte_cpu_to_be_64(u) : u;
+		break;
+	default:
+		goto error;
+	}
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/** No completion. */
+static int
+comp_none(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)token;
+	(void)ent;
+	(void)buf;
+	(void)size;
+	return 0;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -195,9 +334,11 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->curr = ZERO;
 	ctx->prev = ZERO;
 	ctx->next_num = 0;
+	ctx->args_num = 0;
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->object = NULL;
 }
 
 /** Parse a token (cmdline API). */
@@ -270,6 +411,13 @@ cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
 				return -1;
 			ctx->next[ctx->next_num++] = token->next[i];
 		}
+	/* Push arguments if any. */
+	if (token->args)
+		for (i = 0; token->args[i]; ++i) {
+			if (ctx->args_num == RTE_DIM(ctx->args))
+				return -1;
+			ctx->args[ctx->args_num++] = token->args[i];
+		}
 	return len;
 }
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 09/25] app/testpmd: add flow list command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (7 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 10/25] app/testpmd: add flow flush command Adrien Mazarguil
                             ` (16 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Syntax:

 flow list {port_id} [group {group_id}] [...]

List configured flow rules on a port. Output can optionally be limited to a
given set of group identifiers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   4 ++
 app/test-pmd/cmdline_flow.c | 141 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b124412..0dc6c63 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -810,6 +810,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
+
+			"flow list {port_id} [group {group_id}] [...]\n"
+			"    List existing flow rules sorted by priority,"
+			" filtered by group identifiers.\n\n"
 		);
 	}
 }
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index c5a4209..7a2aaa4 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,9 +56,17 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PORT_ID,
+	GROUP_ID,
 
 	/* Top-level command. */
 	FLOW,
+
+	/* Sub-level commands. */
+	LIST,
+
+	/* List arguments. */
+	LIST_GROUP,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -77,6 +85,7 @@ struct context {
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	uint16_t port; /**< Current port ID (for completions). */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -153,16 +162,36 @@ struct token {
 struct buffer {
 	enum index command; /**< Flow command. */
 	uint16_t port; /**< Affected port ID. */
+	union {
+		struct {
+			uint32_t *group;
+			uint32_t group_n;
+		} list; /**< List arguments. */
+	} args; /**< Command arguments. */
+};
+
+static const enum index next_list_attr[] = {
+	LIST_GROUP,
+	END,
+	ZERO,
 };
 
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_list(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_port(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_port(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -192,13 +221,44 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PORT_ID] = {
+		.name = "{port_id}",
+		.type = "PORT ID",
+		.help = "port identifier",
+		.call = parse_port,
+		.comp = comp_port,
+	},
+	[GROUP_ID] = {
+		.name = "{group_id}",
+		.type = "GROUP ID",
+		.help = "group identifier",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
+		.next = NEXT(NEXT_ENTRY(LIST)),
 		.call = parse_init,
 	},
+	/* Sub-level commands. */
+	[LIST] = {
+		.name = "list",
+		.help = "list existing flow rules",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_list,
+	},
+	/* List arguments. */
+	[LIST_GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
+		.call = parse_list,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -256,6 +316,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for list command. */
+static int
+parse_list(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != LIST)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.list.group =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
+	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.list.group + out->args.list.group_n++;
+	return len;
+}
+
 /**
  * Parse signed/unsigned integers 8 to 64-bit long.
  *
@@ -307,6 +400,29 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/** Parse port and update context. */
+static int
+parse_port(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = &(struct buffer){ .port = 0 };
+	int ret;
+
+	if (buf)
+		out = buf;
+	else {
+		ctx->object = out;
+		size = sizeof(*out);
+	}
+	ret = parse_int(ctx, token, str, len, out, size);
+	if (ret >= 0)
+		ctx->port = out->port;
+	if (!buf)
+		ctx->object = NULL;
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -320,6 +436,26 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete available ports. */
+static int
+comp_port(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	portid_t p;
+
+	(void)ctx;
+	(void)token;
+	FOREACH_PORT(p, ports) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", p);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -338,6 +474,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->port = 0;
 	ctx->object = NULL;
 }
 
@@ -561,6 +698,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case LIST:
+		port_flow_list(in->port, in->args.list.group_n,
+			       in->args.list.group);
+		break;
 	default:
 		break;
 	}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 10/25] app/testpmd: add flow flush command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (8 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 09/25] app/testpmd: add flow list command Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
                             ` (15 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Syntax:

 flow flush {port_id}

Destroy all flow rules on a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |  3 +++
 app/test-pmd/cmdline_flow.c | 43 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0dc6c63..6e2b289 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow flush {port_id}\n"
+			"    Destroy all flow rules.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7a2aaa4..5972b80 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -63,6 +63,7 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	FLUSH,
 	LIST,
 
 	/* List arguments. */
@@ -179,6 +180,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_flush(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -240,10 +244,19 @@ static const struct token token_list[] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
-		.next = NEXT(NEXT_ENTRY(LIST)),
+		.next = NEXT(NEXT_ENTRY
+			     (FLUSH,
+			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[FLUSH] = {
+		.name = "flush",
+		.help = "destroy all flow rules",
+		.next = NEXT(NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_flush,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -316,6 +329,31 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for flush command. */
+static int
+parse_flush(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != FLUSH)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+	}
+	return len;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -698,6 +736,9 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case FLUSH:
+		port_flow_flush(in->port);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 11/25] app/testpmd: add flow destroy command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (9 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 10/25] app/testpmd: add flow flush command Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
                             ` (14 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Syntax:

 flow destroy {port_id} rule {rule_id} [...]

Destroy a given set of flow rules associated with a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   3 ++
 app/test-pmd/cmdline_flow.c | 106 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 6e2b289..80ddda2 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow destroy {port_id} rule {rule_id} [...]\n"
+			"    Destroy specific flow rules.\n\n"
+
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 5972b80..5c45b3a 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
 
@@ -63,9 +64,13 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	DESTROY,
 	FLUSH,
 	LIST,
 
+	/* Destroy arguments. */
+	DESTROY_RULE,
+
 	/* List arguments. */
 	LIST_GROUP,
 };
@@ -165,12 +170,22 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			uint32_t *rule;
+			uint32_t rule_n;
+		} destroy; /**< Destroy arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
 	} args; /**< Command arguments. */
 };
 
+static const enum index next_destroy_attr[] = {
+	DESTROY_RULE,
+	END,
+	ZERO,
+};
+
 static const enum index next_list_attr[] = {
 	LIST_GROUP,
 	END,
@@ -180,6 +195,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_destroy(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
@@ -196,6 +214,8 @@ static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_rule_id(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -225,6 +245,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[RULE_ID] = {
+		.name = "{rule id}",
+		.type = "RULE ID",
+		.help = "rule identifier",
+		.call = parse_int,
+		.comp = comp_rule_id,
+	},
 	[PORT_ID] = {
 		.name = "{port_id}",
 		.type = "PORT ID",
@@ -245,11 +272,19 @@ static const struct token token_list[] = {
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (FLUSH,
+			     (DESTROY,
+			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[DESTROY] = {
+		.name = "destroy",
+		.help = "destroy specific flow rules",
+		.next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_destroy,
+	},
 	[FLUSH] = {
 		.name = "flush",
 		.help = "destroy all flow rules",
@@ -264,6 +299,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_list,
 	},
+	/* Destroy arguments. */
+	[DESTROY_RULE] = {
+		.name = "rule",
+		.help = "specify a rule identifier",
+		.next = NEXT(next_destroy_attr, NEXT_ENTRY(RULE_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
+		.call = parse_destroy,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -329,6 +372,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for destroy command. */
+static int
+parse_destroy(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != DESTROY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.destroy.rule =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
+	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	return len;
+}
+
 /** Parse tokens for flush command. */
 static int
 parse_flush(struct context *ctx, const struct token *token,
@@ -494,6 +570,30 @@ comp_port(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete available rule IDs. */
+static int
+comp_rule_id(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	struct rte_port *port;
+	struct port_flow *pf;
+
+	(void)token;
+	if (port_id_is_invalid(ctx->port, DISABLED_WARN) ||
+	    ctx->port == (uint16_t)RTE_PORT_ALL)
+		return -1;
+	port = &ports[ctx->port];
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", pf->id);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -736,6 +836,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case DESTROY:
+		port_flow_destroy(in->port, in->args.destroy.rule_n,
+				  in->args.destroy.rule);
+		break;
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 12/25] app/testpmd: add flow validate/create commands
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (10 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 13/25] app/testpmd: add flow query command Adrien Mazarguil
                             ` (13 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Syntax:

 flow (validate|create) {port_id}
    [group {group_id}] [priority {level}] [ingress] [egress]
    pattern {item} [/ {item} [...]] / end
    actions {action} [/ {action} [...]] / end

Either check the validity of a flow rule or create it. Any number of
pattern items and actions can be provided in any order. Completion is
available for convenience.

This commit only adds support for the most basic item and action types,
namely:

- END: terminates pattern items and actions lists.
- VOID: item/action filler, no operation.
- INVERT: inverted pattern matching, process packets that do not match.
- PASSTHRU: action that leaves packets up for additional processing by
  subsequent flow rules.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |  14 ++
 app/test-pmd/cmdline_flow.c | 314 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 327 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 80ddda2..23f4b48 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,20 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow validate {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Check whether a flow rule can be created.\n\n"
+
+			"flow create {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Create a flow rule.\n\n"
+
 			"flow destroy {port_id} rule {rule_id} [...]\n"
 			"    Destroy specific flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 5c45b3a..dc68685 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -59,11 +59,14 @@ enum index {
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
+	PRIORITY_LEVEL,
 
 	/* Top-level command. */
 	FLOW,
 
 	/* Sub-level commands. */
+	VALIDATE,
+	CREATE,
 	DESTROY,
 	FLUSH,
 	LIST,
@@ -73,6 +76,26 @@ enum index {
 
 	/* List arguments. */
 	LIST_GROUP,
+
+	/* Validate/create arguments. */
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+
+	/* Validate/create pattern. */
+	PATTERN,
+	ITEM_NEXT,
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+
+	/* Validate/create actions. */
+	ACTIONS,
+	ACTION_NEXT,
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -92,6 +115,7 @@ struct context {
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
 	uint16_t port; /**< Current port ID (for completions). */
+	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -109,6 +133,8 @@ struct token {
 	const char *type;
 	/** Help displayed during completion (defaults to token name). */
 	const char *help;
+	/** Private data used by parser functions. */
+	const void *priv;
 	/**
 	 * Lists of subsequent tokens to push on the stack. Each call to the
 	 * parser consumes the last entry of that stack.
@@ -170,6 +196,14 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			struct rte_flow_attr attr;
+			struct rte_flow_item *pattern;
+			struct rte_flow_action *actions;
+			uint32_t pattern_n;
+			uint32_t actions_n;
+			uint8_t *data;
+		} vc; /**< Validate/create arguments. */
+		struct {
 			uint32_t *rule;
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
@@ -180,6 +214,39 @@ struct buffer {
 	} args; /**< Command arguments. */
 };
 
+/** Private data for pattern items. */
+struct parse_item_priv {
+	enum rte_flow_item_type type; /**< Item type. */
+	uint32_t size; /**< Size of item specification structure. */
+};
+
+#define PRIV_ITEM(t, s) \
+	(&(const struct parse_item_priv){ \
+		.type = RTE_FLOW_ITEM_TYPE_ ## t, \
+		.size = s, \
+	})
+
+/** Private data for actions. */
+struct parse_action_priv {
+	enum rte_flow_action_type type; /**< Action type. */
+	uint32_t size; /**< Size of action configuration structure. */
+};
+
+#define PRIV_ACTION(t, s) \
+	(&(const struct parse_action_priv){ \
+		.type = RTE_FLOW_ACTION_TYPE_ ## t, \
+		.size = s, \
+	})
+
+static const enum index next_vc_attr[] = {
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+	PATTERN,
+	ZERO,
+};
+
 static const enum index next_destroy_attr[] = {
 	DESTROY_RULE,
 	END,
@@ -192,9 +259,26 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
+static const enum index next_item[] = {
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+	ZERO,
+};
+
+static const enum index next_action[] = {
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_vc(struct context *, const struct token *,
+		    const char *, unsigned int,
+		    void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -266,18 +350,41 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PRIORITY_LEVEL] = {
+		.name = "{level}",
+		.type = "PRIORITY",
+		.help = "priority level",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (DESTROY,
+			     (VALIDATE,
+			      CREATE,
+			      DESTROY,
 			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[VALIDATE] = {
+		.name = "validate",
+		.help = "check whether a flow rule can be created",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
+	[CREATE] = {
+		.name = "create",
+		.help = "create a flow rule",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
 	[DESTROY] = {
 		.name = "destroy",
 		.help = "destroy specific flow rules",
@@ -315,6 +422,98 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
 		.call = parse_list,
 	},
+	/* Validate/create attributes. */
+	[GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, group)),
+		.call = parse_vc,
+	},
+	[PRIORITY] = {
+		.name = "priority",
+		.help = "specify a priority level",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PRIORITY_LEVEL)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, priority)),
+		.call = parse_vc,
+	},
+	[INGRESS] = {
+		.name = "ingress",
+		.help = "affect rule to ingress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	[EGRESS] = {
+		.name = "egress",
+		.help = "affect rule to egress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	/* Validate/create pattern. */
+	[PATTERN] = {
+		.name = "pattern",
+		.help = "submit a list of pattern items",
+		.next = NEXT(next_item),
+		.call = parse_vc,
+	},
+	[ITEM_NEXT] = {
+		.name = "/",
+		.help = "specify next pattern item",
+		.next = NEXT(next_item),
+	},
+	[ITEM_END] = {
+		.name = "end",
+		.help = "end list of pattern items",
+		.priv = PRIV_ITEM(END, 0),
+		.next = NEXT(NEXT_ENTRY(ACTIONS)),
+		.call = parse_vc,
+	},
+	[ITEM_VOID] = {
+		.name = "void",
+		.help = "no-op pattern item",
+		.priv = PRIV_ITEM(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_INVERT] = {
+		.name = "invert",
+		.help = "perform actions when pattern does not match",
+		.priv = PRIV_ITEM(INVERT, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	/* Validate/create actions. */
+	[ACTIONS] = {
+		.name = "actions",
+		.help = "submit a list of associated actions",
+		.next = NEXT(next_action),
+		.call = parse_vc,
+	},
+	[ACTION_NEXT] = {
+		.name = "/",
+		.help = "specify next action",
+		.next = NEXT(next_action),
+	},
+	[ACTION_END] = {
+		.name = "end",
+		.help = "end list of actions",
+		.priv = PRIV_ACTION(END, 0),
+		.call = parse_vc,
+	},
+	[ACTION_VOID] = {
+		.name = "void",
+		.help = "no-op action",
+		.priv = PRIV_ACTION(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PASSTHRU] = {
+		.name = "passthru",
+		.help = "let subsequent rule process matched packets",
+		.priv = PRIV_ACTION(PASSTHRU, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -368,10 +567,108 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->objdata = 0;
 	ctx->object = out;
 	return len;
 }
 
+/** Parse tokens for validate/create commands. */
+static int
+parse_vc(struct context *ctx, const struct token *token,
+	 const char *str, unsigned int len,
+	 void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	uint8_t *data;
+	uint32_t data_size;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != VALIDATE && ctx->curr != CREATE)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		out->args.vc.data = (uint8_t *)out + size;
+		return len;
+	}
+	ctx->objdata = 0;
+	ctx->object = &out->args.vc.attr;
+	switch (ctx->curr) {
+	case GROUP:
+	case PRIORITY:
+		return len;
+	case INGRESS:
+		out->args.vc.attr.ingress = 1;
+		return len;
+	case EGRESS:
+		out->args.vc.attr.egress = 1;
+		return len;
+	case PATTERN:
+		out->args.vc.pattern =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		ctx->object = out->args.vc.pattern;
+		return len;
+	case ACTIONS:
+		out->args.vc.actions =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)
+					       (out->args.vc.pattern +
+						out->args.vc.pattern_n),
+					       sizeof(double));
+		ctx->object = out->args.vc.actions;
+		return len;
+	default:
+		if (!token->priv)
+			return -1;
+		break;
+	}
+	if (!out->args.vc.actions) {
+		const struct parse_item_priv *priv = token->priv;
+		struct rte_flow_item *item =
+			out->args.vc.pattern + out->args.vc.pattern_n;
+
+		data_size = priv->size * 3; /* spec, last, mask */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)item + sizeof(*item) > data)
+			return -1;
+		*item = (struct rte_flow_item){
+			.type = priv->type,
+		};
+		++out->args.vc.pattern_n;
+		ctx->object = item;
+	} else {
+		const struct parse_action_priv *priv = token->priv;
+		struct rte_flow_action *action =
+			out->args.vc.actions + out->args.vc.actions_n;
+
+		data_size = priv->size; /* configuration */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)action + sizeof(*action) > data)
+			return -1;
+		*action = (struct rte_flow_action){
+			.type = priv->type,
+		};
+		++out->args.vc.actions_n;
+		ctx->object = action;
+	}
+	memset(data, 0, data_size);
+	out->args.vc.data = data;
+	ctx->objdata = data_size;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -392,6 +689,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -401,6 +699,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
 	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
 	return len;
 }
@@ -425,6 +724,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 	}
 	return len;
@@ -450,6 +750,7 @@ parse_list(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -459,6 +760,7 @@ parse_list(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
 	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
 	return len;
 }
@@ -526,6 +828,7 @@ parse_port(struct context *ctx, const struct token *token,
 	if (buf)
 		out = buf;
 	else {
+		ctx->objdata = 0;
 		ctx->object = out;
 		size = sizeof(*out);
 	}
@@ -613,6 +916,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->eol = 0;
 	ctx->last = 0;
 	ctx->port = 0;
+	ctx->objdata = 0;
 	ctx->object = NULL;
 }
 
@@ -836,6 +1140,14 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case VALIDATE:
+		port_flow_validate(in->port, &in->args.vc.attr,
+				   in->args.vc.pattern, in->args.vc.actions);
+		break;
+	case CREATE:
+		port_flow_create(in->port, &in->args.vc.attr,
+				 in->args.vc.pattern, in->args.vc.actions);
+		break;
 	case DESTROY:
 		port_flow_destroy(in->port, in->args.destroy.rule_n,
 				  in->args.destroy.rule);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 13/25] app/testpmd: add flow query command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (11 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
                             ` (12 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Syntax:

 flow query {port_id} {rule_id} {action}

Query a specific action of an existing flow rule.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   3 +
 app/test-pmd/cmdline_flow.c | 121 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 23f4b48..f768b6b 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -831,6 +831,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
+			"flow query {port_id} {rule_id} {action}\n"
+			"    Query an existing flow rule.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index dc68685..fb9489d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -69,11 +69,15 @@ enum index {
 	CREATE,
 	DESTROY,
 	FLUSH,
+	QUERY,
 	LIST,
 
 	/* Destroy arguments. */
 	DESTROY_RULE,
 
+	/* Query arguments. */
+	QUERY_ACTION,
+
 	/* List arguments. */
 	LIST_GROUP,
 
@@ -208,6 +212,10 @@ struct buffer {
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
 		struct {
+			uint32_t rule;
+			enum rte_flow_action_type action;
+		} query; /**< Query arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
@@ -285,6 +293,12 @@ static int parse_destroy(struct context *, const struct token *,
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
+static int parse_query(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
+static int parse_action(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -296,6 +310,8 @@ static int parse_port(struct context *, const struct token *,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_action(struct context *, const struct token *,
+		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
@@ -367,7 +383,8 @@ static const struct token token_list[] = {
 			      CREATE,
 			      DESTROY,
 			      FLUSH,
-			      LIST)),
+			      LIST,
+			      QUERY)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
@@ -399,6 +416,17 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_flush,
 	},
+	[QUERY] = {
+		.name = "query",
+		.help = "query an existing flow rule",
+		.next = NEXT(NEXT_ENTRY(QUERY_ACTION),
+			     NEXT_ENTRY(RULE_ID),
+			     NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.query.action),
+			     ARGS_ENTRY(struct buffer, args.query.rule),
+			     ARGS_ENTRY(struct buffer, port)),
+		.call = parse_query,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -414,6 +442,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
 		.call = parse_destroy,
 	},
+	/* Query arguments. */
+	[QUERY_ACTION] = {
+		.name = "{action}",
+		.type = "ACTION",
+		.help = "action to query, must be part of the rule",
+		.call = parse_action,
+		.comp = comp_action,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -730,6 +766,67 @@ parse_flush(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for query command. */
+static int
+parse_query(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != QUERY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+	}
+	return len;
+}
+
+/** Parse action names. */
+static int
+parse_action(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+
+	(void)size;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	/* Parse action name. */
+	for (i = 0; next_action[i]; ++i) {
+		const struct parse_action_priv *priv;
+
+		token = &token_list[next_action[i]];
+		if (strncmp(token->name, str, len))
+			continue;
+		priv = token->priv;
+		if (!priv)
+			goto error;
+		if (out)
+			memcpy((uint8_t *)ctx->object + arg->offset,
+			       &priv->type,
+			       arg->size);
+		return len;
+	}
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -853,6 +950,24 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete action names. */
+static int
+comp_action(struct context *ctx, const struct token *token,
+	    unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; next_action[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s",
+					token_list[next_action[i]].name);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete available ports. */
 static int
 comp_port(struct context *ctx, const struct token *token,
@@ -1155,6 +1270,10 @@ cmd_flow_parsed(const struct buffer *in)
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
+	case QUERY:
+		port_flow_query(in->port, in->args.query.rule,
+				in->args.query.action);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 14/25] app/testpmd: add rte_flow item spec handler
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (12 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 13/25] app/testpmd: add flow query command Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
                             ` (11 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Add parser code to fully set individual fields of pattern item
specification structures, using the following operators:

- fix: sets field and applies full bit-mask for perfect matching.
- spec: sets field without modifying its bit-mask.
- last: sets upper value of the spec => last range.
- mask: sets bit-mask affecting both spec and last from arbitrary value.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 111 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index fb9489d..7bc1aa7 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -89,6 +89,10 @@ enum index {
 
 	/* Validate/create pattern. */
 	PATTERN,
+	ITEM_PARAM_IS,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -121,6 +125,7 @@ struct context {
 	uint16_t port; /**< Current port ID (for completions). */
 	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
+	void *objmask; /**< Object a full mask must be written to. */
 };
 
 /** Token argument. */
@@ -267,6 +272,15 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
+__rte_unused
+static const enum index item_param[] = {
+	ITEM_PARAM_IS,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
+	ZERO,
+};
+
 static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
@@ -287,6 +301,8 @@ static int parse_init(struct context *, const struct token *,
 static int parse_vc(struct context *, const struct token *,
 		    const char *, unsigned int,
 		    void *, unsigned int);
+static int parse_vc_spec(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -492,6 +508,26 @@ static const struct token token_list[] = {
 		.next = NEXT(next_item),
 		.call = parse_vc,
 	},
+	[ITEM_PARAM_IS] = {
+		.name = "is",
+		.help = "match value perfectly (with full bit-mask)",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_SPEC] = {
+		.name = "spec",
+		.help = "match value according to configured bit-mask",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_LAST] = {
+		.name = "last",
+		.help = "specify upper bound to establish a range",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_MASK] = {
+		.name = "mask",
+		.help = "specify bit-mask with relevant bits set to one",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +641,7 @@ parse_init(struct context *ctx, const struct token *token,
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
 	ctx->objdata = 0;
 	ctx->object = out;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -632,11 +669,13 @@ parse_vc(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.vc.data = (uint8_t *)out + size;
 		return len;
 	}
 	ctx->objdata = 0;
 	ctx->object = &out->args.vc.attr;
+	ctx->objmask = NULL;
 	switch (ctx->curr) {
 	case GROUP:
 	case PRIORITY:
@@ -652,6 +691,7 @@ parse_vc(struct context *ctx, const struct token *token,
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
 		ctx->object = out->args.vc.pattern;
+		ctx->objmask = NULL;
 		return len;
 	case ACTIONS:
 		out->args.vc.actions =
@@ -660,6 +700,7 @@ parse_vc(struct context *ctx, const struct token *token,
 						out->args.vc.pattern_n),
 					       sizeof(double));
 		ctx->object = out->args.vc.actions;
+		ctx->objmask = NULL;
 		return len;
 	default:
 		if (!token->priv)
@@ -682,6 +723,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.pattern_n;
 		ctx->object = item;
+		ctx->objmask = NULL;
 	} else {
 		const struct parse_action_priv *priv = token->priv;
 		struct rte_flow_action *action =
@@ -698,6 +740,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.actions_n;
 		ctx->object = action;
+		ctx->objmask = NULL;
 	}
 	memset(data, 0, data_size);
 	out->args.vc.data = data;
@@ -705,6 +748,60 @@ parse_vc(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse pattern item parameter type. */
+static int
+parse_vc_spec(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_item *item;
+	uint32_t data_size;
+	int index;
+	int objmask = 0;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Parse parameter types. */
+	switch (ctx->curr) {
+	case ITEM_PARAM_IS:
+		index = 0;
+		objmask = 1;
+		break;
+	case ITEM_PARAM_SPEC:
+		index = 0;
+		break;
+	case ITEM_PARAM_LAST:
+		index = 1;
+		break;
+	case ITEM_PARAM_MASK:
+		index = 2;
+		break;
+	default:
+		return -1;
+	}
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.pattern_n)
+		return -1;
+	item = &out->args.vc.pattern[out->args.vc.pattern_n - 1];
+	data_size = ctx->objdata / 3; /* spec, last, mask */
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data + (data_size * index);
+	if (objmask) {
+		ctx->objmask = out->args.vc.data + (data_size * 2); /* mask */
+		item->mask = ctx->objmask;
+	} else
+		ctx->objmask = NULL;
+	/* Update relevant item pointer. */
+	*((const void **[]){ &item->spec, &item->last, &item->mask })[index] =
+		ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -727,6 +824,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -737,6 +835,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -762,6 +861,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -788,6 +888,7 @@ parse_query(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -849,6 +950,7 @@ parse_list(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -859,6 +961,7 @@ parse_list(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -891,6 +994,7 @@ parse_int(struct context *ctx, const struct token *token,
 		return len;
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
+objmask:
 	switch (size) {
 	case sizeof(uint8_t):
 		*(uint8_t *)buf = u;
@@ -907,6 +1011,11 @@ parse_int(struct context *ctx, const struct token *token,
 	default:
 		goto error;
 	}
+	if (ctx->objmask && buf != (uint8_t *)ctx->objmask + arg->offset) {
+		u = -1;
+		buf = (uint8_t *)ctx->objmask + arg->offset;
+		goto objmask;
+	}
 	return len;
 error:
 	push_args(ctx, arg);
@@ -927,6 +1036,7 @@ parse_port(struct context *ctx, const struct token *token,
 	else {
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		size = sizeof(*out);
 	}
 	ret = parse_int(ctx, token, str, len, out, size);
@@ -1033,6 +1143,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->port = 0;
 	ctx->objdata = 0;
 	ctx->object = NULL;
+	ctx->objmask = NULL;
 }
 
 /** Parse a token (cmdline API). */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 15/25] app/testpmd: add rte_flow item spec prefix length
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (13 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
                             ` (10 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Generating bit-masks from prefix lengths is often more convenient than
providing them entirely (e.g. to define IPv4 and IPv6 subnets).

This commit adds the "prefix" operator that assigns generated bit-masks to
any pattern item specification field.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 80 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7bc1aa7..9a6f37d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PREFIX,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -93,6 +94,7 @@ enum index {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -278,6 +280,7 @@ static const enum index item_param[] = {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ZERO,
 };
 
@@ -321,6 +324,9 @@ static int parse_list(struct context *, const struct token *,
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_prefix(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -361,6 +367,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PREFIX] = {
+		.name = "{prefix}",
+		.type = "PREFIX",
+		.help = "prefix length for bit-mask",
+		.call = parse_prefix,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -528,6 +541,11 @@ static const struct token token_list[] = {
 		.help = "specify bit-mask with relevant bits set to one",
 		.call = parse_vc_spec,
 	},
+	[ITEM_PARAM_PREFIX] = {
+		.name = "prefix",
+		.help = "generate bit-mask from a prefix length",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +623,62 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/**
+ * Parse a prefix length and generate a bit-mask.
+ *
+ * Last argument (ctx->args) is retrieved to determine mask size, storage
+ * location and whether the result must use network byte ordering.
+ */
+static int
+parse_prefix(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	static const uint8_t conv[] = "\x00\x80\xc0\xe0\xf0\xf8\xfc\xfe\xff";
+	char *end;
+	uintmax_t u;
+	unsigned int bytes;
+	unsigned int extra;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	bytes = u / 8;
+	extra = u % 8;
+	size = arg->size;
+	if (bytes > size || bytes + !!extra > size)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+	if (!arg->hton) {
+		memset((uint8_t *)buf + size - bytes, 0xff, bytes);
+		memset(buf, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[size - bytes - 1] = conv[extra];
+	} else
+#endif
+	{
+		memset(buf, 0xff, bytes);
+		memset((uint8_t *)buf + bytes, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[bytes] = conv[extra];
+	}
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -776,6 +850,12 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	case ITEM_PARAM_LAST:
 		index = 1;
 		break;
+	case ITEM_PARAM_PREFIX:
+		/* Modify next token to expect a prefix. */
+		if (ctx->next_num < 2)
+			return -1;
+		ctx->next[ctx->next_num - 2] = NEXT_ENTRY(PREFIX);
+		/* Fall through. */
 	case ITEM_PARAM_MASK:
 		index = 2;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 16/25] app/testpmd: add rte_flow bit-field support
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (14 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
                             ` (9 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Several rte_flow structures expose bit-fields that cannot be set in a
generic fashion at byte level. Add bit-mask support to handle them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 59 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 9a6f37d..72b5297 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -136,6 +136,7 @@ struct arg {
 	uint32_t sign:1; /**< Value is signed. */
 	uint32_t offset; /**< Relative offset from ctx->object. */
 	uint32_t size; /**< Field size. */
+	const uint8_t *mask; /**< Bit-mask to use instead of offset/size. */
 };
 
 /** Parser token definition. */
@@ -195,6 +196,13 @@ struct token {
 		.size = sizeof(((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() to target a bit-field. */
+#define ARGS_ENTRY_BF(s, f, b) \
+	(&(const struct arg){ \
+		.size = sizeof(s), \
+		.mask = (const void *)&(const s){ .f = (1 << (b)) - 1 }, \
+	})
+
 /** Static initializer for ARGS() to target a pointer. */
 #define ARGS_ENTRY_PTR(s, f) \
 	(&(const struct arg){ \
@@ -623,6 +631,34 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/** Spread value into buffer according to bit-mask. */
+static size_t
+arg_entry_bf_fill(void *dst, uintmax_t val, const struct arg *arg)
+{
+	uint32_t i;
+	size_t len = 0;
+
+	/* Endian conversion is not supported on bit-fields. */
+	if (!arg->mask || arg->hton)
+		return 0;
+	for (i = 0; i != arg->size; ++i) {
+		unsigned int shift = 0;
+		uint8_t *buf = (uint8_t *)dst + i;
+
+		for (shift = 0; arg->mask[i] >> shift; ++shift) {
+			if (!(arg->mask[i] & (1 << shift)))
+				continue;
+			++len;
+			if (!dst)
+				continue;
+			*buf &= ~(1 << shift);
+			*buf |= (val & 1) << shift;
+			val >>= 1;
+		}
+	}
+	return len;
+}
+
 /**
  * Parse a prefix length and generate a bit-mask.
  *
@@ -649,6 +685,23 @@ parse_prefix(struct context *ctx, const struct token *token,
 	u = strtoumax(str, &end, 0);
 	if (errno || (size_t)(end - str) != len)
 		goto error;
+	if (arg->mask) {
+		uintmax_t v = 0;
+
+		extra = arg_entry_bf_fill(NULL, 0, arg);
+		if (u > extra)
+			goto error;
+		if (!ctx->object)
+			return len;
+		extra -= u;
+		while (u--)
+			(v <<= 1, v |= 1);
+		v <<= extra;
+		if (!arg_entry_bf_fill(ctx->object, v, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	bytes = u / 8;
 	extra = u % 8;
 	size = arg->size;
@@ -1072,6 +1125,12 @@ parse_int(struct context *ctx, const struct token *token,
 		goto error;
 	if (!ctx->object)
 		return len;
+	if (arg->mask) {
+		if (!arg_entry_bf_fill(ctx->object, u, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
 objmask:
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 17/25] app/testpmd: add item any to flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (15 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 18/25] app/testpmd: add various items " Adrien Mazarguil
                             ` (8 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

This pattern item matches any protocol in place of the current layer and
has two properties:

- min: minimum number of layers covered (0 or more).
- max: maximum number of layers covered (0 means infinity).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 72b5297..09e9177 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -99,6 +99,8 @@ enum index {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ITEM_ANY_NUM,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -282,7 +284,6 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
-__rte_unused
 static const enum index item_param[] = {
 	ITEM_PARAM_IS,
 	ITEM_PARAM_SPEC,
@@ -296,6 +297,13 @@ static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ZERO,
+};
+
+static const enum index item_any[] = {
+	ITEM_ANY_NUM,
+	ITEM_NEXT,
 	ZERO,
 };
 
@@ -580,6 +588,19 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
 		.call = parse_vc,
 	},
+	[ITEM_ANY] = {
+		.name = "any",
+		.help = "match any protocol for the current layer",
+		.priv = PRIV_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+		.next = NEXT(item_any),
+		.call = parse_vc,
+	},
+	[ITEM_ANY_NUM] = {
+		.name = "num",
+		.help = "number of layers covered",
+		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, num)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 18/25] app/testpmd: add various items to flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (16 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 19/25] app/testpmd: add item raw " Adrien Mazarguil
                             ` (7 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

- PF: match packets addressed to the physical function.
- VF: match packets addressed to a virtual function ID.
- PORT: device-specific physical port index to use.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 53 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 09e9177..bc200a7 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -101,6 +101,11 @@ enum index {
 	ITEM_INVERT,
 	ITEM_ANY,
 	ITEM_ANY_NUM,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_VF_ID,
+	ITEM_PORT,
+	ITEM_PORT_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -298,6 +303,9 @@ static const enum index next_item[] = {
 	ITEM_VOID,
 	ITEM_INVERT,
 	ITEM_ANY,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_PORT,
 	ZERO,
 };
 
@@ -307,6 +315,18 @@ static const enum index item_any[] = {
 	ZERO,
 };
 
+static const enum index item_vf[] = {
+	ITEM_VF_ID,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_port[] = {
+	ITEM_PORT_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -601,6 +621,39 @@ static const struct token token_list[] = {
 		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, num)),
 	},
+	[ITEM_PF] = {
+		.name = "pf",
+		.help = "match packets addressed to the physical function",
+		.priv = PRIV_ITEM(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_VF] = {
+		.name = "vf",
+		.help = "match packets addressed to a virtual function ID",
+		.priv = PRIV_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+		.next = NEXT(item_vf),
+		.call = parse_vc,
+	},
+	[ITEM_VF_ID] = {
+		.name = "id",
+		.help = "destination VF ID",
+		.next = NEXT(item_vf, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_vf, id)),
+	},
+	[ITEM_PORT] = {
+		.name = "port",
+		.help = "device-specific physical port index to use",
+		.priv = PRIV_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+		.next = NEXT(item_port),
+		.call = parse_vc,
+	},
+	[ITEM_PORT_INDEX] = {
+		.name = "index",
+		.help = "physical port index",
+		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 19/25] app/testpmd: add item raw to flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (17 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 18/25] app/testpmd: add various items " Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
                             ` (6 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Matches arbitrary byte strings with properties:

- relative: look for pattern after the previous item.
- search: search pattern from offset (see also limit).
- offset: absolute or relative offset for pattern.
- limit: search area limit for start of pattern.
- length: pattern length.
- pattern: byte string to look for.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 208 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 208 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bc200a7..7161a04 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -57,6 +57,8 @@ enum index {
 	INTEGER,
 	UNSIGNED,
 	PREFIX,
+	BOOLEAN,
+	STRING,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -106,6 +108,12 @@ enum index {
 	ITEM_VF_ID,
 	ITEM_PORT,
 	ITEM_PORT_INDEX,
+	ITEM_RAW,
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -115,6 +123,13 @@ enum index {
 	ACTION_PASSTHRU,
 };
 
+/** Size of pattern[] field in struct rte_flow_item_raw. */
+#define ITEM_RAW_PATTERN_SIZE 36
+
+/** Storage size for struct rte_flow_item_raw including pattern. */
+#define ITEM_RAW_SIZE \
+	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -216,6 +231,13 @@ struct token {
 		.size = sizeof(*((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() with arbitrary size. */
+#define ARGS_ENTRY_USZ(s, f, sz) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = (sz), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -306,6 +328,7 @@ static const enum index next_item[] = {
 	ITEM_PF,
 	ITEM_VF,
 	ITEM_PORT,
+	ITEM_RAW,
 	ZERO,
 };
 
@@ -327,6 +350,16 @@ static const enum index item_port[] = {
 	ZERO,
 };
 
+static const enum index item_raw[] = {
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -363,11 +396,19 @@ static int parse_int(struct context *, const struct token *,
 static int parse_prefix(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_boolean(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
+static int parse_string(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_boolean(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 static int comp_action(struct context *, const struct token *,
 		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
@@ -410,6 +451,20 @@ static const struct token token_list[] = {
 		.call = parse_prefix,
 		.comp = comp_none,
 	},
+	[BOOLEAN] = {
+		.name = "{boolean}",
+		.type = "BOOLEAN",
+		.help = "any boolean value",
+		.call = parse_boolean,
+		.comp = comp_boolean,
+	},
+	[STRING] = {
+		.name = "{string}",
+		.type = "STRING",
+		.help = "fixed string",
+		.call = parse_string,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -654,6 +709,52 @@ static const struct token token_list[] = {
 		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
 	},
+	[ITEM_RAW] = {
+		.name = "raw",
+		.help = "match an arbitrary byte string",
+		.priv = PRIV_ITEM(RAW, ITEM_RAW_SIZE),
+		.next = NEXT(item_raw),
+		.call = parse_vc,
+	},
+	[ITEM_RAW_RELATIVE] = {
+		.name = "relative",
+		.help = "look for pattern after the previous item",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
+					   relative, 1)),
+	},
+	[ITEM_RAW_SEARCH] = {
+		.name = "search",
+		.help = "search pattern from offset (see also limit)",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
+					   search, 1)),
+	},
+	[ITEM_RAW_OFFSET] = {
+		.name = "offset",
+		.help = "absolute or relative offset for pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(INTEGER), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, offset)),
+	},
+	[ITEM_RAW_LIMIT] = {
+		.name = "limit",
+		.help = "search area limit for start of pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, limit)),
+	},
+	[ITEM_RAW_PATTERN] = {
+		.name = "pattern",
+		.help = "byte string to look for",
+		.next = NEXT(item_raw,
+			     NEXT_ENTRY(STRING),
+			     NEXT_ENTRY(ITEM_PARAM_IS,
+					ITEM_PARAM_SPEC,
+					ITEM_PARAM_MASK)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, length),
+			     ARGS_ENTRY_USZ(struct rte_flow_item_raw,
+					    pattern,
+					    ITEM_RAW_PATTERN_SIZE)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1235,6 +1336,96 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a string.
+ *
+ * Two arguments (ctx->args) are retrieved from the stack to store data and
+ * its length (in that order).
+ */
+static int
+parse_string(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg_data = pop_args(ctx);
+	const struct arg *arg_len = pop_args(ctx);
+	char tmp[16]; /* Ought to be enough. */
+	int ret;
+
+	/* Arguments are expected. */
+	if (!arg_data)
+		return -1;
+	if (!arg_len) {
+		push_args(ctx, arg_data);
+		return -1;
+	}
+	size = arg_data->size;
+	/* Bit-mask fill is not supported. */
+	if (arg_data->mask || size < len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	/* Let parse_int() fill length information first. */
+	ret = snprintf(tmp, sizeof(tmp), "%u", len);
+	if (ret < 0)
+		goto error;
+	push_args(ctx, arg_len);
+	ret = parse_int(ctx, token, tmp, ret, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		goto error;
+	}
+	buf = (uint8_t *)ctx->object + arg_data->offset;
+	/* Output buffer is not necessarily NUL-terminated. */
+	memcpy(buf, str, len);
+	memset((uint8_t *)buf + len, 0x55, size - len);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg_data->offset, 0xff, len);
+	return len;
+error:
+	push_args(ctx, arg_len);
+	push_args(ctx, arg_data);
+	return -1;
+}
+
+/** Boolean values (even indices stand for false). */
+static const char *const boolean_name[] = {
+	"0", "1",
+	"false", "true",
+	"no", "yes",
+	"N", "Y",
+	NULL,
+};
+
+/**
+ * Parse a boolean value.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_boolean(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	for (i = 0; boolean_name[i]; ++i)
+		if (!strncmp(str, boolean_name[i], len))
+			break;
+	/* Process token as integer. */
+	if (boolean_name[i])
+		str = i & 1 ? "1" : "0";
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, strlen(str), buf, size);
+	return ret > 0 ? (int)len : ret;
+}
+
 /** Parse port and update context. */
 static int
 parse_port(struct context *ctx, const struct token *token,
@@ -1273,6 +1464,23 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete boolean values. */
+static int
+comp_boolean(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; boolean_name[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", boolean_name[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete action names. */
 static int
 comp_action(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 20/25] app/testpmd: add items eth/vlan to flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (18 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 19/25] app/testpmd: add item raw " Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
                             ` (5 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

These pattern items match basic Ethernet headers (source, destination and
type) and related 802.1Q/ad VLAN headers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 126 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 126 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7161a04..ffa18d3 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -43,6 +43,7 @@
 #include <rte_ethdev.h>
 #include <rte_byteorder.h>
 #include <cmdline_parse.h>
+#include <cmdline_parse_etheraddr.h>
 #include <rte_flow.h>
 
 #include "testpmd.h"
@@ -59,6 +60,7 @@ enum index {
 	PREFIX,
 	BOOLEAN,
 	STRING,
+	MAC_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -114,6 +116,13 @@ enum index {
 	ITEM_RAW_OFFSET,
 	ITEM_RAW_LIMIT,
 	ITEM_RAW_PATTERN,
+	ITEM_ETH,
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_VLAN,
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -238,6 +247,14 @@ struct token {
 		.size = (sz), \
 	})
 
+/** Same as ARGS_ENTRY() using network byte ordering. */
+#define ARGS_ENTRY_HTON(s, f) \
+	(&(const struct arg){ \
+		.hton = 1, \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -329,6 +346,8 @@ static const enum index next_item[] = {
 	ITEM_VF,
 	ITEM_PORT,
 	ITEM_RAW,
+	ITEM_ETH,
+	ITEM_VLAN,
 	ZERO,
 };
 
@@ -360,6 +379,21 @@ static const enum index item_raw[] = {
 	ZERO,
 };
 
+static const enum index item_eth[] = {
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vlan[] = {
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -402,6 +436,9 @@ static int parse_boolean(struct context *, const struct token *,
 static int parse_string(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_mac_addr(struct context *, const struct token *,
+			  const char *, unsigned int,
+			  void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -465,6 +502,13 @@ static const struct token token_list[] = {
 		.call = parse_string,
 		.comp = comp_none,
 	},
+	[MAC_ADDR] = {
+		.name = "{MAC address}",
+		.type = "MAC-48",
+		.help = "standard MAC address notation",
+		.call = parse_mac_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -755,6 +799,50 @@ static const struct token token_list[] = {
 					    pattern,
 					    ITEM_RAW_PATTERN_SIZE)),
 	},
+	[ITEM_ETH] = {
+		.name = "eth",
+		.help = "match Ethernet header",
+		.priv = PRIV_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+		.next = NEXT(item_eth),
+		.call = parse_vc,
+	},
+	[ITEM_ETH_DST] = {
+		.name = "dst",
+		.help = "destination MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, dst)),
+	},
+	[ITEM_ETH_SRC] = {
+		.name = "src",
+		.help = "source MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, src)),
+	},
+	[ITEM_ETH_TYPE] = {
+		.name = "type",
+		.help = "EtherType",
+		.next = NEXT(item_eth, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_eth, type)),
+	},
+	[ITEM_VLAN] = {
+		.name = "vlan",
+		.help = "match 802.1Q/ad VLAN tag",
+		.priv = PRIV_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+		.next = NEXT(item_vlan),
+		.call = parse_vc,
+	},
+	[ITEM_VLAN_TPID] = {
+		.name = "tpid",
+		.help = "tag protocol identifier",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tpid)),
+	},
+	[ITEM_VLAN_TCI] = {
+		.name = "tci",
+		.help = "tag control information",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1388,6 +1476,44 @@ parse_string(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a MAC address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_mac_addr(struct context *ctx, const struct token *token,
+	       const char *str, unsigned int len,
+	       void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	struct ether_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	ret = cmdline_parse_etheraddr(NULL, str, &tmp, size);
+	if (ret < 0 || (unsigned int)ret != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 21/25] app/testpmd: add items ipv4/ipv6 to flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (19 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 22/25] app/testpmd: add L4 items " Adrien Mazarguil
                             ` (4 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Add the ability to match basic fields from IPv4 and IPv6 headers (source
and destination addresses only).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 177 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index ffa18d3..dfb6d6f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -38,6 +38,7 @@
 #include <errno.h>
 #include <ctype.h>
 #include <string.h>
+#include <arpa/inet.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
@@ -61,6 +62,8 @@ enum index {
 	BOOLEAN,
 	STRING,
 	MAC_ADDR,
+	IPV4_ADDR,
+	IPV6_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -123,6 +126,12 @@ enum index {
 	ITEM_VLAN,
 	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
+	ITEM_IPV4,
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_IPV6,
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -348,6 +357,8 @@ static const enum index next_item[] = {
 	ITEM_RAW,
 	ITEM_ETH,
 	ITEM_VLAN,
+	ITEM_IPV4,
+	ITEM_IPV6,
 	ZERO,
 };
 
@@ -394,6 +405,20 @@ static const enum index item_vlan[] = {
 	ZERO,
 };
 
+static const enum index item_ipv4[] = {
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_ipv6[] = {
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -439,6 +464,12 @@ static int parse_string(struct context *, const struct token *,
 static int parse_mac_addr(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int parse_ipv4_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
+static int parse_ipv6_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -509,6 +540,20 @@ static const struct token token_list[] = {
 		.call = parse_mac_addr,
 		.comp = comp_none,
 	},
+	[IPV4_ADDR] = {
+		.name = "{IPv4 address}",
+		.type = "IPV4 ADDRESS",
+		.help = "standard IPv4 address notation",
+		.call = parse_ipv4_addr,
+		.comp = comp_none,
+	},
+	[IPV6_ADDR] = {
+		.name = "{IPv6 address}",
+		.type = "IPV6 ADDRESS",
+		.help = "standard IPv6 address notation",
+		.call = parse_ipv6_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -843,6 +888,48 @@ static const struct token token_list[] = {
 		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
 	},
+	[ITEM_IPV4] = {
+		.name = "ipv4",
+		.help = "match IPv4 header",
+		.priv = PRIV_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+		.next = NEXT(item_ipv4),
+		.call = parse_vc,
+	},
+	[ITEM_IPV4_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV4_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.dst_addr)),
+	},
+	[ITEM_IPV6] = {
+		.name = "ipv6",
+		.help = "match IPv6 header",
+		.priv = PRIV_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+		.next = NEXT(item_ipv6),
+		.call = parse_vc,
+	},
+	[ITEM_IPV6_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV6_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.dst_addr)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1514,6 +1601,96 @@ parse_mac_addr(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse an IPv4 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv4_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in_addr tmp;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET, str2, &tmp);
+	if (ret != 1) {
+		/* Attempt integer parsing. */
+		push_args(ctx, arg);
+		return parse_int(ctx, token, str, len, buf, size);
+	}
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/**
+ * Parse an IPv6 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv6_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in6_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET6, str2, &tmp);
+	if (ret != 1)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 22/25] app/testpmd: add L4 items to flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (20 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 23/25] app/testpmd: add various actions " Adrien Mazarguil
                             ` (3 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Add the ability to match a few properties of common L4[.5] protocol
headers:

- ICMP: type and code.
- UDP: source and destination ports.
- TCP: source and destination ports.
- SCTP: source and destination ports.
- VXLAN: network identifier.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 163 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index dfb6d6f..e63982f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -132,6 +132,20 @@ enum index {
 	ITEM_IPV6,
 	ITEM_IPV6_SRC,
 	ITEM_IPV6_DST,
+	ITEM_ICMP,
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_UDP,
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_TCP,
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_SCTP,
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_VXLAN,
+	ITEM_VXLAN_VNI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -359,6 +373,11 @@ static const enum index next_item[] = {
 	ITEM_VLAN,
 	ITEM_IPV4,
 	ITEM_IPV6,
+	ITEM_ICMP,
+	ITEM_UDP,
+	ITEM_TCP,
+	ITEM_SCTP,
+	ITEM_VXLAN,
 	ZERO,
 };
 
@@ -419,6 +438,40 @@ static const enum index item_ipv6[] = {
 	ZERO,
 };
 
+static const enum index item_icmp[] = {
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_udp[] = {
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_tcp[] = {
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_sctp[] = {
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vxlan[] = {
+	ITEM_VXLAN_VNI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -930,6 +983,103 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
 					     hdr.dst_addr)),
 	},
+	[ITEM_ICMP] = {
+		.name = "icmp",
+		.help = "match ICMP header",
+		.priv = PRIV_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+		.next = NEXT(item_icmp),
+		.call = parse_vc,
+	},
+	[ITEM_ICMP_TYPE] = {
+		.name = "type",
+		.help = "ICMP packet type",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_type)),
+	},
+	[ITEM_ICMP_CODE] = {
+		.name = "code",
+		.help = "ICMP packet code",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_code)),
+	},
+	[ITEM_UDP] = {
+		.name = "udp",
+		.help = "match UDP header",
+		.priv = PRIV_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+		.next = NEXT(item_udp),
+		.call = parse_vc,
+	},
+	[ITEM_UDP_SRC] = {
+		.name = "src",
+		.help = "UDP source port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.src_port)),
+	},
+	[ITEM_UDP_DST] = {
+		.name = "dst",
+		.help = "UDP destination port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.dst_port)),
+	},
+	[ITEM_TCP] = {
+		.name = "tcp",
+		.help = "match TCP header",
+		.priv = PRIV_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+		.next = NEXT(item_tcp),
+		.call = parse_vc,
+	},
+	[ITEM_TCP_SRC] = {
+		.name = "src",
+		.help = "TCP source port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.src_port)),
+	},
+	[ITEM_TCP_DST] = {
+		.name = "dst",
+		.help = "TCP destination port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.dst_port)),
+	},
+	[ITEM_SCTP] = {
+		.name = "sctp",
+		.help = "match SCTP header",
+		.priv = PRIV_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+		.next = NEXT(item_sctp),
+		.call = parse_vc,
+	},
+	[ITEM_SCTP_SRC] = {
+		.name = "src",
+		.help = "SCTP source port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.src_port)),
+	},
+	[ITEM_SCTP_DST] = {
+		.name = "dst",
+		.help = "SCTP destination port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.dst_port)),
+	},
+	[ITEM_VXLAN] = {
+		.name = "vxlan",
+		.help = "match VXLAN header",
+		.priv = PRIV_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+		.next = NEXT(item_vxlan),
+		.call = parse_vc,
+	},
+	[ITEM_VXLAN_VNI] = {
+		.name = "vni",
+		.help = "VXLAN identifier",
+		.next = NEXT(item_vxlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vxlan, vni)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1491,6 +1641,19 @@ parse_int(struct context *ctx, const struct token *token,
 	case sizeof(uint16_t):
 		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
 		break;
+	case sizeof(uint8_t [3]):
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+		if (!arg->hton) {
+			((uint8_t *)buf)[0] = u;
+			((uint8_t *)buf)[1] = u >> 8;
+			((uint8_t *)buf)[2] = u >> 16;
+			break;
+		}
+#endif
+		((uint8_t *)buf)[0] = u >> 16;
+		((uint8_t *)buf)[1] = u >> 8;
+		((uint8_t *)buf)[2] = u;
+		break;
 	case sizeof(uint32_t):
 		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 23/25] app/testpmd: add various actions to flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (21 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 22/25] app/testpmd: add L4 items " Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 24/25] app/testpmd: add queue " Adrien Mazarguil
                             ` (2 subsequent siblings)
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

- MARK: attach 32 bit value to packets.
- FLAG: flag packets.
- DROP: drop packets.
- COUNT: enable counters for a rule.
- PF: redirect packets to physical device function.
- VF: redirect packets to virtual device function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 121 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index e63982f..d0b6754 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -153,6 +153,15 @@ enum index {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_MARK_ID,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
 };
 
 /** Size of pattern[] field in struct rte_flow_item_raw. */
@@ -476,6 +485,25 @@ static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ZERO,
+};
+
+static const enum index action_mark[] = {
+	ACTION_MARK_ID,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_vf[] = {
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
+	ACTION_NEXT,
 	ZERO,
 };
 
@@ -487,6 +515,8 @@ static int parse_vc(struct context *, const struct token *,
 		    void *, unsigned int);
 static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_conf(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1112,6 +1142,70 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_MARK] = {
+		.name = "mark",
+		.help = "attach 32 bit value to packets",
+		.priv = PRIV_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+		.next = NEXT(action_mark),
+		.call = parse_vc,
+	},
+	[ACTION_MARK_ID] = {
+		.name = "id",
+		.help = "32 bit value to return with packets",
+		.next = NEXT(action_mark, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_mark, id)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_FLAG] = {
+		.name = "flag",
+		.help = "flag packets",
+		.priv = PRIV_ACTION(FLAG, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_DROP] = {
+		.name = "drop",
+		.help = "drop packets (note: passthru has priority)",
+		.priv = PRIV_ACTION(DROP, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_COUNT] = {
+		.name = "count",
+		.help = "enable counters for this rule",
+		.priv = PRIV_ACTION(COUNT, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PF] = {
+		.name = "pf",
+		.help = "redirect packets to physical device function",
+		.priv = PRIV_ACTION(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_VF] = {
+		.name = "vf",
+		.help = "redirect packets to virtual device function",
+		.priv = PRIV_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+		.next = NEXT(action_vf),
+		.call = parse_vc,
+	},
+	[ACTION_VF_ORIGINAL] = {
+		.name = "original",
+		.help = "use original VF ID if possible",
+		.next = NEXT(action_vf, NEXT_ENTRY(BOOLEAN)),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_vf,
+					   original, 1)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_VF_ID] = {
+		.name = "id",
+		.help = "VF ID to redirect packets to",
+		.next = NEXT(action_vf, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_vf, id)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -1435,6 +1529,33 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse action configuration field. */
+static int
+parse_vc_conf(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Update configuration pointer. */
+	action->conf = ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 24/25] app/testpmd: add queue actions to flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (22 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 23/25] app/testpmd: add various actions " Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 25/25] doc: describe testpmd " Adrien Mazarguil
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

- QUEUE: assign packets to a given queue index.
- DUP: duplicate packets to a given queue index.
- RSS: spread packets among several queues.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 152 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index d0b6754..145838e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -156,8 +156,15 @@ enum index {
 	ACTION_MARK,
 	ACTION_MARK_ID,
 	ACTION_FLAG,
+	ACTION_QUEUE,
+	ACTION_QUEUE_INDEX,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_DUP_INDEX,
+	ACTION_RSS,
+	ACTION_RSS_QUEUES,
+	ACTION_RSS_QUEUE,
 	ACTION_PF,
 	ACTION_VF,
 	ACTION_VF_ORIGINAL,
@@ -171,6 +178,14 @@ enum index {
 #define ITEM_RAW_SIZE \
 	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
 
+/** Number of queue[] entries in struct rte_flow_action_rss. */
+#define ACTION_RSS_NUM 32
+
+/** Storage size for struct rte_flow_action_rss including queues. */
+#define ACTION_RSS_SIZE \
+	(offsetof(struct rte_flow_action_rss, queue) + \
+	 sizeof(*((struct rte_flow_action_rss *)0)->queue) * ACTION_RSS_NUM)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -487,8 +502,11 @@ static const enum index next_action[] = {
 	ACTION_PASSTHRU,
 	ACTION_MARK,
 	ACTION_FLAG,
+	ACTION_QUEUE,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_RSS,
 	ACTION_PF,
 	ACTION_VF,
 	ZERO,
@@ -500,6 +518,24 @@ static const enum index action_mark[] = {
 	ZERO,
 };
 
+static const enum index action_queue[] = {
+	ACTION_QUEUE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_dup[] = {
+	ACTION_DUP_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_rss[] = {
+	ACTION_RSS_QUEUES,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static const enum index action_vf[] = {
 	ACTION_VF_ORIGINAL,
 	ACTION_VF_ID,
@@ -517,6 +553,9 @@ static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
 static int parse_vc_conf(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_action_rss_queue(struct context *, const struct token *,
+				     const char *, unsigned int, void *,
+				     unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -566,6 +605,8 @@ static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
 			unsigned int, char *, unsigned int);
+static int comp_vc_action_rss_queue(struct context *, const struct token *,
+				    unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -1163,6 +1204,21 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_QUEUE] = {
+		.name = "queue",
+		.help = "assign packets to a given queue index",
+		.priv = PRIV_ACTION(QUEUE,
+				    sizeof(struct rte_flow_action_queue)),
+		.next = NEXT(action_queue),
+		.call = parse_vc,
+	},
+	[ACTION_QUEUE_INDEX] = {
+		.name = "index",
+		.help = "queue index to use",
+		.next = NEXT(action_queue, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_queue, index)),
+		.call = parse_vc_conf,
+	},
 	[ACTION_DROP] = {
 		.name = "drop",
 		.help = "drop packets (note: passthru has priority)",
@@ -1177,6 +1233,39 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_DUP] = {
+		.name = "dup",
+		.help = "duplicate packets to a given queue index",
+		.priv = PRIV_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+		.next = NEXT(action_dup),
+		.call = parse_vc,
+	},
+	[ACTION_DUP_INDEX] = {
+		.name = "index",
+		.help = "queue index to duplicate packets to",
+		.next = NEXT(action_dup, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_dup, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS] = {
+		.name = "rss",
+		.help = "spread packets among several queues",
+		.priv = PRIV_ACTION(RSS, ACTION_RSS_SIZE),
+		.next = NEXT(action_rss),
+		.call = parse_vc,
+	},
+	[ACTION_RSS_QUEUES] = {
+		.name = "queues",
+		.help = "queue indices to use",
+		.next = NEXT(action_rss, NEXT_ENTRY(ACTION_RSS_QUEUE)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS_QUEUE] = {
+		.name = "{queue}",
+		.help = "queue index",
+		.call = parse_vc_action_rss_queue,
+		.comp = comp_vc_action_rss_queue,
+	},
 	[ACTION_PF] = {
 		.name = "pf",
 		.help = "redirect packets to physical device function",
@@ -1556,6 +1645,51 @@ parse_vc_conf(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/**
+ * Parse queue field for RSS action.
+ *
+ * Valid tokens are queue indices and the "end" token.
+ */
+static int
+parse_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	static const enum index next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
+	int ret;
+	int i;
+
+	(void)token;
+	(void)buf;
+	(void)size;
+	if (ctx->curr != ACTION_RSS_QUEUE)
+		return -1;
+	i = ctx->objdata >> 16;
+	if (!strncmp(str, "end", len)) {
+		ctx->objdata &= 0xffff;
+		return len;
+	}
+	if (i >= ACTION_RSS_NUM)
+		return -1;
+	if (push_args(ctx, ARGS_ENTRY(struct rte_flow_action_rss, queue[i])))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	++i;
+	ctx->objdata = i << 16 | (ctx->objdata & 0xffff);
+	/* Repeat token. */
+	if (ctx->next_num == RTE_DIM(ctx->next))
+		return -1;
+	ctx->next[ctx->next_num++] = next;
+	if (!ctx->object)
+		return len;
+	((struct rte_flow_action_rss *)ctx->object)->num = i;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -2130,6 +2264,24 @@ comp_rule_id(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete queue field for RSS action. */
+static int
+comp_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			 unsigned int ent, char *buf, unsigned int size)
+{
+	static const char *const str[] = { "", "end", NULL };
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; str[i] != NULL; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", str[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v4 25/25] doc: describe testpmd flow command
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (23 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 24/25] app/testpmd: add queue " Adrien Mazarguil
@ 2016-12-20 18:42           ` Adrien Mazarguil
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
  25 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-20 18:42 UTC (permalink / raw)
  To: dev

Document syntax, interaction with rte_flow and provide usage examples.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 612 +++++++++++++++++++++++
 1 file changed, 612 insertions(+)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index f1c269a..8defb88 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1631,6 +1631,9 @@ Filter Functions
 
 This section details the available filter functions that are available.
 
+Note these functions interface the deprecated legacy filtering framework,
+superseded by *rte_flow*. See `Flow rules management`_.
+
 ethertype_filter
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -2041,3 +2044,612 @@ Set different GRE key length for input set::
 For example to set GRE key length for input set to 4 bytes on port 0::
 
    testpmd> global_config 0 gre-key-len 4
+
+
+.. _testpmd_rte_flow:
+
+Flow rules management
+---------------------
+
+Control of the generic flow API (*rte_flow*) is fully exposed through the
+``flow`` command (validation, creation, destruction and queries).
+
+Considering *rte_flow* overlaps with all `Filter Functions`_, using both
+features simultaneously may cause undefined side-effects and is therefore
+not recommended.
+
+``flow`` syntax
+~~~~~~~~~~~~~~~
+
+Because the ``flow`` command uses dynamic tokens to handle the large number
+of possible flow rules combinations, its behavior differs slightly from
+other commands, in particular:
+
+- Pressing *?* or the *<tab>* key displays contextual help for the current
+  token, not that of the entire command.
+
+- Optional and repeated parameters are supported (provided they are listed
+  in the contextual help).
+
+The first parameter stands for the operation mode. Possible operations and
+their general syntax are described below. They are covered in detail in the
+following sections.
+
+- Check whether a flow rule can be created::
+
+   flow validate {port_id}
+       [group {group_id}] [priority {level}] [ingress] [egress]
+       pattern {item} [/ {item} [...]] / end
+       actions {action} [/ {action} [...]] / end
+
+- Create a flow rule::
+
+   flow create {port_id}
+       [group {group_id}] [priority {level}] [ingress] [egress]
+       pattern {item} [/ {item} [...]] / end
+       actions {action} [/ {action} [...]] / end
+
+- Destroy specific flow rules::
+
+   flow destroy {port_id} rule {rule_id} [...]
+
+- Destroy all flow rules::
+
+   flow flush {port_id}
+
+- Query an existing flow rule::
+
+   flow query {port_id} {rule_id} {action}
+
+- List existing flow rules sorted by priority, filtered by group
+  identifiers::
+
+   flow list {port_id} [group {group_id}] [...]
+
+Validating flow rules
+~~~~~~~~~~~~~~~~~~~~~
+
+``flow validate`` reports whether a flow rule would be accepted by the
+underlying device in its current state but stops short of creating it. It is
+bound to ``rte_flow_validate()``::
+
+   flow validate {port_id}
+      [group {group_id}] [priority {level}] [ingress] [egress]
+      pattern {item} [/ {item} [...]] / end
+      actions {action} [/ {action} [...]] / end
+
+If successful, it will show::
+
+   Flow rule validated
+
+Otherwise it will show an error message of the form::
+
+   Caught error type [...] ([...]): [...]
+
+This command uses the same parameters as ``flow create``, their format is
+described in `Creating flow rules`_.
+
+Check whether redirecting any Ethernet packet received on port 0 to RX queue
+index 6 is supported::
+
+   testpmd> flow validate 0 ingress pattern eth / end
+      actions queue index 6 / end
+   Flow rule validated
+   testpmd>
+
+Port 0 does not support TCPv6 rules::
+
+   testpmd> flow validate 0 ingress pattern eth / ipv6 / tcp / end
+      actions drop / end
+   Caught error type 9 (specific pattern item): Invalid argument.
+   testpmd>
+
+Creating flow rules
+~~~~~~~~~~~~~~~~~~~
+
+``flow create`` validates and creates the specified flow rule. It is bound
+to ``rte_flow_create()``::
+
+   flow create {port_id}
+      [group {group_id}] [priority {level}] [ingress] [egress]
+      pattern {item} [/ {item} [...]] / end
+      actions {action} [/ {action} [...]] / end
+
+If successful, it will return a flow rule ID usable with other commands::
+
+   Flow rule #[...] created
+
+Otherwise it will show an error message of the form::
+
+   Caught error type [...] ([...]): [...]
+
+Parameters describe in the following order:
+
+- Attributes (*group*, *priority*, *ingress*, *egress* tokens).
+- A matching pattern, starting with the *pattern* token and terminated by an
+  *end* pattern item.
+- Actions, starting with the *actions* token and terminated by an *end*
+  action.
+
+These translate directly to *rte_flow* objects provided as-is to the
+underlying functions.
+
+The shortest valid definition only comprises mandatory tokens::
+
+   testpmd> flow create 0 pattern end actions end
+
+Note that PMDs may refuse rules that essentially do nothing such as this
+one.
+
+**All unspecified object values are automatically initialized to 0.**
+
+Attributes
+^^^^^^^^^^
+
+These tokens affect flow rule attributes (``struct rte_flow_attr``) and are
+specified before the ``pattern`` token.
+
+- ``group {group id}``: priority group.
+- ``priority {level}``: priority level within group.
+- ``ingress``: rule applies to ingress traffic.
+- ``egress``: rule applies to egress traffic.
+
+Each instance of an attribute specified several times overrides the previous
+value as shown below (group 4 is used)::
+
+   testpmd> flow create 0 group 42 group 24 group 4 [...]
+
+Note that once enabled, ``ingress`` and ``egress`` cannot be disabled.
+
+While not specifying a direction is an error, some rules may allow both
+simultaneously.
+
+Most rules affect RX therefore contain the ``ingress`` token::
+
+   testpmd> flow create 0 ingress pattern [...]
+
+Matching pattern
+^^^^^^^^^^^^^^^^
+
+A matching pattern starts after the ``pattern`` token. It is made of pattern
+items and is terminated by a mandatory ``end`` item.
+
+Items are named after their type (*RTE_FLOW_ITEM_TYPE_* from ``enum
+rte_flow_item_type``).
+
+The ``/`` token is used as a separator between pattern items as shown
+below::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end [...]
+
+Note that protocol items like these must be stacked from lowest to highest
+layer to make sense. For instance, the following rule is either invalid or
+unlikely to match any packet::
+
+   testpmd> flow create 0 ingress pattern eth / udp / ipv4 / end [...]
+
+More information on these restrictions can be found in the *rte_flow*
+documentation.
+
+Several items support additional specification structures, for example
+``ipv4`` allows specifying source and destination addresses as follows::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 src is 10.1.1.1
+      dst is 10.2.0.0 / end [...]
+
+This rule matches all IPv4 traffic with the specified properties.
+
+In this example, ``src`` and ``dst`` are field names of the underlying
+``struct rte_flow_item_ipv4`` object. All item properties can be specified
+in a similar fashion.
+
+The ``is`` token means that the subsequent value must be matched exactly,
+and assigns ``spec`` and ``mask`` fields in ``struct rte_flow_item``
+accordingly. Possible assignment tokens are:
+
+- ``is``: match value perfectly (with full bit-mask).
+- ``spec``: match value according to configured bit-mask.
+- ``last``: specify upper bound to establish a range.
+- ``mask``: specify bit-mask with relevant bits set to one.
+- ``prefix``: generate bit-mask from a prefix length.
+
+These yield identical results::
+
+   ipv4 src is 10.1.1.1
+
+::
+
+   ipv4 src spec 10.1.1.1 src mask 255.255.255.255
+
+::
+
+   ipv4 src spec 10.1.1.1 src prefix 32
+
+::
+
+   ipv4 src is 10.1.1.1 src last 10.1.1.1 # range with a single value
+
+::
+
+   ipv4 src is 10.1.1.1 src last 0 # 0 disables range
+
+Inclusive ranges can be defined with ``last``::
+
+   ipv4 src is 10.1.1.1 src last 10.2.3.4 # 10.1.1.1 to 10.2.3.4
+
+Note that ``mask`` affects both ``spec`` and ``last``::
+
+   ipv4 src is 10.1.1.1 src last 10.2.3.4 src mask 255.255.0.0
+      # matches 10.1.0.0 to 10.2.255.255
+
+Properties can be modified multiple times::
+
+   ipv4 src is 10.1.1.1 src is 10.1.2.3 src is 10.2.3.4 # matches 10.2.3.4
+
+::
+
+   ipv4 src is 10.1.1.1 src prefix 24 src prefix 16 # matches 10.1.0.0/16
+
+Pattern items
+^^^^^^^^^^^^^
+
+This section lists supported pattern items and their attributes, if any.
+
+- ``end``: end list of pattern items.
+
+- ``void``: no-op pattern item.
+
+- ``invert``: perform actions when pattern does not match.
+
+- ``any``: match any protocol for the current layer.
+
+  - ``num {unsigned}``: number of layers covered.
+
+- ``pf``: match packets addressed to the physical function.
+
+- ``vf``: match packets addressed to a virtual function ID.
+
+  - ``id {unsigned}``: destination VF ID.
+
+- ``port``: device-specific physical port index to use.
+
+  - ``index {unsigned}``: physical port index.
+
+- ``raw``: match an arbitrary byte string.
+
+  - ``relative {boolean}``: look for pattern after the previous item.
+  - ``search {boolean}``: search pattern from offset (see also limit).
+  - ``offset {integer}``: absolute or relative offset for pattern.
+  - ``limit {unsigned}``: search area limit for start of pattern.
+  - ``pattern {string}``: byte string to look for.
+
+- ``eth``: match Ethernet header.
+
+  - ``dst {MAC-48}``: destination MAC.
+  - ``src {MAC-48}``: source MAC.
+  - ``type {unsigned}``: EtherType.
+
+- ``vlan``: match 802.1Q/ad VLAN tag.
+
+  - ``tpid {unsigned}``: tag protocol identifier.
+  - ``tci {unsigned}``: tag control information.
+
+- ``ipv4``: match IPv4 header.
+
+  - ``src {ipv4 address}``: source address.
+  - ``dst {ipv4 address}``: destination address.
+
+- ``ipv6``: match IPv6 header.
+
+  - ``src {ipv6 address}``: source address.
+  - ``dst {ipv6 address}``: destination address.
+
+- ``icmp``: match ICMP header.
+
+  - ``type {unsigned}``: ICMP packet type.
+  - ``code {unsigned}``: ICMP packet code.
+
+- ``udp``: match UDP header.
+
+  - ``src {unsigned}``: UDP source port.
+  - ``dst {unsigned}``: UDP destination port.
+
+- ``tcp``: match TCP header.
+
+  - ``src {unsigned}``: TCP source port.
+  - ``dst {unsigned}``: TCP destination port.
+
+- ``sctp``: match SCTP header.
+
+  - ``src {unsigned}``: SCTP source port.
+  - ``dst {unsigned}``: SCTP destination port.
+
+- ``vxlan``: match VXLAN header.
+
+  - ``vni {unsigned}``: VXLAN identifier.
+
+Actions list
+^^^^^^^^^^^^
+
+A list of actions starts after the ``actions`` token in the same fashion as
+`Matching pattern`_; actions are separated by ``/`` tokens and the list is
+terminated by a mandatory ``end`` action.
+
+Actions are named after their type (*RTE_FLOW_ACTION_TYPE_* from ``enum
+rte_flow_action_type``).
+
+Dropping all incoming UDPv4 packets can be expressed as follows::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+      actions drop / end
+
+Several actions have configurable properties which must be specified when
+there is no valid default value. For example, ``queue`` requires a target
+queue index.
+
+This rule redirects incoming UDPv4 traffic to queue index 6::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+      actions queue index 6 / end
+
+While this one could be rejected by PMDs (unspecified queue index)::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+      actions queue / end
+
+As defined by *rte_flow*, the list is not ordered, all actions of a given
+rule are performed simultaneously. These are equivalent::
+
+   queue index 6 / void / mark id 42 / end
+
+::
+
+   void / mark id 42 / queue index 6 / end
+
+All actions in a list should have different types, otherwise only the last
+action of a given type is taken into account::
+
+   queue index 4 / queue index 5 / queue index 6 / end # will use queue 6
+
+::
+
+   drop / drop / drop / end # drop is performed only once
+
+::
+
+   mark id 42 / queue index 3 / mark id 24 / end # mark will be 24
+
+Considering they are performed simultaneously, opposite and overlapping
+actions can sometimes be combined when the end result is unambiguous::
+
+   drop / queue index 6 / end # drop has no effect
+
+::
+
+   drop / dup index 6 / end # same as above
+
+::
+
+   queue index 6 / rss queues 6 7 8 / end # queue has no effect
+
+::
+
+   drop / passthru / end # drop has no effect
+
+Note that PMDs may still refuse such combinations.
+
+Actions
+^^^^^^^
+
+This section lists supported actions and their attributes, if any.
+
+- ``end``: end list of actions.
+
+- ``void``: no-op action.
+
+- ``passthru``: let subsequent rule process matched packets.
+
+- ``mark``: attach 32 bit value to packets.
+
+  - ``id {unsigned}``: 32 bit value to return with packets.
+
+- ``flag``: flag packets.
+
+- ``queue``: assign packets to a given queue index.
+
+  - ``index {unsigned}``: queue index to use.
+
+- ``drop``: drop packets (note: passthru has priority).
+
+- ``count``: enable counters for this rule.
+
+- ``dup``: duplicate packets to a given queue index.
+
+  - ``index {unsigned}``: queue index to duplicate packets to.
+
+- ``rss``: spread packets among several queues.
+
+  - ``queues [{unsigned} [...]] end``: queue indices to use.
+
+- ``pf``: redirect packets to physical device function.
+
+- ``vf``: redirect packets to virtual device function.
+
+  - ``original {boolean}``: use original VF ID if possible.
+  - ``id {unsigned}``: VF ID to redirect packets to.
+
+Destroying flow rules
+~~~~~~~~~~~~~~~~~~~~~
+
+``flow destroy`` destroys one or more rules from their rule ID (as returned
+by ``flow create``), this command calls ``rte_flow_destroy()`` as many
+times as necessary::
+
+   flow destroy {port_id} rule {rule_id} [...]
+
+If successful, it will show::
+
+   Flow rule #[...] destroyed
+
+It does not report anything for rule IDs that do not exist. The usual error
+message is shown when a rule cannot be destroyed::
+
+   Caught error type [...] ([...]): [...]
+
+``flow flush`` destroys all rules on a device and does not take extra
+arguments. It is bound to ``rte_flow_flush()``::
+
+   flow flush {port_id}
+
+Any errors are reported as above.
+
+Creating several rules and destroying them::
+
+   testpmd> flow create 0 ingress pattern eth / ipv6 / end
+      actions queue index 2 / end
+   Flow rule #0 created
+   testpmd> flow create 0 ingress pattern eth / ipv4 / end
+      actions queue index 3 / end
+   Flow rule #1 created
+   testpmd> flow destroy 0 rule 0 rule 1
+   Flow rule #1 destroyed
+   Flow rule #0 destroyed
+   testpmd>
+
+The same result can be achieved using ``flow flush``::
+
+   testpmd> flow create 0 ingress pattern eth / ipv6 / end
+      actions queue index 2 / end
+   Flow rule #0 created
+   testpmd> flow create 0 ingress pattern eth / ipv4 / end
+      actions queue index 3 / end
+   Flow rule #1 created
+   testpmd> flow flush 0
+   testpmd>
+
+Non-existent rule IDs are ignored::
+
+   testpmd> flow create 0 ingress pattern eth / ipv6 / end
+      actions queue index 2 / end
+   Flow rule #0 created
+   testpmd> flow create 0 ingress pattern eth / ipv4 / end
+      actions queue index 3 / end
+   Flow rule #1 created
+   testpmd> flow destroy 0 rule 42 rule 10 rule 2
+   testpmd>
+   testpmd> flow destroy 0 rule 0
+   Flow rule #0 destroyed
+   testpmd>
+
+Querying flow rules
+~~~~~~~~~~~~~~~~~~~
+
+``flow query`` queries a specific action of a flow rule having that
+ability. Such actions collect information that can be reported using this
+command. It is bound to ``rte_flow_query()``::
+
+   flow query {port_id} {rule_id} {action}
+
+If successful, it will display either the retrieved data for known actions
+or the following message::
+
+   Cannot display result for action type [...] ([...])
+
+Otherwise, it will complain either that the rule does not exist or that some
+error occurred::
+
+   Flow rule #[...] not found
+
+::
+
+   Caught error type [...] ([...]): [...]
+
+Currently only the ``count`` action is supported. This action reports the
+number of packets that hit the flow rule and the total number of bytes. Its
+output has the following format::
+
+   count:
+    hits_set: [...] # whether "hits" contains a valid value
+    bytes_set: [...] # whether "bytes" contains a valid value
+    hits: [...] # number of packets
+    bytes: [...] # number of bytes
+
+Querying counters for TCPv6 packets redirected to queue 6::
+
+   testpmd> flow create 0 ingress pattern eth / ipv6 / tcp / end
+      actions queue index 6 / count / end
+   Flow rule #4 created
+   testpmd> flow query 0 4 count
+   count:
+    hits_set: 1
+    bytes_set: 0
+    hits: 386446
+    bytes: 0
+   testpmd>
+
+Listing flow rules
+~~~~~~~~~~~~~~~~~~
+
+``flow list`` lists existing flow rules sorted by priority and optionally
+filtered by group identifiers::
+
+   flow list {port_id} [group {group_id}] [...]
+
+This command only fails with the following message if the device does not
+exist::
+
+   Invalid port [...]
+
+Output consists of a header line followed by a short description of each
+flow rule, one per line. There is no output at all when no flow rules are
+configured on the device::
+
+   ID      Group   Prio    Attr    Rule
+   [...]   [...]   [...]   [...]   [...]
+
+``Attr`` column flags:
+
+- ``i`` for ``ingress``.
+- ``e`` for ``egress``.
+
+Creating several flow rules and listing them::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / end
+      actions queue index 6 / end
+   Flow rule #0 created
+   testpmd> flow create 0 ingress pattern eth / ipv6 / end
+      actions queue index 2 / end
+   Flow rule #1 created
+   testpmd> flow create 0 priority 5 ingress pattern eth / ipv4 / udp / end
+      actions rss queues 6 7 8 end / end
+   Flow rule #2 created
+   testpmd> flow list 0
+   ID      Group   Prio    Attr    Rule
+   0       0       0       i-      ETH IPV4 => QUEUE
+   1       0       0       i-      ETH IPV6 => QUEUE
+   2       0       5       i-      ETH IPV4 UDP => RSS
+   testpmd>
+
+Rules are sorted by priority (i.e. group ID first, then priority level)::
+
+   testpmd> flow list 1
+   ID      Group   Prio    Attr    Rule
+   0       0       0       i-      ETH => COUNT
+   6       0       500     i-      ETH IPV6 TCP => DROP COUNT
+   5       0       1000    i-      ETH IPV6 ICMP => QUEUE
+   1       24      0       i-      ETH IPV4 UDP => QUEUE
+   4       24      10      i-      ETH IPV4 TCP => DROP
+   3       24      20      i-      ETH IPV4 => DROP
+   2       24      42      i-      ETH IPV4 UDP => QUEUE
+   7       63      0       i-      ETH IPV6 UDP VXLAN => MARK QUEUE
+   testpmd>
+
+Output can be limited to specific groups::
+
+   testpmd> flow list 1 group 0 group 63
+   ID      Group   Prio    Attr    Rule
+   0       0       0       i-      ETH => COUNT
+   6       0       500     i-      ETH IPV6 TCP => DROP COUNT
+   5       0       1000    i-      ETH IPV6 ICMP => QUEUE
+   7       63      0       i-      ETH IPV6 UDP VXLAN => MARK QUEUE
+   testpmd>
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow
  2016-12-20  9:38               ` Adrien Mazarguil
@ 2016-12-21  5:23                 ` Xing, Beilei
  0 siblings, 0 replies; 262+ messages in thread
From: Xing, Beilei @ 2016-12-21  5:23 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Pei, Yulong



> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Tuesday, December 20, 2016 5:38 PM
> To: Xing, Beilei <beilei.xing@intel.com>
> Cc: dev@dpdk.org; Pei, Yulong <yulong.pei@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic
> support for rte_flow
> 
> On Tue, Dec 20, 2016 at 01:57:46AM +0000, Xing, Beilei wrote:
> > > -----Original Message-----
> > > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > > Sent: Monday, December 19, 2016 6:20 PM
> > > To: Xing, Beilei <beilei.xing@intel.com>
> > > Cc: dev@dpdk.org; Pei, Yulong <yulong.pei@intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement
> > > basic support for rte_flow
> > >
> > > Hi Beilei,
> > >
> > > On Mon, Dec 19, 2016 at 08:37:20AM +0000, Xing, Beilei wrote:
> > > > Hi Adrien,
> > > >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien
> > > > > Mazarguil
> > > > > Sent: Saturday, December 17, 2016 12:25 AM
> > > > > To: dev@dpdk.org
> > > > > Subject: [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement
> > > > > basic support for rte_flow
> > > > >
> > > > > Add basic management functions for the generic flow API
> > > > > (validate, create, destroy, flush, query and list). Flow rule
> > > > > objects and properties are arranged in lists associated with each port.
> > > > >
> > > > > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > +/** Create flow rule. */
> > > > > +int
> > > > > +port_flow_create(portid_t port_id,
> > > > > +		 const struct rte_flow_attr *attr,
> > > > > +		 const struct rte_flow_item *pattern,
> > > > > +		 const struct rte_flow_action *actions) {
> > > > > +	struct rte_flow *flow;
> > > > > +	struct rte_port *port;
> > > > > +	struct port_flow *pf;
> > > > > +	uint32_t id;
> > > > > +	struct rte_flow_error error;
> > > > > +
> > > >
> > > > I think there should be memset for error here, e.g. memset(&error,
> > > > 0, sizeof(struct rte_flow_error)); Since both cause and message
> > > > may be NULL
> > > regardless of the error type, if there's no error.cause and
> > > error.message returned from PMD, Segmentation fault will happen in
> port_flow_complain.
> > > > PS: This issue doesn't happen if add "export EXTRA_CFLAGS=' -g
> > > > O0'" when
> > > compiling.
> > >
> > > Actually, PMDs must fill the error structure only in case of error
> > > if the application provides one, it's not optional. I didn't
> > > initialize this structure for this reason.
> > >
> > > I suggest we initialize it with a known poisoning value for
> > > debugging purposes though, to make it fail every time. Does it sound
> reasonable?
> 
> Done for v3 by the way.
> 
> > OK, I see. Do you want PMD to allocate the memory for cause and message
> of error, and must fill the cause and message if error exists, right?
> > So is it possible to set NULL for pointers of cause and message in application?
> then PMD can judge if it's need to allocate or overlap memory.
> 
> PMDs never allocate this structure, applications do and initialize it however
> they want. They only provide a non-NULL pointer if they want additional
> details in case of error.
> 
> It will likely be allocated on the stack in most cases (as in testpmd).
> 
> From a PMD standpoint, the contents of this structure must be updated in
> case of non-NULL pointer and error state.
> 
> Now the message pointer can be allocated dynamically but it's not
> recommended, it's far easier to make it point to some constant string.
> Applications won't free it anyway, so PMDs would have to do it during
> dev_close(). Please see "Verbose error reporting" documentation section.

Got it, thanks. Seems the rte_flow_error_set function can be used for PMD.

Regards,
Beilei

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/25] doc: add rte_flow prog guide
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 02/25] doc: add rte_flow prog guide Adrien Mazarguil
@ 2016-12-21 10:55             ` Mcnamara, John
  2016-12-21 11:31               ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Mcnamara, John @ 2016-12-21 10:55 UTC (permalink / raw)
  To: Adrien Mazarguil, dev

Hi Adrien,

You missed out the changes I suggested to the table_rte_flow_migration_fdir
which means that the pdf build is still broken. Also, the changes to
table_rte_flow_migration_l2tunnel also break the PDF bulid.

You can test as follows:

    make doc-guides-pdf -j

I'd suggest the following tables which retain the information and format
that you are trying to achieve but which should compile:


.. _table_rte_flow_migration_fdir:

.. table:: FDIR conversion

   +----------------------------------------+-----------------------+
   | Pattern                                | Actions               |
   +===+===================+==========+=====+=======================+
   | 0 | ETH, RAW          | ``spec`` | any | QUEUE, DROP, PASSTHRU |
   |   |                   +----------+-----+                       |
   |   |                   | ``last`` | N/A |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``mask`` | any |                       |
   +---+-------------------+----------+-----+-----------------------+
   | 1 | IPV4, IPv6        | ``spec`` | any | MARK                  |
   |   |                   +----------+-----+                       |
   |   |                   | ``last`` | N/A |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``mask`` | any |                       |
   +---+-------------------+----------+-----+-----------------------+
   | 2 | TCP, UDP, SCTP    | ``spec`` | any | END                   |
   |   |                   +----------+-----+                       |
   |   |                   | ``last`` | N/A |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``mask`` | any |                       |
   +---+-------------------+----------+-----+                       |
   | 3 | VF, PF (optional) | ``spec`` | any |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``last`` | N/A |                       |
   |   |                   +----------+-----+                       |
   |   |                   | ``mask`` | any |                       |
   +---+-------------------+----------+-----+                       |
   | 4 | END                                |                       |
   +---+------------------------------------+-----------------------+


.. _table_rte_flow_migration_l2tunnel:

.. table:: L2_TUNNEL conversion

   +---------------------------+--------------------+
   | Pattern                   | Actions            |
   +===+======+==========+=====+====================+
   | 0 | VOID | ``spec`` | N/A | VXLAN, GENEVE, ... |
   |   |      |          |     |                    |
   |   |      |          |     |                    |
   |   |      +----------+-----+                    |
   |   |      | ``last`` | N/A |                    |
   |   |      +----------+-----+                    |
   |   |      | ``mask`` | N/A |                    |
   |   |      |          |     |                    |
   +---+------+----------+-----+--------------------+
   | 1 | END                   | VF (optional)      |
   +---+                       +--------------------+
   | 2 |                       | END                |
   +---+-----------------------+--------------------+


^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/25] doc: add rte_flow prog guide
  2016-12-21 10:55             ` Mcnamara, John
@ 2016-12-21 11:31               ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 11:31 UTC (permalink / raw)
  To: Mcnamara, John; +Cc: dev

Hi John,

On Wed, Dec 21, 2016 at 10:55:39AM +0000, Mcnamara, John wrote:
> Hi Adrien,
> 
> You missed out the changes I suggested to the table_rte_flow_migration_fdir
> which means that the pdf build is still broken. Also, the changes to
> table_rte_flow_migration_l2tunnel also break the PDF bulid.
> 
> You can test as follows:
> 
>     make doc-guides-pdf -j

OK, I'm unable to generate pdf documentation on my setup even though I've
installed the dependencies, I usually rely on HTML output only, that's why I
did not see the remaining issue.

> I'd suggest the following tables which retain the information and format
> that you are trying to achieve but which should compile:
> 
> .. _table_rte_flow_migration_fdir:
> 
> .. table:: FDIR conversion
> 
>    +----------------------------------------+-----------------------+
>    | Pattern                                | Actions               |
>    +===+===================+==========+=====+=======================+
>    | 0 | ETH, RAW          | ``spec`` | any | QUEUE, DROP, PASSTHRU |
>    |   |                   +----------+-----+                       |
>    |   |                   | ``last`` | N/A |                       |
>    |   |                   +----------+-----+                       |
>    |   |                   | ``mask`` | any |                       |
>    +---+-------------------+----------+-----+-----------------------+
>    | 1 | IPV4, IPv6        | ``spec`` | any | MARK                  |
>    |   |                   +----------+-----+                       |
>    |   |                   | ``last`` | N/A |                       |
>    |   |                   +----------+-----+                       |
>    |   |                   | ``mask`` | any |                       |
>    +---+-------------------+----------+-----+-----------------------+
>    | 2 | TCP, UDP, SCTP    | ``spec`` | any | END                   |
>    |   |                   +----------+-----+                       |
>    |   |                   | ``last`` | N/A |                       |
>    |   |                   +----------+-----+                       |
>    |   |                   | ``mask`` | any |                       |
>    +---+-------------------+----------+-----+                       |
>    | 3 | VF, PF (optional) | ``spec`` | any |                       |
>    |   |                   +----------+-----+                       |
>    |   |                   | ``last`` | N/A |                       |
>    |   |                   +----------+-----+                       |
>    |   |                   | ``mask`` | any |                       |
>    +---+-------------------+----------+-----+                       |
>    | 4 | END                                |                       |
>    +---+------------------------------------+-----------------------+
> 
> 
> .. _table_rte_flow_migration_l2tunnel:
> 
> .. table:: L2_TUNNEL conversion
> 
>    +---------------------------+--------------------+
>    | Pattern                   | Actions            |
>    +===+======+==========+=====+====================+
>    | 0 | VOID | ``spec`` | N/A | VXLAN, GENEVE, ... |
>    |   |      |          |     |                    |
>    |   |      |          |     |                    |
>    |   |      +----------+-----+                    |
>    |   |      | ``last`` | N/A |                    |
>    |   |      +----------+-----+                    |
>    |   |      | ``mask`` | N/A |                    |
>    |   |      |          |     |                    |
>    +---+------+----------+-----+--------------------+
>    | 1 | END                   | VF (optional)      |
>    +---+                       +--------------------+
>    | 2 |                       | END                |
>    +---+-----------------------+--------------------+

No problem, I'll include those changes along with a remaining bugfix and
testpmd support for additional protocol fields (requested by several
people). Brace yourself for v5!

Thanks for all the reviews.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow)
  2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
                             ` (24 preceding siblings ...)
  2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 25/25] doc: describe testpmd " Adrien Mazarguil
@ 2016-12-21 14:51           ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 01/26] ethdev: introduce generic flow API Adrien Mazarguil
                               ` (26 more replies)
  25 siblings, 27 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

As previously discussed in RFC v1 [1], RFC v2 [2], with changes
described in [3] (also pasted below), here is the first non-draft series
for this new API.

Its capabilities are so generic that its name had to be vague, it may be
called "Generic flow API", "Generic flow interface" (possibly shortened
as "GFI") to refer to the name of the new filter type, or "rte_flow" from
the prefix used for its public symbols. I personally favor the latter.

While it is currently meant to supersede existing filter types in order for
all PMDs to expose a common filtering/classification interface, it may
eventually evolve to cover the following ideas as well:

- Rx/Tx offloads configuration through automatic offloads for specific
  packets, e.g. performing checksum on TCP packets could be expressed with
  an egress rule with a TCP pattern and a kind of checksum action.

- RSS configuration (already defined actually). Could be global or per rule
  depending on hardware capabilities.

- Switching configuration for devices with many physical ports; rules doing
  both ingress and egress could even be used to completely bypass software
  if supported by hardware.

 [1] http://dpdk.org/ml/archives/dev/2016-July/043365.html
 [2] http://dpdk.org/ml/archives/dev/2016-August/045383.html
 [3] http://dpdk.org/ml/archives/dev/2016-November/050044.html

Changes since v4 series:

- Fixed compilation regression introduced in v2, caused by new #include
  directive (rte_flow_driver.h).

- Fixed remaining documentation tables that broke PDF generation
  (John Mcnamara / rte_flow.rst).

- Fixed testpmd flow command's bit-field handler that filled values in the
  wrong order on big-endian systems. This change also makes
  arg_entry_bf_fill() support endian conversion at no additional cost
  (cmdline_flow.c).

- Removed extra period from testpmd documentation example
  (testpmd_funcs.rst).

- Extra commit to handle additional protocol item fields (for VLAN, IPv4,
  IPv6 and SCTP) requested by several people (cmdline_flow.c).

Changes since v3 series:

- Fixed documentation tables that broke PDF generation
  (John Mcnamara / rte_flow.rst).

- Also properly aligned "Action" lines in several tables with their
  corresponding "Index" (rte_flow.rst).

- Fixed remaining ICC error #188 in testpmd (Ferruh / cmdline_flow.c).

- Indented testpmd examples properly (John / testpmd_funcs.rst).

- Fixed wrong port in example (Ferruh / testpmd_funcs.rst).

Changes since v2 series:

- Replaced ENOTSUP with ENOSYS in the code (although doing so triggers
  spurious checkpatch warnings) to tell apart unimplemented callbacks from
  unsupported flow rules and match the documented behavior.

- Fixed missing include seen by check-includes.sh in rte_flow_driver.h.

- Made clearer that PMDs must initialize rte_flow_error (if non-NULL) in
  case of error, added related memory poisoning in testpmd to catch missing
  initializations.

- Fixed rte_flow programmer's guide according to John Mcnamara's comments
  (tables, sections header and typos).

- Fixed deprecation notice as well.

Changes since v1 series:

- Added programmer's guide documentation for rte_flow.

- Added depreciation notice for the legacy API.

- Documented testpmd flow command.

- Fixed missing rte_flow_flush symbol in rte_ether_version.map.

- Cleaned up API documentation in rte_flow.h.

- Replaced "min/max" parameters with "num" in struct rte_flow_item_any, to
  align behavior with other item definitions.

- Fixed "type" (EtherType) size in struct rte_flow_item_eth.

- Renamed "queues" to "num" in struct rte_flow_action_rss.

- Fixed missing const in rte_flow_error_set() prototype definition.

- Fixed testpmd flow create command that did not save the rte_flow object
  pointer, causing crashes.

- Hopefully fixed all the remaining ICC/clang errors.

- Replaced testpmd flow command's "fix" token with "is" for clarity.

Changes since RFC v2:

- New separate VLAN pattern item (previously part of the ETH definition),
  found to be much more convenient.

- Removed useless "any" field from VF pattern item, the same effect can be
  achieved by not providing a specification structure.

- Replaced bit-fields from the VXLAN pattern item to avoid endianness
  conversion issues on 24-bit fields.

- Updated struct rte_flow_item with a new "last" field to create inclusive
  ranges. They are defined as the interval between (spec & mask) and
  (last & mask). All three parameters are optional.

- Renamed ID action MARK.

- Renamed "queue" fields in actions QUEUE and DUP to "index".

- "rss_conf" field in RSS action is now const.

- VF action now uses a 32 bit ID like its pattern item counterpart.

- Removed redundant struct rte_flow_pattern, API functions now expect
  struct
  rte_flow_item lists terminated by END items.

- Replaced struct rte_flow_actions for the same reason, with struct
  rte_flow_action lists terminated by END actions.

- Error types (enum rte_flow_error_type) have been updated and the cause
  pointer in struct rte_flow_error is now const.

- Function prototypes (rte_flow_create, rte_flow_validate) have also been
  updated for clarity.

Additions:

- Public wrapper functions rte_flow_{validate|create|destroy|flush|query}
  are now implemented in rte_flow.c, with their symbols exported and
  versioned. Related filter type RTE_ETH_FILTER_GENERIC has been added.

- A separate header (rte_flow_driver.h) has been added for driver-side
  functionality, in particular struct rte_flow_ops which contains PMD
  callbacks returned by RTE_ETH_FILTER_GENERIC query.

- testpmd now exposes most of this API through the new "flow" command.

What remains to be done:

- Using endian-aware integer types (rte_beX_t) where necessary for clarity.

- API documentation (based on RFC).

- testpmd flow command documentation (although context-aware command
  completion should already help quite a bit in this regard).

- A few pattern item / action properties cannot be configured yet
  (e.g. rss_conf parameter for RSS action) and a few completions
  (e.g. possible queue IDs) should be added.

Adrien Mazarguil (26):
  ethdev: introduce generic flow API
  doc: add rte_flow prog guide
  doc: announce deprecation of legacy filter types
  cmdline: add support for dynamic tokens
  cmdline: add alignment constraint
  app/testpmd: implement basic support for rte_flow
  app/testpmd: add flow command
  app/testpmd: add rte_flow integer support
  app/testpmd: add flow list command
  app/testpmd: add flow flush command
  app/testpmd: add flow destroy command
  app/testpmd: add flow validate/create commands
  app/testpmd: add flow query command
  app/testpmd: add rte_flow item spec handler
  app/testpmd: add rte_flow item spec prefix length
  app/testpmd: add rte_flow bit-field support
  app/testpmd: add item any to flow command
  app/testpmd: add various items to flow command
  app/testpmd: add item raw to flow command
  app/testpmd: add items eth/vlan to flow command
  app/testpmd: add items ipv4/ipv6 to flow command
  app/testpmd: add L4 items to flow command
  app/testpmd: add various actions to flow command
  app/testpmd: add queue actions to flow command
  doc: describe testpmd flow command
  app/testpmd: add protocol fields to flow command

 MAINTAINERS                                 |    4 +
 app/test-pmd/Makefile                       |    1 +
 app/test-pmd/cmdline.c                      |   32 +
 app/test-pmd/cmdline_flow.c                 | 2713 ++++++++++++++++++++++
 app/test-pmd/config.c                       |  498 ++++
 app/test-pmd/csumonly.c                     |    1 +
 app/test-pmd/flowgen.c                      |    1 +
 app/test-pmd/icmpecho.c                     |    1 +
 app/test-pmd/ieee1588fwd.c                  |    1 +
 app/test-pmd/iofwd.c                        |    1 +
 app/test-pmd/macfwd.c                       |    1 +
 app/test-pmd/macswap.c                      |    1 +
 app/test-pmd/parameters.c                   |    1 +
 app/test-pmd/rxonly.c                       |    1 +
 app/test-pmd/testpmd.c                      |    6 +
 app/test-pmd/testpmd.h                      |   27 +
 app/test-pmd/txonly.c                       |    1 +
 doc/api/doxy-api-index.md                   |    2 +
 doc/guides/prog_guide/index.rst             |    1 +
 doc/guides/prog_guide/rte_flow.rst          | 2041 ++++++++++++++++
 doc/guides/rel_notes/deprecation.rst        |    8 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  624 +++++
 lib/librte_cmdline/cmdline_parse.c          |   67 +-
 lib/librte_cmdline/cmdline_parse.h          |   21 +
 lib/librte_ether/Makefile                   |    3 +
 lib/librte_ether/rte_eth_ctrl.h             |    1 +
 lib/librte_ether/rte_ether_version.map      |   11 +
 lib/librte_ether/rte_flow.c                 |  159 ++
 lib/librte_ether/rte_flow.h                 |  947 ++++++++
 lib/librte_ether/rte_flow_driver.h          |  182 ++
 30 files changed, 7349 insertions(+), 9 deletions(-)
 create mode 100644 app/test-pmd/cmdline_flow.c
 create mode 100644 doc/guides/prog_guide/rte_flow.rst
 create mode 100644 lib/librte_ether/rte_flow.c
 create mode 100644 lib/librte_ether/rte_flow.h
 create mode 100644 lib/librte_ether/rte_flow_driver.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 01/26] ethdev: introduce generic flow API
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 02/26] doc: add rte_flow prog guide Adrien Mazarguil
                               ` (25 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

Benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

Existing filter types will be deprecated and removed in the near future.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 MAINTAINERS                            |   4 +
 doc/api/doxy-api-index.md              |   2 +
 lib/librte_ether/Makefile              |   3 +
 lib/librte_ether/rte_eth_ctrl.h        |   1 +
 lib/librte_ether/rte_ether_version.map |  11 +
 lib/librte_ether/rte_flow.c            | 159 +++++
 lib/librte_ether/rte_flow.h            | 947 ++++++++++++++++++++++++++++
 lib/librte_ether/rte_flow_driver.h     | 182 ++++++
 8 files changed, 1309 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3bb0b99..775b058 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -243,6 +243,10 @@ M: Thomas Monjalon <thomas.monjalon@6wind.com>
 F: lib/librte_ether/
 F: scripts/test-null.sh
 
+Generic flow API
+M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
+F: lib/librte_ether/rte_flow*
+
 Crypto API
 M: Declan Doherty <declan.doherty@intel.com>
 F: lib/librte_cryptodev/
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de65b4c..4951552 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -39,6 +39,8 @@ There are many libraries, so their headers may be grouped by topics:
   [dev]                (@ref rte_dev.h),
   [ethdev]             (@ref rte_ethdev.h),
   [ethctrl]            (@ref rte_eth_ctrl.h),
+  [rte_flow]           (@ref rte_flow.h),
+  [rte_flow_driver]    (@ref rte_flow_driver.h),
   [cryptodev]          (@ref rte_cryptodev.h),
   [devargs]            (@ref rte_devargs.h),
   [bond]               (@ref rte_eth_bond.h),
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index efe1e5f..9335361 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -44,6 +44,7 @@ EXPORT_MAP := rte_ether_version.map
 LIBABIVER := 5
 
 SRCS-y += rte_ethdev.c
+SRCS-y += rte_flow.c
 
 #
 # Export include files
@@ -51,6 +52,8 @@ SRCS-y += rte_ethdev.c
 SYMLINK-y-include += rte_ethdev.h
 SYMLINK-y-include += rte_eth_ctrl.h
 SYMLINK-y-include += rte_dev_info.h
+SYMLINK-y-include += rte_flow.h
+SYMLINK-y-include += rte_flow_driver.h
 
 # this lib depends upon:
 DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index fe80eb0..8386904 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -99,6 +99,7 @@ enum rte_filter_type {
 	RTE_ETH_FILTER_FDIR,
 	RTE_ETH_FILTER_HASH,
 	RTE_ETH_FILTER_L2_TUNNEL,
+	RTE_ETH_FILTER_GENERIC,
 	RTE_ETH_FILTER_MAX
 };
 
diff --git a/lib/librte_ether/rte_ether_version.map b/lib/librte_ether/rte_ether_version.map
index 72be66d..384cdee 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -147,3 +147,14 @@ DPDK_16.11 {
 	rte_eth_dev_pci_remove;
 
 } DPDK_16.07;
+
+DPDK_17.02 {
+	global:
+
+	rte_flow_validate;
+	rte_flow_create;
+	rte_flow_destroy;
+	rte_flow_flush;
+	rte_flow_query;
+
+} DPDK_16.11;
diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
new file mode 100644
index 0000000..d98fb1b
--- /dev/null
+++ b/lib/librte_ether/rte_flow.c
@@ -0,0 +1,159 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include "rte_ethdev.h"
+#include "rte_flow_driver.h"
+#include "rte_flow.h"
+
+/* Get generic flow operations structure from a port. */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops;
+	int code;
+
+	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
+		code = ENODEV;
+	else if (unlikely(!dev->dev_ops->filter_ctrl ||
+			  dev->dev_ops->filter_ctrl(dev,
+						    RTE_ETH_FILTER_GENERIC,
+						    RTE_ETH_FILTER_GET,
+						    &ops) ||
+			  !ops))
+		code = ENOSYS;
+	else
+		return ops;
+	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(code));
+	return NULL;
+}
+
+/* Check whether a flow rule can be created on a given port. */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error)
+{
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->validate))
+		return ops->validate(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Create a flow rule on a given port. */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return NULL;
+	if (likely(!!ops->create))
+		return ops->create(dev, attr, pattern, actions, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return NULL;
+}
+
+/* Destroy a flow rule on a given port. */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->destroy))
+		return ops->destroy(dev, flow, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Destroy all flow rules associated with a port. */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops))
+		return -rte_errno;
+	if (likely(!!ops->flush))
+		return ops->flush(dev, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
+
+/* Query an existing flow rule. */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (!ops)
+		return -rte_errno;
+	if (likely(!!ops->query))
+		return ops->query(dev, flow, action, data, error);
+	rte_flow_error_set(error, ENOSYS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+			   NULL, rte_strerror(ENOSYS));
+	return -rte_errno;
+}
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
new file mode 100644
index 0000000..98084ac
--- /dev/null
+++ b/lib/librte_ether/rte_flow.h
@@ -0,0 +1,947 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_H_
+#define RTE_FLOW_H_
+
+/**
+ * @file
+ * RTE generic flow API
+ *
+ * This interface provides the ability to program packet matching and
+ * associated actions in hardware through flow rules.
+ */
+
+#include <rte_arp.h>
+#include <rte_ether.h>
+#include <rte_icmp.h>
+#include <rte_ip.h>
+#include <rte_sctp.h>
+#include <rte_tcp.h>
+#include <rte_udp.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Flow rule attributes.
+ *
+ * Priorities are set on two levels: per group and per rule within groups.
+ *
+ * Lower values denote higher priority, the highest priority for both levels
+ * is 0, so that a rule with priority 0 in group 8 is always matched after a
+ * rule with priority 8 in group 0.
+ *
+ * Although optional, applications are encouraged to group similar rules as
+ * much as possible to fully take advantage of hardware capabilities
+ * (e.g. optimized matching) and work around limitations (e.g. a single
+ * pattern type possibly allowed in a given group).
+ *
+ * Group and priority levels are arbitrary and up to the application, they
+ * do not need to be contiguous nor start from 0, however the maximum number
+ * varies between devices and may be affected by existing flow rules.
+ *
+ * If a packet is matched by several rules of a given group for a given
+ * priority level, the outcome is undefined. It can take any path, may be
+ * duplicated or even cause unrecoverable errors.
+ *
+ * Note that support for more than a single group and priority level is not
+ * guaranteed.
+ *
+ * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+ *
+ * Several pattern items and actions are valid and can be used in both
+ * directions. Those valid for only one direction are described as such.
+ *
+ * At least one direction must be specified.
+ *
+ * Specifying both directions at once for a given rule is not recommended
+ * but may be valid in a few cases (e.g. shared counter).
+ */
+struct rte_flow_attr {
+	uint32_t group; /**< Priority group. */
+	uint32_t priority; /**< Priority level within group. */
+	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
+	uint32_t egress:1; /**< Rule applies to egress traffic. */
+	uint32_t reserved:30; /**< Reserved, must be zero. */
+};
+
+/**
+ * Matching pattern item types.
+ *
+ * Pattern items fall in two categories:
+ *
+ * - Matching protocol headers and packet data (ANY, RAW, ETH, VLAN, IPV4,
+ *   IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
+ *   specification structure. These must be stacked in the same order as the
+ *   protocol layers to match, starting from the lowest.
+ *
+ * - Matching meta-data or affecting pattern processing (END, VOID, INVERT,
+ *   PF, VF, PORT and so on), often without a specification structure. Since
+ *   they do not match packet contents, these can be specified anywhere
+ *   within item lists without affecting others.
+ *
+ * See the description of individual types for more information. Those
+ * marked with [META] fall into the second category.
+ */
+enum rte_flow_item_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for item lists. Prevents further processing of items,
+	 * thereby ending the pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_VOID,
+
+	/**
+	 * [META]
+	 *
+	 * Inverted matching, i.e. process packets that do not match the
+	 * pattern.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_INVERT,
+
+	/**
+	 * Matches any protocol in place of the current layer, a single ANY
+	 * may also stand for several protocol layers.
+	 *
+	 * See struct rte_flow_item_any.
+	 */
+	RTE_FLOW_ITEM_TYPE_ANY,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to the physical function of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a PF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * No associated specification structure.
+	 */
+	RTE_FLOW_ITEM_TYPE_PF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets addressed to a virtual function ID of the device.
+	 *
+	 * If the underlying device function differs from the one that would
+	 * normally receive the matched traffic, specifying this item
+	 * prevents it from reaching that device unless the flow rule
+	 * contains a VF action. Packets are not duplicated between device
+	 * instances by default.
+	 *
+	 * See struct rte_flow_item_vf.
+	 */
+	RTE_FLOW_ITEM_TYPE_VF,
+
+	/**
+	 * [META]
+	 *
+	 * Matches packets coming from the specified physical port of the
+	 * underlying device.
+	 *
+	 * The first PORT item overrides the physical port normally
+	 * associated with the specified DPDK input port (port_id). This
+	 * item can be provided several times to match additional physical
+	 * ports.
+	 *
+	 * See struct rte_flow_item_port.
+	 */
+	RTE_FLOW_ITEM_TYPE_PORT,
+
+	/**
+	 * Matches a byte string of a given length at a given offset.
+	 *
+	 * See struct rte_flow_item_raw.
+	 */
+	RTE_FLOW_ITEM_TYPE_RAW,
+
+	/**
+	 * Matches an Ethernet header.
+	 *
+	 * See struct rte_flow_item_eth.
+	 */
+	RTE_FLOW_ITEM_TYPE_ETH,
+
+	/**
+	 * Matches an 802.1Q/ad VLAN tag.
+	 *
+	 * See struct rte_flow_item_vlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VLAN,
+
+	/**
+	 * Matches an IPv4 header.
+	 *
+	 * See struct rte_flow_item_ipv4.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV4,
+
+	/**
+	 * Matches an IPv6 header.
+	 *
+	 * See struct rte_flow_item_ipv6.
+	 */
+	RTE_FLOW_ITEM_TYPE_IPV6,
+
+	/**
+	 * Matches an ICMP header.
+	 *
+	 * See struct rte_flow_item_icmp.
+	 */
+	RTE_FLOW_ITEM_TYPE_ICMP,
+
+	/**
+	 * Matches a UDP header.
+	 *
+	 * See struct rte_flow_item_udp.
+	 */
+	RTE_FLOW_ITEM_TYPE_UDP,
+
+	/**
+	 * Matches a TCP header.
+	 *
+	 * See struct rte_flow_item_tcp.
+	 */
+	RTE_FLOW_ITEM_TYPE_TCP,
+
+	/**
+	 * Matches a SCTP header.
+	 *
+	 * See struct rte_flow_item_sctp.
+	 */
+	RTE_FLOW_ITEM_TYPE_SCTP,
+
+	/**
+	 * Matches a VXLAN header.
+	 *
+	 * See struct rte_flow_item_vxlan.
+	 */
+	RTE_FLOW_ITEM_TYPE_VXLAN,
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ANY
+ *
+ * Matches any protocol in place of the current layer, a single ANY may also
+ * stand for several protocol layers.
+ *
+ * This is usually specified as the first pattern item when looking for a
+ * protocol anywhere in a packet.
+ *
+ * A zeroed mask stands for any number of layers.
+ */
+struct rte_flow_item_any {
+	uint32_t num; /* Number of layers covered. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VF
+ *
+ * Matches packets addressed to a virtual function ID of the device.
+ *
+ * If the underlying device function differs from the one that would
+ * normally receive the matched traffic, specifying this item prevents it
+ * from reaching that device unless the flow rule contains a VF
+ * action. Packets are not duplicated between device instances by default.
+ *
+ * - Likely to return an error or never match any traffic if this causes a
+ *   VF device to match traffic addressed to a different VF.
+ * - Can be specified multiple times to match traffic addressed to several
+ *   VF IDs.
+ * - Can be combined with a PF item to match both PF and VF traffic.
+ *
+ * A zeroed mask can be used to match any VF ID.
+ */
+struct rte_flow_item_vf {
+	uint32_t id; /**< Destination VF ID. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_PORT
+ *
+ * Matches packets coming from the specified physical port of the underlying
+ * device.
+ *
+ * The first PORT item overrides the physical port normally associated with
+ * the specified DPDK input port (port_id). This item can be provided
+ * several times to match additional physical ports.
+ *
+ * Note that physical ports are not necessarily tied to DPDK input ports
+ * (port_id) when those are not under DPDK control. Possible values are
+ * specific to each device, they are not necessarily indexed from zero and
+ * may not be contiguous.
+ *
+ * As a device property, the list of allowed values as well as the value
+ * associated with a port_id should be retrieved by other means.
+ *
+ * A zeroed mask can be used to match any port index.
+ */
+struct rte_flow_item_port {
+	uint32_t index; /**< Physical port index. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_RAW
+ *
+ * Matches a byte string of a given length at a given offset.
+ *
+ * Offset is either absolute (using the start of the packet) or relative to
+ * the end of the previous matched item in the stack, in which case negative
+ * values are allowed.
+ *
+ * If search is enabled, offset is used as the starting point. The search
+ * area can be delimited by setting limit to a nonzero value, which is the
+ * maximum number of bytes after offset where the pattern may start.
+ *
+ * Matching a zero-length pattern is allowed, doing so resets the relative
+ * offset for subsequent items.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_raw {
+	uint32_t relative:1; /**< Look for pattern after the previous item. */
+	uint32_t search:1; /**< Search pattern from offset (see also limit). */
+	uint32_t reserved:30; /**< Reserved, must be set to zero. */
+	int32_t offset; /**< Absolute or relative offset for pattern. */
+	uint16_t limit; /**< Search area limit for start of pattern. */
+	uint16_t length; /**< Pattern length. */
+	uint8_t pattern[]; /**< Byte string to look for. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ETH
+ *
+ * Matches an Ethernet header.
+ */
+struct rte_flow_item_eth {
+	struct ether_addr dst; /**< Destination MAC. */
+	struct ether_addr src; /**< Source MAC. */
+	uint16_t type; /**< EtherType. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VLAN
+ *
+ * Matches an 802.1Q/ad VLAN tag.
+ *
+ * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
+ * RTE_FLOW_ITEM_TYPE_VLAN.
+ */
+struct rte_flow_item_vlan {
+	uint16_t tpid; /**< Tag protocol identifier. */
+	uint16_t tci; /**< Tag control information. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV4
+ *
+ * Matches an IPv4 header.
+ *
+ * Note: IPv4 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv4 {
+	struct ipv4_hdr hdr; /**< IPv4 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_IPV6.
+ *
+ * Matches an IPv6 header.
+ *
+ * Note: IPv6 options are handled by dedicated pattern items.
+ */
+struct rte_flow_item_ipv6 {
+	struct ipv6_hdr hdr; /**< IPv6 header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_ICMP.
+ *
+ * Matches an ICMP header.
+ */
+struct rte_flow_item_icmp {
+	struct icmp_hdr hdr; /**< ICMP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_UDP.
+ *
+ * Matches a UDP header.
+ */
+struct rte_flow_item_udp {
+	struct udp_hdr hdr; /**< UDP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_TCP.
+ *
+ * Matches a TCP header.
+ */
+struct rte_flow_item_tcp {
+	struct tcp_hdr hdr; /**< TCP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_SCTP.
+ *
+ * Matches a SCTP header.
+ */
+struct rte_flow_item_sctp {
+	struct sctp_hdr hdr; /**< SCTP header definition. */
+};
+
+/**
+ * RTE_FLOW_ITEM_TYPE_VXLAN.
+ *
+ * Matches a VXLAN header (RFC 7348).
+ */
+struct rte_flow_item_vxlan {
+	uint8_t flags; /**< Normally 0x08 (I flag). */
+	uint8_t rsvd0[3]; /**< Reserved, normally 0x000000. */
+	uint8_t vni[3]; /**< VXLAN identifier. */
+	uint8_t rsvd1; /**< Reserved, normally 0x00. */
+};
+
+/**
+ * Matching pattern item definition.
+ *
+ * A pattern is formed by stacking items starting from the lowest protocol
+ * layer to match. This stacking restriction does not apply to meta items
+ * which can be placed anywhere in the stack without affecting the meaning
+ * of the resulting pattern.
+ *
+ * Patterns are terminated by END items.
+ *
+ * The spec field should be a valid pointer to a structure of the related
+ * item type. It may be set to NULL in many cases to use default values.
+ *
+ * Optionally, last can point to a structure of the same type to define an
+ * inclusive range. This is mostly supported by integer and address fields,
+ * may cause errors otherwise. Fields that do not support ranges must be set
+ * to 0 or to the same value as the corresponding fields in spec.
+ *
+ * By default all fields present in spec are considered relevant (see note
+ * below). This behavior can be altered by providing a mask structure of the
+ * same type with applicable bits set to one. It can also be used to
+ * partially filter out specific fields (e.g. as an alternate mean to match
+ * ranges of IP addresses).
+ *
+ * Mask is a simple bit-mask applied before interpreting the contents of
+ * spec and last, which may yield unexpected results if not used
+ * carefully. For example, if for an IPv4 address field, spec provides
+ * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
+ * effective range becomes 10.1.0.0 to 10.3.255.255.
+ *
+ * Note: the defaults for data-matching items such as IPv4 when mask is not
+ * specified actually depend on the underlying implementation since only
+ * recognized fields can be taken into account.
+ */
+struct rte_flow_item {
+	enum rte_flow_item_type type; /**< Item type. */
+	const void *spec; /**< Pointer to item specification structure. */
+	const void *last; /**< Defines an inclusive range (spec to last). */
+	const void *mask; /**< Bit-mask applied to spec and last. */
+};
+
+/**
+ * Action types.
+ *
+ * Each possible action is represented by a type. Some have associated
+ * configuration structures. Several actions combined in a list can be
+ * affected to a flow rule. That list is not ordered.
+ *
+ * They fall in three categories:
+ *
+ * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+ *   processing matched packets by subsequent flow rules, unless overridden
+ *   with PASSTHRU.
+ *
+ * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
+ *   for additional processing by subsequent flow rules.
+ *
+ * - Other non terminating meta actions that do not affect the fate of
+ *   packets (END, VOID, MARK, FLAG, COUNT).
+ *
+ * When several actions are combined in a flow rule, they should all have
+ * different types (e.g. dropping a packet twice is not possible).
+ *
+ * Only the last action of a given type is taken into account. PMDs still
+ * perform error checking on the entire list.
+ *
+ * Note that PASSTHRU is the only action able to override a terminating
+ * rule.
+ */
+enum rte_flow_action_type {
+	/**
+	 * [META]
+	 *
+	 * End marker for action lists. Prevents further processing of
+	 * actions, thereby ending the list.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_END,
+
+	/**
+	 * [META]
+	 *
+	 * Used as a placeholder for convenience. It is ignored and simply
+	 * discarded by PMDs.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_VOID,
+
+	/**
+	 * Leaves packets up for additional processing by subsequent flow
+	 * rules. This is the default when a rule does not contain a
+	 * terminating action, but can be specified to force a rule to
+	 * become non-terminating.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PASSTHRU,
+
+	/**
+	 * [META]
+	 *
+	 * Attaches a 32 bit value to packets.
+	 *
+	 * See struct rte_flow_action_mark.
+	 */
+	RTE_FLOW_ACTION_TYPE_MARK,
+
+	/**
+	 * [META]
+	 *
+	 * Flag packets. Similar to MARK but only affects ol_flags.
+	 *
+	 * Note: a distinctive flag must be defined for it.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_FLAG,
+
+	/**
+	 * Assigns packets to a given queue index.
+	 *
+	 * See struct rte_flow_action_queue.
+	 */
+	RTE_FLOW_ACTION_TYPE_QUEUE,
+
+	/**
+	 * Drops packets.
+	 *
+	 * PASSTHRU overrides this action if both are specified.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_DROP,
+
+	/**
+	 * [META]
+	 *
+	 * Enables counters for this rule.
+	 *
+	 * These counters can be retrieved and reset through rte_flow_query(),
+	 * see struct rte_flow_query_count.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_COUNT,
+
+	/**
+	 * Duplicates packets to a given queue index.
+	 *
+	 * This is normally combined with QUEUE, however when used alone, it
+	 * is actually similar to QUEUE + PASSTHRU.
+	 *
+	 * See struct rte_flow_action_dup.
+	 */
+	RTE_FLOW_ACTION_TYPE_DUP,
+
+	/**
+	 * Similar to QUEUE, except RSS is additionally performed on packets
+	 * to spread them among several queues according to the provided
+	 * parameters.
+	 *
+	 * See struct rte_flow_action_rss.
+	 */
+	RTE_FLOW_ACTION_TYPE_RSS,
+
+	/**
+	 * Redirects packets to the physical function (PF) of the current
+	 * device.
+	 *
+	 * No associated configuration structure.
+	 */
+	RTE_FLOW_ACTION_TYPE_PF,
+
+	/**
+	 * Redirects packets to the virtual function (VF) of the current
+	 * device with the specified ID.
+	 *
+	 * See struct rte_flow_action_vf.
+	 */
+	RTE_FLOW_ACTION_TYPE_VF,
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_MARK
+ *
+ * Attaches a 32 bit value to packets.
+ *
+ * This value is arbitrary and application-defined. For compatibility with
+ * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
+ * also set in ol_flags.
+ */
+struct rte_flow_action_mark {
+	uint32_t id; /**< 32 bit value to return with packets. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_QUEUE
+ *
+ * Assign packets to a given queue index.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_queue {
+	uint16_t index; /**< Queue index to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_COUNT (query)
+ *
+ * Query structure to retrieve and reset flow rule counters.
+ */
+struct rte_flow_query_count {
+	uint32_t reset:1; /**< Reset counters after query [in]. */
+	uint32_t hits_set:1; /**< hits field is set [out]. */
+	uint32_t bytes_set:1; /**< bytes field is set [out]. */
+	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
+	uint64_t hits; /**< Number of hits for this rule [out]. */
+	uint64_t bytes; /**< Number of bytes through this rule [out]. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_DUP
+ *
+ * Duplicates packets to a given queue index.
+ *
+ * This is normally combined with QUEUE, however when used alone, it is
+ * actually similar to QUEUE + PASSTHRU.
+ *
+ * Non-terminating by default.
+ */
+struct rte_flow_action_dup {
+	uint16_t index; /**< Queue index to duplicate packets to. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_RSS
+ *
+ * Similar to QUEUE, except RSS is additionally performed on packets to
+ * spread them among several queues according to the provided parameters.
+ *
+ * Note: RSS hash result is normally stored in the hash.rss mbuf field,
+ * however it conflicts with the MARK action as they share the same
+ * space. When both actions are specified, the RSS hash is discarded and
+ * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
+ * structure should eventually evolve to store both.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_rss {
+	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
+	uint16_t num; /**< Number of entries in queue[]. */
+	uint16_t queue[]; /**< Queues indices to use. */
+};
+
+/**
+ * RTE_FLOW_ACTION_TYPE_VF
+ *
+ * Redirects packets to a virtual function (VF) of the current device.
+ *
+ * Packets matched by a VF pattern item can be redirected to their original
+ * VF ID instead of the specified one. This parameter may not be available
+ * and is not guaranteed to work properly if the VF part is matched by a
+ * prior flow rule or if packets are not addressed to a VF in the first
+ * place.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_vf {
+	uint32_t original:1; /**< Use original VF ID if possible. */
+	uint32_t reserved:31; /**< Reserved, must be zero. */
+	uint32_t id; /**< VF ID to redirect packets to. */
+};
+
+/**
+ * Definition of a single action.
+ *
+ * A list of actions is terminated by a END action.
+ *
+ * For simple actions without a configuration structure, conf remains NULL.
+ */
+struct rte_flow_action {
+	enum rte_flow_action_type type; /**< Action type. */
+	const void *conf; /**< Pointer to action configuration structure. */
+};
+
+/**
+ * Opaque type returned after successfully creating a flow.
+ *
+ * This handle can be used to manage and query the related flow (e.g. to
+ * destroy it or retrieve counters).
+ */
+struct rte_flow;
+
+/**
+ * Verbose error types.
+ *
+ * Most of them provide the type of the object referenced by struct
+ * rte_flow_error.cause.
+ */
+enum rte_flow_error_type {
+	RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+	RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+	RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+	RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+	RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+	RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+	RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+};
+
+/**
+ * Verbose error structure definition.
+ *
+ * This object is normally allocated by applications and set by PMDs, the
+ * message points to a constant string which does not need to be freed by
+ * the application, however its pointer can be considered valid only as long
+ * as its associated DPDK port remains configured. Closing the underlying
+ * device or unloading the PMD invalidates it.
+ *
+ * Both cause and message may be NULL regardless of the error type.
+ */
+struct rte_flow_error {
+	enum rte_flow_error_type type; /**< Cause field and error types. */
+	const void *cause; /**< Object responsible for the error. */
+	const char *message; /**< Human-readable error message. */
+};
+
+/**
+ * Check whether a flow rule can be created on a given port.
+ *
+ * While this function has no effect on the target device, the flow rule is
+ * validated against its current configuration state and the returned value
+ * should be considered valid by the caller for that state only.
+ *
+ * The returned value is guaranteed to remain valid only as long as no
+ * successful calls to rte_flow_create() or rte_flow_destroy() are made in
+ * the meantime and no device parameter affecting flow rules in any way are
+ * modified, due to possible collisions or resource limitations (although in
+ * such cases EINVAL should not be returned).
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 if flow rule is valid and can be created. A negative errno value
+ *   otherwise (rte_errno is also set), the following errors are defined:
+ *
+ *   -ENOSYS: underlying device does not support this functionality.
+ *
+ *   -EINVAL: unknown or invalid rule specification.
+ *
+ *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
+ *   bit-masks are unsupported).
+ *
+ *   -EEXIST: collision with an existing rule.
+ *
+ *   -ENOMEM: not enough resources.
+ *
+ *   -EBUSY: action cannot be performed due to busy device resources, may
+ *   succeed if the affected queues or even the entire port are in a stopped
+ *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
+ */
+int
+rte_flow_validate(uint8_t port_id,
+		  const struct rte_flow_attr *attr,
+		  const struct rte_flow_item pattern[],
+		  const struct rte_flow_action actions[],
+		  struct rte_flow_error *error);
+
+/**
+ * Create a flow rule on a given port.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] attr
+ *   Flow rule attributes.
+ * @param[in] pattern
+ *   Pattern specification (list terminated by the END pattern item).
+ * @param[in] actions
+ *   Associated actions (list terminated by the END action).
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   A valid handle in case of success, NULL otherwise and rte_errno is set
+ *   to the positive version of one of the error codes defined for
+ *   rte_flow_validate().
+ */
+struct rte_flow *
+rte_flow_create(uint8_t port_id,
+		const struct rte_flow_attr *attr,
+		const struct rte_flow_item pattern[],
+		const struct rte_flow_action actions[],
+		struct rte_flow_error *error);
+
+/**
+ * Destroy a flow rule on a given port.
+ *
+ * Failure to destroy a flow rule handle may occur when other flow rules
+ * depend on it, and destroying it would result in an inconsistent state.
+ *
+ * This function is only guaranteed to succeed if handles are destroyed in
+ * reverse order of their creation.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to destroy.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_destroy(uint8_t port_id,
+		 struct rte_flow *flow,
+		 struct rte_flow_error *error);
+
+/**
+ * Destroy all flow rules associated with a port.
+ *
+ * In the unlikely event of failure, handles are still considered destroyed
+ * and no longer valid but the port must be assumed to be in an inconsistent
+ * state.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_flush(uint8_t port_id,
+	       struct rte_flow_error *error);
+
+/**
+ * Query an existing flow rule.
+ *
+ * This function allows retrieving flow-specific data such as counters.
+ * Data is gathered by special actions which must be present in the flow
+ * rule definition.
+ *
+ * \see RTE_FLOW_ACTION_TYPE_COUNT
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param flow
+ *   Flow rule handle to query.
+ * @param action
+ *   Action type to query.
+ * @param[in, out] data
+ *   Pointer to storage for the associated query data type.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+rte_flow_query(uint8_t port_id,
+	       struct rte_flow *flow,
+	       enum rte_flow_action_type action,
+	       void *data,
+	       struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_H_ */
diff --git a/lib/librte_ether/rte_flow_driver.h b/lib/librte_ether/rte_flow_driver.h
new file mode 100644
index 0000000..cc97785
--- /dev/null
+++ b/lib/librte_ether/rte_flow_driver.h
@@ -0,0 +1,182 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef RTE_FLOW_DRIVER_H_
+#define RTE_FLOW_DRIVER_H_
+
+/**
+ * @file
+ * RTE generic flow API (driver side)
+ *
+ * This file provides implementation helpers for internal use by PMDs, they
+ * are not intended to be exposed to applications and are not subject to ABI
+ * versioning.
+ */
+
+#include <stdint.h>
+
+#include <rte_errno.h>
+#include "rte_ethdev.h"
+#include "rte_flow.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Generic flow operations structure implemented and returned by PMDs.
+ *
+ * To implement this API, PMDs must handle the RTE_ETH_FILTER_GENERIC filter
+ * type in their .filter_ctrl callback function (struct eth_dev_ops) as well
+ * as the RTE_ETH_FILTER_GET filter operation.
+ *
+ * If successful, this operation must result in a pointer to a PMD-specific
+ * struct rte_flow_ops written to the argument address as described below:
+ *
+ * \code
+ *
+ * // PMD filter_ctrl callback
+ *
+ * static const struct rte_flow_ops pmd_flow_ops = { ... };
+ *
+ * switch (filter_type) {
+ * case RTE_ETH_FILTER_GENERIC:
+ *     if (filter_op != RTE_ETH_FILTER_GET)
+ *         return -EINVAL;
+ *     *(const void **)arg = &pmd_flow_ops;
+ *     return 0;
+ * }
+ *
+ * \endcode
+ *
+ * See also rte_flow_ops_get().
+ *
+ * These callback functions are not supposed to be used by applications
+ * directly, which must rely on the API defined in rte_flow.h.
+ *
+ * Public-facing wrapper functions perform a few consistency checks so that
+ * unimplemented (i.e. NULL) callbacks simply return -ENOTSUP. These
+ * callbacks otherwise only differ by their first argument (with port ID
+ * already resolved to a pointer to struct rte_eth_dev).
+ */
+struct rte_flow_ops {
+	/** See rte_flow_validate(). */
+	int (*validate)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_create(). */
+	struct rte_flow *(*create)
+		(struct rte_eth_dev *,
+		 const struct rte_flow_attr *,
+		 const struct rte_flow_item [],
+		 const struct rte_flow_action [],
+		 struct rte_flow_error *);
+	/** See rte_flow_destroy(). */
+	int (*destroy)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 struct rte_flow_error *);
+	/** See rte_flow_flush(). */
+	int (*flush)
+		(struct rte_eth_dev *,
+		 struct rte_flow_error *);
+	/** See rte_flow_query(). */
+	int (*query)
+		(struct rte_eth_dev *,
+		 struct rte_flow *,
+		 enum rte_flow_action_type,
+		 void *,
+		 struct rte_flow_error *);
+};
+
+/**
+ * Initialize generic flow error structure.
+ *
+ * This function also sets rte_errno to a given value.
+ *
+ * @param[out] error
+ *   Pointer to flow error structure (may be NULL).
+ * @param code
+ *   Related error code (rte_errno).
+ * @param type
+ *   Cause field and error types.
+ * @param cause
+ *   Object responsible for the error.
+ * @param message
+ *   Human-readable error message.
+ *
+ * @return
+ *   Pointer to flow error structure.
+ */
+static inline struct rte_flow_error *
+rte_flow_error_set(struct rte_flow_error *error,
+		   int code,
+		   enum rte_flow_error_type type,
+		   const void *cause,
+		   const char *message)
+{
+	if (error) {
+		*error = (struct rte_flow_error){
+			.type = type,
+			.cause = cause,
+			.message = message,
+		};
+	}
+	rte_errno = code;
+	return error;
+}
+
+/**
+ * Get generic flow operations structure from a port.
+ *
+ * @param port_id
+ *   Port identifier to query.
+ * @param[out] error
+ *   Pointer to flow error structure.
+ *
+ * @return
+ *   The flow operations structure associated with port_id, NULL in case of
+ *   error, in which case rte_errno is set and the error structure contains
+ *   additional details.
+ */
+const struct rte_flow_ops *
+rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_FLOW_DRIVER_H_ */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 02/26] doc: add rte_flow prog guide
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 01/26] ethdev: introduce generic flow API Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 15:09               ` Mcnamara, John
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 03/26] doc: announce deprecation of legacy filter types Adrien Mazarguil
                               ` (24 subsequent siblings)
  26 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

This documentation is based on the latest RFC submission, subsequently
updated according to feedback from the community.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/prog_guide/index.rst    |    1 +
 doc/guides/prog_guide/rte_flow.rst | 2041 +++++++++++++++++++++++++++++++
 2 files changed, 2042 insertions(+)

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index e5a50a8..ed7f770 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -42,6 +42,7 @@ Programmer's Guide
     mempool_lib
     mbuf_lib
     poll_mode_drv
+    rte_flow
     cryptodev_lib
     link_bonding_poll_mode_drv_lib
     timer_lib
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
new file mode 100644
index 0000000..f415a73
--- /dev/null
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -0,0 +1,2041 @@
+..  BSD LICENSE
+    Copyright 2016 6WIND S.A.
+    Copyright 2016 Mellanox.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of 6WIND S.A. nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+.. _Generic_flow_API:
+
+Generic flow API (rte_flow)
+===========================
+
+Overview
+--------
+
+This API provides a generic means to configure hardware to match specific
+ingress or egress traffic, alter its fate and query related counters
+according to any number of user-defined rules.
+
+It is named *rte_flow* after the prefix used for all its symbols, and is
+defined in ``rte_flow.h``.
+
+- Matching can be performed on packet data (protocol headers, payload) and
+  properties (e.g. associated physical port, virtual device function ID).
+
+- Possible operations include dropping traffic, diverting it to specific
+  queues, to virtual/physical device functions or ports, performing tunnel
+  offloads, adding marks and so on.
+
+It is slightly higher-level than the legacy filtering framework which it
+encompasses and supersedes (including all functions and filter types) in
+order to expose a single interface with an unambiguous behavior that is
+common to all poll-mode drivers (PMDs).
+
+Several methods to migrate existing applications are described in `API
+migration`_.
+
+Flow rule
+---------
+
+Description
+~~~~~~~~~~~
+
+A flow rule is the combination of attributes with a matching pattern and a
+list of actions. Flow rules form the basis of this API.
+
+Flow rules can have several distinct actions (such as counting,
+encapsulating, decapsulating before redirecting packets to a particular
+queue, etc.), instead of relying on several rules to achieve this and having
+applications deal with hardware implementation details regarding their
+order.
+
+Support for different priority levels on a rule basis is provided, for
+example in order to force a more specific rule to come before a more generic
+one for packets matched by both. However hardware support for more than a
+single priority level cannot be guaranteed. When supported, the number of
+available priority levels is usually low, which is why they can also be
+implemented in software by PMDs (e.g. missing priority levels may be
+emulated by reordering rules).
+
+In order to remain as hardware-agnostic as possible, by default all rules
+are considered to have the same priority, which means that the order between
+overlapping rules (when a packet is matched by several filters) is
+undefined.
+
+PMDs may refuse to create overlapping rules at a given priority level when
+they can be detected (e.g. if a pattern matches an existing filter).
+
+Thus predictable results for a given priority level can only be achieved
+with non-overlapping rules, using perfect matching on all protocol layers.
+
+Flow rules can also be grouped, the flow rule priority is specific to the
+group they belong to. All flow rules in a given group are thus processed
+either before or after another group.
+
+Support for multiple actions per rule may be implemented internally on top
+of non-default hardware priorities, as a result both features may not be
+simultaneously available to applications.
+
+Considering that allowed pattern/actions combinations cannot be known in
+advance and would result in an impractically large number of capabilities to
+expose, a method is provided to validate a given rule from the current
+device configuration state.
+
+This enables applications to check if the rule types they need is supported
+at initialization time, before starting their data path. This method can be
+used anytime, its only requirement being that the resources needed by a rule
+should exist (e.g. a target RX queue should be configured first).
+
+Each defined rule is associated with an opaque handle managed by the PMD,
+applications are responsible for keeping it. These can be used for queries
+and rules management, such as retrieving counters or other data and
+destroying them.
+
+To avoid resource leaks on the PMD side, handles must be explicitly
+destroyed by the application before releasing associated resources such as
+queues and ports.
+
+The following sections cover:
+
+- **Attributes** (represented by ``struct rte_flow_attr``): properties of a
+  flow rule such as its direction (ingress or egress) and priority.
+
+- **Pattern item** (represented by ``struct rte_flow_item``): part of a
+  matching pattern that either matches specific packet data or traffic
+  properties. It can also describe properties of the pattern itself, such as
+  inverted matching.
+
+- **Matching pattern**: traffic properties to look for, a combination of any
+  number of items.
+
+- **Actions** (represented by ``struct rte_flow_action``): operations to
+  perform whenever a packet is matched by a pattern.
+
+Attributes
+~~~~~~~~~~
+
+Attribute: Group
+^^^^^^^^^^^^^^^^
+
+Flow rules can be grouped by assigning them a common group number. Lower
+values have higher priority. Group 0 has the highest priority.
+
+Although optional, applications are encouraged to group similar rules as
+much as possible to fully take advantage of hardware capabilities
+(e.g. optimized matching) and work around limitations (e.g. a single pattern
+type possibly allowed in a given group).
+
+Note that support for more than a single group is not guaranteed.
+
+Attribute: Priority
+^^^^^^^^^^^^^^^^^^^
+
+A priority level can be assigned to a flow rule. Like groups, lower values
+denote higher priority, with 0 as the maximum.
+
+A rule with priority 0 in group 8 is always matched after a rule with
+priority 8 in group 0.
+
+Group and priority levels are arbitrary and up to the application, they do
+not need to be contiguous nor start from 0, however the maximum number
+varies between devices and may be affected by existing flow rules.
+
+If a packet is matched by several rules of a given group for a given
+priority level, the outcome is undefined. It can take any path, may be
+duplicated or even cause unrecoverable errors.
+
+Note that support for more than a single priority level is not guaranteed.
+
+Attribute: Traffic direction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
+
+Several pattern items and actions are valid and can be used in both
+directions. At least one direction must be specified.
+
+Specifying both directions at once for a given rule is not recommended but
+may be valid in a few cases (e.g. shared counters).
+
+Pattern item
+~~~~~~~~~~~~
+
+Pattern items fall in two categories:
+
+- Matching protocol headers and packet data (ANY, RAW, ETH, VLAN, IPV4,
+  IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
+  specification structure.
+
+- Matching meta-data or affecting pattern processing (END, VOID, INVERT, PF,
+  VF, PORT and so on), often without a specification structure.
+
+Item specification structures are used to match specific values among
+protocol fields (or item properties). Documentation describes for each item
+whether they are associated with one and their type name if so.
+
+Up to three structures of the same type can be set for a given item:
+
+- ``spec``: values to match (e.g. a given IPv4 address).
+
+- ``last``: upper bound for an inclusive range with corresponding fields in
+  ``spec``.
+
+- ``mask``: bit-mask applied to both ``spec`` and ``last`` whose purpose is
+  to distinguish the values to take into account and/or partially mask them
+  out (e.g. in order to match an IPv4 address prefix).
+
+Usage restrictions and expected behavior:
+
+- Setting either ``mask`` or ``last`` without ``spec`` is an error.
+
+- Field values in ``last`` which are either 0 or equal to the corresponding
+  values in ``spec`` are ignored; they do not generate a range. Nonzero
+  values lower than those in ``spec`` are not supported.
+
+- Setting ``spec`` and optionally ``last`` without ``mask`` causes the PMD
+  to only take the fields it can recognize into account. There is no error
+  checking for unsupported fields.
+
+- Not setting any of them (assuming item type allows it) uses default
+  parameters that depend on the item type. Most of the time, particularly
+  for protocol header items, it is equivalent to providing an empty (zeroed)
+  ``mask``.
+
+- ``mask`` is a simple bit-mask applied before interpreting the contents of
+  ``spec`` and ``last``, which may yield unexpected results if not used
+  carefully. For example, if for an IPv4 address field, ``spec`` provides
+  *10.1.2.3*, ``last`` provides *10.3.4.5* and ``mask`` provides
+  *255.255.0.0*, the effective range becomes *10.1.0.0* to *10.3.255.255*.
+
+Example of an item specification matching an Ethernet header:
+
+.. _table_rte_flow_pattern_item_example:
+
+.. table:: Ethernet item
+
+   +----------+----------+--------------------+
+   | Field    | Subfield | Value              |
+   +==========+==========+====================+
+   | ``spec`` | ``src``  | ``00:01:02:03:04`` |
+   |          +----------+--------------------+
+   |          | ``dst``  | ``00:2a:66:00:01`` |
+   |          +----------+--------------------+
+   |          | ``type`` | ``0x22aa``         |
+   +----------+----------+--------------------+
+   | ``last`` | unspecified                   |
+   +----------+----------+--------------------+
+   | ``mask`` | ``src``  | ``00:ff:ff:ff:00`` |
+   |          +----------+--------------------+
+   |          | ``dst``  | ``00:00:00:00:ff`` |
+   |          +----------+--------------------+
+   |          | ``type`` | ``0x0000``         |
+   +----------+----------+--------------------+
+
+Non-masked bits stand for any value (shown as ``?`` below), Ethernet headers
+with the following properties are thus matched:
+
+- ``src``: ``??:01:02:03:??``
+- ``dst``: ``??:??:??:??:01``
+- ``type``: ``0x????``
+
+Matching pattern
+~~~~~~~~~~~~~~~~
+
+A pattern is formed by stacking items starting from the lowest protocol
+layer to match. This stacking restriction does not apply to meta items which
+can be placed anywhere in the stack without affecting the meaning of the
+resulting pattern.
+
+Patterns are terminated by END items.
+
+Examples:
+
+.. _table_rte_flow_tcpv4_as_l4:
+
+.. table:: TCPv4 as L4
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | Ethernet |
+   +-------+----------+
+   | 1     | IPv4     |
+   +-------+----------+
+   | 2     | TCP      |
+   +-------+----------+
+   | 3     | END      |
+   +-------+----------+
+
+|
+
+.. _table_rte_flow_tcpv6_in_vxlan:
+
+.. table:: TCPv6 in VXLAN
+
+   +-------+------------+
+   | Index | Item       |
+   +=======+============+
+   | 0     | Ethernet   |
+   +-------+------------+
+   | 1     | IPv4       |
+   +-------+------------+
+   | 2     | UDP        |
+   +-------+------------+
+   | 3     | VXLAN      |
+   +-------+------------+
+   | 4     | Ethernet   |
+   +-------+------------+
+   | 5     | IPv6       |
+   +-------+------------+
+   | 6     | TCP        |
+   +-------+------------+
+   | 7     | END        |
+   +-------+------------+
+
+|
+
+.. _table_rte_flow_tcpv4_as_l4_meta:
+
+.. table:: TCPv4 as L4 with meta items
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | VOID     |
+   +-------+----------+
+   | 1     | Ethernet |
+   +-------+----------+
+   | 2     | VOID     |
+   +-------+----------+
+   | 3     | IPv4     |
+   +-------+----------+
+   | 4     | TCP      |
+   +-------+----------+
+   | 5     | VOID     |
+   +-------+----------+
+   | 6     | VOID     |
+   +-------+----------+
+   | 7     | END      |
+   +-------+----------+
+
+The above example shows how meta items do not affect packet data matching
+items, as long as those remain stacked properly. The resulting matching
+pattern is identical to "TCPv4 as L4".
+
+.. _table_rte_flow_udpv6_anywhere:
+
+.. table:: UDPv6 anywhere
+
+   +-------+------+
+   | Index | Item |
+   +=======+======+
+   | 0     | IPv6 |
+   +-------+------+
+   | 1     | UDP  |
+   +-------+------+
+   | 2     | END  |
+   +-------+------+
+
+If supported by the PMD, omitting one or several protocol layers at the
+bottom of the stack as in the above example (missing an Ethernet
+specification) enables looking up anywhere in packets.
+
+It is unspecified whether the payload of supported encapsulations
+(e.g. VXLAN payload) is matched by such a pattern, which may apply to inner,
+outer or both packets.
+
+.. _table_rte_flow_invalid_l3:
+
+.. table:: Invalid, missing L3
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | Ethernet |
+   +-------+----------+
+   | 1     | UDP      |
+   +-------+----------+
+   | 2     | END      |
+   +-------+----------+
+
+The above pattern is invalid due to a missing L3 specification between L2
+(Ethernet) and L4 (UDP). Doing so is only allowed at the bottom and at the
+top of the stack.
+
+Meta item types
+~~~~~~~~~~~~~~~
+
+They match meta-data or affect pattern processing instead of matching packet
+data directly, most of them do not need a specification structure. This
+particularity allows them to be specified anywhere in the stack without
+causing any side effect.
+
+Item: ``END``
+^^^^^^^^^^^^^
+
+End marker for item lists. Prevents further processing of items, thereby
+ending the pattern.
+
+- Its numeric value is 0 for convenience.
+- PMD support is mandatory.
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_end:
+
+.. table:: END
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+Item: ``VOID``
+^^^^^^^^^^^^^^
+
+Used as a placeholder for convenience. It is ignored and simply discarded by
+PMDs.
+
+- PMD support is mandatory.
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_void:
+
+.. table:: VOID
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+One usage example for this type is generating rules that share a common
+prefix quickly without reallocating memory, only by updating item types:
+
+.. _table_rte_flow_item_void_example:
+
+.. table:: TCP, UDP or ICMP as L4
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | Ethernet           |
+   +-------+--------------------+
+   | 1     | IPv4               |
+   +-------+------+------+------+
+   | 2     | UDP  | VOID | VOID |
+   +-------+------+------+------+
+   | 3     | VOID | TCP  | VOID |
+   +-------+------+------+------+
+   | 4     | VOID | VOID | ICMP |
+   +-------+------+------+------+
+   | 5     | END                |
+   +-------+--------------------+
+
+Item: ``INVERT``
+^^^^^^^^^^^^^^^^
+
+Inverted matching, i.e. process packets that do not match the pattern.
+
+- ``spec``, ``last`` and ``mask`` are ignored.
+
+.. _table_rte_flow_item_invert:
+
+.. table:: INVERT
+
+   +----------+---------+
+   | Field    | Value   |
+   +==========+=========+
+   | ``spec`` | ignored |
+   +----------+---------+
+   | ``last`` | ignored |
+   +----------+---------+
+   | ``mask`` | ignored |
+   +----------+---------+
+
+Usage example, matching non-TCPv4 packets only:
+
+.. _table_rte_flow_item_invert_example:
+
+.. table:: Anything but TCPv4
+
+   +-------+----------+
+   | Index | Item     |
+   +=======+==========+
+   | 0     | INVERT   |
+   +-------+----------+
+   | 1     | Ethernet |
+   +-------+----------+
+   | 2     | IPv4     |
+   +-------+----------+
+   | 3     | TCP      |
+   +-------+----------+
+   | 4     | END      |
+   +-------+----------+
+
+Item: ``PF``
+^^^^^^^^^^^^
+
+Matches packets addressed to the physical function of the device.
+
+If the underlying device function differs from the one that would normally
+receive the matched traffic, specifying this item prevents it from reaching
+that device unless the flow rule contains a `Action: PF`_. Packets are not
+duplicated between device instances by default.
+
+- Likely to return an error or never match any traffic if applied to a VF
+  device.
+- Can be combined with any number of `Item: VF`_ to match both PF and VF
+  traffic.
+- ``spec``, ``last`` and ``mask`` must not be set.
+
+.. _table_rte_flow_item_pf:
+
+.. table:: PF
+
+   +----------+-------+
+   | Field    | Value |
+   +==========+=======+
+   | ``spec`` | unset |
+   +----------+-------+
+   | ``last`` | unset |
+   +----------+-------+
+   | ``mask`` | unset |
+   +----------+-------+
+
+Item: ``VF``
+^^^^^^^^^^^^
+
+Matches packets addressed to a virtual function ID of the device.
+
+If the underlying device function differs from the one that would normally
+receive the matched traffic, specifying this item prevents it from reaching
+that device unless the flow rule contains a `Action: VF`_. Packets are not
+duplicated between device instances by default.
+
+- Likely to return an error or never match any traffic if this causes a VF
+  device to match traffic addressed to a different VF.
+- Can be specified multiple times to match traffic addressed to several VF
+  IDs.
+- Can be combined with a PF item to match both PF and VF traffic.
+
+.. _table_rte_flow_item_vf:
+
+.. table:: VF
+
+   +----------+----------+---------------------------+
+   | Field    | Subfield | Value                     |
+   +==========+==========+===========================+
+   | ``spec`` | ``id``   | destination VF ID         |
+   +----------+----------+---------------------------+
+   | ``last`` | ``id``   | upper range value         |
+   +----------+----------+---------------------------+
+   | ``mask`` | ``id``   | zeroed to match any VF ID |
+   +----------+----------+---------------------------+
+
+Item: ``PORT``
+^^^^^^^^^^^^^^
+
+Matches packets coming from the specified physical port of the underlying
+device.
+
+The first PORT item overrides the physical port normally associated with the
+specified DPDK input port (port_id). This item can be provided several times
+to match additional physical ports.
+
+Note that physical ports are not necessarily tied to DPDK input ports
+(port_id) when those are not under DPDK control. Possible values are
+specific to each device, they are not necessarily indexed from zero and may
+not be contiguous.
+
+As a device property, the list of allowed values as well as the value
+associated with a port_id should be retrieved by other means.
+
+.. _table_rte_flow_item_port:
+
+.. table:: PORT
+
+   +----------+-----------+--------------------------------+
+   | Field    | Subfield  | Value                          |
+   +==========+===========+================================+
+   | ``spec`` | ``index`` | physical port index            |
+   +----------+-----------+--------------------------------+
+   | ``last`` | ``index`` | upper range value              |
+   +----------+-----------+--------------------------------+
+   | ``mask`` | ``index`` | zeroed to match any port index |
+   +----------+-----------+--------------------------------+
+
+Data matching item types
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Most of these are basically protocol header definitions with associated
+bit-masks. They must be specified (stacked) from lowest to highest protocol
+layer to form a matching pattern.
+
+The following list is not exhaustive, new protocols will be added in the
+future.
+
+Item: ``ANY``
+^^^^^^^^^^^^^
+
+Matches any protocol in place of the current layer, a single ANY may also
+stand for several protocol layers.
+
+This is usually specified as the first pattern item when looking for a
+protocol anywhere in a packet.
+
+.. _table_rte_flow_item_any:
+
+.. table:: ANY
+
+   +----------+----------+--------------------------------------+
+   | Field    | Subfield | Value                                |
+   +==========+==========+======================================+
+   | ``spec`` | ``num``  | number of layers covered             |
+   +----------+----------+--------------------------------------+
+   | ``last`` | ``num``  | upper range value                    |
+   +----------+----------+--------------------------------------+
+   | ``mask`` | ``num``  | zeroed to cover any number of layers |
+   +----------+----------+--------------------------------------+
+
+Example for VXLAN TCP payload matching regardless of outer L3 (IPv4 or IPv6)
+and L4 (UDP) both matched by the first ANY specification, and inner L3 (IPv4
+or IPv6) matched by the second ANY specification:
+
+.. _table_rte_flow_item_any_example:
+
+.. table:: TCP in VXLAN with wildcards
+
+   +-------+------+----------+----------+-------+
+   | Index | Item | Field    | Subfield | Value |
+   +=======+======+==========+==========+=======+
+   | 0     | Ethernet                           |
+   +-------+------+----------+----------+-------+
+   | 1     | ANY  | ``spec`` | ``num``  | 2     |
+   +-------+------+----------+----------+-------+
+   | 2     | VXLAN                              |
+   +-------+------------------------------------+
+   | 3     | Ethernet                           |
+   +-------+------+----------+----------+-------+
+   | 4     | ANY  | ``spec`` | ``num``  | 1     |
+   +-------+------+----------+----------+-------+
+   | 5     | TCP                                |
+   +-------+------------------------------------+
+   | 6     | END                                |
+   +-------+------------------------------------+
+
+Item: ``RAW``
+^^^^^^^^^^^^^
+
+Matches a byte string of a given length at a given offset.
+
+Offset is either absolute (using the start of the packet) or relative to the
+end of the previous matched item in the stack, in which case negative values
+are allowed.
+
+If search is enabled, offset is used as the starting point. The search area
+can be delimited by setting limit to a nonzero value, which is the maximum
+number of bytes after offset where the pattern may start.
+
+Matching a zero-length pattern is allowed, doing so resets the relative
+offset for subsequent items.
+
+- This type does not support ranges (``last`` field).
+
+.. _table_rte_flow_item_raw:
+
+.. table:: RAW
+
+   +----------+--------------+-------------------------------------------------+
+   | Field    | Subfield     | Value                                           |
+   +==========+==============+=================================================+
+   | ``spec`` | ``relative`` | look for pattern after the previous item        |
+   |          +--------------+-------------------------------------------------+
+   |          | ``search``   | search pattern from offset (see also ``limit``) |
+   |          +--------------+-------------------------------------------------+
+   |          | ``reserved`` | reserved, must be set to zero                   |
+   |          +--------------+-------------------------------------------------+
+   |          | ``offset``   | absolute or relative offset for ``pattern``     |
+   |          +--------------+-------------------------------------------------+
+   |          | ``limit``    | search area limit for start of ``pattern``      |
+   |          +--------------+-------------------------------------------------+
+   |          | ``length``   | ``pattern`` length                              |
+   |          +--------------+-------------------------------------------------+
+   |          | ``pattern``  | byte string to look for                         |
+   +----------+--------------+-------------------------------------------------+
+   | ``last`` | if specified, either all 0 or with the same values as ``spec`` |
+   +----------+----------------------------------------------------------------+
+   | ``mask`` | bit-mask applied to ``spec`` values with usual behavior        |
+   +----------+----------------------------------------------------------------+
+
+Example pattern looking for several strings at various offsets of a UDP
+payload, using combined RAW items:
+
+.. _table_rte_flow_item_raw_example:
+
+.. table:: UDP payload matching
+
+   +-------+------+----------+--------------+-------+
+   | Index | Item | Field    | Subfield     | Value |
+   +=======+======+==========+==============+=======+
+   | 0     | Ethernet                               |
+   +-------+----------------------------------------+
+   | 1     | IPv4                                   |
+   +-------+----------------------------------------+
+   | 2     | UDP                                    |
+   +-------+------+----------+--------------+-------+
+   | 3     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | 10    |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "foo" |
+   +-------+------+----------+--------------+-------+
+   | 4     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | 20    |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "bar" |
+   +-------+------+----------+--------------+-------+
+   | 5     | RAW  | ``spec`` | ``relative`` | 1     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``search``   | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``offset``   | -29   |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``limit``    | 0     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``length``   | 3     |
+   |       |      |          +--------------+-------+
+   |       |      |          | ``pattern``  | "baz" |
+   +-------+------+----------+--------------+-------+
+   | 6     | END                                    |
+   +-------+----------------------------------------+
+
+This translates to:
+
+- Locate "foo" at least 10 bytes deep inside UDP payload.
+- Locate "bar" after "foo" plus 20 bytes.
+- Locate "baz" after "bar" minus 29 bytes.
+
+Such a packet may be represented as follows (not to scale)::
+
+ 0                     >= 10 B           == 20 B
+ |                  |<--------->|     |<--------->|
+ |                  |           |     |           |
+ |-----|------|-----|-----|-----|-----|-----------|-----|------|
+ | ETH | IPv4 | UDP | ... | baz | foo | ......... | bar | .... |
+ |-----|------|-----|-----|-----|-----|-----------|-----|------|
+                          |                             |
+                          |<--------------------------->|
+                                      == 29 B
+
+Note that matching subsequent pattern items would resume after "baz", not
+"bar" since matching is always performed after the previous item of the
+stack.
+
+Item: ``ETH``
+^^^^^^^^^^^^^
+
+Matches an Ethernet header.
+
+- ``dst``: destination MAC.
+- ``src``: source MAC.
+- ``type``: EtherType.
+
+Item: ``VLAN``
+^^^^^^^^^^^^^^
+
+Matches an 802.1Q/ad VLAN tag.
+
+- ``tpid``: tag protocol identifier.
+- ``tci``: tag control information.
+
+Item: ``IPV4``
+^^^^^^^^^^^^^^
+
+Matches an IPv4 header.
+
+Note: IPv4 options are handled by dedicated pattern items.
+
+- ``hdr``: IPv4 header definition (``rte_ip.h``).
+
+Item: ``IPV6``
+^^^^^^^^^^^^^^
+
+Matches an IPv6 header.
+
+Note: IPv6 options are handled by dedicated pattern items.
+
+- ``hdr``: IPv6 header definition (``rte_ip.h``).
+
+Item: ``ICMP``
+^^^^^^^^^^^^^^
+
+Matches an ICMP header.
+
+- ``hdr``: ICMP header definition (``rte_icmp.h``).
+
+Item: ``UDP``
+^^^^^^^^^^^^^
+
+Matches a UDP header.
+
+- ``hdr``: UDP header definition (``rte_udp.h``).
+
+Item: ``TCP``
+^^^^^^^^^^^^^
+
+Matches a TCP header.
+
+- ``hdr``: TCP header definition (``rte_tcp.h``).
+
+Item: ``SCTP``
+^^^^^^^^^^^^^^
+
+Matches a SCTP header.
+
+- ``hdr``: SCTP header definition (``rte_sctp.h``).
+
+Item: ``VXLAN``
+^^^^^^^^^^^^^^^
+
+Matches a VXLAN header (RFC 7348).
+
+- ``flags``: normally 0x08 (I flag).
+- ``rsvd0``: reserved, normally 0x000000.
+- ``vni``: VXLAN network identifier.
+- ``rsvd1``: reserved, normally 0x00.
+
+Actions
+~~~~~~~
+
+Each possible action is represented by a type. Some have associated
+configuration structures. Several actions combined in a list can be affected
+to a flow rule. That list is not ordered.
+
+They fall in three categories:
+
+- Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
+  processing matched packets by subsequent flow rules, unless overridden
+  with PASSTHRU.
+
+- Non-terminating actions (PASSTHRU, DUP) that leave matched packets up for
+  additional processing by subsequent flow rules.
+
+- Other non-terminating meta actions that do not affect the fate of packets
+  (END, VOID, MARK, FLAG, COUNT).
+
+When several actions are combined in a flow rule, they should all have
+different types (e.g. dropping a packet twice is not possible).
+
+Only the last action of a given type is taken into account. PMDs still
+perform error checking on the entire list.
+
+Like matching patterns, action lists are terminated by END items.
+
+*Note that PASSTHRU is the only action able to override a terminating rule.*
+
+Example of action that redirects packets to queue index 10:
+
+.. _table_rte_flow_action_example:
+
+.. table:: Queue action
+
+   +-----------+-------+
+   | Field     | Value |
+   +===========+=======+
+   | ``index`` | 10    |
+   +-----------+-------+
+
+Action lists examples, their order is not significant, applications must
+consider all actions to be performed simultaneously:
+
+.. _table_rte_flow_count_and_drop:
+
+.. table:: Count and drop
+
+   +-------+--------+
+   | Index | Action |
+   +=======+========+
+   | 0     | COUNT  |
+   +-------+--------+
+   | 1     | DROP   |
+   +-------+--------+
+   | 2     | END    |
+   +-------+--------+
+
+|
+
+.. _table_rte_flow_mark_count_redirect:
+
+.. table:: Mark, count and redirect
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | MARK   | ``mark``  | 0x2a  |
+   +-------+--------+-----------+-------+
+   | 1     | COUNT                      |
+   +-------+--------+-----------+-------+
+   | 2     | QUEUE  | ``queue`` | 10    |
+   +-------+--------+-----------+-------+
+   | 3     | END                        |
+   +-------+----------------------------+
+
+|
+
+.. _table_rte_flow_redirect_queue_5:
+
+.. table:: Redirect to queue 5
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | DROP                       |
+   +-------+--------+-----------+-------+
+   | 1     | QUEUE  | ``queue`` | 5     |
+   +-------+--------+-----------+-------+
+   | 2     | END                        |
+   +-------+----------------------------+
+
+In the above example, considering both actions are performed simultaneously,
+the end result is that only QUEUE has any effect.
+
+.. _table_rte_flow_redirect_queue_3:
+
+.. table:: Redirect to queue 3
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | QUEUE  | ``queue`` | 5     |
+   +-------+--------+-----------+-------+
+   | 1     | VOID                       |
+   +-------+--------+-----------+-------+
+   | 2     | QUEUE  | ``queue`` | 3     |
+   +-------+--------+-----------+-------+
+   | 3     | END                        |
+   +-------+----------------------------+
+
+As previously described, only the last action of a given type found in the
+list is taken into account. The above example also shows that VOID is
+ignored.
+
+Action types
+~~~~~~~~~~~~
+
+Common action types are described in this section. Like pattern item types,
+this list is not exhaustive as new actions will be added in the future.
+
+Action: ``END``
+^^^^^^^^^^^^^^^
+
+End marker for action lists. Prevents further processing of actions, thereby
+ending the list.
+
+- Its numeric value is 0 for convenience.
+- PMD support is mandatory.
+- No configurable properties.
+
+.. _table_rte_flow_action_end:
+
+.. table:: END
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``VOID``
+^^^^^^^^^^^^^^^^
+
+Used as a placeholder for convenience. It is ignored and simply discarded by
+PMDs.
+
+- PMD support is mandatory.
+- No configurable properties.
+
+.. _table_rte_flow_action_void:
+
+.. table:: VOID
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``PASSTHRU``
+^^^^^^^^^^^^^^^^^^^^
+
+Leaves packets up for additional processing by subsequent flow rules. This
+is the default when a rule does not contain a terminating action, but can be
+specified to force a rule to become non-terminating.
+
+- No configurable properties.
+
+.. _table_rte_flow_action_passthru:
+
+.. table:: PASSTHRU
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Example to copy a packet to a queue and continue processing by subsequent
+flow rules:
+
+.. _table_rte_flow_action_passthru_example:
+
+.. table:: Copy to queue 8
+
+   +-------+--------+-----------+-------+
+   | Index | Action | Field     | Value |
+   +=======+========+===========+=======+
+   | 0     | PASSTHRU                   |
+   +-------+--------+-----------+-------+
+   | 1     | QUEUE  | ``queue`` | 8     |
+   +-------+--------+-----------+-------+
+   | 2     | END                        |
+   +-------+----------------------------+
+
+Action: ``MARK``
+^^^^^^^^^^^^^^^^
+
+Attaches a 32 bit value to packets.
+
+This value is arbitrary and application-defined. For compatibility with FDIR
+it is returned in the ``hash.fdir.hi`` mbuf field. ``PKT_RX_FDIR_ID`` is
+also set in ``ol_flags``.
+
+.. _table_rte_flow_action_mark:
+
+.. table:: MARK
+
+   +--------+-------------------------------------+
+   | Field  | Value                               |
+   +========+=====================================+
+   | ``id`` | 32 bit value to return with packets |
+   +--------+-------------------------------------+
+
+Action: ``FLAG``
+^^^^^^^^^^^^^^^^
+
+Flag packets. Similar to `Action: MARK`_ but only affects ``ol_flags``.
+
+- No configurable properties.
+
+Note: a distinctive flag must be defined for it.
+
+.. _table_rte_flow_action_flag:
+
+.. table:: FLAG
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``QUEUE``
+^^^^^^^^^^^^^^^^^
+
+Assigns packets to a given queue index.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_queue:
+
+.. table:: QUEUE
+
+   +-----------+--------------------+
+   | Field     | Value              |
+   +===========+====================+
+   | ``index`` | queue index to use |
+   +-----------+--------------------+
+
+Action: ``DROP``
+^^^^^^^^^^^^^^^^
+
+Drop packets.
+
+- No configurable properties.
+- Terminating by default.
+- PASSTHRU overrides this action if both are specified.
+
+.. _table_rte_flow_action_drop:
+
+.. table:: DROP
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``COUNT``
+^^^^^^^^^^^^^^^^^
+
+Enables counters for this rule.
+
+These counters can be retrieved and reset through ``rte_flow_query()``, see
+``struct rte_flow_query_count``.
+
+- Counters can be retrieved with ``rte_flow_query()``.
+- No configurable properties.
+
+.. _table_rte_flow_action_count:
+
+.. table:: COUNT
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Query structure to retrieve and reset flow rule counters:
+
+.. _table_rte_flow_query_count:
+
+.. table:: COUNT query
+
+   +---------------+-----+-----------------------------------+
+   | Field         | I/O | Value                             |
+   +===============+=====+===================================+
+   | ``reset``     | in  | reset counter after query         |
+   +---------------+-----+-----------------------------------+
+   | ``hits_set``  | out | ``hits`` field is set             |
+   +---------------+-----+-----------------------------------+
+   | ``bytes_set`` | out | ``bytes`` field is set            |
+   +---------------+-----+-----------------------------------+
+   | ``hits``      | out | number of hits for this rule      |
+   +---------------+-----+-----------------------------------+
+   | ``bytes``     | out | number of bytes through this rule |
+   +---------------+-----+-----------------------------------+
+
+Action: ``DUP``
+^^^^^^^^^^^^^^^
+
+Duplicates packets to a given queue index.
+
+This is normally combined with QUEUE, however when used alone, it is
+actually similar to QUEUE + PASSTHRU.
+
+- Non-terminating by default.
+
+.. _table_rte_flow_action_dup:
+
+.. table:: DUP
+
+   +-----------+------------------------------------+
+   | Field     | Value                              |
+   +===========+====================================+
+   | ``index`` | queue index to duplicate packet to |
+   +-----------+------------------------------------+
+
+Action: ``RSS``
+^^^^^^^^^^^^^^^
+
+Similar to QUEUE, except RSS is additionally performed on packets to spread
+them among several queues according to the provided parameters.
+
+Note: RSS hash result is normally stored in the ``hash.rss`` mbuf field,
+however it conflicts with `Action: MARK`_ as they share the same space. When
+both actions are specified, the RSS hash is discarded and
+``PKT_RX_RSS_HASH`` is not set in ``ol_flags``. MARK has priority. The mbuf
+structure should eventually evolve to store both.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_rss:
+
+.. table:: RSS
+
+   +--------------+------------------------------+
+   | Field        | Value                        |
+   +==============+==============================+
+   | ``rss_conf`` | RSS parameters               |
+   +--------------+------------------------------+
+   | ``num``      | number of entries in queue[] |
+   +--------------+------------------------------+
+   | ``queue[]``  | queue indices to use         |
+   +--------------+------------------------------+
+
+Action: ``PF``
+^^^^^^^^^^^^^^
+
+Redirects packets to the physical function (PF) of the current device.
+
+- No configurable properties.
+- Terminating by default.
+
+.. _table_rte_flow_action_pf:
+
+.. table:: PF
+
+   +---------------+
+   | Field         |
+   +===============+
+   | no properties |
+   +---------------+
+
+Action: ``VF``
+^^^^^^^^^^^^^^
+
+Redirects packets to a virtual function (VF) of the current device.
+
+Packets matched by a VF pattern item can be redirected to their original VF
+ID instead of the specified one. This parameter may not be available and is
+not guaranteed to work properly if the VF part is matched by a prior flow
+rule or if packets are not addressed to a VF in the first place.
+
+- Terminating by default.
+
+.. _table_rte_flow_action_vf:
+
+.. table:: VF
+
+   +--------------+--------------------------------+
+   | Field        | Value                          |
+   +==============+================================+
+   | ``original`` | use original VF ID if possible |
+   +--------------+--------------------------------+
+   | ``vf``       | VF ID to redirect packets to   |
+   +--------------+--------------------------------+
+
+Negative types
+~~~~~~~~~~~~~~
+
+All specified pattern items (``enum rte_flow_item_type``) and actions
+(``enum rte_flow_action_type``) use positive identifiers.
+
+The negative space is reserved for dynamic types generated by PMDs during
+run-time. PMDs may encounter them as a result but must not accept negative
+identifiers they are not aware of.
+
+A method to generate them remains to be defined.
+
+Planned types
+~~~~~~~~~~~~~
+
+Pattern item types will be added as new protocols are implemented.
+
+Variable headers support through dedicated pattern items, for example in
+order to match specific IPv4 options and IPv6 extension headers would be
+stacked after IPv4/IPv6 items.
+
+Other action types are planned but are not defined yet. These include the
+ability to alter packet data in several ways, such as performing
+encapsulation/decapsulation of tunnel headers.
+
+Rules management
+----------------
+
+A rather simple API with few functions is provided to fully manage flow
+rules.
+
+Each created flow rule is associated with an opaque, PMD-specific handle
+pointer. The application is responsible for keeping it until the rule is
+destroyed.
+
+Flows rules are represented by ``struct rte_flow`` objects.
+
+Validation
+~~~~~~~~~~
+
+Given that expressing a definite set of device capabilities is not
+practical, a dedicated function is provided to check if a flow rule is
+supported and can be created.
+
+.. code-block:: c
+
+   int
+   rte_flow_validate(uint8_t port_id,
+                     const struct rte_flow_attr *attr,
+                     const struct rte_flow_item pattern[],
+                     const struct rte_flow_action actions[],
+                     struct rte_flow_error *error);
+
+While this function has no effect on the target device, the flow rule is
+validated against its current configuration state and the returned value
+should be considered valid by the caller for that state only.
+
+The returned value is guaranteed to remain valid only as long as no
+successful calls to ``rte_flow_create()`` or ``rte_flow_destroy()`` are made
+in the meantime and no device parameter affecting flow rules in any way are
+modified, due to possible collisions or resource limitations (although in
+such cases ``EINVAL`` should not be returned).
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``attr``: flow rule attributes.
+- ``pattern``: pattern specification (list terminated by the END pattern
+  item).
+- ``actions``: associated actions (list terminated by the END action).
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 if flow rule is valid and can be created. A negative errno value
+  otherwise (``rte_errno`` is also set), the following errors are defined.
+- ``-ENOSYS``: underlying device does not support this functionality.
+- ``-EINVAL``: unknown or invalid rule specification.
+- ``-ENOTSUP``: valid but unsupported rule specification (e.g. partial
+  bit-masks are unsupported).
+- ``-EEXIST``: collision with an existing rule.
+- ``-ENOMEM``: not enough resources.
+- ``-EBUSY``: action cannot be performed due to busy device resources, may
+  succeed if the affected queues or even the entire port are in a stopped
+  state (see ``rte_eth_dev_rx_queue_stop()`` and ``rte_eth_dev_stop()``).
+
+Creation
+~~~~~~~~
+
+Creating a flow rule is similar to validating one, except the rule is
+actually created and a handle returned.
+
+.. code-block:: c
+
+   struct rte_flow *
+   rte_flow_create(uint8_t port_id,
+                   const struct rte_flow_attr *attr,
+                   const struct rte_flow_item pattern[],
+                   const struct rte_flow_action *actions[],
+                   struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``attr``: flow rule attributes.
+- ``pattern``: pattern specification (list terminated by the END pattern
+  item).
+- ``actions``: associated actions (list terminated by the END action).
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+A valid handle in case of success, NULL otherwise and ``rte_errno`` is set
+to the positive version of one of the error codes defined for
+``rte_flow_validate()``.
+
+Destruction
+~~~~~~~~~~~
+
+Flow rules destruction is not automatic, and a queue or a port should not be
+released if any are still attached to them. Applications must take care of
+performing this step before releasing resources.
+
+.. code-block:: c
+
+   int
+   rte_flow_destroy(uint8_t port_id,
+                    struct rte_flow *flow,
+                    struct rte_flow_error *error);
+
+
+Failure to destroy a flow rule handle may occur when other flow rules depend
+on it, and destroying it would result in an inconsistent state.
+
+This function is only guaranteed to succeed if handles are destroyed in
+reverse order of their creation.
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``flow``: flow rule handle to destroy.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Flush
+~~~~~
+
+Convenience function to destroy all flow rule handles associated with a
+port. They are released as with successive calls to ``rte_flow_destroy()``.
+
+.. code-block:: c
+
+   int
+   rte_flow_flush(uint8_t port_id,
+                  struct rte_flow_error *error);
+
+In the unlikely event of failure, handles are still considered destroyed and
+no longer valid but the port must be assumed to be in an inconsistent state.
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Query
+~~~~~
+
+Query an existing flow rule.
+
+This function allows retrieving flow-specific data such as counters. Data
+is gathered by special actions which must be present in the flow rule
+definition.
+
+.. code-block:: c
+
+   int
+   rte_flow_query(uint8_t port_id,
+                  struct rte_flow *flow,
+                  enum rte_flow_action_type action,
+                  void *data,
+                  struct rte_flow_error *error);
+
+Arguments:
+
+- ``port_id``: port identifier of Ethernet device.
+- ``flow``: flow rule handle to query.
+- ``action``: action type to query.
+- ``data``: pointer to storage for the associated query data type.
+- ``error``: perform verbose error reporting if not NULL. PMDs initialize
+  this structure in case of error only.
+
+Return values:
+
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+
+Verbose error reporting
+-----------------------
+
+The defined *errno* values may not be accurate enough for users or
+application developers who want to investigate issues related to flow rules
+management. A dedicated error object is defined for this purpose:
+
+.. code-block:: c
+
+   enum rte_flow_error_type {
+       RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
+       RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
+       RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
+       RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
+       RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
+       RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
+       RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
+       RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
+       RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions. */
+       RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */
+   };
+
+   struct rte_flow_error {
+       enum rte_flow_error_type type; /**< Cause field and error types. */
+       const void *cause; /**< Object responsible for the error. */
+       const char *message; /**< Human-readable error message. */
+   };
+
+Error type ``RTE_FLOW_ERROR_TYPE_NONE`` stands for no error, in which case
+remaining fields can be ignored. Other error types describe the type of the
+object pointed by ``cause``.
+
+If non-NULL, ``cause`` points to the object responsible for the error. For a
+flow rule, this may be a pattern item or an individual action.
+
+If non-NULL, ``message`` provides a human-readable error message.
+
+This object is normally allocated by applications and set by PMDs in case of
+error, the message points to a constant string which does not need to be
+freed by the application, however its pointer can be considered valid only
+as long as its associated DPDK port remains configured. Closing the
+underlying device or unloading the PMD invalidates it.
+
+Caveats
+-------
+
+- DPDK does not keep track of flow rules definitions or flow rule objects
+  automatically. Applications may keep track of the former and must keep
+  track of the latter. PMDs may also do it for internal needs, however this
+  must not be relied on by applications.
+
+- Flow rules are not maintained between successive port initializations. An
+  application exiting without releasing them and restarting must re-create
+  them from scratch.
+
+- API operations are synchronous and blocking (``EAGAIN`` cannot be
+  returned).
+
+- There is no provision for reentrancy/multi-thread safety, although nothing
+  should prevent different devices from being configured at the same
+  time. PMDs may protect their control path functions accordingly.
+
+- Stopping the data path (TX/RX) should not be necessary when managing flow
+  rules. If this cannot be achieved naturally or with workarounds (such as
+  temporarily replacing the burst function pointers), an appropriate error
+  code must be returned (``EBUSY``).
+
+- PMDs, not applications, are responsible for maintaining flow rules
+  configuration when stopping and restarting a port or performing other
+  actions which may affect them. They can only be destroyed explicitly by
+  applications.
+
+For devices exposing multiple ports sharing global settings affected by flow
+rules:
+
+- All ports under DPDK control must behave consistently, PMDs are
+  responsible for making sure that existing flow rules on a port are not
+  affected by other ports.
+
+- Ports not under DPDK control (unaffected or handled by other applications)
+  are user's responsibility. They may affect existing flow rules and cause
+  undefined behavior. PMDs aware of this may prevent flow rules creation
+  altogether in such cases.
+
+PMD interface
+-------------
+
+The PMD interface is defined in ``rte_flow_driver.h``. It is not subject to
+API/ABI versioning constraints as it is not exposed to applications and may
+evolve independently.
+
+It is currently implemented on top of the legacy filtering framework through
+filter type *RTE_ETH_FILTER_GENERIC* that accepts the single operation
+*RTE_ETH_FILTER_GET* to return PMD-specific *rte_flow* callbacks wrapped
+inside ``struct rte_flow_ops``.
+
+This overhead is temporarily necessary in order to keep compatibility with
+the legacy filtering framework, which should eventually disappear.
+
+- PMD callbacks implement exactly the interface described in `Rules
+  management`_, except for the port ID argument which has already been
+  converted to a pointer to the underlying ``struct rte_eth_dev``.
+
+- Public API functions do not process flow rules definitions at all before
+  calling PMD functions (no basic error checking, no validation
+  whatsoever). They only make sure these callbacks are non-NULL or return
+  the ``ENOSYS`` (function not supported) error.
+
+This interface additionally defines the following helper functions:
+
+- ``rte_flow_ops_get()``: get generic flow operations structure from a
+  port.
+
+- ``rte_flow_error_set()``: initialize generic flow error structure.
+
+More will be added over time.
+
+Device compatibility
+--------------------
+
+No known implementation supports all the described features.
+
+Unsupported features or combinations are not expected to be fully emulated
+in software by PMDs for performance reasons. Partially supported features
+may be completed in software as long as hardware performs most of the work
+(such as queue redirection and packet recognition).
+
+However PMDs are expected to do their best to satisfy application requests
+by working around hardware limitations as long as doing so does not affect
+the behavior of existing flow rules.
+
+The following sections provide a few examples of such cases and describe how
+PMDs should handle them, they are based on limitations built into the
+previous APIs.
+
+Global bit-masks
+~~~~~~~~~~~~~~~~
+
+Each flow rule comes with its own, per-layer bit-masks, while hardware may
+support only a single, device-wide bit-mask for a given layer type, so that
+two IPv4 rules cannot use different bit-masks.
+
+The expected behavior in this case is that PMDs automatically configure
+global bit-masks according to the needs of the first flow rule created.
+
+Subsequent rules are allowed only if their bit-masks match those, the
+``EEXIST`` error code should be returned otherwise.
+
+Unsupported layer types
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Many protocols can be simulated by crafting patterns with the `Item: RAW`_
+type.
+
+PMDs can rely on this capability to simulate support for protocols with
+headers not directly recognized by hardware.
+
+``ANY`` pattern item
+~~~~~~~~~~~~~~~~~~~~
+
+This pattern item stands for anything, which can be difficult to translate
+to something hardware would understand, particularly if followed by more
+specific types.
+
+Consider the following pattern:
+
+.. _table_rte_flow_unsupported_any:
+
+.. table:: Pattern with ANY as L3
+
+   +-------+-----------------------+
+   | Index | Item                  |
+   +=======+=======================+
+   | 0     | ETHER                 |
+   +-------+-----+---------+-------+
+   | 1     | ANY | ``num`` | ``1`` |
+   +-------+-----+---------+-------+
+   | 2     | TCP                   |
+   +-------+-----------------------+
+   | 3     | END                   |
+   +-------+-----------------------+
+
+Knowing that TCP does not make sense with something other than IPv4 and IPv6
+as L3, such a pattern may be translated to two flow rules instead:
+
+.. _table_rte_flow_unsupported_any_ipv4:
+
+.. table:: ANY replaced with IPV4
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | ETHER              |
+   +-------+--------------------+
+   | 1     | IPV4 (zeroed mask) |
+   +-------+--------------------+
+   | 2     | TCP                |
+   +-------+--------------------+
+   | 3     | END                |
+   +-------+--------------------+
+
+|
+
+.. _table_rte_flow_unsupported_any_ipv6:
+
+.. table:: ANY replaced with IPV6
+
+   +-------+--------------------+
+   | Index | Item               |
+   +=======+====================+
+   | 0     | ETHER              |
+   +-------+--------------------+
+   | 1     | IPV6 (zeroed mask) |
+   +-------+--------------------+
+   | 2     | TCP                |
+   +-------+--------------------+
+   | 3     | END                |
+   +-------+--------------------+
+
+Note that as soon as a ANY rule covers several layers, this approach may
+yield a large number of hidden flow rules. It is thus suggested to only
+support the most common scenarios (anything as L2 and/or L3).
+
+Unsupported actions
+~~~~~~~~~~~~~~~~~~~
+
+- When combined with `Action: QUEUE`_, packet counting (`Action: COUNT`_)
+  and tagging (`Action: MARK`_ or `Action: FLAG`_) may be implemented in
+  software as long as the target queue is used by a single rule.
+
+- A rule specifying both `Action: DUP`_ + `Action: QUEUE`_ may be translated
+  to two hidden rules combining `Action: QUEUE`_ and `Action: PASSTHRU`_.
+
+- When a single target queue is provided, `Action: RSS`_ can also be
+  implemented through `Action: QUEUE`_.
+
+Flow rules priority
+~~~~~~~~~~~~~~~~~~~
+
+While it would naturally make sense, flow rules cannot be assumed to be
+processed by hardware in the same order as their creation for several
+reasons:
+
+- They may be managed internally as a tree or a hash table instead of a
+  list.
+- Removing a flow rule before adding another one can either put the new rule
+  at the end of the list or reuse a freed entry.
+- Duplication may occur when packets are matched by several rules.
+
+For overlapping rules (particularly in order to use `Action: PASSTHRU`_)
+predictable behavior is only guaranteed by using different priority levels.
+
+Priority levels are not necessarily implemented in hardware, or may be
+severely limited (e.g. a single priority bit).
+
+For these reasons, priority levels may be implemented purely in software by
+PMDs.
+
+- For devices expecting flow rules to be added in the correct order, PMDs
+  may destroy and re-create existing rules after adding a new one with
+  a higher priority.
+
+- A configurable number of dummy or empty rules can be created at
+  initialization time to save high priority slots for later.
+
+- In order to save priority levels, PMDs may evaluate whether rules are
+  likely to collide and adjust their priority accordingly.
+
+Future evolutions
+-----------------
+
+- A device profile selection function which could be used to force a
+  permanent profile instead of relying on its automatic configuration based
+  on existing flow rules.
+
+- A method to optimize *rte_flow* rules with specific pattern items and
+  action types generated on the fly by PMDs. DPDK should assign negative
+  numbers to these in order to not collide with the existing types. See
+  `Negative types`_.
+
+- Adding specific egress pattern items and actions as described in
+  `Attribute: Traffic direction`_.
+
+- Optional software fallback when PMDs are unable to handle requested flow
+  rules so applications do not have to implement their own.
+
+API migration
+-------------
+
+Exhaustive list of deprecated filter types (normally prefixed with
+*RTE_ETH_FILTER_*) found in ``rte_eth_ctrl.h`` and methods to convert them
+to *rte_flow* rules.
+
+``MACVLAN`` to ``ETH`` → ``VF``, ``PF``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*MACVLAN* can be translated to a basic `Item: ETH`_ flow rule with a
+terminating `Action: VF`_ or `Action: PF`_.
+
+.. _table_rte_flow_migration_macvlan:
+
+.. table:: MACVLAN conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | ETH | ``spec`` | any | VF,     |
+   |   |     +----------+-----+ PF      |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``ETHERTYPE`` to ``ETH`` → ``QUEUE``, ``DROP``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*ETHERTYPE* is basically an `Item: ETH`_ flow rule with a terminating
+`Action: QUEUE`_ or `Action: DROP`_.
+
+.. _table_rte_flow_migration_ethertype:
+
+.. table:: ETHERTYPE conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | ETH | ``spec`` | any | QUEUE,  |
+   |   |     +----------+-----+ DROP    |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``FLEXIBLE`` to ``RAW`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*FLEXIBLE* can be translated to one `Item: RAW`_ pattern with a terminating
+`Action: QUEUE`_ and a defined priority level.
+
+.. _table_rte_flow_migration_flexible:
+
+.. table:: FLEXIBLE conversion
+
+   +--------------------------+---------+
+   | Pattern                  | Actions |
+   +===+=====+==========+=====+=========+
+   | 0 | RAW | ``spec`` | any | QUEUE   |
+   |   |     +----------+-----+         |
+   |   |     | ``last`` | N/A |         |
+   |   |     +----------+-----+         |
+   |   |     | ``mask`` | any |         |
+   +---+-----+----------+-----+---------+
+   | 1 | END                  | END     |
+   +---+----------------------+---------+
+
+``SYN`` to ``TCP`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*SYN* is a `Item: TCP`_ rule with only the ``syn`` bit enabled and masked,
+and a terminating `Action: QUEUE`_.
+
+Priority level can be set to simulate the high priority bit.
+
+.. _table_rte_flow_migration_syn:
+
+.. table:: SYN conversion
+
+   +-----------------------------------+---------+
+   | Pattern                           | Actions |
+   +===+======+==========+=============+=========+
+   | 0 | ETH  | ``spec`` | unset       | QUEUE   |
+   |   |      +----------+-------------+         |
+   |   |      | ``last`` | unset       |         |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   +---+------+----------+-------------+---------+
+   | 1 | IPV4 | ``spec`` | unset       | END     |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   |   |      +----------+-------------+         |
+   |   |      | ``mask`` | unset       |         |
+   +---+------+----------+---------+---+         |
+   | 2 | TCP  | ``spec`` | ``syn`` | 1 |         |
+   |   |      +----------+---------+---+         |
+   |   |      | ``mask`` | ``syn`` | 1 |         |
+   +---+------+----------+---------+---+         |
+   | 3 | END                           |         |
+   +---+-------------------------------+---------+
+
+``NTUPLE`` to ``IPV4``, ``TCP``, ``UDP`` → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*NTUPLE* is similar to specifying an empty L2, `Item: IPV4`_ as L3 with
+`Item: TCP`_ or `Item: UDP`_ as L4 and a terminating `Action: QUEUE`_.
+
+A priority level can be specified as well.
+
+.. _table_rte_flow_migration_ntuple:
+
+.. table:: NTUPLE conversion
+
+   +-----------------------------+---------+
+   | Pattern                     | Actions |
+   +===+======+==========+=======+=========+
+   | 0 | ETH  | ``spec`` | unset | QUEUE   |
+   |   |      +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | unset |         |
+   +---+------+----------+-------+---------+
+   | 1 | IPV4 | ``spec`` | any   | END     |
+   |   |      +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | any   |         |
+   +---+------+----------+-------+         |
+   | 2 | TCP, | ``spec`` | any   |         |
+   |   | UDP  +----------+-------+         |
+   |   |      | ``last`` | unset |         |
+   |   |      +----------+-------+         |
+   |   |      | ``mask`` | any   |         |
+   +---+------+----------+-------+         |
+   | 3 | END                     |         |
+   +---+-------------------------+---------+
+
+``TUNNEL`` to ``ETH``, ``IPV4``, ``IPV6``, ``VXLAN`` (or other) → ``QUEUE``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*TUNNEL* matches common IPv4 and IPv6 L3/L4-based tunnel types.
+
+In the following table, `Item: ANY`_ is used to cover the optional L4.
+
+.. _table_rte_flow_migration_tunnel:
+
+.. table:: TUNNEL conversion
+
+   +-------------------------------------------------------+---------+
+   | Pattern                                               | Actions |
+   +===+==========================+==========+=============+=========+
+   | 0 | ETH                      | ``spec`` | any         | QUEUE   |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``last`` | unset       |         |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``mask`` | any         |         |
+   +---+--------------------------+----------+-------------+---------+
+   | 1 | IPV4, IPV6               | ``spec`` | any         | END     |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``last`` | unset       |         |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``mask`` | any         |         |
+   +---+--------------------------+----------+-------------+         |
+   | 2 | ANY                      | ``spec`` | any         |         |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``last`` | unset       |         |
+   |   |                          +----------+---------+---+         |
+   |   |                          | ``mask`` | ``num`` | 0 |         |
+   +---+--------------------------+----------+---------+---+         |
+   | 3 | VXLAN, GENEVE, TEREDO,   | ``spec`` | any         |         |
+   |   | NVGRE, GRE, ...          +----------+-------------+         |
+   |   |                          | ``last`` | unset       |         |
+   |   |                          +----------+-------------+         |
+   |   |                          | ``mask`` | any         |         |
+   +---+--------------------------+----------+-------------+         |
+   | 4 | END                                               |         |
+   +---+---------------------------------------------------+---------+
+
+``FDIR`` to most item types → ``QUEUE``, ``DROP``, ``PASSTHRU``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*FDIR* is more complex than any other type, there are several methods to
+emulate its functionality. It is summarized for the most part in the table
+below.
+
+A few features are intentionally not supported:
+
+- The ability to configure the matching input set and masks for the entire
+  device, PMDs should take care of it automatically according to the
+  requested flow rules.
+
+  For example if a device supports only one bit-mask per protocol type,
+  source/address IPv4 bit-masks can be made immutable by the first created
+  rule. Subsequent IPv4 or TCPv4 rules can only be created if they are
+  compatible.
+
+  Note that only protocol bit-masks affected by existing flow rules are
+  immutable, others can be changed later. They become mutable again after
+  the related flow rules are destroyed.
+
+- Returning four or eight bytes of matched data when using flex bytes
+  filtering. Although a specific action could implement it, it conflicts
+  with the much more useful 32 bits tagging on devices that support it.
+
+- Side effects on RSS processing of the entire device. Flow rules that
+  conflict with the current device configuration should not be
+  allowed. Similarly, device configuration should not be allowed when it
+  affects existing flow rules.
+
+- Device modes of operation. "none" is unsupported since filtering cannot be
+  disabled as long as a flow rule is present.
+
+- "MAC VLAN" or "tunnel" perfect matching modes should be automatically set
+  according to the created flow rules.
+
+- Signature mode of operation is not defined but could be handled through a
+  specific item type if needed.
+
+.. _table_rte_flow_migration_fdir:
+
+.. table:: FDIR conversion
+
+   +----------------------------------------+-----------------------+
+   | Pattern                                | Actions               |
+   +===+===================+==========+=====+=======================+
+   | 0 | ETH, RAW          | ``spec`` | any | QUEUE, DROP, PASSTHRU |
+   |   |                   +----------+-----+                       |
+   |   |                   | ``last`` | N/A |                       |
+   |   |                   +----------+-----+                       |
+   |   |                   | ``mask`` | any |                       |
+   +---+-------------------+----------+-----+-----------------------+
+   | 1 | IPV4, IPv6        | ``spec`` | any | MARK                  |
+   |   |                   +----------+-----+                       |
+   |   |                   | ``last`` | N/A |                       |
+   |   |                   +----------+-----+                       |
+   |   |                   | ``mask`` | any |                       |
+   +---+-------------------+----------+-----+-----------------------+
+   | 2 | TCP, UDP, SCTP    | ``spec`` | any | END                   |
+   |   |                   +----------+-----+                       |
+   |   |                   | ``last`` | N/A |                       |
+   |   |                   +----------+-----+                       |
+   |   |                   | ``mask`` | any |                       |
+   +---+-------------------+----------+-----+                       |
+   | 3 | VF, PF (optional) | ``spec`` | any |                       |
+   |   |                   +----------+-----+                       |
+   |   |                   | ``last`` | N/A |                       |
+   |   |                   +----------+-----+                       |
+   |   |                   | ``mask`` | any |                       |
+   +---+-------------------+----------+-----+                       |
+   | 4 | END                                |                       |
+   +---+------------------------------------+-----------------------+
+
+``HASH``
+~~~~~~~~
+
+There is no counterpart to this filter type because it translates to a
+global device setting instead of a pattern item. Device settings are
+automatically set according to the created flow rules.
+
+``L2_TUNNEL`` to ``VOID`` → ``VXLAN`` (or others)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All packets are matched. This type alters incoming packets to encapsulate
+them in a chosen tunnel type, optionally redirect them to a VF as well.
+
+The destination pool for tag based forwarding can be emulated with other
+flow rules using `Action: DUP`_.
+
+.. _table_rte_flow_migration_l2tunnel:
+
+.. table:: L2_TUNNEL conversion
+
+   +---------------------------+--------------------+
+   | Pattern                   | Actions            |
+   +===+======+==========+=====+====================+
+   | 0 | VOID | ``spec`` | N/A | VXLAN, GENEVE, ... |
+   |   |      |          |     |                    |
+   |   |      |          |     |                    |
+   |   |      +----------+-----+                    |
+   |   |      | ``last`` | N/A |                    |
+   |   |      +----------+-----+                    |
+   |   |      | ``mask`` | N/A |                    |
+   |   |      |          |     |                    |
+   +---+------+----------+-----+--------------------+
+   | 1 | END                   | VF (optional)      |
+   +---+                       +--------------------+
+   | 2 |                       | END                |
+   +---+-----------------------+--------------------+
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 03/26] doc: announce deprecation of legacy filter types
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 01/26] ethdev: introduce generic flow API Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 02/26] doc: add rte_flow prog guide Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 04/26] cmdline: add support for dynamic tokens Adrien Mazarguil
                               ` (23 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

They are superseded by the generic flow API (rte_flow). Target release is
not defined yet.

Suggested-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/rel_notes/deprecation.rst | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 2d17bc6..1438c77 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -71,3 +71,11 @@ Deprecation Notices
 * mempool: The functions for single/multi producer/consumer are deprecated
   and will be removed in 17.02.
   It is replaced by ``rte_mempool_generic_get/put`` functions.
+
+* ethdev: the legacy filter API, including
+  ``rte_eth_dev_filter_supported()``, ``rte_eth_dev_filter_ctrl()`` as well
+  as filter types MACVLAN, ETHERTYPE, FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR,
+  HASH and L2_TUNNEL, is superseded by the generic flow API (rte_flow) in
+  PMDs that implement the latter.
+  Target release for removal of the legacy API will be defined once most
+  PMDs have switched to rte_flow.
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 04/26] cmdline: add support for dynamic tokens
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (2 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 03/26] doc: announce deprecation of legacy filter types Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 05/26] cmdline: add alignment constraint Adrien Mazarguil
                               ` (22 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Considering tokens must be hard-coded in a list part of the instruction
structure, context-dependent tokens cannot be expressed.

This commit adds support for building dynamic token lists through a
user-provided function, which is called when the static token list is empty
(a single NULL entry).

Because no structures are modified (existing fields are reused), this
commit has no impact on the current ABI.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 lib/librte_cmdline/cmdline_parse.c | 60 +++++++++++++++++++++++++++++----
 lib/librte_cmdline/cmdline_parse.h | 21 ++++++++++++
 2 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index b496067..14f5553 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -146,7 +146,9 @@ nb_common_chars(const char * s1, const char * s2)
  */
 static int
 match_inst(cmdline_parse_inst_t *inst, const char *buf,
-	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size)
+	   unsigned int nb_match_token, void *resbuf, unsigned resbuf_size,
+	   cmdline_parse_token_hdr_t
+		*(*dyn_tokens)[CMDLINE_PARSE_DYNAMIC_TOKENS])
 {
 	unsigned int token_num=0;
 	cmdline_parse_token_hdr_t * token_p;
@@ -155,6 +157,11 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 	struct cmdline_token_hdr token_hdr;
 
 	token_p = inst->tokens[token_num];
+	if (!token_p && dyn_tokens && inst->f) {
+		if (!(*dyn_tokens)[0])
+			inst->f(&(*dyn_tokens)[0], NULL, dyn_tokens);
+		token_p = (*dyn_tokens)[0];
+	}
 	if (token_p)
 		memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -196,7 +203,17 @@ match_inst(cmdline_parse_inst_t *inst, const char *buf,
 		buf += n;
 
 		token_num ++;
-		token_p = inst->tokens[token_num];
+		if (!inst->tokens[0]) {
+			if (token_num < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!(*dyn_tokens)[token_num])
+					inst->f(&(*dyn_tokens)[token_num],
+						NULL,
+						dyn_tokens);
+				token_p = (*dyn_tokens)[token_num];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[token_num];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 	}
@@ -239,6 +256,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
 	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
 	int comment = 0;
@@ -255,6 +273,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		return CMDLINE_PARSE_BAD_ARGS;
 
 	ctx = cl->ctx;
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/*
 	 * - look if the buffer contains at least one line
@@ -299,7 +318,8 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf));
+		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
 			err = CMDLINE_PARSE_BAD_ARGS;
@@ -355,6 +375,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 	cmdline_parse_token_hdr_t *token_p;
 	struct cmdline_token_hdr token_hdr;
 	char tmpbuf[CMDLINE_BUFFER_SIZE], comp_buf[CMDLINE_BUFFER_SIZE];
+	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	unsigned int partial_tok_len;
 	int comp_len = -1;
 	int tmp_len = -1;
@@ -374,6 +395,7 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 
 	debug_printf("%s called\n", __func__);
 	memset(&token_hdr, 0, sizeof(token_hdr));
+	memset(&dyn_tokens, 0, sizeof(dyn_tokens));
 
 	/* count the number of complete token to parse */
 	for (i=0 ; buf[i] ; i++) {
@@ -396,11 +418,24 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		inst = ctx[inst_num];
 		while (inst) {
 			/* parse the first tokens of the inst */
-			if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+			if (nb_token &&
+			    match_inst(inst, buf, nb_token, NULL, 0,
+				       &dyn_tokens))
 				goto next;
 
 			debug_printf("instruction match\n");
-			token_p = inst->tokens[nb_token];
+			if (!inst->tokens[0]) {
+				if (nb_token <
+				    (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+					if (!dyn_tokens[nb_token])
+						inst->f(&dyn_tokens[nb_token],
+							NULL,
+							&dyn_tokens);
+					token_p = dyn_tokens[nb_token];
+				} else
+					token_p = NULL;
+			} else
+				token_p = inst->tokens[nb_token];
 			if (token_p)
 				memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
@@ -490,10 +525,21 @@ cmdline_complete(struct cmdline *cl, const char *buf, int *state,
 		/* we need to redo it */
 		inst = ctx[inst_num];
 
-		if (nb_token && match_inst(inst, buf, nb_token, NULL, 0))
+		if (nb_token &&
+		    match_inst(inst, buf, nb_token, NULL, 0, &dyn_tokens))
 			goto next2;
 
-		token_p = inst->tokens[nb_token];
+		if (!inst->tokens[0]) {
+			if (nb_token < (CMDLINE_PARSE_DYNAMIC_TOKENS - 1)) {
+				if (!dyn_tokens[nb_token])
+					inst->f(&dyn_tokens[nb_token],
+						NULL,
+						&dyn_tokens);
+				token_p = dyn_tokens[nb_token];
+			} else
+				token_p = NULL;
+		} else
+			token_p = inst->tokens[nb_token];
 		if (token_p)
 			memcpy(&token_hdr, token_p, sizeof(token_hdr));
 
diff --git a/lib/librte_cmdline/cmdline_parse.h b/lib/librte_cmdline/cmdline_parse.h
index 4ac05d6..65b18d4 100644
--- a/lib/librte_cmdline/cmdline_parse.h
+++ b/lib/librte_cmdline/cmdline_parse.h
@@ -83,6 +83,9 @@ extern "C" {
 /* maximum buffer size for parsed result */
 #define CMDLINE_PARSE_RESULT_BUFSIZE 8192
 
+/* maximum number of dynamic tokens */
+#define CMDLINE_PARSE_DYNAMIC_TOKENS 128
+
 /**
  * Stores a pointer to the ops struct, and the offset: the place to
  * write the parsed result in the destination structure.
@@ -130,6 +133,24 @@ struct cmdline;
  * Store a instruction, which is a pointer to a callback function and
  * its parameter that is called when the instruction is parsed, a help
  * string, and a list of token composing this instruction.
+ *
+ * When no tokens are defined (tokens[0] == NULL), they are retrieved
+ * dynamically by calling f() as follows:
+ *
+ *  f((struct cmdline_token_hdr **)&token_hdr,
+ *    NULL,
+ *    (struct cmdline_token_hdr *[])tokens));
+ *
+ * The address of the resulting token is expected at the location pointed by
+ * the first argument. Can be set to NULL to end the list.
+ *
+ * The cmdline argument (struct cmdline *) is always NULL.
+ *
+ * The last argument points to the NULL-terminated list of dynamic tokens
+ * defined so far. Since token_hdr points to an index of that list, the
+ * current index can be derived as follows:
+ *
+ *  int index = token_hdr - &(*tokens)[0];
  */
 struct cmdline_inst {
 	/* f(parsed_struct, data) */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 05/26] cmdline: add alignment constraint
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (3 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 04/26] cmdline: add support for dynamic tokens Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 06/26] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
                               ` (21 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

This prevents sigbus errors on architectures that cannot handle unexpected
unaligned accesses to the output buffer.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 lib/librte_cmdline/cmdline_parse.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse.c b/lib/librte_cmdline/cmdline_parse.c
index 14f5553..763c286 100644
--- a/lib/librte_cmdline/cmdline_parse.c
+++ b/lib/librte_cmdline/cmdline_parse.c
@@ -255,7 +255,10 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 	unsigned int inst_num=0;
 	cmdline_parse_inst_t *inst;
 	const char *curbuf;
-	char result_buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+	union {
+		char buf[CMDLINE_PARSE_RESULT_BUFSIZE];
+		long double align; /* strong alignment constraint for buf */
+	} result;
 	cmdline_parse_token_hdr_t *dyn_tokens[CMDLINE_PARSE_DYNAMIC_TOKENS];
 	void (*f)(void *, struct cmdline *, void *) = NULL;
 	void *data = NULL;
@@ -318,7 +321,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 		debug_printf("INST %d\n", inst_num);
 
 		/* fully parsed */
-		tok = match_inst(inst, buf, 0, result_buf, sizeof(result_buf),
+		tok = match_inst(inst, buf, 0, result.buf, sizeof(result.buf),
 				 &dyn_tokens);
 
 		if (tok > 0) /* we matched at least one token */
@@ -353,7 +356,7 @@ cmdline_parse(struct cmdline *cl, const char * buf)
 
 	/* call func */
 	if (f) {
-		f(result_buf, cl, data);
+		f(result.buf, cl, data);
 	}
 
 	/* no match */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 06/26] app/testpmd: implement basic support for rte_flow
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (4 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 05/26] cmdline: add alignment constraint Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 07/26] app/testpmd: add flow command Adrien Mazarguil
                               ` (20 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Add basic management functions for the generic flow API (validate, create,
destroy, flush, query and list). Flow rule objects and properties are
arranged in lists associated with each port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c     |   1 +
 app/test-pmd/config.c      | 498 ++++++++++++++++++++++++++++++++++++++++
 app/test-pmd/csumonly.c    |   1 +
 app/test-pmd/flowgen.c     |   1 +
 app/test-pmd/icmpecho.c    |   1 +
 app/test-pmd/ieee1588fwd.c |   1 +
 app/test-pmd/iofwd.c       |   1 +
 app/test-pmd/macfwd.c      |   1 +
 app/test-pmd/macswap.c     |   1 +
 app/test-pmd/parameters.c  |   1 +
 app/test-pmd/rxonly.c      |   1 +
 app/test-pmd/testpmd.c     |   6 +
 app/test-pmd/testpmd.h     |  27 +++
 app/test-pmd/txonly.c      |   1 +
 14 files changed, 542 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index d03a592..5d1c0dd 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -75,6 +75,7 @@
 #include <rte_string_fns.h>
 #include <rte_devargs.h>
 #include <rte_eth_ctrl.h>
+#include <rte_flow.h>
 
 #include <cmdline_rdline.h>
 #include <cmdline_parse.h>
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 8cf537d..9716ce7 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -92,6 +92,8 @@
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
 #include <rte_cycles.h>
+#include <rte_flow.h>
+#include <rte_errno.h>
 
 #include "testpmd.h"
 
@@ -751,6 +753,502 @@ port_mtu_set(portid_t port_id, uint16_t mtu)
 	printf("Set MTU failed. diag=%d\n", diag);
 }
 
+/* Generic flow management functions. */
+
+/** Generate flow_item[] entry. */
+#define MK_FLOW_ITEM(t, s) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow pattern items. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_item[] = {
+	MK_FLOW_ITEM(END, 0),
+	MK_FLOW_ITEM(VOID, 0),
+	MK_FLOW_ITEM(INVERT, 0),
+	MK_FLOW_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+	MK_FLOW_ITEM(PF, 0),
+	MK_FLOW_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+	MK_FLOW_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+	MK_FLOW_ITEM(RAW, sizeof(struct rte_flow_item_raw)), /* +pattern[] */
+	MK_FLOW_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+	MK_FLOW_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+	MK_FLOW_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+	MK_FLOW_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+	MK_FLOW_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+	MK_FLOW_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+	MK_FLOW_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+	MK_FLOW_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+	MK_FLOW_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+};
+
+/** Compute storage space needed by item specification. */
+static void
+flow_item_spec_size(const struct rte_flow_item *item,
+		    size_t *size, size_t *pad)
+{
+	if (!item->spec)
+		goto empty;
+	switch (item->type) {
+		union {
+			const struct rte_flow_item_raw *raw;
+		} spec;
+
+	case RTE_FLOW_ITEM_TYPE_RAW:
+		spec.raw = item->spec;
+		*size = offsetof(struct rte_flow_item_raw, pattern) +
+			spec.raw->length * sizeof(*spec.raw->pattern);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate flow_action[] entry. */
+#define MK_FLOW_ACTION(t, s) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = s, \
+	}
+
+/** Information about known flow actions. */
+static const struct {
+	const char *name;
+	size_t size;
+} flow_action[] = {
+	MK_FLOW_ACTION(END, 0),
+	MK_FLOW_ACTION(VOID, 0),
+	MK_FLOW_ACTION(PASSTHRU, 0),
+	MK_FLOW_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+	MK_FLOW_ACTION(FLAG, 0),
+	MK_FLOW_ACTION(QUEUE, sizeof(struct rte_flow_action_queue)),
+	MK_FLOW_ACTION(DROP, 0),
+	MK_FLOW_ACTION(COUNT, 0),
+	MK_FLOW_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+	MK_FLOW_ACTION(RSS, sizeof(struct rte_flow_action_rss)), /* +queue[] */
+	MK_FLOW_ACTION(PF, 0),
+	MK_FLOW_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+};
+
+/** Compute storage space needed by action configuration. */
+static void
+flow_action_conf_size(const struct rte_flow_action *action,
+		      size_t *size, size_t *pad)
+{
+	if (!action->conf)
+		goto empty;
+	switch (action->type) {
+		union {
+			const struct rte_flow_action_rss *rss;
+		} conf;
+
+	case RTE_FLOW_ACTION_TYPE_RSS:
+		conf.rss = action->conf;
+		*size = offsetof(struct rte_flow_action_rss, queue) +
+			conf.rss->num * sizeof(*conf.rss->queue);
+		break;
+	default:
+empty:
+		*size = 0;
+		break;
+	}
+	*pad = RTE_ALIGN_CEIL(*size, sizeof(double)) - *size;
+}
+
+/** Generate a port_flow entry from attributes/pattern/actions. */
+static struct port_flow *
+port_flow_new(const struct rte_flow_attr *attr,
+	      const struct rte_flow_item *pattern,
+	      const struct rte_flow_action *actions)
+{
+	const struct rte_flow_item *item;
+	const struct rte_flow_action *action;
+	struct port_flow *pf = NULL;
+	size_t tmp;
+	size_t pad;
+	size_t off1 = 0;
+	size_t off2 = 0;
+	int err = ENOTSUP;
+
+store:
+	item = pattern;
+	if (pf)
+		pf->pattern = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_item *dst = NULL;
+
+		if ((unsigned int)item->type > RTE_DIM(flow_item) ||
+		    !flow_item[item->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, item, sizeof(*item));
+		off1 += sizeof(*item);
+		flow_item_spec_size(item, &tmp, &pad);
+		if (item->spec) {
+			if (pf)
+				dst->spec = memcpy(pf->data + off2,
+						   item->spec, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->last) {
+			if (pf)
+				dst->last = memcpy(pf->data + off2,
+						   item->last, tmp);
+			off2 += tmp + pad;
+		}
+		if (item->mask) {
+			if (pf)
+				dst->mask = memcpy(pf->data + off2,
+						   item->mask, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((item++)->type != RTE_FLOW_ITEM_TYPE_END);
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	action = actions;
+	if (pf)
+		pf->actions = (void *)&pf->data[off1];
+	do {
+		struct rte_flow_action *dst = NULL;
+
+		if ((unsigned int)action->type > RTE_DIM(flow_action) ||
+		    !flow_action[action->type].name)
+			goto notsup;
+		if (pf)
+			dst = memcpy(pf->data + off1, action, sizeof(*action));
+		off1 += sizeof(*action);
+		flow_action_conf_size(action, &tmp, &pad);
+		if (action->conf) {
+			if (pf)
+				dst->conf = memcpy(pf->data + off2,
+						   action->conf, tmp);
+			off2 += tmp + pad;
+		}
+		off2 = RTE_ALIGN_CEIL(off2, sizeof(double));
+	} while ((action++)->type != RTE_FLOW_ACTION_TYPE_END);
+	if (pf != NULL)
+		return pf;
+	off1 = RTE_ALIGN_CEIL(off1, sizeof(double));
+	tmp = RTE_ALIGN_CEIL(offsetof(struct port_flow, data), sizeof(double));
+	pf = calloc(1, tmp + off1 + off2);
+	if (pf == NULL)
+		err = errno;
+	else {
+		*pf = (const struct port_flow){
+			.size = tmp + off1 + off2,
+			.attr = *attr,
+		};
+		tmp -= offsetof(struct port_flow, data);
+		off2 = tmp + off1;
+		off1 = tmp;
+		goto store;
+	}
+notsup:
+	rte_errno = err;
+	return NULL;
+}
+
+/** Print a message out of a flow error. */
+static int
+port_flow_complain(struct rte_flow_error *error)
+{
+	static const char *const errstrlist[] = {
+		[RTE_FLOW_ERROR_TYPE_NONE] = "no error",
+		[RTE_FLOW_ERROR_TYPE_UNSPECIFIED] = "cause unspecified",
+		[RTE_FLOW_ERROR_TYPE_HANDLE] = "flow rule (handle)",
+		[RTE_FLOW_ERROR_TYPE_ATTR_GROUP] = "group field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY] = "priority field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_INGRESS] = "ingress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR_EGRESS] = "egress field",
+		[RTE_FLOW_ERROR_TYPE_ATTR] = "attributes structure",
+		[RTE_FLOW_ERROR_TYPE_ITEM_NUM] = "pattern length",
+		[RTE_FLOW_ERROR_TYPE_ITEM] = "specific pattern item",
+		[RTE_FLOW_ERROR_TYPE_ACTION_NUM] = "number of actions",
+		[RTE_FLOW_ERROR_TYPE_ACTION] = "specific action",
+	};
+	const char *errstr;
+	char buf[32];
+	int err = rte_errno;
+
+	if ((unsigned int)error->type > RTE_DIM(errstrlist) ||
+	    !errstrlist[error->type])
+		errstr = "unknown type";
+	else
+		errstr = errstrlist[error->type];
+	printf("Caught error type %d (%s): %s%s\n",
+	       error->type, errstr,
+	       error->cause ? (snprintf(buf, sizeof(buf), "cause: %p, ",
+					error->cause), buf) : "",
+	       error->message ? error->message : "(no stated reason)");
+	return -err;
+}
+
+/** Validate flow rule. */
+int
+port_flow_validate(portid_t port_id,
+		   const struct rte_flow_attr *attr,
+		   const struct rte_flow_item *pattern,
+		   const struct rte_flow_action *actions)
+{
+	struct rte_flow_error error;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x11, sizeof(error));
+	if (rte_flow_validate(port_id, attr, pattern, actions, &error))
+		return port_flow_complain(&error);
+	printf("Flow rule validated\n");
+	return 0;
+}
+
+/** Create flow rule. */
+int
+port_flow_create(portid_t port_id,
+		 const struct rte_flow_attr *attr,
+		 const struct rte_flow_item *pattern,
+		 const struct rte_flow_action *actions)
+{
+	struct rte_flow *flow;
+	struct rte_port *port;
+	struct port_flow *pf;
+	uint32_t id;
+	struct rte_flow_error error;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x22, sizeof(error));
+	flow = rte_flow_create(port_id, attr, pattern, actions, &error);
+	if (!flow)
+		return port_flow_complain(&error);
+	port = &ports[port_id];
+	if (port->flow_list) {
+		if (port->flow_list->id == UINT32_MAX) {
+			printf("Highest rule ID is already assigned, delete"
+			       " it first");
+			rte_flow_destroy(port_id, flow, NULL);
+			return -ENOMEM;
+		}
+		id = port->flow_list->id + 1;
+	} else
+		id = 0;
+	pf = port_flow_new(attr, pattern, actions);
+	if (!pf) {
+		int err = rte_errno;
+
+		printf("Cannot allocate flow: %s\n", rte_strerror(err));
+		rte_flow_destroy(port_id, flow, NULL);
+		return -err;
+	}
+	pf->next = port->flow_list;
+	pf->id = id;
+	pf->flow = flow;
+	port->flow_list = pf;
+	printf("Flow rule #%u created\n", pf->id);
+	return 0;
+}
+
+/** Destroy a number of flow rules. */
+int
+port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule)
+{
+	struct rte_port *port;
+	struct port_flow **tmp;
+	uint32_t c = 0;
+	int ret = 0;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	tmp = &port->flow_list;
+	while (*tmp) {
+		uint32_t i;
+
+		for (i = 0; i != n; ++i) {
+			struct rte_flow_error error;
+			struct port_flow *pf = *tmp;
+
+			if (rule[i] != pf->id)
+				continue;
+			/*
+			 * Poisoning to make sure PMDs update it in case
+			 * of error.
+			 */
+			memset(&error, 0x33, sizeof(error));
+			if (rte_flow_destroy(port_id, pf->flow, &error)) {
+				ret = port_flow_complain(&error);
+				continue;
+			}
+			printf("Flow rule #%u destroyed\n", pf->id);
+			*tmp = pf->next;
+			free(pf);
+			break;
+		}
+		if (i == n)
+			tmp = &(*tmp)->next;
+		++c;
+	}
+	return ret;
+}
+
+/** Remove all flow rules. */
+int
+port_flow_flush(portid_t port_id)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	int ret = 0;
+
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x44, sizeof(error));
+	if (rte_flow_flush(port_id, &error)) {
+		ret = port_flow_complain(&error);
+		if (port_id_is_invalid(port_id, DISABLED_WARN) ||
+		    port_id == (portid_t)RTE_PORT_ALL)
+			return ret;
+	}
+	port = &ports[port_id];
+	while (port->flow_list) {
+		struct port_flow *pf = port->flow_list->next;
+
+		free(port->flow_list);
+		port->flow_list = pf;
+	}
+	return ret;
+}
+
+/** Query a flow rule. */
+int
+port_flow_query(portid_t port_id, uint32_t rule,
+		enum rte_flow_action_type action)
+{
+	struct rte_flow_error error;
+	struct rte_port *port;
+	struct port_flow *pf;
+	const char *name;
+	union {
+		struct rte_flow_query_count count;
+	} query;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return -EINVAL;
+	port = &ports[port_id];
+	for (pf = port->flow_list; pf; pf = pf->next)
+		if (pf->id == rule)
+			break;
+	if (!pf) {
+		printf("Flow rule #%u not found\n", rule);
+		return -ENOENT;
+	}
+	if ((unsigned int)action > RTE_DIM(flow_action) ||
+	    !flow_action[action].name)
+		name = "unknown";
+	else
+		name = flow_action[action].name;
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		break;
+	default:
+		printf("Cannot query action type %d (%s)\n", action, name);
+		return -ENOTSUP;
+	}
+	/* Poisoning to make sure PMDs update it in case of error. */
+	memset(&error, 0x55, sizeof(error));
+	memset(&query, 0, sizeof(query));
+	if (rte_flow_query(port_id, pf->flow, action, &query, &error))
+		return port_flow_complain(&error);
+	switch (action) {
+	case RTE_FLOW_ACTION_TYPE_COUNT:
+		printf("%s:\n"
+		       " hits_set: %u\n"
+		       " bytes_set: %u\n"
+		       " hits: %" PRIu64 "\n"
+		       " bytes: %" PRIu64 "\n",
+		       name,
+		       query.count.hits_set,
+		       query.count.bytes_set,
+		       query.count.hits,
+		       query.count.bytes);
+		break;
+	default:
+		printf("Cannot display result for action type %d (%s)\n",
+		       action, name);
+		break;
+	}
+	return 0;
+}
+
+/** List flow rules. */
+void
+port_flow_list(portid_t port_id, uint32_t n, const uint32_t group[n])
+{
+	struct rte_port *port;
+	struct port_flow *pf;
+	struct port_flow *list = NULL;
+	uint32_t i;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN) ||
+	    port_id == (portid_t)RTE_PORT_ALL)
+		return;
+	port = &ports[port_id];
+	if (!port->flow_list)
+		return;
+	/* Sort flows by group, priority and ID. */
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		struct port_flow **tmp;
+
+		if (n) {
+			/* Filter out unwanted groups. */
+			for (i = 0; i != n; ++i)
+				if (pf->attr.group == group[i])
+					break;
+			if (i == n)
+				continue;
+		}
+		tmp = &list;
+		while (*tmp &&
+		       (pf->attr.group > (*tmp)->attr.group ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority > (*tmp)->attr.priority) ||
+			(pf->attr.group == (*tmp)->attr.group &&
+			 pf->attr.priority == (*tmp)->attr.priority &&
+			 pf->id > (*tmp)->id)))
+			tmp = &(*tmp)->tmp;
+		pf->tmp = *tmp;
+		*tmp = pf;
+	}
+	printf("ID\tGroup\tPrio\tAttr\tRule\n");
+	for (pf = list; pf != NULL; pf = pf->tmp) {
+		const struct rte_flow_item *item = pf->pattern;
+		const struct rte_flow_action *action = pf->actions;
+
+		printf("%" PRIu32 "\t%" PRIu32 "\t%" PRIu32 "\t%c%c\t",
+		       pf->id,
+		       pf->attr.group,
+		       pf->attr.priority,
+		       pf->attr.ingress ? 'i' : '-',
+		       pf->attr.egress ? 'e' : '-');
+		while (item->type != RTE_FLOW_ITEM_TYPE_END) {
+			if (item->type != RTE_FLOW_ITEM_TYPE_VOID)
+				printf("%s ", flow_item[item->type].name);
+			++item;
+		}
+		printf("=>");
+		while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+			if (action->type != RTE_FLOW_ACTION_TYPE_VOID)
+				printf(" %s", flow_action[action->type].name);
+			++action;
+		}
+		printf("\n");
+	}
+}
+
 /*
  * RX/TX ring descriptors display functions.
  */
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 57e6ae2..dd67ebf 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -70,6 +70,7 @@
 #include <rte_sctp.h>
 #include <rte_prefetch.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 #include "testpmd.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index b13ff89..13b4f90 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 6a4e750..f25a8f5 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -61,6 +61,7 @@
 #include <rte_ip.h>
 #include <rte_icmp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/ieee1588fwd.c b/app/test-pmd/ieee1588fwd.c
index 0d3b37a..51170ee 100644
--- a/app/test-pmd/ieee1588fwd.c
+++ b/app/test-pmd/ieee1588fwd.c
@@ -34,6 +34,7 @@
 
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/iofwd.c b/app/test-pmd/iofwd.c
index 26936b7..15cb4a2 100644
--- a/app/test-pmd/iofwd.c
+++ b/app/test-pmd/iofwd.c
@@ -64,6 +64,7 @@
 #include <rte_ether.h>
 #include <rte_ethdev.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index 86e01de..d361db1 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 36e139f..f996039 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -65,6 +65,7 @@
 #include <rte_ethdev.h>
 #include <rte_ip.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 08e5a76..28db8cd 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -76,6 +76,7 @@
 #ifdef RTE_LIBRTE_PMD_BOND
 #include <rte_eth_bond.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index fff815c..cf00576 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -67,6 +67,7 @@
 #include <rte_ip.h>
 #include <rte_udp.h>
 #include <rte_net.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a0332c2..bfb2f8e 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -78,6 +78,7 @@
 #ifdef RTE_LIBRTE_PDUMP
 #include <rte_pdump.h>
 #endif
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
@@ -1545,6 +1546,8 @@ close_port(portid_t pid)
 			continue;
 		}
 
+		if (port->flow_list)
+			port_flow_flush(pi);
 		rte_eth_dev_close(pi);
 
 		if (rte_atomic16_cmpset(&(port->port_status),
@@ -1599,6 +1602,9 @@ detach_port(uint8_t port_id)
 		return;
 	}
 
+	if (ports[port_id].flow_list)
+		port_flow_flush(port_id);
+
 	if (rte_eth_dev_detach(port_id, name))
 		return;
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9c1e703..22ce2d6 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -144,6 +144,19 @@ struct fwd_stream {
 /** Insert double VLAN header in forward engine */
 #define TESTPMD_TX_OFFLOAD_INSERT_QINQ       0x0080
 
+/** Descriptor for a single flow. */
+struct port_flow {
+	size_t size; /**< Allocated space including data[]. */
+	struct port_flow *next; /**< Next flow in list. */
+	struct port_flow *tmp; /**< Temporary linking. */
+	uint32_t id; /**< Flow rule ID. */
+	struct rte_flow *flow; /**< Opaque flow object returned by PMD. */
+	struct rte_flow_attr attr; /**< Attributes. */
+	struct rte_flow_item *pattern; /**< Pattern. */
+	struct rte_flow_action *actions; /**< Actions. */
+	uint8_t data[]; /**< Storage for pattern/actions. */
+};
+
 /**
  * The data structure associated with each port.
  */
@@ -177,6 +190,7 @@ struct rte_port {
 	struct ether_addr       *mc_addr_pool; /**< pool of multicast addrs */
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
+	struct port_flow        *flow_list; /**< Associated flows. */
 };
 
 extern portid_t __rte_unused
@@ -504,6 +518,19 @@ void port_reg_bit_field_set(portid_t port_id, uint32_t reg_off,
 			    uint8_t bit1_pos, uint8_t bit2_pos, uint32_t value);
 void port_reg_display(portid_t port_id, uint32_t reg_off);
 void port_reg_set(portid_t port_id, uint32_t reg_off, uint32_t value);
+int port_flow_validate(portid_t port_id,
+		       const struct rte_flow_attr *attr,
+		       const struct rte_flow_item *pattern,
+		       const struct rte_flow_action *actions);
+int port_flow_create(portid_t port_id,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item *pattern,
+		     const struct rte_flow_action *actions);
+int port_flow_destroy(portid_t port_id, uint32_t n, const uint32_t *rule);
+int port_flow_flush(portid_t port_id);
+int port_flow_query(portid_t port_id, uint32_t rule,
+		    enum rte_flow_action_type action);
+void port_flow_list(portid_t port_id, uint32_t n, const uint32_t *group);
 
 void rx_ring_desc_display(portid_t port_id, queueid_t rxq_id, uint16_t rxd_id);
 void tx_ring_desc_display(portid_t port_id, queueid_t txq_id, uint16_t txd_id);
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 8513a06..e996f35 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -68,6 +68,7 @@
 #include <rte_tcp.h>
 #include <rte_udp.h>
 #include <rte_string_fns.h>
+#include <rte_flow.h>
 
 #include "testpmd.h"
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 07/26] app/testpmd: add flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (5 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 06/26] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 08/26] app/testpmd: add rte_flow integer support Adrien Mazarguil
                               ` (19 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Managing generic flow API functions from command line requires the use of
dynamic tokens for convenience as flow rules are not fixed and cannot be
defined statically.

This commit adds specific flexible parser code and object for a new "flow"
command in separate file.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/Makefile       |   1 +
 app/test-pmd/cmdline.c      |   4 +
 app/test-pmd/cmdline_flow.c | 439 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 444 insertions(+)

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 891b85a..5988c3e 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -47,6 +47,7 @@ CFLAGS += $(WERROR_FLAGS)
 SRCS-y := testpmd.c
 SRCS-y += parameters.c
 SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline.c
+SRCS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_flow.c
 SRCS-y += config.c
 SRCS-y += iofwd.c
 SRCS-y += macfwd.c
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 5d1c0dd..b124412 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -9567,6 +9567,9 @@ cmdline_parse_inst_t cmd_set_flow_director_flex_payload = {
 	},
 };
 
+/* Generic flow interface command. */
+extern cmdline_parse_inst_t cmd_flow;
+
 /* *** Classification Filters Control *** */
 /* *** Get symmetric hash enable per port *** */
 struct cmd_get_sym_hash_ena_per_port_result {
@@ -11605,6 +11608,7 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_set_hash_global_config,
 	(cmdline_parse_inst_t *)&cmd_set_hash_input_set,
 	(cmdline_parse_inst_t *)&cmd_set_fdir_input_set,
+	(cmdline_parse_inst_t *)&cmd_flow,
 	(cmdline_parse_inst_t *)&cmd_mcast_addr,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_all,
 	(cmdline_parse_inst_t *)&cmd_config_l2_tunnel_eth_type_specific,
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
new file mode 100644
index 0000000..f5aef0f
--- /dev/null
+++ b/app/test-pmd/cmdline_flow.c
@@ -0,0 +1,439 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2016 6WIND S.A.
+ *   Copyright 2016 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <ctype.h>
+#include <string.h>
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <cmdline_parse.h>
+#include <rte_flow.h>
+
+#include "testpmd.h"
+
+/** Parser token indices. */
+enum index {
+	/* Special tokens. */
+	ZERO = 0,
+	END,
+
+	/* Top-level command. */
+	FLOW,
+};
+
+/** Maximum number of subsequent tokens and arguments on the stack. */
+#define CTX_STACK_SIZE 16
+
+/** Parser context. */
+struct context {
+	/** Stack of subsequent token lists to process. */
+	const enum index *next[CTX_STACK_SIZE];
+	enum index curr; /**< Current token index. */
+	enum index prev; /**< Index of the last token seen. */
+	int next_num; /**< Number of entries in next[]. */
+	uint32_t reparse:1; /**< Start over from the beginning. */
+	uint32_t eol:1; /**< EOL has been detected. */
+	uint32_t last:1; /**< No more arguments. */
+};
+
+/** Parser token definition. */
+struct token {
+	/** Type displayed during completion (defaults to "TOKEN"). */
+	const char *type;
+	/** Help displayed during completion (defaults to token name). */
+	const char *help;
+	/**
+	 * Lists of subsequent tokens to push on the stack. Each call to the
+	 * parser consumes the last entry of that stack.
+	 */
+	const enum index *const *next;
+	/**
+	 * Token-processing callback, returns -1 in case of error, the
+	 * length of the matched string otherwise. If NULL, attempts to
+	 * match the token name.
+	 *
+	 * If buf is not NULL, the result should be stored in it according
+	 * to context. An error is returned if not large enough.
+	 */
+	int (*call)(struct context *ctx, const struct token *token,
+		    const char *str, unsigned int len,
+		    void *buf, unsigned int size);
+	/**
+	 * Callback that provides possible values for this token, used for
+	 * completion. Returns -1 in case of error, the number of possible
+	 * values otherwise. If NULL, the token name is used.
+	 *
+	 * If buf is not NULL, entry index ent is written to buf and the
+	 * full length of the entry is returned (same behavior as
+	 * snprintf()).
+	 */
+	int (*comp)(struct context *ctx, const struct token *token,
+		    unsigned int ent, char *buf, unsigned int size);
+	/** Mandatory token name, no default value. */
+	const char *name;
+};
+
+/** Static initializer for the next field. */
+#define NEXT(...) (const enum index *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for a NEXT() entry. */
+#define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, ZERO, }
+
+/** Parser output buffer layout expected by cmd_flow_parsed(). */
+struct buffer {
+	enum index command; /**< Flow command. */
+	uint16_t port; /**< Affected port ID. */
+};
+
+static int parse_init(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
+
+/** Token definitions. */
+static const struct token token_list[] = {
+	/* Special tokens. */
+	[ZERO] = {
+		.name = "ZERO",
+		.help = "null entry, abused as the entry point",
+		.next = NEXT(NEXT_ENTRY(FLOW)),
+	},
+	[END] = {
+		.name = "",
+		.type = "RETURN",
+		.help = "command may end here",
+	},
+	/* Top-level command. */
+	[FLOW] = {
+		.name = "flow",
+		.type = "{command} {port_id} [{arg} [...]]",
+		.help = "manage ingress/egress flow rules",
+		.call = parse_init,
+	},
+};
+
+/** Default parsing function for token name matching. */
+static int
+parse_default(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)buf;
+	(void)size;
+	if (strncmp(str, token->name, len))
+		return -1;
+	return len;
+}
+
+/** Parse flow command, initialize output buffer for subsequent tokens. */
+static int
+parse_init(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	/* Make sure buffer is large enough. */
+	if (size < sizeof(*out))
+		return -1;
+	/* Initialize buffer. */
+	memset(out, 0x00, sizeof(*out));
+	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	return len;
+}
+
+/** Internal context. */
+static struct context cmd_flow_context;
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow;
+
+/** Initialize context. */
+static void
+cmd_flow_context_init(struct context *ctx)
+{
+	/* A full memset() is not necessary. */
+	ctx->curr = ZERO;
+	ctx->prev = ZERO;
+	ctx->next_num = 0;
+	ctx->reparse = 0;
+	ctx->eol = 0;
+	ctx->last = 0;
+}
+
+/** Parse a token (cmdline API). */
+static int
+cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
+	       unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token;
+	const enum index *list;
+	int len;
+	int i;
+
+	(void)hdr;
+	/* Restart as requested. */
+	if (ctx->reparse)
+		cmd_flow_context_init(ctx);
+	token = &token_list[ctx->curr];
+	/* Check argument length. */
+	ctx->eol = 0;
+	ctx->last = 1;
+	for (len = 0; src[len]; ++len)
+		if (src[len] == '#' || isspace(src[len]))
+			break;
+	if (!len)
+		return -1;
+	/* Last argument and EOL detection. */
+	for (i = len; src[i]; ++i)
+		if (src[i] == '#' || src[i] == '\r' || src[i] == '\n')
+			break;
+		else if (!isspace(src[i])) {
+			ctx->last = 0;
+			break;
+		}
+	for (; src[i]; ++i)
+		if (src[i] == '\r' || src[i] == '\n') {
+			ctx->eol = 1;
+			break;
+		}
+	/* Initialize context if necessary. */
+	if (!ctx->next_num) {
+		if (!token->next)
+			return 0;
+		ctx->next[ctx->next_num++] = token->next[0];
+	}
+	/* Process argument through candidates. */
+	ctx->prev = ctx->curr;
+	list = ctx->next[ctx->next_num - 1];
+	for (i = 0; list[i]; ++i) {
+		const struct token *next = &token_list[list[i]];
+		int tmp;
+
+		ctx->curr = list[i];
+		if (next->call)
+			tmp = next->call(ctx, next, src, len, result, size);
+		else
+			tmp = parse_default(ctx, next, src, len, result, size);
+		if (tmp == -1 || tmp != len)
+			continue;
+		token = next;
+		break;
+	}
+	if (!list[i])
+		return -1;
+	--ctx->next_num;
+	/* Push subsequent tokens if any. */
+	if (token->next)
+		for (i = 0; token->next[i]; ++i) {
+			if (ctx->next_num == RTE_DIM(ctx->next))
+				return -1;
+			ctx->next[ctx->next_num++] = token->next[i];
+		}
+	return len;
+}
+
+/** Return number of completion entries (cmdline API). */
+static int
+cmd_flow_complete_get_nb(cmdline_parse_token_hdr_t *hdr)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return 0;
+	/*
+	 * If there is a single token, use its completion callback, otherwise
+	 * return the number of entries.
+	 */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, 0, NULL, 0);
+	}
+	return i;
+}
+
+/** Return a completion entry (cmdline API). */
+static int
+cmd_flow_complete_get_elt(cmdline_parse_token_hdr_t *hdr, int index,
+			  char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->curr];
+	const enum index *list;
+	int i;
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	/* Count number of tokens in current list. */
+	if (ctx->next_num)
+		list = ctx->next[ctx->next_num - 1];
+	else
+		list = token->next[0];
+	for (i = 0; list[i]; ++i)
+		;
+	if (!i)
+		return -1;
+	/* If there is a single token, use its completion callback. */
+	token = &token_list[list[0]];
+	if (i == 1 && token->comp) {
+		/* Save index for cmd_flow_get_help(). */
+		ctx->prev = list[0];
+		return token->comp(ctx, token, index, dst, size) < 0 ? -1 : 0;
+	}
+	/* Otherwise make sure the index is valid and use defaults. */
+	if (index >= i)
+		return -1;
+	token = &token_list[list[index]];
+	snprintf(dst, size, "%s", token->name);
+	/* Save index for cmd_flow_get_help(). */
+	ctx->prev = list[index];
+	return 0;
+}
+
+/** Populate help strings for current token (cmdline API). */
+static int
+cmd_flow_get_help(cmdline_parse_token_hdr_t *hdr, char *dst, unsigned int size)
+{
+	struct context *ctx = &cmd_flow_context;
+	const struct token *token = &token_list[ctx->prev];
+
+	(void)hdr;
+	/* Tell cmd_flow_parse() that context must be reinitialized. */
+	ctx->reparse = 1;
+	if (!size)
+		return -1;
+	/* Set token type and update global help with details. */
+	snprintf(dst, size, "%s", (token->type ? token->type : "TOKEN"));
+	if (token->help)
+		cmd_flow.help_str = token->help;
+	else
+		cmd_flow.help_str = token->name;
+	return 0;
+}
+
+/** Token definition template (cmdline API). */
+static struct cmdline_token_hdr cmd_flow_token_hdr = {
+	.ops = &(struct cmdline_token_ops){
+		.parse = cmd_flow_parse,
+		.complete_get_nb = cmd_flow_complete_get_nb,
+		.complete_get_elt = cmd_flow_complete_get_elt,
+		.get_help = cmd_flow_get_help,
+	},
+	.offset = 0,
+};
+
+/** Populate the next dynamic token. */
+static void
+cmd_flow_tok(cmdline_parse_token_hdr_t **hdr,
+	     cmdline_parse_token_hdr_t *(*hdrs)[])
+{
+	struct context *ctx = &cmd_flow_context;
+
+	/* Always reinitialize context before requesting the first token. */
+	if (!(hdr - *hdrs))
+		cmd_flow_context_init(ctx);
+	/* Return NULL when no more tokens are expected. */
+	if (!ctx->next_num && ctx->curr) {
+		*hdr = NULL;
+		return;
+	}
+	/* Determine if command should end here. */
+	if (ctx->eol && ctx->last && ctx->next_num) {
+		const enum index *list = ctx->next[ctx->next_num - 1];
+		int i;
+
+		for (i = 0; list[i]; ++i) {
+			if (list[i] != END)
+				continue;
+			*hdr = NULL;
+			return;
+		}
+	}
+	*hdr = &cmd_flow_token_hdr;
+}
+
+/** Dispatch parsed buffer to function calls. */
+static void
+cmd_flow_parsed(const struct buffer *in)
+{
+	switch (in->command) {
+	default:
+		break;
+	}
+}
+
+/** Token generator and output processing callback (cmdline API). */
+static void
+cmd_flow_cb(void *arg0, struct cmdline *cl, void *arg2)
+{
+	if (cl == NULL)
+		cmd_flow_tok(arg0, arg2);
+	else
+		cmd_flow_parsed(arg0);
+}
+
+/** Global parser instance (cmdline API). */
+cmdline_parse_inst_t cmd_flow = {
+	.f = cmd_flow_cb,
+	.data = NULL, /**< Unused. */
+	.help_str = NULL, /**< Updated by cmd_flow_get_help(). */
+	.tokens = {
+		NULL,
+	}, /**< Tokens are returned by cmd_flow_tok(). */
+};
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 08/26] app/testpmd: add rte_flow integer support
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (6 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 07/26] app/testpmd: add flow command Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 09/26] app/testpmd: add flow list command Adrien Mazarguil
                               ` (18 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Parse all integer types and handle conversion to network byte order in a
single function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 148 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 148 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index f5aef0f..c5a4209 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -34,11 +34,14 @@
 #include <stddef.h>
 #include <stdint.h>
 #include <stdio.h>
+#include <inttypes.h>
+#include <errno.h>
 #include <ctype.h>
 #include <string.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
+#include <rte_byteorder.h>
 #include <cmdline_parse.h>
 #include <rte_flow.h>
 
@@ -50,6 +53,10 @@ enum index {
 	ZERO = 0,
 	END,
 
+	/* Common tokens. */
+	INTEGER,
+	UNSIGNED,
+
 	/* Top-level command. */
 	FLOW,
 };
@@ -61,12 +68,24 @@ enum index {
 struct context {
 	/** Stack of subsequent token lists to process. */
 	const enum index *next[CTX_STACK_SIZE];
+	/** Arguments for stacked tokens. */
+	const void *args[CTX_STACK_SIZE];
 	enum index curr; /**< Current token index. */
 	enum index prev; /**< Index of the last token seen. */
 	int next_num; /**< Number of entries in next[]. */
+	int args_num; /**< Number of entries in args[]. */
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	void *object; /**< Address of current object for relative offsets. */
+};
+
+/** Token argument. */
+struct arg {
+	uint32_t hton:1; /**< Use network byte ordering. */
+	uint32_t sign:1; /**< Value is signed. */
+	uint32_t offset; /**< Relative offset from ctx->object. */
+	uint32_t size; /**< Field size. */
 };
 
 /** Parser token definition. */
@@ -80,6 +99,8 @@ struct token {
 	 * parser consumes the last entry of that stack.
 	 */
 	const enum index *const *next;
+	/** Arguments stack for subsequent tokens that need them. */
+	const struct arg *const *args;
 	/**
 	 * Token-processing callback, returns -1 in case of error, the
 	 * length of the matched string otherwise. If NULL, attempts to
@@ -112,6 +133,22 @@ struct token {
 /** Static initializer for a NEXT() entry. */
 #define NEXT_ENTRY(...) (const enum index []){ __VA_ARGS__, ZERO, }
 
+/** Static initializer for the args field. */
+#define ARGS(...) (const struct arg *const []){ __VA_ARGS__, NULL, }
+
+/** Static initializer for ARGS() to target a field. */
+#define ARGS_ENTRY(s, f) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
+/** Static initializer for ARGS() to target a pointer. */
+#define ARGS_ENTRY_PTR(s, f) \
+	(&(const struct arg){ \
+		.size = sizeof(*((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -121,6 +158,11 @@ struct buffer {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_int(struct context *, const struct token *,
+		     const char *, unsigned int,
+		     void *, unsigned int);
+static int comp_none(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -135,6 +177,21 @@ static const struct token token_list[] = {
 		.type = "RETURN",
 		.help = "command may end here",
 	},
+	/* Common tokens. */
+	[INTEGER] = {
+		.name = "{int}",
+		.type = "INTEGER",
+		.help = "integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[UNSIGNED] = {
+		.name = "{unsigned}",
+		.type = "UNSIGNED",
+		.help = "unsigned integer value",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -144,6 +201,23 @@ static const struct token token_list[] = {
 	},
 };
 
+/** Remove and return last entry from argument stack. */
+static const struct arg *
+pop_args(struct context *ctx)
+{
+	return ctx->args_num ? ctx->args[--ctx->args_num] : NULL;
+}
+
+/** Add entry on top of the argument stack. */
+static int
+push_args(struct context *ctx, const struct arg *arg)
+{
+	if (ctx->args_num == CTX_STACK_SIZE)
+		return -1;
+	ctx->args[ctx->args_num++] = arg;
+	return 0;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -178,9 +252,74 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->object = out;
 	return len;
 }
 
+/**
+ * Parse signed/unsigned integers 8 to 64-bit long.
+ *
+ * Last argument (ctx->args) is retrieved to determine integer type and
+ * storage location.
+ */
+static int
+parse_int(struct context *ctx, const struct token *token,
+	  const char *str, unsigned int len,
+	  void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	uintmax_t u;
+	char *end;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = arg->sign ?
+		(uintmax_t)strtoimax(str, &end, 0) :
+		strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	size = arg->size;
+	switch (size) {
+	case sizeof(uint8_t):
+		*(uint8_t *)buf = u;
+		break;
+	case sizeof(uint16_t):
+		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
+		break;
+	case sizeof(uint32_t):
+		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
+		break;
+	case sizeof(uint64_t):
+		*(uint64_t *)buf = arg->hton ? rte_cpu_to_be_64(u) : u;
+		break;
+	default:
+		goto error;
+	}
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/** No completion. */
+static int
+comp_none(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	(void)ctx;
+	(void)token;
+	(void)ent;
+	(void)buf;
+	(void)size;
+	return 0;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -195,9 +334,11 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->curr = ZERO;
 	ctx->prev = ZERO;
 	ctx->next_num = 0;
+	ctx->args_num = 0;
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->object = NULL;
 }
 
 /** Parse a token (cmdline API). */
@@ -270,6 +411,13 @@ cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
 				return -1;
 			ctx->next[ctx->next_num++] = token->next[i];
 		}
+	/* Push arguments if any. */
+	if (token->args)
+		for (i = 0; token->args[i]; ++i) {
+			if (ctx->args_num == RTE_DIM(ctx->args))
+				return -1;
+			ctx->args[ctx->args_num++] = token->args[i];
+		}
 	return len;
 }
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 09/26] app/testpmd: add flow list command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (7 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 08/26] app/testpmd: add rte_flow integer support Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 10/26] app/testpmd: add flow flush command Adrien Mazarguil
                               ` (17 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Syntax:

 flow list {port_id} [group {group_id}] [...]

List configured flow rules on a port. Output can optionally be limited to a
given set of group identifiers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   4 ++
 app/test-pmd/cmdline_flow.c | 141 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b124412..0dc6c63 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -810,6 +810,10 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"sctp-src-port|sctp-dst-port|sctp-veri-tag|none)"
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
+
+			"flow list {port_id} [group {group_id}] [...]\n"
+			"    List existing flow rules sorted by priority,"
+			" filtered by group identifiers.\n\n"
 		);
 	}
 }
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index c5a4209..7a2aaa4 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,9 +56,17 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PORT_ID,
+	GROUP_ID,
 
 	/* Top-level command. */
 	FLOW,
+
+	/* Sub-level commands. */
+	LIST,
+
+	/* List arguments. */
+	LIST_GROUP,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -77,6 +85,7 @@ struct context {
 	uint32_t reparse:1; /**< Start over from the beginning. */
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
+	uint16_t port; /**< Current port ID (for completions). */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -153,16 +162,36 @@ struct token {
 struct buffer {
 	enum index command; /**< Flow command. */
 	uint16_t port; /**< Affected port ID. */
+	union {
+		struct {
+			uint32_t *group;
+			uint32_t group_n;
+		} list; /**< List arguments. */
+	} args; /**< Command arguments. */
+};
+
+static const enum index next_list_attr[] = {
+	LIST_GROUP,
+	END,
+	ZERO,
 };
 
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_list(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_port(struct context *, const struct token *,
+		      const char *, unsigned int,
+		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_port(struct context *, const struct token *,
+		     unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -192,13 +221,44 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PORT_ID] = {
+		.name = "{port_id}",
+		.type = "PORT ID",
+		.help = "port identifier",
+		.call = parse_port,
+		.comp = comp_port,
+	},
+	[GROUP_ID] = {
+		.name = "{group_id}",
+		.type = "GROUP ID",
+		.help = "group identifier",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
+		.next = NEXT(NEXT_ENTRY(LIST)),
 		.call = parse_init,
 	},
+	/* Sub-level commands. */
+	[LIST] = {
+		.name = "list",
+		.help = "list existing flow rules",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_list,
+	},
+	/* List arguments. */
+	[LIST_GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_list_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
+		.call = parse_list,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -256,6 +316,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for list command. */
+static int
+parse_list(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != LIST)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.list.group =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
+	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.list.group + out->args.list.group_n++;
+	return len;
+}
+
 /**
  * Parse signed/unsigned integers 8 to 64-bit long.
  *
@@ -307,6 +400,29 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/** Parse port and update context. */
+static int
+parse_port(struct context *ctx, const struct token *token,
+	   const char *str, unsigned int len,
+	   void *buf, unsigned int size)
+{
+	struct buffer *out = &(struct buffer){ .port = 0 };
+	int ret;
+
+	if (buf)
+		out = buf;
+	else {
+		ctx->object = out;
+		size = sizeof(*out);
+	}
+	ret = parse_int(ctx, token, str, len, out, size);
+	if (ret >= 0)
+		ctx->port = out->port;
+	if (!buf)
+		ctx->object = NULL;
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -320,6 +436,26 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete available ports. */
+static int
+comp_port(struct context *ctx, const struct token *token,
+	  unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	portid_t p;
+
+	(void)ctx;
+	(void)token;
+	FOREACH_PORT(p, ports) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", p);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -338,6 +474,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->reparse = 0;
 	ctx->eol = 0;
 	ctx->last = 0;
+	ctx->port = 0;
 	ctx->object = NULL;
 }
 
@@ -561,6 +698,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case LIST:
+		port_flow_list(in->port, in->args.list.group_n,
+			       in->args.list.group);
+		break;
 	default:
 		break;
 	}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 10/26] app/testpmd: add flow flush command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (8 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 09/26] app/testpmd: add flow list command Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 11/26] app/testpmd: add flow destroy command Adrien Mazarguil
                               ` (16 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Syntax:

 flow flush {port_id}

Destroy all flow rules on a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |  3 +++
 app/test-pmd/cmdline_flow.c | 43 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0dc6c63..6e2b289 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow flush {port_id}\n"
+			"    Destroy all flow rules.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7a2aaa4..5972b80 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -63,6 +63,7 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	FLUSH,
 	LIST,
 
 	/* List arguments. */
@@ -179,6 +180,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_flush(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -240,10 +244,19 @@ static const struct token token_list[] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
-		.next = NEXT(NEXT_ENTRY(LIST)),
+		.next = NEXT(NEXT_ENTRY
+			     (FLUSH,
+			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[FLUSH] = {
+		.name = "flush",
+		.help = "destroy all flow rules",
+		.next = NEXT(NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_flush,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -316,6 +329,31 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for flush command. */
+static int
+parse_flush(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != FLUSH)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+	}
+	return len;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -698,6 +736,9 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case FLUSH:
+		port_flow_flush(in->port);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 11/26] app/testpmd: add flow destroy command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (9 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 10/26] app/testpmd: add flow flush command Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 12/26] app/testpmd: add flow validate/create commands Adrien Mazarguil
                               ` (15 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Syntax:

 flow destroy {port_id} rule {rule_id} [...]

Destroy a given set of flow rules associated with a port.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   3 ++
 app/test-pmd/cmdline_flow.c | 106 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 6e2b289..80ddda2 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow destroy {port_id} rule {rule_id} [...]\n"
+			"    Destroy specific flow rules.\n\n"
+
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 5972b80..5c45b3a 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
 
@@ -63,9 +64,13 @@ enum index {
 	FLOW,
 
 	/* Sub-level commands. */
+	DESTROY,
 	FLUSH,
 	LIST,
 
+	/* Destroy arguments. */
+	DESTROY_RULE,
+
 	/* List arguments. */
 	LIST_GROUP,
 };
@@ -165,12 +170,22 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			uint32_t *rule;
+			uint32_t rule_n;
+		} destroy; /**< Destroy arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
 	} args; /**< Command arguments. */
 };
 
+static const enum index next_destroy_attr[] = {
+	DESTROY_RULE,
+	END,
+	ZERO,
+};
+
 static const enum index next_list_attr[] = {
 	LIST_GROUP,
 	END,
@@ -180,6 +195,9 @@ static const enum index next_list_attr[] = {
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_destroy(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
@@ -196,6 +214,8 @@ static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_rule_id(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -225,6 +245,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[RULE_ID] = {
+		.name = "{rule id}",
+		.type = "RULE ID",
+		.help = "rule identifier",
+		.call = parse_int,
+		.comp = comp_rule_id,
+	},
 	[PORT_ID] = {
 		.name = "{port_id}",
 		.type = "PORT ID",
@@ -245,11 +272,19 @@ static const struct token token_list[] = {
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (FLUSH,
+			     (DESTROY,
+			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[DESTROY] = {
+		.name = "destroy",
+		.help = "destroy specific flow rules",
+		.next = NEXT(NEXT_ENTRY(DESTROY_RULE), NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_destroy,
+	},
 	[FLUSH] = {
 		.name = "flush",
 		.help = "destroy all flow rules",
@@ -264,6 +299,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_list,
 	},
+	/* Destroy arguments. */
+	[DESTROY_RULE] = {
+		.name = "rule",
+		.help = "specify a rule identifier",
+		.next = NEXT(next_destroy_attr, NEXT_ENTRY(RULE_ID)),
+		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
+		.call = parse_destroy,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -329,6 +372,39 @@ parse_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for destroy command. */
+static int
+parse_destroy(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != DESTROY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->object = out;
+		out->args.destroy.rule =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		return len;
+	}
+	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
+	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
+		return -1;
+	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	return len;
+}
+
 /** Parse tokens for flush command. */
 static int
 parse_flush(struct context *ctx, const struct token *token,
@@ -494,6 +570,30 @@ comp_port(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete available rule IDs. */
+static int
+comp_rule_id(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i = 0;
+	struct rte_port *port;
+	struct port_flow *pf;
+
+	(void)token;
+	if (port_id_is_invalid(ctx->port, DISABLED_WARN) ||
+	    ctx->port == (uint16_t)RTE_PORT_ALL)
+		return -1;
+	port = &ports[ctx->port];
+	for (pf = port->flow_list; pf != NULL; pf = pf->next) {
+		if (buf && i == ent)
+			return snprintf(buf, size, "%u", pf->id);
+		++i;
+	}
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
@@ -736,6 +836,10 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case DESTROY:
+		port_flow_destroy(in->port, in->args.destroy.rule_n,
+				  in->args.destroy.rule);
+		break;
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 12/26] app/testpmd: add flow validate/create commands
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (10 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 11/26] app/testpmd: add flow destroy command Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 13/26] app/testpmd: add flow query command Adrien Mazarguil
                               ` (14 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Syntax:

 flow (validate|create) {port_id}
    [group {group_id}] [priority {level}] [ingress] [egress]
    pattern {item} [/ {item} [...]] / end
    actions {action} [/ {action} [...]] / end

Either check the validity of a flow rule or create it. Any number of
pattern items and actions can be provided in any order. Completion is
available for convenience.

This commit only adds support for the most basic item and action types,
namely:

- END: terminates pattern items and actions lists.
- VOID: item/action filler, no operation.
- INVERT: inverted pattern matching, process packets that do not match.
- PASSTHRU: action that leaves packets up for additional processing by
  subsequent flow rules.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |  14 ++
 app/test-pmd/cmdline_flow.c | 314 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 327 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 80ddda2..23f4b48 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -811,6 +811,20 @@ static void cmd_help_long_parsed(void *parsed_result,
 			" (select|add)\n"
 			"    Set the input set for FDir.\n\n"
 
+			"flow validate {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Check whether a flow rule can be created.\n\n"
+
+			"flow create {port_id}"
+			" [group {group_id}] [priority {level}]"
+			" [ingress] [egress]"
+			" pattern {item} [/ {item} [...]] / end"
+			" actions {action} [/ {action} [...]] / end\n"
+			"    Create a flow rule.\n\n"
+
 			"flow destroy {port_id} rule {rule_id} [...]\n"
 			"    Destroy specific flow rules.\n\n"
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 5c45b3a..dc68685 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -59,11 +59,14 @@ enum index {
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
+	PRIORITY_LEVEL,
 
 	/* Top-level command. */
 	FLOW,
 
 	/* Sub-level commands. */
+	VALIDATE,
+	CREATE,
 	DESTROY,
 	FLUSH,
 	LIST,
@@ -73,6 +76,26 @@ enum index {
 
 	/* List arguments. */
 	LIST_GROUP,
+
+	/* Validate/create arguments. */
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+
+	/* Validate/create pattern. */
+	PATTERN,
+	ITEM_NEXT,
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+
+	/* Validate/create actions. */
+	ACTIONS,
+	ACTION_NEXT,
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
 };
 
 /** Maximum number of subsequent tokens and arguments on the stack. */
@@ -92,6 +115,7 @@ struct context {
 	uint32_t eol:1; /**< EOL has been detected. */
 	uint32_t last:1; /**< No more arguments. */
 	uint16_t port; /**< Current port ID (for completions). */
+	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
 };
 
@@ -109,6 +133,8 @@ struct token {
 	const char *type;
 	/** Help displayed during completion (defaults to token name). */
 	const char *help;
+	/** Private data used by parser functions. */
+	const void *priv;
 	/**
 	 * Lists of subsequent tokens to push on the stack. Each call to the
 	 * parser consumes the last entry of that stack.
@@ -170,6 +196,14 @@ struct buffer {
 	uint16_t port; /**< Affected port ID. */
 	union {
 		struct {
+			struct rte_flow_attr attr;
+			struct rte_flow_item *pattern;
+			struct rte_flow_action *actions;
+			uint32_t pattern_n;
+			uint32_t actions_n;
+			uint8_t *data;
+		} vc; /**< Validate/create arguments. */
+		struct {
 			uint32_t *rule;
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
@@ -180,6 +214,39 @@ struct buffer {
 	} args; /**< Command arguments. */
 };
 
+/** Private data for pattern items. */
+struct parse_item_priv {
+	enum rte_flow_item_type type; /**< Item type. */
+	uint32_t size; /**< Size of item specification structure. */
+};
+
+#define PRIV_ITEM(t, s) \
+	(&(const struct parse_item_priv){ \
+		.type = RTE_FLOW_ITEM_TYPE_ ## t, \
+		.size = s, \
+	})
+
+/** Private data for actions. */
+struct parse_action_priv {
+	enum rte_flow_action_type type; /**< Action type. */
+	uint32_t size; /**< Size of action configuration structure. */
+};
+
+#define PRIV_ACTION(t, s) \
+	(&(const struct parse_action_priv){ \
+		.type = RTE_FLOW_ACTION_TYPE_ ## t, \
+		.size = s, \
+	})
+
+static const enum index next_vc_attr[] = {
+	GROUP,
+	PRIORITY,
+	INGRESS,
+	EGRESS,
+	PATTERN,
+	ZERO,
+};
+
 static const enum index next_destroy_attr[] = {
 	DESTROY_RULE,
 	END,
@@ -192,9 +259,26 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
+static const enum index next_item[] = {
+	ITEM_END,
+	ITEM_VOID,
+	ITEM_INVERT,
+	ZERO,
+};
+
+static const enum index next_action[] = {
+	ACTION_END,
+	ACTION_VOID,
+	ACTION_PASSTHRU,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
+static int parse_vc(struct context *, const struct token *,
+		    const char *, unsigned int,
+		    void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -266,18 +350,41 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PRIORITY_LEVEL] = {
+		.name = "{level}",
+		.type = "PRIORITY",
+		.help = "priority level",
+		.call = parse_int,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
 		.type = "{command} {port_id} [{arg} [...]]",
 		.help = "manage ingress/egress flow rules",
 		.next = NEXT(NEXT_ENTRY
-			     (DESTROY,
+			     (VALIDATE,
+			      CREATE,
+			      DESTROY,
 			      FLUSH,
 			      LIST)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
+	[VALIDATE] = {
+		.name = "validate",
+		.help = "check whether a flow rule can be created",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
+	[CREATE] = {
+		.name = "create",
+		.help = "create a flow rule",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
+		.call = parse_vc,
+	},
 	[DESTROY] = {
 		.name = "destroy",
 		.help = "destroy specific flow rules",
@@ -315,6 +422,98 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.list.group)),
 		.call = parse_list,
 	},
+	/* Validate/create attributes. */
+	[GROUP] = {
+		.name = "group",
+		.help = "specify a group",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(GROUP_ID)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, group)),
+		.call = parse_vc,
+	},
+	[PRIORITY] = {
+		.name = "priority",
+		.help = "specify a priority level",
+		.next = NEXT(next_vc_attr, NEXT_ENTRY(PRIORITY_LEVEL)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_attr, priority)),
+		.call = parse_vc,
+	},
+	[INGRESS] = {
+		.name = "ingress",
+		.help = "affect rule to ingress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	[EGRESS] = {
+		.name = "egress",
+		.help = "affect rule to egress",
+		.next = NEXT(next_vc_attr),
+		.call = parse_vc,
+	},
+	/* Validate/create pattern. */
+	[PATTERN] = {
+		.name = "pattern",
+		.help = "submit a list of pattern items",
+		.next = NEXT(next_item),
+		.call = parse_vc,
+	},
+	[ITEM_NEXT] = {
+		.name = "/",
+		.help = "specify next pattern item",
+		.next = NEXT(next_item),
+	},
+	[ITEM_END] = {
+		.name = "end",
+		.help = "end list of pattern items",
+		.priv = PRIV_ITEM(END, 0),
+		.next = NEXT(NEXT_ENTRY(ACTIONS)),
+		.call = parse_vc,
+	},
+	[ITEM_VOID] = {
+		.name = "void",
+		.help = "no-op pattern item",
+		.priv = PRIV_ITEM(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_INVERT] = {
+		.name = "invert",
+		.help = "perform actions when pattern does not match",
+		.priv = PRIV_ITEM(INVERT, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	/* Validate/create actions. */
+	[ACTIONS] = {
+		.name = "actions",
+		.help = "submit a list of associated actions",
+		.next = NEXT(next_action),
+		.call = parse_vc,
+	},
+	[ACTION_NEXT] = {
+		.name = "/",
+		.help = "specify next action",
+		.next = NEXT(next_action),
+	},
+	[ACTION_END] = {
+		.name = "end",
+		.help = "end list of actions",
+		.priv = PRIV_ACTION(END, 0),
+		.call = parse_vc,
+	},
+	[ACTION_VOID] = {
+		.name = "void",
+		.help = "no-op action",
+		.priv = PRIV_ACTION(VOID, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PASSTHRU] = {
+		.name = "passthru",
+		.help = "let subsequent rule process matched packets",
+		.priv = PRIV_ACTION(PASSTHRU, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -368,10 +567,108 @@ parse_init(struct context *ctx, const struct token *token,
 	/* Initialize buffer. */
 	memset(out, 0x00, sizeof(*out));
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
+	ctx->objdata = 0;
 	ctx->object = out;
 	return len;
 }
 
+/** Parse tokens for validate/create commands. */
+static int
+parse_vc(struct context *ctx, const struct token *token,
+	 const char *str, unsigned int len,
+	 void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	uint8_t *data;
+	uint32_t data_size;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != VALIDATE && ctx->curr != CREATE)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		out->args.vc.data = (uint8_t *)out + size;
+		return len;
+	}
+	ctx->objdata = 0;
+	ctx->object = &out->args.vc.attr;
+	switch (ctx->curr) {
+	case GROUP:
+	case PRIORITY:
+		return len;
+	case INGRESS:
+		out->args.vc.attr.ingress = 1;
+		return len;
+	case EGRESS:
+		out->args.vc.attr.egress = 1;
+		return len;
+	case PATTERN:
+		out->args.vc.pattern =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
+					       sizeof(double));
+		ctx->object = out->args.vc.pattern;
+		return len;
+	case ACTIONS:
+		out->args.vc.actions =
+			(void *)RTE_ALIGN_CEIL((uintptr_t)
+					       (out->args.vc.pattern +
+						out->args.vc.pattern_n),
+					       sizeof(double));
+		ctx->object = out->args.vc.actions;
+		return len;
+	default:
+		if (!token->priv)
+			return -1;
+		break;
+	}
+	if (!out->args.vc.actions) {
+		const struct parse_item_priv *priv = token->priv;
+		struct rte_flow_item *item =
+			out->args.vc.pattern + out->args.vc.pattern_n;
+
+		data_size = priv->size * 3; /* spec, last, mask */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)item + sizeof(*item) > data)
+			return -1;
+		*item = (struct rte_flow_item){
+			.type = priv->type,
+		};
+		++out->args.vc.pattern_n;
+		ctx->object = item;
+	} else {
+		const struct parse_action_priv *priv = token->priv;
+		struct rte_flow_action *action =
+			out->args.vc.actions + out->args.vc.actions_n;
+
+		data_size = priv->size; /* configuration */
+		data = (void *)RTE_ALIGN_FLOOR((uintptr_t)
+					       (out->args.vc.data - data_size),
+					       sizeof(double));
+		if ((uint8_t *)action + sizeof(*action) > data)
+			return -1;
+		*action = (struct rte_flow_action){
+			.type = priv->type,
+		};
+		++out->args.vc.actions_n;
+		ctx->object = action;
+	}
+	memset(data, 0, data_size);
+	out->args.vc.data = data;
+	ctx->objdata = data_size;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -392,6 +689,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -401,6 +699,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.destroy.rule + out->args.destroy.rule_n) +
 	     sizeof(*out->args.destroy.rule)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
 	return len;
 }
@@ -425,6 +724,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 	}
 	return len;
@@ -450,6 +750,7 @@ parse_list(struct context *ctx, const struct token *token,
 		if (sizeof(*out) > size)
 			return -1;
 		out->command = ctx->curr;
+		ctx->objdata = 0;
 		ctx->object = out;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
@@ -459,6 +760,7 @@ parse_list(struct context *ctx, const struct token *token,
 	if (((uint8_t *)(out->args.list.group + out->args.list.group_n) +
 	     sizeof(*out->args.list.group)) > (uint8_t *)out + size)
 		return -1;
+	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
 	return len;
 }
@@ -526,6 +828,7 @@ parse_port(struct context *ctx, const struct token *token,
 	if (buf)
 		out = buf;
 	else {
+		ctx->objdata = 0;
 		ctx->object = out;
 		size = sizeof(*out);
 	}
@@ -613,6 +916,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->eol = 0;
 	ctx->last = 0;
 	ctx->port = 0;
+	ctx->objdata = 0;
 	ctx->object = NULL;
 }
 
@@ -836,6 +1140,14 @@ static void
 cmd_flow_parsed(const struct buffer *in)
 {
 	switch (in->command) {
+	case VALIDATE:
+		port_flow_validate(in->port, &in->args.vc.attr,
+				   in->args.vc.pattern, in->args.vc.actions);
+		break;
+	case CREATE:
+		port_flow_create(in->port, &in->args.vc.attr,
+				 in->args.vc.pattern, in->args.vc.actions);
+		break;
 	case DESTROY:
 		port_flow_destroy(in->port, in->args.destroy.rule_n,
 				  in->args.destroy.rule);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 13/26] app/testpmd: add flow query command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (11 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 12/26] app/testpmd: add flow validate/create commands Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 14/26] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
                               ` (13 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Syntax:

 flow query {port_id} {rule_id} {action}

Query a specific action of an existing flow rule.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline.c      |   3 +
 app/test-pmd/cmdline_flow.c | 121 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 23f4b48..f768b6b 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -831,6 +831,9 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"flow flush {port_id}\n"
 			"    Destroy all flow rules.\n\n"
 
+			"flow query {port_id} {rule_id} {action}\n"
+			"    Query an existing flow rule.\n\n"
+
 			"flow list {port_id} [group {group_id}] [...]\n"
 			"    List existing flow rules sorted by priority,"
 			" filtered by group identifiers.\n\n"
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index dc68685..fb9489d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -69,11 +69,15 @@ enum index {
 	CREATE,
 	DESTROY,
 	FLUSH,
+	QUERY,
 	LIST,
 
 	/* Destroy arguments. */
 	DESTROY_RULE,
 
+	/* Query arguments. */
+	QUERY_ACTION,
+
 	/* List arguments. */
 	LIST_GROUP,
 
@@ -208,6 +212,10 @@ struct buffer {
 			uint32_t rule_n;
 		} destroy; /**< Destroy arguments. */
 		struct {
+			uint32_t rule;
+			enum rte_flow_action_type action;
+		} query; /**< Query arguments. */
+		struct {
 			uint32_t *group;
 			uint32_t group_n;
 		} list; /**< List arguments. */
@@ -285,6 +293,12 @@ static int parse_destroy(struct context *, const struct token *,
 static int parse_flush(struct context *, const struct token *,
 		       const char *, unsigned int,
 		       void *, unsigned int);
+static int parse_query(struct context *, const struct token *,
+		       const char *, unsigned int,
+		       void *, unsigned int);
+static int parse_action(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_list(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -296,6 +310,8 @@ static int parse_port(struct context *, const struct token *,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_action(struct context *, const struct token *,
+		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
@@ -367,7 +383,8 @@ static const struct token token_list[] = {
 			      CREATE,
 			      DESTROY,
 			      FLUSH,
-			      LIST)),
+			      LIST,
+			      QUERY)),
 		.call = parse_init,
 	},
 	/* Sub-level commands. */
@@ -399,6 +416,17 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY(struct buffer, port)),
 		.call = parse_flush,
 	},
+	[QUERY] = {
+		.name = "query",
+		.help = "query an existing flow rule",
+		.next = NEXT(NEXT_ENTRY(QUERY_ACTION),
+			     NEXT_ENTRY(RULE_ID),
+			     NEXT_ENTRY(PORT_ID)),
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.query.action),
+			     ARGS_ENTRY(struct buffer, args.query.rule),
+			     ARGS_ENTRY(struct buffer, port)),
+		.call = parse_query,
+	},
 	[LIST] = {
 		.name = "list",
 		.help = "list existing flow rules",
@@ -414,6 +442,14 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_PTR(struct buffer, args.destroy.rule)),
 		.call = parse_destroy,
 	},
+	/* Query arguments. */
+	[QUERY_ACTION] = {
+		.name = "{action}",
+		.type = "ACTION",
+		.help = "action to query, must be part of the rule",
+		.call = parse_action,
+		.comp = comp_action,
+	},
 	/* List arguments. */
 	[LIST_GROUP] = {
 		.name = "group",
@@ -730,6 +766,67 @@ parse_flush(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse tokens for query command. */
+static int
+parse_query(struct context *ctx, const struct token *token,
+	    const char *str, unsigned int len,
+	    void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->command) {
+		if (ctx->curr != QUERY)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+	}
+	return len;
+}
+
+/** Parse action names. */
+static int
+parse_action(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+
+	(void)size;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	/* Parse action name. */
+	for (i = 0; next_action[i]; ++i) {
+		const struct parse_action_priv *priv;
+
+		token = &token_list[next_action[i]];
+		if (strncmp(token->name, str, len))
+			continue;
+		priv = token->priv;
+		if (!priv)
+			goto error;
+		if (out)
+			memcpy((uint8_t *)ctx->object + arg->offset,
+			       &priv->type,
+			       arg->size);
+		return len;
+	}
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Parse tokens for list command. */
 static int
 parse_list(struct context *ctx, const struct token *token,
@@ -853,6 +950,24 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete action names. */
+static int
+comp_action(struct context *ctx, const struct token *token,
+	    unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; next_action[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s",
+					token_list[next_action[i]].name);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete available ports. */
 static int
 comp_port(struct context *ctx, const struct token *token,
@@ -1155,6 +1270,10 @@ cmd_flow_parsed(const struct buffer *in)
 	case FLUSH:
 		port_flow_flush(in->port);
 		break;
+	case QUERY:
+		port_flow_query(in->port, in->args.query.rule,
+				in->args.query.action);
+		break;
 	case LIST:
 		port_flow_list(in->port, in->args.list.group_n,
 			       in->args.list.group);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 14/26] app/testpmd: add rte_flow item spec handler
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (12 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 13/26] app/testpmd: add flow query command Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 15/26] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
                               ` (12 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Add parser code to fully set individual fields of pattern item
specification structures, using the following operators:

- fix: sets field and applies full bit-mask for perfect matching.
- spec: sets field without modifying its bit-mask.
- last: sets upper value of the spec => last range.
- mask: sets bit-mask affecting both spec and last from arbitrary value.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 111 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 111 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index fb9489d..7bc1aa7 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -89,6 +89,10 @@ enum index {
 
 	/* Validate/create pattern. */
 	PATTERN,
+	ITEM_PARAM_IS,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -121,6 +125,7 @@ struct context {
 	uint16_t port; /**< Current port ID (for completions). */
 	uint32_t objdata; /**< Object-specific data. */
 	void *object; /**< Address of current object for relative offsets. */
+	void *objmask; /**< Object a full mask must be written to. */
 };
 
 /** Token argument. */
@@ -267,6 +272,15 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
+__rte_unused
+static const enum index item_param[] = {
+	ITEM_PARAM_IS,
+	ITEM_PARAM_SPEC,
+	ITEM_PARAM_LAST,
+	ITEM_PARAM_MASK,
+	ZERO,
+};
+
 static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
@@ -287,6 +301,8 @@ static int parse_init(struct context *, const struct token *,
 static int parse_vc(struct context *, const struct token *,
 		    const char *, unsigned int,
 		    void *, unsigned int);
+static int parse_vc_spec(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -492,6 +508,26 @@ static const struct token token_list[] = {
 		.next = NEXT(next_item),
 		.call = parse_vc,
 	},
+	[ITEM_PARAM_IS] = {
+		.name = "is",
+		.help = "match value perfectly (with full bit-mask)",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_SPEC] = {
+		.name = "spec",
+		.help = "match value according to configured bit-mask",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_LAST] = {
+		.name = "last",
+		.help = "specify upper bound to establish a range",
+		.call = parse_vc_spec,
+	},
+	[ITEM_PARAM_MASK] = {
+		.name = "mask",
+		.help = "specify bit-mask with relevant bits set to one",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +641,7 @@ parse_init(struct context *ctx, const struct token *token,
 	memset((uint8_t *)out + sizeof(*out), 0x22, size - sizeof(*out));
 	ctx->objdata = 0;
 	ctx->object = out;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -632,11 +669,13 @@ parse_vc(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.vc.data = (uint8_t *)out + size;
 		return len;
 	}
 	ctx->objdata = 0;
 	ctx->object = &out->args.vc.attr;
+	ctx->objmask = NULL;
 	switch (ctx->curr) {
 	case GROUP:
 	case PRIORITY:
@@ -652,6 +691,7 @@ parse_vc(struct context *ctx, const struct token *token,
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
 		ctx->object = out->args.vc.pattern;
+		ctx->objmask = NULL;
 		return len;
 	case ACTIONS:
 		out->args.vc.actions =
@@ -660,6 +700,7 @@ parse_vc(struct context *ctx, const struct token *token,
 						out->args.vc.pattern_n),
 					       sizeof(double));
 		ctx->object = out->args.vc.actions;
+		ctx->objmask = NULL;
 		return len;
 	default:
 		if (!token->priv)
@@ -682,6 +723,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.pattern_n;
 		ctx->object = item;
+		ctx->objmask = NULL;
 	} else {
 		const struct parse_action_priv *priv = token->priv;
 		struct rte_flow_action *action =
@@ -698,6 +740,7 @@ parse_vc(struct context *ctx, const struct token *token,
 		};
 		++out->args.vc.actions_n;
 		ctx->object = action;
+		ctx->objmask = NULL;
 	}
 	memset(data, 0, data_size);
 	out->args.vc.data = data;
@@ -705,6 +748,60 @@ parse_vc(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse pattern item parameter type. */
+static int
+parse_vc_spec(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_item *item;
+	uint32_t data_size;
+	int index;
+	int objmask = 0;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Parse parameter types. */
+	switch (ctx->curr) {
+	case ITEM_PARAM_IS:
+		index = 0;
+		objmask = 1;
+		break;
+	case ITEM_PARAM_SPEC:
+		index = 0;
+		break;
+	case ITEM_PARAM_LAST:
+		index = 1;
+		break;
+	case ITEM_PARAM_MASK:
+		index = 2;
+		break;
+	default:
+		return -1;
+	}
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.pattern_n)
+		return -1;
+	item = &out->args.vc.pattern[out->args.vc.pattern_n - 1];
+	data_size = ctx->objdata / 3; /* spec, last, mask */
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data + (data_size * index);
+	if (objmask) {
+		ctx->objmask = out->args.vc.data + (data_size * 2); /* mask */
+		item->mask = ctx->objmask;
+	} else
+		ctx->objmask = NULL;
+	/* Update relevant item pointer. */
+	*((const void **[]){ &item->spec, &item->last, &item->mask })[index] =
+		ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -727,6 +824,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.destroy.rule =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -737,6 +835,7 @@ parse_destroy(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.destroy.rule + out->args.destroy.rule_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -762,6 +861,7 @@ parse_flush(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -788,6 +888,7 @@ parse_query(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 	}
 	return len;
 }
@@ -849,6 +950,7 @@ parse_list(struct context *ctx, const struct token *token,
 		out->command = ctx->curr;
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		out->args.list.group =
 			(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
 					       sizeof(double));
@@ -859,6 +961,7 @@ parse_list(struct context *ctx, const struct token *token,
 		return -1;
 	ctx->objdata = 0;
 	ctx->object = out->args.list.group + out->args.list.group_n++;
+	ctx->objmask = NULL;
 	return len;
 }
 
@@ -891,6 +994,7 @@ parse_int(struct context *ctx, const struct token *token,
 		return len;
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
+objmask:
 	switch (size) {
 	case sizeof(uint8_t):
 		*(uint8_t *)buf = u;
@@ -907,6 +1011,11 @@ parse_int(struct context *ctx, const struct token *token,
 	default:
 		goto error;
 	}
+	if (ctx->objmask && buf != (uint8_t *)ctx->objmask + arg->offset) {
+		u = -1;
+		buf = (uint8_t *)ctx->objmask + arg->offset;
+		goto objmask;
+	}
 	return len;
 error:
 	push_args(ctx, arg);
@@ -927,6 +1036,7 @@ parse_port(struct context *ctx, const struct token *token,
 	else {
 		ctx->objdata = 0;
 		ctx->object = out;
+		ctx->objmask = NULL;
 		size = sizeof(*out);
 	}
 	ret = parse_int(ctx, token, str, len, out, size);
@@ -1033,6 +1143,7 @@ cmd_flow_context_init(struct context *ctx)
 	ctx->port = 0;
 	ctx->objdata = 0;
 	ctx->object = NULL;
+	ctx->objmask = NULL;
 }
 
 /** Parse a token (cmdline API). */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 15/26] app/testpmd: add rte_flow item spec prefix length
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (13 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 14/26] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 16/26] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
                               ` (11 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Generating bit-masks from prefix lengths is often more convenient than
providing them entirely (e.g. to define IPv4 and IPv6 subnets).

This commit adds the "prefix" operator that assigns generated bit-masks to
any pattern item specification field.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 80 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7bc1aa7..9a6f37d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -56,6 +56,7 @@ enum index {
 	/* Common tokens. */
 	INTEGER,
 	UNSIGNED,
+	PREFIX,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -93,6 +94,7 @@ enum index {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ITEM_NEXT,
 	ITEM_END,
 	ITEM_VOID,
@@ -278,6 +280,7 @@ static const enum index item_param[] = {
 	ITEM_PARAM_SPEC,
 	ITEM_PARAM_LAST,
 	ITEM_PARAM_MASK,
+	ITEM_PARAM_PREFIX,
 	ZERO,
 };
 
@@ -321,6 +324,9 @@ static int parse_list(struct context *, const struct token *,
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
+static int parse_prefix(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -361,6 +367,13 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[PREFIX] = {
+		.name = "{prefix}",
+		.type = "PREFIX",
+		.help = "prefix length for bit-mask",
+		.call = parse_prefix,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -528,6 +541,11 @@ static const struct token token_list[] = {
 		.help = "specify bit-mask with relevant bits set to one",
 		.call = parse_vc_spec,
 	},
+	[ITEM_PARAM_PREFIX] = {
+		.name = "prefix",
+		.help = "generate bit-mask from a prefix length",
+		.call = parse_vc_spec,
+	},
 	[ITEM_NEXT] = {
 		.name = "/",
 		.help = "specify next pattern item",
@@ -605,6 +623,62 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/**
+ * Parse a prefix length and generate a bit-mask.
+ *
+ * Last argument (ctx->args) is retrieved to determine mask size, storage
+ * location and whether the result must use network byte ordering.
+ */
+static int
+parse_prefix(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	static const uint8_t conv[] = "\x00\x80\xc0\xe0\xf0\xf8\xfc\xfe\xff";
+	char *end;
+	uintmax_t u;
+	unsigned int bytes;
+	unsigned int extra;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	errno = 0;
+	u = strtoumax(str, &end, 0);
+	if (errno || (size_t)(end - str) != len)
+		goto error;
+	bytes = u / 8;
+	extra = u % 8;
+	size = arg->size;
+	if (bytes > size || bytes + !!extra > size)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+	if (!arg->hton) {
+		memset((uint8_t *)buf + size - bytes, 0xff, bytes);
+		memset(buf, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[size - bytes - 1] = conv[extra];
+	} else
+#endif
+	{
+		memset(buf, 0xff, bytes);
+		memset((uint8_t *)buf + bytes, 0x00, size - bytes);
+		if (extra)
+			((uint8_t *)buf)[bytes] = conv[extra];
+	}
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Default parsing function for token name matching. */
 static int
 parse_default(struct context *ctx, const struct token *token,
@@ -776,6 +850,12 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	case ITEM_PARAM_LAST:
 		index = 1;
 		break;
+	case ITEM_PARAM_PREFIX:
+		/* Modify next token to expect a prefix. */
+		if (ctx->next_num < 2)
+			return -1;
+		ctx->next[ctx->next_num - 2] = NEXT_ENTRY(PREFIX);
+		/* Fall through. */
 	case ITEM_PARAM_MASK:
 		index = 2;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 16/26] app/testpmd: add rte_flow bit-field support
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (14 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 15/26] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 17/26] app/testpmd: add item any to flow command Adrien Mazarguil
                               ` (10 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Several rte_flow structures expose bit-fields that cannot be set in a
generic fashion at byte level. Add bit-mask support to handle them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 70 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 9a6f37d..fc4d824 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -136,6 +136,7 @@ struct arg {
 	uint32_t sign:1; /**< Value is signed. */
 	uint32_t offset; /**< Relative offset from ctx->object. */
 	uint32_t size; /**< Field size. */
+	const uint8_t *mask; /**< Bit-mask to use instead of offset/size. */
 };
 
 /** Parser token definition. */
@@ -195,6 +196,13 @@ struct token {
 		.size = sizeof(((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() to target a bit-field. */
+#define ARGS_ENTRY_BF(s, f, b) \
+	(&(const struct arg){ \
+		.size = sizeof(s), \
+		.mask = (const void *)&(const s){ .f = (1 << (b)) - 1 }, \
+	})
+
 /** Static initializer for ARGS() to target a pointer. */
 #define ARGS_ENTRY_PTR(s, f) \
 	(&(const struct arg){ \
@@ -623,6 +631,45 @@ push_args(struct context *ctx, const struct arg *arg)
 	return 0;
 }
 
+/** Spread value into buffer according to bit-mask. */
+static size_t
+arg_entry_bf_fill(void *dst, uintmax_t val, const struct arg *arg)
+{
+	uint32_t i = arg->size;
+	uint32_t end = 0;
+	int sub = 1;
+	int add = 0;
+	size_t len = 0;
+
+	if (!arg->mask)
+		return 0;
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+	if (!arg->hton) {
+		i = 0;
+		end = arg->size;
+		sub = 0;
+		add = 1;
+	}
+#endif
+	while (i != end) {
+		unsigned int shift = 0;
+		uint8_t *buf = (uint8_t *)dst + arg->offset + (i -= sub);
+
+		for (shift = 0; arg->mask[i] >> shift; ++shift) {
+			if (!(arg->mask[i] & (1 << shift)))
+				continue;
+			++len;
+			if (!dst)
+				continue;
+			*buf &= ~(1 << shift);
+			*buf |= (val & 1) << shift;
+			val >>= 1;
+		}
+		i += add;
+	}
+	return len;
+}
+
 /**
  * Parse a prefix length and generate a bit-mask.
  *
@@ -649,6 +696,23 @@ parse_prefix(struct context *ctx, const struct token *token,
 	u = strtoumax(str, &end, 0);
 	if (errno || (size_t)(end - str) != len)
 		goto error;
+	if (arg->mask) {
+		uintmax_t v = 0;
+
+		extra = arg_entry_bf_fill(NULL, 0, arg);
+		if (u > extra)
+			goto error;
+		if (!ctx->object)
+			return len;
+		extra -= u;
+		while (u--)
+			(v <<= 1, v |= 1);
+		v <<= extra;
+		if (!arg_entry_bf_fill(ctx->object, v, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	bytes = u / 8;
 	extra = u % 8;
 	size = arg->size;
@@ -1072,6 +1136,12 @@ parse_int(struct context *ctx, const struct token *token,
 		goto error;
 	if (!ctx->object)
 		return len;
+	if (arg->mask) {
+		if (!arg_entry_bf_fill(ctx->object, u, arg) ||
+		    !arg_entry_bf_fill(ctx->objmask, -1, arg))
+			goto error;
+		return len;
+	}
 	buf = (uint8_t *)ctx->object + arg->offset;
 	size = arg->size;
 objmask:
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 17/26] app/testpmd: add item any to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (15 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 16/26] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 18/26] app/testpmd: add various items " Adrien Mazarguil
                               ` (9 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

This pattern item matches any protocol in place of the current layer and
has two properties:

- min: minimum number of layers covered (0 or more).
- max: maximum number of layers covered (0 means infinity).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index fc4d824..7504fc7 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -99,6 +99,8 @@ enum index {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ITEM_ANY_NUM,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -282,7 +284,6 @@ static const enum index next_list_attr[] = {
 	ZERO,
 };
 
-__rte_unused
 static const enum index item_param[] = {
 	ITEM_PARAM_IS,
 	ITEM_PARAM_SPEC,
@@ -296,6 +297,13 @@ static const enum index next_item[] = {
 	ITEM_END,
 	ITEM_VOID,
 	ITEM_INVERT,
+	ITEM_ANY,
+	ZERO,
+};
+
+static const enum index item_any[] = {
+	ITEM_ANY_NUM,
+	ITEM_NEXT,
 	ZERO,
 };
 
@@ -580,6 +588,19 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
 		.call = parse_vc,
 	},
+	[ITEM_ANY] = {
+		.name = "any",
+		.help = "match any protocol for the current layer",
+		.priv = PRIV_ITEM(ANY, sizeof(struct rte_flow_item_any)),
+		.next = NEXT(item_any),
+		.call = parse_vc,
+	},
+	[ITEM_ANY_NUM] = {
+		.name = "num",
+		.help = "number of layers covered",
+		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, num)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 18/26] app/testpmd: add various items to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (16 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 17/26] app/testpmd: add item any to flow command Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw " Adrien Mazarguil
                               ` (8 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

- PF: match packets addressed to the physical function.
- VF: match packets addressed to a virtual function ID.
- PORT: device-specific physical port index to use.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 53 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7504fc7..0592969 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -101,6 +101,11 @@ enum index {
 	ITEM_INVERT,
 	ITEM_ANY,
 	ITEM_ANY_NUM,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_VF_ID,
+	ITEM_PORT,
+	ITEM_PORT_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -298,6 +303,9 @@ static const enum index next_item[] = {
 	ITEM_VOID,
 	ITEM_INVERT,
 	ITEM_ANY,
+	ITEM_PF,
+	ITEM_VF,
+	ITEM_PORT,
 	ZERO,
 };
 
@@ -307,6 +315,18 @@ static const enum index item_any[] = {
 	ZERO,
 };
 
+static const enum index item_vf[] = {
+	ITEM_VF_ID,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_port[] = {
+	ITEM_PORT_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -601,6 +621,39 @@ static const struct token token_list[] = {
 		.next = NEXT(item_any, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_any, num)),
 	},
+	[ITEM_PF] = {
+		.name = "pf",
+		.help = "match packets addressed to the physical function",
+		.priv = PRIV_ITEM(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ITEM_NEXT)),
+		.call = parse_vc,
+	},
+	[ITEM_VF] = {
+		.name = "vf",
+		.help = "match packets addressed to a virtual function ID",
+		.priv = PRIV_ITEM(VF, sizeof(struct rte_flow_item_vf)),
+		.next = NEXT(item_vf),
+		.call = parse_vc,
+	},
+	[ITEM_VF_ID] = {
+		.name = "id",
+		.help = "destination VF ID",
+		.next = NEXT(item_vf, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_vf, id)),
+	},
+	[ITEM_PORT] = {
+		.name = "port",
+		.help = "device-specific physical port index to use",
+		.priv = PRIV_ITEM(PORT, sizeof(struct rte_flow_item_port)),
+		.next = NEXT(item_port),
+		.call = parse_vc,
+	},
+	[ITEM_PORT_INDEX] = {
+		.name = "index",
+		.help = "physical port index",
+		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (17 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 18/26] app/testpmd: add various items " Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2017-05-11  6:53               ` Zhao1, Wei
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 20/26] app/testpmd: add items eth/vlan " Adrien Mazarguil
                               ` (7 subsequent siblings)
  26 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Matches arbitrary byte strings with properties:

- relative: look for pattern after the previous item.
- search: search pattern from offset (see also limit).
- offset: absolute or relative offset for pattern.
- limit: search area limit for start of pattern.
- length: pattern length.
- pattern: byte string to look for.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 208 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 208 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0592969..c52a8f7 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -57,6 +57,8 @@ enum index {
 	INTEGER,
 	UNSIGNED,
 	PREFIX,
+	BOOLEAN,
+	STRING,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -106,6 +108,12 @@ enum index {
 	ITEM_VF_ID,
 	ITEM_PORT,
 	ITEM_PORT_INDEX,
+	ITEM_RAW,
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -115,6 +123,13 @@ enum index {
 	ACTION_PASSTHRU,
 };
 
+/** Size of pattern[] field in struct rte_flow_item_raw. */
+#define ITEM_RAW_PATTERN_SIZE 36
+
+/** Storage size for struct rte_flow_item_raw including pattern. */
+#define ITEM_RAW_SIZE \
+	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -216,6 +231,13 @@ struct token {
 		.size = sizeof(*((s *)0)->f), \
 	})
 
+/** Static initializer for ARGS() with arbitrary size. */
+#define ARGS_ENTRY_USZ(s, f, sz) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = (sz), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -306,6 +328,7 @@ static const enum index next_item[] = {
 	ITEM_PF,
 	ITEM_VF,
 	ITEM_PORT,
+	ITEM_RAW,
 	ZERO,
 };
 
@@ -327,6 +350,16 @@ static const enum index item_port[] = {
 	ZERO,
 };
 
+static const enum index item_raw[] = {
+	ITEM_RAW_RELATIVE,
+	ITEM_RAW_SEARCH,
+	ITEM_RAW_OFFSET,
+	ITEM_RAW_LIMIT,
+	ITEM_RAW_PATTERN,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -363,11 +396,19 @@ static int parse_int(struct context *, const struct token *,
 static int parse_prefix(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_boolean(struct context *, const struct token *,
+			 const char *, unsigned int,
+			 void *, unsigned int);
+static int parse_string(struct context *, const struct token *,
+			const char *, unsigned int,
+			void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
 static int comp_none(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
+static int comp_boolean(struct context *, const struct token *,
+			unsigned int, char *, unsigned int);
 static int comp_action(struct context *, const struct token *,
 		       unsigned int, char *, unsigned int);
 static int comp_port(struct context *, const struct token *,
@@ -410,6 +451,20 @@ static const struct token token_list[] = {
 		.call = parse_prefix,
 		.comp = comp_none,
 	},
+	[BOOLEAN] = {
+		.name = "{boolean}",
+		.type = "BOOLEAN",
+		.help = "any boolean value",
+		.call = parse_boolean,
+		.comp = comp_boolean,
+	},
+	[STRING] = {
+		.name = "{string}",
+		.type = "STRING",
+		.help = "fixed string",
+		.call = parse_string,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -654,6 +709,52 @@ static const struct token token_list[] = {
 		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port, index)),
 	},
+	[ITEM_RAW] = {
+		.name = "raw",
+		.help = "match an arbitrary byte string",
+		.priv = PRIV_ITEM(RAW, ITEM_RAW_SIZE),
+		.next = NEXT(item_raw),
+		.call = parse_vc,
+	},
+	[ITEM_RAW_RELATIVE] = {
+		.name = "relative",
+		.help = "look for pattern after the previous item",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
+					   relative, 1)),
+	},
+	[ITEM_RAW_SEARCH] = {
+		.name = "search",
+		.help = "search pattern from offset (see also limit)",
+		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN), item_param),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
+					   search, 1)),
+	},
+	[ITEM_RAW_OFFSET] = {
+		.name = "offset",
+		.help = "absolute or relative offset for pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(INTEGER), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, offset)),
+	},
+	[ITEM_RAW_LIMIT] = {
+		.name = "limit",
+		.help = "search area limit for start of pattern",
+		.next = NEXT(item_raw, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, limit)),
+	},
+	[ITEM_RAW_PATTERN] = {
+		.name = "pattern",
+		.help = "byte string to look for",
+		.next = NEXT(item_raw,
+			     NEXT_ENTRY(STRING),
+			     NEXT_ENTRY(ITEM_PARAM_IS,
+					ITEM_PARAM_SPEC,
+					ITEM_PARAM_MASK)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, length),
+			     ARGS_ENTRY_USZ(struct rte_flow_item_raw,
+					    pattern,
+					    ITEM_RAW_PATTERN_SIZE)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1246,6 +1347,96 @@ parse_int(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a string.
+ *
+ * Two arguments (ctx->args) are retrieved from the stack to store data and
+ * its length (in that order).
+ */
+static int
+parse_string(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	const struct arg *arg_data = pop_args(ctx);
+	const struct arg *arg_len = pop_args(ctx);
+	char tmp[16]; /* Ought to be enough. */
+	int ret;
+
+	/* Arguments are expected. */
+	if (!arg_data)
+		return -1;
+	if (!arg_len) {
+		push_args(ctx, arg_data);
+		return -1;
+	}
+	size = arg_data->size;
+	/* Bit-mask fill is not supported. */
+	if (arg_data->mask || size < len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	/* Let parse_int() fill length information first. */
+	ret = snprintf(tmp, sizeof(tmp), "%u", len);
+	if (ret < 0)
+		goto error;
+	push_args(ctx, arg_len);
+	ret = parse_int(ctx, token, tmp, ret, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		goto error;
+	}
+	buf = (uint8_t *)ctx->object + arg_data->offset;
+	/* Output buffer is not necessarily NUL-terminated. */
+	memcpy(buf, str, len);
+	memset((uint8_t *)buf + len, 0x55, size - len);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg_data->offset, 0xff, len);
+	return len;
+error:
+	push_args(ctx, arg_len);
+	push_args(ctx, arg_data);
+	return -1;
+}
+
+/** Boolean values (even indices stand for false). */
+static const char *const boolean_name[] = {
+	"0", "1",
+	"false", "true",
+	"no", "yes",
+	"N", "Y",
+	NULL,
+};
+
+/**
+ * Parse a boolean value.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_boolean(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	unsigned int i;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	for (i = 0; boolean_name[i]; ++i)
+		if (!strncmp(str, boolean_name[i], len))
+			break;
+	/* Process token as integer. */
+	if (boolean_name[i])
+		str = i & 1 ? "1" : "0";
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, strlen(str), buf, size);
+	return ret > 0 ? (int)len : ret;
+}
+
 /** Parse port and update context. */
 static int
 parse_port(struct context *ctx, const struct token *token,
@@ -1284,6 +1475,23 @@ comp_none(struct context *ctx, const struct token *token,
 	return 0;
 }
 
+/** Complete boolean values. */
+static int
+comp_boolean(struct context *ctx, const struct token *token,
+	     unsigned int ent, char *buf, unsigned int size)
+{
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; boolean_name[i]; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", boolean_name[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Complete action names. */
 static int
 comp_action(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 20/26] app/testpmd: add items eth/vlan to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (18 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw " Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 21/26] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
                               ` (6 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

These pattern items match basic Ethernet headers (source, destination and
type) and related 802.1Q/ad VLAN headers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 126 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 126 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index c52a8f7..e22e0c2 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -43,6 +43,7 @@
 #include <rte_ethdev.h>
 #include <rte_byteorder.h>
 #include <cmdline_parse.h>
+#include <cmdline_parse_etheraddr.h>
 #include <rte_flow.h>
 
 #include "testpmd.h"
@@ -59,6 +60,7 @@ enum index {
 	PREFIX,
 	BOOLEAN,
 	STRING,
+	MAC_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -114,6 +116,13 @@ enum index {
 	ITEM_RAW_OFFSET,
 	ITEM_RAW_LIMIT,
 	ITEM_RAW_PATTERN,
+	ITEM_ETH,
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_VLAN,
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -238,6 +247,14 @@ struct token {
 		.size = (sz), \
 	})
 
+/** Same as ARGS_ENTRY() using network byte ordering. */
+#define ARGS_ENTRY_HTON(s, f) \
+	(&(const struct arg){ \
+		.hton = 1, \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+	})
+
 /** Parser output buffer layout expected by cmd_flow_parsed(). */
 struct buffer {
 	enum index command; /**< Flow command. */
@@ -329,6 +346,8 @@ static const enum index next_item[] = {
 	ITEM_VF,
 	ITEM_PORT,
 	ITEM_RAW,
+	ITEM_ETH,
+	ITEM_VLAN,
 	ZERO,
 };
 
@@ -360,6 +379,21 @@ static const enum index item_raw[] = {
 	ZERO,
 };
 
+static const enum index item_eth[] = {
+	ITEM_ETH_DST,
+	ITEM_ETH_SRC,
+	ITEM_ETH_TYPE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vlan[] = {
+	ITEM_VLAN_TPID,
+	ITEM_VLAN_TCI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -402,6 +436,9 @@ static int parse_boolean(struct context *, const struct token *,
 static int parse_string(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_mac_addr(struct context *, const struct token *,
+			  const char *, unsigned int,
+			  void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -465,6 +502,13 @@ static const struct token token_list[] = {
 		.call = parse_string,
 		.comp = comp_none,
 	},
+	[MAC_ADDR] = {
+		.name = "{MAC address}",
+		.type = "MAC-48",
+		.help = "standard MAC address notation",
+		.call = parse_mac_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -755,6 +799,50 @@ static const struct token token_list[] = {
 					    pattern,
 					    ITEM_RAW_PATTERN_SIZE)),
 	},
+	[ITEM_ETH] = {
+		.name = "eth",
+		.help = "match Ethernet header",
+		.priv = PRIV_ITEM(ETH, sizeof(struct rte_flow_item_eth)),
+		.next = NEXT(item_eth),
+		.call = parse_vc,
+	},
+	[ITEM_ETH_DST] = {
+		.name = "dst",
+		.help = "destination MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, dst)),
+	},
+	[ITEM_ETH_SRC] = {
+		.name = "src",
+		.help = "source MAC",
+		.next = NEXT(item_eth, NEXT_ENTRY(MAC_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_eth, src)),
+	},
+	[ITEM_ETH_TYPE] = {
+		.name = "type",
+		.help = "EtherType",
+		.next = NEXT(item_eth, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_eth, type)),
+	},
+	[ITEM_VLAN] = {
+		.name = "vlan",
+		.help = "match 802.1Q/ad VLAN tag",
+		.priv = PRIV_ITEM(VLAN, sizeof(struct rte_flow_item_vlan)),
+		.next = NEXT(item_vlan),
+		.call = parse_vc,
+	},
+	[ITEM_VLAN_TPID] = {
+		.name = "tpid",
+		.help = "tag protocol identifier",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tpid)),
+	},
+	[ITEM_VLAN_TCI] = {
+		.name = "tci",
+		.help = "tag control information",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1399,6 +1487,44 @@ parse_string(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse a MAC address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_mac_addr(struct context *ctx, const struct token *token,
+	       const char *str, unsigned int len,
+	       void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	struct ether_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	ret = cmdline_parse_etheraddr(NULL, str, &tmp, size);
+	if (ret < 0 || (unsigned int)ret != len)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 21/26] app/testpmd: add items ipv4/ipv6 to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (19 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 20/26] app/testpmd: add items eth/vlan " Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 22/26] app/testpmd: add L4 items " Adrien Mazarguil
                               ` (5 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Add the ability to match basic fields from IPv4 and IPv6 headers (source
and destination addresses only).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 177 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index e22e0c2..1f6a5a0 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -38,6 +38,7 @@
 #include <errno.h>
 #include <ctype.h>
 #include <string.h>
+#include <arpa/inet.h>
 
 #include <rte_common.h>
 #include <rte_ethdev.h>
@@ -61,6 +62,8 @@ enum index {
 	BOOLEAN,
 	STRING,
 	MAC_ADDR,
+	IPV4_ADDR,
+	IPV6_ADDR,
 	RULE_ID,
 	PORT_ID,
 	GROUP_ID,
@@ -123,6 +126,12 @@ enum index {
 	ITEM_VLAN,
 	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
+	ITEM_IPV4,
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_IPV6,
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -348,6 +357,8 @@ static const enum index next_item[] = {
 	ITEM_RAW,
 	ITEM_ETH,
 	ITEM_VLAN,
+	ITEM_IPV4,
+	ITEM_IPV6,
 	ZERO,
 };
 
@@ -394,6 +405,20 @@ static const enum index item_vlan[] = {
 	ZERO,
 };
 
+static const enum index item_ipv4[] = {
+	ITEM_IPV4_SRC,
+	ITEM_IPV4_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_ipv6[] = {
+	ITEM_IPV6_SRC,
+	ITEM_IPV6_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -439,6 +464,12 @@ static int parse_string(struct context *, const struct token *,
 static int parse_mac_addr(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int parse_ipv4_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
+static int parse_ipv6_addr(struct context *, const struct token *,
+			   const char *, unsigned int,
+			   void *, unsigned int);
 static int parse_port(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -509,6 +540,20 @@ static const struct token token_list[] = {
 		.call = parse_mac_addr,
 		.comp = comp_none,
 	},
+	[IPV4_ADDR] = {
+		.name = "{IPv4 address}",
+		.type = "IPV4 ADDRESS",
+		.help = "standard IPv4 address notation",
+		.call = parse_ipv4_addr,
+		.comp = comp_none,
+	},
+	[IPV6_ADDR] = {
+		.name = "{IPv6 address}",
+		.type = "IPV6 ADDRESS",
+		.help = "standard IPv6 address notation",
+		.call = parse_ipv6_addr,
+		.comp = comp_none,
+	},
 	[RULE_ID] = {
 		.name = "{rule id}",
 		.type = "RULE ID",
@@ -843,6 +888,48 @@ static const struct token token_list[] = {
 		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
 	},
+	[ITEM_IPV4] = {
+		.name = "ipv4",
+		.help = "match IPv4 header",
+		.priv = PRIV_ITEM(IPV4, sizeof(struct rte_flow_item_ipv4)),
+		.next = NEXT(item_ipv4),
+		.call = parse_vc,
+	},
+	[ITEM_IPV4_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV4_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(IPV4_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.dst_addr)),
+	},
+	[ITEM_IPV6] = {
+		.name = "ipv6",
+		.help = "match IPv6 header",
+		.priv = PRIV_ITEM(IPV6, sizeof(struct rte_flow_item_ipv6)),
+		.next = NEXT(item_ipv6),
+		.call = parse_vc,
+	},
+	[ITEM_IPV6_SRC] = {
+		.name = "src",
+		.help = "source address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.src_addr)),
+	},
+	[ITEM_IPV6_DST] = {
+		.name = "dst",
+		.help = "destination address",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(IPV6_ADDR), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.dst_addr)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1525,6 +1612,96 @@ parse_mac_addr(struct context *ctx, const struct token *token,
 	return -1;
 }
 
+/**
+ * Parse an IPv4 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv4_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in_addr tmp;
+	int ret;
+
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET, str2, &tmp);
+	if (ret != 1) {
+		/* Attempt integer parsing. */
+		push_args(ctx, arg);
+		return parse_int(ctx, token, str, len, buf, size);
+	}
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
+/**
+ * Parse an IPv6 address.
+ *
+ * Last argument (ctx->args) is retrieved to determine storage size and
+ * location.
+ */
+static int
+parse_ipv6_addr(struct context *ctx, const struct token *token,
+		const char *str, unsigned int len,
+		void *buf, unsigned int size)
+{
+	const struct arg *arg = pop_args(ctx);
+	char str2[len + 1];
+	struct in6_addr tmp;
+	int ret;
+
+	(void)token;
+	/* Argument is expected. */
+	if (!arg)
+		return -1;
+	size = arg->size;
+	/* Bit-mask fill is not supported. */
+	if (arg->mask || size != sizeof(tmp))
+		goto error;
+	/* Only network endian is supported. */
+	if (!arg->hton)
+		goto error;
+	memcpy(str2, str, len);
+	str2[len] = '\0';
+	ret = inet_pton(AF_INET6, str2, &tmp);
+	if (ret != 1)
+		goto error;
+	if (!ctx->object)
+		return len;
+	buf = (uint8_t *)ctx->object + arg->offset;
+	memcpy(buf, &tmp, size);
+	if (ctx->objmask)
+		memset((uint8_t *)ctx->objmask + arg->offset, 0xff, size);
+	return len;
+error:
+	push_args(ctx, arg);
+	return -1;
+}
+
 /** Boolean values (even indices stand for false). */
 static const char *const boolean_name[] = {
 	"0", "1",
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 22/26] app/testpmd: add L4 items to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (20 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 21/26] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 23/26] app/testpmd: add various actions " Adrien Mazarguil
                               ` (4 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Add the ability to match a few properties of common L4[.5] protocol
headers:

- ICMP: type and code.
- UDP: source and destination ports.
- TCP: source and destination ports.
- SCTP: source and destination ports.
- VXLAN: network identifier.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 163 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 163 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 1f6a5a0..259e9eb 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -132,6 +132,20 @@ enum index {
 	ITEM_IPV6,
 	ITEM_IPV6_SRC,
 	ITEM_IPV6_DST,
+	ITEM_ICMP,
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_UDP,
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_TCP,
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_SCTP,
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_VXLAN,
+	ITEM_VXLAN_VNI,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -359,6 +373,11 @@ static const enum index next_item[] = {
 	ITEM_VLAN,
 	ITEM_IPV4,
 	ITEM_IPV6,
+	ITEM_ICMP,
+	ITEM_UDP,
+	ITEM_TCP,
+	ITEM_SCTP,
+	ITEM_VXLAN,
 	ZERO,
 };
 
@@ -419,6 +438,40 @@ static const enum index item_ipv6[] = {
 	ZERO,
 };
 
+static const enum index item_icmp[] = {
+	ITEM_ICMP_TYPE,
+	ITEM_ICMP_CODE,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_udp[] = {
+	ITEM_UDP_SRC,
+	ITEM_UDP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_tcp[] = {
+	ITEM_TCP_SRC,
+	ITEM_TCP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_sctp[] = {
+	ITEM_SCTP_SRC,
+	ITEM_SCTP_DST,
+	ITEM_NEXT,
+	ZERO,
+};
+
+static const enum index item_vxlan[] = {
+	ITEM_VXLAN_VNI,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -930,6 +983,103 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
 					     hdr.dst_addr)),
 	},
+	[ITEM_ICMP] = {
+		.name = "icmp",
+		.help = "match ICMP header",
+		.priv = PRIV_ITEM(ICMP, sizeof(struct rte_flow_item_icmp)),
+		.next = NEXT(item_icmp),
+		.call = parse_vc,
+	},
+	[ITEM_ICMP_TYPE] = {
+		.name = "type",
+		.help = "ICMP packet type",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_type)),
+	},
+	[ITEM_ICMP_CODE] = {
+		.name = "code",
+		.help = "ICMP packet code",
+		.next = NEXT(item_icmp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_icmp,
+					     hdr.icmp_code)),
+	},
+	[ITEM_UDP] = {
+		.name = "udp",
+		.help = "match UDP header",
+		.priv = PRIV_ITEM(UDP, sizeof(struct rte_flow_item_udp)),
+		.next = NEXT(item_udp),
+		.call = parse_vc,
+	},
+	[ITEM_UDP_SRC] = {
+		.name = "src",
+		.help = "UDP source port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.src_port)),
+	},
+	[ITEM_UDP_DST] = {
+		.name = "dst",
+		.help = "UDP destination port",
+		.next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_udp,
+					     hdr.dst_port)),
+	},
+	[ITEM_TCP] = {
+		.name = "tcp",
+		.help = "match TCP header",
+		.priv = PRIV_ITEM(TCP, sizeof(struct rte_flow_item_tcp)),
+		.next = NEXT(item_tcp),
+		.call = parse_vc,
+	},
+	[ITEM_TCP_SRC] = {
+		.name = "src",
+		.help = "TCP source port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.src_port)),
+	},
+	[ITEM_TCP_DST] = {
+		.name = "dst",
+		.help = "TCP destination port",
+		.next = NEXT(item_tcp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tcp,
+					     hdr.dst_port)),
+	},
+	[ITEM_SCTP] = {
+		.name = "sctp",
+		.help = "match SCTP header",
+		.priv = PRIV_ITEM(SCTP, sizeof(struct rte_flow_item_sctp)),
+		.next = NEXT(item_sctp),
+		.call = parse_vc,
+	},
+	[ITEM_SCTP_SRC] = {
+		.name = "src",
+		.help = "SCTP source port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.src_port)),
+	},
+	[ITEM_SCTP_DST] = {
+		.name = "dst",
+		.help = "SCTP destination port",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.dst_port)),
+	},
+	[ITEM_VXLAN] = {
+		.name = "vxlan",
+		.help = "match VXLAN header",
+		.priv = PRIV_ITEM(VXLAN, sizeof(struct rte_flow_item_vxlan)),
+		.next = NEXT(item_vxlan),
+		.call = parse_vc,
+	},
+	[ITEM_VXLAN_VNI] = {
+		.name = "vni",
+		.help = "VXLAN identifier",
+		.next = NEXT(item_vxlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vxlan, vni)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -1502,6 +1652,19 @@ parse_int(struct context *ctx, const struct token *token,
 	case sizeof(uint16_t):
 		*(uint16_t *)buf = arg->hton ? rte_cpu_to_be_16(u) : u;
 		break;
+	case sizeof(uint8_t [3]):
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+		if (!arg->hton) {
+			((uint8_t *)buf)[0] = u;
+			((uint8_t *)buf)[1] = u >> 8;
+			((uint8_t *)buf)[2] = u >> 16;
+			break;
+		}
+#endif
+		((uint8_t *)buf)[0] = u >> 16;
+		((uint8_t *)buf)[1] = u >> 8;
+		((uint8_t *)buf)[2] = u;
+		break;
 	case sizeof(uint32_t):
 		*(uint32_t *)buf = arg->hton ? rte_cpu_to_be_32(u) : u;
 		break;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 23/26] app/testpmd: add various actions to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (21 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 22/26] app/testpmd: add L4 items " Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 24/26] app/testpmd: add queue " Adrien Mazarguil
                               ` (3 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

- MARK: attach 32 bit value to packets.
- FLAG: flag packets.
- DROP: drop packets.
- COUNT: enable counters for a rule.
- PF: redirect packets to physical device function.
- VF: redirect packets to virtual device function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 121 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 259e9eb..a4e8ebe 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -153,6 +153,15 @@ enum index {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_MARK_ID,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
 };
 
 /** Size of pattern[] field in struct rte_flow_item_raw. */
@@ -476,6 +485,25 @@ static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
 	ACTION_PASSTHRU,
+	ACTION_MARK,
+	ACTION_FLAG,
+	ACTION_DROP,
+	ACTION_COUNT,
+	ACTION_PF,
+	ACTION_VF,
+	ZERO,
+};
+
+static const enum index action_mark[] = {
+	ACTION_MARK_ID,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_vf[] = {
+	ACTION_VF_ORIGINAL,
+	ACTION_VF_ID,
+	ACTION_NEXT,
 	ZERO,
 };
 
@@ -487,6 +515,8 @@ static int parse_vc(struct context *, const struct token *,
 		    void *, unsigned int);
 static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_conf(struct context *, const struct token *,
+			 const char *, unsigned int, void *, unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -1112,6 +1142,70 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_MARK] = {
+		.name = "mark",
+		.help = "attach 32 bit value to packets",
+		.priv = PRIV_ACTION(MARK, sizeof(struct rte_flow_action_mark)),
+		.next = NEXT(action_mark),
+		.call = parse_vc,
+	},
+	[ACTION_MARK_ID] = {
+		.name = "id",
+		.help = "32 bit value to return with packets",
+		.next = NEXT(action_mark, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_mark, id)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_FLAG] = {
+		.name = "flag",
+		.help = "flag packets",
+		.priv = PRIV_ACTION(FLAG, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_DROP] = {
+		.name = "drop",
+		.help = "drop packets (note: passthru has priority)",
+		.priv = PRIV_ACTION(DROP, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_COUNT] = {
+		.name = "count",
+		.help = "enable counters for this rule",
+		.priv = PRIV_ACTION(COUNT, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_PF] = {
+		.name = "pf",
+		.help = "redirect packets to physical device function",
+		.priv = PRIV_ACTION(PF, 0),
+		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+		.call = parse_vc,
+	},
+	[ACTION_VF] = {
+		.name = "vf",
+		.help = "redirect packets to virtual device function",
+		.priv = PRIV_ACTION(VF, sizeof(struct rte_flow_action_vf)),
+		.next = NEXT(action_vf),
+		.call = parse_vc,
+	},
+	[ACTION_VF_ORIGINAL] = {
+		.name = "original",
+		.help = "use original VF ID if possible",
+		.next = NEXT(action_vf, NEXT_ENTRY(BOOLEAN)),
+		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_vf,
+					   original, 1)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_VF_ID] = {
+		.name = "id",
+		.help = "VF ID to redirect packets to",
+		.next = NEXT(action_vf, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_vf, id)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
@@ -1446,6 +1540,33 @@ parse_vc_spec(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/** Parse action configuration field. */
+static int
+parse_vc_conf(struct context *ctx, const struct token *token,
+	      const char *str, unsigned int len,
+	      void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+	struct rte_flow_action *action;
+
+	(void)size;
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (!out->args.vc.actions_n)
+		return -1;
+	action = &out->args.vc.actions[out->args.vc.actions_n - 1];
+	/* Point to selected object. */
+	ctx->object = out->args.vc.data;
+	ctx->objmask = NULL;
+	/* Update configuration pointer. */
+	action->conf = ctx->object;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 24/26] app/testpmd: add queue actions to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (22 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 23/26] app/testpmd: add various actions " Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 25/26] doc: describe testpmd " Adrien Mazarguil
                               ` (2 subsequent siblings)
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

- QUEUE: assign packets to a given queue index.
- DUP: duplicate packets to a given queue index.
- RSS: spread packets among several queues.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
---
 app/test-pmd/cmdline_flow.c | 152 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a4e8ebe..db680c6 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -156,8 +156,15 @@ enum index {
 	ACTION_MARK,
 	ACTION_MARK_ID,
 	ACTION_FLAG,
+	ACTION_QUEUE,
+	ACTION_QUEUE_INDEX,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_DUP_INDEX,
+	ACTION_RSS,
+	ACTION_RSS_QUEUES,
+	ACTION_RSS_QUEUE,
 	ACTION_PF,
 	ACTION_VF,
 	ACTION_VF_ORIGINAL,
@@ -171,6 +178,14 @@ enum index {
 #define ITEM_RAW_SIZE \
 	(offsetof(struct rte_flow_item_raw, pattern) + ITEM_RAW_PATTERN_SIZE)
 
+/** Number of queue[] entries in struct rte_flow_action_rss. */
+#define ACTION_RSS_NUM 32
+
+/** Storage size for struct rte_flow_action_rss including queues. */
+#define ACTION_RSS_SIZE \
+	(offsetof(struct rte_flow_action_rss, queue) + \
+	 sizeof(*((struct rte_flow_action_rss *)0)->queue) * ACTION_RSS_NUM)
+
 /** Maximum number of subsequent tokens and arguments on the stack. */
 #define CTX_STACK_SIZE 16
 
@@ -487,8 +502,11 @@ static const enum index next_action[] = {
 	ACTION_PASSTHRU,
 	ACTION_MARK,
 	ACTION_FLAG,
+	ACTION_QUEUE,
 	ACTION_DROP,
 	ACTION_COUNT,
+	ACTION_DUP,
+	ACTION_RSS,
 	ACTION_PF,
 	ACTION_VF,
 	ZERO,
@@ -500,6 +518,24 @@ static const enum index action_mark[] = {
 	ZERO,
 };
 
+static const enum index action_queue[] = {
+	ACTION_QUEUE_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_dup[] = {
+	ACTION_DUP_INDEX,
+	ACTION_NEXT,
+	ZERO,
+};
+
+static const enum index action_rss[] = {
+	ACTION_RSS_QUEUES,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static const enum index action_vf[] = {
 	ACTION_VF_ORIGINAL,
 	ACTION_VF_ID,
@@ -517,6 +553,9 @@ static int parse_vc_spec(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
 static int parse_vc_conf(struct context *, const struct token *,
 			 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_action_rss_queue(struct context *, const struct token *,
+				     const char *, unsigned int, void *,
+				     unsigned int);
 static int parse_destroy(struct context *, const struct token *,
 			 const char *, unsigned int,
 			 void *, unsigned int);
@@ -566,6 +605,8 @@ static int comp_port(struct context *, const struct token *,
 		     unsigned int, char *, unsigned int);
 static int comp_rule_id(struct context *, const struct token *,
 			unsigned int, char *, unsigned int);
+static int comp_vc_action_rss_queue(struct context *, const struct token *,
+				    unsigned int, char *, unsigned int);
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -1163,6 +1204,21 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_QUEUE] = {
+		.name = "queue",
+		.help = "assign packets to a given queue index",
+		.priv = PRIV_ACTION(QUEUE,
+				    sizeof(struct rte_flow_action_queue)),
+		.next = NEXT(action_queue),
+		.call = parse_vc,
+	},
+	[ACTION_QUEUE_INDEX] = {
+		.name = "index",
+		.help = "queue index to use",
+		.next = NEXT(action_queue, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_queue, index)),
+		.call = parse_vc_conf,
+	},
 	[ACTION_DROP] = {
 		.name = "drop",
 		.help = "drop packets (note: passthru has priority)",
@@ -1177,6 +1233,39 @@ static const struct token token_list[] = {
 		.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
 		.call = parse_vc,
 	},
+	[ACTION_DUP] = {
+		.name = "dup",
+		.help = "duplicate packets to a given queue index",
+		.priv = PRIV_ACTION(DUP, sizeof(struct rte_flow_action_dup)),
+		.next = NEXT(action_dup),
+		.call = parse_vc,
+	},
+	[ACTION_DUP_INDEX] = {
+		.name = "index",
+		.help = "queue index to duplicate packets to",
+		.next = NEXT(action_dup, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_dup, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS] = {
+		.name = "rss",
+		.help = "spread packets among several queues",
+		.priv = PRIV_ACTION(RSS, ACTION_RSS_SIZE),
+		.next = NEXT(action_rss),
+		.call = parse_vc,
+	},
+	[ACTION_RSS_QUEUES] = {
+		.name = "queues",
+		.help = "queue indices to use",
+		.next = NEXT(action_rss, NEXT_ENTRY(ACTION_RSS_QUEUE)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_RSS_QUEUE] = {
+		.name = "{queue}",
+		.help = "queue index",
+		.call = parse_vc_action_rss_queue,
+		.comp = comp_vc_action_rss_queue,
+	},
 	[ACTION_PF] = {
 		.name = "pf",
 		.help = "redirect packets to physical device function",
@@ -1567,6 +1656,51 @@ parse_vc_conf(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/**
+ * Parse queue field for RSS action.
+ *
+ * Valid tokens are queue indices and the "end" token.
+ */
+static int
+parse_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			  const char *str, unsigned int len,
+			  void *buf, unsigned int size)
+{
+	static const enum index next[] = NEXT_ENTRY(ACTION_RSS_QUEUE);
+	int ret;
+	int i;
+
+	(void)token;
+	(void)buf;
+	(void)size;
+	if (ctx->curr != ACTION_RSS_QUEUE)
+		return -1;
+	i = ctx->objdata >> 16;
+	if (!strncmp(str, "end", len)) {
+		ctx->objdata &= 0xffff;
+		return len;
+	}
+	if (i >= ACTION_RSS_NUM)
+		return -1;
+	if (push_args(ctx, ARGS_ENTRY(struct rte_flow_action_rss, queue[i])))
+		return -1;
+	ret = parse_int(ctx, token, str, len, NULL, 0);
+	if (ret < 0) {
+		pop_args(ctx);
+		return -1;
+	}
+	++i;
+	ctx->objdata = i << 16 | (ctx->objdata & 0xffff);
+	/* Repeat token. */
+	if (ctx->next_num == RTE_DIM(ctx->next))
+		return -1;
+	ctx->next[ctx->next_num++] = next;
+	if (!ctx->object)
+		return len;
+	((struct rte_flow_action_rss *)ctx->object)->num = i;
+	return len;
+}
+
 /** Parse tokens for destroy command. */
 static int
 parse_destroy(struct context *ctx, const struct token *token,
@@ -2141,6 +2275,24 @@ comp_rule_id(struct context *ctx, const struct token *token,
 	return i;
 }
 
+/** Complete queue field for RSS action. */
+static int
+comp_vc_action_rss_queue(struct context *ctx, const struct token *token,
+			 unsigned int ent, char *buf, unsigned int size)
+{
+	static const char *const str[] = { "", "end", NULL };
+	unsigned int i;
+
+	(void)ctx;
+	(void)token;
+	for (i = 0; str[i] != NULL; ++i)
+		if (buf && i == ent)
+			return snprintf(buf, size, "%s", str[i]);
+	if (buf)
+		return -1;
+	return i;
+}
+
 /** Internal context. */
 static struct context cmd_flow_context;
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 25/26] doc: describe testpmd flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (23 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 24/26] app/testpmd: add queue " Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 26/26] app/testpmd: add protocol fields to " Adrien Mazarguil
  2016-12-23  9:30             ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Thomas Monjalon
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev

Document syntax, interaction with rte_flow and provide usage examples.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
---
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 612 +++++++++++++++++++++++
 1 file changed, 612 insertions(+)

diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index f1c269a..03b6fa9 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -1631,6 +1631,9 @@ Filter Functions
 
 This section details the available filter functions that are available.
 
+Note these functions interface the deprecated legacy filtering framework,
+superseded by *rte_flow*. See `Flow rules management`_.
+
 ethertype_filter
 ~~~~~~~~~~~~~~~~~~~~
 
@@ -2041,3 +2044,612 @@ Set different GRE key length for input set::
 For example to set GRE key length for input set to 4 bytes on port 0::
 
    testpmd> global_config 0 gre-key-len 4
+
+
+.. _testpmd_rte_flow:
+
+Flow rules management
+---------------------
+
+Control of the generic flow API (*rte_flow*) is fully exposed through the
+``flow`` command (validation, creation, destruction and queries).
+
+Considering *rte_flow* overlaps with all `Filter Functions`_, using both
+features simultaneously may cause undefined side-effects and is therefore
+not recommended.
+
+``flow`` syntax
+~~~~~~~~~~~~~~~
+
+Because the ``flow`` command uses dynamic tokens to handle the large number
+of possible flow rules combinations, its behavior differs slightly from
+other commands, in particular:
+
+- Pressing *?* or the *<tab>* key displays contextual help for the current
+  token, not that of the entire command.
+
+- Optional and repeated parameters are supported (provided they are listed
+  in the contextual help).
+
+The first parameter stands for the operation mode. Possible operations and
+their general syntax are described below. They are covered in detail in the
+following sections.
+
+- Check whether a flow rule can be created::
+
+   flow validate {port_id}
+       [group {group_id}] [priority {level}] [ingress] [egress]
+       pattern {item} [/ {item} [...]] / end
+       actions {action} [/ {action} [...]] / end
+
+- Create a flow rule::
+
+   flow create {port_id}
+       [group {group_id}] [priority {level}] [ingress] [egress]
+       pattern {item} [/ {item} [...]] / end
+       actions {action} [/ {action} [...]] / end
+
+- Destroy specific flow rules::
+
+   flow destroy {port_id} rule {rule_id} [...]
+
+- Destroy all flow rules::
+
+   flow flush {port_id}
+
+- Query an existing flow rule::
+
+   flow query {port_id} {rule_id} {action}
+
+- List existing flow rules sorted by priority, filtered by group
+  identifiers::
+
+   flow list {port_id} [group {group_id}] [...]
+
+Validating flow rules
+~~~~~~~~~~~~~~~~~~~~~
+
+``flow validate`` reports whether a flow rule would be accepted by the
+underlying device in its current state but stops short of creating it. It is
+bound to ``rte_flow_validate()``::
+
+   flow validate {port_id}
+      [group {group_id}] [priority {level}] [ingress] [egress]
+      pattern {item} [/ {item} [...]] / end
+      actions {action} [/ {action} [...]] / end
+
+If successful, it will show::
+
+   Flow rule validated
+
+Otherwise it will show an error message of the form::
+
+   Caught error type [...] ([...]): [...]
+
+This command uses the same parameters as ``flow create``, their format is
+described in `Creating flow rules`_.
+
+Check whether redirecting any Ethernet packet received on port 0 to RX queue
+index 6 is supported::
+
+   testpmd> flow validate 0 ingress pattern eth / end
+      actions queue index 6 / end
+   Flow rule validated
+   testpmd>
+
+Port 0 does not support TCPv6 rules::
+
+   testpmd> flow validate 0 ingress pattern eth / ipv6 / tcp / end
+      actions drop / end
+   Caught error type 9 (specific pattern item): Invalid argument
+   testpmd>
+
+Creating flow rules
+~~~~~~~~~~~~~~~~~~~
+
+``flow create`` validates and creates the specified flow rule. It is bound
+to ``rte_flow_create()``::
+
+   flow create {port_id}
+      [group {group_id}] [priority {level}] [ingress] [egress]
+      pattern {item} [/ {item} [...]] / end
+      actions {action} [/ {action} [...]] / end
+
+If successful, it will return a flow rule ID usable with other commands::
+
+   Flow rule #[...] created
+
+Otherwise it will show an error message of the form::
+
+   Caught error type [...] ([...]): [...]
+
+Parameters describe in the following order:
+
+- Attributes (*group*, *priority*, *ingress*, *egress* tokens).
+- A matching pattern, starting with the *pattern* token and terminated by an
+  *end* pattern item.
+- Actions, starting with the *actions* token and terminated by an *end*
+  action.
+
+These translate directly to *rte_flow* objects provided as-is to the
+underlying functions.
+
+The shortest valid definition only comprises mandatory tokens::
+
+   testpmd> flow create 0 pattern end actions end
+
+Note that PMDs may refuse rules that essentially do nothing such as this
+one.
+
+**All unspecified object values are automatically initialized to 0.**
+
+Attributes
+^^^^^^^^^^
+
+These tokens affect flow rule attributes (``struct rte_flow_attr``) and are
+specified before the ``pattern`` token.
+
+- ``group {group id}``: priority group.
+- ``priority {level}``: priority level within group.
+- ``ingress``: rule applies to ingress traffic.
+- ``egress``: rule applies to egress traffic.
+
+Each instance of an attribute specified several times overrides the previous
+value as shown below (group 4 is used)::
+
+   testpmd> flow create 0 group 42 group 24 group 4 [...]
+
+Note that once enabled, ``ingress`` and ``egress`` cannot be disabled.
+
+While not specifying a direction is an error, some rules may allow both
+simultaneously.
+
+Most rules affect RX therefore contain the ``ingress`` token::
+
+   testpmd> flow create 0 ingress pattern [...]
+
+Matching pattern
+^^^^^^^^^^^^^^^^
+
+A matching pattern starts after the ``pattern`` token. It is made of pattern
+items and is terminated by a mandatory ``end`` item.
+
+Items are named after their type (*RTE_FLOW_ITEM_TYPE_* from ``enum
+rte_flow_item_type``).
+
+The ``/`` token is used as a separator between pattern items as shown
+below::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end [...]
+
+Note that protocol items like these must be stacked from lowest to highest
+layer to make sense. For instance, the following rule is either invalid or
+unlikely to match any packet::
+
+   testpmd> flow create 0 ingress pattern eth / udp / ipv4 / end [...]
+
+More information on these restrictions can be found in the *rte_flow*
+documentation.
+
+Several items support additional specification structures, for example
+``ipv4`` allows specifying source and destination addresses as follows::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 src is 10.1.1.1
+      dst is 10.2.0.0 / end [...]
+
+This rule matches all IPv4 traffic with the specified properties.
+
+In this example, ``src`` and ``dst`` are field names of the underlying
+``struct rte_flow_item_ipv4`` object. All item properties can be specified
+in a similar fashion.
+
+The ``is`` token means that the subsequent value must be matched exactly,
+and assigns ``spec`` and ``mask`` fields in ``struct rte_flow_item``
+accordingly. Possible assignment tokens are:
+
+- ``is``: match value perfectly (with full bit-mask).
+- ``spec``: match value according to configured bit-mask.
+- ``last``: specify upper bound to establish a range.
+- ``mask``: specify bit-mask with relevant bits set to one.
+- ``prefix``: generate bit-mask from a prefix length.
+
+These yield identical results::
+
+   ipv4 src is 10.1.1.1
+
+::
+
+   ipv4 src spec 10.1.1.1 src mask 255.255.255.255
+
+::
+
+   ipv4 src spec 10.1.1.1 src prefix 32
+
+::
+
+   ipv4 src is 10.1.1.1 src last 10.1.1.1 # range with a single value
+
+::
+
+   ipv4 src is 10.1.1.1 src last 0 # 0 disables range
+
+Inclusive ranges can be defined with ``last``::
+
+   ipv4 src is 10.1.1.1 src last 10.2.3.4 # 10.1.1.1 to 10.2.3.4
+
+Note that ``mask`` affects both ``spec`` and ``last``::
+
+   ipv4 src is 10.1.1.1 src last 10.2.3.4 src mask 255.255.0.0
+      # matches 10.1.0.0 to 10.2.255.255
+
+Properties can be modified multiple times::
+
+   ipv4 src is 10.1.1.1 src is 10.1.2.3 src is 10.2.3.4 # matches 10.2.3.4
+
+::
+
+   ipv4 src is 10.1.1.1 src prefix 24 src prefix 16 # matches 10.1.0.0/16
+
+Pattern items
+^^^^^^^^^^^^^
+
+This section lists supported pattern items and their attributes, if any.
+
+- ``end``: end list of pattern items.
+
+- ``void``: no-op pattern item.
+
+- ``invert``: perform actions when pattern does not match.
+
+- ``any``: match any protocol for the current layer.
+
+  - ``num {unsigned}``: number of layers covered.
+
+- ``pf``: match packets addressed to the physical function.
+
+- ``vf``: match packets addressed to a virtual function ID.
+
+  - ``id {unsigned}``: destination VF ID.
+
+- ``port``: device-specific physical port index to use.
+
+  - ``index {unsigned}``: physical port index.
+
+- ``raw``: match an arbitrary byte string.
+
+  - ``relative {boolean}``: look for pattern after the previous item.
+  - ``search {boolean}``: search pattern from offset (see also limit).
+  - ``offset {integer}``: absolute or relative offset for pattern.
+  - ``limit {unsigned}``: search area limit for start of pattern.
+  - ``pattern {string}``: byte string to look for.
+
+- ``eth``: match Ethernet header.
+
+  - ``dst {MAC-48}``: destination MAC.
+  - ``src {MAC-48}``: source MAC.
+  - ``type {unsigned}``: EtherType.
+
+- ``vlan``: match 802.1Q/ad VLAN tag.
+
+  - ``tpid {unsigned}``: tag protocol identifier.
+  - ``tci {unsigned}``: tag control information.
+
+- ``ipv4``: match IPv4 header.
+
+  - ``src {ipv4 address}``: source address.
+  - ``dst {ipv4 address}``: destination address.
+
+- ``ipv6``: match IPv6 header.
+
+  - ``src {ipv6 address}``: source address.
+  - ``dst {ipv6 address}``: destination address.
+
+- ``icmp``: match ICMP header.
+
+  - ``type {unsigned}``: ICMP packet type.
+  - ``code {unsigned}``: ICMP packet code.
+
+- ``udp``: match UDP header.
+
+  - ``src {unsigned}``: UDP source port.
+  - ``dst {unsigned}``: UDP destination port.
+
+- ``tcp``: match TCP header.
+
+  - ``src {unsigned}``: TCP source port.
+  - ``dst {unsigned}``: TCP destination port.
+
+- ``sctp``: match SCTP header.
+
+  - ``src {unsigned}``: SCTP source port.
+  - ``dst {unsigned}``: SCTP destination port.
+
+- ``vxlan``: match VXLAN header.
+
+  - ``vni {unsigned}``: VXLAN identifier.
+
+Actions list
+^^^^^^^^^^^^
+
+A list of actions starts after the ``actions`` token in the same fashion as
+`Matching pattern`_; actions are separated by ``/`` tokens and the list is
+terminated by a mandatory ``end`` action.
+
+Actions are named after their type (*RTE_FLOW_ACTION_TYPE_* from ``enum
+rte_flow_action_type``).
+
+Dropping all incoming UDPv4 packets can be expressed as follows::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+      actions drop / end
+
+Several actions have configurable properties which must be specified when
+there is no valid default value. For example, ``queue`` requires a target
+queue index.
+
+This rule redirects incoming UDPv4 traffic to queue index 6::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+      actions queue index 6 / end
+
+While this one could be rejected by PMDs (unspecified queue index)::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end
+      actions queue / end
+
+As defined by *rte_flow*, the list is not ordered, all actions of a given
+rule are performed simultaneously. These are equivalent::
+
+   queue index 6 / void / mark id 42 / end
+
+::
+
+   void / mark id 42 / queue index 6 / end
+
+All actions in a list should have different types, otherwise only the last
+action of a given type is taken into account::
+
+   queue index 4 / queue index 5 / queue index 6 / end # will use queue 6
+
+::
+
+   drop / drop / drop / end # drop is performed only once
+
+::
+
+   mark id 42 / queue index 3 / mark id 24 / end # mark will be 24
+
+Considering they are performed simultaneously, opposite and overlapping
+actions can sometimes be combined when the end result is unambiguous::
+
+   drop / queue index 6 / end # drop has no effect
+
+::
+
+   drop / dup index 6 / end # same as above
+
+::
+
+   queue index 6 / rss queues 6 7 8 / end # queue has no effect
+
+::
+
+   drop / passthru / end # drop has no effect
+
+Note that PMDs may still refuse such combinations.
+
+Actions
+^^^^^^^
+
+This section lists supported actions and their attributes, if any.
+
+- ``end``: end list of actions.
+
+- ``void``: no-op action.
+
+- ``passthru``: let subsequent rule process matched packets.
+
+- ``mark``: attach 32 bit value to packets.
+
+  - ``id {unsigned}``: 32 bit value to return with packets.
+
+- ``flag``: flag packets.
+
+- ``queue``: assign packets to a given queue index.
+
+  - ``index {unsigned}``: queue index to use.
+
+- ``drop``: drop packets (note: passthru has priority).
+
+- ``count``: enable counters for this rule.
+
+- ``dup``: duplicate packets to a given queue index.
+
+  - ``index {unsigned}``: queue index to duplicate packets to.
+
+- ``rss``: spread packets among several queues.
+
+  - ``queues [{unsigned} [...]] end``: queue indices to use.
+
+- ``pf``: redirect packets to physical device function.
+
+- ``vf``: redirect packets to virtual device function.
+
+  - ``original {boolean}``: use original VF ID if possible.
+  - ``id {unsigned}``: VF ID to redirect packets to.
+
+Destroying flow rules
+~~~~~~~~~~~~~~~~~~~~~
+
+``flow destroy`` destroys one or more rules from their rule ID (as returned
+by ``flow create``), this command calls ``rte_flow_destroy()`` as many
+times as necessary::
+
+   flow destroy {port_id} rule {rule_id} [...]
+
+If successful, it will show::
+
+   Flow rule #[...] destroyed
+
+It does not report anything for rule IDs that do not exist. The usual error
+message is shown when a rule cannot be destroyed::
+
+   Caught error type [...] ([...]): [...]
+
+``flow flush`` destroys all rules on a device and does not take extra
+arguments. It is bound to ``rte_flow_flush()``::
+
+   flow flush {port_id}
+
+Any errors are reported as above.
+
+Creating several rules and destroying them::
+
+   testpmd> flow create 0 ingress pattern eth / ipv6 / end
+      actions queue index 2 / end
+   Flow rule #0 created
+   testpmd> flow create 0 ingress pattern eth / ipv4 / end
+      actions queue index 3 / end
+   Flow rule #1 created
+   testpmd> flow destroy 0 rule 0 rule 1
+   Flow rule #1 destroyed
+   Flow rule #0 destroyed
+   testpmd>
+
+The same result can be achieved using ``flow flush``::
+
+   testpmd> flow create 0 ingress pattern eth / ipv6 / end
+      actions queue index 2 / end
+   Flow rule #0 created
+   testpmd> flow create 0 ingress pattern eth / ipv4 / end
+      actions queue index 3 / end
+   Flow rule #1 created
+   testpmd> flow flush 0
+   testpmd>
+
+Non-existent rule IDs are ignored::
+
+   testpmd> flow create 0 ingress pattern eth / ipv6 / end
+      actions queue index 2 / end
+   Flow rule #0 created
+   testpmd> flow create 0 ingress pattern eth / ipv4 / end
+      actions queue index 3 / end
+   Flow rule #1 created
+   testpmd> flow destroy 0 rule 42 rule 10 rule 2
+   testpmd>
+   testpmd> flow destroy 0 rule 0
+   Flow rule #0 destroyed
+   testpmd>
+
+Querying flow rules
+~~~~~~~~~~~~~~~~~~~
+
+``flow query`` queries a specific action of a flow rule having that
+ability. Such actions collect information that can be reported using this
+command. It is bound to ``rte_flow_query()``::
+
+   flow query {port_id} {rule_id} {action}
+
+If successful, it will display either the retrieved data for known actions
+or the following message::
+
+   Cannot display result for action type [...] ([...])
+
+Otherwise, it will complain either that the rule does not exist or that some
+error occurred::
+
+   Flow rule #[...] not found
+
+::
+
+   Caught error type [...] ([...]): [...]
+
+Currently only the ``count`` action is supported. This action reports the
+number of packets that hit the flow rule and the total number of bytes. Its
+output has the following format::
+
+   count:
+    hits_set: [...] # whether "hits" contains a valid value
+    bytes_set: [...] # whether "bytes" contains a valid value
+    hits: [...] # number of packets
+    bytes: [...] # number of bytes
+
+Querying counters for TCPv6 packets redirected to queue 6::
+
+   testpmd> flow create 0 ingress pattern eth / ipv6 / tcp / end
+      actions queue index 6 / count / end
+   Flow rule #4 created
+   testpmd> flow query 0 4 count
+   count:
+    hits_set: 1
+    bytes_set: 0
+    hits: 386446
+    bytes: 0
+   testpmd>
+
+Listing flow rules
+~~~~~~~~~~~~~~~~~~
+
+``flow list`` lists existing flow rules sorted by priority and optionally
+filtered by group identifiers::
+
+   flow list {port_id} [group {group_id}] [...]
+
+This command only fails with the following message if the device does not
+exist::
+
+   Invalid port [...]
+
+Output consists of a header line followed by a short description of each
+flow rule, one per line. There is no output at all when no flow rules are
+configured on the device::
+
+   ID      Group   Prio    Attr    Rule
+   [...]   [...]   [...]   [...]   [...]
+
+``Attr`` column flags:
+
+- ``i`` for ``ingress``.
+- ``e`` for ``egress``.
+
+Creating several flow rules and listing them::
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / end
+      actions queue index 6 / end
+   Flow rule #0 created
+   testpmd> flow create 0 ingress pattern eth / ipv6 / end
+      actions queue index 2 / end
+   Flow rule #1 created
+   testpmd> flow create 0 priority 5 ingress pattern eth / ipv4 / udp / end
+      actions rss queues 6 7 8 end / end
+   Flow rule #2 created
+   testpmd> flow list 0
+   ID      Group   Prio    Attr    Rule
+   0       0       0       i-      ETH IPV4 => QUEUE
+   1       0       0       i-      ETH IPV6 => QUEUE
+   2       0       5       i-      ETH IPV4 UDP => RSS
+   testpmd>
+
+Rules are sorted by priority (i.e. group ID first, then priority level)::
+
+   testpmd> flow list 1
+   ID      Group   Prio    Attr    Rule
+   0       0       0       i-      ETH => COUNT
+   6       0       500     i-      ETH IPV6 TCP => DROP COUNT
+   5       0       1000    i-      ETH IPV6 ICMP => QUEUE
+   1       24      0       i-      ETH IPV4 UDP => QUEUE
+   4       24      10      i-      ETH IPV4 TCP => DROP
+   3       24      20      i-      ETH IPV4 => DROP
+   2       24      42      i-      ETH IPV4 UDP => QUEUE
+   7       63      0       i-      ETH IPV6 UDP VXLAN => MARK QUEUE
+   testpmd>
+
+Output can be limited to specific groups::
+
+   testpmd> flow list 1 group 0 group 63
+   ID      Group   Prio    Attr    Rule
+   0       0       0       i-      ETH => COUNT
+   6       0       500     i-      ETH IPV6 TCP => DROP COUNT
+   5       0       1000    i-      ETH IPV6 ICMP => QUEUE
+   7       63      0       i-      ETH IPV6 UDP VXLAN => MARK QUEUE
+   testpmd>
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* [dpdk-dev] [PATCH v5 26/26] app/testpmd: add protocol fields to flow command
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (24 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 25/26] doc: describe testpmd " Adrien Mazarguil
@ 2016-12-21 14:51             ` Adrien Mazarguil
  2016-12-23  9:30             ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Thomas Monjalon
  26 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-21 14:51 UTC (permalink / raw)
  To: dev; +Cc: Xing, Beilei, Pei, Yulong, Nelio Laranjeiro

This commit exposes the following item fields through the flow command:

- VLAN priority code point, drop eligible indicator and VLAN identifier
  (all part of TCI).
- IPv4 type of service, time to live and protocol.
- IPv6 traffic class, flow label, next header and hop limit.
- SCTP tag and checksum.

Cc: Xing, Beilei <beilei.xing@intel.com>
Cc: Pei, Yulong <yulong.pei@intel.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 app/test-pmd/cmdline_flow.c                 | 127 +++++++++++++++++++++++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  12 +++
 2 files changed, 139 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index db680c6..7760c2d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -126,10 +126,20 @@ enum index {
 	ITEM_VLAN,
 	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
+	ITEM_VLAN_PCP,
+	ITEM_VLAN_DEI,
+	ITEM_VLAN_VID,
 	ITEM_IPV4,
+	ITEM_IPV4_TOS,
+	ITEM_IPV4_TTL,
+	ITEM_IPV4_PROTO,
 	ITEM_IPV4_SRC,
 	ITEM_IPV4_DST,
 	ITEM_IPV6,
+	ITEM_IPV6_TC,
+	ITEM_IPV6_FLOW,
+	ITEM_IPV6_PROTO,
+	ITEM_IPV6_HOP,
 	ITEM_IPV6_SRC,
 	ITEM_IPV6_DST,
 	ITEM_ICMP,
@@ -144,6 +154,8 @@ enum index {
 	ITEM_SCTP,
 	ITEM_SCTP_SRC,
 	ITEM_SCTP_DST,
+	ITEM_SCTP_TAG,
+	ITEM_SCTP_CKSUM,
 	ITEM_VXLAN,
 	ITEM_VXLAN_VNI,
 
@@ -281,6 +293,23 @@ struct token {
 		.mask = (const void *)&(const s){ .f = (1 << (b)) - 1 }, \
 	})
 
+/** Static initializer for ARGS() to target an arbitrary bit-mask. */
+#define ARGS_ENTRY_MASK(s, f, m) \
+	(&(const struct arg){ \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+		.mask = (const void *)(m), \
+	})
+
+/** Same as ARGS_ENTRY_MASK() using network byte ordering for the value. */
+#define ARGS_ENTRY_MASK_HTON(s, f, m) \
+	(&(const struct arg){ \
+		.hton = 1, \
+		.offset = offsetof(s, f), \
+		.size = sizeof(((s *)0)->f), \
+		.mask = (const void *)(m), \
+	})
+
 /** Static initializer for ARGS() to target a pointer. */
 #define ARGS_ENTRY_PTR(s, f) \
 	(&(const struct arg){ \
@@ -444,11 +473,17 @@ static const enum index item_eth[] = {
 static const enum index item_vlan[] = {
 	ITEM_VLAN_TPID,
 	ITEM_VLAN_TCI,
+	ITEM_VLAN_PCP,
+	ITEM_VLAN_DEI,
+	ITEM_VLAN_VID,
 	ITEM_NEXT,
 	ZERO,
 };
 
 static const enum index item_ipv4[] = {
+	ITEM_IPV4_TOS,
+	ITEM_IPV4_TTL,
+	ITEM_IPV4_PROTO,
 	ITEM_IPV4_SRC,
 	ITEM_IPV4_DST,
 	ITEM_NEXT,
@@ -456,6 +491,10 @@ static const enum index item_ipv4[] = {
 };
 
 static const enum index item_ipv6[] = {
+	ITEM_IPV6_TC,
+	ITEM_IPV6_FLOW,
+	ITEM_IPV6_PROTO,
+	ITEM_IPV6_HOP,
 	ITEM_IPV6_SRC,
 	ITEM_IPV6_DST,
 	ITEM_NEXT,
@@ -486,6 +525,8 @@ static const enum index item_tcp[] = {
 static const enum index item_sctp[] = {
 	ITEM_SCTP_SRC,
 	ITEM_SCTP_DST,
+	ITEM_SCTP_TAG,
+	ITEM_SCTP_CKSUM,
 	ITEM_NEXT,
 	ZERO,
 };
@@ -1012,6 +1053,27 @@ static const struct token token_list[] = {
 		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_vlan, tci)),
 	},
+	[ITEM_VLAN_PCP] = {
+		.name = "pcp",
+		.help = "priority code point",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_vlan,
+						  tci, "\xe0\x00")),
+	},
+	[ITEM_VLAN_DEI] = {
+		.name = "dei",
+		.help = "drop eligible indicator",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_vlan,
+						  tci, "\x10\x00")),
+	},
+	[ITEM_VLAN_VID] = {
+		.name = "vid",
+		.help = "VLAN identifier",
+		.next = NEXT(item_vlan, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_vlan,
+						  tci, "\x0f\xff")),
+	},
 	[ITEM_IPV4] = {
 		.name = "ipv4",
 		.help = "match IPv4 header",
@@ -1019,6 +1081,27 @@ static const struct token token_list[] = {
 		.next = NEXT(item_ipv4),
 		.call = parse_vc,
 	},
+	[ITEM_IPV4_TOS] = {
+		.name = "tos",
+		.help = "type of service",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.type_of_service)),
+	},
+	[ITEM_IPV4_TTL] = {
+		.name = "ttl",
+		.help = "time to live",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.time_to_live)),
+	},
+	[ITEM_IPV4_PROTO] = {
+		.name = "proto",
+		.help = "next protocol ID",
+		.next = NEXT(item_ipv4, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+					     hdr.next_proto_id)),
+	},
 	[ITEM_IPV4_SRC] = {
 		.name = "src",
 		.help = "source address",
@@ -1040,6 +1123,36 @@ static const struct token token_list[] = {
 		.next = NEXT(item_ipv6),
 		.call = parse_vc,
 	},
+	[ITEM_IPV6_TC] = {
+		.name = "tc",
+		.help = "traffic class",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_ipv6,
+						  hdr.vtc_flow,
+						  "\x0f\xf0\x00\x00")),
+	},
+	[ITEM_IPV6_FLOW] = {
+		.name = "flow",
+		.help = "flow label",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_ipv6,
+						  hdr.vtc_flow,
+						  "\x00\x0f\xff\xff")),
+	},
+	[ITEM_IPV6_PROTO] = {
+		.name = "proto",
+		.help = "protocol (next header)",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.proto)),
+	},
+	[ITEM_IPV6_HOP] = {
+		.name = "hop",
+		.help = "hop limit",
+		.next = NEXT(item_ipv6, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
+					     hdr.hop_limits)),
+	},
 	[ITEM_IPV6_SRC] = {
 		.name = "src",
 		.help = "source address",
@@ -1138,6 +1251,20 @@ static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
 					     hdr.dst_port)),
 	},
+	[ITEM_SCTP_TAG] = {
+		.name = "tag",
+		.help = "validation tag",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.tag)),
+	},
+	[ITEM_SCTP_CKSUM] = {
+		.name = "cksum",
+		.help = "checksum",
+		.next = NEXT(item_sctp, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_sctp,
+					     hdr.cksum)),
+	},
 	[ITEM_VXLAN] = {
 		.name = "vxlan",
 		.help = "match VXLAN header",
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 03b6fa9..cacdef1 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -2333,14 +2333,24 @@ This section lists supported pattern items and their attributes, if any.
 
   - ``tpid {unsigned}``: tag protocol identifier.
   - ``tci {unsigned}``: tag control information.
+  - ``pcp {unsigned}``: priority code point.
+  - ``dei {unsigned}``: drop eligible indicator.
+  - ``vid {unsigned}``: VLAN identifier.
 
 - ``ipv4``: match IPv4 header.
 
+  - ``tos {unsigned}``: type of service.
+  - ``ttl {unsigned}``: time to live.
+  - ``proto {unsigned}``: next protocol ID.
   - ``src {ipv4 address}``: source address.
   - ``dst {ipv4 address}``: destination address.
 
 - ``ipv6``: match IPv6 header.
 
+  - ``tc {unsigned}``: traffic class.
+  - ``flow {unsigned}``: flow label.
+  - ``proto {unsigned}``: protocol (next header).
+  - ``hop {unsigned}``: hop limit.
   - ``src {ipv6 address}``: source address.
   - ``dst {ipv6 address}``: destination address.
 
@@ -2363,6 +2373,8 @@ This section lists supported pattern items and their attributes, if any.
 
   - ``src {unsigned}``: SCTP source port.
   - ``dst {unsigned}``: SCTP destination port.
+  - ``tag {unsigned}``: validation tag.
+  - ``cksum {unsigned}``: checksum.
 
 - ``vxlan``: match VXLAN header.
 
-- 
2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v5 02/26] doc: add rte_flow prog guide
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 02/26] doc: add rte_flow prog guide Adrien Mazarguil
@ 2016-12-21 15:09               ` Mcnamara, John
  0 siblings, 0 replies; 262+ messages in thread
From: Mcnamara, John @ 2016-12-21 15:09 UTC (permalink / raw)
  To: Adrien Mazarguil, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Wednesday, December 21, 2016 2:51 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v5 02/26] doc: add rte_flow prog guide
> 
> This documentation is based on the latest RFC submission, subsequently
> updated according to feedback from the community.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Acked-by: Olga Shern <olgas@mellanox.com>

Acked-by: John McNamara <john.mcnamara@intel.com>


^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow)
  2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
                         ` (26 preceding siblings ...)
  2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
@ 2016-12-21 16:19       ` Simon Horman
  2016-12-22 12:48         ` Adrien Mazarguil
  27 siblings, 1 reply; 262+ messages in thread
From: Simon Horman @ 2016-12-21 16:19 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

On Fri, Dec 16, 2016 at 05:24:57PM +0100, Adrien Mazarguil wrote:
> As previously discussed in RFC v1 [1], RFC v2 [2], with changes
> described in [3] (also pasted below), here is the first non-draft series
> for this new API.
> 
> Its capabilities are so generic that its name had to be vague, it may be
> called "Generic flow API", "Generic flow interface" (possibly shortened
> as "GFI") to refer to the name of the new filter type, or "rte_flow" from
> the prefix used for its public symbols. I personally favor the latter.
> 
> While it is currently meant to supersede existing filter types in order for
> all PMDs to expose a common filtering/classification interface, it may
> eventually evolve to cover the following ideas as well:
> 
> - Rx/Tx offloads configuration through automatic offloads for specific
>   packets, e.g. performing checksum on TCP packets could be expressed with
>   an egress rule with a TCP pattern and a kind of checksum action.
> 
> - RSS configuration (already defined actually). Could be global or per rule
>   depending on hardware capabilities.
> 
> - Switching configuration for devices with many physical ports; rules doing
>   both ingress and egress could even be used to completely bypass software
>   if supported by hardware.

Hi Adrien,

thanks for this valuable work.

I would like to ask some high level questions on the proposal.
I apologise in advance if any of these questions are based on a
misunderstanding on my part.

* I am wondering about provisions for actions to modify packet data or
  metadata.  I do see support for marking packets. Is the implication of
  this that the main focus is to provide a mechanism for classification
  with the assumption that any actions - other than drop and variants of
  output - would be performed elsewhere?

  If so I would observe that this seems somewhat limiting in the case of
  hardware that can perform a richer set of actions. And seems particularly
  limiting on egress as there doesn't seem anywhere else that other actions
  could be performed after classification is performed by this API.

* I am curious to know what considerations have been given to supporting          support for tunnelling (encapsulation and decapsulation of e.g. VXLAN),
  tagging (pushing and popping e.g. VLANs), and labels (pushing or popping
  e.g. MPLS).

  Such features seem would useful for application of this work in a variety
  of situations including overlay networks and VNFs.

* I am wondering if any thought has gone into supporting matching on the
  n-th instance of a field that may appear more than once: e.g. VLAN tag.

With the above questions in mind I am curious to know what use-cases
the proposal is targeted at.

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow)
  2016-12-21 16:19       ` [dpdk-dev] [PATCH v2 00/25] " Simon Horman
@ 2016-12-22 12:48         ` Adrien Mazarguil
  2017-01-04  9:53           ` Simon Horman
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2016-12-22 12:48 UTC (permalink / raw)
  To: Simon Horman; +Cc: dev

On Wed, Dec 21, 2016 at 05:19:16PM +0100, Simon Horman wrote:
> On Fri, Dec 16, 2016 at 05:24:57PM +0100, Adrien Mazarguil wrote:
> > As previously discussed in RFC v1 [1], RFC v2 [2], with changes
> > described in [3] (also pasted below), here is the first non-draft series
> > for this new API.
> > 
> > Its capabilities are so generic that its name had to be vague, it may be
> > called "Generic flow API", "Generic flow interface" (possibly shortened
> > as "GFI") to refer to the name of the new filter type, or "rte_flow" from
> > the prefix used for its public symbols. I personally favor the latter.
> > 
> > While it is currently meant to supersede existing filter types in order for
> > all PMDs to expose a common filtering/classification interface, it may
> > eventually evolve to cover the following ideas as well:
> > 
> > - Rx/Tx offloads configuration through automatic offloads for specific
> >   packets, e.g. performing checksum on TCP packets could be expressed with
> >   an egress rule with a TCP pattern and a kind of checksum action.
> > 
> > - RSS configuration (already defined actually). Could be global or per rule
> >   depending on hardware capabilities.
> > 
> > - Switching configuration for devices with many physical ports; rules doing
> >   both ingress and egress could even be used to completely bypass software
> >   if supported by hardware.

Hi Simon,

> Hi Adrien,
> 
> thanks for this valuable work.
> 
> I would like to ask some high level questions on the proposal.
> I apologise in advance if any of these questions are based on a
> misunderstanding on my part.
> 
> * I am wondering about provisions for actions to modify packet data or
>   metadata.  I do see support for marking packets. Is the implication of
>   this that the main focus is to provide a mechanism for classification
>   with the assumption that any actions - other than drop and variants of
>   output - would be performed elsewhere?

I'm not sure to understand what you mean by "elsewhere" here. Packet marking
as currently defined is a purely ingress action, i.e. HW matches some packet
and returns a user-defined tag in related meta-data that the PMD copies to
the appropriate mbuf structure field before returning it to the application.

There is provision for egress rules and I wrote down a few ideas describing
how they could be useful (as above), however they remain to be defined.

>   If so I would observe that this seems somewhat limiting in the case of
>   hardware that can perform a richer set of actions. And seems particularly
>   limiting on egress as there doesn't seem anywhere else that other actions
>   could be performed after classification is performed by this API.

A single flow rule may contain any number of distinct actions. For egress,
it means you could wrap matching packets in VLAN and VXLAN at once.

If you wanted to perform the same action twice on matching packets, you'd
have to provide two rules with defined priorities and use a non-terminating
action for the first one:

- Rule with priority 0: match UDP -> add VLAN 42, passthrough
- Rule with priority 1: match UDP -> add VLAN 64, terminating

This is how automatic QinQ would be defined for outgoing UDP packets.

> * I am curious to know what considerations have been given to supporting          support for tunnelling (encapsulation and decapsulation of e.g. VXLAN),
>   tagging (pushing and popping e.g. VLANs), and labels (pushing or popping
>   e.g. MPLS).
> 
>   Such features seem would useful for application of this work in a variety
>   of situations including overlay networks and VNFs.

This is also what I had in mind and we'd only have to define specific
ingress/egress actions for these. Currently rte_flow only implements a basic
set of existing features from the legacy filtering framework, but is meant
to be extended.

> * I am wondering if any thought has gone into supporting matching on the
>   n-th instance of a field that may appear more than once: e.g. VLAN tag.

Sure, please see the latest documentation [1] and testpmd examples [2].
Pattern items being stacked in the same order as protocol layers, maching
specific QinQ traffic and redirecting it to some queue could be expressed
with something like:

 testpmd> flow create 0 ingress pattern eth / vlan vid is 64 / vlan vid is 42 / end 
    actions queue 6 / end

Such a rule is translated as-is to rte_flow pattern items and action
structures.

> With the above questions in mind I am curious to know what use-cases
> the proposal is targeted at.

Well, it should be easier to answer if you have a specific use-case in mind
you would like to support but that cannot be expressed with the API as
defined in [1], in which case please share it with the community.

[1] http://dpdk.org/ml/archives/dev/2016-December/052954.html
[2] http://dpdk.org/ml/archives/dev/2016-December/052975.html

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow)
  2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
                               ` (25 preceding siblings ...)
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 26/26] app/testpmd: add protocol fields to " Adrien Mazarguil
@ 2016-12-23  9:30             ` Thomas Monjalon
  26 siblings, 0 replies; 262+ messages in thread
From: Thomas Monjalon @ 2016-12-23  9:30 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

> Adrien Mazarguil (26):
>   ethdev: introduce generic flow API
>   doc: add rte_flow prog guide
>   doc: announce deprecation of legacy filter types
>   cmdline: add support for dynamic tokens
>   cmdline: add alignment constraint
>   app/testpmd: implement basic support for rte_flow
>   app/testpmd: add flow command
>   app/testpmd: add rte_flow integer support
>   app/testpmd: add flow list command
>   app/testpmd: add flow flush command
>   app/testpmd: add flow destroy command
>   app/testpmd: add flow validate/create commands
>   app/testpmd: add flow query command
>   app/testpmd: add rte_flow item spec handler
>   app/testpmd: add rte_flow item spec prefix length
>   app/testpmd: add rte_flow bit-field support
>   app/testpmd: add item any to flow command
>   app/testpmd: add various items to flow command
>   app/testpmd: add item raw to flow command
>   app/testpmd: add items eth/vlan to flow command
>   app/testpmd: add items ipv4/ipv6 to flow command
>   app/testpmd: add L4 items to flow command
>   app/testpmd: add various actions to flow command
>   app/testpmd: add queue actions to flow command
>   doc: describe testpmd flow command
>   app/testpmd: add protocol fields to flow command

Applied, thanks for the great work!

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow)
  2016-12-22 12:48         ` Adrien Mazarguil
@ 2017-01-04  9:53           ` Simon Horman
  2017-01-04 18:12             ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Simon Horman @ 2017-01-04  9:53 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

On Thu, Dec 22, 2016 at 01:48:04PM +0100, Adrien Mazarguil wrote:
> On Wed, Dec 21, 2016 at 05:19:16PM +0100, Simon Horman wrote:
> > On Fri, Dec 16, 2016 at 05:24:57PM +0100, Adrien Mazarguil wrote:
> > > As previously discussed in RFC v1 [1], RFC v2 [2], with changes
> > > described in [3] (also pasted below), here is the first non-draft series
> > > for this new API.
> > > 
> > > Its capabilities are so generic that its name had to be vague, it may be
> > > called "Generic flow API", "Generic flow interface" (possibly shortened
> > > as "GFI") to refer to the name of the new filter type, or "rte_flow" from
> > > the prefix used for its public symbols. I personally favor the latter.
> > > 
> > > While it is currently meant to supersede existing filter types in order for
> > > all PMDs to expose a common filtering/classification interface, it may
> > > eventually evolve to cover the following ideas as well:
> > > 
> > > - Rx/Tx offloads configuration through automatic offloads for specific
> > >   packets, e.g. performing checksum on TCP packets could be expressed with
> > >   an egress rule with a TCP pattern and a kind of checksum action.
> > > 
> > > - RSS configuration (already defined actually). Could be global or per rule
> > >   depending on hardware capabilities.
> > > 
> > > - Switching configuration for devices with many physical ports; rules doing
> > >   both ingress and egress could even be used to completely bypass software
> > >   if supported by hardware.

Hi Adrien,

apologies for not replying for some time due to my winter vacation.

> Hi Simon,
> 
> > Hi Adrien,
> > 
> > thanks for this valuable work.
> > 
> > I would like to ask some high level questions on the proposal.
> > I apologise in advance if any of these questions are based on a
> > misunderstanding on my part.
> > 
> > * I am wondering about provisions for actions to modify packet data or
> >   metadata.  I do see support for marking packets. Is the implication of
> >   this that the main focus is to provide a mechanism for classification
> >   with the assumption that any actions - other than drop and variants of
> >   output - would be performed elsewhere?
> 
> I'm not sure to understand what you mean by "elsewhere" here. Packet marking
> as currently defined is a purely ingress action, i.e. HW matches some packet
> and returns a user-defined tag in related meta-data that the PMD copies to
> the appropriate mbuf structure field before returning it to the application.

By elsewhere I meant in the application, sorry for being unclear.

> There is provision for egress rules and I wrote down a few ideas describing
> how they could be useful (as above), however they remain to be defined.
> 
> >   If so I would observe that this seems somewhat limiting in the case of
> >   hardware that can perform a richer set of actions. And seems particularly
> >   limiting on egress as there doesn't seem anywhere else that other actions
> >   could be performed after classification is performed by this API.
> 
> A single flow rule may contain any number of distinct actions. For egress,
> it means you could wrap matching packets in VLAN and VXLAN at once.
> 
> If you wanted to perform the same action twice on matching packets, you'd
> have to provide two rules with defined priorities and use a non-terminating
> action for the first one:
> 
> - Rule with priority 0: match UDP -> add VLAN 42, passthrough
> - Rule with priority 1: match UDP -> add VLAN 64, terminating
> 
> This is how automatic QinQ would be defined for outgoing UDP packets.

Ok understood. I have two follow-up questions:

1. Is the "add VLAN" action included at this time, I was not able to find it
2. Was consideration given to allowing multiple actions in a single rule?
   I see there would be some advantage to that if classification is
   expensive.

> > * I am curious to know what considerations have been given to supporting          support for tunnelling (encapsulation and decapsulation of e.g. VXLAN),
> >   tagging (pushing and popping e.g. VLANs), and labels (pushing or popping
> >   e.g. MPLS).
> > 
> >   Such features seem would useful for application of this work in a variety
> >   of situations including overlay networks and VNFs.
> 
> This is also what I had in mind and we'd only have to define specific
> ingress/egress actions for these. Currently rte_flow only implements a basic
> set of existing features from the legacy filtering framework, but is meant
> to be extended.

Thanks. I think that answers most of my questions: what I see as missing
in terms of actions can be added.

> > * I am wondering if any thought has gone into supporting matching on the
> >   n-th instance of a field that may appear more than once: e.g. VLAN tag.
> 
> Sure, please see the latest documentation [1] and testpmd examples [2].
> Pattern items being stacked in the same order as protocol layers, maching
> specific QinQ traffic and redirecting it to some queue could be expressed
> with something like:
> 
>  testpmd> flow create 0 ingress pattern eth / vlan vid is 64 / vlan vid is 42 / end 
>     actions queue 6 / end
> 
> Such a rule is translated as-is to rte_flow pattern items and action
> structures.

Thanks, I will look over that.

> > With the above questions in mind I am curious to know what use-cases
> > the proposal is targeted at.
> 
> Well, it should be easier to answer if you have a specific use-case in mind
> you would like to support but that cannot be expressed with the API as
> defined in [1], in which case please share it with the community.

A use-case would be implementing OvS DPIF flow offload using this API.

> [1] http://dpdk.org/ml/archives/dev/2016-December/052954.html
> [2] http://dpdk.org/ml/archives/dev/2016-December/052975.html

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow)
  2017-01-04  9:53           ` Simon Horman
@ 2017-01-04 18:12             ` Adrien Mazarguil
  2017-01-04 19:34               ` John Fastabend
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2017-01-04 18:12 UTC (permalink / raw)
  To: Simon Horman; +Cc: dev

On Wed, Jan 04, 2017 at 10:53:50AM +0100, Simon Horman wrote:
> On Thu, Dec 22, 2016 at 01:48:04PM +0100, Adrien Mazarguil wrote:
> > On Wed, Dec 21, 2016 at 05:19:16PM +0100, Simon Horman wrote:
> > > On Fri, Dec 16, 2016 at 05:24:57PM +0100, Adrien Mazarguil wrote:
[...]
> > > I would like to ask some high level questions on the proposal.
> > > I apologise in advance if any of these questions are based on a
> > > misunderstanding on my part.
> > > 
> > > * I am wondering about provisions for actions to modify packet data or
> > >   metadata.  I do see support for marking packets. Is the implication of
> > >   this that the main focus is to provide a mechanism for classification
> > >   with the assumption that any actions - other than drop and variants of
> > >   output - would be performed elsewhere?
> > 
> > I'm not sure to understand what you mean by "elsewhere" here. Packet marking
> > as currently defined is a purely ingress action, i.e. HW matches some packet
> > and returns a user-defined tag in related meta-data that the PMD copies to
> > the appropriate mbuf structure field before returning it to the application.
> 
> By elsewhere I meant in the application, sorry for being unclear.

OK, then a high level definition would be that PMDs perform all actions part
of a flow rule, and applications are left to handle what they did not
explicitly request to be offloaded.

> > There is provision for egress rules and I wrote down a few ideas describing
> > how they could be useful (as above), however they remain to be defined.
> > 
> > >   If so I would observe that this seems somewhat limiting in the case of
> > >   hardware that can perform a richer set of actions. And seems particularly
> > >   limiting on egress as there doesn't seem anywhere else that other actions
> > >   could be performed after classification is performed by this API.
> > 
> > A single flow rule may contain any number of distinct actions. For egress,
> > it means you could wrap matching packets in VLAN and VXLAN at once.
> > 
> > If you wanted to perform the same action twice on matching packets, you'd
> > have to provide two rules with defined priorities and use a non-terminating
> > action for the first one:
> > 
> > - Rule with priority 0: match UDP -> add VLAN 42, passthrough
> > - Rule with priority 1: match UDP -> add VLAN 64, terminating
> > 
> > This is how automatic QinQ would be defined for outgoing UDP packets.
> 
> Ok understood. I have two follow-up questions:
> 
> 1. Is the "add VLAN" action included at this time, I was not able to find it

It has not been defined yet. All egress offload actions remain to be
defined.

> 2. Was consideration given to allowing multiple actions in a single rule?
>    I see there would be some advantage to that if classification is
>    expensive.

Yes, it is supported as described in the documentation (now available
on-line since the series has been applied [3]).

What is not supported however is requesting the same action to be performed
multiple times in a single flow rule, because order is not guaranteed by
design. Actions may all happen simultaneously.

This scenario can be handled with multiple rules using priorities as in my
QinQ example above, or through a new action performing several things at
once in a defined order (e.g. a single "QinQ" action).

> > > * I am curious to know what considerations have been given to supporting          support for tunnelling (encapsulation and decapsulation of e.g. VXLAN),
> > >   tagging (pushing and popping e.g. VLANs), and labels (pushing or popping
> > >   e.g. MPLS).
> > > 
> > >   Such features seem would useful for application of this work in a variety
> > >   of situations including overlay networks and VNFs.
> > 
> > This is also what I had in mind and we'd only have to define specific
> > ingress/egress actions for these. Currently rte_flow only implements a basic
> > set of existing features from the legacy filtering framework, but is meant
> > to be extended.
> 
> Thanks. I think that answers most of my questions: what I see as missing
> in terms of actions can be added.
> 
> > > * I am wondering if any thought has gone into supporting matching on the
> > >   n-th instance of a field that may appear more than once: e.g. VLAN tag.
> > 
> > Sure, please see the latest documentation [1] and testpmd examples [2].
> > Pattern items being stacked in the same order as protocol layers, maching
> > specific QinQ traffic and redirecting it to some queue could be expressed
> > with something like:
> > 
> >  testpmd> flow create 0 ingress pattern eth / vlan vid is 64 / vlan vid is 42 / end 
> >     actions queue 6 / end
> > 
> > Such a rule is translated as-is to rte_flow pattern items and action
> > structures.
> 
> Thanks, I will look over that.
> 
> > > With the above questions in mind I am curious to know what use-cases
> > > the proposal is targeted at.
> > 
> > Well, it should be easier to answer if you have a specific use-case in mind
> > you would like to support but that cannot be expressed with the API as
> > defined in [1], in which case please share it with the community.
> 
> A use-case would be implementing OvS DPIF flow offload using this API.

OK, OVS has been mentioned several times in this thread and my understanding
is that rte_flow seems to accommodate most of its needs according to people
familiar with it. Perhaps ML archives can answer the remaining questions you
may have about combining rte_flow with OVS.

> > [1] http://dpdk.org/ml/archives/dev/2016-December/052954.html
> > [2] http://dpdk.org/ml/archives/dev/2016-December/052975.html

[3] http://dpdk.org/doc/guides/prog_guide/rte_flow.html

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow)
  2017-01-04 18:12             ` Adrien Mazarguil
@ 2017-01-04 19:34               ` John Fastabend
  0 siblings, 0 replies; 262+ messages in thread
From: John Fastabend @ 2017-01-04 19:34 UTC (permalink / raw)
  To: Adrien Mazarguil, Simon Horman; +Cc: dev

[...]

>>> Well, it should be easier to answer if you have a specific use-case in mind
>>> you would like to support but that cannot be expressed with the API as
>>> defined in [1], in which case please share it with the community.
>>
>> A use-case would be implementing OvS DPIF flow offload using this API.
> 
> OK, OVS has been mentioned several times in this thread and my understanding
> is that rte_flow seems to accommodate most of its needs according to people
> familiar with it. Perhaps ML archives can answer the remaining questions you
> may have about combining rte_flow with OVS.

For what its worth. I reviewed this and believe it should be sufficient
to support the OVS SR-IOV offload use case with action/classifier extensions
but without any fundamental changes to the design. We built a prototype
OVS offload on top of another API we dubbed Flow-API a year+ ago and there
seems to be a 1:1 mapping between that older API and the one now in DPDK
so I'm happy. And the missing things seem to fit nicely into extensions.

Also I believe the partial pre-classify use cases should be easily handled
as well although I'm not as familiar with the bit-level details of that
implementation.

At some point capability discovery will be useful but we certainly don't need
those in first iteration and rte_flow doesn't preclude these type of extensions
so that is good.

By the way thanks for doing this work Adrien, glad to see it being accepted
and drivers picking it up.

Thanks,
John

> 
>>> [1] http://dpdk.org/ml/archives/dev/2016-December/052954.html
>>> [2] http://dpdk.org/ml/archives/dev/2016-December/052975.html
> 
> [3] http://dpdk.org/doc/guides/prog_guide/rte_flow.html
> 

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw to flow command
  2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw " Adrien Mazarguil
@ 2017-05-11  6:53               ` Zhao1, Wei
  2017-05-12  9:12                 ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Zhao1, Wei @ 2017-05-11  6:53 UTC (permalink / raw)
  To: Adrien Mazarguil, dev; +Cc: Xing, Beilei, Lu, Wenzhuo

Hi, Adrien

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Wednesday, December 21, 2016 10:52 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw to flow
> command
> 
> Matches arbitrary byte strings with properties:
> 
> - relative: look for pattern after the previous item.
> - search: search pattern from offset (see also limit).
> - offset: absolute or relative offset for pattern.
> - limit: search area limit for start of pattern.
> - length: pattern length.
> - pattern: byte string to look for.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Acked-by: Olga Shern <olgas@mellanox.com>
> ---
>  app/test-pmd/cmdline_flow.c | 208
> +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 208 insertions(+)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 0592969..c52a8f7 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -57,6 +57,8 @@ enum index {
>  	INTEGER,
>  	UNSIGNED,
>  	PREFIX,
> +	BOOLEAN,
> +	STRING,
>  	RULE_ID,
>  	PORT_ID,
>  	GROUP_ID,
> @@ -106,6 +108,12 @@ enum index {
>  	ITEM_VF_ID,
>  	ITEM_PORT,
>  	ITEM_PORT_INDEX,
> +	ITEM_RAW,
> +	ITEM_RAW_RELATIVE,
> +	ITEM_RAW_SEARCH,
> +	ITEM_RAW_OFFSET,
> +	ITEM_RAW_LIMIT,
> +	ITEM_RAW_PATTERN,
> 
>  	/* Validate/create actions. */
>  	ACTIONS,
> @@ -115,6 +123,13 @@ enum index {
>  	ACTION_PASSTHRU,
>  };
> 
> +/** Size of pattern[] field in struct rte_flow_item_raw. */ #define
> +ITEM_RAW_PATTERN_SIZE 36
> +
> +/** Storage size for struct rte_flow_item_raw including pattern. */
> +#define ITEM_RAW_SIZE \
> +	(offsetof(struct rte_flow_item_raw, pattern) +
> ITEM_RAW_PATTERN_SIZE)

#define  ITEM_RAW_PATTERN_SIZE 36

The size of NIC i350 flex byte filter can accommodate the max length size of 128 byte, and the reason to 
Define it as 36 is ?If it is the max length of pattern, maybe 128  is more appropriate? 
Maybe I have not understand your purpose.

Thank you.

> +
>  /** Maximum number of subsequent tokens and arguments on the stack.
> */  #define CTX_STACK_SIZE 16
> 
> @@ -216,6 +231,13 @@ struct token {
>  		.size = sizeof(*((s *)0)->f), \
>  	})
> 
> +/** Static initializer for ARGS() with arbitrary size. */ #define
> +ARGS_ENTRY_USZ(s, f, sz) \
> +	(&(const struct arg){ \
> +		.offset = offsetof(s, f), \
> +		.size = (sz), \
> +	})
> +
>  /** Parser output buffer layout expected by cmd_flow_parsed(). */  struct
> buffer {
>  	enum index command; /**< Flow command. */ @@ -306,6 +328,7
> @@ static const enum index next_item[] = {
>  	ITEM_PF,
>  	ITEM_VF,
>  	ITEM_PORT,
> +	ITEM_RAW,
>  	ZERO,
>  };
> 
> @@ -327,6 +350,16 @@ static const enum index item_port[] = {
>  	ZERO,
>  };
> 
> +static const enum index item_raw[] = {
> +	ITEM_RAW_RELATIVE,
> +	ITEM_RAW_SEARCH,
> +	ITEM_RAW_OFFSET,
> +	ITEM_RAW_LIMIT,
> +	ITEM_RAW_PATTERN,
> +	ITEM_NEXT,
> +	ZERO,
> +};
> +
>  static const enum index next_action[] = {
>  	ACTION_END,
>  	ACTION_VOID,
> @@ -363,11 +396,19 @@ static int parse_int(struct context *, const struct
> token *,  static int parse_prefix(struct context *, const struct token *,
>  			const char *, unsigned int,
>  			void *, unsigned int);
> +static int parse_boolean(struct context *, const struct token *,
> +			 const char *, unsigned int,
> +			 void *, unsigned int);
> +static int parse_string(struct context *, const struct token *,
> +			const char *, unsigned int,
> +			void *, unsigned int);
>  static int parse_port(struct context *, const struct token *,
>  		      const char *, unsigned int,
>  		      void *, unsigned int);
>  static int comp_none(struct context *, const struct token *,
>  		     unsigned int, char *, unsigned int);
> +static int comp_boolean(struct context *, const struct token *,
> +			unsigned int, char *, unsigned int);
>  static int comp_action(struct context *, const struct token *,
>  		       unsigned int, char *, unsigned int);  static int
> comp_port(struct context *, const struct token *, @@ -410,6 +451,20 @@
> static const struct token token_list[] = {
>  		.call = parse_prefix,
>  		.comp = comp_none,
>  	},
> +	[BOOLEAN] = {
> +		.name = "{boolean}",
> +		.type = "BOOLEAN",
> +		.help = "any boolean value",
> +		.call = parse_boolean,
> +		.comp = comp_boolean,
> +	},
> +	[STRING] = {
> +		.name = "{string}",
> +		.type = "STRING",
> +		.help = "fixed string",
> +		.call = parse_string,
> +		.comp = comp_none,
> +	},
>  	[RULE_ID] = {
>  		.name = "{rule id}",
>  		.type = "RULE ID",
> @@ -654,6 +709,52 @@ static const struct token token_list[] = {
>  		.next = NEXT(item_port, NEXT_ENTRY(UNSIGNED),
> item_param),
>  		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_port,
> index)),
>  	},
> +	[ITEM_RAW] = {
> +		.name = "raw",
> +		.help = "match an arbitrary byte string",
> +		.priv = PRIV_ITEM(RAW, ITEM_RAW_SIZE),
> +		.next = NEXT(item_raw),
> +		.call = parse_vc,
> +	},
> +	[ITEM_RAW_RELATIVE] = {
> +		.name = "relative",
> +		.help = "look for pattern after the previous item",
> +		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN),
> item_param),
> +		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
> +					   relative, 1)),
> +	},
> +	[ITEM_RAW_SEARCH] = {
> +		.name = "search",
> +		.help = "search pattern from offset (see also limit)",
> +		.next = NEXT(item_raw, NEXT_ENTRY(BOOLEAN),
> item_param),
> +		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_raw,
> +					   search, 1)),
> +	},
> +	[ITEM_RAW_OFFSET] = {
> +		.name = "offset",
> +		.help = "absolute or relative offset for pattern",
> +		.next = NEXT(item_raw, NEXT_ENTRY(INTEGER),
> item_param),
> +		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw,
> offset)),
> +	},
> +	[ITEM_RAW_LIMIT] = {
> +		.name = "limit",
> +		.help = "search area limit for start of pattern",
> +		.next = NEXT(item_raw, NEXT_ENTRY(UNSIGNED),
> item_param),
> +		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw, limit)),
> +	},
> +	[ITEM_RAW_PATTERN] = {
> +		.name = "pattern",
> +		.help = "byte string to look for",
> +		.next = NEXT(item_raw,
> +			     NEXT_ENTRY(STRING),
> +			     NEXT_ENTRY(ITEM_PARAM_IS,
> +					ITEM_PARAM_SPEC,
> +					ITEM_PARAM_MASK)),
> +		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_raw,
> length),
> +			     ARGS_ENTRY_USZ(struct rte_flow_item_raw,
> +					    pattern,
> +					    ITEM_RAW_PATTERN_SIZE)),
> +	},
>  	/* Validate/create actions. */
>  	[ACTIONS] = {
>  		.name = "actions",
> @@ -1246,6 +1347,96 @@ parse_int(struct context *ctx, const struct token
> *token,
>  	return -1;
>  }
> 
> +/**
> + * Parse a string.
> + *
> + * Two arguments (ctx->args) are retrieved from the stack to store data
> +and
> + * its length (in that order).
> + */
> +static int
> +parse_string(struct context *ctx, const struct token *token,
> +	     const char *str, unsigned int len,
> +	     void *buf, unsigned int size)
> +{
> +	const struct arg *arg_data = pop_args(ctx);
> +	const struct arg *arg_len = pop_args(ctx);
> +	char tmp[16]; /* Ought to be enough. */
> +	int ret;
> +
> +	/* Arguments are expected. */
> +	if (!arg_data)
> +		return -1;
> +	if (!arg_len) {
> +		push_args(ctx, arg_data);
> +		return -1;
> +	}
> +	size = arg_data->size;
> +	/* Bit-mask fill is not supported. */
> +	if (arg_data->mask || size < len)
> +		goto error;
> +	if (!ctx->object)
> +		return len;
> +	/* Let parse_int() fill length information first. */
> +	ret = snprintf(tmp, sizeof(tmp), "%u", len);
> +	if (ret < 0)
> +		goto error;
> +	push_args(ctx, arg_len);
> +	ret = parse_int(ctx, token, tmp, ret, NULL, 0);
> +	if (ret < 0) {
> +		pop_args(ctx);
> +		goto error;
> +	}
> +	buf = (uint8_t *)ctx->object + arg_data->offset;
> +	/* Output buffer is not necessarily NUL-terminated. */
> +	memcpy(buf, str, len);
> +	memset((uint8_t *)buf + len, 0x55, size - len);
> +	if (ctx->objmask)
> +		memset((uint8_t *)ctx->objmask + arg_data->offset, 0xff,
> len);
> +	return len;
> +error:
> +	push_args(ctx, arg_len);
> +	push_args(ctx, arg_data);
> +	return -1;
> +}
> +
> +/** Boolean values (even indices stand for false). */ static const char
> +*const boolean_name[] = {
> +	"0", "1",
> +	"false", "true",
> +	"no", "yes",
> +	"N", "Y",
> +	NULL,
> +};
> +
> +/**
> + * Parse a boolean value.
> + *
> + * Last argument (ctx->args) is retrieved to determine storage size and
> + * location.
> + */
> +static int
> +parse_boolean(struct context *ctx, const struct token *token,
> +	      const char *str, unsigned int len,
> +	      void *buf, unsigned int size)
> +{
> +	const struct arg *arg = pop_args(ctx);
> +	unsigned int i;
> +	int ret;
> +
> +	/* Argument is expected. */
> +	if (!arg)
> +		return -1;
> +	for (i = 0; boolean_name[i]; ++i)
> +		if (!strncmp(str, boolean_name[i], len))
> +			break;
> +	/* Process token as integer. */
> +	if (boolean_name[i])
> +		str = i & 1 ? "1" : "0";
> +	push_args(ctx, arg);
> +	ret = parse_int(ctx, token, str, strlen(str), buf, size);
> +	return ret > 0 ? (int)len : ret;
> +}
> +
>  /** Parse port and update context. */
>  static int
>  parse_port(struct context *ctx, const struct token *token, @@ -1284,6
> +1475,23 @@ comp_none(struct context *ctx, const struct token *token,
>  	return 0;
>  }
> 
> +/** Complete boolean values. */
> +static int
> +comp_boolean(struct context *ctx, const struct token *token,
> +	     unsigned int ent, char *buf, unsigned int size) {
> +	unsigned int i;
> +
> +	(void)ctx;
> +	(void)token;
> +	for (i = 0; boolean_name[i]; ++i)
> +		if (buf && i == ent)
> +			return snprintf(buf, size, "%s", boolean_name[i]);
> +	if (buf)
> +		return -1;
> +	return i;
> +}
> +
>  /** Complete action names. */
>  static int
>  comp_action(struct context *ctx, const struct token *token,
> --
> 2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw to flow command
  2017-05-11  6:53               ` Zhao1, Wei
@ 2017-05-12  9:12                 ` Adrien Mazarguil
  2017-05-16  5:05                   ` Zhao1, Wei
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2017-05-12  9:12 UTC (permalink / raw)
  To: Zhao1, Wei; +Cc: dev, Xing, Beilei, Lu, Wenzhuo

Hi Wei,

On Thu, May 11, 2017 at 06:53:52AM +0000, Zhao1, Wei wrote:
> Hi, Adrien
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Wednesday, December 21, 2016 10:52 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw to flow
> > command
> > 
> > Matches arbitrary byte strings with properties:
> > 
> > - relative: look for pattern after the previous item.
> > - search: search pattern from offset (see also limit).
> > - offset: absolute or relative offset for pattern.
> > - limit: search area limit for start of pattern.
> > - length: pattern length.
> > - pattern: byte string to look for.
> > 
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Acked-by: Olga Shern <olgas@mellanox.com>
[...]
> #define  ITEM_RAW_PATTERN_SIZE 36
> 
> The size of NIC i350 flex byte filter can accommodate the max length size of 128 byte, and the reason to 
> Define it as 36 is ?If it is the max length of pattern, maybe 128  is more appropriate? 
> Maybe I have not understand your purpose.
> 
> Thank you.

It's more or less an arbitrary compromise due to various limitations.

Once parsed, the result of an entire command is stored in a fixed buffer of
size CMDLINE_PARSE_RESULT_BUFSIZE (8192). Each parsed token ends up
somewhere in that buffer.

Each flow item always consumes sizeof(struct rte_flow_item) + sizeof(struct
rte_flow_item_xxx) * 3 (spec, last and mask) + alignment constraints.

For the raw item, this makes at least:

 (sizeof(rte_flow_item) +
  (sizeof(rte_flow_item_raw) + ITEM_RAW_PATTERN_SIZE) * 3)
 /* (32 + (12 + 36) * 3) => 176 bytes */

Because space is always consumed regardless of the size of the byte string
to match for implementation reasons, there is a chance to fill the buffer
too quickly with a larger ITEM_RAW_PATTERN_SIZE.

Also, this does not prevent users from specifying larger raw patterns (even
larger than 128) by combining them, e.g.:

 flow create 0
    pattern eth / raw relative is 1 pattern is foobar /
       raw relative is 1 pattern is barbaz / end
    actions queue index 42 / end

Such a pattern ends up matching a single "foobarbarbaz" string.

To summarize, it is only due to testpmd limitations. Even without PMD
support for combination, the current ability to provide 36 bytes of raw data
to match per specified item is plenty to validate basic functionality. We'll
improve testpmd eventually.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw to flow command
  2017-05-12  9:12                 ` Adrien Mazarguil
@ 2017-05-16  5:05                   ` Zhao1, Wei
  0 siblings, 0 replies; 262+ messages in thread
From: Zhao1, Wei @ 2017-05-16  5:05 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Xing, Beilei, Lu, Wenzhuo

Hi,  Adrien Mazarguil

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Friday, May 12, 2017 5:13 PM
> To: Zhao1, Wei <wei.zhao1@intel.com>
> Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw to flow
> command
> 
> Hi Wei,
> 
> On Thu, May 11, 2017 at 06:53:52AM +0000, Zhao1, Wei wrote:
> > Hi, Adrien
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien
> > > Mazarguil
> > > Sent: Wednesday, December 21, 2016 10:52 PM
> > > To: dev@dpdk.org
> > > Subject: [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw to
> > > flow command
> > >
> > > Matches arbitrary byte strings with properties:
> > >
> > > - relative: look for pattern after the previous item.
> > > - search: search pattern from offset (see also limit).
> > > - offset: absolute or relative offset for pattern.
> > > - limit: search area limit for start of pattern.
> > > - length: pattern length.
> > > - pattern: byte string to look for.
> > >
> > > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > Acked-by: Olga Shern <olgas@mellanox.com>
> [...]
> > #define  ITEM_RAW_PATTERN_SIZE 36
> >
> > The size of NIC i350 flex byte filter can accommodate the max length
> > size of 128 byte, and the reason to Define it as 36 is ?If it is the max length
> of pattern, maybe 128  is more appropriate?
> > Maybe I have not understand your purpose.
> >
> > Thank you.
> 
> It's more or less an arbitrary compromise due to various limitations.
> 
> Once parsed, the result of an entire command is stored in a fixed buffer of
> size CMDLINE_PARSE_RESULT_BUFSIZE (8192). Each parsed token ends up
> somewhere in that buffer.
> 
> Each flow item always consumes sizeof(struct rte_flow_item) + sizeof(struct
> rte_flow_item_xxx) * 3 (spec, last and mask) + alignment constraints.
> 
> For the raw item, this makes at least:
> 
>  (sizeof(rte_flow_item) +
>   (sizeof(rte_flow_item_raw) + ITEM_RAW_PATTERN_SIZE) * 3)
>  /* (32 + (12 + 36) * 3) => 176 bytes */
> 
> Because space is always consumed regardless of the size of the byte string to
> match for implementation reasons, there is a chance to fill the buffer too
> quickly with a larger ITEM_RAW_PATTERN_SIZE.
> 
> Also, this does not prevent users from specifying larger raw patterns (even
> larger than 128) by combining them, e.g.:
> 
>  flow create 0
>     pattern eth / raw relative is 1 pattern is foobar /
>        raw relative is 1 pattern is barbaz / end
>     actions queue index 42 / end
> 
> Such a pattern ends up matching a single "foobarbarbaz" string.
> 
> To summarize, it is only due to testpmd limitations. Even without PMD
> support for combination, the current ability to provide 36 bytes of raw data
> to match per specified item is plenty to validate basic functionality. We'll
> improve testpmd eventually.
> 

Thank you for your detailed explanation.
Igb flex byte filter will support for that type combination for raw item. 
But this testpmd limitation will make trouble for users and tester.

> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
  2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API Adrien Mazarguil
@ 2017-05-23  6:07           ` Zhao1, Wei
  2017-05-23  9:50             ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Zhao1, Wei @ 2017-05-23  6:07 UTC (permalink / raw)
  To: Adrien Mazarguil, dev; +Cc: Xing, Beilei

Hi,  Adrien

> +struct rte_flow_item_raw {
> +	uint32_t relative:1; /**< Look for pattern after the previous item. */
> +	uint32_t search:1; /**< Search pattern from offset (see also limit). */
> +	uint32_t reserved:30; /**< Reserved, must be set to zero. */
> +	int32_t offset; /**< Absolute or relative offset for pattern. */
> +	uint16_t limit; /**< Search area limit for start of pattern. */
> +	uint16_t length; /**< Pattern length. */
> +	uint8_t pattern[]; /**< Byte string to look for. */ };

When I use this API to test igb flex filter, I find that 
in the struct rte_flow_item_raw, the member  pattern is not the same as my purpose.
For example, If I type in  " flow create 0 ingress pattern raw relative is 0 pattern is 0123  / end actions queue index 1 / end "
What I get in NIC layer is  pattern[]={ 0x30, 0x31, 0x32, 0x33, 0x0 <repeats 124 times> }.
But what I need is pattern[]={0x01, 0x23, 0x0 <repeats 126 times>}
About the format change of flex_filter, I have reference to the testpmd function cmd_flex_filter_parsed(),
There is details of format change from ASIC code to data, for example:

            for (i = 0; i < len; i++) {
                        c = bytes_ptr[i];
                        if (isxdigit(c) == 0) {
                                    /* invalid characters. */
                                    printf("invalid input\n");
                                    return;
                        }
                        val = xdigit2val(c);
                        if (i % 2) {
                                    byte |= val;
                                    filter.bytes[j] = byte;
                                    printf("bytes[%d]:%02x ", j, filter.bytes[j]);
                                    j++;
                                    byte = 0;
                        } else
                                    byte |= val << 4;
            }  

and there is also usage example in the DPDK document testpmd_app_ug-16.11.pdf:
(it also not use ASIC code)

testpmd> flex_filter 0 add len 16 bytes 0x00000000000000000000000008060000 \
mask 000C priority 3 queue 3

so, will our new generic flow API align to the old format in flex byte filter in 17.08 or in the future?
At least in the struct rte_flow_item_raw, the member  pattern is the same as old filter?


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Tuesday, December 20, 2016 1:49 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
> 
> This new API supersedes all the legacy filter types described in rte_eth_ctrl.h.
> It is slightly higher level and as a result relies more on PMDs to process and
> validate flow rules.
> 
> Benefits:
> 
> - A unified API is easier to program for, applications do not have to be
>   written for a specific filter type which may or may not be supported by
>   the underlying device.
> 
> - The behavior of a flow rule is the same regardless of the underlying
>   device, applications do not need to be aware of hardware quirks.
> 
> - Extensible by design, API/ABI breakage should rarely occur if at all.
> 
> - Documentation is self-standing, no need to look up elsewhere.
> 
> Existing filter types will be deprecated and removed in the near future.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Acked-by: Olga Shern <olgas@mellanox.com>
> ---
>  MAINTAINERS                            |   4 +
>  doc/api/doxy-api-index.md              |   2 +
>  lib/librte_ether/Makefile              |   3 +
>  lib/librte_ether/rte_eth_ctrl.h        |   1 +
>  lib/librte_ether/rte_ether_version.map |  11 +
>  lib/librte_ether/rte_flow.c            | 159 +++++
>  lib/librte_ether/rte_flow.h            | 947 ++++++++++++++++++++++++++++
>  lib/librte_ether/rte_flow_driver.h     | 182 ++++++
>  8 files changed, 1309 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 26d9590..5975cff 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -243,6 +243,10 @@ M: Thomas Monjalon
> <thomas.monjalon@6wind.com>
>  F: lib/librte_ether/
>  F: scripts/test-null.sh
> 
> +Generic flow API
> +M: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> +F: lib/librte_ether/rte_flow*
> +
>  Crypto API
>  M: Declan Doherty <declan.doherty@intel.com>
>  F: lib/librte_cryptodev/
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index
> de65b4c..4951552 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -39,6 +39,8 @@ There are many libraries, so their headers may be
> grouped by topics:
>    [dev]                (@ref rte_dev.h),
>    [ethdev]             (@ref rte_ethdev.h),
>    [ethctrl]            (@ref rte_eth_ctrl.h),
> +  [rte_flow]           (@ref rte_flow.h),
> +  [rte_flow_driver]    (@ref rte_flow_driver.h),
>    [cryptodev]          (@ref rte_cryptodev.h),
>    [devargs]            (@ref rte_devargs.h),
>    [bond]               (@ref rte_eth_bond.h),
> diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile index
> efe1e5f..9335361 100644
> --- a/lib/librte_ether/Makefile
> +++ b/lib/librte_ether/Makefile
> @@ -44,6 +44,7 @@ EXPORT_MAP := rte_ether_version.map  LIBABIVER := 5
> 
>  SRCS-y += rte_ethdev.c
> +SRCS-y += rte_flow.c
> 
>  #
>  # Export include files
> @@ -51,6 +52,8 @@ SRCS-y += rte_ethdev.c  SYMLINK-y-include +=
> rte_ethdev.h  SYMLINK-y-include += rte_eth_ctrl.h  SYMLINK-y-include +=
> rte_dev_info.h
> +SYMLINK-y-include += rte_flow.h
> +SYMLINK-y-include += rte_flow_driver.h
> 
>  # this lib depends upon:
>  DEPDIRS-y += lib/librte_net lib/librte_eal lib/librte_mempool lib/librte_ring
> lib/librte_mbuf diff --git a/lib/librte_ether/rte_eth_ctrl.h
> b/lib/librte_ether/rte_eth_ctrl.h index fe80eb0..8386904 100644
> --- a/lib/librte_ether/rte_eth_ctrl.h
> +++ b/lib/librte_ether/rte_eth_ctrl.h
> @@ -99,6 +99,7 @@ enum rte_filter_type {
>  	RTE_ETH_FILTER_FDIR,
>  	RTE_ETH_FILTER_HASH,
>  	RTE_ETH_FILTER_L2_TUNNEL,
> +	RTE_ETH_FILTER_GENERIC,
>  	RTE_ETH_FILTER_MAX
>  };
> 
> diff --git a/lib/librte_ether/rte_ether_version.map
> b/lib/librte_ether/rte_ether_version.map
> index 72be66d..384cdee 100644
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -147,3 +147,14 @@ DPDK_16.11 {
>  	rte_eth_dev_pci_remove;
> 
>  } DPDK_16.07;
> +
> +DPDK_17.02 {
> +	global:
> +
> +	rte_flow_validate;
> +	rte_flow_create;
> +	rte_flow_destroy;
> +	rte_flow_flush;
> +	rte_flow_query;
> +
> +} DPDK_16.11;
> diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c new file
> mode 100644 index 0000000..d98fb1b
> --- /dev/null
> +++ b/lib/librte_ether/rte_flow.c
> @@ -0,0 +1,159 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright 2016 6WIND S.A.
> + *   Copyright 2016 Mellanox.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of 6WIND S.A. nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#include <stdint.h>
> +
> +#include <rte_errno.h>
> +#include <rte_branch_prediction.h>
> +#include "rte_ethdev.h"
> +#include "rte_flow_driver.h"
> +#include "rte_flow.h"
> +
> +/* Get generic flow operations structure from a port. */ const struct
> +rte_flow_ops * rte_flow_ops_get(uint8_t port_id, struct rte_flow_error
> +*error) {
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops;
> +	int code;
> +
> +	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
> +		code = ENODEV;
> +	else if (unlikely(!dev->dev_ops->filter_ctrl ||
> +			  dev->dev_ops->filter_ctrl(dev,
> +						    RTE_ETH_FILTER_GENERIC,
> +						    RTE_ETH_FILTER_GET,
> +						    &ops) ||
> +			  !ops))
> +		code = ENOSYS;
> +	else
> +		return ops;
> +	rte_flow_error_set(error, code,
> RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(code));
> +	return NULL;
> +}
> +
> +/* Check whether a flow rule can be created on a given port. */ int
> +rte_flow_validate(uint8_t port_id,
> +		  const struct rte_flow_attr *attr,
> +		  const struct rte_flow_item pattern[],
> +		  const struct rte_flow_action actions[],
> +		  struct rte_flow_error *error)
> +{
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +
> +	if (unlikely(!ops))
> +		return -rte_errno;
> +	if (likely(!!ops->validate))
> +		return ops->validate(dev, attr, pattern, actions, error);
> +	rte_flow_error_set(error, ENOSYS,
> RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOSYS));
> +	return -rte_errno;
> +}
> +
> +/* Create a flow rule on a given port. */ struct rte_flow *
> +rte_flow_create(uint8_t port_id,
> +		const struct rte_flow_attr *attr,
> +		const struct rte_flow_item pattern[],
> +		const struct rte_flow_action actions[],
> +		struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (unlikely(!ops))
> +		return NULL;
> +	if (likely(!!ops->create))
> +		return ops->create(dev, attr, pattern, actions, error);
> +	rte_flow_error_set(error, ENOSYS,
> RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOSYS));
> +	return NULL;
> +}
> +
> +/* Destroy a flow rule on a given port. */ int rte_flow_destroy(uint8_t
> +port_id,
> +		 struct rte_flow *flow,
> +		 struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (unlikely(!ops))
> +		return -rte_errno;
> +	if (likely(!!ops->destroy))
> +		return ops->destroy(dev, flow, error);
> +	rte_flow_error_set(error, ENOSYS,
> RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOSYS));
> +	return -rte_errno;
> +}
> +
> +/* Destroy all flow rules associated with a port. */ int
> +rte_flow_flush(uint8_t port_id,
> +	       struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (unlikely(!ops))
> +		return -rte_errno;
> +	if (likely(!!ops->flush))
> +		return ops->flush(dev, error);
> +	rte_flow_error_set(error, ENOSYS,
> RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOSYS));
> +	return -rte_errno;
> +}
> +
> +/* Query an existing flow rule. */
> +int
> +rte_flow_query(uint8_t port_id,
> +	       struct rte_flow *flow,
> +	       enum rte_flow_action_type action,
> +	       void *data,
> +	       struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (!ops)
> +		return -rte_errno;
> +	if (likely(!!ops->query))
> +		return ops->query(dev, flow, action, data, error);
> +	rte_flow_error_set(error, ENOSYS,
> RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +			   NULL, rte_strerror(ENOSYS));
> +	return -rte_errno;
> +}
> diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h new file
> mode 100644 index 0000000..98084ac
> --- /dev/null
> +++ b/lib/librte_ether/rte_flow.h
> @@ -0,0 +1,947 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright 2016 6WIND S.A.
> + *   Copyright 2016 Mellanox.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of 6WIND S.A. nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#ifndef RTE_FLOW_H_
> +#define RTE_FLOW_H_
> +
> +/**
> + * @file
> + * RTE generic flow API
> + *
> + * This interface provides the ability to program packet matching and
> + * associated actions in hardware through flow rules.
> + */
> +
> +#include <rte_arp.h>
> +#include <rte_ether.h>
> +#include <rte_icmp.h>
> +#include <rte_ip.h>
> +#include <rte_sctp.h>
> +#include <rte_tcp.h>
> +#include <rte_udp.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Flow rule attributes.
> + *
> + * Priorities are set on two levels: per group and per rule within groups.
> + *
> + * Lower values denote higher priority, the highest priority for both
> +levels
> + * is 0, so that a rule with priority 0 in group 8 is always matched
> +after a
> + * rule with priority 8 in group 0.
> + *
> + * Although optional, applications are encouraged to group similar
> +rules as
> + * much as possible to fully take advantage of hardware capabilities
> + * (e.g. optimized matching) and work around limitations (e.g. a single
> + * pattern type possibly allowed in a given group).
> + *
> + * Group and priority levels are arbitrary and up to the application,
> +they
> + * do not need to be contiguous nor start from 0, however the maximum
> +number
> + * varies between devices and may be affected by existing flow rules.
> + *
> + * If a packet is matched by several rules of a given group for a given
> + * priority level, the outcome is undefined. It can take any path, may
> +be
> + * duplicated or even cause unrecoverable errors.
> + *
> + * Note that support for more than a single group and priority level is
> +not
> + * guaranteed.
> + *
> + * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
> + *
> + * Several pattern items and actions are valid and can be used in both
> + * directions. Those valid for only one direction are described as such.
> + *
> + * At least one direction must be specified.
> + *
> + * Specifying both directions at once for a given rule is not
> +recommended
> + * but may be valid in a few cases (e.g. shared counter).
> + */
> +struct rte_flow_attr {
> +	uint32_t group; /**< Priority group. */
> +	uint32_t priority; /**< Priority level within group. */
> +	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
> +	uint32_t egress:1; /**< Rule applies to egress traffic. */
> +	uint32_t reserved:30; /**< Reserved, must be zero. */ };
> +
> +/**
> + * Matching pattern item types.
> + *
> + * Pattern items fall in two categories:
> + *
> + * - Matching protocol headers and packet data (ANY, RAW, ETH, VLAN,
> IPV4,
> + *   IPV6, ICMP, UDP, TCP, SCTP, VXLAN and so on), usually associated with a
> + *   specification structure. These must be stacked in the same order as the
> + *   protocol layers to match, starting from the lowest.
> + *
> + * - Matching meta-data or affecting pattern processing (END, VOID, INVERT,
> + *   PF, VF, PORT and so on), often without a specification structure. Since
> + *   they do not match packet contents, these can be specified anywhere
> + *   within item lists without affecting others.
> + *
> + * See the description of individual types for more information. Those
> + * marked with [META] fall into the second category.
> + */
> +enum rte_flow_item_type {
> +	/**
> +	 * [META]
> +	 *
> +	 * End marker for item lists. Prevents further processing of items,
> +	 * thereby ending the pattern.
> +	 *
> +	 * No associated specification structure.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_END,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Used as a placeholder for convenience. It is ignored and simply
> +	 * discarded by PMDs.
> +	 *
> +	 * No associated specification structure.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_VOID,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Inverted matching, i.e. process packets that do not match the
> +	 * pattern.
> +	 *
> +	 * No associated specification structure.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_INVERT,
> +
> +	/**
> +	 * Matches any protocol in place of the current layer, a single ANY
> +	 * may also stand for several protocol layers.
> +	 *
> +	 * See struct rte_flow_item_any.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_ANY,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Matches packets addressed to the physical function of the device.
> +	 *
> +	 * If the underlying device function differs from the one that would
> +	 * normally receive the matched traffic, specifying this item
> +	 * prevents it from reaching that device unless the flow rule
> +	 * contains a PF action. Packets are not duplicated between device
> +	 * instances by default.
> +	 *
> +	 * No associated specification structure.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_PF,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Matches packets addressed to a virtual function ID of the device.
> +	 *
> +	 * If the underlying device function differs from the one that would
> +	 * normally receive the matched traffic, specifying this item
> +	 * prevents it from reaching that device unless the flow rule
> +	 * contains a VF action. Packets are not duplicated between device
> +	 * instances by default.
> +	 *
> +	 * See struct rte_flow_item_vf.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_VF,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Matches packets coming from the specified physical port of the
> +	 * underlying device.
> +	 *
> +	 * The first PORT item overrides the physical port normally
> +	 * associated with the specified DPDK input port (port_id). This
> +	 * item can be provided several times to match additional physical
> +	 * ports.
> +	 *
> +	 * See struct rte_flow_item_port.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_PORT,
> +
> +	/**
> +	 * Matches a byte string of a given length at a given offset.
> +	 *
> +	 * See struct rte_flow_item_raw.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_RAW,
> +
> +	/**
> +	 * Matches an Ethernet header.
> +	 *
> +	 * See struct rte_flow_item_eth.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_ETH,
> +
> +	/**
> +	 * Matches an 802.1Q/ad VLAN tag.
> +	 *
> +	 * See struct rte_flow_item_vlan.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_VLAN,
> +
> +	/**
> +	 * Matches an IPv4 header.
> +	 *
> +	 * See struct rte_flow_item_ipv4.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_IPV4,
> +
> +	/**
> +	 * Matches an IPv6 header.
> +	 *
> +	 * See struct rte_flow_item_ipv6.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_IPV6,
> +
> +	/**
> +	 * Matches an ICMP header.
> +	 *
> +	 * See struct rte_flow_item_icmp.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_ICMP,
> +
> +	/**
> +	 * Matches a UDP header.
> +	 *
> +	 * See struct rte_flow_item_udp.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_UDP,
> +
> +	/**
> +	 * Matches a TCP header.
> +	 *
> +	 * See struct rte_flow_item_tcp.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_TCP,
> +
> +	/**
> +	 * Matches a SCTP header.
> +	 *
> +	 * See struct rte_flow_item_sctp.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_SCTP,
> +
> +	/**
> +	 * Matches a VXLAN header.
> +	 *
> +	 * See struct rte_flow_item_vxlan.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_VXLAN,
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_ANY
> + *
> + * Matches any protocol in place of the current layer, a single ANY may
> +also
> + * stand for several protocol layers.
> + *
> + * This is usually specified as the first pattern item when looking for
> +a
> + * protocol anywhere in a packet.
> + *
> + * A zeroed mask stands for any number of layers.
> + */
> +struct rte_flow_item_any {
> +	uint32_t num; /* Number of layers covered. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_VF
> + *
> + * Matches packets addressed to a virtual function ID of the device.
> + *
> + * If the underlying device function differs from the one that would
> + * normally receive the matched traffic, specifying this item prevents
> +it
> + * from reaching that device unless the flow rule contains a VF
> + * action. Packets are not duplicated between device instances by default.
> + *
> + * - Likely to return an error or never match any traffic if this causes a
> + *   VF device to match traffic addressed to a different VF.
> + * - Can be specified multiple times to match traffic addressed to several
> + *   VF IDs.
> + * - Can be combined with a PF item to match both PF and VF traffic.
> + *
> + * A zeroed mask can be used to match any VF ID.
> + */
> +struct rte_flow_item_vf {
> +	uint32_t id; /**< Destination VF ID. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_PORT
> + *
> + * Matches packets coming from the specified physical port of the
> +underlying
> + * device.
> + *
> + * The first PORT item overrides the physical port normally associated
> +with
> + * the specified DPDK input port (port_id). This item can be provided
> + * several times to match additional physical ports.
> + *
> + * Note that physical ports are not necessarily tied to DPDK input
> +ports
> + * (port_id) when those are not under DPDK control. Possible values are
> + * specific to each device, they are not necessarily indexed from zero
> +and
> + * may not be contiguous.
> + *
> + * As a device property, the list of allowed values as well as the
> +value
> + * associated with a port_id should be retrieved by other means.
> + *
> + * A zeroed mask can be used to match any port index.
> + */
> +struct rte_flow_item_port {
> +	uint32_t index; /**< Physical port index. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_RAW
> + *
> + * Matches a byte string of a given length at a given offset.
> + *
> + * Offset is either absolute (using the start of the packet) or
> +relative to
> + * the end of the previous matched item in the stack, in which case
> +negative
> + * values are allowed.
> + *
> + * If search is enabled, offset is used as the starting point. The
> +search
> + * area can be delimited by setting limit to a nonzero value, which is
> +the
> + * maximum number of bytes after offset where the pattern may start.
> + *
> + * Matching a zero-length pattern is allowed, doing so resets the
> +relative
> + * offset for subsequent items.
> + *
> + * This type does not support ranges (struct rte_flow_item.last).
> + */
> +struct rte_flow_item_raw {
> +	uint32_t relative:1; /**< Look for pattern after the previous item. */
> +	uint32_t search:1; /**< Search pattern from offset (see also limit). */
> +	uint32_t reserved:30; /**< Reserved, must be set to zero. */
> +	int32_t offset; /**< Absolute or relative offset for pattern. */
> +	uint16_t limit; /**< Search area limit for start of pattern. */
> +	uint16_t length; /**< Pattern length. */
> +	uint8_t pattern[]; /**< Byte string to look for. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_ETH
> + *
> + * Matches an Ethernet header.
> + */
> +struct rte_flow_item_eth {
> +	struct ether_addr dst; /**< Destination MAC. */
> +	struct ether_addr src; /**< Source MAC. */
> +	uint16_t type; /**< EtherType. */
> +};
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_VLAN
> + *
> + * Matches an 802.1Q/ad VLAN tag.
> + *
> + * This type normally follows either RTE_FLOW_ITEM_TYPE_ETH or
> + * RTE_FLOW_ITEM_TYPE_VLAN.
> + */
> +struct rte_flow_item_vlan {
> +	uint16_t tpid; /**< Tag protocol identifier. */
> +	uint16_t tci; /**< Tag control information. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_IPV4
> + *
> + * Matches an IPv4 header.
> + *
> + * Note: IPv4 options are handled by dedicated pattern items.
> + */
> +struct rte_flow_item_ipv4 {
> +	struct ipv4_hdr hdr; /**< IPv4 header definition. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_IPV6.
> + *
> + * Matches an IPv6 header.
> + *
> + * Note: IPv6 options are handled by dedicated pattern items.
> + */
> +struct rte_flow_item_ipv6 {
> +	struct ipv6_hdr hdr; /**< IPv6 header definition. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_ICMP.
> + *
> + * Matches an ICMP header.
> + */
> +struct rte_flow_item_icmp {
> +	struct icmp_hdr hdr; /**< ICMP header definition. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_UDP.
> + *
> + * Matches a UDP header.
> + */
> +struct rte_flow_item_udp {
> +	struct udp_hdr hdr; /**< UDP header definition. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_TCP.
> + *
> + * Matches a TCP header.
> + */
> +struct rte_flow_item_tcp {
> +	struct tcp_hdr hdr; /**< TCP header definition. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_SCTP.
> + *
> + * Matches a SCTP header.
> + */
> +struct rte_flow_item_sctp {
> +	struct sctp_hdr hdr; /**< SCTP header definition. */ };
> +
> +/**
> + * RTE_FLOW_ITEM_TYPE_VXLAN.
> + *
> + * Matches a VXLAN header (RFC 7348).
> + */
> +struct rte_flow_item_vxlan {
> +	uint8_t flags; /**< Normally 0x08 (I flag). */
> +	uint8_t rsvd0[3]; /**< Reserved, normally 0x000000. */
> +	uint8_t vni[3]; /**< VXLAN identifier. */
> +	uint8_t rsvd1; /**< Reserved, normally 0x00. */ };
> +
> +/**
> + * Matching pattern item definition.
> + *
> + * A pattern is formed by stacking items starting from the lowest
> +protocol
> + * layer to match. This stacking restriction does not apply to meta
> +items
> + * which can be placed anywhere in the stack without affecting the
> +meaning
> + * of the resulting pattern.
> + *
> + * Patterns are terminated by END items.
> + *
> + * The spec field should be a valid pointer to a structure of the
> +related
> + * item type. It may be set to NULL in many cases to use default values.
> + *
> + * Optionally, last can point to a structure of the same type to define
> +an
> + * inclusive range. This is mostly supported by integer and address
> +fields,
> + * may cause errors otherwise. Fields that do not support ranges must
> +be set
> + * to 0 or to the same value as the corresponding fields in spec.
> + *
> + * By default all fields present in spec are considered relevant (see
> +note
> + * below). This behavior can be altered by providing a mask structure
> +of the
> + * same type with applicable bits set to one. It can also be used to
> + * partially filter out specific fields (e.g. as an alternate mean to
> +match
> + * ranges of IP addresses).
> + *
> + * Mask is a simple bit-mask applied before interpreting the contents
> +of
> + * spec and last, which may yield unexpected results if not used
> + * carefully. For example, if for an IPv4 address field, spec provides
> + * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
> + * effective range becomes 10.1.0.0 to 10.3.255.255.
> + *
> + * Note: the defaults for data-matching items such as IPv4 when mask is
> +not
> + * specified actually depend on the underlying implementation since
> +only
> + * recognized fields can be taken into account.
> + */
> +struct rte_flow_item {
> +	enum rte_flow_item_type type; /**< Item type. */
> +	const void *spec; /**< Pointer to item specification structure. */
> +	const void *last; /**< Defines an inclusive range (spec to last). */
> +	const void *mask; /**< Bit-mask applied to spec and last. */ };
> +
> +/**
> + * Action types.
> + *
> + * Each possible action is represented by a type. Some have associated
> + * configuration structures. Several actions combined in a list can be
> + * affected to a flow rule. That list is not ordered.
> + *
> + * They fall in three categories:
> + *
> + * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
> + *   processing matched packets by subsequent flow rules, unless
> overridden
> + *   with PASSTHRU.
> + *
> + * - Non terminating actions (PASSTHRU, DUP) that leave matched packets
> up
> + *   for additional processing by subsequent flow rules.
> + *
> + * - Other non terminating meta actions that do not affect the fate of
> + *   packets (END, VOID, MARK, FLAG, COUNT).
> + *
> + * When several actions are combined in a flow rule, they should all
> +have
> + * different types (e.g. dropping a packet twice is not possible).
> + *
> + * Only the last action of a given type is taken into account. PMDs
> +still
> + * perform error checking on the entire list.
> + *
> + * Note that PASSTHRU is the only action able to override a terminating
> + * rule.
> + */
> +enum rte_flow_action_type {
> +	/**
> +	 * [META]
> +	 *
> +	 * End marker for action lists. Prevents further processing of
> +	 * actions, thereby ending the list.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_END,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Used as a placeholder for convenience. It is ignored and simply
> +	 * discarded by PMDs.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_VOID,
> +
> +	/**
> +	 * Leaves packets up for additional processing by subsequent flow
> +	 * rules. This is the default when a rule does not contain a
> +	 * terminating action, but can be specified to force a rule to
> +	 * become non-terminating.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_PASSTHRU,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Attaches a 32 bit value to packets.
> +	 *
> +	 * See struct rte_flow_action_mark.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_MARK,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Flag packets. Similar to MARK but only affects ol_flags.
> +	 *
> +	 * Note: a distinctive flag must be defined for it.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_FLAG,
> +
> +	/**
> +	 * Assigns packets to a given queue index.
> +	 *
> +	 * See struct rte_flow_action_queue.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_QUEUE,
> +
> +	/**
> +	 * Drops packets.
> +	 *
> +	 * PASSTHRU overrides this action if both are specified.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_DROP,
> +
> +	/**
> +	 * [META]
> +	 *
> +	 * Enables counters for this rule.
> +	 *
> +	 * These counters can be retrieved and reset through
> rte_flow_query(),
> +	 * see struct rte_flow_query_count.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_COUNT,
> +
> +	/**
> +	 * Duplicates packets to a given queue index.
> +	 *
> +	 * This is normally combined with QUEUE, however when used alone,
> it
> +	 * is actually similar to QUEUE + PASSTHRU.
> +	 *
> +	 * See struct rte_flow_action_dup.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_DUP,
> +
> +	/**
> +	 * Similar to QUEUE, except RSS is additionally performed on packets
> +	 * to spread them among several queues according to the provided
> +	 * parameters.
> +	 *
> +	 * See struct rte_flow_action_rss.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_RSS,
> +
> +	/**
> +	 * Redirects packets to the physical function (PF) of the current
> +	 * device.
> +	 *
> +	 * No associated configuration structure.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_PF,
> +
> +	/**
> +	 * Redirects packets to the virtual function (VF) of the current
> +	 * device with the specified ID.
> +	 *
> +	 * See struct rte_flow_action_vf.
> +	 */
> +	RTE_FLOW_ACTION_TYPE_VF,
> +};
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_MARK
> + *
> + * Attaches a 32 bit value to packets.
> + *
> + * This value is arbitrary and application-defined. For compatibility
> +with
> + * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID
> +is
> + * also set in ol_flags.
> + */
> +struct rte_flow_action_mark {
> +	uint32_t id; /**< 32 bit value to return with packets. */ };
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_QUEUE
> + *
> + * Assign packets to a given queue index.
> + *
> + * Terminating by default.
> + */
> +struct rte_flow_action_queue {
> +	uint16_t index; /**< Queue index to use. */ };
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_COUNT (query)
> + *
> + * Query structure to retrieve and reset flow rule counters.
> + */
> +struct rte_flow_query_count {
> +	uint32_t reset:1; /**< Reset counters after query [in]. */
> +	uint32_t hits_set:1; /**< hits field is set [out]. */
> +	uint32_t bytes_set:1; /**< bytes field is set [out]. */
> +	uint32_t reserved:29; /**< Reserved, must be zero [in, out]. */
> +	uint64_t hits; /**< Number of hits for this rule [out]. */
> +	uint64_t bytes; /**< Number of bytes through this rule [out]. */ };
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_DUP
> + *
> + * Duplicates packets to a given queue index.
> + *
> + * This is normally combined with QUEUE, however when used alone, it is
> + * actually similar to QUEUE + PASSTHRU.
> + *
> + * Non-terminating by default.
> + */
> +struct rte_flow_action_dup {
> +	uint16_t index; /**< Queue index to duplicate packets to. */ };
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_RSS
> + *
> + * Similar to QUEUE, except RSS is additionally performed on packets to
> + * spread them among several queues according to the provided
> parameters.
> + *
> + * Note: RSS hash result is normally stored in the hash.rss mbuf field,
> + * however it conflicts with the MARK action as they share the same
> + * space. When both actions are specified, the RSS hash is discarded
> +and
> + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
> + * structure should eventually evolve to store both.
> + *
> + * Terminating by default.
> + */
> +struct rte_flow_action_rss {
> +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
> +	uint16_t num; /**< Number of entries in queue[]. */
> +	uint16_t queue[]; /**< Queues indices to use. */ };
> +
> +/**
> + * RTE_FLOW_ACTION_TYPE_VF
> + *
> + * Redirects packets to a virtual function (VF) of the current device.
> + *
> + * Packets matched by a VF pattern item can be redirected to their
> +original
> + * VF ID instead of the specified one. This parameter may not be
> +available
> + * and is not guaranteed to work properly if the VF part is matched by
> +a
> + * prior flow rule or if packets are not addressed to a VF in the first
> + * place.
> + *
> + * Terminating by default.
> + */
> +struct rte_flow_action_vf {
> +	uint32_t original:1; /**< Use original VF ID if possible. */
> +	uint32_t reserved:31; /**< Reserved, must be zero. */
> +	uint32_t id; /**< VF ID to redirect packets to. */ };
> +
> +/**
> + * Definition of a single action.
> + *
> + * A list of actions is terminated by a END action.
> + *
> + * For simple actions without a configuration structure, conf remains NULL.
> + */
> +struct rte_flow_action {
> +	enum rte_flow_action_type type; /**< Action type. */
> +	const void *conf; /**< Pointer to action configuration structure. */
> +};
> +
> +/**
> + * Opaque type returned after successfully creating a flow.
> + *
> + * This handle can be used to manage and query the related flow (e.g.
> +to
> + * destroy it or retrieve counters).
> + */
> +struct rte_flow;
> +
> +/**
> + * Verbose error types.
> + *
> + * Most of them provide the type of the object referenced by struct
> + * rte_flow_error.cause.
> + */
> +enum rte_flow_error_type {
> +	RTE_FLOW_ERROR_TYPE_NONE, /**< No error. */
> +	RTE_FLOW_ERROR_TYPE_UNSPECIFIED, /**< Cause unspecified. */
> +	RTE_FLOW_ERROR_TYPE_HANDLE, /**< Flow rule (handle). */
> +	RTE_FLOW_ERROR_TYPE_ATTR_GROUP, /**< Group field. */
> +	RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, /**< Priority field. */
> +	RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, /**< Ingress field. */
> +	RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, /**< Egress field. */
> +	RTE_FLOW_ERROR_TYPE_ATTR, /**< Attributes structure. */
> +	RTE_FLOW_ERROR_TYPE_ITEM_NUM, /**< Pattern length. */
> +	RTE_FLOW_ERROR_TYPE_ITEM, /**< Specific pattern item. */
> +	RTE_FLOW_ERROR_TYPE_ACTION_NUM, /**< Number of actions.
> */
> +	RTE_FLOW_ERROR_TYPE_ACTION, /**< Specific action. */ };
> +
> +/**
> + * Verbose error structure definition.
> + *
> + * This object is normally allocated by applications and set by PMDs,
> +the
> + * message points to a constant string which does not need to be freed
> +by
> + * the application, however its pointer can be considered valid only as
> +long
> + * as its associated DPDK port remains configured. Closing the
> +underlying
> + * device or unloading the PMD invalidates it.
> + *
> + * Both cause and message may be NULL regardless of the error type.
> + */
> +struct rte_flow_error {
> +	enum rte_flow_error_type type; /**< Cause field and error types.
> */
> +	const void *cause; /**< Object responsible for the error. */
> +	const char *message; /**< Human-readable error message. */ };
> +
> +/**
> + * Check whether a flow rule can be created on a given port.
> + *
> + * While this function has no effect on the target device, the flow
> +rule is
> + * validated against its current configuration state and the returned
> +value
> + * should be considered valid by the caller for that state only.
> + *
> + * The returned value is guaranteed to remain valid only as long as no
> + * successful calls to rte_flow_create() or rte_flow_destroy() are made
> +in
> + * the meantime and no device parameter affecting flow rules in any way
> +are
> + * modified, due to possible collisions or resource limitations
> +(although in
> + * such cases EINVAL should not be returned).
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] attr
> + *   Flow rule attributes.
> + * @param[in] pattern
> + *   Pattern specification (list terminated by the END pattern item).
> + * @param[in] actions
> + *   Associated actions (list terminated by the END action).
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 if flow rule is valid and can be created. A negative errno value
> + *   otherwise (rte_errno is also set), the following errors are defined:
> + *
> + *   -ENOSYS: underlying device does not support this functionality.
> + *
> + *   -EINVAL: unknown or invalid rule specification.
> + *
> + *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
> + *   bit-masks are unsupported).
> + *
> + *   -EEXIST: collision with an existing rule.
> + *
> + *   -ENOMEM: not enough resources.
> + *
> + *   -EBUSY: action cannot be performed due to busy device resources, may
> + *   succeed if the affected queues or even the entire port are in a stopped
> + *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
> + */
> +int
> +rte_flow_validate(uint8_t port_id,
> +		  const struct rte_flow_attr *attr,
> +		  const struct rte_flow_item pattern[],
> +		  const struct rte_flow_action actions[],
> +		  struct rte_flow_error *error);
> +
> +/**
> + * Create a flow rule on a given port.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] attr
> + *   Flow rule attributes.
> + * @param[in] pattern
> + *   Pattern specification (list terminated by the END pattern item).
> + * @param[in] actions
> + *   Associated actions (list terminated by the END action).
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   A valid handle in case of success, NULL otherwise and rte_errno is set
> + *   to the positive version of one of the error codes defined for
> + *   rte_flow_validate().
> + */
> +struct rte_flow *
> +rte_flow_create(uint8_t port_id,
> +		const struct rte_flow_attr *attr,
> +		const struct rte_flow_item pattern[],
> +		const struct rte_flow_action actions[],
> +		struct rte_flow_error *error);
> +
> +/**
> + * Destroy a flow rule on a given port.
> + *
> + * Failure to destroy a flow rule handle may occur when other flow
> +rules
> + * depend on it, and destroying it would result in an inconsistent state.
> + *
> + * This function is only guaranteed to succeed if handles are destroyed
> +in
> + * reverse order of their creation.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param flow
> + *   Flow rule handle to destroy.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_flow_destroy(uint8_t port_id,
> +		 struct rte_flow *flow,
> +		 struct rte_flow_error *error);
> +
> +/**
> + * Destroy all flow rules associated with a port.
> + *
> + * In the unlikely event of failure, handles are still considered
> +destroyed
> + * and no longer valid but the port must be assumed to be in an
> +inconsistent
> + * state.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_flow_flush(uint8_t port_id,
> +	       struct rte_flow_error *error);
> +
> +/**
> + * Query an existing flow rule.
> + *
> + * This function allows retrieving flow-specific data such as counters.
> + * Data is gathered by special actions which must be present in the
> +flow
> + * rule definition.
> + *
> + * \see RTE_FLOW_ACTION_TYPE_COUNT
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param flow
> + *   Flow rule handle to query.
> + * @param action
> + *   Action type to query.
> + * @param[in, out] data
> + *   Pointer to storage for the associated query data type.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +rte_flow_query(uint8_t port_id,
> +	       struct rte_flow *flow,
> +	       enum rte_flow_action_type action,
> +	       void *data,
> +	       struct rte_flow_error *error);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_FLOW_H_ */
> diff --git a/lib/librte_ether/rte_flow_driver.h
> b/lib/librte_ether/rte_flow_driver.h
> new file mode 100644
> index 0000000..274562c
> --- /dev/null
> +++ b/lib/librte_ether/rte_flow_driver.h
> @@ -0,0 +1,182 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright 2016 6WIND S.A.
> + *   Copyright 2016 Mellanox.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of 6WIND S.A. nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#ifndef RTE_FLOW_DRIVER_H_
> +#define RTE_FLOW_DRIVER_H_
> +
> +/**
> + * @file
> + * RTE generic flow API (driver side)
> + *
> + * This file provides implementation helpers for internal use by PMDs,
> +they
> + * are not intended to be exposed to applications and are not subject
> +to ABI
> + * versioning.
> + */
> +
> +#include <stdint.h>
> +
> +#include <rte_errno.h>
> +#include <rte_ethdev.h>
> +#include "rte_flow.h"
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * Generic flow operations structure implemented and returned by PMDs.
> + *
> + * To implement this API, PMDs must handle the RTE_ETH_FILTER_GENERIC
> +filter
> + * type in their .filter_ctrl callback function (struct eth_dev_ops) as
> +well
> + * as the RTE_ETH_FILTER_GET filter operation.
> + *
> + * If successful, this operation must result in a pointer to a
> +PMD-specific
> + * struct rte_flow_ops written to the argument address as described below:
> + *
> + * \code
> + *
> + * // PMD filter_ctrl callback
> + *
> + * static const struct rte_flow_ops pmd_flow_ops = { ... };
> + *
> + * switch (filter_type) {
> + * case RTE_ETH_FILTER_GENERIC:
> + *     if (filter_op != RTE_ETH_FILTER_GET)
> + *         return -EINVAL;
> + *     *(const void **)arg = &pmd_flow_ops;
> + *     return 0;
> + * }
> + *
> + * \endcode
> + *
> + * See also rte_flow_ops_get().
> + *
> + * These callback functions are not supposed to be used by applications
> + * directly, which must rely on the API defined in rte_flow.h.
> + *
> + * Public-facing wrapper functions perform a few consistency checks so
> +that
> + * unimplemented (i.e. NULL) callbacks simply return -ENOTSUP. These
> + * callbacks otherwise only differ by their first argument (with port
> +ID
> + * already resolved to a pointer to struct rte_eth_dev).
> + */
> +struct rte_flow_ops {
> +	/** See rte_flow_validate(). */
> +	int (*validate)
> +		(struct rte_eth_dev *,
> +		 const struct rte_flow_attr *,
> +		 const struct rte_flow_item [],
> +		 const struct rte_flow_action [],
> +		 struct rte_flow_error *);
> +	/** See rte_flow_create(). */
> +	struct rte_flow *(*create)
> +		(struct rte_eth_dev *,
> +		 const struct rte_flow_attr *,
> +		 const struct rte_flow_item [],
> +		 const struct rte_flow_action [],
> +		 struct rte_flow_error *);
> +	/** See rte_flow_destroy(). */
> +	int (*destroy)
> +		(struct rte_eth_dev *,
> +		 struct rte_flow *,
> +		 struct rte_flow_error *);
> +	/** See rte_flow_flush(). */
> +	int (*flush)
> +		(struct rte_eth_dev *,
> +		 struct rte_flow_error *);
> +	/** See rte_flow_query(). */
> +	int (*query)
> +		(struct rte_eth_dev *,
> +		 struct rte_flow *,
> +		 enum rte_flow_action_type,
> +		 void *,
> +		 struct rte_flow_error *);
> +};
> +
> +/**
> + * Initialize generic flow error structure.
> + *
> + * This function also sets rte_errno to a given value.
> + *
> + * @param[out] error
> + *   Pointer to flow error structure (may be NULL).
> + * @param code
> + *   Related error code (rte_errno).
> + * @param type
> + *   Cause field and error types.
> + * @param cause
> + *   Object responsible for the error.
> + * @param message
> + *   Human-readable error message.
> + *
> + * @return
> + *   Pointer to flow error structure.
> + */
> +static inline struct rte_flow_error *
> +rte_flow_error_set(struct rte_flow_error *error,
> +		   int code,
> +		   enum rte_flow_error_type type,
> +		   const void *cause,
> +		   const char *message)
> +{
> +	if (error) {
> +		*error = (struct rte_flow_error){
> +			.type = type,
> +			.cause = cause,
> +			.message = message,
> +		};
> +	}
> +	rte_errno = code;
> +	return error;
> +}
> +
> +/**
> + * Get generic flow operations structure from a port.
> + *
> + * @param port_id
> + *   Port identifier to query.
> + * @param[out] error
> + *   Pointer to flow error structure.
> + *
> + * @return
> + *   The flow operations structure associated with port_id, NULL in case of
> + *   error, in which case rte_errno is set and the error structure contains
> + *   additional details.
> + */
> +const struct rte_flow_ops *
> +rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_FLOW_DRIVER_H_ */
> --
> 2.1.4

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
  2017-05-23  6:07           ` Zhao1, Wei
@ 2017-05-23  9:50             ` Adrien Mazarguil
  2017-05-24  3:32               ` Zhao1, Wei
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2017-05-23  9:50 UTC (permalink / raw)
  To: Zhao1, Wei; +Cc: dev, Xing, Beilei

Hi Wei,

On Tue, May 23, 2017 at 06:07:20AM +0000, Zhao1, Wei wrote:
> Hi,  Adrien
> 
> > +struct rte_flow_item_raw {
> > +	uint32_t relative:1; /**< Look for pattern after the previous item. */
> > +	uint32_t search:1; /**< Search pattern from offset (see also limit). */
> > +	uint32_t reserved:30; /**< Reserved, must be set to zero. */
> > +	int32_t offset; /**< Absolute or relative offset for pattern. */
> > +	uint16_t limit; /**< Search area limit for start of pattern. */
> > +	uint16_t length; /**< Pattern length. */
> > +	uint8_t pattern[]; /**< Byte string to look for. */ };
> 
> When I use this API to test igb flex filter, I find that 
> in the struct rte_flow_item_raw, the member  pattern is not the same as my purpose.
> For example, If I type in  " flow create 0 ingress pattern raw relative is 0 pattern is 0123  / end actions queue index 1 / end "
> What I get in NIC layer is  pattern[]={ 0x30, 0x31, 0x32, 0x33, 0x0 <repeats 124 times> }.
> But what I need is pattern[]={0x01, 0x23, 0x0 <repeats 126 times>}

Similar limitation as I answered in [1] then. This is not a problem in the
rte_flow API, it's only that the testpmd parser currently provides
unprocessed strings to the PMD, and there is currently no method to work
around that.

> About the format change of flex_filter, I have reference to the testpmd function cmd_flex_filter_parsed(),
> There is details of format change from ASIC code to data, for example:
> 
>             for (i = 0; i < len; i++) {
>                         c = bytes_ptr[i];
>                         if (isxdigit(c) == 0) {
>                                     /* invalid characters. */
>                                     printf("invalid input\n");
>                                     return;
>                         }
>                         val = xdigit2val(c);
>                         if (i % 2) {
>                                     byte |= val;
>                                     filter.bytes[j] = byte;
>                                     printf("bytes[%d]:%02x ", j, filter.bytes[j]);
>                                     j++;
>                                     byte = 0;
>                         } else
>                                     byte |= val << 4;
>             }  
> 
> and there is also usage example in the DPDK document testpmd_app_ug-16.11.pdf:
> (it also not use ASIC code)
> 
> testpmd> flex_filter 0 add len 16 bytes 0x00000000000000000000000008060000 \
> mask 000C priority 3 queue 3

I understand, the difference between both commands is only that unlike
flex_filter, flow does not interpret the provided string as hexadecimal.

> so, will our new generic flow API align to the old format in flex byte filter in 17.08 or in the future?

What I have in mind instead is a printf-like input method. Using the rule
you provided above:

 flow create 0 ingress pattern raw relative is 0 pattern is 0123  / end actions queue index 1 / end

Will always yield "0123", however:

 flow create 0 ingress pattern raw relative is 0 pattern is \x00\x01\x02\x03  / end actions queue index 1 / end

Will yield the intended pattern. Currently this format is interpreted as is
(you'll get "\x00\x01\x02\x03") however escape interpretation is in the
plans.

> At least in the struct rte_flow_item_raw, the member  pattern is the same as old filter?

It is the same as the old filter, except you cannot provide it in
hexadecimal format yet. No changes needed on the PMD side in any case.

Again, this is only a testpmd implementation issue, that doesn't prevent
developers from creating programs that directly provide binary data to RAW
items, there's no such limitation.

[1] http://dpdk.org/ml/archives/dev/2017-May/065798.html

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
  2017-05-23  9:50             ` Adrien Mazarguil
@ 2017-05-24  3:32               ` Zhao1, Wei
  2017-05-24  7:32                 ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Zhao1, Wei @ 2017-05-24  3:32 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Xing, Beilei

Hi,

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Tuesday, May 23, 2017 5:51 PM
> To: Zhao1, Wei <wei.zhao1@intel.com>
> Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
>
> Hi Wei,
>
> On Tue, May 23, 2017 at 06:07:20AM +0000, Zhao1, Wei wrote:
> > Hi,  Adrien
> >
> > > +struct rte_flow_item_raw {
> > > + uint32_t relative:1; /**< Look for pattern after the previous item. */
> > > + uint32_t search:1; /**< Search pattern from offset (see also limit). */
> > > + uint32_t reserved:30; /**< Reserved, must be set to zero. */
> > > + int32_t offset; /**< Absolute or relative offset for pattern. */
> > > + uint16_t limit; /**< Search area limit for start of pattern. */
> > > + uint16_t length; /**< Pattern length. */
> > > + uint8_t pattern[]; /**< Byte string to look for. */ };
> >
> > When I use this API to test igb flex filter, I find that in the struct
> > rte_flow_item_raw, the member  pattern is not the same as my purpose.
> > For example, If I type in  " flow create 0 ingress pattern raw relative is 0
> pattern is 0123  / end actions queue index 1 / end "
> > What I get in NIC layer is  pattern[]={ 0x30, 0x31, 0x32, 0x33, 0x0 <repeats
> 124 times> }.
> > But what I need is pattern[]={0x01, 0x23, 0x0 <repeats 126 times>}
>
> Similar limitation as I answered in [1] then. This is not a problem in the
> rte_flow API, it's only that the testpmd parser currently provides
> unprocessed strings to the PMD, and there is currently no method to work
> around that.
>
> > About the format change of flex_filter, I have reference to the
> > testpmd function cmd_flex_filter_parsed(), There is details of format
> change from ASIC code to data, for example:
> >
> >             for (i = 0; i < len; i++) {
> >                         c = bytes_ptr[i];
> >                         if (isxdigit(c) == 0) {
> >                                     /* invalid characters. */
> >                                     printf("invalid input\n");
> >                                     return;
> >                         }
> >                         val = xdigit2val(c);
> >                         if (i % 2) {
> >                                     byte |= val;
> >                                     filter.bytes[j] = byte;
> >                                     printf("bytes[%d]:%02x ", j, filter.bytes[j]);
> >                                     j++;
> >                                     byte = 0;
> >                         } else
> >                                     byte |= val << 4;
> >             }
> >
> > and there is also usage example in the DPDK document testpmd_app_ug-
> 16.11.pdf:
> > (it also not use ASIC code)
> >
> > testpmd> flex_filter 0 add len 16 bytes
> > testpmd> 0x00000000000000000000000008060000 \
> > mask 000C priority 3 queue 3
>
> I understand, the difference between both commands is only that unlike
> flex_filter, flow does not interpret the provided string as hexadecimal.
>
> > so, will our new generic flow API align to the old format in flex byte filter in
> 17.08 or in the future?
>
> What I have in mind instead is a printf-like input method. Using the rule you
> provided above:
>
>  flow create 0 ingress pattern raw relative is 0 pattern is 0123  / end actions
> queue index 1 / end
>
> Will always yield "0123", however:
>
>  flow create 0 ingress pattern raw relative is 0 pattern is \x00\x01\x02\x03  /
> end actions queue index 1 / end
>
> Will yield the intended pattern. Currently this format is interpreted as is
> (you'll get "\x00\x01\x02\x03") however escape interpretation is in the plans.
>

Thank you for your explanation. But there is some key point I want to repeat:
For example, If I type in  " flow create 0 ingress pattern raw relative is 0 pattern is 0123  / end actions queue index 1 / end "
Or maybe more accurate, " flow create 0 ingress pattern raw relative is 0 pattern is 0x0123  / end actions queue index 1 / end "
what I need is pattern[]={0x01, 0x23, 0x0 <repeats 126 times>}.
not  pattern[]={ 0x00, 0x01, 0x02, 0x03, 0x0 <repeats 124 times> }.
And also, not  pattern[]={ 0x30, 0x31, 0x32, 0x33, 0x0 <repeats 124 times> }.
And this problem is not a block for code develop for 17.08, but it is needed for tester and user in the feature.

> > At least in the struct rte_flow_item_raw, the member  pattern is the same
> as old filter?
>
> It is the same as the old filter, except you cannot provide it in hexadecimal
> format yet. No changes needed on the PMD side in any case.
>
> Again, this is only a testpmd implementation issue, that doesn't prevent
> developers from creating programs that directly provide binary data to RAW
> items, there's no such limitation.
>
> [1] http://dpdk.org/ml/archives/dev/2017-May/065798.html
>
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
  2017-05-24  3:32               ` Zhao1, Wei
@ 2017-05-24  7:32                 ` Adrien Mazarguil
  2017-05-24  8:46                   ` Zhao1, Wei
  0 siblings, 1 reply; 262+ messages in thread
From: Adrien Mazarguil @ 2017-05-24  7:32 UTC (permalink / raw)
  To: Zhao1, Wei; +Cc: dev, Xing, Beilei

On Wed, May 24, 2017 at 03:32:02AM +0000, Zhao1, Wei wrote:
> Hi,
> 
> > -----Original Message-----
> > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > Sent: Tuesday, May 23, 2017 5:51 PM
> > To: Zhao1, Wei <wei.zhao1@intel.com>
> > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
> >
> > Hi Wei,
> >
> > On Tue, May 23, 2017 at 06:07:20AM +0000, Zhao1, Wei wrote:
> > > Hi,  Adrien
> > >
> > > > +struct rte_flow_item_raw {
> > > > + uint32_t relative:1; /**< Look for pattern after the previous item. */
> > > > + uint32_t search:1; /**< Search pattern from offset (see also limit). */
> > > > + uint32_t reserved:30; /**< Reserved, must be set to zero. */
> > > > + int32_t offset; /**< Absolute or relative offset for pattern. */
> > > > + uint16_t limit; /**< Search area limit for start of pattern. */
> > > > + uint16_t length; /**< Pattern length. */
> > > > + uint8_t pattern[]; /**< Byte string to look for. */ };
> > >
> > > When I use this API to test igb flex filter, I find that in the struct
> > > rte_flow_item_raw, the member  pattern is not the same as my purpose.
> > > For example, If I type in  " flow create 0 ingress pattern raw relative is 0
> > pattern is 0123  / end actions queue index 1 / end "
> > > What I get in NIC layer is  pattern[]={ 0x30, 0x31, 0x32, 0x33, 0x0 <repeats
> > 124 times> }.
> > > But what I need is pattern[]={0x01, 0x23, 0x0 <repeats 126 times>}
> >
> > Similar limitation as I answered in [1] then. This is not a problem in the
> > rte_flow API, it's only that the testpmd parser currently provides
> > unprocessed strings to the PMD, and there is currently no method to work
> > around that.
> >
> > > About the format change of flex_filter, I have reference to the
> > > testpmd function cmd_flex_filter_parsed(), There is details of format
> > change from ASIC code to data, for example:
> > >
> > >             for (i = 0; i < len; i++) {
> > >                         c = bytes_ptr[i];
> > >                         if (isxdigit(c) == 0) {
> > >                                     /* invalid characters. */
> > >                                     printf("invalid input\n");
> > >                                     return;
> > >                         }
> > >                         val = xdigit2val(c);
> > >                         if (i % 2) {
> > >                                     byte |= val;
> > >                                     filter.bytes[j] = byte;
> > >                                     printf("bytes[%d]:%02x ", j, filter.bytes[j]);
> > >                                     j++;
> > >                                     byte = 0;
> > >                         } else
> > >                                     byte |= val << 4;
> > >             }
> > >
> > > and there is also usage example in the DPDK document testpmd_app_ug-
> > 16.11.pdf:
> > > (it also not use ASIC code)
> > >
> > > testpmd> flex_filter 0 add len 16 bytes
> > > testpmd> 0x00000000000000000000000008060000 \
> > > mask 000C priority 3 queue 3
> >
> > I understand, the difference between both commands is only that unlike
> > flex_filter, flow does not interpret the provided string as hexadecimal.
> >
> > > so, will our new generic flow API align to the old format in flex byte filter in
> > 17.08 or in the future?
> >
> > What I have in mind instead is a printf-like input method. Using the rule you
> > provided above:
> >
> >  flow create 0 ingress pattern raw relative is 0 pattern is 0123  / end actions
> > queue index 1 / end
> >
> > Will always yield "0123", however:
> >
> >  flow create 0 ingress pattern raw relative is 0 pattern is \x00\x01\x02\x03  /
> > end actions queue index 1 / end
> >
> > Will yield the intended pattern. Currently this format is interpreted as is
> > (you'll get "\x00\x01\x02\x03") however escape interpretation is in the plans.
> >
> 
> Thank you for your explanation. But there is some key point I want to repeat:
> For example, If I type in  " flow create 0 ingress pattern raw relative is 0 pattern is 0123  / end actions queue index 1 / end "
> Or maybe more accurate, " flow create 0 ingress pattern raw relative is 0 pattern is 0x0123  / end actions queue index 1 / end "
> what I need is pattern[]={0x01, 0x23, 0x0 <repeats 126 times>}.
> not  pattern[]={ 0x00, 0x01, 0x02, 0x03, 0x0 <repeats 124 times> }.
> And also, not  pattern[]={ 0x30, 0x31, 0x32, 0x33, 0x0 <repeats 124 times> }.

Right, I misread your original pattern[] intent. You would get such a
pattern by specifying \x01\x23 (just like C string literals). It would even
accept octal notation "\01\043". Both would yield { 0x01, 0x23 }.

Does something like that satisfy the requirements?

> And this problem is not a block for code develop for 17.08, but it is needed for tester and user in the feature.

Well, I've actually started implementing the above long ago in testpmd but
didn't have time to clean up the patch and submit it yet (moreover it was
not needed until now). If the idea works for your use case, I can attempt to
do that soon.

> > > At least in the struct rte_flow_item_raw, the member  pattern is the same
> > as old filter?
> >
> > It is the same as the old filter, except you cannot provide it in hexadecimal
> > format yet. No changes needed on the PMD side in any case.
> >
> > Again, this is only a testpmd implementation issue, that doesn't prevent
> > developers from creating programs that directly provide binary data to RAW
> > items, there's no such limitation.
> >
> > [1] http://dpdk.org/ml/archives/dev/2017-May/065798.html
> >
> > --
> > Adrien Mazarguil
> > 6WIND

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
  2017-05-24  7:32                 ` Adrien Mazarguil
@ 2017-05-24  8:46                   ` Zhao1, Wei
  0 siblings, 0 replies; 262+ messages in thread
From: Zhao1, Wei @ 2017-05-24  8:46 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Xing, Beilei

Hi, Adrien

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Wednesday, May 24, 2017 3:32 PM
> To: Zhao1, Wei <wei.zhao1@intel.com>
> Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API
> 
> On Wed, May 24, 2017 at 03:32:02AM +0000, Zhao1, Wei wrote:
> > Hi,
> >
> > > -----Original Message-----
> > > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > > Sent: Tuesday, May 23, 2017 5:51 PM
> > > To: Zhao1, Wei <wei.zhao1@intel.com>
> > > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic
> > > flow API
> > >
> > > Hi Wei,
> > >
> > > On Tue, May 23, 2017 at 06:07:20AM +0000, Zhao1, Wei wrote:
> > > > Hi,  Adrien
> > > >
> > > > > +struct rte_flow_item_raw {
> > > > > + uint32_t relative:1; /**< Look for pattern after the previous
> > > > > +item. */  uint32_t search:1; /**< Search pattern from offset
> > > > > +(see also limit). */  uint32_t reserved:30; /**< Reserved, must
> > > > > +be set to zero. */  int32_t offset; /**< Absolute or relative
> > > > > +offset for pattern. */  uint16_t limit; /**< Search area limit
> > > > > +for start of pattern. */  uint16_t length; /**< Pattern length.
> > > > > +*/  uint8_t pattern[]; /**< Byte string to look for. */ };
> > > >
> > > > When I use this API to test igb flex filter, I find that in the
> > > > struct rte_flow_item_raw, the member  pattern is not the same as my
> purpose.
> > > > For example, If I type in  " flow create 0 ingress pattern raw
> > > > relative is 0
> > > pattern is 0123  / end actions queue index 1 / end "
> > > > What I get in NIC layer is  pattern[]={ 0x30, 0x31, 0x32, 0x33,
> > > > 0x0 <repeats
> > > 124 times> }.
> > > > But what I need is pattern[]={0x01, 0x23, 0x0 <repeats 126 times>}
> > >
> > > Similar limitation as I answered in [1] then. This is not a problem
> > > in the rte_flow API, it's only that the testpmd parser currently
> > > provides unprocessed strings to the PMD, and there is currently no
> > > method to work around that.
> > >
> > > > About the format change of flex_filter, I have reference to the
> > > > testpmd function cmd_flex_filter_parsed(), There is details of
> > > > format
> > > change from ASIC code to data, for example:
> > > >
> > > >             for (i = 0; i < len; i++) {
> > > >                         c = bytes_ptr[i];
> > > >                         if (isxdigit(c) == 0) {
> > > >                                     /* invalid characters. */
> > > >                                     printf("invalid input\n");
> > > >                                     return;
> > > >                         }
> > > >                         val = xdigit2val(c);
> > > >                         if (i % 2) {
> > > >                                     byte |= val;
> > > >                                     filter.bytes[j] = byte;
> > > >                                     printf("bytes[%d]:%02x ", j, filter.bytes[j]);
> > > >                                     j++;
> > > >                                     byte = 0;
> > > >                         } else
> > > >                                     byte |= val << 4;
> > > >             }
> > > >
> > > > and there is also usage example in the DPDK document
> > > > testpmd_app_ug-
> > > 16.11.pdf:
> > > > (it also not use ASIC code)
> > > >
> > > > testpmd> flex_filter 0 add len 16 bytes
> > > > testpmd> 0x00000000000000000000000008060000 \
> > > > mask 000C priority 3 queue 3
> > >
> > > I understand, the difference between both commands is only that
> > > unlike flex_filter, flow does not interpret the provided string as
> hexadecimal.
> > >
> > > > so, will our new generic flow API align to the old format in flex
> > > > byte filter in
> > > 17.08 or in the future?
> > >
> > > What I have in mind instead is a printf-like input method. Using the
> > > rule you provided above:
> > >
> > >  flow create 0 ingress pattern raw relative is 0 pattern is 0123  /
> > > end actions queue index 1 / end
> > >
> > > Will always yield "0123", however:
> > >
> > >  flow create 0 ingress pattern raw relative is 0 pattern is
> > > \x00\x01\x02\x03  / end actions queue index 1 / end
> > >
> > > Will yield the intended pattern. Currently this format is
> > > interpreted as is (you'll get "\x00\x01\x02\x03") however escape
> interpretation is in the plans.
> > >
> >
> > Thank you for your explanation. But there is some key point I want to
> repeat:
> > For example, If I type in  " flow create 0 ingress pattern raw relative is 0
> pattern is 0123  / end actions queue index 1 / end "
> > Or maybe more accurate, " flow create 0 ingress pattern raw relative is 0
> pattern is 0x0123  / end actions queue index 1 / end "
> > what I need is pattern[]={0x01, 0x23, 0x0 <repeats 126 times>}.
> > not  pattern[]={ 0x00, 0x01, 0x02, 0x03, 0x0 <repeats 124 times> }.
> > And also, not  pattern[]={ 0x30, 0x31, 0x32, 0x33, 0x0 <repeats 124 times> }.
> 
> Right, I misread your original pattern[] intent. You would get such a pattern
> by specifying \x01\x23 (just like C string literals). It would even accept octal
> notation "\01\043". Both would yield { 0x01, 0x23 }.
> 
> Does something like that satisfy the requirements?

Yes, After the repeat, I think we both understand each other.

> 
> > And this problem is not a block for code develop for 17.08, but it is needed
> for tester and user in the feature.
> 
> Well, I've actually started implementing the above long ago in testpmd but
> didn't have time to clean up the patch and submit it yet (moreover it was not
> needed until now). If the idea works for your use case, I can attempt to do
> that soon.
> 

All that decision is up to you, and I think it is great.

> > > > At least in the struct rte_flow_item_raw, the member  pattern is
> > > > the same
> > > as old filter?
> > >
> > > It is the same as the old filter, except you cannot provide it in
> > > hexadecimal format yet. No changes needed on the PMD side in any case.
> > >
> > > Again, this is only a testpmd implementation issue, that doesn't
> > > prevent developers from creating programs that directly provide
> > > binary data to RAW items, there's no such limitation.
> > >
> > > [1] http://dpdk.org/ml/archives/dev/2017-May/065798.html
> > >
> > > --
> > > Adrien Mazarguil
> > > 6WIND
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API
  2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API Adrien Mazarguil
@ 2017-10-23  8:53         ` Zhao1, Wei
  2017-10-31 17:45           ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Zhao1, Wei @ 2017-10-23  8:53 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev, Adrien Mazarguil

Hi, Adrien



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Saturday, December 17, 2016 12:25 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API
> 
> This new API supersedes all the legacy filter types described in rte_eth_ctrl.h.
> It is slightly higher level and as a result relies more on PMDs to process and
> validate flow rules.
> 
> Benefits:
> 
> - A unified API is easier to program for, applications do not have to be
>   written for a specific filter type which may or may not be supported by
>   the underlying device.
> 
> - The behavior of a flow rule is the same regardless of the underlying
>   device, applications do not need to be aware of hardware quirks.
> 
> - Extensible by design, API/ABI breakage should rarely occur if at all.
> 
> - Documentation is self-standing, no need to look up elsewhere.
> 
> Existing filter types will be deprecated and removed in the near future.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
>  MAINTAINERS                            |   4 +
>  doc/api/doxy-api-index.md              |   2 +
>  lib/librte_ether/Makefile              |   3 +
>  lib/librte_ether/rte_eth_ctrl.h        |   1 +
>  lib/librte_ether/rte_ether_version.map |  11 +
>  lib/librte_ether/rte_flow.c            | 159 +++++
>  lib/librte_ether/rte_flow.h            | 942 ++++++++++++++++++++++++++++
>  lib/librte_ether/rte_flow_driver.h     | 181 ++++++
>  8 files changed, 1303 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 26d9590..5975cff 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -243,6 +243,10 @@ M: Thomas Monjalon
> <thomas.monjalon@6wind.com>
>  F: lib/librte_ether/
>  F: scripts/test-null.sh
> 

+/**
+ * RTE_FLOW_ACTION_TYPE_RSS
+ *
+ * Similar to QUEUE, except RSS is additionally performed on packets to
+ * spread them among several queues according to the provided parameters.
+ *
+ * Note: RSS hash result is normally stored in the hash.rss mbuf field,
+ * however it conflicts with the MARK action as they share the same
+ * space. When both actions are specified, the RSS hash is discarded 
+and
+ * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
+ * structure should eventually evolve to store both.
+ *
+ * Terminating by default.
+ */
+struct rte_flow_action_rss {
+	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
+	uint16_t num; /**< Number of entries in queue[]. */
+	uint16_t queue[]; /**< Queues indices to use. */ };
+

I am plan for moving rss to rte_flow.
May I ask you some question for this struct of rte_flow_action_rss.
1. why do you use pointer mode for rss_conf, why not use " struct rte_eth_rss_conf  rss_conf "?
we need to fill these rss info which get from CLI to this struct, if we use the pointer mode, how can we fill in these info?

2. And also why the" const" is not need? We need a const ?How can we config this parameter?

3. what is your expect mode for CLI command for rss config? When I type in :
" flow create 0 pattern eth / tcp / end actions rss queues queue 0 /end "
Or " flow create 0 pattern eth / tcp / end actions rss queues {0 1 2} /end "
I get " Bad arguments ".

So, the right CLI command is ?

Thank you!

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API
  2017-10-23  8:53         ` Zhao1, Wei
@ 2017-10-31 17:45           ` Adrien Mazarguil
  2017-11-07  6:56             ` Zhao1, Wei
  2017-11-14  3:23             ` Zhao1, Wei
  0 siblings, 2 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2017-10-31 17:45 UTC (permalink / raw)
  To: Zhao1, Wei; +Cc: dev

Hi Wei,

Sorry for the late answer, see below.

On Mon, Oct 23, 2017 at 08:53:49AM +0000, Zhao1, Wei wrote:
<snip>
> +/**
> + * RTE_FLOW_ACTION_TYPE_RSS
> + *
> + * Similar to QUEUE, except RSS is additionally performed on packets to
> + * spread them among several queues according to the provided parameters.
> + *
> + * Note: RSS hash result is normally stored in the hash.rss mbuf field,
> + * however it conflicts with the MARK action as they share the same
> + * space. When both actions are specified, the RSS hash is discarded 
> +and
> + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
> + * structure should eventually evolve to store both.
> + *
> + * Terminating by default.
> + */
> +struct rte_flow_action_rss {
> +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
> +	uint16_t num; /**< Number of entries in queue[]. */
> +	uint16_t queue[]; /**< Queues indices to use. */ };
> +
> 
> I am plan for moving rss to rte_flow.
> May I ask you some question for this struct of rte_flow_action_rss.
> 1. why do you use pointer mode for rss_conf, why not use " struct rte_eth_rss_conf  rss_conf "?
> we need to fill these rss info which get from CLI to this struct, if we use the pointer mode, how can we fill in these info?

The original idea was to provide the ability to share a single rss_conf
structure between many flow rules and avoid redundant allocations.

It's based on the fact applications currently use a single RSS context for
everything, they can easily make rss_conf point the global context found in
struct rte_eth_dev.

Currently the testpmd flow command is incomplete, it doesn't support the
configuration of this field and always provides NULL to PMDs instead. This
is not documented in the flow API and may crash PMDs.

For the time being, a PMD can interpret NULL as a kind of default global
rss_conf parameters, as is done in both mlx5 and mlx4 PMDs.

That's a problem that needs to be addressed in testpmd.

> 2. And also why the" const" is not need? We need a const ?How can we config this parameter?

It means the allocation is managed outside of rte_flow, where the data may
or may not be const, it's up to the application. The pointer stored in this
structure is only a copy.

Whether the pointed data remains allocated once the flow rule is created is
unspecified. A PMD cannot assume anything and has to copy these parameters
if needed later.

It's an API issue I'd like to address by embedding the rss_conf parameters
directly in rte_flow_action_rss instead of doing so through a pointer.

> 3. what is your expect mode for CLI command for rss config? When I type in :
> " flow create 0 pattern eth / tcp / end actions rss queues queue 0 /end "
> Or " flow create 0 pattern eth / tcp / end actions rss queues {0 1 2} /end "
> I get " Bad arguments ".
> 
> So, the right CLI command is ?

Basically you need to specify an additional "end" keyword to terminate the
queues list (tab completion shows it):

 flow create 0 ingress pattern eth / tcp / end actions rss queues 0 1 2 end / end

There are other design flaws with the RSS action definition:

- RSS hash algorithm to use is missing, PMDs must rely on global parameters.
- The C99-style flexible queues array is a super pain to manage and is not
  supported by C++. I want to make it a normal pointer (the same applies to
  the RAW pattern item).

These changes are planned for the next overhaul of rte_flow, I'll submit a
RFC for them as soon as I get the chance.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API
  2017-10-31 17:45           ` Adrien Mazarguil
@ 2017-11-07  6:56             ` Zhao1, Wei
  2017-11-14  3:23             ` Zhao1, Wei
  1 sibling, 0 replies; 262+ messages in thread
From: Zhao1, Wei @ 2017-11-07  6:56 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

Hi, Adrien

Thank you for your so details answer!
Let me study your mail first then maybe I will supply some details 
when I implement PMD code of MOVING rss  to rte_flow for ixgbe.

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Wednesday, November 1, 2017 1:46 AM
> To: Zhao1, Wei <wei.zhao1@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API
> 
> Hi Wei,
> 
> Sorry for the late answer, see below.
> 
> On Mon, Oct 23, 2017 at 08:53:49AM +0000, Zhao1, Wei wrote:
> <snip>
> > +/**
> > + * RTE_FLOW_ACTION_TYPE_RSS
> > + *
> > + * Similar to QUEUE, except RSS is additionally performed on packets
> > +to
> > + * spread them among several queues according to the provided
> parameters.
> > + *
> > + * Note: RSS hash result is normally stored in the hash.rss mbuf
> > +field,
> > + * however it conflicts with the MARK action as they share the same
> > + * space. When both actions are specified, the RSS hash is discarded
> > +and
> > + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The
> > +mbuf
> > + * structure should eventually evolve to store both.
> > + *
> > + * Terminating by default.
> > + */
> > +struct rte_flow_action_rss {
> > +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
> > +	uint16_t num; /**< Number of entries in queue[]. */
> > +	uint16_t queue[]; /**< Queues indices to use. */ };
> > +
> >
> > I am plan for moving rss to rte_flow.
> > May I ask you some question for this struct of rte_flow_action_rss.
> > 1. why do you use pointer mode for rss_conf, why not use " struct
> rte_eth_rss_conf  rss_conf "?
> > we need to fill these rss info which get from CLI to this struct, if we use the
> pointer mode, how can we fill in these info?
> 
> The original idea was to provide the ability to share a single rss_conf structure
> between many flow rules and avoid redundant allocations.
> 
> It's based on the fact applications currently use a single RSS context for
> everything, they can easily make rss_conf point the global context found in
> struct rte_eth_dev.
> 
> Currently the testpmd flow command is incomplete, it doesn't support the
> configuration of this field and always provides NULL to PMDs instead. This is
> not documented in the flow API and may crash PMDs.
> 
> For the time being, a PMD can interpret NULL as a kind of default global
> rss_conf parameters, as is done in both mlx5 and mlx4 PMDs.
> 
> That's a problem that needs to be addressed in testpmd.
> 
> > 2. And also why the" const" is not need? We need a const ?How can we
> config this parameter?
> 
> It means the allocation is managed outside of rte_flow, where the data may
> or may not be const, it's up to the application. The pointer stored in this
> structure is only a copy.
> 
> Whether the pointed data remains allocated once the flow rule is created is
> unspecified. A PMD cannot assume anything and has to copy these
> parameters if needed later.
> 
> It's an API issue I'd like to address by embedding the rss_conf parameters
> directly in rte_flow_action_rss instead of doing so through a pointer.
> 
> > 3. what is your expect mode for CLI command for rss config? When I type
> in :
> > " flow create 0 pattern eth / tcp / end actions rss queues queue 0 /end "
> > Or " flow create 0 pattern eth / tcp / end actions rss queues {0 1 2} /end "
> > I get " Bad arguments ".
> >
> > So, the right CLI command is ?
> 
> Basically you need to specify an additional "end" keyword to terminate the
> queues list (tab completion shows it):
> 
>  flow create 0 ingress pattern eth / tcp / end actions rss queues 0 1 2 end /
> end
> 
> There are other design flaws with the RSS action definition:
> 
> - RSS hash algorithm to use is missing, PMDs must rely on global parameters.
> - The C99-style flexible queues array is a super pain to manage and is not
>   supported by C++. I want to make it a normal pointer (the same applies to
>   the RAW pattern item).
> 
> These changes are planned for the next overhaul of rte_flow, I'll submit a
> RFC for them as soon as I get the chance.
> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API
  2017-10-31 17:45           ` Adrien Mazarguil
  2017-11-07  6:56             ` Zhao1, Wei
@ 2017-11-14  3:23             ` Zhao1, Wei
  1 sibling, 0 replies; 262+ messages in thread
From: Zhao1, Wei @ 2017-11-14  3:23 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

Hi,Adrien

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Wednesday, November 1, 2017 1:46 AM
> To: Zhao1, Wei <wei.zhao1@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API
> 
> Hi Wei,
> 
> Sorry for the late answer, see below.
> 
> On Mon, Oct 23, 2017 at 08:53:49AM +0000, Zhao1, Wei wrote:
> <snip>
> > +/**
> > + * RTE_FLOW_ACTION_TYPE_RSS
> > + *
> > + * Similar to QUEUE, except RSS is additionally performed on packets
> > +to
> > + * spread them among several queues according to the provided
> parameters.
> > + *
> > + * Note: RSS hash result is normally stored in the hash.rss mbuf
> > +field,
> > + * however it conflicts with the MARK action as they share the same
> > + * space. When both actions are specified, the RSS hash is discarded
> > +and
> > + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The
> > +mbuf
> > + * structure should eventually evolve to store both.
> > + *
> > + * Terminating by default.
> > + */
> > +struct rte_flow_action_rss {
> > +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
> > +	uint16_t num; /**< Number of entries in queue[]. */
> > +	uint16_t queue[]; /**< Queues indices to use. */ };
> > +
> >
> > I am plan for moving rss to rte_flow.
> > May I ask you some question for this struct of rte_flow_action_rss.
> > 1. why do you use pointer mode for rss_conf, why not use " struct
> rte_eth_rss_conf  rss_conf "?
> > we need to fill these rss info which get from CLI to this struct, if we use the
> pointer mode, how can we fill in these info?
> 
> The original idea was to provide the ability to share a single rss_conf structure
> between many flow rules and avoid redundant allocations.
> 
> It's based on the fact applications currently use a single RSS context for
> everything, they can easily make rss_conf point the global context found in
> struct rte_eth_dev.
> 
> Currently the testpmd flow command is incomplete, it doesn't support the
> configuration of this field and always provides NULL to PMDs instead. This is
> not documented in the flow API and may crash PMDs.
> 
> For the time being, a PMD can interpret NULL as a kind of default global
> rss_conf parameters, as is done in both mlx5 and mlx4 PMDs.
> 
> That's a problem that needs to be addressed in testpmd.

Yes, this is a block when I test my code of moving rss to rte_flow.
Now, I have do ixgbe rte_flow action parser for RTE_FLOW_ACTION_TYPE_RSS, 
but I can not config  rss_conf structure member with testpmd, 
so I have to use an extra function to config these action rss parameter, Maybe I
have to use this way to test this. 
I will commit this new code of moving rss to rte_flow later even if testpmd not support this test, like mlx4 and mlx5.
Do you have other more suggestion for test this feature?

Thank you.

> > 2. And also why the" const" is not need? We need a const ?How can we
> config this parameter?
> 
> It means the allocation is managed outside of rte_flow, where the data may
> or may not be const, it's up to the application. The pointer stored in this
> structure is only a copy.
> 

Got.

> Whether the pointed data remains allocated once the flow rule is created is
> unspecified. A PMD cannot assume anything and has to copy these
> parameters if needed later.
> 
> It's an API issue I'd like to address by embedding the rss_conf parameters
> directly in rte_flow_action_rss instead of doing so through a pointer.
> 

Yes, I support your idea for this, but by now PMD code not only IXGBE but also MLX4 and MLX5 have
Implement PMD code on the assumption that it is a pointer.


> > 3. what is your expect mode for CLI command for rss config? When I type
> in :
> > " flow create 0 pattern eth / tcp / end actions rss queues queue 0 /end "
> > Or " flow create 0 pattern eth / tcp / end actions rss queues {0 1 2} /end "
> > I get " Bad arguments ".
> >
> > So, the right CLI command is ?
> 
> Basically you need to specify an additional "end" keyword to terminate the
> queues list (tab completion shows it):
> 
>  flow create 0 ingress pattern eth / tcp / end actions rss queues 0 1 2 end /
> end
> 

Ok.

> There are other design flaws with the RSS action definition:
> 
> - RSS hash algorithm to use is missing, PMDs must rely on global parameters.
> - The C99-style flexible queues array is a super pain to manage and is not
>   supported by C++. I want to make it a normal pointer (the same applies to
>   the RAW pattern item).
> 
> These changes are planned for the next overhaul of rte_flow, I'll submit a
> RFC for them as soon as I get the chance.

Good idea. wait for you.

> 
> --
> Adrien Mazarguil
> 6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
  2016-08-01 15:08 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Kieran Mansley
@ 2016-08-03 15:19 ` Adrien Mazarguil
  0 siblings, 0 replies; 262+ messages in thread
From: Adrien Mazarguil @ 2016-08-03 15:19 UTC (permalink / raw)
  To: Kieran Mansley; +Cc: dev

Hi Kieran,

On Mon, Aug 01, 2016 at 04:08:51PM +0100, Kieran Mansley wrote:
> Apologies for coming a little late to this thread about the proposed new
> API for filtering etc.
> 
> I've reviewed it based on Solarflare's needs and hardware capabilities,
> and think the proposal is likely to be a big improvement on the current
> system.
> 
> I worry slightly that the goal of having applications that are not aware
> of the hardware they are running on will be difficult to meet.  My guess
> is that the different hardware platforms will have so little overlap in
> the functionality they support that to get best performance the
> applications will still be heavily tailored to the subsets of the API
> that the hardware they are using provides.  The discussion of filter
> priorities is a good example of this: to get best performance the
> application will want to use the hardware's filtering capabilities to do
> all the heavy lifting, but the abilities of different NICs to support
> particular priorities and combinations of filters will mean what works
> very well for one NIC may well return "I can't do that" for another, and
> vice versa.

I also think most applications will end up using mostly generic rules, while
applications tailored for specific devices will use more features. In my
mind this is like how applications would handle SSE/AVX/AltiVec/etc
optimizations. They need to be aware such features exist and have both
specific and more generic code. The query interface should help with that.

> One suggestion for extending the API further would be to allow it to
> also describe transmit filters as well as receive filters.

Yes, TX is probably the next step. I think it will be part of the same API,
using pattern/actions similarly only they would affect the TX path. But
let's focus on the RX side for now.

> There are also some filters that can prove very useful to our customers
> that while they could be achieved through the careful insertion of
> multiple filters with the right order and priorities, could be made more
> application-friendly by having a more meaningful alias. For example:
>  - multicast-mismatch (all multicast traffic that doesn't match another
> filter and would otherwise be discarded)
>  - unicast-mismatch (all unicast traffic that doesn't match another
> filter and would otherwise be discarded)
>  - all-multicast (all multicast traffic)
>  - all-unicast (all unicast traffic)

Why not, those may be added as new pattern items if the community feels they
are necessary. But right now I do not think these are difficult to specify,
of course one should dedicate priority levels far apart to avoid collisions
with more specific rules, but you still need priorities to determine which
of "all-multicast" or "unicast-mistmatch" should match first.

> Finally, I wonder if any thought has been given to dealing with
> situations where there is a conflict between different virtual or
> physical functions.  E.g. attempting to insert a MAC filter on one VF
> that would steal traffic destined to a different VF.  Will it be up to
> each driver to enforce these sorts of policies or will there be a
> general vendor-neutral framework to deal with this?

PFs and VFs are a complex topic eh? Considering it is not even guaranteed
for a PF to be able to see VF-addressed traffic as is currently the case
for mlx4 and mlx5 (AFAIK). It will be up to each PMD, but they must all
follow the same logic.

A flow rule with a VF pattern item should not be allowed on a PF device if
the PF is either unable to receive VF-addressed traffic, or if doing so
would prevent traffic from being received by a VF when the flow rule
specifies that it is supposed to pass through (either implictly or through a
VF action).

Simply matching the MAC address of a VF from a PF (without specifying the VF
pattern item) should be allowed though. It may not work as packets may not
be received at all, but if it does the application should take care of the
consequences as VF may not receive packets anymore.

Creating or updating the MAC address of a VF after adding a conflicting
flow rule on a PF should not be allowed or remain undefined.

All of this is not described in the specification yet because PF/VF patterns
and actions are not fully defined at the moment, there is still some
uncertainty about them.

> I should reiterate that I think this will be a big improvement, so thank
> you for proposing it.

Thanks!

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 262+ messages in thread

* Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
@ 2016-08-01 15:08 Kieran Mansley
  2016-08-03 15:19 ` Adrien Mazarguil
  0 siblings, 1 reply; 262+ messages in thread
From: Kieran Mansley @ 2016-08-01 15:08 UTC (permalink / raw)
  To: dev; +Cc: adrien.mazarguil

Apologies for coming a little late to this thread about the proposed new
API for filtering etc.

I've reviewed it based on Solarflare's needs and hardware capabilities,
and think the proposal is likely to be a big improvement on the current
system.

I worry slightly that the goal of having applications that are not aware
of the hardware they are running on will be difficult to meet.  My guess
is that the different hardware platforms will have so little overlap in
the functionality they support that to get best performance the
applications will still be heavily tailored to the subsets of the API
that the hardware they are using provides.  The discussion of filter
priorities is a good example of this: to get best performance the
application will want to use the hardware's filtering capabilities to do
all the heavy lifting, but the abilities of different NICs to support
particular priorities and combinations of filters will mean what works
very well for one NIC may well return "I can't do that" for another, and
vice versa.

One suggestion for extending the API further would be to allow it to
also describe transmit filters as well as receive filters.

There are also some filters that can prove very useful to our customers
that while they could be achieved through the careful insertion of
multiple filters with the right order and priorities, could be made more
application-friendly by having a more meaningful alias. For example:
  - multicast-mismatch (all multicast traffic that doesn't match another
filter and would otherwise be discarded)
  - unicast-mismatch (all unicast traffic that doesn't match another
filter and would otherwise be discarded)
  - all-multicast (all multicast traffic)
  - all-unicast (all unicast traffic)

Finally, I wonder if any thought has been given to dealing with
situations where there is a conflict between different virtual or
physical functions.  E.g. attempting to insert a MAC filter on one VF
that would steal traffic destined to a different VF.  Will it be up to
each driver to enforce these sorts of policies or will there be a
general vendor-neutral framework to deal with this?

I should reiterate that I think this will be a big improvement, so thank
you for proposing it.

Kieran
The information contained in this message is confidential and is intended for the addressee(s) only. If you have received this message in error, please notify the sender immediately and delete the message. Unless you are an addressee (or authorized to receive for an addressee), you may not use, copy or disclose to anyone this message or any information contained in this message. The unauthorized use, disclosure, copying or alteration of this message is strictly prohibited.

^ permalink raw reply	[flat|nested] 262+ messages in thread

end of thread, other threads:[~2017-11-14  3:23 UTC | newest]

Thread overview: 262+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-05 18:16 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Adrien Mazarguil
2016-07-07  7:14 ` Lu, Wenzhuo
2016-07-07 10:26   ` Adrien Mazarguil
2016-07-19  8:11     ` Lu, Wenzhuo
2016-07-19 13:12       ` Adrien Mazarguil
2016-07-20  2:16         ` Lu, Wenzhuo
2016-07-20 10:41           ` Adrien Mazarguil
2016-07-21  3:18             ` Lu, Wenzhuo
2016-07-21 12:47               ` Adrien Mazarguil
2016-07-22  1:38                 ` Lu, Wenzhuo
2016-07-07 23:15 ` Chandran, Sugesh
2016-07-08 13:03   ` Adrien Mazarguil
2016-07-11 10:42     ` Chandran, Sugesh
2016-07-13 20:03       ` Adrien Mazarguil
2016-07-15  9:23         ` Chandran, Sugesh
2016-07-15 10:02           ` Chilikin, Andrey
2016-07-18 13:26             ` Chandran, Sugesh
2016-07-15 15:04           ` Adrien Mazarguil
2016-07-18 13:26             ` Chandran, Sugesh
2016-07-18 15:00               ` Adrien Mazarguil
2016-07-20 16:32                 ` Chandran, Sugesh
2016-07-20 17:10                   ` Adrien Mazarguil
2016-07-21 11:06                     ` Chandran, Sugesh
2016-07-21 13:37                       ` Adrien Mazarguil
2016-07-22 16:32                         ` Chandran, Sugesh
2016-07-08 11:11 ` Liang, Cunming
2016-07-08 12:38   ` Bruce Richardson
2016-07-08 13:25   ` Adrien Mazarguil
2016-07-11  3:18     ` Liang, Cunming
2016-07-11 10:06       ` Adrien Mazarguil
2016-07-11 10:41 ` Jerin Jacob
2016-07-21 19:20   ` Adrien Mazarguil
2016-07-23 21:10     ` John Fastabend
2016-08-02 18:19       ` John Fastabend
2016-08-03 14:30         ` Adrien Mazarguil
2016-08-03 18:10           ` John Fastabend
2016-08-04 13:05             ` Adrien Mazarguil
2016-08-09 21:24               ` John Fastabend
2016-08-10 11:02                 ` Adrien Mazarguil
2016-08-10 16:35                   ` John Fastabend
2016-07-21  8:13 ` Rahul Lakkireddy
2016-07-21 17:07   ` Adrien Mazarguil
2016-07-25 11:32     ` Rahul Lakkireddy
2016-07-25 16:40       ` John Fastabend
2016-07-26 10:07         ` Rahul Lakkireddy
2016-08-03 16:44           ` Adrien Mazarguil
2016-08-03 19:11             ` John Fastabend
2016-08-04 13:24               ` Adrien Mazarguil
2016-08-09 21:47                 ` John Fastabend
2016-08-10 13:37                   ` Adrien Mazarguil
2016-08-10 16:46                     ` John Fastabend
2016-08-19 21:13           ` John Daley (johndale)
2016-08-19 19:32 ` [dpdk-dev] [RFC v2] " Adrien Mazarguil
2016-08-19 19:32   ` [dpdk-dev] [RFC v2] ethdev: introduce generic flow API Adrien Mazarguil
2016-08-20  7:00     ` Lu, Wenzhuo
2016-08-22 18:20     ` John Fastabend
2016-08-22 18:30   ` [dpdk-dev] [RFC v2] Generic flow director/filtering/classification API John Fastabend
2016-09-29 17:10   ` Adrien Mazarguil
2016-10-31  7:19     ` Zhang, Helin
2016-11-02 11:13       ` Adrien Mazarguil
2016-11-08  1:31         ` Zhang, Helin
2016-11-09 11:07           ` Adrien Mazarguil
2016-11-16 16:23   ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API Adrien Mazarguil
2016-11-18  6:36       ` Xing, Beilei
2016-11-18 10:28         ` Adrien Mazarguil
2016-11-30 17:47       ` Kevin Traynor
2016-12-01  8:36         ` Adrien Mazarguil
2016-12-02 21:06           ` Kevin Traynor
2016-12-06 18:11             ` Chandran, Sugesh
2016-12-08 15:09               ` Adrien Mazarguil
2016-12-09 12:18                 ` Chandran, Sugesh
2016-12-09 16:38                   ` Adrien Mazarguil
2016-12-12 10:20                     ` Chandran, Sugesh
2016-12-12 11:17                       ` Adrien Mazarguil
2016-12-08 17:07             ` Adrien Mazarguil
2016-12-14 11:48               ` Kevin Traynor
2016-12-14 13:54                 ` Adrien Mazarguil
2016-12-14 16:11                   ` Kevin Traynor
2016-12-08  9:00       ` Xing, Beilei
2016-12-08 14:50         ` Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 02/22] cmdline: add support for dynamic tokens Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 03/22] cmdline: add alignment constraint Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 04/22] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 05/22] app/testpmd: add flow command Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 06/22] app/testpmd: add rte_flow integer support Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 07/22] app/testpmd: add flow list command Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 08/22] app/testpmd: add flow flush command Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 09/22] app/testpmd: add flow destroy command Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 10/22] app/testpmd: add flow validate/create commands Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 11/22] app/testpmd: add flow query command Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 12/22] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
2016-12-16  3:01       ` Pei, Yulong
2016-12-16  9:17         ` Adrien Mazarguil
2016-12-16 12:22           ` Xing, Beilei
2016-12-16 15:25             ` Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 13/22] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 14/22] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 15/22] app/testpmd: add item any to flow command Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 16/22] app/testpmd: add various items " Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 17/22] app/testpmd: add item raw " Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 18/22] app/testpmd: add items eth/vlan " Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 19/22] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 20/22] app/testpmd: add L4 items " Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 21/22] app/testpmd: add various actions " Adrien Mazarguil
2016-11-16 16:23     ` [dpdk-dev] [PATCH 22/22] app/testpmd: add queue " Adrien Mazarguil
2016-11-21  9:23     ` [dpdk-dev] [PATCH 00/22] Generic flow API (rte_flow) Nélio Laranjeiro
2016-11-28 10:03     ` Pei, Yulong
2016-12-01  8:39       ` Adrien Mazarguil
2016-12-02 16:58     ` Ferruh Yigit
2016-12-08 15:19       ` Adrien Mazarguil
2016-12-08 17:56         ` Ferruh Yigit
2016-12-15 12:20         ` Ferruh Yigit
2016-12-16  8:22           ` Adrien Mazarguil
2016-12-16 16:24     ` [dpdk-dev] [PATCH v2 00/25] " Adrien Mazarguil
2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 01/25] ethdev: introduce generic flow API Adrien Mazarguil
2017-10-23  8:53         ` Zhao1, Wei
2017-10-31 17:45           ` Adrien Mazarguil
2017-11-07  6:56             ` Zhao1, Wei
2017-11-14  3:23             ` Zhao1, Wei
2016-12-16 16:24       ` [dpdk-dev] [PATCH v2 02/25] doc: add rte_flow prog guide Adrien Mazarguil
2016-12-19 10:45         ` Mcnamara, John
2016-12-19 11:10           ` Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 03/25] doc: announce depreciation of legacy filter types Adrien Mazarguil
2016-12-19 10:47         ` Mcnamara, John
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 05/25] cmdline: add alignment constraint Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
2016-12-19  8:37         ` Xing, Beilei
2016-12-19 10:19           ` Adrien Mazarguil
2016-12-20  1:57             ` Xing, Beilei
2016-12-20  9:38               ` Adrien Mazarguil
2016-12-21  5:23                 ` Xing, Beilei
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 07/25] app/testpmd: add flow command Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 09/25] app/testpmd: add flow list command Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 10/25] app/testpmd: add flow flush command Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 13/25] app/testpmd: add flow query command Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 18/25] app/testpmd: add various items " Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 19/25] app/testpmd: add item raw " Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 22/25] app/testpmd: add L4 items " Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 23/25] app/testpmd: add various actions " Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 24/25] app/testpmd: add queue " Adrien Mazarguil
2016-12-16 16:25       ` [dpdk-dev] [PATCH v2 25/25] doc: describe testpmd " Adrien Mazarguil
2016-12-17 22:06       ` [dpdk-dev] [PATCH v2 00/25] Generic flow API (rte_flow) Olga Shern
2016-12-19 17:48       ` [dpdk-dev] [PATCH v3 " Adrien Mazarguil
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 01/25] ethdev: introduce generic flow API Adrien Mazarguil
2017-05-23  6:07           ` Zhao1, Wei
2017-05-23  9:50             ` Adrien Mazarguil
2017-05-24  3:32               ` Zhao1, Wei
2017-05-24  7:32                 ` Adrien Mazarguil
2017-05-24  8:46                   ` Zhao1, Wei
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 02/25] doc: add rte_flow prog guide Adrien Mazarguil
2016-12-20 16:30           ` Mcnamara, John
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 03/25] doc: announce deprecation of legacy filter types Adrien Mazarguil
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 05/25] cmdline: add alignment constraint Adrien Mazarguil
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 07/25] app/testpmd: add flow command Adrien Mazarguil
2016-12-20 16:13           ` Ferruh Yigit
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
2016-12-19 17:48         ` [dpdk-dev] [PATCH v3 09/25] app/testpmd: add flow list command Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 10/25] app/testpmd: add flow flush command Adrien Mazarguil
2016-12-20  7:32           ` Zhao1, Wei
2016-12-20  9:45             ` Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 13/25] app/testpmd: add flow query command Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 18/25] app/testpmd: add various items " Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 19/25] app/testpmd: add item raw " Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
2016-12-20  9:21           ` Pei, Yulong
2016-12-20 10:02             ` Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 22/25] app/testpmd: add L4 items " Adrien Mazarguil
2016-12-20  9:14           ` Pei, Yulong
2016-12-20  9:50             ` Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 23/25] app/testpmd: add various actions " Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 24/25] app/testpmd: add queue " Adrien Mazarguil
2016-12-19 17:49         ` [dpdk-dev] [PATCH v3 25/25] doc: describe testpmd " Adrien Mazarguil
2016-12-19 20:44           ` Mcnamara, John
2016-12-20 10:51             ` Adrien Mazarguil
2016-12-20 17:06           ` Ferruh Yigit
2016-12-20 18:42         ` [dpdk-dev] [PATCH v4 00/25] Generic flow API (rte_flow) Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 01/25] ethdev: introduce generic flow API Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 02/25] doc: add rte_flow prog guide Adrien Mazarguil
2016-12-21 10:55             ` Mcnamara, John
2016-12-21 11:31               ` Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 03/25] doc: announce deprecation of legacy filter types Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 04/25] cmdline: add support for dynamic tokens Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 05/25] cmdline: add alignment constraint Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 06/25] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 07/25] app/testpmd: add flow command Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 08/25] app/testpmd: add rte_flow integer support Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 09/25] app/testpmd: add flow list command Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 10/25] app/testpmd: add flow flush command Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 11/25] app/testpmd: add flow destroy command Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 12/25] app/testpmd: add flow validate/create commands Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 13/25] app/testpmd: add flow query command Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 14/25] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 15/25] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 16/25] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 17/25] app/testpmd: add item any to flow command Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 18/25] app/testpmd: add various items " Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 19/25] app/testpmd: add item raw " Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 20/25] app/testpmd: add items eth/vlan " Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 21/25] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 22/25] app/testpmd: add L4 items " Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 23/25] app/testpmd: add various actions " Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 24/25] app/testpmd: add queue " Adrien Mazarguil
2016-12-20 18:42           ` [dpdk-dev] [PATCH v4 25/25] doc: describe testpmd " Adrien Mazarguil
2016-12-21 14:51           ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 01/26] ethdev: introduce generic flow API Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 02/26] doc: add rte_flow prog guide Adrien Mazarguil
2016-12-21 15:09               ` Mcnamara, John
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 03/26] doc: announce deprecation of legacy filter types Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 04/26] cmdline: add support for dynamic tokens Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 05/26] cmdline: add alignment constraint Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 06/26] app/testpmd: implement basic support for rte_flow Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 07/26] app/testpmd: add flow command Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 08/26] app/testpmd: add rte_flow integer support Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 09/26] app/testpmd: add flow list command Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 10/26] app/testpmd: add flow flush command Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 11/26] app/testpmd: add flow destroy command Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 12/26] app/testpmd: add flow validate/create commands Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 13/26] app/testpmd: add flow query command Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 14/26] app/testpmd: add rte_flow item spec handler Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 15/26] app/testpmd: add rte_flow item spec prefix length Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 16/26] app/testpmd: add rte_flow bit-field support Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 17/26] app/testpmd: add item any to flow command Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 18/26] app/testpmd: add various items " Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 19/26] app/testpmd: add item raw " Adrien Mazarguil
2017-05-11  6:53               ` Zhao1, Wei
2017-05-12  9:12                 ` Adrien Mazarguil
2017-05-16  5:05                   ` Zhao1, Wei
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 20/26] app/testpmd: add items eth/vlan " Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 21/26] app/testpmd: add items ipv4/ipv6 " Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 22/26] app/testpmd: add L4 items " Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 23/26] app/testpmd: add various actions " Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 24/26] app/testpmd: add queue " Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 25/26] doc: describe testpmd " Adrien Mazarguil
2016-12-21 14:51             ` [dpdk-dev] [PATCH v5 26/26] app/testpmd: add protocol fields to " Adrien Mazarguil
2016-12-23  9:30             ` [dpdk-dev] [PATCH v5 00/26] Generic flow API (rte_flow) Thomas Monjalon
2016-12-21 16:19       ` [dpdk-dev] [PATCH v2 00/25] " Simon Horman
2016-12-22 12:48         ` Adrien Mazarguil
2017-01-04  9:53           ` Simon Horman
2017-01-04 18:12             ` Adrien Mazarguil
2017-01-04 19:34               ` John Fastabend
2016-08-01 15:08 [dpdk-dev] [RFC] Generic flow director/filtering/classification API Kieran Mansley
2016-08-03 15:19 ` Adrien Mazarguil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).