DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item
@ 2021-09-22 18:04 Viacheslav Ovsiienko
  2021-09-22 18:04 ` [dpdk-dev] [PATCH 1/3] " Viacheslav Ovsiienko
                   ` (10 more replies)
  0 siblings, 11 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-09-22 18:04 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object have been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell driver and hardware what data should
be extracted from the packet and then presented to match in the
flows. Each field is a bit pattern. It has width, offset from
the header beginning, mode of offset calculation, and offset
related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as pointer to the next
header in the packet, in other word the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*field_offset & offset_mask) << field_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*field_offset & offset_mask) << field_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation type the item with one of input link types should
precede the flex item and driver will select the correct flex
item settings, depending on actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with next protocol
identifier, and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in pattern - it
means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items should not
overlap (be unique per field). For this case, the driver
will try to engage not overlapping hardware resources
and provide independent handling of the fields with unique
indices. If the hint index is zero the driver assigns
resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* header length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .sample_num = 2,
    .input_link[0] = &link0,
    .input_num = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

7. Notes:
 - testpmd and mlx5 PMD parts are coming soon
 - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/

Gregory Etelson (2):
  ethdev: support flow elements with variable length
  ethdev: implement RTE flex item API

Viacheslav Ovsiienko (1):
  ethdev: introduce configurable flexible item

 doc/guides/prog_guide/rte_flow.rst     |  24 +++
 doc/guides/rel_notes/release_21_11.rst |   7 +
 lib/ethdev/rte_ethdev.h                |   1 +
 lib/ethdev/rte_flow.c                  | 141 +++++++++++++--
 lib/ethdev/rte_flow.h                  | 228 +++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h           |  13 ++
 lib/ethdev/version.map                 |   5 +
 7 files changed, 406 insertions(+), 13 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH 1/3] ethdev: introduce configurable flexible item
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-09-22 18:04 ` Viacheslav Ovsiienko
  2021-09-22 18:04 ` [dpdk-dev] [PATCH 2/3] ethdev: support flow elements with variable length Viacheslav Ovsiienko
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-09-22 18:04 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object have been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell driver and hardware what data should
be extracted from the packet and then presented to match in the
flows. Each field is a bit pattern. It has width, offset from
the header beginning, mode of offset calculation, and offset
related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as pointer to the next
header in the packet, in other word the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*field_offset & offset_mask) << field_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*field_offset & offset_mask) << field_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation type the item with one of input link types should
precede the flex item and driver will select the correct flex
item settings, depending on actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with next protocol
identifier, and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in pattern - it
means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items should not
overlap (be unique per field). For this case, the driver
will try to engage not overlapping hardware resources
and provide independent handling of the fields with unique
indices. If the hint index is zero the driver assigns
resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* 3 dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .sample_num = 2,
    .input_link[0] = &link0,
    .input_num = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst     |  24 +++
 doc/guides/rel_notes/release_21_11.rst |   7 +
 lib/ethdev/rte_ethdev.h                |   1 +
 lib/ethdev/rte_flow.h                  | 228 +++++++++++++++++++++++++
 4 files changed, 260 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2b42d5ec8c..628f30cea7 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1425,6 +1425,30 @@ Matches a conntrack state after conntrack action.
 - ``flags``: conntrack packet state flags.
 - Default ``mask`` matches all state bits.
 
+Item: ``FLEX``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches with the network protocol header of preliminary configured format.
+The application describes the desired header structure, defines the header
+fields attributes and header relations with preceding and following
+protocols and configures the ethernet devices accordingly via
+rte_flow_flex_item_create() routine.
+
+- ``handle``: the flex item handle returned by the PMD on successful
+  rte_flow_flex_item_create() call. The item handle is unique within
+  the device port, mask for this field is ignored.
+- ``length``: match pattern length in bytes. If the length does not cover
+  all fields defined in item configuration, the pattern spec and mask are
+  supposed to be appended with zeroes till the full configured item length.
+- ``pattern``: pattern to match. The protocol header fields are considered
+  as bit fields, all offsets and widths are expressed in bits. The pattern
+  is the buffer containing the bit concatenation of all the fields presented
+  at item configuration time, in the same order and same amount. The most
+  regular way is to define all the header fields in the flex item configuration
+  and directly use the header structure as pattern template, i.e. application
+  just can fill the header structures with desired match values and masks and
+  specify these structures as flex item pattern directly.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 68b6405ae0..0c80ab5232 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Introduced RTE Flow Flex Item.**
+
+  * The configurable RTE Flow Flex Item provides the capability to introdude
+    the arbitrary user specified network protocol header, configure the device
+    hardware accordingly, and perform match on this header with desired patterns
+    and masks.
+
 * **Enabled new devargs parser.**
 
   * Enabled devargs syntax
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index bef24173cf..9188c92d9f 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -537,6 +537,7 @@ struct rte_eth_rss_conf {
 #define ETH_RSS_PPPOE		   (1ULL << 31)
 #define ETH_RSS_ECPRI		   (1ULL << 32)
 #define ETH_RSS_MPLS		   (1ULL << 33)
+#define ETH_RSS_FLEX		   (1ULL << 34)
 
 /*
  * We use the following macros to combine with above ETH_RSS_* for
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 70f455d47d..b589ec7cd0 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -573,6 +573,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_conntrack.
 	 */
 	RTE_FLOW_ITEM_TYPE_CONNTRACK,
+
+	/**
+	 * Matches a configured set of fields at runtime calculated offsets
+	 * over the generic network header with variable length and
+	 * flexible pattern
+	 *
+	 * @see struct rte_flow_item_flex.
+	 */
+	RTE_FLOW_ITEM_TYPE_FLEX,
 };
 
 /**
@@ -1839,6 +1848,160 @@ struct rte_flow_item {
 	const void *mask; /**< Bit-mask applied to spec and last. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_FLEX
+ *
+ * Matches a specified set of fields within the network protocol
+ * header. Each field is presented as set of bits with specified width, and
+ * bit offset (this is dynamic one - can be calulated by several methods
+ * in runtime) from the header beginning.
+ *
+ * The pattern is concatenation of all bit fields configured at item creation
+ * by rte_flow_flex_item_create() exactly in the same order and amount, no
+ * fields can be omitted or swapped. The dummy mode field can be used for
+ * pattern byte boundary alignment, least significant bit in byte goes first.
+ * Only the fields specified in sample_data configuration parameter participate
+ * in pattern construction.
+ *
+ * If pattern length is smaller than configured fields overall length it is
+ * extended with trailing zeroes, both for value and mask.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_flex {
+	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle. */
+	uint32_t length; /**< Pattern length in bytes. */
+	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
+};
+/**
+ * Field bit offset calculation mode.
+ */
+enum rte_flow_item_flex_field_mode {
+	/**
+	 * Dummy field, used for byte boundary alignment in pattern.
+	 * Pattern mask and data are ignored in the match. All configuration
+	 * parameters besides field size are ignored.
+	 */
+	FIELD_MODE_DUMMY = 0,
+	/**
+	 * Fixed offset field. The bit offset from header beginning is
+	 * is permanent and defined by field_base parameter.
+	 */
+	FIELD_MODE_FIXED,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field). The resulting field offset to match is calculated as:
+	 *
+	 *    field_base + (*field_offset & offset_mask) << field_shift
+	 */
+	FIELD_MODE_OFFSET,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field), the latter is considered as bitmask containing some
+	 * number of one bits, the resulting field offset to match is
+	 * calculated as:
+	 *
+	 *    field_base + bitcount(*field_offset & offset_mask) << field_shift
+	 */
+	FIELD_MODE_BITMASK,
+};
+
+/**
+ * Flex item field tunnel mode
+ */
+enum rte_flow_item_flex_tunnel_mode {
+	FLEX_TUNNEL_MODE_FIRST = 0, /**< First item occurrence. */
+	FLEX_TUNNEL_MODE_OUTER = 1, /**< Outer item. */
+	FLEX_TUNNEL_MODE_INNER = 2  /**< Inner item. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+__extension__
+struct rte_flow_item_flex_field {
+	/** Defines how match field offset is calculated over the packet. */
+	enum rte_flow_item_flex_field_mode field_mode;
+	uint32_t field_size; /**< Match field size in bits. */
+	int32_t field_base; /**< Match field offset in bits. */
+	uint32_t offset_base; /**< Indirect offset field offset in bits. */
+	uint32_t offset_mask; /**< Indirect offset field bit mask. */
+	int32_t offset_shift; /**< Indirect offset multiply factor. */
+	uint16_t tunnel_count:2; /**< 0-first occurrence, 1-outer, 2-inner.*/
+	uint16_t rss_hash:1; /**< Field participates in RSS hash calculation. */
+	uint16_t field_id; /**< device hint, for flows with multiple items. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_link {
+	/**
+	 * Preceding/following header. The item type must be always provided.
+	 * For preceding one item must specify the header value/mask to match
+	 * for the link be taken and start the flex item header parsing.
+	 */
+	struct rte_flow_item item;
+	/**
+	 * Next field value to match to continue with one of the configured
+	 * next protocols.
+	 */
+	uint32_t next;
+	/**
+	 * Specifies whether flex item represents tunnel protocol
+	 */
+	bool tunnel;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_conf {
+	/**
+	 * The next header offset, it presents the network header size covered
+	 * by the flex item and can be obtained with all supported offset
+	 * calculating methods (fixed, dedicated field, bitmask, etc).
+	 */
+	struct rte_flow_item_flex_field next_header;
+	/**
+	 * Specifies the next protocol field to match with link next protocol
+	 * values and continue packet parsing with matching link.
+	 */
+	struct rte_flow_item_flex_field next_protocol;
+	/**
+	 * The fields will be sampled and presented for explicit match
+	 * with pattern in the rte_flow_flex_item. There can be multiple
+	 * fields descriptors, the number should be specified by sample_num.
+	 */
+	struct rte_flow_item_flex_field *sample_data;
+	/** Number of field descriptors in the sample_data array. */
+	uint32_t sample_num;
+	/**
+	 * Input link defines the flex item relation with preceding
+	 * header. It specified the preceding item type and provides pattern
+	 * to match. The flex item will continue parsing and will provide the
+	 * data to flow match in case if there is the match with one of input
+	 * links.
+	 */
+	struct rte_flow_item_flex_link *input_link;
+	/** Number of link descriptors in the input link array. */
+	uint32_t input_num;
+	/**
+	 * Output link defines the next protocol field value to match and
+	 * the following protocol header to continue packet parsing. Also
+	 * defines the tunnel-related behaviour.
+	 */
+	struct rte_flow_item_flex_link *output_link;
+	/** Number of link descriptors in the output link array. */
+	uint32_t output_num;
+};
+
 /**
  * Action types.
  *
@@ -4288,6 +4451,71 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 			     struct rte_flow_item *items,
 			     uint32_t num_of_items,
 			     struct rte_flow_error *error);
+
+/**
+ * Create the flex item with specified configuration over
+ * the Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] conf
+ *   Item configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is set.
+ */
+__rte_experimental
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
+/**
+ * Release the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error);
+
+/**
+ * Modify the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[in] conf
+ *   Item new configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_update(uint16_t port_id,
+			  const struct rte_flow_item_flex_handle *handle,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH 2/3] ethdev: support flow elements with variable length
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-09-22 18:04 ` [dpdk-dev] [PATCH 1/3] " Viacheslav Ovsiienko
@ 2021-09-22 18:04 ` Viacheslav Ovsiienko
  2021-09-22 18:04 ` [dpdk-dev] [PATCH 3/3] ethdev: implement RTE flex item API Viacheslav Ovsiienko
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-09-22 18:04 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

A new RTE flow item type with variable length pattern that does not
fit the RAW item meta description could not use the RAW item.
For example, the new flow item that references 64 bits PMD handler
cannot be described by the RAW item.

The patch allows RTE conv helper functions to process custom flow
items with variable length pattern.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 lib/ethdev/rte_flow.c | 68 ++++++++++++++++++++++++++++++++++---------
 1 file changed, 55 insertions(+), 13 deletions(-)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 8cb7a069c8..fe199eaeb3 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -30,13 +30,54 @@ uint64_t rte_flow_dynf_metadata_mask;
 struct rte_flow_desc_data {
 	const char *name;
 	size_t size;
+	size_t (*desc_fn)(void *dst, const void *src);
 };
 
+/**
+ *
+ * @param buf
+ * Destination memory.
+ * @param data
+ * Source memory
+ * @param size
+ * Requested copy size
+ * @param desc
+ * rte_flow_desc_item - for flow item conversion.
+ * rte_flow_desc_action - for flow action conversion.
+ * @param type
+ * Offset into the desc param or negative value for private flow elements.
+ */
+static inline size_t
+rte_flow_conv_copy(void *buf, const void *data, const size_t size,
+		   const struct rte_flow_desc_data *desc, int type)
+{
+	/**
+	 * allow PMD private flow item
+	 * see 5d1bff8fe2
+	 * "ethdev: allow negative values in flow rule types"
+	 */
+	size_t sz = type >= 0 ? desc[type].size : sizeof(void *);
+	if (buf == NULL || data == NULL)
+		return 0;
+	rte_memcpy(buf, data, (size > sz ? sz : size));
+	if (desc[type].desc_fn)
+		sz += desc[type].desc_fn(size > 0 ? buf : NULL, data);
+	return sz;
+}
+
 /** Generate flow_item[] entry. */
 #define MK_FLOW_ITEM(t, s) \
 	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
 		.name = # t, \
-		.size = s, \
+		.size = s,               \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ITEM_FN(t, s, fn) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = {\
+		.name = # t,                 \
+		.size = s,                   \
+		.desc_fn = fn,               \
 	}
 
 /** Information about known flow pattern items. */
@@ -107,8 +148,17 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
 		.name = # t, \
 		.size = s, \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ACTION_FN(t, fn) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = 0, \
+		.desc_fn = fn,\
 	}
 
+
 /** Information about known flow actions. */
 static const struct rte_flow_desc_data rte_flow_desc_action[] = {
 	MK_FLOW_ACTION(END, 0),
@@ -527,12 +577,8 @@ rte_flow_conv_item_spec(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow item
-		 */
-		off = (int)item->type >= 0 ?
-		      rte_flow_desc_item[item->type].size : sizeof(void *);
-		rte_memcpy(buf, data, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, data, size,
+					 rte_flow_desc_item, item->type);
 		break;
 	}
 	return off;
@@ -634,12 +680,8 @@ rte_flow_conv_action_conf(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow action
-		 */
-		off = (int)action->type >= 0 ?
-		      rte_flow_desc_action[action->type].size : sizeof(void *);
-		rte_memcpy(buf, action->conf, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, action->conf, size,
+					 rte_flow_desc_action, action->type);
 		break;
 	}
 	return off;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH 3/3] ethdev: implement RTE flex item API
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-09-22 18:04 ` [dpdk-dev] [PATCH 1/3] " Viacheslav Ovsiienko
  2021-09-22 18:04 ` [dpdk-dev] [PATCH 2/3] ethdev: support flow elements with variable length Viacheslav Ovsiienko
@ 2021-09-22 18:04 ` Viacheslav Ovsiienko
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-09-22 18:04 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flex item API was introduced in
"ethdev: introduce configurable flexible item" patch.

The API allows DPDK application to define parser for custom
network header in port hardware and offload flows that will match
the custom header elements.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 lib/ethdev/rte_flow.c        | 73 ++++++++++++++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h | 13 +++++++
 lib/ethdev/version.map       |  5 +++
 3 files changed, 91 insertions(+)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index fe199eaeb3..74f74d6009 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -80,6 +80,19 @@ rte_flow_conv_copy(void *buf, const void *data, const size_t size,
 		.desc_fn = fn,               \
 	}
 
+static size_t
+rte_flow_item_flex_conv(void *buf, const void *data)
+{
+	struct rte_flow_item_flex *dst = buf;
+	const struct rte_flow_item_flex *src = data;
+	if (buf) {
+		dst->pattern = rte_memcpy
+			((void *)((uintptr_t)(dst + 1)), src->pattern,
+			 src->length);
+	}
+	return src->length;
+}
+
 /** Information about known flow pattern items. */
 static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(END, 0),
@@ -141,6 +154,8 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)),
 	MK_FLOW_ITEM(INTEGRITY, sizeof(struct rte_flow_item_integrity)),
 	MK_FLOW_ITEM(CONNTRACK, sizeof(uint32_t)),
+	MK_FLOW_ITEM_FN(FLEX, sizeof(struct rte_flow_item_flex),
+			rte_flow_item_flex_conv),
 };
 
 /** Generate flow_action[] entry. */
@@ -1308,3 +1323,61 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOTSUP));
 }
+
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_flow_item_flex_handle *handle;
+
+	if (unlikely(!ops))
+		return NULL;
+	if (unlikely(!ops->flex_item_create)) {
+		rte_flow_error_set(error, ENOTSUP,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOTSUP));
+		return NULL;
+	}
+	handle = ops->flex_item_create(dev, conf, error);
+	if (handle == NULL)
+		flow_err(port_id, -rte_errno, error);
+	return handle;
+}
+
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_release))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_release(dev, handle, error);
+	return flow_err(port_id, ret, error);
+}
+
+int
+rte_flow_flex_item_update(uint16_t port_id,
+			  const struct rte_flow_item_flex_handle *handle,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_update))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_update(dev, handle, conf, error);
+	return flow_err(port_id, ret, error);
+}
diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
index 46f62c2ec2..aed2ac03ad 100644
--- a/lib/ethdev/rte_flow_driver.h
+++ b/lib/ethdev/rte_flow_driver.h
@@ -139,6 +139,19 @@ struct rte_flow_ops {
 		 struct rte_flow_item *pmd_items,
 		 uint32_t num_of_items,
 		 struct rte_flow_error *err);
+	struct rte_flow_item_flex_handle *(*flex_item_create)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
+	int (*flex_item_release)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 struct rte_flow_error *error);
+	int (*flex_item_update)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
 };
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 904bce6ea1..994c57f4b2 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -247,6 +247,11 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_flow_flex_item_create;
+	rte_flow_flex_item_release;
+	rte_flow_flex_item_update;
 };
 
 INTERNAL {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (2 preceding siblings ...)
  2021-09-22 18:04 ` [dpdk-dev] [PATCH 3/3] ethdev: implement RTE flex item API Viacheslav Ovsiienko
@ 2021-10-01 19:34 ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 01/14] " Viacheslav Ovsiienko
                     ` (13 more replies)
  2021-10-11 18:15 ` [dpdk-dev] [PATCH v3 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (6 subsequent siblings)
  10 siblings, 14 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object have been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell driver and hardware what data should
be extracted from the packet and then presented to match in the
flows. Each field is a bit pattern. It has width, offset from
the header beginning, mode of offset calculation, and offset
related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as pointer to the next
header in the packet, in other word the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*field_offset & offset_mask) << field_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*field_offset & offset_mask) << field_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation type the item with one of input link types should
precede the flex item and driver will select the correct flex
item settings, depending on actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with next protocol
identifier, and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in pattern - it
means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items should not
overlap (be unique per field). For this case, the driver
will try to engage not overlapping hardware resources
and provide independent handling of the fields with unique
indices. If the hint index is zero the driver assigns
resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* header length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .sample_num = 2,
    .input_link[0] = &link0,
    .input_num = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

7. Notes:
 - v1:  http://patches.dpdk.org/project/dpdk/patch/20210922180418.20663-2-viacheslavo@nvidia.com/
 - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/
 - v1 -> v2:
   - testpmd CLI to handle flex item is provided
   - draft PMD code is introduced

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Gregory Etelson (8):
  ethdev: support flow elements with variable length
  ethdev: implement RTE flex item API
  app/testpmd: add jansson library
  app/testpmd: add flex item CLI commands
  common/mlx5: extend flex parser capabilities
  common/mlx5: fix flex parser DevX creation routine
  net/mlx5: add flex parser DevX object management
  net/mlx5: handle flex item in flows

Viacheslav Ovsiienko (6):
  ethdev: introduce configurable flexible item
  common/mlx5: refactor HCA attributes query
  net/mlx5: update eCPRI flex parser structures
  net/mlx5: add flex item API
  net/mlx5: translate flex item configuration
  net/mlx5: translate flex item pattern into matcher

 app/test-pmd/cmdline.c                      |    2 +
 app/test-pmd/cmdline_flow.c                 |  801 +++++++++++-
 app/test-pmd/meson.build                    |    5 +
 app/test-pmd/testpmd.c                      |    1 -
 app/test-pmd/testpmd.h                      |   18 +
 doc/guides/prog_guide/rte_flow.rst          |   24 +
 doc/guides/rel_notes/release_21_11.rst      |    7 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  119 ++
 drivers/common/mlx5/mlx5_devx_cmds.c        |  239 ++--
 drivers/common/mlx5/mlx5_devx_cmds.h        |   65 +-
 drivers/common/mlx5/mlx5_prm.h              |   50 +-
 drivers/net/mlx5/linux/mlx5_os.c            |   15 +-
 drivers/net/mlx5/meson.build                |    1 +
 drivers/net/mlx5/mlx5.c                     |   15 +-
 drivers/net/mlx5/mlx5.h                     |   78 +-
 drivers/net/mlx5/mlx5_flow.c                |   49 +
 drivers/net/mlx5/mlx5_flow.h                |   26 +-
 drivers/net/mlx5/mlx5_flow_dv.c             |   73 +-
 drivers/net/mlx5/mlx5_flow_flex.c           | 1262 +++++++++++++++++++
 lib/ethdev/rte_ethdev.h                     |    1 +
 lib/ethdev/rte_flow.c                       |  141 ++-
 lib/ethdev/rte_flow.h                       |  228 ++++
 lib/ethdev/rte_flow_driver.h                |   13 +
 lib/ethdev/version.map                      |    5 +
 24 files changed, 3096 insertions(+), 142 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_flow_flex.c

-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 01/14] ethdev: introduce configurable flexible item
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-07 11:08     ` Ori Kam
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 02/14] ethdev: support flow elements with variable length Viacheslav Ovsiienko
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object have been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell driver and hardware what data should
be extracted from the packet and then presented to match in the
flows. Each field is a bit pattern. It has width, offset from
the header beginning, mode of offset calculation, and offset
related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as pointer to the next
header in the packet, in other word the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*field_offset & offset_mask) << field_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*field_offset & offset_mask) << field_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation type the item with one of input link types should
precede the flex item and driver will select the correct flex
item settings, depending on actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with next protocol
identifier, and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in pattern - it
means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items should not
overlap (be unique per field). For this case, the driver
will try to engage not overlapping hardware resources
and provide independent handling of the fields with unique
indices. If the hint index is zero the driver assigns
resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .sample_num = 2,
    .input_link[0] = &link0,
    .input_num = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst     |  24 +++
 doc/guides/rel_notes/release_21_11.rst |   7 +
 lib/ethdev/rte_ethdev.h                |   1 +
 lib/ethdev/rte_flow.h                  | 228 +++++++++++++++++++++++++
 4 files changed, 260 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2b42d5ec8c..628f30cea7 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1425,6 +1425,30 @@ Matches a conntrack state after conntrack action.
 - ``flags``: conntrack packet state flags.
 - Default ``mask`` matches all state bits.
 
+Item: ``FLEX``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches with the network protocol header of preliminary configured format.
+The application describes the desired header structure, defines the header
+fields attributes and header relations with preceding and following
+protocols and configures the ethernet devices accordingly via
+rte_flow_flex_item_create() routine.
+
+- ``handle``: the flex item handle returned by the PMD on successful
+  rte_flow_flex_item_create() call. The item handle is unique within
+  the device port, mask for this field is ignored.
+- ``length``: match pattern length in bytes. If the length does not cover
+  all fields defined in item configuration, the pattern spec and mask are
+  supposed to be appended with zeroes till the full configured item length.
+- ``pattern``: pattern to match. The protocol header fields are considered
+  as bit fields, all offsets and widths are expressed in bits. The pattern
+  is the buffer containing the bit concatenation of all the fields presented
+  at item configuration time, in the same order and same amount. The most
+  regular way is to define all the header fields in the flex item configuration
+  and directly use the header structure as pattern template, i.e. application
+  just can fill the header structures with desired match values and masks and
+  specify these structures as flex item pattern directly.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 73e377a007..170797f9e9 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Introduced RTE Flow Flex Item.**
+
+  * The configurable RTE Flow Flex Item provides the capability to introdude
+    the arbitrary user specified network protocol header, configure the device
+    hardware accordingly, and perform match on this header with desired patterns
+    and masks.
+
 * **Enabled new devargs parser.**
 
   * Enabled devargs syntax
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index afdc53b674..e9ad7673e9 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -558,6 +558,7 @@ struct rte_eth_rss_conf {
  * it takes the reserved value 0 as input for the hash function.
  */
 #define ETH_RSS_L4_CHKSUM          (1ULL << 35)
+#define ETH_RSS_FLEX		   (1ULL << 36)
 
 /*
  * We use the following macros to combine with above ETH_RSS_* for
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 7b1ed7f110..eccb1e1791 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -574,6 +574,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_conntrack.
 	 */
 	RTE_FLOW_ITEM_TYPE_CONNTRACK,
+
+	/**
+	 * Matches a configured set of fields at runtime calculated offsets
+	 * over the generic network header with variable length and
+	 * flexible pattern
+	 *
+	 * @see struct rte_flow_item_flex.
+	 */
+	RTE_FLOW_ITEM_TYPE_FLEX,
 };
 
 /**
@@ -1839,6 +1848,160 @@ struct rte_flow_item {
 	const void *mask; /**< Bit-mask applied to spec and last. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_FLEX
+ *
+ * Matches a specified set of fields within the network protocol
+ * header. Each field is presented as set of bits with specified width, and
+ * bit offset (this is dynamic one - can be calulated by several methods
+ * in runtime) from the header beginning.
+ *
+ * The pattern is concatenation of all bit fields configured at item creation
+ * by rte_flow_flex_item_create() exactly in the same order and amount, no
+ * fields can be omitted or swapped. The dummy mode field can be used for
+ * pattern byte boundary alignment, least significant bit in byte goes first.
+ * Only the fields specified in sample_data configuration parameter participate
+ * in pattern construction.
+ *
+ * If pattern length is smaller than configured fields overall length it is
+ * extended with trailing zeroes, both for value and mask.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_flex {
+	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle. */
+	uint32_t length; /**< Pattern length in bytes. */
+	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
+};
+/**
+ * Field bit offset calculation mode.
+ */
+enum rte_flow_item_flex_field_mode {
+	/**
+	 * Dummy field, used for byte boundary alignment in pattern.
+	 * Pattern mask and data are ignored in the match. All configuration
+	 * parameters besides field size are ignored.
+	 */
+	FIELD_MODE_DUMMY = 0,
+	/**
+	 * Fixed offset field. The bit offset from header beginning is
+	 * is permanent and defined by field_base parameter.
+	 */
+	FIELD_MODE_FIXED,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field). The resulting field offset to match is calculated as:
+	 *
+	 *    field_base + (*field_offset & offset_mask) << field_shift
+	 */
+	FIELD_MODE_OFFSET,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field), the latter is considered as bitmask containing some
+	 * number of one bits, the resulting field offset to match is
+	 * calculated as:
+	 *
+	 *    field_base + bitcount(*field_offset & offset_mask) << field_shift
+	 */
+	FIELD_MODE_BITMASK,
+};
+
+/**
+ * Flex item field tunnel mode
+ */
+enum rte_flow_item_flex_tunnel_mode {
+	FLEX_TUNNEL_MODE_FIRST = 0, /**< First item occurrence. */
+	FLEX_TUNNEL_MODE_OUTER = 1, /**< Outer item. */
+	FLEX_TUNNEL_MODE_INNER = 2  /**< Inner item. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+__extension__
+struct rte_flow_item_flex_field {
+	/** Defines how match field offset is calculated over the packet. */
+	enum rte_flow_item_flex_field_mode field_mode;
+	uint32_t field_size; /**< Match field size in bits. */
+	int32_t field_base; /**< Match field offset in bits. */
+	uint32_t offset_base; /**< Indirect offset field offset in bits. */
+	uint32_t offset_mask; /**< Indirect offset field bit mask. */
+	int32_t offset_shift; /**< Indirect offset multiply factor. */
+	uint16_t tunnel_count:2; /**< 0-first occurrence, 1-outer, 2-inner.*/
+	uint16_t rss_hash:1; /**< Field participates in RSS hash calculation. */
+	uint16_t field_id; /**< device hint, for flows with multiple items. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_link {
+	/**
+	 * Preceding/following header. The item type must be always provided.
+	 * For preceding one item must specify the header value/mask to match
+	 * for the link be taken and start the flex item header parsing.
+	 */
+	struct rte_flow_item item;
+	/**
+	 * Next field value to match to continue with one of the configured
+	 * next protocols.
+	 */
+	uint32_t next;
+	/**
+	 * Specifies whether flex item represents tunnel protocol
+	 */
+	bool tunnel;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_conf {
+	/**
+	 * The next header offset, it presents the network header size covered
+	 * by the flex item and can be obtained with all supported offset
+	 * calculating methods (fixed, dedicated field, bitmask, etc).
+	 */
+	struct rte_flow_item_flex_field next_header;
+	/**
+	 * Specifies the next protocol field to match with link next protocol
+	 * values and continue packet parsing with matching link.
+	 */
+	struct rte_flow_item_flex_field next_protocol;
+	/**
+	 * The fields will be sampled and presented for explicit match
+	 * with pattern in the rte_flow_flex_item. There can be multiple
+	 * fields descriptors, the number should be specified by sample_num.
+	 */
+	struct rte_flow_item_flex_field *sample_data;
+	/** Number of field descriptors in the sample_data array. */
+	uint32_t sample_num;
+	/**
+	 * Input link defines the flex item relation with preceding
+	 * header. It specified the preceding item type and provides pattern
+	 * to match. The flex item will continue parsing and will provide the
+	 * data to flow match in case if there is the match with one of input
+	 * links.
+	 */
+	struct rte_flow_item_flex_link *input_link;
+	/** Number of link descriptors in the input link array. */
+	uint32_t input_num;
+	/**
+	 * Output link defines the next protocol field value to match and
+	 * the following protocol header to continue packet parsing. Also
+	 * defines the tunnel-related behaviour.
+	 */
+	struct rte_flow_item_flex_link *output_link;
+	/** Number of link descriptors in the output link array. */
+	uint32_t output_num;
+};
+
 /**
  * Action types.
  *
@@ -4288,6 +4451,71 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 			     struct rte_flow_item *items,
 			     uint32_t num_of_items,
 			     struct rte_flow_error *error);
+
+/**
+ * Create the flex item with specified configuration over
+ * the Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] conf
+ *   Item configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is set.
+ */
+__rte_experimental
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
+/**
+ * Release the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error);
+
+/**
+ * Modify the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[in] conf
+ *   Item new configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_update(uint16_t port_id,
+			  const struct rte_flow_item_flex_handle *handle,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 02/14] ethdev: support flow elements with variable length
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 01/14] " Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 03/14] ethdev: implement RTE flex item API Viacheslav Ovsiienko
                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

A new RTE flow item type with variable length pattern that does not
fit the RAW item meta description could not use the RAW item.
For example, the new flow item that references 64 bits PMD handler
cannot be described by the RAW item.

The patch allows RTE conv helper functions to process custom flow
items with variable length pattern.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 lib/ethdev/rte_flow.c | 68 ++++++++++++++++++++++++++++++++++---------
 1 file changed, 55 insertions(+), 13 deletions(-)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 8cb7a069c8..fe199eaeb3 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -30,13 +30,54 @@ uint64_t rte_flow_dynf_metadata_mask;
 struct rte_flow_desc_data {
 	const char *name;
 	size_t size;
+	size_t (*desc_fn)(void *dst, const void *src);
 };
 
+/**
+ *
+ * @param buf
+ * Destination memory.
+ * @param data
+ * Source memory
+ * @param size
+ * Requested copy size
+ * @param desc
+ * rte_flow_desc_item - for flow item conversion.
+ * rte_flow_desc_action - for flow action conversion.
+ * @param type
+ * Offset into the desc param or negative value for private flow elements.
+ */
+static inline size_t
+rte_flow_conv_copy(void *buf, const void *data, const size_t size,
+		   const struct rte_flow_desc_data *desc, int type)
+{
+	/**
+	 * allow PMD private flow item
+	 * see 5d1bff8fe2
+	 * "ethdev: allow negative values in flow rule types"
+	 */
+	size_t sz = type >= 0 ? desc[type].size : sizeof(void *);
+	if (buf == NULL || data == NULL)
+		return 0;
+	rte_memcpy(buf, data, (size > sz ? sz : size));
+	if (desc[type].desc_fn)
+		sz += desc[type].desc_fn(size > 0 ? buf : NULL, data);
+	return sz;
+}
+
 /** Generate flow_item[] entry. */
 #define MK_FLOW_ITEM(t, s) \
 	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
 		.name = # t, \
-		.size = s, \
+		.size = s,               \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ITEM_FN(t, s, fn) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = {\
+		.name = # t,                 \
+		.size = s,                   \
+		.desc_fn = fn,               \
 	}
 
 /** Information about known flow pattern items. */
@@ -107,8 +148,17 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
 		.name = # t, \
 		.size = s, \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ACTION_FN(t, fn) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = 0, \
+		.desc_fn = fn,\
 	}
 
+
 /** Information about known flow actions. */
 static const struct rte_flow_desc_data rte_flow_desc_action[] = {
 	MK_FLOW_ACTION(END, 0),
@@ -527,12 +577,8 @@ rte_flow_conv_item_spec(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow item
-		 */
-		off = (int)item->type >= 0 ?
-		      rte_flow_desc_item[item->type].size : sizeof(void *);
-		rte_memcpy(buf, data, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, data, size,
+					 rte_flow_desc_item, item->type);
 		break;
 	}
 	return off;
@@ -634,12 +680,8 @@ rte_flow_conv_action_conf(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow action
-		 */
-		off = (int)action->type >= 0 ?
-		      rte_flow_desc_action[action->type].size : sizeof(void *);
-		rte_memcpy(buf, action->conf, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, action->conf, size,
+					 rte_flow_desc_action, action->type);
 		break;
 	}
 	return off;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 03/14] ethdev: implement RTE flex item API
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 01/14] " Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 02/14] ethdev: support flow elements with variable length Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 04/14] app/testpmd: add jansson library Viacheslav Ovsiienko
                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flex item API was introduced in
"ethdev: introduce configurable flexible item" patch.

The API allows DPDK application to define parser for custom
network header in port hardware and offload flows that will match
the custom header elements.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 lib/ethdev/rte_flow.c        | 73 ++++++++++++++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h | 13 +++++++
 lib/ethdev/version.map       |  5 +++
 3 files changed, 91 insertions(+)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index fe199eaeb3..74f74d6009 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -80,6 +80,19 @@ rte_flow_conv_copy(void *buf, const void *data, const size_t size,
 		.desc_fn = fn,               \
 	}
 
+static size_t
+rte_flow_item_flex_conv(void *buf, const void *data)
+{
+	struct rte_flow_item_flex *dst = buf;
+	const struct rte_flow_item_flex *src = data;
+	if (buf) {
+		dst->pattern = rte_memcpy
+			((void *)((uintptr_t)(dst + 1)), src->pattern,
+			 src->length);
+	}
+	return src->length;
+}
+
 /** Information about known flow pattern items. */
 static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(END, 0),
@@ -141,6 +154,8 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)),
 	MK_FLOW_ITEM(INTEGRITY, sizeof(struct rte_flow_item_integrity)),
 	MK_FLOW_ITEM(CONNTRACK, sizeof(uint32_t)),
+	MK_FLOW_ITEM_FN(FLEX, sizeof(struct rte_flow_item_flex),
+			rte_flow_item_flex_conv),
 };
 
 /** Generate flow_action[] entry. */
@@ -1308,3 +1323,61 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOTSUP));
 }
+
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_flow_item_flex_handle *handle;
+
+	if (unlikely(!ops))
+		return NULL;
+	if (unlikely(!ops->flex_item_create)) {
+		rte_flow_error_set(error, ENOTSUP,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOTSUP));
+		return NULL;
+	}
+	handle = ops->flex_item_create(dev, conf, error);
+	if (handle == NULL)
+		flow_err(port_id, -rte_errno, error);
+	return handle;
+}
+
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_release))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_release(dev, handle, error);
+	return flow_err(port_id, ret, error);
+}
+
+int
+rte_flow_flex_item_update(uint16_t port_id,
+			  const struct rte_flow_item_flex_handle *handle,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_update))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_update(dev, handle, conf, error);
+	return flow_err(port_id, ret, error);
+}
diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
index 46f62c2ec2..aed2ac03ad 100644
--- a/lib/ethdev/rte_flow_driver.h
+++ b/lib/ethdev/rte_flow_driver.h
@@ -139,6 +139,19 @@ struct rte_flow_ops {
 		 struct rte_flow_item *pmd_items,
 		 uint32_t num_of_items,
 		 struct rte_flow_error *err);
+	struct rte_flow_item_flex_handle *(*flex_item_create)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
+	int (*flex_item_release)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 struct rte_flow_error *error);
+	int (*flex_item_update)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
 };
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 904bce6ea1..994c57f4b2 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -247,6 +247,11 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_flow_flex_item_create;
+	rte_flow_flex_item_release;
+	rte_flow_flex_item_update;
 };
 
 INTERNAL {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 04/14] app/testpmd: add jansson library
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 03/14] ethdev: implement RTE flex item API Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 05/14] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Testpmd interactive mode provides CLI to configure application
commands. Testpmd reads CLI command and parameters from STDIN, and
converts input into C objects with internal parser.
The patch adds jansson dependency to testpmd.
With jansson, testpmd can read input in JSON format from STDIN or input
file and convert it into C object using jansson library calls.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 app/test-pmd/meson.build | 5 +++++
 app/test-pmd/testpmd.h   | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 98f3289bdf..3a8babd604 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -61,3 +61,8 @@ if dpdk_conf.has('RTE_LIB_BPF')
     sources += files('bpf_cmd.c')
     deps += 'bpf'
 endif
+jansson_dep = dependency('jansson', required: false, method: 'pkg-config')
+if jansson_dep.found()
+    dpdk_conf.set('RTE_HAS_JANSSON', 1)
+    ext_deps += jansson_dep
+endif
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 5863b2f43f..876a341cf0 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -14,6 +14,9 @@
 #include <rte_os_shim.h>
 #include <cmdline.h>
 #include <sys/queue.h>
+#ifdef RTE_HAS_JANSSON
+#include <jansson.h>
+#endif
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 05/14] app/testpmd: add flex item CLI commands
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 04/14] app/testpmd: add jansson library Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 06/14] common/mlx5: refactor HCA attributes query Viacheslav Ovsiienko
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Network port hardware is shipped with fixed number of
supported network protocols. If application must work with a
protocol that is not included in the port hardware by default, it
can try to add the new protocol to port hardware.

Flex item or flex parser is port infrastructure that allows
application to add support for a custom network header and
offload flows to match the header elements.

Application must complete the following tasks to create a flow
rule that matches custom header:

1. Create flow item object in port hardware.
Application must provide custom header configuration to PMD.
PMD will use that configuration to create flex item object in
port hardware.

2. Create flex patterns to match. Flex pattern has a spec and a mask
components, like a regular flow item. Combined together, spec and mask
can target unique data sequence or a number of data sequences in the
custom header.
Flex patterns of the same flex item can have different lengths.
Flex pattern is identified by unique handler value.

3. Create a flow rule with a flex flow item that references
flow pattern.

Testpmd flex CLI commands are:

testpmd> flow flex_item create <port> <flex_id> <filename>

testpmd> set flex_pattern <pattern_id> \
         spec <spec data> mask <mask data>

testpmd> set flex_pattern <pattern_id> is <spec_data>

testpmd> flow create <port> ... \
/ flex item is <flex_id> pattern is <pattern_id> / ...

The patch works with the jansson library API.
Jansson development files must be present:
jansson.pc, jansson.h libjansson.[a,so]

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 801 +++++++++++++++++++-
 app/test-pmd/testpmd.c                      |   1 -
 app/test-pmd/testpmd.h                      |  15 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++
 5 files changed, 936 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index a9efd027c3..a673e6ef08 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -17822,6 +17822,8 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_set_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_show_capability,
+	(cmdline_parse_inst_t *)&cmd_set_flex_is_pattern,
+	(cmdline_parse_inst_t *)&cmd_set_flex_spec_pattern,
 	NULL,
 };
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bb22294dd3..8817b4e210 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -54,6 +54,8 @@ enum index {
 	COMMON_PRIORITY_LEVEL,
 	COMMON_INDIRECT_ACTION_ID,
 	COMMON_POLICY_ID,
+	COMMON_FLEX_HANDLE,
+	COMMON_FLEX_TOKEN,
 
 	/* TOP-level command. */
 	ADD,
@@ -81,6 +83,13 @@ enum index {
 	AGED,
 	ISOLATE,
 	TUNNEL,
+	FLEX,
+
+	/* Flex arguments */
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_MODIFY,
+	FLEX_ITEM_DESTROY,
 
 	/* Tunnel arguments. */
 	TUNNEL_CREATE,
@@ -306,6 +315,9 @@ enum index {
 	ITEM_POL_PORT,
 	ITEM_POL_METER,
 	ITEM_POL_POLICY,
+	ITEM_FLEX,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_FLEX_PATTERN_HANDLE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -844,6 +856,11 @@ struct buffer {
 		struct {
 			uint32_t policy_id;
 		} policy;/**< Policy arguments. */
+		struct {
+			uint16_t token;
+			uintptr_t uintptr;
+			char filename[128];
+		} flex; /**< Flex arguments*/
 	} args; /**< Command arguments. */
 };
 
@@ -871,6 +888,14 @@ struct parse_action_priv {
 		.size = s, \
 	})
 
+static const enum index next_flex_item[] = {
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_MODIFY,
+	FLEX_ITEM_DESTROY,
+	ZERO,
+};
+
 static const enum index next_ia_create_attr[] = {
 	INDIRECT_ACTION_CREATE_ID,
 	INDIRECT_ACTION_INGRESS,
@@ -1000,6 +1025,7 @@ static const enum index next_item[] = {
 	ITEM_GENEVE_OPT,
 	ITEM_INTEGRITY,
 	ITEM_CONNTRACK,
+	ITEM_FLEX,
 	END_SET,
 	ZERO,
 };
@@ -1368,6 +1394,13 @@ static const enum index item_integrity_lv[] = {
 	ZERO,
 };
 
+static const enum index item_flex[] = {
+	ITEM_FLEX_PATTERN_HANDLE,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -1724,6 +1757,9 @@ static int parse_set_sample_action(struct context *, const struct token *,
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int
+parse_flex_handle(struct context *, const struct token *,
+		  const char *, unsigned int, void *, unsigned int);
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -1840,6 +1876,8 @@ static int parse_isolate(struct context *, const struct token *,
 static int parse_tunnel(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_flex(struct context *, const struct token *,
+		      const char *, unsigned int, void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
@@ -1904,6 +1942,19 @@ static int comp_set_modify_field_op(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
 static int comp_set_modify_field_id(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static void flex_item_create(portid_t port_id, uint16_t flex_id,
+			     const char *filename);
+static void flex_item_modify(portid_t port_id, uint16_t flex_id,
+			     const char *filename);
+static void flex_item_destroy(portid_t port_id, uint16_t flex_id);
+struct flex_pattern {
+	struct rte_flow_item_flex spec, mask;
+	uint8_t spec_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	uint8_t mask_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+};
+
+static struct flex_item *flex_items[RTE_MAX_ETHPORTS][FLEX_MAX_PARSERS_NUM];
+static struct flex_pattern flex_patterns[FLEX_MAX_PATTERNS_NUM];
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -2040,6 +2091,20 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[COMMON_FLEX_TOKEN] = {
+		.name = "{flex token}",
+		.type = "flex token",
+		.help = "flex token",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[COMMON_FLEX_HANDLE] = {
+		.name = "{flex handle}",
+		.type = "FLEX HANDLE",
+		.help = "fill flex item data",
+		.call = parse_flex_handle,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -2056,7 +2121,8 @@ static const struct token token_list[] = {
 			      AGED,
 			      QUERY,
 			      ISOLATE,
-			      TUNNEL)),
+			      TUNNEL,
+			      FLEX)),
 		.call = parse_init,
 	},
 	/* Top-level command. */
@@ -2168,6 +2234,52 @@ static const struct token token_list[] = {
 			     ARGS_ENTRY(struct buffer, port)),
 		.call = parse_isolate,
 	},
+	[FLEX] = {
+		.name = "flex_item",
+		.help = "flex item API",
+		.next = NEXT(next_flex_item),
+		.call = parse_flex,
+	},
+	[FLEX_ITEM_INIT] = {
+		.name = "init",
+		.help = "flex item init",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_CREATE] = {
+		.name = "create",
+		.help = "flex item create",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.filename),
+			     ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FILE_PATH),
+			     NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_MODIFY] = {
+		.name = "modify",
+		.help = "flex item modify",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.filename),
+			     ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FILE_PATH),
+			     NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_DESTROY] = {
+		.name = "destroy",
+		.help = "flex item destroy",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
 	[TUNNEL] = {
 		.name = "tunnel",
 		.help = "new tunnel API",
@@ -3608,6 +3720,27 @@ static const struct token token_list[] = {
 			     item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_conntrack, flags)),
 	},
+	[ITEM_FLEX] = {
+		.name = "flex",
+		.help = "match flex header",
+		.priv = PRIV_ITEM(FLEX, sizeof(struct rte_flow_item_flex)),
+		.next = NEXT(item_flex),
+		.call = parse_vc,
+	},
+	[ITEM_FLEX_ITEM_HANDLE] = {
+		.name = "item",
+		.help = "flex item handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, handle)),
+	},
+	[ITEM_FLEX_PATTERN_HANDLE] = {
+		.name = "pattern",
+		.help = "flex pattern handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, pattern)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -6999,6 +7132,44 @@ parse_isolate(struct context *ctx, const struct token *token,
 	return len;
 }
 
+static int
+parse_flex(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (out->command == ZERO) {
+		if (ctx->curr != FLEX)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		ctx->objmask = NULL;
+	} else {
+		switch (ctx->curr) {
+		default:
+			break;
+		case FLEX_ITEM_INIT:
+		case FLEX_ITEM_CREATE:
+		case FLEX_ITEM_MODIFY:
+		case FLEX_ITEM_DESTROY:
+			out->command = ctx->curr;
+			break;
+		}
+	}
+
+	return len;
+}
+
 static int
 parse_tunnel(struct context *ctx, const struct token *token,
 	     const char *str, unsigned int len,
@@ -7661,6 +7832,71 @@ parse_set_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/*
+ * Replace testpmd handles in a flex flow item with real values.
+ */
+static int
+parse_flex_handle(struct context *ctx, const struct token *token,
+		  const char *str, unsigned int len,
+		  void *buf, unsigned int size)
+{
+	struct rte_flow_item_flex *spec, *mask;
+	const struct rte_flow_item_flex *src_spec, *src_mask;
+	const struct arg *arg = pop_args(ctx);
+	uint32_t offset;
+	uint16_t handle;
+	int ret;
+
+	if (!arg) {
+		printf("Bad environment\n");
+		return -1;
+	}
+	offset = arg->offset;
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, len, buf, size);
+	if (ret <= 0 || !ctx->object)
+		return ret;
+	if (ctx->port >= RTE_MAX_ETHPORTS) {
+		printf("Bad port\n");
+		return -1;
+	}
+	if (offset == offsetof(struct rte_flow_item_flex, handle)) {
+		const struct flex_item *fp;
+		struct rte_flow_item_flex *item_flex = ctx->object;
+		handle = (uint16_t)(uintptr_t)item_flex->handle;
+		if (handle >= FLEX_MAX_PARSERS_NUM) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		fp = flex_items[ctx->port][handle];
+		if (!fp) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		item_flex->handle = fp->flex_handle;
+	} else if (offset == offsetof(struct rte_flow_item_flex, pattern)) {
+		handle = (uint16_t)(uintptr_t)
+			((struct rte_flow_item_flex *)ctx->object)->pattern;
+		if (handle >= FLEX_MAX_PATTERNS_NUM) {
+			printf("Bad pattern handle\n");
+			return -1;
+		}
+		src_spec = &flex_patterns[handle].spec;
+		src_mask = &flex_patterns[handle].mask;
+		spec = ctx->object;
+		mask = spec + 2; /* spec, last, mask */
+		/* fill flow rule spec and mask parameters */
+		spec->length = src_spec->length;
+		spec->pattern = src_spec->pattern;
+		mask->length = src_mask->length;
+		mask->pattern = src_mask->pattern;
+	} else {
+		printf("Bad arguments - unknown flex item offset\n");
+		return -1;
+	}
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -8167,6 +8403,17 @@ cmd_flow_parsed(const struct buffer *in)
 		port_meter_policy_add(in->port, in->args.policy.policy_id,
 					in->args.vc.actions);
 		break;
+	case FLEX_ITEM_CREATE:
+		flex_item_create(in->port, in->args.flex.token,
+				 in->args.flex.filename);
+		break;
+	case FLEX_ITEM_MODIFY:
+		flex_item_modify(in->port, in->args.flex.token,
+				 in->args.flex.filename);
+		break;
+	case FLEX_ITEM_DESTROY:
+		flex_item_destroy(in->port, in->args.flex.token);
+		break;
 	default:
 		break;
 	}
@@ -8618,6 +8865,11 @@ cmd_set_raw_parsed(const struct buffer *in)
 		case RTE_FLOW_ITEM_TYPE_PFCP:
 			size = sizeof(struct rte_flow_item_pfcp);
 			break;
+		case RTE_FLOW_ITEM_TYPE_FLEX:
+			size = item->spec ?
+				((const struct rte_flow_item_flex *)
+				item->spec)->length : 0;
+			break;
 		default:
 			fprintf(stderr, "Error - Not supported item\n");
 			goto error;
@@ -8800,3 +9052,550 @@ cmdline_parse_inst_t cmd_show_set_raw_all = {
 		NULL,
 	},
 };
+
+#ifdef RTE_HAS_JANSSON
+static __rte_always_inline bool
+match_strkey(const char *key, const char *pattern)
+{
+	return strncmp(key, pattern, strlen(key)) == 0;
+}
+
+static struct flex_item *
+flex_parser_fetch(uint16_t port_id, uint16_t flex_id)
+{
+	if (port_id >= RTE_MAX_ETHPORTS) {
+		printf("Invalid port_id: %u\n", port_id);
+		return FLEX_PARSER_ERR;
+	}
+	if (flex_id >= FLEX_MAX_PARSERS_NUM) {
+		printf("Invalid flex item flex_id: %u\n", flex_id);
+		return FLEX_PARSER_ERR;
+	}
+	return flex_items[port_id][flex_id];
+}
+
+static void
+flex_item_destroy(portid_t port_id, uint16_t flex_id)
+{
+	int ret;
+	struct rte_flow_error error;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (!fp)
+		return;
+	ret = rte_flow_flex_item_release(port_id, fp->flex_handle, &error);
+	if (!ret) {
+		free(fp);
+		flex_items[port_id][flex_id] = NULL;
+		printf("port-%u: released flex item #%u\n",
+		       port_id, flex_id);
+
+	} else {
+		printf("port-%u: cannot release flex item #%u: %s\n",
+		       port_id, flex_id, error.message);
+	}
+}
+
+static int
+flex_field_parse(json_t *jfld, struct rte_flow_item_flex_field *fld)
+{
+	const char *key;
+	json_t *je;
+
+#define FLEX_FIELD_GET(fm, t) \
+do {                  \
+	if (!strncmp(key, # fm, strlen(# fm))) { \
+		if (json_is_real(je))   \
+			fld->fm = (t) json_real_value(je); \
+		else if (json_is_integer(je))   \
+			fld->fm = (t) json_integer_value(je); \
+		else   \
+			return -EINVAL; \
+	}         \
+} while (0)
+
+	json_object_foreach(jfld, key, je) {
+		FLEX_FIELD_GET(field_size, uint32_t);
+		FLEX_FIELD_GET(field_base, int32_t);
+		FLEX_FIELD_GET(offset_base, uint32_t);
+		FLEX_FIELD_GET(offset_mask, uint32_t);
+		FLEX_FIELD_GET(offset_shift, int32_t);
+		FLEX_FIELD_GET(tunnel_count, uint16_t);
+		FLEX_FIELD_GET(field_id, uint16_t);
+		FLEX_FIELD_GET(rss_hash, uint16_t);
+		if (match_strkey(key, "field_mode")) {
+			const char *mode;
+			if (!json_is_string(je))
+				return -EINVAL;
+			mode = json_string_value(je);
+			if (match_strkey(mode, "FIELD_MODE_DUMMY"))
+				fld->field_mode = FIELD_MODE_DUMMY;
+			else if (match_strkey(mode, "FIELD_MODE_FIXED"))
+				fld->field_mode = FIELD_MODE_FIXED;
+			else if (match_strkey(mode, "FIELD_MODE_OFFSET"))
+				fld->field_mode = FIELD_MODE_OFFSET;
+			else if (match_strkey(mode, "FIELD_MODE_BITMASK"))
+				fld->field_mode = FIELD_MODE_BITMASK;
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+enum flex_link_type {
+	FLEX_LINK_IN = 0,
+	FLEX_LINK_OUT = 1
+};
+
+static int
+flex_link_item_parse(const char *pattern, struct rte_flow_item *item)
+{
+#define  FLEX_PARSE_DATA_SIZE 1024
+
+	int ret;
+	uint8_t *ptr, data[FLEX_PARSE_DATA_SIZE] = {0,};
+	char flow_rule[256];
+	struct context saved_flow_ctx = cmd_flow_context;
+
+	sprintf(flow_rule, "flow create 0 pattern %s / end", pattern);
+	pattern = flow_rule;
+	cmd_flow_context_init(&cmd_flow_context);
+	do {
+		ret = cmd_flow_parse(NULL, pattern, (void *)data, sizeof(data));
+		if (ret > 0) {
+			pattern += ret;
+			while (isspace(*pattern))
+				pattern++;
+		}
+	} while (ret > 0 && strlen(pattern));
+	if (ret >= 0 && !strlen(pattern)) {
+		struct rte_flow_item *src =
+			((struct buffer *)data)->args.vc.pattern;
+		item->type = src->type;
+		if (src->spec) {
+			ptr = (void *)(uintptr_t)item->spec;
+			memcpy(ptr, src->spec, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->spec = NULL;
+		}
+		if (src->mask) {
+			ptr = (void *)(uintptr_t)item->mask;
+			memcpy(ptr, src->mask, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->mask = NULL;
+		}
+		if (src->last) {
+			ptr = (void *)(uintptr_t)item->last;
+			memcpy(ptr, src->last, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->last = NULL;
+		}
+		ret = 0;
+	}
+	cmd_flow_context = saved_flow_ctx;
+	return ret;
+}
+
+static int
+flex_link_parse(json_t *jobj, struct rte_flow_item_flex_link *link,
+		enum flex_link_type link_type)
+{
+	const char *key;
+	json_t *je;
+	int ret;
+	json_object_foreach(jobj, key, je) {
+		if (match_strkey(key, "item")) {
+			if (!json_is_string(je))
+				return -EINVAL;
+			ret = flex_link_item_parse(json_string_value(je),
+						   &link->item);
+			if (ret)
+				return -EINVAL;
+			if (link_type == FLEX_LINK_IN) {
+				if (!link->item.spec || !link->item.mask)
+					return -EINVAL;
+				if (link->item.last)
+					return -EINVAL;
+			}
+		}
+		if (match_strkey(key, "next")) {
+			if (json_is_integer(je))
+				link->next = (typeof(link->next))
+					     json_integer_value(je);
+			else if (json_is_real(je))
+				link->next = (typeof(link->next))
+					     json_real_value(je);
+			else
+				return -EINVAL;
+		}
+		if (match_strkey(key, "tunnel")) {
+			if (!json_is_true(je) && !json_is_false(je))
+				return -EINVAL;
+			link->tunnel = json_boolean_value(je);
+		}
+	}
+	return 0;
+}
+
+static int flex_item_config(json_t *jroot,
+			    struct rte_flow_item_flex_conf *flex_conf)
+{
+	const char *key;
+	json_t *jobj = NULL;
+	int ret;
+
+	json_object_foreach(jroot, key, jobj) {
+		if (match_strkey(key, "next_header")) {
+			ret = flex_field_parse(jobj, &flex_conf->next_header);
+			if (ret) {
+				printf("Can't parse next_header field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_protocol")) {
+			ret = flex_field_parse(jobj,
+					       &flex_conf->next_protocol);
+			if (ret) {
+				printf("Can't parse next_protocol field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "sample_data")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_field_parse
+					(ji, flex_conf->sample_data + i);
+				if (ret) {
+					printf("Can't parse sample_data field(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->sample_num = size;
+		} else if (match_strkey(key, "input_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse(ji,
+						      flex_conf->input_link + i,
+						      FLEX_LINK_IN);
+				if (ret) {
+					printf("Can't parse input_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->input_num = size;
+		} else if (match_strkey(key, "output_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse
+					(ji, flex_conf->output_link + i,
+					 FLEX_LINK_OUT);
+				if (ret) {
+					printf("Can't parse output_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->output_num = size;
+		}
+	}
+out:
+	return ret;
+}
+
+static struct flex_item *
+flex_item_init(void)
+{
+#define ALIGN(x) (((x) + sizeof(uintptr_t) - 1) & ~(sizeof(uintptr_t) - 1))
+
+	size_t base_size, samples_size, links_size, spec_size;
+	struct rte_flow_item_flex_conf *conf;
+	struct flex_item *fp;
+	uint8_t (*pattern)[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	int i;
+	base_size = ALIGN(sizeof(*conf));
+	samples_size = ALIGN(FLEX_ITEM_MAX_SAMPLES_NUM *
+			     sizeof(conf->sample_data[0]));
+	links_size = ALIGN(FLEX_ITEM_MAX_LINKS_NUM *
+			   sizeof(conf->input_link[0]));
+	/* spec & mask for all input links */
+	spec_size = 2 * FLEX_MAX_FLOW_PATTERN_LENGTH * FLEX_ITEM_MAX_LINKS_NUM;
+	fp = calloc(1, base_size + samples_size + 2 * links_size + spec_size);
+	if (fp == NULL) {
+		printf("Can't allocate memory for flex item\n");
+		return NULL;
+	}
+	conf = &fp->flex_conf;
+	conf->sample_data = (typeof(conf->sample_data))
+			    ((uint8_t *)fp + base_size);
+	conf->input_link = (typeof(conf->input_link))
+			   ((uint8_t *)conf->sample_data + samples_size);
+	conf->output_link = (typeof(conf->output_link))
+			    ((uint8_t *)conf->input_link + links_size);
+	pattern = (typeof(pattern))((uint8_t *)conf->output_link + links_size);
+	for (i = 0; i < FLEX_ITEM_MAX_LINKS_NUM; i++) {
+		struct rte_flow_item_flex_link *in = conf->input_link + i;
+		in->item.spec = pattern++;
+		in->item.mask = pattern++;
+	}
+	return fp;
+}
+
+static void
+flex_item_modify(portid_t port_id, uint16_t flex_id, const char *filename)
+{
+	struct rte_flow_error flow_error;
+	json_error_t json_error;
+	json_t *jroot = NULL;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	struct flex_item *modified_fp;
+	int ret;
+
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (!fp) {
+		printf("port-%u: flex item #%u not available\n",
+		       port_id, flex_id);
+		return;
+	}
+	jroot = json_load_file(filename, 0, &json_error);
+	if (!jroot) {
+		printf("Bad JSON file \"%s\"\n", filename);
+		return;
+	}
+	modified_fp = flex_item_init();
+	if (!modified_fp) {
+		printf("Could not allocate flex item\n");
+		goto out;
+	}
+	ret = flex_item_config(jroot, &modified_fp->flex_conf);
+	if (ret)
+		goto out;
+	ret = rte_flow_flex_item_update(port_id, fp->flex_handle,
+					&modified_fp->flex_conf,
+					&flow_error);
+	if (!ret) {
+		modified_fp->flex_handle = fp->flex_handle;
+		flex_items[port_id][flex_id] = modified_fp;
+		printf("port-%u: modified flex item #%u\n", port_id, flex_id);
+		modified_fp = NULL;
+		free(fp);
+	} else {
+		free(modified_fp);
+	}
+out:
+	if (modified_fp)
+		free(modified_fp);
+	if (jroot)
+		json_decref(jroot);
+}
+
+static void
+flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename)
+{
+	struct rte_flow_error flow_error;
+	json_error_t json_error;
+	json_t *jroot = NULL;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	int ret;
+
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (fp) {
+		printf("port-%u: flex item #%u is already in use\n",
+		       port_id, flex_id);
+		return;
+	}
+	jroot = json_load_file(filename, 0, &json_error);
+	if (!jroot) {
+		printf("Bad JSON file \"%s\": %s\n", filename, json_error.text);
+		return;
+	}
+	fp = flex_item_init();
+	if (!fp) {
+		printf("Could not allocate flex item\n");
+		goto out;
+	}
+	ret = flex_item_config(jroot, &fp->flex_conf);
+	if (ret)
+		goto out;
+	fp->flex_handle = rte_flow_flex_item_create(port_id,
+						    &fp->flex_conf,
+						    &flow_error);
+	if (fp->flex_handle) {
+		flex_items[port_id][flex_id] = fp;
+		printf("port-%u: created flex item #%u\n", port_id, flex_id);
+		fp = NULL;
+	} else {
+		printf("port-%u: flex item #%u creation failed: %s\n",
+		       port_id, flex_id,
+		       flow_error.message ? flow_error.message : "");
+	}
+out:
+	if (fp)
+		free(fp);
+	if (jroot)
+		json_decref(jroot);
+}
+
+#else /* RTE_HAS_JANSSON */
+static void flex_item_create(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id,
+			     __rte_unused const char *filename)
+{
+	printf("no JSON library\n");
+}
+
+static void flex_item_modify(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id,
+			     __rte_unused const char *filename)
+{
+	printf("no JSON library\n");
+}
+
+static void flex_item_destroy(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id)
+{
+	printf("no JSON library\n");
+}
+#endif /* RTE_HAS_JANSSON */
+
+struct flex_pattern_set {
+	cmdline_fixed_string_t set, flex_pattern;
+	cmdline_fixed_string_t is_spec, mask;
+	cmdline_fixed_string_t spec_data, mask_data;
+	uint16_t id;
+};
+
+static cmdline_parse_token_string_t flex_pattern_set_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, set, "set");
+static cmdline_parse_token_string_t flex_pattern_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 flex_pattern, "flex_pattern");
+static cmdline_parse_token_string_t flex_pattern_is_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 is_spec, "is");
+static cmdline_parse_token_string_t flex_pattern_spec_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 is_spec, "spec");
+static cmdline_parse_token_string_t flex_pattern_mask_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask, "mask");
+static cmdline_parse_token_string_t flex_pattern_spec_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, spec_data, NULL);
+static cmdline_parse_token_string_t flex_pattern_mask_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask_data, NULL);
+static cmdline_parse_token_num_t flex_pattern_id_token =
+	TOKEN_NUM_INITIALIZER(struct flex_pattern_set, id, RTE_UINT16);
+
+/*
+ * flex pattern data - spec or mask is a string representation of byte array
+ * in hexadecimal format. Each byte in data string must have 2 characters:
+ * 0x15 - "15"
+ * 0x1  - "01"
+ * Bytes in data array are in network order.
+ */
+static uint32_t
+flex_pattern_data(const char *str, uint8_t *data)
+{
+	uint32_t i, len = strlen(str);
+	char b[3], *endptr;
+
+	if (len & 01)
+		return 0;
+	len /= 2;
+	if (len >= FLEX_MAX_FLOW_PATTERN_LENGTH)
+		return 0;
+	for (i = 0, b[2] = '\0'; i < len; i++) {
+		b[0] = str[2 * i];
+		b[1] = str[2 * i + 1];
+		data[i] = strtoul(b, &endptr, 16);
+		if (endptr != &b[2])
+			return 0;
+	}
+	return len;
+}
+
+static void
+flex_pattern_parsed_fn(void *parsed_result,
+		       __rte_unused struct cmdline *cl,
+		       __rte_unused void *data)
+{
+	struct flex_pattern_set *res = parsed_result;
+	struct flex_pattern *fp;
+	bool full_spec;
+
+	if (res->id >= FLEX_MAX_PATTERNS_NUM) {
+		printf("Bad flex pattern id\n");
+		return;
+	}
+	fp = flex_patterns + res->id;
+	memset(fp->spec_pattern, 0, sizeof(fp->spec_pattern));
+	memset(fp->mask_pattern, 0, sizeof(fp->mask_pattern));
+	fp->spec.length = flex_pattern_data(res->spec_data, fp->spec_pattern);
+	if (!fp->spec.length) {
+		printf("Bad flex pattern spec\n");
+		return;
+	}
+	full_spec = strncmp(res->is_spec, "spec", strlen("spec")) == 0;
+	if (full_spec) {
+		fp->mask.length = flex_pattern_data(res->mask_data,
+						    fp->mask_pattern);
+		if (!fp->mask.length) {
+			printf("Bad flex pattern mask\n");
+			return;
+		}
+	} else {
+		memset(fp->mask_pattern, 0xFF, fp->spec.length);
+		fp->mask.length = fp->spec.length;
+	}
+	if (fp->mask.length != fp->spec.length) {
+		printf("Spec length do not match mask length\n");
+		return;
+	}
+	fp->spec.pattern = fp->spec_pattern;
+	fp->mask.pattern = fp->mask_pattern;
+	printf("created pattern #%u\n", res->id);
+}
+
+cmdline_parse_inst_t cmd_set_flex_is_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> is <spec_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_is_token,
+		(void *)&flex_pattern_spec_data_token,
+		NULL,
+	}
+};
+
+cmdline_parse_inst_t cmd_set_flex_spec_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> spec <spec_data> mask <mask_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_spec_token,
+		(void *)&flex_pattern_spec_data_token,
+		(void *)&flex_pattern_mask_token,
+		(void *)&flex_pattern_mask_data_token,
+		NULL,
+	}
+};
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17e..0f76d4c551 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -4017,7 +4017,6 @@ main(int argc, char** argv)
 		rte_stats_bitrate_reg(bitrate_data);
 	}
 #endif
-
 #ifdef RTE_LIB_CMDLINE
 	if (strlen(cmdline_filename) != 0)
 		cmdline_read_from_file(cmdline_filename);
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 876a341cf0..36d4e29b83 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -282,6 +282,19 @@ struct fwd_engine {
 	packet_fwd_t     packet_fwd;     /**< Mandatory. */
 };
 
+struct flex_item {
+	struct rte_flow_item_flex_conf flex_conf;
+	struct rte_flow_item_flex_handle *flex_handle;
+	uint32_t flex_id;
+};
+
+#define FLEX_ITEM_MAX_SAMPLES_NUM 16
+#define FLEX_ITEM_MAX_LINKS_NUM 16
+#define FLEX_MAX_FLOW_PATTERN_LENGTH 64
+#define FLEX_MAX_PARSERS_NUM 8
+#define FLEX_MAX_PATTERNS_NUM 64
+#define FLEX_PARSER_ERR ((struct flex_item *)-1)
+
 #define BURST_TX_WAIT_US 1
 #define BURST_TX_RETRIES 64
 
@@ -306,6 +319,8 @@ extern struct fwd_engine * fwd_engines[]; /**< NULL terminated array. */
 extern cmdline_parse_inst_t cmd_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw_all;
+extern cmdline_parse_inst_t cmd_set_flex_is_pattern;
+extern cmdline_parse_inst_t cmd_set_flex_spec_pattern;
 
 extern uint16_t mempool_flags;
 
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index bbef706374..5efc626260 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -5091,3 +5091,122 @@ For example to unload BPF filter from TX queue 0, port 0:
 .. code-block:: console
 
    testpmd> bpf-unload tx 0 0
+
+Flex Item Functions
+-------------------
+
+The following sections show functions that configure and create flex item object,
+create flex pattern and use it in a flow rule.
+The commands will use 20 bytes IPv4 header for examples:
+
+::
+
+   0                   1                   2                   3
+   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |  ver  |  IHL  |     TOS       |        length                 | DW0
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       identification          | flg |    frag. offset         | DW1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       TTL     |  protocol     |        checksum               | DW2
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |               source IP address                               | DW3
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |              destination IP address                           | DW4
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+Create flex item
+~~~~~~~~~~~~~~~~
+
+Flex item object is created by PMD according to a new header configuration. The
+header configuration is compiled by the testpmd and stored in
+``rte_flow_item_flex_conf`` type variable.
+
+::
+
+   # flow flex_item create <port> <flex id> <configuration file>
+   testpmd> flow flex_item init 0 3 ipv4_flex_config.json
+   port-0: created flex item #3
+
+Flex item configuration is kept in external JSON file.
+It describes the following header elements:
+
+**New header length.**
+
+Specify whether the new header has fixed or variable length and the basic/minimal
+header length value.
+
+If header length is not fixed, header location with a value that completes header
+length calculation and scale/offset function must be added.
+
+Scale function depends on port hardware.
+
+**Next protocol.**
+
+Describes location in the new header that specify following network header type.
+
+**Flow match samples.**
+
+Describes locations in the new header that will be used in flow rules.
+
+Number of flow samples and sample maximal length depend of port hardware.
+
+**Input trigger.**
+
+Describes preceding network header configuration.
+
+**Output trigger.**
+
+Describes conditions that trigger transfer to following network header
+
+.. code-block:: json
+
+   {
+      "next_header": { "field_mode": "FIELD_MODE_FIXED", "field_size": 20},
+      "next_protocol": {"field_size": 8, "field_base": 72},
+      "sample_data": [
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 0},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 32},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 64},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 96}
+      ],
+      "input_link": [
+         {"item": "eth type is 0x0800"},
+         {"item": "vlan inner_type is 0x0800"}
+      ],
+      "output_link": [
+         {"item": "udp", "next": 17},
+         {"item": "tcp", "next": 6},
+         {"item": "icmp", "next": 1}
+      ]
+   }
+
+
+Flex pattern and flow rules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Flex pattern describe parts of network header that will trigger flex flow item hit in a flow rule.
+Flex pattern directly related to flex item samples configuration.
+Flex pattern can be shared between ports.
+
+**Flex pattern and flow rule to match IPv4 version and 20 bytes length**
+
+::
+
+   # set flex_pattern <pattern_id> is <hex bytes sequence>
+   testpmd> flow flex_item pattern 5 is 45FF
+   created pattern #5
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 5 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
+
+**Flex pattern and flow rule to match packets with source address 1.2.3.4**
+
+::
+
+   testpmd> flow flex_item pattern 2 spec 45000000000000000000000001020304 mask FF0000000000000000000000FFFFFFFF
+   created pattern #2
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 2 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 06/14] common/mlx5: refactor HCA attributes query
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (4 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 05/14] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 07/14] common/mlx5: extend flex parser capabilities Viacheslav Ovsiienko
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

There is the common part of code querying the HCA attributes
from the device, and this part can be commoditized as
dedicated routine.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 173 +++++++++++----------------
 1 file changed, 73 insertions(+), 100 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 56407cc332..8273e98146 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -13,6 +13,42 @@
 #include "mlx5_common_log.h"
 #include "mlx5_malloc.h"
 
+static void *
+mlx5_devx_get_hca_cap(void *ctx, uint32_t *in, uint32_t *out,
+		      int *err, uint32_t flags)
+{
+	const size_t size_in = MLX5_ST_SZ_DW(query_hca_cap_in) * sizeof(int);
+	const size_t size_out = MLX5_ST_SZ_DW(query_hca_cap_out) * sizeof(int);
+	int status, syndrome, rc;
+
+	if (err)
+		*err = 0;
+	memset(in, 0, size_in);
+	memset(out, 0, size_out);
+	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
+	MLX5_SET(query_hca_cap_in, in, op_mod, flags);
+	rc = mlx5_glue->devx_general_cmd(ctx, in, size_in, out, size_out);
+	if (rc) {
+		DRV_LOG(ERR,
+			"Failed to query devx HCA capabilities func %#02x",
+			flags >> 1);
+		if (err)
+			*err = rc > 0 ? -rc : rc;
+		return NULL;
+	}
+	status = MLX5_GET(query_hca_cap_out, out, status);
+	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
+	if (status) {
+		DRV_LOG(ERR,
+			"Failed to query devx HCA capabilities func %#02x status %x, syndrome = %x",
+			flags >> 1, status, syndrome);
+		if (err)
+			*err = -1;
+		return NULL;
+	}
+	return MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+}
+
 /**
  * Perform read access to the registers. Reads data from register
  * and writes ones to the specified buffer.
@@ -472,21 +508,15 @@ static void
 mlx5_devx_cmd_query_hca_vdpa_attr(void *ctx,
 				  struct mlx5_hca_vdpa_attr *vdpa_attr)
 {
-	uint32_t in[MLX5_ST_SZ_DW(query_hca_cap_in)] = {0};
-	uint32_t out[MLX5_ST_SZ_DW(query_hca_cap_out)] = {0};
-	void *hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
-	int status, syndrome, rc;
+	uint32_t in[MLX5_ST_SZ_DW(query_hca_cap_in)];
+	uint32_t out[MLX5_ST_SZ_DW(query_hca_cap_out)];
+	void *hcattr;
 
-	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
-	MLX5_SET(query_hca_cap_in, in, op_mod,
-		 MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION |
-		 MLX5_HCA_CAP_OPMOD_GET_CUR);
-	rc = mlx5_glue->devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out));
-	status = MLX5_GET(query_hca_cap_out, out, status);
-	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
-	if (rc || status) {
-		RTE_LOG(DEBUG, PMD, "Failed to query devx VDPA capabilities,"
-			" status %x, syndrome = %x", status, syndrome);
+	hcattr = mlx5_devx_get_hca_cap(ctx, in, out, NULL,
+			MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION |
+			MLX5_HCA_CAP_OPMOD_GET_CUR);
+	if (!hcattr) {
+		RTE_LOG(DEBUG, PMD, "Failed to query devx VDPA capabilities");
 		vdpa_attr->valid = 0;
 	} else {
 		vdpa_attr->valid = 1;
@@ -741,27 +771,15 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 {
 	uint32_t in[MLX5_ST_SZ_DW(query_hca_cap_in)] = {0};
 	uint32_t out[MLX5_ST_SZ_DW(query_hca_cap_out)] = {0};
-	void *hcattr;
-	int status, syndrome, rc, i;
 	uint64_t general_obj_types_supported = 0;
+	void *hcattr;
+	int rc, i;
 
-	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
-	MLX5_SET(query_hca_cap_in, in, op_mod,
-		 MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE |
-		 MLX5_HCA_CAP_OPMOD_GET_CUR);
-
-	rc = mlx5_glue->devx_general_cmd(ctx,
-					 in, sizeof(in), out, sizeof(out));
-	if (rc)
-		goto error;
-	status = MLX5_GET(query_hca_cap_out, out, status);
-	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
-	if (status) {
-		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
-			"status %x, syndrome = %x", status, syndrome);
-		return -1;
-	}
-	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
+	hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
+			MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE |
+			MLX5_HCA_CAP_OPMOD_GET_CUR);
+	if (!hcattr)
+		return rc;
 	attr->flow_counter_bulk_alloc_bitmap =
 			MLX5_GET(cmd_hca_cap, hcattr, flow_counter_bulk_alloc);
 	attr->flow_counters_dump = MLX5_GET(cmd_hca_cap, hcattr,
@@ -884,19 +902,13 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 					 general_obj_types) &
 			      MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
 	if (attr->qos.sup) {
-		MLX5_SET(query_hca_cap_in, in, op_mod,
-			 MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
-			 MLX5_HCA_CAP_OPMOD_GET_CUR);
-		rc = mlx5_glue->devx_general_cmd(ctx, in, sizeof(in),
-						 out, sizeof(out));
-		if (rc)
-			goto error;
-		if (status) {
-			DRV_LOG(DEBUG, "Failed to query devx QOS capabilities,"
-				" status %x, syndrome = %x", status, syndrome);
-			return -1;
+		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
+				MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
+				MLX5_HCA_CAP_OPMOD_GET_CUR);
+		if (!hcattr) {
+			DRV_LOG(DEBUG, "Failed to query devx QOS capabilities");
+			return rc;
 		}
-		hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
 		attr->qos.flow_meter_old =
 				MLX5_GET(qos_cap, hcattr, flow_meter_old);
 		attr->qos.log_max_flow_meter =
@@ -925,27 +937,14 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 		mlx5_devx_cmd_query_hca_vdpa_attr(ctx, &attr->vdpa);
 	if (!attr->eth_net_offloads)
 		return 0;
-
 	/* Query Flow Sampler Capability From FLow Table Properties Layout. */
-	memset(in, 0, sizeof(in));
-	memset(out, 0, sizeof(out));
-	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
-	MLX5_SET(query_hca_cap_in, in, op_mod,
-		 MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE |
-		 MLX5_HCA_CAP_OPMOD_GET_CUR);
-
-	rc = mlx5_glue->devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out));
-	if (rc)
-		goto error;
-	status = MLX5_GET(query_hca_cap_out, out, status);
-	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
-	if (status) {
-		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
-			"status %x, syndrome = %x", status, syndrome);
+	hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
+			MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE |
+			MLX5_HCA_CAP_OPMOD_GET_CUR);
+	if (!hcattr) {
 		attr->log_max_ft_sampler_num = 0;
-		return -1;
+		return rc;
 	}
-	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
 	attr->log_max_ft_sampler_num = MLX5_GET
 		(flow_table_nic_cap, hcattr,
 		 flow_table_properties_nic_receive.log_max_ft_sampler_num);
@@ -960,27 +959,13 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 		(flow_table_nic_cap, hcattr,
 		 ft_field_support_2_nic_receive.outer_ipv4_ihl);
 	/* Query HCA offloads for Ethernet protocol. */
-	memset(in, 0, sizeof(in));
-	memset(out, 0, sizeof(out));
-	MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
-	MLX5_SET(query_hca_cap_in, in, op_mod,
-		 MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS |
-		 MLX5_HCA_CAP_OPMOD_GET_CUR);
-
-	rc = mlx5_glue->devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out));
-	if (rc) {
+	mlx5_devx_get_hca_cap(ctx, in, out, &rc,
+			MLX5_GET_HCA_CAP_OP_MOD_ETHERNET_OFFLOAD_CAPS |
+			MLX5_HCA_CAP_OPMOD_GET_CUR);
+	if (!hcattr) {
 		attr->eth_net_offloads = 0;
-		goto error;
+		return rc;
 	}
-	status = MLX5_GET(query_hca_cap_out, out, status);
-	syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
-	if (status) {
-		DRV_LOG(DEBUG, "Failed to query devx HCA capabilities, "
-			"status %x, syndrome = %x", status, syndrome);
-		attr->eth_net_offloads = 0;
-		return -1;
-	}
-	hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
 	attr->wqe_vlan_insert = MLX5_GET(per_protocol_networking_offload_caps,
 					 hcattr, wqe_vlan_insert);
 	attr->csum_cap = MLX5_GET(per_protocol_networking_offload_caps,
@@ -1017,26 +1002,14 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 					 hcattr, rss_ind_tbl_cap);
 	/* Query HCA attribute for ROCE. */
 	if (attr->roce) {
-		memset(in, 0, sizeof(in));
-		memset(out, 0, sizeof(out));
-		MLX5_SET(query_hca_cap_in, in, opcode,
-			 MLX5_CMD_OP_QUERY_HCA_CAP);
-		MLX5_SET(query_hca_cap_in, in, op_mod,
-			 MLX5_GET_HCA_CAP_OP_MOD_ROCE |
-			 MLX5_HCA_CAP_OPMOD_GET_CUR);
-		rc = mlx5_glue->devx_general_cmd(ctx, in, sizeof(in),
-						 out, sizeof(out));
-		if (rc)
-			goto error;
-		status = MLX5_GET(query_hca_cap_out, out, status);
-		syndrome = MLX5_GET(query_hca_cap_out, out, syndrome);
-		if (status) {
+		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
+				MLX5_GET_HCA_CAP_OP_MOD_ROCE |
+				MLX5_HCA_CAP_OPMOD_GET_CUR);
+		if (!hcattr) {
 			DRV_LOG(DEBUG,
-				"Failed to query devx HCA ROCE capabilities, "
-				"status %x, syndrome = %x", status, syndrome);
-			return -1;
+				"Failed to query devx HCA ROCE capabilities");
+			return rc;
 		}
-		hcattr = MLX5_ADDR_OF(query_hca_cap_out, out, capability);
 		attr->qp_ts_format = MLX5_GET(roce_caps, hcattr, qp_ts_format);
 	}
 	if (attr->eth_virt &&
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 07/14] common/mlx5: extend flex parser capabilities
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (5 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 06/14] common/mlx5: refactor HCA attributes query Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 08/14] common/mlx5: fix flex parser DevX creation routine Viacheslav Ovsiienko
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

MLX5 PARSE_GRAPH_NODE is the main data structure used by the Flex
Parser when a new parsing protocol is defined. While software
creates PARSE_GRAPH_NODE object for a new protocol, it must
verify that configuration parameters it uses comply with
hardware limits.

The patch queries hardware PARSE_GRAPH_NODE capabilities and
stores ones in PMD internal configuration structure:

 - query capabilties from parse_graph_node attribute page
 - query max_num_prog_sample_field capability from HCA page 2

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 57 ++++++++++++++++++++++++
 drivers/common/mlx5/mlx5_devx_cmds.h | 65 +++++++++++++++++++++++++++-
 drivers/common/mlx5/mlx5_prm.h       | 50 ++++++++++++++++++++-
 3 files changed, 168 insertions(+), 4 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 8273e98146..294ac480dc 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -729,6 +729,53 @@ mlx5_devx_cmd_create_flex_parser(void *ctx,
 	return parse_flex_obj;
 }
 
+static int
+mlx5_devx_cmd_query_hca_parse_graph_node_cap
+	(void *ctx, struct mlx5_hca_flex_attr *attr)
+{
+	uint32_t in[MLX5_ST_SZ_DW(query_hca_cap_in)];
+	uint32_t out[MLX5_ST_SZ_DW(query_hca_cap_out)];
+	void *hcattr;
+	int rc;
+
+	hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
+			MLX5_GET_HCA_CAP_OP_MOD_PARSE_GRAPH_NODE_CAP |
+			MLX5_HCA_CAP_OPMOD_GET_CUR);
+	if (!hcattr)
+		return rc;
+	attr->node_in = MLX5_GET(parse_graph_node_cap, hcattr, node_in);
+	attr->node_out = MLX5_GET(parse_graph_node_cap, hcattr, node_out);
+	attr->header_length_mode = MLX5_GET(parse_graph_node_cap, hcattr,
+					    header_length_mode);
+	attr->sample_offset_mode = MLX5_GET(parse_graph_node_cap, hcattr,
+					    sample_offset_mode);
+	attr->max_num_arc_in = MLX5_GET(parse_graph_node_cap, hcattr,
+					max_num_arc_in);
+	attr->max_num_arc_out = MLX5_GET(parse_graph_node_cap, hcattr,
+					 max_num_arc_out);
+	attr->max_num_sample = MLX5_GET(parse_graph_node_cap, hcattr,
+					max_num_sample);
+	attr->sample_id_in_out = MLX5_GET(parse_graph_node_cap, hcattr,
+					  sample_id_in_out);
+	attr->max_base_header_length = MLX5_GET(parse_graph_node_cap, hcattr,
+						max_base_header_length);
+	attr->max_sample_base_offset = MLX5_GET(parse_graph_node_cap, hcattr,
+						max_sample_base_offset);
+	attr->max_next_header_offset = MLX5_GET(parse_graph_node_cap, hcattr,
+						max_next_header_offset);
+	attr->header_length_mask_width = MLX5_GET(parse_graph_node_cap, hcattr,
+						  header_length_mask_width);
+	/* Get the max supported samples from HCA CAP 2 */
+	hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
+			MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE_2 |
+			MLX5_HCA_CAP_OPMOD_GET_CUR);
+	if (!hcattr)
+		return rc;
+	attr->max_num_prog_sample =
+		MLX5_GET(cmd_hca_cap_2, hcattr,	max_num_prog_sample_field);
+	return 0;
+}
+
 static int
 mlx5_devx_query_pkt_integrity_match(void *hcattr)
 {
@@ -933,6 +980,16 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 					log_max_num_meter_aso);
 		}
 	}
+	/*
+	 * Flex item support needs max_num_prog_sample_field
+	 * from the Capabilities 2 table for PARSE_GRAPH_NODE
+	 */
+	if (attr->parse_graph_flex_node) {
+		rc = mlx5_devx_cmd_query_hca_parse_graph_node_cap
+			(ctx, &attr->flex);
+		if (rc)
+			return -1;
+	}
 	if (attr->vdpa.valid)
 		mlx5_devx_cmd_query_hca_vdpa_attr(ctx, &attr->vdpa);
 	if (!attr->eth_net_offloads)
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index e576e30f24..fcd0b12e22 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -8,6 +8,7 @@
 #include "mlx5_glue.h"
 #include "mlx5_prm.h"
 #include <rte_compat.h>
+#include <rte_bitops.h>
 
 /*
  * Defines the amount of retries to allocate the first UAR in the page.
@@ -94,6 +95,64 @@ struct mlx5_hca_flow_attr {
 	uint32_t tunnel_header_2_3;
 };
 
+/**
+ * Accumulate port PARSE_GRAPH_NODE capabilities from
+ * PARSE_GRAPH_NODE Capabilities and HCA Capabilities 2 tables
+ */
+__extension__
+struct mlx5_hca_flex_attr {
+	uint32_t node_in;
+	uint32_t node_out;
+	uint16_t header_length_mode;
+	uint16_t sample_offset_mode;
+	uint8_t  max_num_arc_in;
+	uint8_t  max_num_arc_out;
+	uint8_t  max_num_sample;
+	uint8_t  max_num_prog_sample:5;	/* From HCA CAP 2 */
+	uint8_t  sample_id_in_out:1;
+	uint16_t max_base_header_length;
+	uint8_t  max_sample_base_offset;
+	uint16_t max_next_header_offset;
+	uint8_t  header_length_mask_width;
+};
+
+/* ISO C restricts enumerator values to range of 'int' */
+__extension__
+enum {
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_HEAD          = RTE_BIT32(1),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_MAC           = RTE_BIT32(2),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_IP            = RTE_BIT32(3),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_GRE           = RTE_BIT32(4),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_UDP           = RTE_BIT32(5),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_MPLS          = RTE_BIT32(6),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_TCP           = RTE_BIT32(7),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_VXLAN_GRE     = RTE_BIT32(8),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_GENEVE        = RTE_BIT32(9),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_IPSEC_ESP     = RTE_BIT32(10),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_IPV4          = RTE_BIT32(11),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_IPV6          = RTE_BIT32(12),
+	PARSE_GRAPH_NODE_CAP_SUPPORTED_PROTOCOL_PROGRAMMABLE  = RTE_BIT32(31)
+};
+
+enum {
+	PARSE_GRAPH_NODE_CAP_LENGTH_MODE_FIXED          = RTE_BIT32(0),
+	PARSE_GRAPH_NODE_CAP_LENGTH_MODE_EXPLISIT_FIELD = RTE_BIT32(1),
+	PARSE_GRAPH_NODE_CAP_LENGTH_MODE_BITMASK_FIELD  = RTE_BIT32(2)
+};
+
+/*
+ * DWORD shift is the base for calculating header_length_field_mask
+ * value in the MLX5_GRAPH_NODE_LEN_FIELD mode.
+ */
+#define MLX5_PARSE_GRAPH_NODE_HDR_LEN_SHIFT_DWORD 0x02
+
+static inline uint32_t
+mlx5_hca_parse_graph_node_base_hdr_len_mask
+	(const struct mlx5_hca_flex_attr *attr)
+{
+	return (1 << attr->header_length_mask_width) - 1;
+}
+
 /* HCA supports this number of time periods for LRO. */
 #define MLX5_LRO_NUM_SUPP_PERIODS 4
 
@@ -164,6 +223,7 @@ struct mlx5_hca_attr {
 	struct mlx5_hca_qos_attr qos;
 	struct mlx5_hca_vdpa_attr vdpa;
 	struct mlx5_hca_flow_attr flow;
+	struct mlx5_hca_flex_attr flex;
 	int log_max_qp_sz;
 	int log_max_cq_sz;
 	int log_max_qp;
@@ -570,8 +630,9 @@ int mlx5_devx_cmd_query_parse_samples(struct mlx5_devx_obj *flex_obj,
 				      uint32_t ids[], uint32_t num);
 
 __rte_internal
-struct mlx5_devx_obj *mlx5_devx_cmd_create_flex_parser(void *ctx,
-					struct mlx5_devx_graph_node_attr *data);
+struct mlx5_devx_obj *
+mlx5_devx_cmd_create_flex_parser(void *ctx,
+				 struct mlx5_devx_graph_node_attr *data);
 
 __rte_internal
 int mlx5_devx_cmd_register_read(void *ctx, uint16_t reg_id,
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index d361bcf90e..3ff14b4a5a 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -975,7 +975,14 @@ struct mlx5_ifc_fte_match_set_misc4_bits {
 	u8 prog_sample_field_id_2[0x20];
 	u8 prog_sample_field_value_3[0x20];
 	u8 prog_sample_field_id_3[0x20];
-	u8 reserved_at_100[0x100];
+	u8 prog_sample_field_value_4[0x20];
+	u8 prog_sample_field_id_4[0x20];
+	u8 prog_sample_field_value_5[0x20];
+	u8 prog_sample_field_id_5[0x20];
+	u8 prog_sample_field_value_6[0x20];
+	u8 prog_sample_field_id_6[0x20];
+	u8 prog_sample_field_value_7[0x20];
+	u8 prog_sample_field_id_7[0x20];
 };
 
 struct mlx5_ifc_fte_match_set_misc5_bits {
@@ -1244,6 +1251,7 @@ enum {
 	MLX5_GET_HCA_CAP_OP_MOD_ROCE = 0x4 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_VDPA_EMULATION = 0x13 << 1,
+	MLX5_GET_HCA_CAP_OP_MOD_PARSE_GRAPH_NODE_CAP = 0x1C << 1,
 	MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE_2 = 0x20 << 1,
 };
 
@@ -1750,6 +1758,27 @@ struct mlx5_ifc_virtio_emulation_cap_bits {
 	u8 reserved_at_1c0[0x620];
 };
 
+/**
+ * PARSE_GRAPH_NODE Capabilities Field Descriptions
+ */
+struct mlx5_ifc_parse_graph_node_cap_bits {
+	u8 node_in[0x20];
+	u8 node_out[0x20];
+	u8 header_length_mode[0x10];
+	u8 sample_offset_mode[0x10];
+	u8 max_num_arc_in[0x08];
+	u8 max_num_arc_out[0x08];
+	u8 max_num_sample[0x08];
+	u8 reserved_at_78[0x07];
+	u8 sample_id_in_out[0x1];
+	u8 max_base_header_length[0x10];
+	u8 reserved_at_90[0x08];
+	u8 max_sample_base_offset[0x08];
+	u8 max_next_header_offset[0x10];
+	u8 reserved_at_b0[0x08];
+	u8 header_length_mask_width[0x08];
+};
+
 struct mlx5_ifc_flow_table_prop_layout_bits {
 	u8 ft_support[0x1];
 	u8 flow_tag[0x1];
@@ -1844,9 +1873,14 @@ struct mlx5_ifc_flow_table_nic_cap_bits {
 		ft_field_support_2_nic_receive;
 };
 
+/*
+ *  HCA Capabilities 2
+ */
 struct mlx5_ifc_cmd_hca_cap_2_bits {
 	u8 reserved_at_0[0x80]; /* End of DW4. */
-	u8 reserved_at_80[0xb];
+	u8 reserved_at_80[0x3];
+	u8 max_num_prog_sample_field[0x5];
+	u8 reserved_at_88[0x3];
 	u8 log_max_num_reserved_qpn[0x5];
 	u8 reserved_at_90[0x3];
 	u8 log_reserved_qpn_granularity[0x5];
@@ -3877,6 +3911,12 @@ enum mlx5_parse_graph_flow_match_sample_offset_mode {
 	MLX5_GRAPH_SAMPLE_OFFSET_BITMASK = 0x2,
 };
 
+enum mlx5_parse_graph_flow_match_sample_tunnel_mode {
+	MLX5_GRAPH_SAMPLE_TUNNEL_OUTER = 0x0,
+	MLX5_GRAPH_SAMPLE_TUNNEL_INNER = 0x1,
+	MLX5_GRAPH_SAMPLE_TUNNEL_FIRST = 0x2
+};
+
 /* Node index for an input / output arc of the flex parser graph. */
 enum mlx5_parse_graph_arc_node_index {
 	MLX5_GRAPH_ARC_NODE_NULL = 0x0,
@@ -3890,9 +3930,15 @@ enum mlx5_parse_graph_arc_node_index {
 	MLX5_GRAPH_ARC_NODE_VXLAN_GPE = 0x8,
 	MLX5_GRAPH_ARC_NODE_GENEVE = 0x9,
 	MLX5_GRAPH_ARC_NODE_IPSEC_ESP = 0xa,
+	MLX5_GRAPH_ARC_NODE_IPV4 = 0xb,
+	MLX5_GRAPH_ARC_NODE_IPV6 = 0xc,
 	MLX5_GRAPH_ARC_NODE_PROGRAMMABLE = 0x1f,
 };
 
+#define MLX5_PARSE_GRAPH_FLOW_SAMPLE_MAX 8
+#define MLX5_PARSE_GRAPH_IN_ARC_MAX 8
+#define MLX5_PARSE_GRAPH_OUT_ARC_MAX 8
+
 /**
  * Convert a user mark to flow mark.
  *
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 08/14] common/mlx5: fix flex parser DevX creation routine
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (6 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 07/14] common/mlx5: extend flex parser capabilities Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 09/14] net/mlx5: update eCPRI flex parser structures Viacheslav Ovsiienko
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas, stable

From: Gregory Etelson <getelson@nvidia.com>

Add missing modify_field_select, next_header_field_size
field values setting.

Fixes: 38119ebe01d6 ("common/mlx5: add DevX command for flex parsers")
Cc: stable@dpdk.org

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 294ac480dc..43e51e3f95 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -620,10 +620,9 @@ mlx5_devx_cmd_query_parse_samples(struct mlx5_devx_obj *flex_obj,
 	return ret;
 }
 
-
 struct mlx5_devx_obj *
 mlx5_devx_cmd_create_flex_parser(void *ctx,
-			      struct mlx5_devx_graph_node_attr *data)
+				 struct mlx5_devx_graph_node_attr *data)
 {
 	uint32_t in[MLX5_ST_SZ_DW(create_flex_parser_in)] = {0};
 	uint32_t out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0};
@@ -647,12 +646,18 @@ mlx5_devx_cmd_create_flex_parser(void *ctx,
 		 MLX5_GENERAL_OBJ_TYPE_FLEX_PARSE_GRAPH);
 	MLX5_SET(parse_graph_flex, flex, header_length_mode,
 		 data->header_length_mode);
+	MLX5_SET64(parse_graph_flex, flex, modify_field_select,
+		   data->modify_field_select);
 	MLX5_SET(parse_graph_flex, flex, header_length_base_value,
 		 data->header_length_base_value);
 	MLX5_SET(parse_graph_flex, flex, header_length_field_offset,
 		 data->header_length_field_offset);
 	MLX5_SET(parse_graph_flex, flex, header_length_field_shift,
 		 data->header_length_field_shift);
+	MLX5_SET(parse_graph_flex, flex, next_header_field_offset,
+		 data->next_header_field_offset);
+	MLX5_SET(parse_graph_flex, flex, next_header_field_size,
+		 data->next_header_field_size);
 	MLX5_SET(parse_graph_flex, flex, header_length_field_mask,
 		 data->header_length_field_mask);
 	for (i = 0; i < MLX5_GRAPH_NODE_SAMPLE_NUM; i++) {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 09/14] net/mlx5: update eCPRI flex parser structures
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (7 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 08/14] common/mlx5: fix flex parser DevX creation routine Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 10/14] net/mlx5: add flex item API Viacheslav Ovsiienko
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

To handle eCPRI protocol in the flows the mlx5 PMD engages
flex parser hardware feature. While we were implementing
eCPRI support we anticipated the flex parser usage extension,
and all related variables were named accordingly, containing
flex syllabus. Now we are preparing to introduce more common
approach of flex item, in order to avoid naming conflicts
and improve the code readability the eCPRI infrastructure
related variables are renamed as preparation step.

Later, once we have the new flex item implemented, we could
consider to refactor the eCPRI protocol support  to move on
common flex item basis.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5.c         |  9 +++------
 drivers/net/mlx5/mlx5.h         | 12 +++---------
 drivers/net/mlx5/mlx5_flow_dv.c |  2 +-
 3 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 45ccfe2784..aa49542b9d 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -858,8 +858,7 @@ bool
 mlx5_flex_parser_ecpri_exist(struct rte_eth_dev *dev)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	struct mlx5_flex_parser_profiles *prf =
-				&priv->sh->fp[MLX5_FLEX_PARSER_ECPRI_0];
+	struct mlx5_ecpri_parser_profile *prf = &priv->sh->ecpri_parser;
 
 	return !!prf->obj;
 }
@@ -878,8 +877,7 @@ int
 mlx5_flex_parser_ecpri_alloc(struct rte_eth_dev *dev)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	struct mlx5_flex_parser_profiles *prf =
-				&priv->sh->fp[MLX5_FLEX_PARSER_ECPRI_0];
+	struct mlx5_ecpri_parser_profile *prf =	&priv->sh->ecpri_parser;
 	struct mlx5_devx_graph_node_attr node = {
 		.modify_field_select = 0,
 	};
@@ -942,8 +940,7 @@ static void
 mlx5_flex_parser_ecpri_release(struct rte_eth_dev *dev)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	struct mlx5_flex_parser_profiles *prf =
-				&priv->sh->fp[MLX5_FLEX_PARSER_ECPRI_0];
+	struct mlx5_ecpri_parser_profile *prf =	&priv->sh->ecpri_parser;
 
 	if (prf->obj)
 		mlx5_devx_cmd_destroy(prf->obj);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 3581414b78..5000d2d4c5 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1035,14 +1035,8 @@ struct mlx5_dev_txpp {
 	uint64_t err_ts_future; /* Timestamp in the distant future. */
 };
 
-/* Supported flex parser profile ID. */
-enum mlx5_flex_parser_profile_id {
-	MLX5_FLEX_PARSER_ECPRI_0 = 0,
-	MLX5_FLEX_PARSER_MAX = 8,
-};
-
-/* Sample ID information of flex parser structure. */
-struct mlx5_flex_parser_profiles {
+/* Sample ID information of eCPRI flex parser structure. */
+struct mlx5_ecpri_parser_profile {
 	uint32_t num;		/* Actual number of samples. */
 	uint32_t ids[8];	/* Sample IDs for this profile. */
 	uint8_t offset[8];	/* Bytes offset of each parser. */
@@ -1190,7 +1184,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_devx_obj *tis; /* TIS object. */
 	struct mlx5_devx_obj *td; /* Transport domain. */
 	void *tx_uar; /* Tx/packet pacing shared UAR. */
-	struct mlx5_flex_parser_profiles fp[MLX5_FLEX_PARSER_MAX];
+	struct mlx5_ecpri_parser_profile ecpri_parser;
 	/* Flex parser profiles information. */
 	void *devx_rx_uar; /* DevX UAR for Rx. */
 	struct mlx5_aso_age_mng *aso_age_mng;
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index b610ad3ef4..fc676d3ee4 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -10020,7 +10020,7 @@ flow_dv_translate_item_ecpri(struct rte_eth_dev *dev, void *matcher,
 	 */
 	if (!ecpri_m->hdr.common.u32)
 		return;
-	samples = priv->sh->fp[MLX5_FLEX_PARSER_ECPRI_0].ids;
+	samples = priv->sh->ecpri_parser.ids;
 	/* Need to take the whole DW as the mask to fill the entry. */
 	dw_m = MLX5_ADDR_OF(fte_match_set_misc4, misc4_m,
 			    prog_sample_field_value_0);
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 10/14] net/mlx5: add flex item API
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (8 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 09/14] net/mlx5: update eCPRI flex parser structures Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 11/14] net/mlx5: add flex parser DevX object management Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

This patch is a preparation step of implementing
flex item feature in driver and it provides:

  - external entry point routines for flex item
    creation/deletion

  - flex item objects management over the ports.

The flex item object keeps information about
the item created over the port - reference counter
to track whether item is in use by some active
flows and the pointer to underlaying shared DevX
object, providing all the data needed to translate
the flow flex pattern into matcher fields according
hardware configuration.

There is not too many flex items supposed to be
created on the port, the design is optimized
rather for flow insertion rate than memory savings.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c  |   5 +-
 drivers/net/mlx5/meson.build      |   1 +
 drivers/net/mlx5/mlx5.c           |   2 +-
 drivers/net/mlx5/mlx5.h           |  24 ++++
 drivers/net/mlx5/mlx5_flow.c      |  49 ++++++++
 drivers/net/mlx5/mlx5_flow.h      |  18 ++-
 drivers/net/mlx5/mlx5_flow_dv.c   |   3 +-
 drivers/net/mlx5/mlx5_flow_flex.c | 189 ++++++++++++++++++++++++++++++
 8 files changed, 286 insertions(+), 5 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_flow_flex.c

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 3746057673..cbbc152782 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -928,7 +928,6 @@ mlx5_representor_match(struct mlx5_dev_spawn_data *spawn,
 	return false;
 }
 
-
 /**
  * Spawn an Ethernet device from Verbs information.
  *
@@ -1787,6 +1786,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		err = mlx5_alloc_shared_dr(priv);
 		if (err)
 			goto error;
+		if (mlx5_flex_item_port_init(eth_dev) < 0)
+			goto error;
 	}
 	if (config->devx && config->dv_flow_en && config->dest_tir) {
 		priv->obj_ops = devx_obj_ops;
@@ -1922,6 +1923,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			claim_zero(rte_eth_switch_domain_free(priv->domain_id));
 		if (priv->hrxqs)
 			mlx5_list_destroy(priv->hrxqs);
+		if (eth_dev && priv->flex_item_map)
+			mlx5_flex_item_port_cleanup(eth_dev);
 		mlx5_free(priv);
 		if (eth_dev != NULL)
 			eth_dev->data->dev_private = NULL;
diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index dac7f1fabf..f9b21c35d9 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -17,6 +17,7 @@ sources = files(
         'mlx5_flow_meter.c',
         'mlx5_flow_dv.c',
         'mlx5_flow_aso.c',
+        'mlx5_flow_flex.c',
         'mlx5_mac.c',
         'mlx5_mr.c',
         'mlx5_rss.c',
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index aa49542b9d..d902e00ea3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -376,7 +376,6 @@ static const struct mlx5_indexed_pool_config mlx5_ipool_cfg[] = {
 	},
 };
 
-
 #define MLX5_FLOW_MIN_ID_POOL_SIZE 512
 #define MLX5_ID_GENERATION_ARRAY_FACTOR 16
 
@@ -1575,6 +1574,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	mlx5_mp_os_req_stop_rxtx(dev);
 	/* Free the eCPRI flex parser resource. */
 	mlx5_flex_parser_ecpri_release(dev);
+	mlx5_flex_item_port_cleanup(dev);
 	if (priv->rxqs != NULL) {
 		/* XXX race condition if mlx5_rx_burst() is still running. */
 		rte_delay_us_sleep(1000);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 5000d2d4c5..89b4d66374 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -49,6 +49,9 @@
 #define MLX5_MAX_MODIFY_NUM			32
 #define MLX5_ROOT_TBL_MODIFY_NUM		16
 
+/* Maximal number of flex items created on the port.*/
+#define MLX5_PORT_FLEX_ITEM_NUM			4
+
 enum mlx5_ipool_index {
 #if defined(HAVE_IBV_FLOW_DV_SUPPORT) || !defined(HAVE_INFINIBAND_VERBS_H)
 	MLX5_IPOOL_DECAP_ENCAP = 0, /* Pool for encap/decap resource. */
@@ -1112,6 +1115,12 @@ struct mlx5_aso_ct_pools_mng {
 	struct mlx5_aso_sq aso_sq; /* ASO queue objects. */
 };
 
+/* Port flex item context. */
+struct mlx5_flex_item {
+	struct mlx5_flex_parser_devx *devx_fp; /* DevX flex parser object. */
+	uint32_t refcnt; /**< Atomically accessed refcnt by flows. */
+};
+
 /*
  * Shared Infiniband device context for Master/Representors
  * which belong to same IB device with multiple IB ports.
@@ -1448,6 +1457,10 @@ struct mlx5_priv {
 	uint32_t rss_shared_actions; /* RSS shared actions. */
 	struct mlx5_devx_obj *q_counters; /* DevX queue counter object. */
 	uint32_t counter_set_id; /* Queue counter ID to set in DevX objects. */
+	rte_spinlock_t flex_item_sl; /* Flex item list spinlock. */
+	struct mlx5_flex_item flex_item[MLX5_PORT_FLEX_ITEM_NUM];
+	/* Flex items have been created on the port. */
+	uint32_t flex_item_map; /* Map of allocated flex item elements. */
 };
 
 #define PORT_ID(priv) ((priv)->dev_data->port_id)
@@ -1823,4 +1836,15 @@ int mlx5_aso_ct_query_by_wqe(struct mlx5_dev_ctx_shared *sh,
 int mlx5_aso_ct_available(struct mlx5_dev_ctx_shared *sh,
 			  struct mlx5_aso_ct_action *ct);
 
+/* mlx5_flow_flex.c */
+
+struct rte_flow_item_flex_handle *
+flow_dv_item_create(struct rte_eth_dev *dev,
+		    const struct rte_flow_item_flex_conf *conf,
+		    struct rte_flow_error *error);
+int flow_dv_item_release(struct rte_eth_dev *dev,
+		    const struct rte_flow_item_flex_handle *flex_handle,
+		    struct rte_flow_error *error);
+int mlx5_flex_item_port_init(struct rte_eth_dev *dev);
+void mlx5_flex_item_port_cleanup(struct rte_eth_dev *dev);
 #endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index c914a7120c..5224daed6c 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -718,6 +718,14 @@ mlx5_flow_tunnel_get_restore_info(struct rte_eth_dev *dev,
 				  struct rte_mbuf *m,
 				  struct rte_flow_restore_info *info,
 				  struct rte_flow_error *err);
+static struct rte_flow_item_flex_handle *
+mlx5_flow_flex_item_create(struct rte_eth_dev *dev,
+			   const struct rte_flow_item_flex_conf *conf,
+			   struct rte_flow_error *error);
+static int
+mlx5_flow_flex_item_release(struct rte_eth_dev *dev,
+			    const struct rte_flow_item_flex_handle *handle,
+			    struct rte_flow_error *error);
 
 static const struct rte_flow_ops mlx5_flow_ops = {
 	.validate = mlx5_flow_validate,
@@ -737,6 +745,8 @@ static const struct rte_flow_ops mlx5_flow_ops = {
 	.tunnel_action_decap_release = mlx5_flow_tunnel_action_release,
 	.tunnel_item_release = mlx5_flow_tunnel_item_release,
 	.get_restore_info = mlx5_flow_tunnel_get_restore_info,
+	.flex_item_create = mlx5_flow_flex_item_create,
+	.flex_item_release = mlx5_flow_flex_item_release,
 };
 
 /* Tunnel information. */
@@ -9398,6 +9408,45 @@ mlx5_release_tunnel_hub(__rte_unused struct mlx5_dev_ctx_shared *sh,
 }
 #endif /* HAVE_IBV_FLOW_DV_SUPPORT */
 
+/* Flex flow item API */
+static struct rte_flow_item_flex_handle *
+mlx5_flow_flex_item_create(struct rte_eth_dev *dev,
+			   const struct rte_flow_item_flex_conf *conf,
+			   struct rte_flow_error *error)
+{
+	static const char err_msg[] = "flex item creation unsupported";
+	struct rte_flow_attr attr = { .transfer = 0 };
+	const struct mlx5_flow_driver_ops *fops =
+			flow_get_drv_ops(flow_get_drv_type(dev, &attr));
+
+	if (!fops->item_create) {
+		DRV_LOG(ERR, "port %u %s.", dev->data->port_id, err_msg);
+		rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ACTION,
+				   NULL, err_msg);
+		return NULL;
+	}
+	return fops->item_create(dev, conf, error);
+}
+
+static int
+mlx5_flow_flex_item_release(struct rte_eth_dev *dev,
+			    const struct rte_flow_item_flex_handle *handle,
+			    struct rte_flow_error *error)
+{
+	static const char err_msg[] = "flex item release unsupported";
+	struct rte_flow_attr attr = { .transfer = 0 };
+	const struct mlx5_flow_driver_ops *fops =
+			flow_get_drv_ops(flow_get_drv_type(dev, &attr));
+
+	if (!fops->item_release) {
+		DRV_LOG(ERR, "port %u %s.", dev->data->port_id, err_msg);
+		rte_flow_error_set(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ACTION,
+				   NULL, err_msg);
+		return -rte_errno;
+	}
+	return fops->item_release(dev, handle, error);
+}
+
 static void
 mlx5_dbg__print_pattern(const struct rte_flow_item *item)
 {
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 5c68d4f7d7..a8f8c49dd2 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1226,6 +1226,19 @@ typedef int (*mlx5_flow_create_def_policy_t)
 			(struct rte_eth_dev *dev);
 typedef void (*mlx5_flow_destroy_def_policy_t)
 			(struct rte_eth_dev *dev);
+typedef struct rte_flow_item_flex_handle *(*mlx5_flow_item_create_t)
+			(struct rte_eth_dev *dev,
+			 const struct rte_flow_item_flex_conf *conf,
+			 struct rte_flow_error *error);
+typedef int (*mlx5_flow_item_release_t)
+			(struct rte_eth_dev *dev,
+			 const struct rte_flow_item_flex_handle *handle,
+			 struct rte_flow_error *error);
+typedef int (*mlx5_flow_item_update_t)
+			(struct rte_eth_dev *dev,
+			 const struct rte_flow_item_flex_handle *handle,
+			 const struct rte_flow_item_flex_conf *conf,
+			 struct rte_flow_error *error);
 
 struct mlx5_flow_driver_ops {
 	mlx5_flow_validate_t validate;
@@ -1260,6 +1273,9 @@ struct mlx5_flow_driver_ops {
 	mlx5_flow_action_update_t action_update;
 	mlx5_flow_action_query_t action_query;
 	mlx5_flow_sync_domain_t sync_domain;
+	mlx5_flow_item_create_t item_create;
+	mlx5_flow_item_release_t item_release;
+	mlx5_flow_item_update_t item_update;
 };
 
 /* mlx5_flow.c */
@@ -1709,6 +1725,4 @@ const struct mlx5_flow_tunnel *
 mlx5_get_tof(const struct rte_flow_item *items,
 	     const struct rte_flow_action *actions,
 	     enum mlx5_tof_rule_type *rule_type);
-
-
 #endif /* RTE_PMD_MLX5_FLOW_H_ */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index fc676d3ee4..a3c35a5edf 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -18011,7 +18011,8 @@ const struct mlx5_flow_driver_ops mlx5_flow_dv_drv_ops = {
 	.action_update = flow_dv_action_update,
 	.action_query = flow_dv_action_query,
 	.sync_domain = flow_dv_sync_domain,
+	.item_create = flow_dv_item_create,
+	.item_release = flow_dv_item_release,
 };
-
 #endif /* HAVE_IBV_FLOW_DV_SUPPORT */
 
diff --git a/drivers/net/mlx5/mlx5_flow_flex.c b/drivers/net/mlx5/mlx5_flow_flex.c
new file mode 100644
index 0000000000..b7bc4af6fb
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -0,0 +1,189 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2021 NVIDIA Corporation & Affiliates
+ */
+#include <rte_malloc.h>
+#include <mlx5_devx_cmds.h>
+#include <mlx5_malloc.h>
+#include "mlx5.h"
+#include "mlx5_flow.h"
+
+static_assert(sizeof(uint32_t) * CHAR_BIT >= MLX5_PORT_FLEX_ITEM_NUM,
+	      "Flex item maximal number exceeds uint32_t bit width");
+
+/**
+ *  Routine called once on port initialization to init flex item
+ *  related infrastructure initialization
+ *
+ * @param dev
+ *   Ethernet device to perform flex item initialization
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_flex_item_port_init(struct rte_eth_dev *dev)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+
+	rte_spinlock_init(&priv->flex_item_sl);
+	MLX5_ASSERT(!priv->flex_item_map);
+	return 0;
+}
+
+/**
+ *  Routine called once on port close to perform flex item
+ *  related infrastructure cleanup.
+ *
+ * @param dev
+ *   Ethernet device to perform cleanup
+ */
+void
+mlx5_flex_item_port_cleanup(struct rte_eth_dev *dev)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint32_t i;
+
+	for (i = 0; i < MLX5_PORT_FLEX_ITEM_NUM && priv->flex_item_map ; i++) {
+		if (priv->flex_item_map & (1 << i)) {
+			/* DevX object dereferencing should be provided here. */
+			priv->flex_item_map &= ~(1 << i);
+		}
+	}
+}
+
+static int
+mlx5_flex_index(struct mlx5_priv *priv, struct mlx5_flex_item *item)
+{
+	uintptr_t start = (uintptr_t)&priv->flex_item[0];
+	uintptr_t entry = (uintptr_t)item;
+	uintptr_t idx = (entry - start) / sizeof(struct mlx5_flex_item);
+
+	if (entry < start ||
+	    idx >= MLX5_PORT_FLEX_ITEM_NUM ||
+	    (entry - start) % sizeof(struct mlx5_flex_item) ||
+	    !(priv->flex_item_map & (1u << idx)))
+		return -1;
+	return (int)idx;
+}
+
+static struct mlx5_flex_item *
+mlx5_flex_alloc(struct mlx5_priv *priv)
+{
+	struct mlx5_flex_item *item = NULL;
+
+	rte_spinlock_lock(&priv->flex_item_sl);
+	if (~priv->flex_item_map) {
+		uint32_t idx = rte_bsf32(~priv->flex_item_map);
+
+		if (idx < MLX5_PORT_FLEX_ITEM_NUM) {
+			item = &priv->flex_item[idx];
+			MLX5_ASSERT(!item->refcnt);
+			MLX5_ASSERT(!item->devx_fp);
+			item->devx_fp = NULL;
+			__atomic_store_n(&item->refcnt, 0, __ATOMIC_RELEASE);
+			priv->flex_item_map |= 1u << idx;
+		}
+	}
+	rte_spinlock_unlock(&priv->flex_item_sl);
+	return item;
+}
+
+static void
+mlx5_flex_free(struct mlx5_priv *priv, struct mlx5_flex_item *item)
+{
+	int idx = mlx5_flex_index(priv, item);
+
+	MLX5_ASSERT(idx >= 0 &&
+		    idx < MLX5_PORT_FLEX_ITEM_NUM &&
+		    (priv->flex_item_map & (1u << idx)));
+	if (idx >= 0) {
+		rte_spinlock_lock(&priv->flex_item_sl);
+		MLX5_ASSERT(!item->refcnt);
+		MLX5_ASSERT(!item->devx_fp);
+		item->devx_fp = NULL;
+		__atomic_store_n(&item->refcnt, 0, __ATOMIC_RELEASE);
+		priv->flex_item_map &= ~(1u << idx);
+		rte_spinlock_unlock(&priv->flex_item_sl);
+	}
+}
+
+/**
+ * Create the flex item with specified configuration over the Ethernet device.
+ *
+ * @param dev
+ *   Ethernet device to create flex item on.
+ * @param[in] conf
+ *   Flex item configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is set.
+ */
+struct rte_flow_item_flex_handle *
+flow_dv_item_create(struct rte_eth_dev *dev,
+		    const struct rte_flow_item_flex_conf *conf,
+		    struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flex_item *flex;
+
+	MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	flex = mlx5_flex_alloc(priv);
+	if (!flex) {
+		rte_flow_error_set(error, ENOMEM,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				   "too many flex items created on the port");
+		return NULL;
+	}
+	RTE_SET_USED(conf);
+	/* Mark initialized flex item valid. */
+	__atomic_add_fetch(&flex->refcnt, 1, __ATOMIC_RELEASE);
+	return (struct rte_flow_item_flex_handle *)flex;
+}
+
+/**
+ * Release the flex item on the specified Ethernet device.
+ *
+ * @param dev
+ *   Ethernet device to destroy flex item on.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+flow_dv_item_release(struct rte_eth_dev *dev,
+		     const struct rte_flow_item_flex_handle *handle,
+		     struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flex_item *flex =
+		(struct mlx5_flex_item *)(uintptr_t)handle;
+	uint32_t old_refcnt = 1;
+
+	MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_spinlock_lock(&priv->flex_item_sl);
+	if (mlx5_flex_index(priv, flex) < 0) {
+		rte_spinlock_unlock(&priv->flex_item_sl);
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+					  "invalid flex item handle value");
+	}
+	if (!__atomic_compare_exchange_n(&flex->refcnt, &old_refcnt, 0, 0,
+					 __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
+		rte_spinlock_unlock(&priv->flex_item_sl);
+		return rte_flow_error_set(error, EBUSY,
+					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+					  "flex item has flow references");
+	}
+	/* Flex item is marked as invalid, we can leave locked section. */
+	rte_spinlock_unlock(&priv->flex_item_sl);
+	mlx5_flex_free(priv, flex);
+	return 0;
+}
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 11/14] net/mlx5: add flex parser DevX object management
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (9 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 10/14] net/mlx5: add flex item API Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 12/14] net/mlx5: translate flex item configuration Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

The DevX flex parsers can be shared between representors
within the same IB context. We should put the flex parser
objects into the shared list and engage the standard
mlx5_list_xxx API to manage ones.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c  |  10 +++
 drivers/net/mlx5/mlx5.c           |   4 +
 drivers/net/mlx5/mlx5.h           |  20 +++++
 drivers/net/mlx5/mlx5_flow_flex.c | 120 +++++++++++++++++++++++++++++-
 4 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index cbbc152782..e4066d134b 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -384,6 +384,16 @@ mlx5_alloc_shared_dr(struct mlx5_priv *priv)
 					      flow_dv_dest_array_clone_free_cb);
 	if (!sh->dest_array_list)
 		goto error;
+	/* Init shared flex parsers list, no need lcore_share */
+	snprintf(s, sizeof(s), "%s_flex_parsers_list", sh->ibdev_name);
+	sh->flex_parsers_dv = mlx5_list_create(s, sh, false,
+					       mlx5_flex_parser_create_cb,
+					       mlx5_flex_parser_match_cb,
+					       mlx5_flex_parser_remove_cb,
+					       mlx5_flex_parser_clone_cb,
+					       mlx5_flex_parser_clone_free_cb);
+	if (!sh->flex_parsers_dv)
+		goto error;
 #endif
 #ifdef HAVE_MLX5DV_DR
 	void *domain;
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d902e00ea3..77fe073f5c 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1315,6 +1315,10 @@ mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh)
 	if (LIST_EMPTY(&mlx5_dev_ctx_list))
 		mlx5_flow_os_release_workspace();
 	pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex);
+	if (sh->flex_parsers_dv) {
+		mlx5_list_destroy(sh->flex_parsers_dv);
+		sh->flex_parsers_dv = NULL;
+	}
 	/*
 	 *  Ensure there is no async event handler installed.
 	 *  Only primary process handles async device events.
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 89b4d66374..629ff6ebfe 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1115,6 +1115,15 @@ struct mlx5_aso_ct_pools_mng {
 	struct mlx5_aso_sq aso_sq; /* ASO queue objects. */
 };
 
+/* DevX flex parser context. */
+struct mlx5_flex_parser_devx {
+	struct mlx5_list_entry entry;  /* List element at the beginning. */
+	uint32_t num_samples;
+	void *devx_obj;
+	struct mlx5_devx_graph_node_attr devx_conf;
+	uint32_t sample_ids[MLX5_GRAPH_NODE_SAMPLE_NUM];
+};
+
 /* Port flex item context. */
 struct mlx5_flex_item {
 	struct mlx5_flex_parser_devx *devx_fp; /* DevX flex parser object. */
@@ -1179,6 +1188,7 @@ struct mlx5_dev_ctx_shared {
 	struct mlx5_list *push_vlan_action_list; /* Push VLAN actions. */
 	struct mlx5_list *sample_action_list; /* List of sample actions. */
 	struct mlx5_list *dest_array_list;
+	struct mlx5_list *flex_parsers_dv; /* Flex Item parsers. */
 	/* List of destination array actions. */
 	struct mlx5_flow_counter_mng cmng; /* Counters management structure. */
 	void *default_miss_action; /* Default miss action. */
@@ -1847,4 +1857,14 @@ int flow_dv_item_release(struct rte_eth_dev *dev,
 		    struct rte_flow_error *error);
 int mlx5_flex_item_port_init(struct rte_eth_dev *dev);
 void mlx5_flex_item_port_cleanup(struct rte_eth_dev *dev);
+/* Flex parser list callbacks. */
+struct mlx5_list_entry *mlx5_flex_parser_create_cb(void *list_ctx, void *ctx);
+int mlx5_flex_parser_match_cb(void *list_ctx,
+			      struct mlx5_list_entry *iter, void *ctx);
+void mlx5_flex_parser_remove_cb(void *list_ctx,	struct mlx5_list_entry *entry);
+struct mlx5_list_entry *mlx5_flex_parser_clone_cb(void *list_ctx,
+						  struct mlx5_list_entry *entry,
+						  void *ctx);
+void mlx5_flex_parser_clone_free_cb(void *tool_ctx,
+				    struct mlx5_list_entry *entry);
 #endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_flow_flex.c b/drivers/net/mlx5/mlx5_flow_flex.c
index b7bc4af6fb..b8a091e259 100644
--- a/drivers/net/mlx5/mlx5_flow_flex.c
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -45,7 +45,13 @@ mlx5_flex_item_port_cleanup(struct rte_eth_dev *dev)
 
 	for (i = 0; i < MLX5_PORT_FLEX_ITEM_NUM && priv->flex_item_map ; i++) {
 		if (priv->flex_item_map & (1 << i)) {
-			/* DevX object dereferencing should be provided here. */
+			struct mlx5_flex_item *flex = &priv->flex_item[i];
+
+			claim_zero(mlx5_list_unregister
+					(priv->sh->flex_parsers_dv,
+					 &flex->devx_fp->entry));
+			flex->devx_fp = NULL;
+			flex->refcnt = 0;
 			priv->flex_item_map &= ~(1 << i);
 		}
 	}
@@ -127,7 +133,9 @@ flow_dv_item_create(struct rte_eth_dev *dev,
 		    struct rte_flow_error *error)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flex_parser_devx devx_config = { .devx_obj = NULL };
 	struct mlx5_flex_item *flex;
+	struct mlx5_list_entry *ent;
 
 	MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	flex = mlx5_flex_alloc(priv);
@@ -137,10 +145,22 @@ flow_dv_item_create(struct rte_eth_dev *dev,
 				   "too many flex items created on the port");
 		return NULL;
 	}
+	ent = mlx5_list_register(priv->sh->flex_parsers_dv, &devx_config);
+	if (!ent) {
+		rte_flow_error_set(error, ENOMEM,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				   "flex item creation failure");
+		goto error;
+	}
+	flex->devx_fp = container_of(ent, struct mlx5_flex_parser_devx, entry);
 	RTE_SET_USED(conf);
 	/* Mark initialized flex item valid. */
 	__atomic_add_fetch(&flex->refcnt, 1, __ATOMIC_RELEASE);
 	return (struct rte_flow_item_flex_handle *)flex;
+
+error:
+	mlx5_flex_free(priv, flex);
+	return NULL;
 }
 
 /**
@@ -166,6 +186,7 @@ flow_dv_item_release(struct rte_eth_dev *dev,
 	struct mlx5_flex_item *flex =
 		(struct mlx5_flex_item *)(uintptr_t)handle;
 	uint32_t old_refcnt = 1;
+	int rc;
 
 	MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	rte_spinlock_lock(&priv->flex_item_sl);
@@ -184,6 +205,103 @@ flow_dv_item_release(struct rte_eth_dev *dev,
 	}
 	/* Flex item is marked as invalid, we can leave locked section. */
 	rte_spinlock_unlock(&priv->flex_item_sl);
+	MLX5_ASSERT(flex->devx_fp);
+	rc = mlx5_list_unregister(priv->sh->flex_parsers_dv,
+				  &flex->devx_fp->entry);
+	flex->devx_fp = NULL;
 	mlx5_flex_free(priv, flex);
+	if (rc)
+		return rte_flow_error_set(error, rc,
+					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+					  "flex item release failure");
 	return 0;
 }
+
+/* DevX flex parser list callbacks. */
+struct mlx5_list_entry *
+mlx5_flex_parser_create_cb(void *list_ctx, void *ctx)
+{
+	struct mlx5_dev_ctx_shared *sh = list_ctx;
+	struct mlx5_flex_parser_devx *fp, *conf = ctx;
+	int ret;
+
+	fp = mlx5_malloc(MLX5_MEM_ZERO,	sizeof(struct mlx5_flex_parser_devx),
+			 0, SOCKET_ID_ANY);
+	if (!fp)
+		return NULL;
+	/* Copy the requested confgiurations. */
+	fp->num_samples = conf->num_samples;
+	memcpy(&fp->devx_conf, &conf->devx_conf, sizeof(fp->devx_conf));
+	/* Create DevX flex parser. */
+	fp->devx_obj = mlx5_devx_cmd_create_flex_parser(sh->ctx,
+							&fp->devx_conf);
+	if (!fp->devx_obj)
+		goto error;
+	/* Query the firmware assigined sample ids. */
+	ret = mlx5_devx_cmd_query_parse_samples(fp->devx_obj,
+						fp->sample_ids,
+						fp->num_samples);
+	if (ret)
+		goto error;
+	DRV_LOG(DEBUG, "DEVx flex parser %p created, samples num: %u\n",
+		(const void *)fp, fp->num_samples);
+	return &fp->entry;
+error:
+	if (fp->devx_obj)
+		mlx5_devx_cmd_destroy((void *)(uintptr_t)fp->devx_obj);
+	if (fp)
+		mlx5_free(fp);
+	return NULL;
+}
+
+int
+mlx5_flex_parser_match_cb(void *list_ctx,
+			  struct mlx5_list_entry *iter, void *ctx)
+{
+	struct mlx5_flex_parser_devx *fp =
+		container_of(iter, struct mlx5_flex_parser_devx, entry);
+	struct mlx5_flex_parser_devx *org =
+		container_of(ctx, struct mlx5_flex_parser_devx, entry);
+
+	RTE_SET_USED(list_ctx);
+	return !iter || !ctx || memcmp(&fp->devx_conf,
+				       &org->devx_conf,
+				       sizeof(fp->devx_conf));
+}
+
+void
+mlx5_flex_parser_remove_cb(void *list_ctx, struct mlx5_list_entry *entry)
+{
+	struct mlx5_flex_parser_devx *fp =
+		container_of(entry, struct mlx5_flex_parser_devx, entry);
+
+	RTE_SET_USED(list_ctx);
+	MLX5_ASSERT(fp->devx_obj);
+	claim_zero(mlx5_devx_cmd_destroy(fp->devx_obj));
+	mlx5_free(entry);
+}
+
+struct mlx5_list_entry *
+mlx5_flex_parser_clone_cb(void *list_ctx,
+			  struct mlx5_list_entry *entry, void *ctx)
+{
+	struct mlx5_flex_parser_devx *fp =
+		container_of(entry, struct mlx5_flex_parser_devx, entry);
+
+	RTE_SET_USED(list_ctx);
+	fp = mlx5_malloc(0, sizeof(struct mlx5_flex_parser_devx),
+			 0, SOCKET_ID_ANY);
+	if (!fp)
+		return NULL;
+	memcpy(fp, ctx, sizeof(struct mlx5_flex_parser_devx));
+	return &fp->entry;
+}
+
+void
+mlx5_flex_parser_clone_free_cb(void *list_ctx, struct mlx5_list_entry *entry)
+{
+	struct mlx5_flex_parser_devx *fp =
+		container_of(entry, struct mlx5_flex_parser_devx, entry);
+	RTE_SET_USED(list_ctx);
+	mlx5_free(fp);
+}
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 12/14] net/mlx5: translate flex item configuration
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (10 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 11/14] net/mlx5: add flex parser DevX object management Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 13/14] net/mlx5: translate flex item pattern into matcher Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 14/14] net/mlx5: handle flex item in flows Viacheslav Ovsiienko
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="N", Size: 28157 bytes --]

RTE Flow flex item configuration should be translated
into actual hardware settings:

  - translate header length and next protocol field samplings
  - translate data field sampling, the similar fields with the
    same mode and matching related parameters are relocated
    and grouped to be covered with minimal amount of hardware
    sampling registers (each register can cover arbitrary
    neighbour 32 bits (aligned to byte boundary) in the packet
    and we can combine the fields with smaller lengths or
    segments of bigger fields)
  - input and output links translation
  - preparing data for parsing flex item pattern on flow creation

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5.h           |  16 +-
 drivers/net/mlx5/mlx5_flow_flex.c | 748 +++++++++++++++++++++++++++++-
 2 files changed, 762 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 629ff6ebfe..d4fa946485 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -52,6 +52,9 @@
 /* Maximal number of flex items created on the port.*/
 #define MLX5_PORT_FLEX_ITEM_NUM			4
 
+/* Maximal number of field/field parts to map into sample registers .*/
+#define MLX5_FLEX_ITEM_MAPPING_NUM		32
+
 enum mlx5_ipool_index {
 #if defined(HAVE_IBV_FLOW_DV_SUPPORT) || !defined(HAVE_INFINIBAND_VERBS_H)
 	MLX5_IPOOL_DECAP_ENCAP = 0, /* Pool for encap/decap resource. */
@@ -1124,10 +1127,21 @@ struct mlx5_flex_parser_devx {
 	uint32_t sample_ids[MLX5_GRAPH_NODE_SAMPLE_NUM];
 };
 
+/* Pattern field dscriptor - how to translate flex pattern into samples. */
+__extension__
+struct mlx5_flex_pattern_field {
+	uint16_t width:6;
+	uint16_t shift:5;
+	uint16_t reg_id:5;
+};
+
 /* Port flex item context. */
 struct mlx5_flex_item {
 	struct mlx5_flex_parser_devx *devx_fp; /* DevX flex parser object. */
-	uint32_t refcnt; /**< Atomically accessed refcnt by flows. */
+	uint32_t refcnt; /* Atomically accessed refcnt by flows. */
+	uint32_t tunnel:1; /* Flex item presents tunnel protocol. */
+	uint32_t mapnum; /* Number of pattern translation entries. */
+	struct mlx5_flex_pattern_field map[MLX5_FLEX_ITEM_MAPPING_NUM];
 };
 
 /*
diff --git a/drivers/net/mlx5/mlx5_flow_flex.c b/drivers/net/mlx5/mlx5_flow_flex.c
index b8a091e259..56b91da839 100644
--- a/drivers/net/mlx5/mlx5_flow_flex.c
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -113,6 +113,750 @@ mlx5_flex_free(struct mlx5_priv *priv, struct mlx5_flex_item *item)
 	}
 }
 
+static int
+mlx5_flex_translate_length(struct mlx5_hca_flex_attr *attr,
+			   const struct rte_flow_item_flex_conf *conf,
+			   struct mlx5_flex_parser_devx *devx,
+			   struct rte_flow_error *error)
+{
+	const struct rte_flow_item_flex_field *field = &conf->next_header;
+	struct mlx5_devx_graph_node_attr *node = &devx->devx_conf;
+	uint32_t len_width;
+
+	if (field->field_base % CHAR_BIT)
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "not byte aligned header length field");
+	switch (field->field_mode) {
+	case FIELD_MODE_DUMMY:
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "invalid header length field mode (DUMMY)");
+	case FIELD_MODE_FIXED:
+		if (!(attr->header_length_mode &
+		    RTE_BIT32(MLX5_GRAPH_NODE_LEN_FIXED)))
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported header length field mode (FIXED)");
+		if (attr->header_length_mask_width < field->field_size)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "header length field width exceeds limit");
+		if (field->offset_shift < 0 ||
+		    field->offset_shift > attr->header_length_mask_width)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "invalid header length field shift (FIXED");
+		if (field->field_base < 0)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "negative header length field base (FIXED)");
+		node->header_length_mode = MLX5_GRAPH_NODE_LEN_FIXED;
+		break;
+	case FIELD_MODE_OFFSET:
+		if (!(attr->header_length_mode &
+		    RTE_BIT32(MLX5_GRAPH_NODE_LEN_FIELD)))
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported header length field mode (OFFSET)");
+		node->header_length_mode = MLX5_GRAPH_NODE_LEN_FIELD;
+		if (field->offset_mask == 0 ||
+		    !rte_is_power_of_2(field->offset_mask + 1))
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "invalid length field offset mask (OFFSET)");
+		len_width = rte_fls_u32(field->offset_mask);
+		if (len_width > attr->header_length_mask_width)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "length field offset mask too wide (OFFSET)");
+		node->header_length_field_mask = field->offset_mask;
+		break;
+	case FIELD_MODE_BITMASK:
+		if (!(attr->header_length_mode &
+		    RTE_BIT32(MLX5_GRAPH_NODE_LEN_BITMASK)))
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported header length field mode (BITMASK)");
+		if (attr->header_length_mask_width < field->field_size)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "header length field width exceeds limit");
+		node->header_length_mode = MLX5_GRAPH_NODE_LEN_BITMASK;
+		node->header_length_field_mask = field->offset_mask;
+		break;
+	default:
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "unknown header length field mode");
+	}
+	if (field->field_base / CHAR_BIT >= 0 &&
+	    field->field_base / CHAR_BIT > attr->max_base_header_length)
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "header length field base exceeds limit");
+	node->header_length_base_value = field->field_base / CHAR_BIT;
+	if (field->field_mode == FIELD_MODE_OFFSET ||
+	    field->field_mode == FIELD_MODE_BITMASK) {
+		if (field->offset_shift > 15 || field->offset_shift < 0)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "header length field shift exceeeds limit");
+		node->header_length_field_shift	= field->offset_shift;
+		node->header_length_field_offset = field->offset_base;
+	}
+	return 0;
+}
+
+static int
+mlx5_flex_translate_next(struct mlx5_hca_flex_attr *attr,
+			 const struct rte_flow_item_flex_conf *conf,
+			 struct mlx5_flex_parser_devx *devx,
+			 struct rte_flow_error *error)
+{
+	const struct rte_flow_item_flex_field *field = &conf->next_protocol;
+	struct mlx5_devx_graph_node_attr *node = &devx->devx_conf;
+
+	switch (field->field_mode) {
+	case FIELD_MODE_DUMMY:
+		if (conf->output_num)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "next protocof field is required (DUMMY)");
+		return 0;
+	case FIELD_MODE_FIXED:
+		break;
+	case FIELD_MODE_OFFSET:
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "unsupported next protocol field mode (OFFSET)");
+		break;
+	case FIELD_MODE_BITMASK:
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "unsupported next protocol field mode (BITMASK)");
+	default:
+		return rte_flow_error_set
+			(error, EINVAL,
+			 RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "unknown next protocol field mode");
+	}
+	MLX5_ASSERT(field->field_mode == FIELD_MODE_FIXED);
+	if (attr->max_next_header_offset < field->field_base)
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "next protocol field base exceeds limit");
+	if (field->offset_shift)
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "unsupported next protocol field shift");
+	node->next_header_field_offset = field->field_base;
+	node->next_header_field_size = field->field_size;
+	return 0;
+}
+
+/* Helper structure to handle field bit intervals. */
+struct mlx5_flex_field_cover {
+	uint16_t num;
+	int32_t start[MLX5_FLEX_ITEM_MAPPING_NUM];
+	int32_t end[MLX5_FLEX_ITEM_MAPPING_NUM];
+	uint8_t mapped[MLX5_FLEX_ITEM_MAPPING_NUM / CHAR_BIT + 1];
+};
+
+static void
+mlx5_flex_insert_field(struct mlx5_flex_field_cover *cover,
+		       uint16_t num, int32_t start, int32_t end)
+{
+	MLX5_ASSERT(num < MLX5_FLEX_ITEM_MAPPING_NUM);
+	MLX5_ASSERT(num <= cover->num);
+	if (num < cover->num) {
+		memmove(&cover->start[num + 1],	&cover->start[num],
+			(cover->num - num) * sizeof(int32_t));
+		memmove(&cover->end[num + 1],	&cover->end[num],
+			(cover->num - num) * sizeof(int32_t));
+	}
+	cover->start[num] = start;
+	cover->end[num] = end;
+	cover->num++;
+}
+
+static void
+mlx5_flex_merge_field(struct mlx5_flex_field_cover *cover, uint16_t num)
+{
+	uint32_t i, del = 0;
+	int32_t end;
+
+	MLX5_ASSERT(num < MLX5_FLEX_ITEM_MAPPING_NUM);
+	MLX5_ASSERT(num < (cover->num - 1));
+	end = cover->end[num];
+	for (i = num + 1; i < cover->num; i++) {
+		if (end < cover->start[i])
+			break;
+		del++;
+		if (end <= cover->end[i]) {
+			cover->end[num] = cover->end[i];
+			break;
+		}
+	}
+	if (del) {
+		MLX5_ASSERT(del < (cover->num - 1u - num));
+		cover->num -= del;
+		MLX5_ASSERT(cover->num > num);
+		if ((cover->num - num) > 1) {
+			memmove(&cover->start[num + 1],
+				&cover->start[num + 1 + del],
+				(cover->num - num - 1) * sizeof(int32_t));
+			memmove(&cover->end[num + 1],
+				&cover->end[num + 1 + del],
+				(cover->num - num - 1) * sizeof(int32_t));
+		}
+	}
+}
+
+/*
+ * Validate the sample field and update interval array
+ * if parameters match with the 'match" field.
+ * Returns:
+ *    < 0  - error
+ *    == 0 - no match, interval array not updated
+ *    > 0  - match, interval array updated
+ */
+static int
+mlx5_flex_cover_sample(struct mlx5_flex_field_cover *cover,
+		       struct rte_flow_item_flex_field *field,
+		       struct rte_flow_item_flex_field *match,
+		       struct mlx5_hca_flex_attr *attr,
+		       struct rte_flow_error *error)
+{
+	int32_t start, end;
+	uint32_t i;
+
+	switch (field->field_mode) {
+	case FIELD_MODE_DUMMY:
+		return 0;
+	case FIELD_MODE_FIXED:
+		if (!(attr->sample_offset_mode &
+		    RTE_BIT32(MLX5_GRAPH_SAMPLE_OFFSET_FIXED)))
+			return rte_flow_error_set
+				(error, EINVAL,
+				 RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported sample field mode (FIXED)");
+		if (field->offset_shift)
+			return rte_flow_error_set
+				(error, EINVAL,
+				 RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "invalid sample field shift (FIXED");
+		if (field->field_base < 0)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "invalid sample field base (FIXED)");
+		if (field->field_base / CHAR_BIT > attr->max_sample_base_offset)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "sample field base exceeds limit (FIXED)");
+		break;
+	case FIELD_MODE_OFFSET:
+		if (!(attr->sample_offset_mode &
+		    RTE_BIT32(MLX5_GRAPH_SAMPLE_OFFSET_FIELD)))
+			return rte_flow_error_set
+				(error, EINVAL,
+				 RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported sample field mode (OFFSET)");
+		if (field->field_base / CHAR_BIT >= 0 &&
+		    field->field_base / CHAR_BIT > attr->max_sample_base_offset)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				"sample field base exceeds limit");
+		break;
+	case FIELD_MODE_BITMASK:
+		if (!(attr->sample_offset_mode &
+		    RTE_BIT32(MLX5_GRAPH_SAMPLE_OFFSET_BITMASK)))
+			return rte_flow_error_set
+				(error, EINVAL,
+				 RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported sample field mode (BITMASK)");
+		if (field->field_base / CHAR_BIT >= 0 &&
+		    field->field_base / CHAR_BIT > attr->max_sample_base_offset)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				"sample field base exceeds limit");
+		break;
+	default:
+		return rte_flow_error_set
+			(error, EINVAL,
+			 RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "unknown data sample field mode");
+	}
+	if (!match) {
+		if (!field->field_size)
+			return rte_flow_error_set
+				(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				"zero sample field width");
+		if (field->rss_hash)
+			return rte_flow_error_set
+				(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				"unsupported RSS hash over flex item fields");
+		if (field->tunnel_count != FLEX_TUNNEL_MODE_FIRST &&
+		    field->tunnel_count != FLEX_TUNNEL_MODE_OUTER &&
+		    field->tunnel_count != FLEX_TUNNEL_MODE_INNER)
+			return rte_flow_error_set
+				(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				"unsupported sample field tunnel mode");
+		if (field->field_id)
+			DRV_LOG(DEBUG, "sample field id hint ignored\n");
+	} else {
+		if (field->field_mode != match->field_mode ||
+		    field->rss_hash != match->rss_hash ||
+		    field->tunnel_count != match->tunnel_count ||
+		    field->offset_base | match->offset_base ||
+		    field->offset_mask | match->offset_mask ||
+		    field->offset_shift | match->offset_shift)
+			return 0;
+	}
+	start = field->field_base;
+	end = start + field->field_size;
+	/* Add the new or similar field to interval array. */
+	if (!cover->num) {
+		cover->start[cover->num] = start;
+		cover->end[cover->num] = end;
+		cover->num = 1;
+		return 1;
+	}
+	for (i = 0; i < cover->num; i++) {
+		if (start > cover->end[i]) {
+			if (i >= (cover->num - 1u)) {
+				mlx5_flex_insert_field(cover, cover->num,
+						       start, end);
+				break;
+			}
+			continue;
+		}
+		if (end < cover->start[i]) {
+			mlx5_flex_insert_field(cover, i, start, end);
+			break;
+		}
+		if (start < cover->start[i])
+			cover->start[i] = start;
+		if (end > cover->end[i]) {
+			cover->end[i] = end;
+			if (i < (cover->num - 1u))
+				mlx5_flex_merge_field(cover, i);
+		}
+		break;
+	}
+	return 1;
+}
+
+static void
+mlx5_flex_config_sample(struct mlx5_devx_match_sample_attr *na,
+		       struct rte_flow_item_flex_field *field)
+{
+	memset(na, 0, sizeof(struct mlx5_devx_match_sample_attr));
+	na->flow_match_sample_en = 1;
+	switch (field->field_mode) {
+	case FIELD_MODE_FIXED:
+		na->flow_match_sample_offset_mode =
+			MLX5_GRAPH_SAMPLE_OFFSET_FIXED;
+		break;
+	case FIELD_MODE_OFFSET:
+		na->flow_match_sample_offset_mode =
+			MLX5_GRAPH_SAMPLE_OFFSET_FIELD;
+		na->flow_match_sample_field_offset = field->offset_base;
+		na->flow_match_sample_field_offset_mask = field->offset_mask;
+		na->flow_match_sample_field_offset_shift = field->offset_shift;
+		break;
+	case FIELD_MODE_BITMASK:
+		na->flow_match_sample_offset_mode =
+			MLX5_GRAPH_SAMPLE_OFFSET_BITMASK;
+		na->flow_match_sample_field_offset = field->offset_base;
+		na->flow_match_sample_field_offset_mask = field->offset_mask;
+		na->flow_match_sample_field_offset_shift = field->offset_shift;
+		break;
+	default:
+		MLX5_ASSERT(false);
+		break;
+	}
+	switch (field->tunnel_count) {
+	case FLEX_TUNNEL_MODE_FIRST:
+		na->flow_match_sample_tunnel_mode =
+			MLX5_GRAPH_SAMPLE_TUNNEL_FIRST;
+		break;
+	case FLEX_TUNNEL_MODE_OUTER:
+		na->flow_match_sample_tunnel_mode =
+			MLX5_GRAPH_SAMPLE_TUNNEL_OUTER;
+		break;
+	case FLEX_TUNNEL_MODE_INNER:
+		na->flow_match_sample_tunnel_mode =
+			MLX5_GRAPH_SAMPLE_TUNNEL_INNER;
+		break;
+	default:
+		MLX5_ASSERT(false);
+		break;
+	}
+}
+
+/* Map specified field to set/subset of allocated sample registers. */
+static int
+mlx5_flex_map_sample(struct rte_flow_item_flex_field *field,
+		     struct mlx5_flex_parser_devx *parser,
+		     struct mlx5_flex_item *item,
+		     struct rte_flow_error *error)
+{
+	struct mlx5_devx_match_sample_attr node;
+	int32_t start = field->field_base;
+	int32_t end = start + field->field_size;
+	uint32_t i, done_bits = 0;
+
+	mlx5_flex_config_sample(&node, field);
+	for (i = 0; i < parser->num_samples; i++) {
+		struct mlx5_devx_match_sample_attr *sample =
+			&parser->devx_conf.sample[i];
+		int32_t reg_start, reg_end;
+		int32_t cov_start, cov_end;
+		struct mlx5_flex_pattern_field *trans;
+
+		MLX5_ASSERT(sample->flow_match_sample_en);
+		if (!sample->flow_match_sample_en)
+			break;
+		node.flow_match_sample_field_base_offset =
+			sample->flow_match_sample_field_base_offset;
+		if (memcmp(&node, sample, sizeof(node)))
+			continue;
+		reg_start = (int8_t)sample->flow_match_sample_field_base_offset;
+		reg_start *= CHAR_BIT;
+		reg_end = reg_start + 32;
+		if (end <= reg_start || start >= reg_end)
+			continue;
+		cov_start = RTE_MAX(reg_start, start);
+		cov_end = RTE_MIN(reg_end, end);
+		MLX5_ASSERT(cov_end > cov_start);
+		done_bits += cov_end - cov_start;
+		if (item->mapnum >= MLX5_FLEX_ITEM_MAPPING_NUM)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "too many flex item pattern translations");
+		trans = &item->map[item->mapnum];
+		item->mapnum++;
+		trans->reg_id = i;
+		trans->shift = cov_start - reg_start;
+		trans->width = cov_end - cov_start;
+	}
+	if (done_bits != field->field_size) {
+		MLX5_ASSERT(false);
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "failed to map field to sample register");
+	}
+	return 0;
+}
+
+/* Allocate sample registers for the specified field type and interval array. */
+static int
+mlx5_flex_alloc_sample(struct mlx5_flex_field_cover *cover,
+		       struct mlx5_flex_parser_devx *parser,
+		       struct rte_flow_item_flex_field *field,
+		       struct mlx5_hca_flex_attr *attr,
+		       struct rte_flow_error *error)
+{
+	struct mlx5_devx_match_sample_attr node;
+	uint32_t idx = 0;
+
+	mlx5_flex_config_sample(&node, field);
+	while (idx < cover->num) {
+		int32_t start, end;
+
+		/* Sample base offsets are in bytes, should align. */
+		start = RTE_ALIGN_FLOOR(cover->start[idx], CHAR_BIT);
+		node.flow_match_sample_field_base_offset =
+						(start / CHAR_BIT) & 0xFF;
+		/* Allocate sample register. */
+		if (parser->num_samples >= MLX5_GRAPH_NODE_SAMPLE_NUM ||
+		    parser->num_samples >= attr->max_num_sample ||
+		    parser->num_samples >= attr->max_num_prog_sample)
+			return rte_flow_error_set
+				(error, EINVAL,
+				 RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "no sample registers to handle all flex item fields");
+		parser->devx_conf.sample[parser->num_samples] = node;
+		parser->num_samples++;
+		/* Remove or update covered intervals. */
+		end = start + 32;
+		while (idx < cover->num) {
+			if (end >= cover->end[idx]) {
+				idx++;
+				continue;
+			}
+			if (end > cover->start[idx])
+				cover->start[idx] = end;
+			break;
+		}
+	}
+	return 0;
+}
+
+static int
+mlx5_flex_translate_sample(struct mlx5_hca_flex_attr *attr,
+			   const struct rte_flow_item_flex_conf *conf,
+			   struct mlx5_flex_parser_devx *parser,
+			   struct mlx5_flex_item *item,
+			   struct rte_flow_error *error)
+{
+	struct mlx5_flex_field_cover cover;
+	uint32_t i, j;
+	int ret;
+
+	if (conf->sample_num > MLX5_FLEX_ITEM_MAPPING_NUM)
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "sample field number exceeds limit");
+	/*
+	 * The application can specify fields smaller or bigger than 32 bits
+	 * covered with single sample register and it can specify field
+	 * offsets in any order.
+	 *
+	 * Gather all similar fields together, build array of bit intervals
+	 * in asсending order and try to cover with the smallest set of sample
+	 * refgisters.
+	 */
+	memset(&cover, 0, sizeof(cover));
+	for (i = 0; i < conf->sample_num; i++) {
+		struct rte_flow_item_flex_field *fl = conf->sample_data + i;
+
+		/* Check whether field was covered in the previous iteration. */
+		if (cover.mapped[i / CHAR_BIT] & (1u << (i % CHAR_BIT)))
+			continue;
+		if (fl->field_mode == FIELD_MODE_DUMMY)
+			continue;
+		/* Build an interval array for the field and similar ones */
+		cover.num = 0;
+		/* Add the first field to array unconditionally. */
+		ret = mlx5_flex_cover_sample(&cover, fl, NULL, attr, error);
+		if (ret < 0)
+			return ret;
+		MLX5_ASSERT(ret > 0);
+		cover.mapped[i / CHAR_BIT] |= 1u << (i % CHAR_BIT);
+		for (j = i + 1; j < conf->sample_num; j++) {
+			struct rte_flow_item_flex_field *ft;
+
+			/* Add field to array if its type matches. */
+			ft = conf->sample_data + j;
+			ret = mlx5_flex_cover_sample(&cover, ft, fl,
+						     attr, error);
+			if (ret < 0)
+				return ret;
+			if (!ret)
+				continue;
+			cover.mapped[j / CHAR_BIT] |= 1u << (j % CHAR_BIT);
+		}
+		/* Allocate sample registers to cover array of intervals. */
+		ret = mlx5_flex_alloc_sample(&cover, parser, fl, attr, error);
+		if (ret)
+			return ret;
+	}
+	/* Build the item pattern translating data on flow creation. */
+	item->mapnum = 0;
+	memset(&item->map, 0, sizeof(item->map));
+	for (i = 0; i < conf->sample_num; i++) {
+		struct rte_flow_item_flex_field *fl = conf->sample_data + i;
+
+		ret = mlx5_flex_map_sample(fl, parser, item, error);
+		if (ret) {
+			MLX5_ASSERT(false);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+static int
+mlx5_flex_arc_type(enum rte_flow_item_type type, int in)
+{
+	switch (type) {
+	case RTE_FLOW_ITEM_TYPE_ETH:
+		return  MLX5_GRAPH_ARC_NODE_MAC;
+	case RTE_FLOW_ITEM_TYPE_IPV4:
+		return in ? MLX5_GRAPH_ARC_NODE_IP : MLX5_GRAPH_ARC_NODE_IPV4;
+	case RTE_FLOW_ITEM_TYPE_IPV6:
+		return in ? MLX5_GRAPH_ARC_NODE_IP : MLX5_GRAPH_ARC_NODE_IPV6;
+	case RTE_FLOW_ITEM_TYPE_UDP:
+		return MLX5_GRAPH_ARC_NODE_UDP;
+	case RTE_FLOW_ITEM_TYPE_TCP:
+		return MLX5_GRAPH_ARC_NODE_TCP;
+	case RTE_FLOW_ITEM_TYPE_MPLS:
+		return MLX5_GRAPH_ARC_NODE_MPLS;
+	case RTE_FLOW_ITEM_TYPE_GRE:
+		return MLX5_GRAPH_ARC_NODE_GRE;
+	case RTE_FLOW_ITEM_TYPE_GENEVE:
+		return MLX5_GRAPH_ARC_NODE_GENEVE;
+	case RTE_FLOW_ITEM_TYPE_VXLAN_GPE:
+		return MLX5_GRAPH_ARC_NODE_VXLAN_GPE;
+	default:
+		return -EINVAL;
+	}
+}
+
+static int
+mlx5_flex_arc_in_eth(const struct rte_flow_item *item,
+		     struct rte_flow_error *error)
+{
+	const struct rte_flow_item_eth *spec = item->spec;
+	const struct rte_flow_item_eth *mask = item->mask;
+	struct rte_flow_item_eth eth = { .hdr.ether_type = RTE_BE16(0xFFFF) };
+
+	if (memcmp(mask, &eth, sizeof(struct rte_flow_item_eth))) {
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, item,
+			 "invalid eth item mask");
+	}
+	return rte_be_to_cpu_16(spec->hdr.ether_type);
+}
+
+static int
+mlx5_flex_arc_in_udp(const struct rte_flow_item *item,
+		     struct rte_flow_error *error)
+{
+	const struct rte_flow_item_udp *spec = item->spec;
+	const struct rte_flow_item_udp *mask = item->mask;
+	struct rte_flow_item_udp udp = { .hdr.dst_port = RTE_BE16(0xFFFF) };
+
+	if (memcmp(mask, &udp, sizeof(struct rte_flow_item_udp))) {
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, item,
+			 "invalid eth item mask");
+	}
+	return rte_be_to_cpu_16(spec->hdr.dst_port);
+}
+
+static int
+mlx5_flex_translate_arc_in(struct mlx5_hca_flex_attr *attr,
+			   const struct rte_flow_item_flex_conf *conf,
+			   struct mlx5_flex_parser_devx *devx,
+			   struct mlx5_flex_item *item,
+			   struct rte_flow_error *error)
+{
+	struct mlx5_devx_graph_node_attr *node = &devx->devx_conf;
+	uint32_t i;
+
+	RTE_SET_USED(item);
+	if (conf->input_num > attr->max_num_arc_in)
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "too many input links");
+	for (i = 0; i < conf->input_num; i++) {
+		struct mlx5_devx_graph_arc_attr *arc = node->in + i;
+		struct rte_flow_item_flex_link *link = conf->input_link + i;
+		const struct rte_flow_item *rte_item = &link->item;
+		int arc_type;
+		int ret;
+
+		if (!rte_item->spec || !rte_item->mask || rte_item->last)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "invalid flex item IN arc format");
+		arc_type = mlx5_flex_arc_type(rte_item->type, true);
+		if (arc_type < 0 || !(attr->node_in & RTE_BIT32(arc_type)))
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported flex item IN arc type");
+		arc->arc_parse_graph_node = arc_type;
+		arc->start_inner_tunnel = link->tunnel ? 1 : 0;
+		/*
+		 * Configure arc IN condition value. The value location depends
+		 * on protocol. Current FW version supports IP & UDP for IN
+		 * arcs only, and locations for these protocols are defined.
+		 * Add more protocols when available.
+		 */
+		switch (rte_item->type) {
+		case RTE_FLOW_ITEM_TYPE_ETH:
+			ret = mlx5_flex_arc_in_eth(rte_item, error);
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			ret = mlx5_flex_arc_in_udp(rte_item, error);
+			break;
+		default:
+			MLX5_ASSERT(false);
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported flex item IN arc type");
+		}
+		if (ret < 0)
+			return ret;
+		arc->compare_condition_value = (uint16_t)ret;
+	}
+	return 0;
+}
+
+static int
+mlx5_flex_translate_arc_out(struct mlx5_hca_flex_attr *attr,
+			    const struct rte_flow_item_flex_conf *conf,
+			    struct mlx5_flex_parser_devx *devx,
+			    struct mlx5_flex_item *item,
+			    struct rte_flow_error *error)
+{
+	struct mlx5_devx_graph_node_attr *node = &devx->devx_conf;
+	uint32_t i;
+
+	if (conf->output_num > attr->max_num_arc_out)
+		return rte_flow_error_set
+			(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+			 "too many output links");
+	for (i = 0; i < conf->output_num; i++) {
+		struct mlx5_devx_graph_arc_attr *arc = node->out + i;
+		struct rte_flow_item_flex_link *link = conf->output_link + i;
+		const struct rte_flow_item *rte_item = &link->item;
+		int arc_type;
+
+		if (rte_item->spec || rte_item->mask || rte_item->last)
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "flex node: invalid OUT arc format");
+		arc_type = mlx5_flex_arc_type(rte_item->type, false);
+		if (arc_type < 0 || !(attr->node_out & RTE_BIT32(arc_type)))
+			return rte_flow_error_set
+				(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+				 "unsupported flex item OUT arc type");
+		arc->arc_parse_graph_node = arc_type;
+		arc->start_inner_tunnel = link->tunnel ? 1 : 0;
+		arc->compare_condition_value = link->next;
+		if (link->tunnel)
+			item->tunnel = 1;
+	}
+	return 0;
+}
+
+/* Translate RTE flex item API configuration into flaex parser settings. */
+static int
+mlx5_flex_translate_conf(struct rte_eth_dev *dev,
+			 const struct rte_flow_item_flex_conf *conf,
+			 struct mlx5_flex_parser_devx *devx,
+			 struct mlx5_flex_item *item,
+			 struct rte_flow_error *error)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_hca_flex_attr *attr = &priv->config.hca_attr.flex;
+	int ret;
+
+	ret = mlx5_flex_translate_length(attr, conf, devx, error);
+	if (ret)
+		return ret;
+	ret = mlx5_flex_translate_next(attr, conf, devx, error);
+	if (ret)
+		return ret;
+	ret = mlx5_flex_translate_sample(attr, conf, devx, item, error);
+	if (ret)
+		return ret;
+	ret = mlx5_flex_translate_arc_in(attr, conf, devx, item, error);
+	if (ret)
+		return ret;
+	ret = mlx5_flex_translate_arc_out(attr, conf, devx, item, error);
+	if (ret)
+		return ret;
+	return 0;
+}
+
 /**
  * Create the flex item with specified configuration over the Ethernet device.
  *
@@ -145,6 +889,8 @@ flow_dv_item_create(struct rte_eth_dev *dev,
 				   "too many flex items created on the port");
 		return NULL;
 	}
+	if (mlx5_flex_translate_conf(dev, conf, &devx_config, flex, error))
+		goto error;
 	ent = mlx5_list_register(priv->sh->flex_parsers_dv, &devx_config);
 	if (!ent) {
 		rte_flow_error_set(error, ENOMEM,
@@ -153,7 +899,6 @@ flow_dv_item_create(struct rte_eth_dev *dev,
 		goto error;
 	}
 	flex->devx_fp = container_of(ent, struct mlx5_flex_parser_devx, entry);
-	RTE_SET_USED(conf);
 	/* Mark initialized flex item valid. */
 	__atomic_add_fetch(&flex->refcnt, 1, __ATOMIC_RELEASE);
 	return (struct rte_flow_item_flex_handle *)flex;
@@ -278,6 +1023,7 @@ mlx5_flex_parser_remove_cb(void *list_ctx, struct mlx5_list_entry *entry)
 	RTE_SET_USED(list_ctx);
 	MLX5_ASSERT(fp->devx_obj);
 	claim_zero(mlx5_devx_cmd_destroy(fp->devx_obj));
+	DRV_LOG(DEBUG, "DEVx flex parser %p destroyed\n", (const void *)fp);
 	mlx5_free(entry);
 }
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 13/14] net/mlx5: translate flex item pattern into matcher
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (11 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 12/14] net/mlx5: translate flex item configuration Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 14/14] net/mlx5: handle flex item in flows Viacheslav Ovsiienko
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

The matcher is an steering engine entity that represents
the flow pattern to hardware to match. It order to
provide match on the flex item pattern the appropriate
matcher fields should be confgiured with values and masks
accordingly.

The flex item related matcher fields is an array of eight
32-bit fields to match with data captured by sample registers
of confgiured flex parser. One packet field, presented in
item pattern can be split between several sample registers,
and multiple fields can be combined together into single
sample register to optimize hardware resources usage
(number os sample registers is limited), depending on field
modes, widths and offsets. Actual mapping is complicated
and controlled by special translation data, built by PMD
on flex item creation.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5.h           |   8 ++
 drivers/net/mlx5/mlx5_flow_flex.c | 209 ++++++++++++++++++++++++++++++
 2 files changed, 217 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index d4fa946485..5cca704977 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1871,6 +1871,14 @@ int flow_dv_item_release(struct rte_eth_dev *dev,
 		    struct rte_flow_error *error);
 int mlx5_flex_item_port_init(struct rte_eth_dev *dev);
 void mlx5_flex_item_port_cleanup(struct rte_eth_dev *dev);
+void mlx5_flex_flow_translate_item(struct rte_eth_dev *dev,
+				   void *matcher, void *key,
+				   const struct rte_flow_item *item);
+int mlx5_flex_acquire_index(struct rte_eth_dev *dev,
+			    struct rte_flow_item_flex_handle *handle,
+			    bool acquire);
+int mlx5_flex_release_index(struct rte_eth_dev *dev, int index);
+
 /* Flex parser list callbacks. */
 struct mlx5_list_entry *mlx5_flex_parser_create_cb(void *list_ctx, void *ctx);
 int mlx5_flex_parser_match_cb(void *list_ctx,
diff --git a/drivers/net/mlx5/mlx5_flow_flex.c b/drivers/net/mlx5/mlx5_flow_flex.c
index 56b91da839..f695198833 100644
--- a/drivers/net/mlx5/mlx5_flow_flex.c
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -113,6 +113,215 @@ mlx5_flex_free(struct mlx5_priv *priv, struct mlx5_flex_item *item)
 	}
 }
 
+__rte_always_inline static uint32_t
+mlx5_flex_get_bitfield(const struct rte_flow_item_flex *item,
+		       uint32_t pos, uint32_t width)
+{
+	const uint8_t *ptr = item->pattern;
+	uint32_t val;
+
+	/* Proceed the bitfield start byte. */
+	MLX5_ASSERT(width <= sizeof(uint32_t) * CHAR_BIT);
+	if (item->length <= pos / CHAR_BIT)
+		return 0;
+	val = ptr[pos / CHAR_BIT] >> (pos % CHAR_BIT);
+	if (width <= CHAR_BIT - pos % CHAR_BIT)
+		return val;
+	width -= CHAR_BIT - pos % CHAR_BIT;
+	pos += CHAR_BIT - pos % CHAR_BIT;
+	while (width >= CHAR_BIT) {
+		val <<= CHAR_BIT;
+		if (pos / CHAR_BIT < item->length)
+			val |= ptr[pos / CHAR_BIT];
+		width -= CHAR_BIT;
+		pos += CHAR_BIT;
+	}
+	/* Proceed the bitfield end byte. */
+	if (width) {
+		val <<= width;
+		if (pos / CHAR_BIT < item->length)
+			val |= ptr[pos / CHAR_BIT] & (RTE_BIT32(width) - 1);
+	}
+	return val;
+}
+
+#define SET_FP_MATCH_SAMPLE_ID(x, def, msk, val, sid) \
+	do { \
+		uint32_t tmp, out = (def); \
+		tmp = MLX5_GET(fte_match_set_misc4, misc4_m, \
+			       prog_sample_field_value_##x); \
+		tmp = (tmp & ~out) | (msk); \
+		MLX5_SET(fte_match_set_misc4, misc4_m, \
+			 prog_sample_field_value_##x, tmp); \
+		tmp = MLX5_GET(fte_match_set_misc4, misc4_v, \
+			       prog_sample_field_value_##x); \
+		tmp = (tmp & ~out) | (val); \
+		MLX5_SET(fte_match_set_misc4, misc4_v, \
+			 prog_sample_field_value_##x, tmp); \
+		tmp = sid; \
+		MLX5_SET(fte_match_set_misc4, misc4_v, \
+			 prog_sample_field_id_##x, tmp);\
+		MLX5_SET(fte_match_set_misc4, misc4_m, \
+			 prog_sample_field_id_##x, tmp); \
+	} while (0)
+
+__rte_always_inline static void
+mlx5_flex_set_match_sample(void *misc4_m, void *misc4_v,
+			   uint32_t def, uint32_t mask, uint32_t value,
+			   uint32_t sample_id, uint32_t id)
+{
+	switch (id) {
+	case 0:
+		SET_FP_MATCH_SAMPLE_ID(0, def, mask, value, sample_id);
+		break;
+	case 1:
+		SET_FP_MATCH_SAMPLE_ID(1, def, mask, value, sample_id);
+		break;
+	case 2:
+		SET_FP_MATCH_SAMPLE_ID(2, def, mask, value, sample_id);
+		break;
+	case 3:
+		SET_FP_MATCH_SAMPLE_ID(3, def, mask, value, sample_id);
+		break;
+	case 4:
+		SET_FP_MATCH_SAMPLE_ID(4, def, mask, value, sample_id);
+		break;
+	case 5:
+		SET_FP_MATCH_SAMPLE_ID(5, def, mask, value, sample_id);
+		break;
+	case 6:
+		SET_FP_MATCH_SAMPLE_ID(6, def, mask, value, sample_id);
+		break;
+	case 7:
+		SET_FP_MATCH_SAMPLE_ID(7, def, mask, value, sample_id);
+		break;
+	default:
+		MLX5_ASSERT(false);
+		break;
+	}
+#undef SET_FP_MATCH_SAMPLE_ID
+}
+/**
+ * Translate item pattern into matcher fields according to translation
+ * array.
+ *
+ * @param dev
+ *   Ethernet device to translate flex item on.
+ * @param[in, out] matcher
+ *   Flow matcher to confgiure
+ * @param[in, out] key
+ *   Flow matcher value.
+ * @param[in] item
+ *   Flow pattern to translate.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+void
+mlx5_flex_flow_translate_item(struct rte_eth_dev *dev,
+			      void *matcher, void *key,
+			      const struct rte_flow_item *item)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	const struct rte_flow_item_flex *spec, *mask;
+	void *misc4_m = MLX5_ADDR_OF(fte_match_param, matcher,
+				     misc_parameters_4);
+	void *misc4_v = MLX5_ADDR_OF(fte_match_param, key, misc_parameters_4);
+	struct mlx5_flex_item *tp;
+	uint32_t i, pos = 0;
+
+	MLX5_ASSERT(item->spec && item->mask);
+	spec = item->spec;
+	mask = item->mask;
+	tp = (struct mlx5_flex_item *)spec->handle;
+	MLX5_ASSERT(mlx5_flex_index(priv, tp) >= 0);
+	for (i = 0; i < tp->mapnum; i++) {
+		struct mlx5_flex_pattern_field *map = tp->map + i;
+		uint32_t id = map->reg_id;
+		uint32_t def = (RTE_BIT32(map->width) - 1) << map->shift;
+		uint32_t val = mlx5_flex_get_bitfield(spec, pos, map->width);
+		uint32_t msk = mlx5_flex_get_bitfield(mask, pos, map->width);
+
+		MLX5_ASSERT(map->width);
+		MLX5_ASSERT(id < tp->devx_fp->num_samples);
+		pos += map->width;
+		val <<= map->shift;
+		msk <<= map->shift;
+		mlx5_flex_set_match_sample(misc4_m, misc4_v,
+					   def, msk & def, val & msk & def,
+					   tp->devx_fp->sample_ids[id], id);
+	}
+}
+
+/**
+ * Convert flex item handle (from the RTE flow) to flex item index on port.
+ * Optionally can increment flex item object reference count.
+ *
+ * @param dev
+ *   Ethernet device to acquire flex item on.
+ * @param[in] handle
+ *   Flow item handle from item spec.
+ * @param[in] acquire
+ *   If set - increment reference counter.
+ *
+ * @return
+ *   >=0 - index on success, a negative errno value otherwise
+ *         and rte_errno is set.
+ */
+int
+mlx5_flex_acquire_index(struct rte_eth_dev *dev,
+			struct rte_flow_item_flex_handle *handle,
+			bool acquire)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flex_item *flex = (struct mlx5_flex_item *)handle;
+	int ret = mlx5_flex_index(priv, flex);
+
+	if (ret < 0) {
+		errno = -EINVAL;
+		rte_errno = EINVAL;
+		return ret;
+	}
+	if (acquire)
+		__atomic_add_fetch(&flex->refcnt, 1, __ATOMIC_RELEASE);
+	return ret;
+}
+
+/**
+ * Release flex item index on port - decrements reference counter by index.
+ *
+ * @param dev
+ *   Ethernet device to acquire flex item on.
+ * @param[in] index
+ *   Flow item index.
+ *
+ * @return
+ *   0 - on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_flex_release_index(struct rte_eth_dev *dev,
+			int index)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_flex_item *flex;
+
+	if (index >= MLX5_PORT_FLEX_ITEM_NUM ||
+	    !(priv->flex_item_map & (1u << index))) {
+		errno = EINVAL;
+		rte_errno = -EINVAL;
+		return -EINVAL;
+	}
+	flex = priv->flex_item + index;
+	if (flex->refcnt <= 1) {
+		MLX5_ASSERT(false);
+		errno = EINVAL;
+		rte_errno = -EINVAL;
+		return -EINVAL;
+	}
+	__atomic_sub_fetch(&flex->refcnt, 1, __ATOMIC_RELEASE);
+	return 0;
+}
+
 static int
 mlx5_flex_translate_length(struct mlx5_hca_flex_attr *attr,
 			   const struct rte_flow_item_flex_conf *conf,
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v2 14/14] net/mlx5: handle flex item in flows
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (12 preceding siblings ...)
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 13/14] net/mlx5: translate flex item pattern into matcher Viacheslav Ovsiienko
@ 2021-10-01 19:34   ` Viacheslav Ovsiienko
  13 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-01 19:34 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Provide flex item recognition, validation and trabslation
in flow patterns. Track the flex item referencing.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow.h      |  8 +++-
 drivers/net/mlx5/mlx5_flow_dv.c   | 70 +++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_flow_flex.c |  4 +-
 3 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index a8f8c49dd2..c87d8e3168 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -173,6 +173,10 @@ enum mlx5_feature_name {
 /* Conntrack item. */
 #define MLX5_FLOW_LAYER_ASO_CT (UINT64_C(1) << 35)
 
+/* Flex item */
+#define MLX5_FLOW_ITEM_FLEX (UINT64_C(1) << 36)
+#define MLX5_FLOW_ITEM_FLEX_TUNNEL (UINT64_C(1) << 37)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
 	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
@@ -187,7 +191,8 @@ enum mlx5_feature_name {
 	(MLX5_FLOW_LAYER_VXLAN | MLX5_FLOW_LAYER_VXLAN_GPE | \
 	 MLX5_FLOW_LAYER_GRE | MLX5_FLOW_LAYER_NVGRE | MLX5_FLOW_LAYER_MPLS | \
 	 MLX5_FLOW_LAYER_IPIP | MLX5_FLOW_LAYER_IPV6_ENCAP | \
-	 MLX5_FLOW_LAYER_GENEVE | MLX5_FLOW_LAYER_GTP)
+	 MLX5_FLOW_LAYER_GENEVE | MLX5_FLOW_LAYER_GTP | \
+	 MLX5_FLOW_ITEM_FLEX_TUNNEL)
 
 /* Inner Masks. */
 #define MLX5_FLOW_LAYER_INNER_L3 \
@@ -686,6 +691,7 @@ struct mlx5_flow_handle {
 	uint32_t is_meter_flow_id:1; /**< Indate if flow_id is for meter. */
 	uint32_t mark:1; /**< Metadate rxq mark flag. */
 	uint32_t fate_action:3; /**< Fate action type. */
+	uint32_t flex_item; /**< referenced Flex Item bitmask. */
 	union {
 		uint32_t rix_hrxq; /**< Hash Rx queue object index. */
 		uint32_t rix_jump; /**< Index to the jump action resource. */
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index a3c35a5edf..3a785e7925 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -6825,6 +6825,38 @@ flow_dv_validate_item_integrity(struct rte_eth_dev *dev,
 	return 0;
 }
 
+static int
+flow_dv_validate_item_flex(struct rte_eth_dev *dev,
+			   const struct rte_flow_item *item,
+			   uint64_t *last_item,
+			   struct rte_flow_error *error)
+{
+	const struct rte_flow_item_flex *flow_spec = item->spec;
+	const struct rte_flow_item_flex *flow_mask = item->mask;
+	struct mlx5_flex_item *flex;
+
+	if (!flow_spec)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+					  "flex flow item spec cannot be NULL");
+	if (!flow_mask)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+					  "flex flow item mask cannot be NULL");
+	if (item->last)
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+					  "flex flow item last not supported");
+	if (mlx5_flex_acquire_index(dev, flow_spec->handle, false) < 0)
+		return rte_flow_error_set(error, EINVAL,
+					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+					  "invalid flex flow item handle");
+	flex = (struct mlx5_flex_item *)flow_spec->handle;
+	*last_item = flex->tunnel ? MLX5_FLOW_ITEM_FLEX_TUNNEL :
+				    MLX5_FLOW_ITEM_FLEX;
+	return 0;
+}
+
 /**
  * Internal validation function. For validating both actions and items.
  *
@@ -7266,6 +7298,14 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct rte_flow_attr *attr,
 			 * list it here as a supported type
 			 */
 			break;
+		case RTE_FLOW_ITEM_TYPE_FLEX:
+			if (item_flags & MLX5_FLOW_ITEM_FLEX)
+				rte_flow_error_set(error, ENOTSUP,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   NULL, "multiple flex items not supported");
+			ret = flow_dv_validate_item_flex(dev, items,
+							 &last_item, error);
+			break;
 		default:
 			return rte_flow_error_set(error, ENOTSUP,
 						  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -10123,6 +10163,25 @@ flow_dv_translate_item_aso_ct(struct rte_eth_dev *dev,
 			       reg_value, reg_mask);
 }
 
+static void
+flow_dv_translate_item_flex(struct rte_eth_dev *dev, void *matcher, void *key,
+			    const struct rte_flow_item *item,
+			    struct mlx5_flow *dev_flow)
+{
+	const struct rte_flow_item_flex *spec =
+		(const struct rte_flow_item_flex *)item->spec;
+	int index = mlx5_flex_acquire_index(dev, spec->handle, false);
+
+	MLX5_ASSERT(index >= 0 && index <= (int)(sizeof(uint32_t) * CHAR_BIT));
+	if (!(dev_flow->handle->flex_item & RTE_BIT32(index))) {
+		/* Don't count both inner and outer flex items in one rule. */
+		if (mlx5_flex_acquire_index(dev, spec->handle, true) != index)
+			MLX5_ASSERT(false);
+		dev_flow->handle->flex_item |= RTE_BIT32(index);
+	}
+	mlx5_flex_flow_translate_item(dev, matcher, key, item);
+}
+
 static uint32_t matcher_zero[MLX5_ST_SZ_DW(fte_match_param)] = { 0 };
 
 #define HEADER_IS_ZERO(match_criteria, headers)				     \
@@ -13520,6 +13579,10 @@ flow_dv_translate(struct rte_eth_dev *dev,
 			flow_dv_translate_item_aso_ct(dev, match_mask,
 						      match_value, items);
 			break;
+		case RTE_FLOW_ITEM_TYPE_FLEX:
+			flow_dv_translate_item_flex(dev, match_mask,
+						    match_value, items,
+						    dev_flow);
 		default:
 			break;
 		}
@@ -14393,6 +14456,12 @@ flow_dv_destroy(struct rte_eth_dev *dev, struct rte_flow *flow)
 		if (!dev_handle)
 			return;
 		flow->dev_handles = dev_handle->next.next;
+		while (dev_handle->flex_item) {
+			int index = rte_bsf32(dev_handle->flex_item);
+
+			mlx5_flex_release_index(dev, index);
+			dev_handle->flex_item &= ~RTE_BIT32(index);
+		}
 		if (dev_handle->dvh.matcher)
 			flow_dv_matcher_release(dev, dev_handle);
 		if (dev_handle->dvh.rix_sample)
@@ -18014,5 +18083,6 @@ const struct mlx5_flow_driver_ops mlx5_flow_dv_drv_ops = {
 	.item_create = flow_dv_item_create,
 	.item_release = flow_dv_item_release,
 };
+
 #endif /* HAVE_IBV_FLOW_DV_SUPPORT */
 
diff --git a/drivers/net/mlx5/mlx5_flow_flex.c b/drivers/net/mlx5/mlx5_flow_flex.c
index f695198833..f1567ddfdd 100644
--- a/drivers/net/mlx5/mlx5_flow_flex.c
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -1164,8 +1164,8 @@ flow_dv_item_release(struct rte_eth_dev *dev,
 				  &flex->devx_fp->entry);
 	flex->devx_fp = NULL;
 	mlx5_flex_free(priv, flex);
-	if (rc)
-		return rte_flow_error_set(error, rc,
+	if (rc < 0)
+		return rte_flow_error_set(error, EBUSY,
 					  RTE_FLOW_ERROR_TYPE_ITEM, NULL,
 					  "flex item release failure");
 	return 0;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/14] ethdev: introduce configurable flexible item
  2021-10-01 19:34   ` [dpdk-dev] [PATCH v2 01/14] " Viacheslav Ovsiienko
@ 2021-10-07 11:08     ` Ori Kam
  2021-10-12  6:42       ` Slava Ovsiienko
  0 siblings, 1 reply; 73+ messages in thread
From: Ori Kam @ 2021-10-07 11:08 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Gregory Etelson,
	NBU-Contact-Thomas Monjalon

Hi Slava,

> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo@nvidia.com>
> Sent: Friday, October 1, 2021 10:34 PM
> Subject: [PATCH v2 01/14] ethdev: introduce configurable flexible item
> 
> 1. Introduction and Retrospective
> 
> Nowadays the networks are evolving fast and wide, the network structures are
> getting more and more complicated, the new application areas are emerging.
> To address these challenges the new network protocols are continuously being
> developed, considered by technical communities, adopted by industry and,
> eventually implemented in hardware and software. The DPDK framework
> follows the common trends and if we bother to glance at the RTE Flow API
> header we see the multiple new items were introduced during the last years
> since the initial release.
> 
> The new protocol adoption and implementation process is not straightforward
> and takes time, the new protocol passes development, consideration,
> adoption, and implementation phases. The industry tries to mitigate and
> address the forthcoming network protocols, for example, many hardware
> vendors are implementing flexible and configurable network protocol parsers.
> As DPDK developers, could we anticipate the near future in the same fashion
> and introduce the similar flexibility in RTE Flow API?
> 
> Let's check what we already have merged in our project, and we see the nice
> raw item (rte_flow_item_raw). At the first glance, it looks superior and we can
> try to implement a flow matching on the header of some relatively new tunnel
> protocol, say on the GENEVE header with variable length options. And, under
> further consideration, we run into the raw item
> limitations:
> 
> - only fixed size network header can be represented
> - the entire network header pattern of fixed format
>   (header field offsets are fixed) must be provided
> - the search for patterns is not robust (the wrong matches
>   might be triggered), and actually is not supported
>   by existing PMDs
> - no explicitly specified relations with preceding
>   and following items
> - no tunnel hint support
> 
> As the result, implementing the support for tunnel protocols like
> aforementioned GENEVE with variable extra protocol option with flow raw
> item becomes very complicated and would require multiple flows and
> multiple raw items chained in the same flow (by the way, there is no support
> found for chained raw items in implemented drivers).
> 
> This RFC introduces the dedicated flex item (rte_flow_item_flex) to handle
> matches with existing and new network protocol headers in a unified fashion.
> 
> 2. Flex Item Life Cycle
> 
> Let's assume there are the requirements to support the new network protocol
> with RTE Flows. What is given within protocol
> specification:
> 
>   - header format
>   - header length, (can be variable, depending on options)
>   - potential presence of extra options following or included
>     in the header the header
>   - the relations with preceding protocols. For example,
>     the GENEVE follows UDP, eCPRI can follow either UDP
>     or L2 header
>   - the relations with following protocols. For example,
>     the next layer after tunnel header can be L2 or L3
>   - whether the new protocol is a tunnel and the header
>     is a splitting point between outer and inner layers
> 
> The supposed way to operate with flex item:
> 
>   - application defines the header structures according to
>     protocol specification
> 
>   - application calls rte_flow_flex_item_create() with desired
>     configuration according to the protocol specification, it
>     creates the flex item object over specified ethernet device
>     and prepares PMD and underlying hardware to handle flex
>     item. On item creation call PMD backing the specified
>     ethernet device returns the opaque handle identifying
>     the object have been created
> 
>   - application uses the rte_flow_item_flex with obtained handle
>     in the flows, the values/masks to match with fields in the
>     header are specified in the flex item per flow as for regular
>     items (except that pattern buffer combines all fields)
> 
>   - flows with flex items match with packets in a regular fashion,
>     the values and masks for the new protocol header match are
>     taken from the flex items in the flows
> 
>   - application destroys flows with flex items
> 
>   - application calls rte_flow_flex_item_release() as part of
>     ethernet device API and destroys the flex item object in
>     PMD and releases the engaged hardware resources
> 
> 3. Flex Item Structure
> 
> The flex item structure is intended to be used as part of the flow pattern like
> regular RTE flow items and provides the mask and value to match with fields of
> the protocol item was configured for.
> 
>   struct rte_flow_item_flex {
>     void *handle;
>     uint32_t length;
>     const uint8_t* pattern;
>   };
> 
> The handle is some opaque object maintained on per device basis by
> underlying driver.
> 
> The protocol header fields are considered as bit fields, all offsets and widths
> are expressed in bits. The pattern is the buffer containing the bit
> concatenation of all the fields presented at item configuration time, in the
> same order and same amount. If byte boundary alignment is needed an
> application can use a dummy type field, this is just some kind of gap filler.
> 
> The length field specifies the pattern buffer length in bytes and is needed to
> allow rte_flow_copy() operations. The approach of multiple pattern pointers
> and lengths (per field) was considered and found clumsy - it seems to be much
> suitable for the application to maintain the single structure within the single
> pattern buffer.
> 

I think that the main thing that is unclear to me and I think I understand it from
reading the code is that the pattern is the entire flex header structure.
maybe a better word will be header?
In the beginning I thought that you should only give the matchable fields.
also you say everything is in bits and suddenly you are talking in bytes.

> 4. Flex Item Configuration
> 
> The flex item configuration consists of the following parts:
> 
>   - header field descriptors:
>     - next header
>     - next protocol
>     - sample to match
>   - input link descriptors
>   - output link descriptors
> 
> The field descriptors tell driver and hardware what data should be extracted
> from the packet and then presented to match in the flows. Each field is a bit
> pattern. It has width, offset from the header beginning, mode of offset
> calculation, and offset related parameters.
> 

I'm not sure your indentation is correct for the next header, next protocol, sample to match.
Since reading the first line means that all fields are going to be matched
while in following sections only the sample to match are matchable. 

> The next header field is special, no data are actually taken from the packet,
> but its offset is used as pointer to the next header in the packet, in other word
> the next header offset specifies the size of the header being parsed by flex
> item.
> 

So the name of the next header should be len?

> There is one more special field - next protocol, it specifies where the next
> protocol identifier is contained and packet data sampled from this field will be
> used to determine the next protocol header type to continue packet parsing.
> The next protocol field is like eth_type field in MAC2, or proto field in IPv4/v6
> headers.
> 
> The sample fields are used to represent the data be sampled from the packet
> and then matched with established flows.

Should this be samples?

> 
> There are several methods supposed to calculate field offset in runtime
> depending on configuration and packet content:
> 
>   - FIELD_MODE_FIXED - fixed offset. The bit offset from
>     header beginning is permanent and defined by field_base
>     configuration parameter.
> 
>   - FIELD_MODE_OFFSET - the field bit offset is extracted
>     from other header field (indirect offset field). The
>     resulting field offset to match is calculated from as:
> 
>   field_base + (*field_offset & offset_mask) << field_shift
> 

Not all of those fields names are defined later in this patch, and I'm not
sure about what they mean.
Does * means take the value this is in field_offset?
How do we know the width of the field (by the value of the mask)?

>     This mode is useful to sample some extra options following
>     the main header with field containing main header length.
>     Also, this mode can be used to calculate offset to the
>     next protocol header, for example - IPv4 header contains
>     the 4-bit field with IPv4 header length expressed in dwords.
>     One more example - this mode would allow us to skip GENEVE
>     header variable length options.
> 
>   - FIELD_MODE_BITMASK - the field bit offset is extracted
>     from other header field (indirect offset field), the latter
>     is considered as bitmask containing some number of one bits,
>     the resulting field offset to match is calculated as:
> 
>   field_base + bitcount(*field_offset & offset_mask) << field_shift

Same comment as above you are using name that are not defined later.

> 
>     This mode would be useful to skip the GTP header and its
>     extra options with specified flags.
> 
>   - FIELD_MODE_DUMMY - dummy field, optionally used for byte
>     boundary alignment in pattern. Pattern mask and data are
>     ignored in the match. All configuration parameters besides
>     field size and offset are ignored.
> 
> The offset mode list can be extended by vendors according to hardware
> supported options.
> 
> The input link configuration section tells the driver after what protocols and at
> what conditions the flex item can follow.
> Input link specified the preceding header pattern, for example for GENEVE it
> can be UDP item specifying match on destination port with value 6081. The
> flex item can follow multiple header types and multiple input links should be
> specified. At flow creation type the item with one of input link types should
> precede the flex item and driver will select the correct flex item settings,
> depending on actual flow pattern.
> 
> The output link configuration section tells the driver how to continue packet
> parsing after the flex item protocol.
> If multiple protocols can follow the flex item header the flex item should
> contain the field with next protocol identifier, and the parsing will be
> continued depending on the data contained in this field in the actual packet.
> 
> The flex item fields can participate in RSS hash calculation, the dedicated flag
> is present in field description to specify what fields should be provided for
> hashing.
> 
> 5. Flex Item Chaining
> 
> If there are multiple protocols supposed to be supported with flex items in
> chained fashion - two or more flex items within the same flow and these ones
> might be neighbors in pattern - it means the flex items are mutual referencing.
> In this case, the item that occurred first should be created with empty output
> link list or with the list including existing items, and then the second flex item
> should be created referencing the first flex item as input arc.
> 

And then I assume we should update the output list.

> Also, the hardware resources used by flex items to handle the packet can be
> limited. If there are multiple flex items that are supposed to be used within the
> same flow it would be nice to provide some hint for the driver that these two
> or more flex items are intended for simultaneous usage.
> The fields of items should be assigned with hint indices and these indices from
> two or more flex items should not overlap (be unique per field). For this case,
> the driver will try to engage not overlapping hardware resources and provide
> independent handling of the fields with unique indices. If the hint index is zero
> the driver assigns resources on its own.
> 
> 6. Example of New Protocol Handling
> 
> Let's suppose we have the requirements to handle the new tunnel protocol
> that follows UDP header with destination port 0xFADE and is followed by MAC
> header. Let the new protocol header format be like this:
> 
>   struct new_protocol_header {
>     rte_be32 header_length; /* length in dwords, including options */
>     rte_be32 specific0;     /* some protocol data, no intention */
>     rte_be32 specific1;     /* to match in flows on these fields */
>     rte_be32 crucial;       /* data of interest, match is needed */
>     rte_be32 options[0];    /* optional protocol data, variable length */
>   };
> 
> The supposed flex item configuration:
> 
>   struct rte_flow_item_flex_field field0 = {
>     .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
>     .field_size = 96,                /* three dwords from the beginning */
>   };
>   struct rte_flow_item_flex_field field1 = {
>     .field_mode = FIELD_MODE_FIXED,
>     .field_size = 32,       /* Field size is one dword */
>     .field_base = 96,       /* Skip three dwords from the beginning */
>   };
>   struct rte_flow_item_udp spec0 = {
>     .hdr = {
>       .dst_port = RTE_BE16(0xFADE),
>     }
>   };
>   struct rte_flow_item_udp mask0 = {
>     .hdr = {
>       .dst_port = RTE_BE16(0xFFFF),
>     }
>   };
>   struct rte_flow_item_flex_link link0 = {
>     .item = {
>        .type = RTE_FLOW_ITEM_TYPE_UDP,
>        .spec = &spec0,
>        .mask = &mask0,
>   };
> 
>   struct rte_flow_item_flex_conf conf = {
>     .next_header = {
>       .field_mode = FIELD_MODE_OFFSET,
>       .field_base = 0,
>       .offset_base = 0,
>       .offset_mask = 0xFFFFFFFF,
>       .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
>     },
>     .sample = {
>        &field0,
>        &field1,
>     },

Why in sample you give both fields?
by your decision we just want to match on field1.

>     .sample_num = 2,
>     .input_link[0] = &link0,
>     .input_num = 1
>   };
> 
> Let's suppose we have created the flex item successfully, and PMD returned
> the handle 0x123456789A. We can use the following item pattern to match the
> crucial field in the packet with value 0x00112233:
> 
>   struct new_protocol_header spec_pattern =
>   {
>     .crucial = RTE_BE32(0x00112233),
>   };
>   struct new_protocol_header mask_pattern =
>   {
>     .crucial = RTE_BE32(0xFFFFFFFF),
>   };
>   struct rte_flow_item_flex spec_flex = {
>     .handle = 0x123456789A
>     .length = sizeiof(struct new_protocol_header),
>     .pattern = &spec_pattern,
>   };
>   struct rte_flow_item_flex mask_flex = {
>     .length = sizeof(struct new_protocol_header),
>     .pattern = &mask_pattern,
>   };
>   struct rte_flow_item item_to_match = {
>     .type = RTE_FLOW_ITEM_TYPE_FLEX,
>     .spec = &spec_flex,
>     .mask = &mask_flex,
>   };
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
>  doc/guides/prog_guide/rte_flow.rst     |  24 +++
>  doc/guides/rel_notes/release_21_11.rst |   7 +
>  lib/ethdev/rte_ethdev.h                |   1 +
>  lib/ethdev/rte_flow.h                  | 228 +++++++++++++++++++++++++
>  4 files changed, 260 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> index 2b42d5ec8c..628f30cea7 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -1425,6 +1425,30 @@ Matches a conntrack state after conntrack action.
>  - ``flags``: conntrack packet state flags.
>  - Default ``mask`` matches all state bits.
> 
> +Item: ``FLEX``
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Matches with the network protocol header of preliminary configured format.
> +The application describes the desired header structure, defines the
> +header fields attributes and header relations with preceding and
> +following protocols and configures the ethernet devices accordingly via
> +rte_flow_flex_item_create() routine.

How about: matches a custom header that was created using
rte_flow_flex_item_create

> +
> +- ``handle``: the flex item handle returned by the PMD on successful
> +  rte_flow_flex_item_create() call. The item handle is unique within
> +  the device port, mask for this field is ignored.

I think you can remove that it is unique handle.

> +- ``length``: match pattern length in bytes. If the length does not
> +cover
> +  all fields defined in item configuration, the pattern spec and mask
> +are
> +  supposed to be appended with zeroes till the full configured item length.

It looks bugy saying that you can give any length but expect the application to supply the
full length.
 
> +- ``pattern``: pattern to match. The protocol header fields are
> +considered
> +  as bit fields, all offsets and widths are expressed in bits. The
> +pattern
> +  is the buffer containing the bit concatenation of all the fields
> +presented
> +  at item configuration time, in the same order and same amount. The
> +most
> +  regular way is to define all the header fields in the flex item
> +configuration
> +  and directly use the header structure as pattern template, i.e.
> +application
> +  just can fill the header structures with desired match values and
> +masks and
> +  specify these structures as flex item pattern directly.
> +

It hard to understand this comment and what the application should set.
I suggest to take the basic approach and just explain it. ( I think those are
the last few lines)

>  Actions
>  ~~~~~~~
> 
> diff --git a/doc/guides/rel_notes/release_21_11.rst
> b/doc/guides/rel_notes/release_21_11.rst
> index 73e377a007..170797f9e9 100644
> --- a/doc/guides/rel_notes/release_21_11.rst
> +++ b/doc/guides/rel_notes/release_21_11.rst
> @@ -55,6 +55,13 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +* **Introduced RTE Flow Flex Item.**
> +
> +  * The configurable RTE Flow Flex Item provides the capability to introdude
> +    the arbitrary user specified network protocol header, configure the device
> +    hardware accordingly, and perform match on this header with desired
> patterns
> +    and masks.
> +
>  * **Enabled new devargs parser.**
> 
>    * Enabled devargs syntax
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> afdc53b674..e9ad7673e9 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -558,6 +558,7 @@ struct rte_eth_rss_conf {
>   * it takes the reserved value 0 as input for the hash function.
>   */
>  #define ETH_RSS_L4_CHKSUM          (1ULL << 35)
> +#define ETH_RSS_FLEX		   (1ULL << 36)

Is the indentation right?
How do you support FLEX RSS if more then on FLEX item is configured?

> 
>  /*
>   * We use the following macros to combine with above ETH_RSS_* for diff --git
> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> 7b1ed7f110..eccb1e1791 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -574,6 +574,15 @@ enum rte_flow_item_type {
>  	 * @see struct rte_flow_item_conntrack.
>  	 */
>  	RTE_FLOW_ITEM_TYPE_CONNTRACK,
> +
> +	/**
> +	 * Matches a configured set of fields at runtime calculated offsets
> +	 * over the generic network header with variable length and
> +	 * flexible pattern
> +	 *

I think it should say matches on application configured header.

> +	 * @see struct rte_flow_item_flex.
> +	 */
> +	RTE_FLOW_ITEM_TYPE_FLEX,
>  };
> 
>  /**
> @@ -1839,6 +1848,160 @@ struct rte_flow_item {
>  	const void *mask; /**< Bit-mask applied to spec and last. */  };
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice
> + *
> + * RTE_FLOW_ITEM_TYPE_FLEX
> + *
> + * Matches a specified set of fields within the network protocol
> + * header. Each field is presented as set of bits with specified width,
> +and
> + * bit offset (this is dynamic one - can be calulated by several
> +methods
> + * in runtime) from the header beginning.
> + *
> + * The pattern is concatenation of all bit fields configured at item
> +creation
> + * by rte_flow_flex_item_create() exactly in the same order and amount,
> +no
> + * fields can be omitted or swapped. The dummy mode field can be used
> +for
> + * pattern byte boundary alignment, least significant bit in byte goes first.
> + * Only the fields specified in sample_data configuration parameter
> +participate
> + * in pattern construction.
> + *
> + * If pattern length is smaller than configured fields overall length
> +it is
> + * extended with trailing zeroes, both for value and mask.
> + *
> + * This type does not support ranges (struct rte_flow_item.last).
> + */

I think it is to complex to understand see my comment above.

> +struct rte_flow_item_flex {
> +	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle.
> */
> +	uint32_t length; /**< Pattern length in bytes. */
> +	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
> +};
> +/**
> + * Field bit offset calculation mode.
> + */
> +enum rte_flow_item_flex_field_mode {
> +	/**
> +	 * Dummy field, used for byte boundary alignment in pattern.
> +	 * Pattern mask and data are ignored in the match. All configuration
> +	 * parameters besides field size are ignored.

Since in the item we just set value and mask what will happen if
we set mask to be different then 0 in an offset that we have such a field?

> +	 */
> +	FIELD_MODE_DUMMY = 0,
> +	/**
> +	 * Fixed offset field. The bit offset from header beginning is
> +	 * is permanent and defined by field_base parameter.
> +	 */
> +	FIELD_MODE_FIXED,
> +	/**
> +	 * The field bit offset is extracted from other header field (indirect
> +	 * offset field). The resulting field offset to match is calculated as:
> +	 *
> +	 *    field_base + (*field_offset & offset_mask) << field_shift

I can't find those name in the patch and I'm not clear on what they mean.

> +	 */
> +	FIELD_MODE_OFFSET,
> +	/**
> +	 * The field bit offset is extracted from other header field (indirect
> +	 * offset field), the latter is considered as bitmask containing some
> +	 * number of one bits, the resulting field offset to match is
> +	 * calculated as:

Just like above. 

> +	 *
> +	 *    field_base + bitcount(*field_offset & offset_mask) << field_shift
> +	 */
> +	FIELD_MODE_BITMASK,
> +};
> +
> +/**
> + * Flex item field tunnel mode
> + */
> +enum rte_flow_item_flex_tunnel_mode {
> +	FLEX_TUNNEL_MODE_FIRST = 0, /**< First item occurrence. */
> +	FLEX_TUNNEL_MODE_OUTER = 1, /**< Outer item. */
> +	FLEX_TUNNEL_MODE_INNER = 2  /**< Inner item. */ };
> +

The '}' should be at a new line.
If the item can be inner and outer do we need to define two flex objects?
Also why enum and not defines?
From API point of view I think it should hav the following options:
Mode_outer , mode_inner, mode_global and mode_tunnel,
Why is per field and not per object. 

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice  */
> +__extension__ struct rte_flow_item_flex_field {
> +	/** Defines how match field offset is calculated over the packet. */
> +	enum rte_flow_item_flex_field_mode field_mode;
> +	uint32_t field_size; /**< Match field size in bits. */

I think it will be better to remove the word Match.

> +	int32_t field_base; /**< Match field offset in bits. */

I think it will be better to remove the word Match.

> +	uint32_t offset_base; /**< Indirect offset field offset in bits. */

I think a better name will be offset_field /* the offset of the field that holds the offset that
should be used from the field_base */ what do you think?

Maybe just change from offset_base to offset?

> +	uint32_t offset_mask; /**< Indirect offset field bit mask. */

Maybe better wording?
The mask to apply to the value that is set in the offset_field.

> +	int32_t offset_shift; /**< Indirect offset multiply factor. */
> +	uint16_t tunnel_count:2; /**< 0-first occurrence, 1-outer, 2-inner.*/

I think this may result in some warning since you try to cast enum to 2 bits.
Also the same question from above to support inner and outer do we need
two objects?

> +	uint16_t rss_hash:1; /**< Field participates in RSS hash calculation. */

Please see my comment on the RSS, it is not clear how more then one flex item 
can be created and the rss will work.

> +	uint16_t field_id; /**< device hint, for flows with multiple items. */

How should this be used? 
Should be capital D in device.

> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice  */
> +struct rte_flow_item_flex_link {
> +	/**
> +	 * Preceding/following header. The item type must be always
> provided.
> +	 * For preceding one item must specify the header value/mask to
> match
> +	 * for the link be taken and start the flex item header parsing.
> +	 */
> +	struct rte_flow_item item;
> +	/**
> +	 * Next field value to match to continue with one of the configured
> +	 * next protocols.
> +	 */
> +	uint32_t next;

Is this offset of the field or the value?

> +	/**
> +	 * Specifies whether flex item represents tunnel protocol
> +	 */
> +	bool tunnel;
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this structure may change without prior notice  */
> +struct rte_flow_item_flex_conf {
> +	/**
> +	 * The next header offset, it presents the network header size covered
> +	 * by the flex item and can be obtained with all supported offset
> +	 * calculating methods (fixed, dedicated field, bitmask, etc).
> +	 */
> +	struct rte_flow_item_flex_field next_header;

I think a better name will be size/len

> +	/**
> +	 * Specifies the next protocol field to match with link next protocol
> +	 * values and continue packet parsing with matching link.
> +	 */
> +	struct rte_flow_item_flex_field next_protocol;
> +	/**
> +	 * The fields will be sampled and presented for explicit match
> +	 * with pattern in the rte_flow_flex_item. There can be multiple
> +	 * fields descriptors, the number should be specified by sample_num.
> +	 */
> +	struct rte_flow_item_flex_field *sample_data;
> +	/** Number of field descriptors in the sample_data array. */
> +	uint32_t sample_num;

nb_samples?

> +	/**
> +	 * Input link defines the flex item relation with preceding
> +	 * header. It specified the preceding item type and provides pattern
> +	 * to match. The flex item will continue parsing and will provide the
> +	 * data to flow match in case if there is the match with one of input
> +	 * links.
> +	 */
> +	struct rte_flow_item_flex_link *input_link;
> +	/** Number of link descriptors in the input link array. */
> +	uint32_t input_num;
Nb_inputs
> +	/**
> +	 * Output link defines the next protocol field value to match and
> +	 * the following protocol header to continue packet parsing. Also
> +	 * defines the tunnel-related behaviour.
> +	 */
> +	struct rte_flow_item_flex_link *output_link;
> +	/** Number of link descriptors in the output link array. */
> +	uint32_t output_num;
> +};
> +
>  /**
>   * Action types.
>   *
> @@ -4288,6 +4451,71 @@ rte_flow_tunnel_item_release(uint16_t port_id,
>  			     struct rte_flow_item *items,
>  			     uint32_t num_of_items,
>  			     struct rte_flow_error *error);
> +
> +/**
> + * Create the flex item with specified configuration over
> + * the Ethernet device.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] conf
> + *   Item configuration.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is
> set.
> + */
> +__rte_experimental
> +struct rte_flow_item_flex_handle *
> +rte_flow_flex_item_create(uint16_t port_id,
> +			  const struct rte_flow_item_flex_conf *conf,
> +			  struct rte_flow_error *error);
> +
> +/**
> + * Release the flex item on the specified Ethernet device.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] handle
> + *   Handle of the item existing on the specified device.
> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_flex_item_release(uint16_t port_id,
> +			   const struct rte_flow_item_flex_handle *handle,
> +			   struct rte_flow_error *error);
> +
> +/**
> + * Modify the flex item on the specified Ethernet device.
> + *
> + * @param port_id
> + *   Port identifier of Ethernet device.
> + * @param[in] handle
> + *   Handle of the item existing on the specified device.
> + * @param[in] conf
> + *   Item new configuration.

Do you to supply full configuration for each update?
Maybe add a mask?

> + * @param[out] error
> + *   Perform verbose error reporting if not NULL. PMDs initialize this
> + *   structure in case of error only.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +__rte_experimental
> +int
> +rte_flow_flex_item_update(uint16_t port_id,
> +			  const struct rte_flow_item_flex_handle *handle,
> +			  const struct rte_flow_item_flex_conf *conf,
> +			  struct rte_flow_error *error);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> --
> 2.18.1

Best,
Ori


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v3 0/5] ethdev: introduce configurable flexible item
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (3 preceding siblings ...)
  2021-10-01 19:34 ` [dpdk-dev] [PATCH v2 00/14] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-11 18:15 ` Viacheslav Ovsiienko
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 1/5] " Viacheslav Ovsiienko
                     ` (4 more replies)
  2021-10-12 10:49 ` [dpdk-dev] [PATCH v4 0/5] ethdev: update modify field flow action Viacheslav Ovsiienko
                   ` (5 subsequent siblings)
  10 siblings, 5 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-11 18:15 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item confgiuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

7. Notes:

 - v2:  http://patches.dpdk.org/project/dpdk/patch/20211001193415.23288-2-viacheslavo@nvidia.com/
 - v1:  http://patches.dpdk.org/project/dpdk/patch/20210922180418.20663-2-viacheslavo@nvidia.com/
 - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/

 - v2 -> v3:
   - comments addressed
   - flex item update removed as not supported
   - RSS over flex item fields removed as not supported and non-complete
     API
   - tunnel mode configuration refactored
   - testpmd updated
   - documentation updated
   - PMD patches are removed temporarily (updating WIP, be presented in rc2)

 - v1 -> v2:
   - testpmd CLI to handle flex item is provided
   - draft PMD code is introduced

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Gregory Etelson (4):
  ethdev: support flow elements with variable length
  ethdev: implement RTE flex item API
  app/testpmd: add jansson library
  app/testpmd: add flex item CLI commands

Viacheslav Ovsiienko (1):
  ethdev: introduce configurable flexible item

 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 763 +++++++++++++++++++-
 app/test-pmd/meson.build                    |   5 +
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  19 +
 doc/guides/prog_guide/rte_flow.rst          |  25 +
 doc/guides/rel_notes/release_21_11.rst      |   7 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++
 lib/ethdev/rte_flow.c                       | 123 +++-
 lib/ethdev/rte_flow.h                       | 222 ++++++
 lib/ethdev/rte_flow_driver.h                |   8 +
 lib/ethdev/version.map                      |   4 +
 12 files changed, 1284 insertions(+), 15 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v3 1/5] ethdev: introduce configurable flexible item
  2021-10-11 18:15 ` [dpdk-dev] [PATCH v3 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-11 18:15   ` Viacheslav Ovsiienko
  2021-10-12  6:41     ` Ori Kam
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 2/5] ethdev: support flow elements with variable length Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-11 18:15 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item confgiuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst     |  25 +++
 doc/guides/rel_notes/release_21_11.rst |   7 +
 lib/ethdev/rte_flow.h                  | 222 +++++++++++++++++++++++++
 3 files changed, 254 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2b42d5ec8c..3f40b8891c 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1425,6 +1425,31 @@ Matches a conntrack state after conntrack action.
 - ``flags``: conntrack packet state flags.
 - Default ``mask`` matches all state bits.
 
+Item: ``FLEX``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches with the custom network protocol header that was created
+using rte_flow_flex_item_create() API. The application describes
+the desired header structure, defines the header fields attributes
+and header relations with preceding and following protocols and
+configures the ethernet devices accordingly via
+rte_flow_flex_item_create() routine.
+
+- ``handle``: the flex item handle returned by the PMD on successful
+  rte_flow_flex_item_create() call, mask for this field is ignored.
+- ``length``: match pattern length in bytes. If the length does not cover
+  all fields defined in item configuration, the pattern spec and mask are
+  considered by the driver as padded with trailing zeroes till the full
+  configured item pattern length.
+- ``pattern``: pattern to match. The pattern is concatenation of bit fields
+  configured at item creation. At configuration the fields are presented
+  by sample_data array. The order of the bitfields is defined by the order
+  of sampla_data elements. The width of each bitfield is defined by the width
+  specified in the corresponding sample_data element as well. If pattern
+  length is smaller than configured fields overall length it is considered
+  as padded with trailing zeroes upto full configured length, both for
+  value and mask.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 73e377a007..4b8cac60d4 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Introduced RTE Flow Flex Item.**
+
+  * The configurable RTE Flow Flex Item provides the capability to introduce
+    the arbitrary user specified network protocol header, configure the device
+    hardware accordingly, and perform match on this header with desired patterns
+    and masks.
+
 * **Enabled new devargs parser.**
 
   * Enabled devargs syntax
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 7b1ed7f110..fb226d9f52 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -574,6 +574,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_conntrack.
 	 */
 	RTE_FLOW_ITEM_TYPE_CONNTRACK,
+
+	/**
+	 * Matches a configured set of fields at runtime calculated offsets
+	 * over the generic network header with variable length and
+	 * flexible pattern
+	 *
+	 * @see struct rte_flow_item_flex.
+	 */
+	RTE_FLOW_ITEM_TYPE_FLEX,
 };
 
 /**
@@ -1839,6 +1848,177 @@ struct rte_flow_item {
 	const void *mask; /**< Bit-mask applied to spec and last. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_FLEX
+ *
+ * Matches a specified set of fields within the network protocol
+ * header. Each field is presented as set of bits with specified width, and
+ * bit offset from the header beginning.
+ *
+ * The pattern is concatenation of bit fields configured at item creation
+ * by rte_flow_flex_item_create(). At configuration the fields are presented
+ * by sample_data array.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_flex {
+	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle. */
+	uint32_t length; /**< Pattern length in bytes. */
+	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
+};
+/**
+ * Field bit offset calculation mode.
+ */
+enum rte_flow_item_flex_field_mode {
+	/**
+	 * Dummy field, used for byte boundary alignment in pattern.
+	 * Pattern mask and data are ignored in the match. All configuration
+	 * parameters besides field size are ignored.
+	 */
+	FIELD_MODE_DUMMY = 0,
+	/**
+	 * Fixed offset field. The bit offset from header beginning
+	 * is permanent and defined by field_base parameter.
+	 */
+	FIELD_MODE_FIXED,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field). The resulting field offset to match is calculated as:
+	 *
+	 *    field_base + (*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_OFFSET,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field), the latter is considered as bitmask containing some
+	 * number of one bits, the resulting field offset to match is
+	 * calculated as:
+	 *
+	 *    field_base + bitcount(*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_BITMASK,
+};
+
+/**
+ * Flex item field tunnel mode
+ */
+enum rte_flow_item_flex_tunnel_mode {
+	/**
+	 * The protocol header can be present in the packet only once.
+	 * No multiple flex item flow inclusions (for inner/outer) are allowed.
+	 * No any relations with tunnel protocols are imposed. The drivers
+	 * can optimize hardware resource usage to handle match on single flex
+	 * item of specific type.
+	 */
+	FLEX_TUNNEL_MODE_SINGLE = 0,
+	/**
+	 * Flex item presents outer header only.
+	 */
+	FLEX_TUNNEL_MODE_OUTER,
+	/**
+	 * Flex item presents inner header only.
+	 */
+	FLEX_TUNNEL_MODE_INNER,
+	/**
+	 * Flex item presents either inner or outer header. The driver
+	 * handles as many multiple inners as hardware supports.
+	 */
+	FLEX_TUNNEL_MODE_MULTI,
+	/**
+	 * Flex item presents tunnel protocol header.
+	 */
+	FLEX_TUNNEL_MODE_TUNNEL,
+};
+
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+__extension__
+struct rte_flow_item_flex_field {
+	/** Defines how match field offset is calculated over the packet. */
+	enum rte_flow_item_flex_field_mode field_mode;
+	uint32_t field_size; /**< Field size in bits. */
+	int32_t field_base; /**< Field offset in bits. */
+	uint32_t offset_base; /**< Indirect offset field offset in bits. */
+	uint32_t offset_mask; /**< Indirect offset field bit mask. */
+	int32_t offset_shift; /**< Indirect offset multiply factor. */
+	uint32_t field_id:16; /**< Device hint, for multiple items in flow. */
+	uint32_t reserved:16; /**< Reserved field. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_link {
+	/**
+	 * Preceding/following header. The item type must be always provided.
+	 * For preceding one item must specify the header value/mask to match
+	 * for the link be taken and start the flex item header parsing.
+	 */
+	struct rte_flow_item item;
+	/**
+	 * Next field value to match to continue with one of the configured
+	 * next protocols.
+	 */
+	uint32_t next;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_conf {
+	/**
+	 * Specifies the flex item and tunnel relations and tells the PMD
+	 * whether flex item can be used for inner, outer or both headers,
+	 * or whether flex item presents the tunnel protocol itself.
+	 */
+	enum rte_flow_item_flex_tunnel_mode tunnel;
+	/**
+	 * The next header offset, it presents the network header size covered
+	 * by the flex item and can be obtained with all supported offset
+	 * calculating methods (fixed, dedicated field, bitmask, etc).
+	 */
+	struct rte_flow_item_flex_field next_header;
+	/**
+	 * Specifies the next protocol field to match with link next protocol
+	 * values and continue packet parsing with matching link.
+	 */
+	struct rte_flow_item_flex_field next_protocol;
+	/**
+	 * The fields will be sampled and presented for explicit match
+	 * with pattern in the rte_flow_flex_item. There can be multiple
+	 * fields descriptors, the number should be specified by nb_samples.
+	 */
+	struct rte_flow_item_flex_field *sample_data;
+	/** Number of field descriptors in the sample_data array. */
+	uint32_t nb_samples;
+	/**
+	 * Input link defines the flex item relation with preceding
+	 * header. It specified the preceding item type and provides pattern
+	 * to match. The flex item will continue parsing and will provide the
+	 * data to flow match in case if there is the match with one of input
+	 * links.
+	 */
+	struct rte_flow_item_flex_link *input_link;
+	/** Number of link descriptors in the input link array. */
+	uint32_t nb_inputs;
+	/**
+	 * Output link defines the next protocol field value to match and
+	 * the following protocol header to continue packet parsing. Also
+	 * defines the tunnel-related behaviour.
+	 */
+	struct rte_flow_item_flex_link *output_link;
+	/** Number of link descriptors in the output link array. */
+	uint32_t nb_outputs;
+};
+
 /**
  * Action types.
  *
@@ -4288,6 +4468,48 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 			     struct rte_flow_item *items,
 			     uint32_t num_of_items,
 			     struct rte_flow_error *error);
+
+/**
+ * Create the flex item with specified configuration over
+ * the Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] conf
+ *   Item configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is set.
+ */
+__rte_experimental
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
+/**
+ * Release the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v3 2/5] ethdev: support flow elements with variable length
  2021-10-11 18:15 ` [dpdk-dev] [PATCH v3 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 1/5] " Viacheslav Ovsiienko
@ 2021-10-11 18:15   ` Viacheslav Ovsiienko
  2021-10-12  7:53     ` Ori Kam
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 3/5] ethdev: implement RTE flex item API Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-11 18:15 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

A new RTE flow item type with variable length pattern that does not
fit the RAW item meta description could not use the RAW item.
For example, the new flow item that references 64 bits PMD handler
cannot be described by the RAW item.

The patch allows RTE conv helper functions to process custom flow
items with variable length pattern.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 lib/ethdev/rte_flow.c | 83 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 70 insertions(+), 13 deletions(-)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 8cb7a069c8..100983ca59 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -30,13 +30,67 @@ uint64_t rte_flow_dynf_metadata_mask;
 struct rte_flow_desc_data {
 	const char *name;
 	size_t size;
+	size_t (*desc_fn)(void *dst, const void *src);
 };
 
+/**
+ *
+ * @param buf
+ * Destination memory.
+ * @param data
+ * Source memory
+ * @param size
+ * Requested copy size
+ * @param desc
+ * rte_flow_desc_item - for flow item conversion.
+ * rte_flow_desc_action - for flow action conversion.
+ * @param type
+ * Offset into the desc param or negative value for private flow elements.
+ */
+static inline size_t
+rte_flow_conv_copy(void *buf, const void *data, const size_t size,
+		   const struct rte_flow_desc_data *desc, int type)
+{
+	/**
+	 * allow PMD private flow item
+	 * see 5d1bff8fe2
+	 * "ethdev: allow negative values in flow rule types"
+	 */
+	size_t sz = type >= 0 ? desc[type].size : sizeof(void *);
+	if (buf == NULL || data == NULL)
+		return 0;
+	rte_memcpy(buf, data, (size > sz ? sz : size));
+	if (desc[type].desc_fn)
+		sz += desc[type].desc_fn(size > 0 ? buf : NULL, data);
+	return sz;
+}
+
+static size_t
+rte_flow_item_flex_conv(void *buf, const void *data)
+{
+	struct rte_flow_item_flex *dst = buf;
+	const struct rte_flow_item_flex *src = data;
+	if (buf) {
+		dst->pattern = rte_memcpy
+			((void *)((uintptr_t)(dst + 1)), src->pattern,
+			 src->length);
+	}
+	return src->length;
+}
+
 /** Generate flow_item[] entry. */
 #define MK_FLOW_ITEM(t, s) \
 	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
 		.name = # t, \
-		.size = s, \
+		.size = s,               \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ITEM_FN(t, s, fn) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = {\
+		.name = # t,                 \
+		.size = s,                   \
+		.desc_fn = fn,               \
 	}
 
 /** Information about known flow pattern items. */
@@ -100,6 +154,8 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)),
 	MK_FLOW_ITEM(INTEGRITY, sizeof(struct rte_flow_item_integrity)),
 	MK_FLOW_ITEM(CONNTRACK, sizeof(uint32_t)),
+	MK_FLOW_ITEM_FN(FLEX, sizeof(struct rte_flow_item_flex),
+			rte_flow_item_flex_conv),
 };
 
 /** Generate flow_action[] entry. */
@@ -107,8 +163,17 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
 		.name = # t, \
 		.size = s, \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ACTION_FN(t, fn) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = 0, \
+		.desc_fn = fn,\
 	}
 
+
 /** Information about known flow actions. */
 static const struct rte_flow_desc_data rte_flow_desc_action[] = {
 	MK_FLOW_ACTION(END, 0),
@@ -527,12 +592,8 @@ rte_flow_conv_item_spec(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow item
-		 */
-		off = (int)item->type >= 0 ?
-		      rte_flow_desc_item[item->type].size : sizeof(void *);
-		rte_memcpy(buf, data, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, data, size,
+					 rte_flow_desc_item, item->type);
 		break;
 	}
 	return off;
@@ -634,12 +695,8 @@ rte_flow_conv_action_conf(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow action
-		 */
-		off = (int)action->type >= 0 ?
-		      rte_flow_desc_action[action->type].size : sizeof(void *);
-		rte_memcpy(buf, action->conf, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, action->conf, size,
+					 rte_flow_desc_action, action->type);
 		break;
 	}
 	return off;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v3 3/5] ethdev: implement RTE flex item API
  2021-10-11 18:15 ` [dpdk-dev] [PATCH v3 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 1/5] " Viacheslav Ovsiienko
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 2/5] ethdev: support flow elements with variable length Viacheslav Ovsiienko
@ 2021-10-11 18:15   ` Viacheslav Ovsiienko
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 4/5] app/testpmd: add jansson library Viacheslav Ovsiienko
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 5/5] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-11 18:15 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flex item API was introduced in
"ethdev: introduce configurable flexible item" patch.

The API allows DPDK application to define parser for custom
network header in port hardware and offload flows that will match
the custom header elements.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 lib/ethdev/rte_flow.c        | 40 ++++++++++++++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h |  8 ++++++++
 lib/ethdev/version.map       |  4 ++++
 3 files changed, 52 insertions(+)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 100983ca59..a858dc31e3 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -1323,3 +1323,43 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOTSUP));
 }
+
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_flow_item_flex_handle *handle;
+
+	if (unlikely(!ops))
+		return NULL;
+	if (unlikely(!ops->flex_item_create)) {
+		rte_flow_error_set(error, ENOTSUP,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOTSUP));
+		return NULL;
+	}
+	handle = ops->flex_item_create(dev, conf, error);
+	if (handle == NULL)
+		flow_err(port_id, -rte_errno, error);
+	return handle;
+}
+
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_release))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_release(dev, handle, error);
+	return flow_err(port_id, ret, error);
+}
diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
index 46f62c2ec2..34a5a5bcd0 100644
--- a/lib/ethdev/rte_flow_driver.h
+++ b/lib/ethdev/rte_flow_driver.h
@@ -139,6 +139,14 @@ struct rte_flow_ops {
 		 struct rte_flow_item *pmd_items,
 		 uint32_t num_of_items,
 		 struct rte_flow_error *err);
+	struct rte_flow_item_flex_handle *(*flex_item_create)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
+	int (*flex_item_release)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 struct rte_flow_error *error);
 };
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 904bce6ea1..ec3b66d7a1 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -247,6 +247,10 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_flow_flex_item_create;
+	rte_flow_flex_item_release;
 };
 
 INTERNAL {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v3 4/5] app/testpmd: add jansson library
  2021-10-11 18:15 ` [dpdk-dev] [PATCH v3 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 3/5] ethdev: implement RTE flex item API Viacheslav Ovsiienko
@ 2021-10-11 18:15   ` Viacheslav Ovsiienko
  2021-10-12  7:56     ` Ori Kam
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 5/5] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  4 siblings, 1 reply; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-11 18:15 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Testpmd interactive mode provides CLI to configure application
commands. Testpmd reads CLI command and parameters from STDIN, and
converts input into C objects with internal parser.
The patch adds jansson dependency to testpmd.
With jansson, testpmd can read input in JSON format from STDIN or input
file and convert it into C object using jansson library calls.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/meson.build | 5 +++++
 app/test-pmd/testpmd.h   | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 98f3289bdf..3a8babd604 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -61,3 +61,8 @@ if dpdk_conf.has('RTE_LIB_BPF')
     sources += files('bpf_cmd.c')
     deps += 'bpf'
 endif
+jansson_dep = dependency('jansson', required: false, method: 'pkg-config')
+if jansson_dep.found()
+    dpdk_conf.set('RTE_HAS_JANSSON', 1)
+    ext_deps += jansson_dep
+endif
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 5863b2f43f..876a341cf0 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -14,6 +14,9 @@
 #include <rte_os_shim.h>
 #include <cmdline.h>
 #include <sys/queue.h>
+#ifdef RTE_HAS_JANSSON
+#include <jansson.h>
+#endif
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v3 5/5] app/testpmd: add flex item CLI commands
  2021-10-11 18:15 ` [dpdk-dev] [PATCH v3 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 4/5] app/testpmd: add jansson library Viacheslav Ovsiienko
@ 2021-10-11 18:15   ` Viacheslav Ovsiienko
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-11 18:15 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Network port hardware is shipped with fixed number of
supported network protocols. If application must work with a
protocol that is not included in the port hardware by default, it
can try to add the new protocol to port hardware.

Flex item or flex parser is port infrastructure that allows
application to add support for a custom network header and
offload flows to match the header elements.

Application must complete the following tasks to create a flow
rule that matches custom header:

1. Create flow item object in port hardware.
Application must provide custom header configuration to PMD.
PMD will use that configuration to create flex item object in
port hardware.

2. Create flex patterns to match. Flex pattern has a spec and a mask
components, like a regular flow item. Combined together, spec and mask
can target unique data sequence or a number of data sequences in the
custom header.
Flex patterns of the same flex item can have different lengths.
Flex pattern is identified by unique handler value.

3. Create a flow rule with a flex flow item that references
flow pattern.

Testpmd flex CLI commands are:

testpmd> flow flex_item create <port> <flex_id> <filename>

testpmd> set flex_pattern <pattern_id> \
         spec <spec data> mask <mask data>

testpmd> set flex_pattern <pattern_id> is <spec_data>

testpmd> flow create <port> ... \
/ flex item is <flex_id> pattern is <pattern_id> / ...

The patch works with the jansson library API.
Jansson development files must be present:
jansson.pc, jansson.h libjansson.[a,so]

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 763 +++++++++++++++++++-
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  16 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++
 5 files changed, 900 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index a9efd027c3..a673e6ef08 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -17822,6 +17822,8 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_set_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_show_capability,
+	(cmdline_parse_inst_t *)&cmd_set_flex_is_pattern,
+	(cmdline_parse_inst_t *)&cmd_set_flex_spec_pattern,
 	NULL,
 };
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bb22294dd3..7c817f716c 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -54,6 +54,8 @@ enum index {
 	COMMON_PRIORITY_LEVEL,
 	COMMON_INDIRECT_ACTION_ID,
 	COMMON_POLICY_ID,
+	COMMON_FLEX_HANDLE,
+	COMMON_FLEX_TOKEN,
 
 	/* TOP-level command. */
 	ADD,
@@ -81,6 +83,12 @@ enum index {
 	AGED,
 	ISOLATE,
 	TUNNEL,
+	FLEX,
+
+	/* Flex arguments */
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
 
 	/* Tunnel arguments. */
 	TUNNEL_CREATE,
@@ -306,6 +314,9 @@ enum index {
 	ITEM_POL_PORT,
 	ITEM_POL_METER,
 	ITEM_POL_POLICY,
+	ITEM_FLEX,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_FLEX_PATTERN_HANDLE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -844,6 +855,11 @@ struct buffer {
 		struct {
 			uint32_t policy_id;
 		} policy;/**< Policy arguments. */
+		struct {
+			uint16_t token;
+			uintptr_t uintptr;
+			char filename[128];
+		} flex; /**< Flex arguments*/
 	} args; /**< Command arguments. */
 };
 
@@ -871,6 +887,13 @@ struct parse_action_priv {
 		.size = s, \
 	})
 
+static const enum index next_flex_item[] = {
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
+	ZERO,
+};
+
 static const enum index next_ia_create_attr[] = {
 	INDIRECT_ACTION_CREATE_ID,
 	INDIRECT_ACTION_INGRESS,
@@ -1000,6 +1023,7 @@ static const enum index next_item[] = {
 	ITEM_GENEVE_OPT,
 	ITEM_INTEGRITY,
 	ITEM_CONNTRACK,
+	ITEM_FLEX,
 	END_SET,
 	ZERO,
 };
@@ -1368,6 +1392,13 @@ static const enum index item_integrity_lv[] = {
 	ZERO,
 };
 
+static const enum index item_flex[] = {
+	ITEM_FLEX_PATTERN_HANDLE,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -1724,6 +1755,9 @@ static int parse_set_sample_action(struct context *, const struct token *,
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int
+parse_flex_handle(struct context *, const struct token *,
+		  const char *, unsigned int, void *, unsigned int);
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -1840,6 +1874,8 @@ static int parse_isolate(struct context *, const struct token *,
 static int parse_tunnel(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_flex(struct context *, const struct token *,
+		      const char *, unsigned int, void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
@@ -1904,6 +1940,17 @@ static int comp_set_modify_field_op(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
 static int comp_set_modify_field_id(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static void flex_item_create(portid_t port_id, uint16_t flex_id,
+			     const char *filename);
+static void flex_item_destroy(portid_t port_id, uint16_t flex_id);
+struct flex_pattern {
+	struct rte_flow_item_flex spec, mask;
+	uint8_t spec_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	uint8_t mask_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+};
+
+static struct flex_item *flex_items[RTE_MAX_ETHPORTS][FLEX_MAX_PARSERS_NUM];
+static struct flex_pattern flex_patterns[FLEX_MAX_PATTERNS_NUM];
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -2040,6 +2087,20 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[COMMON_FLEX_TOKEN] = {
+		.name = "{flex token}",
+		.type = "flex token",
+		.help = "flex token",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[COMMON_FLEX_HANDLE] = {
+		.name = "{flex handle}",
+		.type = "FLEX HANDLE",
+		.help = "fill flex item data",
+		.call = parse_flex_handle,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -2056,7 +2117,8 @@ static const struct token token_list[] = {
 			      AGED,
 			      QUERY,
 			      ISOLATE,
-			      TUNNEL)),
+			      TUNNEL,
+			      FLEX)),
 		.call = parse_init,
 	},
 	/* Top-level command. */
@@ -2168,6 +2230,41 @@ static const struct token token_list[] = {
 			     ARGS_ENTRY(struct buffer, port)),
 		.call = parse_isolate,
 	},
+	[FLEX] = {
+		.name = "flex_item",
+		.help = "flex item API",
+		.next = NEXT(next_flex_item),
+		.call = parse_flex,
+	},
+	[FLEX_ITEM_INIT] = {
+		.name = "init",
+		.help = "flex item init",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_CREATE] = {
+		.name = "create",
+		.help = "flex item create",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.filename),
+			     ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FILE_PATH),
+			     NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_DESTROY] = {
+		.name = "destroy",
+		.help = "flex item destroy",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
 	[TUNNEL] = {
 		.name = "tunnel",
 		.help = "new tunnel API",
@@ -3608,6 +3705,27 @@ static const struct token token_list[] = {
 			     item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_conntrack, flags)),
 	},
+	[ITEM_FLEX] = {
+		.name = "flex",
+		.help = "match flex header",
+		.priv = PRIV_ITEM(FLEX, sizeof(struct rte_flow_item_flex)),
+		.next = NEXT(item_flex),
+		.call = parse_vc,
+	},
+	[ITEM_FLEX_ITEM_HANDLE] = {
+		.name = "item",
+		.help = "flex item handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, handle)),
+	},
+	[ITEM_FLEX_PATTERN_HANDLE] = {
+		.name = "pattern",
+		.help = "flex pattern handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, pattern)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -6999,6 +7117,43 @@ parse_isolate(struct context *ctx, const struct token *token,
 	return len;
 }
 
+static int
+parse_flex(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (out->command == ZERO) {
+		if (ctx->curr != FLEX)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		ctx->objmask = NULL;
+	} else {
+		switch (ctx->curr) {
+		default:
+			break;
+		case FLEX_ITEM_INIT:
+		case FLEX_ITEM_CREATE:
+		case FLEX_ITEM_DESTROY:
+			out->command = ctx->curr;
+			break;
+		}
+	}
+
+	return len;
+}
+
 static int
 parse_tunnel(struct context *ctx, const struct token *token,
 	     const char *str, unsigned int len,
@@ -7661,6 +7816,71 @@ parse_set_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/*
+ * Replace testpmd handles in a flex flow item with real values.
+ */
+static int
+parse_flex_handle(struct context *ctx, const struct token *token,
+		  const char *str, unsigned int len,
+		  void *buf, unsigned int size)
+{
+	struct rte_flow_item_flex *spec, *mask;
+	const struct rte_flow_item_flex *src_spec, *src_mask;
+	const struct arg *arg = pop_args(ctx);
+	uint32_t offset;
+	uint16_t handle;
+	int ret;
+
+	if (!arg) {
+		printf("Bad environment\n");
+		return -1;
+	}
+	offset = arg->offset;
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, len, buf, size);
+	if (ret <= 0 || !ctx->object)
+		return ret;
+	if (ctx->port >= RTE_MAX_ETHPORTS) {
+		printf("Bad port\n");
+		return -1;
+	}
+	if (offset == offsetof(struct rte_flow_item_flex, handle)) {
+		const struct flex_item *fp;
+		struct rte_flow_item_flex *item_flex = ctx->object;
+		handle = (uint16_t)(uintptr_t)item_flex->handle;
+		if (handle >= FLEX_MAX_PARSERS_NUM) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		fp = flex_items[ctx->port][handle];
+		if (!fp) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		item_flex->handle = fp->flex_handle;
+	} else if (offset == offsetof(struct rte_flow_item_flex, pattern)) {
+		handle = (uint16_t)(uintptr_t)
+			((struct rte_flow_item_flex *)ctx->object)->pattern;
+		if (handle >= FLEX_MAX_PATTERNS_NUM) {
+			printf("Bad pattern handle\n");
+			return -1;
+		}
+		src_spec = &flex_patterns[handle].spec;
+		src_mask = &flex_patterns[handle].mask;
+		spec = ctx->object;
+		mask = spec + 2; /* spec, last, mask */
+		/* fill flow rule spec and mask parameters */
+		spec->length = src_spec->length;
+		spec->pattern = src_spec->pattern;
+		mask->length = src_mask->length;
+		mask->pattern = src_mask->pattern;
+	} else {
+		printf("Bad arguments - unknown flex item offset\n");
+		return -1;
+	}
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -8167,6 +8387,13 @@ cmd_flow_parsed(const struct buffer *in)
 		port_meter_policy_add(in->port, in->args.policy.policy_id,
 					in->args.vc.actions);
 		break;
+	case FLEX_ITEM_CREATE:
+		flex_item_create(in->port, in->args.flex.token,
+				 in->args.flex.filename);
+		break;
+	case FLEX_ITEM_DESTROY:
+		flex_item_destroy(in->port, in->args.flex.token);
+		break;
 	default:
 		break;
 	}
@@ -8618,6 +8845,11 @@ cmd_set_raw_parsed(const struct buffer *in)
 		case RTE_FLOW_ITEM_TYPE_PFCP:
 			size = sizeof(struct rte_flow_item_pfcp);
 			break;
+		case RTE_FLOW_ITEM_TYPE_FLEX:
+			size = item->spec ?
+				((const struct rte_flow_item_flex *)
+				item->spec)->length : 0;
+			break;
 		default:
 			fprintf(stderr, "Error - Not supported item\n");
 			goto error;
@@ -8800,3 +9032,532 @@ cmdline_parse_inst_t cmd_show_set_raw_all = {
 		NULL,
 	},
 };
+
+#ifdef RTE_HAS_JANSSON
+static __rte_always_inline bool
+match_strkey(const char *key, const char *pattern)
+{
+	return strncmp(key, pattern, strlen(key)) == 0;
+}
+
+static struct flex_item *
+flex_parser_fetch(uint16_t port_id, uint16_t flex_id)
+{
+	if (port_id >= RTE_MAX_ETHPORTS) {
+		printf("Invalid port_id: %u\n", port_id);
+		return FLEX_PARSER_ERR;
+	}
+	if (flex_id >= FLEX_MAX_PARSERS_NUM) {
+		printf("Invalid flex item flex_id: %u\n", flex_id);
+		return FLEX_PARSER_ERR;
+	}
+	return flex_items[port_id][flex_id];
+}
+
+static void
+flex_item_destroy(portid_t port_id, uint16_t flex_id)
+{
+	int ret;
+	struct rte_flow_error error;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (!fp)
+		return;
+	ret = rte_flow_flex_item_release(port_id, fp->flex_handle, &error);
+	if (!ret) {
+		free(fp);
+		flex_items[port_id][flex_id] = NULL;
+		printf("port-%u: released flex item #%u\n",
+		       port_id, flex_id);
+
+	} else {
+		printf("port-%u: cannot release flex item #%u: %s\n",
+		       port_id, flex_id, error.message);
+	}
+}
+
+static int
+flex_tunnel_parse(json_t *jtun, enum rte_flow_item_flex_tunnel_mode *tunnel)
+{
+	int tun = -1;
+
+	if (json_is_integer(jtun))
+		tun = (int)json_integer_value(jtun);
+	else if (json_is_real(jtun))
+		tun = (int)json_real_value(jtun);
+	else if (json_is_string(jtun)) {
+		const char *mode = json_string_value(jtun);
+
+		if (match_strkey(mode, "FLEX_TUNNEL_MODE_SINGLE"))
+			tun = FLEX_TUNNEL_MODE_SINGLE;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_OUTER"))
+			tun = FLEX_TUNNEL_MODE_OUTER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_INNER"))
+			tun = FLEX_TUNNEL_MODE_INNER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_MULTI"))
+			tun = FLEX_TUNNEL_MODE_MULTI;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_TUNNEL"))
+			tun = FLEX_TUNNEL_MODE_TUNNEL;
+		else
+			return -EINVAL;
+	} else
+		return -EINVAL;
+	*tunnel = (enum rte_flow_item_flex_tunnel_mode)tun;
+	return 0;
+}
+
+static int
+flex_field_parse(json_t *jfld, struct rte_flow_item_flex_field *fld)
+{
+	const char *key;
+	json_t *je;
+
+#define FLEX_FIELD_GET(fm, t) \
+do {                  \
+	if (!strncmp(key, # fm, strlen(# fm))) { \
+		if (json_is_real(je))   \
+			fld->fm = (t) json_real_value(je); \
+		else if (json_is_integer(je))   \
+			fld->fm = (t) json_integer_value(je); \
+		else   \
+			return -EINVAL; \
+	}         \
+} while (0)
+
+	json_object_foreach(jfld, key, je) {
+		FLEX_FIELD_GET(field_size, uint32_t);
+		FLEX_FIELD_GET(field_base, int32_t);
+		FLEX_FIELD_GET(offset_base, uint32_t);
+		FLEX_FIELD_GET(offset_mask, uint32_t);
+		FLEX_FIELD_GET(offset_shift, int32_t);
+		FLEX_FIELD_GET(field_id, uint16_t);
+		if (match_strkey(key, "field_mode")) {
+			const char *mode;
+			if (!json_is_string(je))
+				return -EINVAL;
+			mode = json_string_value(je);
+			if (match_strkey(mode, "FIELD_MODE_DUMMY"))
+				fld->field_mode = FIELD_MODE_DUMMY;
+			else if (match_strkey(mode, "FIELD_MODE_FIXED"))
+				fld->field_mode = FIELD_MODE_FIXED;
+			else if (match_strkey(mode, "FIELD_MODE_OFFSET"))
+				fld->field_mode = FIELD_MODE_OFFSET;
+			else if (match_strkey(mode, "FIELD_MODE_BITMASK"))
+				fld->field_mode = FIELD_MODE_BITMASK;
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+enum flex_link_type {
+	FLEX_LINK_IN = 0,
+	FLEX_LINK_OUT = 1
+};
+
+static int
+flex_link_item_parse(const char *pattern, struct rte_flow_item *item)
+{
+#define  FLEX_PARSE_DATA_SIZE 1024
+
+	int ret;
+	uint8_t *ptr, data[FLEX_PARSE_DATA_SIZE] = {0,};
+	char flow_rule[256];
+	struct context saved_flow_ctx = cmd_flow_context;
+
+	sprintf(flow_rule, "flow create 0 pattern %s / end", pattern);
+	pattern = flow_rule;
+	cmd_flow_context_init(&cmd_flow_context);
+	do {
+		ret = cmd_flow_parse(NULL, pattern, (void *)data, sizeof(data));
+		if (ret > 0) {
+			pattern += ret;
+			while (isspace(*pattern))
+				pattern++;
+		}
+	} while (ret > 0 && strlen(pattern));
+	if (ret >= 0 && !strlen(pattern)) {
+		struct buffer *pbuf = (struct buffer *)(uintptr_t)data;
+		struct rte_flow_item *src = pbuf->args.vc.pattern;
+
+		item->type = src->type;
+		if (src->spec) {
+			ptr = (void *)(uintptr_t)item->spec;
+			memcpy(ptr, src->spec, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->spec = NULL;
+		}
+		if (src->mask) {
+			ptr = (void *)(uintptr_t)item->mask;
+			memcpy(ptr, src->mask, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->mask = NULL;
+		}
+		if (src->last) {
+			ptr = (void *)(uintptr_t)item->last;
+			memcpy(ptr, src->last, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->last = NULL;
+		}
+		ret = 0;
+	}
+	cmd_flow_context = saved_flow_ctx;
+	return ret;
+}
+
+static int
+flex_link_parse(json_t *jobj, struct rte_flow_item_flex_link *link,
+		enum flex_link_type link_type)
+{
+	const char *key;
+	json_t *je;
+	int ret;
+	json_object_foreach(jobj, key, je) {
+		if (match_strkey(key, "item")) {
+			if (!json_is_string(je))
+				return -EINVAL;
+			ret = flex_link_item_parse(json_string_value(je),
+						   &link->item);
+			if (ret)
+				return -EINVAL;
+			if (link_type == FLEX_LINK_IN) {
+				if (!link->item.spec || !link->item.mask)
+					return -EINVAL;
+				if (link->item.last)
+					return -EINVAL;
+			}
+		}
+		if (match_strkey(key, "next")) {
+			if (json_is_integer(je))
+				link->next = (typeof(link->next))
+					     json_integer_value(je);
+			else if (json_is_real(je))
+				link->next = (typeof(link->next))
+					     json_real_value(je);
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static int flex_item_config(json_t *jroot,
+			    struct rte_flow_item_flex_conf *flex_conf)
+{
+	const char *key;
+	json_t *jobj = NULL;
+	int ret = 0;
+
+	json_object_foreach(jroot, key, jobj) {
+		if (match_strkey(key, "tunnel")) {
+			ret = flex_tunnel_parse(jobj, &flex_conf->tunnel);
+			if (ret) {
+				printf("Can't parse tunnel value\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_header")) {
+			ret = flex_field_parse(jobj, &flex_conf->next_header);
+			if (ret) {
+				printf("Can't parse next_header field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_protocol")) {
+			ret = flex_field_parse(jobj,
+					       &flex_conf->next_protocol);
+			if (ret) {
+				printf("Can't parse next_protocol field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "sample_data")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_field_parse
+					(ji, flex_conf->sample_data + i);
+				if (ret) {
+					printf("Can't parse sample_data field(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_samples = size;
+		} else if (match_strkey(key, "input_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse(ji,
+						      flex_conf->input_link + i,
+						      FLEX_LINK_IN);
+				if (ret) {
+					printf("Can't parse input_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_inputs = size;
+		} else if (match_strkey(key, "output_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse
+					(ji, flex_conf->output_link + i,
+					 FLEX_LINK_OUT);
+				if (ret) {
+					printf("Can't parse output_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_outputs = size;
+		}
+	}
+out:
+	return ret;
+}
+
+static struct flex_item *
+flex_item_init(void)
+{
+#define ALIGN(x) (((x) + sizeof(uintptr_t) - 1) & ~(sizeof(uintptr_t) - 1))
+
+	size_t base_size, samples_size, links_size, spec_size;
+	struct rte_flow_item_flex_conf *conf;
+	struct flex_item *fp;
+	uint8_t (*pattern)[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	int i;
+	base_size = ALIGN(sizeof(*conf));
+	samples_size = ALIGN(FLEX_ITEM_MAX_SAMPLES_NUM *
+			     sizeof(conf->sample_data[0]));
+	links_size = ALIGN(FLEX_ITEM_MAX_LINKS_NUM *
+			   sizeof(conf->input_link[0]));
+	/* spec & mask for all input links */
+	spec_size = 2 * FLEX_MAX_FLOW_PATTERN_LENGTH * FLEX_ITEM_MAX_LINKS_NUM;
+	fp = calloc(1, base_size + samples_size + 2 * links_size + spec_size);
+	if (fp == NULL) {
+		printf("Can't allocate memory for flex item\n");
+		return NULL;
+	}
+	conf = &fp->flex_conf;
+	conf->sample_data = (typeof(conf->sample_data))
+			    ((uint8_t *)fp + base_size);
+	conf->input_link = (typeof(conf->input_link))
+			   ((uint8_t *)conf->sample_data + samples_size);
+	conf->output_link = (typeof(conf->output_link))
+			    ((uint8_t *)conf->input_link + links_size);
+	pattern = (typeof(pattern))((uint8_t *)conf->output_link + links_size);
+	for (i = 0; i < FLEX_ITEM_MAX_LINKS_NUM; i++) {
+		struct rte_flow_item_flex_link *in = conf->input_link + i;
+		in->item.spec = pattern++;
+		in->item.mask = pattern++;
+	}
+	return fp;
+}
+
+static void
+flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename)
+{
+	struct rte_flow_error flow_error;
+	json_error_t json_error;
+	json_t *jroot = NULL;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	int ret;
+
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (fp) {
+		printf("port-%u: flex item #%u is already in use\n",
+		       port_id, flex_id);
+		return;
+	}
+	jroot = json_load_file(filename, 0, &json_error);
+	if (!jroot) {
+		printf("Bad JSON file \"%s\": %s\n", filename, json_error.text);
+		return;
+	}
+	fp = flex_item_init();
+	if (!fp) {
+		printf("Could not allocate flex item\n");
+		goto out;
+	}
+	ret = flex_item_config(jroot, &fp->flex_conf);
+	if (ret)
+		goto out;
+	fp->flex_handle = rte_flow_flex_item_create(port_id,
+						    &fp->flex_conf,
+						    &flow_error);
+	if (fp->flex_handle) {
+		flex_items[port_id][flex_id] = fp;
+		printf("port-%u: created flex item #%u\n", port_id, flex_id);
+		fp = NULL;
+	} else {
+		printf("port-%u: flex item #%u creation failed: %s\n",
+		       port_id, flex_id,
+		       flow_error.message ? flow_error.message : "");
+	}
+out:
+	if (fp)
+		free(fp);
+	if (jroot)
+		json_decref(jroot);
+}
+
+#else /* RTE_HAS_JANSSON */
+static void flex_item_create(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id,
+			     __rte_unused const char *filename)
+{
+	printf("no JSON library\n");
+}
+
+static void flex_item_destroy(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id)
+{
+	printf("no JSON library\n");
+}
+#endif /* RTE_HAS_JANSSON */
+
+void
+port_flex_item_flush(portid_t port_id)
+{
+	uint16_t i;
+
+	for (i = 0; i < FLEX_MAX_PARSERS_NUM; i++) {
+		flex_item_destroy(port_id, i);
+		flex_items[port_id][i] = NULL;
+	}
+}
+
+struct flex_pattern_set {
+	cmdline_fixed_string_t set, flex_pattern;
+	cmdline_fixed_string_t is_spec, mask;
+	cmdline_fixed_string_t spec_data, mask_data;
+	uint16_t id;
+};
+
+static cmdline_parse_token_string_t flex_pattern_set_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, set, "set");
+static cmdline_parse_token_string_t flex_pattern_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 flex_pattern, "flex_pattern");
+static cmdline_parse_token_string_t flex_pattern_is_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 is_spec, "is");
+static cmdline_parse_token_string_t flex_pattern_spec_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 is_spec, "spec");
+static cmdline_parse_token_string_t flex_pattern_mask_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask, "mask");
+static cmdline_parse_token_string_t flex_pattern_spec_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, spec_data, NULL);
+static cmdline_parse_token_string_t flex_pattern_mask_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask_data, NULL);
+static cmdline_parse_token_num_t flex_pattern_id_token =
+	TOKEN_NUM_INITIALIZER(struct flex_pattern_set, id, RTE_UINT16);
+
+/*
+ * flex pattern data - spec or mask is a string representation of byte array
+ * in hexadecimal format. Each byte in data string must have 2 characters:
+ * 0x15 - "15"
+ * 0x1  - "01"
+ * Bytes in data array are in network order.
+ */
+static uint32_t
+flex_pattern_data(const char *str, uint8_t *data)
+{
+	uint32_t i, len = strlen(str);
+	char b[3], *endptr;
+
+	if (len & 01)
+		return 0;
+	len /= 2;
+	if (len >= FLEX_MAX_FLOW_PATTERN_LENGTH)
+		return 0;
+	for (i = 0, b[2] = '\0'; i < len; i++) {
+		b[0] = str[2 * i];
+		b[1] = str[2 * i + 1];
+		data[i] = strtoul(b, &endptr, 16);
+		if (endptr != &b[2])
+			return 0;
+	}
+	return len;
+}
+
+static void
+flex_pattern_parsed_fn(void *parsed_result,
+		       __rte_unused struct cmdline *cl,
+		       __rte_unused void *data)
+{
+	struct flex_pattern_set *res = parsed_result;
+	struct flex_pattern *fp;
+	bool full_spec;
+
+	if (res->id >= FLEX_MAX_PATTERNS_NUM) {
+		printf("Bad flex pattern id\n");
+		return;
+	}
+	fp = flex_patterns + res->id;
+	memset(fp->spec_pattern, 0, sizeof(fp->spec_pattern));
+	memset(fp->mask_pattern, 0, sizeof(fp->mask_pattern));
+	fp->spec.length = flex_pattern_data(res->spec_data, fp->spec_pattern);
+	if (!fp->spec.length) {
+		printf("Bad flex pattern spec\n");
+		return;
+	}
+	full_spec = strncmp(res->is_spec, "spec", strlen("spec")) == 0;
+	if (full_spec) {
+		fp->mask.length = flex_pattern_data(res->mask_data,
+						    fp->mask_pattern);
+		if (!fp->mask.length) {
+			printf("Bad flex pattern mask\n");
+			return;
+		}
+	} else {
+		memset(fp->mask_pattern, 0xFF, fp->spec.length);
+		fp->mask.length = fp->spec.length;
+	}
+	if (fp->mask.length != fp->spec.length) {
+		printf("Spec length do not match mask length\n");
+		return;
+	}
+	fp->spec.pattern = fp->spec_pattern;
+	fp->mask.pattern = fp->mask_pattern;
+	printf("created pattern #%u\n", res->id);
+}
+
+cmdline_parse_inst_t cmd_set_flex_is_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> is <spec_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_is_token,
+		(void *)&flex_pattern_spec_data_token,
+		NULL,
+	}
+};
+
+cmdline_parse_inst_t cmd_set_flex_spec_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> spec <spec_data> mask <mask_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_spec_token,
+		(void *)&flex_pattern_spec_data_token,
+		(void *)&flex_pattern_mask_token,
+		(void *)&flex_pattern_mask_data_token,
+		NULL,
+	}
+};
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17e..26357bc6e3 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2886,6 +2886,7 @@ close_port(portid_t pid)
 
 		if (is_proc_primary()) {
 			port_flow_flush(pi);
+			port_flex_item_flush(pi);
 			rte_eth_dev_close(pi);
 		}
 	}
@@ -4017,7 +4018,6 @@ main(int argc, char** argv)
 		rte_stats_bitrate_reg(bitrate_data);
 	}
 #endif
-
 #ifdef RTE_LIB_CMDLINE
 	if (strlen(cmdline_filename) != 0)
 		cmdline_read_from_file(cmdline_filename);
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 876a341cf0..3437d7607d 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -282,6 +282,19 @@ struct fwd_engine {
 	packet_fwd_t     packet_fwd;     /**< Mandatory. */
 };
 
+struct flex_item {
+	struct rte_flow_item_flex_conf flex_conf;
+	struct rte_flow_item_flex_handle *flex_handle;
+	uint32_t flex_id;
+};
+
+#define FLEX_ITEM_MAX_SAMPLES_NUM 16
+#define FLEX_ITEM_MAX_LINKS_NUM 16
+#define FLEX_MAX_FLOW_PATTERN_LENGTH 64
+#define FLEX_MAX_PARSERS_NUM 8
+#define FLEX_MAX_PATTERNS_NUM 64
+#define FLEX_PARSER_ERR ((struct flex_item *)-1)
+
 #define BURST_TX_WAIT_US 1
 #define BURST_TX_RETRIES 64
 
@@ -306,6 +319,8 @@ extern struct fwd_engine * fwd_engines[]; /**< NULL terminated array. */
 extern cmdline_parse_inst_t cmd_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw_all;
+extern cmdline_parse_inst_t cmd_set_flex_is_pattern;
+extern cmdline_parse_inst_t cmd_set_flex_spec_pattern;
 
 extern uint16_t mempool_flags;
 
@@ -1026,6 +1041,7 @@ uint16_t tx_pkt_set_dynf(uint16_t port_id, __rte_unused uint16_t queue,
 void add_tx_dynf_callback(portid_t portid);
 void remove_tx_dynf_callback(portid_t portid);
 int update_jumbo_frame_offload(portid_t portid);
+void port_flex_item_flush(portid_t port_id);
 
 /*
  * Work-around of a compilation error with ICC on invocations of the
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index bbef706374..5efc626260 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -5091,3 +5091,122 @@ For example to unload BPF filter from TX queue 0, port 0:
 .. code-block:: console
 
    testpmd> bpf-unload tx 0 0
+
+Flex Item Functions
+-------------------
+
+The following sections show functions that configure and create flex item object,
+create flex pattern and use it in a flow rule.
+The commands will use 20 bytes IPv4 header for examples:
+
+::
+
+   0                   1                   2                   3
+   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |  ver  |  IHL  |     TOS       |        length                 | DW0
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       identification          | flg |    frag. offset         | DW1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       TTL     |  protocol     |        checksum               | DW2
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |               source IP address                               | DW3
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |              destination IP address                           | DW4
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+Create flex item
+~~~~~~~~~~~~~~~~
+
+Flex item object is created by PMD according to a new header configuration. The
+header configuration is compiled by the testpmd and stored in
+``rte_flow_item_flex_conf`` type variable.
+
+::
+
+   # flow flex_item create <port> <flex id> <configuration file>
+   testpmd> flow flex_item init 0 3 ipv4_flex_config.json
+   port-0: created flex item #3
+
+Flex item configuration is kept in external JSON file.
+It describes the following header elements:
+
+**New header length.**
+
+Specify whether the new header has fixed or variable length and the basic/minimal
+header length value.
+
+If header length is not fixed, header location with a value that completes header
+length calculation and scale/offset function must be added.
+
+Scale function depends on port hardware.
+
+**Next protocol.**
+
+Describes location in the new header that specify following network header type.
+
+**Flow match samples.**
+
+Describes locations in the new header that will be used in flow rules.
+
+Number of flow samples and sample maximal length depend of port hardware.
+
+**Input trigger.**
+
+Describes preceding network header configuration.
+
+**Output trigger.**
+
+Describes conditions that trigger transfer to following network header
+
+.. code-block:: json
+
+   {
+      "next_header": { "field_mode": "FIELD_MODE_FIXED", "field_size": 20},
+      "next_protocol": {"field_size": 8, "field_base": 72},
+      "sample_data": [
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 0},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 32},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 64},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 96}
+      ],
+      "input_link": [
+         {"item": "eth type is 0x0800"},
+         {"item": "vlan inner_type is 0x0800"}
+      ],
+      "output_link": [
+         {"item": "udp", "next": 17},
+         {"item": "tcp", "next": 6},
+         {"item": "icmp", "next": 1}
+      ]
+   }
+
+
+Flex pattern and flow rules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Flex pattern describe parts of network header that will trigger flex flow item hit in a flow rule.
+Flex pattern directly related to flex item samples configuration.
+Flex pattern can be shared between ports.
+
+**Flex pattern and flow rule to match IPv4 version and 20 bytes length**
+
+::
+
+   # set flex_pattern <pattern_id> is <hex bytes sequence>
+   testpmd> flow flex_item pattern 5 is 45FF
+   created pattern #5
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 5 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
+
+**Flex pattern and flow rule to match packets with source address 1.2.3.4**
+
+::
+
+   testpmd> flow flex_item pattern 2 spec 45000000000000000000000001020304 mask FF0000000000000000000000FFFFFFFF
+   created pattern #2
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 2 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] ethdev: introduce configurable flexible item
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 1/5] " Viacheslav Ovsiienko
@ 2021-10-12  6:41     ` Ori Kam
  0 siblings, 0 replies; 73+ messages in thread
From: Ori Kam @ 2021-10-12  6:41 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Gregory Etelson,
	NBU-Contact-Thomas Monjalon


Hi Slava,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Monday, October 11, 2021 9:15 PM
> Subject: [dpdk-dev] [PATCH v3 1/5] ethdev: introduce configurable flexible item
> 
> 1. Introduction and Retrospective
> 
> Nowadays the networks are evolving fast and wide, the network structures are getting more and more
> complicated, the new application areas are emerging. To address these challenges the new network
> protocols are continuously being developed, considered by technical communities, adopted by industry
> and, eventually implemented in hardware and software. The DPDK framework follows the common
> trends and if we bother to glance at the RTE Flow API header we see the multiple new items were
> introduced during the last years since the initial release.
> 
> The new protocol adoption and implementation process is not straightforward and takes time, the new
> protocol passes development, consideration, adoption, and implementation phases. The industry tries to
> mitigate and address the forthcoming network protocols, for example, many hardware vendors are
> implementing flexible and configurable network protocol parsers. As DPDK developers, could we
> anticipate the near future in the same fashion and introduce the similar flexibility in RTE Flow API?
> 
> Let's check what we already have merged in our project, and we see the nice raw item
> (rte_flow_item_raw). At the first glance, it looks superior and we can try to implement a flow matching on
> the header of some relatively new tunnel protocol, say on the GENEVE header with variable length
> options. And, under further consideration, we run into the raw item
> limitations:
> 
> - only fixed size network header can be represented
> - the entire network header pattern of fixed format
>   (header field offsets are fixed) must be provided
> - the search for patterns is not robust (the wrong matches
>   might be triggered), and actually is not supported
>   by existing PMDs
> - no explicitly specified relations with preceding
>   and following items
> - no tunnel hint support
> 
> As the result, implementing the support for tunnel protocols like aforementioned GENEVE with variable
> extra protocol option with flow raw item becomes very complicated and would require multiple flows and
> multiple raw items chained in the same flow (by the way, there is no support found for chained raw items
> in implemented drivers).
> 
> This RFC introduces the dedicated flex item (rte_flow_item_flex) to handle matches with existing and new
> network protocol headers in a unified fashion.
> 
> 2. Flex Item Life Cycle
> 
> Let's assume there are the requirements to support the new network protocol with RTE Flows. What is
> given within protocol
> specification:
> 
>   - header format
>   - header length, (can be variable, depending on options)
>   - potential presence of extra options following or included
>     in the header the header
>   - the relations with preceding protocols. For example,
>     the GENEVE follows UDP, eCPRI can follow either UDP
>     or L2 header
>   - the relations with following protocols. For example,
>     the next layer after tunnel header can be L2 or L3
>   - whether the new protocol is a tunnel and the header
>     is a splitting point between outer and inner layers
> 
> The supposed way to operate with flex item:
> 
>   - application defines the header structures according to
>     protocol specification
> 
>   - application calls rte_flow_flex_item_create() with desired
>     configuration according to the protocol specification, it
>     creates the flex item object over specified ethernet device
>     and prepares PMD and underlying hardware to handle flex
>     item. On item creation call PMD backing the specified
>     ethernet device returns the opaque handle identifying
>     the object has been created
> 
>   - application uses the rte_flow_item_flex with obtained handle
>     in the flows, the values/masks to match with fields in the
>     header are specified in the flex item per flow as for regular
>     items (except that pattern buffer combines all fields)
> 
>   - flows with flex items match with packets in a regular fashion,
>     the values and masks for the new protocol header match are
>     taken from the flex items in the flows
> 
>   - application destroys flows with flex items
> 
>   - application calls rte_flow_flex_item_release() as part of
>     ethernet device API and destroys the flex item object in
>     PMD and releases the engaged hardware resources
> 
> 3. Flex Item Structure
> 
> The flex item structure is intended to be used as part of the flow pattern like regular RTE flow items and
> provides the mask and value to match with fields of the protocol item was configured for.
> 
>   struct rte_flow_item_flex {
>     void *handle;
>     uint32_t length;
>     const uint8_t* pattern;
>   };
> 
> The handle is some opaque object maintained on per device basis by underlying driver.
> 
> The protocol header fields are considered as bit fields, all offsets and widths are expressed in bits. The
> pattern is the buffer containing the bit concatenation of all the fields presented at item configuration time,
> in the same order and same amount. If byte boundary alignment is needed an application can use a
> dummy type field, this is just some kind of gap filler.
> 
> The length field specifies the pattern buffer length in bytes and is needed to allow rte_flow_copy()
> operations. The approach of multiple pattern pointers and lengths (per field) was considered and found
> clumsy - it seems to be much suitable for the application to maintain the single structure within the single
> pattern buffer.
> 
> 4. Flex Item Configuration
> 
> The flex item configuration consists of the following parts:
> 
>   - header field descriptors:
>     - next header
>     - next protocol
>     - sample to match
>   - input link descriptors
>   - output link descriptors
> 
> The field descriptors tell the driver and hardware what data should be extracted from the packet and then
> control the packet handling in the flow engine. Besides this, sample fields can be presented to match with
> patterns in the flows. Each field is a bit pattern.
> It has width, offset from the header beginning, mode of offset calculation, and offset related parameters.
> 
> The next header field is special, no data are actually taken from the packet, but its offset is used as a
> pointer to the next header in the packet, in other words the next header offset specifies the size of the
> header being parsed by flex item.
> 
> There is one more special field - next protocol, it specifies where the next protocol identifier is contained
> and packet data sampled from this field will be used to determine the next protocol header type to
> continue packet parsing. The next protocol field is like eth_type field in MAC2, or proto field in IPv4/v6
> headers.
> 
> The sample fields are used to represent the data be sampled from the packet and then matched with
> established flows.
> 
> There are several methods supposed to calculate field offset in runtime depending on configuration and
> packet content:
> 
>   - FIELD_MODE_FIXED - fixed offset. The bit offset from
>     header beginning is permanent and defined by field_base
>     configuration parameter.
> 
>   - FIELD_MODE_OFFSET - the field bit offset is extracted
>     from other header field (indirect offset field). The
>     resulting field offset to match is calculated from as:
> 
>   field_base + (*offset_base & offset_mask) << offset_shift
> 
>     This mode is useful to sample some extra options following
>     the main header with field containing main header length.
>     Also, this mode can be used to calculate offset to the
>     next protocol header, for example - IPv4 header contains
>     the 4-bit field with IPv4 header length expressed in dwords.
>     One more example - this mode would allow us to skip GENEVE
>     header variable length options.
> 
>   - FIELD_MODE_BITMASK - the field bit offset is extracted
>     from other header field (indirect offset field), the latter
>     is considered as bitmask containing some number of one bits,
>     the resulting field offset to match is calculated as:
> 
>   field_base + bitcount(*offset_base & offset_mask) << offset_shift
> 
>     This mode would be useful to skip the GTP header and its
>     extra options with specified flags.
> 
>   - FIELD_MODE_DUMMY - dummy field, optionally used for byte
>     boundary alignment in pattern. Pattern mask and data are
>     ignored in the match. All configuration parameters besides
>     field size and offset are ignored.
> 
>   Note:  "*" - means the indirect field offset is calculated
>   and actual data are extracted from the packet by this
>   offset (like data are fetched by pointer *p from memory).
> 
> The offset mode list can be extended by vendors according to hardware supported options.
> 
> The input link configuration section tells the driver after what protocols and at what conditions the flex
> item can follow.
> Input link specified the preceding header pattern, for example for GENEVE it can be UDP item specifying
> match on destination port with value 6081. The flex item can follow multiple header types and multiple
> input links should be specified. At flow creation time the item with one of the input link types should
> precede the flex item and driver will select the correct flex item settings, depending on the actual flow
> pattern.
> 
> The output link configuration section tells the driver how to continue packet parsing after the flex item
> protocol.
> If multiple protocols can follow the flex item header the flex item should contain the field with the next
> protocol identifier and the parsing will be continued depending on the data contained in this field in the
> actual packet.
> 
> The flex item fields can participate in RSS hash calculation, the dedicated flag is present in the field
> description to specify what fields should be provided for hashing.
> 
> 5. Flex Item Chaining
> 
> If there are multiple protocols supposed to be supported with flex items in chained fashion - two or more
> flex items within the same flow and these ones might be neighbors in the pattern, it means the flex items
> are mutual referencing.  In this case, the item that occurred first should be created with empty output link
> list or with the list including existing items, and then the second flex item should be created referencing the
> first flex item as input arc, drivers should adjust the item confgiuration.
> 
> Also, the hardware resources used by flex items to handle the packet can be limited. If there are multiple
> flex items that are supposed to be used within the same flow it would be nice to provide some hint for the
> driver that these two or more flex items are intended for simultaneous usage.
> The fields of items should be assigned with hint indices and these indices from two or more flex items
> supposed to be provided within the same flow should be the same as well. In other words, the field hint
> index specifies the group of fields that can be matched simultaneously within a single flow. If hint indices
> are specified, the driver will try to engage not overlapping hardware resources and provide independent
> handling of the field groups with unique indices. If the hint index is zero the driver assigns resources on its
> own.
> 
> 6. Example of New Protocol Handling
> 
> Let's suppose we have the requirements to handle the new tunnel protocol that follows UDP header with
> destination port 0xFADE and is followed by MAC header. Let the new protocol header format be like this:
> 
>   struct new_protocol_header {
>     rte_be32 header_length; /* length in dwords, including options */
>     rte_be32 specific0;     /* some protocol data, no intention */
>     rte_be32 specific1;     /* to match in flows on these fields */
>     rte_be32 crucial;       /* data of interest, match is needed */
>     rte_be32 options[0];    /* optional protocol data, variable length */
>   };
> 
> The supposed flex item configuration:
> 
>   struct rte_flow_item_flex_field field0 = {
>     .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
>     .field_size = 96,                /* three dwords from the beginning */
>   };
>   struct rte_flow_item_flex_field field1 = {
>     .field_mode = FIELD_MODE_FIXED,
>     .field_size = 32,       /* Field size is one dword */
>     .field_base = 96,       /* Skip three dwords from the beginning */
>   };
>   struct rte_flow_item_udp spec0 = {
>     .hdr = {
>       .dst_port = RTE_BE16(0xFADE),
>     }
>   };
>   struct rte_flow_item_udp mask0 = {
>     .hdr = {
>       .dst_port = RTE_BE16(0xFFFF),
>     }
>   };
>   struct rte_flow_item_flex_link link0 = {
>     .item = {
>        .type = RTE_FLOW_ITEM_TYPE_UDP,
>        .spec = &spec0,
>        .mask = &mask0,
>   };
> 
>   struct rte_flow_item_flex_conf conf = {
>     .next_header = {
>       .tunnel = FLEX_TUNNEL_MODE_SINGLE,
>       .field_mode = FIELD_MODE_OFFSET,
>       .field_base = 0,
>       .offset_base = 0,
>       .offset_mask = 0xFFFFFFFF,
>       .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
>     },
>     .sample = {
>        &field0,
>        &field1,
>     },
>     .nb_samples = 2,
>     .input_link[0] = &link0,
>     .nb_inputs = 1
>   };
> 
> Let's suppose we have created the flex item successfully, and PMD returned the handle 0x123456789A.
> We can use the following item pattern to match the crucial field in the packet with value 0x00112233:
> 
>   struct new_protocol_header spec_pattern =
>   {
>     .crucial = RTE_BE32(0x00112233),
>   };
>   struct new_protocol_header mask_pattern =
>   {
>     .crucial = RTE_BE32(0xFFFFFFFF),
>   };
>   struct rte_flow_item_flex spec_flex = {
>     .handle = 0x123456789A
>     .length = sizeiof(struct new_protocol_header),
>     .pattern = &spec_pattern,
>   };
>   struct rte_flow_item_flex mask_flex = {
>     .length = sizeof(struct new_protocol_header),
>     .pattern = &mask_pattern,
>   };
>   struct rte_flow_item item_to_match = {
>     .type = RTE_FLOW_ITEM_TYPE_FLEX,
>     .spec = &spec_flex,
>     .mask = &mask_flex,
>   };
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---

Acked-by: Ori Kam <orika@nvidia.com>
Thanks,
Ori

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/14] ethdev: introduce configurable flexible item
  2021-10-07 11:08     ` Ori Kam
@ 2021-10-12  6:42       ` Slava Ovsiienko
  0 siblings, 0 replies; 73+ messages in thread
From: Slava Ovsiienko @ 2021-10-12  6:42 UTC (permalink / raw)
  To: Ori Kam, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Gregory Etelson,
	NBU-Contact-Thomas Monjalon

Hi, Ori

Thank you very much for the review, I found some of your comment extremely useful.
Please, see below

> -----Original Message-----
> From: Ori Kam <orika@nvidia.com>
> Sent: Thursday, October 7, 2021 14:08
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; dev@dpdk.org
> Cc: Raslan Darawsheh <rasland@nvidia.com>; Matan Azrad
> <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>; Gregory Etelson
> <getelson@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>
> Subject: RE: [PATCH v2 01/14] ethdev: introduce configurable flexible item
> 
> Hi Slava,
> 
> > -----Original Message-----
> > From: Slava Ovsiienko <viacheslavo@nvidia.com>
> > Sent: Friday, October 1, 2021 10:34 PM
> > Subject: [PATCH v2 01/14] ethdev: introduce configurable flexible item
> >
> > 1. Introduction and Retrospective

.. snip ..
> >
> > The length field specifies the pattern buffer length in bytes and is
> > needed to allow rte_flow_copy() operations. The approach of multiple
> > pattern pointers and lengths (per field) was considered and found
> > clumsy - it seems to be much suitable for the application to maintain
> > the single structure within the single pattern buffer.
> >
> 
> I think that the main thing that is unclear to me and I think I understand it
> from reading the code is that the pattern is the entire flex header structure.
> maybe a better word will be header?

"pattern is the entire flex header structure" - it is not completely correct, sorry
Pattern represents the set of fields of the header. Yes, usually it coincides
with the entire header (it is just simpler for understanding). But it must not!
The flex item can be constructed in more generic way. It can include fields
in arbitrary order (in general, we may not follow the strict field order in the
header while defining the flex item), the field can be split into subfields
and reordered. Theoretically it even allows to do many interesting things,
say the byte order conversion - item will sample byte subfields into single
integer field in desired host endianness. 

Other possibility is to gather some split fields (say we have some offset
split into multiple locations in the header) into one. Yes, for this case
pattern will not correspond the header structure 1-to-1, but it is not
required. 1-to-1 pattern-to-header mapping is just most straightforward
way to operate, but it is not the only possible one.

> In the beginning I thought that you should only give the matchable fields.
> also you say everything is in bits and suddenly you are talking in bytes.
Yes, data are presented as bytes. But we must provide the capability
to operate with bitfields. If we introduced some byte alignment for
bitfields in the pattern we would lose the opportunity to match
with header structure (that might define bitfields) strictly. That's
why all offsets in flex config are expressed in bits, it provides the
precise control over pattern structure.

> 
> > 4. Flex Item Configuration
> >
> > The flex item configuration consists of the following parts:
> >
> >   - header field descriptors:
> >     - next header
> >     - next protocol
> >     - sample to match
> >   - input link descriptors
> >   - output link descriptors
> >
> > The field descriptors tell driver and hardware what data should be
> > extracted from the packet and then presented to match in the flows.
> > Each field is a bit pattern. It has width, offset from the header
> > beginning, mode of offset calculation, and offset related parameters.
> >
> 
> I'm not sure your indentation is correct for the next header, next protocol,
> sample to match.
> Since reading the first line means that all fields are going to be matched while
> in following sections only the sample to match are matchable.

"reading the first line means". M-m-m-m, sorry I do not follow.
The first line is "- header field descriptors". It tells what field descriptors we have.
It tells nothing about match. All indented bullets have the same structure type.
Indentation is correct, but the claim 
The field descriptors tell driver .... then presented to match in the flows"
Is not. So - agree, fixed.

> 
> > The next header field is special, no data are actually taken from the
> > packet, but its offset is used as pointer to the next header in the
> > packet, in other word the next header offset specifies the size of the
> > header being parsed by flex item.
> >
> 
> So the name of the next header should be len?

We considered using the naming "len". We even started the code development
with this one. But there is some level of indirection. The header length
can be obtained with indirect methods (offset or bitmask field in the packet),
and this field descriptor provides rather some "pointer/offset to next header"
than length itself. So, naming next_header (pointer) is more precise,
in my opinion. I would prefer to keep this.

> 
> > There is one more special field - next protocol, it specifies where
> > the next protocol identifier is contained and packet data sampled from
> > this field will be used to determine the next protocol header type to
> continue packet parsing.
> > The next protocol field is like eth_type field in MAC2, or proto field
> > in IPv4/v6 headers.
> >
> > The sample fields are used to represent the data be sampled from the
> > packet and then matched with established flows.
> 
> Should this be samples?
? IIUC  - "sample" is adjective here, "fieldS" is plural.

> 
> >
> > There are several methods supposed to calculate field offset in
> > runtime depending on configuration and packet content:
> >
> >   - FIELD_MODE_FIXED - fixed offset. The bit offset from
> >     header beginning is permanent and defined by field_base
> >     configuration parameter.
> >
> >   - FIELD_MODE_OFFSET - the field bit offset is extracted
> >     from other header field (indirect offset field). The
> >     resulting field offset to match is calculated from as:
> >
> >   field_base + (*field_offset & offset_mask) << field_shift
> >
> 
> Not all of those fields names are defined later in this patch, and I'm not sure
> about what they mean.
Yes, sorry, missed this fix in commit message once code was updated.
> Does * means take the value this is in field_offset?
Yes, it means we should calculate indirect field offset, and extract field
data from the packet (like by pointer *p from memory).
Added the note about this.

> How do we know the width of the field (by the value of the mask)?
By mask, it is common and advanced way to specify the field.

> 
> >     This mode is useful to sample some extra options following
> >     the main header with field containing main header length.
> >     Also, this mode can be used to calculate offset to the
> >     next protocol header, for example - IPv4 header contains
> >     the 4-bit field with IPv4 header length expressed in dwords.
> >     One more example - this mode would allow us to skip GENEVE
> >     header variable length options.
> >
> >   - FIELD_MODE_BITMASK - the field bit offset is extracted
> >     from other header field (indirect offset field), the latter
> >     is considered as bitmask containing some number of one bits,
> >     the resulting field offset to match is calculated as:
> >
> >   field_base + bitcount(*field_offset & offset_mask) << field_shift
> 
> Same comment as above you are using name that are not defined later.
Yes, fixed.

> 
> >
> >     This mode would be useful to skip the GTP header and its
> >     extra options with specified flags.
> >
> >   - FIELD_MODE_DUMMY - dummy field, optionally used for byte
> >     boundary alignment in pattern. Pattern mask and data are
> >     ignored in the match. All configuration parameters besides
> >     field size and offset are ignored.
> >
> > The offset mode list can be extended by vendors according to hardware
> > supported options.
> >
> > The input link configuration section tells the driver after what
> > protocols and at what conditions the flex item can follow.
> > Input link specified the preceding header pattern, for example for
> > GENEVE it can be UDP item specifying match on destination port with
> > value 6081. The flex item can follow multiple header types and
> > multiple input links should be specified. At flow creation type the
> > item with one of input link types should precede the flex item and
> > driver will select the correct flex item settings, depending on actual flow
> pattern.
> >
> > The output link configuration section tells the driver how to continue
> > packet parsing after the flex item protocol.
> > If multiple protocols can follow the flex item header the flex item
> > should contain the field with next protocol identifier, and the
> > parsing will be continued depending on the data contained in this field in
> the actual packet.
> >
> > The flex item fields can participate in RSS hash calculation, the
> > dedicated flag is present in field description to specify what fields
> > should be provided for hashing.
> >
> > 5. Flex Item Chaining
> >
> > If there are multiple protocols supposed to be supported with flex
> > items in chained fashion - two or more flex items within the same flow
> > and these ones might be neighbors in pattern - it means the flex items are
> mutual referencing.
> > In this case, the item that occurred first should be created with
> > empty output link list or with the list including existing items, and
> > then the second flex item should be created referencing the first flex item as
> input arc.
> >
> 
> And then I assume we should update the output list.

It is supposed to be done by driver on creation the second item.
Now update API is not supported (it depends on FW, and now there
is no plans to support object modify), so - no support - we should
not include the code, and rte_flow_flex_item_update() will be missing,
at least in this release.

And, currently there is no code supporting chaining, it is just
an attempt to consider the potential scenario of flex item chaining and
to get as complete API as we can.

> 
> > Also, the hardware resources used by flex items to handle the packet
> > can be limited. If there are multiple flex items that are supposed to
> > be used within the same flow it would be nice to provide some hint for
> > the driver that these two or more flex items are intended for simultaneous
> usage.
> > The fields of items should be assigned with hint indices and these
> > indices from two or more flex items should not overlap (be unique per
> > field). For this case, the driver will try to engage not overlapping
> > hardware resources and provide independent handling of the fields with
> > unique indices. If the hint index is zero the driver assigns resources on its
> own.
> >
> > 6. Example of New Protocol Handling
> >
> > Let's suppose we have the requirements to handle the new tunnel
> > protocol that follows UDP header with destination port 0xFADE and is
> > followed by MAC header. Let the new protocol header format be like this:
> >
> >   struct new_protocol_header {
> >     rte_be32 header_length; /* length in dwords, including options */
> >     rte_be32 specific0;     /* some protocol data, no intention */
> >     rte_be32 specific1;     /* to match in flows on these fields */
> >     rte_be32 crucial;       /* data of interest, match is needed */
> >     rte_be32 options[0];    /* optional protocol data, variable length */
> >   };
> >
> > The supposed flex item configuration:
> >
> >   struct rte_flow_item_flex_field field0 = {
> >     .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
> >     .field_size = 96,                /* three dwords from the beginning */
> >   };
> >   struct rte_flow_item_flex_field field1 = {
> >     .field_mode = FIELD_MODE_FIXED,
> >     .field_size = 32,       /* Field size is one dword */
> >     .field_base = 96,       /* Skip three dwords from the beginning */
> >   };
> >   struct rte_flow_item_udp spec0 = {
> >     .hdr = {
> >       .dst_port = RTE_BE16(0xFADE),
> >     }
> >   };
> >   struct rte_flow_item_udp mask0 = {
> >     .hdr = {
> >       .dst_port = RTE_BE16(0xFFFF),
> >     }
> >   };
> >   struct rte_flow_item_flex_link link0 = {
> >     .item = {
> >        .type = RTE_FLOW_ITEM_TYPE_UDP,
> >        .spec = &spec0,
> >        .mask = &mask0,
> >   };
> >
> >   struct rte_flow_item_flex_conf conf = {
> >     .next_header = {
> >       .field_mode = FIELD_MODE_OFFSET,
> >       .field_base = 0,
> >       .offset_base = 0,
> >       .offset_mask = 0xFFFFFFFF,
> >       .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
> >     },
> >     .sample = {
> >        &field0,
> >        &field1,
> >     },
> 
> Why in sample you give both fields?
> by your decision we just want to match on field1.

Field0 is a placeholder, it covers the gap in pattern and
makes sure the pattern has exactly the same format as the
protocol header. As option (by application design choice)
we can omit field0 and, for this case, we'll get compact
pattern structure, but it won't the exact protocol  header
structure.
> 
> >     .sample_num = 2,
> >     .input_link[0] = &link0,
> >     .input_num = 1
> >   };
> >
> > Let's suppose we have created the flex item successfully, and PMD
> > returned the handle 0x123456789A. We can use the following item
> > pattern to match the crucial field in the packet with value 0x00112233:
> >
> >   struct new_protocol_header spec_pattern =
> >   {
> >     .crucial = RTE_BE32(0x00112233),
> >   };
> >   struct new_protocol_header mask_pattern =
> >   {
> >     .crucial = RTE_BE32(0xFFFFFFFF),
> >   };
> >   struct rte_flow_item_flex spec_flex = {
> >     .handle = 0x123456789A
> >     .length = sizeiof(struct new_protocol_header),
> >     .pattern = &spec_pattern,
> >   };
> >   struct rte_flow_item_flex mask_flex = {
> >     .length = sizeof(struct new_protocol_header),
> >     .pattern = &mask_pattern,
> >   };
> >   struct rte_flow_item item_to_match = {
> >     .type = RTE_FLOW_ITEM_TYPE_FLEX,
> >     .spec = &spec_flex,
> >     .mask = &mask_flex,
> >   };
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> > ---
> >  doc/guides/prog_guide/rte_flow.rst     |  24 +++
> >  doc/guides/rel_notes/release_21_11.rst |   7 +
> >  lib/ethdev/rte_ethdev.h                |   1 +
> >  lib/ethdev/rte_flow.h                  | 228 +++++++++++++++++++++++++
> >  4 files changed, 260 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > index 2b42d5ec8c..628f30cea7 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -1425,6 +1425,30 @@ Matches a conntrack state after conntrack
> action.
> >  - ``flags``: conntrack packet state flags.
> >  - Default ``mask`` matches all state bits.
> >
> > +Item: ``FLEX``
> > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +Matches with the network protocol header of preliminary configured
> format.
> > +The application describes the desired header structure, defines the
> > +header fields attributes and header relations with preceding and
> > +following protocols and configures the ethernet devices accordingly
> > +via
> > +rte_flow_flex_item_create() routine.
> 
> How about: matches a custom header that was created using
> rte_flow_flex_item_create
Np, fixed.

> 
> > +
> > +- ``handle``: the flex item handle returned by the PMD on successful
> > +  rte_flow_flex_item_create() call. The item handle is unique within
> > +  the device port, mask for this field is ignored.
> 
> I think you can remove that it is unique handle.
> 
> > +- ``length``: match pattern length in bytes. If the length does not
> > +cover
> > +  all fields defined in item configuration, the pattern spec and mask
> > +are
> > +  supposed to be appended with zeroes till the full configured item length.
> 
> It looks bugy saying that you can give any length but expect the application to
> supply the full length.
Yes. Application can configure 128B protocol header with rte_flow_flex_item_create(). "Full configured length is 128B.
And provide only 4 bytes of pattern in the flow. The driver
should consider the patter as 4 bytes provided and 120 following 
zero bytes. 

> 
> > +- ``pattern``: pattern to match. The protocol header fields are
> > +considered
> > +  as bit fields, all offsets and widths are expressed in bits. The
> > +pattern
> > +  is the buffer containing the bit concatenation of all the fields
> > +presented
> > +  at item configuration time, in the same order and same amount. The
> > +most
> > +  regular way is to define all the header fields in the flex item
> > +configuration
> > +  and directly use the header structure as pattern template, i.e.
> > +application
> > +  just can fill the header structures with desired match values and
> > +masks and
> > +  specify these structures as flex item pattern directly.
> > +
> 
> It hard to understand this comment and what the application should set.
> I suggest to take the basic approach and just explain it. ( I think those are the
> last few lines)
Last few lines is just a supposed option.
Generally speaking, there are TWO structures - protocol header and match pattern. The easiest way ("the most regular way") to use flex item - make these TWO structures coinciding. But this is not an only way. Fields in pattern
can reference the same protocol header fields multiple times, in  arbitrary
number and combinations. Application can do gathering split field together, do byte order conversion, etc. 


> 
> >  Actions
> >  ~~~~~~~
> >
> > diff --git a/doc/guides/rel_notes/release_21_11.rst
> > b/doc/guides/rel_notes/release_21_11.rst
> > index 73e377a007..170797f9e9 100644
> > --- a/doc/guides/rel_notes/release_21_11.rst
> > +++ b/doc/guides/rel_notes/release_21_11.rst
> > @@ -55,6 +55,13 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =======================================================
> >
> > +* **Introduced RTE Flow Flex Item.**
> > +
> > +  * The configurable RTE Flow Flex Item provides the capability to introdude
> > +    the arbitrary user specified network protocol header, configure the
> device
> > +    hardware accordingly, and perform match on this header with
> > + desired
> > patterns
> > +    and masks.
> > +
> >  * **Enabled new devargs parser.**
> >
> >    * Enabled devargs syntax
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > afdc53b674..e9ad7673e9 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -558,6 +558,7 @@ struct rte_eth_rss_conf {
> >   * it takes the reserved value 0 as input for the hash function.
> >   */
> >  #define ETH_RSS_L4_CHKSUM          (1ULL << 35)
> > +#define ETH_RSS_FLEX		   (1ULL << 36)
> 
> Is the indentation right?
> How do you support FLEX RSS if more then on FLEX item is configured?
> 
As we found we missed some options in RSS related API (we have to invent
the way how to tell the drivers about flex item fields while creating the indirect RSS action), and there is no PMDs supporting RSSing over flex field yet, we
can omit RSS-related stuff in this release.

> >
> >  /*
> >   * We use the following macros to combine with above ETH_RSS_* for
> > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> > 7b1ed7f110..eccb1e1791 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -574,6 +574,15 @@ enum rte_flow_item_type {
> >  	 * @see struct rte_flow_item_conntrack.
> >  	 */
> >  	RTE_FLOW_ITEM_TYPE_CONNTRACK,
> > +
> > +	/**
> > +	 * Matches a configured set of fields at runtime calculated offsets
> > +	 * over the generic network header with variable length and
> > +	 * flexible pattern
> > +	 *
> 
> I think it should say matches on application configured header.
No, no. Matches on pattern. That may be configured with the same format
as protocol header has. Or may be not (more complicated way to operated, but it could provide some optimizations).
> 
> > +	 * @see struct rte_flow_item_flex.
> > +	 */
> > +	RTE_FLOW_ITEM_TYPE_FLEX,
> >  };
> >
> >  /**
> > @@ -1839,6 +1848,160 @@ struct rte_flow_item {
> >  	const void *mask; /**< Bit-mask applied to spec and last. */  };
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ITEM_TYPE_FLEX
> > + *
> > + * Matches a specified set of fields within the network protocol
> > + * header. Each field is presented as set of bits with specified
> > +width, and
> > + * bit offset (this is dynamic one - can be calulated by several
> > +methods
> > + * in runtime) from the header beginning.
> > + *
> > + * The pattern is concatenation of all bit fields configured at item
> > +creation
> > + * by rte_flow_flex_item_create() exactly in the same order and
> > +amount, no
> > + * fields can be omitted or swapped. The dummy mode field can be used
> > +for
> > + * pattern byte boundary alignment, least significant bit in byte goes first.
> > + * Only the fields specified in sample_data configuration parameter
> > +participate
> > + * in pattern construction.
> > + *
> > + * If pattern length is smaller than configured fields overall length
> > +it is
> > + * extended with trailing zeroes, both for value and mask.
> > + *
> > + * This type does not support ranges (struct rte_flow_item.last).
> > + */
> 
> I think it is to complex to understand see my comment above.
> 
> > +struct rte_flow_item_flex {
> > +	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle.
> > */
> > +	uint32_t length; /**< Pattern length in bytes. */
> > +	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
> > +};
> > +/**
> > + * Field bit offset calculation mode.
> > + */
> > +enum rte_flow_item_flex_field_mode {
> > +	/**
> > +	 * Dummy field, used for byte boundary alignment in pattern.
> > +	 * Pattern mask and data are ignored in the match. All configuration
> > +	 * parameters besides field size are ignored.
> 
> Since in the item we just set value and mask what will happen if we set mask
> to be different then 0 in an offset that we have such a field?
Nothing. The mask and value for bits covered with DUMMY fields are just ignored. There can be any values and masks, these ones will not be translated to actual flow matcher. DUMMY  is just a placeholder, to align the substantial fields with actual protocol header structure. DUMMY usage is optional, and we
need these ones only to build the pattern structure coinciding with proto header, without covering the entire header with actual sampling  fields (to save HW resources).

> 
> > +	 */
> > +	FIELD_MODE_DUMMY = 0,
> > +	/**
> > +	 * Fixed offset field. The bit offset from header beginning is
> > +	 * is permanent and defined by field_base parameter.
> > +	 */
> > +	FIELD_MODE_FIXED,
> > +	/**
> > +	 * The field bit offset is extracted from other header field (indirect
> > +	 * offset field). The resulting field offset to match is calculated as:
> > +	 *
> > +	 *    field_base + (*field_offset & offset_mask) << field_shift
> 
> I can't find those name in the patch and I'm not clear on what they mean.
Yes, it is type, fixed, thank you.

> 
> > +	 */
> > +	FIELD_MODE_OFFSET,
> > +	/**
> > +	 * The field bit offset is extracted from other header field (indirect
> > +	 * offset field), the latter is considered as bitmask containing some
> > +	 * number of one bits, the resulting field offset to match is
> > +	 * calculated as:
> 
> Just like above.
> 
> > +	 *
> > +	 *    field_base + bitcount(*field_offset & offset_mask) << field_shift
> > +	 */
> > +	FIELD_MODE_BITMASK,
> > +};
> > +
> > +/**
> > + * Flex item field tunnel mode
> > + */
> > +enum rte_flow_item_flex_tunnel_mode {
> > +	FLEX_TUNNEL_MODE_FIRST = 0, /**< First item occurrence. */
> > +	FLEX_TUNNEL_MODE_OUTER = 1, /**< Outer item. */
> > +	FLEX_TUNNEL_MODE_INNER = 2  /**< Inner item. */ };
> > +
> 
> The '}' should be at a new line.
> If the item can be inner and outer do we need to define two flex objects?
> Also why enum and not defines?
Just looked at rte_flow.h and saw the #defines are not so common there,
just for bit flags. For values sets there are mostly enums. And we have updated
the tunnel settings, so this  enum is not needed anymore.

> From API point of view I think it should hav the following options:
> Mode_outer , mode_inner, mode_global and mode_tunnel, Why is per field
> and not per object.
Yes,  agree, thank you very much for discovering this arch gap, updated.

> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > +*/ __extension__ struct rte_flow_item_flex_field {
> > +	/** Defines how match field offset is calculated over the packet. */
> > +	enum rte_flow_item_flex_field_mode field_mode;
> > +	uint32_t field_size; /**< Match field size in bits. */
> 
> I think it will be better to remove the word Match.
Yes, right, fixed
> 
> > +	int32_t field_base; /**< Match field offset in bits. */
> 
> I think it will be better to remove the word Match.
Yes, right, fixed

> 
> > +	uint32_t offset_base; /**< Indirect offset field offset in bits. */
> 
> I think a better name will be offset_field /* the offset of the field that holds
> the offset that should be used from the field_base */ what do you think?
It is just one term (first in sum of resulting offset), so xxxx_base looks OK.
> 
> Maybe just change from offset_base to offset?
> 
> > +	uint32_t offset_mask; /**< Indirect offset field bit mask. */
> 
> Maybe better wording?
> The mask to apply to the value that is set in the offset_field.

We have an entity - offset field.
We have 3 attributes of entiry - base, mask, shift.
So, the naming schema is  "entity-name_attribute-name":
offset_base  - "base" attribute of "offset field"
offset_mask - "mask" attribute of "offset field"
offset_shift - "shift" attribute of "offset field"
field_base - "base" attribute of "field" entity

The "field" word is omitted from "offset_field" name in order to make name shorted and to not intermix with pure "field" entity.

> 
> > +	int32_t offset_shift; /**< Indirect offset multiply factor. */
> > +	uint16_t tunnel_count:2; /**< 0-first occurrence, 1-outer,
> > +2-inner.*/
> 
> I think this may result in some warning since you try to cast enum to 2 bits.
> Also the same question from above to support inner and outer do we need
> two objects?
We refactored the tunneling attributes.

> 
> > +	uint16_t rss_hash:1; /**< Field participates in RSS hash
> > +calculation. */
> 
> Please see my comment on the RSS, it is not clear how more then one flex
> item can be created and the rss will work.
Yes, you are right, we must update the RSS API as well. No we have no drivers
supporting RSS over flex item fields. But we considered the opportunity and now (as review result) we have better understanding what we should develop to provide RSS over flex. 

> 
> > +	uint16_t field_id; /**< device hint, for flows with multiple items.
> > +*/
> 
> How should this be used?
> Should be capital D in device.
Updated the documentation.
> 
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > +*/ struct rte_flow_item_flex_link {
> > +	/**
> > +	 * Preceding/following header. The item type must be always
> > provided.
> > +	 * For preceding one item must specify the header value/mask to
> > match
> > +	 * for the link be taken and start the flex item header parsing.
> > +	 */
> > +	struct rte_flow_item item;
> > +	/**
> > +	 * Next field value to match to continue with one of the configured
> > +	 * next protocols.
> > +	 */
> > +	uint32_t next;
> 
> Is this offset of the field or the value?
" Next field VALUE"
It is the value. Like 0x0800 in eth_type to specify IPv4 next proto.
Or 17 in IPv4.proto to specify following UDP.

> 
> > +	/**
> > +	 * Specifies whether flex item represents tunnel protocol
> > +	 */
> > +	bool tunnel;
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > +*/ struct rte_flow_item_flex_conf {
> > +	/**
> > +	 * The next header offset, it presents the network header size covered
> > +	 * by the flex item and can be obtained with all supported offset
> > +	 * calculating methods (fixed, dedicated field, bitmask, etc).
> > +	 */
> > +	struct rte_flow_item_flex_field next_header;
> 
> I think a better name will be size/len
Replied above about level of indirection.

> 
> > +	/**
> > +	 * Specifies the next protocol field to match with link next protocol
> > +	 * values and continue packet parsing with matching link.
> > +	 */
> > +	struct rte_flow_item_flex_field next_protocol;
> > +	/**
> > +	 * The fields will be sampled and presented for explicit match
> > +	 * with pattern in the rte_flow_flex_item. There can be multiple
> > +	 * fields descriptors, the number should be specified by sample_num.
> > +	 */
> > +	struct rte_flow_item_flex_field *sample_data;
> > +	/** Number of field descriptors in the sample_data array. */
> > +	uint32_t sample_num;
> 
> nb_samples?
> 
> > +	/**
> > +	 * Input link defines the flex item relation with preceding
> > +	 * header. It specified the preceding item type and provides pattern
> > +	 * to match. The flex item will continue parsing and will provide the
> > +	 * data to flow match in case if there is the match with one of input
> > +	 * links.
> > +	 */
> > +	struct rte_flow_item_flex_link *input_link;
> > +	/** Number of link descriptors in the input link array. */
> > +	uint32_t input_num;
> Nb_inputs
OK, let's rename.
.. snip ..
> > +
> > +/**
> > + * Modify the flex item on the specified Ethernet device.
> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param[in] handle
> > + *   Handle of the item existing on the specified device.
> > + * @param[in] conf
> > + *   Item new configuration.
> 
> Do you to supply full configuration for each update?
> Maybe add a mask?
Currently no drivers supporting update. So, let's remove update routine.

With best regards,
Slava


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/5] ethdev: support flow elements with variable length
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 2/5] ethdev: support flow elements with variable length Viacheslav Ovsiienko
@ 2021-10-12  7:53     ` Ori Kam
  0 siblings, 0 replies; 73+ messages in thread
From: Ori Kam @ 2021-10-12  7:53 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Gregory Etelson,
	NBU-Contact-Thomas Monjalon

Hi Slava and Gregory,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Monday, October 11, 2021 9:15 PM
> Subject: [dpdk-dev] [PATCH v3 2/5] ethdev: support flow elements with variable length
> 
> From: Gregory Etelson <getelson@nvidia.com>
> 
> RTE flow API provides RAW item type for packet patterns of variable length. The RAW item structure has
> fixed size members that describe the variable pattern length and methods to process it.
> 
> A new RTE flow item type with variable length pattern that does not fit the RAW item meta description
> could not use the RAW item.
> For example, the new flow item that references 64 bits PMD handler cannot be described by the RAW
> item.
> 
> The patch allows RTE conv helper functions to process custom flow items with variable length pattern.
> 
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
>  lib/ethdev/rte_flow.c | 83 ++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 70 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c index 8cb7a069c8..100983ca59 100644
> --- a/lib/ethdev/rte_flow.c
> +++ b/lib/ethdev/rte_flow.c
> @@ -30,13 +30,67 @@ uint64_t rte_flow_dynf_metadata_mask;  struct rte_flow_desc_data {
>  	const char *name;
>  	size_t size;
> +	size_t (*desc_fn)(void *dst, const void *src);
>  };
> 
> +/**
> + *
> + * @param buf
> + * Destination memory.
> + * @param data
> + * Source memory
> + * @param size
> + * Requested copy size
> + * @param desc
> + * rte_flow_desc_item - for flow item conversion.
> + * rte_flow_desc_action - for flow action conversion.
> + * @param type
> + * Offset into the desc param or negative value for private flow elements.
> + */
> +static inline size_t
> +rte_flow_conv_copy(void *buf, const void *data, const size_t size,
> +		   const struct rte_flow_desc_data *desc, int type) {
> +	/**
> +	 * allow PMD private flow item
> +	 * see 5d1bff8fe2

There shouldn't be commit reference in source.

> +	 * "ethdev: allow negative values in flow rule types"
> +	 */
> +	size_t sz = type >= 0 ? desc[type].size : sizeof(void *);
> +	if (buf == NULL || data == NULL)
> +		return 0;
> +	rte_memcpy(buf, data, (size > sz ? sz : size));
> +	if (desc[type].desc_fn)
> +		sz += desc[type].desc_fn(size > 0 ? buf : NULL, data);
> +	return sz;
> +}
> +
> +static size_t
> +rte_flow_item_flex_conv(void *buf, const void *data) {
> +	struct rte_flow_item_flex *dst = buf;
> +	const struct rte_flow_item_flex *src = data;
> +	if (buf) {
> +		dst->pattern = rte_memcpy
> +			((void *)((uintptr_t)(dst + 1)), src->pattern,
> +			 src->length);
> +	}
> +	return src->length;
> +}
> +
>  /** Generate flow_item[] entry. */
>  #define MK_FLOW_ITEM(t, s) \
>  	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
>  		.name = # t, \
> -		.size = s, \
> +		.size = s,               \
> +		.desc_fn = NULL,\
> +	}
> +
> +#define MK_FLOW_ITEM_FN(t, s, fn) \
> +	[RTE_FLOW_ITEM_TYPE_ ## t] = {\
> +		.name = # t,                 \
> +		.size = s,                   \
> +		.desc_fn = fn,               \
>  	}
> 
>  /** Information about known flow pattern items. */ @@ -100,6 +154,8 @@ static const struct
> rte_flow_desc_data rte_flow_desc_item[] = {
>  	MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)),
>  	MK_FLOW_ITEM(INTEGRITY, sizeof(struct rte_flow_item_integrity)),
>  	MK_FLOW_ITEM(CONNTRACK, sizeof(uint32_t)),
> +	MK_FLOW_ITEM_FN(FLEX, sizeof(struct rte_flow_item_flex),
> +			rte_flow_item_flex_conv),
>  };
> 
>  /** Generate flow_action[] entry. */
> @@ -107,8 +163,17 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
>  	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
>  		.name = # t, \
>  		.size = s, \
> +		.desc_fn = NULL,\
> +	}
> +
> +#define MK_FLOW_ACTION_FN(t, fn) \
> +	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
> +		.name = # t, \
> +		.size = 0, \
> +		.desc_fn = fn,\
>  	}
> 
> +
>  /** Information about known flow actions. */  static const struct rte_flow_desc_data
> rte_flow_desc_action[] = {
>  	MK_FLOW_ACTION(END, 0),
> @@ -527,12 +592,8 @@ rte_flow_conv_item_spec(void *buf, const size_t size,
>  		}
>  		break;
>  	default:
> -		/**
> -		 * allow PMD private flow item
> -		 */
> -		off = (int)item->type >= 0 ?
> -		      rte_flow_desc_item[item->type].size : sizeof(void *);
> -		rte_memcpy(buf, data, (size > off ? off : size));
> +		off = rte_flow_conv_copy(buf, data, size,
> +					 rte_flow_desc_item, item->type);
>  		break;
>  	}
>  	return off;
> @@ -634,12 +695,8 @@ rte_flow_conv_action_conf(void *buf, const size_t size,
>  		}
>  		break;
>  	default:
> -		/**
> -		 * allow PMD private flow action
> -		 */
> -		off = (int)action->type >= 0 ?
> -		      rte_flow_desc_action[action->type].size : sizeof(void *);
> -		rte_memcpy(buf, action->conf, (size > off ? off : size));
> +		off = rte_flow_conv_copy(buf, action->conf, size,
> +					 rte_flow_desc_action, action->type);
>  		break;
>  	}
>  	return off;
> --
> 2.18.1

Best,
Ori


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/5] app/testpmd: add jansson library
  2021-10-11 18:15   ` [dpdk-dev] [PATCH v3 4/5] app/testpmd: add jansson library Viacheslav Ovsiienko
@ 2021-10-12  7:56     ` Ori Kam
  0 siblings, 0 replies; 73+ messages in thread
From: Ori Kam @ 2021-10-12  7:56 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Gregory Etelson,
	NBU-Contact-Thomas Monjalon


Hi Gregory,

> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo@nvidia.com>
> Sent: Monday, October 11, 2021 9:15 PM
> Subject: [PATCH v3 4/5] app/testpmd: add jansson library
> 
> From: Gregory Etelson <getelson@nvidia.com>
> 
> Testpmd interactive mode provides CLI to configure application commands. Testpmd reads CLI command
> and parameters from STDIN, and converts input into C objects with internal parser.
> The patch adds jansson dependency to testpmd.
> With jansson, testpmd can read input in JSON format from STDIN or input file and convert it into C object
> using jansson library calls.
> 
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
>  app/test-pmd/meson.build | 5 +++++
>  app/test-pmd/testpmd.h   | 3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build index 98f3289bdf..3a8babd604
> 100644
> --- a/app/test-pmd/meson.build
> +++ b/app/test-pmd/meson.build
> @@ -61,3 +61,8 @@ if dpdk_conf.has('RTE_LIB_BPF')
>      sources += files('bpf_cmd.c')
>      deps += 'bpf'
>  endif
> +jansson_dep = dependency('jansson', required: false, method:
> +'pkg-config') if jansson_dep.found()
> +    dpdk_conf.set('RTE_HAS_JANSSON', 1)
> +    ext_deps += jansson_dep
> +endif
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index 5863b2f43f..876a341cf0 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -14,6 +14,9 @@
>  #include <rte_os_shim.h>
>  #include <cmdline.h>
>  #include <sys/queue.h>
> +#ifdef RTE_HAS_JANSSON
> +#include <jansson.h>
> +#endif
> 
>  #define RTE_PORT_ALL            (~(portid_t)0x0)
> 
> --
> 2.18.1

Acked-by: Ori Kam <orika@nvidia.com>
Best,
Ori


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 0/5] ethdev: update modify field flow action
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (4 preceding siblings ...)
  2021-10-11 18:15 ` [dpdk-dev] [PATCH v3 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-12 10:49 ` Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 1/5] " Viacheslav Ovsiienko
                     ` (4 more replies)
  2021-10-12 11:32 ` [dpdk-dev] [PATCH v4 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (4 subsequent siblings)
  10 siblings, 5 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 10:49 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

The generic modify field flow action introduced in [1] has
some issues related to the immediate source operand:

  - immediate source can be presented either as an unsigned
    64-bit integer or pointer to data pattern in memory.
    There was no explicit pointer field defined in the union

  - the byte ordering for 64-bit integer was not specified.
    Many fields have lesser lengths and byte ordering
    is crucial.

  - how the bit offset is applied to the immediate source
    field was not defined and documented

  - 64-bit integer size is not enough to provide MAC and
    IPv6 addresses

In order to cover the issues and exclude any ambiguities
the following is done:

  - introduce the explicit pointer field
    in rte_flow_action_modify_data structure

  - replace the 64-bit unsigned integer with 16-byte array

  - update the modify field flow action documentation

Appropriate commit message has been removed.

[1] commit 73b68f4c54a0 ("ethdev: introduce generic modify flow action")
[2] RFC: http://patches.dpdk.org/project/dpdk/patch/20210910141609.8410-1-viacheslavo@nvidia.com/
[3] Deprecation notice: http://patches.dpdk.org/project/dpdk/patch/20210803085754.643180-1-orika@nvidia.com/
[4] v1 - http://patches.dpdk.org/project/dpdk/cover/20211001195223.31909-1-viacheslavo@nvidia.com/
[5] v2 - http://patches.dpdk.org/project/dpdk/patch/20211010234547.1495-2-viacheslavo@nvidia.com/
[6] v3 - http://patches.dpdk.org/project/dpdk/cover/20211012080631.28504-1-viacheslavo@nvidia.com/

v2: - comments addressed
    - documentation updated
    - typos fixed
    - mlx5 PMD updated

v3: - comments addressed
    - documentation updated
    - typos fixed

v4: - removed errorneously added Ack by Ori K. for mlx5 patch
    - mlx5 patch updated - bug fixes and cleanup

Viacheslav Ovsiienko (5):
  ethdev: update modify field flow action
  ethdev: fix missed experimental tag for modify field action
  app/testpmd: update modify field flow action support
  app/testpmd: fix hex string parser in flow commands
  net/mlx5: update modify field action

 app/test-pmd/cmdline_flow.c            |  60 ++++++++----
 doc/guides/prog_guide/rte_flow.rst     |  24 ++++-
 doc/guides/rel_notes/deprecation.rst   |   4 -
 doc/guides/rel_notes/release_21_11.rst |   7 ++
 drivers/net/mlx5/mlx5_flow_dv.c        | 123 +++++++++----------------
 lib/ethdev/rte_flow.h                  |  19 +++-
 6 files changed, 129 insertions(+), 108 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 1/5] ethdev: update modify field flow action
  2021-10-12 10:49 ` [dpdk-dev] [PATCH v4 0/5] ethdev: update modify field flow action Viacheslav Ovsiienko
@ 2021-10-12 10:49   ` Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 2/5] ethdev: fix missed experimental tag for modify field action Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 10:49 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

The generic modify field flow action introduced in [1] has
some issues related to the immediate source operand:

  - immediate source can be presented either as an unsigned
    64-bit integer or pointer to data pattern in memory.
    There was no explicit pointer field defined in the union.

  - the byte ordering for 64-bit integer was not specified.
    Many fields have shorter lengths and byte ordering
    is crucial.

  - how the bit offset is applied to the immediate source
    field was not defined and documented.

  - 64-bit integer size is not enough to provide IPv6
    addresses.

In order to cover the issues and exclude any ambiguities
the following is done:

  - introduce the explicit pointer field
    in rte_flow_action_modify_data structure

  - replace the 64-bit unsigned integer with 16-byte array

  - update the modify field flow action documentation

Appropriate deprecation notice has been removed.

[1] commit 73b68f4c54a0 ("ethdev: introduce generic modify flow action")

Fixes: 2ba49b5f3721 ("doc: announce change to ethdev modify action data")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 doc/guides/prog_guide/rte_flow.rst     | 24 +++++++++++++++++++++++-
 doc/guides/rel_notes/deprecation.rst   |  4 ----
 doc/guides/rel_notes/release_21_11.rst |  7 +++++++
 lib/ethdev/rte_flow.h                  | 16 ++++++++++++----
 4 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2b42d5ec8c..b08087511f 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2835,6 +2835,22 @@ a packet to any other part of it.
 ``value`` sets an immediate value to be used as a source or points to a
 location of the value in memory. It is used instead of ``level`` and ``offset``
 for ``RTE_FLOW_FIELD_VALUE`` and ``RTE_FLOW_FIELD_POINTER`` respectively.
+The data in memory should be presented exactly in the same byte order and
+length as in the relevant flow item, i.e. data for field with type
+``RTE_FLOW_FIELD_MAC_DST`` should follow the conventions of ``dst`` field
+in ``rte_flow_item_eth`` structure, with type ``RTE_FLOW_FIELD_IPV6_SRC`` -
+``rte_flow_item_ipv6`` conventions, and so on. If the field size is larger than
+16 bytes the pattern can be provided as pointer only.
+
+The bitfield extracted from the memory being applied as second operation
+parameter is defined by action width and by the destination field offset.
+Application should provide the data in immediate value memory (either as
+buffer or by pointer) exactly as item field without any applied explicit offset,
+and destination packet field (with specified width and bit offset) will be
+replaced by immediate source bits from the same bit offset. For example,
+to replace the third byte of MAC address with value 0x85, application should
+specify destination width as 8, destination offset as 16, and provide immediate
+value as sequence of bytes {xxx, xxx, 0x85, xxx, xxx, xxx}.
 
 .. _table_rte_flow_action_modify_field:
 
@@ -2865,7 +2881,13 @@ for ``RTE_FLOW_FIELD_VALUE`` and ``RTE_FLOW_FIELD_POINTER`` respectively.
    +---------------+----------------------------------------------------------+
    | ``offset``    | number of bits to skip at the beginning                  |
    +---------------+----------------------------------------------------------+
-   | ``value``     | immediate value or a pointer to this value               |
+   | ``value``     | immediate value buffer (source field only, not           |
+   |               | applicable to destination) for RTE_FLOW_FIELD_VALUE      |
+   |               | field type                                               |
+   +---------------+----------------------------------------------------------+
+   | ``pvalue``    | pointer to immediate value data (source field only, not  |
+   |               | applicable to destination) for RTE_FLOW_FIELD_POINTER    |
+   |               | field type                                               |
    +---------------+----------------------------------------------------------+
 
 Action: ``CONNTRACK``
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index a2fe766d4b..dee14077a5 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -120,10 +120,6 @@ Deprecation Notices
 * ethdev: Announce moving from dedicated modify function for each field,
   to using the general ``rte_flow_modify_field`` action.
 
-* ethdev: The struct ``rte_flow_action_modify_data`` will be modified
-  to support modifying fields larger than 64 bits.
-  In addition, documentation will be updated to clarify byte order.
-
 * ethdev: Attribute ``shared`` of the ``struct rte_flow_action_count``
   is deprecated and will be removed in DPDK 21.11. Shared counters should
   be managed using shared actions API (``rte_flow_shared_action_create`` etc).
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index dfc2cbdeed..578c1206e7 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -187,6 +187,13 @@ API Changes
   the crypto/security operation. This field will be used to communicate
   events such as soft expiry with IPsec in lookaside mode.
 
+* ethdev: ``rte_flow_action_modify_data`` structure updated, immediate data
+  array is extended, data pointer field is explicitly added to union, the
+  action behavior is defined in more strict fashion and documentation updated.
+  The immediate value behavior has been changed, the entire immediate field
+  should be provided, and offset for immediate source bitfield is assigned
+  from destination one.
+
 
 ABI Changes
 -----------
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 7b1ed7f110..f14f77772b 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -3217,10 +3217,18 @@ struct rte_flow_action_modify_data {
 			uint32_t offset;
 		};
 		/**
-		 * Immediate value for RTE_FLOW_FIELD_VALUE or
-		 * memory address for RTE_FLOW_FIELD_POINTER.
+		 * Immediate value for RTE_FLOW_FIELD_VALUE, presented in the
+		 * same byte order and length as in relevant rte_flow_item_xxx.
+		 * The immediate source bitfield offset is inherited from
+		 * the destination's one.
 		 */
-		uint64_t value;
+		uint8_t value[16];
+		/**
+		 * Memory address for RTE_FLOW_FIELD_POINTER, memory layout
+		 * should be the same as for relevant field in the
+		 * rte_flow_item_xxx structure.
+		 */
+		void *pvalue;
 	};
 };
 
@@ -3240,7 +3248,7 @@ enum rte_flow_modify_op {
  * RTE_FLOW_ACTION_TYPE_MODIFY_FIELD
  *
  * Modify a destination header field according to the specified
- * operation. Another packet field can be used as a source as well
+ * operation. Another field of the packet can be used as a source as well
  * as tag, mark, metadata, immediate value or a pointer to it.
  */
 struct rte_flow_action_modify_field {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 2/5] ethdev: fix missed experimental tag for modify field action
  2021-10-12 10:49 ` [dpdk-dev] [PATCH v4 0/5] ethdev: update modify field flow action Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 1/5] " Viacheslav Ovsiienko
@ 2021-10-12 10:49   ` Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 3/5] app/testpmd: update modify field flow action support Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 10:49 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas, stable

EXPERIMENTAL tag was missed in rte_flow_action_modify_data
structure description.

Fixes: 73b68f4c54a0 ("ethdev: introduce generic modify flow action")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 lib/ethdev/rte_flow.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index f14f77772b..8a1eddd0b7 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -3204,6 +3204,9 @@ enum rte_flow_field_id {
 };
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
  * Field description for MODIFY_FIELD action.
  */
 struct rte_flow_action_modify_data {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 3/5] app/testpmd: update modify field flow action support
  2021-10-12 10:49 ` [dpdk-dev] [PATCH v4 0/5] ethdev: update modify field flow action Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 1/5] " Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 2/5] ethdev: fix missed experimental tag for modify field action Viacheslav Ovsiienko
@ 2021-10-12 10:49   ` Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 4/5] app/testpmd: fix hex string parser in flow commands Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 5/5] net/mlx5: update modify field action Viacheslav Ovsiienko
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 10:49 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

The testpmd flow create command updates provided:

  - modify field action supports the updated actions
  - pointer type added for action source field
  - pointer and value source field takes hex string
    instead of unsigned int in host endianness

There are some examples of flow with update modified
field action:

1. IPv6 destination address bytes 4-7 assignment:
   0000::1111 - > 0000:xxxx:4455:6677::1111

   flow create 0 egress group 1
     pattern eth / ipv6 dst is 0000::1111 / udp / end
     actions modify_field op set
             dst_type ipv6_dst
	     dst_offset 32
             src_type value
             src_value 0011223344556677
	     width 32 / end

2. Copy second byte of IPv4 destination address to the
   third byte of source address:
     10.0.118.4 -> 192.168.100.1
     10.0.168.4 -> 192.168.100.1

   flow create 0 egress group 1
     pattern eth / ipv4 / udp / end
     actions modify_field op set
             dst_type ipv4_src
             dst_offset 16
	     src_type ipv4_dst
	     src_offset 8
	     width 8 / end

3. Assign METADATA value with 11223344 value from the
   hex string in the linear buffer. Please note, the value
   definition should follow host-endian, example is given
   for x86 (little-endian):

   flow create 0 egress group 1
     pattern eth / ipv4 / end
     actions modify_field op set
             dst_type meta
	     src_type pointer
	     src_ptr 44332211
	     width 32 / end

4. Assign destination MAC with EA:11:0B:AD:0B:ED value:

   flow create 0 egress group 1
     pattern eth / end
     actions modify_field op set
             dst_type mac_dst
	     src_type value
	     src_value EA110BAD0BED
	     width 48 / end

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 app/test-pmd/cmdline_flow.c | 55 +++++++++++++++++++++++++------------
 1 file changed, 38 insertions(+), 17 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bb22294dd3..736029c4fd 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -448,6 +448,7 @@ enum index {
 	ACTION_MODIFY_FIELD_SRC_LEVEL,
 	ACTION_MODIFY_FIELD_SRC_OFFSET,
 	ACTION_MODIFY_FIELD_SRC_VALUE,
+	ACTION_MODIFY_FIELD_SRC_POINTER,
 	ACTION_MODIFY_FIELD_WIDTH,
 	ACTION_CONNTRACK,
 	ACTION_CONNTRACK_UPDATE,
@@ -468,6 +469,14 @@ enum index {
 #define ITEM_RAW_SIZE \
 	(sizeof(struct rte_flow_item_raw) + ITEM_RAW_PATTERN_SIZE)
 
+/** Maximum size for external pattern in struct rte_flow_action_modify_data. */
+#define ACTION_MODIFY_PATTERN_SIZE 32
+
+/** Storage size for struct rte_flow_action_modify_field including pattern. */
+#define ACTION_MODIFY_SIZE \
+	(sizeof(struct rte_flow_action_modify_field) + \
+	ACTION_MODIFY_PATTERN_SIZE)
+
 /** Maximum number of queue indices in struct rte_flow_action_rss. */
 #define ACTION_RSS_QUEUE_NUM 128
 
@@ -1704,6 +1713,7 @@ static const enum index action_modify_field_src[] = {
 	ACTION_MODIFY_FIELD_SRC_LEVEL,
 	ACTION_MODIFY_FIELD_SRC_OFFSET,
 	ACTION_MODIFY_FIELD_SRC_VALUE,
+	ACTION_MODIFY_FIELD_SRC_POINTER,
 	ACTION_MODIFY_FIELD_WIDTH,
 	ZERO,
 };
@@ -4455,8 +4465,7 @@ static const struct token token_list[] = {
 	[ACTION_MODIFY_FIELD] = {
 		.name = "modify_field",
 		.help = "modify destination field with data from source field",
-		.priv = PRIV_ACTION(MODIFY_FIELD,
-			sizeof(struct rte_flow_action_modify_field)),
+		.priv = PRIV_ACTION(MODIFY_FIELD, ACTION_MODIFY_SIZE),
 		.next = NEXT(NEXT_ENTRY(ACTION_MODIFY_FIELD_OP)),
 		.call = parse_vc,
 	},
@@ -4539,11 +4548,26 @@ static const struct token token_list[] = {
 		.name = "src_value",
 		.help = "source immediate value",
 		.next = NEXT(NEXT_ENTRY(ACTION_MODIFY_FIELD_WIDTH),
-			NEXT_ENTRY(COMMON_UNSIGNED)),
-		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field,
+			     NEXT_ENTRY(COMMON_HEX)),
+		.args = ARGS(ARGS_ENTRY_ARB(0, 0),
+			     ARGS_ENTRY_ARB(0, 0),
+			     ARGS_ENTRY(struct rte_flow_action_modify_field,
 					src.value)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_MODIFY_FIELD_SRC_POINTER] = {
+		.name = "src_ptr",
+		.help = "pointer to source immediate value",
+		.next = NEXT(NEXT_ENTRY(ACTION_MODIFY_FIELD_WIDTH),
+			     NEXT_ENTRY(COMMON_HEX)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field,
+					src.pvalue),
+			     ARGS_ENTRY_ARB(0, 0),
+			     ARGS_ENTRY_ARB
+				(sizeof(struct rte_flow_action_modify_field),
+				 ACTION_MODIFY_PATTERN_SIZE)),
+		.call = parse_vc_conf,
+	},
 	[ACTION_MODIFY_FIELD_WIDTH] = {
 		.name = "width",
 		.help = "number of bits to copy",
@@ -7830,15 +7854,11 @@ static int
 comp_set_modify_field_op(struct context *ctx, const struct token *token,
 		   unsigned int ent, char *buf, unsigned int size)
 {
-	uint16_t idx = 0;
-
 	RTE_SET_USED(ctx);
 	RTE_SET_USED(token);
-	for (idx = 0; modify_field_ops[idx]; ++idx)
-		;
 	if (!buf)
-		return idx + 1;
-	if (ent < idx)
+		return RTE_DIM(modify_field_ops);
+	if (ent < RTE_DIM(modify_field_ops) - 1)
 		return strlcpy(buf, modify_field_ops[ent], size);
 	return -1;
 }
@@ -7848,16 +7868,17 @@ static int
 comp_set_modify_field_id(struct context *ctx, const struct token *token,
 		   unsigned int ent, char *buf, unsigned int size)
 {
-	uint16_t idx = 0;
+	const char *name;
 
-	RTE_SET_USED(ctx);
 	RTE_SET_USED(token);
-	for (idx = 0; modify_field_ids[idx]; ++idx)
-		;
 	if (!buf)
-		return idx + 1;
-	if (ent < idx)
-		return strlcpy(buf, modify_field_ids[ent], size);
+		return RTE_DIM(modify_field_ids);
+	if (ent >= RTE_DIM(modify_field_ids) - 1)
+		return -1;
+	name = modify_field_ids[ent];
+	if (ctx->curr == ACTION_MODIFY_FIELD_SRC_TYPE ||
+	    (strcmp(name, "pointer") && strcmp(name, "value")))
+		return strlcpy(buf, name, size);
 	return -1;
 }
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 4/5] app/testpmd: fix hex string parser in flow commands
  2021-10-12 10:49 ` [dpdk-dev] [PATCH v4 0/5] ethdev: update modify field flow action Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 3/5] app/testpmd: update modify field flow action support Viacheslav Ovsiienko
@ 2021-10-12 10:49   ` Viacheslav Ovsiienko
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 5/5] net/mlx5: update modify field action Viacheslav Ovsiienko
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 10:49 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas, stable

The hexadecimal string parser does not check the target
field buffer size, buffer overflow happens and might
cause the application failure (segmentation fault
is observed usually).

Fixes: 169a9fed1f4c ("app/testpmd: fix hex string parser support for flow API")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 app/test-pmd/cmdline_flow.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 736029c4fd..6827d9228f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -7291,10 +7291,13 @@ parse_hex(struct context *ctx, const struct token *token,
 		hexlen -= 2;
 	}
 	if (hexlen > length)
-		return -1;
+		goto error;
 	ret = parse_hex_string(str, hex_tmp, &hexlen);
 	if (ret < 0)
 		goto error;
+	/* Check the converted binary fits into data buffer. */
+	if (hexlen > size)
+		goto error;
 	/* Let parse_int() fill length information first. */
 	ret = snprintf(tmp, sizeof(tmp), "%u", hexlen);
 	if (ret < 0)
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 5/5] net/mlx5: update modify field action
  2021-10-12 10:49 ` [dpdk-dev] [PATCH v4 0/5] ethdev: update modify field flow action Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2021-10-12 10:49   ` [dpdk-dev] [PATCH v4 4/5] app/testpmd: fix hex string parser in flow commands Viacheslav Ovsiienko
@ 2021-10-12 10:49   ` Viacheslav Ovsiienko
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 10:49 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

Update immediate value/pointer source operand support
for modify field RTE Flow action:

  - source operand data can be presented by byte buffer
    (instead of former uint64_t) or by pointer
  - no host byte ordering is assumed anymore for immediate
    data buffer (not uint64_t anymore)
  - no immediate value offset is expected

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5_flow_dv.c | 123 +++++++++++---------------------
 1 file changed, 42 insertions(+), 81 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index c6370cd1d6..d13e0d14e9 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1391,7 +1391,7 @@ flow_dv_convert_action_modify_ipv6_dscp
 
 static int
 mlx5_flow_item_field_width(struct mlx5_priv *priv,
-			   enum rte_flow_field_id field)
+			   enum rte_flow_field_id field, int inherit)
 {
 	switch (field) {
 	case RTE_FLOW_FIELD_START:
@@ -1442,7 +1442,8 @@ mlx5_flow_item_field_width(struct mlx5_priv *priv,
 		return __builtin_popcount(priv->sh->dv_meta_mask);
 	case RTE_FLOW_FIELD_POINTER:
 	case RTE_FLOW_FIELD_VALUE:
-		return 64;
+		MLX5_ASSERT(inherit >= 0);
+		return inherit < 0 ? 0 : inherit;
 	default:
 		MLX5_ASSERT(false);
 	}
@@ -1452,17 +1453,14 @@ mlx5_flow_item_field_width(struct mlx5_priv *priv,
 static void
 mlx5_flow_field_id_to_modify_info
 		(const struct rte_flow_action_modify_data *data,
-		 struct field_modify_info *info,
-		 uint32_t *mask, uint32_t *value,
-		 uint32_t width, uint32_t dst_width,
-		 uint32_t *shift, struct rte_eth_dev *dev,
-		 const struct rte_flow_attr *attr,
-		 struct rte_flow_error *error)
+		 struct field_modify_info *info, uint32_t *mask,
+		 uint32_t width, uint32_t *shift, struct rte_eth_dev *dev,
+		 const struct rte_flow_attr *attr, struct rte_flow_error *error)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	uint32_t idx = 0;
 	uint32_t off = 0;
-	uint64_t val = 0;
+
 	switch (data->field) {
 	case RTE_FLOW_FIELD_START:
 		/* not supported yet */
@@ -1472,7 +1470,7 @@ mlx5_flow_field_id_to_modify_info
 		off = data->offset > 16 ? data->offset - 16 : 0;
 		if (mask) {
 			if (data->offset < 16) {
-				info[idx] = (struct field_modify_info){2, 0,
+				info[idx] = (struct field_modify_info){2, 4,
 						MLX5_MODI_OUT_DMAC_15_0};
 				if (width < 16) {
 					mask[idx] = rte_cpu_to_be_16(0xffff >>
@@ -1486,15 +1484,15 @@ mlx5_flow_field_id_to_modify_info
 					break;
 				++idx;
 			}
-			info[idx] = (struct field_modify_info){4, 4 * idx,
+			info[idx] = (struct field_modify_info){4, 0,
 						MLX5_MODI_OUT_DMAC_47_16};
 			mask[idx] = rte_cpu_to_be_32((0xffffffff >>
 						      (32 - width)) << off);
 		} else {
 			if (data->offset < 16)
-				info[idx++] = (struct field_modify_info){2, 0,
+				info[idx++] = (struct field_modify_info){2, 4,
 						MLX5_MODI_OUT_DMAC_15_0};
-			info[idx] = (struct field_modify_info){4, off,
+			info[idx] = (struct field_modify_info){4, 0,
 						MLX5_MODI_OUT_DMAC_47_16};
 		}
 		break;
@@ -1502,7 +1500,7 @@ mlx5_flow_field_id_to_modify_info
 		off = data->offset > 16 ? data->offset - 16 : 0;
 		if (mask) {
 			if (data->offset < 16) {
-				info[idx] = (struct field_modify_info){2, 0,
+				info[idx] = (struct field_modify_info){2, 4,
 						MLX5_MODI_OUT_SMAC_15_0};
 				if (width < 16) {
 					mask[idx] = rte_cpu_to_be_16(0xffff >>
@@ -1516,15 +1514,15 @@ mlx5_flow_field_id_to_modify_info
 					break;
 				++idx;
 			}
-			info[idx] = (struct field_modify_info){4, 4 * idx,
+			info[idx] = (struct field_modify_info){4, 0,
 						MLX5_MODI_OUT_SMAC_47_16};
 			mask[idx] = rte_cpu_to_be_32((0xffffffff >>
 						      (32 - width)) << off);
 		} else {
 			if (data->offset < 16)
-				info[idx++] = (struct field_modify_info){2, 0,
+				info[idx++] = (struct field_modify_info){2, 4,
 						MLX5_MODI_OUT_SMAC_15_0};
-			info[idx] = (struct field_modify_info){4, off,
+			info[idx] = (struct field_modify_info){4, 0,
 						MLX5_MODI_OUT_SMAC_47_16};
 		}
 		break;
@@ -1584,8 +1582,7 @@ mlx5_flow_field_id_to_modify_info
 	case RTE_FLOW_FIELD_IPV6_SRC:
 		if (mask) {
 			if (data->offset < 32) {
-				info[idx] = (struct field_modify_info){4,
-						4 * idx,
+				info[idx] = (struct field_modify_info){4, 12,
 						MLX5_MODI_OUT_SIPV6_31_0};
 				if (width < 32) {
 					mask[idx] =
@@ -1601,8 +1598,7 @@ mlx5_flow_field_id_to_modify_info
 				++idx;
 			}
 			if (data->offset < 64) {
-				info[idx] = (struct field_modify_info){4,
-						4 * idx,
+				info[idx] = (struct field_modify_info){4, 8,
 						MLX5_MODI_OUT_SIPV6_63_32};
 				if (width < 32) {
 					mask[idx] =
@@ -1618,8 +1614,7 @@ mlx5_flow_field_id_to_modify_info
 				++idx;
 			}
 			if (data->offset < 96) {
-				info[idx] = (struct field_modify_info){4,
-						4 * idx,
+				info[idx] = (struct field_modify_info){4, 4,
 						MLX5_MODI_OUT_SIPV6_95_64};
 				if (width < 32) {
 					mask[idx] =
@@ -1634,19 +1629,19 @@ mlx5_flow_field_id_to_modify_info
 					break;
 				++idx;
 			}
-			info[idx] = (struct field_modify_info){4, 4 * idx,
+			info[idx] = (struct field_modify_info){4, 0,
 						MLX5_MODI_OUT_SIPV6_127_96};
 			mask[idx] = rte_cpu_to_be_32(0xffffffff >>
 						     (32 - width));
 		} else {
 			if (data->offset < 32)
-				info[idx++] = (struct field_modify_info){4, 0,
+				info[idx++] = (struct field_modify_info){4, 12,
 						MLX5_MODI_OUT_SIPV6_31_0};
 			if (data->offset < 64)
-				info[idx++] = (struct field_modify_info){4, 0,
+				info[idx++] = (struct field_modify_info){4, 8,
 						MLX5_MODI_OUT_SIPV6_63_32};
 			if (data->offset < 96)
-				info[idx++] = (struct field_modify_info){4, 0,
+				info[idx++] = (struct field_modify_info){4, 4,
 						MLX5_MODI_OUT_SIPV6_95_64};
 			if (data->offset < 128)
 				info[idx++] = (struct field_modify_info){4, 0,
@@ -1656,8 +1651,7 @@ mlx5_flow_field_id_to_modify_info
 	case RTE_FLOW_FIELD_IPV6_DST:
 		if (mask) {
 			if (data->offset < 32) {
-				info[idx] = (struct field_modify_info){4,
-						4 * idx,
+				info[idx] = (struct field_modify_info){4, 12,
 						MLX5_MODI_OUT_DIPV6_31_0};
 				if (width < 32) {
 					mask[idx] =
@@ -1673,8 +1667,7 @@ mlx5_flow_field_id_to_modify_info
 				++idx;
 			}
 			if (data->offset < 64) {
-				info[idx] = (struct field_modify_info){4,
-						4 * idx,
+				info[idx] = (struct field_modify_info){4, 8,
 						MLX5_MODI_OUT_DIPV6_63_32};
 				if (width < 32) {
 					mask[idx] =
@@ -1690,8 +1683,7 @@ mlx5_flow_field_id_to_modify_info
 				++idx;
 			}
 			if (data->offset < 96) {
-				info[idx] = (struct field_modify_info){4,
-						4 * idx,
+				info[idx] = (struct field_modify_info){4, 4,
 						MLX5_MODI_OUT_DIPV6_95_64};
 				if (width < 32) {
 					mask[idx] =
@@ -1706,19 +1698,19 @@ mlx5_flow_field_id_to_modify_info
 					break;
 				++idx;
 			}
-			info[idx] = (struct field_modify_info){4, 4 * idx,
+			info[idx] = (struct field_modify_info){4, 0,
 						MLX5_MODI_OUT_DIPV6_127_96};
 			mask[idx] = rte_cpu_to_be_32(0xffffffff >>
 						     (32 - width));
 		} else {
 			if (data->offset < 32)
-				info[idx++] = (struct field_modify_info){4, 0,
+				info[idx++] = (struct field_modify_info){4, 12,
 						MLX5_MODI_OUT_DIPV6_31_0};
 			if (data->offset < 64)
-				info[idx++] = (struct field_modify_info){4, 0,
+				info[idx++] = (struct field_modify_info){4, 8,
 						MLX5_MODI_OUT_DIPV6_63_32};
 			if (data->offset < 96)
-				info[idx++] = (struct field_modify_info){4, 0,
+				info[idx++] = (struct field_modify_info){4, 4,
 						MLX5_MODI_OUT_DIPV6_95_64};
 			if (data->offset < 128)
 				info[idx++] = (struct field_modify_info){4, 0,
@@ -1838,35 +1830,6 @@ mlx5_flow_field_id_to_modify_info
 		break;
 	case RTE_FLOW_FIELD_POINTER:
 	case RTE_FLOW_FIELD_VALUE:
-		if (data->field == RTE_FLOW_FIELD_POINTER)
-			memcpy(&val, (void *)(uintptr_t)data->value,
-			       sizeof(uint64_t));
-		else
-			val = data->value;
-		for (idx = 0; idx < MLX5_ACT_MAX_MOD_FIELDS; idx++) {
-			if (mask[idx]) {
-				if (dst_width == 48) {
-					/*special case for MAC addresses */
-					value[idx] = rte_cpu_to_be_16(val);
-					val >>= 16;
-					dst_width -= 16;
-				} else if (dst_width > 16) {
-					value[idx] = rte_cpu_to_be_32(val);
-					val >>= 32;
-				} else if (dst_width > 8) {
-					value[idx] = rte_cpu_to_be_16(val);
-					val >>= 16;
-				} else {
-					value[idx] = (uint8_t)val;
-					val >>= 8;
-				}
-				if (*shift)
-					value[idx] <<= *shift;
-				if (!val)
-					break;
-			}
-		}
-		break;
 	default:
 		MLX5_ASSERT(false);
 		break;
@@ -1907,33 +1870,31 @@ flow_dv_convert_action_modify_field
 	struct field_modify_info dcopy[MLX5_ACT_MAX_MOD_FIELDS] = {
 								{0, 0, 0} };
 	uint32_t mask[MLX5_ACT_MAX_MOD_FIELDS] = {0, 0, 0, 0, 0};
-	uint32_t value[MLX5_ACT_MAX_MOD_FIELDS] = {0, 0, 0, 0, 0};
 	uint32_t type;
 	uint32_t shift = 0;
-	uint32_t dst_width = mlx5_flow_item_field_width(priv, conf->dst.field);
+	uint32_t dst_width;
 
+	dst_width = mlx5_flow_item_field_width(priv, conf->dst.field, -1);
 	if (conf->src.field == RTE_FLOW_FIELD_POINTER ||
 		conf->src.field == RTE_FLOW_FIELD_VALUE) {
 		type = MLX5_MODIFICATION_TYPE_SET;
 		/** For SET fill the destination field (field) first. */
 		mlx5_flow_field_id_to_modify_info(&conf->dst, field, mask,
-						  value, conf->width, dst_width,
-						  &shift, dev, attr, error);
-		/** Then copy immediate value from source as per mask. */
-		mlx5_flow_field_id_to_modify_info(&conf->src, dcopy, mask,
-						  value, conf->width, dst_width,
-						  &shift, dev, attr, error);
-		item.spec = &value;
+						  conf->width, &shift, dev,
+						  attr, error);
+		item.spec = conf->src.field == RTE_FLOW_FIELD_POINTER ?
+					(void *)(uintptr_t)conf->src.pvalue :
+					(void *)(uintptr_t)&conf->src.value;
 	} else {
 		type = MLX5_MODIFICATION_TYPE_COPY;
 		/** For COPY fill the destination field (dcopy) without mask. */
 		mlx5_flow_field_id_to_modify_info(&conf->dst, dcopy, NULL,
-						  value, conf->width, dst_width,
-						  &shift, dev, attr, error);
+						  conf->width, &shift, dev,
+						  attr, error);
 		/** Then construct the source field (field) with mask. */
 		mlx5_flow_field_id_to_modify_info(&conf->src, field, mask,
-						  value, conf->width, dst_width,
-						  &shift, dev, attr, error);
+						  conf->width, &shift,
+						  dev, attr, error);
 	}
 	item.mask = &mask;
 	return flow_dv_convert_modify_action(&item,
@@ -4874,9 +4835,9 @@ flow_dv_validate_action_modify_field(struct rte_eth_dev *dev,
 	const struct rte_flow_action_modify_field *action_modify_field =
 		action->conf;
 	uint32_t dst_width = mlx5_flow_item_field_width(priv,
-				action_modify_field->dst.field);
+				action_modify_field->dst.field, -1);
 	uint32_t src_width = mlx5_flow_item_field_width(priv,
-				action_modify_field->src.field);
+				action_modify_field->src.field, dst_width);
 
 	ret = flow_dv_validate_action_modify_hdr(action_flags, action, error);
 	if (ret)
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 0/5] ethdev: introduce configurable flexible item
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (5 preceding siblings ...)
  2021-10-12 10:49 ` [dpdk-dev] [PATCH v4 0/5] ethdev: update modify field flow action Viacheslav Ovsiienko
@ 2021-10-12 11:32 ` Viacheslav Ovsiienko
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 1/5] " Viacheslav Ovsiienko
                     ` (4 more replies)
  2021-10-12 12:54 ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (3 subsequent siblings)
  10 siblings, 5 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 11:32 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item confgiuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

7. Notes:

 - v3:  http://patches.dpdk.org/project/dpdk/cover/20211011181528.517-1-viacheslavo@nvidia.com/
 - v2:  http://patches.dpdk.org/project/dpdk/patch/20211001193415.23288-2-viacheslavo@nvidia.com/
 - v1:  http://patches.dpdk.org/project/dpdk/patch/20210922180418.20663-2-viacheslavo@nvidia.com/
 - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/

 - v3 -> v2:
   - comments addressed
   - testpmd compilation issues fixed
   - typos fixed

 - v2 -> v3:
   - comments addressed
   - flex item update removed as not supported
   - RSS over flex item fields removed as not supported and non-complete
     API
   - tunnel mode configuration refactored
   - testpmd updated
   - documentation updated
   - PMD patches are removed temporarily (updating WIP, be presented in rc2)

 - v1 -> v2:
   - testpmd CLI to handle flex item is provided
   - draft PMD code is introduced

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Gregory Etelson (4):
  ethdev: support flow elements with variable length
  ethdev: implement RTE flex item API
  app/testpmd: add jansson library
  app/testpmd: add flex item CLI commands

Viacheslav Ovsiienko (1):
  ethdev: introduce configurable flexible item

 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 764 +++++++++++++++++++-
 app/test-pmd/meson.build                    |   5 +
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  19 +
 doc/guides/prog_guide/rte_flow.rst          |  25 +
 doc/guides/rel_notes/release_21_11.rst      |   7 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++
 lib/ethdev/rte_flow.c                       | 123 +++-
 lib/ethdev/rte_flow.h                       | 222 ++++++
 lib/ethdev/rte_flow_driver.h                |   8 +
 lib/ethdev/version.map                      |   4 +
 12 files changed, 1285 insertions(+), 15 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 1/5] ethdev: introduce configurable flexible item
  2021-10-12 11:32 ` [dpdk-dev] [PATCH v4 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-12 11:32   ` Viacheslav Ovsiienko
  2021-10-12 11:42     ` Ori Kam
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 2/5] ethdev: support flow elements with variable length Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 11:32 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item configuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst     |  25 +++
 doc/guides/rel_notes/release_21_11.rst |   7 +
 lib/ethdev/rte_flow.h                  | 222 +++++++++++++++++++++++++
 3 files changed, 254 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2b42d5ec8c..495d08a6a9 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1425,6 +1425,31 @@ Matches a conntrack state after conntrack action.
 - ``flags``: conntrack packet state flags.
 - Default ``mask`` matches all state bits.
 
+Item: ``FLEX``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches with the custom network protocol header that was created
+using rte_flow_flex_item_create() API. The application describes
+the desired header structure, defines the header fields attributes
+and header relations with preceding and following protocols and
+configures the ethernet devices accordingly via
+rte_flow_flex_item_create() routine.
+
+- ``handle``: the flex item handle returned by the PMD on successful
+  rte_flow_flex_item_create() call, mask for this field is ignored.
+- ``length``: match pattern length in bytes. If the length does not cover
+  all fields defined in item configuration, the pattern spec and mask are
+  considered by the driver as padded with trailing zeroes till the full
+  configured item pattern length.
+- ``pattern``: pattern to match. The pattern is concatenation of bit fields
+  configured at item creation. At configuration the fields are presented
+  by sample_data array. The order of the bitfields is defined by the order
+  of sample_data elements. The width of each bitfield is defined by the width
+  specified in the corresponding sample_data element as well. If pattern
+  length is smaller than configured fields overall length it is considered
+  as padded with trailing zeroes up to full configured length, both for
+  value and mask.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 73e377a007..4b8cac60d4 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Introduced RTE Flow Flex Item.**
+
+  * The configurable RTE Flow Flex Item provides the capability to introduce
+    the arbitrary user specified network protocol header, configure the device
+    hardware accordingly, and perform match on this header with desired patterns
+    and masks.
+
 * **Enabled new devargs parser.**
 
   * Enabled devargs syntax
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 7b1ed7f110..fb226d9f52 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -574,6 +574,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_conntrack.
 	 */
 	RTE_FLOW_ITEM_TYPE_CONNTRACK,
+
+	/**
+	 * Matches a configured set of fields at runtime calculated offsets
+	 * over the generic network header with variable length and
+	 * flexible pattern
+	 *
+	 * @see struct rte_flow_item_flex.
+	 */
+	RTE_FLOW_ITEM_TYPE_FLEX,
 };
 
 /**
@@ -1839,6 +1848,177 @@ struct rte_flow_item {
 	const void *mask; /**< Bit-mask applied to spec and last. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_FLEX
+ *
+ * Matches a specified set of fields within the network protocol
+ * header. Each field is presented as set of bits with specified width, and
+ * bit offset from the header beginning.
+ *
+ * The pattern is concatenation of bit fields configured at item creation
+ * by rte_flow_flex_item_create(). At configuration the fields are presented
+ * by sample_data array.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_flex {
+	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle. */
+	uint32_t length; /**< Pattern length in bytes. */
+	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
+};
+/**
+ * Field bit offset calculation mode.
+ */
+enum rte_flow_item_flex_field_mode {
+	/**
+	 * Dummy field, used for byte boundary alignment in pattern.
+	 * Pattern mask and data are ignored in the match. All configuration
+	 * parameters besides field size are ignored.
+	 */
+	FIELD_MODE_DUMMY = 0,
+	/**
+	 * Fixed offset field. The bit offset from header beginning
+	 * is permanent and defined by field_base parameter.
+	 */
+	FIELD_MODE_FIXED,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field). The resulting field offset to match is calculated as:
+	 *
+	 *    field_base + (*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_OFFSET,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field), the latter is considered as bitmask containing some
+	 * number of one bits, the resulting field offset to match is
+	 * calculated as:
+	 *
+	 *    field_base + bitcount(*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_BITMASK,
+};
+
+/**
+ * Flex item field tunnel mode
+ */
+enum rte_flow_item_flex_tunnel_mode {
+	/**
+	 * The protocol header can be present in the packet only once.
+	 * No multiple flex item flow inclusions (for inner/outer) are allowed.
+	 * No any relations with tunnel protocols are imposed. The drivers
+	 * can optimize hardware resource usage to handle match on single flex
+	 * item of specific type.
+	 */
+	FLEX_TUNNEL_MODE_SINGLE = 0,
+	/**
+	 * Flex item presents outer header only.
+	 */
+	FLEX_TUNNEL_MODE_OUTER,
+	/**
+	 * Flex item presents inner header only.
+	 */
+	FLEX_TUNNEL_MODE_INNER,
+	/**
+	 * Flex item presents either inner or outer header. The driver
+	 * handles as many multiple inners as hardware supports.
+	 */
+	FLEX_TUNNEL_MODE_MULTI,
+	/**
+	 * Flex item presents tunnel protocol header.
+	 */
+	FLEX_TUNNEL_MODE_TUNNEL,
+};
+
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+__extension__
+struct rte_flow_item_flex_field {
+	/** Defines how match field offset is calculated over the packet. */
+	enum rte_flow_item_flex_field_mode field_mode;
+	uint32_t field_size; /**< Field size in bits. */
+	int32_t field_base; /**< Field offset in bits. */
+	uint32_t offset_base; /**< Indirect offset field offset in bits. */
+	uint32_t offset_mask; /**< Indirect offset field bit mask. */
+	int32_t offset_shift; /**< Indirect offset multiply factor. */
+	uint32_t field_id:16; /**< Device hint, for multiple items in flow. */
+	uint32_t reserved:16; /**< Reserved field. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_link {
+	/**
+	 * Preceding/following header. The item type must be always provided.
+	 * For preceding one item must specify the header value/mask to match
+	 * for the link be taken and start the flex item header parsing.
+	 */
+	struct rte_flow_item item;
+	/**
+	 * Next field value to match to continue with one of the configured
+	 * next protocols.
+	 */
+	uint32_t next;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_conf {
+	/**
+	 * Specifies the flex item and tunnel relations and tells the PMD
+	 * whether flex item can be used for inner, outer or both headers,
+	 * or whether flex item presents the tunnel protocol itself.
+	 */
+	enum rte_flow_item_flex_tunnel_mode tunnel;
+	/**
+	 * The next header offset, it presents the network header size covered
+	 * by the flex item and can be obtained with all supported offset
+	 * calculating methods (fixed, dedicated field, bitmask, etc).
+	 */
+	struct rte_flow_item_flex_field next_header;
+	/**
+	 * Specifies the next protocol field to match with link next protocol
+	 * values and continue packet parsing with matching link.
+	 */
+	struct rte_flow_item_flex_field next_protocol;
+	/**
+	 * The fields will be sampled and presented for explicit match
+	 * with pattern in the rte_flow_flex_item. There can be multiple
+	 * fields descriptors, the number should be specified by nb_samples.
+	 */
+	struct rte_flow_item_flex_field *sample_data;
+	/** Number of field descriptors in the sample_data array. */
+	uint32_t nb_samples;
+	/**
+	 * Input link defines the flex item relation with preceding
+	 * header. It specified the preceding item type and provides pattern
+	 * to match. The flex item will continue parsing and will provide the
+	 * data to flow match in case if there is the match with one of input
+	 * links.
+	 */
+	struct rte_flow_item_flex_link *input_link;
+	/** Number of link descriptors in the input link array. */
+	uint32_t nb_inputs;
+	/**
+	 * Output link defines the next protocol field value to match and
+	 * the following protocol header to continue packet parsing. Also
+	 * defines the tunnel-related behaviour.
+	 */
+	struct rte_flow_item_flex_link *output_link;
+	/** Number of link descriptors in the output link array. */
+	uint32_t nb_outputs;
+};
+
 /**
  * Action types.
  *
@@ -4288,6 +4468,48 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 			     struct rte_flow_item *items,
 			     uint32_t num_of_items,
 			     struct rte_flow_error *error);
+
+/**
+ * Create the flex item with specified configuration over
+ * the Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] conf
+ *   Item configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is set.
+ */
+__rte_experimental
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
+/**
+ * Release the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 2/5] ethdev: support flow elements with variable length
  2021-10-12 11:32 ` [dpdk-dev] [PATCH v4 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 1/5] " Viacheslav Ovsiienko
@ 2021-10-12 11:32   ` Viacheslav Ovsiienko
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 3/5] ethdev: implement RTE flex item API Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 11:32 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

There is the new RTE Flow items with variable lengths coming - flex
item). In order to handle this item (and potentially other new ones
with variable pattern length) in RTE flow copy and conversion routines
the helper function is introduced.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 lib/ethdev/rte_flow.c | 83 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 70 insertions(+), 13 deletions(-)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 8cb7a069c8..100983ca59 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -30,13 +30,67 @@ uint64_t rte_flow_dynf_metadata_mask;
 struct rte_flow_desc_data {
 	const char *name;
 	size_t size;
+	size_t (*desc_fn)(void *dst, const void *src);
 };
 
+/**
+ *
+ * @param buf
+ * Destination memory.
+ * @param data
+ * Source memory
+ * @param size
+ * Requested copy size
+ * @param desc
+ * rte_flow_desc_item - for flow item conversion.
+ * rte_flow_desc_action - for flow action conversion.
+ * @param type
+ * Offset into the desc param or negative value for private flow elements.
+ */
+static inline size_t
+rte_flow_conv_copy(void *buf, const void *data, const size_t size,
+		   const struct rte_flow_desc_data *desc, int type)
+{
+	/**
+	 * allow PMD private flow item
+	 * see 5d1bff8fe2
+	 * "ethdev: allow negative values in flow rule types"
+	 */
+	size_t sz = type >= 0 ? desc[type].size : sizeof(void *);
+	if (buf == NULL || data == NULL)
+		return 0;
+	rte_memcpy(buf, data, (size > sz ? sz : size));
+	if (desc[type].desc_fn)
+		sz += desc[type].desc_fn(size > 0 ? buf : NULL, data);
+	return sz;
+}
+
+static size_t
+rte_flow_item_flex_conv(void *buf, const void *data)
+{
+	struct rte_flow_item_flex *dst = buf;
+	const struct rte_flow_item_flex *src = data;
+	if (buf) {
+		dst->pattern = rte_memcpy
+			((void *)((uintptr_t)(dst + 1)), src->pattern,
+			 src->length);
+	}
+	return src->length;
+}
+
 /** Generate flow_item[] entry. */
 #define MK_FLOW_ITEM(t, s) \
 	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
 		.name = # t, \
-		.size = s, \
+		.size = s,               \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ITEM_FN(t, s, fn) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = {\
+		.name = # t,                 \
+		.size = s,                   \
+		.desc_fn = fn,               \
 	}
 
 /** Information about known flow pattern items. */
@@ -100,6 +154,8 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)),
 	MK_FLOW_ITEM(INTEGRITY, sizeof(struct rte_flow_item_integrity)),
 	MK_FLOW_ITEM(CONNTRACK, sizeof(uint32_t)),
+	MK_FLOW_ITEM_FN(FLEX, sizeof(struct rte_flow_item_flex),
+			rte_flow_item_flex_conv),
 };
 
 /** Generate flow_action[] entry. */
@@ -107,8 +163,17 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
 		.name = # t, \
 		.size = s, \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ACTION_FN(t, fn) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = 0, \
+		.desc_fn = fn,\
 	}
 
+
 /** Information about known flow actions. */
 static const struct rte_flow_desc_data rte_flow_desc_action[] = {
 	MK_FLOW_ACTION(END, 0),
@@ -527,12 +592,8 @@ rte_flow_conv_item_spec(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow item
-		 */
-		off = (int)item->type >= 0 ?
-		      rte_flow_desc_item[item->type].size : sizeof(void *);
-		rte_memcpy(buf, data, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, data, size,
+					 rte_flow_desc_item, item->type);
 		break;
 	}
 	return off;
@@ -634,12 +695,8 @@ rte_flow_conv_action_conf(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow action
-		 */
-		off = (int)action->type >= 0 ?
-		      rte_flow_desc_action[action->type].size : sizeof(void *);
-		rte_memcpy(buf, action->conf, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, action->conf, size,
+					 rte_flow_desc_action, action->type);
 		break;
 	}
 	return off;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 3/5] ethdev: implement RTE flex item API
  2021-10-12 11:32 ` [dpdk-dev] [PATCH v4 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 1/5] " Viacheslav Ovsiienko
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 2/5] ethdev: support flow elements with variable length Viacheslav Ovsiienko
@ 2021-10-12 11:32   ` Viacheslav Ovsiienko
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 4/5] app/testpmd: add jansson library Viacheslav Ovsiienko
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 5/5] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 11:32 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flex item API was introduced in
"ethdev: introduce configurable flexible item" patch.

The API allows DPDK application to define parser for custom
network header in port hardware and offload flows that will match
the custom header elements.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 lib/ethdev/rte_flow.c        | 40 ++++++++++++++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h |  8 ++++++++
 lib/ethdev/version.map       |  4 ++++
 3 files changed, 52 insertions(+)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 100983ca59..a858dc31e3 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -1323,3 +1323,43 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOTSUP));
 }
+
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_flow_item_flex_handle *handle;
+
+	if (unlikely(!ops))
+		return NULL;
+	if (unlikely(!ops->flex_item_create)) {
+		rte_flow_error_set(error, ENOTSUP,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOTSUP));
+		return NULL;
+	}
+	handle = ops->flex_item_create(dev, conf, error);
+	if (handle == NULL)
+		flow_err(port_id, -rte_errno, error);
+	return handle;
+}
+
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_release))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_release(dev, handle, error);
+	return flow_err(port_id, ret, error);
+}
diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
index 46f62c2ec2..34a5a5bcd0 100644
--- a/lib/ethdev/rte_flow_driver.h
+++ b/lib/ethdev/rte_flow_driver.h
@@ -139,6 +139,14 @@ struct rte_flow_ops {
 		 struct rte_flow_item *pmd_items,
 		 uint32_t num_of_items,
 		 struct rte_flow_error *err);
+	struct rte_flow_item_flex_handle *(*flex_item_create)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
+	int (*flex_item_release)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 struct rte_flow_error *error);
 };
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 904bce6ea1..ec3b66d7a1 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -247,6 +247,10 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_flow_flex_item_create;
+	rte_flow_flex_item_release;
 };
 
 INTERNAL {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 4/5] app/testpmd: add jansson library
  2021-10-12 11:32 ` [dpdk-dev] [PATCH v4 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 3/5] ethdev: implement RTE flex item API Viacheslav Ovsiienko
@ 2021-10-12 11:32   ` Viacheslav Ovsiienko
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 5/5] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 11:32 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Testpmd interactive mode provides CLI to configure application
commands. Testpmd reads CLI command and parameters from STDIN, and
converts input into C objects with internal parser.
The patch adds jansson dependency to testpmd.
With jansson, testpmd can read input in JSON format from STDIN or input
file and convert it into C object using jansson library calls.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/meson.build | 5 +++++
 app/test-pmd/testpmd.h   | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 98f3289bdf..3a8babd604 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -61,3 +61,8 @@ if dpdk_conf.has('RTE_LIB_BPF')
     sources += files('bpf_cmd.c')
     deps += 'bpf'
 endif
+jansson_dep = dependency('jansson', required: false, method: 'pkg-config')
+if jansson_dep.found()
+    dpdk_conf.set('RTE_HAS_JANSSON', 1)
+    ext_deps += jansson_dep
+endif
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 5863b2f43f..876a341cf0 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -14,6 +14,9 @@
 #include <rte_os_shim.h>
 #include <cmdline.h>
 #include <sys/queue.h>
+#ifdef RTE_HAS_JANSSON
+#include <jansson.h>
+#endif
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v4 5/5] app/testpmd: add flex item CLI commands
  2021-10-12 11:32 ` [dpdk-dev] [PATCH v4 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 4/5] app/testpmd: add jansson library Viacheslav Ovsiienko
@ 2021-10-12 11:32   ` Viacheslav Ovsiienko
  4 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 11:32 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Network port hardware is shipped with fixed number of
supported network protocols. If application must work with a
protocol that is not included in the port hardware by default, it
can try to add the new protocol to port hardware.

Flex item or flex parser is port infrastructure that allows
application to add support for a custom network header and
offload flows to match the header elements.

Application must complete the following tasks to create a flow
rule that matches custom header:

1. Create flow item object in port hardware.
Application must provide custom header configuration to PMD.
PMD will use that configuration to create flex item object in
port hardware.

2. Create flex patterns to match. Flex pattern has a spec and a mask
components, like a regular flow item. Combined together, spec and mask
can target unique data sequence or a number of data sequences in the
custom header.
Flex patterns of the same flex item can have different lengths.
Flex pattern is identified by unique handler value.

3. Create a flow rule with a flex flow item that references
flow pattern.

Testpmd flex CLI commands are:

testpmd> flow flex_item create <port> <flex_id> <filename>

testpmd> set flex_pattern <pattern_id> \
         spec <spec data> mask <mask data>

testpmd> set flex_pattern <pattern_id> is <spec_data>

testpmd> flow create <port> ... \
/ flex item is <flex_id> pattern is <pattern_id> / ...

The patch works with the jansson library API.
Jansson development files must be present:
jansson.pc, jansson.h libjansson.[a,so]

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 764 +++++++++++++++++++-
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  16 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++
 5 files changed, 901 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index a9efd027c3..a673e6ef08 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -17822,6 +17822,8 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_set_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_show_capability,
+	(cmdline_parse_inst_t *)&cmd_set_flex_is_pattern,
+	(cmdline_parse_inst_t *)&cmd_set_flex_spec_pattern,
 	NULL,
 };
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bb22294dd3..b8dd9369b1 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -54,6 +54,8 @@ enum index {
 	COMMON_PRIORITY_LEVEL,
 	COMMON_INDIRECT_ACTION_ID,
 	COMMON_POLICY_ID,
+	COMMON_FLEX_HANDLE,
+	COMMON_FLEX_TOKEN,
 
 	/* TOP-level command. */
 	ADD,
@@ -81,6 +83,12 @@ enum index {
 	AGED,
 	ISOLATE,
 	TUNNEL,
+	FLEX,
+
+	/* Flex arguments */
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
 
 	/* Tunnel arguments. */
 	TUNNEL_CREATE,
@@ -306,6 +314,9 @@ enum index {
 	ITEM_POL_PORT,
 	ITEM_POL_METER,
 	ITEM_POL_POLICY,
+	ITEM_FLEX,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_FLEX_PATTERN_HANDLE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -844,6 +855,11 @@ struct buffer {
 		struct {
 			uint32_t policy_id;
 		} policy;/**< Policy arguments. */
+		struct {
+			uint16_t token;
+			uintptr_t uintptr;
+			char filename[128];
+		} flex; /**< Flex arguments*/
 	} args; /**< Command arguments. */
 };
 
@@ -871,6 +887,13 @@ struct parse_action_priv {
 		.size = s, \
 	})
 
+static const enum index next_flex_item[] = {
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
+	ZERO,
+};
+
 static const enum index next_ia_create_attr[] = {
 	INDIRECT_ACTION_CREATE_ID,
 	INDIRECT_ACTION_INGRESS,
@@ -1000,6 +1023,7 @@ static const enum index next_item[] = {
 	ITEM_GENEVE_OPT,
 	ITEM_INTEGRITY,
 	ITEM_CONNTRACK,
+	ITEM_FLEX,
 	END_SET,
 	ZERO,
 };
@@ -1368,6 +1392,13 @@ static const enum index item_integrity_lv[] = {
 	ZERO,
 };
 
+static const enum index item_flex[] = {
+	ITEM_FLEX_PATTERN_HANDLE,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -1724,6 +1755,9 @@ static int parse_set_sample_action(struct context *, const struct token *,
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int
+parse_flex_handle(struct context *, const struct token *,
+		  const char *, unsigned int, void *, unsigned int);
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -1840,6 +1874,8 @@ static int parse_isolate(struct context *, const struct token *,
 static int parse_tunnel(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_flex(struct context *, const struct token *,
+		      const char *, unsigned int, void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
@@ -1904,6 +1940,17 @@ static int comp_set_modify_field_op(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
 static int comp_set_modify_field_id(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static void flex_item_create(portid_t port_id, uint16_t flex_id,
+			     const char *filename);
+static void flex_item_destroy(portid_t port_id, uint16_t flex_id);
+struct flex_pattern {
+	struct rte_flow_item_flex spec, mask;
+	uint8_t spec_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	uint8_t mask_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+};
+
+static struct flex_item *flex_items[RTE_MAX_ETHPORTS][FLEX_MAX_PARSERS_NUM];
+static struct flex_pattern flex_patterns[FLEX_MAX_PATTERNS_NUM];
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -2040,6 +2087,20 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[COMMON_FLEX_TOKEN] = {
+		.name = "{flex token}",
+		.type = "flex token",
+		.help = "flex token",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[COMMON_FLEX_HANDLE] = {
+		.name = "{flex handle}",
+		.type = "FLEX HANDLE",
+		.help = "fill flex item data",
+		.call = parse_flex_handle,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -2056,7 +2117,8 @@ static const struct token token_list[] = {
 			      AGED,
 			      QUERY,
 			      ISOLATE,
-			      TUNNEL)),
+			      TUNNEL,
+			      FLEX)),
 		.call = parse_init,
 	},
 	/* Top-level command. */
@@ -2168,6 +2230,41 @@ static const struct token token_list[] = {
 			     ARGS_ENTRY(struct buffer, port)),
 		.call = parse_isolate,
 	},
+	[FLEX] = {
+		.name = "flex_item",
+		.help = "flex item API",
+		.next = NEXT(next_flex_item),
+		.call = parse_flex,
+	},
+	[FLEX_ITEM_INIT] = {
+		.name = "init",
+		.help = "flex item init",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_CREATE] = {
+		.name = "create",
+		.help = "flex item create",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.filename),
+			     ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FILE_PATH),
+			     NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_DESTROY] = {
+		.name = "destroy",
+		.help = "flex item destroy",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
 	[TUNNEL] = {
 		.name = "tunnel",
 		.help = "new tunnel API",
@@ -3608,6 +3705,27 @@ static const struct token token_list[] = {
 			     item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_conntrack, flags)),
 	},
+	[ITEM_FLEX] = {
+		.name = "flex",
+		.help = "match flex header",
+		.priv = PRIV_ITEM(FLEX, sizeof(struct rte_flow_item_flex)),
+		.next = NEXT(item_flex),
+		.call = parse_vc,
+	},
+	[ITEM_FLEX_ITEM_HANDLE] = {
+		.name = "item",
+		.help = "flex item handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, handle)),
+	},
+	[ITEM_FLEX_PATTERN_HANDLE] = {
+		.name = "pattern",
+		.help = "flex pattern handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, pattern)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -6999,6 +7117,43 @@ parse_isolate(struct context *ctx, const struct token *token,
 	return len;
 }
 
+static int
+parse_flex(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (out->command == ZERO) {
+		if (ctx->curr != FLEX)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		ctx->objmask = NULL;
+	} else {
+		switch (ctx->curr) {
+		default:
+			break;
+		case FLEX_ITEM_INIT:
+		case FLEX_ITEM_CREATE:
+		case FLEX_ITEM_DESTROY:
+			out->command = ctx->curr;
+			break;
+		}
+	}
+
+	return len;
+}
+
 static int
 parse_tunnel(struct context *ctx, const struct token *token,
 	     const char *str, unsigned int len,
@@ -7661,6 +7816,71 @@ parse_set_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/*
+ * Replace testpmd handles in a flex flow item with real values.
+ */
+static int
+parse_flex_handle(struct context *ctx, const struct token *token,
+		  const char *str, unsigned int len,
+		  void *buf, unsigned int size)
+{
+	struct rte_flow_item_flex *spec, *mask;
+	const struct rte_flow_item_flex *src_spec, *src_mask;
+	const struct arg *arg = pop_args(ctx);
+	uint32_t offset;
+	uint16_t handle;
+	int ret;
+
+	if (!arg) {
+		printf("Bad environment\n");
+		return -1;
+	}
+	offset = arg->offset;
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, len, buf, size);
+	if (ret <= 0 || !ctx->object)
+		return ret;
+	if (ctx->port >= RTE_MAX_ETHPORTS) {
+		printf("Bad port\n");
+		return -1;
+	}
+	if (offset == offsetof(struct rte_flow_item_flex, handle)) {
+		const struct flex_item *fp;
+		struct rte_flow_item_flex *item_flex = ctx->object;
+		handle = (uint16_t)(uintptr_t)item_flex->handle;
+		if (handle >= FLEX_MAX_PARSERS_NUM) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		fp = flex_items[ctx->port][handle];
+		if (!fp) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		item_flex->handle = fp->flex_handle;
+	} else if (offset == offsetof(struct rte_flow_item_flex, pattern)) {
+		handle = (uint16_t)(uintptr_t)
+			((struct rte_flow_item_flex *)ctx->object)->pattern;
+		if (handle >= FLEX_MAX_PATTERNS_NUM) {
+			printf("Bad pattern handle\n");
+			return -1;
+		}
+		src_spec = &flex_patterns[handle].spec;
+		src_mask = &flex_patterns[handle].mask;
+		spec = ctx->object;
+		mask = spec + 2; /* spec, last, mask */
+		/* fill flow rule spec and mask parameters */
+		spec->length = src_spec->length;
+		spec->pattern = src_spec->pattern;
+		mask->length = src_mask->length;
+		mask->pattern = src_mask->pattern;
+	} else {
+		printf("Bad arguments - unknown flex item offset\n");
+		return -1;
+	}
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -8167,6 +8387,13 @@ cmd_flow_parsed(const struct buffer *in)
 		port_meter_policy_add(in->port, in->args.policy.policy_id,
 					in->args.vc.actions);
 		break;
+	case FLEX_ITEM_CREATE:
+		flex_item_create(in->port, in->args.flex.token,
+				 in->args.flex.filename);
+		break;
+	case FLEX_ITEM_DESTROY:
+		flex_item_destroy(in->port, in->args.flex.token);
+		break;
 	default:
 		break;
 	}
@@ -8618,6 +8845,11 @@ cmd_set_raw_parsed(const struct buffer *in)
 		case RTE_FLOW_ITEM_TYPE_PFCP:
 			size = sizeof(struct rte_flow_item_pfcp);
 			break;
+		case RTE_FLOW_ITEM_TYPE_FLEX:
+			size = item->spec ?
+				((const struct rte_flow_item_flex *)
+				item->spec)->length : 0;
+			break;
 		default:
 			fprintf(stderr, "Error - Not supported item\n");
 			goto error;
@@ -8800,3 +9032,533 @@ cmdline_parse_inst_t cmd_show_set_raw_all = {
 		NULL,
 	},
 };
+
+#ifdef RTE_HAS_JANSSON
+static __rte_always_inline bool
+match_strkey(const char *key, const char *pattern)
+{
+	return strncmp(key, pattern, strlen(key)) == 0;
+}
+
+static struct flex_item *
+flex_parser_fetch(uint16_t port_id, uint16_t flex_id)
+{
+	if (port_id >= RTE_MAX_ETHPORTS) {
+		printf("Invalid port_id: %u\n", port_id);
+		return FLEX_PARSER_ERR;
+	}
+	if (flex_id >= FLEX_MAX_PARSERS_NUM) {
+		printf("Invalid flex item flex_id: %u\n", flex_id);
+		return FLEX_PARSER_ERR;
+	}
+	return flex_items[port_id][flex_id];
+}
+
+static void
+flex_item_destroy(portid_t port_id, uint16_t flex_id)
+{
+	int ret;
+	struct rte_flow_error error;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (!fp)
+		return;
+	ret = rte_flow_flex_item_release(port_id, fp->flex_handle, &error);
+	if (!ret) {
+		free(fp);
+		flex_items[port_id][flex_id] = NULL;
+		printf("port-%u: released flex item #%u\n",
+		       port_id, flex_id);
+
+	} else {
+		printf("port-%u: cannot release flex item #%u: %s\n",
+		       port_id, flex_id, error.message);
+	}
+}
+
+static int
+flex_tunnel_parse(json_t *jtun, enum rte_flow_item_flex_tunnel_mode *tunnel)
+{
+	int tun = -1;
+
+	if (json_is_integer(jtun))
+		tun = (int)json_integer_value(jtun);
+	else if (json_is_real(jtun))
+		tun = (int)json_real_value(jtun);
+	else if (json_is_string(jtun)) {
+		const char *mode = json_string_value(jtun);
+
+		if (match_strkey(mode, "FLEX_TUNNEL_MODE_SINGLE"))
+			tun = FLEX_TUNNEL_MODE_SINGLE;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_OUTER"))
+			tun = FLEX_TUNNEL_MODE_OUTER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_INNER"))
+			tun = FLEX_TUNNEL_MODE_INNER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_MULTI"))
+			tun = FLEX_TUNNEL_MODE_MULTI;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_TUNNEL"))
+			tun = FLEX_TUNNEL_MODE_TUNNEL;
+		else
+			return -EINVAL;
+	} else
+		return -EINVAL;
+	*tunnel = (enum rte_flow_item_flex_tunnel_mode)tun;
+	return 0;
+}
+
+static int
+flex_field_parse(json_t *jfld, struct rte_flow_item_flex_field *fld)
+{
+	const char *key;
+	json_t *je;
+
+#define FLEX_FIELD_GET(fm, t) \
+do {                  \
+	if (!strncmp(key, # fm, strlen(# fm))) { \
+		if (json_is_real(je))   \
+			fld->fm = (t) json_real_value(je); \
+		else if (json_is_integer(je))   \
+			fld->fm = (t) json_integer_value(je); \
+		else   \
+			return -EINVAL; \
+	}         \
+} while (0)
+
+	json_object_foreach(jfld, key, je) {
+		FLEX_FIELD_GET(field_size, uint32_t);
+		FLEX_FIELD_GET(field_base, int32_t);
+		FLEX_FIELD_GET(offset_base, uint32_t);
+		FLEX_FIELD_GET(offset_mask, uint32_t);
+		FLEX_FIELD_GET(offset_shift, int32_t);
+		FLEX_FIELD_GET(field_id, uint16_t);
+		if (match_strkey(key, "field_mode")) {
+			const char *mode;
+			if (!json_is_string(je))
+				return -EINVAL;
+			mode = json_string_value(je);
+			if (match_strkey(mode, "FIELD_MODE_DUMMY"))
+				fld->field_mode = FIELD_MODE_DUMMY;
+			else if (match_strkey(mode, "FIELD_MODE_FIXED"))
+				fld->field_mode = FIELD_MODE_FIXED;
+			else if (match_strkey(mode, "FIELD_MODE_OFFSET"))
+				fld->field_mode = FIELD_MODE_OFFSET;
+			else if (match_strkey(mode, "FIELD_MODE_BITMASK"))
+				fld->field_mode = FIELD_MODE_BITMASK;
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+enum flex_link_type {
+	FLEX_LINK_IN = 0,
+	FLEX_LINK_OUT = 1
+};
+
+static int
+flex_link_item_parse(const char *pattern, struct rte_flow_item *item)
+{
+#define  FLEX_PARSE_DATA_SIZE 1024
+
+	int ret;
+	uint8_t *ptr, data[FLEX_PARSE_DATA_SIZE] = {0,};
+	char flow_rule[256];
+	struct context saved_flow_ctx = cmd_flow_context;
+
+	sprintf(flow_rule, "flow create 0 pattern %s / end", pattern);
+	pattern = flow_rule;
+	cmd_flow_context_init(&cmd_flow_context);
+	do {
+		ret = cmd_flow_parse(NULL, pattern, (void *)data, sizeof(data));
+		if (ret > 0) {
+			pattern += ret;
+			while (isspace(*pattern))
+				pattern++;
+		}
+	} while (ret > 0 && strlen(pattern));
+	if (ret >= 0 && !strlen(pattern)) {
+		struct buffer *pbuf = (struct buffer *)(uintptr_t)data;
+		struct rte_flow_item *src = pbuf->args.vc.pattern;
+
+		item->type = src->type;
+		if (src->spec) {
+			ptr = (void *)(uintptr_t)item->spec;
+			memcpy(ptr, src->spec, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->spec = NULL;
+		}
+		if (src->mask) {
+			ptr = (void *)(uintptr_t)item->mask;
+			memcpy(ptr, src->mask, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->mask = NULL;
+		}
+		if (src->last) {
+			ptr = (void *)(uintptr_t)item->last;
+			memcpy(ptr, src->last, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->last = NULL;
+		}
+		ret = 0;
+	}
+	cmd_flow_context = saved_flow_ctx;
+	return ret;
+}
+
+static int
+flex_link_parse(json_t *jobj, struct rte_flow_item_flex_link *link,
+		enum flex_link_type link_type)
+{
+	const char *key;
+	json_t *je;
+	int ret;
+	json_object_foreach(jobj, key, je) {
+		if (match_strkey(key, "item")) {
+			if (!json_is_string(je))
+				return -EINVAL;
+			ret = flex_link_item_parse(json_string_value(je),
+						   &link->item);
+			if (ret)
+				return -EINVAL;
+			if (link_type == FLEX_LINK_IN) {
+				if (!link->item.spec || !link->item.mask)
+					return -EINVAL;
+				if (link->item.last)
+					return -EINVAL;
+			}
+		}
+		if (match_strkey(key, "next")) {
+			if (json_is_integer(je))
+				link->next = (typeof(link->next))
+					     json_integer_value(je);
+			else if (json_is_real(je))
+				link->next = (typeof(link->next))
+					     json_real_value(je);
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static int flex_item_config(json_t *jroot,
+			    struct rte_flow_item_flex_conf *flex_conf)
+{
+	const char *key;
+	json_t *jobj = NULL;
+	int ret = 0;
+
+	json_object_foreach(jroot, key, jobj) {
+		if (match_strkey(key, "tunnel")) {
+			ret = flex_tunnel_parse(jobj, &flex_conf->tunnel);
+			if (ret) {
+				printf("Can't parse tunnel value\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_header")) {
+			ret = flex_field_parse(jobj, &flex_conf->next_header);
+			if (ret) {
+				printf("Can't parse next_header field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_protocol")) {
+			ret = flex_field_parse(jobj,
+					       &flex_conf->next_protocol);
+			if (ret) {
+				printf("Can't parse next_protocol field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "sample_data")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_field_parse
+					(ji, flex_conf->sample_data + i);
+				if (ret) {
+					printf("Can't parse sample_data field(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_samples = size;
+		} else if (match_strkey(key, "input_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse(ji,
+						      flex_conf->input_link + i,
+						      FLEX_LINK_IN);
+				if (ret) {
+					printf("Can't parse input_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_inputs = size;
+		} else if (match_strkey(key, "output_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse
+					(ji, flex_conf->output_link + i,
+					 FLEX_LINK_OUT);
+				if (ret) {
+					printf("Can't parse output_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_outputs = size;
+		}
+	}
+out:
+	return ret;
+}
+
+static struct flex_item *
+flex_item_init(void)
+{
+	size_t base_size, samples_size, links_size, spec_size;
+	struct rte_flow_item_flex_conf *conf;
+	struct flex_item *fp;
+	uint8_t (*pattern)[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	int i;
+
+	base_size = RTE_ALIGN(sizeof(*conf), sizeof(uintptr_t));
+	samples_size = RTE_ALIGN(FLEX_ITEM_MAX_SAMPLES_NUM *
+			         sizeof(conf->sample_data[0]),
+				 sizeof(uintptr_t));
+	links_size = RTE_ALIGN(FLEX_ITEM_MAX_LINKS_NUM *
+			       sizeof(conf->input_link[0]),
+			       sizeof(uintptr_t));
+	/* spec & mask for all input links */
+	spec_size = 2 * FLEX_MAX_FLOW_PATTERN_LENGTH * FLEX_ITEM_MAX_LINKS_NUM;
+	fp = calloc(1, base_size + samples_size + 2 * links_size + spec_size);
+	if (fp == NULL) {
+		printf("Can't allocate memory for flex item\n");
+		return NULL;
+	}
+	conf = &fp->flex_conf;
+	conf->sample_data = (typeof(conf->sample_data))
+			    ((uint8_t *)fp + base_size);
+	conf->input_link = (typeof(conf->input_link))
+			   ((uint8_t *)conf->sample_data + samples_size);
+	conf->output_link = (typeof(conf->output_link))
+			    ((uint8_t *)conf->input_link + links_size);
+	pattern = (typeof(pattern))((uint8_t *)conf->output_link + links_size);
+	for (i = 0; i < FLEX_ITEM_MAX_LINKS_NUM; i++) {
+		struct rte_flow_item_flex_link *in = conf->input_link + i;
+		in->item.spec = pattern++;
+		in->item.mask = pattern++;
+	}
+	return fp;
+}
+
+static void
+flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename)
+{
+	struct rte_flow_error flow_error;
+	json_error_t json_error;
+	json_t *jroot = NULL;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	int ret;
+
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (fp) {
+		printf("port-%u: flex item #%u is already in use\n",
+		       port_id, flex_id);
+		return;
+	}
+	jroot = json_load_file(filename, 0, &json_error);
+	if (!jroot) {
+		printf("Bad JSON file \"%s\": %s\n", filename, json_error.text);
+		return;
+	}
+	fp = flex_item_init();
+	if (!fp) {
+		printf("Could not allocate flex item\n");
+		goto out;
+	}
+	ret = flex_item_config(jroot, &fp->flex_conf);
+	if (ret)
+		goto out;
+	fp->flex_handle = rte_flow_flex_item_create(port_id,
+						    &fp->flex_conf,
+						    &flow_error);
+	if (fp->flex_handle) {
+		flex_items[port_id][flex_id] = fp;
+		printf("port-%u: created flex item #%u\n", port_id, flex_id);
+		fp = NULL;
+	} else {
+		printf("port-%u: flex item #%u creation failed: %s\n",
+		       port_id, flex_id,
+		       flow_error.message ? flow_error.message : "");
+	}
+out:
+	if (fp)
+		free(fp);
+	if (jroot)
+		json_decref(jroot);
+}
+
+#else /* RTE_HAS_JANSSON */
+static void flex_item_create(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id,
+			     __rte_unused const char *filename)
+{
+	printf("no JSON library\n");
+}
+
+static void flex_item_destroy(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id)
+{
+	printf("no JSON library\n");
+}
+#endif /* RTE_HAS_JANSSON */
+
+void
+port_flex_item_flush(portid_t port_id)
+{
+	uint16_t i;
+
+	for (i = 0; i < FLEX_MAX_PARSERS_NUM; i++) {
+		flex_item_destroy(port_id, i);
+		flex_items[port_id][i] = NULL;
+	}
+}
+
+struct flex_pattern_set {
+	cmdline_fixed_string_t set, flex_pattern;
+	cmdline_fixed_string_t is_spec, mask;
+	cmdline_fixed_string_t spec_data, mask_data;
+	uint16_t id;
+};
+
+static cmdline_parse_token_string_t flex_pattern_set_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, set, "set");
+static cmdline_parse_token_string_t flex_pattern_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 flex_pattern, "flex_pattern");
+static cmdline_parse_token_string_t flex_pattern_is_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 is_spec, "is");
+static cmdline_parse_token_string_t flex_pattern_spec_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 is_spec, "spec");
+static cmdline_parse_token_string_t flex_pattern_mask_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask, "mask");
+static cmdline_parse_token_string_t flex_pattern_spec_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, spec_data, NULL);
+static cmdline_parse_token_string_t flex_pattern_mask_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask_data, NULL);
+static cmdline_parse_token_num_t flex_pattern_id_token =
+	TOKEN_NUM_INITIALIZER(struct flex_pattern_set, id, RTE_UINT16);
+
+/*
+ * flex pattern data - spec or mask is a string representation of byte array
+ * in hexadecimal format. Each byte in data string must have 2 characters:
+ * 0x15 - "15"
+ * 0x1  - "01"
+ * Bytes in data array are in network order.
+ */
+static uint32_t
+flex_pattern_data(const char *str, uint8_t *data)
+{
+	uint32_t i, len = strlen(str);
+	char b[3], *endptr;
+
+	if (len & 01)
+		return 0;
+	len /= 2;
+	if (len >= FLEX_MAX_FLOW_PATTERN_LENGTH)
+		return 0;
+	for (i = 0, b[2] = '\0'; i < len; i++) {
+		b[0] = str[2 * i];
+		b[1] = str[2 * i + 1];
+		data[i] = strtoul(b, &endptr, 16);
+		if (endptr != &b[2])
+			return 0;
+	}
+	return len;
+}
+
+static void
+flex_pattern_parsed_fn(void *parsed_result,
+		       __rte_unused struct cmdline *cl,
+		       __rte_unused void *data)
+{
+	struct flex_pattern_set *res = parsed_result;
+	struct flex_pattern *fp;
+	bool full_spec;
+
+	if (res->id >= FLEX_MAX_PATTERNS_NUM) {
+		printf("Bad flex pattern id\n");
+		return;
+	}
+	fp = flex_patterns + res->id;
+	memset(fp->spec_pattern, 0, sizeof(fp->spec_pattern));
+	memset(fp->mask_pattern, 0, sizeof(fp->mask_pattern));
+	fp->spec.length = flex_pattern_data(res->spec_data, fp->spec_pattern);
+	if (!fp->spec.length) {
+		printf("Bad flex pattern spec\n");
+		return;
+	}
+	full_spec = strncmp(res->is_spec, "spec", strlen("spec")) == 0;
+	if (full_spec) {
+		fp->mask.length = flex_pattern_data(res->mask_data,
+						    fp->mask_pattern);
+		if (!fp->mask.length) {
+			printf("Bad flex pattern mask\n");
+			return;
+		}
+	} else {
+		memset(fp->mask_pattern, 0xFF, fp->spec.length);
+		fp->mask.length = fp->spec.length;
+	}
+	if (fp->mask.length != fp->spec.length) {
+		printf("Spec length do not match mask length\n");
+		return;
+	}
+	fp->spec.pattern = fp->spec_pattern;
+	fp->mask.pattern = fp->mask_pattern;
+	printf("created pattern #%u\n", res->id);
+}
+
+cmdline_parse_inst_t cmd_set_flex_is_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> is <spec_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_is_token,
+		(void *)&flex_pattern_spec_data_token,
+		NULL,
+	}
+};
+
+cmdline_parse_inst_t cmd_set_flex_spec_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> spec <spec_data> mask <mask_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_spec_token,
+		(void *)&flex_pattern_spec_data_token,
+		(void *)&flex_pattern_mask_token,
+		(void *)&flex_pattern_mask_data_token,
+		NULL,
+	}
+};
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17e..26357bc6e3 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2886,6 +2886,7 @@ close_port(portid_t pid)
 
 		if (is_proc_primary()) {
 			port_flow_flush(pi);
+			port_flex_item_flush(pi);
 			rte_eth_dev_close(pi);
 		}
 	}
@@ -4017,7 +4018,6 @@ main(int argc, char** argv)
 		rte_stats_bitrate_reg(bitrate_data);
 	}
 #endif
-
 #ifdef RTE_LIB_CMDLINE
 	if (strlen(cmdline_filename) != 0)
 		cmdline_read_from_file(cmdline_filename);
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 876a341cf0..3437d7607d 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -282,6 +282,19 @@ struct fwd_engine {
 	packet_fwd_t     packet_fwd;     /**< Mandatory. */
 };
 
+struct flex_item {
+	struct rte_flow_item_flex_conf flex_conf;
+	struct rte_flow_item_flex_handle *flex_handle;
+	uint32_t flex_id;
+};
+
+#define FLEX_ITEM_MAX_SAMPLES_NUM 16
+#define FLEX_ITEM_MAX_LINKS_NUM 16
+#define FLEX_MAX_FLOW_PATTERN_LENGTH 64
+#define FLEX_MAX_PARSERS_NUM 8
+#define FLEX_MAX_PATTERNS_NUM 64
+#define FLEX_PARSER_ERR ((struct flex_item *)-1)
+
 #define BURST_TX_WAIT_US 1
 #define BURST_TX_RETRIES 64
 
@@ -306,6 +319,8 @@ extern struct fwd_engine * fwd_engines[]; /**< NULL terminated array. */
 extern cmdline_parse_inst_t cmd_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw_all;
+extern cmdline_parse_inst_t cmd_set_flex_is_pattern;
+extern cmdline_parse_inst_t cmd_set_flex_spec_pattern;
 
 extern uint16_t mempool_flags;
 
@@ -1026,6 +1041,7 @@ uint16_t tx_pkt_set_dynf(uint16_t port_id, __rte_unused uint16_t queue,
 void add_tx_dynf_callback(portid_t portid);
 void remove_tx_dynf_callback(portid_t portid);
 int update_jumbo_frame_offload(portid_t portid);
+void port_flex_item_flush(portid_t port_id);
 
 /*
  * Work-around of a compilation error with ICC on invocations of the
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index bbef706374..4f03efd43f 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -5091,3 +5091,122 @@ For example to unload BPF filter from TX queue 0, port 0:
 .. code-block:: console
 
    testpmd> bpf-unload tx 0 0
+
+Flex Item Functions
+-------------------
+
+The following sections show functions that configure and create flex item object,
+create flex pattern and use it in a flow rule.
+The commands will use 20 bytes IPv4 header for examples:
+
+::
+
+   0                   1                   2                   3
+   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |  ver  |  IHL  |     TOS       |        length                 | +0
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       identification          | flg |    frag. offset         | +4
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       TTL     |  protocol     |        checksum               | +8
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |               source IP address                               | +12
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |              destination IP address                           | +16
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+Create flex item
+~~~~~~~~~~~~~~~~
+
+Flex item object is created by PMD according to a new header configuration. The
+header configuration is compiled by the testpmd and stored in
+``rte_flow_item_flex_conf`` type variable.
+
+::
+
+   # flow flex_item create <port> <flex id> <configuration file>
+   testpmd> flow flex_item init 0 3 ipv4_flex_config.json
+   port-0: created flex item #3
+
+Flex item configuration is kept in external JSON file.
+It describes the following header elements:
+
+**New header length.**
+
+Specify whether the new header has fixed or variable length and the basic/minimal
+header length value.
+
+If header length is not fixed, header location with a value that completes header
+length calculation and scale/offset function must be added.
+
+Scale function depends on port hardware.
+
+**Next protocol.**
+
+Describes location in the new header that specify following network header type.
+
+**Flow match samples.**
+
+Describes locations in the new header that will be used in flow rules.
+
+Number of flow samples and sample maximal length depend of port hardware.
+
+**Input trigger.**
+
+Describes preceding network header configuration.
+
+**Output trigger.**
+
+Describes conditions that trigger transfer to following network header
+
+.. code-block:: json
+
+   {
+      "next_header": { "field_mode": "FIELD_MODE_FIXED", "field_size": 20},
+      "next_protocol": {"field_size": 8, "field_base": 72},
+      "sample_data": [
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 0},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 32},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 64},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 96}
+      ],
+      "input_link": [
+         {"item": "eth type is 0x0800"},
+         {"item": "vlan inner_type is 0x0800"}
+      ],
+      "output_link": [
+         {"item": "udp", "next": 17},
+         {"item": "tcp", "next": 6},
+         {"item": "icmp", "next": 1}
+      ]
+   }
+
+
+Flex pattern and flow rules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Flex pattern describe parts of network header that will trigger flex flow item hit in a flow rule.
+Flex pattern directly related to flex item samples configuration.
+Flex pattern can be shared between ports.
+
+**Flex pattern and flow rule to match IPv4 version and 20 bytes length**
+
+::
+
+   # set flex_pattern <pattern_id> is <hex bytes sequence>
+   testpmd> flow flex_item pattern 5 is 45FF
+   created pattern #5
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 5 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
+
+**Flex pattern and flow rule to match packets with source address 1.2.3.4**
+
+::
+
+   testpmd> flow flex_item pattern 2 spec 45000000000000000000000001020304 mask FF0000000000000000000000FFFFFFFF
+   created pattern #2
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 2 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/5] ethdev: introduce configurable flexible item
  2021-10-12 11:32   ` [dpdk-dev] [PATCH v4 1/5] " Viacheslav Ovsiienko
@ 2021-10-12 11:42     ` Ori Kam
  0 siblings, 0 replies; 73+ messages in thread
From: Ori Kam @ 2021-10-12 11:42 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Gregory Etelson,
	NBU-Contact-Thomas Monjalon

Hi Slava,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Tuesday, October 12, 2021 2:33 PM
> Subject: [dpdk-dev] [PATCH v4 1/5] ethdev: introduce configurable flexible item
> 
> 1. Introduction and Retrospective
> 
> Nowadays the networks are evolving fast and wide, the network structures are getting more and more
> complicated, the new application areas are emerging. To address these challenges the new network
> protocols are continuously being developed, considered by technical communities, adopted by industry
> and, eventually implemented in hardware and software. The DPDK framework follows the common
> trends and if we bother to glance at the RTE Flow API header we see the multiple new items were
> introduced during the last years since the initial release.
> 
> The new protocol adoption and implementation process is not straightforward and takes time, the new
> protocol passes development, consideration, adoption, and implementation phases. The industry tries to
> mitigate and address the forthcoming network protocols, for example, many hardware vendors are
> implementing flexible and configurable network protocol parsers. As DPDK developers, could we
> anticipate the near future in the same fashion and introduce the similar flexibility in RTE Flow API?
> 
> Let's check what we already have merged in our project, and we see the nice raw item
> (rte_flow_item_raw). At the first glance, it looks superior and we can try to implement a flow matching on
> the header of some relatively new tunnel protocol, say on the GENEVE header with variable length
> options. And, under further consideration, we run into the raw item
> limitations:
> 
> - only fixed size network header can be represented
> - the entire network header pattern of fixed format
>   (header field offsets are fixed) must be provided
> - the search for patterns is not robust (the wrong matches
>   might be triggered), and actually is not supported
>   by existing PMDs
> - no explicitly specified relations with preceding
>   and following items
> - no tunnel hint support
> 
> As the result, implementing the support for tunnel protocols like aforementioned GENEVE with variable
> extra protocol option with flow raw item becomes very complicated and would require multiple flows and
> multiple raw items chained in the same flow (by the way, there is no support found for chained raw items
> in implemented drivers).
> 
> This RFC introduces the dedicated flex item (rte_flow_item_flex) to handle matches with existing and new
> network protocol headers in a unified fashion.
> 
> 2. Flex Item Life Cycle
> 
> Let's assume there are the requirements to support the new network protocol with RTE Flows. What is
> given within protocol
> specification:
> 
>   - header format
>   - header length, (can be variable, depending on options)
>   - potential presence of extra options following or included
>     in the header the header
>   - the relations with preceding protocols. For example,
>     the GENEVE follows UDP, eCPRI can follow either UDP
>     or L2 header
>   - the relations with following protocols. For example,
>     the next layer after tunnel header can be L2 or L3
>   - whether the new protocol is a tunnel and the header
>     is a splitting point between outer and inner layers
> 
> The supposed way to operate with flex item:
> 
>   - application defines the header structures according to
>     protocol specification
> 
>   - application calls rte_flow_flex_item_create() with desired
>     configuration according to the protocol specification, it
>     creates the flex item object over specified ethernet device
>     and prepares PMD and underlying hardware to handle flex
>     item. On item creation call PMD backing the specified
>     ethernet device returns the opaque handle identifying
>     the object has been created
> 
>   - application uses the rte_flow_item_flex with obtained handle
>     in the flows, the values/masks to match with fields in the
>     header are specified in the flex item per flow as for regular
>     items (except that pattern buffer combines all fields)
> 
>   - flows with flex items match with packets in a regular fashion,
>     the values and masks for the new protocol header match are
>     taken from the flex items in the flows
> 
>   - application destroys flows with flex items
> 
>   - application calls rte_flow_flex_item_release() as part of
>     ethernet device API and destroys the flex item object in
>     PMD and releases the engaged hardware resources
> 
> 3. Flex Item Structure
> 
> The flex item structure is intended to be used as part of the flow pattern like regular RTE flow items and
> provides the mask and value to match with fields of the protocol item was configured for.
> 
>   struct rte_flow_item_flex {
>     void *handle;
>     uint32_t length;
>     const uint8_t* pattern;
>   };
> 
> The handle is some opaque object maintained on per device basis by underlying driver.
> 
> The protocol header fields are considered as bit fields, all offsets and widths are expressed in bits. The
> pattern is the buffer containing the bit concatenation of all the fields presented at item configuration time,
> in the same order and same amount. If byte boundary alignment is needed an application can use a
> dummy type field, this is just some kind of gap filler.
> 
> The length field specifies the pattern buffer length in bytes and is needed to allow rte_flow_copy()
> operations. The approach of multiple pattern pointers and lengths (per field) was considered and found
> clumsy - it seems to be much suitable for the application to maintain the single structure within the single
> pattern buffer.
> 
> 4. Flex Item Configuration
> 
> The flex item configuration consists of the following parts:
> 
>   - header field descriptors:
>     - next header
>     - next protocol
>     - sample to match
>   - input link descriptors
>   - output link descriptors
> 
> The field descriptors tell the driver and hardware what data should be extracted from the packet and then
> control the packet handling in the flow engine. Besides this, sample fields can be presented to match with
> patterns in the flows. Each field is a bit pattern.
> It has width, offset from the header beginning, mode of offset calculation, and offset related parameters.
> 
> The next header field is special, no data are actually taken from the packet, but its offset is used as a
> pointer to the next header in the packet, in other words the next header offset specifies the size of the
> header being parsed by flex item.
> 
> There is one more special field - next protocol, it specifies where the next protocol identifier is contained
> and packet data sampled from this field will be used to determine the next protocol header type to
> continue packet parsing. The next protocol field is like eth_type field in MAC2, or proto field in IPv4/v6
> headers.
> 
> The sample fields are used to represent the data be sampled from the packet and then matched with
> established flows.
> 
> There are several methods supposed to calculate field offset in runtime depending on configuration and
> packet content:
> 
>   - FIELD_MODE_FIXED - fixed offset. The bit offset from
>     header beginning is permanent and defined by field_base
>     configuration parameter.
> 
>   - FIELD_MODE_OFFSET - the field bit offset is extracted
>     from other header field (indirect offset field). The
>     resulting field offset to match is calculated from as:
> 
>   field_base + (*offset_base & offset_mask) << offset_shift
> 
>     This mode is useful to sample some extra options following
>     the main header with field containing main header length.
>     Also, this mode can be used to calculate offset to the
>     next protocol header, for example - IPv4 header contains
>     the 4-bit field with IPv4 header length expressed in dwords.
>     One more example - this mode would allow us to skip GENEVE
>     header variable length options.
> 
>   - FIELD_MODE_BITMASK - the field bit offset is extracted
>     from other header field (indirect offset field), the latter
>     is considered as bitmask containing some number of one bits,
>     the resulting field offset to match is calculated as:
> 
>   field_base + bitcount(*offset_base & offset_mask) << offset_shift
> 
>     This mode would be useful to skip the GTP header and its
>     extra options with specified flags.
> 
>   - FIELD_MODE_DUMMY - dummy field, optionally used for byte
>     boundary alignment in pattern. Pattern mask and data are
>     ignored in the match. All configuration parameters besides
>     field size and offset are ignored.
> 
>   Note:  "*" - means the indirect field offset is calculated
>   and actual data are extracted from the packet by this
>   offset (like data are fetched by pointer *p from memory).
> 
> The offset mode list can be extended by vendors according to hardware supported options.
> 
> The input link configuration section tells the driver after what protocols and at what conditions the flex
> item can follow.
> Input link specified the preceding header pattern, for example for GENEVE it can be UDP item specifying
> match on destination port with value 6081. The flex item can follow multiple header types and multiple
> input links should be specified. At flow creation time the item with one of the input link types should
> precede the flex item and driver will select the correct flex item settings, depending on the actual flow
> pattern.
> 
> The output link configuration section tells the driver how to continue packet parsing after the flex item
> protocol.
> If multiple protocols can follow the flex item header the flex item should contain the field with the next
> protocol identifier and the parsing will be continued depending on the data contained in this field in the
> actual packet.
> 
> The flex item fields can participate in RSS hash calculation, the dedicated flag is present in the field
> description to specify what fields should be provided for hashing.
> 
> 5. Flex Item Chaining
> 
> If there are multiple protocols supposed to be supported with flex items in chained fashion - two or more
> flex items within the same flow and these ones might be neighbors in the pattern, it means the flex items
> are mutual referencing.  In this case, the item that occurred first should be created with empty output link
> list or with the list including existing items, and then the second flex item should be created referencing the
> first flex item as input arc, drivers should adjust the item configuration.
> 
> Also, the hardware resources used by flex items to handle the packet can be limited. If there are multiple
> flex items that are supposed to be used within the same flow it would be nice to provide some hint for the
> driver that these two or more flex items are intended for simultaneous usage.
> The fields of items should be assigned with hint indices and these indices from two or more flex items
> supposed to be provided within the same flow should be the same as well. In other words, the field hint
> index specifies the group of fields that can be matched simultaneously within a single flow. If hint indices
> are specified, the driver will try to engage not overlapping hardware resources and provide independent
> handling of the field groups with unique indices. If the hint index is zero the driver assigns resources on its
> own.
> 
> 6. Example of New Protocol Handling
> 
> Let's suppose we have the requirements to handle the new tunnel protocol that follows UDP header with
> destination port 0xFADE and is followed by MAC header. Let the new protocol header format be like this:
> 
>   struct new_protocol_header {
>     rte_be32 header_length; /* length in dwords, including options */
>     rte_be32 specific0;     /* some protocol data, no intention */
>     rte_be32 specific1;     /* to match in flows on these fields */
>     rte_be32 crucial;       /* data of interest, match is needed */
>     rte_be32 options[0];    /* optional protocol data, variable length */
>   };
> 
> The supposed flex item configuration:
> 
>   struct rte_flow_item_flex_field field0 = {
>     .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
>     .field_size = 96,                /* three dwords from the beginning */
>   };
>   struct rte_flow_item_flex_field field1 = {
>     .field_mode = FIELD_MODE_FIXED,
>     .field_size = 32,       /* Field size is one dword */
>     .field_base = 96,       /* Skip three dwords from the beginning */
>   };
>   struct rte_flow_item_udp spec0 = {
>     .hdr = {
>       .dst_port = RTE_BE16(0xFADE),
>     }
>   };
>   struct rte_flow_item_udp mask0 = {
>     .hdr = {
>       .dst_port = RTE_BE16(0xFFFF),
>     }
>   };
>   struct rte_flow_item_flex_link link0 = {
>     .item = {
>        .type = RTE_FLOW_ITEM_TYPE_UDP,
>        .spec = &spec0,
>        .mask = &mask0,
>   };
> 
>   struct rte_flow_item_flex_conf conf = {
>     .next_header = {
>       .tunnel = FLEX_TUNNEL_MODE_SINGLE,
>       .field_mode = FIELD_MODE_OFFSET,
>       .field_base = 0,
>       .offset_base = 0,
>       .offset_mask = 0xFFFFFFFF,
>       .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
>     },
>     .sample = {
>        &field0,
>        &field1,
>     },
>     .nb_samples = 2,
>     .input_link[0] = &link0,
>     .nb_inputs = 1
>   };
> 
> Let's suppose we have created the flex item successfully, and PMD returned the handle 0x123456789A.
> We can use the following item pattern to match the crucial field in the packet with value 0x00112233:
> 
>   struct new_protocol_header spec_pattern =
>   {
>     .crucial = RTE_BE32(0x00112233),
>   };
>   struct new_protocol_header mask_pattern =
>   {
>     .crucial = RTE_BE32(0xFFFFFFFF),
>   };
>   struct rte_flow_item_flex spec_flex = {
>     .handle = 0x123456789A
>     .length = sizeiof(struct new_protocol_header),
>     .pattern = &spec_pattern,
>   };
>   struct rte_flow_item_flex mask_flex = {
>     .length = sizeof(struct new_protocol_header),
>     .pattern = &mask_pattern,
>   };
>   struct rte_flow_item item_to_match = {
>     .type = RTE_FLOW_ITEM_TYPE_FLEX,
>     .spec = &spec_flex,
>     .mask = &mask_flex,
>   };
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---

Acked-by: Ori Kam <orika@nvidia.com>
Thanks,
Ori

^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (6 preceding siblings ...)
  2021-10-12 11:32 ` [dpdk-dev] [PATCH v4 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-12 12:54 ` Viacheslav Ovsiienko
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 1/5] " Viacheslav Ovsiienko
                     ` (5 more replies)
  2021-10-18 18:02 ` [dpdk-dev] [PATCH v6 0/6] " Viacheslav Ovsiienko
                   ` (2 subsequent siblings)
  10 siblings, 6 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 12:54 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item confgiuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

7. Notes:

 - v4:  http://patches.dpdk.org/project/dpdk/patch/20211012113235.24975-2-viacheslavo@nvidia.com/ 
 - v3:  http://patches.dpdk.org/project/dpdk/cover/20211011181528.517-1-viacheslavo@nvidia.com/
 - v2:  http://patches.dpdk.org/project/dpdk/patch/20211001193415.23288-2-viacheslavo@nvidia.com/
 - v1:  http://patches.dpdk.org/project/dpdk/patch/20210922180418.20663-2-viacheslavo@nvidia.com/
 - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/

 - v4 -> v5:
   - comments addressed
   - testpmd compilation issue fixed

 - v3 -> v4:
   - comments addressed
   - testpmd compilation issues fixed
   - typos fixed

 - v2 -> v3:
   - comments addressed
   - flex item update removed as not supported
   - RSS over flex item fields removed as not supported and non-complete
     API
   - tunnel mode configuration refactored
   - testpmd updated
   - documentation updated
   - PMD patches are removed temporarily (updating WIP, be presented in rc2)

 - v1 -> v2:
   - testpmd CLI to handle flex item is provided
   - draft PMD code is introduced

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Gregory Etelson (4):
  ethdev: support flow elements with variable length
  ethdev: implement RTE flex item API
  app/testpmd: add jansson library
  app/testpmd: add flex item CLI commands

Viacheslav Ovsiienko (1):
  ethdev: introduce configurable flexible item

 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 764 +++++++++++++++++++-
 app/test-pmd/meson.build                    |   5 +
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  19 +
 doc/guides/prog_guide/rte_flow.rst          |  25 +
 doc/guides/rel_notes/release_21_11.rst      |   7 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++
 lib/ethdev/rte_flow.c                       | 121 +++-
 lib/ethdev/rte_flow.h                       | 222 ++++++
 lib/ethdev/rte_flow_driver.h                |   8 +
 lib/ethdev/version.map                      |   4 +
 12 files changed, 1283 insertions(+), 15 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v5 1/5] ethdev: introduce configurable flexible item
  2021-10-12 12:54 ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-12 12:54   ` Viacheslav Ovsiienko
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 2/5] ethdev: support flow elements with variable length Viacheslav Ovsiienko
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 12:54 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item configuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst     |  25 +++
 doc/guides/rel_notes/release_21_11.rst |   7 +
 lib/ethdev/rte_flow.h                  | 222 +++++++++++++++++++++++++
 3 files changed, 254 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 2b42d5ec8c..495d08a6a9 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1425,6 +1425,31 @@ Matches a conntrack state after conntrack action.
 - ``flags``: conntrack packet state flags.
 - Default ``mask`` matches all state bits.
 
+Item: ``FLEX``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches with the custom network protocol header that was created
+using rte_flow_flex_item_create() API. The application describes
+the desired header structure, defines the header fields attributes
+and header relations with preceding and following protocols and
+configures the ethernet devices accordingly via
+rte_flow_flex_item_create() routine.
+
+- ``handle``: the flex item handle returned by the PMD on successful
+  rte_flow_flex_item_create() call, mask for this field is ignored.
+- ``length``: match pattern length in bytes. If the length does not cover
+  all fields defined in item configuration, the pattern spec and mask are
+  considered by the driver as padded with trailing zeroes till the full
+  configured item pattern length.
+- ``pattern``: pattern to match. The pattern is concatenation of bit fields
+  configured at item creation. At configuration the fields are presented
+  by sample_data array. The order of the bitfields is defined by the order
+  of sample_data elements. The width of each bitfield is defined by the width
+  specified in the corresponding sample_data element as well. If pattern
+  length is smaller than configured fields overall length it is considered
+  as padded with trailing zeroes up to full configured length, both for
+  value and mask.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 73e377a007..4b8cac60d4 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Introduced RTE Flow Flex Item.**
+
+  * The configurable RTE Flow Flex Item provides the capability to introduce
+    the arbitrary user specified network protocol header, configure the device
+    hardware accordingly, and perform match on this header with desired patterns
+    and masks.
+
 * **Enabled new devargs parser.**
 
   * Enabled devargs syntax
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 7b1ed7f110..fb226d9f52 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -574,6 +574,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_conntrack.
 	 */
 	RTE_FLOW_ITEM_TYPE_CONNTRACK,
+
+	/**
+	 * Matches a configured set of fields at runtime calculated offsets
+	 * over the generic network header with variable length and
+	 * flexible pattern
+	 *
+	 * @see struct rte_flow_item_flex.
+	 */
+	RTE_FLOW_ITEM_TYPE_FLEX,
 };
 
 /**
@@ -1839,6 +1848,177 @@ struct rte_flow_item {
 	const void *mask; /**< Bit-mask applied to spec and last. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_FLEX
+ *
+ * Matches a specified set of fields within the network protocol
+ * header. Each field is presented as set of bits with specified width, and
+ * bit offset from the header beginning.
+ *
+ * The pattern is concatenation of bit fields configured at item creation
+ * by rte_flow_flex_item_create(). At configuration the fields are presented
+ * by sample_data array.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_flex {
+	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle. */
+	uint32_t length; /**< Pattern length in bytes. */
+	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
+};
+/**
+ * Field bit offset calculation mode.
+ */
+enum rte_flow_item_flex_field_mode {
+	/**
+	 * Dummy field, used for byte boundary alignment in pattern.
+	 * Pattern mask and data are ignored in the match. All configuration
+	 * parameters besides field size are ignored.
+	 */
+	FIELD_MODE_DUMMY = 0,
+	/**
+	 * Fixed offset field. The bit offset from header beginning
+	 * is permanent and defined by field_base parameter.
+	 */
+	FIELD_MODE_FIXED,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field). The resulting field offset to match is calculated as:
+	 *
+	 *    field_base + (*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_OFFSET,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field), the latter is considered as bitmask containing some
+	 * number of one bits, the resulting field offset to match is
+	 * calculated as:
+	 *
+	 *    field_base + bitcount(*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_BITMASK,
+};
+
+/**
+ * Flex item field tunnel mode
+ */
+enum rte_flow_item_flex_tunnel_mode {
+	/**
+	 * The protocol header can be present in the packet only once.
+	 * No multiple flex item flow inclusions (for inner/outer) are allowed.
+	 * No any relations with tunnel protocols are imposed. The drivers
+	 * can optimize hardware resource usage to handle match on single flex
+	 * item of specific type.
+	 */
+	FLEX_TUNNEL_MODE_SINGLE = 0,
+	/**
+	 * Flex item presents outer header only.
+	 */
+	FLEX_TUNNEL_MODE_OUTER,
+	/**
+	 * Flex item presents inner header only.
+	 */
+	FLEX_TUNNEL_MODE_INNER,
+	/**
+	 * Flex item presents either inner or outer header. The driver
+	 * handles as many multiple inners as hardware supports.
+	 */
+	FLEX_TUNNEL_MODE_MULTI,
+	/**
+	 * Flex item presents tunnel protocol header.
+	 */
+	FLEX_TUNNEL_MODE_TUNNEL,
+};
+
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+__extension__
+struct rte_flow_item_flex_field {
+	/** Defines how match field offset is calculated over the packet. */
+	enum rte_flow_item_flex_field_mode field_mode;
+	uint32_t field_size; /**< Field size in bits. */
+	int32_t field_base; /**< Field offset in bits. */
+	uint32_t offset_base; /**< Indirect offset field offset in bits. */
+	uint32_t offset_mask; /**< Indirect offset field bit mask. */
+	int32_t offset_shift; /**< Indirect offset multiply factor. */
+	uint32_t field_id:16; /**< Device hint, for multiple items in flow. */
+	uint32_t reserved:16; /**< Reserved field. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_link {
+	/**
+	 * Preceding/following header. The item type must be always provided.
+	 * For preceding one item must specify the header value/mask to match
+	 * for the link be taken and start the flex item header parsing.
+	 */
+	struct rte_flow_item item;
+	/**
+	 * Next field value to match to continue with one of the configured
+	 * next protocols.
+	 */
+	uint32_t next;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_conf {
+	/**
+	 * Specifies the flex item and tunnel relations and tells the PMD
+	 * whether flex item can be used for inner, outer or both headers,
+	 * or whether flex item presents the tunnel protocol itself.
+	 */
+	enum rte_flow_item_flex_tunnel_mode tunnel;
+	/**
+	 * The next header offset, it presents the network header size covered
+	 * by the flex item and can be obtained with all supported offset
+	 * calculating methods (fixed, dedicated field, bitmask, etc).
+	 */
+	struct rte_flow_item_flex_field next_header;
+	/**
+	 * Specifies the next protocol field to match with link next protocol
+	 * values and continue packet parsing with matching link.
+	 */
+	struct rte_flow_item_flex_field next_protocol;
+	/**
+	 * The fields will be sampled and presented for explicit match
+	 * with pattern in the rte_flow_flex_item. There can be multiple
+	 * fields descriptors, the number should be specified by nb_samples.
+	 */
+	struct rte_flow_item_flex_field *sample_data;
+	/** Number of field descriptors in the sample_data array. */
+	uint32_t nb_samples;
+	/**
+	 * Input link defines the flex item relation with preceding
+	 * header. It specified the preceding item type and provides pattern
+	 * to match. The flex item will continue parsing and will provide the
+	 * data to flow match in case if there is the match with one of input
+	 * links.
+	 */
+	struct rte_flow_item_flex_link *input_link;
+	/** Number of link descriptors in the input link array. */
+	uint32_t nb_inputs;
+	/**
+	 * Output link defines the next protocol field value to match and
+	 * the following protocol header to continue packet parsing. Also
+	 * defines the tunnel-related behaviour.
+	 */
+	struct rte_flow_item_flex_link *output_link;
+	/** Number of link descriptors in the output link array. */
+	uint32_t nb_outputs;
+};
+
 /**
  * Action types.
  *
@@ -4288,6 +4468,48 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 			     struct rte_flow_item *items,
 			     uint32_t num_of_items,
 			     struct rte_flow_error *error);
+
+/**
+ * Create the flex item with specified configuration over
+ * the Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] conf
+ *   Item configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is set.
+ */
+__rte_experimental
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
+/**
+ * Release the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v5 2/5] ethdev: support flow elements with variable length
  2021-10-12 12:54 ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 1/5] " Viacheslav Ovsiienko
@ 2021-10-12 12:54   ` Viacheslav Ovsiienko
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 3/5] ethdev: implement RTE flex item API Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 12:54 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

There is the new RTE Flow items with variable lengths coming - flex
item. In order to handle this item (and potentially other new ones
with variable pattern length) in RTE flow copy and conversion routines
the helper function is introduced.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 lib/ethdev/rte_flow.c | 81 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 68 insertions(+), 13 deletions(-)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 8cb7a069c8..051781b440 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -30,13 +30,65 @@ uint64_t rte_flow_dynf_metadata_mask;
 struct rte_flow_desc_data {
 	const char *name;
 	size_t size;
+	size_t (*desc_fn)(void *dst, const void *src);
 };
 
+/**
+ *
+ * @param buf
+ * Destination memory.
+ * @param data
+ * Source memory
+ * @param size
+ * Requested copy size
+ * @param desc
+ * rte_flow_desc_item - for flow item conversion.
+ * rte_flow_desc_action - for flow action conversion.
+ * @param type
+ * Offset into the desc param or negative value for private flow elements.
+ */
+static inline size_t
+rte_flow_conv_copy(void *buf, const void *data, const size_t size,
+		   const struct rte_flow_desc_data *desc, int type)
+{
+	/**
+	 * Allow PMD private flow item
+	 */
+	size_t sz = type >= 0 ? desc[type].size : sizeof(void *);
+	if (buf == NULL || data == NULL)
+		return 0;
+	rte_memcpy(buf, data, (size > sz ? sz : size));
+	if (desc[type].desc_fn)
+		sz += desc[type].desc_fn(size > 0 ? buf : NULL, data);
+	return sz;
+}
+
+static size_t
+rte_flow_item_flex_conv(void *buf, const void *data)
+{
+	struct rte_flow_item_flex *dst = buf;
+	const struct rte_flow_item_flex *src = data;
+	if (buf) {
+		dst->pattern = rte_memcpy
+			((void *)((uintptr_t)(dst + 1)), src->pattern,
+			 src->length);
+	}
+	return src->length;
+}
+
 /** Generate flow_item[] entry. */
 #define MK_FLOW_ITEM(t, s) \
 	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
 		.name = # t, \
-		.size = s, \
+		.size = s,               \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ITEM_FN(t, s, fn) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = {\
+		.name = # t,                 \
+		.size = s,                   \
+		.desc_fn = fn,               \
 	}
 
 /** Information about known flow pattern items. */
@@ -100,6 +152,8 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)),
 	MK_FLOW_ITEM(INTEGRITY, sizeof(struct rte_flow_item_integrity)),
 	MK_FLOW_ITEM(CONNTRACK, sizeof(uint32_t)),
+	MK_FLOW_ITEM_FN(FLEX, sizeof(struct rte_flow_item_flex),
+			rte_flow_item_flex_conv),
 };
 
 /** Generate flow_action[] entry. */
@@ -107,8 +161,17 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
 		.name = # t, \
 		.size = s, \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ACTION_FN(t, fn) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = 0, \
+		.desc_fn = fn,\
 	}
 
+
 /** Information about known flow actions. */
 static const struct rte_flow_desc_data rte_flow_desc_action[] = {
 	MK_FLOW_ACTION(END, 0),
@@ -527,12 +590,8 @@ rte_flow_conv_item_spec(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow item
-		 */
-		off = (int)item->type >= 0 ?
-		      rte_flow_desc_item[item->type].size : sizeof(void *);
-		rte_memcpy(buf, data, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, data, size,
+					 rte_flow_desc_item, item->type);
 		break;
 	}
 	return off;
@@ -634,12 +693,8 @@ rte_flow_conv_action_conf(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow action
-		 */
-		off = (int)action->type >= 0 ?
-		      rte_flow_desc_action[action->type].size : sizeof(void *);
-		rte_memcpy(buf, action->conf, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, action->conf, size,
+					 rte_flow_desc_action, action->type);
 		break;
 	}
 	return off;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v5 3/5] ethdev: implement RTE flex item API
  2021-10-12 12:54 ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 1/5] " Viacheslav Ovsiienko
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 2/5] ethdev: support flow elements with variable length Viacheslav Ovsiienko
@ 2021-10-12 12:54   ` Viacheslav Ovsiienko
  2021-10-12 14:39     ` Ori Kam
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 4/5] app/testpmd: add jansson library Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 12:54 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flex item API was introduced in
"ethdev: introduce configurable flexible item" patch.

The API allows DPDK application to define parser for custom
network header in port hardware and offload flows that will match
the custom header elements.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 lib/ethdev/rte_flow.c        | 40 ++++++++++++++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h |  8 ++++++++
 lib/ethdev/version.map       |  4 ++++
 3 files changed, 52 insertions(+)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 051781b440..8257ed8c97 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -1321,3 +1321,43 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOTSUP));
 }
+
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_flow_item_flex_handle *handle;
+
+	if (unlikely(!ops))
+		return NULL;
+	if (unlikely(!ops->flex_item_create)) {
+		rte_flow_error_set(error, ENOTSUP,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOTSUP));
+		return NULL;
+	}
+	handle = ops->flex_item_create(dev, conf, error);
+	if (handle == NULL)
+		flow_err(port_id, -rte_errno, error);
+	return handle;
+}
+
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_release))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_release(dev, handle, error);
+	return flow_err(port_id, ret, error);
+}
diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
index 46f62c2ec2..34a5a5bcd0 100644
--- a/lib/ethdev/rte_flow_driver.h
+++ b/lib/ethdev/rte_flow_driver.h
@@ -139,6 +139,14 @@ struct rte_flow_ops {
 		 struct rte_flow_item *pmd_items,
 		 uint32_t num_of_items,
 		 struct rte_flow_error *err);
+	struct rte_flow_item_flex_handle *(*flex_item_create)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
+	int (*flex_item_release)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 struct rte_flow_error *error);
 };
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 904bce6ea1..ec3b66d7a1 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -247,6 +247,10 @@ EXPERIMENTAL {
 	rte_mtr_meter_policy_delete;
 	rte_mtr_meter_policy_update;
 	rte_mtr_meter_policy_validate;
+
+	# added in 21.11
+	rte_flow_flex_item_create;
+	rte_flow_flex_item_release;
 };
 
 INTERNAL {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v5 4/5] app/testpmd: add jansson library
  2021-10-12 12:54 ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 3/5] ethdev: implement RTE flex item API Viacheslav Ovsiienko
@ 2021-10-12 12:54   ` Viacheslav Ovsiienko
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 5/5] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  2021-10-14 16:09   ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Ferruh Yigit
  5 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 12:54 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Testpmd interactive mode provides CLI to configure application
commands. Testpmd reads CLI command and parameters from STDIN, and
converts input into C objects with internal parser.
The patch adds jansson dependency to testpmd.
With jansson, testpmd can read input in JSON format from STDIN or input
file and convert it into C object using jansson library calls.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 app/test-pmd/meson.build | 5 +++++
 app/test-pmd/testpmd.h   | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 98f3289bdf..3a8babd604 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -61,3 +61,8 @@ if dpdk_conf.has('RTE_LIB_BPF')
     sources += files('bpf_cmd.c')
     deps += 'bpf'
 endif
+jansson_dep = dependency('jansson', required: false, method: 'pkg-config')
+if jansson_dep.found()
+    dpdk_conf.set('RTE_HAS_JANSSON', 1)
+    ext_deps += jansson_dep
+endif
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 5863b2f43f..876a341cf0 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -14,6 +14,9 @@
 #include <rte_os_shim.h>
 #include <cmdline.h>
 #include <sys/queue.h>
+#ifdef RTE_HAS_JANSSON
+#include <jansson.h>
+#endif
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v5 5/5] app/testpmd: add flex item CLI commands
  2021-10-12 12:54 ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 4/5] app/testpmd: add jansson library Viacheslav Ovsiienko
@ 2021-10-12 12:54   ` Viacheslav Ovsiienko
  2021-10-14 16:42     ` Ferruh Yigit
  2021-10-14 16:09   ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Ferruh Yigit
  5 siblings, 1 reply; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-12 12:54 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Network port hardware is shipped with fixed number of
supported network protocols. If application must work with a
protocol that is not included in the port hardware by default, it
can try to add the new protocol to port hardware.

Flex item or flex parser is port infrastructure that allows
application to add support for a custom network header and
offload flows to match the header elements.

Application must complete the following tasks to create a flow
rule that matches custom header:

1. Create flow item object in port hardware.
Application must provide custom header configuration to PMD.
PMD will use that configuration to create flex item object in
port hardware.

2. Create flex patterns to match. Flex pattern has a spec and a mask
components, like a regular flow item. Combined together, spec and mask
can target unique data sequence or a number of data sequences in the
custom header.
Flex patterns of the same flex item can have different lengths.
Flex pattern is identified by unique handler value.

3. Create a flow rule with a flex flow item that references
flow pattern.

Testpmd flex CLI commands are:

testpmd> flow flex_item create <port> <flex_id> <filename>

testpmd> set flex_pattern <pattern_id> \
         spec <spec data> mask <mask data>

testpmd> set flex_pattern <pattern_id> is <spec_data>

testpmd> flow create <port> ... \
/ flex item is <flex_id> pattern is <pattern_id> / ...

The patch works with the jansson library API.
Jansson development files must be present:
jansson.pc, jansson.h libjansson.[a,so]

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 764 +++++++++++++++++++-
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  16 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++
 5 files changed, 901 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index a9efd027c3..a673e6ef08 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -17822,6 +17822,8 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_set_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_show_capability,
+	(cmdline_parse_inst_t *)&cmd_set_flex_is_pattern,
+	(cmdline_parse_inst_t *)&cmd_set_flex_spec_pattern,
 	NULL,
 };
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index bb22294dd3..f7a6febc1d 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -54,6 +54,8 @@ enum index {
 	COMMON_PRIORITY_LEVEL,
 	COMMON_INDIRECT_ACTION_ID,
 	COMMON_POLICY_ID,
+	COMMON_FLEX_HANDLE,
+	COMMON_FLEX_TOKEN,
 
 	/* TOP-level command. */
 	ADD,
@@ -81,6 +83,12 @@ enum index {
 	AGED,
 	ISOLATE,
 	TUNNEL,
+	FLEX,
+
+	/* Flex arguments */
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
 
 	/* Tunnel arguments. */
 	TUNNEL_CREATE,
@@ -306,6 +314,9 @@ enum index {
 	ITEM_POL_PORT,
 	ITEM_POL_METER,
 	ITEM_POL_POLICY,
+	ITEM_FLEX,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_FLEX_PATTERN_HANDLE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -844,6 +855,11 @@ struct buffer {
 		struct {
 			uint32_t policy_id;
 		} policy;/**< Policy arguments. */
+		struct {
+			uint16_t token;
+			uintptr_t uintptr;
+			char filename[128];
+		} flex; /**< Flex arguments*/
 	} args; /**< Command arguments. */
 };
 
@@ -871,6 +887,13 @@ struct parse_action_priv {
 		.size = s, \
 	})
 
+static const enum index next_flex_item[] = {
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
+	ZERO,
+};
+
 static const enum index next_ia_create_attr[] = {
 	INDIRECT_ACTION_CREATE_ID,
 	INDIRECT_ACTION_INGRESS,
@@ -1000,6 +1023,7 @@ static const enum index next_item[] = {
 	ITEM_GENEVE_OPT,
 	ITEM_INTEGRITY,
 	ITEM_CONNTRACK,
+	ITEM_FLEX,
 	END_SET,
 	ZERO,
 };
@@ -1368,6 +1392,13 @@ static const enum index item_integrity_lv[] = {
 	ZERO,
 };
 
+static const enum index item_flex[] = {
+	ITEM_FLEX_PATTERN_HANDLE,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -1724,6 +1755,9 @@ static int parse_set_sample_action(struct context *, const struct token *,
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int
+parse_flex_handle(struct context *, const struct token *,
+		  const char *, unsigned int, void *, unsigned int);
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -1840,6 +1874,8 @@ static int parse_isolate(struct context *, const struct token *,
 static int parse_tunnel(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_flex(struct context *, const struct token *,
+		      const char *, unsigned int, void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
@@ -1904,6 +1940,17 @@ static int comp_set_modify_field_op(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
 static int comp_set_modify_field_id(struct context *, const struct token *,
 			      unsigned int, char *, unsigned int);
+static void flex_item_create(portid_t port_id, uint16_t flex_id,
+			     const char *filename);
+static void flex_item_destroy(portid_t port_id, uint16_t flex_id);
+struct flex_pattern {
+	struct rte_flow_item_flex spec, mask;
+	uint8_t spec_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	uint8_t mask_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+};
+
+static struct flex_item *flex_items[RTE_MAX_ETHPORTS][FLEX_MAX_PARSERS_NUM];
+static struct flex_pattern flex_patterns[FLEX_MAX_PATTERNS_NUM];
 
 /** Token definitions. */
 static const struct token token_list[] = {
@@ -2040,6 +2087,20 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[COMMON_FLEX_TOKEN] = {
+		.name = "{flex token}",
+		.type = "flex token",
+		.help = "flex token",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[COMMON_FLEX_HANDLE] = {
+		.name = "{flex handle}",
+		.type = "FLEX HANDLE",
+		.help = "fill flex item data",
+		.call = parse_flex_handle,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -2056,7 +2117,8 @@ static const struct token token_list[] = {
 			      AGED,
 			      QUERY,
 			      ISOLATE,
-			      TUNNEL)),
+			      TUNNEL,
+			      FLEX)),
 		.call = parse_init,
 	},
 	/* Top-level command. */
@@ -2168,6 +2230,41 @@ static const struct token token_list[] = {
 			     ARGS_ENTRY(struct buffer, port)),
 		.call = parse_isolate,
 	},
+	[FLEX] = {
+		.name = "flex_item",
+		.help = "flex item API",
+		.next = NEXT(next_flex_item),
+		.call = parse_flex,
+	},
+	[FLEX_ITEM_INIT] = {
+		.name = "init",
+		.help = "flex item init",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_CREATE] = {
+		.name = "create",
+		.help = "flex item create",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.filename),
+			     ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FILE_PATH),
+			     NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_DESTROY] = {
+		.name = "destroy",
+		.help = "flex item destroy",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
 	[TUNNEL] = {
 		.name = "tunnel",
 		.help = "new tunnel API",
@@ -3608,6 +3705,27 @@ static const struct token token_list[] = {
 			     item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_conntrack, flags)),
 	},
+	[ITEM_FLEX] = {
+		.name = "flex",
+		.help = "match flex header",
+		.priv = PRIV_ITEM(FLEX, sizeof(struct rte_flow_item_flex)),
+		.next = NEXT(item_flex),
+		.call = parse_vc,
+	},
+	[ITEM_FLEX_ITEM_HANDLE] = {
+		.name = "item",
+		.help = "flex item handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, handle)),
+	},
+	[ITEM_FLEX_PATTERN_HANDLE] = {
+		.name = "pattern",
+		.help = "flex pattern handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, pattern)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -6999,6 +7117,43 @@ parse_isolate(struct context *ctx, const struct token *token,
 	return len;
 }
 
+static int
+parse_flex(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (out->command == ZERO) {
+		if (ctx->curr != FLEX)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		ctx->objmask = NULL;
+	} else {
+		switch (ctx->curr) {
+		default:
+			break;
+		case FLEX_ITEM_INIT:
+		case FLEX_ITEM_CREATE:
+		case FLEX_ITEM_DESTROY:
+			out->command = ctx->curr;
+			break;
+		}
+	}
+
+	return len;
+}
+
 static int
 parse_tunnel(struct context *ctx, const struct token *token,
 	     const char *str, unsigned int len,
@@ -7661,6 +7816,71 @@ parse_set_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/*
+ * Replace testpmd handles in a flex flow item with real values.
+ */
+static int
+parse_flex_handle(struct context *ctx, const struct token *token,
+		  const char *str, unsigned int len,
+		  void *buf, unsigned int size)
+{
+	struct rte_flow_item_flex *spec, *mask;
+	const struct rte_flow_item_flex *src_spec, *src_mask;
+	const struct arg *arg = pop_args(ctx);
+	uint32_t offset;
+	uint16_t handle;
+	int ret;
+
+	if (!arg) {
+		printf("Bad environment\n");
+		return -1;
+	}
+	offset = arg->offset;
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, len, buf, size);
+	if (ret <= 0 || !ctx->object)
+		return ret;
+	if (ctx->port >= RTE_MAX_ETHPORTS) {
+		printf("Bad port\n");
+		return -1;
+	}
+	if (offset == offsetof(struct rte_flow_item_flex, handle)) {
+		const struct flex_item *fp;
+		struct rte_flow_item_flex *item_flex = ctx->object;
+		handle = (uint16_t)(uintptr_t)item_flex->handle;
+		if (handle >= FLEX_MAX_PARSERS_NUM) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		fp = flex_items[ctx->port][handle];
+		if (!fp) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		item_flex->handle = fp->flex_handle;
+	} else if (offset == offsetof(struct rte_flow_item_flex, pattern)) {
+		handle = (uint16_t)(uintptr_t)
+			((struct rte_flow_item_flex *)ctx->object)->pattern;
+		if (handle >= FLEX_MAX_PATTERNS_NUM) {
+			printf("Bad pattern handle\n");
+			return -1;
+		}
+		src_spec = &flex_patterns[handle].spec;
+		src_mask = &flex_patterns[handle].mask;
+		spec = ctx->object;
+		mask = spec + 2; /* spec, last, mask */
+		/* fill flow rule spec and mask parameters */
+		spec->length = src_spec->length;
+		spec->pattern = src_spec->pattern;
+		mask->length = src_mask->length;
+		mask->pattern = src_mask->pattern;
+	} else {
+		printf("Bad arguments - unknown flex item offset\n");
+		return -1;
+	}
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -8167,6 +8387,13 @@ cmd_flow_parsed(const struct buffer *in)
 		port_meter_policy_add(in->port, in->args.policy.policy_id,
 					in->args.vc.actions);
 		break;
+	case FLEX_ITEM_CREATE:
+		flex_item_create(in->port, in->args.flex.token,
+				 in->args.flex.filename);
+		break;
+	case FLEX_ITEM_DESTROY:
+		flex_item_destroy(in->port, in->args.flex.token);
+		break;
 	default:
 		break;
 	}
@@ -8618,6 +8845,11 @@ cmd_set_raw_parsed(const struct buffer *in)
 		case RTE_FLOW_ITEM_TYPE_PFCP:
 			size = sizeof(struct rte_flow_item_pfcp);
 			break;
+		case RTE_FLOW_ITEM_TYPE_FLEX:
+			size = item->spec ?
+				((const struct rte_flow_item_flex *)
+				item->spec)->length : 0;
+			break;
 		default:
 			fprintf(stderr, "Error - Not supported item\n");
 			goto error;
@@ -8800,3 +9032,533 @@ cmdline_parse_inst_t cmd_show_set_raw_all = {
 		NULL,
 	},
 };
+
+#ifdef RTE_HAS_JANSSON
+static __rte_always_inline bool
+match_strkey(const char *key, const char *pattern)
+{
+	return strncmp(key, pattern, strlen(key)) == 0;
+}
+
+static struct flex_item *
+flex_parser_fetch(uint16_t port_id, uint16_t flex_id)
+{
+	if (port_id >= RTE_MAX_ETHPORTS) {
+		printf("Invalid port_id: %u\n", port_id);
+		return FLEX_PARSER_ERR;
+	}
+	if (flex_id >= FLEX_MAX_PARSERS_NUM) {
+		printf("Invalid flex item flex_id: %u\n", flex_id);
+		return FLEX_PARSER_ERR;
+	}
+	return flex_items[port_id][flex_id];
+}
+
+static void
+flex_item_destroy(portid_t port_id, uint16_t flex_id)
+{
+	int ret;
+	struct rte_flow_error error;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (!fp)
+		return;
+	ret = rte_flow_flex_item_release(port_id, fp->flex_handle, &error);
+	if (!ret) {
+		free(fp);
+		flex_items[port_id][flex_id] = NULL;
+		printf("port-%u: released flex item #%u\n",
+		       port_id, flex_id);
+
+	} else {
+		printf("port-%u: cannot release flex item #%u: %s\n",
+		       port_id, flex_id, error.message);
+	}
+}
+
+static int
+flex_tunnel_parse(json_t *jtun, enum rte_flow_item_flex_tunnel_mode *tunnel)
+{
+	int tun = -1;
+
+	if (json_is_integer(jtun))
+		tun = (int)json_integer_value(jtun);
+	else if (json_is_real(jtun))
+		tun = (int)json_real_value(jtun);
+	else if (json_is_string(jtun)) {
+		const char *mode = json_string_value(jtun);
+
+		if (match_strkey(mode, "FLEX_TUNNEL_MODE_SINGLE"))
+			tun = FLEX_TUNNEL_MODE_SINGLE;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_OUTER"))
+			tun = FLEX_TUNNEL_MODE_OUTER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_INNER"))
+			tun = FLEX_TUNNEL_MODE_INNER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_MULTI"))
+			tun = FLEX_TUNNEL_MODE_MULTI;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_TUNNEL"))
+			tun = FLEX_TUNNEL_MODE_TUNNEL;
+		else
+			return -EINVAL;
+	} else
+		return -EINVAL;
+	*tunnel = (enum rte_flow_item_flex_tunnel_mode)tun;
+	return 0;
+}
+
+static int
+flex_field_parse(json_t *jfld, struct rte_flow_item_flex_field *fld)
+{
+	const char *key;
+	json_t *je;
+
+#define FLEX_FIELD_GET(fm, t) \
+do {                  \
+	if (!strncmp(key, # fm, strlen(# fm))) { \
+		if (json_is_real(je))   \
+			fld->fm = (t) json_real_value(je); \
+		else if (json_is_integer(je))   \
+			fld->fm = (t) json_integer_value(je); \
+		else   \
+			return -EINVAL; \
+	}         \
+} while (0)
+
+	json_object_foreach(jfld, key, je) {
+		FLEX_FIELD_GET(field_size, uint32_t);
+		FLEX_FIELD_GET(field_base, int32_t);
+		FLEX_FIELD_GET(offset_base, uint32_t);
+		FLEX_FIELD_GET(offset_mask, uint32_t);
+		FLEX_FIELD_GET(offset_shift, int32_t);
+		FLEX_FIELD_GET(field_id, uint16_t);
+		if (match_strkey(key, "field_mode")) {
+			const char *mode;
+			if (!json_is_string(je))
+				return -EINVAL;
+			mode = json_string_value(je);
+			if (match_strkey(mode, "FIELD_MODE_DUMMY"))
+				fld->field_mode = FIELD_MODE_DUMMY;
+			else if (match_strkey(mode, "FIELD_MODE_FIXED"))
+				fld->field_mode = FIELD_MODE_FIXED;
+			else if (match_strkey(mode, "FIELD_MODE_OFFSET"))
+				fld->field_mode = FIELD_MODE_OFFSET;
+			else if (match_strkey(mode, "FIELD_MODE_BITMASK"))
+				fld->field_mode = FIELD_MODE_BITMASK;
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+enum flex_link_type {
+	FLEX_LINK_IN = 0,
+	FLEX_LINK_OUT = 1
+};
+
+static int
+flex_link_item_parse(const char *pattern, struct rte_flow_item *item)
+{
+#define  FLEX_PARSE_DATA_SIZE 1024
+
+	int ret;
+	uint8_t *ptr, data[FLEX_PARSE_DATA_SIZE] = {0,};
+	char flow_rule[256];
+	struct context saved_flow_ctx = cmd_flow_context;
+
+	sprintf(flow_rule, "flow create 0 pattern %s / end", pattern);
+	pattern = flow_rule;
+	cmd_flow_context_init(&cmd_flow_context);
+	do {
+		ret = cmd_flow_parse(NULL, pattern, (void *)data, sizeof(data));
+		if (ret > 0) {
+			pattern += ret;
+			while (isspace(*pattern))
+				pattern++;
+		}
+	} while (ret > 0 && strlen(pattern));
+	if (ret >= 0 && !strlen(pattern)) {
+		struct buffer *pbuf = (struct buffer *)(uintptr_t)data;
+		struct rte_flow_item *src = pbuf->args.vc.pattern;
+
+		item->type = src->type;
+		if (src->spec) {
+			ptr = (void *)(uintptr_t)item->spec;
+			memcpy(ptr, src->spec, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->spec = NULL;
+		}
+		if (src->mask) {
+			ptr = (void *)(uintptr_t)item->mask;
+			memcpy(ptr, src->mask, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->mask = NULL;
+		}
+		if (src->last) {
+			ptr = (void *)(uintptr_t)item->last;
+			memcpy(ptr, src->last, FLEX_MAX_FLOW_PATTERN_LENGTH);
+		} else {
+			item->last = NULL;
+		}
+		ret = 0;
+	}
+	cmd_flow_context = saved_flow_ctx;
+	return ret;
+}
+
+static int
+flex_link_parse(json_t *jobj, struct rte_flow_item_flex_link *link,
+		enum flex_link_type link_type)
+{
+	const char *key;
+	json_t *je;
+	int ret;
+	json_object_foreach(jobj, key, je) {
+		if (match_strkey(key, "item")) {
+			if (!json_is_string(je))
+				return -EINVAL;
+			ret = flex_link_item_parse(json_string_value(je),
+						   &link->item);
+			if (ret)
+				return -EINVAL;
+			if (link_type == FLEX_LINK_IN) {
+				if (!link->item.spec || !link->item.mask)
+					return -EINVAL;
+				if (link->item.last)
+					return -EINVAL;
+			}
+		}
+		if (match_strkey(key, "next")) {
+			if (json_is_integer(je))
+				link->next = (typeof(link->next))
+					     json_integer_value(je);
+			else if (json_is_real(je))
+				link->next = (typeof(link->next))
+					     json_real_value(je);
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static int flex_item_config(json_t *jroot,
+			    struct rte_flow_item_flex_conf *flex_conf)
+{
+	const char *key;
+	json_t *jobj = NULL;
+	int ret = 0;
+
+	json_object_foreach(jroot, key, jobj) {
+		if (match_strkey(key, "tunnel")) {
+			ret = flex_tunnel_parse(jobj, &flex_conf->tunnel);
+			if (ret) {
+				printf("Can't parse tunnel value\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_header")) {
+			ret = flex_field_parse(jobj, &flex_conf->next_header);
+			if (ret) {
+				printf("Can't parse next_header field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_protocol")) {
+			ret = flex_field_parse(jobj,
+					       &flex_conf->next_protocol);
+			if (ret) {
+				printf("Can't parse next_protocol field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "sample_data")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_field_parse
+					(ji, flex_conf->sample_data + i);
+				if (ret) {
+					printf("Can't parse sample_data field(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_samples = size;
+		} else if (match_strkey(key, "input_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse(ji,
+						      flex_conf->input_link + i,
+						      FLEX_LINK_IN);
+				if (ret) {
+					printf("Can't parse input_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_inputs = size;
+		} else if (match_strkey(key, "output_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse
+					(ji, flex_conf->output_link + i,
+					 FLEX_LINK_OUT);
+				if (ret) {
+					printf("Can't parse output_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_outputs = size;
+		}
+	}
+out:
+	return ret;
+}
+
+static struct flex_item *
+flex_item_init(void)
+{
+	size_t base_size, samples_size, links_size, spec_size;
+	struct rte_flow_item_flex_conf *conf;
+	struct flex_item *fp;
+	uint8_t (*pattern)[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	int i;
+
+	base_size = RTE_ALIGN(sizeof(*conf), sizeof(uintptr_t));
+	samples_size = RTE_ALIGN(FLEX_ITEM_MAX_SAMPLES_NUM *
+				 sizeof(conf->sample_data[0]),
+				 sizeof(uintptr_t));
+	links_size = RTE_ALIGN(FLEX_ITEM_MAX_LINKS_NUM *
+			       sizeof(conf->input_link[0]),
+			       sizeof(uintptr_t));
+	/* spec & mask for all input links */
+	spec_size = 2 * FLEX_MAX_FLOW_PATTERN_LENGTH * FLEX_ITEM_MAX_LINKS_NUM;
+	fp = calloc(1, base_size + samples_size + 2 * links_size + spec_size);
+	if (fp == NULL) {
+		printf("Can't allocate memory for flex item\n");
+		return NULL;
+	}
+	conf = &fp->flex_conf;
+	conf->sample_data = (typeof(conf->sample_data))
+			    ((uint8_t *)fp + base_size);
+	conf->input_link = (typeof(conf->input_link))
+			   ((uint8_t *)conf->sample_data + samples_size);
+	conf->output_link = (typeof(conf->output_link))
+			    ((uint8_t *)conf->input_link + links_size);
+	pattern = (typeof(pattern))((uint8_t *)conf->output_link + links_size);
+	for (i = 0; i < FLEX_ITEM_MAX_LINKS_NUM; i++) {
+		struct rte_flow_item_flex_link *in = conf->input_link + i;
+		in->item.spec = pattern++;
+		in->item.mask = pattern++;
+	}
+	return fp;
+}
+
+static void
+flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename)
+{
+	struct rte_flow_error flow_error;
+	json_error_t json_error;
+	json_t *jroot = NULL;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	int ret;
+
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (fp) {
+		printf("port-%u: flex item #%u is already in use\n",
+		       port_id, flex_id);
+		return;
+	}
+	jroot = json_load_file(filename, 0, &json_error);
+	if (!jroot) {
+		printf("Bad JSON file \"%s\": %s\n", filename, json_error.text);
+		return;
+	}
+	fp = flex_item_init();
+	if (!fp) {
+		printf("Could not allocate flex item\n");
+		goto out;
+	}
+	ret = flex_item_config(jroot, &fp->flex_conf);
+	if (ret)
+		goto out;
+	fp->flex_handle = rte_flow_flex_item_create(port_id,
+						    &fp->flex_conf,
+						    &flow_error);
+	if (fp->flex_handle) {
+		flex_items[port_id][flex_id] = fp;
+		printf("port-%u: created flex item #%u\n", port_id, flex_id);
+		fp = NULL;
+	} else {
+		printf("port-%u: flex item #%u creation failed: %s\n",
+		       port_id, flex_id,
+		       flow_error.message ? flow_error.message : "");
+	}
+out:
+	if (fp)
+		free(fp);
+	if (jroot)
+		json_decref(jroot);
+}
+
+#else /* RTE_HAS_JANSSON */
+static void flex_item_create(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id,
+			     __rte_unused const char *filename)
+{
+	printf("no JSON library\n");
+}
+
+static void flex_item_destroy(__rte_unused portid_t port_id,
+			     __rte_unused uint16_t flex_id)
+{
+	printf("no JSON library\n");
+}
+#endif /* RTE_HAS_JANSSON */
+
+void
+port_flex_item_flush(portid_t port_id)
+{
+	uint16_t i;
+
+	for (i = 0; i < FLEX_MAX_PARSERS_NUM; i++) {
+		flex_item_destroy(port_id, i);
+		flex_items[port_id][i] = NULL;
+	}
+}
+
+struct flex_pattern_set {
+	cmdline_fixed_string_t set, flex_pattern;
+	cmdline_fixed_string_t is_spec, mask;
+	cmdline_fixed_string_t spec_data, mask_data;
+	uint16_t id;
+};
+
+static cmdline_parse_token_string_t flex_pattern_set_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, set, "set");
+static cmdline_parse_token_string_t flex_pattern_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 flex_pattern, "flex_pattern");
+static cmdline_parse_token_string_t flex_pattern_is_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 is_spec, "is");
+static cmdline_parse_token_string_t flex_pattern_spec_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+				 is_spec, "spec");
+static cmdline_parse_token_string_t flex_pattern_mask_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask, "mask");
+static cmdline_parse_token_string_t flex_pattern_spec_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, spec_data, NULL);
+static cmdline_parse_token_string_t flex_pattern_mask_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask_data, NULL);
+static cmdline_parse_token_num_t flex_pattern_id_token =
+	TOKEN_NUM_INITIALIZER(struct flex_pattern_set, id, RTE_UINT16);
+
+/*
+ * flex pattern data - spec or mask is a string representation of byte array
+ * in hexadecimal format. Each byte in data string must have 2 characters:
+ * 0x15 - "15"
+ * 0x1  - "01"
+ * Bytes in data array are in network order.
+ */
+static uint32_t
+flex_pattern_data(const char *str, uint8_t *data)
+{
+	uint32_t i, len = strlen(str);
+	char b[3], *endptr;
+
+	if (len & 01)
+		return 0;
+	len /= 2;
+	if (len >= FLEX_MAX_FLOW_PATTERN_LENGTH)
+		return 0;
+	for (i = 0, b[2] = '\0'; i < len; i++) {
+		b[0] = str[2 * i];
+		b[1] = str[2 * i + 1];
+		data[i] = strtoul(b, &endptr, 16);
+		if (endptr != &b[2])
+			return 0;
+	}
+	return len;
+}
+
+static void
+flex_pattern_parsed_fn(void *parsed_result,
+		       __rte_unused struct cmdline *cl,
+		       __rte_unused void *data)
+{
+	struct flex_pattern_set *res = parsed_result;
+	struct flex_pattern *fp;
+	bool full_spec;
+
+	if (res->id >= FLEX_MAX_PATTERNS_NUM) {
+		printf("Bad flex pattern id\n");
+		return;
+	}
+	fp = flex_patterns + res->id;
+	memset(fp->spec_pattern, 0, sizeof(fp->spec_pattern));
+	memset(fp->mask_pattern, 0, sizeof(fp->mask_pattern));
+	fp->spec.length = flex_pattern_data(res->spec_data, fp->spec_pattern);
+	if (!fp->spec.length) {
+		printf("Bad flex pattern spec\n");
+		return;
+	}
+	full_spec = strncmp(res->is_spec, "spec", strlen("spec")) == 0;
+	if (full_spec) {
+		fp->mask.length = flex_pattern_data(res->mask_data,
+						    fp->mask_pattern);
+		if (!fp->mask.length) {
+			printf("Bad flex pattern mask\n");
+			return;
+		}
+	} else {
+		memset(fp->mask_pattern, 0xFF, fp->spec.length);
+		fp->mask.length = fp->spec.length;
+	}
+	if (fp->mask.length != fp->spec.length) {
+		printf("Spec length do not match mask length\n");
+		return;
+	}
+	fp->spec.pattern = fp->spec_pattern;
+	fp->mask.pattern = fp->mask_pattern;
+	printf("created pattern #%u\n", res->id);
+}
+
+cmdline_parse_inst_t cmd_set_flex_is_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> is <spec_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_is_token,
+		(void *)&flex_pattern_spec_data_token,
+		NULL,
+	}
+};
+
+cmdline_parse_inst_t cmd_set_flex_spec_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> spec <spec_data> mask <mask_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_spec_token,
+		(void *)&flex_pattern_spec_data_token,
+		(void *)&flex_pattern_mask_token,
+		(void *)&flex_pattern_mask_data_token,
+		NULL,
+	}
+};
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17e..26357bc6e3 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2886,6 +2886,7 @@ close_port(portid_t pid)
 
 		if (is_proc_primary()) {
 			port_flow_flush(pi);
+			port_flex_item_flush(pi);
 			rte_eth_dev_close(pi);
 		}
 	}
@@ -4017,7 +4018,6 @@ main(int argc, char** argv)
 		rte_stats_bitrate_reg(bitrate_data);
 	}
 #endif
-
 #ifdef RTE_LIB_CMDLINE
 	if (strlen(cmdline_filename) != 0)
 		cmdline_read_from_file(cmdline_filename);
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 876a341cf0..3437d7607d 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -282,6 +282,19 @@ struct fwd_engine {
 	packet_fwd_t     packet_fwd;     /**< Mandatory. */
 };
 
+struct flex_item {
+	struct rte_flow_item_flex_conf flex_conf;
+	struct rte_flow_item_flex_handle *flex_handle;
+	uint32_t flex_id;
+};
+
+#define FLEX_ITEM_MAX_SAMPLES_NUM 16
+#define FLEX_ITEM_MAX_LINKS_NUM 16
+#define FLEX_MAX_FLOW_PATTERN_LENGTH 64
+#define FLEX_MAX_PARSERS_NUM 8
+#define FLEX_MAX_PATTERNS_NUM 64
+#define FLEX_PARSER_ERR ((struct flex_item *)-1)
+
 #define BURST_TX_WAIT_US 1
 #define BURST_TX_RETRIES 64
 
@@ -306,6 +319,8 @@ extern struct fwd_engine * fwd_engines[]; /**< NULL terminated array. */
 extern cmdline_parse_inst_t cmd_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw_all;
+extern cmdline_parse_inst_t cmd_set_flex_is_pattern;
+extern cmdline_parse_inst_t cmd_set_flex_spec_pattern;
 
 extern uint16_t mempool_flags;
 
@@ -1026,6 +1041,7 @@ uint16_t tx_pkt_set_dynf(uint16_t port_id, __rte_unused uint16_t queue,
 void add_tx_dynf_callback(portid_t portid);
 void remove_tx_dynf_callback(portid_t portid);
 int update_jumbo_frame_offload(portid_t portid);
+void port_flex_item_flush(portid_t port_id);
 
 /*
  * Work-around of a compilation error with ICC on invocations of the
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index bbef706374..4f03efd43f 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -5091,3 +5091,122 @@ For example to unload BPF filter from TX queue 0, port 0:
 .. code-block:: console
 
    testpmd> bpf-unload tx 0 0
+
+Flex Item Functions
+-------------------
+
+The following sections show functions that configure and create flex item object,
+create flex pattern and use it in a flow rule.
+The commands will use 20 bytes IPv4 header for examples:
+
+::
+
+   0                   1                   2                   3
+   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |  ver  |  IHL  |     TOS       |        length                 | +0
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       identification          | flg |    frag. offset         | +4
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       TTL     |  protocol     |        checksum               | +8
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |               source IP address                               | +12
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |              destination IP address                           | +16
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+Create flex item
+~~~~~~~~~~~~~~~~
+
+Flex item object is created by PMD according to a new header configuration. The
+header configuration is compiled by the testpmd and stored in
+``rte_flow_item_flex_conf`` type variable.
+
+::
+
+   # flow flex_item create <port> <flex id> <configuration file>
+   testpmd> flow flex_item init 0 3 ipv4_flex_config.json
+   port-0: created flex item #3
+
+Flex item configuration is kept in external JSON file.
+It describes the following header elements:
+
+**New header length.**
+
+Specify whether the new header has fixed or variable length and the basic/minimal
+header length value.
+
+If header length is not fixed, header location with a value that completes header
+length calculation and scale/offset function must be added.
+
+Scale function depends on port hardware.
+
+**Next protocol.**
+
+Describes location in the new header that specify following network header type.
+
+**Flow match samples.**
+
+Describes locations in the new header that will be used in flow rules.
+
+Number of flow samples and sample maximal length depend of port hardware.
+
+**Input trigger.**
+
+Describes preceding network header configuration.
+
+**Output trigger.**
+
+Describes conditions that trigger transfer to following network header
+
+.. code-block:: json
+
+   {
+      "next_header": { "field_mode": "FIELD_MODE_FIXED", "field_size": 20},
+      "next_protocol": {"field_size": 8, "field_base": 72},
+      "sample_data": [
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 0},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 32},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 64},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 96}
+      ],
+      "input_link": [
+         {"item": "eth type is 0x0800"},
+         {"item": "vlan inner_type is 0x0800"}
+      ],
+      "output_link": [
+         {"item": "udp", "next": 17},
+         {"item": "tcp", "next": 6},
+         {"item": "icmp", "next": 1}
+      ]
+   }
+
+
+Flex pattern and flow rules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Flex pattern describe parts of network header that will trigger flex flow item hit in a flow rule.
+Flex pattern directly related to flex item samples configuration.
+Flex pattern can be shared between ports.
+
+**Flex pattern and flow rule to match IPv4 version and 20 bytes length**
+
+::
+
+   # set flex_pattern <pattern_id> is <hex bytes sequence>
+   testpmd> flow flex_item pattern 5 is 45FF
+   created pattern #5
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 5 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
+
+**Flex pattern and flow rule to match packets with source address 1.2.3.4**
+
+::
+
+   testpmd> flow flex_item pattern 2 spec 45000000000000000000000001020304 mask FF0000000000000000000000FFFFFFFF
+   created pattern #2
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 2 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v5 3/5] ethdev: implement RTE flex item API
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 3/5] ethdev: implement RTE flex item API Viacheslav Ovsiienko
@ 2021-10-12 14:39     ` Ori Kam
  0 siblings, 0 replies; 73+ messages in thread
From: Ori Kam @ 2021-10-12 14:39 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Gregory Etelson,
	NBU-Contact-Thomas Monjalon


Hi Slava,

> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo@nvidia.com>
> Sent: Tuesday, October 12, 2021 3:55 PM
> Subject: [PATCH v5 3/5] ethdev: implement RTE flex item API
> 
> From: Gregory Etelson <getelson@nvidia.com>
> 
> RTE flex item API was introduced in
> "ethdev: introduce configurable flexible item" patch.
> 
> The API allows DPDK application to define parser for custom network header in port hardware and offload
> flows that will match the custom header elements.
> 
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
>  lib/ethdev/rte_flow.c        | 40 ++++++++++++++++++++++++++++++++++++
>  lib/ethdev/rte_flow_driver.h |  8 ++++++++
>  lib/ethdev/version.map       |  4 ++++
>  3 files changed, 52 insertions(+)
> 
> diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c index 051781b440..8257ed8c97 100644
> --- a/lib/ethdev/rte_flow.c
> +++ b/lib/ethdev/rte_flow.c
> @@ -1321,3 +1321,43 @@ rte_flow_tunnel_item_release(uint16_t port_id,
>  				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
>  				  NULL, rte_strerror(ENOTSUP));
>  }
> +
> +struct rte_flow_item_flex_handle *
> +rte_flow_flex_item_create(uint16_t port_id,
> +			  const struct rte_flow_item_flex_conf *conf,
> +			  struct rte_flow_error *error)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +	struct rte_flow_item_flex_handle *handle;
> +
> +	if (unlikely(!ops))
> +		return NULL;
> +	if (unlikely(!ops->flex_item_create)) {
> +		rte_flow_error_set(error, ENOTSUP,
> +				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +				   NULL, rte_strerror(ENOTSUP));
> +		return NULL;
> +	}
> +	handle = ops->flex_item_create(dev, conf, error);
> +	if (handle == NULL)
> +		flow_err(port_id, -rte_errno, error);
> +	return handle;
> +}
> +
> +int
> +rte_flow_flex_item_release(uint16_t port_id,
> +			   const struct rte_flow_item_flex_handle *handle,
> +			   struct rte_flow_error *error)
> +{
> +	int ret;
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> +
> +	if (unlikely(!ops || !ops->flex_item_release))
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +					  NULL, rte_strerror(ENOTSUP));
> +	ret = ops->flex_item_release(dev, handle, error);
> +	return flow_err(port_id, ret, error);
> +}
> diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h index 46f62c2ec2..34a5a5bcd0
> 100644
> --- a/lib/ethdev/rte_flow_driver.h
> +++ b/lib/ethdev/rte_flow_driver.h
> @@ -139,6 +139,14 @@ struct rte_flow_ops {
>  		 struct rte_flow_item *pmd_items,
>  		 uint32_t num_of_items,
>  		 struct rte_flow_error *err);
> +	struct rte_flow_item_flex_handle *(*flex_item_create)
> +		(struct rte_eth_dev *dev,
> +		 const struct rte_flow_item_flex_conf *conf,
> +		 struct rte_flow_error *error);
> +	int (*flex_item_release)
> +		(struct rte_eth_dev *dev,
> +		 const struct rte_flow_item_flex_handle *handle,
> +		 struct rte_flow_error *error);
>  };
> 
>  /**
> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index 904bce6ea1..ec3b66d7a1 100644
> --- a/lib/ethdev/version.map
> +++ b/lib/ethdev/version.map
> @@ -247,6 +247,10 @@ EXPERIMENTAL {
>  	rte_mtr_meter_policy_delete;
>  	rte_mtr_meter_policy_update;
>  	rte_mtr_meter_policy_validate;
> +
> +	# added in 21.11
> +	rte_flow_flex_item_create;
> +	rte_flow_flex_item_release;
>  };
> 
>  INTERNAL {
> --
> 2.18.1

Acked-by: Ori Kam <orika@nvidia.com>
Best,
Ori



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item
  2021-10-12 12:54 ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (4 preceding siblings ...)
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 5/5] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
@ 2021-10-14 16:09   ` Ferruh Yigit
  2021-10-14 18:55     ` Slava Ovsiienko
  5 siblings, 1 reply; 73+ messages in thread
From: Ferruh Yigit @ 2021-10-14 16:09 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev
  Cc: rasland, matan, shahafs, orika, getelson, thomas, Qi Zhang

On 10/12/2021 1:54 PM, Viacheslav Ovsiienko wrote:
> 1. Introduction and Retrospective
> 
> Nowadays the networks are evolving fast and wide, the network
> structures are getting more and more complicated, the new
> application areas are emerging. To address these challenges
> the new network protocols are continuously being developed,
> considered by technical communities, adopted by industry and,
> eventually implemented in hardware and software. The DPDK
> framework follows the common trends and if we bother
> to glance at the RTE Flow API header we see the multiple
> new items were introduced during the last years since
> the initial release.
> 
> The new protocol adoption and implementation process is
> not straightforward and takes time, the new protocol passes
> development, consideration, adoption, and implementation
> phases. The industry tries to mitigate and address the
> forthcoming network protocols, for example, many hardware
> vendors are implementing flexible and configurable network
> protocol parsers. As DPDK developers, could we anticipate
> the near future in the same fashion and introduce the similar
> flexibility in RTE Flow API?
> 
> Let's check what we already have merged in our project, and
> we see the nice raw item (rte_flow_item_raw). At the first
> glance, it looks superior and we can try to implement a flow
> matching on the header of some relatively new tunnel protocol,
> say on the GENEVE header with variable length options. And,
> under further consideration, we run into the raw item
> limitations:
> 
> - only fixed size network header can be represented
> - the entire network header pattern of fixed format
>    (header field offsets are fixed) must be provided
> - the search for patterns is not robust (the wrong matches
>    might be triggered), and actually is not supported
>    by existing PMDs
> - no explicitly specified relations with preceding
>    and following items
> - no tunnel hint support
> 
> As the result, implementing the support for tunnel protocols
> like aforementioned GENEVE with variable extra protocol option
> with flow raw item becomes very complicated and would require
> multiple flows and multiple raw items chained in the same
> flow (by the way, there is no support found for chained raw
> items in implemented drivers).
> 
> This RFC introduces the dedicated flex item (rte_flow_item_flex)
> to handle matches with existing and new network protocol headers
> in a unified fashion.
> 
> 2. Flex Item Life Cycle
> 
> Let's assume there are the requirements to support the new
> network protocol with RTE Flows. What is given within protocol
> specification:
> 
>    - header format
>    - header length, (can be variable, depending on options)
>    - potential presence of extra options following or included
>      in the header the header
>    - the relations with preceding protocols. For example,
>      the GENEVE follows UDP, eCPRI can follow either UDP
>      or L2 header
>    - the relations with following protocols. For example,
>      the next layer after tunnel header can be L2 or L3
>    - whether the new protocol is a tunnel and the header
>      is a splitting point between outer and inner layers
> 
> The supposed way to operate with flex item:
> 
>    - application defines the header structures according to
>      protocol specification
> 
>    - application calls rte_flow_flex_item_create() with desired
>      configuration according to the protocol specification, it
>      creates the flex item object over specified ethernet device
>      and prepares PMD and underlying hardware to handle flex
>      item. On item creation call PMD backing the specified
>      ethernet device returns the opaque handle identifying
>      the object has been created
> 
>    - application uses the rte_flow_item_flex with obtained handle
>      in the flows, the values/masks to match with fields in the
>      header are specified in the flex item per flow as for regular
>      items (except that pattern buffer combines all fields)
> 
>    - flows with flex items match with packets in a regular fashion,
>      the values and masks for the new protocol header match are
>      taken from the flex items in the flows
> 
>    - application destroys flows with flex items
> 
>    - application calls rte_flow_flex_item_release() as part of
>      ethernet device API and destroys the flex item object in
>      PMD and releases the engaged hardware resources
> 
> 3. Flex Item Structure
> 
> The flex item structure is intended to be used as part of the flow
> pattern like regular RTE flow items and provides the mask and
> value to match with fields of the protocol item was configured
> for.
> 
>    struct rte_flow_item_flex {
>      void *handle;
>      uint32_t length;
>      const uint8_t* pattern;
>    };
> 
> The handle is some opaque object maintained on per device basis
> by underlying driver.
> 
> The protocol header fields are considered as bit fields, all
> offsets and widths are expressed in bits. The pattern is the
> buffer containing the bit concatenation of all the fields
> presented at item configuration time, in the same order and
> same amount. If byte boundary alignment is needed an application
> can use a dummy type field, this is just some kind of gap filler.
> 
> The length field specifies the pattern buffer length in bytes
> and is needed to allow rte_flow_copy() operations. The approach
> of multiple pattern pointers and lengths (per field) was
> considered and found clumsy - it seems to be much suitable for
> the application to maintain the single structure within the
> single pattern buffer.
> 
> 4. Flex Item Configuration
> 
> The flex item configuration consists of the following parts:
> 
>    - header field descriptors:
>      - next header
>      - next protocol
>      - sample to match
>    - input link descriptors
>    - output link descriptors
> 
> The field descriptors tell the driver and hardware what data should
> be extracted from the packet and then control the packet handling
> in the flow engine. Besides this, sample fields can be presented
> to match with patterns in the flows. Each field is a bit pattern.
> It has width, offset from the header beginning, mode of offset
> calculation, and offset related parameters.
> 
> The next header field is special, no data are actually taken
> from the packet, but its offset is used as a pointer to the next
> header in the packet, in other words the next header offset
> specifies the size of the header being parsed by flex item.
> 
> There is one more special field - next protocol, it specifies
> where the next protocol identifier is contained and packet data
> sampled from this field will be used to determine the next
> protocol header type to continue packet parsing. The next
> protocol field is like eth_type field in MAC2, or proto field
> in IPv4/v6 headers.
> 
> The sample fields are used to represent the data be sampled
> from the packet and then matched with established flows.
> 
> There are several methods supposed to calculate field offset
> in runtime depending on configuration and packet content:
> 
>    - FIELD_MODE_FIXED - fixed offset. The bit offset from
>      header beginning is permanent and defined by field_base
>      configuration parameter.
> 
>    - FIELD_MODE_OFFSET - the field bit offset is extracted
>      from other header field (indirect offset field). The
>      resulting field offset to match is calculated from as:
> 
>    field_base + (*offset_base & offset_mask) << offset_shift
> 
>      This mode is useful to sample some extra options following
>      the main header with field containing main header length.
>      Also, this mode can be used to calculate offset to the
>      next protocol header, for example - IPv4 header contains
>      the 4-bit field with IPv4 header length expressed in dwords.
>      One more example - this mode would allow us to skip GENEVE
>      header variable length options.
> 
>    - FIELD_MODE_BITMASK - the field bit offset is extracted
>      from other header field (indirect offset field), the latter
>      is considered as bitmask containing some number of one bits,
>      the resulting field offset to match is calculated as:
> 
>    field_base + bitcount(*offset_base & offset_mask) << offset_shift
> 
>      This mode would be useful to skip the GTP header and its
>      extra options with specified flags.
> 
>    - FIELD_MODE_DUMMY - dummy field, optionally used for byte
>      boundary alignment in pattern. Pattern mask and data are
>      ignored in the match. All configuration parameters besides
>      field size and offset are ignored.
> 
>    Note:  "*" - means the indirect field offset is calculated
>    and actual data are extracted from the packet by this
>    offset (like data are fetched by pointer *p from memory).
> 
> The offset mode list can be extended by vendors according to
> hardware supported options.
> 
> The input link configuration section tells the driver after
> what protocols and at what conditions the flex item can follow.
> Input link specified the preceding header pattern, for example
> for GENEVE it can be UDP item specifying match on destination
> port with value 6081. The flex item can follow multiple header
> types and multiple input links should be specified. At flow
> creation time the item with one of the input link types should
> precede the flex item and driver will select the correct flex
> item settings, depending on the actual flow pattern.
> 
> The output link configuration section tells the driver how
> to continue packet parsing after the flex item protocol.
> If multiple protocols can follow the flex item header the
> flex item should contain the field with the next protocol
> identifier and the parsing will be continued depending
> on the data contained in this field in the actual packet.
> 
> The flex item fields can participate in RSS hash calculation,
> the dedicated flag is present in the field description to specify
> what fields should be provided for hashing.
> 
> 5. Flex Item Chaining
> 
> If there are multiple protocols supposed to be supported with
> flex items in chained fashion - two or more flex items within
> the same flow and these ones might be neighbors in the pattern,
> it means the flex items are mutual referencing.  In this case,
> the item that occurred first should be created with empty
> output link list or with the list including existing items,
> and then the second flex item should be created referencing
> the first flex item as input arc, drivers should adjust
> the item confgiuration.
> 
> Also, the hardware resources used by flex items to handle
> the packet can be limited. If there are multiple flex items
> that are supposed to be used within the same flow it would
> be nice to provide some hint for the driver that these two
> or more flex items are intended for simultaneous usage.
> The fields of items should be assigned with hint indices
> and these indices from two or more flex items supposed
> to be provided within the same flow should be the same
> as well. In other words, the field hint index specifies
> the group of fields that can be matched simultaneously
> within a single flow. If hint indices are specified,
> the driver will try to engage not overlapping hardware
> resources and provide independent handling of the field
> groups with unique indices. If the hint index is zero
> the driver assigns resources on its own.
> 
> 6. Example of New Protocol Handling
> 
> Let's suppose we have the requirements to handle the new tunnel
> protocol that follows UDP header with destination port 0xFADE
> and is followed by MAC header. Let the new protocol header format
> be like this:
> 
>    struct new_protocol_header {
>      rte_be32 header_length; /* length in dwords, including options */
>      rte_be32 specific0;     /* some protocol data, no intention */
>      rte_be32 specific1;     /* to match in flows on these fields */
>      rte_be32 crucial;       /* data of interest, match is needed */
>      rte_be32 options[0];    /* optional protocol data, variable length */
>    };
> 
> The supposed flex item configuration:
> 
>    struct rte_flow_item_flex_field field0 = {
>      .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
>      .field_size = 96,                /* three dwords from the beginning */
>    };
>    struct rte_flow_item_flex_field field1 = {
>      .field_mode = FIELD_MODE_FIXED,
>      .field_size = 32,       /* Field size is one dword */
>      .field_base = 96,       /* Skip three dwords from the beginning */
>    };
>    struct rte_flow_item_udp spec0 = {
>      .hdr = {
>        .dst_port = RTE_BE16(0xFADE),
>      }
>    };
>    struct rte_flow_item_udp mask0 = {
>      .hdr = {
>        .dst_port = RTE_BE16(0xFFFF),
>      }
>    };
>    struct rte_flow_item_flex_link link0 = {
>      .item = {
>         .type = RTE_FLOW_ITEM_TYPE_UDP,
>         .spec = &spec0,
>         .mask = &mask0,
>    };
> 
>    struct rte_flow_item_flex_conf conf = {
>      .next_header = {
>        .tunnel = FLEX_TUNNEL_MODE_SINGLE,
>        .field_mode = FIELD_MODE_OFFSET,
>        .field_base = 0,
>        .offset_base = 0,
>        .offset_mask = 0xFFFFFFFF,
>        .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
>      },
>      .sample = {
>         &field0,
>         &field1,
>      },
>      .nb_samples = 2,
>      .input_link[0] = &link0,
>      .nb_inputs = 1
>    };
> 
> Let's suppose we have created the flex item successfully, and PMD
> returned the handle 0x123456789A. We can use the following item
> pattern to match the crucial field in the packet with value 0x00112233:
> 
>    struct new_protocol_header spec_pattern =
>    {
>      .crucial = RTE_BE32(0x00112233),
>    };
>    struct new_protocol_header mask_pattern =
>    {
>      .crucial = RTE_BE32(0xFFFFFFFF),
>    };
>    struct rte_flow_item_flex spec_flex = {
>      .handle = 0x123456789A
>      .length = sizeiof(struct new_protocol_header),
>      .pattern = &spec_pattern,
>    };
>    struct rte_flow_item_flex mask_flex = {
>      .length = sizeof(struct new_protocol_header),
>      .pattern = &mask_pattern,
>    };
>    struct rte_flow_item item_to_match = {
>      .type = RTE_FLOW_ITEM_TYPE_FLEX,
>      .spec = &spec_flex,
>      .mask = &mask_flex,
>    };
> 
> 7. Notes:
> 
>   - v4:  http://patches.dpdk.org/project/dpdk/patch/20211012113235.24975-2-viacheslavo@nvidia.com/
>   - v3:  http://patches.dpdk.org/project/dpdk/cover/20211011181528.517-1-viacheslavo@nvidia.com/
>   - v2:  http://patches.dpdk.org/project/dpdk/patch/20211001193415.23288-2-viacheslavo@nvidia.com/
>   - v1:  http://patches.dpdk.org/project/dpdk/patch/20210922180418.20663-2-viacheslavo@nvidia.com/
>   - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/
> 
>   - v4 -> v5:
>     - comments addressed
>     - testpmd compilation issue fixed
> 
>   - v3 -> v4:
>     - comments addressed
>     - testpmd compilation issues fixed
>     - typos fixed
> 
>   - v2 -> v3:
>     - comments addressed
>     - flex item update removed as not supported
>     - RSS over flex item fields removed as not supported and non-complete
>       API
>     - tunnel mode configuration refactored
>     - testpmd updated
>     - documentation updated
>     - PMD patches are removed temporarily (updating WIP, be presented in rc2)
> 
>   - v1 -> v2:
>     - testpmd CLI to handle flex item is provided
>     - draft PMD code is introduced
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> 
> Gregory Etelson (4):
>    ethdev: support flow elements with variable length
>    ethdev: implement RTE flex item API
>    app/testpmd: add jansson library
>    app/testpmd: add flex item CLI commands
> 
> Viacheslav Ovsiienko (1):
>    ethdev: introduce configurable flexible item
> 

Hi Viacheslav,

This as a nice feature, thanks. But my concern/question is how to test it?

I think testing requires HW that support custom (flexible) protocol, and I
assume mellanox devices supports it, does it mean specific HW is required
to test the feature?

Or can we use flexible item to emulate an existing protocol and test it on
any hardware? If so what do you think about adding more documentation to
describe how this can be done?

Thanks,
ferruh


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/5] app/testpmd: add flex item CLI commands
  2021-10-12 12:54   ` [dpdk-dev] [PATCH v5 5/5] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
@ 2021-10-14 16:42     ` Ferruh Yigit
  2021-10-14 18:13       ` Gregory Etelson
  0 siblings, 1 reply; 73+ messages in thread
From: Ferruh Yigit @ 2021-10-14 16:42 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, dev
  Cc: rasland, matan, shahafs, orika, getelson, thomas

On 10/12/2021 1:54 PM, Viacheslav Ovsiienko wrote:
> From: Gregory Etelson <getelson@nvidia.com>
> 
> Network port hardware is shipped with fixed number of
> supported network protocols. If application must work with a
> protocol that is not included in the port hardware by default, it
> can try to add the new protocol to port hardware.
> 
> Flex item or flex parser is port infrastructure that allows
> application to add support for a custom network header and
> offload flows to match the header elements.
> 
> Application must complete the following tasks to create a flow
> rule that matches custom header:
> 
> 1. Create flow item object in port hardware.
> Application must provide custom header configuration to PMD.
> PMD will use that configuration to create flex item object in
> port hardware.
> 
> 2. Create flex patterns to match. Flex pattern has a spec and a mask
> components, like a regular flow item. Combined together, spec and mask
> can target unique data sequence or a number of data sequences in the
> custom header.
> Flex patterns of the same flex item can have different lengths.
> Flex pattern is identified by unique handler value.
> 
> 3. Create a flow rule with a flex flow item that references
> flow pattern.
> 
> Testpmd flex CLI commands are:
> 
> testpmd> flow flex_item create <port> <flex_id> <filename>
> 

The file here is .json file, right? What do you think to provide some
sample .json file? I am not quite sure though where can be a place for
them, perhaps a sub-folder under testpmd?

> testpmd> set flex_pattern <pattern_id> \
>           spec <spec data> mask <mask data>
> 
> testpmd> set flex_pattern <pattern_id> is <spec_data>
> 
> testpmd> flow create <port> ... \
> / flex item is <flex_id> pattern is <pattern_id> / ...
> 
> The patch works with the jansson library API.
> Jansson development files must be present:
> jansson.pc, jansson.h libjansson.[a,so]
> 
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

<...>

> +static void
> +flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename)
> +{
> +	struct rte_flow_error flow_error;
> +	json_error_t json_error;
> +	json_t *jroot = NULL;
> +	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
> +	int ret;
> +
> +	if (fp == FLEX_PARSER_ERR) {
> +		printf("Bad parameters: port_id=%u flex_id=%u\n",
> +		       port_id, flex_id);
> +		return;
> +	}
> +	if (fp) {
> +		printf("port-%u: flex item #%u is already in use\n",
> +		       port_id, flex_id);
> +		return;
> +	}
> +	jroot = json_load_file(filename, 0, &json_error)> +	if (!jroot) {
> +		printf("Bad JSON file \"%s\": %s\n", filename, json_error.text);
> +		return;
> +	}
> +	fp = flex_item_init();
> +	if (!fp) {
> +		printf("Could not allocate flex item\n");
> +		goto out;
> +	}
> +	ret = flex_item_config(jroot, &fp->flex_conf);

What do you think to decouple json & flex item support a little more?


Like:
flex_item_config(&fp->flex_conf);
	flex_item_config_json(&fp->flex_conf);
		jroot = json_load_file()
		parse json & fill flex_conf
		json_decref(jroot);


> +	if (ret)
> +		goto out;
> +	fp->flex_handle = rte_flow_flex_item_create(port_id,
> +						    &fp->flex_conf,
> +						    &flow_error);
> +	if (fp->flex_handle) {
> +		flex_items[port_id][flex_id] = fp;
> +		printf("port-%u: created flex item #%u\n", port_id, flex_id);
> +		fp = NULL;
> +	} else {
> +		printf("port-%u: flex item #%u creation failed: %s\n",
> +		       port_id, flex_id,
> +		       flow_error.message ? flow_error.message : "");
> +	}
> +out:
> +	if (fp)
> +		free(fp);
> +	if (jroot)
> +		json_decref(jroot);
> +}
> +
> +#else /* RTE_HAS_JANSSON */
> +static void flex_item_create(__rte_unused portid_t port_id,
> +			     __rte_unused uint16_t flex_id,
> +			     __rte_unused const char *filename)
> +{
> +	printf("no JSON library\n");
> +}
> +
> +static void flex_item_destroy(__rte_unused portid_t port_id,
> +			     __rte_unused uint16_t flex_id)
> +{
> +	printf("no JSON library\n");
> +}
> +#endif /* RTE_HAS_JANSSON */

Does it make sense to move all above code (ifdef block) to a separate file?

Just because 'cmdline_flow.c' is getting bigger, I want to get your comment,
no strong opinion.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v5 5/5] app/testpmd: add flex item CLI commands
  2021-10-14 16:42     ` Ferruh Yigit
@ 2021-10-14 18:13       ` Gregory Etelson
  0 siblings, 0 replies; 73+ messages in thread
From: Gregory Etelson @ 2021-10-14 18:13 UTC (permalink / raw)
  To: Ferruh Yigit, Slava Ovsiienko, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Ori Kam,
	NBU-Contact-Thomas Monjalon

Hello Ferruh,

..snip..

> > testpmd> flow flex_item create <port>
> <flex_id> <filename>
> >
> 
> The file here is .json file, right? What do you
> think to provide some
> sample .json file? I am not quite sure though
> where can be a place for
> them, perhaps a sub-folder under testpmd?
>

JSON file example will be added in patch update.

Flex item configuration is not part of testpmd infrastructure.
It belongs to run-time test environment. It should not be treated
differently than file parameter for the `load` testpmd command.  

..snip.. 

> 
> What do you think to decouple json & flex item
> support a little more?
> 
> 
> Like:
> flex_item_config(&fp->flex_conf);
>         flex_item_config_json(&fp->flex_conf);
>                 jroot = json_load_file()
>                 parse json & fill flex_conf
>                 json_decref(jroot);
> 
> 

[1].

..snip..

> 
> Does it make sense to move all above code (ifdef
> block) to a separate file?
> 
> Just because 'cmdline_flow.c' is getting bigger, I
> want to get your comment,
> no strong opinion.

[2].

[1], [2] - I'll update the code in upcoming patch

Regards,
Gregory

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item
  2021-10-14 16:09   ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Ferruh Yigit
@ 2021-10-14 18:55     ` Slava Ovsiienko
  0 siblings, 0 replies; 73+ messages in thread
From: Slava Ovsiienko @ 2021-10-14 18:55 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Ori Kam,
	Gregory Etelson, NBU-Contact-Thomas Monjalon, Qi Zhang

Hi, Ferruh

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Thursday, October 14, 2021 19:09
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; dev@dpdk.org
> Cc: Raslan Darawsheh <rasland@nvidia.com>; Matan Azrad
> <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>; Ori Kam
> <orika@nvidia.com>; Gregory Etelson <getelson@nvidia.com>; NBU-Contact-
> Thomas Monjalon <thomas@monjalon.net>; Qi Zhang <qi.z.zhang@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable
> flexible item
> 
> On 10/12/2021 1:54 PM, Viacheslav Ovsiienko wrote:
> > 1. Introduction and Retrospective
> >

.. snip ..

> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> >
> > Gregory Etelson (4):
> >    ethdev: support flow elements with variable length
> >    ethdev: implement RTE flex item API
> >    app/testpmd: add jansson library
> >    app/testpmd: add flex item CLI commands
> >
> > Viacheslav Ovsiienko (1):
> >    ethdev: introduce configurable flexible item
> >
> 
> Hi Viacheslav,
> 
> This as a nice feature, thanks. But my concern/question is how to test it?
> 
> I think testing requires HW that support custom (flexible) protocol, and I
> assume mellanox devices supports it, does it mean specific HW is required to
> test the feature?

Hypothetically,  the feature can be implemented in software (as well as the
entire RTE flow engine) , but you are right - the hardware in needed - 
Mellanox NICs provide support for Flex Item feature since ConnectX-6DX.
It seems mlx5 is going to be only the PMD supporting the Flex Item
at in hardware for the moment. We tried to avoid any vendor specifics in this API,
and I hope we succeeded with making it commoditized.
> 
> Or can we use flexible item to emulate an existing protocol and test it on any
> hardware?
Mmm, let me think a bit.
If we have some NIC supporting, say, UDP, and we would like to handle
UDP with Flex Item (for testing purposes only) - we should update NIC PMD
accordingly - it should handle Flex Item API and recognize Flex Item in Flows.
So, it is rather the question about updating other PMDs and we did not think
in that direction. But it is feasible, I guess.

> If so what do you think about adding more documentation to
> describe how this can be done?

We are going to add couple of examples of Flex Item API usage (closer to rc3)
The Flex Item idea is simple, but there are a lot of parameters and structures, so
good examples must be provided in documentation. These ones
will be focused mostly on Flex Item API and can be used as guides for PMD
potential testing implementations.

With best regards,
Slava


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v6 0/6] ethdev: introduce configurable flexible item
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (7 preceding siblings ...)
  2021-10-12 12:54 ` [dpdk-dev] [PATCH v5 0/5] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-18 18:02 ` Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 1/6] " Viacheslav Ovsiienko
                     ` (5 more replies)
  2021-10-20 15:06 ` [dpdk-dev] [PATCH v7 0/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-20 15:14 ` [dpdk-dev] [PATCH v8 0/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  10 siblings, 6 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-18 18:02 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item confgiuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

7. Notes:

 - v5:  http://patches.dpdk.org/project/dpdk/patch/20211012125433.31647-2-viacheslavo@nvidia.com/
 - v4:  http://patches.dpdk.org/project/dpdk/patch/20211012113235.24975-2-viacheslavo@nvidia.com/ 
 - v3:  http://patches.dpdk.org/project/dpdk/cover/20211011181528.517-1-viacheslavo@nvidia.com/
 - v2:  http://patches.dpdk.org/project/dpdk/patch/20211001193415.23288-2-viacheslavo@nvidia.com/
 - v1:  http://patches.dpdk.org/project/dpdk/patch/20210922180418.20663-2-viacheslavo@nvidia.com/
 - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/

 - v5 -> v6:
   - flex item command moved to dedicated file cmd_flex_item.c

 - v4 -> v5:
   - comments addressed
   - testpmd compilation issue fixed

 - v3 -> v4:
   - comments addressed
   - testpmd compilation issues fixed
   - typos fixed

 - v2 -> v3:
   - comments addressed
   - flex item update removed as not supported
   - RSS over flex item fields removed as not supported and non-complete
     API
   - tunnel mode configuration refactored
   - testpmd updated
   - documentation updated
   - PMD patches are removed temporarily (updating WIP, be presented in rc2)

 - v1 -> v2:
   - testpmd CLI to handle flex item is provided
   - draft PMD code is introduced

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Gregory Etelson (5):
  ethdev: support flow elements with variable length
  ethdev: implement RTE flex item API
  app/testpmd: add jansson library
  app/testpmd: add dedicated flow command parsing routine
  app/testpmd: add flex item CLI commands

Viacheslav Ovsiienko (1):
  ethdev: introduce configurable flexible item

 app/test-pmd/cmd_flex_item.c                | 548 ++++++++++++++++++++
 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 247 ++++++++-
 app/test-pmd/meson.build                    |   6 +
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  34 ++
 doc/guides/prog_guide/rte_flow.rst          |  25 +
 doc/guides/rel_notes/release_21_11.rst      |   7 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++++
 lib/ethdev/rte_flow.c                       | 121 ++++-
 lib/ethdev/rte_flow.h                       | 222 ++++++++
 lib/ethdev/rte_flow_driver.h                |   8 +
 lib/ethdev/version.map                      |   2 +
 13 files changed, 1328 insertions(+), 15 deletions(-)
 create mode 100644 app/test-pmd/cmd_flex_item.c

-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v6 1/6] ethdev: introduce configurable flexible item
  2021-10-18 18:02 ` [dpdk-dev] [PATCH v6 0/6] " Viacheslav Ovsiienko
@ 2021-10-18 18:02   ` Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 2/6] ethdev: support flow elements with variable length Viacheslav Ovsiienko
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-18 18:02 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item configuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst     |  25 +++
 doc/guides/rel_notes/release_21_11.rst |   7 +
 lib/ethdev/rte_flow.h                  | 222 +++++++++++++++++++++++++
 3 files changed, 254 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3cb014c1fa..eb472a8b77 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1425,6 +1425,31 @@ Matches a conntrack state after conntrack action.
 - ``flags``: conntrack packet state flags.
 - Default ``mask`` matches all state bits.
 
+Item: ``FLEX``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Matches with the custom network protocol header that was created
+using rte_flow_flex_item_create() API. The application describes
+the desired header structure, defines the header fields attributes
+and header relations with preceding and following protocols and
+configures the ethernet devices accordingly via
+rte_flow_flex_item_create() routine.
+
+- ``handle``: the flex item handle returned by the PMD on successful
+  rte_flow_flex_item_create() call, mask for this field is ignored.
+- ``length``: match pattern length in bytes. If the length does not cover
+  all fields defined in item configuration, the pattern spec and mask are
+  considered by the driver as padded with trailing zeroes till the full
+  configured item pattern length.
+- ``pattern``: pattern to match. The pattern is concatenation of bit fields
+  configured at item creation. At configuration the fields are presented
+  by sample_data array. The order of the bitfields is defined by the order
+  of sample_data elements. The width of each bitfield is defined by the width
+  specified in the corresponding sample_data element as well. If pattern
+  length is smaller than configured fields overall length it is considered
+  as padded with trailing zeroes up to full configured length, both for
+  value and mask.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 65ed00261c..6d7d30030c 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Introduced RTE Flow Flex Item.**
+
+  * The configurable RTE Flow Flex Item provides the capability to introduce
+    the arbitrary user specified network protocol header, configure the device
+    hardware accordingly, and perform match on this header with desired patterns
+    and masks.
+
 * **Enabled new devargs parser.**
 
   * Enabled devargs syntax
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 5f87851f8c..b50cb0f693 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -574,6 +574,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_conntrack.
 	 */
 	RTE_FLOW_ITEM_TYPE_CONNTRACK,
+
+	/**
+	 * Matches a configured set of fields at runtime calculated offsets
+	 * over the generic network header with variable length and
+	 * flexible pattern
+	 *
+	 * @see struct rte_flow_item_flex.
+	 */
+	RTE_FLOW_ITEM_TYPE_FLEX,
 };
 
 /**
@@ -1839,6 +1848,177 @@ struct rte_flow_item {
 	const void *mask; /**< Bit-mask applied to spec and last. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_FLEX
+ *
+ * Matches a specified set of fields within the network protocol
+ * header. Each field is presented as set of bits with specified width, and
+ * bit offset from the header beginning.
+ *
+ * The pattern is concatenation of bit fields configured at item creation
+ * by rte_flow_flex_item_create(). At configuration the fields are presented
+ * by sample_data array.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_flex {
+	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle. */
+	uint32_t length; /**< Pattern length in bytes. */
+	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
+};
+/**
+ * Field bit offset calculation mode.
+ */
+enum rte_flow_item_flex_field_mode {
+	/**
+	 * Dummy field, used for byte boundary alignment in pattern.
+	 * Pattern mask and data are ignored in the match. All configuration
+	 * parameters besides field size are ignored.
+	 */
+	FIELD_MODE_DUMMY = 0,
+	/**
+	 * Fixed offset field. The bit offset from header beginning
+	 * is permanent and defined by field_base parameter.
+	 */
+	FIELD_MODE_FIXED,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field). The resulting field offset to match is calculated as:
+	 *
+	 *    field_base + (*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_OFFSET,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field), the latter is considered as bitmask containing some
+	 * number of one bits, the resulting field offset to match is
+	 * calculated as:
+	 *
+	 *    field_base + bitcount(*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_BITMASK,
+};
+
+/**
+ * Flex item field tunnel mode
+ */
+enum rte_flow_item_flex_tunnel_mode {
+	/**
+	 * The protocol header can be present in the packet only once.
+	 * No multiple flex item flow inclusions (for inner/outer) are allowed.
+	 * No any relations with tunnel protocols are imposed. The drivers
+	 * can optimize hardware resource usage to handle match on single flex
+	 * item of specific type.
+	 */
+	FLEX_TUNNEL_MODE_SINGLE = 0,
+	/**
+	 * Flex item presents outer header only.
+	 */
+	FLEX_TUNNEL_MODE_OUTER,
+	/**
+	 * Flex item presents inner header only.
+	 */
+	FLEX_TUNNEL_MODE_INNER,
+	/**
+	 * Flex item presents either inner or outer header. The driver
+	 * handles as many multiple inners as hardware supports.
+	 */
+	FLEX_TUNNEL_MODE_MULTI,
+	/**
+	 * Flex item presents tunnel protocol header.
+	 */
+	FLEX_TUNNEL_MODE_TUNNEL,
+};
+
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+__extension__
+struct rte_flow_item_flex_field {
+	/** Defines how match field offset is calculated over the packet. */
+	enum rte_flow_item_flex_field_mode field_mode;
+	uint32_t field_size; /**< Field size in bits. */
+	int32_t field_base; /**< Field offset in bits. */
+	uint32_t offset_base; /**< Indirect offset field offset in bits. */
+	uint32_t offset_mask; /**< Indirect offset field bit mask. */
+	int32_t offset_shift; /**< Indirect offset multiply factor. */
+	uint32_t field_id:16; /**< Device hint, for multiple items in flow. */
+	uint32_t reserved:16; /**< Reserved field. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_link {
+	/**
+	 * Preceding/following header. The item type must be always provided.
+	 * For preceding one item must specify the header value/mask to match
+	 * for the link be taken and start the flex item header parsing.
+	 */
+	struct rte_flow_item item;
+	/**
+	 * Next field value to match to continue with one of the configured
+	 * next protocols.
+	 */
+	uint32_t next;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_conf {
+	/**
+	 * Specifies the flex item and tunnel relations and tells the PMD
+	 * whether flex item can be used for inner, outer or both headers,
+	 * or whether flex item presents the tunnel protocol itself.
+	 */
+	enum rte_flow_item_flex_tunnel_mode tunnel;
+	/**
+	 * The next header offset, it presents the network header size covered
+	 * by the flex item and can be obtained with all supported offset
+	 * calculating methods (fixed, dedicated field, bitmask, etc).
+	 */
+	struct rte_flow_item_flex_field next_header;
+	/**
+	 * Specifies the next protocol field to match with link next protocol
+	 * values and continue packet parsing with matching link.
+	 */
+	struct rte_flow_item_flex_field next_protocol;
+	/**
+	 * The fields will be sampled and presented for explicit match
+	 * with pattern in the rte_flow_flex_item. There can be multiple
+	 * fields descriptors, the number should be specified by nb_samples.
+	 */
+	struct rte_flow_item_flex_field *sample_data;
+	/** Number of field descriptors in the sample_data array. */
+	uint32_t nb_samples;
+	/**
+	 * Input link defines the flex item relation with preceding
+	 * header. It specified the preceding item type and provides pattern
+	 * to match. The flex item will continue parsing and will provide the
+	 * data to flow match in case if there is the match with one of input
+	 * links.
+	 */
+	struct rte_flow_item_flex_link *input_link;
+	/** Number of link descriptors in the input link array. */
+	uint32_t nb_inputs;
+	/**
+	 * Output link defines the next protocol field value to match and
+	 * the following protocol header to continue packet parsing. Also
+	 * defines the tunnel-related behaviour.
+	 */
+	struct rte_flow_item_flex_link *output_link;
+	/** Number of link descriptors in the output link array. */
+	uint32_t nb_outputs;
+};
+
 /**
  * Action types.
  *
@@ -4286,6 +4466,48 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 			     struct rte_flow_item *items,
 			     uint32_t num_of_items,
 			     struct rte_flow_error *error);
+
+/**
+ * Create the flex item with specified configuration over
+ * the Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] conf
+ *   Item configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is set.
+ */
+__rte_experimental
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
+/**
+ * Release the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v6 2/6] ethdev: support flow elements with variable length
  2021-10-18 18:02 ` [dpdk-dev] [PATCH v6 0/6] " Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 1/6] " Viacheslav Ovsiienko
@ 2021-10-18 18:02   ` Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 3/6] ethdev: implement RTE flex item API Viacheslav Ovsiienko
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-18 18:02 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

There is the new RTE Flow items with variable lengths coming - flex
item. In order to handle this item (and potentially other new ones
with variable pattern length) in RTE flow copy and conversion routines
the helper function is introduced.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 lib/ethdev/rte_flow.c | 81 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 68 insertions(+), 13 deletions(-)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 8cb7a069c8..051781b440 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -30,13 +30,65 @@ uint64_t rte_flow_dynf_metadata_mask;
 struct rte_flow_desc_data {
 	const char *name;
 	size_t size;
+	size_t (*desc_fn)(void *dst, const void *src);
 };
 
+/**
+ *
+ * @param buf
+ * Destination memory.
+ * @param data
+ * Source memory
+ * @param size
+ * Requested copy size
+ * @param desc
+ * rte_flow_desc_item - for flow item conversion.
+ * rte_flow_desc_action - for flow action conversion.
+ * @param type
+ * Offset into the desc param or negative value for private flow elements.
+ */
+static inline size_t
+rte_flow_conv_copy(void *buf, const void *data, const size_t size,
+		   const struct rte_flow_desc_data *desc, int type)
+{
+	/**
+	 * Allow PMD private flow item
+	 */
+	size_t sz = type >= 0 ? desc[type].size : sizeof(void *);
+	if (buf == NULL || data == NULL)
+		return 0;
+	rte_memcpy(buf, data, (size > sz ? sz : size));
+	if (desc[type].desc_fn)
+		sz += desc[type].desc_fn(size > 0 ? buf : NULL, data);
+	return sz;
+}
+
+static size_t
+rte_flow_item_flex_conv(void *buf, const void *data)
+{
+	struct rte_flow_item_flex *dst = buf;
+	const struct rte_flow_item_flex *src = data;
+	if (buf) {
+		dst->pattern = rte_memcpy
+			((void *)((uintptr_t)(dst + 1)), src->pattern,
+			 src->length);
+	}
+	return src->length;
+}
+
 /** Generate flow_item[] entry. */
 #define MK_FLOW_ITEM(t, s) \
 	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
 		.name = # t, \
-		.size = s, \
+		.size = s,               \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ITEM_FN(t, s, fn) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = {\
+		.name = # t,                 \
+		.size = s,                   \
+		.desc_fn = fn,               \
 	}
 
 /** Information about known flow pattern items. */
@@ -100,6 +152,8 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)),
 	MK_FLOW_ITEM(INTEGRITY, sizeof(struct rte_flow_item_integrity)),
 	MK_FLOW_ITEM(CONNTRACK, sizeof(uint32_t)),
+	MK_FLOW_ITEM_FN(FLEX, sizeof(struct rte_flow_item_flex),
+			rte_flow_item_flex_conv),
 };
 
 /** Generate flow_action[] entry. */
@@ -107,8 +161,17 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
 		.name = # t, \
 		.size = s, \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ACTION_FN(t, fn) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = 0, \
+		.desc_fn = fn,\
 	}
 
+
 /** Information about known flow actions. */
 static const struct rte_flow_desc_data rte_flow_desc_action[] = {
 	MK_FLOW_ACTION(END, 0),
@@ -527,12 +590,8 @@ rte_flow_conv_item_spec(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow item
-		 */
-		off = (int)item->type >= 0 ?
-		      rte_flow_desc_item[item->type].size : sizeof(void *);
-		rte_memcpy(buf, data, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, data, size,
+					 rte_flow_desc_item, item->type);
 		break;
 	}
 	return off;
@@ -634,12 +693,8 @@ rte_flow_conv_action_conf(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow action
-		 */
-		off = (int)action->type >= 0 ?
-		      rte_flow_desc_action[action->type].size : sizeof(void *);
-		rte_memcpy(buf, action->conf, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, action->conf, size,
+					 rte_flow_desc_action, action->type);
 		break;
 	}
 	return off;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v6 3/6] ethdev: implement RTE flex item API
  2021-10-18 18:02 ` [dpdk-dev] [PATCH v6 0/6] " Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 1/6] " Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 2/6] ethdev: support flow elements with variable length Viacheslav Ovsiienko
@ 2021-10-18 18:02   ` Viacheslav Ovsiienko
  2021-10-19  6:12     ` Ori Kam
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 4/6] app/testpmd: add jansson library Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-18 18:02 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

RTE flex item API was introduced in
"ethdev: introduce configurable flexible item" patch.

The API allows DPDK application to define parser for custom
network header in port hardware and offload flows that will match
the custom header elements.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 lib/ethdev/rte_flow.c        | 40 ++++++++++++++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h |  8 ++++++++
 lib/ethdev/version.map       |  2 ++
 3 files changed, 50 insertions(+)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 051781b440..8257ed8c97 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -1321,3 +1321,43 @@ rte_flow_tunnel_item_release(uint16_t port_id,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOTSUP));
 }
+
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_flow_item_flex_handle *handle;
+
+	if (unlikely(!ops))
+		return NULL;
+	if (unlikely(!ops->flex_item_create)) {
+		rte_flow_error_set(error, ENOTSUP,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOTSUP));
+		return NULL;
+	}
+	handle = ops->flex_item_create(dev, conf, error);
+	if (handle == NULL)
+		flow_err(port_id, -rte_errno, error);
+	return handle;
+}
+
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_release))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_release(dev, handle, error);
+	return flow_err(port_id, ret, error);
+}
diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
index 46f62c2ec2..34a5a5bcd0 100644
--- a/lib/ethdev/rte_flow_driver.h
+++ b/lib/ethdev/rte_flow_driver.h
@@ -139,6 +139,14 @@ struct rte_flow_ops {
 		 struct rte_flow_item *pmd_items,
 		 uint32_t num_of_items,
 		 struct rte_flow_error *err);
+	struct rte_flow_item_flex_handle *(*flex_item_create)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
+	int (*flex_item_release)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 struct rte_flow_error *error);
 };
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 29fb71f1af..6992e25046 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -248,6 +248,8 @@ EXPERIMENTAL {
 
 	# added in 21.11
 	rte_eth_rx_metadata_negotiate;
+	rte_flow_flex_item_create;
+	rte_flow_flex_item_release;
 };
 
 INTERNAL {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v6 4/6] app/testpmd: add jansson library
  2021-10-18 18:02 ` [dpdk-dev] [PATCH v6 0/6] " Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 3/6] ethdev: implement RTE flex item API Viacheslav Ovsiienko
@ 2021-10-18 18:02   ` Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 5/6] app/testpmd: add dedicated flow command parsing routine Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 6/6] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  5 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-18 18:02 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Testpmd interactive mode provides CLI to configure application
commands. Testpmd reads CLI command and parameters from STDIN, and
converts input into C objects with internal parser.
The patch adds jansson dependency to testpmd.
With jansson, testpmd can read input in JSON format from STDIN or input
file and convert it into C object using jansson library calls.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 app/test-pmd/meson.build | 5 +++++
 app/test-pmd/testpmd.h   | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 98f3289bdf..3a8babd604 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -61,3 +61,8 @@ if dpdk_conf.has('RTE_LIB_BPF')
     sources += files('bpf_cmd.c')
     deps += 'bpf'
 endif
+jansson_dep = dependency('jansson', required: false, method: 'pkg-config')
+if jansson_dep.found()
+    dpdk_conf.set('RTE_HAS_JANSSON', 1)
+    ext_deps += jansson_dep
+endif
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index e9d9db06ce..fc43bf2763 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -14,6 +14,9 @@
 #include <rte_os_shim.h>
 #include <cmdline.h>
 #include <sys/queue.h>
+#ifdef RTE_HAS_JANSSON
+#include <jansson.h>
+#endif
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v6 5/6] app/testpmd: add dedicated flow command parsing routine
  2021-10-18 18:02 ` [dpdk-dev] [PATCH v6 0/6] " Viacheslav Ovsiienko
                     ` (3 preceding siblings ...)
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 4/6] app/testpmd: add jansson library Viacheslav Ovsiienko
@ 2021-10-18 18:02   ` Viacheslav Ovsiienko
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 6/6] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  5 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-18 18:02 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

testpmd flow creation is constructed from these procedures:
  1. receive string with flow rule description;
  2. parse input string and build flow parameters: port_id value,
     flow attributes, items array, actions array;
  3. create a flow rule from flow rule parameters.

Flow rule creation procedures are built as a pipeline. A new
procedure starts immediately after successful predecessor completion.
Due to this we have no dedicated routines providing intermediate
results for step 1-3 above.

The patch adds `flow_parse()` function call. It parses input string
and provides a caller with parsed data. This is a preparation step
for introducing flex item command processing.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
---
 app/test-pmd/cmdline_flow.c | 24 ++++++++++++++++++++++++
 app/test-pmd/testpmd.h      |  5 +++++
 2 files changed, 29 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0b5856c7d5..4e8e3e3c29 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -7952,6 +7952,30 @@ cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
 	return len;
 }
 
+int
+flow_parse(const char *src, void *result, unsigned int size,
+	   struct rte_flow_attr **attr,
+	   struct rte_flow_item **pattern, struct rte_flow_action **actions)
+{
+	int ret;
+	struct context saved_flow_ctx = cmd_flow_context;
+
+	cmd_flow_context_init(&cmd_flow_context);
+	do {
+		ret = cmd_flow_parse(NULL, src, result, size);
+		if (ret > 0) {
+			src += ret;
+			while (isspace(*src))
+				src++;
+		}
+	} while (ret > 0 && strlen(src));
+	cmd_flow_context = saved_flow_ctx;
+	*attr = &((struct buffer *)result)->args.vc.attr;
+	*pattern = ((struct buffer *)result)->args.vc.pattern;
+	*actions = ((struct buffer *)result)->args.vc.actions;
+	return (ret >= 0 && !strlen(src)) ? 0 : -1;
+}
+
 /** Return number of completion entries (cmdline API). */
 static int
 cmd_flow_complete_get_nb(cmdline_parse_token_hdr_t *hdr)
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index fc43bf2763..c580406e99 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -1027,6 +1027,11 @@ void add_tx_dynf_callback(portid_t portid);
 void remove_tx_dynf_callback(portid_t portid);
 int update_jumbo_frame_offload(portid_t portid);
 
+extern int flow_parse(const char *src, void *result, unsigned int size,
+		      struct rte_flow_attr **attr,
+		      struct rte_flow_item **pattern,
+		      struct rte_flow_action **actions);
+
 /*
  * Work-around of a compilation error with ICC on invocations of the
  * rte_be_to_cpu_16() function.
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v6 6/6] app/testpmd: add flex item CLI commands
  2021-10-18 18:02 ` [dpdk-dev] [PATCH v6 0/6] " Viacheslav Ovsiienko
                     ` (4 preceding siblings ...)
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 5/6] app/testpmd: add dedicated flow command parsing routine Viacheslav Ovsiienko
@ 2021-10-18 18:02   ` Viacheslav Ovsiienko
  5 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-18 18:02 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Network port hardware is shipped with fixed number of
supported network protocols. If application must work with a
protocol that is not included in the port hardware by default, it
can try to add the new protocol to port hardware.

Flex item or flex parser is port infrastructure that allows
application to add support for a custom network header and
offload flows to match the header elements.

Application must complete the following tasks to create a flow
rule that matches custom header:

1. Create flow item object in port hardware.
Application must provide custom header configuration to PMD.
PMD will use that configuration to create flex item object in
port hardware.

2. Create flex patterns to match. Flex pattern has a spec and a mask
components, like a regular flow item. Combined together, spec and mask
can target unique data sequence or a number of data sequences in the
custom header.
Flex patterns of the same flex item can have different lengths.
Flex pattern is identified by unique handler value.

3. Create a flow rule with a flex flow item that references
flow pattern.

Testpmd flex CLI commands are:

testpmd> flow flex_item create <port> <flex_id> <filename>

testpmd> set flex_pattern <pattern_id> \
         spec <spec data> mask <mask data>

testpmd> set flex_pattern <pattern_id> is <spec_data>

testpmd> flow create <port> ... \
/ flex item is <flex_id> pattern is <pattern_id> / ...

The patch works with the jansson library API.
Jansson development files must be present:
jansson.pc, jansson.h libjansson.[a,so]

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/cmd_flex_item.c                | 548 ++++++++++++++++++++
 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 223 +++++++-
 app/test-pmd/meson.build                    |   1 +
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  26 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++++
 7 files changed, 919 insertions(+), 2 deletions(-)
 create mode 100644 app/test-pmd/cmd_flex_item.c

diff --git a/app/test-pmd/cmd_flex_item.c b/app/test-pmd/cmd_flex_item.c
new file mode 100644
index 0000000000..45103e45a8
--- /dev/null
+++ b/app/test-pmd/cmd_flex_item.c
@@ -0,0 +1,548 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2021 NVIDIA Corporation & Affiliates
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <rte_flow.h>
+
+#include "testpmd.h"
+
+struct flex_item *flex_items[RTE_MAX_ETHPORTS][FLEX_MAX_PARSERS_NUM];
+struct flex_pattern flex_patterns[FLEX_MAX_PATTERNS_NUM];
+
+#ifdef RTE_HAS_JANSSON
+static __rte_always_inline bool
+match_strkey(const char *key, const char *pattern)
+{
+	return strncmp(key, pattern, strlen(key)) == 0;
+}
+
+static struct flex_item *
+flex_parser_fetch(uint16_t port_id, uint16_t flex_id)
+{
+	if (port_id >= RTE_MAX_ETHPORTS) {
+		printf("Invalid port_id: %u\n", port_id);
+		return FLEX_PARSER_ERR;
+	}
+	if (flex_id >= FLEX_MAX_PARSERS_NUM) {
+		printf("Invalid flex item flex_id: %u\n", flex_id);
+		return FLEX_PARSER_ERR;
+	}
+	return flex_items[port_id][flex_id];
+}
+
+void
+flex_item_destroy(portid_t port_id, uint16_t flex_id)
+{
+	int ret;
+	struct rte_flow_error error;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (!fp)
+		return;
+	ret = rte_flow_flex_item_release(port_id, fp->flex_handle, &error);
+	if (!ret) {
+		free(fp);
+		flex_items[port_id][flex_id] = NULL;
+		printf("port-%u: released flex item #%u\n",
+		       port_id, flex_id);
+
+	} else {
+		printf("port-%u: cannot release flex item #%u: %s\n",
+		       port_id, flex_id, error.message);
+	}
+}
+
+static int
+flex_tunnel_parse(json_t *jtun, enum rte_flow_item_flex_tunnel_mode *tunnel)
+{
+	int tun = -1;
+
+	if (json_is_integer(jtun))
+		tun = (int)json_integer_value(jtun);
+	else if (json_is_real(jtun))
+		tun = (int)json_real_value(jtun);
+	else if (json_is_string(jtun)) {
+		const char *mode = json_string_value(jtun);
+
+		if (match_strkey(mode, "FLEX_TUNNEL_MODE_SINGLE"))
+			tun = FLEX_TUNNEL_MODE_SINGLE;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_OUTER"))
+			tun = FLEX_TUNNEL_MODE_OUTER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_INNER"))
+			tun = FLEX_TUNNEL_MODE_INNER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_MULTI"))
+			tun = FLEX_TUNNEL_MODE_MULTI;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_TUNNEL"))
+			tun = FLEX_TUNNEL_MODE_TUNNEL;
+		else
+			return -EINVAL;
+	} else
+		return -EINVAL;
+	*tunnel = (enum rte_flow_item_flex_tunnel_mode)tun;
+	return 0;
+}
+
+static int
+flex_field_parse(json_t *jfld, struct rte_flow_item_flex_field *fld)
+{
+	const char *key;
+	json_t *je;
+
+#define FLEX_FIELD_GET(fm, t) \
+do {                  \
+	if (!strncmp(key, # fm, strlen(# fm))) { \
+		if (json_is_real(je))   \
+			fld->fm = (t) json_real_value(je); \
+		else if (json_is_integer(je))   \
+			fld->fm = (t) json_integer_value(je); \
+		else   \
+			return -EINVAL; \
+	}         \
+} while (0)
+
+	json_object_foreach(jfld, key, je) {
+		FLEX_FIELD_GET(field_size, uint32_t);
+		FLEX_FIELD_GET(field_base, int32_t);
+		FLEX_FIELD_GET(offset_base, uint32_t);
+		FLEX_FIELD_GET(offset_mask, uint32_t);
+		FLEX_FIELD_GET(offset_shift, int32_t);
+		FLEX_FIELD_GET(field_id, uint16_t);
+		if (match_strkey(key, "field_mode")) {
+			const char *mode;
+			if (!json_is_string(je))
+				return -EINVAL;
+			mode = json_string_value(je);
+			if (match_strkey(mode, "FIELD_MODE_DUMMY"))
+				fld->field_mode = FIELD_MODE_DUMMY;
+			else if (match_strkey(mode, "FIELD_MODE_FIXED"))
+				fld->field_mode = FIELD_MODE_FIXED;
+			else if (match_strkey(mode, "FIELD_MODE_OFFSET"))
+				fld->field_mode = FIELD_MODE_OFFSET;
+			else if (match_strkey(mode, "FIELD_MODE_BITMASK"))
+				fld->field_mode = FIELD_MODE_BITMASK;
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+enum flex_link_type {
+	FLEX_LINK_IN = 0,
+	FLEX_LINK_OUT = 1
+};
+
+static int
+flex_link_item_parse(const char *src, struct rte_flow_item *item)
+{
+#define  FLEX_PARSE_DATA_SIZE 1024
+
+	int ret;
+	uint8_t *ptr, data[FLEX_PARSE_DATA_SIZE] = {0,};
+	char flow_rule[256];
+	struct rte_flow_attr *attr;
+	struct rte_flow_item *pattern;
+	struct rte_flow_action *actions;
+
+	sprintf(flow_rule, "flow create 0 pattern %s / end", src);
+	src = flow_rule;
+	ret = flow_parse(src, (void *)data, sizeof(data),
+			 &attr, &pattern, &actions);
+	if (ret)
+		return ret;
+	item->type = pattern->type;
+	if (pattern->spec) {
+		ptr = (void *)(uintptr_t)item->spec;
+		memcpy(ptr, pattern->spec, FLEX_MAX_FLOW_PATTERN_LENGTH);
+	} else {
+		item->spec = NULL;
+	}
+	if (pattern->mask) {
+		ptr = (void *)(uintptr_t)item->mask;
+		memcpy(ptr, pattern->mask, FLEX_MAX_FLOW_PATTERN_LENGTH);
+	} else {
+		item->mask = NULL;
+	}
+	if (pattern->last) {
+		ptr = (void *)(uintptr_t)item->last;
+		memcpy(ptr, pattern->last, FLEX_MAX_FLOW_PATTERN_LENGTH);
+	} else {
+		item->last = NULL;
+	}
+	return 0;
+}
+
+static int
+flex_link_parse(json_t *jobj, struct rte_flow_item_flex_link *link,
+		enum flex_link_type link_type)
+{
+	const char *key;
+	json_t *je;
+	int ret;
+	json_object_foreach(jobj, key, je) {
+		if (match_strkey(key, "item")) {
+			if (!json_is_string(je))
+				return -EINVAL;
+			ret = flex_link_item_parse(json_string_value(je),
+						   &link->item);
+			if (ret)
+				return -EINVAL;
+			if (link_type == FLEX_LINK_IN) {
+				if (!link->item.spec || !link->item.mask)
+					return -EINVAL;
+				if (link->item.last)
+					return -EINVAL;
+			}
+		}
+		if (match_strkey(key, "next")) {
+			if (json_is_integer(je))
+				link->next = (typeof(link->next))
+					     json_integer_value(je);
+			else if (json_is_real(je))
+				link->next = (typeof(link->next))
+					     json_real_value(je);
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static int flex_item_config(json_t *jroot,
+			    struct rte_flow_item_flex_conf *flex_conf)
+{
+	const char *key;
+	json_t *jobj = NULL;
+	int ret = 0;
+
+	json_object_foreach(jroot, key, jobj) {
+		if (match_strkey(key, "tunnel")) {
+			ret = flex_tunnel_parse(jobj, &flex_conf->tunnel);
+			if (ret) {
+				printf("Can't parse tunnel value\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_header")) {
+			ret = flex_field_parse(jobj, &flex_conf->next_header);
+			if (ret) {
+				printf("Can't parse next_header field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_protocol")) {
+			ret = flex_field_parse(jobj,
+					       &flex_conf->next_protocol);
+			if (ret) {
+				printf("Can't parse next_protocol field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "sample_data")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_field_parse
+					(ji, flex_conf->sample_data + i);
+				if (ret) {
+					printf("Can't parse sample_data field(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_samples = size;
+		} else if (match_strkey(key, "input_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse(ji,
+						      flex_conf->input_link + i,
+						      FLEX_LINK_IN);
+				if (ret) {
+					printf("Can't parse input_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_inputs = size;
+		} else if (match_strkey(key, "output_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse
+					(ji, flex_conf->output_link + i,
+					 FLEX_LINK_OUT);
+				if (ret) {
+					printf("Can't parse output_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_outputs = size;
+		}
+	}
+out:
+	return ret;
+}
+
+static struct flex_item *
+flex_item_init(void)
+{
+	size_t base_size, samples_size, links_size, spec_size;
+	struct rte_flow_item_flex_conf *conf;
+	struct flex_item *fp;
+	uint8_t (*pattern)[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	int i;
+
+	base_size = RTE_ALIGN(sizeof(*conf), sizeof(uintptr_t));
+	samples_size = RTE_ALIGN(FLEX_ITEM_MAX_SAMPLES_NUM *
+				 sizeof(conf->sample_data[0]),
+				 sizeof(uintptr_t));
+	links_size = RTE_ALIGN(FLEX_ITEM_MAX_LINKS_NUM *
+			       sizeof(conf->input_link[0]),
+			       sizeof(uintptr_t));
+	/* spec & mask for all input links */
+	spec_size = 2 * FLEX_MAX_FLOW_PATTERN_LENGTH * FLEX_ITEM_MAX_LINKS_NUM;
+	fp = calloc(1, base_size + samples_size + 2 * links_size + spec_size);
+	if (fp == NULL) {
+		printf("Can't allocate memory for flex item\n");
+		return NULL;
+	}
+	conf = &fp->flex_conf;
+	conf->sample_data = (typeof(conf->sample_data))
+			    ((uint8_t *)fp + base_size);
+	conf->input_link = (typeof(conf->input_link))
+			   ((uint8_t *)conf->sample_data + samples_size);
+	conf->output_link = (typeof(conf->output_link))
+			    ((uint8_t *)conf->input_link + links_size);
+	pattern = (typeof(pattern))((uint8_t *)conf->output_link + links_size);
+	for (i = 0; i < FLEX_ITEM_MAX_LINKS_NUM; i++) {
+		struct rte_flow_item_flex_link *in = conf->input_link + i;
+		in->item.spec = pattern++;
+		in->item.mask = pattern++;
+	}
+	return fp;
+}
+
+static int
+flex_item_build_config(struct flex_item *fp, const char *filename)
+{
+	int ret;
+	json_error_t json_error;
+	json_t *jroot = json_load_file(filename, 0, &json_error);
+
+	if (!jroot) {
+		printf("Bad JSON file \"%s\": %s\n", filename, json_error.text);
+		return -1;
+	}
+	ret = flex_item_config(jroot, &fp->flex_conf);
+	json_decref(jroot);
+	return ret;
+}
+
+void
+flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename)
+{
+	struct rte_flow_error flow_error;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	int ret;
+
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (fp) {
+		printf("port-%u: flex item #%u is already in use\n",
+		       port_id, flex_id);
+		return;
+	}
+	fp = flex_item_init();
+	if (!fp) {
+		printf("Could not allocate flex item\n");
+		goto out;
+	}
+	ret = flex_item_build_config(fp, filename);
+	if (ret)
+		goto out;
+	fp->flex_handle = rte_flow_flex_item_create(port_id,
+						    &fp->flex_conf,
+						    &flow_error);
+	if (fp->flex_handle) {
+		flex_items[port_id][flex_id] = fp;
+		printf("port-%u: created flex item #%u\n", port_id, flex_id);
+		fp = NULL;
+	} else {
+		printf("port-%u: flex item #%u creation failed: %s\n",
+		       port_id, flex_id,
+		       flow_error.message ? flow_error.message : "");
+	}
+out:
+	if (fp)
+		free(fp);
+}
+
+#else /* RTE_HAS_JANSSON */
+void flex_item_create(__rte_unused portid_t port_id,
+		      __rte_unused uint16_t flex_id,
+		      __rte_unused const char *filename)
+{
+	printf("no JSON library\n");
+}
+
+void flex_item_destroy(__rte_unused portid_t port_id,
+		       __rte_unused uint16_t flex_id)
+{
+	printf("no JSON library\n");
+}
+#endif /* RTE_HAS_JANSSON */
+
+void
+port_flex_item_flush(portid_t port_id)
+{
+	uint16_t i;
+
+	for (i = 0; i < FLEX_MAX_PARSERS_NUM; i++) {
+		flex_item_destroy(port_id, i);
+		flex_items[port_id][i] = NULL;
+	}
+}
+
+struct flex_pattern_set {
+	cmdline_fixed_string_t set, flex_pattern;
+	cmdline_fixed_string_t is_spec, mask;
+	cmdline_fixed_string_t spec_data, mask_data;
+	uint16_t id;
+};
+
+static cmdline_parse_token_string_t flex_pattern_set_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, set, "set");
+static cmdline_parse_token_string_t flex_pattern_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+flex_pattern, "flex_pattern");
+static cmdline_parse_token_string_t flex_pattern_is_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+is_spec, "is");
+static cmdline_parse_token_string_t flex_pattern_spec_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+is_spec, "spec");
+static cmdline_parse_token_string_t flex_pattern_mask_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask, "mask");
+static cmdline_parse_token_string_t flex_pattern_spec_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, spec_data, NULL);
+static cmdline_parse_token_string_t flex_pattern_mask_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask_data, NULL);
+static cmdline_parse_token_num_t flex_pattern_id_token =
+	TOKEN_NUM_INITIALIZER(struct flex_pattern_set, id, RTE_UINT16);
+
+/*
+ * flex pattern data - spec or mask is a string representation of byte array
+ * in hexadecimal format. Each byte in data string must have 2 characters:
+ * 0x15 - "15"
+ * 0x1  - "01"
+ * Bytes in data array are in network order.
+ */
+static uint32_t
+flex_pattern_data(const char *str, uint8_t *data)
+{
+	uint32_t i, len = strlen(str);
+	char b[3], *endptr;
+
+	if (len & 01)
+		return 0;
+	len /= 2;
+	if (len >= FLEX_MAX_FLOW_PATTERN_LENGTH)
+		return 0;
+	for (i = 0, b[2] = '\0'; i < len; i++) {
+		b[0] = str[2 * i];
+		b[1] = str[2 * i + 1];
+		data[i] = strtoul(b, &endptr, 16);
+		if (endptr != &b[2])
+			return 0;
+	}
+	return len;
+}
+
+static void
+flex_pattern_parsed_fn(void *parsed_result,
+		       __rte_unused struct cmdline *cl,
+		       __rte_unused void *data)
+{
+	struct flex_pattern_set *res = parsed_result;
+	struct flex_pattern *fp;
+	bool full_spec;
+
+	if (res->id >= FLEX_MAX_PATTERNS_NUM) {
+		printf("Bad flex pattern id\n");
+		return;
+	}
+	fp = flex_patterns + res->id;
+	memset(fp->spec_pattern, 0, sizeof(fp->spec_pattern));
+	memset(fp->mask_pattern, 0, sizeof(fp->mask_pattern));
+	fp->spec.length = flex_pattern_data(res->spec_data, fp->spec_pattern);
+	if (!fp->spec.length) {
+		printf("Bad flex pattern spec\n");
+		return;
+	}
+	full_spec = strncmp(res->is_spec, "spec", strlen("spec")) == 0;
+	if (full_spec) {
+		fp->mask.length = flex_pattern_data(res->mask_data,
+						    fp->mask_pattern);
+		if (!fp->mask.length) {
+			printf("Bad flex pattern mask\n");
+			return;
+		}
+	} else {
+		memset(fp->mask_pattern, 0xFF, fp->spec.length);
+		fp->mask.length = fp->spec.length;
+	}
+	if (fp->mask.length != fp->spec.length) {
+		printf("Spec length do not match mask length\n");
+		return;
+	}
+	fp->spec.pattern = fp->spec_pattern;
+	fp->mask.pattern = fp->mask_pattern;
+	printf("created pattern #%u\n", res->id);
+}
+
+cmdline_parse_inst_t cmd_set_flex_is_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> is <spec_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_is_token,
+		(void *)&flex_pattern_spec_data_token,
+		NULL,
+	}
+};
+
+cmdline_parse_inst_t cmd_set_flex_spec_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> spec <spec_data> mask <mask_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_spec_token,
+		(void *)&flex_pattern_spec_data_token,
+		(void *)&flex_pattern_mask_token,
+		(void *)&flex_pattern_mask_data_token,
+		NULL,
+	}
+};
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 36d50fd3c7..68fb6a4025 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -17858,6 +17858,8 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_set_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_show_capability,
+	(cmdline_parse_inst_t *)&cmd_set_flex_is_pattern,
+	(cmdline_parse_inst_t *)&cmd_set_flex_spec_pattern,
 	NULL,
 };
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 4e8e3e3c29..5734e8082e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -54,6 +54,8 @@ enum index {
 	COMMON_PRIORITY_LEVEL,
 	COMMON_INDIRECT_ACTION_ID,
 	COMMON_POLICY_ID,
+	COMMON_FLEX_HANDLE,
+	COMMON_FLEX_TOKEN,
 
 	/* TOP-level command. */
 	ADD,
@@ -81,6 +83,12 @@ enum index {
 	AGED,
 	ISOLATE,
 	TUNNEL,
+	FLEX,
+
+	/* Flex arguments */
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
 
 	/* Tunnel arguments. */
 	TUNNEL_CREATE,
@@ -306,6 +314,9 @@ enum index {
 	ITEM_POL_PORT,
 	ITEM_POL_METER,
 	ITEM_POL_POLICY,
+	ITEM_FLEX,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_FLEX_PATTERN_HANDLE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -843,6 +854,11 @@ struct buffer {
 		struct {
 			uint32_t policy_id;
 		} policy;/**< Policy arguments. */
+		struct {
+			uint16_t token;
+			uintptr_t uintptr;
+			char filename[128];
+		} flex; /**< Flex arguments*/
 	} args; /**< Command arguments. */
 };
 
@@ -870,6 +886,13 @@ struct parse_action_priv {
 		.size = s, \
 	})
 
+static const enum index next_flex_item[] = {
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
+	ZERO,
+};
+
 static const enum index next_ia_create_attr[] = {
 	INDIRECT_ACTION_CREATE_ID,
 	INDIRECT_ACTION_INGRESS,
@@ -999,6 +1022,7 @@ static const enum index next_item[] = {
 	ITEM_GENEVE_OPT,
 	ITEM_INTEGRITY,
 	ITEM_CONNTRACK,
+	ITEM_FLEX,
 	END_SET,
 	ZERO,
 };
@@ -1367,6 +1391,13 @@ static const enum index item_integrity_lv[] = {
 	ZERO,
 };
 
+static const enum index item_flex[] = {
+	ITEM_FLEX_PATTERN_HANDLE,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -1722,6 +1753,9 @@ static int parse_set_sample_action(struct context *, const struct token *,
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int
+parse_flex_handle(struct context *, const struct token *,
+		  const char *, unsigned int, void *, unsigned int);
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -1838,6 +1872,8 @@ static int parse_isolate(struct context *, const struct token *,
 static int parse_tunnel(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_flex(struct context *, const struct token *,
+		      const char *, unsigned int, void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
@@ -2038,6 +2074,20 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[COMMON_FLEX_TOKEN] = {
+		.name = "{flex token}",
+		.type = "flex token",
+		.help = "flex token",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[COMMON_FLEX_HANDLE] = {
+		.name = "{flex handle}",
+		.type = "FLEX HANDLE",
+		.help = "fill flex item data",
+		.call = parse_flex_handle,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -2054,7 +2104,8 @@ static const struct token token_list[] = {
 			      AGED,
 			      QUERY,
 			      ISOLATE,
-			      TUNNEL)),
+			      TUNNEL,
+			      FLEX)),
 		.call = parse_init,
 	},
 	/* Top-level command. */
@@ -2166,6 +2217,41 @@ static const struct token token_list[] = {
 			     ARGS_ENTRY(struct buffer, port)),
 		.call = parse_isolate,
 	},
+	[FLEX] = {
+		.name = "flex_item",
+		.help = "flex item API",
+		.next = NEXT(next_flex_item),
+		.call = parse_flex,
+	},
+	[FLEX_ITEM_INIT] = {
+		.name = "init",
+		.help = "flex item init",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_CREATE] = {
+		.name = "create",
+		.help = "flex item create",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.filename),
+			     ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FILE_PATH),
+			     NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_DESTROY] = {
+		.name = "destroy",
+		.help = "flex item destroy",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
 	[TUNNEL] = {
 		.name = "tunnel",
 		.help = "new tunnel API",
@@ -3606,6 +3692,27 @@ static const struct token token_list[] = {
 			     item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_conntrack, flags)),
 	},
+	[ITEM_FLEX] = {
+		.name = "flex",
+		.help = "match flex header",
+		.priv = PRIV_ITEM(FLEX, sizeof(struct rte_flow_item_flex)),
+		.next = NEXT(item_flex),
+		.call = parse_vc,
+	},
+	[ITEM_FLEX_ITEM_HANDLE] = {
+		.name = "item",
+		.help = "flex item handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, handle)),
+	},
+	[ITEM_FLEX_PATTERN_HANDLE] = {
+		.name = "pattern",
+		.help = "flex pattern handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, pattern)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -6989,6 +7096,43 @@ parse_isolate(struct context *ctx, const struct token *token,
 	return len;
 }
 
+static int
+parse_flex(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (out->command == ZERO) {
+		if (ctx->curr != FLEX)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		ctx->objmask = NULL;
+	} else {
+		switch (ctx->curr) {
+		default:
+			break;
+		case FLEX_ITEM_INIT:
+		case FLEX_ITEM_CREATE:
+		case FLEX_ITEM_DESTROY:
+			out->command = ctx->curr;
+			break;
+		}
+	}
+
+	return len;
+}
+
 static int
 parse_tunnel(struct context *ctx, const struct token *token,
 	     const char *str, unsigned int len,
@@ -7651,6 +7795,71 @@ parse_set_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/*
+ * Replace testpmd handles in a flex flow item with real values.
+ */
+static int
+parse_flex_handle(struct context *ctx, const struct token *token,
+		  const char *str, unsigned int len,
+		  void *buf, unsigned int size)
+{
+	struct rte_flow_item_flex *spec, *mask;
+	const struct rte_flow_item_flex *src_spec, *src_mask;
+	const struct arg *arg = pop_args(ctx);
+	uint32_t offset;
+	uint16_t handle;
+	int ret;
+
+	if (!arg) {
+		printf("Bad environment\n");
+		return -1;
+	}
+	offset = arg->offset;
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, len, buf, size);
+	if (ret <= 0 || !ctx->object)
+		return ret;
+	if (ctx->port >= RTE_MAX_ETHPORTS) {
+		printf("Bad port\n");
+		return -1;
+	}
+	if (offset == offsetof(struct rte_flow_item_flex, handle)) {
+		const struct flex_item *fp;
+		struct rte_flow_item_flex *item_flex = ctx->object;
+		handle = (uint16_t)(uintptr_t)item_flex->handle;
+		if (handle >= FLEX_MAX_PARSERS_NUM) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		fp = flex_items[ctx->port][handle];
+		if (!fp) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		item_flex->handle = fp->flex_handle;
+	} else if (offset == offsetof(struct rte_flow_item_flex, pattern)) {
+		handle = (uint16_t)(uintptr_t)
+			((struct rte_flow_item_flex *)ctx->object)->pattern;
+		if (handle >= FLEX_MAX_PATTERNS_NUM) {
+			printf("Bad pattern handle\n");
+			return -1;
+		}
+		src_spec = &flex_patterns[handle].spec;
+		src_mask = &flex_patterns[handle].mask;
+		spec = ctx->object;
+		mask = spec + 2; /* spec, last, mask */
+		/* fill flow rule spec and mask parameters */
+		spec->length = src_spec->length;
+		spec->pattern = src_spec->pattern;
+		mask->length = src_mask->length;
+		mask->pattern = src_mask->pattern;
+	} else {
+		printf("Bad arguments - unknown flex item offset\n");
+		return -1;
+	}
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -8181,6 +8390,13 @@ cmd_flow_parsed(const struct buffer *in)
 		port_meter_policy_add(in->port, in->args.policy.policy_id,
 					in->args.vc.actions);
 		break;
+	case FLEX_ITEM_CREATE:
+		flex_item_create(in->port, in->args.flex.token,
+				 in->args.flex.filename);
+		break;
+	case FLEX_ITEM_DESTROY:
+		flex_item_destroy(in->port, in->args.flex.token);
+		break;
 	default:
 		break;
 	}
@@ -8632,6 +8848,11 @@ cmd_set_raw_parsed(const struct buffer *in)
 		case RTE_FLOW_ITEM_TYPE_PFCP:
 			size = sizeof(struct rte_flow_item_pfcp);
 			break;
+		case RTE_FLOW_ITEM_TYPE_FLEX:
+			size = item->spec ?
+				((const struct rte_flow_item_flex *)
+				item->spec)->length : 0;
+			break;
 		default:
 			fprintf(stderr, "Error - Not supported item\n");
 			goto error;
diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 3a8babd604..201bed013f 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -10,6 +10,7 @@ sources = files(
         'cmdline_flow.c',
         'cmdline_mtr.c',
         'cmdline_tm.c',
+	'cmd_flex_item.c',
         'config.c',
         'csumonly.c',
         'flowgen.c',
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a7841c557f..aa01a6fcdb 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2937,6 +2937,7 @@ close_port(portid_t pid)
 
 		if (is_proc_primary()) {
 			port_flow_flush(pi);
+			port_flex_item_flush(pi);
 			rte_eth_dev_close(pi);
 		}
 	}
@@ -4066,7 +4067,6 @@ main(int argc, char** argv)
 		rte_stats_bitrate_reg(bitrate_data);
 	}
 #endif
-
 #ifdef RTE_LIB_CMDLINE
 	if (strlen(cmdline_filename) != 0)
 		cmdline_read_from_file(cmdline_filename);
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index c580406e99..0e22ddc610 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -282,6 +282,27 @@ struct fwd_engine {
 	packet_fwd_t     packet_fwd;     /**< Mandatory. */
 };
 
+#define FLEX_ITEM_MAX_SAMPLES_NUM 16
+#define FLEX_ITEM_MAX_LINKS_NUM 16
+#define FLEX_MAX_FLOW_PATTERN_LENGTH 64
+#define FLEX_MAX_PARSERS_NUM 8
+#define FLEX_MAX_PATTERNS_NUM 64
+#define FLEX_PARSER_ERR ((struct flex_item *)-1)
+
+struct flex_item {
+	struct rte_flow_item_flex_conf flex_conf;
+	struct rte_flow_item_flex_handle *flex_handle;
+	uint32_t flex_id;
+};
+
+struct flex_pattern {
+	struct rte_flow_item_flex spec, mask;
+	uint8_t spec_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	uint8_t mask_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+};
+extern struct flex_item *flex_items[RTE_MAX_ETHPORTS][FLEX_MAX_PARSERS_NUM];
+extern struct flex_pattern flex_patterns[FLEX_MAX_PATTERNS_NUM];
+
 #define BURST_TX_WAIT_US 1
 #define BURST_TX_RETRIES 64
 
@@ -306,6 +327,8 @@ extern struct fwd_engine * fwd_engines[]; /**< NULL terminated array. */
 extern cmdline_parse_inst_t cmd_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw_all;
+extern cmdline_parse_inst_t cmd_set_flex_is_pattern;
+extern cmdline_parse_inst_t cmd_set_flex_spec_pattern;
 
 extern uint16_t mempool_flags;
 
@@ -1026,6 +1049,9 @@ uint16_t tx_pkt_set_dynf(uint16_t port_id, __rte_unused uint16_t queue,
 void add_tx_dynf_callback(portid_t portid);
 void remove_tx_dynf_callback(portid_t portid);
 int update_jumbo_frame_offload(portid_t portid);
+void flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename);
+void flex_item_destroy(portid_t port_id, uint16_t flex_id);
+void port_flex_item_flush(portid_t port_id);
 
 extern int flow_parse(const char *src, void *result, unsigned int size,
 		      struct rte_flow_attr **attr,
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index a0efb7d0b0..5ca51c6e4a 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -5072,3 +5072,122 @@ For example to unload BPF filter from TX queue 0, port 0:
 .. code-block:: console
 
    testpmd> bpf-unload tx 0 0
+
+Flex Item Functions
+-------------------
+
+The following sections show functions that configure and create flex item object,
+create flex pattern and use it in a flow rule.
+The commands will use 20 bytes IPv4 header for examples:
+
+::
+
+   0                   1                   2                   3
+   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |  ver  |  IHL  |     TOS       |        length                 | +0
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       identification          | flg |    frag. offset         | +4
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       TTL     |  protocol     |        checksum               | +8
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |               source IP address                               | +12
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |              destination IP address                           | +16
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+Create flex item
+~~~~~~~~~~~~~~~~
+
+Flex item object is created by PMD according to a new header configuration. The
+header configuration is compiled by the testpmd and stored in
+``rte_flow_item_flex_conf`` type variable.
+
+::
+
+   # flow flex_item create <port> <flex id> <configuration file>
+   testpmd> flow flex_item init 0 3 ipv4_flex_config.json
+   port-0: created flex item #3
+
+Flex item configuration is kept in external JSON file.
+It describes the following header elements:
+
+**New header length.**
+
+Specify whether the new header has fixed or variable length and the basic/minimal
+header length value.
+
+If header length is not fixed, header location with a value that completes header
+length calculation and scale/offset function must be added.
+
+Scale function depends on port hardware.
+
+**Next protocol.**
+
+Describes location in the new header that specify following network header type.
+
+**Flow match samples.**
+
+Describes locations in the new header that will be used in flow rules.
+
+Number of flow samples and sample maximal length depend of port hardware.
+
+**Input trigger.**
+
+Describes preceding network header configuration.
+
+**Output trigger.**
+
+Describes conditions that trigger transfer to following network header
+
+.. code-block:: json
+
+   {
+      "next_header": { "field_mode": "FIELD_MODE_FIXED", "field_size": 20},
+      "next_protocol": {"field_size": 8, "field_base": 72},
+      "sample_data": [
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 0},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 32},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 64},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 96}
+      ],
+      "input_link": [
+         {"item": "eth type is 0x0800"},
+         {"item": "vlan inner_type is 0x0800"}
+      ],
+      "output_link": [
+         {"item": "udp", "next": 17},
+         {"item": "tcp", "next": 6},
+         {"item": "icmp", "next": 1}
+      ]
+   }
+
+
+Flex pattern and flow rules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Flex pattern describe parts of network header that will trigger flex flow item hit in a flow rule.
+Flex pattern directly related to flex item samples configuration.
+Flex pattern can be shared between ports.
+
+**Flex pattern and flow rule to match IPv4 version and 20 bytes length**
+
+::
+
+   # set flex_pattern <pattern_id> is <hex bytes sequence>
+   testpmd> flow flex_item pattern 5 is 45FF
+   created pattern #5
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 5 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
+
+**Flex pattern and flow rule to match packets with source address 1.2.3.4**
+
+::
+
+   testpmd> flow flex_item pattern 2 spec 45000000000000000000000001020304 mask FF0000000000000000000000FFFFFFFF
+   created pattern #2
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 2 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [dpdk-dev] [PATCH v6 3/6] ethdev: implement RTE flex item API
  2021-10-18 18:02   ` [dpdk-dev] [PATCH v6 3/6] ethdev: implement RTE flex item API Viacheslav Ovsiienko
@ 2021-10-19  6:12     ` Ori Kam
  0 siblings, 0 replies; 73+ messages in thread
From: Ori Kam @ 2021-10-19  6:12 UTC (permalink / raw)
  To: Slava Ovsiienko, dev
  Cc: Raslan Darawsheh, Matan Azrad, Shahaf Shuler, Gregory Etelson,
	NBU-Contact-Thomas Monjalon

Hi Slava and Gregory,


> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo@nvidia.com>
> Sent: Monday, October 18, 2021 9:03 PM
> To: dev@dpdk.org
> Cc: Raslan Darawsheh <rasland@nvidia.com>; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; Ori Kam <orika@nvidia.com>; Gregory Etelson <getelson@nvidia.com>; NBU-
> Contact-Thomas Monjalon <thomas@monjalon.net>
> Subject: [PATCH v6 3/6] ethdev: implement RTE flex item API
> 
> From: Gregory Etelson <getelson@nvidia.com>
> 
> RTE flex item API was introduced in
> "ethdev: introduce configurable flexible item" patch.
> 
> The API allows DPDK application to define parser for custom network header in port hardware and
> offload flows that will match the custom header elements.
> 
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---

Acked-by: Ori Kam <orika@nvidia.com>
Thanks,
Ori

^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v7 0/4] ethdev: introduce configurable flexible item
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (8 preceding siblings ...)
  2021-10-18 18:02 ` [dpdk-dev] [PATCH v6 0/6] " Viacheslav Ovsiienko
@ 2021-10-20 15:06 ` Viacheslav Ovsiienko
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 1/4] ethdev: support flow elements with variable length Viacheslav Ovsiienko
                     ` (3 more replies)
  2021-10-20 15:14 ` [dpdk-dev] [PATCH v8 0/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  10 siblings, 4 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-20 15:06 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item confgiuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

7. Notes:

 - v6:  http://patches.dpdk.org/project/dpdk/cover/20211018180252.14106-1-viacheslavo@nvidia.com/
 - v5:  http://patches.dpdk.org/project/dpdk/patch/20211012125433.31647-2-viacheslavo@nvidia.com/
 - v4:  http://patches.dpdk.org/project/dpdk/patch/20211012113235.24975-2-viacheslavo@nvidia.com/ 
 - v3:  http://patches.dpdk.org/project/dpdk/cover/20211011181528.517-1-viacheslavo@nvidia.com/
 - v2:  http://patches.dpdk.org/project/dpdk/patch/20211001193415.23288-2-viacheslavo@nvidia.com/
 - v1:  http://patches.dpdk.org/project/dpdk/patch/20210922180418.20663-2-viacheslavo@nvidia.com/
 - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/

 - v6 -> v7:
   - series resplitted and patches reorderered, code is the same
   - documentation fixes

 - v5 -> v6:
   - flex item command moved to dedicated file cmd_flex_item.c

 - v4 -> v5:
   - comments addressed
   - testpmd compilation issue fixed

 - v3 -> v4:
   - comments addressed
   - testpmd compilation issues fixed
   - typos fixed

 - v2 -> v3:
   - comments addressed
   - flex item update removed as not supported
   - RSS over flex item fields removed as not supported and non-complete
     API
   - tunnel mode configuration refactored
   - testpmd updated
   - documentation updated
   - PMD patches are removed temporarily (updating WIP, be presented in rc2)

 - v1 -> v2:
   - testpmd CLI to handle flex item is provided
   - draft PMD code is introduced

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Gregory Etelson (2):
  app/testpmd: add dedicated flow command parsing routine
  app/testpmd: add flex item CLI commands

Viacheslav Ovsiienko (2):
  ethdev: support flow elements with variable length
  ethdev: introduce configurable flexible item

 app/test-pmd/cmd_flex_item.c                | 548 ++++++++++++++++++++
 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 247 ++++++++-
 app/test-pmd/meson.build                    |   6 +
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  35 ++
 doc/guides/prog_guide/rte_flow.rst          |  25 +
 doc/guides/rel_notes/release_21_11.rst      |   7 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++++
 lib/ethdev/rte_flow.c                       | 121 ++++-
 lib/ethdev/rte_flow.h                       | 225 ++++++++
 lib/ethdev/rte_flow_driver.h                |   8 +
 lib/ethdev/version.map                      |   2 +
 13 files changed, 1332 insertions(+), 15 deletions(-)
 create mode 100644 app/test-pmd/cmd_flex_item.c

-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v7 1/4] ethdev: support flow elements with variable length
  2021-10-20 15:06 ` [dpdk-dev] [PATCH v7 0/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-20 15:06   ` Viacheslav Ovsiienko
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 2/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-20 15:06 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

Flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

There is the new Flow items with variable lengths coming - flex
item. In order to handle this item (and potentially other new ones
with variable pattern length) in flow copy and conversion routines
the helper function is introduced.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 lib/ethdev/rte_flow.c | 66 ++++++++++++++++++++++++++++++++++---------
 1 file changed, 53 insertions(+), 13 deletions(-)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 29f2b0e954..c8e12404a7 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -30,13 +30,52 @@ uint64_t rte_flow_dynf_metadata_mask;
 struct rte_flow_desc_data {
 	const char *name;
 	size_t size;
+	size_t (*desc_fn)(void *dst, const void *src);
 };
 
+/**
+ *
+ * @param buf
+ * Destination memory.
+ * @param data
+ * Source memory
+ * @param size
+ * Requested copy size
+ * @param desc
+ * rte_flow_desc_item - for flow item conversion.
+ * rte_flow_desc_action - for flow action conversion.
+ * @param type
+ * Offset into the desc param or negative value for private flow elements.
+ */
+static inline size_t
+rte_flow_conv_copy(void *buf, const void *data, const size_t size,
+		   const struct rte_flow_desc_data *desc, int type)
+{
+	/**
+	 * Allow PMD private flow item
+	 */
+	size_t sz = type >= 0 ? desc[type].size : sizeof(void *);
+	if (buf == NULL || data == NULL)
+		return 0;
+	rte_memcpy(buf, data, (size > sz ? sz : size));
+	if (desc[type].desc_fn)
+		sz += desc[type].desc_fn(size > 0 ? buf : NULL, data);
+	return sz;
+}
+
 /** Generate flow_item[] entry. */
 #define MK_FLOW_ITEM(t, s) \
 	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
 		.name = # t, \
-		.size = s, \
+		.size = s,               \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ITEM_FN(t, s, fn) \
+	[RTE_FLOW_ITEM_TYPE_ ## t] = {\
+		.name = # t,                 \
+		.size = s,                   \
+		.desc_fn = fn,               \
 	}
 
 /** Information about known flow pattern items. */
@@ -109,8 +148,17 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
 		.name = # t, \
 		.size = s, \
+		.desc_fn = NULL,\
+	}
+
+#define MK_FLOW_ACTION_FN(t, fn) \
+	[RTE_FLOW_ACTION_TYPE_ ## t] = { \
+		.name = # t, \
+		.size = 0, \
+		.desc_fn = fn,\
 	}
 
+
 /** Information about known flow actions. */
 static const struct rte_flow_desc_data rte_flow_desc_action[] = {
 	MK_FLOW_ACTION(END, 0),
@@ -531,12 +579,8 @@ rte_flow_conv_item_spec(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow item
-		 */
-		off = (int)item->type >= 0 ?
-		      rte_flow_desc_item[item->type].size : sizeof(void *);
-		rte_memcpy(buf, data, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, data, size,
+					 rte_flow_desc_item, item->type);
 		break;
 	}
 	return off;
@@ -638,12 +682,8 @@ rte_flow_conv_action_conf(void *buf, const size_t size,
 		}
 		break;
 	default:
-		/**
-		 * allow PMD private flow action
-		 */
-		off = (int)action->type >= 0 ?
-		      rte_flow_desc_action[action->type].size : sizeof(void *);
-		rte_memcpy(buf, action->conf, (size > off ? off : size));
+		off = rte_flow_conv_copy(buf, action->conf, size,
+					 rte_flow_desc_action, action->type);
 		break;
 	}
 	return off;
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v7 2/4] ethdev: introduce configurable flexible item
  2021-10-20 15:06 ` [dpdk-dev] [PATCH v7 0/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 1/4] ethdev: support flow elements with variable length Viacheslav Ovsiienko
@ 2021-10-20 15:06   ` Viacheslav Ovsiienko
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 3/4] app/testpmd: add dedicated flow command parsing routine Viacheslav Ovsiienko
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 4/4] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  3 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-20 15:06 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item configuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst     |  25 +++
 doc/guides/rel_notes/release_21_11.rst |   7 +
 lib/ethdev/rte_flow.c                  |  55 ++++++
 lib/ethdev/rte_flow.h                  | 225 +++++++++++++++++++++++++
 lib/ethdev/rte_flow_driver.h           |   8 +
 lib/ethdev/version.map                 |   2 +
 6 files changed, 322 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index fa05fe0845..aeba374182 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1548,6 +1548,31 @@ This item is meant to use the same structure as `Item: PORT_REPRESENTOR`_.
 
 See also `Action: REPRESENTED_PORT`_.
 
+Item: ``FLEX``
+^^^^^^^^^^^^^^
+
+Matches with the custom network protocol header that was created
+using rte_flow_flex_item_create() API. The application describes
+the desired header structure, defines the header fields attributes
+and header relations with preceding and following protocols and
+configures the ethernet devices accordingly via
+rte_flow_flex_item_create() routine.
+
+- ``handle``: the flex item handle returned by the PMD on successful
+  rte_flow_flex_item_create() call, mask for this field is ignored.
+- ``length``: match pattern length in bytes. If the length does not cover
+  all fields defined in item configuration, the pattern spec and mask are
+  considered by the driver as padded with trailing zeroes till the full
+  configured item pattern length.
+- ``pattern``: pattern to match. The pattern is concatenation of bit fields
+  configured at item creation. At configuration the fields are presented
+  by sample_data array. The order of the bitfields is defined by the order
+  of sample_data elements. The width of each bitfield is defined by the width
+  specified in the corresponding sample_data element as well. If pattern
+  length is smaller than configured fields overall length it is considered
+  as padded with trailing zeroes up to full configured length, both for
+  value and mask.
+
 Actions
 ~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 44379da5af..e2b1a5882d 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -80,6 +80,13 @@ New Features
   Added macros ETH_RSS_IPV4_CHKSUM and ETH_RSS_L4_CHKSUM, now IPv4 and
   TCP/UDP/SCTP header checksum field can be used as input set for RSS.
 
+* **Introduced flow flex item.**
+
+  * The configurable flow flex item provides the capability to introduce
+    the arbitrary user specified network protocol header, configure the device
+    hardware accordingly, and perform match on this header with desired patterns
+    and masks.
+
 * **Added ethdev support to control delivery of Rx metadata from the HW to the PMD**
 
   A new API, ``rte_eth_rx_metadata_negotiate()``, was added.
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index c8e12404a7..14294017ed 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -63,6 +63,19 @@ rte_flow_conv_copy(void *buf, const void *data, const size_t size,
 	return sz;
 }
 
+static size_t
+rte_flow_item_flex_conv(void *buf, const void *data)
+{
+	struct rte_flow_item_flex *dst = buf;
+	const struct rte_flow_item_flex *src = data;
+	if (buf) {
+		dst->pattern = rte_memcpy
+			((void *)((uintptr_t)(dst + 1)), src->pattern,
+			 src->length);
+	}
+	return src->length;
+}
+
 /** Generate flow_item[] entry. */
 #define MK_FLOW_ITEM(t, s) \
 	[RTE_FLOW_ITEM_TYPE_ ## t] = { \
@@ -141,6 +154,8 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = {
 	MK_FLOW_ITEM(CONNTRACK, sizeof(uint32_t)),
 	MK_FLOW_ITEM(PORT_REPRESENTOR, sizeof(struct rte_flow_item_ethdev)),
 	MK_FLOW_ITEM(REPRESENTED_PORT, sizeof(struct rte_flow_item_ethdev)),
+	MK_FLOW_ITEM_FN(FLEX, sizeof(struct rte_flow_item_flex),
+			rte_flow_item_flex_conv),
 };
 
 /** Generate flow_action[] entry. */
@@ -1332,3 +1347,43 @@ rte_flow_pick_transfer_proxy(uint16_t port_id, uint16_t *proxy_port_id,
 			ops->pick_transfer_proxy(dev, proxy_port_id, error),
 			error);
 }
+
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	struct rte_flow_item_flex_handle *handle;
+
+	if (unlikely(!ops))
+		return NULL;
+	if (unlikely(!ops->flex_item_create)) {
+		rte_flow_error_set(error, ENOTSUP,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+				   NULL, rte_strerror(ENOTSUP));
+		return NULL;
+	}
+	handle = ops->flex_item_create(dev, conf, error);
+	if (handle == NULL)
+		flow_err(port_id, -rte_errno, error);
+	return handle;
+}
+
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error)
+{
+	int ret;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+
+	if (unlikely(!ops || !ops->flex_item_release))
+		return rte_flow_error_set(error, ENOTSUP,
+					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+					  NULL, rte_strerror(ENOTSUP));
+	ret = ops->flex_item_release(dev, handle, error);
+	return flow_err(port_id, ret, error);
+}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index d5bfdaaaf2..ba069f2f63 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -635,6 +635,15 @@ enum rte_flow_item_type {
 	 * @see struct rte_flow_item_ethdev
 	 */
 	RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT,
+
+	/**
+	 * Matches a configured set of fields at runtime calculated offsets
+	 * over the generic network header with variable length and
+	 * flexible pattern
+	 *
+	 * @see struct rte_flow_item_flex.
+	 */
+	RTE_FLOW_ITEM_TYPE_FLEX,
 };
 
 /**
@@ -1931,6 +1940,177 @@ struct rte_flow_item {
 	const void *mask; /**< Bit-mask applied to spec and last. */
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_FLEX
+ *
+ * Matches a specified set of fields within the network protocol
+ * header. Each field is presented as set of bits with specified width, and
+ * bit offset from the header beginning.
+ *
+ * The pattern is concatenation of bit fields configured at item creation
+ * by rte_flow_flex_item_create(). At configuration the fields are presented
+ * by sample_data array.
+ *
+ * This type does not support ranges (struct rte_flow_item.last).
+ */
+struct rte_flow_item_flex {
+	struct rte_flow_item_flex_handle *handle; /**< Opaque item handle. */
+	uint32_t length; /**< Pattern length in bytes. */
+	const uint8_t *pattern; /**< Combined bitfields pattern to match. */
+};
+/**
+ * Field bit offset calculation mode.
+ */
+enum rte_flow_item_flex_field_mode {
+	/**
+	 * Dummy field, used for byte boundary alignment in pattern.
+	 * Pattern mask and data are ignored in the match. All configuration
+	 * parameters besides field size are ignored.
+	 */
+	FIELD_MODE_DUMMY = 0,
+	/**
+	 * Fixed offset field. The bit offset from header beginning
+	 * is permanent and defined by field_base parameter.
+	 */
+	FIELD_MODE_FIXED,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field). The resulting field offset to match is calculated as:
+	 *
+	 *    field_base + (*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_OFFSET,
+	/**
+	 * The field bit offset is extracted from other header field (indirect
+	 * offset field), the latter is considered as bitmask containing some
+	 * number of one bits, the resulting field offset to match is
+	 * calculated as:
+	 *
+	 *    field_base + bitcount(*offset_base & offset_mask) << offset_shift
+	 */
+	FIELD_MODE_BITMASK,
+};
+
+/**
+ * Flex item field tunnel mode
+ */
+enum rte_flow_item_flex_tunnel_mode {
+	/**
+	 * The protocol header can be present in the packet only once.
+	 * No multiple flex item flow inclusions (for inner/outer) are allowed.
+	 * No any relations with tunnel protocols are imposed. The drivers
+	 * can optimize hardware resource usage to handle match on single flex
+	 * item of specific type.
+	 */
+	FLEX_TUNNEL_MODE_SINGLE = 0,
+	/**
+	 * Flex item presents outer header only.
+	 */
+	FLEX_TUNNEL_MODE_OUTER,
+	/**
+	 * Flex item presents inner header only.
+	 */
+	FLEX_TUNNEL_MODE_INNER,
+	/**
+	 * Flex item presents either inner or outer header. The driver
+	 * handles as many multiple inners as hardware supports.
+	 */
+	FLEX_TUNNEL_MODE_MULTI,
+	/**
+	 * Flex item presents tunnel protocol header.
+	 */
+	FLEX_TUNNEL_MODE_TUNNEL,
+};
+
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+__extension__
+struct rte_flow_item_flex_field {
+	/** Defines how match field offset is calculated over the packet. */
+	enum rte_flow_item_flex_field_mode field_mode;
+	uint32_t field_size; /**< Field size in bits. */
+	int32_t field_base; /**< Field offset in bits. */
+	uint32_t offset_base; /**< Indirect offset field offset in bits. */
+	uint32_t offset_mask; /**< Indirect offset field bit mask. */
+	int32_t offset_shift; /**< Indirect offset multiply factor. */
+	uint32_t field_id:16; /**< Device hint, for multiple items in flow. */
+	uint32_t reserved:16; /**< Reserved field. */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_link {
+	/**
+	 * Preceding/following header. The item type must be always provided.
+	 * For preceding one item must specify the header value/mask to match
+	 * for the link be taken and start the flex item header parsing.
+	 */
+	struct rte_flow_item item;
+	/**
+	 * Next field value to match to continue with one of the configured
+	 * next protocols.
+	 */
+	uint32_t next;
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ */
+struct rte_flow_item_flex_conf {
+	/**
+	 * Specifies the flex item and tunnel relations and tells the PMD
+	 * whether flex item can be used for inner, outer or both headers,
+	 * or whether flex item presents the tunnel protocol itself.
+	 */
+	enum rte_flow_item_flex_tunnel_mode tunnel;
+	/**
+	 * The next header offset, it presents the network header size covered
+	 * by the flex item and can be obtained with all supported offset
+	 * calculating methods (fixed, dedicated field, bitmask, etc).
+	 */
+	struct rte_flow_item_flex_field next_header;
+	/**
+	 * Specifies the next protocol field to match with link next protocol
+	 * values and continue packet parsing with matching link.
+	 */
+	struct rte_flow_item_flex_field next_protocol;
+	/**
+	 * The fields will be sampled and presented for explicit match
+	 * with pattern in the rte_flow_flex_item. There can be multiple
+	 * fields descriptors, the number should be specified by nb_samples.
+	 */
+	struct rte_flow_item_flex_field *sample_data;
+	/** Number of field descriptors in the sample_data array. */
+	uint32_t nb_samples;
+	/**
+	 * Input link defines the flex item relation with preceding
+	 * header. It specified the preceding item type and provides pattern
+	 * to match. The flex item will continue parsing and will provide the
+	 * data to flow match in case if there is the match with one of input
+	 * links.
+	 */
+	struct rte_flow_item_flex_link *input_link;
+	/** Number of link descriptors in the input link array. */
+	uint32_t nb_inputs;
+	/**
+	 * Output link defines the next protocol field value to match and
+	 * the following protocol header to continue packet parsing. Also
+	 * defines the tunnel-related behaviour.
+	 */
+	struct rte_flow_item_flex_link *output_link;
+	/** Number of link descriptors in the output link array. */
+	uint32_t nb_outputs;
+};
+
 /**
  * Action types.
  *
@@ -4477,6 +4657,51 @@ __rte_experimental
 int
 rte_flow_pick_transfer_proxy(uint16_t port_id, uint16_t *proxy_port_id,
 			     struct rte_flow_error *error);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Create the flex item with specified configuration over
+ * the Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] conf
+ *   Item configuration.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   Non-NULL opaque pointer on success, NULL otherwise and rte_errno is set.
+ */
+__rte_experimental
+struct rte_flow_item_flex_handle *
+rte_flow_flex_item_create(uint16_t port_id,
+			  const struct rte_flow_item_flex_conf *conf,
+			  struct rte_flow_error *error);
+
+/**
+ * Release the flex item on the specified Ethernet device.
+ *
+ * @param port_id
+ *   Port identifier of Ethernet device.
+ * @param[in] handle
+ *   Handle of the item existing on the specified device.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL. PMDs initialize this
+ *   structure in case of error only.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+__rte_experimental
+int
+rte_flow_flex_item_release(uint16_t port_id,
+			   const struct rte_flow_item_flex_handle *handle,
+			   struct rte_flow_error *error);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/ethdev/rte_flow_driver.h b/lib/ethdev/rte_flow_driver.h
index ed52e59a0a..f691b04af4 100644
--- a/lib/ethdev/rte_flow_driver.h
+++ b/lib/ethdev/rte_flow_driver.h
@@ -144,6 +144,14 @@ struct rte_flow_ops {
 		(struct rte_eth_dev *dev,
 		 uint16_t *proxy_port_id,
 		 struct rte_flow_error *error);
+	struct rte_flow_item_flex_handle *(*flex_item_create)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_conf *conf,
+		 struct rte_flow_error *error);
+	int (*flex_item_release)
+		(struct rte_eth_dev *dev,
+		 const struct rte_flow_item_flex_handle *handle,
+		 struct rte_flow_error *error);
 };
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 5e6f59c6d3..102353a4cd 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -253,6 +253,8 @@ EXPERIMENTAL {
 	rte_eth_macaddrs_get;
 	rte_eth_rx_metadata_negotiate;
 	rte_flow_pick_transfer_proxy;
+	rte_flow_flex_item_create;
+	rte_flow_flex_item_release;
 };
 
 INTERNAL {
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v7 3/4] app/testpmd: add dedicated flow command parsing routine
  2021-10-20 15:06 ` [dpdk-dev] [PATCH v7 0/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 1/4] ethdev: support flow elements with variable length Viacheslav Ovsiienko
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 2/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-20 15:06   ` Viacheslav Ovsiienko
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 4/4] app/testpmd: add flex item CLI commands Viacheslav Ovsiienko
  3 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-20 15:06 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

testpmd flow creation is constructed from these procedures:
  1. receive string with flow rule description;
  2. parse input string and build flow parameters: port_id value,
     flow attributes, items array, actions array;
  3. create a flow rule from flow rule parameters.

Flow rule creation procedures are built as a pipeline. A new
procedure starts immediately after successful predecessor completion.
Due to this we have no dedicated routines providing intermediate
results for step 1-3 above.

The patch adds `flow_parse()` function call. It parses input string
and provides a caller with parsed data. This is a preparation step
for introducing flex item command processing.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/cmdline_flow.c | 24 ++++++++++++++++++++++++
 app/test-pmd/testpmd.h      |  5 +++++
 2 files changed, 29 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a90822b660..cd640b9b7a 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -8076,6 +8076,30 @@ cmd_flow_parse(cmdline_parse_token_hdr_t *hdr, const char *src, void *result,
 	return len;
 }
 
+int
+flow_parse(const char *src, void *result, unsigned int size,
+	   struct rte_flow_attr **attr,
+	   struct rte_flow_item **pattern, struct rte_flow_action **actions)
+{
+	int ret;
+	struct context saved_flow_ctx = cmd_flow_context;
+
+	cmd_flow_context_init(&cmd_flow_context);
+	do {
+		ret = cmd_flow_parse(NULL, src, result, size);
+		if (ret > 0) {
+			src += ret;
+			while (isspace(*src))
+				src++;
+		}
+	} while (ret > 0 && strlen(src));
+	cmd_flow_context = saved_flow_ctx;
+	*attr = &((struct buffer *)result)->args.vc.attr;
+	*pattern = ((struct buffer *)result)->args.vc.pattern;
+	*actions = ((struct buffer *)result)->args.vc.actions;
+	return (ret >= 0 && !strlen(src)) ? 0 : -1;
+}
+
 /** Return number of completion entries (cmdline API). */
 static int
 cmd_flow_complete_get_nb(cmdline_parse_token_hdr_t *hdr)
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index dd8f27a296..81be754605 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -1047,6 +1047,11 @@ void add_tx_dynf_callback(portid_t portid);
 void remove_tx_dynf_callback(portid_t portid);
 int update_mtu_from_frame_size(portid_t portid, uint32_t max_rx_pktlen);
 
+extern int flow_parse(const char *src, void *result, unsigned int size,
+		      struct rte_flow_attr **attr,
+		      struct rte_flow_item **pattern,
+		      struct rte_flow_action **actions);
+
 /*
  * Work-around of a compilation error with ICC on invocations of the
  * rte_be_to_cpu_16() function.
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v7 4/4] app/testpmd: add flex item CLI commands
  2021-10-20 15:06 ` [dpdk-dev] [PATCH v7 0/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                     ` (2 preceding siblings ...)
  2021-10-20 15:06   ` [dpdk-dev] [PATCH v7 3/4] app/testpmd: add dedicated flow command parsing routine Viacheslav Ovsiienko
@ 2021-10-20 15:06   ` Viacheslav Ovsiienko
  3 siblings, 0 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-20 15:06 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

From: Gregory Etelson <getelson@nvidia.com>

Network port hardware is shipped with fixed number of
supported network protocols. If application must work with a
protocol that is not included in the port hardware by default, it
can try to add the new protocol to port hardware.

Flex item or flex parser is port infrastructure that allows
application to add support for a custom network header and
offload flows to match the header elements.

Application must complete the following tasks to create a flow
rule that matches custom header:

1. Create flow item object in port hardware.
Application must provide custom header configuration to PMD.
PMD will use that configuration to create flex item object in
port hardware.

2. Create flex patterns to match. Flex pattern has a spec and a mask
components, like a regular flow item. Combined together, spec and mask
can target unique data sequence or a number of data sequences in the
custom header.
Flex patterns of the same flex item can have different lengths.
Flex pattern is identified by unique handler value.

3. Create a flow rule with a flex flow item that references
flow pattern.

Testpmd flex CLI commands are:

testpmd> flow flex_item create <port> <flex_id> <filename>

testpmd> set flex_pattern <pattern_id> \
         spec <spec data> mask <mask data>

testpmd> set flex_pattern <pattern_id> is <spec_data>

testpmd> flow create <port> ... \
/ flex item is <flex_id> pattern is <pattern_id> / ...

The patch works with the jansson library API.
A new optional dependency on jansson library is added for
testpmd. If jansson not detected the flex item functionality
is disabled.
Jansson development files must be present:
jansson.pc, jansson.h libjansson.[a,so]

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 app/test-pmd/cmd_flex_item.c                | 548 ++++++++++++++++++++
 app/test-pmd/cmdline.c                      |   2 +
 app/test-pmd/cmdline_flow.c                 | 223 +++++++-
 app/test-pmd/meson.build                    |   6 +
 app/test-pmd/testpmd.c                      |   2 +-
 app/test-pmd/testpmd.h                      |  30 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 119 +++++
 7 files changed, 928 insertions(+), 2 deletions(-)
 create mode 100644 app/test-pmd/cmd_flex_item.c

diff --git a/app/test-pmd/cmd_flex_item.c b/app/test-pmd/cmd_flex_item.c
new file mode 100644
index 0000000000..45103e45a8
--- /dev/null
+++ b/app/test-pmd/cmd_flex_item.c
@@ -0,0 +1,548 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2021 NVIDIA Corporation & Affiliates
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+
+#include <rte_common.h>
+#include <rte_ethdev.h>
+#include <cmdline_parse.h>
+#include <cmdline_parse_string.h>
+#include <cmdline_parse_num.h>
+#include <rte_flow.h>
+
+#include "testpmd.h"
+
+struct flex_item *flex_items[RTE_MAX_ETHPORTS][FLEX_MAX_PARSERS_NUM];
+struct flex_pattern flex_patterns[FLEX_MAX_PATTERNS_NUM];
+
+#ifdef RTE_HAS_JANSSON
+static __rte_always_inline bool
+match_strkey(const char *key, const char *pattern)
+{
+	return strncmp(key, pattern, strlen(key)) == 0;
+}
+
+static struct flex_item *
+flex_parser_fetch(uint16_t port_id, uint16_t flex_id)
+{
+	if (port_id >= RTE_MAX_ETHPORTS) {
+		printf("Invalid port_id: %u\n", port_id);
+		return FLEX_PARSER_ERR;
+	}
+	if (flex_id >= FLEX_MAX_PARSERS_NUM) {
+		printf("Invalid flex item flex_id: %u\n", flex_id);
+		return FLEX_PARSER_ERR;
+	}
+	return flex_items[port_id][flex_id];
+}
+
+void
+flex_item_destroy(portid_t port_id, uint16_t flex_id)
+{
+	int ret;
+	struct rte_flow_error error;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (!fp)
+		return;
+	ret = rte_flow_flex_item_release(port_id, fp->flex_handle, &error);
+	if (!ret) {
+		free(fp);
+		flex_items[port_id][flex_id] = NULL;
+		printf("port-%u: released flex item #%u\n",
+		       port_id, flex_id);
+
+	} else {
+		printf("port-%u: cannot release flex item #%u: %s\n",
+		       port_id, flex_id, error.message);
+	}
+}
+
+static int
+flex_tunnel_parse(json_t *jtun, enum rte_flow_item_flex_tunnel_mode *tunnel)
+{
+	int tun = -1;
+
+	if (json_is_integer(jtun))
+		tun = (int)json_integer_value(jtun);
+	else if (json_is_real(jtun))
+		tun = (int)json_real_value(jtun);
+	else if (json_is_string(jtun)) {
+		const char *mode = json_string_value(jtun);
+
+		if (match_strkey(mode, "FLEX_TUNNEL_MODE_SINGLE"))
+			tun = FLEX_TUNNEL_MODE_SINGLE;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_OUTER"))
+			tun = FLEX_TUNNEL_MODE_OUTER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_INNER"))
+			tun = FLEX_TUNNEL_MODE_INNER;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_MULTI"))
+			tun = FLEX_TUNNEL_MODE_MULTI;
+		else if (match_strkey(mode, "FLEX_TUNNEL_MODE_TUNNEL"))
+			tun = FLEX_TUNNEL_MODE_TUNNEL;
+		else
+			return -EINVAL;
+	} else
+		return -EINVAL;
+	*tunnel = (enum rte_flow_item_flex_tunnel_mode)tun;
+	return 0;
+}
+
+static int
+flex_field_parse(json_t *jfld, struct rte_flow_item_flex_field *fld)
+{
+	const char *key;
+	json_t *je;
+
+#define FLEX_FIELD_GET(fm, t) \
+do {                  \
+	if (!strncmp(key, # fm, strlen(# fm))) { \
+		if (json_is_real(je))   \
+			fld->fm = (t) json_real_value(je); \
+		else if (json_is_integer(je))   \
+			fld->fm = (t) json_integer_value(je); \
+		else   \
+			return -EINVAL; \
+	}         \
+} while (0)
+
+	json_object_foreach(jfld, key, je) {
+		FLEX_FIELD_GET(field_size, uint32_t);
+		FLEX_FIELD_GET(field_base, int32_t);
+		FLEX_FIELD_GET(offset_base, uint32_t);
+		FLEX_FIELD_GET(offset_mask, uint32_t);
+		FLEX_FIELD_GET(offset_shift, int32_t);
+		FLEX_FIELD_GET(field_id, uint16_t);
+		if (match_strkey(key, "field_mode")) {
+			const char *mode;
+			if (!json_is_string(je))
+				return -EINVAL;
+			mode = json_string_value(je);
+			if (match_strkey(mode, "FIELD_MODE_DUMMY"))
+				fld->field_mode = FIELD_MODE_DUMMY;
+			else if (match_strkey(mode, "FIELD_MODE_FIXED"))
+				fld->field_mode = FIELD_MODE_FIXED;
+			else if (match_strkey(mode, "FIELD_MODE_OFFSET"))
+				fld->field_mode = FIELD_MODE_OFFSET;
+			else if (match_strkey(mode, "FIELD_MODE_BITMASK"))
+				fld->field_mode = FIELD_MODE_BITMASK;
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+enum flex_link_type {
+	FLEX_LINK_IN = 0,
+	FLEX_LINK_OUT = 1
+};
+
+static int
+flex_link_item_parse(const char *src, struct rte_flow_item *item)
+{
+#define  FLEX_PARSE_DATA_SIZE 1024
+
+	int ret;
+	uint8_t *ptr, data[FLEX_PARSE_DATA_SIZE] = {0,};
+	char flow_rule[256];
+	struct rte_flow_attr *attr;
+	struct rte_flow_item *pattern;
+	struct rte_flow_action *actions;
+
+	sprintf(flow_rule, "flow create 0 pattern %s / end", src);
+	src = flow_rule;
+	ret = flow_parse(src, (void *)data, sizeof(data),
+			 &attr, &pattern, &actions);
+	if (ret)
+		return ret;
+	item->type = pattern->type;
+	if (pattern->spec) {
+		ptr = (void *)(uintptr_t)item->spec;
+		memcpy(ptr, pattern->spec, FLEX_MAX_FLOW_PATTERN_LENGTH);
+	} else {
+		item->spec = NULL;
+	}
+	if (pattern->mask) {
+		ptr = (void *)(uintptr_t)item->mask;
+		memcpy(ptr, pattern->mask, FLEX_MAX_FLOW_PATTERN_LENGTH);
+	} else {
+		item->mask = NULL;
+	}
+	if (pattern->last) {
+		ptr = (void *)(uintptr_t)item->last;
+		memcpy(ptr, pattern->last, FLEX_MAX_FLOW_PATTERN_LENGTH);
+	} else {
+		item->last = NULL;
+	}
+	return 0;
+}
+
+static int
+flex_link_parse(json_t *jobj, struct rte_flow_item_flex_link *link,
+		enum flex_link_type link_type)
+{
+	const char *key;
+	json_t *je;
+	int ret;
+	json_object_foreach(jobj, key, je) {
+		if (match_strkey(key, "item")) {
+			if (!json_is_string(je))
+				return -EINVAL;
+			ret = flex_link_item_parse(json_string_value(je),
+						   &link->item);
+			if (ret)
+				return -EINVAL;
+			if (link_type == FLEX_LINK_IN) {
+				if (!link->item.spec || !link->item.mask)
+					return -EINVAL;
+				if (link->item.last)
+					return -EINVAL;
+			}
+		}
+		if (match_strkey(key, "next")) {
+			if (json_is_integer(je))
+				link->next = (typeof(link->next))
+					     json_integer_value(je);
+			else if (json_is_real(je))
+				link->next = (typeof(link->next))
+					     json_real_value(je);
+			else
+				return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static int flex_item_config(json_t *jroot,
+			    struct rte_flow_item_flex_conf *flex_conf)
+{
+	const char *key;
+	json_t *jobj = NULL;
+	int ret = 0;
+
+	json_object_foreach(jroot, key, jobj) {
+		if (match_strkey(key, "tunnel")) {
+			ret = flex_tunnel_parse(jobj, &flex_conf->tunnel);
+			if (ret) {
+				printf("Can't parse tunnel value\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_header")) {
+			ret = flex_field_parse(jobj, &flex_conf->next_header);
+			if (ret) {
+				printf("Can't parse next_header field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "next_protocol")) {
+			ret = flex_field_parse(jobj,
+					       &flex_conf->next_protocol);
+			if (ret) {
+				printf("Can't parse next_protocol field\n");
+				goto out;
+			}
+		} else if (match_strkey(key, "sample_data")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_field_parse
+					(ji, flex_conf->sample_data + i);
+				if (ret) {
+					printf("Can't parse sample_data field(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_samples = size;
+		} else if (match_strkey(key, "input_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse(ji,
+						      flex_conf->input_link + i,
+						      FLEX_LINK_IN);
+				if (ret) {
+					printf("Can't parse input_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_inputs = size;
+		} else if (match_strkey(key, "output_link")) {
+			json_t *ji;
+			uint32_t i, size = json_array_size(jobj);
+			for (i = 0; i < size; i++) {
+				ji = json_array_get(jobj, i);
+				ret = flex_link_parse
+					(ji, flex_conf->output_link + i,
+					 FLEX_LINK_OUT);
+				if (ret) {
+					printf("Can't parse output_link(s)\n");
+					goto out;
+				}
+			}
+			flex_conf->nb_outputs = size;
+		}
+	}
+out:
+	return ret;
+}
+
+static struct flex_item *
+flex_item_init(void)
+{
+	size_t base_size, samples_size, links_size, spec_size;
+	struct rte_flow_item_flex_conf *conf;
+	struct flex_item *fp;
+	uint8_t (*pattern)[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	int i;
+
+	base_size = RTE_ALIGN(sizeof(*conf), sizeof(uintptr_t));
+	samples_size = RTE_ALIGN(FLEX_ITEM_MAX_SAMPLES_NUM *
+				 sizeof(conf->sample_data[0]),
+				 sizeof(uintptr_t));
+	links_size = RTE_ALIGN(FLEX_ITEM_MAX_LINKS_NUM *
+			       sizeof(conf->input_link[0]),
+			       sizeof(uintptr_t));
+	/* spec & mask for all input links */
+	spec_size = 2 * FLEX_MAX_FLOW_PATTERN_LENGTH * FLEX_ITEM_MAX_LINKS_NUM;
+	fp = calloc(1, base_size + samples_size + 2 * links_size + spec_size);
+	if (fp == NULL) {
+		printf("Can't allocate memory for flex item\n");
+		return NULL;
+	}
+	conf = &fp->flex_conf;
+	conf->sample_data = (typeof(conf->sample_data))
+			    ((uint8_t *)fp + base_size);
+	conf->input_link = (typeof(conf->input_link))
+			   ((uint8_t *)conf->sample_data + samples_size);
+	conf->output_link = (typeof(conf->output_link))
+			    ((uint8_t *)conf->input_link + links_size);
+	pattern = (typeof(pattern))((uint8_t *)conf->output_link + links_size);
+	for (i = 0; i < FLEX_ITEM_MAX_LINKS_NUM; i++) {
+		struct rte_flow_item_flex_link *in = conf->input_link + i;
+		in->item.spec = pattern++;
+		in->item.mask = pattern++;
+	}
+	return fp;
+}
+
+static int
+flex_item_build_config(struct flex_item *fp, const char *filename)
+{
+	int ret;
+	json_error_t json_error;
+	json_t *jroot = json_load_file(filename, 0, &json_error);
+
+	if (!jroot) {
+		printf("Bad JSON file \"%s\": %s\n", filename, json_error.text);
+		return -1;
+	}
+	ret = flex_item_config(jroot, &fp->flex_conf);
+	json_decref(jroot);
+	return ret;
+}
+
+void
+flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename)
+{
+	struct rte_flow_error flow_error;
+	struct flex_item *fp = flex_parser_fetch(port_id, flex_id);
+	int ret;
+
+	if (fp == FLEX_PARSER_ERR) {
+		printf("Bad parameters: port_id=%u flex_id=%u\n",
+		       port_id, flex_id);
+		return;
+	}
+	if (fp) {
+		printf("port-%u: flex item #%u is already in use\n",
+		       port_id, flex_id);
+		return;
+	}
+	fp = flex_item_init();
+	if (!fp) {
+		printf("Could not allocate flex item\n");
+		goto out;
+	}
+	ret = flex_item_build_config(fp, filename);
+	if (ret)
+		goto out;
+	fp->flex_handle = rte_flow_flex_item_create(port_id,
+						    &fp->flex_conf,
+						    &flow_error);
+	if (fp->flex_handle) {
+		flex_items[port_id][flex_id] = fp;
+		printf("port-%u: created flex item #%u\n", port_id, flex_id);
+		fp = NULL;
+	} else {
+		printf("port-%u: flex item #%u creation failed: %s\n",
+		       port_id, flex_id,
+		       flow_error.message ? flow_error.message : "");
+	}
+out:
+	if (fp)
+		free(fp);
+}
+
+#else /* RTE_HAS_JANSSON */
+void flex_item_create(__rte_unused portid_t port_id,
+		      __rte_unused uint16_t flex_id,
+		      __rte_unused const char *filename)
+{
+	printf("no JSON library\n");
+}
+
+void flex_item_destroy(__rte_unused portid_t port_id,
+		       __rte_unused uint16_t flex_id)
+{
+	printf("no JSON library\n");
+}
+#endif /* RTE_HAS_JANSSON */
+
+void
+port_flex_item_flush(portid_t port_id)
+{
+	uint16_t i;
+
+	for (i = 0; i < FLEX_MAX_PARSERS_NUM; i++) {
+		flex_item_destroy(port_id, i);
+		flex_items[port_id][i] = NULL;
+	}
+}
+
+struct flex_pattern_set {
+	cmdline_fixed_string_t set, flex_pattern;
+	cmdline_fixed_string_t is_spec, mask;
+	cmdline_fixed_string_t spec_data, mask_data;
+	uint16_t id;
+};
+
+static cmdline_parse_token_string_t flex_pattern_set_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, set, "set");
+static cmdline_parse_token_string_t flex_pattern_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+flex_pattern, "flex_pattern");
+static cmdline_parse_token_string_t flex_pattern_is_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+is_spec, "is");
+static cmdline_parse_token_string_t flex_pattern_spec_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set,
+is_spec, "spec");
+static cmdline_parse_token_string_t flex_pattern_mask_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask, "mask");
+static cmdline_parse_token_string_t flex_pattern_spec_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, spec_data, NULL);
+static cmdline_parse_token_string_t flex_pattern_mask_data_token =
+	TOKEN_STRING_INITIALIZER(struct flex_pattern_set, mask_data, NULL);
+static cmdline_parse_token_num_t flex_pattern_id_token =
+	TOKEN_NUM_INITIALIZER(struct flex_pattern_set, id, RTE_UINT16);
+
+/*
+ * flex pattern data - spec or mask is a string representation of byte array
+ * in hexadecimal format. Each byte in data string must have 2 characters:
+ * 0x15 - "15"
+ * 0x1  - "01"
+ * Bytes in data array are in network order.
+ */
+static uint32_t
+flex_pattern_data(const char *str, uint8_t *data)
+{
+	uint32_t i, len = strlen(str);
+	char b[3], *endptr;
+
+	if (len & 01)
+		return 0;
+	len /= 2;
+	if (len >= FLEX_MAX_FLOW_PATTERN_LENGTH)
+		return 0;
+	for (i = 0, b[2] = '\0'; i < len; i++) {
+		b[0] = str[2 * i];
+		b[1] = str[2 * i + 1];
+		data[i] = strtoul(b, &endptr, 16);
+		if (endptr != &b[2])
+			return 0;
+	}
+	return len;
+}
+
+static void
+flex_pattern_parsed_fn(void *parsed_result,
+		       __rte_unused struct cmdline *cl,
+		       __rte_unused void *data)
+{
+	struct flex_pattern_set *res = parsed_result;
+	struct flex_pattern *fp;
+	bool full_spec;
+
+	if (res->id >= FLEX_MAX_PATTERNS_NUM) {
+		printf("Bad flex pattern id\n");
+		return;
+	}
+	fp = flex_patterns + res->id;
+	memset(fp->spec_pattern, 0, sizeof(fp->spec_pattern));
+	memset(fp->mask_pattern, 0, sizeof(fp->mask_pattern));
+	fp->spec.length = flex_pattern_data(res->spec_data, fp->spec_pattern);
+	if (!fp->spec.length) {
+		printf("Bad flex pattern spec\n");
+		return;
+	}
+	full_spec = strncmp(res->is_spec, "spec", strlen("spec")) == 0;
+	if (full_spec) {
+		fp->mask.length = flex_pattern_data(res->mask_data,
+						    fp->mask_pattern);
+		if (!fp->mask.length) {
+			printf("Bad flex pattern mask\n");
+			return;
+		}
+	} else {
+		memset(fp->mask_pattern, 0xFF, fp->spec.length);
+		fp->mask.length = fp->spec.length;
+	}
+	if (fp->mask.length != fp->spec.length) {
+		printf("Spec length do not match mask length\n");
+		return;
+	}
+	fp->spec.pattern = fp->spec_pattern;
+	fp->mask.pattern = fp->mask_pattern;
+	printf("created pattern #%u\n", res->id);
+}
+
+cmdline_parse_inst_t cmd_set_flex_is_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> is <spec_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_is_token,
+		(void *)&flex_pattern_spec_data_token,
+		NULL,
+	}
+};
+
+cmdline_parse_inst_t cmd_set_flex_spec_pattern = {
+	.f = flex_pattern_parsed_fn,
+	.data = NULL,
+	.help_str = "set flex_pattern <id> spec <spec_data> mask <mask_data>",
+	.tokens = {
+		(void *)&flex_pattern_set_token,
+		(void *)&flex_pattern_token,
+		(void *)&flex_pattern_id_token,
+		(void *)&flex_pattern_spec_token,
+		(void *)&flex_pattern_spec_data_token,
+		(void *)&flex_pattern_mask_token,
+		(void *)&flex_pattern_mask_data_token,
+		NULL,
+	}
+};
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 88354ccab9..3221f6e1aa 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -17861,6 +17861,8 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_show_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_set_fec_mode,
 	(cmdline_parse_inst_t *)&cmd_show_capability,
+	(cmdline_parse_inst_t *)&cmd_set_flex_is_pattern,
+	(cmdline_parse_inst_t *)&cmd_set_flex_spec_pattern,
 	NULL,
 };
 
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index cd640b9b7a..5437975837 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -54,6 +54,8 @@ enum index {
 	COMMON_PRIORITY_LEVEL,
 	COMMON_INDIRECT_ACTION_ID,
 	COMMON_POLICY_ID,
+	COMMON_FLEX_HANDLE,
+	COMMON_FLEX_TOKEN,
 
 	/* TOP-level command. */
 	ADD,
@@ -81,6 +83,12 @@ enum index {
 	AGED,
 	ISOLATE,
 	TUNNEL,
+	FLEX,
+
+	/* Flex arguments */
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
 
 	/* Tunnel arguments. */
 	TUNNEL_CREATE,
@@ -310,6 +318,9 @@ enum index {
 	ITEM_PORT_REPRESENTOR_PORT_ID,
 	ITEM_REPRESENTED_PORT,
 	ITEM_REPRESENTED_PORT_ETHDEV_PORT_ID,
+	ITEM_FLEX,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_FLEX_PATTERN_HANDLE,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -860,6 +871,11 @@ struct buffer {
 		struct {
 			uint32_t policy_id;
 		} policy;/**< Policy arguments. */
+		struct {
+			uint16_t token;
+			uintptr_t uintptr;
+			char filename[128];
+		} flex; /**< Flex arguments*/
 	} args; /**< Command arguments. */
 };
 
@@ -887,6 +903,13 @@ struct parse_action_priv {
 		.size = s, \
 	})
 
+static const enum index next_flex_item[] = {
+	FLEX_ITEM_INIT,
+	FLEX_ITEM_CREATE,
+	FLEX_ITEM_DESTROY,
+	ZERO,
+};
+
 static const enum index next_ia_create_attr[] = {
 	INDIRECT_ACTION_CREATE_ID,
 	INDIRECT_ACTION_INGRESS,
@@ -1018,6 +1041,7 @@ static const enum index next_item[] = {
 	ITEM_CONNTRACK,
 	ITEM_PORT_REPRESENTOR,
 	ITEM_REPRESENTED_PORT,
+	ITEM_FLEX,
 	END_SET,
 	ZERO,
 };
@@ -1398,6 +1422,13 @@ static const enum index item_represented_port[] = {
 	ZERO,
 };
 
+static const enum index item_flex[] = {
+	ITEM_FLEX_PATTERN_HANDLE,
+	ITEM_FLEX_ITEM_HANDLE,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -1768,6 +1799,9 @@ static int parse_set_sample_action(struct context *, const struct token *,
 static int parse_set_init(struct context *, const struct token *,
 			  const char *, unsigned int,
 			  void *, unsigned int);
+static int
+parse_flex_handle(struct context *, const struct token *,
+		  const char *, unsigned int, void *, unsigned int);
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -1884,6 +1918,8 @@ static int parse_isolate(struct context *, const struct token *,
 static int parse_tunnel(struct context *, const struct token *,
 			const char *, unsigned int,
 			void *, unsigned int);
+static int parse_flex(struct context *, const struct token *,
+		      const char *, unsigned int, void *, unsigned int);
 static int parse_int(struct context *, const struct token *,
 		     const char *, unsigned int,
 		     void *, unsigned int);
@@ -2084,6 +2120,20 @@ static const struct token token_list[] = {
 		.call = parse_int,
 		.comp = comp_none,
 	},
+	[COMMON_FLEX_TOKEN] = {
+		.name = "{flex token}",
+		.type = "flex token",
+		.help = "flex token",
+		.call = parse_int,
+		.comp = comp_none,
+	},
+	[COMMON_FLEX_HANDLE] = {
+		.name = "{flex handle}",
+		.type = "FLEX HANDLE",
+		.help = "fill flex item data",
+		.call = parse_flex_handle,
+		.comp = comp_none,
+	},
 	/* Top-level command. */
 	[FLOW] = {
 		.name = "flow",
@@ -2100,7 +2150,8 @@ static const struct token token_list[] = {
 			      AGED,
 			      QUERY,
 			      ISOLATE,
-			      TUNNEL)),
+			      TUNNEL,
+			      FLEX)),
 		.call = parse_init,
 	},
 	/* Top-level command. */
@@ -2212,6 +2263,41 @@ static const struct token token_list[] = {
 			     ARGS_ENTRY(struct buffer, port)),
 		.call = parse_isolate,
 	},
+	[FLEX] = {
+		.name = "flex_item",
+		.help = "flex item API",
+		.next = NEXT(next_flex_item),
+		.call = parse_flex,
+	},
+	[FLEX_ITEM_INIT] = {
+		.name = "init",
+		.help = "flex item init",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_CREATE] = {
+		.name = "create",
+		.help = "flex item create",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.filename),
+			     ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FILE_PATH),
+			     NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
+	[FLEX_ITEM_DESTROY] = {
+		.name = "destroy",
+		.help = "flex item destroy",
+		.args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
+			     ARGS_ENTRY(struct buffer, port)),
+		.next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
+			     NEXT_ENTRY(COMMON_PORT_ID)),
+		.call = parse_flex
+	},
 	[TUNNEL] = {
 		.name = "tunnel",
 		.help = "new tunnel API",
@@ -3682,6 +3768,27 @@ static const struct token token_list[] = {
 			     item_param),
 		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_ethdev, port_id)),
 	},
+	[ITEM_FLEX] = {
+		.name = "flex",
+		.help = "match flex header",
+		.priv = PRIV_ITEM(FLEX, sizeof(struct rte_flow_item_flex)),
+		.next = NEXT(item_flex),
+		.call = parse_vc,
+	},
+	[ITEM_FLEX_ITEM_HANDLE] = {
+		.name = "item",
+		.help = "flex item handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, handle)),
+	},
+	[ITEM_FLEX_PATTERN_HANDLE] = {
+		.name = "pattern",
+		.help = "flex pattern handle",
+		.next = NEXT(item_flex, NEXT_ENTRY(COMMON_FLEX_HANDLE),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_flex, pattern)),
+	},
 	/* Validate/create actions. */
 	[ACTIONS] = {
 		.name = "actions",
@@ -7113,6 +7220,43 @@ parse_isolate(struct context *ctx, const struct token *token,
 	return len;
 }
 
+static int
+parse_flex(struct context *ctx, const struct token *token,
+	     const char *str, unsigned int len,
+	     void *buf, unsigned int size)
+{
+	struct buffer *out = buf;
+
+	/* Token name must match. */
+	if (parse_default(ctx, token, str, len, NULL, 0) < 0)
+		return -1;
+	/* Nothing else to do if there is no buffer. */
+	if (!out)
+		return len;
+	if (out->command == ZERO) {
+		if (ctx->curr != FLEX)
+			return -1;
+		if (sizeof(*out) > size)
+			return -1;
+		out->command = ctx->curr;
+		ctx->objdata = 0;
+		ctx->object = out;
+		ctx->objmask = NULL;
+	} else {
+		switch (ctx->curr) {
+		default:
+			break;
+		case FLEX_ITEM_INIT:
+		case FLEX_ITEM_CREATE:
+		case FLEX_ITEM_DESTROY:
+			out->command = ctx->curr;
+			break;
+		}
+	}
+
+	return len;
+}
+
 static int
 parse_tunnel(struct context *ctx, const struct token *token,
 	     const char *str, unsigned int len,
@@ -7778,6 +7922,71 @@ parse_set_init(struct context *ctx, const struct token *token,
 	return len;
 }
 
+/*
+ * Replace testpmd handles in a flex flow item with real values.
+ */
+static int
+parse_flex_handle(struct context *ctx, const struct token *token,
+		  const char *str, unsigned int len,
+		  void *buf, unsigned int size)
+{
+	struct rte_flow_item_flex *spec, *mask;
+	const struct rte_flow_item_flex *src_spec, *src_mask;
+	const struct arg *arg = pop_args(ctx);
+	uint32_t offset;
+	uint16_t handle;
+	int ret;
+
+	if (!arg) {
+		printf("Bad environment\n");
+		return -1;
+	}
+	offset = arg->offset;
+	push_args(ctx, arg);
+	ret = parse_int(ctx, token, str, len, buf, size);
+	if (ret <= 0 || !ctx->object)
+		return ret;
+	if (ctx->port >= RTE_MAX_ETHPORTS) {
+		printf("Bad port\n");
+		return -1;
+	}
+	if (offset == offsetof(struct rte_flow_item_flex, handle)) {
+		const struct flex_item *fp;
+		struct rte_flow_item_flex *item_flex = ctx->object;
+		handle = (uint16_t)(uintptr_t)item_flex->handle;
+		if (handle >= FLEX_MAX_PARSERS_NUM) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		fp = flex_items[ctx->port][handle];
+		if (!fp) {
+			printf("Bad flex item handle\n");
+			return -1;
+		}
+		item_flex->handle = fp->flex_handle;
+	} else if (offset == offsetof(struct rte_flow_item_flex, pattern)) {
+		handle = (uint16_t)(uintptr_t)
+			((struct rte_flow_item_flex *)ctx->object)->pattern;
+		if (handle >= FLEX_MAX_PATTERNS_NUM) {
+			printf("Bad pattern handle\n");
+			return -1;
+		}
+		src_spec = &flex_patterns[handle].spec;
+		src_mask = &flex_patterns[handle].mask;
+		spec = ctx->object;
+		mask = spec + 2; /* spec, last, mask */
+		/* fill flow rule spec and mask parameters */
+		spec->length = src_spec->length;
+		spec->pattern = src_spec->pattern;
+		mask->length = src_mask->length;
+		mask->pattern = src_mask->pattern;
+	} else {
+		printf("Bad arguments - unknown flex item offset\n");
+		return -1;
+	}
+	return ret;
+}
+
 /** No completion. */
 static int
 comp_none(struct context *ctx, const struct token *token,
@@ -8305,6 +8514,13 @@ cmd_flow_parsed(const struct buffer *in)
 		port_meter_policy_add(in->port, in->args.policy.policy_id,
 					in->args.vc.actions);
 		break;
+	case FLEX_ITEM_CREATE:
+		flex_item_create(in->port, in->args.flex.token,
+				 in->args.flex.filename);
+		break;
+	case FLEX_ITEM_DESTROY:
+		flex_item_destroy(in->port, in->args.flex.token);
+		break;
 	default:
 		break;
 	}
@@ -8760,6 +8976,11 @@ cmd_set_raw_parsed(const struct buffer *in)
 		case RTE_FLOW_ITEM_TYPE_PFCP:
 			size = sizeof(struct rte_flow_item_pfcp);
 			break;
+		case RTE_FLOW_ITEM_TYPE_FLEX:
+			size = item->spec ?
+				((const struct rte_flow_item_flex *)
+				item->spec)->length : 0;
+			break;
 		default:
 			fprintf(stderr, "Error - Not supported item\n");
 			goto error;
diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 98f3289bdf..201bed013f 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -10,6 +10,7 @@ sources = files(
         'cmdline_flow.c',
         'cmdline_mtr.c',
         'cmdline_tm.c',
+	'cmd_flex_item.c',
         'config.c',
         'csumonly.c',
         'flowgen.c',
@@ -61,3 +62,8 @@ if dpdk_conf.has('RTE_LIB_BPF')
     sources += files('bpf_cmd.c')
     deps += 'bpf'
 endif
+jansson_dep = dependency('jansson', required: false, method: 'pkg-config')
+if jansson_dep.found()
+    dpdk_conf.set('RTE_HAS_JANSSON', 1)
+    ext_deps += jansson_dep
+endif
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index de7a8c2955..af0e79fe6d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3118,6 +3118,7 @@ close_port(portid_t pid)
 
 		if (is_proc_primary()) {
 			port_flow_flush(pi);
+			port_flex_item_flush(pi);
 			rte_eth_dev_close(pi);
 		}
 
@@ -4223,7 +4224,6 @@ main(int argc, char** argv)
 		rte_stats_bitrate_reg(bitrate_data);
 	}
 #endif
-
 #ifdef RTE_LIB_CMDLINE
 	if (strlen(cmdline_filename) != 0)
 		cmdline_read_from_file(cmdline_filename);
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 81be754605..e3995d24ab 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -14,6 +14,9 @@
 #include <rte_os_shim.h>
 #include <cmdline.h>
 #include <sys/queue.h>
+#ifdef RTE_HAS_JANSSON
+#include <jansson.h>
+#endif
 
 #define RTE_PORT_ALL            (~(portid_t)0x0)
 
@@ -295,6 +298,27 @@ struct fwd_engine {
 	packet_fwd_t     packet_fwd;     /**< Mandatory. */
 };
 
+#define FLEX_ITEM_MAX_SAMPLES_NUM 16
+#define FLEX_ITEM_MAX_LINKS_NUM 16
+#define FLEX_MAX_FLOW_PATTERN_LENGTH 64
+#define FLEX_MAX_PARSERS_NUM 8
+#define FLEX_MAX_PATTERNS_NUM 64
+#define FLEX_PARSER_ERR ((struct flex_item *)-1)
+
+struct flex_item {
+	struct rte_flow_item_flex_conf flex_conf;
+	struct rte_flow_item_flex_handle *flex_handle;
+	uint32_t flex_id;
+};
+
+struct flex_pattern {
+	struct rte_flow_item_flex spec, mask;
+	uint8_t spec_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+	uint8_t mask_pattern[FLEX_MAX_FLOW_PATTERN_LENGTH];
+};
+extern struct flex_item *flex_items[RTE_MAX_ETHPORTS][FLEX_MAX_PARSERS_NUM];
+extern struct flex_pattern flex_patterns[FLEX_MAX_PATTERNS_NUM];
+
 #define BURST_TX_WAIT_US 1
 #define BURST_TX_RETRIES 64
 
@@ -319,6 +343,8 @@ extern struct fwd_engine * fwd_engines[]; /**< NULL terminated array. */
 extern cmdline_parse_inst_t cmd_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw;
 extern cmdline_parse_inst_t cmd_show_set_raw_all;
+extern cmdline_parse_inst_t cmd_set_flex_is_pattern;
+extern cmdline_parse_inst_t cmd_set_flex_spec_pattern;
 
 extern uint16_t mempool_flags;
 
@@ -1046,6 +1072,10 @@ uint16_t tx_pkt_set_dynf(uint16_t port_id, __rte_unused uint16_t queue,
 void add_tx_dynf_callback(portid_t portid);
 void remove_tx_dynf_callback(portid_t portid);
 int update_mtu_from_frame_size(portid_t portid, uint32_t max_rx_pktlen);
+int update_jumbo_frame_offload(portid_t portid);
+void flex_item_create(portid_t port_id, uint16_t flex_id, const char *filename);
+void flex_item_destroy(portid_t port_id, uint16_t flex_id);
+void port_flex_item_flush(portid_t port_id);
 
 extern int flow_parse(const char *src, void *result, unsigned int size,
 		      struct rte_flow_attr **attr,
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 22ba8f0516..6d127d9a7b 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -5091,3 +5091,122 @@ For example to unload BPF filter from TX queue 0, port 0:
 .. code-block:: console
 
    testpmd> bpf-unload tx 0 0
+
+Flex Item Functions
+-------------------
+
+The following sections show functions that configure and create flex item object,
+create flex pattern and use it in a flow rule.
+The commands will use 20 bytes IPv4 header for examples:
+
+::
+
+   0                   1                   2                   3
+   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |  ver  |  IHL  |     TOS       |        length                 | +0
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       identification          | flg |    frag. offset         | +4
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |       TTL     |  protocol     |        checksum               | +8
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |               source IP address                               | +12
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+   |              destination IP address                           | +16
+   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+Create flex item
+~~~~~~~~~~~~~~~~
+
+Flex item object is created by PMD according to a new header configuration. The
+header configuration is compiled by the testpmd and stored in
+``rte_flow_item_flex_conf`` type variable.
+
+::
+
+   # flow flex_item create <port> <flex id> <configuration file>
+   testpmd> flow flex_item init 0 3 ipv4_flex_config.json
+   port-0: created flex item #3
+
+Flex item configuration is kept in external JSON file.
+It describes the following header elements:
+
+**New header length.**
+
+Specify whether the new header has fixed or variable length and the basic/minimal
+header length value.
+
+If header length is not fixed, header location with a value that completes header
+length calculation and scale/offset function must be added.
+
+Scale function depends on port hardware.
+
+**Next protocol.**
+
+Describes location in the new header that specify following network header type.
+
+**Flow match samples.**
+
+Describes locations in the new header that will be used in flow rules.
+
+Number of flow samples and sample maximal length depend of port hardware.
+
+**Input trigger.**
+
+Describes preceding network header configuration.
+
+**Output trigger.**
+
+Describes conditions that trigger transfer to following network header
+
+.. code-block:: json
+
+   {
+      "next_header": { "field_mode": "FIELD_MODE_FIXED", "field_size": 20},
+      "next_protocol": {"field_size": 8, "field_base": 72},
+      "sample_data": [
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 0},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 32},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 64},
+         { "field_mode": "FIELD_MODE_FIXED", "field_size": 32, "field_base": 96}
+      ],
+      "input_link": [
+         {"item": "eth type is 0x0800"},
+         {"item": "vlan inner_type is 0x0800"}
+      ],
+      "output_link": [
+         {"item": "udp", "next": 17},
+         {"item": "tcp", "next": 6},
+         {"item": "icmp", "next": 1}
+      ]
+   }
+
+
+Flex pattern and flow rules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Flex pattern describe parts of network header that will trigger flex flow item hit in a flow rule.
+Flex pattern directly related to flex item samples configuration.
+Flex pattern can be shared between ports.
+
+**Flex pattern and flow rule to match IPv4 version and 20 bytes length**
+
+::
+
+   # set flex_pattern <pattern_id> is <hex bytes sequence>
+   testpmd> flow flex_item pattern 5 is 45FF
+   created pattern #5
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 5 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
+
+**Flex pattern and flow rule to match packets with source address 1.2.3.4**
+
+::
+
+   testpmd> flow flex_item pattern 2 spec 45000000000000000000000001020304 mask FF0000000000000000000000FFFFFFFF
+   created pattern #2
+
+   testpmd> flow create 0 ingress pattern eth / ipv4 / udp / flex item is 3 pattern is 2 / end actions mark id 1 / queue index 0 / end
+   Flow rule #0 created
-- 
2.18.1


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [dpdk-dev] [PATCH v8 0/4] ethdev: introduce configurable flexible item
  2021-09-22 18:04 [dpdk-dev] [PATCH 0/3] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
                   ` (9 preceding siblings ...)
  2021-10-20 15:06 ` [dpdk-dev] [PATCH v7 0/4] ethdev: introduce configurable flexible item Viacheslav Ovsiienko
@ 2021-10-20 15:14 ` Viacheslav Ovsiienko
  2021-10-20 15:14   ` [dpdk-dev] [PATCH v8 1/4] ethdev: support flow elements with variable length Viacheslav Ovsiienko
                     ` (4 more replies)
  10 siblings, 5 replies; 73+ messages in thread
From: Viacheslav Ovsiienko @ 2021-10-20 15:14 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, shahafs, orika, getelson, thomas

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item confgiuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_it