DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev]  [RFC PATCH v1] regexdev: introduce regexdev subsystem
@ 2019-06-27 15:50 jerinj
  2019-07-15  4:26 ` Jerin Jacob Kollanukkaran
                   ` (6 more replies)
  0 siblings, 7 replies; 62+ messages in thread
From: jerinj @ 2019-06-27 15:50 UTC (permalink / raw)
  To: dev; +Cc: techboard, Jerin Jacob, Pavan Nikhilesh

From: Jerin Jacob <jerinj@marvell.com>

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

The Doxygen generated RFC API documentation available here:
https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
• Next Generation Firewalls (NGFW)
• Deep Packet and Flow Inspection (DPI)
• Intrusion Prevention Systems (IPS)
• DDoS Mitigation
• Network Monitoring
• Data Loss Prevention (DLP)
• Smart NICs
• Grammar based content processing
• URL, spam and adware filtering
• Advanced auditing and policing of user/application security policies
• Financial data mining - parsing of streamed financial feeds 

Request to review from HW and SW RegEx vendors and RegEx application users
to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---

RTE RegEx Device API
--------------------

Defines RTE RegEx Device APIs for RegEx operations and its provisioning.

The RegEx Device API is composed of two parts:

- The application-oriented RegEx API that includes functions to setup
  a RegEx device (configure it, setup its queue pairs and start it),
  update the rule database and so on.

- The driver-oriented RegEx API that exports a function allowing
  a RegEx poll Mode Driver (PMD) to simultaneously register itself as
  a RegEx device driver.

RegEx device components and definitions:

    +-----------------+
    |                 |
    |                 o---------+    rte_regex_[en|de]queue_burst()
    |   PCRE based    o------+  |               |
    |  RegEx pattern  |      |  |  +--------+   |
    | matching engine o------+--+--o        |   |    +------+
    |                 |      |  |  | queue  |<==o===>|Core 0|
    |                 o----+ |  |  | pair 0 |        |      |
    |                 |    | |  |  +--------+        +------+
    +-----------------+    | |  |
           ^               | |  |  +--------+
           |               | |  |  |        |        +------+
           |               | +--+--o queue  |<======>|Core 1|
       Rule|Database       |    |  | pair 1 |        |      |
    +------+----------+    |    |  +--------+        +------+
    |     Group 0     |    |    |
    | +-------------+ |    |    |  +--------+        +------+
    | | Rules 0..n  | |    |    |  |        |        |Core 2|
    | +-------------+ |    |    +--o queue  |<======>|      |
    |     Group 1     |    |       | pair 2 |        +------+
    | +-------------+ |    |       +--------+
    | | Rules 0..n  | |    |
    | +-------------+ |    |       +--------+
    |     Group 2     |    |       |        |        +------+
    | +-------------+ |    |       | queue  |<======>|Core n|
    | | Rules 0..n  | |    +-------o pair n |        |      |
    | +-------------+ |            +--------+        +------+
    |     Group n     |
    | +-------------+ |<-------rte_regex_rule_db_update()
    | | Rules 0..n  | |<-------rte_regex_rule_db_import()
    | +-------------+ |------->rte_regex_rule_db_export()
    +-----------------+

RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is “RegEx”.

RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.

PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html

RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching request/response
embedded in *rte_regex_ops* structure.

Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.

Rule database: The RegEx device accepts regular expressions and converts them
into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database contains
a set of rules that compiled in device specific binary form.

Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.

Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify the
rule upon match.

Scan: A pattern matching request through *enqueue* API.

It may possible that a given RegEx device may not support all the features
of PCRE. The application may probe unsupported features through
struct rte_regex_dev_info::pcre_unsup_flags

By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the upper level application to enforce this rule.

In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*

At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regex_dev*.

RegEx devices are dynamically registered during the PCI/SoC device probing
phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regex_dev* structure and
a new device identifier are allocated for that device. Then, the
regex_dev_init() function supplied by the RegEx driver matching the probed
device is invoked to properly initialize the device.

The role of the device init function consists of resetting the hardware or
software RegEx driver implementations.

If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regex_dev* structure is effectively registered.
Otherwise, both the *rte_regex_dev* structure and the device identifier are
freed.

The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following order:
    - rte_regex_dev_configure()
    - rte_regex_queue_pair_setup()
    - rte_regex_dev_start()

Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue pattern
matching response, get the stats, update the rule database,
get/set device attributes and so on

If the application wants to change the configuration (i.e. call
rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
rte_regex_dev_stop() first to stop the device and then do the reconfiguration
before calling rte_regex_dev_start() again. The enqueue and dequeue
functions should not be invoked when the device is stopped.

Finally, an application can close a RegEx device by invoking the
rte_regex_dev_close() function.

Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.

For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of type
*regex_dev_ops*.
The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
structure by the device init function of the RegEx driver, which is
invoked during the PCI/SoC device probing phase, as explained earlier.

In other words, each function of the RegEx API simply retrieves the
*rte_regex_dev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.

For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regex_dev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regex_dev*
structure to avoid an extra indirect memory access during their invocation.

RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.

The *enqueue* operation submits a burst of RegEx pattern matching request
to the RegEx device and the *dequeue* operation gets a burst of pattern
matching response for the ones submitted through *enqueue* operation.

Typical application utilisation of the RegEx device API will follow the
following programming flow.

- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
  provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
  and/or application needs to update rule database.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regex_dev_start()
- rte_regex_enqueue_burst()
- rte_regex_dequeue_burst()

---

 config/common_base                 |    5 +
 doc/api/doxy-api-index.md          |    1 +
 doc/api/doxy-api.conf.in           |    1 +
 lib/Makefile                       |    2 +
 lib/librte_regexdev/Makefile       |   23 +
 lib/librte_regexdev/rte_regexdev.c |    5 +
 lib/librte_regexdev/rte_regexdev.h | 1247 ++++++++++++++++++++++++++++
 7 files changed, 1284 insertions(+)
 create mode 100644 lib/librte_regexdev/Makefile
 create mode 100644 lib/librte_regexdev/rte_regexdev.c
 create mode 100644 lib/librte_regexdev/rte_regexdev.h

diff --git a/config/common_base b/config/common_base
index e406e7836..986093d6e 100644
--- a/config/common_base
+++ b/config/common_base
@@ -746,6 +746,11 @@ CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
 #
 CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
 
+#
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+
 #
 # Compile librte_ring
 #
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 715248dd1..a0bc27ae4 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
   [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
   [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
   [rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
   [metrics]            (@ref rte_metrics.h),
   [bitrate]            (@ref rte_bitrate.h),
   [latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index b9896cb63..7adb821bb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/librte_rawdev \
                           @TOPDIR@/lib/librte_rcu \
                           @TOPDIR@/lib/librte_reorder \
+                          @TOPDIR@/lib/librte_regexdev \
                           @TOPDIR@/lib/librte_ring \
                           @TOPDIR@/lib/librte_sched \
                           @TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 791e0d991..57de9691a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
                            librte_mempool librte_timer librte_cryptodev
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
 DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
 			librte_net
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 000000000..723b4b28c
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library source files
+SRCS-y += rte_regexdev.c
+
+# export include files
+SYMLINK-y-include += rte_regexdev.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 000000000..e5be0f29c
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 000000000..765da4aaa
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1247 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+/**
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+
+/* Enumerates unsupported PCRE features for the RegEx device */
+#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
+/**< RegEx device doesn't support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL << 1)
+/**< RegEx device doesn't support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL << 2)
+/**< RegEx device doesn't support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
+/**< RegEx device doesn't support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
+/**< RegEx device doesn't support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
+/**< RegEx device doesn't support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL << 6)
+/**< RegEx device doesn't support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL << 7)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F (1ULL << 8)
+/**< RegEx device doesn't support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
+/**< RegEx device doesn't support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL << 10)
+/**< RegEx device doesn't support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F (1ULL << 11)
+/**< RegEx device doesn't support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
+/**< RegEx device doesn't support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
+/**< RegEx device doesn't support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
+/**< RegEx device doesn't support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL << 15)
+/**< RegEx device doesn't support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL << 16)
+/**< RegEx device doesn't support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+	const char *driver_name; /**< RegEx driver name */
+	struct rte_device *dev;	/**< Device information */
+	uint8_t max_matches;
+	/**< Maximum matches per scan supported by this device */
+	uint16_t max_queue_pairs;
+	/**< Maximum queue pairs supported by this device */
+	uint16_t max_payload_size;
+	/**< Maximum payload size for a pattern match request or scan.
+	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+	 */
+	uint16_t max_rules_per_group;
+	/**< Maximum rules supported per group by this device */
+	uint16_t max_groups;
+	/**< Maximum group supported by this device */
+	uint32_t regex_dev_capa;
+	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+	uint64_t rule_flags;
+	/**< Supported compiler rule flags.
+	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+	 */
+	uint64_t pcre_unsup_flags;
+	/**< Unsupported PCRE features for this RegEx device.
+	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
+	 */
+};
+
+/**
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+	uint8_t nb_max_matches;
+	/**< Maximum matches per scan configured on this device.
+	 * This value cannot exceed the *max_matches*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case, value 1 used.
+	 * @see struct rte_regex_dev_info::max_matches
+	 */
+	uint16_t nb_queue_pairs;
+	/**< Number of RegEx queue pairs to configure on this device.
+	 * This value cannot exceed the *max_queue_pairs* which previously
+	 * provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_queue_pairs
+	 */
+	uint16_t nb_rules_per_group;
+	/**< Number of rules per group to configure on this device.
+	 * This value cannot exceed the *max_rules_per_group*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case,
+	 * struct rte_regex_dev_info::max_rules_per_group used.
+	 * @see struct rte_regex_dev_info::max_rules_per_group
+	 */
+	uint16_t nb_groups;
+	/**< Number of groups to configure on this device.
+	 * This value cannot exceed the *max_groups*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_groups
+	 */
+	const char *rule_db;
+	/**< Import initial set of prebuilt rule database on this device.
+	 * The value NULL is allowed, in which case, the device will not
+	 * be configured prebuilt rule database. Application may use
+	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+	 * to update or import rule database after the
+	 * rte_regex_dev_configure().
+	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+	 */
+	uint32_t rule_db_len;
+	/**< Length of *rule_db* buffer. */
+	uint32_t dev_cfg_flags;
+	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
+};
+
+/**
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured.
+ *   - <0: Error code returned by the driver configuration function.
+ */
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+				      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+	uint32_t qp_conf_flags;
+	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
+	uint16_t nb_desc;
+	/**< The number of descriptors to allocate for this queue pair. */
+	regexdev_stop_flush_t cb;
+	/**< Callback function called during rte_regex_dev_stop(), invoked
+	 * once per flushed regex op. Value NULL is allowed, in which case
+	 * callback will not be invoked. This function can be used to properly
+	 * dispose of outstanding regex ops from response queue,
+	 * for example ops containing memory pointers.
+	 * @see rte_regex_dev_stop()
+	 */
+};
+
+/**
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the range
+ *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue pair.
+ *   NULL value is allowed, in which case default configuration	used.
+ *
+ * @return
+ *   - 0: Success, RegEx queue pair correctly set up.
+ *   - <0: RegEx queue configuration failed
+ */
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
+			   const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ * @return
+ *   - 0: Success, device started.
+ *   - <0: Device start failed.
+ */
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *  - 0 on successfully closed the device.
+ *  - <0 on failure to close the device.
+ */
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+	RTE_REGEX_DEV_ATTR_SOCKET_ID,
+	/**< The NUMA socket id to which the device is connected or
+	 * a default of zero if the socket could not be determined.
+	 * datatype: *int*
+	 * operation: *get*
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+	/**< Maximum number of matches per scan.
+	 * datatype: *uint8_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+	/**< Upper bound scan time in ns.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+	/**< Maximum number of prefix detected per scan.
+	 * This would be useful for denial of service detection.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+	 */
+};
+
+/**
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param[out] attr_value A pointer that will be filled in with the attribute
+ *             value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       void *attr_value);
+
+/**
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param attr_value A pointer that will be filled in with the attribute value
+ *                   by the application
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation */
+enum rte_regex_rule_op {
+	RTE_REGEX_RULE_OP_ADD,
+	/**< Add RegEx rule to rule database */
+	RTE_REGEX_RULE_OP_REMOVE
+	/**< Remove RegEx rule from rule database */
+};
+
+/** Structure to hold a RegEx rule attributes */
+struct rte_regex_rule {
+	enum rte_regex_rule_op op;
+	/**< OP type of the rule either a OP_ADD or OP_DELETE */
+	uint16_t group_id;
+	/**< Group identifier to which the rule belongs to. */
+	uint32_t rule_id;
+	/**< Rule identifier which is returned on successful match. */
+	const char *pcre_rule;
+	/**< Buffer to hold the PCRE rule. */
+	uint16_t pcre_rule_len;
+	/**< Length of the PCRE rule*/
+	uint64_t rule_flags;
+	/* PCRE rule flags. Supported device specific PCRE rules enumerated
+	 * in struct rte_regex_dev_info::rule_flags. For successful rule
+	 * database update, application needs to provide only supported
+	 * rule flags.
+	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
+	 */
+};
+
+/**
+ * Update the rule database of a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rules
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
+ */
+uint16_t
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
+			 uint16_t nb_rules);
+
+/**
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+			 uint32_t rule_db_len);
+
+/**
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id RegEx device identifier
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+	uint16_t id;
+	/**< xstat identifier */
+	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+	/**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+			       struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param[out] values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested
+ * @return
+ *   - positive value: number of stat entries filled into the values array
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+			 uint64_t values[], uint16_t n);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param name
+ *   The stat name to retrieve
+ * @param[out] id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+				 uint16_t *id, uint64_t *value);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is NULL.
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+			   uint16_t nb_ids);
+
+/**
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @return
+ *   - 0: Selftest successful
+ *   - -ENOTSUP if the device doesn't support selftest
+ *   - other values < 0 on failure.
+ */
+int rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param f
+ *   A pointer to a file for output
+ *
+ * @return
+ *   - 0: on success
+ *   - <0: on failure.
+ */
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		struct {
+			uint32_t rule_id:20;
+			/**< Rule identifier to which the pattern matched.
+			 * @see struct rte_regex_rule::rule_id
+			 */
+			uint32_t group_id:12;
+			/**< Group identifier of the rule which the pattern
+			 * matched. @see struct rte_regex_rule::group_id
+			 */
+			uint16_t offset;
+			/**< Starting Byte Position for matched rule. */
+			uint16_t len;
+			/**< Length of match in bytes */
+		};
+	};
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per scan to
+ * reduce the post-processing required by the application. The match with the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+	/* W0 */
+	uint16_t req_flags;
+	/**< Request flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_REQ_*
+	 */
+	uint16_t scan_size;
+	/**< Scan size of the buffer to be scanned in bytes. */
+	uint16_t rsp_flags;
+	/**< Response flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_RSP_*
+	 */
+	uint8_t nb_actual_matches;
+	/**< The total number of actual matches detected by the Regex device.*/
+	uint8_t nb_matches;
+	/**< The total number of matches returned by the RegEx device for this
+	 * scan. The size of *rte_regex_ops::matches* zero length array will be
+	 * this value.
+	 *
+	 * @see struct rte_regex_ops::matches, struct rte_regex_match
+	 */
+
+	/* W1 */
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		/**<  Allow 8-byte reserved on 32-bit system */
+		void *buf_addr;
+		/**< Virtual address of the pattern to be matched. */
+	};
+
+	/* W2 */
+	rte_iova_t buf_iova;
+	/**< IOVA address of the pattern to be matched. */
+
+	/* W3 */
+	uint16_t group_id0;
+	/**< First group_id to match the rule against. Minimum one group id
+	 * must be provided by application.
+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
+	 * is valid, respectively similar flags for group_id2 and group_id3.
+	 * Upon the match, struct rte_regex_match::group_id shall be updated
+	 * with matching group ID by the device. Group ID scheme provides
+	 * rule isolation and effective pattern matching.
+	 */
+	uint16_t group_id1;
+	/**< Second group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+	 */
+	uint16_t group_id2;
+	/**< Third group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+	 */
+	uint16_t group_id3;
+	/**< Forth group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+	 */
+
+	/* W4 */
+	RTE_STD_C11
+	union {
+		uint64_t user_id;
+		/**< Application specific opaque value. An application may use
+		 * this field to hold application specific value to share
+		 * between dequeue and enqueue operation.
+		 * Implementation should not modify this field.
+		 */
+		void *user_ptr;
+		/**< Pointer representation of *user_id* */
+	};
+
+	/* W5 */
+	struct rte_regex_match matches[];
+	/**< Zero length array to hold the match tuples.
+	 * The struct rte_regex_ops::nb_matches value holds the number of
+	 * elements in this array.
+	 *
+	 * @see struct rte_regex_ops::nb_matches
+	 */
+};
+
+/**
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible" policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
@ 2019-07-15  4:26 ` Jerin Jacob Kollanukkaran
  2019-08-15  9:35 ` Thomas Monjalon
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-07-15  4:26 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, dev
  Cc: techboard, Pavan Nikhilesh Bhagavatula, Shahaf Shuler,
	Hemant Agrawal, j.bromhead, contact, Bruce Richardson,
	Dovrat Zifroni

Ping.

Is anyone interested to collaborate on this RFC[2]?
Marvell would like to contribute on one SW(PCRE based) PMD and HW PMD for this API.

Shahaf from Mellanox proposed a presentation on DPDK Regex device for upcoming user space summit[1] so
It would be good to iron out the differences in various HW based regex engines and understand the requirements
from application perspective and finalize  the specification before the summit.

Let us know, if anyone interested to collaborate on RegEx device API for DPDK?

[1]
https://events.linuxfoundation.org/events/dpdk-userspace-2019-bordeaux/program/schedule/

[2]
http://patches.dpdk.org/patch/55505/


> -----Original Message-----
> From: jerinj@marvell.com <jerinj@marvell.com>
> Sent: Thursday, June 27, 2019 9:21 PM
> To: dev@dpdk.org
> Cc: techboard@dpdk.org; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Subject: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds
> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
>   a RegEx device (configure it, setup its queue pairs and start it),
>   update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
>   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
>   a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching
> request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts
> them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
>     - rte_regex_dev_configure()
>     - rte_regex_queue_pair_setup()
>     - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching
> request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> not
>   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
>   and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
>  config/common_base                 |    5 +
>  doc/api/doxy-api-index.md          |    1 +
>  doc/api/doxy-api.conf.in           |    1 +
>  lib/Makefile                       |    2 +
>  lib/librte_regexdev/Makefile       |   23 +
>  lib/librte_regexdev/rte_regexdev.c |    5 +
>  lib/librte_regexdev/rte_regexdev.h | 1247
> ++++++++++++++++++++++++++++
>  7 files changed, 1284 insertions(+)
>  create mode 100644 lib/librte_regexdev/Makefile
>  create mode 100644 lib/librte_regexdev/rte_regexdev.c
>  create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@
> CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
>  #
>  CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> 
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
>  #
>  # Compile librte_ring
>  #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
>    [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
>    [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
>    [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
>    [metrics]            (@ref rte_metrics.h),
>    [bitrate]            (@ref rte_bitrate.h),
>    [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
>                            @TOPDIR@/lib/librte_rawdev \
>                            @TOPDIR@/lib/librte_rcu \
>                            @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
>                            @TOPDIR@/lib/librte_ring \
>                            @TOPDIR@/lib/librte_sched \
>                            @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> librte_ethdev librte_hash \
>                             librte_mempool librte_timer librte_cryptodev
>  DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
>  DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
>  DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
>  DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
>  			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c
> b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue
> pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching
> request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and
> converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion
> that
> + * would be far too expensive to compute at run-time. A rule database
> contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two
> logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue
> pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device
> probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the
> probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> call
> + * rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their
> invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching
> request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue*
> operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> database not
> + *   provided in rte_regex_dev_config::rule_db for
> rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using
> '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> engine
> + * will parse ABC perform a userdefined callout and return a successful
> match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it
> does
> + * not give up matches as the engine backtracks. With a possessive
> quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that
> appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty
> string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and
> lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not
> be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group
> matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups
> (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and
> the
> + * subject strings that are subsequently processed as strings of UTF
> characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are
> related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which
> previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct
> rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags,
> rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once
> per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the
> range
> + *   [0, nb_queue_pairs - 1] previously supplied to
> rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue
> pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
> +			 uint16_t nb_rules);
> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is
> NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per
> scan to
> + * reduce the post-processing required by the application. The match with
> the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix
> length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex
> device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for
> this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be
> updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may
> use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};
> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which
> are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The
> return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data
> structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible"
> policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that
> must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
> --
> 2.21.0


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
  2019-07-15  4:26 ` Jerin Jacob Kollanukkaran
@ 2019-08-15  9:35 ` Thomas Monjalon
  2019-08-15 11:34   ` Thomas Monjalon
  2020-01-27 21:19 ` [dpdk-dev] [PATCH v2] net/regexdev: " Ori Kam
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 62+ messages in thread
From: Thomas Monjalon @ 2019-08-15  9:35 UTC (permalink / raw)
  To: jerinj
  Cc: dev, Pavan Nikhilesh, Shahaf Shuler, Hemant Agrawal, Opher Reviv,
	Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor, Nipun Gupta, Wang,
	Xiang W, Richardson, Bruce, yang.a.hong, harry.chang, gu.jian1,
	shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni

+Cc other interested vendors
+Cc contributors to µDPI project in fd.io

27/06/2019 17:50, jerinj@marvell.com:
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds 
> 
> Request to review from HW and SW RegEx vendors and RegEx application users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
>   a RegEx device (configure it, setup its queue pairs and start it),
>   update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
>   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
>   a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
>     - rte_regex_dev_configure()
>     - rte_regex_queue_pair_setup()
>     - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
>   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
>   and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
>  config/common_base                 |    5 +
>  doc/api/doxy-api-index.md          |    1 +
>  doc/api/doxy-api.conf.in           |    1 +
>  lib/Makefile                       |    2 +
>  lib/librte_regexdev/Makefile       |   23 +
>  lib/librte_regexdev/rte_regexdev.c |    5 +
>  lib/librte_regexdev/rte_regexdev.h | 1247 ++++++++++++++++++++++++++++
>  7 files changed, 1284 insertions(+)
>  create mode 100644 lib/librte_regexdev/Makefile
>  create mode 100644 lib/librte_regexdev/rte_regexdev.c
>  create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@ CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
>  #
>  CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
>  
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
>  #
>  # Compile librte_ring
>  #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
>    [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
>    [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
>    [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
>    [metrics]            (@ref rte_metrics.h),
>    [bitrate]            (@ref rte_bitrate.h),
>    [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
>                            @TOPDIR@/lib/librte_rawdev \
>                            @TOPDIR@/lib/librte_rcu \
>                            @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
>                            @TOPDIR@/lib/librte_ring \
>                            @TOPDIR@/lib/librte_sched \
>                            @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
>                             librte_mempool librte_timer librte_cryptodev
>  DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
>  DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
>  DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
>  DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
>  			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion that
> + * would be far too expensive to compute at run-time. A rule database contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> + * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue* operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
> + *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL << 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL << 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
> + * will parse ABC perform a userdefined callout and return a successful match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL << 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL << 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it does
> + * not give up matches as the engine backtracks. With a possessive quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL << 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and the
> + * subject strings that are subsequently processed as strings of UTF characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the range
> + *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
> +			 uint16_t nb_rules);
> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per scan to
> + * reduce the post-processing required by the application. The match with the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};
> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible" policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-08-15  9:35 ` Thomas Monjalon
@ 2019-08-15 11:34   ` Thomas Monjalon
  2019-08-19  3:09     ` Jerin Jacob Kollanukkaran
  2019-08-21  5:32     ` Shahaf Shuler
  0 siblings, 2 replies; 62+ messages in thread
From: Thomas Monjalon @ 2019-08-15 11:34 UTC (permalink / raw)
  To: dev
  Cc: jerinj, Pavan Nikhilesh, Shahaf Shuler, Hemant Agrawal,
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	Nipun Gupta, Wang, Xiang W, Richardson, Bruce, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, j.bromhead, deri, fc, arthur.su

+Cc more

------------

From: Jerin Jacob <jerinj@marvell.com>
 
Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.
 
This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
 
The Doxygen generated RFC API documentation available here:
https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
 
This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.
 
RegEx pattern matching applications:
• Next Generation Firewalls (NGFW)
• Deep Packet and Flow Inspection (DPI)
• Intrusion Prevention Systems (IPS)
• DDoS Mitigation
• Network Monitoring
• Data Loss Prevention (DLP)
• Smart NICs
• Grammar based content processing
• URL, spam and adware filtering
• Advanced auditing and policing of user/application security policies
• Financial data mining - parsing of streamed financial feeds 
 
Request to review from HW and SW RegEx vendors and RegEx application users
to have portable DPDK API for RegEx.
 
The API schematics are based cryptodev, eventdev and ethdev existing device API.
 
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 
RTE RegEx Device API
--------------------
 
Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
 
The RegEx Device API is composed of two parts:
 
- The application-oriented RegEx API that includes functions to setup
a RegEx device (configure it, setup its queue pairs and start it),
update the rule database and so on.
 
- The driver-oriented RegEx API that exports a function allowing
a RegEx poll Mode Driver (PMD) to simultaneously register itself as
a RegEx device driver.
 
RegEx device components and definitions:
 
    +-----------------+
    |                 |
    |                 o---------+    rte_regex_[en|de]queue_burst()
    |   PCRE based    o------+  |               |
    |  RegEx pattern  |      |  |  +--------+   |
    | matching engine o------+--+--o        |   |    +------+
    |                 |      |  |  | queue  |<==o===>|Core 0|
    |                 o----+ |  |  | pair 0 |        |      |
    |                 |    | |  |  +--------+        +------+
    +-----------------+    | |  |
           ^               | |  |  +--------+
           |               | |  |  |        |        +------+
           |               | +--+--o queue  |<======>|Core 1|
       Rule|Database       |    |  | pair 1 |        |      |
    +------+----------+    |    |  +--------+        +------+
    |     Group 0     |    |    |
    | +-------------+ |    |    |  +--------+        +------+
    | | Rules 0..n  | |    |    |  |        |        |Core 2|
    | +-------------+ |    |    +--o queue  |<======>|      |
    |     Group 1     |    |       | pair 2 |        +------+
    | +-------------+ |    |       +--------+
    | | Rules 0..n  | |    |
    | +-------------+ |    |       +--------+
    |     Group 2     |    |       |        |        +------+
    | +-------------+ |    |       | queue  |<======>|Core n|
    | | Rules 0..n  | |    +-------o pair n |        |      |
    | +-------------+ |            +--------+        +------+
    |     Group n     |
    | +-------------+ |<-------rte_regex_rule_db_update()
    | | Rules 0..n  | |<-------rte_regex_rule_db_import()
    | +-------------+ |------->rte_regex_rule_db_export()
    +-----------------+
 
RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is “RegEx”.
 
RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.
 
PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
 
RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching request/response
embedded in *rte_regex_ops* structure.
 
Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.
 
Rule database: The RegEx device accepts regular expressions and converts them
into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database contains
a set of rules that compiled in device specific binary form.
 
Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.
 
Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify the
rule upon match.
 
Scan: A pattern matching request through *enqueue* API.
 
It may possible that a given RegEx device may not support all the features
of PCRE. The application may probe unsupported features through
struct rte_regex_dev_info::pcre_unsup_flags
 
By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the upper level application to enforce this rule.
 
In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*
 
At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regex_dev*.
 
RegEx devices are dynamically registered during the PCI/SoC device probing
phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regex_dev* structure and
a new device identifier are allocated for that device. Then, the
regex_dev_init() function supplied by the RegEx driver matching the probed
device is invoked to properly initialize the device.
 
The role of the device init function consists of resetting the hardware or
software RegEx driver implementations.
 
If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regex_dev* structure is effectively registered.
Otherwise, both the *rte_regex_dev* structure and the device identifier are
freed.
 
The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following order:
- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_dev_start()
 
Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue pattern
matching response, get the stats, update the rule database,
get/set device attributes and so on
 
If the application wants to change the configuration (i.e. call
rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
rte_regex_dev_stop() first to stop the device and then do the reconfiguration
before calling rte_regex_dev_start() again. The enqueue and dequeue
functions should not be invoked when the device is stopped.
 
Finally, an application can close a RegEx device by invoking the
rte_regex_dev_close() function.
 
Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.
 
For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of type
*regex_dev_ops*.
The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
structure by the device init function of the RegEx driver, which is
invoked during the PCI/SoC device probing phase, as explained earlier.
 
In other words, each function of the RegEx API simply retrieves the
*rte_regex_dev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
 
For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regex_dev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regex_dev*
structure to avoid an extra indirect memory access during their invocation.
 
RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.
 
The *enqueue* operation submits a burst of RegEx pattern matching request
to the RegEx device and the *dequeue* operation gets a burst of pattern
matching response for the ones submitted through *enqueue* operation.
 
Typical application utilisation of the RegEx device API will follow the
following programming flow.
 
- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
and/or application needs to update rule database.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regex_dev_start()
- rte_regex_enqueue_burst()
- rte_regex_dequeue_burst()
 
---
 
config/common_base                 |    5 +
doc/api/doxy-api-index.md          |    1 +
doc/api/doxy-api.conf.in           |    1 +
lib/Makefile                       |    2 +
lib/librte_regexdev/Makefile       |   23 +
lib/librte_regexdev/rte_regexdev.c |    5 +
lib/librte_regexdev/rte_regexdev.h | 1247 ++++++++++++++++++++++++++++
7 files changed, 1284 insertions(+)
create mode 100644 lib/librte_regexdev/Makefile
create mode 100644 lib/librte_regexdev/rte_regexdev.c
create mode 100644 lib/librte_regexdev/rte_regexdev.h
 
diff --git a/config/common_base b/config/common_base
index e406e7836..986093d6e 100644
--- a/config/common_base
+++ b/config/common_base
@@ -746,6 +746,11 @@ CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
#
CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
  
+#
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+
#
# Compile librte_ring
#
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 715248dd1..a0bc27ae4 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
[event_timer_adapter]    (@ref rte_event_timer_adapter.h),
[event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
[rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
[metrics]            (@ref rte_metrics.h),
[bitrate]            (@ref rte_bitrate.h),
[latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index b9896cb63..7adb821bb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/librte_rawdev \
@TOPDIR@/lib/librte_rcu \
@TOPDIR@/lib/librte_reorder \
+                          @TOPDIR@/lib/librte_regexdev \
@TOPDIR@/lib/librte_ring \
@TOPDIR@/lib/librte_sched \
@TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 791e0d991..57de9691a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
librte_mempool librte_timer librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
			librte_net
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 000000000..723b4b28c
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library source files
+SRCS-y += rte_regexdev.c
+
+# export include files
+SYMLINK-y-include += rte_regexdev.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 000000000..e5be0f29c
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 000000000..765da4aaa
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1247 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+/**
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+
+/* Enumerates unsupported PCRE features for the RegEx device */
+#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
+/**< RegEx device doesn't support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL << 1)
+/**< RegEx device doesn't support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL << 2)
+/**< RegEx device doesn't support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
+/**< RegEx device doesn't support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
+/**< RegEx device doesn't support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
+/**< RegEx device doesn't support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL << 6)
+/**< RegEx device doesn't support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL << 7)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F (1ULL << 8)
+/**< RegEx device doesn't support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
+/**< RegEx device doesn't support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL << 10)
+/**< RegEx device doesn't support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F (1ULL << 11)
+/**< RegEx device doesn't support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
+/**< RegEx device doesn't support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
+/**< RegEx device doesn't support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
+/**< RegEx device doesn't support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL << 15)
+/**< RegEx device doesn't support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL << 16)
+/**< RegEx device doesn't support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+	const char *driver_name; /**< RegEx driver name */
+	struct rte_device *dev;	/**< Device information */
+	uint8_t max_matches;
+	/**< Maximum matches per scan supported by this device */
+	uint16_t max_queue_pairs;
+	/**< Maximum queue pairs supported by this device */
+	uint16_t max_payload_size;
+	/**< Maximum payload size for a pattern match request or scan.
+	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+	 */
+	uint16_t max_rules_per_group;
+	/**< Maximum rules supported per group by this device */
+	uint16_t max_groups;
+	/**< Maximum group supported by this device */
+	uint32_t regex_dev_capa;
+	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+	uint64_t rule_flags;
+	/**< Supported compiler rule flags.
+	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+	 */
+	uint64_t pcre_unsup_flags;
+	/**< Unsupported PCRE features for this RegEx device.
+	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
+	 */
+};
+
+/**
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+	uint8_t nb_max_matches;
+	/**< Maximum matches per scan configured on this device.
+	 * This value cannot exceed the *max_matches*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case, value 1 used.
+	 * @see struct rte_regex_dev_info::max_matches
+	 */
+	uint16_t nb_queue_pairs;
+	/**< Number of RegEx queue pairs to configure on this device.
+	 * This value cannot exceed the *max_queue_pairs* which previously
+	 * provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_queue_pairs
+	 */
+	uint16_t nb_rules_per_group;
+	/**< Number of rules per group to configure on this device.
+	 * This value cannot exceed the *max_rules_per_group*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case,
+	 * struct rte_regex_dev_info::max_rules_per_group used.
+	 * @see struct rte_regex_dev_info::max_rules_per_group
+	 */
+	uint16_t nb_groups;
+	/**< Number of groups to configure on this device.
+	 * This value cannot exceed the *max_groups*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_groups
+	 */
+	const char *rule_db;
+	/**< Import initial set of prebuilt rule database on this device.
+	 * The value NULL is allowed, in which case, the device will not
+	 * be configured prebuilt rule database. Application may use
+	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+	 * to update or import rule database after the
+	 * rte_regex_dev_configure().
+	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+	 */
+	uint32_t rule_db_len;
+	/**< Length of *rule_db* buffer. */
+	uint32_t dev_cfg_flags;
+	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
+};
+
+/**
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured.
+ *   - <0: Error code returned by the driver configuration function.
+ */
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+				      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+	uint32_t qp_conf_flags;
+	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
+	uint16_t nb_desc;
+	/**< The number of descriptors to allocate for this queue pair. */
+	regexdev_stop_flush_t cb;
+	/**< Callback function called during rte_regex_dev_stop(), invoked
+	 * once per flushed regex op. Value NULL is allowed, in which case
+	 * callback will not be invoked. This function can be used to properly
+	 * dispose of outstanding regex ops from response queue,
+	 * for example ops containing memory pointers.
+	 * @see rte_regex_dev_stop()
+	 */
+};
+
+/**
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the range
+ *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue pair.
+ *   NULL value is allowed, in which case default configuration	used.
+ *
+ * @return
+ *   - 0: Success, RegEx queue pair correctly set up.
+ *   - <0: RegEx queue configuration failed
+ */
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
+			   const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ * @return
+ *   - 0: Success, device started.
+ *   - <0: Device start failed.
+ */
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *  - 0 on successfully closed the device.
+ *  - <0 on failure to close the device.
+ */
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+	RTE_REGEX_DEV_ATTR_SOCKET_ID,
+	/**< The NUMA socket id to which the device is connected or
+	 * a default of zero if the socket could not be determined.
+	 * datatype: *int*
+	 * operation: *get*
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+	/**< Maximum number of matches per scan.
+	 * datatype: *uint8_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+	/**< Upper bound scan time in ns.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+	/**< Maximum number of prefix detected per scan.
+	 * This would be useful for denial of service detection.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+	 */
+};
+
+/**
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param[out] attr_value A pointer that will be filled in with the attribute
+ *             value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       void *attr_value);
+
+/**
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param attr_value A pointer that will be filled in with the attribute value
+ *                   by the application
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation */
+enum rte_regex_rule_op {
+	RTE_REGEX_RULE_OP_ADD,
+	/**< Add RegEx rule to rule database */
+	RTE_REGEX_RULE_OP_REMOVE
+	/**< Remove RegEx rule from rule database */
+};
+
+/** Structure to hold a RegEx rule attributes */
+struct rte_regex_rule {
+	enum rte_regex_rule_op op;
+	/**< OP type of the rule either a OP_ADD or OP_DELETE */
+	uint16_t group_id;
+	/**< Group identifier to which the rule belongs to. */
+	uint32_t rule_id;
+	/**< Rule identifier which is returned on successful match. */
+	const char *pcre_rule;
+	/**< Buffer to hold the PCRE rule. */
+	uint16_t pcre_rule_len;
+	/**< Length of the PCRE rule*/
+	uint64_t rule_flags;
+	/* PCRE rule flags. Supported device specific PCRE rules enumerated
+	 * in struct rte_regex_dev_info::rule_flags. For successful rule
+	 * database update, application needs to provide only supported
+	 * rule flags.
+	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
+	 */
+};
+
+/**
+ * Update the rule database of a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rules
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
+ */
+uint16_t
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
+			 uint16_t nb_rules);
+
+/**
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+			 uint32_t rule_db_len);
+
+/**
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id RegEx device identifier
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+	uint16_t id;
+	/**< xstat identifier */
+	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+	/**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+			       struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param[out] values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested
+ * @return
+ *   - positive value: number of stat entries filled into the values array
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+			 uint64_t values[], uint16_t n);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param name
+ *   The stat name to retrieve
+ * @param[out] id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+				 uint16_t *id, uint64_t *value);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is NULL.
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+			   uint16_t nb_ids);
+
+/**
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @return
+ *   - 0: Selftest successful
+ *   - -ENOTSUP if the device doesn't support selftest
+ *   - other values < 0 on failure.
+ */
+int rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param f
+ *   A pointer to a file for output
+ *
+ * @return
+ *   - 0: on success
+ *   - <0: on failure.
+ */
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		struct {
+			uint32_t rule_id:20;
+			/**< Rule identifier to which the pattern matched.
+			 * @see struct rte_regex_rule::rule_id
+			 */
+			uint32_t group_id:12;
+			/**< Group identifier of the rule which the pattern
+			 * matched. @see struct rte_regex_rule::group_id
+			 */
+			uint16_t offset;
+			/**< Starting Byte Position for matched rule. */
+			uint16_t len;
+			/**< Length of match in bytes */
+		};
+	};
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per scan to
+ * reduce the post-processing required by the application. The match with the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+	/* W0 */
+	uint16_t req_flags;
+	/**< Request flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_REQ_*
+	 */
+	uint16_t scan_size;
+	/**< Scan size of the buffer to be scanned in bytes. */
+	uint16_t rsp_flags;
+	/**< Response flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_RSP_*
+	 */
+	uint8_t nb_actual_matches;
+	/**< The total number of actual matches detected by the Regex device.*/
+	uint8_t nb_matches;
+	/**< The total number of matches returned by the RegEx device for this
+	 * scan. The size of *rte_regex_ops::matches* zero length array will be
+	 * this value.
+	 *
+	 * @see struct rte_regex_ops::matches, struct rte_regex_match
+	 */
+
+	/* W1 */
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		/**<  Allow 8-byte reserved on 32-bit system */
+		void *buf_addr;
+		/**< Virtual address of the pattern to be matched. */
+	};
+
+	/* W2 */
+	rte_iova_t buf_iova;
+	/**< IOVA address of the pattern to be matched. */
+
+	/* W3 */
+	uint16_t group_id0;
+	/**< First group_id to match the rule against. Minimum one group id
+	 * must be provided by application.
+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
+	 * is valid, respectively similar flags for group_id2 and group_id3.
+	 * Upon the match, struct rte_regex_match::group_id shall be updated
+	 * with matching group ID by the device. Group ID scheme provides
+	 * rule isolation and effective pattern matching.
+	 */
+	uint16_t group_id1;
+	/**< Second group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+	 */
+	uint16_t group_id2;
+	/**< Third group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+	 */
+	uint16_t group_id3;
+	/**< Forth group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+	 */
+
+	/* W4 */
+	RTE_STD_C11
+	union {
+		uint64_t user_id;
+		/**< Application specific opaque value. An application may use
+		 * this field to hold application specific value to share
+		 * between dequeue and enqueue operation.
+		 * Implementation should not modify this field.
+		 */
+		void *user_ptr;
+		/**< Pointer representation of *user_id* */
+	};
+
+	/* W5 */
+	struct rte_regex_match matches[];
+	/**< Zero length array to hold the match tuples.
+	 * The struct rte_regex_ops::nb_matches value holds the number of
+	 * elements in this array.
+	 *
+	 * @see struct rte_regex_ops::nb_matches
+	 */
+};
+
+/**
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible" policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-08-15 11:34   ` Thomas Monjalon
@ 2019-08-19  3:09     ` Jerin Jacob Kollanukkaran
  2019-08-20  1:54       ` Wang, Xiang W
  2019-08-21  5:32     ` Shahaf Shuler
  1 sibling, 1 reply; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-08-19  3:09 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: Pavan Nikhilesh Bhagavatula, Shahaf Shuler, Hemant Agrawal,
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	Nipun Gupta, Wang, Xiang W, Richardson, Bruce, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, j.bromhead, deri, fc, arthur.su, Guy Kaneti,
	Smadar Fuks, Liron Himi

Reply to Xiang's queries in main thread:

Hi all,

Some questions regarding APIs. Could you please give more insights?

1) rte_regex_ops
      a) rsp_flags
      These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
      RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match at the end of current buffer after scan.
      What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?

[Jerin] Since we need three states to represent partial match buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to
represent start of the buffer, intermediate buffers with no flag, and end of the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ

      RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a specific hardware implementation. I am wondering what this PREFIX refers to:)?

[Jerin] Yes. Looks like it is for hardware specific implementation. Introduced rte_regex_dev_attr_set/get functions to make it portable and
To add new implementation specific fields.
For example, if a rule is
/ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the factor. The prefix is a literal
string, while the factor can contain complex regular expression constructs. As a result, rule matching occurs in
two stages: prefix matching and factor matching.
 
      b)  user_id or user_ptr
      Under what kind of circumstances should an application pass value into these variables for enqueue and dequeuer operations?

[Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using mempool normally, on enqueue, user can specify user_id
If needed to in order identify the op on dequeue if required. The use case could be to store the sequence number from application
POV or storing the mbuf ptr in which pattern is requested etc.
 

 2) rte_regex_match
      a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t len; /**< Length of match in bytes */
      Looks like the matching offset is defined as *starting matching offset* instead of *end matching offset*, e.g. report the offset of "a" instead of "c" for pattern "abc". 
      If so, this makes it hard to integrate software regex libraries such as Hyperscan and RE2 as they only report *end matching offset* without length of match. 
      Although Hyperscan has API for *starting matching offset*, it only delivers partial syntax support. So I think we have to define *end of matching offset* for software solutions.

[Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I thought application would need always the length of the match.
Probably we will see how other HW implementation (from Mellanox) etc. We will try to abstract it, probably we can make it as function of "user requested".

3)  rte_regex_rule_db_update()
    Does this mean we can dynamically add or delete rules for an already generated database without recompile from scratch for hardware Regex implementation? 
    If so, this isn't possible for software solutions as they don't support dynamic database update and require recompile. 

[Jerin] rte_regex_rule_db_update() internally it would call recompile function for both HW and SW.
See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for precompiled rule database case.

4) rte_regex_rule_db_import() and rte_regex_rule_db_export()
     What's the expected behavior for import and export operations? Will we create another copy of database when calling them? 

[Jerin] Does it require copy or not it is Implementation defined. Marvell's HW implementation has centralized rule database
per device.

Thanks,
Xiang

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, August 15, 2019 5:04 PM
> To: dev@dpdk.org
> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Pavan Nikhilesh
> Bhagavatula <pbhagavatula@marvell.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Opher Reviv <opher@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>; Prasun
> Kapoor <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> Wang, Xiang W <xiang.w.wang@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com; yuyingxia@yxlink.com;
> fanchenggang@sunyainfo.com; davidfgao@tencent.com;
> liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com;
> jim@netgate.com; hongjun.ni@intel.com; j.bromhead@titan-ic.com;
> deri@ntop.org; fc@napatech.com; arthur.su@lionic.com
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> +Cc more
> 
> ------------
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds
> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
> a RegEx device (configure it, setup its queue pairs and start it),
> update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
> a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching
> request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts
> them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching
> request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> not
> provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
> config/common_base                 |    5 +
> doc/api/doxy-api-index.md          |    1 +
> doc/api/doxy-api.conf.in           |    1 +
> lib/Makefile                       |    2 +
> lib/librte_regexdev/Makefile       |   23 +
> lib/librte_regexdev/rte_regexdev.c |    5 +
> lib/librte_regexdev/rte_regexdev.h | 1247
> ++++++++++++++++++++++++++++
> 7 files changed, 1284 insertions(+)
> create mode 100644 lib/librte_regexdev/Makefile
> create mode 100644 lib/librte_regexdev/rte_regexdev.c
> create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@
> CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
> #
> CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> 
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
> #
> # Compile librte_ring
> #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
> [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
> [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
> [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
> [metrics]            (@ref rte_metrics.h),
> [bitrate]            (@ref rte_bitrate.h),
> [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
> @TOPDIR@/lib/librte_rawdev \
> @TOPDIR@/lib/librte_rcu \
> @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
> @TOPDIR@/lib/librte_ring \
> @TOPDIR@/lib/librte_sched \
> @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> librte_ethdev librte_hash \
> librte_mempool librte_timer librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
> DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
> DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
> 			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c
> b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue
> pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching
> request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and
> converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion
> that
> + * would be far too expensive to compute at run-time. A rule database
> contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two
> logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue
> pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device
> probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the
> probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> call
> + * rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their
> invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching
> request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue*
> operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> database not
> + *   provided in rte_regex_dev_config::rule_db for
> rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using
> '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> engine
> + * will parse ABC perform a userdefined callout and return a successful
> match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it
> does
> + * not give up matches as the engine backtracks. With a possessive
> quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that
> appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty
> string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and
> lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not
> be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group
> matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups
> (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and
> the
> + * subject strings that are subsequently processed as strings of UTF
> characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are
> related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which
> previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct
> rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags,
> rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once
> per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the
> range
> + *   [0, nb_queue_pairs - 1] previously supplied to
> rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue
> pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
> +			 uint16_t nb_rules);
> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is
> NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per
> scan to
> + * reduce the post-processing required by the application. The match with
> the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix
> length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex
> device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for
> this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be
> updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may
> use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};
> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which
> are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The
> return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data
> structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible"
> policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that
> must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-08-19  3:09     ` Jerin Jacob Kollanukkaran
@ 2019-08-20  1:54       ` Wang, Xiang W
  2019-09-10  8:05         ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 62+ messages in thread
From: Wang, Xiang W @ 2019-08-20  1:54 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Thomas Monjalon, dev
  Cc: Pavan Nikhilesh Bhagavatula, Shahaf Shuler, Hemant Agrawal,
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	Nipun Gupta, Richardson, Bruce, Hong, Yang A, Chang, Harry,
	gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim, Ni,
	Hongjun, j.bromhead, deri, fc, arthur.su, Guy Kaneti,
	Smadar Fuks, Liron Himi

Thanks Jerin. Comments inline.

-----Original Message-----
From: Jerin Jacob Kollanukkaran [mailto:jerinj@marvell.com] 
Sent: Monday, August 19, 2019 11:09 AM
To: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org
Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Shahaf Shuler <shahafs@mellanox.com>; Hemant Agrawal <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Wang, Xiang W <xiang.w.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>; Hong, Yang A <yang.a.hong@intel.com>; Chang, Harry <harry.chang@intel.com>; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn; zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com; yuyingxia@yxlink.com; fanchenggang@sunyainfo.com; davidfgao@tencent.com; liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com; Ni, Hongjun <hongjun.ni@intel.com>; j.bromhead@titan-ic.com; deri@ntop.org; fc@napatech.com; arthur.su@lionic.com; Guy Kaneti <guyk@marvell.com>; Smadar Fuks <smadarf@marvell.com>; Liron Himi <lironh@marvell.com>
Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem

Reply to Xiang's queries in main thread:

Hi all,

Some questions regarding APIs. Could you please give more insights?

1) rte_regex_ops
      a) rsp_flags
      These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
      RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match at the end of current buffer after scan.
      What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?

[Jerin] Since we need three states to represent partial match buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to
represent start of the buffer, intermediate buffers with no flag, and end of the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ
[Xiang] How could a user leverage these flags for matching? Suppose a large buffer is divided into multiple chunks. Will RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't set after scan the first chunk. Similarly, RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching future buffers after finish the last chunk?  

      RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a specific hardware implementation. I am wondering what this PREFIX refers to:)?

[Jerin] Yes. Looks like it is for hardware specific implementation. Introduced rte_regex_dev_attr_set/get functions to make it portable and
To add new implementation specific fields.
For example, if a rule is
/ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the factor. The prefix is a literal
string, while the factor can contain complex regular expression constructs. As a result, rule matching occurs in
two stages: prefix matching and factor matching.
 
      b)  user_id or user_ptr
      Under what kind of circumstances should an application pass value into these variables for enqueue and dequeuer operations?

[Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using mempool normally, on enqueue, user can specify user_id
If needed to in order identify the op on dequeue if required. The use case could be to store the sequence number from application
POV or storing the mbuf ptr in which pattern is requested etc.
 

 2) rte_regex_match
      a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t len; /**< Length of match in bytes */
      Looks like the matching offset is defined as *starting matching offset* instead of *end matching offset*, e.g. report the offset of "a" instead of "c" for pattern "abc". 
      If so, this makes it hard to integrate software regex libraries such as Hyperscan and RE2 as they only report *end matching offset* without length of match. 
      Although Hyperscan has API for *starting matching offset*, it only delivers partial syntax support. So I think we have to define *end of matching offset* for software solutions.

[Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I thought application would need always the length of the match.
Probably we will see how other HW implementation (from Mellanox) etc. We will try to abstract it, probably we can make it as function of "user requested".
[Xiang] Yes, it will be good to make it per user request. At least from Hyperscan user's point of view, start of match and match length are not mandatory. 

3)  rte_regex_rule_db_update()
    Does this mean we can dynamically add or delete rules for an already generated database without recompile from scratch for hardware Regex implementation? 
    If so, this isn't possible for software solutions as they don't support dynamic database update and require recompile. 

[Jerin] rte_regex_rule_db_update() internally it would call recompile function for both HW and SW.
See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for precompiled rule database case.
[Xiang] OK, sounds like we have to save the original rule-set for the device in order to do recompile. I see both ADD and REMOVE operators from rte_regex_rule.
For rules with REMOVE operator, what's the expected behavior to handle them for the old rule-set? Do we need to go through the old rule-set and remove corresponding rules before doing recompile?  

4) rte_regex_rule_db_import() and rte_regex_rule_db_export()
     What's the expected behavior for import and export operations? Will we create another copy of database when calling them? 

[Jerin] Does it require copy or not it is Implementation defined. Marvell's HW implementation has centralized rule database
per device.

Thanks,
Xiang

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, August 15, 2019 5:04 PM
> To: dev@dpdk.org
> Cc: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Pavan Nikhilesh
> Bhagavatula <pbhagavatula@marvell.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Opher Reviv <opher@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>; Prasun
> Kapoor <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> Wang, Xiang W <xiang.w.wang@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com; yuyingxia@yxlink.com;
> fanchenggang@sunyainfo.com; davidfgao@tencent.com;
> liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com;
> jim@netgate.com; hongjun.ni@intel.com; j.bromhead@titan-ic.com;
> deri@ntop.org; fc@napatech.com; arthur.su@lionic.com
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> +Cc more
> 
> ------------
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://dreamy-noether-22777e.netlify.com/rte__regexdev_8h.html
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds
> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
> a RegEx device (configure it, setup its queue pairs and start it),
> update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
> a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching
> request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts
> them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching
> request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> not
> provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
> config/common_base                 |    5 +
> doc/api/doxy-api-index.md          |    1 +
> doc/api/doxy-api.conf.in           |    1 +
> lib/Makefile                       |    2 +
> lib/librte_regexdev/Makefile       |   23 +
> lib/librte_regexdev/rte_regexdev.c |    5 +
> lib/librte_regexdev/rte_regexdev.h | 1247
> ++++++++++++++++++++++++++++
> 7 files changed, 1284 insertions(+)
> create mode 100644 lib/librte_regexdev/Makefile
> create mode 100644 lib/librte_regexdev/rte_regexdev.c
> create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@
> CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
> #
> CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> 
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
> #
> # Compile librte_ring
> #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
> [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
> [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
> [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
> [metrics]            (@ref rte_metrics.h),
> [bitrate]            (@ref rte_bitrate.h),
> [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
> @TOPDIR@/lib/librte_rawdev \
> @TOPDIR@/lib/librte_rcu \
> @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
> @TOPDIR@/lib/librte_ring \
> @TOPDIR@/lib/librte_sched \
> @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> librte_ethdev librte_hash \
> librte_mempool librte_timer librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
> DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
> DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
> 			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c
> b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue
> pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching
> request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and
> converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion
> that
> + * would be far too expensive to compute at run-time. A rule database
> contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two
> logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue
> pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device
> probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the
> probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> call
> + * rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their
> invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching
> request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue*
> operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> database not
> + *   provided in rte_regex_dev_config::rule_db for
> rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using
> '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> engine
> + * will parse ABC perform a userdefined callout and return a successful
> match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it
> does
> + * not give up matches as the engine backtracks. With a possessive
> quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that
> appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty
> string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and
> lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not
> be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group
> matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups
> (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and
> the
> + * subject strings that are subsequently processed as strings of UTF
> characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are
> related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which
> previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct
> rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags,
> rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once
> per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the
> range
> + *   [0, nb_queue_pairs - 1] previously supplied to
> rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue
> pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
> +			 uint16_t nb_rules);
> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is
> NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per
> scan to
> + * reduce the post-processing required by the application. The match with
> the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix
> length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex
> device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for
> this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be
> updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may
> use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};
> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which
> are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The
> return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data
> structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible"
> policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that
> must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-08-15 11:34   ` Thomas Monjalon
  2019-08-19  3:09     ` Jerin Jacob Kollanukkaran
@ 2019-08-21  5:32     ` Shahaf Shuler
  2019-08-21 15:12       ` John Bromhead
                         ` (2 more replies)
  1 sibling, 3 replies; 62+ messages in thread
From: Shahaf Shuler @ 2019-08-21  5:32 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: jerinj, Pavan Nikhilesh, Hemant Agrawal, Opher Reviv,
	Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor, Nipun Gupta, Wang,
	Xiang W, Richardson, Bruce, yang.a.hong, harry.chang, gu.jian1,
	shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su

Hi Jerin,

Thursday, August 15, 2019 2:34 PM, Thomas Monjalon:
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> +Cc more
> 
> ------------
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> The Doxygen generated RFC API documentation available here:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrea
> my-noether-
> 22777e.netlify.com%2Frte__regexdev_8h.html&amp;data=02%7C01%7Csha
> hafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721748937%7Ca652971c
> 7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637014656739993131&amp;sdata
> =6ZAOrLmj3sf7LrPRlzE7IyqkK8b4cvFIQqK6zSwF4aw%3D&amp;reserved=0
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> • Next Generation Firewalls (NGFW)
> • Deep Packet and Flow Inspection (DPI)
> • Intrusion Prevention Systems (IPS)
> • DDoS Mitigation
> • Network Monitoring
> • Data Loss Prevention (DLP)
> • Smart NICs
> • Grammar based content processing
> • URL, spam and adware filtering
> • Advanced auditing and policing of user/application security policies
> • Financial data mining - parsing of streamed financial feeds

I think two more important use case to add (at least on the doc of this subsystem) are:
* application recognition 
* memory introspection 


> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users
> to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> 
> RTE RegEx Device API
> --------------------
> 
> Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> 
> The RegEx Device API is composed of two parts:
> 
> - The application-oriented RegEx API that includes functions to setup
> a RegEx device (configure it, setup its queue pairs and start it),
> update the rule database and so on.
> 
> - The driver-oriented RegEx API that exports a function allowing
> a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> a RegEx device driver.
> 
> RegEx device components and definitions:
> 
>     +-----------------+
>     |                 |
>     |                 o---------+    rte_regex_[en|de]queue_burst()
>     |   PCRE based    o------+  |               |
>     |  RegEx pattern  |      |  |  +--------+   |
>     | matching engine o------+--+--o        |   |    +------+
>     |                 |      |  |  | queue  |<==o===>|Core 0|
>     |                 o----+ |  |  | pair 0 |        |      |
>     |                 |    | |  |  +--------+        +------+
>     +-----------------+    | |  |
>            ^               | |  |  +--------+
>            |               | |  |  |        |        +------+
>            |               | +--+--o queue  |<======>|Core 1|
>        Rule|Database       |    |  | pair 1 |        |      |
>     +------+----------+    |    |  +--------+        +------+
>     |     Group 0     |    |    |
>     | +-------------+ |    |    |  +--------+        +------+
>     | | Rules 0..n  | |    |    |  |        |        |Core 2|
>     | +-------------+ |    |    +--o queue  |<======>|      |
>     |     Group 1     |    |       | pair 2 |        +------+
>     | +-------------+ |    |       +--------+
>     | | Rules 0..n  | |    |
>     | +-------------+ |    |       +--------+
>     |     Group 2     |    |       |        |        +------+
>     | +-------------+ |    |       | queue  |<======>|Core n|
>     | | Rules 0..n  | |    +-------o pair n |        |      |
>     | +-------------+ |            +--------+        +------+
>     |     Group n     |
>     | +-------------+ |<-------rte_regex_rule_db_update()
>     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
>     | +-------------+ |------->rte_regex_rule_db_export()
>     +-----------------+
> 
> RegEx: A regular expression is a concise and flexible means for matching
> strings of text, such as particular characters, words, or patterns of
> characters. A common abbreviation for this is “RegEx”.
> 
> RegEx device: A hardware or software-based implementation of RegEx
> device API for PCRE based pattern matching syntax and semantics.
> 
> PCRE RegEx syntax and semantics specification:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
> kit.sourceforge.net%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
> ata=02%7C01%7Cshahafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721
> 748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
> 9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
> D&amp;reserved=0
> 
> RegEx queue pair: Each RegEx device should have one or more queue pair to
> transmit a burst of pattern matching request and receive a burst of
> receive the pattern matching response. The pattern matching
> request/response
> embedded in *rte_regex_ops* structure.
> 
> Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> Match ID and Group ID to identify the rule upon the match.
> 
> Rule database: The RegEx device accepts regular expressions and converts
> them
> into a compiled rule database that can then be used to scan data.
> Compilation allows the device to analyze the given pattern(s) and
> pre-determine how to scan for these patterns in an optimized fashion that
> would be far too expensive to compute at run-time. A rule database contains
> a set of rules that compiled in device specific binary form.
> 
> Match ID or Rule ID: A unique identifier provided at the time of rule
> creation for the application to identify the rule upon match.
> 
> Group ID: Group of rules can be grouped under one group ID to enable
> rule isolation and effective pattern matching. A unique group identifier
> provided at the time of rule creation for the application to identify the
> rule upon match.
> 
> Scan: A pattern matching request through *enqueue* API.
> 
> It may possible that a given RegEx device may not support all the features
> of PCRE. The application may probe unsupported features through
> struct rte_regex_dev_info::pcre_unsup_flags
> 
> By default, all the functions of the RegEx Device API exported by a PMD
> are lock-free functions which assume to not be invoked in parallel on
> different logical cores to work on the same target object. For instance,
> the dequeue function of a PMD cannot be invoked in parallel on two logical
> cores to operates on same RegEx queue pair. Of course, this function
> can be invoked in parallel by different logical core on different queue pair.
> It is the responsibility of the upper level application to enforce this rule.
> 
> In all functions of the RegEx API, the RegEx device is
> designated by an integer >= 0 named the device identifier *dev_id*
> 
> At the RegEx driver level, RegEx devices are represented by a generic
> data structure of type *rte_regex_dev*.
> 
> RegEx devices are dynamically registered during the PCI/SoC device probing
> phase performed at EAL initialization time.
> When a RegEx device is being probed, a *rte_regex_dev* structure and
> a new device identifier are allocated for that device. Then, the
> regex_dev_init() function supplied by the RegEx driver matching the probed
> device is invoked to properly initialize the device.
> 
> The role of the device init function consists of resetting the hardware or
> software RegEx driver implementations.
> 
> If the device init operation is successful, the correspondence between
> the device identifier assigned to the new device and its associated
> *rte_regex_dev* structure is effectively registered.
> Otherwise, both the *rte_regex_dev* structure and the device identifier are
> freed.
> 
> The functions exported by the application RegEx API to setup a device
> designated by its device identifier must be invoked in the following order:
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_dev_start()
> 
> Then, the application can invoke, in any order, the functions
> exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> matching response, get the stats, update the rule database,
> get/set device attributes and so on
> 
> If the application wants to change the configuration (i.e. call
> rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> before calling rte_regex_dev_start() again. The enqueue and dequeue
> functions should not be invoked when the device is stopped.
> 
> Finally, an application can close a RegEx device by invoking the
> rte_regex_dev_close() function.
> 
> Each function of the application RegEx API invokes a specific function
> of the PMD that controls the target device designated by its device
> identifier.
> 
> For this purpose, all device-specific functions of a RegEx driver are
> supplied through a set of pointers contained in a generic structure of type
> *regex_dev_ops*.
> The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> structure by the device init function of the RegEx driver, which is
> invoked during the PCI/SoC device probing phase, as explained earlier.
> 
> In other words, each function of the RegEx API simply retrieves the
> *rte_regex_dev* structure associated with the device identifier and
> performs an indirect invocation of the corresponding driver function
> supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> 
> For performance reasons, the address of the fast-path functions of the
> RegEx driver is not contained in the *regex_dev_ops* structure.
> Instead, they are directly stored at the beginning of the *rte_regex_dev*
> structure to avoid an extra indirect memory access during their invocation.
> 
> RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> functions to applications.
> 
> The *enqueue* operation submits a burst of RegEx pattern matching
> request
> to the RegEx device and the *dequeue* operation gets a burst of pattern
> matching response for the ones submitted through *enqueue* operation.
> 
> Typical application utilisation of the RegEx device API will follow the
> following programming flow.
> 
> - rte_regex_dev_configure()
> - rte_regex_queue_pair_setup()
> - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> not
> provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> and/or application needs to update rule database.
> - Create or reuse exiting mempool for *rte_regex_ops* objects.
> - rte_regex_dev_start()
> - rte_regex_enqueue_burst()
> - rte_regex_dequeue_burst()
> 
> ---
> 
> config/common_base                 |    5 +
> doc/api/doxy-api-index.md          |    1 +
> doc/api/doxy-api.conf.in           |    1 +
> lib/Makefile                       |    2 +
> lib/librte_regexdev/Makefile       |   23 +
> lib/librte_regexdev/rte_regexdev.c |    5 +
> lib/librte_regexdev/rte_regexdev.h | 1247
> ++++++++++++++++++++++++++++
> 7 files changed, 1284 insertions(+)
> create mode 100644 lib/librte_regexdev/Makefile
> create mode 100644 lib/librte_regexdev/rte_regexdev.c
> create mode 100644 lib/librte_regexdev/rte_regexdev.h
> 
> diff --git a/config/common_base b/config/common_base
> index e406e7836..986093d6e 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -746,6 +746,11 @@
> CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
> #
> CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> 
> +#
> +# Compile regex device support
> +#
> +CONFIG_RTE_LIBRTE_REGEXDEV=y
> +
> #
> # Compile librte_ring
> #
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 715248dd1..a0bc27ae4 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
> [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
> [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
> [rawdev]             (@ref rte_rawdev.h),
> +  [regexdev]           (@ref rte_regexdev.h),
> [metrics]            (@ref rte_metrics.h),
> [bitrate]            (@ref rte_bitrate.h),
> [latency]            (@ref rte_latencystats.h),
> diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> index b9896cb63..7adb821bb 100644
> --- a/doc/api/doxy-api.conf.in
> +++ b/doc/api/doxy-api.conf.in
> @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> index.md \
> @TOPDIR@/lib/librte_rawdev \
> @TOPDIR@/lib/librte_rcu \
> @TOPDIR@/lib/librte_reorder \
> +                          @TOPDIR@/lib/librte_regexdev \
> @TOPDIR@/lib/librte_ring \
> @TOPDIR@/lib/librte_sched \
> @TOPDIR@/lib/librte_security \
> diff --git a/lib/Makefile b/lib/Makefile
> index 791e0d991..57de9691a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> librte_ethdev librte_hash \
> librte_mempool librte_timer librte_cryptodev
> DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
> DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> +DEPDIRS-librte_regexdev := librte_eal
> DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> librte_ethdev \
> 			librte_net
> diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> new file mode 100644
> index 000000000..723b4b28c
> --- /dev/null
> +++ b/lib/librte_regexdev/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(C) 2019 Marvell International Ltd.
> +#
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_regexdev.a
> +
> +# library version
> +LIBABIVER := 1
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +# library source files
> +SRCS-y += rte_regexdev.c
> +
> +# export include files
> +SYMLINK-y-include += rte_regexdev.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_regexdev/rte_regexdev.c
> b/lib/librte_regexdev/rte_regexdev.c
> new file mode 100644
> index 000000000..e5be0f29c
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.c
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#include <rte_regexdev.h>
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 000000000..765da4aaa
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1247 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_
> +
> +/**
> + * @file
> + *
> + * RTE RegEx Device API
> + *
> + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> + *
> + * The RegEx Device API is composed of two parts:
> + *
> + * - The application-oriented RegEx API that includes functions to setup
> + *   a RegEx device (configure it, setup its queue pairs and start it),
> + *   update the rule database and so on.
> + *
> + * - The driver-oriented RegEx API that exports a function allowing
> + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> + *   a RegEx device driver.
> + *
> + * RegEx device components and definitions:
> + *
> + *     +-----------------+
> + *     |                 |
> + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> + *     |   PCRE based    o------+  |               |
> + *     |  RegEx pattern  |      |  |  +--------+   |
> + *     | matching engine o------+--+--o        |   |    +------+
> + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> + *     |                 o----+ |  |  | pair 0 |        |      |
> + *     |                 |    | |  |  +--------+        +------+
> + *     +-----------------+    | |  |
> + *            ^               | |  |  +--------+
> + *            |               | |  |  |        |        +------+
> + *            |               | +--+--o queue  |<======>|Core 1|
> + *        Rule|Database       |    |  | pair 1 |        |      |
> + *     +------+----------+    |    |  +--------+        +------+
> + *     |     Group 0     |    |    |
> + *     | +-------------+ |    |    |  +--------+        +------+
> + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> + *     | +-------------+ |    |    +--o queue  |<======>|      |
> + *     |     Group 1     |    |       | pair 2 |        +------+
> + *     | +-------------+ |    |       +--------+
> + *     | | Rules 0..n  | |    |
> + *     | +-------------+ |    |       +--------+
> + *     |     Group 2     |    |       |        |        +------+
> + *     | +-------------+ |    |       | queue  |<======>|Core n|
> + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> + *     | +-------------+ |            +--------+        +------+
> + *     |     Group n     |
> + *     | +-------------+ |<-------rte_regex_rule_db_update()
> + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> + *     | +-------------+ |------->rte_regex_rule_db_export()
> + *     +-----------------+
> + *
> + * RegEx: A regular expression is a concise and flexible means for matching
> + * strings of text, such as particular characters, words, or patterns of
> + * characters. A common abbreviation for this is “RegEx”.
> + *
> + * RegEx device: A hardware or software-based implementation of RegEx
> + * device API for PCRE based pattern matching syntax and semantics.
> + *
> + * PCRE RegEx syntax and semantics specification:
> + *
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
> kit.sourceforge.net%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
> ata=02%7C01%7Cshahafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721
> 748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
> 9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
> D&amp;reserved=0
> + *
> + * RegEx queue pair: Each RegEx device should have one or more queue
> pair to
> + * transmit a burst of pattern matching request and receive a burst of
> + * receive the pattern matching response. The pattern matching
> request/response
> + * embedded in *rte_regex_ops* structure.
> + *
> + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> + * Match ID and Group ID to identify the rule upon the match.
> + *
> + * Rule database: The RegEx device accepts regular expressions and
> converts them
> + * into a compiled rule database that can then be used to scan data.
> + * Compilation allows the device to analyze the given pattern(s) and
> + * pre-determine how to scan for these patterns in an optimized fashion
> that
> + * would be far too expensive to compute at run-time. A rule database
> contains
> + * a set of rules that compiled in device specific binary form.
> + *
> + * Match ID or Rule ID: A unique identifier provided at the time of rule
> + * creation for the application to identify the rule upon match.
> + *
> + * Group ID: Group of rules can be grouped under one group ID to enable
> + * rule isolation and effective pattern matching. A unique group identifier
> + * provided at the time of rule creation for the application to identify the
> + * rule upon match.
> + *
> + * Scan: A pattern matching request through *enqueue* API.
> + *
> + * It may possible that a given RegEx device may not support all the features
> + * of PCRE. The application may probe unsupported features through
> + * struct rte_regex_dev_info::pcre_unsup_flags
> + *
> + * By default, all the functions of the RegEx Device API exported by a PMD
> + * are lock-free functions which assume to not be invoked in parallel on
> + * different logical cores to work on the same target object. For instance,
> + * the dequeue function of a PMD cannot be invoked in parallel on two
> logical
> + * cores to operates on same RegEx queue pair. Of course, this function
> + * can be invoked in parallel by different logical core on different queue
> pair.
> + * It is the responsibility of the upper level application to enforce this rule.
> + *
> + * In all functions of the RegEx API, the RegEx device is
> + * designated by an integer >= 0 named the device identifier *dev_id*
> + *
> + * At the RegEx driver level, RegEx devices are represented by a generic
> + * data structure of type *rte_regex_dev*.
> + *
> + * RegEx devices are dynamically registered during the PCI/SoC device
> probing
> + * phase performed at EAL initialization time.
> + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> + * a new device identifier are allocated for that device. Then, the
> + * regex_dev_init() function supplied by the RegEx driver matching the
> probed
> + * device is invoked to properly initialize the device.
> + *
> + * The role of the device init function consists of resetting the hardware or
> + * software RegEx driver implementations.
> + *
> + * If the device init operation is successful, the correspondence between
> + * the device identifier assigned to the new device and its associated
> + * *rte_regex_dev* structure is effectively registered.
> + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> are
> + * freed.
> + *
> + * The functions exported by the application RegEx API to setup a device
> + * designated by its device identifier must be invoked in the following order:
> + *     - rte_regex_dev_configure()
> + *     - rte_regex_queue_pair_setup()
> + *     - rte_regex_dev_start()
> + *
> + * Then, the application can invoke, in any order, the functions
> + * exported by the RegEx API to enqueue pattern matching job, dequeue
> pattern
> + * matching response, get the stats, update the rule database,
> + * get/set device attributes and so on
> + *
> + * If the application wants to change the configuration (i.e. call
> + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> call
> + * rte_regex_dev_stop() first to stop the device and then do the
> reconfiguration
> + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> + * functions should not be invoked when the device is stopped.
> + *
> + * Finally, an application can close a RegEx device by invoking the
> + * rte_regex_dev_close() function.
> + *
> + * Each function of the application RegEx API invokes a specific function
> + * of the PMD that controls the target device designated by its device
> + * identifier.
> + *
> + * For this purpose, all device-specific functions of a RegEx driver are
> + * supplied through a set of pointers contained in a generic structure of type
> + * *regex_dev_ops*.
> + * The address of the *regex_dev_ops* structure is stored in the
> *rte_regex_dev*
> + * structure by the device init function of the RegEx driver, which is
> + * invoked during the PCI/SoC device probing phase, as explained earlier.
> + *
> + * In other words, each function of the RegEx API simply retrieves the
> + * *rte_regex_dev* structure associated with the device identifier and
> + * performs an indirect invocation of the corresponding driver function
> + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> structure.
> + *
> + * For performance reasons, the address of the fast-path functions of the
> + * RegEx driver is not contained in the *regex_dev_ops* structure.
> + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> + * structure to avoid an extra indirect memory access during their
> invocation.
> + *
> + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> dequeue
> + * functions to applications.
> + *
> + * The *enqueue* operation submits a burst of RegEx pattern matching
> request
> + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> + * matching response for the ones submitted through *enqueue*
> operation.
> + *
> + * Typical application utilisation of the RegEx device API will follow the
> + * following programming flow.
> + *
> + * - rte_regex_dev_configure()
> + * - rte_regex_queue_pair_setup()
> + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> database not
> + *   provided in rte_regex_dev_config::rule_db for
> rte_regex_dev_configure()
> + *   and/or application needs to update rule database.
> + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> + * - rte_regex_dev_start()
> + * - rte_regex_enqueue_burst()
> + * - rte_regex_dequeue_burst()
> + *
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_config.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_memory.h>
> +
> +/**
> + * Get the total number of RegEx devices that have been successfully
> + * initialised.
> + *
> + * @return
> + *   The total number of usable RegEx devices.
> + */
> +uint8_t
> +rte_regex_dev_count(void);
> +
> +/**
> + * Get the device identifier for the named RegEx device.
> + *
> + * @param name
> + *   RegEx device name to select the RegEx device identifier.
> + *
> + * @return
> + *   Returns RegEx device identifier on success.
> + *   - <0: Failure to find named RegEx device.
> + */
> +int
> +rte_regex_dev_get_dev_id(const char *name);
> +
> +/* Enumerates RegEx device capabilities */
> +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> +/**< RegEx device does support compiling the rules at runtime unlike
> + * loading only the pre-built rule database using
> + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/* Enumerates unsupported PCRE features for the RegEx device */
> +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> + * previous match or the start of the string for the first match.
> + * This position will change each time the RegEx is applied to the subject
> + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> 1)
> +/**< RegEx device doesn't support PCRE Atomic grouping.
> + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> + * when the RegEx engine exits from it, automatically throws away all
> + * backtracking positions remembered by any tokens inside the group.
> + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> then
> + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> + * atomic groups don't allow backtracing back to 'b'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> 2)
> +/**< RegEx device doesn't support PCRE backtracking control verbs.
> + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> + * (*SKIP), (*PRUNE).
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> +/**< RegEx device doesn't support PCRE callouts.
> + * PCRE supports calling external function in between matches by using
> '(?C)'.
> + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> engine
> + * will parse ABC perform a userdefined callout and return a successful
> match at
> + * D.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> +/**< RegEx device doesn't support PCRE backreference.
> + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> recently
> + * matched by the 2nd capturing group i.e. 'GHI'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> +/**< RegEx device doesn't support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> 6)
> +/**< RegEx device doesn't support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> + * successful match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> 7)
> +/**< RegEx device doesn't support PCRE match point reset directive.
> + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> + * then even though the entire pattern matches only '123'
> + * is reported as a match.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> (1ULL << 8)
> +/**< RegEx device doesn't support PCRE newline convention.
> + * Newline conventions are represented as follows:
> + * (*CR)        carriage return
> + * (*LF)        linefeed
> + * (*CRLF)      carriage return, followed by linefeed
> + * (*ANYCRLF)   any of the three above
> + * (*ANY)       all Unicode newline sequences
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> +/**< RegEx device doesn't support PCRE newline sequence.
> + * The escape sequence '\R' will match any newline sequence.
> + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> << 10)
> +/**< RegEx device doesn't support PCRE possessive qualifiers.
> + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> + * Possessive quantifier repeats the token as many times as possible and it
> does
> + * not give up matches as the engine backtracks. With a possessive
> quantifier,
> + * the deal is all or nothing.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> (1ULL << 11)
> +/**< RegEx device doesn't support PCRE Subroutine references.
> + * PCRE Subroutine references allow for sub patterns to be assessed
> + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> + * pattern 'foofoofuzzfoofuzzbar'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> +/**< RegEx device doesn't support UTF-8 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> +/**< RegEx device doesn't support UTF-16 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> +/**< RegEx device doesn't support UTF-32 character encoding.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> 15)
> +/**< RegEx device doesn't support word boundaries.
> + * The meta character '\b' represents word boundary anchor.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> << 16)
> +/**< RegEx device doesn't support Forward references.
> + * Forward references allow you to use a back reference to a group that
> appears
> + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> + * following string 'GHIGHIABCDEF'.
> + * @see struct rte_regex_dev_info::pcre_unsup_flags
> + */
> +
> +/* Enumerates PCRE rule flags */
> +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> +/**< When this flag is set, the pattern that can match against an empty
> string,
> + * such as '.*' are allowed.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> + * is constrained to match only at the first matching point in the string that
> + * is being searched. Similar to '^' and represented by \A.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> +/**< When this flag is set, letters in the pattern match both upper and
> lower
> + * case letters in the subject.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> +/**< When this flag is set, a dot metacharacter in the pattern matches any
> + * character, including one that indicates a newline.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> +/**< When this flag is set, names used to identify capture groups need not
> be
> + * unique.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> +/**< When this flag is set, most white space characters in the pattern are
> + * totally ignored except when escaped or inside a character class.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> +/**< When this flag is set, a backreference to an unset capture group
> matches an
> + * empty string.
> + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> +/**< When this flag  is set, the '^' and '$' constructs match immediately
> + * following or immediately before internal newlines in the subject string,
> + * respectively, as well as at the very start and end.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> +/**< When this Flag is set, it disables the use of numbered capturing
> + * parentheses in the pattern. References to capture groups
> (backreferences or
> + * recursion/subroutine calls) may only refer to named groups, though the
> + * reference can be by name or by number.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> +/**< By default, only ASCII characters are recognized, When this flag is set,
> + * Unicode properties are used instead to classify characters.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> + * so that they are not greedy by default, but become greedy if followed by
> + * '?'.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> +/**< When this flag is set, RegEx engine has to regard both the pattern and
> the
> + * subject strings that are subsequently processed as strings of UTF
> characters
> + * instead of single-code-unit strings.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> + * This escape matches one data unit, even in UTF mode which can cause
> + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> the
> + * current matching point in the middle of a multi-code-unit character.
> + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name */
> +	struct rte_device *dev;	/**< Device information */
> +	uint8_t max_matches;
> +	/**< Maximum matches per scan supported by this device */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint16_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device */
> +	uint16_t max_groups;
> +	/**< Maximum group supported by this device */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint64_t pcre_unsup_flags;
> +	/**< Unsupported PCRE features for this RegEx device.
> +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> +	 */
> +};
> +
> +/**
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are
> related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +/** RegEx device configuration structure */
> +struct rte_regex_dev_config {
> +	uint8_t nb_max_matches;
> +	/**< Maximum matches per scan configured on this device.
> +	 * This value cannot exceed the *max_matches*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case, value 1 used.
> +	 * @see struct rte_regex_dev_info::max_matches
> +	 */
> +	uint16_t nb_queue_pairs;
> +	/**< Number of RegEx queue pairs to configure on this device.
> +	 * This value cannot exceed the *max_queue_pairs* which
> previously
> +	 * provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_queue_pairs
> +	 */
> +	uint16_t nb_rules_per_group;
> +	/**< Number of rules per group to configure on this device.
> +	 * This value cannot exceed the *max_rules_per_group*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * The value 0 is allowed, in which case,
> +	 * struct rte_regex_dev_info::max_rules_per_group used.
> +	 * @see struct rte_regex_dev_info::max_rules_per_group
> +	 */
> +	uint16_t nb_groups;
> +	/**< Number of groups to configure on this device.
> +	 * This value cannot exceed the *max_groups*
> +	 * which previously provided in rte_regex_dev_info_get().
> +	 * @see struct rte_regex_dev_info::max_groups
> +	 */
> +	const char *rule_db;
> +	/**< Import initial set of prebuilt rule database on this device.
> +	 * The value NULL is allowed, in which case, the device will not
> +	 * be configured prebuilt rule database. Application may use
> +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> +	 * to update or import rule database after the
> +	 * rte_regex_dev_configure().
> +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> +	 */
> +	uint32_t rule_db_len;
> +	/**< Length of *rule_db* buffer. */
> +	uint32_t dev_cfg_flags;
> +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> */
> +};
> +
> +/**
> + * Configure a RegEx device.
> + *
> + * This function must be invoked first before any other function in the
> + * API. This function can also be re-invoked when a device is in the
> + * stopped state.
> + *
> + * The caller may use rte_regex_dev_info_get() to get the capability of each
> + * resources available for this regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device to configure.
> + * @param cfg
> + *   The RegEx device configuration structure.
> + *
> + * @return
> + *   - 0: Success, device configured.
> + *   - <0: Error code returned by the driver configuration function.
> + */
> +int
> +rte_regex_dev_configure(uint8_t dev_id, const struct
> rte_regex_dev_config *cfg);
> +
> +/* Enumerates RegEx queue pair configuration flags */
> +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> +/**< Out of order scan, If not set, a scan must retire after previously issued
> + * in-order scans to this queue pair. If set, this scan can be retired as soon
> + * as device returns completion. Application should not set out of order scan
> + * flag if it needs to maintain the ingress order of scan request.
> + *
> + * @see struct rte_regex_qp_conf::qp_conf_flags,
> rte_regex_queue_pair_setup()
> + */
> +
> +struct rte_regex_ops;
> +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> +				      struct rte_regex_ops *op);
> +/**< Callback function called during rte_regex_dev_stop(), invoked once
> per
> + * flushed RegEx op.
> + */
> +
> +/** RegEx queue pair configuration structure */
> +struct rte_regex_qp_conf {
> +	uint32_t qp_conf_flags;
> +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> */
> +	uint16_t nb_desc;
> +	/**< The number of descriptors to allocate for this queue pair. */
> +	regexdev_stop_flush_t cb;
> +	/**< Callback function called during rte_regex_dev_stop(), invoked
> +	 * once per flushed regex op. Value NULL is allowed, in which case
> +	 * callback will not be invoked. This function can be used to properly
> +	 * dispose of outstanding regex ops from response queue,
> +	 * for example ops containing memory pointers.
> +	 * @see rte_regex_dev_stop()
> +	 */
> +};
> +
> +/**
> + * Allocate and set up a RegEx queue pair for a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_pair_id
> + *   The index of the RegEx queue pair to setup. The value must be in the
> range
> + *   [0, nb_queue_pairs - 1] previously supplied to
> rte_regex_dev_configure().
> + * @param qp_conf
> + *   The pointer to the configuration data to be used for the RegEx queue
> pair.
> + *   NULL value is allowed, in which case default configuration	used.
> + *
> + * @return
> + *   - 0: Success, RegEx queue pair correctly set up.
> + *   - <0: RegEx queue configuration failed
> + */
> +int
> +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> +			   const struct rte_regex_qp_conf *qp_conf);
> +
> +/**
> + * Start a RegEx device.
> + *
> + * The device start step is the last one and consists of setting the RegEx
> + * queues to start accepting the pattern matching scan requests.
> + *
> + * On success, all basic functions exported by the API (RegEx enqueue,
> + * RegEx dequeue and so on) can be invoked.
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + * @return
> + *   - 0: Success, device started.
> + *   - <0: Device start failed.
> + */
> +int
> +rte_regex_dev_start(uint8_t dev_id);
> +
> +/**
> + * Stop a RegEx device.
> + *
> + * Stop a RegEx device. The device can be restarted with a call to
> + * rte_regex_dev_start().
> + *
> + * This function causes all queued response regex ops to be drained in the
> + * response queue. While draining ops out of the device,
> + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> + *
> + * @param dev_id
> + *   RegEx device identifier.
> + *
> + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> + */
> +void
> +rte_regex_dev_stop(uint8_t dev_id);
> +
> +/**
> + * Close a RegEx device. The device cannot be restarted!
> + *
> + * @param dev_id
> + *   RegEx device identifier
> + *
> + * @return
> + *  - 0 on successfully closed the device.
> + *  - <0 on failure to close the device.
> + */
> +int
> +rte_regex_dev_close(uint8_t dev_id);
> +
> +/* Device get/set attributes */
> +
> +/** Enumerates RegEx device attribute identifier */
> +enum rte_regex_dev_attr_id {
> +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> +	/**< The NUMA socket id to which the device is connected or
> +	 * a default of zero if the socket could not be determined.
> +	 * datatype: *int*
> +	 * operation: *get*
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> +	/**< Maximum number of matches per scan.
> +	 * datatype: *uint8_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> +	/**< Upper bound scan time in ns.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> +	 */
> +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> +	/**< Maximum number of prefix detected per scan.
> +	 * This would be useful for denial of service detection.
> +	 * datatype: *uint16_t*
> +	 * operation: *get* and *set*
> +	 *
> +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> +	 */
> +};
> +
> +/**
> + * Get an attribute from a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param[out] attr_value A pointer that will be filled in with the attribute
> + *             value if successful.
> + *
> + * @return
> + *   - 0: Successfully retrieved attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       void *attr_value);
> +
> +/**
> + * Set an attribute to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param attr_id The attribute ID to retrieve
> + * @param attr_value A pointer that will be filled in with the attribute value
> + *                   by the application
> + *
> + * @return
> + *   - 0: Successfully applied the attribute value.
> + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> + */
> +int
> +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> attr_id,
> +		       const void *attr_value);
> +
> +/* Rule related APIs */
> +/** Enumerates RegEx rule operation */
> +enum rte_regex_rule_op {
> +	RTE_REGEX_RULE_OP_ADD,
> +	/**< Add RegEx rule to rule database */
> +	RTE_REGEX_RULE_OP_REMOVE
> +	/**< Remove RegEx rule from rule database */
> +};
> +
> +/** Structure to hold a RegEx rule attributes */
> +struct rte_regex_rule {
> +	enum rte_regex_rule_op op;
> +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> +	uint16_t group_id;
> +	/**< Group identifier to which the rule belongs to. */
> +	uint32_t rule_id;
> +	/**< Rule identifier which is returned on successful match. */
> +	const char *pcre_rule;
> +	/**< Buffer to hold the PCRE rule. */
> +	uint16_t pcre_rule_len;
> +	/**< Length of the PCRE rule*/
> +	uint64_t rule_flags;
> +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> +	 * database update, application needs to provide only supported
> +	 * rule flags.
> +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_dev_info::rule_flags
> +	 */
> +};
> +
> +/**
> + * Update the rule database of a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rules
> + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> structure
> + *   which contain the regex rules attributes to be updated in rule database.
> + * @param nb_rules
> + *   The number of PCRE rules to update the rule database.
> + *
> + * @return
> + *   The number of regex rules actually updated on the regex device's rule
> + *   database. The return value can be less than the value of the *nb_rules*
> + *   parameter when the regex devices fails to update the rule database or
> + *   if invalid parameters are specified in a *rte_regex_rule*.
> + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> + *   at the end of *rules* are not consumed and the caller has to take
> + *   care of them and rte_errno is set accordingly.
> + *   Possible errno values include:
> + *   - -EINVAL:  Invalid device ID or rules is NULL
> + *   - -ENOTSUP: The last processed rule is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> + */
> +uint16_t
> +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
> +			 uint16_t nb_rules);

I think the function name is not too informative. If this function meant to compile the rule then it should be explicit on the function name. 

> +
> +/**
> + * Import a prebuilt rule database from a buffer to a RegEx device.
> + *
> + * @param dev_id RegEx device identifier
> + * @param rule_db
> + *   Points to prebuilt rule database.
> + * @param rule_db_len
> + *   Length of the rule database.
> + *
> + * @return
> + *   - 0: Successfully updated the prebuilt rule database.
> + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> + *   - -ENOTSUP: Rule database import is not supported on this device.
> + *   - -ENOSPC: No space available in rule database.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> + */
> +int
> +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> +			 uint32_t rule_db_len);
> +
> +/**
> + * Export the prebuilt rule database from a RegEx device to the buffer.
> + *
> + * @param dev_id RegEx device identifier
> + * @param[out] rule_db
> + *   Block of memory to insert the rule database. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + *
> + * @return
> + *   - 0: Successfully exported the prebuilt rule database.
> + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> + *   - -EINVAL:  Invalid device ID
> + *   - -ENOTSUP: Rule database export is not supported on this device.
> + *
> + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> + */
> +int
> +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> +
> +/* Extended statistics */
> +/** Maximum name length for extended statistics counters */
> +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> +
> +/**
> + * A name-key lookup element for extended statistics.
> + *
> + * This structure is used to map between names and ID numbers
> + * for extended RegEx device statistics.
> + */
> +struct rte_regex_dev_xstats_map {
> +	uint16_t id;
> +	/**< xstat identifier */
> +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> +	/**< xstat name */
> +};
> +
> +/**
> + * Retrieve names of extended statistics of a regex device.
> + *
> + * @param dev_id
> + *   The identifier of the regex device.
> + * @param[out] xstats_map
> + *   Block of memory to insert id and names into. Must be at least size in
> + *   capacity. If set to NULL, function returns required capacity.
> + * @return
> + *   - positive value on success:
> + *        -The return value is the number of entries filled in the stats map.
> + *        -If xstats_map set to NULL then required capacity for xstats_map.
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> +			       struct rte_regex_dev_xstats_map *xstats_map);
> +
> +/**
> + * Retrieve extended statistics of an regex device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param ids
> + *   The id numbers of the stats to get. The ids can be got from the stat
> + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> + *   by using rte_regex_dev_xstats_by_name_get().
> + * @param[out] values
> + *   The values for each stats request by ID.
> + * @param n
> + *   The number of stats requested
> + * @return
> + *   - positive value: number of stat entries filled into the values array
> + *   - negative value on error:
> + *      -ENODEV for invalid *dev_id*
> + *      -ENOTSUP if the device doesn't support this function.
> + */
> +int
> +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> +			 uint64_t values[], uint16_t n);
> +
> +/**
> + * Retrieve the value of a single stat by requesting it by name.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param name
> + *   The stat name to retrieve
> + * @param[out] id
> + *   If non-NULL, the numerical id of the stat will be returned, so that further
> + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> will
> + *   be faster as it doesn't need to scan a list of names for the stat.
> + * @param[out] value
> + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> + *
> + * @return
> + *   - 0: Successfully retrieved xstat value.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> +				 uint16_t *id, uint64_t *value);
> +
> +/**
> + * Reset the values of the xstats of the selected component in the device.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @param ids
> + *   Selects specific statistics to be reset. When NULL, all statistics will be
> + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> + * @param nb_ids
> + *   The number of ids available from the *ids* array. Ignored when ids is
> NULL.
> + * @return
> + *   - 0: Successfully reset the statistics to zero.
> + *   - -EINVAL: invalid parameters
> + *   - -ENOTSUP: if not supported.
> + */
> +int
> +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> +			   uint16_t nb_ids);
> +
> +/**
> + * Trigger the RegEx device self test.
> + *
> + * @param dev_id
> + *   The identifier of the device
> + * @return
> + *   - 0: Selftest successful
> + *   - -ENOTSUP if the device doesn't support selftest
> + *   - other values < 0 on failure.
> + */
> +int rte_regex_dev_selftest(uint8_t dev_id);
> +
> +/**
> + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param f
> + *   A pointer to a file for output
> + *
> + * @return
> + *   - 0: on success
> + *   - <0: on failure.
> + */
> +int
> +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> +
> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		struct {
> +			uint32_t rule_id:20;
> +			/**< Rule identifier to which the pattern matched.
> +			 * @see struct rte_regex_rule::rule_id
> +			 */
> +			uint32_t group_id:12;
> +			/**< Group identifier of the rule which the pattern
> +			 * matched. @see struct rte_regex_rule::group_id
> +			 */
> +			uint16_t offset;
> +			/**< Starting Byte Position for matched rule. */
> +			uint16_t len;
> +			/**< Length of match in bytes */
> +		};
> +	};
> +};
> +
> +/* Enumerates RegEx request flags. */
> +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> +/**< Set when struct rte_regex_rule::group_id1 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> +/**< Set when struct rte_regex_rule::group_id2 valid */
> +
> +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> +/**< Set when struct rte_regex_rule::group_id3 valid */
> +
> +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> +/**< The RegEx engine will stop scanning and return the first match. */
> +
> +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> +/**< In High Priority mode a maximum of one match will be returned per
> scan to
> + * reduce the post-processing required by the application. The match with
> the
> + * lowest Rule id, lowest start pointer and lowest match length will be
> + * returned.
> + *
> + * @see struct rte_regex_ops::nb_actual_matches
> + * @see struct rte_regex_ops::nb_matches
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> +/**< Indicates that the RegEx device has exceeded the max timeout while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> +/**< Indicates that the RegEx device has exceeded the max matches while
> + * scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> + */
> +
> +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> +/**< Indicates that the RegEx device has reached the max allowed prefix
> length
> + * while scanning the given buffer.
> + *
> + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> + */
> +
> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +	/* W0 */
> +	uint16_t req_flags;
> +	/**< Request flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_REQ_*
> +	 */
> +	uint16_t scan_size;
> +	/**< Scan size of the buffer to be scanned in bytes. */
> +	uint16_t rsp_flags;
> +	/**< Response flags for the RegEx ops.
> +	 * @see RTE_REGEX_OPS_RSP_*
> +	 */
> +	uint8_t nb_actual_matches;
> +	/**< The total number of actual matches detected by the Regex
> device.*/
> +	uint8_t nb_matches;
> +	/**< The total number of matches returned by the RegEx device for
> this
> +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> +	 * this value.
> +	 *
> +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> +	 */
> +
> +	/* W1 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t u64;
> +		/**<  Allow 8-byte reserved on 32-bit system */
> +		void *buf_addr;
> +		/**< Virtual address of the pattern to be matched. */
> +	};
> +
> +	/* W2 */
> +	rte_iova_t buf_iova;
> +	/**< IOVA address of the pattern to be matched. */
> +
> +	/* W3 */
> +	uint16_t group_id0;
> +	/**< First group_id to match the rule against. Minimum one group id
> +	 * must be provided by application.
> +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> group_id1
> +	 * is valid, respectively similar flags for group_id2 and group_id3.
> +	 * Upon the match, struct rte_regex_match::group_id shall be
> updated
> +	 * with matching group ID by the device. Group ID scheme provides
> +	 * rule isolation and effective pattern matching.
> +	 */
> +	uint16_t group_id1;
> +	/**< Second group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> +	 */
> +	uint16_t group_id2;
> +	/**< Third group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> +	 */
> +	uint16_t group_id3;
> +	/**< Forth group_id to match the rule against.
> +	 *
> +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> +	 */
> +
> +	/* W4 */
> +	RTE_STD_C11
> +	union {
> +		uint64_t user_id;
> +		/**< Application specific opaque value. An application may
> use
> +		 * this field to hold application specific value to share
> +		 * between dequeue and enqueue operation.
> +		 * Implementation should not modify this field.
> +		 */
> +		void *user_ptr;
> +		/**< Pointer representation of *user_id* */
> +	};

Since we target the regex subsystem for both regex and DPI I think it will be good to add another uint64_t field called connection_id. 
Device that support DPI can refer to it as another match able field when looking up for matches on the given buffer. 

This field is different from the user_id, as it is not opaque for the device. 

> +
> +	/* W5 */
> +	struct rte_regex_match matches[];
> +	/**< Zero length array to hold the match tuples.
> +	 * The struct rte_regex_ops::nb_matches value holds the number of
> +	 * elements in this array.
> +	 *
> +	 * @see struct rte_regex_ops::nb_matches
> +	 */
> +};
> +
> +/**
> + * Enqueue a burst of scan request on a RegEx device.
> + *
> + * The rte_regex_enqueue_burst() function is invoked to place
> + * regex operations on the queue *qp_id* of the device designated by
> + * its *dev_id*.
> + *
> + * The *nb_ops* parameter is the number of operations to process which
> are
> + * supplied in the *ops* array of *rte_regex_op* structures.
> + *
> + * The rte_regex_enqueue_burst() function returns the number of
> + * operations it actually enqueued for processing. A return value equal to
> + * *nb_ops* means that all packets have been enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param qp_id
> + *   The index of the queue pair which packets are to be enqueued for
> + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> + *   previously supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> structures
> + *   which contain the regex operations to be processed.
> + * @param nb_ops
> + *   The number of operations to process.
> + *
> + * @return
> + *   The number of operations actually enqueued on the regex device. The
> return
> + *   value can be less than the value of the *nb_ops* parameter when the
> + *   regex devices queue is full or if invalid parameters are specified in
> + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +/**
> + *
> + * Dequeue a burst of scan response from a queue on the RegEx device.
> + * The dequeued operation are stored in *rte_regex_op* structures
> + * whose pointers are supplied in the *ops* array.
> + *
> + * The rte_regex_dequeue_burst() function returns the number of ops
> + * actually dequeued, which is the number of *rte_regex_op* data
> structures
> + * effectively supplied into the *ops* array.
> + *
> + * A return value equal to *nb_ops* indicates that the queue contained
> + * at least *nb_ops* operations, and this is likely to signify that other
> + * processed operations remain in the devices output queue. Applications
> + * implementing a "retrieve as many processed operations as possible"
> policy
> + * can check this specific case and keep invoking the
> + * rte_regex_dequeue_burst() function until a value less than
> + * *nb_ops* is returned.
> + *
> + * The rte_regex_dequeue_burst() function does not provide any error
> + * notification to avoid the corresponding overhead.
> + *
> + * @param dev_id
> + *   The RegEx device identifier
> + * @param qp_id
> + *   The index of the queue pair from which to retrieve processed packets.
> + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> + *   supplied to rte_regex_dev_configure().
> + * @param ops
> + *   The address of an array of pointers to *rte_regex_op* structures that
> must
> + *   be large enough to store *nb_ops* pointers in it.
> + * @param nb_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued, which is the number
> + *   of pointers to *rte_regex_op* structures effectively supplied to the
> + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> + *   ops at the end of *ops* are not consumed and the caller has to take
> care
> + *   of them.
> + */
> +uint16_t
> +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> +			struct rte_regex_ops **ops, uint16_t nb_ops);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_REGEXDEV_H_ */
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-08-21  5:32     ` Shahaf Shuler
@ 2019-08-21 15:12       ` John Bromhead
  2019-09-10 10:31       ` Jerin Jacob Kollanukkaran
  2019-09-10 11:02       ` Jerin Jacob Kollanukkaran
  2 siblings, 0 replies; 62+ messages in thread
From: John Bromhead @ 2019-08-21 15:12 UTC (permalink / raw)
  To: Shahaf Shuler
  Cc: Thomas Monjalon, dev, jerinj, Pavan Nikhilesh, Hemant Agrawal,
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	Nipun Gupta, Wang, Xiang W, Richardson, Bruce, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, deri, fc, arthur.su, James Hunter,
	Gareth Douglas, Sakir Sezer

Their are probably quite a few other use cases, but suggest you also add

Natural Language Processing (NLP)
Sentiment Analysis
Big Data database acceleration (Spark, Hadoop etc.)
Computational Storage

Regards JohnB

John Bromhead
VP of Business Development
Titan IC
San Diego, CA 92130, USA

j.bromhead@titan-ic.com<mailto:j.bromhead@titan-ic.com>
Cell: +1-858-642-2501
Web: www.titan-ic.com<http://www.titan-ic.com/>
Personal email: john@bromhead.com<mailto:john@bromhead.com>
LinkedIn: https://www.linkedin.com/in/jbromhead<https://www.linkedin.com/in/jbromhead/>
To book a meeting: https://calendly.com/johnbromhead/titanic


On Aug 20, 2019, at 10:32 PM, Shahaf Shuler <shahafs@mellanox.com<mailto:shahafs@mellanox.com>> wrote:

Hi Jerin,

Thursday, August 15, 2019 2:34 PM, Thomas Monjalon:
Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
subsystem

+Cc more

------------

From: Jerin Jacob <jerinj@marvell.com<mailto:jerinj@marvell.com>>

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

The Doxygen generated RFC API documentation available here:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrea
my-noether-
22777e.netlify.com<http://22777e.netlify.com>%2Frte__regexdev_8h.html&amp;data=02%7C01%7Csha
hafs%40mellanox.com<http://40mellanox.com>%7Cdf93416cf4e8498a982c08d721748937%7Ca652971c
7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637014656739993131&amp;sdata
=6ZAOrLmj3sf7LrPRlzE7IyqkK8b4cvFIQqK6zSwF4aw%3D&amp;reserved=0

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
• Next Generation Firewalls (NGFW)
• Deep Packet and Flow Inspection (DPI)
• Intrusion Prevention Systems (IPS)
• DDoS Mitigation
• Network Monitoring
• Data Loss Prevention (DLP)
• Smart NICs
• Grammar based content processing
• URL, spam and adware filtering
• Advanced auditing and policing of user/application security policies
• Financial data mining - parsing of streamed financial feeds

I think two more important use case to add (at least on the doc of this subsystem) are:
* application recognition
* memory introspection



Request to review from HW and SW RegEx vendors and RegEx application
users
to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing
device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com<mailto:jerinj@marvell.com>>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com<mailto:pbhagavatula@marvell.com>>
---

RTE RegEx Device API
--------------------

Defines RTE RegEx Device APIs for RegEx operations and its provisioning.

The RegEx Device API is composed of two parts:

- The application-oriented RegEx API that includes functions to setup
a RegEx device (configure it, setup its queue pairs and start it),
update the rule database and so on.

- The driver-oriented RegEx API that exports a function allowing
a RegEx poll Mode Driver (PMD) to simultaneously register itself as
a RegEx device driver.

RegEx device components and definitions:

   +-----------------+
   |                 |
   |                 o---------+    rte_regex_[en|de]queue_burst()
   |   PCRE based    o------+  |               |
   |  RegEx pattern  |      |  |  +--------+   |
   | matching engine o------+--+--o        |   |    +------+
   |                 |      |  |  | queue  |<==o===>|Core 0|
   |                 o----+ |  |  | pair 0 |        |      |
   |                 |    | |  |  +--------+        +------+
   +-----------------+    | |  |
          ^               | |  |  +--------+
          |               | |  |  |        |        +------+
          |               | +--+--o queue  |<======>|Core 1|
      Rule|Database       |    |  | pair 1 |        |      |
   +------+----------+    |    |  +--------+        +------+
   |     Group 0     |    |    |
   | +-------------+ |    |    |  +--------+        +------+
   | | Rules 0..n  | |    |    |  |        |        |Core 2|
   | +-------------+ |    |    +--o queue  |<======>|      |
   |     Group 1     |    |       | pair 2 |        +------+
   | +-------------+ |    |       +--------+
   | | Rules 0..n  | |    |
   | +-------------+ |    |       +--------+
   |     Group 2     |    |       |        |        +------+
   | +-------------+ |    |       | queue  |<======>|Core n|
   | | Rules 0..n  | |    +-------o pair n |        |      |
   | +-------------+ |            +--------+        +------+
   |     Group n     |
   | +-------------+ |<-------rte_regex_rule_db_update()
   | | Rules 0..n  | |<-------rte_regex_rule_db_import()
   | +-------------+ |------->rte_regex_rule_db_export()
   +-----------------+

RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is “RegEx”.

RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.

PCRE RegEx syntax and semantics specification:
https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
kit.sourceforge.net<http://kit.sourceforge.net>%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
ata=02%7C01%7Cshahafs%40mellanox.com<http://40mellanox.com>%7Cdf93416cf4e8498a982c08d721
748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
D&amp;reserved=0

RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching
request/response
embedded in *rte_regex_ops* structure.

Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.

Rule database: The RegEx device accepts regular expressions and converts
them
into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database contains
a set of rules that compiled in device specific binary form.

Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.

Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify the
rule upon match.

Scan: A pattern matching request through *enqueue* API.

It may possible that a given RegEx device may not support all the features
of PCRE. The application may probe unsupported features through
struct rte_regex_dev_info::pcre_unsup_flags

By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the upper level application to enforce this rule.

In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*

At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regex_dev*.

RegEx devices are dynamically registered during the PCI/SoC device probing
phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regex_dev* structure and
a new device identifier are allocated for that device. Then, the
regex_dev_init() function supplied by the RegEx driver matching the probed
device is invoked to properly initialize the device.

The role of the device init function consists of resetting the hardware or
software RegEx driver implementations.

If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regex_dev* structure is effectively registered.
Otherwise, both the *rte_regex_dev* structure and the device identifier are
freed.

The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following order:
- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_dev_start()

Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue
pattern
matching response, get the stats, update the rule database,
get/set device attributes and so on

If the application wants to change the configuration (i.e. call
rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
rte_regex_dev_stop() first to stop the device and then do the
reconfiguration
before calling rte_regex_dev_start() again. The enqueue and dequeue
functions should not be invoked when the device is stopped.

Finally, an application can close a RegEx device by invoking the
rte_regex_dev_close() function.

Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.

For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of type
*regex_dev_ops*.
The address of the *regex_dev_ops* structure is stored in the
*rte_regex_dev*
structure by the device init function of the RegEx driver, which is
invoked during the PCI/SoC device probing phase, as explained earlier.

In other words, each function of the RegEx API simply retrieves the
*rte_regex_dev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
structure.

For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regex_dev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regex_dev*
structure to avoid an extra indirect memory access during their invocation.

RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.

The *enqueue* operation submits a burst of RegEx pattern matching
request
to the RegEx device and the *dequeue* operation gets a burst of pattern
matching response for the ones submitted through *enqueue* operation.

Typical application utilisation of the RegEx device API will follow the
following programming flow.

- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_rule_db_update() Needs to invoke if precompiled rule database
not
provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
and/or application needs to update rule database.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regex_dev_start()
- rte_regex_enqueue_burst()
- rte_regex_dequeue_burst()

---

config/common_base                 |    5 +
doc/api/doxy-api-index.md          |    1 +
doc/api/doxy-api.conf.in           |    1 +
lib/Makefile                       |    2 +
lib/librte_regexdev/Makefile       |   23 +
lib/librte_regexdev/rte_regexdev.c |    5 +
lib/librte_regexdev/rte_regexdev.h | 1247
++++++++++++++++++++++++++++
7 files changed, 1284 insertions(+)
create mode 100644 lib/librte_regexdev/Makefile
create mode 100644 lib/librte_regexdev/rte_regexdev.c
create mode 100644 lib/librte_regexdev/rte_regexdev.h

diff --git a/config/common_base b/config/common_base
index e406e7836..986093d6e 100644
--- a/config/common_base
+++ b/config/common_base
@@ -746,6 +746,11 @@
CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
#
CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y

+#
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+
#
# Compile librte_ring
#
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 715248dd1..a0bc27ae4 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
[event_timer_adapter]    (@ref rte_event_timer_adapter.h),
[event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
[rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
[metrics]            (@ref rte_metrics.h),
[bitrate]            (@ref rte_bitrate.h),
[latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index b9896cb63..7adb821bb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
index.md \
@TOPDIR@/lib/librte_rawdev \
@TOPDIR@/lib/librte_rcu \
@TOPDIR@/lib/librte_reorder \
+                          @TOPDIR@/lib/librte_regexdev \
@TOPDIR@/lib/librte_ring \
@TOPDIR@/lib/librte_sched \
@TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 791e0d991..57de9691a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
librte_ethdev librte_hash \
librte_mempool librte_timer librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal
DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
librte_ethdev \
           librte_net
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 000000000..723b4b28c
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library source files
+SRCS-y += rte_regexdev.c
+
+# export include files
+SYMLINK-y-include += rte_regexdev.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c
b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 000000000..e5be0f29c
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h
b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 000000000..765da4aaa
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1247 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ *
https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
kit.sourceforge.net<http://kit.sourceforge.net>%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
ata=02%7C01%7Cshahafs%40mellanox.com<http://40mellanox.com>%7Cdf93416cf4e8498a982c08d721
748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
D&amp;reserved=0
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue
pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching
request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and
converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion
that
+ * would be far too expensive to compute at run-time. A rule database
contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two
logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue
pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device
probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the
probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier
are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue
pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
call
+ * rte_regex_dev_stop() first to stop the device and then do the
reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the
*rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their
invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and
dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching
request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue*
operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
database not
+ *   provided in rte_regex_dev_config::rule_db for
rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+/**
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+
+/* Enumerates unsupported PCRE features for the RegEx device */
+#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
+/**< RegEx device doesn't support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
1)
+/**< RegEx device doesn't support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
2)
+/**< RegEx device doesn't support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
+/**< RegEx device doesn't support PCRE callouts.
+ * PCRE supports calling external function in between matches by using
'(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
engine
+ * will parse ABC perform a userdefined callout and return a successful
match at
+ * D.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
+/**< RegEx device doesn't support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
+/**< RegEx device doesn't support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched
completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
6)
+/**< RegEx device doesn't support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
a
+ * successful match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
7)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
(1ULL << 8)
+/**< RegEx device doesn't support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
+/**< RegEx device doesn't support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
<< 10)
+/**< RegEx device doesn't support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it
does
+ * not give up matches as the engine backtracks. With a possessive
quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
(1ULL << 11)
+/**< RegEx device doesn't support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
+/**< RegEx device doesn't support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
+/**< RegEx device doesn't support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
+/**< RegEx device doesn't support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
15)
+/**< RegEx device doesn't support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
<< 16)
+/**< RegEx device doesn't support Forward references.
+ * Forward references allow you to use a back reference to a group that
appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty
string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and
lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not
be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group
matches an
+ * empty string.
+ * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups
(backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and
the
+ * subject strings that are subsequently processed as strings of UTF
characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct
rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+    const char *driver_name; /**< RegEx driver name */
+    struct rte_device *dev;    /**< Device information */
+    uint8_t max_matches;
+    /**< Maximum matches per scan supported by this device */
+    uint16_t max_queue_pairs;
+    /**< Maximum queue pairs supported by this device */
+    uint16_t max_payload_size;
+    /**< Maximum payload size for a pattern match request or scan.
+     * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+     */
+    uint16_t max_rules_per_group;
+    /**< Maximum rules supported per group by this device */
+    uint16_t max_groups;
+    /**< Maximum group supported by this device */
+    uint32_t regex_dev_capa;
+    /**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+    uint64_t rule_flags;
+    /**< Supported compiler rule flags.
+     * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+     */
+    uint64_t pcre_unsup_flags;
+    /**< Unsupported PCRE features for this RegEx device.
+     * @see RTE_REGEX_DEV_PCRE_UNSUP_*
+     */
+};
+
+/**
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx
device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
*dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are
related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags,
rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+    uint8_t nb_max_matches;
+    /**< Maximum matches per scan configured on this device.
+     * This value cannot exceed the *max_matches*
+     * which previously provided in rte_regex_dev_info_get().
+     * The value 0 is allowed, in which case, value 1 used.
+     * @see struct rte_regex_dev_info::max_matches
+     */
+    uint16_t nb_queue_pairs;
+    /**< Number of RegEx queue pairs to configure on this device.
+     * This value cannot exceed the *max_queue_pairs* which
previously
+     * provided in rte_regex_dev_info_get().
+     * @see struct rte_regex_dev_info::max_queue_pairs
+     */
+    uint16_t nb_rules_per_group;
+    /**< Number of rules per group to configure on this device.
+     * This value cannot exceed the *max_rules_per_group*
+     * which previously provided in rte_regex_dev_info_get().
+     * The value 0 is allowed, in which case,
+     * struct rte_regex_dev_info::max_rules_per_group used.
+     * @see struct rte_regex_dev_info::max_rules_per_group
+     */
+    uint16_t nb_groups;
+    /**< Number of groups to configure on this device.
+     * This value cannot exceed the *max_groups*
+     * which previously provided in rte_regex_dev_info_get().
+     * @see struct rte_regex_dev_info::max_groups
+     */
+    const char *rule_db;
+    /**< Import initial set of prebuilt rule database on this device.
+     * The value NULL is allowed, in which case, the device will not
+     * be configured prebuilt rule database. Application may use
+     * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+     * to update or import rule database after the
+     * rte_regex_dev_configure().
+     * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+     */
+    uint32_t rule_db_len;
+    /**< Length of *rule_db* buffer. */
+    uint32_t dev_cfg_flags;
+    /**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
*/
+};
+
+/**
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured.
+ *   - <0: Error code returned by the driver configuration function.
+ */
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct
rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags,
rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+                      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once
per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+    uint32_t qp_conf_flags;
+    /**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
*/
+    uint16_t nb_desc;
+    /**< The number of descriptors to allocate for this queue pair. */
+    regexdev_stop_flush_t cb;
+    /**< Callback function called during rte_regex_dev_stop(), invoked
+     * once per flushed regex op. Value NULL is allowed, in which case
+     * callback will not be invoked. This function can be used to properly
+     * dispose of outstanding regex ops from response queue,
+     * for example ops containing memory pointers.
+     * @see rte_regex_dev_stop()
+     */
+};
+
+/**
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the
range
+ *   [0, nb_queue_pairs - 1] previously supplied to
rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue
pair.
+ *   NULL value is allowed, in which case default configuration    used.
+ *
+ * @return
+ *   - 0: Success, RegEx queue pair correctly set up.
+ *   - <0: RegEx queue configuration failed
+ */
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
+               const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ * @return
+ *   - 0: Success, device started.
+ *   - <0: Device start failed.
+ */
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *  - 0 on successfully closed the device.
+ *  - <0 on failure to close the device.
+ */
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+    RTE_REGEX_DEV_ATTR_SOCKET_ID,
+    /**< The NUMA socket id to which the device is connected or
+     * a default of zero if the socket could not be determined.
+     * datatype: *int*
+     * operation: *get*
+     */
+    RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+    /**< Maximum number of matches per scan.
+     * datatype: *uint8_t*
+     * operation: *get* and *set*
+     *
+     * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+     */
+    RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+    /**< Upper bound scan time in ns.
+     * datatype: *uint16_t*
+     * operation: *get* and *set*
+     *
+     * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+     */
+    RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+    /**< Maximum number of prefix detected per scan.
+     * This would be useful for denial of service detection.
+     * datatype: *uint16_t*
+     * operation: *get* and *set*
+     *
+     * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+     */
+};
+
+/**
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param[out] attr_value A pointer that will be filled in with the attribute
+ *             value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
attr_id,
+               void *attr_value);
+
+/**
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param attr_id The attribute ID to retrieve
+ * @param attr_value A pointer that will be filled in with the attribute value
+ *                   by the application
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
attr_id,
+               const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation */
+enum rte_regex_rule_op {
+    RTE_REGEX_RULE_OP_ADD,
+    /**< Add RegEx rule to rule database */
+    RTE_REGEX_RULE_OP_REMOVE
+    /**< Remove RegEx rule from rule database */
+};
+
+/** Structure to hold a RegEx rule attributes */
+struct rte_regex_rule {
+    enum rte_regex_rule_op op;
+    /**< OP type of the rule either a OP_ADD or OP_DELETE */
+    uint16_t group_id;
+    /**< Group identifier to which the rule belongs to. */
+    uint32_t rule_id;
+    /**< Rule identifier which is returned on successful match. */
+    const char *pcre_rule;
+    /**< Buffer to hold the PCRE rule. */
+    uint16_t pcre_rule_len;
+    /**< Length of the PCRE rule*/
+    uint64_t rule_flags;
+    /* PCRE rule flags. Supported device specific PCRE rules enumerated
+     * in struct rte_regex_dev_info::rule_flags. For successful rule
+     * database update, application needs to provide only supported
+     * rule flags.
+     * @See RTE_REGEX_PCRE_RULE_*, struct
rte_regex_dev_info::rule_flags
+     */
+};
+
+/**
+ * Update the rule database of a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rules
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
+ */
+uint16_t
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
*rules,
+             uint16_t nb_rules);

I think the function name is not too informative. If this function meant to compile the rule then it should be explicit on the function name.

+
+/**
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id RegEx device identifier
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+             uint32_t rule_db_len);
+
+/**
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id RegEx device identifier
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+    uint16_t id;
+    /**< xstat identifier */
+    char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+    /**< xstat name */
+};
+
+/**
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+                   struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param[out] values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested
+ * @return
+ *   - positive value: number of stat entries filled into the values array
+ *   - negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+             uint64_t values[], uint16_t n);
+
+/**
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param name
+ *   The stat name to retrieve
+ * @param[out] id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which
will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+                 uint16_t *id, uint64_t *value);
+
+/**
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is
NULL.
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+               uint16_t nb_ids);
+
+/**
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device
+ * @return
+ *   - 0: Selftest successful
+ *   - -ENOTSUP if the device doesn't support selftest
+ *   - other values < 0 on failure.
+ */
+int rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param f
+ *   A pointer to a file for output
+ *
+ * @return
+ *   - 0: on success
+ *   - <0: on failure.
+ */
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match
attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+    RTE_STD_C11
+    union {
+        uint64_t u64;
+        struct {
+            uint32_t rule_id:20;
+            /**< Rule identifier to which the pattern matched.
+             * @see struct rte_regex_rule::rule_id
+             */
+            uint32_t group_id:12;
+            /**< Group identifier of the rule which the pattern
+             * matched. @see struct rte_regex_rule::group_id
+             */
+            uint16_t offset;
+            /**< Starting Byte Position for matched rule. */
+            uint16_t len;
+            /**< Length of match in bytes */
+        };
+    };
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per
scan to
+ * reduce the post-processing required by the application. The match with
the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix
length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+    /* W0 */
+    uint16_t req_flags;
+    /**< Request flags for the RegEx ops.
+     * @see RTE_REGEX_OPS_REQ_*
+     */
+    uint16_t scan_size;
+    /**< Scan size of the buffer to be scanned in bytes. */
+    uint16_t rsp_flags;
+    /**< Response flags for the RegEx ops.
+     * @see RTE_REGEX_OPS_RSP_*
+     */
+    uint8_t nb_actual_matches;
+    /**< The total number of actual matches detected by the Regex
device.*/
+    uint8_t nb_matches;
+    /**< The total number of matches returned by the RegEx device for
this
+     * scan. The size of *rte_regex_ops::matches* zero length array will
be
+     * this value.
+     *
+     * @see struct rte_regex_ops::matches, struct rte_regex_match
+     */
+
+    /* W1 */
+    RTE_STD_C11
+    union {
+        uint64_t u64;
+        /**<  Allow 8-byte reserved on 32-bit system */
+        void *buf_addr;
+        /**< Virtual address of the pattern to be matched. */
+    };
+
+    /* W2 */
+    rte_iova_t buf_iova;
+    /**< IOVA address of the pattern to be matched. */
+
+    /* W3 */
+    uint16_t group_id0;
+    /**< First group_id to match the rule against. Minimum one group id
+     * must be provided by application.
+     * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
group_id1
+     * is valid, respectively similar flags for group_id2 and group_id3.
+     * Upon the match, struct rte_regex_match::group_id shall be
updated
+     * with matching group ID by the device. Group ID scheme provides
+     * rule isolation and effective pattern matching.
+     */
+    uint16_t group_id1;
+    /**< Second group_id to match the rule against.
+     *
+     * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+     */
+    uint16_t group_id2;
+    /**< Third group_id to match the rule against.
+     *
+     * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+     */
+    uint16_t group_id3;
+    /**< Forth group_id to match the rule against.
+     *
+     * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+     */
+
+    /* W4 */
+    RTE_STD_C11
+    union {
+        uint64_t user_id;
+        /**< Application specific opaque value. An application may
use
+         * this field to hold application specific value to share
+         * between dequeue and enqueue operation.
+         * Implementation should not modify this field.
+         */
+        void *user_ptr;
+        /**< Pointer representation of *user_id* */
+    };

Since we target the regex subsystem for both regex and DPI I think it will be good to add another uint64_t field called connection_id.
Device that support DPI can refer to it as another match able field when looking up for matches on the given buffer.

This field is different from the user_id, as it is not opaque for the device.

+
+    /* W5 */
+    struct rte_regex_match matches[];
+    /**< Zero length array to hold the match tuples.
+     * The struct rte_regex_ops::nb_matches value holds the number of
+     * elements in this array.
+     *
+     * @see struct rte_regex_ops::nb_matches
+     */
+};
+
+/**
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which
are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op*
structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The
return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take
care
+ *   of them.
+ */
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+            struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data
structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible"
policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that
must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take
care
+ *   of them.
+ */
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+            struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-08-20  1:54       ` Wang, Xiang W
@ 2019-09-10  8:05         ` Jerin Jacob Kollanukkaran
  2019-09-19 13:58           ` Wang Xiang
  0 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-09-10  8:05 UTC (permalink / raw)
  To: Wang, Xiang W, Thomas Monjalon, dev
  Cc: Pavan Nikhilesh Bhagavatula, Shahaf Shuler, Hemant Agrawal,
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	Nipun Gupta, Richardson, Bruce, Hong, Yang A, Chang, Harry,
	gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim, Ni,
	Hongjun, j.bromhead, deri, fc, arthur.su, Guy Kaneti,
	Smadar Fuks, Liron Himi

Hi Xiang,

Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see inline.
 
> 
> Reply to Xiang's queries in main thread:
> 
> Hi all,
> 
> Some questions regarding APIs. Could you please give more insights?
> 
> 1) rte_regex_ops
>       a) rsp_flags
>       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
>       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match
> at the end of current buffer after scan.
>       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> 
> [Jerin] Since we need three states to represent partial match buffer,
> RTE_REGEX_OPS_RSP_PMI_SOJ_F to
> represent start of the buffer, intermediate buffers with no flag, and end of
> the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ

> [Xiang] How could a user leverage these flags for matching? Suppose a large
> buffer is divided into multiple chunks. Will RTE_REGEX_OPS_RSP_PMI_SOJ_F
> cause an early quit once it isn't set after scan the first chunk. Similarly,
> RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching future
> buffers after finish the last chunk?

Let me describe with an example,

Assume,
1) struct rte_regex_dev_info:: max_payload_size set to 1024
2) rte_regex_dev_config:: dev_cfg_flags configured with RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
3) Device programmed with matching "hello\s+world" pattern
4) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 1024

data[0..1021] = data don’t have hello world pattern
data[1022] = 'h'
data[1023] = 'e'

5) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 9

data[0] = 'l'
data[1] = 'l'
data[2] = 'o'
data[3] = ' '
data[4] = 'w'
data[5] = 'o'
data[6] = 'r'
data[7] = 'l'
data[8] = 'd'

If so,

Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops:: rsp_flags on dequeue
Where rte_regex_match:: offset is 1022 and len 2

Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops:: rsp_flags on dequeue
Where rte_regex_match:: offset is 0 and len 9


> 
>       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a
> specific hardware implementation. I am wondering what this PREFIX refers
> to:)?
> 
> [Jerin] Yes. Looks like it is for hardware specific implementation. Introduced
> rte_regex_dev_attr_set/get functions to make it portable and
> To add new implementation specific fields.
> For example, if a rule is
> /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the
> factor. The prefix is a literal
> string, while the factor can contain complex regular expression constructs. As
> a result, rule matching occurs in
> two stages: prefix matching and factor matching.
> 
>       b)  user_id or user_ptr
>       Under what kind of circumstances should an application pass value into
> these variables for enqueue and dequeuer operations?
> 
> [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using
> mempool normally, on enqueue, user can specify user_id
> If needed to in order identify the op on dequeue if required. The use case
> could be to store the sequence number from application
> POV or storing the mbuf ptr in which pattern is requested etc.
> 
> 
>  2) rte_regex_match
>       a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t
> len; /**< Length of match in bytes */
>       Looks like the matching offset is defined as *starting matching offset*
> instead of *end matching offset*, e.g. report the offset of "a" instead of "c"
> for pattern "abc".
>       If so, this makes it hard to integrate software regex libraries such as
> Hyperscan and RE2 as they only report *end matching offset* without length
> of match.
>       Although Hyperscan has API for *starting matching offset*, it only delivers
> partial syntax support. So I think we have to define *end of matching offset*
> for software solutions.
> 
> [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I
> thought application would need always the length of the match.
> Probably we will see how other HW implementation (from Mellanox) etc. We
> will try to abstract it, probably we can make it as function of "user
> requested".
> [Xiang] Yes, it will be good to make it per user request. At least from
> Hyperscan user's point of view, start of match and match length are not
> mandatory.

OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START
In device configure.

Since offset+len == end, we can introduce following generic inline function.

static inline 
rte_regex_match_end(truct rte_regex_match *match)
{
	match->offset + match->len;
}

Example:  pattern to match is  "hello\s+world"  and data is following
data[4] = 'h'
data[5] = 'e'
data[6] = 'l'
data[7] = 'l'
data[8] = 'o'
data[9] = ' '
data[10] = 'w'
data[11] = 'o'
data[12] = 'r'
data[13] = 'l'
data[14] = 'd'

if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
match->offset returns 4
match->len returns 11

if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
driver MAY return the following(in hyperscan case)
match->offset returns 0
match->len returns 11 + 4

In both case(irrespective of flags, to make application life easy) rte_regex_match_end() would return 15.
If application demands for MATCH_AS_START then driver can return match->offset returns 4 and match->len returns 11
Aka set HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use rte_regex_match_end()
for finding the end of the match. To make, work in all cases.

Is it OK? 

> 
> 3)  rte_regex_rule_db_update()
>     Does this mean we can dynamically add or delete rules for an already
> generated database without recompile from scratch for hardware Regex
> implementation?
>     If so, this isn't possible for software solutions as they don't support
> dynamic database update and require recompile.
> 
> [Jerin] rte_regex_rule_db_update() internally it would call recompile
> function for both HW and SW.
> See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> precompiled rule database case.
> [Xiang] OK, sounds like we have to save the original rule-set for the device in
> order to do recompile. I see both ADD and REMOVE operators from
> rte_regex_rule.
> For rules with REMOVE operator, what's the expected behavior to handle
> them for the old rule-set? Do we need to go through the old rule-set and
> remove corresponding rules before doing recompile?

Yes.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-08-21  5:32     ` Shahaf Shuler
  2019-08-21 15:12       ` John Bromhead
@ 2019-09-10 10:31       ` Jerin Jacob Kollanukkaran
  2019-09-10 11:02       ` Jerin Jacob Kollanukkaran
  2 siblings, 0 replies; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-09-10 10:31 UTC (permalink / raw)
  To: Shahaf Shuler, Thomas Monjalon, dev
  Cc: Pavan Nikhilesh Bhagavatula, Hemant Agrawal, Opher Reviv,
	Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor, Nipun Gupta, Wang,
	Xiang W, Richardson, Bruce, yang.a.hong, harry.chang, gu.jian1,
	shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su

> Hi Jerin,










> 
> Thursday, August 15, 2019 2:34 PM, Thomas Monjalon:
> > Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> > subsystem
> >
> > +Cc more
> >
> > ------------
> >
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Even though there are some vendors which offer Regex HW offload, due to
> > lack of standard API, It is diffcult for DPDK consumer to use them
> > in a portable way.
> >
> > This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> >
> > The Doxygen generated RFC API documentation available here:
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrea
> > my-noether-
> > 22777e.netlify.com%2Frte__regexdev_8h.html&amp;data=02%7C01%7Csha
> >
> hafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721748937%7Ca652971c
> >
> 7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637014656739993131&amp;sdata
> > =6ZAOrLmj3sf7LrPRlzE7IyqkK8b4cvFIQqK6zSwF4aw%3D&amp;reserved=0
> >
> > This RFC crafted based on SW Regex API frameworks such as libpcre and
> > hyperscan and a few of the RegEx HW IPs which I am aware of.
> >
> > RegEx pattern matching applications:
> > • Next Generation Firewalls (NGFW)
> > • Deep Packet and Flow Inspection (DPI)
> > • Intrusion Prevention Systems (IPS)
> > • DDoS Mitigation
> > • Network Monitoring
> > • Data Loss Prevention (DLP)
> > • Smart NICs
> > • Grammar based content processing
> > • URL, spam and adware filtering
> > • Advanced auditing and policing of user/application security policies
> > • Financial data mining - parsing of streamed financial feeds
> 
> I think two more important use case to add (at least on the doc of this
> subsystem) are:
> * application recognition
> * memory introspection
> 
> 
> >
> > Request to review from HW and SW RegEx vendors and RegEx application
> > users
> > to have portable DPDK API for RegEx.
> >
> > The API schematics are based cryptodev, eventdev and ethdev existing
> > device API.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > ---
> >
> > RTE RegEx Device API
> > --------------------
> >
> > Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> >
> > The RegEx Device API is composed of two parts:
> >
> > - The application-oriented RegEx API that includes functions to setup
> > a RegEx device (configure it, setup its queue pairs and start it),
> > update the rule database and so on.
> >
> > - The driver-oriented RegEx API that exports a function allowing
> > a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> > a RegEx device driver.
> >
> > RegEx device components and definitions:
> >
> >     +-----------------+
> >     |                 |
> >     |                 o---------+    rte_regex_[en|de]queue_burst()
> >     |   PCRE based    o------+  |               |
> >     |  RegEx pattern  |      |  |  +--------+   |
> >     | matching engine o------+--+--o        |   |    +------+
> >     |                 |      |  |  | queue  |<==o===>|Core 0|
> >     |                 o----+ |  |  | pair 0 |        |      |
> >     |                 |    | |  |  +--------+        +------+
> >     +-----------------+    | |  |
> >            ^               | |  |  +--------+
> >            |               | |  |  |        |        +------+
> >            |               | +--+--o queue  |<======>|Core 1|
> >        Rule|Database       |    |  | pair 1 |        |      |
> >     +------+----------+    |    |  +--------+        +------+
> >     |     Group 0     |    |    |
> >     | +-------------+ |    |    |  +--------+        +------+
> >     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> >     | +-------------+ |    |    +--o queue  |<======>|      |
> >     |     Group 1     |    |       | pair 2 |        +------+
> >     | +-------------+ |    |       +--------+
> >     | | Rules 0..n  | |    |
> >     | +-------------+ |    |       +--------+
> >     |     Group 2     |    |       |        |        +------+
> >     | +-------------+ |    |       | queue  |<======>|Core n|
> >     | | Rules 0..n  | |    +-------o pair n |        |      |
> >     | +-------------+ |            +--------+        +------+
> >     |     Group n     |
> >     | +-------------+ |<-------rte_regex_rule_db_update()
> >     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> >     | +-------------+ |------->rte_regex_rule_db_export()
> >     +-----------------+
> >
> > RegEx: A regular expression is a concise and flexible means for matching
> > strings of text, such as particular characters, words, or patterns of
> > characters. A common abbreviation for this is “RegEx”.
> >
> > RegEx device: A hardware or software-based implementation of RegEx
> > device API for PCRE based pattern matching syntax and semantics.
> >
> > PCRE RegEx syntax and semantics specification:
> > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
> > kit.sourceforge.net%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
> >
> ata=02%7C01%7Cshahafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721
> >
> 748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
> > 9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
> > D&amp;reserved=0
> >
> > RegEx queue pair: Each RegEx device should have one or more queue pair to
> > transmit a burst of pattern matching request and receive a burst of
> > receive the pattern matching response. The pattern matching
> > request/response
> > embedded in *rte_regex_ops* structure.
> >
> > Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> > Match ID and Group ID to identify the rule upon the match.
> >
> > Rule database: The RegEx device accepts regular expressions and converts
> > them
> > into a compiled rule database that can then be used to scan data.
> > Compilation allows the device to analyze the given pattern(s) and
> > pre-determine how to scan for these patterns in an optimized fashion that
> > would be far too expensive to compute at run-time. A rule database contains
> > a set of rules that compiled in device specific binary form.
> >
> > Match ID or Rule ID: A unique identifier provided at the time of rule
> > creation for the application to identify the rule upon match.
> >
> > Group ID: Group of rules can be grouped under one group ID to enable
> > rule isolation and effective pattern matching. A unique group identifier
> > provided at the time of rule creation for the application to identify the
> > rule upon match.
> >
> > Scan: A pattern matching request through *enqueue* API.
> >
> > It may possible that a given RegEx device may not support all the features
> > of PCRE. The application may probe unsupported features through
> > struct rte_regex_dev_info::pcre_unsup_flags
> >
> > By default, all the functions of the RegEx Device API exported by a PMD
> > are lock-free functions which assume to not be invoked in parallel on
> > different logical cores to work on the same target object. For instance,
> > the dequeue function of a PMD cannot be invoked in parallel on two logical
> > cores to operates on same RegEx queue pair. Of course, this function
> > can be invoked in parallel by different logical core on different queue pair.
> > It is the responsibility of the upper level application to enforce this rule.
> >
> > In all functions of the RegEx API, the RegEx device is
> > designated by an integer >= 0 named the device identifier *dev_id*
> >
> > At the RegEx driver level, RegEx devices are represented by a generic
> > data structure of type *rte_regex_dev*.
> >
> > RegEx devices are dynamically registered during the PCI/SoC device probing
> > phase performed at EAL initialization time.
> > When a RegEx device is being probed, a *rte_regex_dev* structure and
> > a new device identifier are allocated for that device. Then, the
> > regex_dev_init() function supplied by the RegEx driver matching the probed
> > device is invoked to properly initialize the device.
> >
> > The role of the device init function consists of resetting the hardware or
> > software RegEx driver implementations.
> >
> > If the device init operation is successful, the correspondence between
> > the device identifier assigned to the new device and its associated
> > *rte_regex_dev* structure is effectively registered.
> > Otherwise, both the *rte_regex_dev* structure and the device identifier are
> > freed.
> >
> > The functions exported by the application RegEx API to setup a device
> > designated by its device identifier must be invoked in the following order:
> > - rte_regex_dev_configure()
> > - rte_regex_queue_pair_setup()
> > - rte_regex_dev_start()
> >
> > Then, the application can invoke, in any order, the functions
> > exported by the RegEx API to enqueue pattern matching job, dequeue
> > pattern
> > matching response, get the stats, update the rule database,
> > get/set device attributes and so on
> >
> > If the application wants to change the configuration (i.e. call
> > rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
> > rte_regex_dev_stop() first to stop the device and then do the
> > reconfiguration
> > before calling rte_regex_dev_start() again. The enqueue and dequeue
> > functions should not be invoked when the device is stopped.
> >
> > Finally, an application can close a RegEx device by invoking the
> > rte_regex_dev_close() function.
> >
> > Each function of the application RegEx API invokes a specific function
> > of the PMD that controls the target device designated by its device
> > identifier.
> >
> > For this purpose, all device-specific functions of a RegEx driver are
> > supplied through a set of pointers contained in a generic structure of type
> > *regex_dev_ops*.
> > The address of the *regex_dev_ops* structure is stored in the
> > *rte_regex_dev*
> > structure by the device init function of the RegEx driver, which is
> > invoked during the PCI/SoC device probing phase, as explained earlier.
> >
> > In other words, each function of the RegEx API simply retrieves the
> > *rte_regex_dev* structure associated with the device identifier and
> > performs an indirect invocation of the corresponding driver function
> > supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> > structure.
> >
> > For performance reasons, the address of the fast-path functions of the
> > RegEx driver is not contained in the *regex_dev_ops* structure.
> > Instead, they are directly stored at the beginning of the *rte_regex_dev*
> > structure to avoid an extra indirect memory access during their invocation.
> >
> > RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> > operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
> > functions to applications.
> >
> > The *enqueue* operation submits a burst of RegEx pattern matching
> > request
> > to the RegEx device and the *dequeue* operation gets a burst of pattern
> > matching response for the ones submitted through *enqueue* operation.
> >
> > Typical application utilisation of the RegEx device API will follow the
> > following programming flow.
> >
> > - rte_regex_dev_configure()
> > - rte_regex_queue_pair_setup()
> > - rte_regex_rule_db_update() Needs to invoke if precompiled rule database
> > not
> > provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
> > and/or application needs to update rule database.
> > - Create or reuse exiting mempool for *rte_regex_ops* objects.
> > - rte_regex_dev_start()
> > - rte_regex_enqueue_burst()
> > - rte_regex_dequeue_burst()
> >
> > ---
> >
> > config/common_base                 |    5 +
> > doc/api/doxy-api-index.md          |    1 +
> > doc/api/doxy-api.conf.in           |    1 +
> > lib/Makefile                       |    2 +
> > lib/librte_regexdev/Makefile       |   23 +
> > lib/librte_regexdev/rte_regexdev.c |    5 +
> > lib/librte_regexdev/rte_regexdev.h | 1247
> > ++++++++++++++++++++++++++++
> > 7 files changed, 1284 insertions(+)
> > create mode 100644 lib/librte_regexdev/Makefile
> > create mode 100644 lib/librte_regexdev/rte_regexdev.c
> > create mode 100644 lib/librte_regexdev/rte_regexdev.h
> >
> > diff --git a/config/common_base b/config/common_base
> > index e406e7836..986093d6e 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -746,6 +746,11 @@
> > CONFIG_RTE_LIBRTE_PMD_DPAA2_QDMA_RAWDEV=n
> > #
> > CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
> >
> > +#
> > +# Compile regex device support
> > +#
> > +CONFIG_RTE_LIBRTE_REGEXDEV=y
> > +
> > #
> > # Compile librte_ring
> > #
> > diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> > index 715248dd1..a0bc27ae4 100644
> > --- a/doc/api/doxy-api-index.md
> > +++ b/doc/api/doxy-api-index.md
> > @@ -26,6 +26,7 @@ The public API headers are grouped by topics:
> > [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
> > [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
> > [rawdev]             (@ref rte_rawdev.h),
> > +  [regexdev]           (@ref rte_regexdev.h),
> > [metrics]            (@ref rte_metrics.h),
> > [bitrate]            (@ref rte_bitrate.h),
> > [latency]            (@ref rte_latencystats.h),
> > diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
> > index b9896cb63..7adb821bb 100644
> > --- a/doc/api/doxy-api.conf.in
> > +++ b/doc/api/doxy-api.conf.in
> > @@ -53,6 +53,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-
> > index.md \
> > @TOPDIR@/lib/librte_rawdev \
> > @TOPDIR@/lib/librte_rcu \
> > @TOPDIR@/lib/librte_reorder \
> > +                          @TOPDIR@/lib/librte_regexdev \
> > @TOPDIR@/lib/librte_ring \
> > @TOPDIR@/lib/librte_sched \
> > @TOPDIR@/lib/librte_security \
> > diff --git a/lib/Makefile b/lib/Makefile
> > index 791e0d991..57de9691a 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring
> > librte_ethdev librte_hash \
> > librte_mempool librte_timer librte_cryptodev
> > DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
> > DEPDIRS-librte_rawdev := librte_eal librte_ethdev
> > +DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
> > +DEPDIRS-librte_regexdev := librte_eal
> > DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> > DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf
> > librte_ethdev \
> > 			librte_net
> > diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
> > new file mode 100644
> > index 000000000..723b4b28c
> > --- /dev/null
> > +++ b/lib/librte_regexdev/Makefile
> > @@ -0,0 +1,23 @@
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright(C) 2019 Marvell International Ltd.
> > +#
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# library name
> > +LIB = librte_regexdev.a
> > +
> > +# library version
> > +LIBABIVER := 1
> > +
> > +# build flags
> > +CFLAGS += -O3
> > +CFLAGS += $(WERROR_FLAGS)
> > +
> > +# library source files
> > +SRCS-y += rte_regexdev.c
> > +
> > +# export include files
> > +SYMLINK-y-include += rte_regexdev.h
> > +
> > +include $(RTE_SDK)/mk/rte.lib.mk
> > diff --git a/lib/librte_regexdev/rte_regexdev.c
> > b/lib/librte_regexdev/rte_regexdev.c
> > new file mode 100644
> > index 000000000..e5be0f29c
> > --- /dev/null
> > +++ b/lib/librte_regexdev/rte_regexdev.c
> > @@ -0,0 +1,5 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(C) 2019 Marvell International Ltd.
> > + */
> > +
> > +#include <rte_regexdev.h>
> > diff --git a/lib/librte_regexdev/rte_regexdev.h
> > b/lib/librte_regexdev/rte_regexdev.h
> > new file mode 100644
> > index 000000000..765da4aaa
> > --- /dev/null
> > +++ b/lib/librte_regexdev/rte_regexdev.h
> > @@ -0,0 +1,1247 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(C) 2019 Marvell International Ltd.
> > + */
> > +
> > +#ifndef _RTE_REGEXDEV_H_
> > +#define _RTE_REGEXDEV_H_
> > +
> > +/**
> > + * @file
> > + *
> > + * RTE RegEx Device API
> > + *
> > + * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
> > + *
> > + * The RegEx Device API is composed of two parts:
> > + *
> > + * - The application-oriented RegEx API that includes functions to setup
> > + *   a RegEx device (configure it, setup its queue pairs and start it),
> > + *   update the rule database and so on.
> > + *
> > + * - The driver-oriented RegEx API that exports a function allowing
> > + *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
> > + *   a RegEx device driver.
> > + *
> > + * RegEx device components and definitions:
> > + *
> > + *     +-----------------+
> > + *     |                 |
> > + *     |                 o---------+    rte_regex_[en|de]queue_burst()
> > + *     |   PCRE based    o------+  |               |
> > + *     |  RegEx pattern  |      |  |  +--------+   |
> > + *     | matching engine o------+--+--o        |   |    +------+
> > + *     |                 |      |  |  | queue  |<==o===>|Core 0|
> > + *     |                 o----+ |  |  | pair 0 |        |      |
> > + *     |                 |    | |  |  +--------+        +------+
> > + *     +-----------------+    | |  |
> > + *            ^               | |  |  +--------+
> > + *            |               | |  |  |        |        +------+
> > + *            |               | +--+--o queue  |<======>|Core 1|
> > + *        Rule|Database       |    |  | pair 1 |        |      |
> > + *     +------+----------+    |    |  +--------+        +------+
> > + *     |     Group 0     |    |    |
> > + *     | +-------------+ |    |    |  +--------+        +------+
> > + *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
> > + *     | +-------------+ |    |    +--o queue  |<======>|      |
> > + *     |     Group 1     |    |       | pair 2 |        +------+
> > + *     | +-------------+ |    |       +--------+
> > + *     | | Rules 0..n  | |    |
> > + *     | +-------------+ |    |       +--------+
> > + *     |     Group 2     |    |       |        |        +------+
> > + *     | +-------------+ |    |       | queue  |<======>|Core n|
> > + *     | | Rules 0..n  | |    +-------o pair n |        |      |
> > + *     | +-------------+ |            +--------+        +------+
> > + *     |     Group n     |
> > + *     | +-------------+ |<-------rte_regex_rule_db_update()
> > + *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
> > + *     | +-------------+ |------->rte_regex_rule_db_export()
> > + *     +-----------------+
> > + *
> > + * RegEx: A regular expression is a concise and flexible means for matching
> > + * strings of text, such as particular characters, words, or patterns of
> > + * characters. A common abbreviation for this is “RegEx”.
> > + *
> > + * RegEx device: A hardware or software-based implementation of RegEx
> > + * device API for PCRE based pattern matching syntax and semantics.
> > + *
> > + * PCRE RegEx syntax and semantics specification:
> > + *
> > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fregex
> > kit.sourceforge.net%2FDocumentation%2Fpcre%2Fpcrepattern.html&amp;d
> >
> ata=02%7C01%7Cshahafs%40mellanox.com%7Cdf93416cf4e8498a982c08d721
> >
> 748937%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63701465673
> > 9993131&amp;sdata=B0LSMubldDy3UlF55Z3whhNiRq6ep1pxB8Rrt5DItfw%3
> > D&amp;reserved=0
> > + *
> > + * RegEx queue pair: Each RegEx device should have one or more queue
> > pair to
> > + * transmit a burst of pattern matching request and receive a burst of
> > + * receive the pattern matching response. The pattern matching
> > request/response
> > + * embedded in *rte_regex_ops* structure.
> > + *
> > + * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
> > + * Match ID and Group ID to identify the rule upon the match.
> > + *
> > + * Rule database: The RegEx device accepts regular expressions and
> > converts them
> > + * into a compiled rule database that can then be used to scan data.
> > + * Compilation allows the device to analyze the given pattern(s) and
> > + * pre-determine how to scan for these patterns in an optimized fashion
> > that
> > + * would be far too expensive to compute at run-time. A rule database
> > contains
> > + * a set of rules that compiled in device specific binary form.
> > + *
> > + * Match ID or Rule ID: A unique identifier provided at the time of rule
> > + * creation for the application to identify the rule upon match.
> > + *
> > + * Group ID: Group of rules can be grouped under one group ID to enable
> > + * rule isolation and effective pattern matching. A unique group identifier
> > + * provided at the time of rule creation for the application to identify the
> > + * rule upon match.
> > + *
> > + * Scan: A pattern matching request through *enqueue* API.
> > + *
> > + * It may possible that a given RegEx device may not support all the features
> > + * of PCRE. The application may probe unsupported features through
> > + * struct rte_regex_dev_info::pcre_unsup_flags
> > + *
> > + * By default, all the functions of the RegEx Device API exported by a PMD
> > + * are lock-free functions which assume to not be invoked in parallel on
> > + * different logical cores to work on the same target object. For instance,
> > + * the dequeue function of a PMD cannot be invoked in parallel on two
> > logical
> > + * cores to operates on same RegEx queue pair. Of course, this function
> > + * can be invoked in parallel by different logical core on different queue
> > pair.
> > + * It is the responsibility of the upper level application to enforce this rule.
> > + *
> > + * In all functions of the RegEx API, the RegEx device is
> > + * designated by an integer >= 0 named the device identifier *dev_id*
> > + *
> > + * At the RegEx driver level, RegEx devices are represented by a generic
> > + * data structure of type *rte_regex_dev*.
> > + *
> > + * RegEx devices are dynamically registered during the PCI/SoC device
> > probing
> > + * phase performed at EAL initialization time.
> > + * When a RegEx device is being probed, a *rte_regex_dev* structure and
> > + * a new device identifier are allocated for that device. Then, the
> > + * regex_dev_init() function supplied by the RegEx driver matching the
> > probed
> > + * device is invoked to properly initialize the device.
> > + *
> > + * The role of the device init function consists of resetting the hardware or
> > + * software RegEx driver implementations.
> > + *
> > + * If the device init operation is successful, the correspondence between
> > + * the device identifier assigned to the new device and its associated
> > + * *rte_regex_dev* structure is effectively registered.
> > + * Otherwise, both the *rte_regex_dev* structure and the device identifier
> > are
> > + * freed.
> > + *
> > + * The functions exported by the application RegEx API to setup a device
> > + * designated by its device identifier must be invoked in the following order:
> > + *     - rte_regex_dev_configure()
> > + *     - rte_regex_queue_pair_setup()
> > + *     - rte_regex_dev_start()
> > + *
> > + * Then, the application can invoke, in any order, the functions
> > + * exported by the RegEx API to enqueue pattern matching job, dequeue
> > pattern
> > + * matching response, get the stats, update the rule database,
> > + * get/set device attributes and so on
> > + *
> > + * If the application wants to change the configuration (i.e. call
> > + * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must
> > call
> > + * rte_regex_dev_stop() first to stop the device and then do the
> > reconfiguration
> > + * before calling rte_regex_dev_start() again. The enqueue and dequeue
> > + * functions should not be invoked when the device is stopped.
> > + *
> > + * Finally, an application can close a RegEx device by invoking the
> > + * rte_regex_dev_close() function.
> > + *
> > + * Each function of the application RegEx API invokes a specific function
> > + * of the PMD that controls the target device designated by its device
> > + * identifier.
> > + *
> > + * For this purpose, all device-specific functions of a RegEx driver are
> > + * supplied through a set of pointers contained in a generic structure of type
> > + * *regex_dev_ops*.
> > + * The address of the *regex_dev_ops* structure is stored in the
> > *rte_regex_dev*
> > + * structure by the device init function of the RegEx driver, which is
> > + * invoked during the PCI/SoC device probing phase, as explained earlier.
> > + *
> > + * In other words, each function of the RegEx API simply retrieves the
> > + * *rte_regex_dev* structure associated with the device identifier and
> > + * performs an indirect invocation of the corresponding driver function
> > + * supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
> > structure.
> > + *
> > + * For performance reasons, the address of the fast-path functions of the
> > + * RegEx driver is not contained in the *regex_dev_ops* structure.
> > + * Instead, they are directly stored at the beginning of the *rte_regex_dev*
> > + * structure to avoid an extra indirect memory access during their
> > invocation.
> > + *
> > + * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
> > + * operation. Instead, RegEx drivers export Poll-Mode enqueue and
> > dequeue
> > + * functions to applications.
> > + *
> > + * The *enqueue* operation submits a burst of RegEx pattern matching
> > request
> > + * to the RegEx device and the *dequeue* operation gets a burst of pattern
> > + * matching response for the ones submitted through *enqueue*
> > operation.
> > + *
> > + * Typical application utilisation of the RegEx device API will follow the
> > + * following programming flow.
> > + *
> > + * - rte_regex_dev_configure()
> > + * - rte_regex_queue_pair_setup()
> > + * - rte_regex_rule_db_update() Needs to invoke if precompiled rule
> > database not
> > + *   provided in rte_regex_dev_config::rule_db for
> > rte_regex_dev_configure()
> > + *   and/or application needs to update rule database.
> > + * - Create or reuse exiting mempool for *rte_regex_ops* objects.
> > + * - rte_regex_dev_start()
> > + * - rte_regex_enqueue_burst()
> > + * - rte_regex_dequeue_burst()
> > + *
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <rte_common.h>
> > +#include <rte_config.h>
> > +#include <rte_dev.h>
> > +#include <rte_errno.h>
> > +#include <rte_memory.h>
> > +
> > +/**
> > + * Get the total number of RegEx devices that have been successfully
> > + * initialised.
> > + *
> > + * @return
> > + *   The total number of usable RegEx devices.
> > + */
> > +uint8_t
> > +rte_regex_dev_count(void);
> > +
> > +/**
> > + * Get the device identifier for the named RegEx device.
> > + *
> > + * @param name
> > + *   RegEx device name to select the RegEx device identifier.
> > + *
> > + * @return
> > + *   Returns RegEx device identifier on success.
> > + *   - <0: Failure to find named RegEx device.
> > + */
> > +int
> > +rte_regex_dev_get_dev_id(const char *name);
> > +
> > +/* Enumerates RegEx device capabilities */
> > +#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> > +/**< RegEx device does support compiling the rules at runtime unlike
> > + * loading only the pre-built rule database using
> > + * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> > + * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
> > + * @see struct rte_regex_dev_info::regex_dev_capa
> > + */
> > +
> > +
> > +/* Enumerates unsupported PCRE features for the RegEx device */
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_START_ANCHOR_F (1ULL << 0)
> > +/**< RegEx device doesn't support PCRE Anchor to start of match flag.
> > + * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
> > + * previous match or the start of the string for the first match.
> > + * This position will change each time the RegEx is applied to the subject
> > + * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
> > + * be successful for 'foo1foo2' and fail for 'Zfoo3'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_ATOMIC_GROUPING_F (1ULL <<
> > 1)
> > +/**< RegEx device doesn't support PCRE Atomic grouping.
> > + * Atomic groups are represented by '(?>)'. An atomic group is a group that,
> > + * when the RegEx engine exits from it, automatically throws away all
> > + * backtracking positions remembered by any tokens inside the group.
> > + * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc'
> > then
> > + * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
> > + * atomic groups don't allow backtracing back to 'b'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKTRACKING_CTRL_F (1ULL <<
> > 2)
> > +/**< RegEx device doesn't support PCRE backtracking control verbs.
> > + * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
> > + * (*SKIP), (*PRUNE).
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_CALLOUTS_F (1ULL << 3)
> > +/**< RegEx device doesn't support PCRE callouts.
> > + * PCRE supports calling external function in between matches by using
> > '(?C)'.
> > + * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx
> > engine
> > + * will parse ABC perform a userdefined callout and return a successful
> > match at
> > + * D.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_BACKREFERENCE_F (1ULL << 4)
> > +/**< RegEx device doesn't support PCRE backreference.
> > + * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most
> > recently
> > + * matched by the 2nd capturing group i.e. 'GHI'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_GREEDY_F (1ULL << 5)
> > +/**< RegEx device doesn't support PCRE Greedy mode.
> > + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> > unlimited
> > + * matches. In greedy mode the pattern 'AB12345' will be matched
> > completely
> > + * where as the ungreedy mode 'AB' will be returned as the match.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_LOOKAROUND_ASRT_F (1ULL <<
> > 6)
> > +/**< RegEx device doesn't support PCRE Lookaround assertions
> > + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> > + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> > matches
> > + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> > a
> > + * successful match.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_MATCH_POINT_RST_F (1ULL <<
> > 7)
> > +/**< RegEx device doesn't support PCRE match point reset directive.
> > + * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
> > + * then even though the entire pattern matches only '123'
> > + * is reported as a match.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_CONVENTIONS_F
> > (1ULL << 8)
> > +/**< RegEx device doesn't support PCRE newline convention.
> > + * Newline conventions are represented as follows:
> > + * (*CR)        carriage return
> > + * (*LF)        linefeed
> > + * (*CRLF)      carriage return, followed by linefeed
> > + * (*ANYCRLF)   any of the three above
> > + * (*ANY)       all Unicode newline sequences
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_NEWLINE_SEQ_F (1ULL << 9)
> > +/**< RegEx device doesn't support PCRE newline sequence.
> > + * The escape sequence '\R' will match any newline sequence.
> > + * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_POSSESSIVE_QUALIFIERS_F (1ULL
> > << 10)
> > +/**< RegEx device doesn't support PCRE possessive qualifiers.
> > + * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
> > + * Possessive quantifier repeats the token as many times as possible and it
> > does
> > + * not give up matches as the engine backtracks. With a possessive
> > quantifier,
> > + * the deal is all or nothing.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_SUBROUTINE_REFERENCES_F
> > (1ULL << 11)
> > +/**< RegEx device doesn't support PCRE Subroutine references.
> > + * PCRE Subroutine references allow for sub patterns to be assessed
> > + * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
> > + * pattern 'foofoofuzzfoofuzzbar'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_8_F (1ULL << 12)
> > +/**< RegEx device doesn't support UTF-8 character encoding.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_16_F (1ULL << 13)
> > +/**< RegEx device doesn't support UTF-16 character encoding.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_UTF_32_F (1ULL << 14)
> > +/**< RegEx device doesn't support UTF-32 character encoding.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_WORD_BOUNDARY_F (1ULL <<
> > 15)
> > +/**< RegEx device doesn't support word boundaries.
> > + * The meta character '\b' represents word boundary anchor.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +#define RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F (1ULL
> > << 16)
> > +/**< RegEx device doesn't support Forward references.
> > + * Forward references allow you to use a back reference to a group that
> > appears
> > + * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
> > + * following string 'GHIGHIABCDEF'.
> > + * @see struct rte_regex_dev_info::pcre_unsup_flags
> > + */
> > +
> > +/* Enumerates PCRE rule flags */
> > +#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
> > +/**< When this flag is set, the pattern that can match against an empty
> > string,
> > + * such as '.*' are allowed.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
> > +/**< When this flag is set, the pattern is forced to be "anchored", that is, it
> > + * is constrained to match only at the first matching point in the string that
> > + * is being searched. Similar to '^' and represented by \A.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
> > +/**< When this flag is set, letters in the pattern match both upper and
> > lower
> > + * case letters in the subject.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
> > +/**< When this flag is set, a dot metacharacter in the pattern matches any
> > + * character, including one that indicates a newline.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
> > +/**< When this flag is set, names used to identify capture groups need not
> > be
> > + * unique.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
> > +/**< When this flag is set, most white space characters in the pattern are
> > + * totally ignored except when escaped or inside a character class.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
> > +/**< When this flag is set, a backreference to an unset capture group
> > matches an
> > + * empty string.
> > + * @see RTE_REGEX_DEV_PCRE_UNSUP_FORWARD_REFERENCES_F
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
> > +/**< When this flag  is set, the '^' and '$' constructs match immediately
> > + * following or immediately before internal newlines in the subject string,
> > + * respectively, as well as at the very start and end.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
> > +/**< When this Flag is set, it disables the use of numbered capturing
> > + * parentheses in the pattern. References to capture groups
> > (backreferences or
> > + * recursion/subroutine calls) may only refer to named groups, though the
> > + * reference can be by name or by number.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
> > +/**< By default, only ASCII characters are recognized, When this flag is set,
> > + * Unicode properties are used instead to classify characters.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
> > +/**< When this flag is set, the "greediness" of the quantifiers is inverted
> > + * so that they are not greedy by default, but become greedy if followed by
> > + * '?'.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
> > +/**< When this flag is set, RegEx engine has to regard both the pattern and
> > the
> > + * subject strings that are subsequently processed as strings of UTF
> > characters
> > + * instead of single-code-unit strings.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
> > +/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
> > + * This escape matches one data unit, even in UTF mode which can cause
> > + * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave
> > the
> > + * current matching point in the middle of a multi-code-unit character.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > + */
> > +
> > +
> > +/**
> > + * RegEx device information
> > + */
> > +struct rte_regex_dev_info {
> > +	const char *driver_name; /**< RegEx driver name */
> > +	struct rte_device *dev;	/**< Device information */
> > +	uint8_t max_matches;
> > +	/**< Maximum matches per scan supported by this device */
> > +	uint16_t max_queue_pairs;
> > +	/**< Maximum queue pairs supported by this device */
> > +	uint16_t max_payload_size;
> > +	/**< Maximum payload size for a pattern match request or scan.
> > +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > +	 */
> > +	uint16_t max_rules_per_group;
> > +	/**< Maximum rules supported per group by this device */
> > +	uint16_t max_groups;
> > +	/**< Maximum group supported by this device */
> > +	uint32_t regex_dev_capa;
> > +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> > +	uint64_t rule_flags;
> > +	/**< Supported compiler rule flags.
> > +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> > +	 */
> > +	uint64_t pcre_unsup_flags;
> > +	/**< Unsupported PCRE features for this RegEx device.
> > +	 * @see RTE_REGEX_DEV_PCRE_UNSUP_*
> > +	 */
> > +};
> > +
> > +/**
> > + * Retrieve the contextual information of a RegEx device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @param[out] dev_info
> > + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> > the
> > + *   contextual information of the device.
> > + *
> > + * @return
> > + *   - 0: Success, driver updates the contextual information of the RegEx
> > device
> > + *   - <0: Error code returned by the driver info get function.
> > + *
> > + */
> > +int
> > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> > *dev_info);
> > +
> > +/* Enumerates RegEx device configuration flags */
> > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > +/**< Cross buffer scan refers to the ability to be able to detect
> > + * matches that occur across buffer boundaries, where the buffers are
> > related
> > + * to each other in some way. Enable this flag when to scan payload size
> > + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> > + * matches can present across scan buffer boundaries.
> > + *
> > + * @see struct rte_regex_dev_info::max_payload_size
> > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> > rte_regex_dev_configure()
> > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > + */
> > +
> > +/** RegEx device configuration structure */
> > +struct rte_regex_dev_config {
> > +	uint8_t nb_max_matches;
> > +	/**< Maximum matches per scan configured on this device.
> > +	 * This value cannot exceed the *max_matches*
> > +	 * which previously provided in rte_regex_dev_info_get().
> > +	 * The value 0 is allowed, in which case, value 1 used.
> > +	 * @see struct rte_regex_dev_info::max_matches
> > +	 */
> > +	uint16_t nb_queue_pairs;
> > +	/**< Number of RegEx queue pairs to configure on this device.
> > +	 * This value cannot exceed the *max_queue_pairs* which
> > previously
> > +	 * provided in rte_regex_dev_info_get().
> > +	 * @see struct rte_regex_dev_info::max_queue_pairs
> > +	 */
> > +	uint16_t nb_rules_per_group;
> > +	/**< Number of rules per group to configure on this device.
> > +	 * This value cannot exceed the *max_rules_per_group*
> > +	 * which previously provided in rte_regex_dev_info_get().
> > +	 * The value 0 is allowed, in which case,
> > +	 * struct rte_regex_dev_info::max_rules_per_group used.
> > +	 * @see struct rte_regex_dev_info::max_rules_per_group
> > +	 */
> > +	uint16_t nb_groups;
> > +	/**< Number of groups to configure on this device.
> > +	 * This value cannot exceed the *max_groups*
> > +	 * which previously provided in rte_regex_dev_info_get().
> > +	 * @see struct rte_regex_dev_info::max_groups
> > +	 */
> > +	const char *rule_db;
> > +	/**< Import initial set of prebuilt rule database on this device.
> > +	 * The value NULL is allowed, in which case, the device will not
> > +	 * be configured prebuilt rule database. Application may use
> > +	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
> > +	 * to update or import rule database after the
> > +	 * rte_regex_dev_configure().
> > +	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> > +	 */
> > +	uint32_t rule_db_len;
> > +	/**< Length of *rule_db* buffer. */
> > +	uint32_t dev_cfg_flags;
> > +	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*
> > */
> > +};
> > +
> > +/**
> > + * Configure a RegEx device.
> > + *
> > + * This function must be invoked first before any other function in the
> > + * API. This function can also be re-invoked when a device is in the
> > + * stopped state.
> > + *
> > + * The caller may use rte_regex_dev_info_get() to get the capability of each
> > + * resources available for this regex device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device to configure.
> > + * @param cfg
> > + *   The RegEx device configuration structure.
> > + *
> > + * @return
> > + *   - 0: Success, device configured.
> > + *   - <0: Error code returned by the driver configuration function.
> > + */
> > +int
> > +rte_regex_dev_configure(uint8_t dev_id, const struct
> > rte_regex_dev_config *cfg);
> > +
> > +/* Enumerates RegEx queue pair configuration flags */
> > +#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
> > +/**< Out of order scan, If not set, a scan must retire after previously issued
> > + * in-order scans to this queue pair. If set, this scan can be retired as soon
> > + * as device returns completion. Application should not set out of order scan
> > + * flag if it needs to maintain the ingress order of scan request.
> > + *
> > + * @see struct rte_regex_qp_conf::qp_conf_flags,
> > rte_regex_queue_pair_setup()
> > + */
> > +
> > +struct rte_regex_ops;
> > +typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
> > +				      struct rte_regex_ops *op);
> > +/**< Callback function called during rte_regex_dev_stop(), invoked once
> > per
> > + * flushed RegEx op.
> > + */
> > +
> > +/** RegEx queue pair configuration structure */
> > +struct rte_regex_qp_conf {
> > +	uint32_t qp_conf_flags;
> > +	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_*
> > */
> > +	uint16_t nb_desc;
> > +	/**< The number of descriptors to allocate for this queue pair. */
> > +	regexdev_stop_flush_t cb;
> > +	/**< Callback function called during rte_regex_dev_stop(), invoked
> > +	 * once per flushed regex op. Value NULL is allowed, in which case
> > +	 * callback will not be invoked. This function can be used to properly
> > +	 * dispose of outstanding regex ops from response queue,
> > +	 * for example ops containing memory pointers.
> > +	 * @see rte_regex_dev_stop()
> > +	 */
> > +};
> > +
> > +/**
> > + * Allocate and set up a RegEx queue pair for a RegEx device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param queue_pair_id
> > + *   The index of the RegEx queue pair to setup. The value must be in the
> > range
> > + *   [0, nb_queue_pairs - 1] previously supplied to
> > rte_regex_dev_configure().
> > + * @param qp_conf
> > + *   The pointer to the configuration data to be used for the RegEx queue
> > pair.
> > + *   NULL value is allowed, in which case default configuration	used.
> > + *
> > + * @return
> > + *   - 0: Success, RegEx queue pair correctly set up.
> > + *   - <0: RegEx queue configuration failed
> > + */
> > +int
> > +rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> > +			   const struct rte_regex_qp_conf *qp_conf);
> > +
> > +/**
> > + * Start a RegEx device.
> > + *
> > + * The device start step is the last one and consists of setting the RegEx
> > + * queues to start accepting the pattern matching scan requests.
> > + *
> > + * On success, all basic functions exported by the API (RegEx enqueue,
> > + * RegEx dequeue and so on) can be invoked.
> > + *
> > + * @param dev_id
> > + *   RegEx device identifier
> > + * @return
> > + *   - 0: Success, device started.
> > + *   - <0: Device start failed.
> > + */
> > +int
> > +rte_regex_dev_start(uint8_t dev_id);
> > +
> > +/**
> > + * Stop a RegEx device.
> > + *
> > + * Stop a RegEx device. The device can be restarted with a call to
> > + * rte_regex_dev_start().
> > + *
> > + * This function causes all queued response regex ops to be drained in the
> > + * response queue. While draining ops out of the device,
> > + * struct rte_regex_qp_conf::cb will be invoked for each ops.
> > + *
> > + * @param dev_id
> > + *   RegEx device identifier.
> > + *
> > + * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
> > + */
> > +void
> > +rte_regex_dev_stop(uint8_t dev_id);
> > +
> > +/**
> > + * Close a RegEx device. The device cannot be restarted!
> > + *
> > + * @param dev_id
> > + *   RegEx device identifier
> > + *
> > + * @return
> > + *  - 0 on successfully closed the device.
> > + *  - <0 on failure to close the device.
> > + */
> > +int
> > +rte_regex_dev_close(uint8_t dev_id);
> > +
> > +/* Device get/set attributes */
> > +
> > +/** Enumerates RegEx device attribute identifier */
> > +enum rte_regex_dev_attr_id {
> > +	RTE_REGEX_DEV_ATTR_SOCKET_ID,
> > +	/**< The NUMA socket id to which the device is connected or
> > +	 * a default of zero if the socket could not be determined.
> > +	 * datatype: *int*
> > +	 * operation: *get*
> > +	 */
> > +	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> > +	/**< Maximum number of matches per scan.
> > +	 * datatype: *uint8_t*
> > +	 * operation: *get* and *set*
> > +	 *
> > +	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> > +	 */
> > +	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> > +	/**< Upper bound scan time in ns.
> > +	 * datatype: *uint16_t*
> > +	 * operation: *get* and *set*
> > +	 *
> > +	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> > +	 */
> > +	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> > +	/**< Maximum number of prefix detected per scan.
> > +	 * This would be useful for denial of service detection.
> > +	 * datatype: *uint16_t*
> > +	 * operation: *get* and *set*
> > +	 *
> > +	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> > +	 */
> > +};
> > +
> > +/**
> > + * Get an attribute from a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param attr_id The attribute ID to retrieve
> > + * @param[out] attr_value A pointer that will be filled in with the attribute
> > + *             value if successful.
> > + *
> > + * @return
> > + *   - 0: Successfully retrieved attribute value.
> > + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> > + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> > + */
> > +int
> > +rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id
> > attr_id,
> > +		       void *attr_value);
> > +
> > +/**
> > + * Set an attribute to a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param attr_id The attribute ID to retrieve
> > + * @param attr_value A pointer that will be filled in with the attribute value
> > + *                   by the application
> > + *
> > + * @return
> > + *   - 0: Successfully applied the attribute value.
> > + *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
> > + *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
> > + */
> > +int
> > +rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id
> > attr_id,
> > +		       const void *attr_value);
> > +
> > +/* Rule related APIs */
> > +/** Enumerates RegEx rule operation */
> > +enum rte_regex_rule_op {
> > +	RTE_REGEX_RULE_OP_ADD,
> > +	/**< Add RegEx rule to rule database */
> > +	RTE_REGEX_RULE_OP_REMOVE
> > +	/**< Remove RegEx rule from rule database */
> > +};
> > +
> > +/** Structure to hold a RegEx rule attributes */
> > +struct rte_regex_rule {
> > +	enum rte_regex_rule_op op;
> > +	/**< OP type of the rule either a OP_ADD or OP_DELETE */
> > +	uint16_t group_id;
> > +	/**< Group identifier to which the rule belongs to. */
> > +	uint32_t rule_id;
> > +	/**< Rule identifier which is returned on successful match. */
> > +	const char *pcre_rule;
> > +	/**< Buffer to hold the PCRE rule. */
> > +	uint16_t pcre_rule_len;
> > +	/**< Length of the PCRE rule*/
> > +	uint64_t rule_flags;
> > +	/* PCRE rule flags. Supported device specific PCRE rules enumerated
> > +	 * in struct rte_regex_dev_info::rule_flags. For successful rule
> > +	 * database update, application needs to provide only supported
> > +	 * rule flags.
> > +	 * @See RTE_REGEX_PCRE_RULE_*, struct
> > rte_regex_dev_info::rule_flags
> > +	 */
> > +};
> > +
> > +/**
> > + * Update the rule database of a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param rules
> > + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> > structure
> > + *   which contain the regex rules attributes to be updated in rule database.
> > + * @param nb_rules
> > + *   The number of PCRE rules to update the rule database.
> > + *
> > + * @return
> > + *   The number of regex rules actually updated on the regex device's rule
> > + *   database. The return value can be less than the value of the *nb_rules*
> > + *   parameter when the regex devices fails to update the rule database or
> > + *   if invalid parameters are specified in a *rte_regex_rule*.
> > + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> > + *   at the end of *rules* are not consumed and the caller has to take
> > + *   care of them and rte_errno is set accordingly.
> > + *   Possible errno values include:
> > + *   - -EINVAL:  Invalid device ID or rules is NULL
> > + *   - -ENOTSUP: The last processed rule is not supported on this device.
> > + *   - -ENOSPC: No space available in rule database.
> > + *
> > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> > + */
> > +uint16_t
> > +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > *rules,
> > +			 uint16_t nb_rules);
> 
> I think the function name is not too informative. If this function meant to
> compile the rule then it should be explicit on the function name.
> 
> > +
> > +/**
> > + * Import a prebuilt rule database from a buffer to a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param rule_db
> > + *   Points to prebuilt rule database.
> > + * @param rule_db_len
> > + *   Length of the rule database.
> > + *
> > + * @return
> > + *   - 0: Successfully updated the prebuilt rule database.
> > + *   - -EINVAL:  Invalid device ID or rule_db is NULL
> > + *   - -ENOTSUP: Rule database import is not supported on this device.
> > + *   - -ENOSPC: No space available in rule database.
> > + *
> > + * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
> > + */
> > +int
> > +rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
> > +			 uint32_t rule_db_len);
> > +
> > +/**
> > + * Export the prebuilt rule database from a RegEx device to the buffer.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param[out] rule_db
> > + *   Block of memory to insert the rule database. Must be at least size in
> > + *   capacity. If set to NULL, function returns required capacity.
> > + *
> > + * @return
> > + *   - 0: Successfully exported the prebuilt rule database.
> > + *   - size: If rule_db set to NULL then required capacity for *rule_db*
> > + *   - -EINVAL:  Invalid device ID
> > + *   - -ENOTSUP: Rule database export is not supported on this device.
> > + *
> > + * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
> > + */
> > +int
> > +rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
> > +
> > +/* Extended statistics */
> > +/** Maximum name length for extended statistics counters */
> > +#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
> > +
> > +/**
> > + * A name-key lookup element for extended statistics.
> > + *
> > + * This structure is used to map between names and ID numbers
> > + * for extended RegEx device statistics.
> > + */
> > +struct rte_regex_dev_xstats_map {
> > +	uint16_t id;
> > +	/**< xstat identifier */
> > +	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
> > +	/**< xstat name */
> > +};
> > +
> > +/**
> > + * Retrieve names of extended statistics of a regex device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the regex device.
> > + * @param[out] xstats_map
> > + *   Block of memory to insert id and names into. Must be at least size in
> > + *   capacity. If set to NULL, function returns required capacity.
> > + * @return
> > + *   - positive value on success:
> > + *        -The return value is the number of entries filled in the stats map.
> > + *        -If xstats_map set to NULL then required capacity for xstats_map.
> > + *   - negative value on error:
> > + *      -ENODEV for invalid *dev_id*
> > + *      -ENOTSUP if the device doesn't support this function.
> > + */
> > +int
> > +rte_regex_dev_xstats_names_get(uint8_t dev_id,
> > +			       struct rte_regex_dev_xstats_map *xstats_map);
> > +
> > +/**
> > + * Retrieve extended statistics of an regex device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param ids
> > + *   The id numbers of the stats to get. The ids can be got from the stat
> > + *   position in the stat list from rte_regex_dev_xstats_names_get(), or
> > + *   by using rte_regex_dev_xstats_by_name_get().
> > + * @param[out] values
> > + *   The values for each stats request by ID.
> > + * @param n
> > + *   The number of stats requested
> > + * @return
> > + *   - positive value: number of stat entries filled into the values array
> > + *   - negative value on error:
> > + *      -ENODEV for invalid *dev_id*
> > + *      -ENOTSUP if the device doesn't support this function.
> > + */
> > +int
> > +rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
> > +			 uint64_t values[], uint16_t n);
> > +
> > +/**
> > + * Retrieve the value of a single stat by requesting it by name.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device
> > + * @param name
> > + *   The stat name to retrieve
> > + * @param[out] id
> > + *   If non-NULL, the numerical id of the stat will be returned, so that further
> > + *   requests for the stat can be got using rte_regex_dev_xstats_get, which
> > will
> > + *   be faster as it doesn't need to scan a list of names for the stat.
> > + * @param[out] value
> > + *   Must be non-NULL, retrieved xstat value will be stored in this address.
> > + *
> > + * @return
> > + *   - 0: Successfully retrieved xstat value.
> > + *   - -EINVAL: invalid parameters
> > + *   - -ENOTSUP: if not supported.
> > + */
> > +int
> > +rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
> > +				 uint16_t *id, uint64_t *value);
> > +
> > +/**
> > + * Reset the values of the xstats of the selected component in the device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device
> > + * @param ids
> > + *   Selects specific statistics to be reset. When NULL, all statistics will be
> > + *   reset. If non-NULL, must point to array of at least *nb_ids* size.
> > + * @param nb_ids
> > + *   The number of ids available from the *ids* array. Ignored when ids is
> > NULL.
> > + * @return
> > + *   - 0: Successfully reset the statistics to zero.
> > + *   - -EINVAL: invalid parameters
> > + *   - -ENOTSUP: if not supported.
> > + */
> > +int
> > +rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
> > +			   uint16_t nb_ids);
> > +
> > +/**
> > + * Trigger the RegEx device self test.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device
> > + * @return
> > + *   - 0: Selftest successful
> > + *   - -ENOTSUP if the device doesn't support selftest
> > + *   - other values < 0 on failure.
> > + */
> > +int rte_regex_dev_selftest(uint8_t dev_id);
> > +
> > +/**
> > + * Dump internal information about *dev_id* to the FILE* provided in *f*.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @param f
> > + *   A pointer to a file for output
> > + *
> > + * @return
> > + *   - 0: on success
> > + *   - <0: on failure.
> > + */
> > +int
> > +rte_regex_dev_dump(uint8_t dev_id, FILE *f);
> > +
> > +/* Fast path APIs */
> > +
> > +/**
> > + * The generic *rte_regex_match* structure to hold the RegEx match
> > attributes.
> > + * @see struct rte_regex_ops::matches
> > + */
> > +struct rte_regex_match {
> > +	RTE_STD_C11
> > +	union {
> > +		uint64_t u64;
> > +		struct {
> > +			uint32_t rule_id:20;
> > +			/**< Rule identifier to which the pattern matched.
> > +			 * @see struct rte_regex_rule::rule_id
> > +			 */
> > +			uint32_t group_id:12;
> > +			/**< Group identifier of the rule which the pattern
> > +			 * matched. @see struct rte_regex_rule::group_id
> > +			 */
> > +			uint16_t offset;
> > +			/**< Starting Byte Position for matched rule. */
> > +			uint16_t len;
> > +			/**< Length of match in bytes */
> > +		};
> > +	};
> > +};
> > +
> > +/* Enumerates RegEx request flags. */
> > +#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
> > +/**< Set when struct rte_regex_rule::group_id1 valid */
> > +
> > +#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
> > +/**< Set when struct rte_regex_rule::group_id2 valid */
> > +
> > +#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
> > +/**< Set when struct rte_regex_rule::group_id3 valid */
> > +
> > +#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
> > +/**< The RegEx engine will stop scanning and return the first match. */
> > +
> > +#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
> > +/**< In High Priority mode a maximum of one match will be returned per
> > scan to
> > + * reduce the post-processing required by the application. The match with
> > the
> > + * lowest Rule id, lowest start pointer and lowest match length will be
> > + * returned.
> > + *
> > + * @see struct rte_regex_ops::nb_actual_matches
> > + * @see struct rte_regex_ops::nb_matches
> > + */
> > +
> > +
> > +/* Enumerates RegEx response flags. */
> > +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> > +/**< Indicates that the RegEx device has encountered a partial match at the
> > + * start of scan in the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > + */
> > +
> > +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> > +/**< Indicates that the RegEx device has encountered a partial match at the
> > + * end of scan in the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > + */
> > +
> > +#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
> > +/**< Indicates that the RegEx device has exceeded the max timeout while
> > + * scanning the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
> > + */
> > +
> > +#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
> > +/**< Indicates that the RegEx device has exceeded the max matches while
> > + * scanning the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
> > + */
> > +
> > +#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
> > +/**< Indicates that the RegEx device has reached the max allowed prefix
> > length
> > + * while scanning the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
> > + */
> > +
> > +/**
> > + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> > + * for enqueue and dequeue operation.
> > + */
> > +struct rte_regex_ops {
> > +	/* W0 */
> > +	uint16_t req_flags;
> > +	/**< Request flags for the RegEx ops.
> > +	 * @see RTE_REGEX_OPS_REQ_*
> > +	 */
> > +	uint16_t scan_size;
> > +	/**< Scan size of the buffer to be scanned in bytes. */
> > +	uint16_t rsp_flags;
> > +	/**< Response flags for the RegEx ops.
> > +	 * @see RTE_REGEX_OPS_RSP_*
> > +	 */
> > +	uint8_t nb_actual_matches;
> > +	/**< The total number of actual matches detected by the Regex
> > device.*/
> > +	uint8_t nb_matches;
> > +	/**< The total number of matches returned by the RegEx device for
> > this
> > +	 * scan. The size of *rte_regex_ops::matches* zero length array will
> > be
> > +	 * this value.
> > +	 *
> > +	 * @see struct rte_regex_ops::matches, struct rte_regex_match
> > +	 */
> > +
> > +	/* W1 */
> > +	RTE_STD_C11
> > +	union {
> > +		uint64_t u64;
> > +		/**<  Allow 8-byte reserved on 32-bit system */
> > +		void *buf_addr;
> > +		/**< Virtual address of the pattern to be matched. */
> > +	};
> > +
> > +	/* W2 */
> > +	rte_iova_t buf_iova;
> > +	/**< IOVA address of the pattern to be matched. */
> > +
> > +	/* W3 */
> > +	uint16_t group_id0;
> > +	/**< First group_id to match the rule against. Minimum one group id
> > +	 * must be provided by application.
> > +	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> > group_id1
> > +	 * is valid, respectively similar flags for group_id2 and group_id3.
> > +	 * Upon the match, struct rte_regex_match::group_id shall be
> > updated
> > +	 * with matching group ID by the device. Group ID scheme provides
> > +	 * rule isolation and effective pattern matching.
> > +	 */
> > +	uint16_t group_id1;
> > +	/**< Second group_id to match the rule against.
> > +	 *
> > +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> > +	 */
> > +	uint16_t group_id2;
> > +	/**< Third group_id to match the rule against.
> > +	 *
> > +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> > +	 */
> > +	uint16_t group_id3;
> > +	/**< Forth group_id to match the rule against.
> > +	 *
> > +	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> > +	 */
> > +
> > +	/* W4 */
> > +	RTE_STD_C11
> > +	union {
> > +		uint64_t user_id;
> > +		/**< Application specific opaque value. An application may
> > use
> > +		 * this field to hold application specific value to share
> > +		 * between dequeue and enqueue operation.
> > +		 * Implementation should not modify this field.
> > +		 */
> > +		void *user_ptr;
> > +		/**< Pointer representation of *user_id* */
> > +	};
> 
> Since we target the regex subsystem for both regex and DPI I think it will be
> good to add another uint64_t field called connection_id.
> Device that support DPI can refer to it as another match able field when looking
> up for matches on the given buffer.
> 
> This field is different from the user_id, as it is not opaque for the device.
> 
> > +
> > +	/* W5 */
> > +	struct rte_regex_match matches[];
> > +	/**< Zero length array to hold the match tuples.
> > +	 * The struct rte_regex_ops::nb_matches value holds the number of
> > +	 * elements in this array.
> > +	 *
> > +	 * @see struct rte_regex_ops::nb_matches
> > +	 */
> > +};
> > +
> > +/**
> > + * Enqueue a burst of scan request on a RegEx device.
> > + *
> > + * The rte_regex_enqueue_burst() function is invoked to place
> > + * regex operations on the queue *qp_id* of the device designated by
> > + * its *dev_id*.
> > + *
> > + * The *nb_ops* parameter is the number of operations to process which
> > are
> > + * supplied in the *ops* array of *rte_regex_op* structures.
> > + *
> > + * The rte_regex_enqueue_burst() function returns the number of
> > + * operations it actually enqueued for processing. A return value equal to
> > + * *nb_ops* means that all packets have been enqueued.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param qp_id
> > + *   The index of the queue pair which packets are to be enqueued for
> > + *   processing. The value must be in the range [0, nb_queue_pairs - 1]
> > + *   previously supplied to rte_regex_dev_configure().
> > + * @param ops
> > + *   The address of an array of *nb_ops* pointers to *rte_regex_op*
> > structures
> > + *   which contain the regex operations to be processed.
> > + * @param nb_ops
> > + *   The number of operations to process.
> > + *
> > + * @return
> > + *   The number of operations actually enqueued on the regex device. The
> > return
> > + *   value can be less than the value of the *nb_ops* parameter when the
> > + *   regex devices queue is full or if invalid parameters are specified in
> > + *   a *rte_regex_op*. If the return value is less than *nb_ops*, the
> > remaining
> > + *   ops at the end of *ops* are not consumed and the caller has to take
> > care
> > + *   of them.
> > + */
> > +uint16_t
> > +rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
> > +			struct rte_regex_ops **ops, uint16_t nb_ops);
> > +
> > +/**
> > + *
> > + * Dequeue a burst of scan response from a queue on the RegEx device.
> > + * The dequeued operation are stored in *rte_regex_op* structures
> > + * whose pointers are supplied in the *ops* array.
> > + *
> > + * The rte_regex_dequeue_burst() function returns the number of ops
> > + * actually dequeued, which is the number of *rte_regex_op* data
> > structures
> > + * effectively supplied into the *ops* array.
> > + *
> > + * A return value equal to *nb_ops* indicates that the queue contained
> > + * at least *nb_ops* operations, and this is likely to signify that other
> > + * processed operations remain in the devices output queue. Applications
> > + * implementing a "retrieve as many processed operations as possible"
> > policy
> > + * can check this specific case and keep invoking the
> > + * rte_regex_dequeue_burst() function until a value less than
> > + * *nb_ops* is returned.
> > + *
> > + * The rte_regex_dequeue_burst() function does not provide any error
> > + * notification to avoid the corresponding overhead.
> > + *
> > + * @param dev_id
> > + *   The RegEx device identifier
> > + * @param qp_id
> > + *   The index of the queue pair from which to retrieve processed packets.
> > + *   The value must be in the range [0, nb_queue_pairs - 1] previously
> > + *   supplied to rte_regex_dev_configure().
> > + * @param ops
> > + *   The address of an array of pointers to *rte_regex_op* structures that
> > must
> > + *   be large enough to store *nb_ops* pointers in it.
> > + * @param nb_ops
> > + *   The maximum number of operations to dequeue.
> > + *
> > + * @return
> > + *   The number of operations actually dequeued, which is the number
> > + *   of pointers to *rte_regex_op* structures effectively supplied to the
> > + *   *ops* array. If the return value is less than *nb_ops*, the remaining
> > + *   ops at the end of *ops* are not consumed and the caller has to take
> > care
> > + *   of them.
> > + */
> > +uint16_t
> > +rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
> > +			struct rte_regex_ops **ops, uint16_t nb_ops);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_REGEXDEV_H_ */
> >


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-08-21  5:32     ` Shahaf Shuler
  2019-08-21 15:12       ` John Bromhead
  2019-09-10 10:31       ` Jerin Jacob Kollanukkaran
@ 2019-09-10 11:02       ` Jerin Jacob Kollanukkaran
  2019-09-27 14:45         ` Jerin Jacob Kollanukkaran
  2 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-09-10 11:02 UTC (permalink / raw)
  To: Shahaf Shuler, Thomas Monjalon, dev
  Cc: Pavan Nikhilesh Bhagavatula, Hemant Agrawal, Opher Reviv,
	Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor, Nipun Gupta, Wang,
	Xiang W, Richardson, Bruce, yang.a.hong, harry.chang, gu.jian1,
	shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su

> Hi Jerin,

Hi Shahaf,

Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see inline.

> >
> > RegEx pattern matching applications:
> > • Next Generation Firewalls (NGFW)
> > • Deep Packet and Flow Inspection (DPI)
> > • Intrusion Prevention Systems (IPS)
> > • DDoS Mitigation
> > • Network Monitoring
> > • Data Loss Prevention (DLP)
> > • Smart NICs
> > • Grammar based content processing
> > • URL, spam and adware filtering
> > • Advanced auditing and policing of user/application security policies
> > • Financial data mining - parsing of streamed financial feeds
> 
> I think two more important use case to add (at least on the doc of this
> subsystem) are:
> * application recognition
> * memory introspection

Sure. Will add the following from John as well.

# Natural Language Processing (NLP)
# Sentiment Analysis
# Big Data database acceleration (Spark, Hadoop etc.)
# Computational Storage

> 
> 
> > +/**
> > + * Update the rule database of a RegEx device.
> > + *
> > + * @param dev_id RegEx device identifier
> > + * @param rules
> > + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> > structure
> > + *   which contain the regex rules attributes to be updated in rule database.
> > + * @param nb_rules
> > + *   The number of PCRE rules to update the rule database.
> > + *
> > + * @return
> > + *   The number of regex rules actually updated on the regex device's rule
> > + *   database. The return value can be less than the value of the *nb_rules*
> > + *   parameter when the regex devices fails to update the rule database or
> > + *   if invalid parameters are specified in a *rte_regex_rule*.
> > + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> > + *   at the end of *rules* are not consumed and the caller has to take
> > + *   care of them and rte_errno is set accordingly.
> > + *   Possible errno values include:
> > + *   - -EINVAL:  Invalid device ID or rules is NULL
> > + *   - -ENOTSUP: The last processed rule is not supported on this device.
> > + *   - -ENOSPC: No space available in rule database.
> > + *
> > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> > + */
> > +uint16_t
> > +rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > *rules,
> > +			 uint16_t nb_rules);
> 
> I think the function name is not too informative. If this function meant to
> compile the rule then it should be explicit on the function name.
 
It is meant to be compile the rules and then  update the rule database.

I think, we can have either 1 or 2. Let me know your preference or
If you have any name suggestion. I will change it accordingly.

1. rte_regex_rule_db_compile()
2. rte_regex_rule_db_compile_update()


> > +
> > + */
> > +struct rte_regex_ops {
> > +
> > +	/* W4 */
> > +	RTE_STD_C11
> > +	union {
> > +		uint64_t user_id;
> > +		/**< Application specific opaque value. An application may
> > use
> > +		 * this field to hold application specific value to share
> > +		 * between dequeue and enqueue operation.
> > +		 * Implementation should not modify this field.
> > +		 */
> > +		void *user_ptr;
> > +		/**< Pointer representation of *user_id* */
> > +	};
> 
> Since we target the regex subsystem for both regex and DPI I think it will be
> good to add another uint64_t field called connection_id.
> Device that support DPI can refer to it as another match able field when looking
> up for matches on the given buffer.
> 
> This field is different from the user_id, as it is not opaque for the device.

Is this driver specific storage place where application should not touch it?

If not, Could you share the data flow of this field? Ie. Who "write" this
Field and who "read" this field.

This is just for documentation, In any event we can add new fields.

If it is only for driver usage then I think, some driver may need more 8B
Storage. In that case I think, each driver can add its on field
After W4(i.e existing user_id) and introduce new field called
match_offset in struct rte_regex_ops

ie. struct rte_regex_match *matches == ops + ops-> match_offset;
so that, Each driver can add enough driver specific metadata.





^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-09-10  8:05         ` Jerin Jacob Kollanukkaran
@ 2019-09-19 13:58           ` Wang Xiang
  2019-09-27 14:35             ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 62+ messages in thread
From: Wang Xiang @ 2019-09-19 13:58 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran
  Cc: Thomas Monjalon, dev, Pavan Nikhilesh Bhagavatula, Shahaf Shuler,
	Hemant Agrawal, Opher Reviv, Alex Rosenbaum, Dovrat Zifroni,
	Prasun Kapoor, Nipun Gupta, Richardson, Bruce, Hong, Yang A,
	Chang, Harry, gu.jian1, shanjiangh, zhangy.yun, lixingfu,
	wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, Ni, Hongjun, j.bromhead, deri, fc,
	arthur.su, Guy Kaneti, Smadar Fuks, Liron Himi, edwin.verplanke,
	keith.wiles

Hi Jerin,

Thanks for your response. More comments below and inline.

1) I think the size of some varaibles (e.g. nb_matches, scan_size,
matching offset, etc) should be increased based on what Hyperscan supports.

    a) struct rte_regex_ops:

        uint16_t scan_size => uint32_t scan_size
        uint8_t nb_actual_matches => uint64 nb_actual_matches
        uint8_t nb_matches => uint64 nb__matches

    b) struct rte_regex_match:
        uint16_t offset => uint32_t offset
        uint16_t len => uint32_t len

    c) uint16_t
        rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
                                 uint16_t nb_rules);
    =>
       uint32_t
        rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
                                 uint32_t nb_rules);

    d) int
    rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
                    const struct rte_regex_qp_conf *qp_conf);
    =>
       int
    rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
                    const struct rte_regex_qp_conf *qp_conf);

    e) struct rte_regex_dev_config:
        uint8_t nb_max_matches => uint64_t nb_max_matches

    f) struct rte_regex_dev_info:
        uint8_t max_matches => uint64_t max_matches

2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set() defined.
Are all the attributes below could be set by users? Is any of them read-only?

/** Enumerates RegEx device attribute identifier */
enum rte_regex_dev_attr_id {
    RTE_REGEX_DEV_ATTR_SOCKET_ID,
    /**< The NUMA socket id to which the device is connected or
     * a default of zero if the socket could not be determined.
     * datatype: *int*
     * operation: *get*
     */
    RTE_REGEX_DEV_ATTR_MAX_MATCHES,
    /**< Maximum number of matches per scan.
     * datatype: *uint8_t*
     * operation: *get* and *set*
     *
     * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
     */
    RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
    /**< Upper bound scan time in ns.
     * datatype: *uint16_t*
     * operation: *get* and *set*
     *
     * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
     */
    RTE_REGEX_DEV_ATTR_MAX_PREFIX,
    /**< Maximum number of prefix detected per scan.
     * This would be useful for denial of service detection.
     * datatype: *uint16_t*
     * operation: *get* and *set*
     *
     * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
     */
};

3) Both RTE_REGEX_PCRE_RULE_* and
RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can we
merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F and have a
unified regex_dev_capa in struct rte_regex_dev_info.


4) It'll be good if we can also define synchronous matching API for users who
want to have a one-off scan and wait for the results.

On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran wrote:
> Hi Xiang,
> 
> Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see inline.
>  
> > 
> > Reply to Xiang's queries in main thread:
> > 
> > Hi all,
> > 
> > Some questions regarding APIs. Could you please give more insights?
> > 
> > 1) rte_regex_ops
> >       a) rsp_flags
> >       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
> >       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match
> > at the end of current buffer after scan.
> >       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> > 
> > [Jerin] Since we need three states to represent partial match buffer,
> > RTE_REGEX_OPS_RSP_PMI_SOJ_F to
> > represent start of the buffer, intermediate buffers with no flag, and end of
> > the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ
> 
> > [Xiang] How could a user leverage these flags for matching? Suppose a large
> > buffer is divided into multiple chunks. Will RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > cause an early quit once it isn't set after scan the first chunk. Similarly,
> > RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching future
> > buffers after finish the last chunk?
> 
> Let me describe with an example,
> 
> Assume,
> 1) struct rte_regex_dev_info:: max_payload_size set to 1024
> 2) rte_regex_dev_config:: dev_cfg_flags configured with RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> 3) Device programmed with matching "hello\s+world" pattern
> 4) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 1024
> 
> data[0..1021] = data don???t have hello world pattern
> data[1022] = 'h'
> data[1023] = 'e'
> 
> 5) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 9
> 
> data[0] = 'l'
> data[1] = 'l'
> data[2] = 'o'
> data[3] = ' '
> data[4] = 'w'
> data[5] = 'o'
> data[6] = 'r'
> data[7] = 'l'
> data[8] = 'd'
> 
> If so,
> 
> Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops:: rsp_flags on dequeue
> Where rte_regex_match:: offset is 1022 and len 2
> 
> Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops:: rsp_flags on dequeue
> Where rte_regex_match:: offset is 0 and len 9
>
If the defined pattern is "hello.*world" instead of "hello\s+world", and
we enqueue following struct rte_regex_ops:

1) rte_regex_op:: scan_size = 1024

   data[0..1021] = data don???t have hello world pattern
   data[1022] = 'h'
   data[1023] = 'e'

2) rte_regex_op:: scan_size = 9
   data[0] = 'l'
   data[1] = 'l'
   data[2] = 'o'
   data[3] = ' '
   data[4] = 'w'
   data[5] = 'o'
   data[6] = 'r'
   data[7] = 'l'
   data[8] = 'd'

3) rte_regex_op:: scan_size = 5
   data[0] = 'w'
   data[1] = 'o'
   data[2] = 'r'
   data[3] = 'l'
   data[4] = 'd'

Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
rsp_flags on dequeue
Where rte_regex_match:: offset is 0 and len 4?

I am wondering what's your expected behavior for .* or similar syntax and if
there are syntax compatability issues. We report all matches in Hyperscan,
e.g. report end match offsets 11 and 16 for pattern "hello.*world" and
corpus "hello worldworld".

BTW, not sure how other hardware devices handle cross buffer scan. Hyperscan
doesn't reports matches for start and intermediate buffers but only reports
end offset if a full match is found.

> 
> > 
> >       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a
> > specific hardware implementation. I am wondering what this PREFIX refers
> > to:)?
> > 
> > [Jerin] Yes. Looks like it is for hardware specific implementation. Introduced
> > rte_regex_dev_attr_set/get functions to make it portable and
> > To add new implementation specific fields.
> > For example, if a rule is
> > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the
> > factor. The prefix is a literal
> > string, while the factor can contain complex regular expression constructs. As
> > a result, rule matching occurs in
> > two stages: prefix matching and factor matching.
> > 
> >       b)  user_id or user_ptr
> >       Under what kind of circumstances should an application pass value into
> > these variables for enqueue and dequeuer operations?
> > 
> > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using
> > mempool normally, on enqueue, user can specify user_id
> > If needed to in order identify the op on dequeue if required. The use case
> > could be to store the sequence number from application
> > POV or storing the mbuf ptr in which pattern is requested etc.
> > 
> > 
> >  2) rte_regex_match
> >       a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t
> > len; /**< Length of match in bytes */
> >       Looks like the matching offset is defined as *starting matching offset*
> > instead of *end matching offset*, e.g. report the offset of "a" instead of "c"
> > for pattern "abc".
> >       If so, this makes it hard to integrate software regex libraries such as
> > Hyperscan and RE2 as they only report *end matching offset* without length
> > of match.
> >       Although Hyperscan has API for *starting matching offset*, it only delivers
> > partial syntax support. So I think we have to define *end of matching offset*
> > for software solutions.
> > 
> > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I
> > thought application would need always the length of the match.
> > Probably we will see how other HW implementation (from Mellanox) etc. We
> > will try to abstract it, probably we can make it as function of "user
> > requested".
> > [Xiang] Yes, it will be good to make it per user request. At least from
> > Hyperscan user's point of view, start of match and match length are not
> > mandatory.
> 
> OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START
> In device configure.
> 
> Since offset+len == end, we can introduce following generic inline function.
> 
> static inline 
> rte_regex_match_end(truct rte_regex_match *match)
> {
> 	match->offset + match->len;
> }
> 
> Example:  pattern to match is  "hello\s+world"  and data is following
> data[4] = 'h'
> data[5] = 'e'
> data[6] = 'l'
> data[7] = 'l'
> data[8] = 'o'
> data[9] = ' '
> data[10] = 'w'
> data[11] = 'o'
> data[12] = 'r'
> data[13] = 'l'
> data[14] = 'd'
> 
> if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> match->offset returns 4
> match->len returns 11
> 
> if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> driver MAY return the following(in hyperscan case)
> match->offset returns 0
> match->len returns 11 + 4
> 
> In both case(irrespective of flags, to make application life easy) rte_regex_match_end() would return 15.
> If application demands for MATCH_AS_START then driver can return match->offset returns 4 and match->len returns 11
> Aka set HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use rte_regex_match_end()
> for finding the end of the match. To make, work in all cases.
> 
> Is it OK? 
> 
Can we replace len with end offset? So we can change "offset" to
"start_offset" and len to "end_ offset" in struct rte_regex_match. Users
interested in len could take "end_offset - start_offset".
We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to RTE_REGEX_DEV_CFG_MATCH_START

In your example,
if device is configured with RTE_REGEX_DEV_CFG_MATCH_START
match->start_offset returns 4
match->end_offset returns 15

if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START
match->start_offset returns 0
match->end_offset returns 15

> > 
> > 3)  rte_regex_rule_db_update()
> >     Does this mean we can dynamically add or delete rules for an already
> > generated database without recompile from scratch for hardware Regex
> > implementation?
> >     If so, this isn't possible for software solutions as they don't support
> > dynamic database update and require recompile.
> > 
> > [Jerin] rte_regex_rule_db_update() internally it would call recompile
> > function for both HW and SW.
> > See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> > precompiled rule database case.
> > [Xiang] OK, sounds like we have to save the original rule-set for the device in
> > order to do recompile. I see both ADD and REMOVE operators from
> > rte_regex_rule.
> > For rules with REMOVE operator, what's the expected behavior to handle
> > them for the old rule-set? Do we need to go through the old rule-set and
> > remove corresponding rules before doing recompile?
> 
> Yes.
>
I think it'll be better to change rte_regex_rule_db_update() to
rte_regex_rule_compile() and have users to provide a full rule-set.
So we don't have to maintain old rule-set and decide which one to keep
and remove. We can simply recompile new rule-set and get rid of
rte_regex_rule_op in this case.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-09-19 13:58           ` Wang Xiang
@ 2019-09-27 14:35             ` Jerin Jacob Kollanukkaran
  2019-10-14 13:59               ` Wang Xiang
  0 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-09-27 14:35 UTC (permalink / raw)
  To: Wang Xiang
  Cc: Thomas Monjalon, dev, Pavan Nikhilesh Bhagavatula, Shahaf Shuler,
	Hemant Agrawal, Opher Reviv, Alex Rosenbaum, Dovrat Zifroni,
	Prasun Kapoor, Nipun Gupta, Richardson, Bruce, Hong, Yang A,
	Chang, Harry, gu.jian1, shanjiangh, zhangy.yun, lixingfu,
	wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, Ni, Hongjun, j.bromhead, deri, fc,
	arthur.su, Guy Kaneti, Smadar Fuks, Liron Himi, edwin.verplanke,
	keith.wiles

> -----Original Message-----
> From: Wang Xiang <xiang.w.wang@intel.com>
> 
> Hi Jerin,
> 
> Thanks for your response. More comments below and inline.
> 
> 1) I think the size of some varaibles (e.g. nb_matches, scan_size, matching
> offset, etc) should be increased based on what Hyperscan supports.
> 
>     a) struct rte_regex_ops:
> 
>         uint16_t scan_size => uint32_t scan_size

I think, packet buffers will not be > 64K and getting more than contiguous
64K DMAable memory will be difficult in DPDK.
Other than that, rte_regex_match is 64bit now, increasing width of
Len could increase the size of  "rte_regex_match". i.e Need more
Bandwidth for response. 
Could other HW implementations share the views on max length
is supported on their implementation? Based on that we can decide.

>         uint8_t nb_actual_matches => uint64 nb_actual_matches
>         uint8_t nb_matches => uint64 nb__matches

2^64 matches will be never possible in practical system. How about 2^16.

> 
>     b) struct rte_regex_match:
>         uint16_t offset => uint32_t offset
>         uint16_t len => uint32_t len

See above.

> 
>     c) uint16_t
>         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
>                                  uint16_t nb_rules);
>     =>
>        uint32_t
>         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> *rules,
>                                  uint32_t nb_rules);

OK. I will change it next version.

> 
>     d) int
>     rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
>                     const struct rte_regex_qp_conf *qp_conf);
>     =>
>        int
>     rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
>                     const struct rte_regex_qp_conf *qp_conf);

OK. I will change it next version.

> 
>     e) struct rte_regex_dev_config:
>         uint8_t nb_max_matches => uint64_t nb_max_matches

2^64 matches will be never possible in practical system. How about 2^16.

> 
>     f) struct rte_regex_dev_info:
>         uint8_t max_matches => uint64_t max_matches

2^64 matches will be never possible in practical system. How about 2^16.

> 
> 2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set() defined.
> Are all the attributes below could be set by users? Is any of them read-only?

See below,

> /** Enumerates RegEx device attribute identifier */ enum
> rte_regex_dev_attr_id {
>     RTE_REGEX_DEV_ATTR_SOCKET_ID,
>     /**< The NUMA socket id to which the device is connected or
>      * a default of zero if the socket could not be determined.
>      * datatype: *int*
>      * operation: *get*

*get*  means read only. *get* and *set* means it support both operation

>      */
>     RTE_REGEX_DEV_ATTR_MAX_MATCHES,
>     /**< Maximum number of matches per scan.
>      * datatype: *uint8_t*
>      * operation: *get* and *set*
>      *
>      * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
>      */
>     RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
>     /**< Upper bound scan time in ns.
>      * datatype: *uint16_t*
>      * operation: *get* and *set*
>      *
>      * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
>      */
>     RTE_REGEX_DEV_ATTR_MAX_PREFIX,
>     /**< Maximum number of prefix detected per scan.
>      * This would be useful for denial of service detection.
>      * datatype: *uint16_t*
>      * operation: *get* and *set*
>      *
>      * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
>      */
> };
> 
> 3) Both RTE_REGEX_PCRE_RULE_* and
> RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can we
> merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F and have
> a unified regex_dev_capa in struct rte_regex_dev_info.

Sure. I will fix it next version.

> 
> 
> 4) It'll be good if we can also define synchronous matching API for users who
> want to have a one-off scan and wait for the results.

Makes sense. I will add synchronous matching API in next version(I understand, it will be useful for SW
Implementations). Probably expose as INFO flag to expose the it as preference.

> 
> On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran wrote:
> > Hi Xiang,
> >
> > Sorry for delay in response(Was busy with 19.11 proposal deadline). Please
> see inline.
> >
> > >
> > > Reply to Xiang's queries in main thread:
> > >
> > > Hi all,
> > >
> > > Some questions regarding APIs. Could you please give more insights?
> > >
> > > 1) rte_regex_ops
> > >       a) rsp_flags
> > >       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> > > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
> > >       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial
> > > match at the end of current buffer after scan.
> > >       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> > >
> > > [Jerin] Since we need three states to represent partial match
> > > buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to represent start of the
> > > buffer, intermediate buffers with no flag, and end of the buffer
> > > with RTE_REGEX_OPS_RSP_PMI_EOJ
> >
> > > [Xiang] How could a user leverage these flags for matching? Suppose
> > > a large buffer is divided into multiple chunks. Will
> > > RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't set
> > > after scan the first chunk. Similarly, RTE_REGEX_OPS_RSP_PMI_EOJ
> > > tells a user whether to stop matching future buffers after finish the last
> chunk?
> >
> > Let me describe with an example,
> >
> > Assume,
> > 1) struct rte_regex_dev_info:: max_payload_size set to 1024
> > 2) rte_regex_dev_config:: dev_cfg_flags configured with
> > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > 3) Device programmed with matching "hello\s+world" pattern
> > 4) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > and struct rte_regex_op:: scan_size = 1024
> >
> > data[0..1021] = data don???t have hello world pattern data[1022] = 'h'
> > data[1023] = 'e'
> >
> > 5) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > and struct rte_regex_op:: scan_size = 9
> >
> > data[0] = 'l'
> > data[1] = 'l'
> > data[2] = 'o'
> > data[3] = ' '
> > data[4] = 'w'
> > data[5] = 'o'
> > data[6] = 'r'
> > data[7] = 'l'
> > data[8] = 'd'
> >
> > If so,
> >
> > Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops::
> > rsp_flags on dequeue Where rte_regex_match:: offset is 1022 and len 2
> >
> > Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
> > rsp_flags on dequeue Where rte_regex_match:: offset is 0 and len 9
> >
> If the defined pattern is "hello.*world" instead of "hello\s+world", and we
> enqueue following struct rte_regex_ops:
> 
> 1) rte_regex_op:: scan_size = 1024
> 
>    data[0..1021] = data don???t have hello world pattern
>    data[1022] = 'h'
>    data[1023] = 'e'
> 
> 2) rte_regex_op:: scan_size = 9
>    data[0] = 'l'
>    data[1] = 'l'
>    data[2] = 'o'
>    data[3] = ' '
>    data[4] = 'w'
>    data[5] = 'o'
>    data[6] = 'r'
>    data[7] = 'l'
>    data[8] = 'd'
> 
> 3) rte_regex_op:: scan_size = 5
>    data[0] = 'w'
>    data[1] = 'o'
>    data[2] = 'r'
>    data[3] = 'l'
>    data[4] = 'd'
> 
> Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
> rsp_flags on dequeue
> Where rte_regex_match:: offset is 0 and len 4?

Yes.

> 
> I am wondering what's your expected behavior for .* or similar syntax and if
> there are syntax compatability issues. We report all matches in Hyperscan, e.g.
> report end match offsets 11 and 16 for pattern "hello.*world" and corpus
> "hello worldworld".
> 
> BTW, not sure how other hardware devices handle cross buffer scan. Hyperscan
> doesn't reports matches for start and intermediate buffers but only reports end
> offset if a full match is found.
> 
> >
> > >
> > >       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition
> > > for a specific hardware implementation. I am wondering what this
> > > PREFIX refers to:)?
> > >
> > > [Jerin] Yes. Looks like it is for hardware specific implementation.
> > > Introduced rte_regex_dev_attr_set/get functions to make it portable
> > > and To add new implementation specific fields.
> > > For example, if a rule is
> > > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is
> > > considered the factor. The prefix is a literal string, while the
> > > factor can contain complex regular expression constructs. As a
> > > result, rule matching occurs in two stages: prefix matching and
> > > factor matching.
> > >
> > >       b)  user_id or user_ptr
> > >       Under what kind of circumstances should an application pass
> > > value into these variables for enqueue and dequeuer operations?
> > >
> > > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also
> > > allocated using mempool normally, on enqueue, user can specify
> > > user_id If needed to in order identify the op on dequeue if
> > > required. The use case could be to store the sequence number from
> > > application POV or storing the mbuf ptr in which pattern is requested etc.
> > >
> > >
> > >  2) rte_regex_match
> > >       a) offset; /**< Starting Byte Position for matched rule. */
> > > and  uint16_t len; /**< Length of match in bytes */
> > >       Looks like the matching offset is defined as *starting
> > > matching offset* instead of *end matching offset*, e.g. report the offset of
> "a" instead of "c"
> > > for pattern "abc".
> > >       If so, this makes it hard to integrate software regex
> > > libraries such as Hyperscan and RE2 as they only report *end
> > > matching offset* without length of match.
> > >       Although Hyperscan has API for *starting matching offset*, it
> > > only delivers partial syntax support. So I think we have to define
> > > *end of matching offset* for software solutions.
> > >
> > > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs.
> > > I thought application would need always the length of the match.
> > > Probably we will see how other HW implementation (from Mellanox)
> > > etc. We will try to abstract it, probably we can make it as function
> > > of "user requested".
> > > [Xiang] Yes, it will be good to make it per user request. At least
> > > from Hyperscan user's point of view, start of match and match length
> > > are not mandatory.
> >
> > OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START In
> > device configure.
> >
> > Since offset+len == end, we can introduce following generic inline function.
> >
> > static inline
> > rte_regex_match_end(truct rte_regex_match *match) {
> > 	match->offset + match->len;
> > }
> >
> > Example:  pattern to match is  "hello\s+world"  and data is following
> > data[4] = 'h'
> > data[5] = 'e'
> > data[6] = 'l'
> > data[7] = 'l'
> > data[8] = 'o'
> > data[9] = ' '
> > data[10] = 'w'
> > data[11] = 'o'
> > data[12] = 'r'
> > data[13] = 'l'
> > data[14] = 'd'
> >
> > if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > match->offset returns 4
> > match->len returns 11
> >
> > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > driver MAY return the following(in hyperscan case)
> > match->offset returns 0
> > match->len returns 11 + 4
> >
> > In both case(irrespective of flags, to make application life easy)
> rte_regex_match_end() would return 15.
> > If application demands for MATCH_AS_START then driver can return
> > match->offset returns 4 and match->len returns 11 Aka set
> > HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use
> rte_regex_match_end() for finding the end of the match. To make, work in all
> cases.
> >
> > Is it OK?
> >
> Can we replace len with end offset? So we can change "offset" to "start_offset"
> and len to "end_ offset" in struct rte_regex_match. Users interested in len
> could take "end_offset - start_offset".
> We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to
> RTE_REGEX_DEV_CFG_MATCH_START
> 
> In your example,
> if device is configured with RTE_REGEX_DEV_CFG_MATCH_START
> match->start_offset returns 4
> match->end_offset returns 15
> 
> if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START
> match->start_offset returns 0
> match->end_offset returns 15


This part is little tricky as HW descriptions need to be rewritten on response.
This is a one issue, I foresee earlier, to come up with rte_regex_match
That's works for all implementation  without performance issue.

We have two HW implementations, both returns start_off and len.
Lets get input from other HW implementation on the semantics of
rte_regex_match. Based on that, we can decide how to go about it?
Thoughts from Mellanox or other vendors?



> 
> > >
> > > 3)  rte_regex_rule_db_update()
> > >     Does this mean we can dynamically add or delete rules for an
> > > already generated database without recompile from scratch for
> > > hardware Regex implementation?
> > >     If so, this isn't possible for software solutions as they don't
> > > support dynamic database update and require recompile.
> > >
> > > [Jerin] rte_regex_rule_db_update() internally it would call
> > > recompile function for both HW and SW.
> > > See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> > > precompiled rule database case.
> > > [Xiang] OK, sounds like we have to save the original rule-set for
> > > the device in order to do recompile. I see both ADD and REMOVE
> > > operators from rte_regex_rule.
> > > For rules with REMOVE operator, what's the expected behavior to
> > > handle them for the old rule-set? Do we need to go through the old
> > > rule-set and remove corresponding rules before doing recompile?
> >
> > Yes.
> >
> I think it'll be better to change rte_regex_rule_db_update() to
> rte_regex_rule_compile() and have users to provide a full rule-set.
> So we don't have to maintain old rule-set and decide which one to keep and
> remove. We can simply recompile new rule-set and get rid of
> rte_regex_rule_op in this case.


On virtualized, HW implementations, The RULE database is maintained by single
body. So the above scheme, works with SW and HW implementations.
And It make user life easy as they don't need to maintain the rules.

I don't have preference on the rte_regex_rule_db_update() name, I can change to
rte_regex_rule_compile() if required keeping above functionality. Let me know.









^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-09-10 11:02       ` Jerin Jacob Kollanukkaran
@ 2019-09-27 14:45         ` Jerin Jacob Kollanukkaran
  2019-10-02  5:53           ` Shahaf Shuler
  0 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-09-27 14:45 UTC (permalink / raw)
  To: 'Shahaf Shuler', 'Thomas Monjalon',
	'dev@dpdk.org'
  Cc: Pavan Nikhilesh Bhagavatula, 'Hemant Agrawal',
	'Opher Reviv', 'Alex Rosenbaum',
	Dovrat Zifroni, Prasun Kapoor, 'Nipun Gupta',
	'Wang, Xiang W', 'Richardson, Bruce',
	'yang.a.hong@intel.com', 'harry.chang@intel.com',
	'gu.jian1@zte.com.cn',
	'shanjiangh@chinatelecom.cn',
	'zhangy.yun@chinatelecom.cn',
	'lixingfu@huachentel.com', 'wushuai@inspur.com',
	'yuyingxia@yxlink.com',
	'fanchenggang@sunyainfo.com',
	'davidfgao@tencent.com',
	'liuzhong1@chinaunicom.cn',
	'zhaoyong11@huawei.com', 'oc@yunify.com',
	'jim@netgate.com', 'hongjun.ni@intel.com',
	'j.bromhead@titan-ic.com', 'deri@ntop.org',
	'fc@napatech.com', 'arthur.su@lionic.com'

> -----Original Message-----
> From: Jerin Jacob Kollanukkaran
> Sent: Tuesday, September 10, 2019 4:33 PM
> To: Shahaf Shuler <shahafs@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Hemant
> Agrawal <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>;
> Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; Nipun Gupta
> <nipun.gupta@nxp.com>; Wang, Xiang W <xiang.w.wang@intel.com>;
> Richardson, Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com; yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com
> Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > Hi Jerin,
> 
> Hi Shahaf,
> 
> Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see
> inline.
> 
> > >
> > > RegEx pattern matching applications:
> > > • Next Generation Firewalls (NGFW)
> > > • Deep Packet and Flow Inspection (DPI) • Intrusion Prevention
> > > Systems (IPS) • DDoS Mitigation • Network Monitoring • Data Loss
> > > Prevention (DLP) • Smart NICs • Grammar based content processing •
> > > URL, spam and adware filtering • Advanced auditing and policing of
> > > user/application security policies • Financial data mining - parsing
> > > of streamed financial feeds
> >
> > I think two more important use case to add (at least on the doc of
> > this
> > subsystem) are:
> > * application recognition
> > * memory introspection
> 
> Sure. Will add the following from John as well.
> 
> # Natural Language Processing (NLP)
> # Sentiment Analysis
> # Big Data database acceleration (Spark, Hadoop etc.) # Computational Storage
> 
> >
> >
> > > +/**
> > > + * Update the rule database of a RegEx device.
> > > + *
> > > + * @param dev_id RegEx device identifier
> > > + * @param rules
> > > + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> > > structure
> > > + *   which contain the regex rules attributes to be updated in rule
> database.
> > > + * @param nb_rules
> > > + *   The number of PCRE rules to update the rule database.
> > > + *
> > > + * @return
> > > + *   The number of regex rules actually updated on the regex device's rule
> > > + *   database. The return value can be less than the value of the *nb_rules*
> > > + *   parameter when the regex devices fails to update the rule database or
> > > + *   if invalid parameters are specified in a *rte_regex_rule*.
> > > + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> > > + *   at the end of *rules* are not consumed and the caller has to take
> > > + *   care of them and rte_errno is set accordingly.
> > > + *   Possible errno values include:
> > > + *   - -EINVAL:  Invalid device ID or rules is NULL
> > > + *   - -ENOTSUP: The last processed rule is not supported on this device.
> > > + *   - -ENOSPC: No space available in rule database.
> > > + *
> > > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()  */
> > > +uint16_t rte_regex_rule_db_update(uint8_t dev_id, const struct
> > > +rte_regex_rule
> > > *rules,
> > > +			 uint16_t nb_rules);
> >
> > I think the function name is not too informative. If this function
> > meant to compile the rule then it should be explicit on the function name.
> 
> It is meant to be compile the rules and then  update the rule database.
> 
> I think, we can have either 1 or 2. Let me know your preference or If you have
> any name suggestion. I will change it accordingly.
> 
> 1. rte_regex_rule_db_compile()
> 2. rte_regex_rule_db_compile_update()


@Shahaf Shuler, Thoughts?


> 
> 
> > > +
> > > + */
> > > +struct rte_regex_ops {
> > > +
> > > +	/* W4 */
> > > +	RTE_STD_C11
> > > +	union {
> > > +		uint64_t user_id;
> > > +		/**< Application specific opaque value. An application may
> > > use
> > > +		 * this field to hold application specific value to share
> > > +		 * between dequeue and enqueue operation.
> > > +		 * Implementation should not modify this field.
> > > +		 */
> > > +		void *user_ptr;
> > > +		/**< Pointer representation of *user_id* */
> > > +	};
> >
> > Since we target the regex subsystem for both regex and DPI I think it
> > will be good to add another uint64_t field called connection_id.
> > Device that support DPI can refer to it as another match able field
> > when looking up for matches on the given buffer.
> >
> > This field is different from the user_id, as it is not opaque for the device.
> 
> Is this driver specific storage place where application should not touch it?
> 
> If not, Could you share the data flow of this field? Ie. Who "write" this Field and
> who "read" this field.

@Shahaf Shuler Thoughts?

Based on your input, I will update the next version.

> 
> This is just for documentation, In any event we can add new fields.
> 
> If it is only for driver usage then I think, some driver may need more 8B
> Storage. In that case I think, each driver can add its on field After W4(i.e
> existing user_id) and introduce new field called match_offset in struct
> rte_regex_ops
> 
> ie. struct rte_regex_match *matches == ops + ops-> match_offset; so that, Each
> driver can add enough driver specific metadata.
> 
> 
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-09-27 14:45         ` Jerin Jacob Kollanukkaran
@ 2019-10-02  5:53           ` Shahaf Shuler
  2019-10-02  8:31             ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 62+ messages in thread
From: Shahaf Shuler @ 2019-10-02  5:53 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Thomas Monjalon, 'dev@dpdk.org'
  Cc: Pavan Nikhilesh Bhagavatula, 'Hemant Agrawal',
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	'Nipun Gupta', 'Wang, Xiang W',
	'Richardson, Bruce', 'yang.a.hong@intel.com',
	'harry.chang@intel.com', 'gu.jian1@zte.com.cn',
	'shanjiangh@chinatelecom.cn',
	'zhangy.yun@chinatelecom.cn',
	'lixingfu@huachentel.com', 'wushuai@inspur.com',
	'yuyingxia@yxlink.com',
	'fanchenggang@sunyainfo.com',
	'davidfgao@tencent.com',
	'liuzhong1@chinaunicom.cn',
	'zhaoyong11@huawei.com', 'oc@yunify.com',
	'jim@netgate.com', 'hongjun.ni@intel.com',
	'j.bromhead@titan-ic.com', 'deri@ntop.org',
	'fc@napatech.com', 'arthur.su@lionic.com'

Friday, September 27, 2019 5:46 PM, Jerin Jacob Kollanukkaran:
> subsystem
> 
> > -----Original Message-----
> > From: Jerin Jacob Kollanukkaran
> > Sent: Tuesday, September 10, 2019 4:33 PM
> > To: Shahaf Shuler <shahafs@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; dev@dpdk.org
> > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Hemant
> > Agrawal <hemant.agrawal@nxp.com>; Opher Reviv
> <opher@mellanox.com>;
> > Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> > <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; Nipun
> Gupta
> > <nipun.gupta@nxp.com>; Wang, Xiang W <xiang.w.wang@intel.com>;
> > Richardson, Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > harry.chang@intel.com; gu.jian1@zte.com.cn;
> > shanjiangh@chinatelecom.cn; zhangy.yun@chinatelecom.cn;
> > lixingfu@huachentel.com; wushuai@inspur.com; yuyingxia@yxlink.com;
> > fanchenggang@sunyainfo.com; davidfgao@tencent.com;
> > liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com;
> > jim@netgate.com; hongjun.ni@intel.com; j.bromhead@titan-ic.com;
> > deri@ntop.org; fc@napatech.com; arthur.su@lionic.com
> > Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> > subsystem
> >
> > > Hi Jerin,
> >
> > Hi Shahaf,
> >
> > > > + *
> > > > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()
> > > > +*/ uint16_t rte_regex_rule_db_update(uint8_t dev_id, const struct
> > > > +rte_regex_rule
> > > > *rules,
> > > > +			 uint16_t nb_rules);
> > >
> > > I think the function name is not too informative. If this function
> > > meant to compile the rule then it should be explicit on the function
> name.
> >
> > It is meant to be compile the rules and then  update the rule database.
> >
> > I think, we can have either 1 or 2. Let me know your preference or If
> > you have any name suggestion. I will change it accordingly.
> >
> > 1. rte_regex_rule_db_compile()
> > 2. rte_regex_rule_db_compile_update()
> 
> 
> @Shahaf Shuler, Thoughts?

IMO we should have two separate functions - one to only compile. One to only update. 

So I would prefer #1, with addition (if not already present) of API to update rules. 

> 
> 
> >
> >
> > > > +
> > > > + */
> > > > +struct rte_regex_ops {
> > > > +
> > > > +	/* W4 */
> > > > +	RTE_STD_C11
> > > > +	union {
> > > > +		uint64_t user_id;
> > > > +		/**< Application specific opaque value. An application may
> > > > use
> > > > +		 * this field to hold application specific value to share
> > > > +		 * between dequeue and enqueue operation.
> > > > +		 * Implementation should not modify this field.
> > > > +		 */
> > > > +		void *user_ptr;
> > > > +		/**< Pointer representation of *user_id* */
> > > > +	};
> > >
> > > Since we target the regex subsystem for both regex and DPI I think
> > > it will be good to add another uint64_t field called connection_id.
> > > Device that support DPI can refer to it as another match able field
> > > when looking up for matches on the given buffer.
> > >
> > > This field is different from the user_id, as it is not opaque for the device.
> >
> > Is this driver specific storage place where application should not touch it?
> >
> > If not, Could you share the data flow of this field? Ie. Who "write"
> > this Field and who "read" this field.

Application writes to the field. Device reads from this fields. 
Unlike the user_ptr which is complete opaque to the device, connection_id field will have some meaning (e.g. DPI rules can apply on it). 

> 
> @Shahaf Shuler Thoughts?
> 
> Based on your input, I will update the next version.
> 
> >
> > This is just for documentation, In any event we can add new fields.
> >
> > If it is only for driver usage then I think, some driver may need more
> > 8B Storage. In that case I think, each driver can add its on field
> > After W4(i.e existing user_id) and introduce new field called
> > match_offset in struct rte_regex_ops
> >
> > ie. struct rte_regex_match *matches == ops + ops-> match_offset; so
> > that, Each driver can add enough driver specific metadata.
> >
> >
> >


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-10-02  5:53           ` Shahaf Shuler
@ 2019-10-02  8:31             ` Jerin Jacob Kollanukkaran
  2019-10-02  8:52               ` Shahaf Shuler
  0 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-10-02  8:31 UTC (permalink / raw)
  To: Shahaf Shuler, Thomas Monjalon, 'dev@dpdk.org'
  Cc: Pavan Nikhilesh Bhagavatula, 'Hemant Agrawal',
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	'Nipun Gupta', 'Wang, Xiang W',
	'Richardson, Bruce', 'yang.a.hong@intel.com',
	'harry.chang@intel.com', 'gu.jian1@zte.com.cn',
	'shanjiangh@chinatelecom.cn',
	'zhangy.yun@chinatelecom.cn',
	'lixingfu@huachentel.com', 'wushuai@inspur.com',
	'yuyingxia@yxlink.com',
	'fanchenggang@sunyainfo.com',
	'davidfgao@tencent.com',
	'liuzhong1@chinaunicom.cn',
	'zhaoyong11@huawei.com', 'oc@yunify.com',
	'jim@netgate.com', 'hongjun.ni@intel.com',
	'j.bromhead@titan-ic.com', 'deri@ntop.org',
	'fc@napatech.com', 'arthur.su@lionic.com'

> -----Original Message-----
> From: Shahaf Shuler <shahafs@mellanox.com>
> Sent: Wednesday, October 2, 2019 11:23 AM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Thomas Monjalon
> <thomas@monjalon.net>; 'dev@dpdk.org' <dev@dpdk.org>
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; 'Hemant
> Agrawal' <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>;
> Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; 'Nipun
> Gupta' <nipun.gupta@nxp.com>; 'Wang, Xiang W' <xiang.w.wang@intel.com>;
> 'Richardson, Bruce' <bruce.richardson@intel.com>; 'yang.a.hong@intel.com'
> <yang.a.hong@intel.com>; 'harry.chang@intel.com' <harry.chang@intel.com>;
> 'gu.jian1@zte.com.cn' <gu.jian1@zte.com.cn>; 'shanjiangh@chinatelecom.cn'
> <shanjiangh@chinatelecom.cn>; 'zhangy.yun@chinatelecom.cn'
> <zhangy.yun@chinatelecom.cn>; 'lixingfu@huachentel.com'
> <lixingfu@huachentel.com>; 'wushuai@inspur.com' <wushuai@inspur.com>;
> 'yuyingxia@yxlink.com' <yuyingxia@yxlink.com>;
> 'fanchenggang@sunyainfo.com' <fanchenggang@sunyainfo.com>;
> 'davidfgao@tencent.com' <davidfgao@tencent.com>;
> 'liuzhong1@chinaunicom.cn' <liuzhong1@chinaunicom.cn>;
> 'zhaoyong11@huawei.com' <zhaoyong11@huawei.com>; 'oc@yunify.com'
> <oc@yunify.com>; 'jim@netgate.com' <jim@netgate.com>;
> 'hongjun.ni@intel.com' <hongjun.ni@intel.com>; 'j.bromhead@titan-ic.com'
> <j.bromhead@titan-ic.com>; 'deri@ntop.org' <deri@ntop.org>;
> 'fc@napatech.com' <fc@napatech.com>; 'arthur.su@lionic.com'
> <arthur.su@lionic.com>
> Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > > > I think the function name is not too informative. If this function
> > > > meant to compile the rule then it should be explicit on the
> > > > function
> > name.
> > >
> > > It is meant to be compile the rules and then  update the rule database.
> > >
> > > I think, we can have either 1 or 2. Let me know your preference or
> > > If you have any name suggestion. I will change it accordingly.
> > >
> > > 1. rte_regex_rule_db_compile()
> > > 2. rte_regex_rule_db_compile_update()
> >
> >
> > @Shahaf Shuler, Thoughts?
> 
> IMO we should have two separate functions - one to only compile. One to only
> update.
> 
> So I would prefer #1, with addition (if not already present) of API to update
> rules.


OK. Will change it in next version.


> 
> >
> >
> > >
> > >
> > > > > +
> > > > > + */
> > > > > +struct rte_regex_ops {
> > > > > +
> > > > > +	/* W4 */
> > > > > +	RTE_STD_C11
> > > > > +	union {
> > > > > +		uint64_t user_id;
> > > > > +		/**< Application specific opaque value. An application
> may
> > > > > use
> > > > > +		 * this field to hold application specific value to share
> > > > > +		 * between dequeue and enqueue operation.
> > > > > +		 * Implementation should not modify this field.
> > > > > +		 */
> > > > > +		void *user_ptr;
> > > > > +		/**< Pointer representation of *user_id* */
> > > > > +	};
> > > >
> > > > Since we target the regex subsystem for both regex and DPI I think
> > > > it will be good to add another uint64_t field called connection_id.
> > > > Device that support DPI can refer to it as another match able
> > > > field when looking up for matches on the given buffer.
> > > >
> > > > This field is different from the user_id, as it is not opaque for the device.
> > >
> > > Is this driver specific storage place where application should not touch it?
> > >
> > > If not, Could you share the data flow of this field? Ie. Who "write"
> > > this Field and who "read" this field.
> 
> Application writes to the field. Device reads from this fields.
> Unlike the user_ptr which is complete opaque to the device, connection_id field
> will have some meaning (e.g. DPI rules can apply on it).

Will you be connecting the value to rte_flow etc to get the complete data flow.
I understand applications writes to this field, But I am not sure what values 
Needs to be written and how it will be connected in overall scheme of things.
I am not sure even what to write doxgygen comment for this field.

Can we add this field once we have the complete data flow?. Since it is
Experimental we can always add new field.



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-10-02  8:31             ` Jerin Jacob Kollanukkaran
@ 2019-10-02  8:52               ` Shahaf Shuler
  2019-10-02  9:34                 ` Jerin Jacob Kollanukkaran
  0 siblings, 1 reply; 62+ messages in thread
From: Shahaf Shuler @ 2019-10-02  8:52 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran, Thomas Monjalon, 'dev@dpdk.org'
  Cc: Pavan Nikhilesh Bhagavatula, 'Hemant Agrawal',
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	'Nipun Gupta', 'Wang, Xiang W',
	'Richardson, Bruce', 'yang.a.hong@intel.com',
	'harry.chang@intel.com', 'gu.jian1@zte.com.cn',
	'shanjiangh@chinatelecom.cn',
	'zhangy.yun@chinatelecom.cn',
	'lixingfu@huachentel.com', 'wushuai@inspur.com',
	'yuyingxia@yxlink.com',
	'fanchenggang@sunyainfo.com',
	'davidfgao@tencent.com',
	'liuzhong1@chinaunicom.cn',
	'zhaoyong11@huawei.com', 'oc@yunify.com',
	'jim@netgate.com', 'hongjun.ni@intel.com',
	'j.bromhead@titan-ic.com', 'deri@ntop.org',
	'fc@napatech.com', 'arthur.su@lionic.com'

Wednesday, October 2, 2019 11:32 AM, Jerin Jacob Kollanukkaran:
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > -----Original Message-----
> > From: Shahaf Shuler <shahafs@mellanox.com>
> > Sent: Wednesday, October 2, 2019 11:23 AM
> > To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Thomas Monjalon
> > <thomas@monjalon.net>; 'dev@dpdk.org' <dev@dpdk.org>
> > Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; 'Hemant
> > Agrawal' <hemant.agrawal@nxp.com>; Opher Reviv
> <opher@mellanox.com>;
> > Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> > <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; 'Nipun
> > Gupta' <nipun.gupta@nxp.com>; 'Wang, Xiang W'
> > <xiang.w.wang@intel.com>; 'Richardson, Bruce'
> <bruce.richardson@intel.com>; 'yang.a.hong@intel.com'
> > <yang.a.hong@intel.com>; 'harry.chang@intel.com'
> > <harry.chang@intel.com>; 'gu.jian1@zte.com.cn' <gu.jian1@zte.com.cn>;
> 'shanjiangh@chinatelecom.cn'
> > <shanjiangh@chinatelecom.cn>; 'zhangy.yun@chinatelecom.cn'
> > <zhangy.yun@chinatelecom.cn>; 'lixingfu@huachentel.com'
> > <lixingfu@huachentel.com>; 'wushuai@inspur.com'
> <wushuai@inspur.com>;
> > 'yuyingxia@yxlink.com' <yuyingxia@yxlink.com>;
> > 'fanchenggang@sunyainfo.com' <fanchenggang@sunyainfo.com>;
> > 'davidfgao@tencent.com' <davidfgao@tencent.com>;
> > 'liuzhong1@chinaunicom.cn' <liuzhong1@chinaunicom.cn>;
> > 'zhaoyong11@huawei.com' <zhaoyong11@huawei.com>; 'oc@yunify.com'
> > <oc@yunify.com>; 'jim@netgate.com' <jim@netgate.com>;
> > 'hongjun.ni@intel.com' <hongjun.ni@intel.com>; 'j.bromhead@titan-
> ic.com'
> > <j.bromhead@titan-ic.com>; 'deri@ntop.org' <deri@ntop.org>;
> > 'fc@napatech.com' <fc@napatech.com>; 'arthur.su@lionic.com'
> > <arthur.su@lionic.com>
> > Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> > subsystem
> >
> > > > > I think the function name is not too informative. If this
> > > > > function meant to compile the rule then it should be explicit on
> > > > > the function
> > > name.
> > > >
> > > > It is meant to be compile the rules and then  update the rule database.
> > > >
> > > > I think, we can have either 1 or 2. Let me know your preference or
> > > > If you have any name suggestion. I will change it accordingly.
> > > >
> > > > 1. rte_regex_rule_db_compile()
> > > > 2. rte_regex_rule_db_compile_update()
> > >
> > >
> > > @Shahaf Shuler, Thoughts?
> >
> > IMO we should have two separate functions - one to only compile. One
> > to only update.
> >
> > So I would prefer #1, with addition (if not already present) of API to
> > update rules.
> 
> 
> OK. Will change it in next version.
> 
> 
> >
> > >
> > >
> > > >
> > > >
> > > > > > +
> > > > > > + */
> > > > > > +struct rte_regex_ops {
> > > > > > +
> > > > > > +	/* W4 */
> > > > > > +	RTE_STD_C11
> > > > > > +	union {
> > > > > > +		uint64_t user_id;
> > > > > > +		/**< Application specific opaque value. An
> application
> > may
> > > > > > use
> > > > > > +		 * this field to hold application specific value to share
> > > > > > +		 * between dequeue and enqueue operation.
> > > > > > +		 * Implementation should not modify this field.
> > > > > > +		 */
> > > > > > +		void *user_ptr;
> > > > > > +		/**< Pointer representation of *user_id* */
> > > > > > +	};
> > > > >
> > > > > Since we target the regex subsystem for both regex and DPI I
> > > > > think it will be good to add another uint64_t field called
> connection_id.
> > > > > Device that support DPI can refer to it as another match able
> > > > > field when looking up for matches on the given buffer.
> > > > >
> > > > > This field is different from the user_id, as it is not opaque for the
> device.
> > > >
> > > > Is this driver specific storage place where application should not touch
> it?
> > > >
> > > > If not, Could you share the data flow of this field? Ie. Who "write"
> > > > this Field and who "read" this field.
> >
> > Application writes to the field. Device reads from this fields.
> > Unlike the user_ptr which is complete opaque to the device,
> > connection_id field will have some meaning (e.g. DPI rules can apply on it).
> 
> Will you be connecting the value to rte_flow etc to get the complete data
> flow.
> I understand applications writes to this field, But I am not sure what values
> Needs to be written and how it will be connected in overall scheme of things.
> I am not sure even what to write doxgygen comment for this field.
> 
> Can we add this field once we have the complete data flow?. Since it is
> Experimental we can always add new field.

Yes. We can revisit it later, so long we agree that such field can be added. 

> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-10-02  8:52               ` Shahaf Shuler
@ 2019-10-02  9:34                 ` Jerin Jacob Kollanukkaran
  0 siblings, 0 replies; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-10-02  9:34 UTC (permalink / raw)
  To: Shahaf Shuler, Thomas Monjalon, 'dev@dpdk.org'
  Cc: Pavan Nikhilesh Bhagavatula, 'Hemant Agrawal',
	Opher Reviv, Alex Rosenbaum, Dovrat Zifroni, Prasun Kapoor,
	'Nipun Gupta', 'Wang, Xiang W',
	'Richardson, Bruce', 'yang.a.hong@intel.com',
	'harry.chang@intel.com', 'gu.jian1@zte.com.cn',
	'shanjiangh@chinatelecom.cn',
	'zhangy.yun@chinatelecom.cn',
	'lixingfu@huachentel.com', 'wushuai@inspur.com',
	'yuyingxia@yxlink.com',
	'fanchenggang@sunyainfo.com',
	'davidfgao@tencent.com',
	'liuzhong1@chinaunicom.cn',
	'zhaoyong11@huawei.com', 'oc@yunify.com',
	'jim@netgate.com', 'hongjun.ni@intel.com',
	'j.bromhead@titan-ic.com', 'deri@ntop.org',
	'fc@napatech.com', 'arthur.su@lionic.com'

> -----Original Message-----
> From: Shahaf Shuler <shahafs@mellanox.com>
> Sent: Wednesday, October 2, 2019 2:23 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>; Thomas Monjalon
> <thomas@monjalon.net>; 'dev@dpdk.org' <dev@dpdk.org>
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; 'Hemant
> Agrawal' <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>;
> Alex Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>; 'Nipun
> Gupta' <nipun.gupta@nxp.com>; 'Wang, Xiang W' <xiang.w.wang@intel.com>;
> 'Richardson, Bruce' <bruce.richardson@intel.com>; 'yang.a.hong@intel.com'
> <yang.a.hong@intel.com>; 'harry.chang@intel.com' <harry.chang@intel.com>;
> 'gu.jian1@zte.com.cn' <gu.jian1@zte.com.cn>; 'shanjiangh@chinatelecom.cn'
> <shanjiangh@chinatelecom.cn>; 'zhangy.yun@chinatelecom.cn'
> <zhangy.yun@chinatelecom.cn>; 'lixingfu@huachentel.com'
> <lixingfu@huachentel.com>; 'wushuai@inspur.com' <wushuai@inspur.com>;
> 'yuyingxia@yxlink.com' <yuyingxia@yxlink.com>;
> 'fanchenggang@sunyainfo.com' <fanchenggang@sunyainfo.com>;
> 'davidfgao@tencent.com' <davidfgao@tencent.com>;
> 'liuzhong1@chinaunicom.cn' <liuzhong1@chinaunicom.cn>;
> 'zhaoyong11@huawei.com' <zhaoyong11@huawei.com>; 'oc@yunify.com'
> <oc@yunify.com>; 'jim@netgate.com' <jim@netgate.com>;
> 'hongjun.ni@intel.com' <hongjun.ni@intel.com>; 'j.bromhead@titan-ic.com'
> <j.bromhead@titan-ic.com>; 'deri@ntop.org' <deri@ntop.org>;
> 'fc@napatech.com' <fc@napatech.com>; 'arthur.su@lionic.com'
> <arthur.su@lionic.com>
> Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > > > > >
> > > > > > Since we target the regex subsystem for both regex and DPI I
> > > > > > think it will be good to add another uint64_t field called
> > connection_id.
> > > > > > Device that support DPI can refer to it as another match able
> > > > > > field when looking up for matches on the given buffer.
> > > > > >
> > > > > > This field is different from the user_id, as it is not opaque
> > > > > > for the
> > device.
> > > > >
> > > > > Is this driver specific storage place where application should
> > > > > not touch
> > it?
> > > > >
> > > > > If not, Could you share the data flow of this field? Ie. Who "write"
> > > > > this Field and who "read" this field.
> > >
> > > Application writes to the field. Device reads from this fields.
> > > Unlike the user_ptr which is complete opaque to the device,
> > > connection_id field will have some meaning (e.g. DPI rules can apply on it).
> >
> > Will you be connecting the value to rte_flow etc to get the complete
> > data flow.
> > I understand applications writes to this field, But I am not sure what
> > values Needs to be written and how it will be connected in overall scheme of
> things.
> > I am not sure even what to write doxgygen comment for this field.
> >
> > Can we add this field once we have the complete data flow?. Since it
> > is Experimental we can always add new field.
> 
> Yes. We can revisit it later, so long we agree that such field can be added.

Yes. DPI inline support is a valid use case. We can add that support
when data flow is clear and HW support is available.




> 
> >


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-09-27 14:35             ` Jerin Jacob Kollanukkaran
@ 2019-10-14 13:59               ` Wang Xiang
  2020-01-26 11:55                 ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Wang Xiang @ 2019-10-14 13:59 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran
  Cc: Thomas Monjalon, dev, Pavan Nikhilesh Bhagavatula, Shahaf Shuler,
	Hemant Agrawal, Opher Reviv, Alex Rosenbaum, Dovrat Zifroni,
	Prasun Kapoor, Nipun Gupta, Richardson, Bruce, Hong, Yang A,
	Chang, Harry, gu.jian1, shanjiangh, zhangy.yun, lixingfu,
	wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, Ni, Hongjun, j.bromhead, deri, fc,
	arthur.su, Guy Kaneti, Smadar Fuks, Liron Himi, edwin.verplanke,
	keith.wiles

On Fri, Sep 27, 2019 at 02:35:00PM +0000, Jerin Jacob Kollanukkaran wrote:
> > -----Original Message-----
> > From: Wang Xiang <xiang.w.wang@intel.com>
> > 
> > Hi Jerin,
> > 
> > Thanks for your response. More comments below and inline.
> > 
> > 1) I think the size of some varaibles (e.g. nb_matches, scan_size, matching
> > offset, etc) should be increased based on what Hyperscan supports.
> > 
> >     a) struct rte_regex_ops:
> > 
> >         uint16_t scan_size => uint32_t scan_size
> 
> I think, packet buffers will not be > 64K and getting more than contiguous
> 64K DMAable memory will be difficult in DPDK.
> Other than that, rte_regex_match is 64bit now, increasing width of
> Len could increase the size of  "rte_regex_match". i.e Need more
> Bandwidth for response. 
> Could other HW implementations share the views on max length
> is supported on their implementation? Based on that we can decide.
>
OK, let's gather ideas from HW implementation.
> 
> >         uint8_t nb_actual_matches => uint64 nb_actual_matches
> >         uint8_t nb_matches => uint64 nb__matches
> 
> 2^64 matches will be never possible in practical system. How about 2^16.
>
I think the number of matches depends on the number of total rules and
scan size. Based on the definitions (16-bit nb_rules_per_group,
16-bit nb_groups and 16-bit scan size), the maximum possible matches
could exceed 2^16. Users may get partial matches in this case while
Hyperscan doesn't make compromises. It'll also be good to check other HW
implementation.
>
> > 
> >     b) struct rte_regex_match:
> >         uint16_t offset => uint32_t offset
> >         uint16_t len => uint32_t len
> 
> See above.
> 
> > 
> >     c) uint16_t
> >         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > *rules,
> >                                  uint16_t nb_rules);
> >     =>
> >        uint32_t
> >         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > *rules,
> >                                  uint32_t nb_rules);
> 
> OK. I will change it next version.
> 
> > 
> >     d) int
> >     rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> >                     const struct rte_regex_qp_conf *qp_conf);
> >     =>
> >        int
> >     rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
> >                     const struct rte_regex_qp_conf *qp_conf);
> 
> OK. I will change it next version.
> 
> > 
> >     e) struct rte_regex_dev_config:
> >         uint8_t nb_max_matches => uint64_t nb_max_matches
> 
> 2^64 matches will be never possible in practical system. How about 2^16.
>
See above.
>
> > 
> >     f) struct rte_regex_dev_info:
> >         uint8_t max_matches => uint64_t max_matches
> 
> 2^64 matches will be never possible in practical system. How about 2^16.
>
See above.
>
> > 
> > 2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set() defined.
> > Are all the attributes below could be set by users? Is any of them read-only?
> 
> See below,
> 
> > /** Enumerates RegEx device attribute identifier */ enum
> > rte_regex_dev_attr_id {
> >     RTE_REGEX_DEV_ATTR_SOCKET_ID,
> >     /**< The NUMA socket id to which the device is connected or
> >      * a default of zero if the socket could not be determined.
> >      * datatype: *int*
> >      * operation: *get*
> 
> *get*  means read only. *get* and *set* means it support both operation
> 
> >      */
> >     RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> >     /**< Maximum number of matches per scan.
> >      * datatype: *uint8_t*
> >      * operation: *get* and *set*
> >      *
> >      * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> >      */
> >     RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> >     /**< Upper bound scan time in ns.
> >      * datatype: *uint16_t*
> >      * operation: *get* and *set*
> >      *
> >      * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> >      */
> >     RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> >     /**< Maximum number of prefix detected per scan.
> >      * This would be useful for denial of service detection.
> >      * datatype: *uint16_t*
> >      * operation: *get* and *set*
> >      *
> >      * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> >      */
> > };
> > 
> > 3) Both RTE_REGEX_PCRE_RULE_* and
> > RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can we
> > merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F and have
> > a unified regex_dev_capa in struct rte_regex_dev_info.
> 
> Sure. I will fix it next version.
> 
> > 
> > 
> > 4) It'll be good if we can also define synchronous matching API for users who
> > want to have a one-off scan and wait for the results.
> 
> Makes sense. I will add synchronous matching API in next version(I understand, it will be useful for SW
> Implementations). Probably expose as INFO flag to expose the it as preference.
> 
> > 
> > On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran wrote:
> > > Hi Xiang,
> > >
> > > Sorry for delay in response(Was busy with 19.11 proposal deadline). Please
> > see inline.
> > >
> > > >
> > > > Reply to Xiang's queries in main thread:
> > > >
> > > > Hi all,
> > > >
> > > > Some questions regarding APIs. Could you please give more insights?
> > > >
> > > > 1) rte_regex_ops
> > > >       a) rsp_flags
> > > >       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> > > > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
> > > >       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial
> > > > match at the end of current buffer after scan.
> > > >       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> > > >
> > > > [Jerin] Since we need three states to represent partial match
> > > > buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to represent start of the
> > > > buffer, intermediate buffers with no flag, and end of the buffer
> > > > with RTE_REGEX_OPS_RSP_PMI_EOJ
> > >
> > > > [Xiang] How could a user leverage these flags for matching? Suppose
> > > > a large buffer is divided into multiple chunks. Will
> > > > RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't set
> > > > after scan the first chunk. Similarly, RTE_REGEX_OPS_RSP_PMI_EOJ
> > > > tells a user whether to stop matching future buffers after finish the last
> > chunk?
> > >
> > > Let me describe with an example,
> > >
> > > Assume,
> > > 1) struct rte_regex_dev_info:: max_payload_size set to 1024
> > > 2) rte_regex_dev_config:: dev_cfg_flags configured with
> > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > > 3) Device programmed with matching "hello\s+world" pattern
> > > 4) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > > and struct rte_regex_op:: scan_size = 1024
> > >
> > > data[0..1021] = data don???t have hello world pattern data[1022] = 'h'
> > > data[1023] = 'e'
> > >
> > > 5) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > > and struct rte_regex_op:: scan_size = 9
> > >
> > > data[0] = 'l'
> > > data[1] = 'l'
> > > data[2] = 'o'
> > > data[3] = ' '
> > > data[4] = 'w'
> > > data[5] = 'o'
> > > data[6] = 'r'
> > > data[7] = 'l'
> > > data[8] = 'd'
> > >
> > > If so,
> > >
> > > Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops::
> > > rsp_flags on dequeue Where rte_regex_match:: offset is 1022 and len 2
> > >
> > > Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
> > > rsp_flags on dequeue Where rte_regex_match:: offset is 0 and len 9
> > >
> > If the defined pattern is "hello.*world" instead of "hello\s+world", and we
> > enqueue following struct rte_regex_ops:
> > 
> > 1) rte_regex_op:: scan_size = 1024
> > 
> >    data[0..1021] = data don???t have hello world pattern
> >    data[1022] = 'h'
> >    data[1023] = 'e'
> > 
> > 2) rte_regex_op:: scan_size = 9
> >    data[0] = 'l'
> >    data[1] = 'l'
> >    data[2] = 'o'
> >    data[3] = ' '
> >    data[4] = 'w'
> >    data[5] = 'o'
> >    data[6] = 'r'
> >    data[7] = 'l'
> >    data[8] = 'd'
> > 
> > 3) rte_regex_op:: scan_size = 5
> >    data[0] = 'w'
> >    data[1] = 'o'
> >    data[2] = 'r'
> >    data[3] = 'l'
> >    data[4] = 'd'
> > 
> > Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops::
> > rsp_flags on dequeue
> > Where rte_regex_match:: offset is 0 and len 4?
> 
> Yes.
> 
> > 
> > I am wondering what's your expected behavior for .* or similar syntax and if
> > there are syntax compatability issues. We report all matches in Hyperscan, e.g.
> > report end match offsets 11 and 16 for pattern "hello.*world" and corpus
> > "hello worldworld".
> > 
> > BTW, not sure how other hardware devices handle cross buffer scan. Hyperscan
> > doesn't reports matches for start and intermediate buffers but only reports end
> > offset if a full match is found.
> > 
> > >
> > > >
> > > >       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition
> > > > for a specific hardware implementation. I am wondering what this
> > > > PREFIX refers to:)?
> > > >
> > > > [Jerin] Yes. Looks like it is for hardware specific implementation.
> > > > Introduced rte_regex_dev_attr_set/get functions to make it portable
> > > > and To add new implementation specific fields.
> > > > For example, if a rule is
> > > > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is
> > > > considered the factor. The prefix is a literal string, while the
> > > > factor can contain complex regular expression constructs. As a
> > > > result, rule matching occurs in two stages: prefix matching and
> > > > factor matching.
> > > >
> > > >       b)  user_id or user_ptr
> > > >       Under what kind of circumstances should an application pass
> > > > value into these variables for enqueue and dequeuer operations?
> > > >
> > > > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also
> > > > allocated using mempool normally, on enqueue, user can specify
> > > > user_id If needed to in order identify the op on dequeue if
> > > > required. The use case could be to store the sequence number from
> > > > application POV or storing the mbuf ptr in which pattern is requested etc.
> > > >
> > > >
> > > >  2) rte_regex_match
> > > >       a) offset; /**< Starting Byte Position for matched rule. */
> > > > and  uint16_t len; /**< Length of match in bytes */
> > > >       Looks like the matching offset is defined as *starting
> > > > matching offset* instead of *end matching offset*, e.g. report the offset of
> > "a" instead of "c"
> > > > for pattern "abc".
> > > >       If so, this makes it hard to integrate software regex
> > > > libraries such as Hyperscan and RE2 as they only report *end
> > > > matching offset* without length of match.
> > > >       Although Hyperscan has API for *starting matching offset*, it
> > > > only delivers partial syntax support. So I think we have to define
> > > > *end of matching offset* for software solutions.
> > > >
> > > > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs.
> > > > I thought application would need always the length of the match.
> > > > Probably we will see how other HW implementation (from Mellanox)
> > > > etc. We will try to abstract it, probably we can make it as function
> > > > of "user requested".
> > > > [Xiang] Yes, it will be good to make it per user request. At least
> > > > from Hyperscan user's point of view, start of match and match length
> > > > are not mandatory.
> > >
> > > OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START In
> > > device configure.
> > >
> > > Since offset+len == end, we can introduce following generic inline function.
> > >
> > > static inline
> > > rte_regex_match_end(truct rte_regex_match *match) {
> > > 	match->offset + match->len;
> > > }
> > >
> > > Example:  pattern to match is  "hello\s+world"  and data is following
> > > data[4] = 'h'
> > > data[5] = 'e'
> > > data[6] = 'l'
> > > data[7] = 'l'
> > > data[8] = 'o'
> > > data[9] = ' '
> > > data[10] = 'w'
> > > data[11] = 'o'
> > > data[12] = 'r'
> > > data[13] = 'l'
> > > data[14] = 'd'
> > >
> > > if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > match->offset returns 4
> > > match->len returns 11
> > >
> > > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > driver MAY return the following(in hyperscan case)
> > > match->offset returns 0
> > > match->len returns 11 + 4
> > >
> > > In both case(irrespective of flags, to make application life easy)
> > rte_regex_match_end() would return 15.
> > > If application demands for MATCH_AS_START then driver can return
> > > match->offset returns 4 and match->len returns 11 Aka set
> > > HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use
> > rte_regex_match_end() for finding the end of the match. To make, work in all
> > cases.
> > >
> > > Is it OK?
> > >
> > Can we replace len with end offset? So we can change "offset" to "start_offset"
> > and len to "end_ offset" in struct rte_regex_match. Users interested in len
> > could take "end_offset - start_offset".
> > We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to
> > RTE_REGEX_DEV_CFG_MATCH_START
> > 
> > In your example,
> > if device is configured with RTE_REGEX_DEV_CFG_MATCH_START
> > match->start_offset returns 4
> > match->end_offset returns 15
> > 
> > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START
> > match->start_offset returns 0
> > match->end_offset returns 15
> 
> 
> This part is little tricky as HW descriptions need to be rewritten on response.
> This is a one issue, I foresee earlier, to come up with rte_regex_match
> That's works for all implementation  without performance issue.
> 
> We have two HW implementations, both returns start_off and len.
> Lets get input from other HW implementation on the semantics of
> rte_regex_match. Based on that, we can decide how to go about it?
> Thoughts from Mellanox or other vendors?
>
Sure. Let's get more inputs on this.
> 
> 
> > 
> > > >
> > > > 3)  rte_regex_rule_db_update()
> > > >     Does this mean we can dynamically add or delete rules for an
> > > > already generated database without recompile from scratch for
> > > > hardware Regex implementation?
> > > >     If so, this isn't possible for software solutions as they don't
> > > > support dynamic database update and require recompile.
> > > >
> > > > [Jerin] rte_regex_rule_db_update() internally it would call
> > > > recompile function for both HW and SW.
> > > > See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> > > > precompiled rule database case.
> > > > [Xiang] OK, sounds like we have to save the original rule-set for
> > > > the device in order to do recompile. I see both ADD and REMOVE
> > > > operators from rte_regex_rule.
> > > > For rules with REMOVE operator, what's the expected behavior to
> > > > handle them for the old rule-set? Do we need to go through the old
> > > > rule-set and remove corresponding rules before doing recompile?
> > >
> > > Yes.
> > >
> > I think it'll be better to change rte_regex_rule_db_update() to
> > rte_regex_rule_compile() and have users to provide a full rule-set.
> > So we don't have to maintain old rule-set and decide which one to keep and
> > remove. We can simply recompile new rule-set and get rid of
> > rte_regex_rule_op in this case.
> 
> 
> On virtualized, HW implementations, The RULE database is maintained by single
> body. So the above scheme, works with SW and HW implementations.
> And It make user life easy as they don't need to maintain the rules.
> 
> I don't have preference on the rte_regex_rule_db_update() name, I can change to
> rte_regex_rule_compile() if required keeping above functionality. Let me know.
> 
>
OK, I'm good if your are willing to maintain it for users. Then both
rte_regex_rule_db_update() and rte_regex_rule_compile() work for me.
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
  2019-10-14 13:59               ` Wang Xiang
@ 2020-01-26 11:55                 ` Ori Kam
  0 siblings, 0 replies; 62+ messages in thread
From: Ori Kam @ 2020-01-26 11:55 UTC (permalink / raw)
  To: Wang Xiang, Jerin Jacob Kollanukkaran
  Cc: Thomas Monjalon, dev, Pavan Nikhilesh Bhagavatula, Shahaf Shuler,
	Hemant Agrawal, Opher Reviv, Alex Rosenbaum, Dovrat Zifroni,
	Prasun Kapoor, Nipun Gupta, Richardson, Bruce, Hong, Yang A,
	Chang, Harry, gu.jian1, shanjiangh, zhangy.yun, lixingfu,
	wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, Ni, Hongjun, j.bromhead, deri, fc,
	arthur.su, Guy Kaneti, Smadar Fuks, Liron Himi, edwin.verplanke,
	keith.wiles

Hi Jerin and Xiang,
PSB

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Wang Xiang
> Sent: Monday, October 14, 2019 4:59 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Pavan
> Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Opher Reviv <opher@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>; Prasun Kapoor
> <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Hong, Yang A <yang.a.hong@intel.com>;
> Chang, Harry <harry.chang@intel.com>; gu.jian1@zte.com.cn;
> shanjiangh@chinatelecom.cn; zhangy.yun@chinatelecom.cn;
> lixingfu@huachentel.com; wushuai@inspur.com; yuyingxia@yxlink.com;
> fanchenggang@sunyainfo.com; davidfgao@tencent.com;
> liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com;
> jim@netgate.com; Ni, Hongjun <hongjun.ni@intel.com>; j.bromhead@titan-
> ic.com; deri@ntop.org; fc@napatech.com; arthur.su@lionic.com; Guy Kaneti
> <guyk@marvell.com>; Smadar Fuks <smadarf@marvell.com>; Liron Himi
> <lironh@marvell.com>; edwin.verplanke@intel.com; keith.wiles@intel.com
> Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> On Fri, Sep 27, 2019 at 02:35:00PM +0000, Jerin Jacob Kollanukkaran wrote:
> > > -----Original Message-----
> > > From: Wang Xiang <xiang.w.wang@intel.com>
> > >
> > > Hi Jerin,
> > >
> > > Thanks for your response. More comments below and inline.
> > >
> > > 1) I think the size of some varaibles (e.g. nb_matches, scan_size, matching
> > > offset, etc) should be increased based on what Hyperscan supports.
> > >
> > >     a) struct rte_regex_ops:
> > >
> > >         uint16_t scan_size => uint32_t scan_size
> >
> > I think, packet buffers will not be > 64K and getting more than contiguous
> > 64K DMAable memory will be difficult in DPDK.
> > Other than that, rte_regex_match is 64bit now, increasing width of
> > Len could increase the size of  "rte_regex_match". i.e Need more
> > Bandwidth for response.
> > Could other HW implementations share the views on max length
> > is supported on their implementation? Based on that we can decide.
> >
> OK, let's gather ideas from HW implementation.

I agree, that 16 bit for buffer length is good, and that the size of rte_regex_match
should stay 64 bit, in order to have better performance. (PCI bandwidth and caching)

> >
> > >         uint8_t nb_actual_matches => uint64 nb_actual_matches
> > >         uint8_t nb_matches => uint64 nb__matches
> >
> > 2^64 matches will be never possible in practical system. How about 2^16.
> >
> I think the number of matches depends on the number of total rules and
> scan size. Based on the definitions (16-bit nb_rules_per_group,
> 16-bit nb_groups and 16-bit scan size), the maximum possible matches
> could exceed 2^16. Users may get partial matches in this case while
> Hyperscan doesn't make compromises. It'll also be good to check other HW
> implementation.

I think  that we can increase the number of matches to 16 bit. But in any case our HW can't
support working on more then 4 groups in a single search, and since we can't support buffer
larger then 2^16, and if we could saving that many results in in HW is not practical.

> >
> > >
> > >     b) struct rte_regex_match:
> > >         uint16_t offset => uint32_t offset
> > >         uint16_t len => uint32_t len
> >
> > See above.
> >
> > >
> > >     c) uint16_t
> > >         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > > *rules,
> > >                                  uint16_t nb_rules);
> > >     =>
> > >        uint32_t
> > >         rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule
> > > *rules,
> > >                                  uint32_t nb_rules);
> >
> > OK. I will change it next version.
> >
> > >
> > >     d) int
> > >     rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> > >                     const struct rte_regex_qp_conf *qp_conf);
> > >     =>
> > >        int
> > >     rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
> > >                     const struct rte_regex_qp_conf *qp_conf);
> >
> > OK. I will change it next version.
> >
> > >
> > >     e) struct rte_regex_dev_config:
> > >         uint8_t nb_max_matches => uint64_t nb_max_matches
> >
> > 2^64 matches will be never possible in practical system. How about 2^16.
> >
> See above.

See above.

> >
> > >
> > >     f) struct rte_regex_dev_info:
> > >         uint8_t max_matches => uint64_t max_matches
> >
> > 2^64 matches will be never possible in practical system. How about 2^16.
> >
> See above.

See above.

> >
> > >
> > > 2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set()
> defined.
> > > Are all the attributes below could be set by users? Is any of them read-only?
> >
> > See below,
> >
> > > /** Enumerates RegEx device attribute identifier */ enum
> > > rte_regex_dev_attr_id {
> > >     RTE_REGEX_DEV_ATTR_SOCKET_ID,
> > >     /**< The NUMA socket id to which the device is connected or
> > >      * a default of zero if the socket could not be determined.
> > >      * datatype: *int*
> > >      * operation: *get*
> >
> > *get*  means read only. *get* and *set* means it support both operation
> >
> > >      */
> > >     RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> > >     /**< Maximum number of matches per scan.
> > >      * datatype: *uint8_t*
> > >      * operation: *get* and *set*
> > >      *
> > >      * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> > >      */
> > >     RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> > >     /**< Upper bound scan time in ns.
> > >      * datatype: *uint16_t*
> > >      * operation: *get* and *set*
> > >      *
> > >      * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> > >      */
> > >     RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> > >     /**< Maximum number of prefix detected per scan.
> > >      * This would be useful for denial of service detection.
> > >      * datatype: *uint16_t*
> > >      * operation: *get* and *set*
> > >      *
> > >      * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> > >      */
> > > };
> > >
> > > 3) Both RTE_REGEX_PCRE_RULE_* and
> > > RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can
> we
> > > merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F and
> have
> > > a unified regex_dev_capa in struct rte_regex_dev_info.
> >
> > Sure. I will fix it next version.
> >
> > >
> > >
> > > 4) It'll be good if we can also define synchronous matching API for users
> who
> > > want to have a one-off scan and wait for the results.
> >
> > Makes sense. I will add synchronous matching API in next version(I
> understand, it will be useful for SW
> > Implementations). Probably expose as INFO flag to expose the it as
> preference.
> >
> > >
> > > On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran
> wrote:
> > > > Hi Xiang,
> > > >
> > > > Sorry for delay in response(Was busy with 19.11 proposal deadline).
> Please
> > > see inline.
> > > >
> > > > >
> > > > > Reply to Xiang's queries in main thread:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > Some questions regarding APIs. Could you please give more insights?
> > > > >
> > > > > 1) rte_regex_ops
> > > > >       a) rsp_flags
> > > > >       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> > > > > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
> > > > >       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial
> > > > > match at the end of current buffer after scan.
> > > > >       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> > > > >
> > > > > [Jerin] Since we need three states to represent partial match
> > > > > buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to represent start of the
> > > > > buffer, intermediate buffers with no flag, and end of the buffer
> > > > > with RTE_REGEX_OPS_RSP_PMI_EOJ
> > > >
> > > > > [Xiang] How could a user leverage these flags for matching? Suppose
> > > > > a large buffer is divided into multiple chunks. Will
> > > > > RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't set
> > > > > after scan the first chunk. Similarly, RTE_REGEX_OPS_RSP_PMI_EOJ
> > > > > tells a user whether to stop matching future buffers after finish the last
> > > chunk?
> > > >
> > > > Let me describe with an example,
> > > >
> > > > Assume,
> > > > 1) struct rte_regex_dev_info:: max_payload_size set to 1024
> > > > 2) rte_regex_dev_config:: dev_cfg_flags configured with
> > > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > > > 3) Device programmed with matching "hello\s+world" pattern
> > > > 4) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > > > and struct rte_regex_op:: scan_size = 1024
> > > >
> > > > data[0..1021] = data don???t have hello world pattern data[1022] = 'h'
> > > > data[1023] = 'e'
> > > >
> > > > 5) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > > > and struct rte_regex_op:: scan_size = 9
> > > >
> > > > data[0] = 'l'
> > > > data[1] = 'l'
> > > > data[2] = 'o'
> > > > data[3] = ' '
> > > > data[4] = 'w'
> > > > data[5] = 'o'
> > > > data[6] = 'r'
> > > > data[7] = 'l'
> > > > data[8] = 'd'
> > > >
> > > > If so,
> > > >
> > > > Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in
> rte_regex_ops::
> > > > rsp_flags on dequeue Where rte_regex_match:: offset is 1022 and len 2
> > > >
> > > > Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in
> rte_regex_ops::
> > > > rsp_flags on dequeue Where rte_regex_match:: offset is 0 and len 9
> > > >
> > > If the defined pattern is "hello.*world" instead of "hello\s+world", and we
> > > enqueue following struct rte_regex_ops:
> > >
> > > 1) rte_regex_op:: scan_size = 1024
> > >
> > >    data[0..1021] = data don???t have hello world pattern
> > >    data[1022] = 'h'
> > >    data[1023] = 'e'
> > >
> > > 2) rte_regex_op:: scan_size = 9
> > >    data[0] = 'l'
> > >    data[1] = 'l'
> > >    data[2] = 'o'
> > >    data[3] = ' '
> > >    data[4] = 'w'
> > >    data[5] = 'o'
> > >    data[6] = 'r'
> > >    data[7] = 'l'
> > >    data[8] = 'd'
> > >
> > > 3) rte_regex_op:: scan_size = 5
> > >    data[0] = 'w'
> > >    data[1] = 'o'
> > >    data[2] = 'r'
> > >    data[3] = 'l'
> > >    data[4] = 'd'
> > >
> > > Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in
> rte_regex_ops::
> > > rsp_flags on dequeue
> > > Where rte_regex_match:: offset is 0 and len 4?
> >
> > Yes.
> >
> > >
> > > I am wondering what's your expected behavior for .* or similar syntax and
> if
> > > there are syntax compatability issues. We report all matches in Hyperscan,
> e.g.
> > > report end match offsets 11 and 16 for pattern "hello.*world" and corpus
> > > "hello worldworld".
> > >
> > > BTW, not sure how other hardware devices handle cross buffer scan.
> Hyperscan
> > > doesn't reports matches for start and intermediate buffers but only reports
> end
> > > offset if a full match is found.
> > >
> > > >
> > > > >
> > > > >       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition
> > > > > for a specific hardware implementation. I am wondering what this
> > > > > PREFIX refers to:)?
> > > > >
> > > > > [Jerin] Yes. Looks like it is for hardware specific implementation.
> > > > > Introduced rte_regex_dev_attr_set/get functions to make it portable
> > > > > and To add new implementation specific fields.
> > > > > For example, if a rule is
> > > > > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is
> > > > > considered the factor. The prefix is a literal string, while the
> > > > > factor can contain complex regular expression constructs. As a
> > > > > result, rule matching occurs in two stages: prefix matching and
> > > > > factor matching.
> > > > >
> > > > >       b)  user_id or user_ptr
> > > > >       Under what kind of circumstances should an application pass
> > > > > value into these variables for enqueue and dequeuer operations?
> > > > >
> > > > > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also
> > > > > allocated using mempool normally, on enqueue, user can specify
> > > > > user_id If needed to in order identify the op on dequeue if
> > > > > required. The use case could be to store the sequence number from
> > > > > application POV or storing the mbuf ptr in which pattern is requested
> etc.
> > > > >
> > > > >
> > > > >  2) rte_regex_match
> > > > >       a) offset; /**< Starting Byte Position for matched rule. */
> > > > > and  uint16_t len; /**< Length of match in bytes */
> > > > >       Looks like the matching offset is defined as *starting
> > > > > matching offset* instead of *end matching offset*, e.g. report the offset
> of
> > > "a" instead of "c"
> > > > > for pattern "abc".
> > > > >       If so, this makes it hard to integrate software regex
> > > > > libraries such as Hyperscan and RE2 as they only report *end
> > > > > matching offset* without length of match.
> > > > >       Although Hyperscan has API for *starting matching offset*, it
> > > > > only delivers partial syntax support. So I think we have to define
> > > > > *end of matching offset* for software solutions.
> > > > >
> > > > > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST
> tradeoffs.
> > > > > I thought application would need always the length of the match.
> > > > > Probably we will see how other HW implementation (from Mellanox)
> > > > > etc. We will try to abstract it, probably we can make it as function
> > > > > of "user requested".
> > > > > [Xiang] Yes, it will be good to make it per user request. At least
> > > > > from Hyperscan user's point of view, start of match and match length
> > > > > are not mandatory.
> > > >
> > > > OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START In
> > > > device configure.
> > > >
> > > > Since offset+len == end, we can introduce following generic inline
> function.
> > > >
> > > > static inline
> > > > rte_regex_match_end(truct rte_regex_match *match) {
> > > > 	match->offset + match->len;
> > > > }
> > > >
> > > > Example:  pattern to match is  "hello\s+world"  and data is following
> > > > data[4] = 'h'
> > > > data[5] = 'e'
> > > > data[6] = 'l'
> > > > data[7] = 'l'
> > > > data[8] = 'o'
> > > > data[9] = ' '
> > > > data[10] = 'w'
> > > > data[11] = 'o'
> > > > data[12] = 'r'
> > > > data[13] = 'l'
> > > > data[14] = 'd'
> > > >
> > > > if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > match->offset returns 4
> > > > match->len returns 11
> > > >
> > > > if device is NOT configured with
> RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > driver MAY return the following(in hyperscan case)
> > > > match->offset returns 0
> > > > match->len returns 11 + 4
> > > >
> > > > In both case(irrespective of flags, to make application life easy)
> > > rte_regex_match_end() would return 15.
> > > > If application demands for MATCH_AS_START then driver can return
> > > > match->offset returns 4 and match->len returns 11 Aka set
> > > > HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use
> > > rte_regex_match_end() for finding the end of the match. To make, work in
> all
> > > cases.
> > > >
> > > > Is it OK?
> > > >
> > > Can we replace len with end offset? So we can change "offset" to
> "start_offset"
> > > and len to "end_ offset" in struct rte_regex_match. Users interested in len
> > > could take "end_offset - start_offset".
> > > We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to
> > > RTE_REGEX_DEV_CFG_MATCH_START
> > >
> > > In your example,
> > > if device is configured with RTE_REGEX_DEV_CFG_MATCH_START
> > > match->start_offset returns 4
> > > match->end_offset returns 15
> > >
> > > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START
> > > match->start_offset returns 0
> > > match->end_offset returns 15
> >
> >
> > This part is little tricky as HW descriptions need to be rewritten on response.
> > This is a one issue, I foresee earlier, to come up with rte_regex_match
> > That's works for all implementation  without performance issue.
> >
> > We have two HW implementations, both returns start_off and len.
> > Lets get input from other HW implementation on the semantics of
> > rte_regex_match. Based on that, we can decide how to go about it?
> > Thoughts from Mellanox or other vendors?
> >
> Sure. Let's get more inputs on this.


I think Jerin approach is the better one, since at least in our case we see a request 
to copy the match, so it is more user friendly to give the offset and len.

> >
> >
> > >
> > > > >
> > > > > 3)  rte_regex_rule_db_update()
> > > > >     Does this mean we can dynamically add or delete rules for an
> > > > > already generated database without recompile from scratch for
> > > > > hardware Regex implementation?
> > > > >     If so, this isn't possible for software solutions as they don't
> > > > > support dynamic database update and require recompile.
> > > > >
> > > > > [Jerin] rte_regex_rule_db_update() internally it would call
> > > > > recompile function for both HW and SW.
> > > > > See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> > > > > precompiled rule database case.
> > > > > [Xiang] OK, sounds like we have to save the original rule-set for
> > > > > the device in order to do recompile. I see both ADD and REMOVE
> > > > > operators from rte_regex_rule.
> > > > > For rules with REMOVE operator, what's the expected behavior to
> > > > > handle them for the old rule-set? Do we need to go through the old
> > > > > rule-set and remove corresponding rules before doing recompile?
> > > >
> > > > Yes.
> > > >
> > > I think it'll be better to change rte_regex_rule_db_update() to
> > > rte_regex_rule_compile() and have users to provide a full rule-set.
> > > So we don't have to maintain old rule-set and decide which one to keep and
> > > remove. We can simply recompile new rule-set and get rid of
> > > rte_regex_rule_op in this case.
> >
> >
> > On virtualized, HW implementations, The RULE database is maintained by
> single
> > body. So the above scheme, works with SW and HW implementations.
> > And It make user life easy as they don't need to maintain the rules.
> >
> > I don't have preference on the rte_regex_rule_db_update() name, I can
> change to
> > rte_regex_rule_compile() if required keeping above functionality. Let me
> know.
> >
> >
> OK, I'm good if your are willing to maintain it for users. Then both
> rte_regex_rule_db_update() and rte_regex_rule_compile() work for me.

Combining with Shahaf request. My suggestion is:
rte_regex_rule_db_update() - only insert/remove rules from the internal-set
rte_regex_rule_db_compile_activate() - compile and activate the new rule set.

> >
> >
> >
> >
> >
> >

Best,
Ori

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [dpdk-dev] [PATCH v2] net/regexdev: introduce regexdev subsystem
  2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
  2019-07-15  4:26 ` Jerin Jacob Kollanukkaran
  2019-08-15  9:35 ` Thomas Monjalon
@ 2020-01-27 21:19 ` Ori Kam
  2020-01-28  9:00 ` [dpdk-dev] [PATCH v3] regexdev: " Ori Kam
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 62+ messages in thread
From: Ori Kam @ 2020-01-27 21:19 UTC (permalink / raw)
  To: jerinj, xiang.w.wang
  Cc: dev, pbhagavatula, shahafs, hemant.agrawal, opher, alexr, dovrat,
	pkapoor, nipun.gupta, bruce.richardson, yang.a.hong, harry.chang,
	gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas, orika

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
* Next Generation Firewalls (NGFW)
* Deep Packet and Flow Inspection (DPI)
* Intrusion Prevention Systems (IPS)
* DDoS Mitigation
* Network Monitoring
* Data Loss Prevention (DLP)
* Smart NICs
* Grammar based content processing
* URL, spam and adware filtering
* Advanced auditing and policing of user/application security policies
* Financial data mining - parsing of streamed financial feeds
* Application recognition.
* Dmemory introspection.
* Natural Language Processing (NLP)
* Sentiment Analysis.
* Big data databse acceleration.
* Computational storage.

Request to review from HW and SW RegEx vendors and RegEx application
users to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing
device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
---
V2:
 - Update RFC based on ML comments.

RTE RegEx Device API
--------------------

Defines RTE RegEx Device APIs for RegEx operations and its provisioning.

The RegEx Device API is composed of two parts:

- The application-oriented RegEx API that includes functions to setup
  a RegEx device (configure it, setup its queue pairs and start it),
  update the rule database and so on.

- The driver-oriented RegEx API that exports a function allowing
  a RegEx poll Mode Driver (PMD) to simultaneously register itself as
  a RegEx device driver.

RegEx device components and definitions:

    +-----------------+
    |                 |
    |                 o---------+    rte_regex_[en|de]queue_burst()
    |   PCRE based    o------+  |               |
    |  RegEx pattern  |      |  |  +--------+   |
    | matching engine o------+--+--o        |   |    +------+
    |                 |      |  |  | queue  |<==o===>|Core 0|
    |                 o----+ |  |  | pair 0 |        |      |
    |                 |    | |  |  +--------+        +------+
    +-----------------+    | |  |
           ^               | |  |  +--------+
           |               | |  |  |        |        +------+
           |               | +--+--o queue  |<======>|Core 1|
       Rule|Database       |    |  | pair 1 |        |      |
    +------+----------+    |    |  +--------+        +------+
    |     Group 0     |    |    |
    | +-------------+ |    |    |  +--------+        +------+
    | | Rules 0..n  | |    |    |  |        |        |Core 2|
    | +-------------+ |    |    +--o queue  |<======>|      |
    |     Group 1     |    |       | pair 2 |        +------+
    | +-------------+ |    |       +--------+
    | | Rules 0..n  | |    |
    | +-------------+ |    |       +--------+
    |     Group 2     |    |       |        |        +------+
    | +-------------+ |    |       | queue  |<======>|Core n|
    | | Rules 0..n  | |    +-------o pair n |        |      |
    | +-------------+ |            +--------+        +------+
    |     Group n     |
    | +-------------+ |<-------rte_regex_rule_db_update()
    | | Rules 0..n  | |<-------rte_regex_rule_db_import()
    | +-------------+ |------->rte_regex_rule_db_export()
    +-----------------+

RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is ?RegEx?.

RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.

PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html

RegEx queue pair: Each RegEx device should have one or more queue pair
to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching
request/response
embedded in *rte_regex_ops* structure.

Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.

Rule database: The RegEx device accepts regular expressions and converts
them
into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion
that
would be far too expensive to compute at run-time. A rule database
contains
a set of rules that compiled in device specific binary form.

Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.

Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify
the
rule upon match.

Scan: A pattern matching request through *enqueue* API.

It may possible that a given RegEx device may not support all the
features
of PCRE. The application may probe unsupported features through
struct rte_regex_dev_info::pcre_unsup_flags

By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two
logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue
pair.
It is the responsibility of the upper level application to enforce this
rule.

In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*

At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regex_dev*.

RegEx devices are dynamically registered during the PCI/SoC device
probing
phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regex_dev* structure and
a new device identifier are allocated for that device. Then, the
regex_dev_init() function supplied by the RegEx driver matching the
probed
device is invoked to properly initialize the device.

The role of the device init function consists of resetting the hardware
or
software RegEx driver implementations.

If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regex_dev* structure is effectively registered.
Otherwise, both the *rte_regex_dev* structure and the device identifier
are
freed.

The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following
order:
    - rte_regex_dev_configure()
    - rte_regex_queue_pair_setup()
    - rte_regex_dev_start()

Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue
pattern
matching response, get the stats, update the rule database,
get/set device attributes and so on

If the application wants to change the configuration (i.e. call
rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
rte_regex_dev_stop() first to stop the device and then do the
reconfiguration
before calling rte_regex_dev_start() again. The enqueue and dequeue
functions should not be invoked when the device is stopped.

Finally, an application can close a RegEx device by invoking the
rte_regex_dev_close() function.

Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.

For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of
type
*regex_dev_ops*.
The address of the *regex_dev_ops* structure is stored in the
*rte_regex_dev*
structure by the device init function of the RegEx driver, which is
invoked during the PCI/SoC device probing phase, as explained earlier.

In other words, each function of the RegEx API simply retrieves the
*rte_regex_dev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
structure.

For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regex_dev_ops* structure.
Instead, they are directly stored at the beginning of the
*rte_regex_dev*
structure to avoid an extra indirect memory access during their
invocation.

RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.

The *enqueue* operation submits a burst of RegEx pattern matching
request
to the RegEx device and the *dequeue* operation gets a burst of pattern
matching response for the ones submitted through *enqueue* operation.

Typical application utilisation of the RegEx device API will follow the
following programming flow.

- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_rule_db_update() Needs to invoke if precompiled rule
  database not
  provided in rte_regex_dev_config::rule_db for
rte_regex_dev_configure()
  and/or application needs to update rule database.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regex_dev_start()
- rte_regex_enqueue_burst()
- rte_regex_dequeue_burst()

---
---
 config/common_base                           |    7 +
 doc/api/doxy-api-index.md                    |    1 +
 doc/api/doxy-api.conf.in                     |    1 +
 lib/Makefile                                 |    2 +
 lib/librte_regexdev/Makefile                 |   27 +
 lib/librte_regexdev/rte_regexdev.c           |    6 +
 lib/librte_regexdev/rte_regexdev.h           | 1411 ++++++++++++++++++++++++++
 lib/librte_regexdev/rte_regexdev_version.map |    3 +
 8 files changed, 1458 insertions(+)
 create mode 100644 lib/librte_regexdev/Makefile
 create mode 100644 lib/librte_regexdev/rte_regexdev.c
 create mode 100644 lib/librte_regexdev/rte_regexdev.h
 create mode 100644 lib/librte_regexdev/rte_regexdev_version.map

diff --git a/config/common_base b/config/common_base
index f9a68f3..4810849 100644
--- a/config/common_base
+++ b/config/common_base
@@ -806,6 +806,12 @@ CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_DMA_RAWDEV=y
 CONFIG_RTE_LIBRTE_PMD_NTB_RAWDEV=y
 
 #
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+CONFIG_RTE_LIBRTE_REGEXDEV_DEBUG=n
+
+#
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
@@ -1098,3 +1104,4 @@ CONFIG_RTE_APP_CRYPTO_PERF=y
 # Compile the eventdev application
 #
 CONFIG_RTE_APP_EVENTDEV=y
+
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496b..787f7c2 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
   [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
   [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
   [rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
   [metrics]            (@ref rte_metrics.h),
   [bitrate]            (@ref rte_bitrate.h),
   [latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index 1c4392e..56c08eb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -58,6 +58,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/librte_rcu \
                           @TOPDIR@/lib/librte_reorder \
                           @TOPDIR@/lib/librte_rib \
+                          @TOPDIR@/lib/librte_regexdev \
                           @TOPDIR@/lib/librte_ring \
                           @TOPDIR@/lib/librte_sched \
                           @TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 46b91ae..82ff950 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
                            librte_mempool librte_timer librte_cryptodev
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
 DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
 			librte_net librte_hash librte_cryptodev
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 0000000..f46f9be
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,27 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+# Copyright(C) 2020 Mellanox International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+EXPORT_MAP := rte_regex_version.map
+ 
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library source files
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_regexdev.c
+
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_regexdev.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 0000000..b901877
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 0000000..73d504d
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1411 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+__rte_experimental
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+__rte_experimental
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_START_ANCHOR_F (1ULL << 1)
+/**< RegEx device support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_ATOMIC_GROUPING_F (1ULL << 2)
+/**< RegEx device support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKTRACKING_CTRL_F (1ULL << 3)
+/**< RegEx device support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_CALLOUTS_F (1ULL << 4)
+/**< RegEx device support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKREFERENCE_F (1ULL << 5)
+/**< RegEx device support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
+/**< RegEx device support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_LOOKAROUND_ASRT_F (1ULL << 7)
+/**< RegEx device support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_MATCH_POINT_RST_F (1ULL << 8)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_NEWLINE_CONVENTIONS_F (1ULL << 9)
+/**< RegEx support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_NEWLINE_SEQ_F (1ULL << 10)
+/**< RegEx device support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_POSSESSIVE_QUALIFIERS_F (1ULL << 11)
+/**< RegEx device support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_SUBROUTINE_REFERENCES_F (1ULL << 12)
+/**< RegEx device support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_8_F (1ULL << 13)
+/**< RegEx device support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_16_F (1ULL << 14)
+/**< RegEx device support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_32_F (1ULL << 15)
+/**< RegEx device support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_WORD_BOUNDARY_F (1ULL << 16)
+/**< RegEx device support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_FORWARD_REFERENCES_F (1ULL << 17)
+/**< RegEx device support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_MATCH_AS_START (1ULL << 18)
+/**< RegEx device support match as start.
+ * Match as start means that the match result holds the starting position of
+ * match and the length of the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+	const char *driver_name; /**< RegEx driver name. */
+	struct rte_device *dev;	/**< Device information. */
+	uint16_t max_matches;
+	/**< Maximum matches per scan supported by this device. */
+	uint16_t max_queue_pairs;
+	/**< Maximum queue pairs supported by this device. */
+	uint16_t max_payload_size;
+	/**< Maximum payload size for a pattern match request or scan.
+	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+	 */
+	uint32_t max_rules_per_group;
+	/**< Maximum rules supported per group by this device.
+	 * This number can't be larger then 20 bits.
+	 */
+	uint16_t max_groups;
+	/**< Maximum group supported by this device.
+	 * This number can't be larger then 12 bits.
+	 */
+	uint32_t regex_dev_capa;
+	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+	uint64_t rule_flags;
+	/**< Supported compiler rule flags.
+	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+	 */
+	uint8_t max_scatter_gather;
+	/**< The max supported number of buffers that can
+	 * be used in a single ops. The total size of all elements
+	 * must be less then max_payload_size.
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+__rte_experimental
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+#define RTE_REGEX_DEV_CFG_MATCH_AS_START (1ULL << 1)
+/**< Match as start is the ability to return the result as starting offset and
+ * len. When this flag is set, the result for each match will hold the starting
+ * offset of the match in offset, and the length of the match in len.
+ * If this flag is not set, then the match result will only hold the end of
+ * the match offset in the end_offset. While the offset will be zero. 
+ *
+ * @see RTE_REGEX_DEV_SUPP_MATCH_AS_START
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+	uint16_t nb_max_matches;
+	/**< Maximum matches per scan configured on this device.
+	 * This value cannot exceed the *max_matches*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case, value 1 used.
+	 * @see struct rte_regex_dev_info::max_matches
+	 */
+	uint16_t nb_queue_pairs;
+	/**< Number of RegEx queue pairs to configure on this device.
+	 * This value cannot exceed the *max_queue_pairs* which previously
+	 * provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_queue_pairs
+	 */
+	uint32_t nb_rules_per_group;
+	/**< Number of rules per group to configure on this device.
+	 * This value cannot exceed the *max_rules_per_group*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case,
+	 * struct rte_regex_dev_info::max_rules_per_group used.
+	 * @see struct rte_regex_dev_info::max_rules_per_group
+	 */
+	uint16_t nb_groups;
+	/**< Number of groups to configure on this device.
+	 * This value cannot exceed the *max_groups*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_groups
+	 */
+	const char *rule_db;
+	/**< Import initial set of prebuilt rule database on this device.
+	 * The value NULL is allowed, in which case, the device will not
+	 * be configured prebuilt rule database. Application may use
+	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+	 * to update or import rule database after the
+	 * rte_regex_dev_configure().
+	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+	 */
+	uint32_t rule_db_len;
+	/**< Length of *rule_db* buffer. */
+	uint32_t dev_cfg_flags;
+	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+				      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+	uint32_t qp_conf_flags;
+	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
+	uint16_t nb_desc;
+	/**< The number of descriptors to allocate for this queue pair. */
+	regexdev_stop_flush_t cb;
+	/**< Callback function called during rte_regex_dev_stop(), invoked
+	 * once per flushed regex op. Value NULL is allowed, in which case
+	 * callback will not be invoked. This function can be used to properly
+	 * dispose of outstanding regex ops from response queue,
+	 * for example ops containing memory pointers.
+	 * @see rte_regex_dev_stop()
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the range
+ *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue pair.
+ *   NULL value is allowed, in which case default configuration	used.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
+			   const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+__rte_experimental
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+	RTE_REGEX_DEV_ATTR_SOCKET_ID,
+	/**< The NUMA socket id to which the device is connected or
+	 * a default of zero if the socket could not be determined.
+	 * datatype: *int*
+	 * operation: *get*
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+	/**< Maximum number of matches per scan.
+	 * datatype: *uint8_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+	/**< Upper bound scan time in ns.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+	/**< Maximum number of prefix detected per scan.
+	 * This would be useful for denial of service detection.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   A pointer that will be filled in with the attribute
+ *   value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       void *attr_value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   Pointer that will be filled in with the attribute value
+ *   by the application.
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation. */
+enum rte_regex_rule_op {
+	RTE_REGEX_RULE_OP_ADD,
+	/**< Add RegEx rule to rule database. */
+	RTE_REGEX_RULE_OP_REMOVE
+	/**< Remove RegEx rule from rule database. */
+};
+
+/** Structure to hold a RegEx rule attributes. */
+struct rte_regex_rule {
+	enum rte_regex_rule_op op;
+	/**< OP type of the rule either a OP_ADD or OP_DELETE. */
+	uint16_t group_id;
+	/**< Group identifier to which the rule belongs to. */
+	uint32_t rule_id;
+	/**< Rule identifier which is returned on successful match. */
+	const char *pcre_rule;
+	/**< Buffer to hold the PCRE rule. */
+	uint16_t pcre_rule_len;
+	/**< Length of the PCRE rule. */
+	uint64_t rule_flags;
+	/* PCRE rule flags. Supported device specific PCRE rules enumerated
+	 * in struct rte_regex_dev_info::rule_flags. For successful rule
+	 * database update, application needs to provide only supported
+	 * rule flags.
+	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Update the local rule set.
+ * This functions only modify the rule set in memory.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ * @param rules.
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules.
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(),
+ *   rte_regex_rule_db_compile()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
+			 uint32_t nb_rules);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Compile local rule set and burn the complied result to the 
+ * RegEx deive.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success, otherwise negative errno. 
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(,
+ *   rte_regex_rule_db_update()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_compile(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+			 uint32_t rule_db_len);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+	uint16_t id;
+	/**< xstat identifier */
+	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+	/**< xstat name */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - Positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+			       struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested.
+ * @return
+ *   - Positive value: number of stat entries filled into the values array
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+			 uint64_t values[], uint16_t n);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param name
+ *   The stat name to retrieve.
+ * @param id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value.
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+				 uint16_t *id, uint64_t *value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is NULL.
+ *
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters.
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+			   uint16_t nb_ids);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @return
+ *   - 0: Selftest successful.
+ *   - -ENOTSUP if the device doesn't support selftest.
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param f
+ *   A pointer to a file for output.
+ *
+ * @return
+ *   0 on success, negative errno on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		struct {
+			uint32_t rule_id:20;
+			/**< Rule identifier to which the pattern matched.
+			 * @see struct rte_regex_rule::rule_id
+			 */
+			uint32_t group_id:12;
+			/**< Group identifier of the rule which the pattern
+			 * matched. @see struct rte_regex_rule::group_id
+			 */
+			uint16_t offset;
+			/**< Starting Byte Position for matched rule. */
+			RTE_STD_C11
+			union {
+				uint16_t len;
+				/**< Length of match in bytes */
+				uint16_t end_offset;
+				/**< The end offset of the match. In case
+				 * MATCH_AS_START configuration is disabled.
+				 * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
+				 */
+			};
+		};
+	};
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per scan to
+ * reduce the post-processing required by the application. The match with the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/** Struct to hold scatter gather elements in ops. */
+struct rte_regex_iov {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		/**<  Allow 8-byte reserved on 32-bit system */
+		void *buf_addr;
+		/**< Virtual address of the pattern to be matched. */
+	};
+	rte_iova_t buf_iova;
+	/**< IOVA address of the pattern to be matched. */
+	uint16_t buf_size; /**< The buf size. */
+};
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+	/* W0 */
+	uint16_t req_flags;
+	/**< Request flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_REQ_*
+	 */
+	uint16_t rsp_flags;
+	/**< Response flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_RSP_*
+	 */
+	uint16_t nb_actual_matches;
+	/**< The total number of actual matches detected by the Regex device.*/
+	uint16_t nb_matches;
+	/**< The total number of matches returned by the RegEx device for this
+	 * scan. The size of *rte_regex_ops::matches* zero length array will be
+	 * this value.
+	 *
+	 * @see struct rte_regex_ops::matches, struct rte_regex_match
+	 */
+
+	/* W1 */
+	uint16_t num_of_bufs;
+	/**< The number of bufs that are part of this ops. The total size of
+	 * the length of all the buffer must be smaller then the max buffer
+	 * len.
+	 */
+	uint16_t resv1;
+	uint32_t resv2;
+
+	/* W2 */
+	struct rte_regex_iov *(*bufs)[];
+	/**< Holds a pointer to the buffers list.*/
+
+	/* W3 */
+	uint16_t group_id0;
+	/**< First group_id to match the rule against. Minimum one group id
+	 * must be provided by application.
+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
+	 * is valid, respectively similar flags for group_id2 and group_id3.
+	 * Upon the match, struct rte_regex_match::group_id shall be updated
+	 * with matching group ID by the device. Group ID scheme provides
+	 * rule isolation and effective pattern matching.
+	 */
+	uint16_t group_id1;
+	/**< Second group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+	 */
+	uint16_t group_id2;
+	/**< Third group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+	 */
+	uint16_t group_id3;
+	/**< Forth group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+	 */
+
+	/* W4 */
+	RTE_STD_C11
+	union {
+		uint64_t user_id;
+		/**< Application specific opaque value. An application may use
+		 * this field to hold application specific value to share
+		 * between dequeue and enqueue operation.
+		 * Implementation should not modify this field.
+		 */
+		void *user_ptr;
+		/**< Pointer representation of *user_id* */
+	};
+
+	/* W5 */
+	RTE_STD_C11
+	union {
+		uint64_t cross_buf_id;
+		/**< ID used by the RegEx device in order to handle cross
+		 * buffer detection.
+		 * This ID is given by the RegEx device on dequeue, and
+		 * the application must send it on the following enque.
+		 */
+		void *cross_buf_ptr;
+		/**< Pointer representation of *cross_buf_id* */
+	};
+
+	/* W6 */
+	struct rte_regex_match matches[];
+	/**< Zero length array to hold the match tuples.
+	 * The struct rte_regex_ops::nb_matches value holds the number of
+	 * elements in this array.
+	 *
+	 * @see struct rte_regex_ops::nb_matches
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible" policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */
diff --git a/lib/librte_regexdev/rte_regexdev_version.map b/lib/librte_regexdev/rte_regexdev_version.map
new file mode 100644
index 0000000..aabae5e
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev_version.map
@@ -0,0 +1,3 @@
+EXPERIMENTAL {
+	global:
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
                   ` (2 preceding siblings ...)
  2020-01-27 21:19 ` [dpdk-dev] [PATCH v2] net/regexdev: " Ori Kam
@ 2020-01-28  9:00 ` Ori Kam
  2020-02-22 16:52   ` Jerin Jacob
  2020-02-27 14:40 ` [dpdk-dev] [RFC v4] " Ori Kam
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-01-28  9:00 UTC (permalink / raw)
  To: jerinj, xiang.w.wang
  Cc: dev, pbhagavatula, shahafs, hemant.agrawal, opher, alexr, dovrat,
	pkapoor, nipun.gupta, bruce.richardson, yang.a.hong, harry.chang,
	gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas, orika

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
* Next Generation Firewalls (NGFW)
* Deep Packet and Flow Inspection (DPI)
* Intrusion Prevention Systems (IPS)
* DDoS Mitigation
* Network Monitoring
* Data Loss Prevention (DLP)
* Smart NICs
* Grammar based content processing
* URL, spam and adware filtering
* Advanced auditing and policing of user/application security policies
* Financial data mining - parsing of streamed financial feeds
* Application recognition.
* Dmemory introspection.
* Natural Language Processing (NLP)
* Sentiment Analysis.
* Big data databse acceleration.
* Computational storage.

Request to review from HW and SW RegEx vendors and RegEx application
users to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing
device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
---
V3:
 * Change subject title.
V2:
 * Address ML comments.

RTE RegEx Device API
--------------------

Defines RTE RegEx Device APIs for RegEx operations and its provisioning.

The RegEx Device API is composed of two parts:

- The application-oriented RegEx API that includes functions to setup
  a RegEx device (configure it, setup its queue pairs and start it),
  update the rule database and so on.

- The driver-oriented RegEx API that exports a function allowing
  a RegEx poll Mode Driver (PMD) to simultaneously register itself as
  a RegEx device driver.

RegEx device components and definitions:

    +-----------------+
    |                 |
    |                 o---------+    rte_regex_[en|de]queue_burst()
    |   PCRE based    o------+  |               |
    |  RegEx pattern  |      |  |  +--------+   |
    | matching engine o------+--+--o        |   |    +------+
    |                 |      |  |  | queue  |<==o===>|Core 0|
    |                 o----+ |  |  | pair 0 |        |      |
    |                 |    | |  |  +--------+        +------+
    +-----------------+    | |  |
           ^               | |  |  +--------+
           |               | |  |  |        |        +------+
           |               | +--+--o queue  |<======>|Core 1|
       Rule|Database       |    |  | pair 1 |        |      |
    +------+----------+    |    |  +--------+        +------+
    |     Group 0     |    |    |
    | +-------------+ |    |    |  +--------+        +------+
    | | Rules 0..n  | |    |    |  |        |        |Core 2|
    | +-------------+ |    |    +--o queue  |<======>|      |
    |     Group 1     |    |       | pair 2 |        +------+
    | +-------------+ |    |       +--------+
    | | Rules 0..n  | |    |
    | +-------------+ |    |       +--------+
    |     Group 2     |    |       |        |        +------+
    | +-------------+ |    |       | queue  |<======>|Core n|
    | | Rules 0..n  | |    +-------o pair n |        |      |
    | +-------------+ |            +--------+        +------+
    |     Group n     |
    | +-------------+ |<-------rte_regex_rule_db_update()
    | | Rules 0..n  | |<-------rte_regex_rule_db_import()
    | +-------------+ |------->rte_regex_rule_db_export()
    +-----------------+

RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is ?RegEx?.

RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.

PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html

RegEx queue pair: Each RegEx device should have one or more queue pair
to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching
request/response
embedded in *rte_regex_ops* structure.

Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.

Rule database: The RegEx device accepts regular expressions and converts
them
into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion
that
would be far too expensive to compute at run-time. A rule database
contains
a set of rules that compiled in device specific binary form.

Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.

Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify
the
rule upon match.

Scan: A pattern matching request through *enqueue* API.

It may possible that a given RegEx device may not support all the
features
of PCRE. The application may probe unsupported features through
struct rte_regex_dev_info::pcre_unsup_flags

By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two
logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue
pair.
It is the responsibility of the upper level application to enforce this
rule.

In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*

At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regex_dev*.

RegEx devices are dynamically registered during the PCI/SoC device
probing
phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regex_dev* structure and
a new device identifier are allocated for that device. Then, the
regex_dev_init() function supplied by the RegEx driver matching the
probed
device is invoked to properly initialize the device.

The role of the device init function consists of resetting the hardware
or
software RegEx driver implementations.

If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regex_dev* structure is effectively registered.
Otherwise, both the *rte_regex_dev* structure and the device identifier
are
freed.

The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following
order:
    - rte_regex_dev_configure()
    - rte_regex_queue_pair_setup()
    - rte_regex_dev_start()

Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue
pattern
matching response, get the stats, update the rule database,
get/set device attributes and so on

If the application wants to change the configuration (i.e. call
rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
rte_regex_dev_stop() first to stop the device and then do the
reconfiguration
before calling rte_regex_dev_start() again. The enqueue and dequeue
functions should not be invoked when the device is stopped.

Finally, an application can close a RegEx device by invoking the
rte_regex_dev_close() function.

Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.

For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of
type
*regex_dev_ops*.
The address of the *regex_dev_ops* structure is stored in the
*rte_regex_dev*
structure by the device init function of the RegEx driver, which is
invoked during the PCI/SoC device probing phase, as explained earlier.

In other words, each function of the RegEx API simply retrieves the
*rte_regex_dev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regex_dev_ops* structure of the *rte_regex_dev*
structure.

For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regex_dev_ops* structure.
Instead, they are directly stored at the beginning of the
*rte_regex_dev*
structure to avoid an extra indirect memory access during their
invocation.

RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.

The *enqueue* operation submits a burst of RegEx pattern matching
request
to the RegEx device and the *dequeue* operation gets a burst of pattern
matching response for the ones submitted through *enqueue* operation.

Typical application utilisation of the RegEx device API will follow the
following programming flow.

- rte_regex_dev_configure()
- rte_regex_queue_pair_setup()
- rte_regex_rule_db_update() Needs to invoke if precompiled rule
  database not
  provided in rte_regex_dev_config::rule_db for
rte_regex_dev_configure()
  and/or application needs to update rule database.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regex_dev_start()
- rte_regex_enqueue_burst()
- rte_regex_dequeue_burst()

---
---
 config/common_base                           |    7 +
 doc/api/doxy-api-index.md                    |    1 +
 doc/api/doxy-api.conf.in                     |    1 +
 lib/Makefile                                 |    2 +
 lib/librte_regexdev/Makefile                 |   27 +
 lib/librte_regexdev/rte_regexdev.c           |    6 +
 lib/librte_regexdev/rte_regexdev.h           | 1411 ++++++++++++++++++++++++++
 lib/librte_regexdev/rte_regexdev_version.map |    3 +
 8 files changed, 1458 insertions(+)
 create mode 100644 lib/librte_regexdev/Makefile
 create mode 100644 lib/librte_regexdev/rte_regexdev.c
 create mode 100644 lib/librte_regexdev/rte_regexdev.h
 create mode 100644 lib/librte_regexdev/rte_regexdev_version.map

diff --git a/config/common_base b/config/common_base
index f9a68f3..4810849 100644
--- a/config/common_base
+++ b/config/common_base
@@ -806,6 +806,12 @@ CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_DMA_RAWDEV=y
 CONFIG_RTE_LIBRTE_PMD_NTB_RAWDEV=y
 
 #
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+CONFIG_RTE_LIBRTE_REGEXDEV_DEBUG=n
+
+#
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
@@ -1098,3 +1104,4 @@ CONFIG_RTE_APP_CRYPTO_PERF=y
 # Compile the eventdev application
 #
 CONFIG_RTE_APP_EVENTDEV=y
+
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496b..787f7c2 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
   [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
   [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
   [rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
   [metrics]            (@ref rte_metrics.h),
   [bitrate]            (@ref rte_bitrate.h),
   [latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index 1c4392e..56c08eb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -58,6 +58,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/librte_rcu \
                           @TOPDIR@/lib/librte_reorder \
                           @TOPDIR@/lib/librte_rib \
+                          @TOPDIR@/lib/librte_regexdev \
                           @TOPDIR@/lib/librte_ring \
                           @TOPDIR@/lib/librte_sched \
                           @TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 46b91ae..82ff950 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
                            librte_mempool librte_timer librte_cryptodev
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
 DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
 			librte_net librte_hash librte_cryptodev
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 0000000..90b4029
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,27 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+# Copyright(C) 2020 Mellanox International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+EXPORT_MAP := rte_regex_version.map
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+# library source files
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_regexdev.c
+
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_regexdev.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 0000000..b901877
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 0000000..c42128b
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1411 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_memory.h>
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+__rte_experimental
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+__rte_experimental
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_START_ANCHOR_F (1ULL << 1)
+/**< RegEx device support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_ATOMIC_GROUPING_F (1ULL << 2)
+/**< RegEx device support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKTRACKING_CTRL_F (1ULL << 3)
+/**< RegEx device support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_CALLOUTS_F (1ULL << 4)
+/**< RegEx device support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKREFERENCE_F (1ULL << 5)
+/**< RegEx device support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
+/**< RegEx device support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_LOOKAROUND_ASRT_F (1ULL << 7)
+/**< RegEx device support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_MATCH_POINT_RST_F (1ULL << 8)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_NEWLINE_CONVENTIONS_F (1ULL << 9)
+/**< RegEx support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_NEWLINE_SEQ_F (1ULL << 10)
+/**< RegEx device support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_POSSESSIVE_QUALIFIERS_F (1ULL << 11)
+/**< RegEx device support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_SUBROUTINE_REFERENCES_F (1ULL << 12)
+/**< RegEx device support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_8_F (1ULL << 13)
+/**< RegEx device support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_16_F (1ULL << 14)
+/**< RegEx device support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_32_F (1ULL << 15)
+/**< RegEx device support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_WORD_BOUNDARY_F (1ULL << 16)
+/**< RegEx device support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_FORWARD_REFERENCES_F (1ULL << 17)
+/**< RegEx device support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_MATCH_AS_START (1ULL << 18)
+/**< RegEx device support match as start.
+ * Match as start means that the match result holds the starting position of
+ * match and the length of the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+	const char *driver_name; /**< RegEx driver name. */
+	struct rte_device *dev;	/**< Device information. */
+	uint16_t max_matches;
+	/**< Maximum matches per scan supported by this device. */
+	uint16_t max_queue_pairs;
+	/**< Maximum queue pairs supported by this device. */
+	uint16_t max_payload_size;
+	/**< Maximum payload size for a pattern match request or scan.
+	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+	 */
+	uint32_t max_rules_per_group;
+	/**< Maximum rules supported per group by this device.
+	 * This number can't be larger then 20 bits.
+	 */
+	uint16_t max_groups;
+	/**< Maximum group supported by this device.
+	 * This number can't be larger then 12 bits.
+	 */
+	uint32_t regex_dev_capa;
+	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+	uint64_t rule_flags;
+	/**< Supported compiler rule flags.
+	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+	 */
+	uint8_t max_scatter_gather;
+	/**< The max supported number of buffers that can
+	 * be used in a single ops. The total size of all elements
+	 * must be less then max_payload_size.
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+__rte_experimental
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+#define RTE_REGEX_DEV_CFG_MATCH_AS_START (1ULL << 1)
+/**< Match as start is the ability to return the result as starting offset and
+ * len. When this flag is set, the result for each match will hold the starting
+ * offset of the match in offset, and the length of the match in len.
+ * If this flag is not set, then the match result will only hold the end of
+ * the match offset in the end_offset. While the offset will be zero.
+ *
+ * @see RTE_REGEX_DEV_SUPP_MATCH_AS_START
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+	uint16_t nb_max_matches;
+	/**< Maximum matches per scan configured on this device.
+	 * This value cannot exceed the *max_matches*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case, value 1 used.
+	 * @see struct rte_regex_dev_info::max_matches
+	 */
+	uint16_t nb_queue_pairs;
+	/**< Number of RegEx queue pairs to configure on this device.
+	 * This value cannot exceed the *max_queue_pairs* which previously
+	 * provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_queue_pairs
+	 */
+	uint32_t nb_rules_per_group;
+	/**< Number of rules per group to configure on this device.
+	 * This value cannot exceed the *max_rules_per_group*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case,
+	 * struct rte_regex_dev_info::max_rules_per_group used.
+	 * @see struct rte_regex_dev_info::max_rules_per_group
+	 */
+	uint16_t nb_groups;
+	/**< Number of groups to configure on this device.
+	 * This value cannot exceed the *max_groups*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_groups
+	 */
+	const char *rule_db;
+	/**< Import initial set of prebuilt rule database on this device.
+	 * The value NULL is allowed, in which case, the device will not
+	 * be configured prebuilt rule database. Application may use
+	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+	 * to update or import rule database after the
+	 * rte_regex_dev_configure().
+	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+	 */
+	uint32_t rule_db_len;
+	/**< Length of *rule_db* buffer. */
+	uint32_t dev_cfg_flags;
+	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+				      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+	uint32_t qp_conf_flags;
+	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
+	uint16_t nb_desc;
+	/**< The number of descriptors to allocate for this queue pair. */
+	regexdev_stop_flush_t cb;
+	/**< Callback function called during rte_regex_dev_stop(), invoked
+	 * once per flushed regex op. Value NULL is allowed, in which case
+	 * callback will not be invoked. This function can be used to properly
+	 * dispose of outstanding regex ops from response queue,
+	 * for example ops containing memory pointers.
+	 * @see rte_regex_dev_stop()
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the range
+ *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue pair.
+ *   NULL value is allowed, in which case default configuration	used.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
+			   const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+__rte_experimental
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+	RTE_REGEX_DEV_ATTR_SOCKET_ID,
+	/**< The NUMA socket id to which the device is connected or
+	 * a default of zero if the socket could not be determined.
+	 * datatype: *int*
+	 * operation: *get*
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+	/**< Maximum number of matches per scan.
+	 * datatype: *uint8_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+	/**< Upper bound scan time in ns.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+	/**< Maximum number of prefix detected per scan.
+	 * This would be useful for denial of service detection.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   A pointer that will be filled in with the attribute
+ *   value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       void *attr_value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   Pointer that will be filled in with the attribute value
+ *   by the application.
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation. */
+enum rte_regex_rule_op {
+	RTE_REGEX_RULE_OP_ADD,
+	/**< Add RegEx rule to rule database. */
+	RTE_REGEX_RULE_OP_REMOVE
+	/**< Remove RegEx rule from rule database. */
+};
+
+/** Structure to hold a RegEx rule attributes. */
+struct rte_regex_rule {
+	enum rte_regex_rule_op op;
+	/**< OP type of the rule either a OP_ADD or OP_DELETE. */
+	uint16_t group_id;
+	/**< Group identifier to which the rule belongs to. */
+	uint32_t rule_id;
+	/**< Rule identifier which is returned on successful match. */
+	const char *pcre_rule;
+	/**< Buffer to hold the PCRE rule. */
+	uint16_t pcre_rule_len;
+	/**< Length of the PCRE rule. */
+	uint64_t rule_flags;
+	/* PCRE rule flags. Supported device specific PCRE rules enumerated
+	 * in struct rte_regex_dev_info::rule_flags. For successful rule
+	 * database update, application needs to provide only supported
+	 * rule flags.
+	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Update the local rule set.
+ * This functions only modify the rule set in memory.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ * @param rules.
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules.
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(),
+ *   rte_regex_rule_db_compile()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
+			 uint32_t nb_rules);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Compile local rule set and burn the complied result to the
+ * RegEx deive.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success, otherwise negative errno.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(,
+ *   rte_regex_rule_db_update()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_compile(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+			 uint32_t rule_db_len);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+	uint16_t id;
+	/**< xstat identifier */
+	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+	/**< xstat name */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - Positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+			       struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested.
+ * @return
+ *   - Positive value: number of stat entries filled into the values array
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+			 uint64_t values[], uint16_t n);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param name
+ *   The stat name to retrieve.
+ * @param id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value.
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+				 uint16_t *id, uint64_t *value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is NULL.
+ *
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters.
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+			   uint16_t nb_ids);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @return
+ *   - 0: Selftest successful.
+ *   - -ENOTSUP if the device doesn't support selftest.
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param f
+ *   A pointer to a file for output.
+ *
+ * @return
+ *   0 on success, negative errno on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		struct {
+			uint32_t rule_id:20;
+			/**< Rule identifier to which the pattern matched.
+			 * @see struct rte_regex_rule::rule_id
+			 */
+			uint32_t group_id:12;
+			/**< Group identifier of the rule which the pattern
+			 * matched. @see struct rte_regex_rule::group_id
+			 */
+			uint16_t offset;
+			/**< Starting Byte Position for matched rule. */
+			RTE_STD_C11
+			union {
+				uint16_t len;
+				/**< Length of match in bytes */
+				uint16_t end_offset;
+				/**< The end offset of the match. In case
+				 * MATCH_AS_START configuration is disabled.
+				 * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
+				 */
+			};
+		};
+	};
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per scan to
+ * reduce the post-processing required by the application. The match with the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/** Struct to hold scatter gather elements in ops. */
+struct rte_regex_iov {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		/**<  Allow 8-byte reserved on 32-bit system */
+		void *buf_addr;
+		/**< Virtual address of the pattern to be matched. */
+	};
+	rte_iova_t buf_iova;
+	/**< IOVA address of the pattern to be matched. */
+	uint16_t buf_size; /**< The buf size. */
+};
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+	/* W0 */
+	uint16_t req_flags;
+	/**< Request flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_REQ_*
+	 */
+	uint16_t rsp_flags;
+	/**< Response flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_RSP_*
+	 */
+	uint16_t nb_actual_matches;
+	/**< The total number of actual matches detected by the Regex device.*/
+	uint16_t nb_matches;
+	/**< The total number of matches returned by the RegEx device for this
+	 * scan. The size of *rte_regex_ops::matches* zero length array will be
+	 * this value.
+	 *
+	 * @see struct rte_regex_ops::matches, struct rte_regex_match
+	 */
+
+	/* W1 */
+	uint16_t num_of_bufs;
+	/**< The number of bufs that are part of this ops. The total size of
+	 * the length of all the buffer must be smaller then the max buffer
+	 * len.
+	 */
+	uint16_t resv1;
+	uint32_t resv2;
+
+	/* W2 */
+	struct rte_regex_iov *(*bufs)[];
+	/**< Holds a pointer to the buffers list.*/
+
+	/* W3 */
+	uint16_t group_id0;
+	/**< First group_id to match the rule against. Minimum one group id
+	 * must be provided by application.
+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
+	 * is valid, respectively similar flags for group_id2 and group_id3.
+	 * Upon the match, struct rte_regex_match::group_id shall be updated
+	 * with matching group ID by the device. Group ID scheme provides
+	 * rule isolation and effective pattern matching.
+	 */
+	uint16_t group_id1;
+	/**< Second group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+	 */
+	uint16_t group_id2;
+	/**< Third group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+	 */
+	uint16_t group_id3;
+	/**< Forth group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+	 */
+
+	/* W4 */
+	RTE_STD_C11
+	union {
+		uint64_t user_id;
+		/**< Application specific opaque value. An application may use
+		 * this field to hold application specific value to share
+		 * between dequeue and enqueue operation.
+		 * Implementation should not modify this field.
+		 */
+		void *user_ptr;
+		/**< Pointer representation of *user_id* */
+	};
+
+	/* W5 */
+	RTE_STD_C11
+	union {
+		uint64_t cross_buf_id;
+		/**< ID used by the RegEx device in order to handle cross
+		 * buffer detection.
+		 * This ID is given by the RegEx device on dequeue, and
+		 * the application must send it on the following enque.
+		 */
+		void *cross_buf_ptr;
+		/**< Pointer representation of *cross_buf_id* */
+	};
+
+	/* W6 */
+	struct rte_regex_match matches[];
+	/**< Zero length array to hold the match tuples.
+	 * The struct rte_regex_ops::nb_matches value holds the number of
+	 * elements in this array.
+	 *
+	 * @see struct rte_regex_ops::nb_matches
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible" policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */
diff --git a/lib/librte_regexdev/rte_regexdev_version.map b/lib/librte_regexdev/rte_regexdev_version.map
new file mode 100644
index 0000000..aabae5e
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev_version.map
@@ -0,0 +1,3 @@
+EXPERIMENTAL {
+	global:
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-01-28  9:00 ` [dpdk-dev] [PATCH v3] regexdev: " Ori Kam
@ 2020-02-22 16:52   ` Jerin Jacob
  2020-02-23  8:41     ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob @ 2020-02-22 16:52 UTC (permalink / raw)
  To: Ori Kam
  Cc: Jerin Jacob, xiang.w.wang, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, opher, alexr, dovrat,
	Prasun Kapoor, Nipun Gupta, Richardson, Bruce, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

> diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
> new file mode 100644
> index 0000000..c42128b
> --- /dev/null
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -0,0 +1,1411 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2019 Marvell International Ltd.
> + * Copyright(C) 2020 Mellanox International Ltd.

There are a few comments from Xiang as well. So let's add Intel also
to the list.

> + */
> +
> +#ifndef _RTE_REGEXDEV_H_
> +#define _RTE_REGEXDEV_H_

> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +       const char *driver_name; /**< RegEx driver name. */
> +       struct rte_device *dev; /**< Device information. */
> +       uint16_t max_matches;
> +       /**< Maximum matches per scan supported by this device. */
> +       uint16_t max_queue_pairs;
> +       /**< Maximum queue pairs supported by this device. */
> +       uint16_t max_payload_size;
> +       /**< Maximum payload size for a pattern match request or scan.
> +        * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +        */
> +       uint32_t max_rules_per_group;
> +       /**< Maximum rules supported per group by this device.
> +        * This number can't be larger then 20 bits.

s/then/than

I think, we don't need to say this " This number can't be larger than 20 bits."
It may help SW drivers.



> +        */
> +       uint16_t max_groups;
> +       /**< Maximum group supported by this device.
> +        * This number can't be larger then 12 bits.
s/then/than
I think, we don't need to say this " This number can't be larger than 12 bits."
It may help SW drivers.

> +        */
> +       uint32_t regex_dev_capa;
> +       /**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +       uint64_t rule_flags;
> +       /**< Supported compiler rule flags.
> +        * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +        */
> +       uint8_t max_scatter_gather;
> +       /**< The max supported number of buffers that can
> +        * be used in a single ops. The total size of all elements
> +        * must be less then max_payload_size.
> +        */
> +};
<snip>

> +int
> +rte_regex_rule_db_compile(uint8_t dev_id);
> +

I think your "rte_regex_rule_db_compile_activate() - compile and
activate the new rule set"
API name looks good. I am for rte_regex_rule_db_compile_activate().


> +/* Fast path APIs */
> +
> +/**
> + * The generic *rte_regex_match* structure to hold the RegEx match attributes.
> + * @see struct rte_regex_ops::matches
> + */
> +struct rte_regex_match {
> +       RTE_STD_C11
> +       union {
> +               uint64_t u64;
> +               struct {
> +                       uint32_t rule_id:20;
> +                       /**< Rule identifier to which the pattern matched.
> +                        * @see struct rte_regex_rule::rule_id
> +                        */
> +                       uint32_t group_id:12;
> +                       /**< Group identifier of the rule which the pattern
> +                        * matched. @see struct rte_regex_rule::group_id
> +                        */
> +                       uint16_t offset;

Since we have end_offset now, IMO, it is better to change this offset
to "start_offset".


> +                       /**< Starting Byte Position for matched rule. */
> +                       RTE_STD_C11
> +                       union {
> +                               uint16_t len;
> +                               /**< Length of match in bytes */
> +                               uint16_t end_offset;
> +                               /**< The end offset of the match. In case
> +                                * MATCH_AS_START configuration is disabled.
> +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> +                                */

We have not concluded on this scheme. Have one field which has
different meaning will be difficult
for application. i.e fast path we need to have a check for this.

I think, Based on the majority of HW/SW implementation, we need to
either go with len or
end_offset. What Mellanox HW returns? len or end_offset?

or We can keep it as len or end_offset based on which drivers upstream first,
other drivers when it comes, we can see how to abstract it?

> +                       };
> +               };
> +       };
> +};

> +/**
> + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> + * for enqueue and dequeue operation.
> + */
> +struct rte_regex_ops {
> +       /* W0 */
> +       uint16_t req_flags;
> +       /**< Request flags for the RegEx ops.
> +        * @see RTE_REGEX_OPS_REQ_*
> +        */
> +       uint16_t rsp_flags;
> +       /**< Response flags for the RegEx ops.
> +        * @see RTE_REGEX_OPS_RSP_*
> +        */
> +       uint16_t nb_actual_matches;
> +       /**< The total number of actual matches detected by the Regex device.*/
> +       uint16_t nb_matches;
> +       /**< The total number of matches returned by the RegEx device for this
> +        * scan. The size of *rte_regex_ops::matches* zero length array will be
> +        * this value.
> +        *
> +        * @see struct rte_regex_ops::matches, struct rte_regex_match
> +        */
> +
> +       /* W1 */
> +       uint16_t num_of_bufs;
> +       /**< The number of bufs that are part of this ops. The total size of
> +        * the length of all the buffer must be smaller then the max buffer
> +        * len.
> +        */
> +       uint16_t resv1;
> +       uint32_t resv2;

One of the point came up in our implementation is that.
HW can return an error due to various reasons.

One option could be to make nb_matches as zero? and update some flag?

What are your thoughts? updating the flag may be overkill.


> +
> +       /* W2 */
> +       struct rte_regex_iov *(*bufs)[];
> +       /**< Holds a pointer to the buffers list.*/

This memory gets submitted to HW so it can not be from the heap.
Cryptodev had a similar dilemma to use the container format for
multi-segment case, Finally they choose to with mbuf.

The following elements are in mbuf. Considering to avoid duplication and
avoid overhead most common usecase DPI(Assume if it is rte_regex_iov,
one need to copy all the elements from mbuf on fastpath).
I propose to have mbuf here instead of rte_regex_iov.

struct rte_regex_iov {
RTE_STD_C11
union {
uint64_t u64;
/**<  Allow 8-byte reserved on 32-bit system */
void *buf_addr;
/**< Virtual address of the pattern to be matched. */
};
rte_iova_t buf_iova;
/**< IOVA address of the pattern to be matched. */
uint16_t buf_size; /**< The buf size. */
};


> +
> +       /* W5 */
> +       RTE_STD_C11
> +       union {
> +               uint64_t cross_buf_id;
> +               /**< ID used by the RegEx device in order to handle cross
> +                * buffer detection.
> +                * This ID is given by the RegEx device on dequeue, and
> +                * the application must send it on the following enque.
> +                */
> +               void *cross_buf_ptr;
> +               /**< Pointer representation of *cross_buf_id* */

Could you have some example of how to use cross_buf_id?
Marvell HW does not support cross_buf_id, so we need to add this
feature as capability.



Regarding the rule attributes, We think, following needs to be added to control
the rule compilation behavior. If it not converging in first look, I
think, we can make
below as separate patch once we have basic things.


diff --git a/lib/librte_regexdev/rte_regexdev.h
b/lib/librte_regexdev/rte_regexdev.h
index 765da4aaa..1fa6a7135 100644
--- a/lib/librte_regexdev/rte_regexdev.h
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -767,6 +767,10 @@ enum rte_regex_rule_op {
 struct rte_regex_rule {
  enum rte_regex_rule_op op;
  /**< OP type of the rule either a OP_ADD or OP_DELETE */
+ uint32_t nb_prefix;
+ /**< Number of prefix entries */
+ char **prefixes;
+ /**< Rule prefix list */
  uint16_t group_id;
  /**< Group identifier to which the rule belongs to. */
  uint32_t rule_id;
@@ -784,6 +788,169 @@ struct rte_regex_rule {
  */
 };

+/** Enumerates RegEx rule prefix control type */
+enum rte_regex_prefix_control_entry_type
+{
+ RTE_REGEX_PREFIX_CONTROL_ENTRY_BLACKLIST = 0,
+ /**< The RegEx rule compiler will not use any prefixes that are
+ * specified. If there are no options but to use a blacklisted prefix,
+ * the rule will not be compiled.
+ */
+ RTE_REGEX_PREFIX_CONTROL_ENTRY_GRAYLIST = 1,
+ /**< The RegEx rule compiler will try not to use any prefixes that
+ * are specified.
+ * It will settle for these if there are no other options.
+ */
+ RTE_REGEX_PREFIX_CONTROL_ENTRY_WHITELIST = 2
+ /**< The RegEx rule compiler will try to use any prefixes that are
+ * specified. If it can't it will use other prefixes.
+ */
+
+};
+
+/** Structure to hold a RegEx rule prefix control entry */
+struct rte_regex_prefix_control_entry
+{
+        enum rte_regex_prefix_control_entry_type type;
+        /**< Indicates the type of the prefix selection control list entry */
+        char *value;
+        /**< The value associated with the prefix selection control list entry.
+         * This could be a string such as ABCD.
+         * Characters can also be represented in hex format
+         * such as \x00\x01\x02\x03.
+         */
+};
+
+/** Structure to hold a RegEx rule rule prefix control entry list */
+struct rte_regex_prefix_control
+{
+        uint32_t num;
+        /**< The number of prefix selection control list entries */
+        struct rte_regex_prefix_control_entry *prefixes;
+        /**< The list of prefix selection control list entries */
+};
+
+/** Enumerates RegEx rule compilation capabilities */
+
+#define RTE_REGEX_COMPILE_CAP_AUTO_RULE_ID_F (1ULL << 0)
+/**< RegEx device compilation supports auto rule id
+ * Assign an automatic incrementing ID to each rule.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_CHECKSUM_F (1ULL << 1)
+/**< RegEx device compilation supports checksum calculation.
+ * When performing an incremental compile use the end checksum of the base rule
+ * db as the start checksum for the new rule db.
+ * Must be used in conjunction with an incremental compile.
+ */
+
+/* @warning might not be needed */
+#define RTE_REGEX_COMPILE_CAP_EM_F (1ULL << 2)
+/**< RegEx device compilation supports external memory
+ * Store rule db in external memory only.
+ */
+
+/* @warning might not be needed */
+#define RTE_REGEX_COMPILE_CAP_DISABLE_DISABLE_AIC_F (1ULL << 3)
+/**< RegEx device compilation supports disable of atomic incremental compile
+ * Disable the atomic incremental compile system.
+ * The incremental compilation will be performed but the output
+ * will not be atomized.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_FORCE_F (1ULL << 4)
+/**< RegEx device compilation supports force compiliation
+ * Force the compilation to complete even if errors are found.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_INCREMENTAL_F (1ULL << 5)
+/**< RegEx device compilation supports incremental compilation
+ * Perform an incremental compile using pre build rule db as the base.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_NO_INC_COMPILE_PADDING_F (1ULL << 6)
+/**< RegEx device compilation supports padding removal
+ * Remove the padding that is normally introduced to assist with
+ * incremental compile. This will allow for a more compact rule db footprint
+ * but is not recommended if you intend to perform an incremental
+ * compile with the resulting rule db.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_PCRE_PRE_8_36_F (1ULL << 7)
+/**< RegEx device compilation supports change of space class
+ * Set the space class to not include vertical tab (VT) as was used in
+ * PCRE before v8.36.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_STRICT_QUANTIFIERS_F (1ULL << 8)
+/**< RegEx device compilation supports strict quantifiers
+ * To help with performance, by default the REGEX Compiler treats non-fixed
+ * bounded quantifiers as unbounded e.g. .{0,2048} will be the same as .*. This
+ * has the caveat of false positives being possible.
+ * This capability will ensure that the original construct is used in all
+ * cases meaning performance will be worse but no false positives will occur.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_QUICK_REMOVE_RULES_F (1ULL << 9)
+/**< RegEx device compilation supports quick removal of rules
+ * Quick incremental compile to remove rules.
+ * Specify rule_ids_to_remove from rule DB.
+ * Must be used in conjunction with the incremental option.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_QUICK_ADD_RULES_F (1ULL << 10)
+/**< RegEx device compilation supports quick addition of rules
+ * Quick incremental compile to add rules.
+ * Specify rule_ids_to_add to rule DB.
+ * Must be used in conjunction with the incremental option.
+ */
+
+/* @warning might not be needed */
+#define RTE_REGEX_COMPILE_CAP_UTF_8_F (1ULL << 11)
+/**< RegEx device compilation supports UTF-8 mode
+ * Switch on UTF-8 mode.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_DISABLE_BIDIRECTIONAL_F (1ULL << 12)
+/**< RegEx device compilation supports disabling of bidirectional
+ * This disables the bidirectional rule compilation.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_UTF_16_STRING_F (1ULL << 13)
+/**< RegEx device compilation supports UTF-16 strings efficiency
+ * Deal with UTF-16 strings more efficiently.
+ * The Compiler will assume RegEx constructs like the following are intended
+ * to match UTF-16 strings:A\x00?B\x00?C\x00?
+ * with this switch enabled the Compiler will interpret the
+ * above RegEx as:ABCD|A\x00B\x00C\x00
+ */
+
+#define RTE_REGEX_COMPILE_CAP_SWITCH_OFF_LOCALE_SUPPORT_F (1ULL << 14)
+/**< RegEx device compilation supports disabling locale
+ * Disable locale support.
+ */
+
+#define RTE_REGEX_COMPILE_CAP_SPLIT_ALTERNATIONS_F (1ULL << 15)
+/**< RegEx device compilation supports automatic alternation splitting
+ * Enable the automatic alternation splitting.
+ */
+
+/** Structure to hold a RegEx control list data sample */
+struct rte_regex_data_sample
+{
+        float threshold;
+        /**< A % threshold used to blacklist strings in the sample data.
+         * This % value represents the % of bytes in the sample data that
+         * represent the start of a 1 - 4-byte string.
+         * If the string occurs at > the % threshold number of bytes it will
+         * then be blacklisted.
+         */
+        size_t length;
+        /**< The number of bytes in the data sample to be analyzed */
+        char *data;
+        /**< The data sample to be analyzed */
+};
+
 /**
  * Update the rule database of a RegEx device.
  *
@@ -793,6 +960,12 @@ struct rte_regex_rule {
  *   which contain the regex rules attributes to be updated in rule database.
  * @param nb_rules
  *   The number of PCRE rules to update the rule database.
+ * @param prefixes
+ *   The prefix selection control list.
+ * @param compiler_options
+ *   The compiler options flags @see RTE_REGEX_COMPILE_CAP*.
+ * @param data_sample
+ *   The data sample for auto control list generation.
  *
  * @return
  *   The number of regex rules actually updated on the regex device's rule
@@ -811,7 +984,10 @@ struct rte_regex_rule {
  */
 uint16_t
 rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
- uint16_t nb_rules);
+ uint16_t nb_rules,
+ struct rte_regex_prefix_control *prefix_control,
+ uint32_t compiler_options
+ struct rte_regex_data_sample *data_sample);

 /**
  * Import a prebuilt rule database from a buffer to a RegEx device.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-22 16:52   ` Jerin Jacob
@ 2020-02-23  8:41     ` Ori Kam
  2020-02-23  9:53       ` Jerin Jacob
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-02-23  8:41 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Jerin Jacob, xiang.w.wang, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Jerin,

Thanks, for the review.
PSB

Ori Kam

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Saturday, February 22, 2020 6:52 PM
> To: Ori Kam <orika@mellanox.com>
> Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> Shuler <shahafs@mellanox.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> 
> > diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> > new file mode 100644
> > index 0000000..c42128b
> > --- /dev/null
> > +++ b/lib/librte_regexdev/rte_regexdev.h
> > @@ -0,0 +1,1411 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(C) 2019 Marvell International Ltd.
> > + * Copyright(C) 2020 Mellanox International Ltd.
> 
> There are a few comments from Xiang as well. So let's add Intel also
> to the list.
> 

Sure no problem.

> > + */
> > +
> > +#ifndef _RTE_REGEXDEV_H_
> > +#define _RTE_REGEXDEV_H_
> 
> > +
> > +/**
> > + * RegEx device information
> > + */
> > +struct rte_regex_dev_info {
> > +       const char *driver_name; /**< RegEx driver name. */
> > +       struct rte_device *dev; /**< Device information. */
> > +       uint16_t max_matches;
> > +       /**< Maximum matches per scan supported by this device. */
> > +       uint16_t max_queue_pairs;
> > +       /**< Maximum queue pairs supported by this device. */
> > +       uint16_t max_payload_size;
> > +       /**< Maximum payload size for a pattern match request or scan.
> > +        * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > +        */
> > +       uint32_t max_rules_per_group;
> > +       /**< Maximum rules supported per group by this device.
> > +        * This number can't be larger then 20 bits.
> 
> s/then/than
> 
> I think, we don't need to say this " This number can't be larger than 20 bits."
> It may help SW drivers.
> 

Agree I will remove the 20 bits part.

> 
> 
> > +        */
> > +       uint16_t max_groups;
> > +       /**< Maximum group supported by this device.
> > +        * This number can't be larger then 12 bits.
> s/then/than
> I think, we don't need to say this " This number can't be larger than 12 bits."
> It may help SW drivers.
>

Agree will remove the 12 bits part.
 
> > +        */
> > +       uint32_t regex_dev_capa;
> > +       /**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> > +       uint64_t rule_flags;
> > +       /**< Supported compiler rule flags.
> > +        * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> > +        */
> > +       uint8_t max_scatter_gather;
> > +       /**< The max supported number of buffers that can
> > +        * be used in a single ops. The total size of all elements
> > +        * must be less then max_payload_size.
> > +        */
> > +};
> <snip>
> 
> > +int
> > +rte_regex_rule_db_compile(uint8_t dev_id);
> > +
> 
> I think your "rte_regex_rule_db_compile_activate() - compile and
> activate the new rule set"
> API name looks good. I am for rte_regex_rule_db_compile_activate().
> 

I like your name, will change to compile_activate.

> 
> > +/* Fast path APIs */
> > +
> > +/**
> > + * The generic *rte_regex_match* structure to hold the RegEx match
> attributes.
> > + * @see struct rte_regex_ops::matches
> > + */
> > +struct rte_regex_match {
> > +       RTE_STD_C11
> > +       union {
> > +               uint64_t u64;
> > +               struct {
> > +                       uint32_t rule_id:20;
> > +                       /**< Rule identifier to which the pattern matched.
> > +                        * @see struct rte_regex_rule::rule_id
> > +                        */
> > +                       uint32_t group_id:12;
> > +                       /**< Group identifier of the rule which the pattern
> > +                        * matched. @see struct rte_regex_rule::group_id
> > +                        */
> > +                       uint16_t offset;
> 
> Since we have end_offset now, IMO, it is better to change this offset
> to "start_offset".
> 

Agree, will change.

> 
> > +                       /**< Starting Byte Position for matched rule. */
> > +                       RTE_STD_C11
> > +                       union {
> > +                               uint16_t len;
> > +                               /**< Length of match in bytes */
> > +                               uint16_t end_offset;
> > +                               /**< The end offset of the match. In case
> > +                                * MATCH_AS_START configuration is disabled.
> > +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> > +                                */
> 
> We have not concluded on this scheme. Have one field which has
> different meaning will be difficult
> for application. i.e fast path we need to have a check for this.
> 

This is the time to conclude 😊 . at least for the first version.
Why do we have one field with different meaning? 
The result can be ether len or end_offset.

> I think, Based on the majority of HW/SW implementation, we need to
> either go with len or
> end_offset. What Mellanox HW returns? len or end_offset?
>

From Mellanox perspective we prefer the len approach. We also think 
it is much more user oriented.
 
> or We can keep it as len or end_offset based on which drivers upstream first,
> other drivers when it comes, we can see how to abstract it?
> 

I can except that assuming we choose the start and len approach 😊

> > +                       };
> > +               };
> > +       };
> > +};
> 
> > +/**
> > + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> > + * for enqueue and dequeue operation.
> > + */
> > +struct rte_regex_ops {
> > +       /* W0 */
> > +       uint16_t req_flags;
> > +       /**< Request flags for the RegEx ops.
> > +        * @see RTE_REGEX_OPS_REQ_*
> > +        */
> > +       uint16_t rsp_flags;
> > +       /**< Response flags for the RegEx ops.
> > +        * @see RTE_REGEX_OPS_RSP_*
> > +        */
> > +       uint16_t nb_actual_matches;
> > +       /**< The total number of actual matches detected by the Regex
> device.*/
> > +       uint16_t nb_matches;
> > +       /**< The total number of matches returned by the RegEx device for this
> > +        * scan. The size of *rte_regex_ops::matches* zero length array will be
> > +        * this value.
> > +        *
> > +        * @see struct rte_regex_ops::matches, struct rte_regex_match
> > +        */
> > +
> > +       /* W1 */
> > +       uint16_t num_of_bufs;
> > +       /**< The number of bufs that are part of this ops. The total size of
> > +        * the length of all the buffer must be smaller then the max buffer
> > +        * len.
> > +        */
> > +       uint16_t resv1;
> > +       uint32_t resv2;
> 
> One of the point came up in our implementation is that.
> HW can return an error due to various reasons.
> 
> One option could be to make nb_matches as zero? and update some flag?
> 
> What are your thoughts? updating the flag may be overkill.
>

I think we can return just zero matches for now.
 
> 
> > +
> > +       /* W2 */
> > +       struct rte_regex_iov *(*bufs)[];
> > +       /**< Holds a pointer to the buffers list.*/
> 
> This memory gets submitted to HW so it can not be from the heap.
> Cryptodev had a similar dilemma to use the container format for
> multi-segment case, Finally they choose to with mbuf.
> 
> The following elements are in mbuf. Considering to avoid duplication and
> avoid overhead most common usecase DPI(Assume if it is rte_regex_iov,
> one need to copy all the elements from mbuf on fastpath).
> I propose to have mbuf here instead of rte_regex_iov.
> 

The application only needs to set the data pointers. (no copy is required. )
I agree that there are advantages to the mbuf approach.
The main limitation for the mbufs approach is that the user will need to play with the offset
pointers and pointers to the next mbuf, in order to support cross buffer.
For example we have a packet and we want to add to the scan also the last part of the previous packet,
this means that the application must modify the data offset in the previous packet mbuf including 
changing the next pointer to point to the head of the new packet, and then return the values to the original position.

What do you think?
We can start with mbufs and see how it works, or start with the buffer and see how it works.



 
> struct rte_regex_iov {
> RTE_STD_C11
> union {
> uint64_t u64;
> /**<  Allow 8-byte reserved on 32-bit system */
> void *buf_addr;
> /**< Virtual address of the pattern to be matched. */
> };
> rte_iova_t buf_iova;
> /**< IOVA address of the pattern to be matched. */
> uint16_t buf_size; /**< The buf size. */
> };
> 
> 
> > +
> > +       /* W5 */
> > +       RTE_STD_C11
> > +       union {
> > +               uint64_t cross_buf_id;
> > +               /**< ID used by the RegEx device in order to handle cross
> > +                * buffer detection.
> > +                * This ID is given by the RegEx device on dequeue, and
> > +                * the application must send it on the following enque.
> > +                */
> > +               void *cross_buf_ptr;
> > +               /**< Pointer representation of *cross_buf_id* */
> 
> Could you have some example of how to use cross_buf_id?
> Marvell HW does not support cross_buf_id, so we need to add this
> feature as capability.
>

The idea is that this buffer will be used to keep some internal data for the engine.
For example the current state and what was found until now, and then reuse this
for the next buffer.
We can remove it for now if we agree that we can add it later.
 


One more thing, regarding the ops structure, I think it is better to split it to 2 different 
structures one enque and one for dequeue, since there are no real shared data and we will
be able to save memory, what do you think?

> 
> 
> Regarding the rule attributes, We think, following needs to be added to control
> the rule compilation behavior. If it not converging in first look, I
> think, we can make
> below as separate patch once we have basic things.
> 

Regarding the new code, we need also to add a function to get the capabilities for the compiler or
add a new field in the dev_info which will report the complier supported features.
I prefer a dedicated function.

> 
> diff --git a/lib/librte_regexdev/rte_regexdev.h
> b/lib/librte_regexdev/rte_regexdev.h
> index 765da4aaa..1fa6a7135 100644
> --- a/lib/librte_regexdev/rte_regexdev.h
> +++ b/lib/librte_regexdev/rte_regexdev.h
> @@ -767,6 +767,10 @@ enum rte_regex_rule_op {
>  struct rte_regex_rule {
>   enum rte_regex_rule_op op;
>   /**< OP type of the rule either a OP_ADD or OP_DELETE */
> + uint32_t nb_prefix;
> + /**< Number of prefix entries */
> + char **prefixes;
> + /**< Rule prefix list */
>   uint16_t group_id;
>   /**< Group identifier to which the rule belongs to. */
>   uint32_t rule_id;
> @@ -784,6 +788,169 @@ struct rte_regex_rule {
>   */
>  };
> 
> +/** Enumerates RegEx rule prefix control type */
> +enum rte_regex_prefix_control_entry_type
> +{
> + RTE_REGEX_PREFIX_CONTROL_ENTRY_BLACKLIST = 0,
> + /**< The RegEx rule compiler will not use any prefixes that are
> + * specified. If there are no options but to use a blacklisted prefix,
> + * the rule will not be compiled.
> + */
> + RTE_REGEX_PREFIX_CONTROL_ENTRY_GRAYLIST = 1,
> + /**< The RegEx rule compiler will try not to use any prefixes that
> + * are specified.
> + * It will settle for these if there are no other options.
> + */
> + RTE_REGEX_PREFIX_CONTROL_ENTRY_WHITELIST = 2
> + /**< The RegEx rule compiler will try to use any prefixes that are
> + * specified. If it can't it will use other prefixes.
> + */
> +
> +};
> +
> +/** Structure to hold a RegEx rule prefix control entry */
> +struct rte_regex_prefix_control_entry
> +{
> +        enum rte_regex_prefix_control_entry_type type;
> +        /**< Indicates the type of the prefix selection control list entry */
> +        char *value;
> +        /**< The value associated with the prefix selection control list entry.
> +         * This could be a string such as ABCD.
> +         * Characters can also be represented in hex format
> +         * such as \x00\x01\x02\x03.
> +         */
> +};
> +
> +/** Structure to hold a RegEx rule rule prefix control entry list */
> +struct rte_regex_prefix_control
> +{
> +        uint32_t num;
> +        /**< The number of prefix selection control list entries */
> +        struct rte_regex_prefix_control_entry *prefixes;
> +        /**< The list of prefix selection control list entries */
> +};
> +
> +/** Enumerates RegEx rule compilation capabilities */
> +
> +#define RTE_REGEX_COMPILE_CAP_AUTO_RULE_ID_F (1ULL << 0)
> +/**< RegEx device compilation supports auto rule id
> + * Assign an automatic incrementing ID to each rule.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_CHECKSUM_F (1ULL << 1)
> +/**< RegEx device compilation supports checksum calculation.
> + * When performing an incremental compile use the end checksum of the
> base rule
> + * db as the start checksum for the new rule db.
> + * Must be used in conjunction with an incremental compile.
> + */
> +
> +/* @warning might not be needed */
> +#define RTE_REGEX_COMPILE_CAP_EM_F (1ULL << 2)
> +/**< RegEx device compilation supports external memory
> + * Store rule db in external memory only.
> + */
> +
> +/* @warning might not be needed */
> +#define RTE_REGEX_COMPILE_CAP_DISABLE_DISABLE_AIC_F (1ULL << 3)
> +/**< RegEx device compilation supports disable of atomic incremental
> compile
> + * Disable the atomic incremental compile system.
> + * The incremental compilation will be performed but the output
> + * will not be atomized.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_FORCE_F (1ULL << 4)
> +/**< RegEx device compilation supports force compiliation
> + * Force the compilation to complete even if errors are found.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_INCREMENTAL_F (1ULL << 5)
> +/**< RegEx device compilation supports incremental compilation
> + * Perform an incremental compile using pre build rule db as the base.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_NO_INC_COMPILE_PADDING_F (1ULL <<
> 6)
> +/**< RegEx device compilation supports padding removal
> + * Remove the padding that is normally introduced to assist with
> + * incremental compile. This will allow for a more compact rule db footprint
> + * but is not recommended if you intend to perform an incremental
> + * compile with the resulting rule db.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_PCRE_PRE_8_36_F (1ULL << 7)
> +/**< RegEx device compilation supports change of space class
> + * Set the space class to not include vertical tab (VT) as was used in
> + * PCRE before v8.36.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_STRICT_QUANTIFIERS_F (1ULL << 8)
> +/**< RegEx device compilation supports strict quantifiers
> + * To help with performance, by default the REGEX Compiler treats non-fixed
> + * bounded quantifiers as unbounded e.g. .{0,2048} will be the same as .*. This
> + * has the caveat of false positives being possible.
> + * This capability will ensure that the original construct is used in all
> + * cases meaning performance will be worse but no false positives will occur.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_QUICK_REMOVE_RULES_F (1ULL << 9)
> +/**< RegEx device compilation supports quick removal of rules
> + * Quick incremental compile to remove rules.
> + * Specify rule_ids_to_remove from rule DB.
> + * Must be used in conjunction with the incremental option.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_QUICK_ADD_RULES_F (1ULL << 10)
> +/**< RegEx device compilation supports quick addition of rules
> + * Quick incremental compile to add rules.
> + * Specify rule_ids_to_add to rule DB.
> + * Must be used in conjunction with the incremental option.
> + */
> +
> +/* @warning might not be needed */
> +#define RTE_REGEX_COMPILE_CAP_UTF_8_F (1ULL << 11)
> +/**< RegEx device compilation supports UTF-8 mode
> + * Switch on UTF-8 mode.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_DISABLE_BIDIRECTIONAL_F (1ULL << 12)
> +/**< RegEx device compilation supports disabling of bidirectional
> + * This disables the bidirectional rule compilation.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_UTF_16_STRING_F (1ULL << 13)
> +/**< RegEx device compilation supports UTF-16 strings efficiency
> + * Deal with UTF-16 strings more efficiently.
> + * The Compiler will assume RegEx constructs like the following are intended
> + * to match UTF-16 strings:A\x00?B\x00?C\x00?
> + * with this switch enabled the Compiler will interpret the
> + * above RegEx as:ABCD|A\x00B\x00C\x00
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_SWITCH_OFF_LOCALE_SUPPORT_F (1ULL
> << 14)
> +/**< RegEx device compilation supports disabling locale
> + * Disable locale support.
> + */
> +
> +#define RTE_REGEX_COMPILE_CAP_SPLIT_ALTERNATIONS_F (1ULL << 15)
> +/**< RegEx device compilation supports automatic alternation splitting
> + * Enable the automatic alternation splitting.
> + */
> +
> +/** Structure to hold a RegEx control list data sample */
> +struct rte_regex_data_sample
> +{
> +        float threshold;
> +        /**< A % threshold used to blacklist strings in the sample data.
> +         * This % value represents the % of bytes in the sample data that
> +         * represent the start of a 1 - 4-byte string.
> +         * If the string occurs at > the % threshold number of bytes it will
> +         * then be blacklisted.
> +         */
> +        size_t length;
> +        /**< The number of bytes in the data sample to be analyzed */
> +        char *data;
> +        /**< The data sample to be analyzed */
> +};
> +
>  /**
>   * Update the rule database of a RegEx device.
>   *
> @@ -793,6 +960,12 @@ struct rte_regex_rule {
>   *   which contain the regex rules attributes to be updated in rule database.
>   * @param nb_rules
>   *   The number of PCRE rules to update the rule database.
> + * @param prefixes
> + *   The prefix selection control list.
> + * @param compiler_options
> + *   The compiler options flags @see RTE_REGEX_COMPILE_CAP*.
> + * @param data_sample
> + *   The data sample for auto control list generation.
>   *
>   * @return
>   *   The number of regex rules actually updated on the regex device's rule
> @@ -811,7 +984,10 @@ struct rte_regex_rule {
>   */
>  uint16_t
>  rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
> - uint16_t nb_rules);
> + uint16_t nb_rules,
> + struct rte_regex_prefix_control *prefix_control,
> + uint32_t compiler_options
> + struct rte_regex_data_sample *data_sample);
> 
>  /**
>   * Import a prebuilt rule database from a buffer to a RegEx device.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-23  8:41     ` Ori Kam
@ 2020-02-23  9:53       ` Jerin Jacob
  2020-02-23 12:33         ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob @ 2020-02-23  9:53 UTC (permalink / raw)
  To: Ori Kam
  Cc: Jerin Jacob, xiang.w.wang, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

On Sun, Feb 23, 2020 at 2:12 PM Ori Kam <orika@mellanox.com> wrote:
>
> Hi Jerin,
>
> Thanks, for the review.
> PSB


Hi Ori

Since we are finalizing the specification part, I thought of
enumerating the list of work needs to be
completed for a new subsystem in DPDK.

0) Finalize the first version of the spec. Hope v4 will do that.
1) Introduce common library code for based on the  specification
2) One HW based driver implementation
3) One SW reference driver: libpcre library provides complete PCRE
functionality.
4) app/test/test_regexdev.c like app/test/test_eventdev.c
5) Need a maintainer for maintaining the regex subsystem
6) The first version programming guide documentation
7) Add app/test-regexdev like app/test-eventdev
8) Add an examples/xxxxxx program

IMO The following items need to be completed to accept a subsystem in
dpdk(Need at least on HW and SW driver).

0) Finalize the first version of the spec. Hope v4 will do that.
1) Introduce common library code for based on the  specification
2) One HW based driver implementation
3) One SW reference driver: libpcre library provides complete PCRE
functionality.
4) app/test/test_regexdev.c like app/test/test_eventdev.c
5) Need a maintainer for maintaining the regex subsystem

We have item (3) so Marvell would like to work on item (3). Our HW
driver may ready by v20.05 or the worst case by 20.08.
Let us know what other items Mellanox or community would like to work
on. This is to avoid duplication of work
to get clarity on the next steps.

PSB


>
>
> Ori Kam
>
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > Sent: Saturday, February 22, 2020 6:52 PM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> > <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> > Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> >
> > > diff --git a/lib/librte_regexdev/rte_regexdev.h
> > b/lib/librte_regexdev/rte_regexdev.h
> > > new file mode 100644
> > > index 0000000..c42128b
> > > --- /dev/null
> > > +++ b/lib/librte_regexdev/rte_regexdev.h
> > > @@ -0,0 +1,1411 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + * Copyright(C) 2019 Marvell International Ltd.
> > > + * Copyright(C) 2020 Mellanox International Ltd.
> >
> > There are a few comments from Xiang as well. So let's add Intel also
> > to the list.
> >
>
> Sure no problem.

Thanks

>
> > > + */
> > > +
> > > +#ifndef _RTE_REGEXDEV_H_
> > > +#define _RTE_REGEXDEV_H_
> >
> > > +
> > > +/**
> > > + * RegEx device information
> > > + */
> > > +struct rte_regex_dev_info {
> > > +       const char *driver_name; /**< RegEx driver name. */
> > > +       struct rte_device *dev; /**< Device information. */
> > > +       uint16_t max_matches;
> > > +       /**< Maximum matches per scan supported by this device. */
> > > +       uint16_t max_queue_pairs;
> > > +       /**< Maximum queue pairs supported by this device. */
> > > +       uint16_t max_payload_size;
> > > +       /**< Maximum payload size for a pattern match request or scan.
> > > +        * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > > +        */
> > > +       uint32_t max_rules_per_group;
> > > +       /**< Maximum rules supported per group by this device.
> > > +        * This number can't be larger then 20 bits.
> >
> > s/then/than
> >
> > I think, we don't need to say this " This number can't be larger than 20 bits."
> > It may help SW drivers.
> >
>
> Agree I will remove the 20 bits part.
>
> >
> >
> > > +        */
> > > +       uint16_t max_groups;
> > > +       /**< Maximum group supported by this device.
> > > +        * This number can't be larger then 12 bits.
> > s/then/than
> > I think, we don't need to say this " This number can't be larger than 12 bits."
> > It may help SW drivers.
> >
>
> Agree will remove the 12 bits part.
>
> > > +        */
> > > +       uint32_t regex_dev_capa;
> > > +       /**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> > > +       uint64_t rule_flags;
> > > +       /**< Supported compiler rule flags.
> > > +        * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> > > +        */
> > > +       uint8_t max_scatter_gather;
> > > +       /**< The max supported number of buffers that can
> > > +        * be used in a single ops. The total size of all elements
> > > +        * must be less then max_payload_size.
> > > +        */
> > > +};
> > <snip>
> >
> > > +int
> > > +rte_regex_rule_db_compile(uint8_t dev_id);
> > > +
> >
> > I think your "rte_regex_rule_db_compile_activate() - compile and
> > activate the new rule set"
> > API name looks good. I am for rte_regex_rule_db_compile_activate().
> >
>
> I like your name, will change to compile_activate.

Ack.

>
> >
> > > +/* Fast path APIs */
> > > +
> > > +/**
> > > + * The generic *rte_regex_match* structure to hold the RegEx match
> > attributes.
> > > + * @see struct rte_regex_ops::matches
> > > + */
> > > +struct rte_regex_match {
> > > +       RTE_STD_C11
> > > +       union {
> > > +               uint64_t u64;
> > > +               struct {
> > > +                       uint32_t rule_id:20;
> > > +                       /**< Rule identifier to which the pattern matched.
> > > +                        * @see struct rte_regex_rule::rule_id
> > > +                        */
> > > +                       uint32_t group_id:12;
> > > +                       /**< Group identifier of the rule which the pattern
> > > +                        * matched. @see struct rte_regex_rule::group_id
> > > +                        */
> > > +                       uint16_t offset;
> >
> > Since we have end_offset now, IMO, it is better to change this offset
> > to "start_offset".
> >
>
> Agree, will change.
>
> >
> > > +                       /**< Starting Byte Position for matched rule. */
> > > +                       RTE_STD_C11
> > > +                       union {
> > > +                               uint16_t len;
> > > +                               /**< Length of match in bytes */
> > > +                               uint16_t end_offset;
> > > +                               /**< The end offset of the match. In case
> > > +                                * MATCH_AS_START configuration is disabled.
> > > +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > +                                */
> >
> > We have not concluded on this scheme. Have one field which has
> > different meaning will be difficult
> > for application. i.e fast path we need to have a check for this.
> >
>
> This is the time to conclude . at least for the first version.
> Why do we have one field with different meaning?
> The result can be ether len or end_offset.
>
> > I think, Based on the majority of HW/SW implementation, we need to
> > either go with len or
> > end_offset. What Mellanox HW returns? len or end_offset?
> >
>
> From Mellanox perspective we prefer the len approach. We also think
> it is much more user oriented.
>
> > or We can keep it as len or end_offset based on which drivers upstream first,
> > other drivers when it comes, we can see how to abstract it?
> >
>
> I can except that assuming we choose the start and len approach

I think, we can have first version with "start and len" by removing
RTE_REGEX_DEV_CFG_MATCH_AS_START.
When can think, how to abstract new drivers when it upstream based on
the overhead.


>
> > > +                       };
> > > +               };
> > > +       };
> > > +};
> >
> > > +/**
> > > + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> > > + * for enqueue and dequeue operation.
> > > + */
> > > +struct rte_regex_ops {
> > > +       /* W0 */
> > > +       uint16_t req_flags;
> > > +       /**< Request flags for the RegEx ops.
> > > +        * @see RTE_REGEX_OPS_REQ_*
> > > +        */
> > > +       uint16_t rsp_flags;
> > > +       /**< Response flags for the RegEx ops.
> > > +        * @see RTE_REGEX_OPS_RSP_*
> > > +        */
> > > +       uint16_t nb_actual_matches;
> > > +       /**< The total number of actual matches detected by the Regex
> > device.*/
> > > +       uint16_t nb_matches;
> > > +       /**< The total number of matches returned by the RegEx device for this
> > > +        * scan. The size of *rte_regex_ops::matches* zero length array will be
> > > +        * this value.
> > > +        *
> > > +        * @see struct rte_regex_ops::matches, struct rte_regex_match
> > > +        */
> > > +
> > > +       /* W1 */
> > > +       uint16_t num_of_bufs;
> > > +       /**< The number of bufs that are part of this ops. The total size of
> > > +        * the length of all the buffer must be smaller then the max buffer
> > > +        * len.
> > > +        */
> > > +       uint16_t resv1;
> > > +       uint32_t resv2;
> >
> > One of the point came up in our implementation is that.
> > HW can return an error due to various reasons.
> >
> > One option could be to make nb_matches as zero? and update some flag?
> >
> > What are your thoughts? updating the flag may be overkill.
> >
>
> I think we can return just zero matches for now.

Ack.

>
> >
> > > +
> > > +       /* W2 */
> > > +       struct rte_regex_iov *(*bufs)[];
> > > +       /**< Holds a pointer to the buffers list.*/
> >
> > This memory gets submitted to HW so it can not be from the heap.
> > Cryptodev had a similar dilemma to use the container format for
> > multi-segment case, Finally they choose to with mbuf.
> >
> > The following elements are in mbuf. Considering to avoid duplication and
> > avoid overhead most common usecase DPI(Assume if it is rte_regex_iov,
> > one need to copy all the elements from mbuf on fastpath).
> > I propose to have mbuf here instead of rte_regex_iov.
> >
>
> The application only needs to set the data pointers. (no copy is required. )
> I agree that there are advantages to the mbuf approach.
> The main limitation for the mbufs approach is that the user will need to play with the offset
> pointers and pointers to the next mbuf, in order to support cross buffer.
> For example we have a packet and we want to add to the scan also the last part of the previous packet,
> this means that the application must modify the data offset in the previous packet mbuf including
> changing the next pointer to point to the head of the new packet, and then return the values to the original position.
>
> What do you think?
> We can start with mbufs and see how it works, or start with the buffer and see how it works.

I think, we can start with mbuf to align with other subsystems. We
will see later the use case for struct rte_regex_iov.


>
>
>
>
> > struct rte_regex_iov {
> > RTE_STD_C11
> > union {
> > uint64_t u64;
> > /**<  Allow 8-byte reserved on 32-bit system */
> > void *buf_addr;
> > /**< Virtual address of the pattern to be matched. */
> > };
> > rte_iova_t buf_iova;
> > /**< IOVA address of the pattern to be matched. */
> > uint16_t buf_size; /**< The buf size. */
> > };
> >
> >
> > > +
> > > +       /* W5 */
> > > +       RTE_STD_C11
> > > +       union {
> > > +               uint64_t cross_buf_id;
> > > +               /**< ID used by the RegEx device in order to handle cross
> > > +                * buffer detection.
> > > +                * This ID is given by the RegEx device on dequeue, and
> > > +                * the application must send it on the following enque.
> > > +                */
> > > +               void *cross_buf_ptr;
> > > +               /**< Pointer representation of *cross_buf_id* */
> >
> > Could you have some example of how to use cross_buf_id?
> > Marvell HW does not support cross_buf_id, so we need to add this
> > feature as capability.
> >
>
> The idea is that this buffer will be used to keep some internal data for the engine.
> For example the current state and what was found until now, and then reuse this
> for the next buffer.
> We can remove it for now if we agree that we can add it later.

I think, adding it later would be better so that we can see how to
abstract it well.

>
>
>
> One more thing, regarding the ops structure, I think it is better to split it to 2 different
> structures one enque and one for dequeue, since there are no real shared data and we will
> be able to save memory, what do you think?

Ops are allocated from mempool so it will be overhead to manage both.
moreover, some
of the fields added in req can be used for resp as info. cryptodev
follows the similar concept,
I think, we can have symmetry with cryptodev wherever is possible to avoid
end-user to learn new API models.



>
> >
> >
> > Regarding the rule attributes, We think, following needs to be added to control
> > the rule compilation behavior. If it not converging in first look, I
> > think, we can make
> > below as separate patch once we have basic things.
> >
>
> Regarding the new code, we need also to add a function to get the capabilities for the compiler or
> add a new field in the dev_info which will report the complier supported features.

I agree. Lets remove this new code from the first version. We can add
it later with capability as
a new patch.

I assume you will send the v4 with these comments. I think, with v4 we
can start implementing common library code.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-23  9:53       ` Jerin Jacob
@ 2020-02-23 12:33         ` Ori Kam
  2020-02-25  5:57           ` Jerin Jacob
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-02-23 12:33 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Jerin Jacob, xiang.w.wang, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Jerin,

Best,
Ori
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Sunday, February 23, 2020 11:54 AM
> To: Ori Kam <orika@mellanox.com>
> Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> Shuler <shahafs@mellanox.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> 
> On Sun, Feb 23, 2020 at 2:12 PM Ori Kam <orika@mellanox.com> wrote:
> >
> > Hi Jerin,
> >
> > Thanks, for the review.
> > PSB
> 
> 
> Hi Ori
> 
> Since we are finalizing the specification part, I thought of
> enumerating the list of work needs to be
> completed for a new subsystem in DPDK.
> 
> 0) Finalize the first version of the spec. Hope v4 will do that.
> 1) Introduce common library code for based on the  specification
> 2) One HW based driver implementation
> 3) One SW reference driver: libpcre library provides complete PCRE
> functionality.
> 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> 5) Need a maintainer for maintaining the regex subsystem
> 6) The first version programming guide documentation
> 7) Add app/test-regexdev like app/test-eventdev
> 8) Add an examples/xxxxxx program
> 

> IMO The following items need to be completed to accept a subsystem in
> dpdk(Need at least on HW and SW driver).
> 
> 0) Finalize the first version of the spec. Hope v4 will do that.

I hope so to 😊

> 1) Introduce common library code for based on the  specification

I'm working on it. as soon as we agree on the API (this RFC will get acked ) I can work on this code.
I will send the entire code for ack when we decide if it will be part of 
20.05 or 20.08.

> 2) One HW based driver implementation

Just like you, our driver will be ready by 20.05 or 20.08

> 3) One SW reference driver: libpcre library provides complete PCRE
> functionality.

O.K. We are not working on this part.

> 4) app/test/test_regexdev.c like app/test/test_eventdev.c

We started to create a super basic app, after the API will be finalized and we will have HW
we can push it. (if you need it faster than feel free)

> 5) Need a maintainer for maintaining the regex subsystem
> 
We wish to maintain it if you agree.

> We have item (3) so Marvell would like to work on item (3). Our HW
> driver may ready by v20.05 or the worst case by 20.08.
> Let us know what other items Mellanox or community would like to work
> on. This is to avoid duplication of work
> to get clarity on the next steps.
> 

See my comments above.
From Mellanox the best date is 20.08 but we are trying to make it to 20.05, 
depended on HW.


> PSB
> 
> 
> >
> >
> > Ori Kam
> >
> > > -----Original Message-----
> > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > Sent: Saturday, February 22, 2020 6:52 PM
> > > To: Ori Kam <orika@mellanox.com>
> > > Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> > > <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> > > Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> Richardson,
> > > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > > harry.chang@intel.com; gu.jian1@zte.com.cn;
> shanjiangh@chinatelecom.cn;
> > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com;
> > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > <thomas@monjalon.net>
> > > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > >
> > > > diff --git a/lib/librte_regexdev/rte_regexdev.h
> > > b/lib/librte_regexdev/rte_regexdev.h
> > > > new file mode 100644
> > > > index 0000000..c42128b
> > > > --- /dev/null
> > > > +++ b/lib/librte_regexdev/rte_regexdev.h
> > > > @@ -0,0 +1,1411 @@
> > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > + * Copyright(C) 2019 Marvell International Ltd.
> > > > + * Copyright(C) 2020 Mellanox International Ltd.
> > >
> > > There are a few comments from Xiang as well. So let's add Intel also
> > > to the list.
> > >
> >
> > Sure no problem.
> 
> Thanks
> 
> >
> > > > + */
> > > > +
> > > > +#ifndef _RTE_REGEXDEV_H_
> > > > +#define _RTE_REGEXDEV_H_
> > >
> > > > +
> > > > +/**
> > > > + * RegEx device information
> > > > + */
> > > > +struct rte_regex_dev_info {
> > > > +       const char *driver_name; /**< RegEx driver name. */
> > > > +       struct rte_device *dev; /**< Device information. */
> > > > +       uint16_t max_matches;
> > > > +       /**< Maximum matches per scan supported by this device. */
> > > > +       uint16_t max_queue_pairs;
> > > > +       /**< Maximum queue pairs supported by this device. */
> > > > +       uint16_t max_payload_size;
> > > > +       /**< Maximum payload size for a pattern match request or scan.
> > > > +        * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > > > +        */
> > > > +       uint32_t max_rules_per_group;
> > > > +       /**< Maximum rules supported per group by this device.
> > > > +        * This number can't be larger then 20 bits.
> > >
> > > s/then/than
> > >
> > > I think, we don't need to say this " This number can't be larger than 20 bits."
> > > It may help SW drivers.
> > >
> >
> > Agree I will remove the 20 bits part.
> >
> > >
> > >
> > > > +        */
> > > > +       uint16_t max_groups;
> > > > +       /**< Maximum group supported by this device.
> > > > +        * This number can't be larger then 12 bits.
> > > s/then/than
> > > I think, we don't need to say this " This number can't be larger than 12 bits."
> > > It may help SW drivers.
> > >
> >
> > Agree will remove the 12 bits part.
> >
> > > > +        */
> > > > +       uint32_t regex_dev_capa;
> > > > +       /**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> > > > +       uint64_t rule_flags;
> > > > +       /**< Supported compiler rule flags.
> > > > +        * @see RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_rule::rule_flags
> > > > +        */
> > > > +       uint8_t max_scatter_gather;
> > > > +       /**< The max supported number of buffers that can
> > > > +        * be used in a single ops. The total size of all elements
> > > > +        * must be less then max_payload_size.
> > > > +        */
> > > > +};
> > > <snip>
> > >
> > > > +int
> > > > +rte_regex_rule_db_compile(uint8_t dev_id);
> > > > +
> > >
> > > I think your "rte_regex_rule_db_compile_activate() - compile and
> > > activate the new rule set"
> > > API name looks good. I am for rte_regex_rule_db_compile_activate().
> > >
> >
> > I like your name, will change to compile_activate.
> 
> Ack.
> 
> >
> > >
> > > > +/* Fast path APIs */
> > > > +
> > > > +/**
> > > > + * The generic *rte_regex_match* structure to hold the RegEx match
> > > attributes.
> > > > + * @see struct rte_regex_ops::matches
> > > > + */
> > > > +struct rte_regex_match {
> > > > +       RTE_STD_C11
> > > > +       union {
> > > > +               uint64_t u64;
> > > > +               struct {
> > > > +                       uint32_t rule_id:20;
> > > > +                       /**< Rule identifier to which the pattern matched.
> > > > +                        * @see struct rte_regex_rule::rule_id
> > > > +                        */
> > > > +                       uint32_t group_id:12;
> > > > +                       /**< Group identifier of the rule which the pattern
> > > > +                        * matched. @see struct rte_regex_rule::group_id
> > > > +                        */
> > > > +                       uint16_t offset;
> > >
> > > Since we have end_offset now, IMO, it is better to change this offset
> > > to "start_offset".
> > >
> >
> > Agree, will change.
> >
> > >
> > > > +                       /**< Starting Byte Position for matched rule. */
> > > > +                       RTE_STD_C11
> > > > +                       union {
> > > > +                               uint16_t len;
> > > > +                               /**< Length of match in bytes */
> > > > +                               uint16_t end_offset;
> > > > +                               /**< The end offset of the match. In case
> > > > +                                * MATCH_AS_START configuration is disabled.
> > > > +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > +                                */
> > >
> > > We have not concluded on this scheme. Have one field which has
> > > different meaning will be difficult
> > > for application. i.e fast path we need to have a check for this.
> > >
> >
> > This is the time to conclude . at least for the first version.
> > Why do we have one field with different meaning?
> > The result can be ether len or end_offset.
> >
> > > I think, Based on the majority of HW/SW implementation, we need to
> > > either go with len or
> > > end_offset. What Mellanox HW returns? len or end_offset?
> > >
> >
> > From Mellanox perspective we prefer the len approach. We also think
> > it is much more user oriented.
> >
> > > or We can keep it as len or end_offset based on which drivers upstream
> first,
> > > other drivers when it comes, we can see how to abstract it?
> > >
> >
> > I can except that assuming we choose the start and len approach
> 
> I think, we can have first version with "start and len" by removing
> RTE_REGEX_DEV_CFG_MATCH_AS_START.
> When can think, how to abstract new drivers when it upstream based on
> the overhead.
> 

Perfect

> 
> >
> > > > +                       };
> > > > +               };
> > > > +       };
> > > > +};
> > >
> > > > +/**
> > > > + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> > > > + * for enqueue and dequeue operation.
> > > > + */
> > > > +struct rte_regex_ops {
> > > > +       /* W0 */
> > > > +       uint16_t req_flags;
> > > > +       /**< Request flags for the RegEx ops.
> > > > +        * @see RTE_REGEX_OPS_REQ_*
> > > > +        */
> > > > +       uint16_t rsp_flags;
> > > > +       /**< Response flags for the RegEx ops.
> > > > +        * @see RTE_REGEX_OPS_RSP_*
> > > > +        */
> > > > +       uint16_t nb_actual_matches;
> > > > +       /**< The total number of actual matches detected by the Regex
> > > device.*/
> > > > +       uint16_t nb_matches;
> > > > +       /**< The total number of matches returned by the RegEx device for
> this
> > > > +        * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> > > > +        * this value.
> > > > +        *
> > > > +        * @see struct rte_regex_ops::matches, struct rte_regex_match
> > > > +        */
> > > > +
> > > > +       /* W1 */
> > > > +       uint16_t num_of_bufs;
> > > > +       /**< The number of bufs that are part of this ops. The total size of
> > > > +        * the length of all the buffer must be smaller then the max buffer
> > > > +        * len.
> > > > +        */
> > > > +       uint16_t resv1;
> > > > +       uint32_t resv2;
> > >
> > > One of the point came up in our implementation is that.
> > > HW can return an error due to various reasons.
> > >
> > > One option could be to make nb_matches as zero? and update some flag?
> > >
> > > What are your thoughts? updating the flag may be overkill.
> > >
> >
> > I think we can return just zero matches for now.
> 
> Ack.
> 
> >
> > >
> > > > +
> > > > +       /* W2 */
> > > > +       struct rte_regex_iov *(*bufs)[];
> > > > +       /**< Holds a pointer to the buffers list.*/
> > >
> > > This memory gets submitted to HW so it can not be from the heap.
> > > Cryptodev had a similar dilemma to use the container format for
> > > multi-segment case, Finally they choose to with mbuf.
> > >
> > > The following elements are in mbuf. Considering to avoid duplication and
> > > avoid overhead most common usecase DPI(Assume if it is rte_regex_iov,
> > > one need to copy all the elements from mbuf on fastpath).
> > > I propose to have mbuf here instead of rte_regex_iov.
> > >
> >
> > The application only needs to set the data pointers. (no copy is required. )
> > I agree that there are advantages to the mbuf approach.
> > The main limitation for the mbufs approach is that the user will need to play
> with the offset
> > pointers and pointers to the next mbuf, in order to support cross buffer.
> > For example we have a packet and we want to add to the scan also the last
> part of the previous packet,
> > this means that the application must modify the data offset in the previous
> packet mbuf including
> > changing the next pointer to point to the head of the new packet, and then
> return the values to the original position.
> >
> > What do you think?
> > We can start with mbufs and see how it works, or start with the buffer and
> see how it works.
> 
> I think, we can start with mbuf to align with other subsystems. We
> will see later the use case for struct rte_regex_iov.
>

Agree.
 
> 
> >
> >
> >
> >
> > > struct rte_regex_iov {
> > > RTE_STD_C11
> > > union {
> > > uint64_t u64;
> > > /**<  Allow 8-byte reserved on 32-bit system */
> > > void *buf_addr;
> > > /**< Virtual address of the pattern to be matched. */
> > > };
> > > rte_iova_t buf_iova;
> > > /**< IOVA address of the pattern to be matched. */
> > > uint16_t buf_size; /**< The buf size. */
> > > };
> > >
> > >
> > > > +
> > > > +       /* W5 */
> > > > +       RTE_STD_C11
> > > > +       union {
> > > > +               uint64_t cross_buf_id;
> > > > +               /**< ID used by the RegEx device in order to handle cross
> > > > +                * buffer detection.
> > > > +                * This ID is given by the RegEx device on dequeue, and
> > > > +                * the application must send it on the following enque.
> > > > +                */
> > > > +               void *cross_buf_ptr;
> > > > +               /**< Pointer representation of *cross_buf_id* */
> > >
> > > Could you have some example of how to use cross_buf_id?
> > > Marvell HW does not support cross_buf_id, so we need to add this
> > > feature as capability.
> > >
> >
> > The idea is that this buffer will be used to keep some internal data for the
> engine.
> > For example the current state and what was found until now, and then reuse
> this
> > for the next buffer.
> > We can remove it for now if we agree that we can add it later.
> 
> I think, adding it later would be better so that we can see how to
> abstract it well.
> 

Agree.

> >
> >
> >
> > One more thing, regarding the ops structure, I think it is better to split it to 2
> different
> > structures one enque and one for dequeue, since there are no real shared
> data and we will
> > be able to save memory, what do you think?
> 
> Ops are allocated from mempool so it will be overhead to manage both.
> moreover, some
> of the fields added in req can be used for resp as info. cryptodev
> follows the similar concept,
> I think, we can have symmetry with cryptodev wherever is possible to avoid
> end-user to learn new API models.

True that there will be overhead with 2 mempools (small one)
but lets assume 255 results. This means that the buffer should be 255 * sizeof(rte_regex_match) = 2K
also this will enable us to replace groupX with group[] which will allow even more groups.
In addition don't think that crypto is a good example.
The main difference is that in RegEx the output is different format then the input.


> 
> 
> 
> >
> > >
> > >
> > > Regarding the rule attributes, We think, following needs to be added to
> control
> > > the rule compilation behavior. If it not converging in first look, I
> > > think, we can make
> > > below as separate patch once we have basic things.
> > >
> >
> > Regarding the new code, we need also to add a function to get the
> capabilities for the compiler or
> > add a new field in the dev_info which will report the complier supported
> features.
> 
> I agree. Lets remove this new code from the first version. We can add
> it later with capability as
> a new patch.
> 

Agree

> I assume you will send the v4 with these comments. I think, with v4 we
> can start implementing common library code.

Just need to agree on the split (one more iteration 😊)
and I will start working on the common code.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-23 12:33         ` Ori Kam
@ 2020-02-25  5:57           ` Jerin Jacob
  2020-02-25  7:48             ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob @ 2020-02-25  5:57 UTC (permalink / raw)
  To: Ori Kam
  Cc: Jerin Jacob, xiang.w.wang, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

> > 4) app/test/test_regexdev.c like app/test/test_eventdev.c
>
> We started to create a super basic app, after the API will be finalized and we will have HW
> we can push it. (if you need it faster than feel free)

A simple Unit test case needs to be present for the APIs. On the
course of developing common code,
it can be developed to test the common code with dummy/skeleton driver.

>
> > 5) Need a maintainer for maintaining the regex subsystem
> >
> We wish to maintain it if you agree.

Yes. Please.

> > >
> > > One more thing, regarding the ops structure, I think it is better to split it to 2
> > different
> > > structures one enque and one for dequeue, since there are no real shared
> > data and we will
> > > be able to save memory, what do you think?
> >
> > Ops are allocated from mempool so it will be overhead to manage both.
> > moreover, some
> > of the fields added in req can be used for resp as info. cryptodev
> > follows the similar concept,
> > I think, we can have symmetry with cryptodev wherever is possible to avoid
> > end-user to learn new API models.
>
> True that there will be overhead with 2 mempools (small one)
> but lets assume 255 results. This means that the buffer should be 255 * sizeof(rte_regex_match) = 2K
> also this will enable us to replace groupX with group[] which will allow even more groups.
> In addition don't think that crypto is a good example.
> The main difference is that in RegEx the output is different format then the input.

# IMO, Some of the fields may be useful for a response as well. I
think application may be interested in following
req filed in the response.
a) buf_addr
b) scan_size
c) user_id (This would be main one)

# Having two mempools adds overhead per lcore L1 cache usage and extra
complexity to the application.

# IMO, From a performance perspective, one mempool is good due to less
stress on the cache and it is costly to
add new mempool for HW mempool implementations.

# I think, group[] use case we can add it when it required by
introducing "matches_start_offset" field, which will
tell the req, where is the end of group[] and where "matches" start
with single mempool scheme also.

# I think, one of the other use case for "matches_start_offset" that,
It may possible to put vendor-specific
opaque data. It will be filled by driver on response. The application
can reference the matches as

struct rte_regex_match *matches = RTE_PTR_ADD(ops, ops->matches_start_offset);

>
> > I assume you will send the v4 with these comments. I think, with v4 we
> > can start implementing common library code.
>
> Just need to agree on the split (one more iteration )
> and I will start working on the common code.

Ack.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-25  5:57           ` Jerin Jacob
@ 2020-02-25  7:48             ` Ori Kam
  2020-02-26  9:03               ` Wang Xiang
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-02-25  7:48 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Jerin Jacob, xiang.w.wang, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon



> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Tuesday, February 25, 2020 7:57 AM
> To: Ori Kam <orika@mellanox.com>
> Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> Shuler <shahafs@mellanox.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> 
> > > 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> >
> > We started to create a super basic app, after the API will be finalized and we
> will have HW
> > we can push it. (if you need it faster than feel free)
> 
> A simple Unit test case needs to be present for the APIs. On the
> course of developing common code,
> it can be developed to test the common code with dummy/skeleton driver.
> 

Agree this is what we are currently have.

> >
> > > 5) Need a maintainer for maintaining the regex subsystem
> > >
> > We wish to maintain it if you agree.
> 
> Yes. Please.
> 

Great.

> > > >
> > > > One more thing, regarding the ops structure, I think it is better to split it
> to 2
> > > different
> > > > structures one enque and one for dequeue, since there are no real shared
> > > data and we will
> > > > be able to save memory, what do you think?
> > >
> > > Ops are allocated from mempool so it will be overhead to manage both.
> > > moreover, some
> > > of the fields added in req can be used for resp as info. cryptodev
> > > follows the similar concept,
> > > I think, we can have symmetry with cryptodev wherever is possible to avoid
> > > end-user to learn new API models.
> >
> > True that there will be overhead with 2 mempools (small one)
> > but lets assume 255 results. This means that the buffer should be 255 *
> sizeof(rte_regex_match) = 2K
> > also this will enable us to replace groupX with group[] which will allow even
> more groups.
> > In addition don't think that crypto is a good example.
> > The main difference is that in RegEx the output is different format then the
> input.
> 
> # IMO, Some of the fields may be useful for a response as well. I
> think application may be interested in following
> req filed in the response.
> a) buf_addr

I don't see how this can be used in the response. since if working in out of order result.
you don’t know which result will be returned. 
I also think it is error prone to use the same op for the enqueue and dequeue.

> b) scan_size

Please see above.

> c) user_id (This would be main one)

Agree

> 
> # Having two mempools adds overhead per lcore L1 cache usage and extra
> complexity to the application.
> 
> # IMO, From a performance perspective, one mempool is good due to less
> stress on the cache and it is costly to
> add new mempool for HW mempool implementations.
> 
> # I think, group[] use case we can add it when it required by
> introducing "matches_start_offset" field, which will
> tell the req, where is the end of group[] and where "matches" start
> with single mempool scheme also.
> 
> # I think, one of the other use case for "matches_start_offset" that,
> It may possible to put vendor-specific
> opaque data. It will be filled by driver on response. The application
> can reference the matches as
> 
> struct rte_regex_match *matches = RTE_PTR_ADD(ops, ops-
> 
>matches_start_offset);
> 

O.K for now we will keep  it as is, and we will see what will be in the future.

> >
> > > I assume you will send the v4 with these comments. I think, with v4 we
> > > can start implementing common library code.
> >
> > Just need to agree on the split (one more iteration )
> > and I will start working on the common code.
> 
> Ack.

Great,
I'm starting to work on V4 with all comments so the RFC will be acked and then will start 
coding the rest of the common code.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-26  9:03               ` Wang Xiang
@ 2020-02-26  8:36                 ` Ori Kam
  2020-02-27  9:25                   ` Wang Xiang
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-02-26  8:36 UTC (permalink / raw)
  To: Wang Xiang
  Cc: Jerin Jacob, Jerin Jacob, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Xiang,


> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Wang Xiang
> Sent: Wednesday, February 26, 2020 11:03 AM
> To: Ori Kam <orika@mellanox.com>
> Cc: Jerin Jacob <jerinjacobk@gmail.com>; Jerin Jacob <jerinj@marvell.com>;
> dpdk-dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>;
> Shahaf Shuler <shahafs@mellanox.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> 
> Hi Ori and Jerin,
> 
> One comment regarding my concern with len and end_offset problem.
> From open source SW regex library(libpcre, re2 and Hyperscan) and
> Intel's perspective, the matching results returned are always start
> offset and end offset. More importantly, Hyperscan only reports end offset
> most of the time.
> 
> It'll be good to keep this union as an abstraction and enforce the default
> behavior for each solution, i.e. HW solutions doesn't support MATCH_AS_START
> flag at rule compile time. Applications will know the meaning of variable at
> rule compile time with the flag so they don't have to do extra check at fast path
> run-time matching.
> Welcome for better abstraction ideas.
> 

I don't mind to keep the union as it was in V3, but I would like to remove the
configuration bit (RTE_REGEX_DEV_CFG_MATCH_AS_START). 
Meaning that if the device reports RTE_REGEX_DEV_SUPP_MATCH_AS_START
the result will always be with start_offset and len.

Best,
Ori

> Thanks,
> Xiang
> 
> > > > > +                       /**< Starting Byte Position for matched rule. */
> > > > > +                       RTE_STD_C11
> > > > > +                       union {
> > > > > +                               uint16_t len;
> > > > > +                               /**< Length of match in bytes */
> > > > > +                               uint16_t end_offset;
> > > > > +                               /**< The end offset of the match. In case
> > > > > +                                * MATCH_AS_START configuration is disabled.
> > > > > +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > > +                                */
> > > >
> > > > We have not concluded on this scheme. Have one field which has
> > > > different meaning will be difficult
> > > > for application. i.e fast path we need to have a check for this.
> > > >
> > >
> > > This is the time to conclude . at least for the first version.
> > > Why do we have one field with different meaning?
> > > The result can be ether len or end_offset.
> > >
> > > > I think, Based on the majority of HW/SW implementation, we need to
> > > > either go with len or
> > > > end_offset. What Mellanox HW returns? len or end_offset?
> > > >
> > >
> > > From Mellanox perspective we prefer the len approach. We also think
> > > it is much more user oriented.
> > >
> > > > or We can keep it as len or end_offset based on which drivers upstream
> > first,
> > > > other drivers when it comes, we can see how to abstract it?
> > > >
> > >
> > > I can except that assuming we choose the start and len approach
> >
> > I think, we can have first version with "start and len" by removing
> > RTE_REGEX_DEV_CFG_MATCH_AS_START.
> > When can think, how to abstract new drivers when it upstream based on
> > the overhead.
> >
> 
> 
> On Tue, Feb 25, 2020 at 07:48:54AM +0000, Ori Kam wrote:
> >
> >
> > > -----Original Message-----
> > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > Sent: Tuesday, February 25, 2020 7:57 AM
> > > To: Ori Kam <orika@mellanox.com>
> > > Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> > > <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> > > Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> Richardson,
> > > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > > harry.chang@intel.com; gu.jian1@zte.com.cn;
> shanjiangh@chinatelecom.cn;
> > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com;
> > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > <thomas@monjalon.net>
> > > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > >
> > > > > 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> > > >
> > > > We started to create a super basic app, after the API will be finalized and
> we
> > > will have HW
> > > > we can push it. (if you need it faster than feel free)
> > >
> > > A simple Unit test case needs to be present for the APIs. On the
> > > course of developing common code,
> > > it can be developed to test the common code with dummy/skeleton driver.
> > >
> >
> > Agree this is what we are currently have.
> >
> > > >
> > > > > 5) Need a maintainer for maintaining the regex subsystem
> > > > >
> > > > We wish to maintain it if you agree.
> > >
> > > Yes. Please.
> > >
> >
> > Great.
> >
> > > > > >
> > > > > > One more thing, regarding the ops structure, I think it is better to split
> it
> > > to 2
> > > > > different
> > > > > > structures one enque and one for dequeue, since there are no real
> shared
> > > > > data and we will
> > > > > > be able to save memory, what do you think?
> > > > >
> > > > > Ops are allocated from mempool so it will be overhead to manage both.
> > > > > moreover, some
> > > > > of the fields added in req can be used for resp as info. cryptodev
> > > > > follows the similar concept,
> > > > > I think, we can have symmetry with cryptodev wherever is possible to
> avoid
> > > > > end-user to learn new API models.
> > > >
> > > > True that there will be overhead with 2 mempools (small one)
> > > > but lets assume 255 results. This means that the buffer should be 255 *
> > > sizeof(rte_regex_match) = 2K
> > > > also this will enable us to replace groupX with group[] which will allow
> even
> > > more groups.
> > > > In addition don't think that crypto is a good example.
> > > > The main difference is that in RegEx the output is different format then
> the
> > > input.
> > >
> > > # IMO, Some of the fields may be useful for a response as well. I
> > > think application may be interested in following
> > > req filed in the response.
> > > a) buf_addr
> >
> > I don't see how this can be used in the response. since if working in out of
> order result.
> > you don’t know which result will be returned.
> > I also think it is error prone to use the same op for the enqueue and dequeue.
> >
> > > b) scan_size
> >
> > Please see above.
> >
> > > c) user_id (This would be main one)
> >
> > Agree
> >
> > >
> > > # Having two mempools adds overhead per lcore L1 cache usage and extra
> > > complexity to the application.
> > >
> > > # IMO, From a performance perspective, one mempool is good due to less
> > > stress on the cache and it is costly to
> > > add new mempool for HW mempool implementations.
> > >
> > > # I think, group[] use case we can add it when it required by
> > > introducing "matches_start_offset" field, which will
> > > tell the req, where is the end of group[] and where "matches" start
> > > with single mempool scheme also.
> > >
> > > # I think, one of the other use case for "matches_start_offset" that,
> > > It may possible to put vendor-specific
> > > opaque data. It will be filled by driver on response. The application
> > > can reference the matches as
> > >
> > > struct rte_regex_match *matches = RTE_PTR_ADD(ops, ops-
> > >
> > >matches_start_offset);
> > >
> >
> > O.K for now we will keep  it as is, and we will see what will be in the future.
> >
> > > >
> > > > > I assume you will send the v4 with these comments. I think, with v4 we
> > > > > can start implementing common library code.
> > > >
> > > > Just need to agree on the split (one more iteration )
> > > > and I will start working on the common code.
> > >
> > > Ack.
> >
> > Great,
> > I'm starting to work on V4 with all comments so the RFC will be acked and
> then will start
> > coding the rest of the common code.
> >

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-25  7:48             ` Ori Kam
@ 2020-02-26  9:03               ` Wang Xiang
  2020-02-26  8:36                 ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Wang Xiang @ 2020-02-26  9:03 UTC (permalink / raw)
  To: Ori Kam
  Cc: Jerin Jacob, Jerin Jacob, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Ori and Jerin,

One comment regarding my concern with len and end_offset problem.
From open source SW regex library(libpcre, re2 and Hyperscan) and 
Intel's perspective, the matching results returned are always start
offset and end offset. More importantly, Hyperscan only reports end offset
most of the time.

It'll be good to keep this union as an abstraction and enforce the default
behavior for each solution, i.e. HW solutions doesn't support MATCH_AS_START
flag at rule compile time. Applications will know the meaning of variable at
rule compile time with the flag so they don't have to do extra check at fast path
run-time matching.
Welcome for better abstraction ideas.

Thanks,
Xiang

> > > > +                       /**< Starting Byte Position for matched rule. */
> > > > +                       RTE_STD_C11
> > > > +                       union {
> > > > +                               uint16_t len;
> > > > +                               /**< Length of match in bytes */
> > > > +                               uint16_t end_offset;
> > > > +                               /**< The end offset of the match. In case
> > > > +                                * MATCH_AS_START configuration is disabled.
> > > > +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > +                                */
> > >
> > > We have not concluded on this scheme. Have one field which has
> > > different meaning will be difficult
> > > for application. i.e fast path we need to have a check for this.
> > >
> >
> > This is the time to conclude . at least for the first version.
> > Why do we have one field with different meaning?
> > The result can be ether len or end_offset.
> >
> > > I think, Based on the majority of HW/SW implementation, we need to
> > > either go with len or
> > > end_offset. What Mellanox HW returns? len or end_offset?
> > >
> >
> > From Mellanox perspective we prefer the len approach. We also think
> > it is much more user oriented.
> >
> > > or We can keep it as len or end_offset based on which drivers upstream
> first,
> > > other drivers when it comes, we can see how to abstract it?
> > >
> >
> > I can except that assuming we choose the start and len approach
>
> I think, we can have first version with "start and len" by removing
> RTE_REGEX_DEV_CFG_MATCH_AS_START.
> When can think, how to abstract new drivers when it upstream based on
> the overhead.
>


On Tue, Feb 25, 2020 at 07:48:54AM +0000, Ori Kam wrote:
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk@gmail.com>
> > Sent: Tuesday, February 25, 2020 7:57 AM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> > <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> > Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > 
> > > > 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> > >
> > > We started to create a super basic app, after the API will be finalized and we
> > will have HW
> > > we can push it. (if you need it faster than feel free)
> > 
> > A simple Unit test case needs to be present for the APIs. On the
> > course of developing common code,
> > it can be developed to test the common code with dummy/skeleton driver.
> > 
> 
> Agree this is what we are currently have.
> 
> > >
> > > > 5) Need a maintainer for maintaining the regex subsystem
> > > >
> > > We wish to maintain it if you agree.
> > 
> > Yes. Please.
> > 
> 
> Great.
> 
> > > > >
> > > > > One more thing, regarding the ops structure, I think it is better to split it
> > to 2
> > > > different
> > > > > structures one enque and one for dequeue, since there are no real shared
> > > > data and we will
> > > > > be able to save memory, what do you think?
> > > >
> > > > Ops are allocated from mempool so it will be overhead to manage both.
> > > > moreover, some
> > > > of the fields added in req can be used for resp as info. cryptodev
> > > > follows the similar concept,
> > > > I think, we can have symmetry with cryptodev wherever is possible to avoid
> > > > end-user to learn new API models.
> > >
> > > True that there will be overhead with 2 mempools (small one)
> > > but lets assume 255 results. This means that the buffer should be 255 *
> > sizeof(rte_regex_match) = 2K
> > > also this will enable us to replace groupX with group[] which will allow even
> > more groups.
> > > In addition don't think that crypto is a good example.
> > > The main difference is that in RegEx the output is different format then the
> > input.
> > 
> > # IMO, Some of the fields may be useful for a response as well. I
> > think application may be interested in following
> > req filed in the response.
> > a) buf_addr
> 
> I don't see how this can be used in the response. since if working in out of order result.
> you don’t know which result will be returned. 
> I also think it is error prone to use the same op for the enqueue and dequeue.
> 
> > b) scan_size
> 
> Please see above.
> 
> > c) user_id (This would be main one)
> 
> Agree
> 
> > 
> > # Having two mempools adds overhead per lcore L1 cache usage and extra
> > complexity to the application.
> > 
> > # IMO, From a performance perspective, one mempool is good due to less
> > stress on the cache and it is costly to
> > add new mempool for HW mempool implementations.
> > 
> > # I think, group[] use case we can add it when it required by
> > introducing "matches_start_offset" field, which will
> > tell the req, where is the end of group[] and where "matches" start
> > with single mempool scheme also.
> > 
> > # I think, one of the other use case for "matches_start_offset" that,
> > It may possible to put vendor-specific
> > opaque data. It will be filled by driver on response. The application
> > can reference the matches as
> > 
> > struct rte_regex_match *matches = RTE_PTR_ADD(ops, ops-
> > 
> >matches_start_offset);
> > 
> 
> O.K for now we will keep  it as is, and we will see what will be in the future.
> 
> > >
> > > > I assume you will send the v4 with these comments. I think, with v4 we
> > > > can start implementing common library code.
> > >
> > > Just need to agree on the split (one more iteration )
> > > and I will start working on the common code.
> > 
> > Ack.
> 
> Great,
> I'm starting to work on V4 with all comments so the RFC will be acked and then will start 
> coding the rest of the common code.
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-27  9:25                   ` Wang Xiang
@ 2020-02-27  7:31                     ` Ori Kam
  2020-02-27  9:16                       ` Wang Xiang
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-02-27  7:31 UTC (permalink / raw)
  To: Wang Xiang
  Cc: Jerin Jacob, Jerin Jacob, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Xiang,

> -----Original Message-----
> From: Wang Xiang <xiang.w.wang@intel.com>
> Sent: Thursday, February 27, 2020 11:26 AM
> To: Ori Kam <orika@mellanox.com>
> Cc: Jerin Jacob <jerinjacobk@gmail.com>; Jerin Jacob <jerinj@marvell.com>;
> dpdk-dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>;
> Shahaf Shuler <shahafs@mellanox.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> 
> Thanks for the comments.
> 
> Hyperscan supports both start_offset and end_offset modes with most
> users choosing end_offset for rule coverage and performance reasons.
> I'm OK to have the default behavior with start_offset and len.
> It'll be good to change RTE_REGEX_DEV_CFG_MATCH_AS_START to
> RTE_REGEX_DEV_CFG_MATCH_AS_END. For users who need only end_offset,
> they have to set RTE_REGEX_DEV_CFG_MATCH_AS_END bit. We may also
> remove
> RTE_REGEX_DEV_SUPP_MATCH_AS_START if you like.

Since you say that Hyperscan can support both modes,
What about changing the cap field to RTE_REGEX_DEV_SUPP_MATCH_AS_END?

> 
> One question is related to the consistency of start_offset definition
> among different solutions, does all solutions return the leftmost
> start_offset, i.e. for rule: foo.*bar and input: foofoobar, the returned
> start_offset will be 0 not 3?
> 
Yes, you are correct,
In Mellanox case there will be only one result: start_offset 0

> Thanks,
> Xiang
> 
> On Wed, Feb 26, 2020 at 08:36:51AM +0000, Ori Kam wrote:
> > Hi Xiang,
> >
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Wang Xiang
> > > Sent: Wednesday, February 26, 2020 11:03 AM
> > > To: Ori Kam <orika@mellanox.com>
> > > Cc: Jerin Jacob <jerinjacobk@gmail.com>; Jerin Jacob
> <jerinj@marvell.com>;
> > > dpdk-dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>;
> > > Shahaf Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> Richardson,
> > > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > > harry.chang@intel.com; gu.jian1@zte.com.cn;
> shanjiangh@chinatelecom.cn;
> > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com;
> > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > <thomas@monjalon.net>
> > > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > >
> > > Hi Ori and Jerin,
> > >
> > > One comment regarding my concern with len and end_offset problem.
> > > From open source SW regex library(libpcre, re2 and Hyperscan) and
> > > Intel's perspective, the matching results returned are always start
> > > offset and end offset. More importantly, Hyperscan only reports end offset
> > > most of the time.
> > >
> > > It'll be good to keep this union as an abstraction and enforce the default
> > > behavior for each solution, i.e. HW solutions doesn't support
> MATCH_AS_START
> > > flag at rule compile time. Applications will know the meaning of variable at
> > > rule compile time with the flag so they don't have to do extra check at fast
> path
> > > run-time matching.
> > > Welcome for better abstraction ideas.
> > >
> >
> > I don't mind to keep the union as it was in V3, but I would like to remove the
> > configuration bit (RTE_REGEX_DEV_CFG_MATCH_AS_START).
> > Meaning that if the device reports
> RTE_REGEX_DEV_SUPP_MATCH_AS_START
> > the result will always be with start_offset and len.
> >
> > Best,
> > Ori
> >
> > > Thanks,
> > > Xiang
> > >
> > > > > > > +                       /**< Starting Byte Position for matched rule. */
> > > > > > > +                       RTE_STD_C11
> > > > > > > +                       union {
> > > > > > > +                               uint16_t len;
> > > > > > > +                               /**< Length of match in bytes */
> > > > > > > +                               uint16_t end_offset;
> > > > > > > +                               /**< The end offset of the match. In case
> > > > > > > +                                * MATCH_AS_START configuration is disabled.
> > > > > > > +                                * @see
> RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > > > > +                                */
> > > > > >
> > > > > > We have not concluded on this scheme. Have one field which has
> > > > > > different meaning will be difficult
> > > > > > for application. i.e fast path we need to have a check for this.
> > > > > >
> > > > >
> > > > > This is the time to conclude . at least for the first version.
> > > > > Why do we have one field with different meaning?
> > > > > The result can be ether len or end_offset.
> > > > >
> > > > > > I think, Based on the majority of HW/SW implementation, we need to
> > > > > > either go with len or
> > > > > > end_offset. What Mellanox HW returns? len or end_offset?
> > > > > >
> > > > >
> > > > > From Mellanox perspective we prefer the len approach. We also think
> > > > > it is much more user oriented.
> > > > >
> > > > > > or We can keep it as len or end_offset based on which drivers
> upstream
> > > > first,
> > > > > > other drivers when it comes, we can see how to abstract it?
> > > > > >
> > > > >
> > > > > I can except that assuming we choose the start and len approach
> > > >
> > > > I think, we can have first version with "start and len" by removing
> > > > RTE_REGEX_DEV_CFG_MATCH_AS_START.
> > > > When can think, how to abstract new drivers when it upstream based on
> > > > the overhead.
> > > >
> > >
> > >
> > > On Tue, Feb 25, 2020 at 07:48:54AM +0000, Ori Kam wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > > Sent: Tuesday, February 25, 2020 7:57 AM
> > > > > To: Ori Kam <orika@mellanox.com>
> > > > > Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-
> dev
> > > > > <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>;
> Shahaf
> > > > > Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > > > > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>;
> Alex
> > > > > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun
> Kapoor
> > > > > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> > > Richardson,
> > > > > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > > > > harry.chang@intel.com; gu.jian1@zte.com.cn;
> > > shanjiangh@chinatelecom.cn;
> > > > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > > wushuai@inspur.com;
> > > > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > > > <thomas@monjalon.net>
> > > > > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev
> subsystem
> > > > >
> > > > > > > 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> > > > > >
> > > > > > We started to create a super basic app, after the API will be finalized
> and
> > > we
> > > > > will have HW
> > > > > > we can push it. (if you need it faster than feel free)
> > > > >
> > > > > A simple Unit test case needs to be present for the APIs. On the
> > > > > course of developing common code,
> > > > > it can be developed to test the common code with dummy/skeleton
> driver.
> > > > >
> > > >
> > > > Agree this is what we are currently have.
> > > >
> > > > > >
> > > > > > > 5) Need a maintainer for maintaining the regex subsystem
> > > > > > >
> > > > > > We wish to maintain it if you agree.
> > > > >
> > > > > Yes. Please.
> > > > >
> > > >
> > > > Great.
> > > >
> > > > > > > >
> > > > > > > > One more thing, regarding the ops structure, I think it is better to
> split
> > > it
> > > > > to 2
> > > > > > > different
> > > > > > > > structures one enque and one for dequeue, since there are no real
> > > shared
> > > > > > > data and we will
> > > > > > > > be able to save memory, what do you think?
> > > > > > >
> > > > > > > Ops are allocated from mempool so it will be overhead to manage
> both.
> > > > > > > moreover, some
> > > > > > > of the fields added in req can be used for resp as info. cryptodev
> > > > > > > follows the similar concept,
> > > > > > > I think, we can have symmetry with cryptodev wherever is possible
> to
> > > avoid
> > > > > > > end-user to learn new API models.
> > > > > >
> > > > > > True that there will be overhead with 2 mempools (small one)
> > > > > > but lets assume 255 results. This means that the buffer should be 255
> *
> > > > > sizeof(rte_regex_match) = 2K
> > > > > > also this will enable us to replace groupX with group[] which will allow
> > > even
> > > > > more groups.
> > > > > > In addition don't think that crypto is a good example.
> > > > > > The main difference is that in RegEx the output is different format
> then
> > > the
> > > > > input.
> > > > >
> > > > > # IMO, Some of the fields may be useful for a response as well. I
> > > > > think application may be interested in following
> > > > > req filed in the response.
> > > > > a) buf_addr
> > > >
> > > > I don't see how this can be used in the response. since if working in out of
> > > order result.
> > > > you don’t know which result will be returned.
> > > > I also think it is error prone to use the same op for the enqueue and
> dequeue.
> > > >
> > > > > b) scan_size
> > > >
> > > > Please see above.
> > > >
> > > > > c) user_id (This would be main one)
> > > >
> > > > Agree
> > > >
> > > > >
> > > > > # Having two mempools adds overhead per lcore L1 cache usage and
> extra
> > > > > complexity to the application.
> > > > >
> > > > > # IMO, From a performance perspective, one mempool is good due to
> less
> > > > > stress on the cache and it is costly to
> > > > > add new mempool for HW mempool implementations.
> > > > >
> > > > > # I think, group[] use case we can add it when it required by
> > > > > introducing "matches_start_offset" field, which will
> > > > > tell the req, where is the end of group[] and where "matches" start
> > > > > with single mempool scheme also.
> > > > >
> > > > > # I think, one of the other use case for "matches_start_offset" that,
> > > > > It may possible to put vendor-specific
> > > > > opaque data. It will be filled by driver on response. The application
> > > > > can reference the matches as
> > > > >
> > > > > struct rte_regex_match *matches = RTE_PTR_ADD(ops, ops-
> > > > >
> > > > >matches_start_offset);
> > > > >
> > > >
> > > > O.K for now we will keep  it as is, and we will see what will be in the
> future.
> > > >
> > > > > >
> > > > > > > I assume you will send the v4 with these comments. I think, with v4
> we
> > > > > > > can start implementing common library code.
> > > > > >
> > > > > > Just need to agree on the split (one more iteration )
> > > > > > and I will start working on the common code.
> > > > >
> > > > > Ack.
> > > >
> > > > Great,
> > > > I'm starting to work on V4 with all comments so the RFC will be acked and
> > > then will start
> > > > coding the rest of the common code.
> > > >

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-27  7:31                     ` Ori Kam
@ 2020-02-27  9:16                       ` Wang Xiang
  0 siblings, 0 replies; 62+ messages in thread
From: Wang Xiang @ 2020-02-27  9:16 UTC (permalink / raw)
  To: Ori Kam
  Cc: Jerin Jacob, Jerin Jacob, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Ori,

Look forward to your v4:).

Thanks,
Xiang
On Thu, Feb 27, 2020 at 07:31:48AM +0000, Ori Kam wrote:
> Hi Xiang,
> 
> > -----Original Message-----
> > From: Wang Xiang <xiang.w.wang@intel.com>
> > Sent: Thursday, February 27, 2020 11:26 AM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: Jerin Jacob <jerinjacobk@gmail.com>; Jerin Jacob <jerinj@marvell.com>;
> > dpdk-dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>;
> > Shahaf Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > 
> > Hi Ori,
> > 
> > Thanks for the comments.
> > 
> > Hyperscan supports both start_offset and end_offset modes with most
> > users choosing end_offset for rule coverage and performance reasons.
> > I'm OK to have the default behavior with start_offset and len.
> > It'll be good to change RTE_REGEX_DEV_CFG_MATCH_AS_START to
> > RTE_REGEX_DEV_CFG_MATCH_AS_END. For users who need only end_offset,
> > they have to set RTE_REGEX_DEV_CFG_MATCH_AS_END bit. We may also
> > remove
> > RTE_REGEX_DEV_SUPP_MATCH_AS_START if you like.
> 
> Since you say that Hyperscan can support both modes,
> What about changing the cap field to RTE_REGEX_DEV_SUPP_MATCH_AS_END?
> 
Ack. Looks good to me.
> > 
> > One question is related to the consistency of start_offset definition
> > among different solutions, does all solutions return the leftmost
> > start_offset, i.e. for rule: foo.*bar and input: foofoobar, the returned
> > start_offset will be 0 not 3?
> > 
> Yes, you are correct,
> In Mellanox case there will be only one result: start_offset 0
> 
> > Thanks,
> > Xiang
> > 
> > On Wed, Feb 26, 2020 at 08:36:51AM +0000, Ori Kam wrote:
> > > Hi Xiang,
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Wang Xiang
> > > > Sent: Wednesday, February 26, 2020 11:03 AM
> > > > To: Ori Kam <orika@mellanox.com>
> > > > Cc: Jerin Jacob <jerinjacobk@gmail.com>; Jerin Jacob
> > <jerinj@marvell.com>;
> > > > dpdk-dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>;
> > > > Shahaf Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > > > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > > > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > > > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> > Richardson,
> > > > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > > > harry.chang@intel.com; gu.jian1@zte.com.cn;
> > shanjiangh@chinatelecom.cn;
> > > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > wushuai@inspur.com;
> > > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > > <thomas@monjalon.net>
> > > > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > > >
> > > > Hi Ori and Jerin,
> > > >
> > > > One comment regarding my concern with len and end_offset problem.
> > > > From open source SW regex library(libpcre, re2 and Hyperscan) and
> > > > Intel's perspective, the matching results returned are always start
> > > > offset and end offset. More importantly, Hyperscan only reports end offset
> > > > most of the time.
> > > >
> > > > It'll be good to keep this union as an abstraction and enforce the default
> > > > behavior for each solution, i.e. HW solutions doesn't support
> > MATCH_AS_START
> > > > flag at rule compile time. Applications will know the meaning of variable at
> > > > rule compile time with the flag so they don't have to do extra check at fast
> > path
> > > > run-time matching.
> > > > Welcome for better abstraction ideas.
> > > >
> > >
> > > I don't mind to keep the union as it was in V3, but I would like to remove the
> > > configuration bit (RTE_REGEX_DEV_CFG_MATCH_AS_START).
> > > Meaning that if the device reports
> > RTE_REGEX_DEV_SUPP_MATCH_AS_START
> > > the result will always be with start_offset and len.
> > >
> > > Best,
> > > Ori
> > >
> > > > Thanks,
> > > > Xiang
> > > >
> > > > > > > > +                       /**< Starting Byte Position for matched rule. */
> > > > > > > > +                       RTE_STD_C11
> > > > > > > > +                       union {
> > > > > > > > +                               uint16_t len;
> > > > > > > > +                               /**< Length of match in bytes */
> > > > > > > > +                               uint16_t end_offset;
> > > > > > > > +                               /**< The end offset of the match. In case
> > > > > > > > +                                * MATCH_AS_START configuration is disabled.
> > > > > > > > +                                * @see
> > RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > > > > > +                                */
> > > > > > >
> > > > > > > We have not concluded on this scheme. Have one field which has
> > > > > > > different meaning will be difficult
> > > > > > > for application. i.e fast path we need to have a check for this.
> > > > > > >
> > > > > >
> > > > > > This is the time to conclude . at least for the first version.
> > > > > > Why do we have one field with different meaning?
> > > > > > The result can be ether len or end_offset.
> > > > > >
> > > > > > > I think, Based on the majority of HW/SW implementation, we need to
> > > > > > > either go with len or
> > > > > > > end_offset. What Mellanox HW returns? len or end_offset?
> > > > > > >
> > > > > >
> > > > > > From Mellanox perspective we prefer the len approach. We also think
> > > > > > it is much more user oriented.
> > > > > >
> > > > > > > or We can keep it as len or end_offset based on which drivers
> > upstream
> > > > > first,
> > > > > > > other drivers when it comes, we can see how to abstract it?
> > > > > > >
> > > > > >
> > > > > > I can except that assuming we choose the start and len approach
> > > > >
> > > > > I think, we can have first version with "start and len" by removing
> > > > > RTE_REGEX_DEV_CFG_MATCH_AS_START.
> > > > > When can think, how to abstract new drivers when it upstream based on
> > > > > the overhead.
> > > > >
> > > >
> > > >
> > > > On Tue, Feb 25, 2020 at 07:48:54AM +0000, Ori Kam wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > > > Sent: Tuesday, February 25, 2020 7:57 AM
> > > > > > To: Ori Kam <orika@mellanox.com>
> > > > > > Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-
> > dev
> > > > > > <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>;
> > Shahaf
> > > > > > Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > > > > > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>;
> > Alex
> > > > > > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun
> > Kapoor
> > > > > > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> > > > Richardson,
> > > > > > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > > > > > harry.chang@intel.com; gu.jian1@zte.com.cn;
> > > > shanjiangh@chinatelecom.cn;
> > > > > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > > > wushuai@inspur.com;
> > > > > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > > > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > > > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > > > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > > > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > > > > <thomas@monjalon.net>
> > > > > > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev
> > subsystem
> > > > > >
> > > > > > > > 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> > > > > > >
> > > > > > > We started to create a super basic app, after the API will be finalized
> > and
> > > > we
> > > > > > will have HW
> > > > > > > we can push it. (if you need it faster than feel free)
> > > > > >
> > > > > > A simple Unit test case needs to be present for the APIs. On the
> > > > > > course of developing common code,
> > > > > > it can be developed to test the common code with dummy/skeleton
> > driver.
> > > > > >
> > > > >
> > > > > Agree this is what we are currently have.
> > > > >
> > > > > > >
> > > > > > > > 5) Need a maintainer for maintaining the regex subsystem
> > > > > > > >
> > > > > > > We wish to maintain it if you agree.
> > > > > >
> > > > > > Yes. Please.
> > > > > >
> > > > >
> > > > > Great.
> > > > >
> > > > > > > > >
> > > > > > > > > One more thing, regarding the ops structure, I think it is better to
> > split
> > > > it
> > > > > > to 2
> > > > > > > > different
> > > > > > > > > structures one enque and one for dequeue, since there are no real
> > > > shared
> > > > > > > > data and we will
> > > > > > > > > be able to save memory, what do you think?
> > > > > > > >
> > > > > > > > Ops are allocated from mempool so it will be overhead to manage
> > both.
> > > > > > > > moreover, some
> > > > > > > > of the fields added in req can be used for resp as info. cryptodev
> > > > > > > > follows the similar concept,
> > > > > > > > I think, we can have symmetry with cryptodev wherever is possible
> > to
> > > > avoid
> > > > > > > > end-user to learn new API models.
> > > > > > >
> > > > > > > True that there will be overhead with 2 mempools (small one)
> > > > > > > but lets assume 255 results. This means that the buffer should be 255
> > *
> > > > > > sizeof(rte_regex_match) = 2K
> > > > > > > also this will enable us to replace groupX with group[] which will allow
> > > > even
> > > > > > more groups.
> > > > > > > In addition don't think that crypto is a good example.
> > > > > > > The main difference is that in RegEx the output is different format
> > then
> > > > the
> > > > > > input.
> > > > > >
> > > > > > # IMO, Some of the fields may be useful for a response as well. I
> > > > > > think application may be interested in following
> > > > > > req filed in the response.
> > > > > > a) buf_addr
> > > > >
> > > > > I don't see how this can be used in the response. since if working in out of
> > > > order result.
> > > > > you don’t know which result will be returned.
> > > > > I also think it is error prone to use the same op for the enqueue and
> > dequeue.
> > > > >
> > > > > > b) scan_size
> > > > >
> > > > > Please see above.
> > > > >
> > > > > > c) user_id (This would be main one)
> > > > >
> > > > > Agree
> > > > >
> > > > > >
> > > > > > # Having two mempools adds overhead per lcore L1 cache usage and
> > extra
> > > > > > complexity to the application.
> > > > > >
> > > > > > # IMO, From a performance perspective, one mempool is good due to
> > less
> > > > > > stress on the cache and it is costly to
> > > > > > add new mempool for HW mempool implementations.
> > > > > >
> > > > > > # I think, group[] use case we can add it when it required by
> > > > > > introducing "matches_start_offset" field, which will
> > > > > > tell the req, where is the end of group[] and where "matches" start
> > > > > > with single mempool scheme also.
> > > > > >
> > > > > > # I think, one of the other use case for "matches_start_offset" that,
> > > > > > It may possible to put vendor-specific
> > > > > > opaque data. It will be filled by driver on response. The application
> > > > > > can reference the matches as
> > > > > >
> > > > > > struct rte_regex_match *matches = RTE_PTR_ADD(ops, ops-
> > > > > >
> > > > > >matches_start_offset);
> > > > > >
> > > > >
> > > > > O.K for now we will keep  it as is, and we will see what will be in the
> > future.
> > > > >
> > > > > > >
> > > > > > > > I assume you will send the v4 with these comments. I think, with v4
> > we
> > > > > > > > can start implementing common library code.
> > > > > > >
> > > > > > > Just need to agree on the split (one more iteration )
> > > > > > > and I will start working on the common code.
> > > > > >
> > > > > > Ack.
> > > > >
> > > > > Great,
> > > > > I'm starting to work on V4 with all comments so the RFC will be acked and
> > > > then will start
> > > > > coding the rest of the common code.
> > > > >

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
  2020-02-26  8:36                 ` Ori Kam
@ 2020-02-27  9:25                   ` Wang Xiang
  2020-02-27  7:31                     ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Wang Xiang @ 2020-02-27  9:25 UTC (permalink / raw)
  To: Ori Kam
  Cc: Jerin Jacob, Jerin Jacob, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Ori,

Thanks for the comments.

Hyperscan supports both start_offset and end_offset modes with most
users choosing end_offset for rule coverage and performance reasons.
I'm OK to have the default behavior with start_offset and len.
It'll be good to change RTE_REGEX_DEV_CFG_MATCH_AS_START to 
RTE_REGEX_DEV_CFG_MATCH_AS_END. For users who need only end_offset,
they have to set RTE_REGEX_DEV_CFG_MATCH_AS_END bit. We may also remove 
RTE_REGEX_DEV_SUPP_MATCH_AS_START if you like.

One question is related to the consistency of start_offset definition 
among different solutions, does all solutions return the leftmost
start_offset, i.e. for rule: foo.*bar and input: foofoobar, the returned
start_offset will be 0 not 3?

Thanks,
Xiang

On Wed, Feb 26, 2020 at 08:36:51AM +0000, Ori Kam wrote:
> Hi Xiang,
> 
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Wang Xiang
> > Sent: Wednesday, February 26, 2020 11:03 AM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: Jerin Jacob <jerinjacobk@gmail.com>; Jerin Jacob <jerinj@marvell.com>;
> > dpdk-dev <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>;
> > Shahaf Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > 
> > Hi Ori and Jerin,
> > 
> > One comment regarding my concern with len and end_offset problem.
> > From open source SW regex library(libpcre, re2 and Hyperscan) and
> > Intel's perspective, the matching results returned are always start
> > offset and end offset. More importantly, Hyperscan only reports end offset
> > most of the time.
> > 
> > It'll be good to keep this union as an abstraction and enforce the default
> > behavior for each solution, i.e. HW solutions doesn't support MATCH_AS_START
> > flag at rule compile time. Applications will know the meaning of variable at
> > rule compile time with the flag so they don't have to do extra check at fast path
> > run-time matching.
> > Welcome for better abstraction ideas.
> > 
> 
> I don't mind to keep the union as it was in V3, but I would like to remove the
> configuration bit (RTE_REGEX_DEV_CFG_MATCH_AS_START). 
> Meaning that if the device reports RTE_REGEX_DEV_SUPP_MATCH_AS_START
> the result will always be with start_offset and len.
> 
> Best,
> Ori
> 
> > Thanks,
> > Xiang
> > 
> > > > > > +                       /**< Starting Byte Position for matched rule. */
> > > > > > +                       RTE_STD_C11
> > > > > > +                       union {
> > > > > > +                               uint16_t len;
> > > > > > +                               /**< Length of match in bytes */
> > > > > > +                               uint16_t end_offset;
> > > > > > +                               /**< The end offset of the match. In case
> > > > > > +                                * MATCH_AS_START configuration is disabled.
> > > > > > +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > > > +                                */
> > > > >
> > > > > We have not concluded on this scheme. Have one field which has
> > > > > different meaning will be difficult
> > > > > for application. i.e fast path we need to have a check for this.
> > > > >
> > > >
> > > > This is the time to conclude . at least for the first version.
> > > > Why do we have one field with different meaning?
> > > > The result can be ether len or end_offset.
> > > >
> > > > > I think, Based on the majority of HW/SW implementation, we need to
> > > > > either go with len or
> > > > > end_offset. What Mellanox HW returns? len or end_offset?
> > > > >
> > > >
> > > > From Mellanox perspective we prefer the len approach. We also think
> > > > it is much more user oriented.
> > > >
> > > > > or We can keep it as len or end_offset based on which drivers upstream
> > > first,
> > > > > other drivers when it comes, we can see how to abstract it?
> > > > >
> > > >
> > > > I can except that assuming we choose the start and len approach
> > >
> > > I think, we can have first version with "start and len" by removing
> > > RTE_REGEX_DEV_CFG_MATCH_AS_START.
> > > When can think, how to abstract new drivers when it upstream based on
> > > the overhead.
> > >
> > 
> > 
> > On Tue, Feb 25, 2020 at 07:48:54AM +0000, Ori Kam wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jerin Jacob <jerinjacobk@gmail.com>
> > > > Sent: Tuesday, February 25, 2020 7:57 AM
> > > > To: Ori Kam <orika@mellanox.com>
> > > > Cc: Jerin Jacob <jerinj@marvell.com>; xiang.w.wang@intel.com; dpdk-dev
> > > > <dev@dpdk.org>; Pavan Nikhilesh <pbhagavatula@marvell.com>; Shahaf
> > > > Shuler <shahafs@mellanox.com>; Hemant Agrawal
> > > > <hemant.agrawal@nxp.com>; Opher Reviv <opher@mellanox.com>; Alex
> > > > Rosenbaum <alexr@mellanox.com>; dovrat@marvell.com; Prasun Kapoor
> > > > <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>;
> > Richardson,
> > > > Bruce <bruce.richardson@intel.com>; yang.a.hong@intel.com;
> > > > harry.chang@intel.com; gu.jian1@zte.com.cn;
> > shanjiangh@chinatelecom.cn;
> > > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > wushuai@inspur.com;
> > > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > > <thomas@monjalon.net>
> > > > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > > >
> > > > > > 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> > > > >
> > > > > We started to create a super basic app, after the API will be finalized and
> > we
> > > > will have HW
> > > > > we can push it. (if you need it faster than feel free)
> > > >
> > > > A simple Unit test case needs to be present for the APIs. On the
> > > > course of developing common code,
> > > > it can be developed to test the common code with dummy/skeleton driver.
> > > >
> > >
> > > Agree this is what we are currently have.
> > >
> > > > >
> > > > > > 5) Need a maintainer for maintaining the regex subsystem
> > > > > >
> > > > > We wish to maintain it if you agree.
> > > >
> > > > Yes. Please.
> > > >
> > >
> > > Great.
> > >
> > > > > > >
> > > > > > > One more thing, regarding the ops structure, I think it is better to split
> > it
> > > > to 2
> > > > > > different
> > > > > > > structures one enque and one for dequeue, since there are no real
> > shared
> > > > > > data and we will
> > > > > > > be able to save memory, what do you think?
> > > > > >
> > > > > > Ops are allocated from mempool so it will be overhead to manage both.
> > > > > > moreover, some
> > > > > > of the fields added in req can be used for resp as info. cryptodev
> > > > > > follows the similar concept,
> > > > > > I think, we can have symmetry with cryptodev wherever is possible to
> > avoid
> > > > > > end-user to learn new API models.
> > > > >
> > > > > True that there will be overhead with 2 mempools (small one)
> > > > > but lets assume 255 results. This means that the buffer should be 255 *
> > > > sizeof(rte_regex_match) = 2K
> > > > > also this will enable us to replace groupX with group[] which will allow
> > even
> > > > more groups.
> > > > > In addition don't think that crypto is a good example.
> > > > > The main difference is that in RegEx the output is different format then
> > the
> > > > input.
> > > >
> > > > # IMO, Some of the fields may be useful for a response as well. I
> > > > think application may be interested in following
> > > > req filed in the response.
> > > > a) buf_addr
> > >
> > > I don't see how this can be used in the response. since if working in out of
> > order result.
> > > you don’t know which result will be returned.
> > > I also think it is error prone to use the same op for the enqueue and dequeue.
> > >
> > > > b) scan_size
> > >
> > > Please see above.
> > >
> > > > c) user_id (This would be main one)
> > >
> > > Agree
> > >
> > > >
> > > > # Having two mempools adds overhead per lcore L1 cache usage and extra
> > > > complexity to the application.
> > > >
> > > > # IMO, From a performance perspective, one mempool is good due to less
> > > > stress on the cache and it is costly to
> > > > add new mempool for HW mempool implementations.
> > > >
> > > > # I think, group[] use case we can add it when it required by
> > > > introducing "matches_start_offset" field, which will
> > > > tell the req, where is the end of group[] and where "matches" start
> > > > with single mempool scheme also.
> > > >
> > > > # I think, one of the other use case for "matches_start_offset" that,
> > > > It may possible to put vendor-specific
> > > > opaque data. It will be filled by driver on response. The application
> > > > can reference the matches as
> > > >
> > > > struct rte_regex_match *matches = RTE_PTR_ADD(ops, ops-
> > > >
> > > >matches_start_offset);
> > > >
> > >
> > > O.K for now we will keep  it as is, and we will see what will be in the future.
> > >
> > > > >
> > > > > > I assume you will send the v4 with these comments. I think, with v4 we
> > > > > > can start implementing common library code.
> > > > >
> > > > > Just need to agree on the split (one more iteration )
> > > > > and I will start working on the common code.
> > > >
> > > > Ack.
> > >
> > > Great,
> > > I'm starting to work on V4 with all comments so the RFC will be acked and
> > then will start
> > > coding the rest of the common code.
> > >

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [dpdk-dev] [RFC v4] regexdev: introduce regexdev subsystem
  2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
                   ` (3 preceding siblings ...)
  2020-01-28  9:00 ` [dpdk-dev] [PATCH v3] regexdev: " Ori Kam
@ 2020-02-27 14:40 ` Ori Kam
  2020-02-27 14:55   ` Jerin Jacob
  2020-02-27 15:08 ` [dpdk-dev] [RFC v5] " Ori Kam
  2020-03-10 10:32 ` [dpdk-dev] [RFC v6] " Ori Kam
  6 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-02-27 14:40 UTC (permalink / raw)
  To: jerinj, xiang.w.wang
  Cc: dev, pbhagavatula, shahafs, hemant.agrawal, opher, alexr, dovrat,
	pkapoor, nipun.gupta, bruce.richardson, yang.a.hong, harry.chang,
	gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas, orika

From: Jerin Jacob <jerinj@marvell.com>

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
* Next Generation Firewalls (NGFW)
* Deep Packet and Flow Inspection (DPI)
* Intrusion Prevention Systems (IPS)
* DDoS Mitigation
* Network Monitoring
* Data Loss Prevention (DLP)
* Smart NICs
* Grammar based content processing
* URL, spam and adware filtering
* Advanced auditing and policing of user/application security policies
* Financial data mining - parsing of streamed financial feeds
* Application recognition.
* Dmemory introspection.
* Natural Language Processing (NLP)
* Sentiment Analysis.
* Big data databse acceleration.
* Computational storage.

Request to review from HW and SW RegEx vendors and RegEx application
users to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing
device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
---
V4:
 * Replace iova with mbuf.
 * Small ML comments.
V3:
 * Change subject title.
V2:
 * Address ML comments.
---
 config/common_base                           |    7 +
 doc/api/doxy-api-index.md                    |    1 +
 doc/api/doxy-api.conf.in                     |    1 +
 lib/Makefile                                 |    2 +
 lib/librte_regexdev/Makefile                 |   31 +
 lib/librte_regexdev/rte_regexdev.c           |    6 +
 lib/librte_regexdev/rte_regexdev.h           | 1393 ++++++++++++++++++++++++++
 lib/librte_regexdev/rte_regexdev_version.map |   26 +
 8 files changed, 1467 insertions(+)
 create mode 100644 lib/librte_regexdev/Makefile
 create mode 100644 lib/librte_regexdev/rte_regexdev.c
 create mode 100644 lib/librte_regexdev/rte_regexdev.h
 create mode 100644 lib/librte_regexdev/rte_regexdev_version.map

diff --git a/config/common_base b/config/common_base
index f9a68f3..4810849 100644
--- a/config/common_base
+++ b/config/common_base
@@ -806,6 +806,12 @@ CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_DMA_RAWDEV=y
 CONFIG_RTE_LIBRTE_PMD_NTB_RAWDEV=y
 
 #
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+CONFIG_RTE_LIBRTE_REGEXDEV_DEBUG=n
+
+#
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
@@ -1098,3 +1104,4 @@ CONFIG_RTE_APP_CRYPTO_PERF=y
 # Compile the eventdev application
 #
 CONFIG_RTE_APP_EVENTDEV=y
+
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496b..787f7c2 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
   [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
   [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
   [rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
   [metrics]            (@ref rte_metrics.h),
   [bitrate]            (@ref rte_bitrate.h),
   [latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index 1c4392e..56c08eb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -58,6 +58,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/librte_rcu \
                           @TOPDIR@/lib/librte_reorder \
                           @TOPDIR@/lib/librte_rib \
+                          @TOPDIR@/lib/librte_regexdev \
                           @TOPDIR@/lib/librte_ring \
                           @TOPDIR@/lib/librte_sched \
                           @TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 46b91ae..a273564 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
                            librte_mempool librte_timer librte_cryptodev
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
 DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
 			librte_net librte_hash librte_cryptodev
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 0000000..6f4cc63
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+# Copyright(C) 2020 Mellanox International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+EXPORT_MAP := rte_regex_version.map
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf
+
+# library source files
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_regexdev.c
+
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_regexdev.h
+
+# versioning export map
+EXPORT_MAP := rte_regexdev_version.map
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 0000000..b901877
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 0000000..4fd1475
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1393 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ * Copyright(C) 2020 Intel International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | |             | |<-------rte_regex_rule_db_compile_activate()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - rte_regex_rule_db_compile_activate() Needs to invoke if
+ *   rte_regex_rule_db_update function was used.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_mbuf.h>
+#include <rte_memory.h>
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+__rte_experimental
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+__rte_experimental
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_START_ANCHOR_F (1ULL << 1)
+/**< RegEx device support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_ATOMIC_GROUPING_F (1ULL << 2)
+/**< RegEx device support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKTRACKING_CTRL_F (1ULL << 3)
+/**< RegEx device support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_CALLOUTS_F (1ULL << 4)
+/**< RegEx device support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKREFERENCE_F (1ULL << 5)
+/**< RegEx device support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
+/**< RegEx device support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_LOOKAROUND_ASRT_F (1ULL << 7)
+/**< RegEx device support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_MATCH_POINT_RST_F (1ULL << 8)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_NEWLINE_CONVENTIONS_F (1ULL << 9)
+/**< RegEx support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_NEWLINE_SEQ_F (1ULL << 10)
+/**< RegEx device support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_POSSESSIVE_QUALIFIERS_F (1ULL << 11)
+/**< RegEx device support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_SUBROUTINE_REFERENCES_F (1ULL << 12)
+/**< RegEx device support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_8_F (1ULL << 13)
+/**< RegEx device support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_16_F (1ULL << 14)
+/**< RegEx device support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_32_F (1ULL << 15)
+/**< RegEx device support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_WORD_BOUNDARY_F (1ULL << 16)
+/**< RegEx device support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_FORWARD_REFERENCES_F (1ULL << 17)
+/**< RegEx device support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_MATCH_AS_END (1ULL << 18)
+/**< RegEx device support match as end.
+ * Match as end means that the match result holds the end offset of the
+ * detected match. No len value is set.
+ * If the device doesn't support this feature it means the match
+ * result holds the starting position of match and the length of the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+	const char *driver_name; /**< RegEx driver name. */
+	struct rte_device *dev;	/**< Device information. */
+	uint16_t max_matches;
+	/**< Maximum matches per scan supported by this device. */
+	uint16_t max_queue_pairs;
+	/**< Maximum queue pairs supported by this device. */
+	uint16_t max_payload_size;
+	/**< Maximum payload size for a pattern match request or scan.
+	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+	 */
+	uint32_t max_rules_per_group;
+	/**< Maximum rules supported per group by this device. */
+	uint16_t max_groups;
+	/**< Maximum groups supported by this device. */
+	uint32_t regex_dev_capa;
+	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+	uint64_t rule_flags;
+	/**< Supported compiler rule flags.
+	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+	 */
+	uint8_t max_scatter_gather;
+	/**< The max supported number of buffers that can
+	 * be used in a single ops. The total size of all elements
+	 * must be less then max_payload_size.
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+__rte_experimental
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+#define RTE_REGEX_DEV_CFG_MATCH_AS_END (1ULL << 1)
+/**< Match as end is the ability to return the result as ending offset.
+ * When this flag is set, the result for each match will hold the ending
+ * offset of the match in end_offset.
+ * If this flag is not set, then the match result will hold the starting offset
+ * in start_offset, and the length of the match in len.
+ *
+ * @see RTE_REGEX_DEV_SUPP_MATCH_AS_END
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+	uint16_t nb_max_matches;
+	/**< Maximum matches per scan configured on this device.
+	 * This value cannot exceed the *max_matches*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case, value 1 used.
+	 * @see struct rte_regex_dev_info::max_matches
+	 */
+	uint16_t nb_queue_pairs;
+	/**< Number of RegEx queue pairs to configure on this device.
+	 * This value cannot exceed the *max_queue_pairs* which previously
+	 * provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_queue_pairs
+	 */
+	uint32_t nb_rules_per_group;
+	/**< Number of rules per group to configure on this device.
+	 * This value cannot exceed the *max_rules_per_group*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case,
+	 * struct rte_regex_dev_info::max_rules_per_group used.
+	 * @see struct rte_regex_dev_info::max_rules_per_group
+	 */
+	uint16_t nb_groups;
+	/**< Number of groups to configure on this device.
+	 * This value cannot exceed the *max_groups*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_groups
+	 */
+	const char *rule_db;
+	/**< Import initial set of prebuilt rule database on this device.
+	 * The value NULL is allowed, in which case, the device will not
+	 * be configured prebuilt rule database. Application may use
+	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+	 * to update or import rule database after the
+	 * rte_regex_dev_configure().
+	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+	 */
+	uint32_t rule_db_len;
+	/**< Length of *rule_db* buffer. */
+	uint32_t dev_cfg_flags;
+	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+				      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+	uint32_t qp_conf_flags;
+	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
+	uint16_t nb_desc;
+	/**< The number of descriptors to allocate for this queue pair. */
+	regexdev_stop_flush_t cb;
+	/**< Callback function called during rte_regex_dev_stop(), invoked
+	 * once per flushed regex op. Value NULL is allowed, in which case
+	 * callback will not be invoked. This function can be used to properly
+	 * dispose of outstanding regex ops from response queue,
+	 * for example ops containing memory pointers.
+	 * @see rte_regex_dev_stop()
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the range
+ *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue pair.
+ *   NULL value is allowed, in which case default configuration	used.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
+			   const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+__rte_experimental
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+	RTE_REGEX_DEV_ATTR_SOCKET_ID,
+	/**< The NUMA socket id to which the device is connected or
+	 * a default of zero if the socket could not be determined.
+	 * datatype: *int*
+	 * operation: *get*
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+	/**< Maximum number of matches per scan.
+	 * datatype: *uint8_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+	/**< Upper bound scan time in ns.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+	/**< Maximum number of prefix detected per scan.
+	 * This would be useful for denial of service detection.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   A pointer that will be filled in with the attribute
+ *   value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       void *attr_value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   Pointer that will be filled in with the attribute value
+ *   by the application.
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation. */
+enum rte_regex_rule_op {
+	RTE_REGEX_RULE_OP_ADD,
+	/**< Add RegEx rule to rule database. */
+	RTE_REGEX_RULE_OP_REMOVE
+	/**< Remove RegEx rule from rule database. */
+};
+
+/** Structure to hold a RegEx rule attributes. */
+struct rte_regex_rule {
+	enum rte_regex_rule_op op;
+	/**< OP type of the rule either a OP_ADD or OP_DELETE. */
+	uint16_t group_id;
+	/**< Group identifier to which the rule belongs to. */
+	uint32_t rule_id;
+	/**< Rule identifier which is returned on successful match. */
+	const char *pcre_rule;
+	/**< Buffer to hold the PCRE rule. */
+	uint16_t pcre_rule_len;
+	/**< Length of the PCRE rule. */
+	uint64_t rule_flags;
+	/* PCRE rule flags. Supported device specific PCRE rules enumerated
+	 * in struct rte_regex_dev_info::rule_flags. For successful rule
+	 * database update, application needs to provide only supported
+	 * rule flags.
+	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Update the local rule set.
+ * This functions only modify the rule set in memory.
+ * In order for the changes to take effect, the function
+ * rte_regex_rule_db_compile_active must be called.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ * @param rules.
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules.
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(),
+ *   rte_regex_rule_db_compile_activate()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
+			 uint32_t nb_rules);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Compile local rule set and burn the complied result to the
+ * RegEx deive.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success, otherwise negative errno.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(,
+ *   rte_regex_rule_db_update()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_compile_activate(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+			 uint32_t rule_db_len);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+	uint16_t id;
+	/**< xstat identifier */
+	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+	/**< xstat name */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - Positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+			       struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested.
+ * @return
+ *   - Positive value: number of stat entries filled into the values array
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+			 uint64_t values[], uint16_t n);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param name
+ *   The stat name to retrieve.
+ * @param id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value.
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+				 uint16_t *id, uint64_t *value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is NULL.
+ *
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters.
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+			   uint16_t nb_ids);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @return
+ *   - 0: Selftest successful.
+ *   - -ENOTSUP if the device doesn't support selftest.
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param f
+ *   A pointer to a file for output.
+ *
+ * @return
+ *   0 on success, negative errno on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		struct {
+			uint32_t rule_id:20;
+			/**< Rule identifier to which the pattern matched.
+			 * @see struct rte_regex_rule::rule_id
+			 */
+			uint32_t group_id:12;
+			/**< Group identifier of the rule which the pattern
+			 * matched. @see struct rte_regex_rule::group_id
+			 */
+			uint16_t start_offset;
+			/**< Starting Byte Position for matched rule. */
+			RTE_STD_C11
+			union {
+				uint16_t len;
+				/**< Length of match in bytes */
+				uint16_t end_offset;
+				/**< The end offset of the match. In case
+				 * MATCH_AS_END configuration is enabled.
+				 * @see RTE_REGEX_DEV_CFG_MATCH_AS_END
+				 */
+			};
+		};
+	};
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per scan to
+ * reduce the post-processing required by the application. The match with the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/** Struct to hold scatter gather elements in ops. */
+struct rte_regex_iov {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		/**<  Allow 8-byte reserved on 32-bit system */
+		void *buf_addr;
+		/**< Virtual address of the pattern to be matched. */
+	};
+	rte_iova_t buf_iova;
+	/**< IOVA address of the pattern to be matched. */
+	uint16_t buf_size; /**< The buf size. */
+};
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+	/* W0 */
+	uint16_t req_flags;
+	/**< Request flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_REQ_*
+	 */
+	uint16_t rsp_flags;
+	/**< Response flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_RSP_*
+	 */
+	uint16_t nb_actual_matches;
+	/**< The total number of actual matches detected by the Regex device.*/
+	uint16_t nb_matches;
+	/**< The total number of matches returned by the RegEx device for this
+	 * scan. The size of *rte_regex_ops::matches* zero length array will be
+	 * this value.
+	 *
+	 * @see struct rte_regex_ops::matches, struct rte_regex_match
+	 */
+
+	/* W1 */
+	struct rte_mbuf mbuf; /**< source mbuf, to search in. */
+
+	/* W2 */
+	uint16_t group_id0;
+	/**< First group_id to match the rule against. Minimum one group id
+	 * must be provided by application.
+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
+	 * is valid, respectively similar flags for group_id2 and group_id3.
+	 * Upon the match, struct rte_regex_match::group_id shall be updated
+	 * with matching group ID by the device. Group ID scheme provides
+	 * rule isolation and effective pattern matching.
+	 */
+	uint16_t group_id1;
+	/**< Second group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+	 */
+	uint16_t group_id2;
+	/**< Third group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+	 */
+	uint16_t group_id3;
+	/**< Forth group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+	 */
+
+	/* W3 */
+	RTE_STD_C11
+	union {
+		uint64_t user_id;
+		/**< Application specific opaque value. An application may use
+		 * this field to hold application specific value to share
+		 * between dequeue and enqueue operation.
+		 * Implementation should not modify this field.
+		 */
+		void *user_ptr;
+		/**< Pointer representation of *user_id* */
+	};
+
+	/* W4 */
+	struct rte_regex_match matches[];
+	/**< Zero length array to hold the match tuples.
+	 * The struct rte_regex_ops::nb_matches value holds the number of
+	 * elements in this array.
+	 *
+	 * @see struct rte_regex_ops::nb_matches
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible" policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */
diff --git a/lib/librte_regexdev/rte_regexdev_version.map b/lib/librte_regexdev/rte_regexdev_version.map
new file mode 100644
index 0000000..723104d
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev_version.map
@@ -0,0 +1,26 @@
+EXPERIMENTAL {
+	global:
+
+	rte_regex_dev_count;
+	rte_regex_dev_get_dev_id;
+	rte_regex_dev_info_get;
+	rte_regex_dev_configure;
+	rte_regex_queue_pair_setup;
+	rte_regex_dev_start;
+	rte_regex_dev_stop;
+	rte_regex_dev_close;
+	rte_regex_dev_attr_get;
+	rte_regex_dev_attr_set;
+	rte_regex_rule_db_update;
+	rte_regex_rule_db_compile_activate;
+	rte_regex_rule_db_import;
+	rte_regex_rule_db_export;
+	rte_regex_dev_xstats_names_get;
+	rte_regex_dev_xstats_get;
+	rte_regex_dev_xstats_by_name_get;
+	rte_regex_dev_xstats_reset;
+	rte_regex_dev_selftest;
+	rte_regex_dev_dump;
+	rte_regex_enqueue_burst;
+	rte_regex_dequeue_burst;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v4] regexdev: introduce regexdev subsystem
  2020-02-27 14:40 ` [dpdk-dev] [RFC v4] " Ori Kam
@ 2020-02-27 14:55   ` Jerin Jacob
  0 siblings, 0 replies; 62+ messages in thread
From: Jerin Jacob @ 2020-02-27 14:55 UTC (permalink / raw)
  To: Ori Kam
  Cc: Jerin Jacob, xiang.w.wang, dpdk-dev, Pavan Nikhilesh,
	Shahaf Shuler, Hemant Agrawal, Opher Reviv, Alex Rosenbaum,
	dovrat, Prasun Kapoor, Nipun Gupta, Richardson, Bruce,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, Jim Thompson, hongjun.ni, j.bromhead, deri, fc,
	arthur.su, Thomas Monjalon

On Thu, Feb 27, 2020 at 8:11 PM Ori Kam <orika@mellanox.com> wrote:
>
> From: Jerin Jacob <jerinj@marvell.com>
>
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
>
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
>
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
>
> RegEx pattern matching applications:
> * Next Generation Firewalls (NGFW)
> * Deep Packet and Flow Inspection (DPI)
> * Intrusion Prevention Systems (IPS)
> * DDoS Mitigation
> * Network Monitoring
> * Data Loss Prevention (DLP)
> * Smart NICs
> * Grammar based content processing
> * URL, spam and adware filtering
> * Advanced auditing and policing of user/application security policies
> * Financial data mining - parsing of streamed financial feeds
> * Application recognition.
> * Dmemory introspection.
> * Natural Language Processing (NLP)
> * Sentiment Analysis.
> * Big data databse acceleration.
> * Computational storage.
>
> Request to review from HW and SW RegEx vendors and RegEx application
> users to have portable DPDK API for RegEx.
>
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
>
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Signed-off-by: Ori Kam <orika@mellanox.com>
> ---
> V4:
>  * Replace iova with mbuf.
>  * Small ML comments.
> V3:
>  * Change subject title.
> V2:
>  * Address ML comments.

> +/** Struct to hold scatter gather elements in ops. */
> +struct rte_regex_iov {
> +       RTE_STD_C11
> +       union {
> +               uint64_t u64;
> +               /**<  Allow 8-byte reserved on 32-bit system */
> +               void *buf_addr;
> +               /**< Virtual address of the pattern to be matched. */
> +       };
> +       rte_iova_t buf_iova;
> +       /**< IOVA address of the pattern to be matched. */
> +       uint16_t buf_size; /**< The buf size. */
> +};

rte_regex_iov structure is stale . Please remove it.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [dpdk-dev] [RFC v5] regexdev: introduce regexdev subsystem
  2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
                   ` (4 preceding siblings ...)
  2020-02-27 14:40 ` [dpdk-dev] [RFC v4] " Ori Kam
@ 2020-02-27 15:08 ` Ori Kam
  2020-03-01  6:13   ` [dpdk-dev] [EXT] " Pavan Nikhilesh Bhagavatula
  2020-03-02  7:05   ` [dpdk-dev] " Wang Xiang
  2020-03-10 10:32 ` [dpdk-dev] [RFC v6] " Ori Kam
  6 siblings, 2 replies; 62+ messages in thread
From: Ori Kam @ 2020-02-27 15:08 UTC (permalink / raw)
  To: jerinj, xiang.w.wang
  Cc: dev, pbhagavatula, shahafs, hemant.agrawal, opher, alexr, dovrat,
	pkapoor, nipun.gupta, bruce.richardson, yang.a.hong, harry.chang,
	gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas, orika

From: Jerin Jacob <jerinj@marvell.com>

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
* Next Generation Firewalls (NGFW)
* Deep Packet and Flow Inspection (DPI)
* Intrusion Prevention Systems (IPS)
* DDoS Mitigation
* Network Monitoring
* Data Loss Prevention (DLP)
* Smart NICs
* Grammar based content processing
* URL, spam and adware filtering
* Advanced auditing and policing of user/application security policies
* Financial data mining - parsing of streamed financial feeds
* Application recognition.
* Dmemory introspection.
* Natural Language Processing (NLP)
* Sentiment Analysis.
* Big data databse acceleration.
* Computational storage.

Request to review from HW and SW RegEx vendors and RegEx application
users to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing
device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
---
V5:
 * Remove unused iov struct.
V4:
 * Replace iov with mbuf.
 * Small ML comments.
V3:
 * Change subject title.
V2:
 * Address ML comments.
---
 config/common_base                           |    7 +
 doc/api/doxy-api-index.md                    |    1 +
 doc/api/doxy-api.conf.in                     |    1 +
 lib/Makefile                                 |    2 +
 lib/librte_regexdev/Makefile                 |   31 +
 lib/librte_regexdev/rte_regexdev.c           |    6 +
 lib/librte_regexdev/rte_regexdev.h           | 1379 ++++++++++++++++++++++++++
 lib/librte_regexdev/rte_regexdev_version.map |   26 +
 8 files changed, 1453 insertions(+)
 create mode 100644 lib/librte_regexdev/Makefile
 create mode 100644 lib/librte_regexdev/rte_regexdev.c
 create mode 100644 lib/librte_regexdev/rte_regexdev.h
 create mode 100644 lib/librte_regexdev/rte_regexdev_version.map

diff --git a/config/common_base b/config/common_base
index f9a68f3..4810849 100644
--- a/config/common_base
+++ b/config/common_base
@@ -806,6 +806,12 @@ CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_DMA_RAWDEV=y
 CONFIG_RTE_LIBRTE_PMD_NTB_RAWDEV=y
 
 #
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+CONFIG_RTE_LIBRTE_REGEXDEV_DEBUG=n
+
+#
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
@@ -1098,3 +1104,4 @@ CONFIG_RTE_APP_CRYPTO_PERF=y
 # Compile the eventdev application
 #
 CONFIG_RTE_APP_EVENTDEV=y
+
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496b..787f7c2 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
   [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
   [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
   [rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
   [metrics]            (@ref rte_metrics.h),
   [bitrate]            (@ref rte_bitrate.h),
   [latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index 1c4392e..56c08eb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -58,6 +58,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/librte_rcu \
                           @TOPDIR@/lib/librte_reorder \
                           @TOPDIR@/lib/librte_rib \
+                          @TOPDIR@/lib/librte_regexdev \
                           @TOPDIR@/lib/librte_ring \
                           @TOPDIR@/lib/librte_sched \
                           @TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 46b91ae..a273564 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
                            librte_mempool librte_timer librte_cryptodev
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
 DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
 			librte_net librte_hash librte_cryptodev
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 0000000..6f4cc63
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+# Copyright(C) 2020 Mellanox International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+EXPORT_MAP := rte_regex_version.map
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf
+
+# library source files
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_regexdev.c
+
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_regexdev.h
+
+# versioning export map
+EXPORT_MAP := rte_regexdev_version.map
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 0000000..b901877
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 0000000..2e9fe07
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1379 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ * Copyright(C) 2020 Intel International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | |             | |<-------rte_regex_rule_db_compile_activate()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - rte_regex_rule_db_compile_activate() Needs to invoke if
+ *   rte_regex_rule_db_update function was used.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_mbuf.h>
+#include <rte_memory.h>
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+__rte_experimental
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+__rte_experimental
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_START_ANCHOR_F (1ULL << 1)
+/**< RegEx device support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_ATOMIC_GROUPING_F (1ULL << 2)
+/**< RegEx device support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKTRACKING_CTRL_F (1ULL << 3)
+/**< RegEx device support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_CALLOUTS_F (1ULL << 4)
+/**< RegEx device support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKREFERENCE_F (1ULL << 5)
+/**< RegEx device support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
+/**< RegEx device support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_LOOKAROUND_ASRT_F (1ULL << 7)
+/**< RegEx device support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_MATCH_POINT_RST_F (1ULL << 8)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_NEWLINE_CONVENTIONS_F (1ULL << 9)
+/**< RegEx support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_NEWLINE_SEQ_F (1ULL << 10)
+/**< RegEx device support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_POSSESSIVE_QUALIFIERS_F (1ULL << 11)
+/**< RegEx device support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_SUBROUTINE_REFERENCES_F (1ULL << 12)
+/**< RegEx device support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_8_F (1ULL << 13)
+/**< RegEx device support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_16_F (1ULL << 14)
+/**< RegEx device support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_32_F (1ULL << 15)
+/**< RegEx device support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_WORD_BOUNDARY_F (1ULL << 16)
+/**< RegEx device support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_FORWARD_REFERENCES_F (1ULL << 17)
+/**< RegEx device support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_MATCH_AS_END (1ULL << 18)
+/**< RegEx device support match as end.
+ * Match as end means that the match result holds the end offset of the
+ * detected match. No len value is set.
+ * If the device doesn't support this feature it means the match
+ * result holds the starting position of match and the length of the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This Flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+	const char *driver_name; /**< RegEx driver name. */
+	struct rte_device *dev;	/**< Device information. */
+	uint16_t max_matches;
+	/**< Maximum matches per scan supported by this device. */
+	uint16_t max_queue_pairs;
+	/**< Maximum queue pairs supported by this device. */
+	uint16_t max_payload_size;
+	/**< Maximum payload size for a pattern match request or scan.
+	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+	 */
+	uint32_t max_rules_per_group;
+	/**< Maximum rules supported per group by this device. */
+	uint16_t max_groups;
+	/**< Maximum groups supported by this device. */
+	uint32_t regex_dev_capa;
+	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+	uint64_t rule_flags;
+	/**< Supported compiler rule flags.
+	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+	 */
+	uint8_t max_scatter_gather;
+	/**< The max supported number of buffers that can
+	 * be used in a single ops. The total size of all elements
+	 * must be less then max_payload_size.
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+__rte_experimental
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater struct struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ */
+
+#define RTE_REGEX_DEV_CFG_MATCH_AS_END (1ULL << 1)
+/**< Match as end is the ability to return the result as ending offset.
+ * When this flag is set, the result for each match will hold the ending
+ * offset of the match in end_offset.
+ * If this flag is not set, then the match result will hold the starting offset
+ * in start_offset, and the length of the match in len.
+ *
+ * @see RTE_REGEX_DEV_SUPP_MATCH_AS_END
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+	uint16_t nb_max_matches;
+	/**< Maximum matches per scan configured on this device.
+	 * This value cannot exceed the *max_matches*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case, value 1 used.
+	 * @see struct rte_regex_dev_info::max_matches
+	 */
+	uint16_t nb_queue_pairs;
+	/**< Number of RegEx queue pairs to configure on this device.
+	 * This value cannot exceed the *max_queue_pairs* which previously
+	 * provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_queue_pairs
+	 */
+	uint32_t nb_rules_per_group;
+	/**< Number of rules per group to configure on this device.
+	 * This value cannot exceed the *max_rules_per_group*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case,
+	 * struct rte_regex_dev_info::max_rules_per_group used.
+	 * @see struct rte_regex_dev_info::max_rules_per_group
+	 */
+	uint16_t nb_groups;
+	/**< Number of groups to configure on this device.
+	 * This value cannot exceed the *max_groups*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_groups
+	 */
+	const char *rule_db;
+	/**< Import initial set of prebuilt rule database on this device.
+	 * The value NULL is allowed, in which case, the device will not
+	 * be configured prebuilt rule database. Application may use
+	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+	 * to update or import rule database after the
+	 * rte_regex_dev_configure().
+	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+	 */
+	uint32_t rule_db_len;
+	/**< Length of *rule_db* buffer. */
+	uint32_t dev_cfg_flags;
+	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+				      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+	uint32_t qp_conf_flags;
+	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
+	uint16_t nb_desc;
+	/**< The number of descriptors to allocate for this queue pair. */
+	regexdev_stop_flush_t cb;
+	/**< Callback function called during rte_regex_dev_stop(), invoked
+	 * once per flushed regex op. Value NULL is allowed, in which case
+	 * callback will not be invoked. This function can be used to properly
+	 * dispose of outstanding regex ops from response queue,
+	 * for example ops containing memory pointers.
+	 * @see rte_regex_dev_stop()
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the range
+ *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue pair.
+ *   NULL value is allowed, in which case default configuration	used.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
+			   const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+__rte_experimental
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+	RTE_REGEX_DEV_ATTR_SOCKET_ID,
+	/**< The NUMA socket id to which the device is connected or
+	 * a default of zero if the socket could not be determined.
+	 * datatype: *int*
+	 * operation: *get*
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+	/**< Maximum number of matches per scan.
+	 * datatype: *uint8_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+	/**< Upper bound scan time in ns.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+	/**< Maximum number of prefix detected per scan.
+	 * This would be useful for denial of service detection.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   A pointer that will be filled in with the attribute
+ *   value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       void *attr_value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   Pointer that will be filled in with the attribute value
+ *   by the application.
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation. */
+enum rte_regex_rule_op {
+	RTE_REGEX_RULE_OP_ADD,
+	/**< Add RegEx rule to rule database. */
+	RTE_REGEX_RULE_OP_REMOVE
+	/**< Remove RegEx rule from rule database. */
+};
+
+/** Structure to hold a RegEx rule attributes. */
+struct rte_regex_rule {
+	enum rte_regex_rule_op op;
+	/**< OP type of the rule either a OP_ADD or OP_DELETE. */
+	uint16_t group_id;
+	/**< Group identifier to which the rule belongs to. */
+	uint32_t rule_id;
+	/**< Rule identifier which is returned on successful match. */
+	const char *pcre_rule;
+	/**< Buffer to hold the PCRE rule. */
+	uint16_t pcre_rule_len;
+	/**< Length of the PCRE rule. */
+	uint64_t rule_flags;
+	/* PCRE rule flags. Supported device specific PCRE rules enumerated
+	 * in struct rte_regex_dev_info::rule_flags. For successful rule
+	 * database update, application needs to provide only supported
+	 * rule flags.
+	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Update the local rule set.
+ * This functions only modify the rule set in memory.
+ * In order for the changes to take effect, the function
+ * rte_regex_rule_db_compile_active must be called.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ * @param rules.
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules.
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(),
+ *   rte_regex_rule_db_compile_activate()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
+			 uint32_t nb_rules);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Compile local rule set and burn the complied result to the
+ * RegEx deive.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success, otherwise negative errno.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(,
+ *   rte_regex_rule_db_update()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_compile_activate(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+			 uint32_t rule_db_len);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+	uint16_t id;
+	/**< xstat identifier */
+	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+	/**< xstat name */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - Positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+			       struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested.
+ * @return
+ *   - Positive value: number of stat entries filled into the values array
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+			 uint64_t values[], uint16_t n);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param name
+ *   The stat name to retrieve.
+ * @param id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value.
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+				 uint16_t *id, uint64_t *value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is NULL.
+ *
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters.
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+			   uint16_t nb_ids);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @return
+ *   - 0: Selftest successful.
+ *   - -ENOTSUP if the device doesn't support selftest.
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param f
+ *   A pointer to a file for output.
+ *
+ * @return
+ *   0 on success, negative errno on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		struct {
+			uint32_t rule_id:20;
+			/**< Rule identifier to which the pattern matched.
+			 * @see struct rte_regex_rule::rule_id
+			 */
+			uint32_t group_id:12;
+			/**< Group identifier of the rule which the pattern
+			 * matched. @see struct rte_regex_rule::group_id
+			 */
+			uint16_t start_offset;
+			/**< Starting Byte Position for matched rule. */
+			RTE_STD_C11
+			union {
+				uint16_t len;
+				/**< Length of match in bytes */
+				uint16_t end_offset;
+				/**< The end offset of the match. In case
+				 * MATCH_AS_END configuration is enabled.
+				 * @see RTE_REGEX_DEV_CFG_MATCH_AS_END
+				 */
+			};
+		};
+	};
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id1 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id2 valid */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id3 valid */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per scan to
+ * reduce the post-processing required by the application. The match with the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 2)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 4)
+/**< Indicates that the RegEx device has reached the max allowed prefix length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+	/* W0 */
+	uint16_t req_flags;
+	/**< Request flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_REQ_*
+	 */
+	uint16_t rsp_flags;
+	/**< Response flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_RSP_*
+	 */
+	uint16_t nb_actual_matches;
+	/**< The total number of actual matches detected by the Regex device.*/
+	uint16_t nb_matches;
+	/**< The total number of matches returned by the RegEx device for this
+	 * scan. The size of *rte_regex_ops::matches* zero length array will be
+	 * this value.
+	 *
+	 * @see struct rte_regex_ops::matches, struct rte_regex_match
+	 */
+
+	/* W1 */
+	struct rte_mbuf mbuf; /**< source mbuf, to search in. */
+
+	/* W2 */
+	uint16_t group_id0;
+	/**< First group_id to match the rule against. Minimum one group id
+	 * must be provided by application.
+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1
+	 * is valid, respectively similar flags for group_id2 and group_id3.
+	 * Upon the match, struct rte_regex_match::group_id shall be updated
+	 * with matching group ID by the device. Group ID scheme provides
+	 * rule isolation and effective pattern matching.
+	 */
+	uint16_t group_id1;
+	/**< Second group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+	 */
+	uint16_t group_id2;
+	/**< Third group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+	 */
+	uint16_t group_id3;
+	/**< Forth group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+	 */
+
+	/* W3 */
+	RTE_STD_C11
+	union {
+		uint64_t user_id;
+		/**< Application specific opaque value. An application may use
+		 * this field to hold application specific value to share
+		 * between dequeue and enqueue operation.
+		 * Implementation should not modify this field.
+		 */
+		void *user_ptr;
+		/**< Pointer representation of *user_id* */
+	};
+
+	/* W4 */
+	struct rte_regex_match matches[];
+	/**< Zero length array to hold the match tuples.
+	 * The struct rte_regex_ops::nb_matches value holds the number of
+	 * elements in this array.
+	 *
+	 * @see struct rte_regex_ops::nb_matches
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible" policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */
diff --git a/lib/librte_regexdev/rte_regexdev_version.map b/lib/librte_regexdev/rte_regexdev_version.map
new file mode 100644
index 0000000..723104d
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev_version.map
@@ -0,0 +1,26 @@
+EXPERIMENTAL {
+	global:
+
+	rte_regex_dev_count;
+	rte_regex_dev_get_dev_id;
+	rte_regex_dev_info_get;
+	rte_regex_dev_configure;
+	rte_regex_queue_pair_setup;
+	rte_regex_dev_start;
+	rte_regex_dev_stop;
+	rte_regex_dev_close;
+	rte_regex_dev_attr_get;
+	rte_regex_dev_attr_set;
+	rte_regex_rule_db_update;
+	rte_regex_rule_db_compile_activate;
+	rte_regex_rule_db_import;
+	rte_regex_rule_db_export;
+	rte_regex_dev_xstats_names_get;
+	rte_regex_dev_xstats_get;
+	rte_regex_dev_xstats_by_name_get;
+	rte_regex_dev_xstats_reset;
+	rte_regex_dev_selftest;
+	rte_regex_dev_dump;
+	rte_regex_enqueue_burst;
+	rte_regex_dequeue_burst;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-02-27 15:08 ` [dpdk-dev] [RFC v5] " Ori Kam
@ 2020-03-01  6:13   ` Pavan Nikhilesh Bhagavatula
  2020-03-01  7:31     ` Ori Kam
  2020-03-02  7:05   ` [dpdk-dev] " Wang Xiang
  1 sibling, 1 reply; 62+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2020-03-01  6:13 UTC (permalink / raw)
  To: Ori Kam, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, shahafs, hemant.agrawal, opher, alexr, Dovrat Zifroni,
	Prasun Kapoor, nipun.gupta, bruce.richardson, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas

Hi Ori,

Minor comments below.

<snip>

>+/**
>+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
>+ * for enqueue and dequeue operation.
>+ */
>+struct rte_regex_ops {
>+	/* W0 */
>+	uint16_t req_flags;
>+	/**< Request flags for the RegEx ops.
>+	 * @see RTE_REGEX_OPS_REQ_*
>+	 */
>+	uint16_t rsp_flags;
>+	/**< Response flags for the RegEx ops.
>+	 * @see RTE_REGEX_OPS_RSP_*
>+	 */
>+	uint16_t nb_actual_matches;
>+	/**< The total number of actual matches detected by the
>Regex device.*/
>+	uint16_t nb_matches;
>+	/**< The total number of matches returned by the RegEx
>device for this
>+	 * scan. The size of *rte_regex_ops::matches* zero length array
>will be
>+	 * this value.
>+	 *
>+	 * @see struct rte_regex_ops::matches, struct
>rte_regex_match
>+	 */
>+
>+	/* W1 */
>+	struct rte_mbuf mbuf; /**< source mbuf, to search in. */

This should be *mbuf.

>+
>+	/* W2 */
>+	uint16_t group_id0;

This should be group_id1.

>+	/**< First group_id to match the rule against. Minimum one
>group id
>+	 * must be provided by application.
>+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
>group_id1
>+	 * is valid, respectively similar flags for group_id2 and group_id3.
>+	 * Upon the match, struct rte_regex_match::group_id shall be
>updated
>+	 * with matching group ID by the device. Group ID scheme
>provides
>+	 * rule isolation and effective pattern matching.
>+	 */
>+	uint16_t group_id1;
>+	/**< Second group_id to match the rule against.
>+	 *
>+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
>+	 */

The above `group_id1` should be removed as its duplicate.

>+	uint16_t group_id2;
>+	/**< Third group_id to match the rule against.
>+	 *
>+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
>+	 */
>+	uint16_t group_id3;
>+	/**< Forth group_id to match the rule against.
>+	 *
>+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
>+	 */
>+
>+	/* W3 */
>+	RTE_STD_C11
>+	union {
>+		uint64_t user_id;
>+		/**< Application specific opaque value. An application
>may use
>+		 * this field to hold application specific value to share
>+		 * between dequeue and enqueue operation.
>+		 * Implementation should not modify this field.
>+		 */
>+		void *user_ptr;
>+		/**< Pointer representation of *user_id* */
>+	};
>+
>+	/* W4 */
>+	struct rte_regex_match matches[];
>+	/**< Zero length array to hold the match tuples.
>+	 * The struct rte_regex_ops::nb_matches value holds the
>number of
>+	 * elements in this array.
>+	 *
>+	 * @see struct rte_regex_ops::nb_matches
>+	 */
>+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-01  6:13   ` [dpdk-dev] [EXT] " Pavan Nikhilesh Bhagavatula
@ 2020-03-01  7:31     ` Ori Kam
  2020-03-01 13:23       ` Pavan Nikhilesh Bhagavatula
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-01  7:31 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Pavan,
Thanks for the comments please see below.

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh Bhagavatula
> Sent: Sunday, March 1, 2020 8:13 AM
> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; xiang.w.wang@intel.com
> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>;
> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> 
> Minor comments below.
> 
> <snip>
> 
> >+/**
> >+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
> >+ * for enqueue and dequeue operation.
> >+ */
> >+struct rte_regex_ops {
> >+	/* W0 */
> >+	uint16_t req_flags;
> >+	/**< Request flags for the RegEx ops.
> >+	 * @see RTE_REGEX_OPS_REQ_*
> >+	 */
> >+	uint16_t rsp_flags;
> >+	/**< Response flags for the RegEx ops.
> >+	 * @see RTE_REGEX_OPS_RSP_*
> >+	 */
> >+	uint16_t nb_actual_matches;
> >+	/**< The total number of actual matches detected by the
> >Regex device.*/
> >+	uint16_t nb_matches;
> >+	/**< The total number of matches returned by the RegEx
> >device for this
> >+	 * scan. The size of *rte_regex_ops::matches* zero length array
> >will be
> >+	 * this value.
> >+	 *
> >+	 * @see struct rte_regex_ops::matches, struct
> >rte_regex_match
> >+	 */
> >+
> >+	/* W1 */
> >+	struct rte_mbuf mbuf; /**< source mbuf, to search in. */
> 
> This should be *mbuf.

Yes you are correct will fix.

> 
> >+
> >+	/* W2 */
> >+	uint16_t group_id0;
> 
> This should be group_id1.
> 
No this is correct is should be id0. We are starting from group 0.
The comment below states that the first group, meaning group 0 must be 
valid group while group 1 doesn’t have to be vaild.

> >+	/**< First group_id to match the rule against. Minimum one
> >group id
> >+	 * must be provided by application.
> >+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> >group_id1
> >+	 * is valid, respectively similar flags for group_id2 and group_id3.
> >+	 * Upon the match, struct rte_regex_match::group_id shall be
> >updated
> >+	 * with matching group ID by the device. Group ID scheme
> >provides
> >+	 * rule isolation and effective pattern matching.
> >+	 */
> >+	uint16_t group_id1;
> >+	/**< Second group_id to match the rule against.
> >+	 *
> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> >+	 */
> 
> The above `group_id1` should be removed as its duplicate.
> 

This is not duplicate, see above comment.

> >+	uint16_t group_id2;
> >+	/**< Third group_id to match the rule against.
> >+	 *
> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> >+	 */
> >+	uint16_t group_id3;
> >+	/**< Forth group_id to match the rule against.
> >+	 *
> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> >+	 */
> >+
> >+	/* W3 */
> >+	RTE_STD_C11
> >+	union {
> >+		uint64_t user_id;
> >+		/**< Application specific opaque value. An application
> >may use
> >+		 * this field to hold application specific value to share
> >+		 * between dequeue and enqueue operation.
> >+		 * Implementation should not modify this field.
> >+		 */
> >+		void *user_ptr;
> >+		/**< Pointer representation of *user_id* */
> >+	};
> >+
> >+	/* W4 */
> >+	struct rte_regex_match matches[];
> >+	/**< Zero length array to hold the match tuples.
> >+	 * The struct rte_regex_ops::nb_matches value holds the
> >number of
> >+	 * elements in this array.
> >+	 *
> >+	 * @see struct rte_regex_ops::nb_matches
> >+	 */
> >+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-01  7:31     ` Ori Kam
@ 2020-03-01 13:23       ` Pavan Nikhilesh Bhagavatula
  2020-03-01 14:10         ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2020-03-01 13:23 UTC (permalink / raw)
  To: Ori Kam, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Ori,

>
>Hi Pavan,
>Thanks for the comments please see below.
>
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
>Bhagavatula
>> Sent: Sunday, March 1, 2020 8:13 AM
>> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
>> <jerinj@marvell.com>; xiang.w.wang@intel.com
>> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
>> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
>Alex
>> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
><dovrat@marvell.com>;
>> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
>> bruce.richardson@intel.com; yang.a.hong@intel.com;
>harry.chang@intel.com;
>> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
>> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
>wushuai@inspur.com;
>> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
>> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
>> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
>> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
>> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
>> <thomas@monjalon.net>
>> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev
>subsystem
>>
>> Hi Ori,
>>
>> Minor comments below.
>>
>> <snip>
>>
>> >+/**
>> >+ * The generic *rte_regex_ops* structure to hold the RegEx
>attributes
>> >+ * for enqueue and dequeue operation.
>> >+ */
>> >+struct rte_regex_ops {
>> >+	/* W0 */
>> >+	uint16_t req_flags;
>> >+	/**< Request flags for the RegEx ops.
>> >+	 * @see RTE_REGEX_OPS_REQ_*
>> >+	 */
>> >+	uint16_t rsp_flags;
>> >+	/**< Response flags for the RegEx ops.
>> >+	 * @see RTE_REGEX_OPS_RSP_*
>> >+	 */
>> >+	uint16_t nb_actual_matches;
>> >+	/**< The total number of actual matches detected by the
>> >Regex device.*/
>> >+	uint16_t nb_matches;
>> >+	/**< The total number of matches returned by the RegEx
>> >device for this
>> >+	 * scan. The size of *rte_regex_ops::matches* zero length array
>> >will be
>> >+	 * this value.
>> >+	 *
>> >+	 * @see struct rte_regex_ops::matches, struct
>> >rte_regex_match
>> >+	 */
>> >+
>> >+	/* W1 */
>> >+	struct rte_mbuf mbuf; /**< source mbuf, to search in. */
>>
>> This should be *mbuf.
>
>Yes you are correct will fix.
>
>>
>> >+
>> >+	/* W2 */
>> >+	uint16_t group_id0;
>>
>> This should be group_id1.
>>
>No this is correct is should be id0. We are starting from group 0.
>The comment below states that the first group, meaning group 0 must
>be
>valid group while group 1 doesn’t have to be vaild.

Would that mean that group_id0 is always valid? 
Since there is no `RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F` flag.

>
>> >+	/**< First group_id to match the rule against. Minimum one
>> >group id
>> >+	 * must be provided by application.
>> >+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
>> >group_id1
>> >+	 * is valid, respectively similar flags for group_id2 and group_id3.
>> >+	 * Upon the match, struct rte_regex_match::group_id shall be
>> >updated
>> >+	 * with matching group ID by the device. Group ID scheme
>> >provides
>> >+	 * rule isolation and effective pattern matching.
>> >+	 */
>> >+	uint16_t group_id1;
>> >+	/**< Second group_id to match the rule against.
>> >+	 *
>> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
>> >+	 */
>>
>> The above `group_id1` should be removed as its duplicate.
>>
>
>This is not duplicate, see above comment.
>
>> >+	uint16_t group_id2;
>> >+	/**< Third group_id to match the rule against.
>> >+	 *
>> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
>> >+	 */
>> >+	uint16_t group_id3;
>> >+	/**< Forth group_id to match the rule against.
>> >+	 *
>> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
>> >+	 */
>> >+
>> >+	/* W3 */
>> >+	RTE_STD_C11
>> >+	union {
>> >+		uint64_t user_id;
>> >+		/**< Application specific opaque value. An application
>> >may use
>> >+		 * this field to hold application specific value to share
>> >+		 * between dequeue and enqueue operation.
>> >+		 * Implementation should not modify this field.
>> >+		 */
>> >+		void *user_ptr;
>> >+		/**< Pointer representation of *user_id* */
>> >+	};
>> >+
>> >+	/* W4 */
>> >+	struct rte_regex_match matches[];
>> >+	/**< Zero length array to hold the match tuples.
>> >+	 * The struct rte_regex_ops::nb_matches value holds the
>> >number of
>> >+	 * elements in this array.
>> >+	 *
>> >+	 * @see struct rte_regex_ops::nb_matches
>> >+	 */
>> >+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-01 13:23       ` Pavan Nikhilesh Bhagavatula
@ 2020-03-01 14:10         ` Ori Kam
  2020-03-01 14:38           ` Pavan Nikhilesh Bhagavatula
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-01 14:10 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Pavan,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh Bhagavatula
> Sent: Sunday, March 1, 2020 3:23 PM
> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; xiang.w.wang@intel.com
> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>;
> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> 
> >
> >Hi Pavan,
> >Thanks for the comments please see below.
> >
> >> -----Original Message-----
> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
> >Bhagavatula
> >> Sent: Sunday, March 1, 2020 8:13 AM
> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> >Alex
> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> ><dovrat@marvell.com>;
> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> >harry.chang@intel.com;
> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> >wushuai@inspur.com;
> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> >> <thomas@monjalon.net>
> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev
> >subsystem
> >>
> >> Hi Ori,
> >>
> >> Minor comments below.
> >>
> >> <snip>
> >>
> >> >+/**
> >> >+ * The generic *rte_regex_ops* structure to hold the RegEx
> >attributes
> >> >+ * for enqueue and dequeue operation.
> >> >+ */
> >> >+struct rte_regex_ops {
> >> >+	/* W0 */
> >> >+	uint16_t req_flags;
> >> >+	/**< Request flags for the RegEx ops.
> >> >+	 * @see RTE_REGEX_OPS_REQ_*
> >> >+	 */
> >> >+	uint16_t rsp_flags;
> >> >+	/**< Response flags for the RegEx ops.
> >> >+	 * @see RTE_REGEX_OPS_RSP_*
> >> >+	 */
> >> >+	uint16_t nb_actual_matches;
> >> >+	/**< The total number of actual matches detected by the
> >> >Regex device.*/
> >> >+	uint16_t nb_matches;
> >> >+	/**< The total number of matches returned by the RegEx
> >> >device for this
> >> >+	 * scan. The size of *rte_regex_ops::matches* zero length array
> >> >will be
> >> >+	 * this value.
> >> >+	 *
> >> >+	 * @see struct rte_regex_ops::matches, struct
> >> >rte_regex_match
> >> >+	 */
> >> >+
> >> >+	/* W1 */
> >> >+	struct rte_mbuf mbuf; /**< source mbuf, to search in. */
> >>
> >> This should be *mbuf.
> >
> >Yes you are correct will fix.
> >
> >>
> >> >+
> >> >+	/* W2 */
> >> >+	uint16_t group_id0;
> >>
> >> This should be group_id1.
> >>
> >No this is correct is should be id0. We are starting from group 0.
> >The comment below states that the first group, meaning group 0 must
> >be
> >valid group while group 1 doesn’t have to be vaild.
> 
> Would that mean that group_id0 is always valid?
> Since there is no `RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F` flag.
> 
Yes, you must have at least one group.

> >
> >> >+	/**< First group_id to match the rule against. Minimum one
> >> >group id
> >> >+	 * must be provided by application.
> >> >+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> >> >group_id1
> >> >+	 * is valid, respectively similar flags for group_id2 and group_id3.
> >> >+	 * Upon the match, struct rte_regex_match::group_id shall be
> >> >updated
> >> >+	 * with matching group ID by the device. Group ID scheme
> >> >provides
> >> >+	 * rule isolation and effective pattern matching.
> >> >+	 */
> >> >+	uint16_t group_id1;
> >> >+	/**< Second group_id to match the rule against.
> >> >+	 *
> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> >> >+	 */
> >>
> >> The above `group_id1` should be removed as its duplicate.
> >>
> >
> >This is not duplicate, see above comment.
> >
> >> >+	uint16_t group_id2;
> >> >+	/**< Third group_id to match the rule against.
> >> >+	 *
> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> >> >+	 */
> >> >+	uint16_t group_id3;
> >> >+	/**< Forth group_id to match the rule against.
> >> >+	 *
> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> >> >+	 */
> >> >+
> >> >+	/* W3 */
> >> >+	RTE_STD_C11
> >> >+	union {
> >> >+		uint64_t user_id;
> >> >+		/**< Application specific opaque value. An application
> >> >may use
> >> >+		 * this field to hold application specific value to share
> >> >+		 * between dequeue and enqueue operation.
> >> >+		 * Implementation should not modify this field.
> >> >+		 */
> >> >+		void *user_ptr;
> >> >+		/**< Pointer representation of *user_id* */
> >> >+	};
> >> >+
> >> >+	/* W4 */
> >> >+	struct rte_regex_match matches[];
> >> >+	/**< Zero length array to hold the match tuples.
> >> >+	 * The struct rte_regex_ops::nb_matches value holds the
> >> >number of
> >> >+	 * elements in this array.
> >> >+	 *
> >> >+	 * @see struct rte_regex_ops::nb_matches
> >> >+	 */
> >> >+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-01 14:10         ` Ori Kam
@ 2020-03-01 14:38           ` Pavan Nikhilesh Bhagavatula
  2020-03-01 15:41             ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2020-03-01 14:38 UTC (permalink / raw)
  To: Ori Kam, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Ori,

>
>Hi Pavan,
>
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
>Bhagavatula
>> Sent: Sunday, March 1, 2020 3:23 PM
>> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
>> <jerinj@marvell.com>; xiang.w.wang@intel.com
>> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
>> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
>Alex
>> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
><dovrat@marvell.com>;
>> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
>> bruce.richardson@intel.com; yang.a.hong@intel.com;
>harry.chang@intel.com;
>> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
>> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
>wushuai@inspur.com;
>> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
>> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
>> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
>> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
>> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
>> <thomas@monjalon.net>
>> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev
>subsystem
>>
>> Hi Ori,
>>
>> >
>> >Hi Pavan,
>> >Thanks for the comments please see below.
>> >
>> >> -----Original Message-----
>> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan
>Nikhilesh
>> >Bhagavatula
>> >> Sent: Sunday, March 1, 2020 8:13 AM
>> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
>> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
>> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
>> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
>> >Alex
>> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
>> ><dovrat@marvell.com>;
>> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
>> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
>> >harry.chang@intel.com;
>> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
>> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
>> >wushuai@inspur.com;
>> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
>> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
>> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
>> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
>> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
>> >> <thomas@monjalon.net>
>> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce
>regexdev
>> >subsystem
>> >>
>> >> Hi Ori,
>> >>
>> >> Minor comments below.
>> >>
>> >> <snip>
>> >>
>> >> >+/**
>> >> >+ * The generic *rte_regex_ops* structure to hold the RegEx
>> >attributes
>> >> >+ * for enqueue and dequeue operation.
>> >> >+ */
>> >> >+struct rte_regex_ops {
>> >> >+	/* W0 */
>> >> >+	uint16_t req_flags;
>> >> >+	/**< Request flags for the RegEx ops.
>> >> >+	 * @see RTE_REGEX_OPS_REQ_*
>> >> >+	 */
>> >> >+	uint16_t rsp_flags;
>> >> >+	/**< Response flags for the RegEx ops.
>> >> >+	 * @see RTE_REGEX_OPS_RSP_*
>> >> >+	 */
>> >> >+	uint16_t nb_actual_matches;
>> >> >+	/**< The total number of actual matches detected by the
>> >> >Regex device.*/
>> >> >+	uint16_t nb_matches;
>> >> >+	/**< The total number of matches returned by the RegEx
>> >> >device for this
>> >> >+	 * scan. The size of *rte_regex_ops::matches* zero length array
>> >> >will be
>> >> >+	 * this value.
>> >> >+	 *
>> >> >+	 * @see struct rte_regex_ops::matches, struct
>> >> >rte_regex_match
>> >> >+	 */
>> >> >+
>> >> >+	/* W1 */
>> >> >+	struct rte_mbuf mbuf; /**< source mbuf, to search in. */
>> >>
>> >> This should be *mbuf.
>> >
>> >Yes you are correct will fix.
>> >
>> >>
>> >> >+
>> >> >+	/* W2 */
>> >> >+	uint16_t group_id0;
>> >>
>> >> This should be group_id1.
>> >>
>> >No this is correct is should be id0. We are starting from group 0.
>> >The comment below states that the first group, meaning group 0
>must
>> >be
>> >valid group while group 1 doesn’t have to be vaild.
>>
>> Would that mean that group_id0 is always valid?
>> Since there is no `RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F` flag.
>>
>Yes, you must have at least one group.

Makes sense, I think we need to update the comment a bit as it only mentions that
at least one group but it should be group_id0 has to be always valid.

(An application can erroneously set valid group_id1 instead of group_id0) 

>
>> >
>> >> >+	/**< First group_id to match the rule against. Minimum one
>> >> >group id
>> >> >+	 * must be provided by application.
>> >> >+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
>> >> >group_id1
>> >> >+	 * is valid, respectively similar flags for group_id2 and group_id3.
>> >> >+	 * Upon the match, struct rte_regex_match::group_id shall be
>> >> >updated
>> >> >+	 * with matching group ID by the device. Group ID scheme
>> >> >provides
>> >> >+	 * rule isolation and effective pattern matching.
>> >> >+	 */
>> >> >+	uint16_t group_id1;
>> >> >+	/**< Second group_id to match the rule against.
>> >> >+	 *
>> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
>> >> >+	 */
>> >>
>> >> The above `group_id1` should be removed as its duplicate.
>> >>
>> >
>> >This is not duplicate, see above comment.
>> >
>> >> >+	uint16_t group_id2;
>> >> >+	/**< Third group_id to match the rule against.
>> >> >+	 *
>> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
>> >> >+	 */
>> >> >+	uint16_t group_id3;
>> >> >+	/**< Forth group_id to match the rule against.
>> >> >+	 *
>> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
>> >> >+	 */
>> >> >+
>> >> >+	/* W3 */
>> >> >+	RTE_STD_C11
>> >> >+	union {
>> >> >+		uint64_t user_id;
>> >> >+		/**< Application specific opaque value. An application
>> >> >may use
>> >> >+		 * this field to hold application specific value to share
>> >> >+		 * between dequeue and enqueue operation.
>> >> >+		 * Implementation should not modify this field.
>> >> >+		 */
>> >> >+		void *user_ptr;
>> >> >+		/**< Pointer representation of *user_id* */
>> >> >+	};
>> >> >+
>> >> >+	/* W4 */
>> >> >+	struct rte_regex_match matches[];
>> >> >+	/**< Zero length array to hold the match tuples.
>> >> >+	 * The struct rte_regex_ops::nb_matches value holds the
>> >> >number of
>> >> >+	 * elements in this array.
>> >> >+	 *
>> >> >+	 * @see struct rte_regex_ops::nb_matches
>> >> >+	 */
>> >> >+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-01 14:38           ` Pavan Nikhilesh Bhagavatula
@ 2020-03-01 15:41             ` Ori Kam
  2020-03-01 15:57               ` Pavan Nikhilesh Bhagavatula
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-01 15:41 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Pavan,


> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh Bhagavatula
> Sent: Sunday, March 1, 2020 4:38 PM
> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; xiang.w.wang@intel.com
> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>;
> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> 
> >
> >Hi Pavan,
> >
> >> -----Original Message-----
> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
> >Bhagavatula
> >> Sent: Sunday, March 1, 2020 3:23 PM
> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> >Alex
> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> ><dovrat@marvell.com>;
> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> >harry.chang@intel.com;
> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> >wushuai@inspur.com;
> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> >> <thomas@monjalon.net>
> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev
> >subsystem
> >>
> >> Hi Ori,
> >>
> >> >
> >> >Hi Pavan,
> >> >Thanks for the comments please see below.
> >> >
> >> >> -----Original Message-----
> >> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan
> >Nikhilesh
> >> >Bhagavatula
> >> >> Sent: Sunday, March 1, 2020 8:13 AM
> >> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> >> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> >> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> >> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> >> >Alex
> >> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> >> ><dovrat@marvell.com>;
> >> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> >> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> >> >harry.chang@intel.com;
> >> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> >> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> >> >wushuai@inspur.com;
> >> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> >> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> >> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> >> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> >> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> >> >> <thomas@monjalon.net>
> >> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce
> >regexdev
> >> >subsystem
> >> >>
> >> >> Hi Ori,
> >> >>
> >> >> Minor comments below.
> >> >>
> >> >> <snip>
> >> >>
> >> >> >+/**
> >> >> >+ * The generic *rte_regex_ops* structure to hold the RegEx
> >> >attributes
> >> >> >+ * for enqueue and dequeue operation.
> >> >> >+ */
> >> >> >+struct rte_regex_ops {
> >> >> >+	/* W0 */
> >> >> >+	uint16_t req_flags;
> >> >> >+	/**< Request flags for the RegEx ops.
> >> >> >+	 * @see RTE_REGEX_OPS_REQ_*
> >> >> >+	 */
> >> >> >+	uint16_t rsp_flags;
> >> >> >+	/**< Response flags for the RegEx ops.
> >> >> >+	 * @see RTE_REGEX_OPS_RSP_*
> >> >> >+	 */
> >> >> >+	uint16_t nb_actual_matches;
> >> >> >+	/**< The total number of actual matches detected by the
> >> >> >Regex device.*/
> >> >> >+	uint16_t nb_matches;
> >> >> >+	/**< The total number of matches returned by the RegEx
> >> >> >device for this
> >> >> >+	 * scan. The size of *rte_regex_ops::matches* zero length
> array
> >> >> >will be
> >> >> >+	 * this value.
> >> >> >+	 *
> >> >> >+	 * @see struct rte_regex_ops::matches, struct
> >> >> >rte_regex_match
> >> >> >+	 */
> >> >> >+
> >> >> >+	/* W1 */
> >> >> >+	struct rte_mbuf mbuf; /**< source mbuf, to search in. */
> >> >>
> >> >> This should be *mbuf.
> >> >
> >> >Yes you are correct will fix.
> >> >
> >> >>
> >> >> >+
> >> >> >+	/* W2 */
> >> >> >+	uint16_t group_id0;
> >> >>
> >> >> This should be group_id1.
> >> >>
> >> >No this is correct is should be id0. We are starting from group 0.
> >> >The comment below states that the first group, meaning group 0
> >must
> >> >be
> >> >valid group while group 1 doesn’t have to be vaild.
> >>
> >> Would that mean that group_id0 is always valid?
> >> Since there is no `RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F` flag.
> >>
> >Yes, you must have at least one group.
> 
> Makes sense, I think we need to update the comment a bit as it only mentions
> that
> at least one group but it should be group_id0 has to be always valid.
> 
> (An application can erroneously set valid group_id1 instead of group_id0)
> 

What about the next comment?
/**< First group_id to match the rule against. This group must be valid. In       
  * order to support more group (up to 4 groups). The group number should 
  * be set. For example to enable group 1 group_id1 should be set
  * with the group value and  and the RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F flag should be set.    
  * Respectively similar flags for group_id2 and group_id3.      
  * Upon the match, struct rte_regex_match::group_id shall be updated      
  * with matching group ID by the device. Group ID scheme provides         
  * rule isolation and effective pattern matching.                         
*/

/**< First group_id to match the rule against. Minimum one group id       
  * must be provided by application.                                       
  * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then group_id1            
  * is valid, respectively similar flags for group_id2 and group_id3.      
  * Upon the match, struct rte_regex_match::group_id shall be updated      
  * with matching group ID by the device. Group ID scheme provides         
  * rule isolation and effective pattern matching.                         

> >
> >> >
> >> >> >+	/**< First group_id to match the rule against. Minimum one
> >> >> >group id
> >> >> >+	 * must be provided by application.
> >> >> >+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> >> >> >group_id1
> >> >> >+	 * is valid, respectively similar flags for group_id2 and
> group_id3.
> >> >> >+	 * Upon the match, struct rte_regex_match::group_id shall be
> >> >> >updated
> >> >> >+	 * with matching group ID by the device. Group ID scheme
> >> >> >provides
> >> >> >+	 * rule isolation and effective pattern matching.
> >> >> >+	 */
> >> >> >+	uint16_t group_id1;
> >> >> >+	/**< Second group_id to match the rule against.
> >> >> >+	 *
> >> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> >> >> >+	 */
> >> >>
> >> >> The above `group_id1` should be removed as its duplicate.
> >> >>
> >> >
> >> >This is not duplicate, see above comment.
> >> >
> >> >> >+	uint16_t group_id2;
> >> >> >+	/**< Third group_id to match the rule against.
> >> >> >+	 *
> >> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> >> >> >+	 */
> >> >> >+	uint16_t group_id3;
> >> >> >+	/**< Forth group_id to match the rule against.
> >> >> >+	 *
> >> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> >> >> >+	 */
> >> >> >+
> >> >> >+	/* W3 */
> >> >> >+	RTE_STD_C11
> >> >> >+	union {
> >> >> >+		uint64_t user_id;
> >> >> >+		/**< Application specific opaque value. An application
> >> >> >may use
> >> >> >+		 * this field to hold application specific value to share
> >> >> >+		 * between dequeue and enqueue operation.
> >> >> >+		 * Implementation should not modify this field.
> >> >> >+		 */
> >> >> >+		void *user_ptr;
> >> >> >+		/**< Pointer representation of *user_id* */
> >> >> >+	};
> >> >> >+
> >> >> >+	/* W4 */
> >> >> >+	struct rte_regex_match matches[];
> >> >> >+	/**< Zero length array to hold the match tuples.
> >> >> >+	 * The struct rte_regex_ops::nb_matches value holds the
> >> >> >number of
> >> >> >+	 * elements in this array.
> >> >> >+	 *
> >> >> >+	 * @see struct rte_regex_ops::nb_matches
> >> >> >+	 */
> >> >> >+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-01 15:41             ` Ori Kam
@ 2020-03-01 15:57               ` Pavan Nikhilesh Bhagavatula
  2020-03-02  7:18                 ` Jerin Jacob
  0 siblings, 1 reply; 62+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2020-03-01 15:57 UTC (permalink / raw)
  To: Ori Kam, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi OrI,

>
>Hi Pavan,
>
>
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
>Bhagavatula
>> Sent: Sunday, March 1, 2020 4:38 PM
>> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
>> <jerinj@marvell.com>; xiang.w.wang@intel.com
>> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
>> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
>Alex
>> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
><dovrat@marvell.com>;
>> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
>> bruce.richardson@intel.com; yang.a.hong@intel.com;
>harry.chang@intel.com;
>> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
>> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
>wushuai@inspur.com;
>> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
>> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
>> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
>> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
>> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
>> <thomas@monjalon.net>
>> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev
>subsystem
>>
>> Hi Ori,
>>
>> >
>> >Hi Pavan,
>> >
>> >> -----Original Message-----
>> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan
>Nikhilesh
>> >Bhagavatula
>> >> Sent: Sunday, March 1, 2020 3:23 PM
>> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
>> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
>> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
>> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
>> >Alex
>> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
>> ><dovrat@marvell.com>;
>> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
>> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
>> >harry.chang@intel.com;
>> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
>> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
>> >wushuai@inspur.com;
>> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
>> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
>> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
>> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
>> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
>> >> <thomas@monjalon.net>
>> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce
>regexdev
>> >subsystem
>> >>
>> >> Hi Ori,
>> >>
>> >> >
>> >> >Hi Pavan,
>> >> >Thanks for the comments please see below.
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan
>> >Nikhilesh
>> >> >Bhagavatula
>> >> >> Sent: Sunday, March 1, 2020 8:13 AM
>> >> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
>> >> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
>> >> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
>> >> >> hemant.agrawal@nxp.com; Opher Reviv
><opher@mellanox.com>;
>> >> >Alex
>> >> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
>> >> ><dovrat@marvell.com>;
>> >> >> Prasun Kapoor <pkapoor@marvell.com>;
>nipun.gupta@nxp.com;
>> >> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
>> >> >harry.chang@intel.com;
>> >> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
>> >> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
>> >> >wushuai@inspur.com;
>> >> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
>> >> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
>> >> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
>> >> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com;
>deri@ntop.org;
>> >> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
>> >> >> <thomas@monjalon.net>
>> >> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce
>> >regexdev
>> >> >subsystem
>> >> >>
>> >> >> Hi Ori,
>> >> >>
>> >> >> Minor comments below.
>> >> >>
>> >> >> <snip>
>> >> >>
>> >> >> >+/**
>> >> >> >+ * The generic *rte_regex_ops* structure to hold the RegEx
>> >> >attributes
>> >> >> >+ * for enqueue and dequeue operation.
>> >> >> >+ */
>> >> >> >+struct rte_regex_ops {
>> >> >> >+	/* W0 */
>> >> >> >+	uint16_t req_flags;
>> >> >> >+	/**< Request flags for the RegEx ops.
>> >> >> >+	 * @see RTE_REGEX_OPS_REQ_*
>> >> >> >+	 */
>> >> >> >+	uint16_t rsp_flags;
>> >> >> >+	/**< Response flags for the RegEx ops.
>> >> >> >+	 * @see RTE_REGEX_OPS_RSP_*
>> >> >> >+	 */
>> >> >> >+	uint16_t nb_actual_matches;
>> >> >> >+	/**< The total number of actual matches detected by
>the
>> >> >> >Regex device.*/
>> >> >> >+	uint16_t nb_matches;
>> >> >> >+	/**< The total number of matches returned by the
>RegEx
>> >> >> >device for this
>> >> >> >+	 * scan. The size of *rte_regex_ops::matches* zero
>length
>> array
>> >> >> >will be
>> >> >> >+	 * this value.
>> >> >> >+	 *
>> >> >> >+	 * @see struct rte_regex_ops::matches, struct
>> >> >> >rte_regex_match
>> >> >> >+	 */
>> >> >> >+
>> >> >> >+	/* W1 */
>> >> >> >+	struct rte_mbuf mbuf; /**< source mbuf, to search in.
>*/
>> >> >>
>> >> >> This should be *mbuf.
>> >> >
>> >> >Yes you are correct will fix.
>> >> >
>> >> >>
>> >> >> >+
>> >> >> >+	/* W2 */
>> >> >> >+	uint16_t group_id0;
>> >> >>
>> >> >> This should be group_id1.
>> >> >>
>> >> >No this is correct is should be id0. We are starting from group 0.
>> >> >The comment below states that the first group, meaning group 0
>> >must
>> >> >be
>> >> >valid group while group 1 doesn’t have to be vaild.
>> >>
>> >> Would that mean that group_id0 is always valid?
>> >> Since there is no `RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F`
>flag.
>> >>
>> >Yes, you must have at least one group.
>>
>> Makes sense, I think we need to update the comment a bit as it only
>mentions
>> that
>> at least one group but it should be group_id0 has to be always valid.
>>
>> (An application can erroneously set valid group_id1 instead of
>group_id0)
>>
>
>What about the next comment?
>/**< First group_id to match the rule against. This group must be valid.
>In
>  * order to support more group (up to 4 groups). The group number
>should
>  * be set. For example to enable group 1 group_id1 should be set
>  * with the group value and  and the
>RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F flag should be set.
>  * Respectively similar flags for group_id2 and group_id3.
>  * Upon the match, struct rte_regex_match::group_id shall be updated
>  * with matching group ID by the device. Group ID scheme provides
>  * rule isolation and effective pattern matching.
>*/

Looks good with minor corrections as below

/**< First group_id to match the rule against. This group must be valid. 
  * In order to support more than one group per each op (up to 4 groups), any of the group_id<1-3> should 
  * hold a valid group id along with RTE_REGEX_OPS_REQ_GROUP_ID<1-3>_VALID_F flag set.
  * For example, to match against group 100 and 101, group_id0 should be set to 100 and group_id1 should 
  * be set to 101 and the RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F flag should be set.    
  * Respectively similar flags for group_id2 and group_id3.      
  * Upon the match, struct rte_regex_match::group_id shall be updated      
  * with matching group ID by the device. Group ID scheme provides         
  * rule isolation and effective pattern matching.                         
*/

Thanks,
Pavan.

>
>/**< First group_id to match the rule against. Minimum one group id
>  * must be provided by application.
>  * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
>group_id1
>  * is valid, respectively similar flags for group_id2 and group_id3.
>  * Upon the match, struct rte_regex_match::group_id shall be updated
>  * with matching group ID by the device. Group ID scheme provides
>  * rule isolation and effective pattern matching.
>
>> >
>> >> >
>> >> >> >+	/**< First group_id to match the rule against. Minimum
>one
>> >> >> >group id
>> >> >> >+	 * must be provided by application.
>> >> >> >+	 * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
>set then
>> >> >> >group_id1
>> >> >> >+	 * is valid, respectively similar flags for group_id2 and
>> group_id3.
>> >> >> >+	 * Upon the match, struct rte_regex_match::group_id
>shall be
>> >> >> >updated
>> >> >> >+	 * with matching group ID by the device. Group ID
>scheme
>> >> >> >provides
>> >> >> >+	 * rule isolation and effective pattern matching.
>> >> >> >+	 */
>> >> >> >+	uint16_t group_id1;
>> >> >> >+	/**< Second group_id to match the rule against.
>> >> >> >+	 *
>> >> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
>> >> >> >+	 */
>> >> >>
>> >> >> The above `group_id1` should be removed as its duplicate.
>> >> >>
>> >> >
>> >> >This is not duplicate, see above comment.
>> >> >
>> >> >> >+	uint16_t group_id2;
>> >> >> >+	/**< Third group_id to match the rule against.
>> >> >> >+	 *
>> >> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
>> >> >> >+	 */
>> >> >> >+	uint16_t group_id3;
>> >> >> >+	/**< Forth group_id to match the rule against.
>> >> >> >+	 *
>> >> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
>> >> >> >+	 */
>> >> >> >+
>> >> >> >+	/* W3 */
>> >> >> >+	RTE_STD_C11
>> >> >> >+	union {
>> >> >> >+		uint64_t user_id;
>> >> >> >+		/**< Application specific opaque value. An
>application
>> >> >> >may use
>> >> >> >+		 * this field to hold application specific value to
>share
>> >> >> >+		 * between dequeue and enqueue operation.
>> >> >> >+		 * Implementation should not modify this field.
>> >> >> >+		 */
>> >> >> >+		void *user_ptr;
>> >> >> >+		/**< Pointer representation of *user_id* */
>> >> >> >+	};
>> >> >> >+
>> >> >> >+	/* W4 */
>> >> >> >+	struct rte_regex_match matches[];
>> >> >> >+	/**< Zero length array to hold the match tuples.
>> >> >> >+	 * The struct rte_regex_ops::nb_matches value holds
>the
>> >> >> >number of
>> >> >> >+	 * elements in this array.
>> >> >> >+	 *
>> >> >> >+	 * @see struct rte_regex_ops::nb_matches
>> >> >> >+	 */
>> >> >> >+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v5] regexdev: introduce regexdev subsystem
  2020-02-27 15:08 ` [dpdk-dev] [RFC v5] " Ori Kam
  2020-03-01  6:13   ` [dpdk-dev] [EXT] " Pavan Nikhilesh Bhagavatula
@ 2020-03-02  7:05   ` Wang Xiang
  2020-03-03  7:44     ` Ori Kam
  1 sibling, 1 reply; 62+ messages in thread
From: Wang Xiang @ 2020-03-02  7:05 UTC (permalink / raw)
  To: Ori Kam
  Cc: jerinj, dev, pbhagavatula, shahafs, hemant.agrawal, opher, alexr,
	dovrat, pkapoor, nipun.gupta, bruce.richardson, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas

Hi Ori,

Comments below.

Thanks,
Xiang

On Thu, Feb 27, 2020 at 03:08:35PM +0000, Ori Kam wrote:
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> * Next Generation Firewalls (NGFW)
> * Deep Packet and Flow Inspection (DPI)
> * Intrusion Prevention Systems (IPS)
> * DDoS Mitigation
> * Network Monitoring
> * Data Loss Prevention (DLP)
> * Smart NICs
> * Grammar based content processing
> * URL, spam and adware filtering
> * Advanced auditing and policing of user/application security policies
> * Financial data mining - parsing of streamed financial feeds
> * Application recognition.
> * Dmemory introspection.
> * Natural Language Processing (NLP)
> * Sentiment Analysis.
> * Big data databse acceleration.
> * Computational storage.
> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Signed-off-by: Ori Kam <orika@mellanox.com>
> ---
> V5:
>  * Remove unused iov struct.
> V4:
>  * Replace iov with mbuf.
>  * Small ML comments.
> V3:
>  * Change subject title.
> V2:
>  * Address ML comments.
> ---
> +
> +#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
> +/**< RegEx device support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +

Hyperscan actually supports "match all" semantic, neither greedy nor ungreedy,
which is different from PCRE. In the case above, AB, AB1, ..., AB12345 will all
be returned as matches. Do HW solutions support this?
Can we add a new flag like RTE_REGEX_DEV_SUPP_PCRE_MATCHALL_F?
Similarly, we can define a flag RTE_REGEX_PCRE_RULE_MATCHALL_F so Hyperscan 
users have to set this flag during rule compile.

> +#define RTE_REGEX_DEV_SUPP_PCRE_LOOKAROUND_ASRT_F (1ULL << 7)
> +/**< RegEx device support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
> + * successful match.
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name. */
> +	struct rte_device *dev;	/**< Device information. */
> +	uint16_t max_matches;
> +	/**< Maximum matches per scan supported by this device. */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device. */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint32_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device. */
> +	uint16_t max_groups;
> +	/**< Maximum groups supported by this device. */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint8_t max_scatter_gather;
> +	/**< The max supported number of buffers that can
> +	 * be used in a single ops. The total size of all elements
> +	 * must be less then max_payload_size.
> +	 */

s/then/than

> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +__rte_experimental
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.

s/struct/than

> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
Hyperscan supports cross buffer scan and only reports true matches instead of
partial matches. Can we have users to config this partial match capability?

> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-01 15:57               ` Pavan Nikhilesh Bhagavatula
@ 2020-03-02  7:18                 ` Jerin Jacob
  2020-03-03  7:06                   ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Jerin Jacob @ 2020-03-02  7:18 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula
  Cc: Ori Kam, Jerin Jacob Kollanukkaran, xiang.w.wang, dev,
	Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

On Sun, Mar 1, 2020 at 9:28 PM Pavan Nikhilesh Bhagavatula
<pbhagavatula@marvell.com> wrote:
>
> Hi OrI,
>
> >
> >Hi Pavan,
> >
> >
> >> -----Original Message-----
> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
> >Bhagavatula
> >> Sent: Sunday, March 1, 2020 4:38 PM
> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> >Alex
> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> ><dovrat@marvell.com>;
> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> >harry.chang@intel.com;
> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> >wushuai@inspur.com;
> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> >> <thomas@monjalon.net>
> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev
> >subsystem
> >>
> >> Hi Ori,
> >>
> >> >
> >> >Hi Pavan,
> >> >
> >> >> -----Original Message-----
> >> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan
> >Nikhilesh
> >> >Bhagavatula
> >> >> Sent: Sunday, March 1, 2020 3:23 PM
> >> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> >> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> >> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> >> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> >> >Alex
> >> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> >> ><dovrat@marvell.com>;
> >> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> >> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> >> >harry.chang@intel.com;
> >> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> >> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> >> >wushuai@inspur.com;
> >> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> >> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> >> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> >> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> >> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> >> >> <thomas@monjalon.net>
> >> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce
> >regexdev
> >> >subsystem
> >> >>
> >> >> Hi Ori,
> >> >>
> >> >> >
> >> >> >Hi Pavan,
> >> >> >Thanks for the comments please see below.
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan
> >> >Nikhilesh
> >> >> >Bhagavatula
> >> >> >> Sent: Sunday, March 1, 2020 8:13 AM
> >> >> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> >> >> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> >> >> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> >> >> >> hemant.agrawal@nxp.com; Opher Reviv
> ><opher@mellanox.com>;
> >> >> >Alex
> >> >> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> >> >> ><dovrat@marvell.com>;
> >> >> >> Prasun Kapoor <pkapoor@marvell.com>;
> >nipun.gupta@nxp.com;
> >> >> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> >> >> >harry.chang@intel.com;
> >> >> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> >> >> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> >> >> >wushuai@inspur.com;
> >> >> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> >> >> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> >> >> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> >> >> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com;
> >deri@ntop.org;
> >> >> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> >> >> >> <thomas@monjalon.net>
> >> >> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce
> >> >regexdev
> >> >> >subsystem
> >> >> >>
> >> >> >> Hi Ori,
> >> >> >>
> >> >> >> Minor comments below.
> >> >> >>
> >> >> >> <snip>
> >> >> >>
> >> >> >> >+/**
> >> >> >> >+ * The generic *rte_regex_ops* structure to hold the RegEx
> >> >> >attributes
> >> >> >> >+ * for enqueue and dequeue operation.
> >> >> >> >+ */
> >> >> >> >+struct rte_regex_ops {
> >> >> >> >+     /* W0 */
> >> >> >> >+     uint16_t req_flags;
> >> >> >> >+     /**< Request flags for the RegEx ops.
> >> >> >> >+      * @see RTE_REGEX_OPS_REQ_*
> >> >> >> >+      */
> >> >> >> >+     uint16_t rsp_flags;
> >> >> >> >+     /**< Response flags for the RegEx ops.
> >> >> >> >+      * @see RTE_REGEX_OPS_RSP_*
> >> >> >> >+      */
> >> >> >> >+     uint16_t nb_actual_matches;
> >> >> >> >+     /**< The total number of actual matches detected by
> >the
> >> >> >> >Regex device.*/
> >> >> >> >+     uint16_t nb_matches;
> >> >> >> >+     /**< The total number of matches returned by the
> >RegEx
> >> >> >> >device for this
> >> >> >> >+      * scan. The size of *rte_regex_ops::matches* zero
> >length
> >> array
> >> >> >> >will be
> >> >> >> >+      * this value.
> >> >> >> >+      *
> >> >> >> >+      * @see struct rte_regex_ops::matches, struct
> >> >> >> >rte_regex_match
> >> >> >> >+      */
> >> >> >> >+
> >> >> >> >+     /* W1 */
> >> >> >> >+     struct rte_mbuf mbuf; /**< source mbuf, to search in.
> >*/
> >> >> >>
> >> >> >> This should be *mbuf.
> >> >> >
> >> >> >Yes you are correct will fix.
> >> >> >
> >> >> >>
> >> >> >> >+
> >> >> >> >+     /* W2 */
> >> >> >> >+     uint16_t group_id0;
> >> >> >>
> >> >> >> This should be group_id1.
> >> >> >>
> >> >> >No this is correct is should be id0. We are starting from group 0.
> >> >> >The comment below states that the first group, meaning group 0
> >> >must
> >> >> >be
> >> >> >valid group while group 1 doesn’t have to be vaild.
> >> >>
> >> >> Would that mean that group_id0 is always valid?
> >> >> Since there is no `RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F`
> >flag.
> >> >>
> >> >Yes, you must have at least one group.
> >>
> >> Makes sense, I think we need to update the comment a bit as it only
> >mentions
> >> that
> >> at least one group but it should be group_id0 has to be always valid.
> >>
> >> (An application can erroneously set valid group_id1 instead of
> >group_id0)
> >>
> >
> >What about the next comment?
> >/**< First group_id to match the rule against. This group must be valid.
> >In
> >  * order to support more group (up to 4 groups). The group number
> >should
> >  * be set. For example to enable group 1 group_id1 should be set
> >  * with the group value and  and the
> >RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F flag should be set.
> >  * Respectively similar flags for group_id2 and group_id3.
> >  * Upon the match, struct rte_regex_match::group_id shall be updated
> >  * with matching group ID by the device. Group ID scheme provides
> >  * rule isolation and effective pattern matching.
> >*/
>
> Looks good with minor corrections as below
>
> /**< First group_id to match the rule against. This group must be valid.
>   * In order to support more than one group per each op (up to 4 groups), any of the group_id<1-3> should
>   * hold a valid group id along with RTE_REGEX_OPS_REQ_GROUP_ID<1-3>_VALID_F flag set.
>   * For example, to match against group 100 and 101, group_id0 should be set to 100 and group_id1 should
>   * be set to 101 and the RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F flag should be set.
>   * Respectively similar flags for group_id2 and group_id3.
>   * Upon the match, struct rte_regex_match::group_id shall be updated
>   * with matching group ID by the device. Group ID scheme provides
>   * rule isolation and effective pattern matching.
> */

I think, we can remove the limitation of group0 is always valid.
There are use cases like each group belongs certain functionality and
based on the packet type or
so application decides the group. In that case, group 0 may or may not valid.

IMO, By spec, we can dictate,

At minimum of one of the group should be valid and selected, Behaviour
is undefined if any of the group is not selected(This is to avoid fast
path check).

Thoughts?






>
> >
> >/**< First group_id to match the rule against. Minimum one group id
> >  * must be provided by application.
> >  * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> >group_id1
> >  * is valid, respectively similar flags for group_id2 and group_id3.
> >  * Upon the match, struct rte_regex_match::group_id shall be updated
> >  * with matching group ID by the device. Group ID scheme provides
> >  * rule isolation and effective pattern matching.
> >
> >> >
> >> >> >
> >> >> >> >+     /**< First group_id to match the rule against. Minimum
> >one
> >> >> >> >group id
> >> >> >> >+      * must be provided by application.
> >> >> >> >+      * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> >set then
> >> >> >> >group_id1
> >> >> >> >+      * is valid, respectively similar flags for group_id2 and
> >> group_id3.
> >> >> >> >+      * Upon the match, struct rte_regex_match::group_id
> >shall be
> >> >> >> >updated
> >> >> >> >+      * with matching group ID by the device. Group ID
> >scheme
> >> >> >> >provides
> >> >> >> >+      * rule isolation and effective pattern matching.
> >> >> >> >+      */
> >> >> >> >+     uint16_t group_id1;
> >> >> >> >+     /**< Second group_id to match the rule against.
> >> >> >> >+      *
> >> >> >> >+      * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> >> >> >> >+      */
> >> >> >>
> >> >> >> The above `group_id1` should be removed as its duplicate.
> >> >> >>
> >> >> >
> >> >> >This is not duplicate, see above comment.
> >> >> >
> >> >> >> >+     uint16_t group_id2;
> >> >> >> >+     /**< Third group_id to match the rule against.
> >> >> >> >+      *
> >> >> >> >+      * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> >> >> >> >+      */
> >> >> >> >+     uint16_t group_id3;
> >> >> >> >+     /**< Forth group_id to match the rule against.
> >> >> >> >+      *
> >> >> >> >+      * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> >> >> >> >+      */
> >> >> >> >+
> >> >> >> >+     /* W3 */
> >> >> >> >+     RTE_STD_C11
> >> >> >> >+     union {
> >> >> >> >+             uint64_t user_id;
> >> >> >> >+             /**< Application specific opaque value. An
> >application
> >> >> >> >may use
> >> >> >> >+              * this field to hold application specific value to
> >share
> >> >> >> >+              * between dequeue and enqueue operation.
> >> >> >> >+              * Implementation should not modify this field.
> >> >> >> >+              */
> >> >> >> >+             void *user_ptr;
> >> >> >> >+             /**< Pointer representation of *user_id* */
> >> >> >> >+     };
> >> >> >> >+
> >> >> >> >+     /* W4 */
> >> >> >> >+     struct rte_regex_match matches[];
> >> >> >> >+     /**< Zero length array to hold the match tuples.
> >> >> >> >+      * The struct rte_regex_ops::nb_matches value holds
> >the
> >> >> >> >number of
> >> >> >> >+      * elements in this array.
> >> >> >> >+      *
> >> >> >> >+      * @see struct rte_regex_ops::nb_matches
> >> >> >> >+      */
> >> >> >> >+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-02  7:18                 ` Jerin Jacob
@ 2020-03-03  7:06                   ` Ori Kam
  0 siblings, 0 replies; 62+ messages in thread
From: Ori Kam @ 2020-03-03  7:06 UTC (permalink / raw)
  To: Jerin Jacob, Pavan Nikhilesh Bhagavatula
  Cc: Jerin Jacob Kollanukkaran, xiang.w.wang, dev, Shahaf Shuler,
	hemant.agrawal, Opher Reviv, Alex Rosenbaum, Dovrat Zifroni,
	Prasun Kapoor, nipun.gupta, bruce.richardson, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi All,

> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Monday, March 2, 2020 9:19 AM
> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Cc: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; xiang.w.wang@intel.com; dev@dpdk.org; Shahaf Shuler
> <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>; Dovrat
> Zifroni <dovrat@marvell.com>; Prasun Kapoor <pkapoor@marvell.com>;
> nipun.gupta@nxp.com; bruce.richardson@intel.com; yang.a.hong@intel.com;
> harry.chang@intel.com; gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev subsystem
> 
> On Sun, Mar 1, 2020 at 9:28 PM Pavan Nikhilesh Bhagavatula
> <pbhagavatula@marvell.com> wrote:
> >
> > Hi OrI,
> >
> > >
> > >Hi Pavan,
> > >
> > >
> > >> -----Original Message-----
> > >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
> > >Bhagavatula
> > >> Sent: Sunday, March 1, 2020 4:38 PM
> > >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> > >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> > >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> > >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> > >Alex
> > >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> > ><dovrat@marvell.com>;
> > >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> > >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> > >harry.chang@intel.com;
> > >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > >wushuai@inspur.com;
> > >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > >> <thomas@monjalon.net>
> > >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce regexdev
> > >subsystem
> > >>
> > >> Hi Ori,
> > >>
> > >> >
> > >> >Hi Pavan,
> > >> >
> > >> >> -----Original Message-----
> > >> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan
> > >Nikhilesh
> > >> >Bhagavatula
> > >> >> Sent: Sunday, March 1, 2020 3:23 PM
> > >> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> > >> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> > >> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> > >> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> > >> >Alex
> > >> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> > >> ><dovrat@marvell.com>;
> > >> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> > >> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> > >> >harry.chang@intel.com;
> > >> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > >> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > >> >wushuai@inspur.com;
> > >> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > >> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > >> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > >> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > >> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > >> >> <thomas@monjalon.net>
> > >> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce
> > >regexdev
> > >> >subsystem
> > >> >>
> > >> >> Hi Ori,
> > >> >>
> > >> >> >
> > >> >> >Hi Pavan,
> > >> >> >Thanks for the comments please see below.
> > >> >> >
> > >> >> >> -----Original Message-----
> > >> >> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan
> > >> >Nikhilesh
> > >> >> >Bhagavatula
> > >> >> >> Sent: Sunday, March 1, 2020 8:13 AM
> > >> >> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> > >> >> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> > >> >> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> > >> >> >> hemant.agrawal@nxp.com; Opher Reviv
> > ><opher@mellanox.com>;
> > >> >> >Alex
> > >> >> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> > >> >> ><dovrat@marvell.com>;
> > >> >> >> Prasun Kapoor <pkapoor@marvell.com>;
> > >nipun.gupta@nxp.com;
> > >> >> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> > >> >> >harry.chang@intel.com;
> > >> >> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > >> >> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > >> >> >wushuai@inspur.com;
> > >> >> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > >> >> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > >> >> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > >> >> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com;
> > >deri@ntop.org;
> > >> >> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > >> >> >> <thomas@monjalon.net>
> > >> >> >> Subject: Re: [dpdk-dev] [EXT] [RFC v5] regexdev: introduce
> > >> >regexdev
> > >> >> >subsystem
> > >> >> >>
> > >> >> >> Hi Ori,
> > >> >> >>
> > >> >> >> Minor comments below.
> > >> >> >>
> > >> >> >> <snip>
> > >> >> >>
> > >> >> >> >+/**
> > >> >> >> >+ * The generic *rte_regex_ops* structure to hold the RegEx
> > >> >> >attributes
> > >> >> >> >+ * for enqueue and dequeue operation.
> > >> >> >> >+ */
> > >> >> >> >+struct rte_regex_ops {
> > >> >> >> >+     /* W0 */
> > >> >> >> >+     uint16_t req_flags;
> > >> >> >> >+     /**< Request flags for the RegEx ops.
> > >> >> >> >+      * @see RTE_REGEX_OPS_REQ_*
> > >> >> >> >+      */
> > >> >> >> >+     uint16_t rsp_flags;
> > >> >> >> >+     /**< Response flags for the RegEx ops.
> > >> >> >> >+      * @see RTE_REGEX_OPS_RSP_*
> > >> >> >> >+      */
> > >> >> >> >+     uint16_t nb_actual_matches;
> > >> >> >> >+     /**< The total number of actual matches detected by
> > >the
> > >> >> >> >Regex device.*/
> > >> >> >> >+     uint16_t nb_matches;
> > >> >> >> >+     /**< The total number of matches returned by the
> > >RegEx
> > >> >> >> >device for this
> > >> >> >> >+      * scan. The size of *rte_regex_ops::matches* zero
> > >length
> > >> array
> > >> >> >> >will be
> > >> >> >> >+      * this value.
> > >> >> >> >+      *
> > >> >> >> >+      * @see struct rte_regex_ops::matches, struct
> > >> >> >> >rte_regex_match
> > >> >> >> >+      */
> > >> >> >> >+
> > >> >> >> >+     /* W1 */
> > >> >> >> >+     struct rte_mbuf mbuf; /**< source mbuf, to search in.
> > >*/
> > >> >> >>
> > >> >> >> This should be *mbuf.
> > >> >> >
> > >> >> >Yes you are correct will fix.
> > >> >> >
> > >> >> >>
> > >> >> >> >+
> > >> >> >> >+     /* W2 */
> > >> >> >> >+     uint16_t group_id0;
> > >> >> >>
> > >> >> >> This should be group_id1.
> > >> >> >>
> > >> >> >No this is correct is should be id0. We are starting from group 0.
> > >> >> >The comment below states that the first group, meaning group 0
> > >> >must
> > >> >> >be
> > >> >> >valid group while group 1 doesn’t have to be vaild.
> > >> >>
> > >> >> Would that mean that group_id0 is always valid?
> > >> >> Since there is no `RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F`
> > >flag.
> > >> >>
> > >> >Yes, you must have at least one group.
> > >>
> > >> Makes sense, I think we need to update the comment a bit as it only
> > >mentions
> > >> that
> > >> at least one group but it should be group_id0 has to be always valid.
> > >>
> > >> (An application can erroneously set valid group_id1 instead of
> > >group_id0)
> > >>
> > >
> > >What about the next comment?
> > >/**< First group_id to match the rule against. This group must be valid.
> > >In
> > >  * order to support more group (up to 4 groups). The group number
> > >should
> > >  * be set. For example to enable group 1 group_id1 should be set
> > >  * with the group value and  and the
> > >RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F flag should be set.
> > >  * Respectively similar flags for group_id2 and group_id3.
> > >  * Upon the match, struct rte_regex_match::group_id shall be updated
> > >  * with matching group ID by the device. Group ID scheme provides
> > >  * rule isolation and effective pattern matching.
> > >*/
> >
> > Looks good with minor corrections as below
> >
> > /**< First group_id to match the rule against. This group must be valid.
> >   * In order to support more than one group per each op (up to 4 groups), any
> of the group_id<1-3> should
> >   * hold a valid group id along with RTE_REGEX_OPS_REQ_GROUP_ID<1-
> 3>_VALID_F flag set.
> >   * For example, to match against group 100 and 101, group_id0 should be set
> to 100 and group_id1 should
> >   * be set to 101 and the RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F flag
> should be set.
> >   * Respectively similar flags for group_id2 and group_id3.
> >   * Upon the match, struct rte_regex_match::group_id shall be updated
> >   * with matching group ID by the device. Group ID scheme provides
> >   * rule isolation and effective pattern matching.
> > */
> 
> I think, we can remove the limitation of group0 is always valid.
> There are use cases like each group belongs certain functionality and
> based on the packet type or
> so application decides the group. In that case, group 0 may or may not valid.
> 
> IMO, By spec, we can dictate,
> 
> At minimum of one of the group should be valid and selected, Behaviour
> is undefined if any of the group is not selected(This is to avoid fast
> path check).
> 
> Thoughts?
> 

I like your approach, lets go with this approach.

> 
> 
> 
> 
> 
> >
> > >
> > >/**< First group_id to match the rule against. Minimum one group id
> > >  * must be provided by application.
> > >  * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F set then
> > >group_id1
> > >  * is valid, respectively similar flags for group_id2 and group_id3.
> > >  * Upon the match, struct rte_regex_match::group_id shall be updated
> > >  * with matching group ID by the device. Group ID scheme provides
> > >  * rule isolation and effective pattern matching.
> > >
> > >> >
> > >> >> >
> > >> >> >> >+     /**< First group_id to match the rule against. Minimum
> > >one
> > >> >> >> >group id
> > >> >> >> >+      * must be provided by application.
> > >> >> >> >+      * When RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> > >set then
> > >> >> >> >group_id1
> > >> >> >> >+      * is valid, respectively similar flags for group_id2 and
> > >> group_id3.
> > >> >> >> >+      * Upon the match, struct rte_regex_match::group_id
> > >shall be
> > >> >> >> >updated
> > >> >> >> >+      * with matching group ID by the device. Group ID
> > >scheme
> > >> >> >> >provides
> > >> >> >> >+      * rule isolation and effective pattern matching.
> > >> >> >> >+      */
> > >> >> >> >+     uint16_t group_id1;
> > >> >> >> >+     /**< Second group_id to match the rule against.
> > >> >> >> >+      *
> > >> >> >> >+      * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> > >> >> >> >+      */
> > >> >> >>
> > >> >> >> The above `group_id1` should be removed as its duplicate.
> > >> >> >>
> > >> >> >
> > >> >> >This is not duplicate, see above comment.
> > >> >> >
> > >> >> >> >+     uint16_t group_id2;
> > >> >> >> >+     /**< Third group_id to match the rule against.
> > >> >> >> >+      *
> > >> >> >> >+      * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> > >> >> >> >+      */
> > >> >> >> >+     uint16_t group_id3;
> > >> >> >> >+     /**< Forth group_id to match the rule against.
> > >> >> >> >+      *
> > >> >> >> >+      * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> > >> >> >> >+      */
> > >> >> >> >+
> > >> >> >> >+     /* W3 */
> > >> >> >> >+     RTE_STD_C11
> > >> >> >> >+     union {
> > >> >> >> >+             uint64_t user_id;
> > >> >> >> >+             /**< Application specific opaque value. An
> > >application
> > >> >> >> >may use
> > >> >> >> >+              * this field to hold application specific value to
> > >share
> > >> >> >> >+              * between dequeue and enqueue operation.
> > >> >> >> >+              * Implementation should not modify this field.
> > >> >> >> >+              */
> > >> >> >> >+             void *user_ptr;
> > >> >> >> >+             /**< Pointer representation of *user_id* */
> > >> >> >> >+     };
> > >> >> >> >+
> > >> >> >> >+     /* W4 */
> > >> >> >> >+     struct rte_regex_match matches[];
> > >> >> >> >+     /**< Zero length array to hold the match tuples.
> > >> >> >> >+      * The struct rte_regex_ops::nb_matches value holds
> > >the
> > >> >> >> >number of
> > >> >> >> >+      * elements in this array.
> > >> >> >> >+      *
> > >> >> >> >+      * @see struct rte_regex_ops::nb_matches
> > >> >> >> >+      */
> > >> >> >> >+};

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-02  7:05   ` [dpdk-dev] " Wang Xiang
@ 2020-03-03  7:44     ` Ori Kam
  2020-03-03  7:54       ` Jerin Jacob
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-03  7:44 UTC (permalink / raw)
  To: Wang Xiang
  Cc: jerinj, dev, pbhagavatula, Shahaf Shuler, hemant.agrawal,
	Opher Reviv, Alex Rosenbaum, dovrat, pkapoor, nipun.gupta,
	bruce.richardson, yang.a.hong, harry.chang, gu.jian1, shanjiangh,
	zhangy.yun, lixingfu, wushuai, yuyingxia, fanchenggang,
	davidfgao, liuzhong1, zhaoyong11, oc, jim, hongjun.ni,
	j.bromhead, deri, fc, arthur.su, Thomas Monjalon

Hi Xiang,

May I ask when do you plan to add the Hyperscan code to the DPDK?

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Wang Xiang
> Sent: Monday, March 2, 2020 9:05 AM
> To: Ori Kam <orika@mellanox.com>
> Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com; Shahaf
> Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [RFC v5] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> 
> Comments below.
> 
> Thanks,
> Xiang
> 
> On Thu, Feb 27, 2020 at 03:08:35PM +0000, Ori Kam wrote:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Even though there are some vendors which offer Regex HW offload, due to
> > lack of standard API, It is diffcult for DPDK consumer to use them
> > in a portable way.
> >
> > This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> >
> > This RFC crafted based on SW Regex API frameworks such as libpcre and
> > hyperscan and a few of the RegEx HW IPs which I am aware of.
> >
> > RegEx pattern matching applications:
> > * Next Generation Firewalls (NGFW)
> > * Deep Packet and Flow Inspection (DPI)
> > * Intrusion Prevention Systems (IPS)
> > * DDoS Mitigation
> > * Network Monitoring
> > * Data Loss Prevention (DLP)
> > * Smart NICs
> > * Grammar based content processing
> > * URL, spam and adware filtering
> > * Advanced auditing and policing of user/application security policies
> > * Financial data mining - parsing of streamed financial feeds
> > * Application recognition.
> > * Dmemory introspection.
> > * Natural Language Processing (NLP)
> > * Sentiment Analysis.
> > * Big data databse acceleration.
> > * Computational storage.
> >
> > Request to review from HW and SW RegEx vendors and RegEx application
> > users to have portable DPDK API for RegEx.
> >
> > The API schematics are based cryptodev, eventdev and ethdev existing
> > device API.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Signed-off-by: Ori Kam <orika@mellanox.com>
> > ---
> > V5:
> >  * Remove unused iov struct.
> > V4:
> >  * Replace iov with mbuf.
> >  * Small ML comments.
> > V3:
> >  * Change subject title.
> > V2:
> >  * Address ML comments.
> > ---
> > +
> > +#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
> > +/**< RegEx device support PCRE Greedy mode.
> > + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> unlimited
> > + * matches. In greedy mode the pattern 'AB12345' will be matched
> completely
> > + * where as the ungreedy mode 'AB' will be returned as the match.
> > + * @see struct rte_regex_dev_info::regex_dev_capa
> > + */
> > +
> 
> Hyperscan actually supports "match all" semantic, neither greedy nor
> ungreedy,
> which is different from PCRE. In the case above, AB, AB1, ..., AB12345 will all
> be returned as matches. Do HW solutions support this?

No our HW doesn't support this.
Jerin, does Marvell HW support this?

> Can we add a new flag like RTE_REGEX_DEV_SUPP_PCRE_MATCHALL_F?
> Similarly, we can define a flag RTE_REGEX_PCRE_RULE_MATCHALL_F so
> Hyperscan
> users have to set this flag during rule compile.
> 

Sure,

> > +#define RTE_REGEX_DEV_SUPP_PCRE_LOOKAROUND_ASRT_F (1ULL << 7)
> > +/**< RegEx device support PCRE Lookaround assertions
> > + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> > + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any
> matches
> > + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return
> a
> > + * successful match.
> > + * @see struct rte_regex_dev_info::regex_dev_capa
> > + */
> > +
> > +
> > +/**
> > + * RegEx device information
> > + */
> > +struct rte_regex_dev_info {
> > +	const char *driver_name; /**< RegEx driver name. */
> > +	struct rte_device *dev;	/**< Device information. */
> > +	uint16_t max_matches;
> > +	/**< Maximum matches per scan supported by this device. */
> > +	uint16_t max_queue_pairs;
> > +	/**< Maximum queue pairs supported by this device. */
> > +	uint16_t max_payload_size;
> > +	/**< Maximum payload size for a pattern match request or scan.
> > +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > +	 */
> > +	uint32_t max_rules_per_group;
> > +	/**< Maximum rules supported per group by this device. */
> > +	uint16_t max_groups;
> > +	/**< Maximum groups supported by this device. */
> > +	uint32_t regex_dev_capa;
> > +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> > +	uint64_t rule_flags;
> > +	/**< Supported compiler rule flags.
> > +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> > +	 */
> > +	uint8_t max_scatter_gather;
> > +	/**< The max supported number of buffers that can
> > +	 * be used in a single ops. The total size of all elements
> > +	 * must be less then max_payload_size.
> > +	 */
> 
> s/then/than
> 

Will fix.

> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Retrieve the contextual information of a RegEx device.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + *
> > + * @param[out] dev_info
> > + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with
> the
> > + *   contextual information of the device.
> > + *
> > + * @return
> > + *   - 0: Success, driver updates the contextual information of the RegEx
> device
> > + *   - <0: Error code returned by the driver info get function.
> > + *
> > + */
> > +__rte_experimental
> > +int
> > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> > +
> > +/* Enumerates RegEx device configuration flags */
> > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > +/**< Cross buffer scan refers to the ability to be able to detect
> > + * matches that occur across buffer boundaries, where the buffers are
> related
> > + * to each other in some way. Enable this flag when to scan payload size
> > + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> > + * matches can present across scan buffer boundaries.
> 
> s/struct/than
> 

Will fix.

> > + *
> > + * @see struct rte_regex_dev_info::max_payload_size
> > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > + */
> > +
> > +
> > +/* Enumerates RegEx response flags. */
> > +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> > +/**< Indicates that the RegEx device has encountered a partial match at the
> > + * start of scan in the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > + */
> > +
> Hyperscan supports cross buffer scan and only reports true matches instead of
> partial matches. Can we have users to config this partial match capability?
> 

Do you mean something like this:
RTE_REGEX_OPS_RSP_PMI_FOJ_F
/**< Indicates that the RegEx device has encountered a full match in the current buffer.
  * The match was started in previous buffer.
  */

> > +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> > +/**< Indicates that the RegEx device has encountered a partial match at the
> > + * end of scan in the given buffer.
> > + *
> > + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > + */
> > +

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v5] regexdev: introduce regexdev subsystem
  2020-03-03  7:44     ` Ori Kam
@ 2020-03-03  7:54       ` Jerin Jacob
  0 siblings, 0 replies; 62+ messages in thread
From: Jerin Jacob @ 2020-03-03  7:54 UTC (permalink / raw)
  To: Ori Kam
  Cc: Wang Xiang, jerinj, dev, pbhagavatula, Shahaf Shuler,
	hemant.agrawal, Opher Reviv, Alex Rosenbaum, dovrat, pkapoor,
	nipun.gupta, bruce.richardson, yang.a.hong, harry.chang,
	gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su, Thomas Monjalon

On Tue, Mar 3, 2020 at 1:14 PM Ori Kam <orika@mellanox.com> wrote:
>
> Hi Xiang,
>
> May I ask when do you plan to add the Hyperscan code to the DPDK?
>
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Wang Xiang
> > Sent: Monday, March 2, 2020 9:05 AM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com; Shahaf
> > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [RFC v5] regexdev: introduce regexdev subsystem
> >
> > Hi Ori,
> >
> > Comments below.
> >
> > Thanks,
> > Xiang
> >
> > On Thu, Feb 27, 2020 at 03:08:35PM +0000, Ori Kam wrote:
> > > From: Jerin Jacob <jerinj@marvell.com>
> > >
> > > Even though there are some vendors which offer Regex HW offload, due to
> > > lack of standard API, It is diffcult for DPDK consumer to use them
> > > in a portable way.
> > >
> > > This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> > >
> > > This RFC crafted based on SW Regex API frameworks such as libpcre and
> > > hyperscan and a few of the RegEx HW IPs which I am aware of.
> > >
> > > RegEx pattern matching applications:
> > > * Next Generation Firewalls (NGFW)
> > > * Deep Packet and Flow Inspection (DPI)
> > > * Intrusion Prevention Systems (IPS)
> > > * DDoS Mitigation
> > > * Network Monitoring
> > > * Data Loss Prevention (DLP)
> > > * Smart NICs
> > > * Grammar based content processing
> > > * URL, spam and adware filtering
> > > * Advanced auditing and policing of user/application security policies
> > > * Financial data mining - parsing of streamed financial feeds
> > > * Application recognition.
> > > * Dmemory introspection.
> > > * Natural Language Processing (NLP)
> > > * Sentiment Analysis.
> > > * Big data databse acceleration.
> > > * Computational storage.
> > >
> > > Request to review from HW and SW RegEx vendors and RegEx application
> > > users to have portable DPDK API for RegEx.
> > >
> > > The API schematics are based cryptodev, eventdev and ethdev existing
> > > device API.
> > >
> > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > > Signed-off-by: Ori Kam <orika@mellanox.com>
> > > ---
> > > V5:
> > >  * Remove unused iov struct.
> > > V4:
> > >  * Replace iov with mbuf.
> > >  * Small ML comments.
> > > V3:
> > >  * Change subject title.
> > > V2:
> > >  * Address ML comments.
> > > ---
> > > +
> > > +#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
> > > +/**< RegEx device support PCRE Greedy mode.
> > > + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or
> > unlimited
> > > + * matches. In greedy mode the pattern 'AB12345' will be matched
> > completely
> > > + * where as the ungreedy mode 'AB' will be returned as the match.
> > > + * @see struct rte_regex_dev_info::regex_dev_capa
> > > + */
> > > +
> >
> > Hyperscan actually supports "match all" semantic, neither greedy nor
> > ungreedy,
> > which is different from PCRE. In the case above, AB, AB1, ..., AB12345 will all
> > be returned as matches. Do HW solutions support this?
>
> No our HW doesn't support this.
> Jerin, does Marvell HW support this?

No. It does not support it.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
                   ` (5 preceding siblings ...)
  2020-02-27 15:08 ` [dpdk-dev] [RFC v5] " Ori Kam
@ 2020-03-10 10:32 ` Ori Kam
  2020-03-10 13:42   ` Pavan Nikhilesh Bhagavatula
  2020-03-13  1:20   ` Wang Xiang
  6 siblings, 2 replies; 62+ messages in thread
From: Ori Kam @ 2020-03-10 10:32 UTC (permalink / raw)
  To: jerinj, xiang.w.wang
  Cc: dev, pbhagavatula, shahafs, hemant.agrawal, opher, alexr, dovrat,
	pkapoor, nipun.gupta, bruce.richardson, yang.a.hong, harry.chang,
	gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai, yuyingxia,
	fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc, jim,
	hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas, orika

From: Jerin Jacob <jerinj@marvell.com>

Even though there are some vendors which offer Regex HW offload, due to
lack of standard API, It is diffcult for DPDK consumer to use them
in a portable way.

This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.

This RFC crafted based on SW Regex API frameworks such as libpcre and
hyperscan and a few of the RegEx HW IPs which I am aware of.

RegEx pattern matching applications:
* Next Generation Firewalls (NGFW)
* Deep Packet and Flow Inspection (DPI)
* Intrusion Prevention Systems (IPS)
* DDoS Mitigation
* Network Monitoring
* Data Loss Prevention (DLP)
* Smart NICs
* Grammar based content processing
* URL, spam and adware filtering
* Advanced auditing and policing of user/application security policies
* Financial data mining - parsing of streamed financial feeds
* Application recognition.
* Dmemory introspection.
* Natural Language Processing (NLP)
* Sentiment Analysis.
* Big data databse acceleration.
* Computational storage.

Request to review from HW and SW RegEx vendors and RegEx application
users to have portable DPDK API for RegEx.

The API schematics are based cryptodev, eventdev and ethdev existing
device API.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
---
V6:
 * Small ML comments.
V5:
 * Remove unused iov struct.
V4:
 * Replace iov with mbuf.
 * Small ML comments.
V3:
 * Change subject title.
V2:
 * Address ML comments
---
 config/common_base                           |    7 +
 doc/api/doxy-api-index.md                    |    1 +
 doc/api/doxy-api.conf.in                     |    1 +
 lib/Makefile                                 |    2 +
 lib/librte_regexdev/Makefile                 |   31 +
 lib/librte_regexdev/rte_regexdev.c           |    6 +
 lib/librte_regexdev/rte_regexdev.h           | 1395 ++++++++++++++++++++++++++
 lib/librte_regexdev/rte_regexdev_version.map |   26 +
 8 files changed, 1469 insertions(+)
 create mode 100644 lib/librte_regexdev/Makefile
 create mode 100644 lib/librte_regexdev/rte_regexdev.c
 create mode 100644 lib/librte_regexdev/rte_regexdev.h
 create mode 100644 lib/librte_regexdev/rte_regexdev_version.map

diff --git a/config/common_base b/config/common_base
index f9a68f3..4810849 100644
--- a/config/common_base
+++ b/config/common_base
@@ -806,6 +806,12 @@ CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_DMA_RAWDEV=y
 CONFIG_RTE_LIBRTE_PMD_NTB_RAWDEV=y
 
 #
+# Compile regex device support
+#
+CONFIG_RTE_LIBRTE_REGEXDEV=y
+CONFIG_RTE_LIBRTE_REGEXDEV_DEBUG=n
+
+#
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
@@ -1098,3 +1104,4 @@ CONFIG_RTE_APP_CRYPTO_PERF=y
 # Compile the eventdev application
 #
 CONFIG_RTE_APP_EVENTDEV=y
+
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496b..787f7c2 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -26,6 +26,7 @@ The public API headers are grouped by topics:
   [event_timer_adapter]    (@ref rte_event_timer_adapter.h),
   [event_crypto_adapter]   (@ref rte_event_crypto_adapter.h),
   [rawdev]             (@ref rte_rawdev.h),
+  [regexdev]           (@ref rte_regexdev.h),
   [metrics]            (@ref rte_metrics.h),
   [bitrate]            (@ref rte_bitrate.h),
   [latency]            (@ref rte_latencystats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index 1c4392e..56c08eb 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -58,6 +58,7 @@ INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/librte_rcu \
                           @TOPDIR@/lib/librte_reorder \
                           @TOPDIR@/lib/librte_rib \
+                          @TOPDIR@/lib/librte_regexdev \
                           @TOPDIR@/lib/librte_ring \
                           @TOPDIR@/lib/librte_sched \
                           @TOPDIR@/lib/librte_security \
diff --git a/lib/Makefile b/lib/Makefile
index 46b91ae..a273564 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -44,6 +44,8 @@ DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
                            librte_mempool librte_timer librte_cryptodev
 DIRS-$(CONFIG_RTE_LIBRTE_RAWDEV) += librte_rawdev
 DEPDIRS-librte_rawdev := librte_eal librte_ethdev
+DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
+DEPDIRS-librte_regexdev := librte_eal librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DEPDIRS-librte_vhost := librte_eal librte_mempool librte_mbuf librte_ethdev \
 			librte_net librte_hash librte_cryptodev
diff --git a/lib/librte_regexdev/Makefile b/lib/librte_regexdev/Makefile
new file mode 100644
index 0000000..6f4cc63
--- /dev/null
+++ b/lib/librte_regexdev/Makefile
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+# Copyright(C) 2020 Mellanox International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_regexdev.a
+
+EXPORT_MAP := rte_regex_version.map
+
+# library version
+LIBABIVER := 1
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf
+
+# library source files
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_regexdev.c
+
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_regexdev.h
+
+# versioning export map
+EXPORT_MAP := rte_regexdev_version.map
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_regexdev/rte_regexdev.c b/lib/librte_regexdev/rte_regexdev.c
new file mode 100644
index 0000000..b901877
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ */
+
+#include <rte_regexdev.h>
diff --git a/lib/librte_regexdev/rte_regexdev.h b/lib/librte_regexdev/rte_regexdev.h
new file mode 100644
index 0000000..cfbefb0
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev.h
@@ -0,0 +1,1395 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ * Copyright(C) 2020 Mellanox International Ltd.
+ * Copyright(C) 2020 Intel International Ltd.
+ */
+
+#ifndef _RTE_REGEXDEV_H_
+#define _RTE_REGEXDEV_H_
+
+/**
+ * @file
+ *
+ * RTE RegEx Device API
+ *
+ * Defines RTE RegEx Device APIs for RegEx operations and its provisioning.
+ *
+ * The RegEx Device API is composed of two parts:
+ *
+ * - The application-oriented RegEx API that includes functions to setup
+ *   a RegEx device (configure it, setup its queue pairs and start it),
+ *   update the rule database and so on.
+ *
+ * - The driver-oriented RegEx API that exports a function allowing
+ *   a RegEx poll Mode Driver (PMD) to simultaneously register itself as
+ *   a RegEx device driver.
+ *
+ * RegEx device components and definitions:
+ *
+ *     +-----------------+
+ *     |                 |
+ *     |                 o---------+    rte_regex_[en|de]queue_burst()
+ *     |   PCRE based    o------+  |               |
+ *     |  RegEx pattern  |      |  |  +--------+   |
+ *     | matching engine o------+--+--o        |   |    +------+
+ *     |                 |      |  |  | queue  |<==o===>|Core 0|
+ *     |                 o----+ |  |  | pair 0 |        |      |
+ *     |                 |    | |  |  +--------+        +------+
+ *     +-----------------+    | |  |
+ *            ^               | |  |  +--------+
+ *            |               | |  |  |        |        +------+
+ *            |               | +--+--o queue  |<======>|Core 1|
+ *        Rule|Database       |    |  | pair 1 |        |      |
+ *     +------+----------+    |    |  +--------+        +------+
+ *     |     Group 0     |    |    |
+ *     | +-------------+ |    |    |  +--------+        +------+
+ *     | | Rules 0..n  | |    |    |  |        |        |Core 2|
+ *     | +-------------+ |    |    +--o queue  |<======>|      |
+ *     |     Group 1     |    |       | pair 2 |        +------+
+ *     | +-------------+ |    |       +--------+
+ *     | | Rules 0..n  | |    |
+ *     | +-------------+ |    |       +--------+
+ *     |     Group 2     |    |       |        |        +------+
+ *     | +-------------+ |    |       | queue  |<======>|Core n|
+ *     | | Rules 0..n  | |    +-------o pair n |        |      |
+ *     | +-------------+ |            +--------+        +------+
+ *     |     Group n     |
+ *     | +-------------+ |<-------rte_regex_rule_db_update()
+ *     | |             | |<-------rte_regex_rule_db_compile_activate()
+ *     | | Rules 0..n  | |<-------rte_regex_rule_db_import()
+ *     | +-------------+ |------->rte_regex_rule_db_export()
+ *     +-----------------+
+ *
+ * RegEx: A regular expression is a concise and flexible means for matching
+ * strings of text, such as particular characters, words, or patterns of
+ * characters. A common abbreviation for this is “RegEx”.
+ *
+ * RegEx device: A hardware or software-based implementation of RegEx
+ * device API for PCRE based pattern matching syntax and semantics.
+ *
+ * PCRE RegEx syntax and semantics specification:
+ * http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
+ *
+ * RegEx queue pair: Each RegEx device should have one or more queue pair to
+ * transmit a burst of pattern matching request and receive a burst of
+ * receive the pattern matching response. The pattern matching request/response
+ * embedded in *rte_regex_ops* structure.
+ *
+ * Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
+ * Match ID and Group ID to identify the rule upon the match.
+ *
+ * Rule database: The RegEx device accepts regular expressions and converts them
+ * into a compiled rule database that can then be used to scan data.
+ * Compilation allows the device to analyze the given pattern(s) and
+ * pre-determine how to scan for these patterns in an optimized fashion that
+ * would be far too expensive to compute at run-time. A rule database contains
+ * a set of rules that compiled in device specific binary form.
+ *
+ * Match ID or Rule ID: A unique identifier provided at the time of rule
+ * creation for the application to identify the rule upon match.
+ *
+ * Group ID: Group of rules can be grouped under one group ID to enable
+ * rule isolation and effective pattern matching. A unique group identifier
+ * provided at the time of rule creation for the application to identify the
+ * rule upon match.
+ *
+ * Scan: A pattern matching request through *enqueue* API.
+ *
+ * It may possible that a given RegEx device may not support all the features
+ * of PCRE. The application may probe unsupported features through
+ * struct rte_regex_dev_info::pcre_unsup_flags
+ *
+ * By default, all the functions of the RegEx Device API exported by a PMD
+ * are lock-free functions which assume to not be invoked in parallel on
+ * different logical cores to work on the same target object. For instance,
+ * the dequeue function of a PMD cannot be invoked in parallel on two logical
+ * cores to operates on same RegEx queue pair. Of course, this function
+ * can be invoked in parallel by different logical core on different queue pair.
+ * It is the responsibility of the upper level application to enforce this rule.
+ *
+ * In all functions of the RegEx API, the RegEx device is
+ * designated by an integer >= 0 named the device identifier *dev_id*
+ *
+ * At the RegEx driver level, RegEx devices are represented by a generic
+ * data structure of type *rte_regex_dev*.
+ *
+ * RegEx devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When a RegEx device is being probed, a *rte_regex_dev* structure and
+ * a new device identifier are allocated for that device. Then, the
+ * regex_dev_init() function supplied by the RegEx driver matching the probed
+ * device is invoked to properly initialize the device.
+ *
+ * The role of the device init function consists of resetting the hardware or
+ * software RegEx driver implementations.
+ *
+ * If the device init operation is successful, the correspondence between
+ * the device identifier assigned to the new device and its associated
+ * *rte_regex_dev* structure is effectively registered.
+ * Otherwise, both the *rte_regex_dev* structure and the device identifier are
+ * freed.
+ *
+ * The functions exported by the application RegEx API to setup a device
+ * designated by its device identifier must be invoked in the following order:
+ *     - rte_regex_dev_configure()
+ *     - rte_regex_queue_pair_setup()
+ *     - rte_regex_dev_start()
+ *
+ * Then, the application can invoke, in any order, the functions
+ * exported by the RegEx API to enqueue pattern matching job, dequeue pattern
+ * matching response, get the stats, update the rule database,
+ * get/set device attributes and so on
+ *
+ * If the application wants to change the configuration (i.e. call
+ * rte_regex_dev_configure() or rte_regex_queue_pair_setup()), it must call
+ * rte_regex_dev_stop() first to stop the device and then do the reconfiguration
+ * before calling rte_regex_dev_start() again. The enqueue and dequeue
+ * functions should not be invoked when the device is stopped.
+ *
+ * Finally, an application can close a RegEx device by invoking the
+ * rte_regex_dev_close() function.
+ *
+ * Each function of the application RegEx API invokes a specific function
+ * of the PMD that controls the target device designated by its device
+ * identifier.
+ *
+ * For this purpose, all device-specific functions of a RegEx driver are
+ * supplied through a set of pointers contained in a generic structure of type
+ * *regex_dev_ops*.
+ * The address of the *regex_dev_ops* structure is stored in the *rte_regex_dev*
+ * structure by the device init function of the RegEx driver, which is
+ * invoked during the PCI/SoC device probing phase, as explained earlier.
+ *
+ * In other words, each function of the RegEx API simply retrieves the
+ * *rte_regex_dev* structure associated with the device identifier and
+ * performs an indirect invocation of the corresponding driver function
+ * supplied in the *regex_dev_ops* structure of the *rte_regex_dev* structure.
+ *
+ * For performance reasons, the address of the fast-path functions of the
+ * RegEx driver is not contained in the *regex_dev_ops* structure.
+ * Instead, they are directly stored at the beginning of the *rte_regex_dev*
+ * structure to avoid an extra indirect memory access during their invocation.
+ *
+ * RTE RegEx device drivers do not use interrupts for enqueue or dequeue
+ * operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
+ * functions to applications.
+ *
+ * The *enqueue* operation submits a burst of RegEx pattern matching request
+ * to the RegEx device and the *dequeue* operation gets a burst of pattern
+ * matching response for the ones submitted through *enqueue* operation.
+ *
+ * Typical application utilisation of the RegEx device API will follow the
+ * following programming flow.
+ *
+ * - rte_regex_dev_configure()
+ * - rte_regex_queue_pair_setup()
+ * - rte_regex_rule_db_update() Needs to invoke if precompiled rule database not
+ *   provided in rte_regex_dev_config::rule_db for rte_regex_dev_configure()
+ *   and/or application needs to update rule database.
+ * - rte_regex_rule_db_compile_activate() Needs to invoke if
+ *   rte_regex_rule_db_update function was used.
+ * - Create or reuse exiting mempool for *rte_regex_ops* objects.
+ * - rte_regex_dev_start()
+ * - rte_regex_enqueue_burst()
+ * - rte_regex_dequeue_burst()
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include <rte_config.h>
+#include <rte_dev.h>
+#include <rte_errno.h>
+#include <rte_mbuf.h>
+#include <rte_memory.h>
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the total number of RegEx devices that have been successfully
+ * initialised.
+ *
+ * @return
+ *   The total number of usable RegEx devices.
+ */
+__rte_experimental
+uint8_t
+rte_regex_dev_count(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the device identifier for the named RegEx device.
+ *
+ * @param name
+ *   RegEx device name to select the RegEx device identifier.
+ *
+ * @return
+ *   Returns RegEx device identifier on success.
+ *   - <0: Failure to find named RegEx device.
+ */
+__rte_experimental
+int
+rte_regex_dev_get_dev_id(const char *name);
+
+/* Enumerates RegEx device capabilities */
+#define RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
+/**< RegEx device does support compiling the rules at runtime unlike
+ * loading only the pre-built rule database using
+ * struct rte_regex_dev_config::rule_db in rte_regex_dev_configure()
+ * @see struct rte_regex_dev_config::rule_db, rte_regex_dev_configure()
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_START_ANCHOR_F (1ULL << 1)
+/**< RegEx device support PCRE Anchor to start of match flag.
+ * Example RegEx is '/\Gfoo\d/'. Here '\G' asserts position at the end of the
+ * previous match or the start of the string for the first match.
+ * This position will change each time the RegEx is applied to the subject
+ * string. If the RegEx is applied to 'foo1foo2Zfoo3' the first two matches will
+ * be successful for 'foo1foo2' and fail for 'Zfoo3'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_CAPA_SUPP_PCRE_ATOMIC_GROUPING_F (1ULL << 2)
+/**< RegEx device support PCRE Atomic grouping.
+ * Atomic groups are represented by '(?>)'. An atomic group is a group that,
+ * when the RegEx engine exits from it, automatically throws away all
+ * backtracking positions remembered by any tokens inside the group.
+ * Example RegEx is 'a(?>bc|b)c' if the given patterns are 'abc' and 'abcc' then
+ * 'a(bc|b)c' matches both where as 'a(?>bc|b)c' matches only abcc because
+ * atomic groups don't allow backtracing back to 'b'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKTRACKING_CTRL_F (1ULL << 3)
+/**< RegEx device support PCRE backtracking control verbs.
+ * Some examples of backtracing verbs are (*COMMIT), (*ACCEPT), (*FAIL),
+ * (*SKIP), (*PRUNE).
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_CALLOUTS_F (1ULL << 4)
+/**< RegEx device support PCRE callouts.
+ * PCRE supports calling external function in between matches by using '(?C)'.
+ * Example RegEx 'ABC(?C)D' if a given patter is 'ABCD' then the RegEx engine
+ * will parse ABC perform a userdefined callout and return a successful match at
+ * D.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_BACKREFERENCE_F (1ULL << 5)
+/**< RegEx device support PCRE backreference.
+ * Example RegEx is '(\2ABC|(GHI))+' \2 matches the same text as most recently
+ * matched by the 2nd capturing group i.e. 'GHI'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
+/**< RegEx device support PCRE Greedy mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In greedy mode the pattern 'AB12345' will be matched completely
+ * where as the ungreedy mode 'AB' will be returned as the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_MATCH_ALL_F (1ULL << 7)
+/**< RegEx device support match all mode.
+ * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
+ * matches. In match all mode the pattern 'AB12345' will return 6 matches.
+ * AB, AB1, AB12, AB123, AB1234, AB12345.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_LOOKAROUND_ASRT_F (1ULL << 8)
+/**< RegEx device support PCRE Lookaround assertions
+ * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
+ * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
+ * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
+ * successful match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_MATCH_POINT_RST_F (1ULL << 9)
+/**< RegEx device doesn't support PCRE match point reset directive.
+ * Example RegEx is '[a-z]+\K\d+' if the pattern is 'dwad123'
+ * then even though the entire pattern matches only '123'
+ * is reported as a match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_NEWLINE_CONVENTIONS_F (1ULL << 10)
+/**< RegEx support PCRE newline convention.
+ * Newline conventions are represented as follows:
+ * (*CR)        carriage return
+ * (*LF)        linefeed
+ * (*CRLF)      carriage return, followed by linefeed
+ * (*ANYCRLF)   any of the three above
+ * (*ANY)       all Unicode newline sequences
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_NEWLINE_SEQ_F (1ULL << 11)
+/**< RegEx device support PCRE newline sequence.
+ * The escape sequence '\R' will match any newline sequence.
+ * It is equivalent to: '(?>\r\n|\n|\x0b|\f|\r|\x85)'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_POSSESSIVE_QUALIFIERS_F (1ULL << 12)
+/**< RegEx device support PCRE possessive qualifiers.
+ * Example RegEx possessive qualifiers '*+', '++', '?+', '{m,n}+'.
+ * Possessive quantifier repeats the token as many times as possible and it does
+ * not give up matches as the engine backtracks. With a possessive quantifier,
+ * the deal is all or nothing.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_SUBROUTINE_REFERENCES_F (1ULL << 13)
+/**< RegEx device support PCRE Subroutine references.
+ * PCRE Subroutine references allow for sub patterns to be assessed
+ * as part of the RegEx. Example RegEx is '(foo|fuzz)\g<1>+bar' matches the
+ * pattern 'foofoofuzzfoofuzzbar'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_8_F (1ULL << 14)
+/**< RegEx device support UTF-8 character encoding.
+ * @see struct rte_regex_dev_info::pcre_unsup_flags
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_16_F (1ULL << 15)
+/**< RegEx device support UTF-16 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_UTF_32_F (1ULL << 16)
+/**< RegEx device support UTF-32 character encoding.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_WORD_BOUNDARY_F (1ULL << 17)
+/**< RegEx device support word boundaries.
+ * The meta character '\b' represents word boundary anchor.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_PCRE_FORWARD_REFERENCES_F (1ULL << 18)
+/**< RegEx device support Forward references.
+ * Forward references allow you to use a back reference to a group that appears
+ * later in the RegEx. Example RegEx is '(\3ABC|(DEF|(GHI)))+' matches the
+ * following string 'GHIGHIABCDEF'.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+#define RTE_REGEX_DEV_SUPP_MATCH_AS_END (1ULL << 19)
+/**< RegEx device support match as end.
+ * Match as end means that the match result holds the end offset of the
+ * detected match. No len value is set.
+ * If the device doesn't support this feature it means the match
+ * result holds the starting position of match and the length of the match.
+ * @see struct rte_regex_dev_info::regex_dev_capa
+ */
+
+/* Enumerates PCRE rule flags */
+#define RTE_REGEX_PCRE_RULE_ALLOW_EMPTY_F (1ULL << 0)
+/**< When this flag is set, the pattern that can match against an empty string,
+ * such as '.*' are allowed.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_ANCHORED_F (1ULL << 1)
+/**< When this flag is set, the pattern is forced to be "anchored", that is, it
+ * is constrained to match only at the first matching point in the string that
+ * is being searched. Similar to '^' and represented by \A.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_CASELESS_F (1ULL << 2)
+/**< When this flag is set, letters in the pattern match both upper and lower
+ * case letters in the subject.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DOTALL_F (1ULL << 3)
+/**< When this flag is set, a dot metacharacter in the pattern matches any
+ * character, including one that indicates a newline.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_DUPNAMES_F (1ULL << 4)
+/**< When this flag is set, names used to identify capture groups need not be
+ * unique.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_EXTENDED_F (1ULL << 5)
+/**< When this flag is set, most white space characters in the pattern are
+ * totally ignored except when escaped or inside a character class.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_UNSET_BACKREF_F (1ULL << 6)
+/**< When this flag is set, a backreference to an unset capture group matches an
+ * empty string.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MULTILINE_F (1ULL << 7)
+/**< When this flag  is set, the '^' and '$' constructs match immediately
+ * following or immediately before internal newlines in the subject string,
+ * respectively, as well as at the very start and end.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NO_AUTO_CAPTURE_F (1ULL << 8)
+/**< When this Flag is set, it disables the use of numbered capturing
+ * parentheses in the pattern. References to capture groups (backreferences or
+ * recursion/subroutine calls) may only refer to named groups, though the
+ * reference can be by name or by number.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UCP_F (1ULL << 9)
+/**< By default, only ASCII characters are recognized, When this flag is set,
+ * Unicode properties are used instead to classify characters.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UNGREEDY_F (1ULL << 10)
+/**< When this flag is set, the "greediness" of the quantifiers is inverted
+ * so that they are not greedy by default, but become greedy if followed by
+ * '?'.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_UTF_F (1ULL << 11)
+/**< When this flag is set, RegEx engine has to regard both the pattern and the
+ * subject strings that are subsequently processed as strings of UTF characters
+ * instead of single-code-unit strings.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_NEVER_BACKSLASH_C_F (1ULL << 12)
+/**< This flag locks out the use of '\C' in the pattern that is being compiled.
+ * This escape matches one data unit, even in UTF mode which can cause
+ * unpredictable behavior in UTF-8 or UTF-16 modes, because it may leave the
+ * current matching point in the middle of a multi-code-unit character.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
+/**< This flag marks that the results for the pattern that is being compiled
+ * should include all possible matches.
+ * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
+ */
+
+/**
+ * RegEx device information
+ */
+struct rte_regex_dev_info {
+	const char *driver_name; /**< RegEx driver name. */
+	struct rte_device *dev;	/**< Device information. */
+	uint16_t max_matches;
+	/**< Maximum matches per scan supported by this device. */
+	uint16_t max_queue_pairs;
+	/**< Maximum queue pairs supported by this device. */
+	uint16_t max_payload_size;
+	/**< Maximum payload size for a pattern match request or scan.
+	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+	 */
+	uint32_t max_rules_per_group;
+	/**< Maximum rules supported per group by this device. */
+	uint16_t max_groups;
+	/**< Maximum groups supported by this device. */
+	uint32_t regex_dev_capa;
+	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
+	uint64_t rule_flags;
+	/**< Supported compiler rule flags.
+	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the contextual information of a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ *
+ * @param[out] dev_info
+ *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
+ *   contextual information of the device.
+ *
+ * @return
+ *   - 0: Success, driver updates the contextual information of the RegEx device
+ *   - <0: Error code returned by the driver info get function.
+ *
+ */
+__rte_experimental
+int
+rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
+
+/* Enumerates RegEx device configuration flags */
+#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
+/**< Cross buffer scan refers to the ability to be able to detect
+ * matches that occur across buffer boundaries, where the buffers are related
+ * to each other in some way. Enable this flag when to scan payload size
+ * greater than struct rte_regex_dev_info::max_payload_size and/or
+ * matches can present across scan buffer boundaries.
+ *
+ * @see struct rte_regex_dev_info::max_payload_size
+ * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
+ * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
+ * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
+ */
+
+#define RTE_REGEX_DEV_CFG_MATCH_AS_END (1ULL << 1)
+/**< Match as end is the ability to return the result as ending offset.
+ * When this flag is set, the result for each match will hold the ending
+ * offset of the match in end_offset.
+ * If this flag is not set, then the match result will hold the starting offset
+ * in start_offset, and the length of the match in len.
+ *
+ * @see RTE_REGEX_DEV_SUPP_MATCH_AS_END
+ */
+
+/** RegEx device configuration structure */
+struct rte_regex_dev_config {
+	uint16_t nb_max_matches;
+	/**< Maximum matches per scan configured on this device.
+	 * This value cannot exceed the *max_matches*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case, value 1 used.
+	 * @see struct rte_regex_dev_info::max_matches
+	 */
+	uint16_t nb_queue_pairs;
+	/**< Number of RegEx queue pairs to configure on this device.
+	 * This value cannot exceed the *max_queue_pairs* which previously
+	 * provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_queue_pairs
+	 */
+	uint32_t nb_rules_per_group;
+	/**< Number of rules per group to configure on this device.
+	 * This value cannot exceed the *max_rules_per_group*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * The value 0 is allowed, in which case,
+	 * struct rte_regex_dev_info::max_rules_per_group used.
+	 * @see struct rte_regex_dev_info::max_rules_per_group
+	 */
+	uint16_t nb_groups;
+	/**< Number of groups to configure on this device.
+	 * This value cannot exceed the *max_groups*
+	 * which previously provided in rte_regex_dev_info_get().
+	 * @see struct rte_regex_dev_info::max_groups
+	 */
+	const char *rule_db;
+	/**< Import initial set of prebuilt rule database on this device.
+	 * The value NULL is allowed, in which case, the device will not
+	 * be configured prebuilt rule database. Application may use
+	 * rte_regex_rule_db_update() or rte_regex_rule_db_import() API
+	 * to update or import rule database after the
+	 * rte_regex_dev_configure().
+	 * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+	 */
+	uint32_t rule_db_len;
+	/**< Length of *rule_db* buffer. */
+	uint32_t dev_cfg_flags;
+	/**< RegEx device configuration flags, See RTE_REGEX_DEV_CFG_*  */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Configure a RegEx device.
+ *
+ * This function must be invoked first before any other function in the
+ * API. This function can also be re-invoked when a device is in the
+ * stopped state.
+ *
+ * The caller may use rte_regex_dev_info_get() to get the capability of each
+ * resources available for this regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device to configure.
+ * @param cfg
+ *   The RegEx device configuration structure.
+ *
+ * @return
+ *   - 0: Success, device configured. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_configure(uint8_t dev_id, const struct rte_regex_dev_config *cfg);
+
+/* Enumerates RegEx queue pair configuration flags */
+#define RTE_REGEX_QUEUE_PAIR_CFG_OOS_F (1ULL << 0)
+/**< Out of order scan, If not set, a scan must retire after previously issued
+ * in-order scans to this queue pair. If set, this scan can be retired as soon
+ * as device returns completion. Application should not set out of order scan
+ * flag if it needs to maintain the ingress order of scan request.
+ *
+ * @see struct rte_regex_qp_conf::qp_conf_flags, rte_regex_queue_pair_setup()
+ */
+
+struct rte_regex_ops;
+typedef void (*regexdev_stop_flush_t)(uint8_t dev_id, uint16_t qp_id,
+				      struct rte_regex_ops *op);
+/**< Callback function called during rte_regex_dev_stop(), invoked once per
+ * flushed RegEx op.
+ */
+
+/** RegEx queue pair configuration structure */
+struct rte_regex_qp_conf {
+	uint32_t qp_conf_flags;
+	/**< Queue pair config flags, See RTE_REGEX_QUEUE_PAIR_CFG_* */
+	uint16_t nb_desc;
+	/**< The number of descriptors to allocate for this queue pair. */
+	regexdev_stop_flush_t cb;
+	/**< Callback function called during rte_regex_dev_stop(), invoked
+	 * once per flushed regex op. Value NULL is allowed, in which case
+	 * callback will not be invoked. This function can be used to properly
+	 * dispose of outstanding regex ops from response queue,
+	 * for example ops containing memory pointers.
+	 * @see rte_regex_dev_stop()
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Allocate and set up a RegEx queue pair for a RegEx device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_pair_id
+ *   The index of the RegEx queue pair to setup. The value must be in the range
+ *   [0, nb_queue_pairs - 1] previously supplied to rte_regex_dev_configure().
+ * @param qp_conf
+ *   The pointer to the configuration data to be used for the RegEx queue pair.
+ *   NULL value is allowed, in which case default configuration	used.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
+			   const struct rte_regex_qp_conf *qp_conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Start a RegEx device.
+ *
+ * The device start step is the last one and consists of setting the RegEx
+ * queues to start accepting the pattern matching scan requests.
+ *
+ * On success, all basic functions exported by the API (RegEx enqueue,
+ * RegEx dequeue and so on) can be invoked.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_start(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Stop a RegEx device.
+ *
+ * Stop a RegEx device. The device can be restarted with a call to
+ * rte_regex_dev_start().
+ *
+ * This function causes all queued response regex ops to be drained in the
+ * response queue. While draining ops out of the device,
+ * struct rte_regex_qp_conf::cb will be invoked for each ops.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ *
+ * @see struct rte_regex_qp_conf::cb, rte_regex_queue_pair_setup()
+ */
+__rte_experimental
+void
+rte_regex_dev_stop(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Close a RegEx device. The device cannot be restarted!
+ *
+ * @param dev_id
+ *   RegEx device identifier
+ *
+ * @return
+ *   0 on success. Otherwise negative errno is returned.
+ */
+__rte_experimental
+int
+rte_regex_dev_close(uint8_t dev_id);
+
+/* Device get/set attributes */
+
+/** Enumerates RegEx device attribute identifier */
+enum rte_regex_dev_attr_id {
+	RTE_REGEX_DEV_ATTR_SOCKET_ID,
+	/**< The NUMA socket id to which the device is connected or
+	 * a default of zero if the socket could not be determined.
+	 * datatype: *int*
+	 * operation: *get*
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_MATCHES,
+	/**< Maximum number of matches per scan.
+	 * datatype: *uint8_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
+	/**< Upper bound scan time in ns.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
+	 */
+	RTE_REGEX_DEV_ATTR_MAX_PREFIX,
+	/**< Maximum number of prefix detected per scan.
+	 * This would be useful for denial of service detection.
+	 * datatype: *uint16_t*
+	 * operation: *get* and *set*
+	 *
+	 * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get an attribute from a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   A pointer that will be filled in with the attribute
+ *   value if successful.
+ *
+ * @return
+ *   - 0: Successfully retrieved attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_get(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       void *attr_value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set an attribute to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param attr_id
+ *   The attribute ID to retrieve.
+ * @param attr_value
+ *   Pointer that will be filled in with the attribute value
+ *   by the application.
+ *
+ * @return
+ *   - 0: Successfully applied the attribute value.
+ *   - -EINVAL: Invalid device or  *attr_id* provided, or *attr_value* is NULL.
+ *   - -ENOTSUP: if the device doesn't support specific *attr_id*.
+ */
+__rte_experimental
+int
+rte_regex_dev_attr_set(uint8_t dev_id, enum rte_regex_dev_attr_id attr_id,
+		       const void *attr_value);
+
+/* Rule related APIs */
+/** Enumerates RegEx rule operation. */
+enum rte_regex_rule_op {
+	RTE_REGEX_RULE_OP_ADD,
+	/**< Add RegEx rule to rule database. */
+	RTE_REGEX_RULE_OP_REMOVE
+	/**< Remove RegEx rule from rule database. */
+};
+
+/** Structure to hold a RegEx rule attributes. */
+struct rte_regex_rule {
+	enum rte_regex_rule_op op;
+	/**< OP type of the rule either a OP_ADD or OP_DELETE. */
+	uint16_t group_id;
+	/**< Group identifier to which the rule belongs to. */
+	uint32_t rule_id;
+	/**< Rule identifier which is returned on successful match. */
+	const char *pcre_rule;
+	/**< Buffer to hold the PCRE rule. */
+	uint16_t pcre_rule_len;
+	/**< Length of the PCRE rule. */
+	uint64_t rule_flags;
+	/* PCRE rule flags. Supported device specific PCRE rules enumerated
+	 * in struct rte_regex_dev_info::rule_flags. For successful rule
+	 * database update, application needs to provide only supported
+	 * rule flags.
+	 * @See RTE_REGEX_PCRE_RULE_*, struct rte_regex_dev_info::rule_flags
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Update the local rule set.
+ * This functions only modify the rule set in memory.
+ * In order for the changes to take effect, the function
+ * rte_regex_rule_db_compile_active must be called.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ * @param rules.
+ *   Points to an array of *nb_rules* objects of type *rte_regex_rule* structure
+ *   which contain the regex rules attributes to be updated in rule database.
+ * @param nb_rules.
+ *   The number of PCRE rules to update the rule database.
+ *
+ * @return
+ *   The number of regex rules actually updated on the regex device's rule
+ *   database. The return value can be less than the value of the *nb_rules*
+ *   parameter when the regex devices fails to update the rule database or
+ *   if invalid parameters are specified in a *rte_regex_rule*.
+ *   If the return value is less than *nb_rules*, the remaining PCRE rules
+ *   at the end of *rules* are not consumed and the caller has to take
+ *   care of them and rte_errno is set accordingly.
+ *   Possible errno values include:
+ *   - -EINVAL:  Invalid device ID or rules is NULL
+ *   - -ENOTSUP: The last processed rule is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(),
+ *   rte_regex_rule_db_compile_activate()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_update(uint8_t dev_id, const struct rte_regex_rule *rules,
+			 uint32_t nb_rules);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Compile local rule set and burn the complied result to the
+ * RegEx deive.
+ *
+ * @param dev_id.
+ *   RegEx device identifier.
+ *
+ * @return
+ *   0 on success, otherwise negative errno.
+ *
+ * @see rte_regex_rule_db_import(), rte_regex_rule_db_export(,
+ *   rte_regex_rule_db_update()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_compile_activate(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Import a prebuilt rule database from a buffer to a RegEx device.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param rule_db
+ *   Points to prebuilt rule database.
+ * @param rule_db_len
+ *   Length of the rule database.
+ *
+ * @return
+ *   - 0: Successfully updated the prebuilt rule database.
+ *   - -EINVAL:  Invalid device ID or rule_db is NULL
+ *   - -ENOTSUP: Rule database import is not supported on this device.
+ *   - -ENOSPC: No space available in rule database.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_export()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_import(uint8_t dev_id, const char *rule_db,
+			 uint32_t rule_db_len);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Export the prebuilt rule database from a RegEx device to the buffer.
+ *
+ * @param dev_id
+ *   RegEx device identifier.
+ * @param[out] rule_db
+ *   Block of memory to insert the rule database. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ *
+ * @return
+ *   - 0: Successfully exported the prebuilt rule database.
+ *   - size: If rule_db set to NULL then required capacity for *rule_db*
+ *   - -EINVAL:  Invalid device ID
+ *   - -ENOTSUP: Rule database export is not supported on this device.
+ *
+ * @see rte_regex_rule_db_update(), rte_regex_rule_db_import()
+ */
+__rte_experimental
+int
+rte_regex_rule_db_export(uint8_t dev_id, char *rule_db);
+
+/* Extended statistics */
+/** Maximum name length for extended statistics counters */
+#define RTE_REGEX_DEV_XSTATS_NAME_SIZE 64
+
+/**
+ * A name-key lookup element for extended statistics.
+ *
+ * This structure is used to map between names and ID numbers
+ * for extended RegEx device statistics.
+ */
+struct rte_regex_dev_xstats_map {
+	uint16_t id;
+	/**< xstat identifier */
+	char name[RTE_REGEX_DEV_XSTATS_NAME_SIZE];
+	/**< xstat name */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve names of extended statistics of a regex device.
+ *
+ * @param dev_id
+ *   The identifier of the regex device.
+ * @param[out] xstats_map
+ *   Block of memory to insert id and names into. Must be at least size in
+ *   capacity. If set to NULL, function returns required capacity.
+ * @return
+ *   - Positive value on success:
+ *        -The return value is the number of entries filled in the stats map.
+ *        -If xstats_map set to NULL then required capacity for xstats_map.
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_names_get(uint8_t dev_id,
+			       struct rte_regex_dev_xstats_map *xstats_map);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve extended statistics of an regex device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   The id numbers of the stats to get. The ids can be got from the stat
+ *   position in the stat list from rte_regex_dev_xstats_names_get(), or
+ *   by using rte_regex_dev_xstats_by_name_get().
+ * @param values
+ *   The values for each stats request by ID.
+ * @param n
+ *   The number of stats requested.
+ * @return
+ *   - Positive value: number of stat entries filled into the values array
+ *   - Negative value on error:
+ *      -ENODEV for invalid *dev_id*
+ *      -ENOTSUP if the device doesn't support this function.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_get(uint8_t dev_id, const uint16_t ids[],
+			 uint64_t values[], uint16_t n);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the value of a single stat by requesting it by name.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param name
+ *   The stat name to retrieve.
+ * @param id
+ *   If non-NULL, the numerical id of the stat will be returned, so that further
+ *   requests for the stat can be got using rte_regex_dev_xstats_get, which will
+ *   be faster as it doesn't need to scan a list of names for the stat.
+ * @param[out] value.
+ *   Must be non-NULL, retrieved xstat value will be stored in this address.
+ *
+ * @return
+ *   - 0: Successfully retrieved xstat value.
+ *   - -EINVAL: invalid parameters
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+				 uint16_t *id, uint64_t *value);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Reset the values of the xstats of the selected component in the device.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param ids
+ *   Selects specific statistics to be reset. When NULL, all statistics will be
+ *   reset. If non-NULL, must point to array of at least *nb_ids* size.
+ * @param nb_ids
+ *   The number of ids available from the *ids* array. Ignored when ids is NULL.
+ *
+ * @return
+ *   - 0: Successfully reset the statistics to zero.
+ *   - -EINVAL: invalid parameters.
+ *   - -ENOTSUP: if not supported.
+ */
+__rte_experimental
+int
+rte_regex_dev_xstats_reset(uint8_t dev_id, const uint16_t ids[],
+			   uint16_t nb_ids);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Trigger the RegEx device self test.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @return
+ *   - 0: Selftest successful.
+ *   - -ENOTSUP if the device doesn't support selftest.
+ *   - other values < 0 on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_selftest(uint8_t dev_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump internal information about *dev_id* to the FILE* provided in *f*.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param f
+ *   A pointer to a file for output.
+ *
+ * @return
+ *   0 on success, negative errno on failure.
+ */
+__rte_experimental
+int
+rte_regex_dev_dump(uint8_t dev_id, FILE *f);
+
+/* Fast path APIs */
+
+/**
+ * The generic *rte_regex_match* structure to hold the RegEx match attributes.
+ * @see struct rte_regex_ops::matches
+ */
+struct rte_regex_match {
+	RTE_STD_C11
+	union {
+		uint64_t u64;
+		struct {
+			uint32_t rule_id:20;
+			/**< Rule identifier to which the pattern matched.
+			 * @see struct rte_regex_rule::rule_id
+			 */
+			uint32_t group_id:12;
+			/**< Group identifier of the rule which the pattern
+			 * matched. @see struct rte_regex_rule::group_id
+			 */
+			uint16_t start_offset;
+			/**< Starting Byte Position for matched rule. */
+			RTE_STD_C11
+			union {
+				uint16_t len;
+				/**< Length of match in bytes */
+				uint16_t end_offset;
+				/**< The end offset of the match. In case
+				 * MATCH_AS_END configuration is enabled.
+				 * @see RTE_REGEX_DEV_CFG_MATCH_AS_END
+				 */
+			};
+		};
+	};
+};
+
+/* Enumerates RegEx request flags. */
+#define RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F (1 << 0)
+/**< Set when struct rte_regex_rule::group_id0 is valid. */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F (1 << 1)
+/**< Set when struct rte_regex_rule::group_id1 is valid. */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F (1 << 2)
+/**< Set when struct rte_regex_rule::group_id2 is valid. */
+
+#define RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F (1 << 3)
+/**< Set when struct rte_regex_rule::group_id3 is valid. */
+
+#define RTE_REGEX_OPS_REQ_STOP_ON_MATCH_F (1 << 4)
+/**< The RegEx engine will stop scanning and return the first match. */
+
+#define RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F (1 << 5)
+/**< In High Priority mode a maximum of one match will be returned per scan to
+ * reduce the post-processing required by the application. The match with the
+ * lowest Rule id, lowest start pointer and lowest match length will be
+ * returned.
+ *
+ * @see struct rte_regex_ops::nb_actual_matches
+ * @see struct rte_regex_ops::nb_matches
+ */
+
+
+/* Enumerates RegEx response flags. */
+#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * start of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
+/**< Indicates that the RegEx device has encountered a partial match at the
+ * end of scan in the given buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_PMI_TOJ_F (1 << 2)
+/**< Indicates that the RegEx device has encountered a match that was started
+ * in previous buffer.
+ *
+ * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F (1 << 3)
+/**< Indicates that the RegEx device has exceeded the max timeout while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_MATCH_F (1 << 4)
+/**< Indicates that the RegEx device has exceeded the max matches while
+ * scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_MATCHES
+ */
+
+#define RTE_REGEX_OPS_RSP_MAX_PREFIX_F (1 << 5)
+/**< Indicates that the RegEx device has reached the max allowed prefix length
+ * while scanning the given buffer.
+ *
+ * @see RTE_REGEX_DEV_ATTR_MAX_PREFIX
+ */
+
+/**
+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
+ * for enqueue and dequeue operation.
+ */
+struct rte_regex_ops {
+	/* W0 */
+	uint16_t req_flags;
+	/**< Request flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_REQ_*
+	 */
+	uint16_t rsp_flags;
+	/**< Response flags for the RegEx ops.
+	 * @see RTE_REGEX_OPS_RSP_*
+	 */
+	uint16_t nb_actual_matches;
+	/**< The total number of actual matches detected by the Regex device.*/
+	uint16_t nb_matches;
+	/**< The total number of matches returned by the RegEx device for this
+	 * scan. The size of *rte_regex_ops::matches* zero length array will be
+	 * this value.
+	 *
+	 * @see struct rte_regex_ops::matches, struct rte_regex_match
+	 */
+
+	/* W1 */
+	struct rte_mbuf *mbuf; /**< source mbuf, to search in. */
+
+	/* W2 */
+	uint16_t group_id0;
+	/**< First group_id to match the rule against. At minimum one group
+	 * should be valid. Behaviour is undefined non of the groups are valid.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F
+	 */
+	uint16_t group_id1;
+	/**< Second group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
+	 */
+	uint16_t group_id2;
+	/**< Third group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
+	 */
+	uint16_t group_id3;
+	/**< Forth group_id to match the rule against.
+	 *
+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
+	 */
+
+	/* W3 */
+	RTE_STD_C11
+	union {
+		uint64_t user_id;
+		/**< Application specific opaque value. An application may use
+		 * this field to hold application specific value to share
+		 * between dequeue and enqueue operation.
+		 * Implementation should not modify this field.
+		 */
+		void *user_ptr;
+		/**< Pointer representation of *user_id* */
+	};
+
+	/* W4 */
+	struct rte_regex_match matches[];
+	/**< Zero length array to hold the match tuples.
+	 * The struct rte_regex_ops::nb_matches value holds the number of
+	 * elements in this array.
+	 *
+	 * @see struct rte_regex_ops::nb_matches
+	 */
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Enqueue a burst of scan request on a RegEx device.
+ *
+ * The rte_regex_enqueue_burst() function is invoked to place
+ * regex operations on the queue *qp_id* of the device designated by
+ * its *dev_id*.
+ *
+ * The *nb_ops* parameter is the number of operations to process which are
+ * supplied in the *ops* array of *rte_regex_op* structures.
+ *
+ * The rte_regex_enqueue_burst() function returns the number of
+ * operations it actually enqueued for processing. A return value equal to
+ * *nb_ops* means that all packets have been enqueued.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param qp_id
+ *   The index of the queue pair which packets are to be enqueued for
+ *   processing. The value must be in the range [0, nb_queue_pairs - 1]
+ *   previously supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of *nb_ops* pointers to *rte_regex_op* structures
+ *   which contain the regex operations to be processed.
+ * @param nb_ops
+ *   The number of operations to process.
+ *
+ * @return
+ *   The number of operations actually enqueued on the regex device. The return
+ *   value can be less than the value of the *nb_ops* parameter when the
+ *   regex devices queue is full or if invalid parameters are specified in
+ *   a *rte_regex_op*. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_enqueue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dequeue a burst of scan response from a queue on the RegEx device.
+ * The dequeued operation are stored in *rte_regex_op* structures
+ * whose pointers are supplied in the *ops* array.
+ *
+ * The rte_regex_dequeue_burst() function returns the number of ops
+ * actually dequeued, which is the number of *rte_regex_op* data structures
+ * effectively supplied into the *ops* array.
+ *
+ * A return value equal to *nb_ops* indicates that the queue contained
+ * at least *nb_ops* operations, and this is likely to signify that other
+ * processed operations remain in the devices output queue. Applications
+ * implementing a "retrieve as many processed operations as possible" policy
+ * can check this specific case and keep invoking the
+ * rte_regex_dequeue_burst() function until a value less than
+ * *nb_ops* is returned.
+ *
+ * The rte_regex_dequeue_burst() function does not provide any error
+ * notification to avoid the corresponding overhead.
+ *
+ * @param dev_id
+ *   The RegEx device identifier
+ * @param qp_id
+ *   The index of the queue pair from which to retrieve processed packets.
+ *   The value must be in the range [0, nb_queue_pairs - 1] previously
+ *   supplied to rte_regex_dev_configure().
+ * @param ops
+ *   The address of an array of pointers to *rte_regex_op* structures that must
+ *   be large enough to store *nb_ops* pointers in it.
+ * @param nb_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued, which is the number
+ *   of pointers to *rte_regex_op* structures effectively supplied to the
+ *   *ops* array. If the return value is less than *nb_ops*, the remaining
+ *   ops at the end of *ops* are not consumed and the caller has to take care
+ *   of them.
+ */
+__rte_experimental
+uint16_t
+rte_regex_dequeue_burst(uint8_t dev_id, uint16_t qp_id,
+			struct rte_regex_ops **ops, uint16_t nb_ops);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_REGEXDEV_H_ */
diff --git a/lib/librte_regexdev/rte_regexdev_version.map b/lib/librte_regexdev/rte_regexdev_version.map
new file mode 100644
index 0000000..723104d
--- /dev/null
+++ b/lib/librte_regexdev/rte_regexdev_version.map
@@ -0,0 +1,26 @@
+EXPERIMENTAL {
+	global:
+
+	rte_regex_dev_count;
+	rte_regex_dev_get_dev_id;
+	rte_regex_dev_info_get;
+	rte_regex_dev_configure;
+	rte_regex_queue_pair_setup;
+	rte_regex_dev_start;
+	rte_regex_dev_stop;
+	rte_regex_dev_close;
+	rte_regex_dev_attr_get;
+	rte_regex_dev_attr_set;
+	rte_regex_rule_db_update;
+	rte_regex_rule_db_compile_activate;
+	rte_regex_rule_db_import;
+	rte_regex_rule_db_export;
+	rte_regex_dev_xstats_names_get;
+	rte_regex_dev_xstats_get;
+	rte_regex_dev_xstats_by_name_get;
+	rte_regex_dev_xstats_reset;
+	rte_regex_dev_selftest;
+	rte_regex_dev_dump;
+	rte_regex_enqueue_burst;
+	rte_regex_dequeue_burst;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-10 10:32 ` [dpdk-dev] [RFC v6] " Ori Kam
@ 2020-03-10 13:42   ` Pavan Nikhilesh Bhagavatula
  2020-03-10 16:23     ` Ori Kam
  2020-03-13  1:20   ` Wang Xiang
  1 sibling, 1 reply; 62+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2020-03-10 13:42 UTC (permalink / raw)
  To: Ori Kam, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, shahafs, hemant.agrawal, opher, alexr, Dovrat Zifroni,
	Prasun Kapoor, nipun.gupta, bruce.richardson, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas

Hi Ori,

<snip>

>+
>+/**
>+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
>+ * for enqueue and dequeue operation.
>+ */
>+struct rte_regex_ops {
>+	/* W0 */
>+	uint16_t req_flags;
>+	/**< Request flags for the RegEx ops.
>+	 * @see RTE_REGEX_OPS_REQ_*
>+	 */
>+	uint16_t rsp_flags;
>+	/**< Response flags for the RegEx ops.
>+	 * @see RTE_REGEX_OPS_RSP_*
>+	 */
>+	uint16_t nb_actual_matches;
>+	/**< The total number of actual matches detected by the
>Regex device.*/
>+	uint16_t nb_matches;
>+	/**< The total number of matches returned by the RegEx
>device for this
>+	 * scan. The size of *rte_regex_ops::matches* zero length array
>will be
>+	 * this value.
>+	 *
>+	 * @see struct rte_regex_ops::matches, struct
>rte_regex_match
>+	 */
>+
>+	/* W1 */
>+	struct rte_mbuf *mbuf; /**< source mbuf, to search in. */

While implementing pcre2 SW driver I came across an oddity where having mbuf alone 
wouldn’t suffice, we need to have scan start offset and scan length as generally we would skip the
L2/L3 header.

>+
>+	/* W2 */
>+	uint16_t group_id0;
>+	/**< First group_id to match the rule against. At minimum one
>group
>+	 * should be valid. Behaviour is undefined non of the groups are
>valid.
>+	 *
>+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F
>+	 */
>+	uint16_t group_id1;
>+	/**< Second group_id to match the rule against.
>+	 *
>+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
>+	 */
>+	uint16_t group_id2;
>+	/**< Third group_id to match the rule against.
>+	 *
>+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
>+	 */
>+	uint16_t group_id3;
>+	/**< Forth group_id to match the rule against.
>+	 *
>+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
>+	 */
>+
>+	/* W3 */
>+	RTE_STD_C11
>+	union {
>+		uint64_t user_id;
>+		/**< Application specific opaque value. An application
>may use
>+		 * this field to hold application specific value to share
>+		 * between dequeue and enqueue operation.
>+		 * Implementation should not modify this field.
>+		 */
>+		void *user_ptr;
>+		/**< Pointer representation of *user_id* */
>+	};
>+
>+	/* W4 */
>+	struct rte_regex_match matches[];
>+	/**< Zero length array to hold the match tuples.
>+	 * The struct rte_regex_ops::nb_matches value holds the
>number of
>+	 * elements in this array.
>+	 *
>+	 * @see struct rte_regex_ops::nb_matches
>+	 */
>+};
>+

Thanks,
Pavan.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-10 13:42   ` Pavan Nikhilesh Bhagavatula
@ 2020-03-10 16:23     ` Ori Kam
  2020-03-10 16:36       ` Pavan Nikhilesh Bhagavatula
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-10 16:23 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Pavan,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh Bhagavatula
> Sent: Tuesday, March 10, 2020 3:42 PM
> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; xiang.w.wang@intel.com
> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>;
> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> 
> <snip>
> 
> >+
> >+/**
> >+ * The generic *rte_regex_ops* structure to hold the RegEx attributes
> >+ * for enqueue and dequeue operation.
> >+ */
> >+struct rte_regex_ops {
> >+	/* W0 */
> >+	uint16_t req_flags;
> >+	/**< Request flags for the RegEx ops.
> >+	 * @see RTE_REGEX_OPS_REQ_*
> >+	 */
> >+	uint16_t rsp_flags;
> >+	/**< Response flags for the RegEx ops.
> >+	 * @see RTE_REGEX_OPS_RSP_*
> >+	 */
> >+	uint16_t nb_actual_matches;
> >+	/**< The total number of actual matches detected by the
> >Regex device.*/
> >+	uint16_t nb_matches;
> >+	/**< The total number of matches returned by the RegEx
> >device for this
> >+	 * scan. The size of *rte_regex_ops::matches* zero length array
> >will be
> >+	 * this value.
> >+	 *
> >+	 * @see struct rte_regex_ops::matches, struct
> >rte_regex_match
> >+	 */
> >+
> >+	/* W1 */
> >+	struct rte_mbuf *mbuf; /**< source mbuf, to search in. */
> 
> While implementing pcre2 SW driver I came across an oddity where having
> mbuf alone
> wouldn’t suffice, we need to have scan start offset and scan length as generally
> we would skip the
> L2/L3 header.
> 

Yes you are correct, in most cases the application will need
not the all mbuf or it will connect number of mbuf.
This can be acchived by modifying the mbuf to point to the correct data 
start, and decrease the len.
In one of the previous version we used buffer address and iov to solve
this issue. But in order to keep the API the same as crypto we decided to go
with mbuf.
This API is experimental and based on the usage we might change it to iov.

> >+
> >+	/* W2 */
> >+	uint16_t group_id0;
> >+	/**< First group_id to match the rule against. At minimum one
> >group
> >+	 * should be valid. Behaviour is undefined non of the groups are
> >valid.
> >+	 *
> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F
> >+	 */
> >+	uint16_t group_id1;
> >+	/**< Second group_id to match the rule against.
> >+	 *
> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> >+	 */
> >+	uint16_t group_id2;
> >+	/**< Third group_id to match the rule against.
> >+	 *
> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> >+	 */
> >+	uint16_t group_id3;
> >+	/**< Forth group_id to match the rule against.
> >+	 *
> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> >+	 */
> >+
> >+	/* W3 */
> >+	RTE_STD_C11
> >+	union {
> >+		uint64_t user_id;
> >+		/**< Application specific opaque value. An application
> >may use
> >+		 * this field to hold application specific value to share
> >+		 * between dequeue and enqueue operation.
> >+		 * Implementation should not modify this field.
> >+		 */
> >+		void *user_ptr;
> >+		/**< Pointer representation of *user_id* */
> >+	};
> >+
> >+	/* W4 */
> >+	struct rte_regex_match matches[];
> >+	/**< Zero length array to hold the match tuples.
> >+	 * The struct rte_regex_ops::nb_matches value holds the
> >number of
> >+	 * elements in this array.
> >+	 *
> >+	 * @see struct rte_regex_ops::nb_matches
> >+	 */
> >+};
> >+
> 
> Thanks,
> Pavan.

Thanks,
Ori

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-10 16:23     ` Ori Kam
@ 2020-03-10 16:36       ` Pavan Nikhilesh Bhagavatula
  2020-03-10 17:00         ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2020-03-10 16:36 UTC (permalink / raw)
  To: Ori Kam, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Ori,
>
>Hi Pavan,
>
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
>Bhagavatula
>> Sent: Tuesday, March 10, 2020 3:42 PM
>> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
>> <jerinj@marvell.com>; xiang.w.wang@intel.com
>> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
>> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
>Alex
>> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
><dovrat@marvell.com>;
>> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
>> bruce.richardson@intel.com; yang.a.hong@intel.com;
>harry.chang@intel.com;
>> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
>> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
>wushuai@inspur.com;
>> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
>> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
>> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
>> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
>> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
>> <thomas@monjalon.net>
>> Subject: Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev
>subsystem
>>
>> Hi Ori,
>>
>> <snip>
>>
>> >+
>> >+/**
>> >+ * The generic *rte_regex_ops* structure to hold the RegEx
>attributes
>> >+ * for enqueue and dequeue operation.
>> >+ */
>> >+struct rte_regex_ops {
>> >+	/* W0 */
>> >+	uint16_t req_flags;
>> >+	/**< Request flags for the RegEx ops.
>> >+	 * @see RTE_REGEX_OPS_REQ_*
>> >+	 */
>> >+	uint16_t rsp_flags;
>> >+	/**< Response flags for the RegEx ops.
>> >+	 * @see RTE_REGEX_OPS_RSP_*
>> >+	 */
>> >+	uint16_t nb_actual_matches;
>> >+	/**< The total number of actual matches detected by the
>> >Regex device.*/
>> >+	uint16_t nb_matches;
>> >+	/**< The total number of matches returned by the RegEx
>> >device for this
>> >+	 * scan. The size of *rte_regex_ops::matches* zero length array
>> >will be
>> >+	 * this value.
>> >+	 *
>> >+	 * @see struct rte_regex_ops::matches, struct
>> >rte_regex_match
>> >+	 */
>> >+
>> >+	/* W1 */
>> >+	struct rte_mbuf *mbuf; /**< source mbuf, to search in. */
>>
>> While implementing pcre2 SW driver I came across an oddity where
>having
>> mbuf alone
>> wouldn’t suffice, we need to have scan start offset and scan length as
>generally
>> we would skip the
>> L2/L3 header.
>>
>
>Yes you are correct, in most cases the application will need
>not the all mbuf or it will connect number of mbuf.
>This can be acchived by modifying the mbuf to point to the correct data
>start, and decrease the len.

Wouldn’t that complicate Txing the packet later on after dequeue from regex if 
the user decides to do so?.
Instead we can have two fields in rte_regex_ops for storing scan_start_offset and
scan_size

>In one of the previous version we used buffer address and iov to solve
>this issue. But in order to keep the API the same as crypto we decided
>to go
>with mbuf.

The general idea was to save cycles converting mbuf and chain of mbuf to iov and back not 
just to stay in line with crypto.

>This API is experimental and based on the usage we might change it to
>iov.
>
>> >+
>> >+	/* W2 */
>> >+	uint16_t group_id0;
>> >+	/**< First group_id to match the rule against. At minimum one
>> >group
>> >+	 * should be valid. Behaviour is undefined non of the groups are
>> >valid.
>> >+	 *
>> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F
>> >+	 */
>> >+	uint16_t group_id1;
>> >+	/**< Second group_id to match the rule against.
>> >+	 *
>> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
>> >+	 */
>> >+	uint16_t group_id2;
>> >+	/**< Third group_id to match the rule against.
>> >+	 *
>> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
>> >+	 */
>> >+	uint16_t group_id3;
>> >+	/**< Forth group_id to match the rule against.
>> >+	 *
>> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
>> >+	 */
>> >+
>> >+	/* W3 */
>> >+	RTE_STD_C11
>> >+	union {
>> >+		uint64_t user_id;
>> >+		/**< Application specific opaque value. An application
>> >may use
>> >+		 * this field to hold application specific value to share
>> >+		 * between dequeue and enqueue operation.
>> >+		 * Implementation should not modify this field.
>> >+		 */
>> >+		void *user_ptr;
>> >+		/**< Pointer representation of *user_id* */
>> >+	};
>> >+
>> >+	/* W4 */
>> >+	struct rte_regex_match matches[];
>> >+	/**< Zero length array to hold the match tuples.
>> >+	 * The struct rte_regex_ops::nb_matches value holds the
>> >number of
>> >+	 * elements in this array.
>> >+	 *
>> >+	 * @see struct rte_regex_ops::nb_matches
>> >+	 */
>> >+};
>> >+
>>
>> Thanks,
>> Pavan.
>
>Thanks,
>Ori

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-10 16:36       ` Pavan Nikhilesh Bhagavatula
@ 2020-03-10 17:00         ` Ori Kam
  2020-03-12 12:13           ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-10 17:00 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi Pavan,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh Bhagavatula
> Sent: Tuesday, March 10, 2020 6:37 PM
> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; xiang.w.wang@intel.com
> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>;
> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> >
> >Hi Pavan,
> >
> >> -----Original Message-----
> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
> >Bhagavatula
> >> Sent: Tuesday, March 10, 2020 3:42 PM
> >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> >Alex
> >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> ><dovrat@marvell.com>;
> >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> >harry.chang@intel.com;
> >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> >wushuai@inspur.com;
> >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> >> <thomas@monjalon.net>
> >> Subject: Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev
> >subsystem
> >>
> >> Hi Ori,
> >>
> >> <snip>
> >>
> >> >+
> >> >+/**
> >> >+ * The generic *rte_regex_ops* structure to hold the RegEx
> >attributes
> >> >+ * for enqueue and dequeue operation.
> >> >+ */
> >> >+struct rte_regex_ops {
> >> >+	/* W0 */
> >> >+	uint16_t req_flags;
> >> >+	/**< Request flags for the RegEx ops.
> >> >+	 * @see RTE_REGEX_OPS_REQ_*
> >> >+	 */
> >> >+	uint16_t rsp_flags;
> >> >+	/**< Response flags for the RegEx ops.
> >> >+	 * @see RTE_REGEX_OPS_RSP_*
> >> >+	 */
> >> >+	uint16_t nb_actual_matches;
> >> >+	/**< The total number of actual matches detected by the
> >> >Regex device.*/
> >> >+	uint16_t nb_matches;
> >> >+	/**< The total number of matches returned by the RegEx
> >> >device for this
> >> >+	 * scan. The size of *rte_regex_ops::matches* zero length array
> >> >will be
> >> >+	 * this value.
> >> >+	 *
> >> >+	 * @see struct rte_regex_ops::matches, struct
> >> >rte_regex_match
> >> >+	 */
> >> >+
> >> >+	/* W1 */
> >> >+	struct rte_mbuf *mbuf; /**< source mbuf, to search in. */
> >>
> >> While implementing pcre2 SW driver I came across an oddity where
> >having
> >> mbuf alone
> >> wouldn’t suffice, we need to have scan start offset and scan length as
> >generally
> >> we would skip the
> >> L2/L3 header.
> >>
> >
> >Yes you are correct, in most cases the application will need
> >not the all mbuf or it will connect number of mbuf.
> >This can be acchived by modifying the mbuf to point to the correct data
> >start, and decrease the len.
> 
> Wouldn’t that complicate Txing the packet later on after dequeue from regex if
> the user decides to do so?.
> Instead we can have two fields in rte_regex_ops for storing scan_start_offset
> and
> scan_size
> 
The user will need to return the packet to the original state.  I agree that
that it is a bit harder for the application (but not by much). But in any case the user knows
the size he removed so when done he just need to return to the original value.
on the other end it save the user the working with iov structs.

Regarding your idea about start_offset and scan_size. It is a nice idea,
But I don't think it is worth it, since the start_offset is just what the user
needs to keep in order to return the mbuf to original state.
Also if the user wants to combine number of messages, he can't use this
approach  since he will need to remove the header also from the second
message and bind the two messages. So in any case the user must have some
logic.

> >In one of the previous version we used buffer address and iov to solve
> >this issue. But in order to keep the API the same as crypto we decided
> >to go
> >with mbuf.
> 
> The general idea was to save cycles converting mbuf and chain of mbuf to iov
> and back not
> just to stay in line with crypto.
> 

I agree and this was also my main thinking but Jerin and other community members raised 
this approach.
Each approach has advantages and disadvantages.
If the user wants he can just give the all mbuf. Also since at least in some
cases the regex will be done after crypto it make sense to use the same structs.
There is also the advantage of sharing code between all the drivers. (net/crypto/regex)
which can be done when using mbuf. (for example memory registration)

> >This API is experimental and based on the usage we might change it to
> >iov.
> >
> >> >+
> >> >+	/* W2 */
> >> >+	uint16_t group_id0;
> >> >+	/**< First group_id to match the rule against. At minimum one
> >> >group
> >> >+	 * should be valid. Behaviour is undefined non of the groups are
> >> >valid.
> >> >+	 *
> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F
> >> >+	 */
> >> >+	uint16_t group_id1;
> >> >+	/**< Second group_id to match the rule against.
> >> >+	 *
> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> >> >+	 */
> >> >+	uint16_t group_id2;
> >> >+	/**< Third group_id to match the rule against.
> >> >+	 *
> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> >> >+	 */
> >> >+	uint16_t group_id3;
> >> >+	/**< Forth group_id to match the rule against.
> >> >+	 *
> >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> >> >+	 */
> >> >+
> >> >+	/* W3 */
> >> >+	RTE_STD_C11
> >> >+	union {
> >> >+		uint64_t user_id;
> >> >+		/**< Application specific opaque value. An application
> >> >may use
> >> >+		 * this field to hold application specific value to share
> >> >+		 * between dequeue and enqueue operation.
> >> >+		 * Implementation should not modify this field.
> >> >+		 */
> >> >+		void *user_ptr;
> >> >+		/**< Pointer representation of *user_id* */
> >> >+	};
> >> >+
> >> >+	/* W4 */
> >> >+	struct rte_regex_match matches[];
> >> >+	/**< Zero length array to hold the match tuples.
> >> >+	 * The struct rte_regex_ops::nb_matches value holds the
> >> >number of
> >> >+	 * elements in this array.
> >> >+	 *
> >> >+	 * @see struct rte_regex_ops::nb_matches
> >> >+	 */
> >> >+};
> >> >+
> >>
> >> Thanks,
> >> Pavan.
> >
> >Thanks,
> >Ori

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-10 17:00         ` Ori Kam
@ 2020-03-12 12:13           ` Ori Kam
  0 siblings, 0 replies; 62+ messages in thread
From: Ori Kam @ 2020-03-12 12:13 UTC (permalink / raw)
  To: Pavan Nikhilesh Bhagavatula, Jerin Jacob Kollanukkaran, xiang.w.wang
  Cc: dev, Shahaf Shuler, hemant.agrawal, Opher Reviv, Alex Rosenbaum,
	Dovrat Zifroni, Prasun Kapoor, nipun.gupta, bruce.richardson,
	yang.a.hong, harry.chang, gu.jian1, shanjiangh, zhangy.yun,
	lixingfu, wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, hongjun.ni, j.bromhead, deri, fc, arthur.su,
	Thomas Monjalon

Hi All,

If there are no more comments, I'm starting to implement the new class.

Thanks,
Ori

> -----Original Message-----
> From: Ori Kam
> Sent: Tuesday, March 10, 2020 7:00 PM
> To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin Jacob
> Kollanukkaran <jerinj@marvell.com>; xiang.w.wang@intel.com
> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>; Alex
> Rosenbaum <Alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>;
> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: RE: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
> 
> Hi Pavan,
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
> Bhagavatula
> > Sent: Tuesday, March 10, 2020 6:37 PM
> > To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> > <jerinj@marvell.com>; xiang.w.wang@intel.com
> > Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> > hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>; Alex
> > Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>;
> > Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> > bruce.richardson@intel.com; yang.a.hong@intel.com;
> harry.chang@intel.com;
> > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
> >
> > Hi Ori,
> > >
> > >Hi Pavan,
> > >
> > >> -----Original Message-----
> > >> From: dev <dev-bounces@dpdk.org> On Behalf Of Pavan Nikhilesh
> > >Bhagavatula
> > >> Sent: Tuesday, March 10, 2020 3:42 PM
> > >> To: Ori Kam <orika@mellanox.com>; Jerin Jacob Kollanukkaran
> > >> <jerinj@marvell.com>; xiang.w.wang@intel.com
> > >> Cc: dev@dpdk.org; Shahaf Shuler <shahafs@mellanox.com>;
> > >> hemant.agrawal@nxp.com; Opher Reviv <opher@mellanox.com>;
> > >Alex
> > >> Rosenbaum <alexr@mellanox.com>; Dovrat Zifroni
> > ><dovrat@marvell.com>;
> > >> Prasun Kapoor <pkapoor@marvell.com>; nipun.gupta@nxp.com;
> > >> bruce.richardson@intel.com; yang.a.hong@intel.com;
> > >harry.chang@intel.com;
> > >> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > >> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > >wushuai@inspur.com;
> > >> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > >> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > >> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > >> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > >> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > >> <thomas@monjalon.net>
> > >> Subject: Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev
> > >subsystem
> > >>
> > >> Hi Ori,
> > >>
> > >> <snip>
> > >>
> > >> >+
> > >> >+/**
> > >> >+ * The generic *rte_regex_ops* structure to hold the RegEx
> > >attributes
> > >> >+ * for enqueue and dequeue operation.
> > >> >+ */
> > >> >+struct rte_regex_ops {
> > >> >+	/* W0 */
> > >> >+	uint16_t req_flags;
> > >> >+	/**< Request flags for the RegEx ops.
> > >> >+	 * @see RTE_REGEX_OPS_REQ_*
> > >> >+	 */
> > >> >+	uint16_t rsp_flags;
> > >> >+	/**< Response flags for the RegEx ops.
> > >> >+	 * @see RTE_REGEX_OPS_RSP_*
> > >> >+	 */
> > >> >+	uint16_t nb_actual_matches;
> > >> >+	/**< The total number of actual matches detected by the
> > >> >Regex device.*/
> > >> >+	uint16_t nb_matches;
> > >> >+	/**< The total number of matches returned by the RegEx
> > >> >device for this
> > >> >+	 * scan. The size of *rte_regex_ops::matches* zero length array
> > >> >will be
> > >> >+	 * this value.
> > >> >+	 *
> > >> >+	 * @see struct rte_regex_ops::matches, struct
> > >> >rte_regex_match
> > >> >+	 */
> > >> >+
> > >> >+	/* W1 */
> > >> >+	struct rte_mbuf *mbuf; /**< source mbuf, to search in. */
> > >>
> > >> While implementing pcre2 SW driver I came across an oddity where
> > >having
> > >> mbuf alone
> > >> wouldn’t suffice, we need to have scan start offset and scan length as
> > >generally
> > >> we would skip the
> > >> L2/L3 header.
> > >>
> > >
> > >Yes you are correct, in most cases the application will need
> > >not the all mbuf or it will connect number of mbuf.
> > >This can be acchived by modifying the mbuf to point to the correct data
> > >start, and decrease the len.
> >
> > Wouldn’t that complicate Txing the packet later on after dequeue from regex
> if
> > the user decides to do so?.
> > Instead we can have two fields in rte_regex_ops for storing scan_start_offset
> > and
> > scan_size
> >
> The user will need to return the packet to the original state.  I agree that
> that it is a bit harder for the application (but not by much). But in any case the
> user knows
> the size he removed so when done he just need to return to the original value.
> on the other end it save the user the working with iov structs.
> 
> Regarding your idea about start_offset and scan_size. It is a nice idea,
> But I don't think it is worth it, since the start_offset is just what the user
> needs to keep in order to return the mbuf to original state.
> Also if the user wants to combine number of messages, he can't use this
> approach  since he will need to remove the header also from the second
> message and bind the two messages. So in any case the user must have some
> logic.
> 
> > >In one of the previous version we used buffer address and iov to solve
> > >this issue. But in order to keep the API the same as crypto we decided
> > >to go
> > >with mbuf.
> >
> > The general idea was to save cycles converting mbuf and chain of mbuf to iov
> > and back not
> > just to stay in line with crypto.
> >
> 
> I agree and this was also my main thinking but Jerin and other community
> members raised
> this approach.
> Each approach has advantages and disadvantages.
> If the user wants he can just give the all mbuf. Also since at least in some
> cases the regex will be done after crypto it make sense to use the same structs.
> There is also the advantage of sharing code between all the drivers.
> (net/crypto/regex)
> which can be done when using mbuf. (for example memory registration)
> 
> > >This API is experimental and based on the usage we might change it to
> > >iov.
> > >
> > >> >+
> > >> >+	/* W2 */
> > >> >+	uint16_t group_id0;
> > >> >+	/**< First group_id to match the rule against. At minimum one
> > >> >group
> > >> >+	 * should be valid. Behaviour is undefined non of the groups are
> > >> >valid.
> > >> >+	 *
> > >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F
> > >> >+	 */
> > >> >+	uint16_t group_id1;
> > >> >+	/**< Second group_id to match the rule against.
> > >> >+	 *
> > >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F
> > >> >+	 */
> > >> >+	uint16_t group_id2;
> > >> >+	/**< Third group_id to match the rule against.
> > >> >+	 *
> > >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F
> > >> >+	 */
> > >> >+	uint16_t group_id3;
> > >> >+	/**< Forth group_id to match the rule against.
> > >> >+	 *
> > >> >+	 * @see RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F
> > >> >+	 */
> > >> >+
> > >> >+	/* W3 */
> > >> >+	RTE_STD_C11
> > >> >+	union {
> > >> >+		uint64_t user_id;
> > >> >+		/**< Application specific opaque value. An application
> > >> >may use
> > >> >+		 * this field to hold application specific value to share
> > >> >+		 * between dequeue and enqueue operation.
> > >> >+		 * Implementation should not modify this field.
> > >> >+		 */
> > >> >+		void *user_ptr;
> > >> >+		/**< Pointer representation of *user_id* */
> > >> >+	};
> > >> >+
> > >> >+	/* W4 */
> > >> >+	struct rte_regex_match matches[];
> > >> >+	/**< Zero length array to hold the match tuples.
> > >> >+	 * The struct rte_regex_ops::nb_matches value holds the
> > >> >number of
> > >> >+	 * elements in this array.
> > >> >+	 *
> > >> >+	 * @see struct rte_regex_ops::nb_matches
> > >> >+	 */
> > >> >+};
> > >> >+
> > >>
> > >> Thanks,
> > >> Pavan.
> > >
> > >Thanks,
> > >Ori

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-10 10:32 ` [dpdk-dev] [RFC v6] " Ori Kam
  2020-03-10 13:42   ` Pavan Nikhilesh Bhagavatula
@ 2020-03-13  1:20   ` Wang Xiang
  2020-03-15 10:05     ` Ori Kam
  1 sibling, 1 reply; 62+ messages in thread
From: Wang Xiang @ 2020-03-13  1:20 UTC (permalink / raw)
  To: Ori Kam
  Cc: jerinj, dev, pbhagavatula, shahafs, hemant.agrawal, opher, alexr,
	dovrat, pkapoor, nipun.gupta, bruce.richardson, yang.a.hong,
	harry.chang, gu.jian1, shanjiangh, zhangy.yun, lixingfu, wushuai,
	yuyingxia, fanchenggang, davidfgao, liuzhong1, zhaoyong11, oc,
	jim, hongjun.ni, j.bromhead, deri, fc, arthur.su, thomas

Hi Ori,

Sorry for the late response as I am occupied by other works.
Two comments below to make the definitions compatible to Hyperscan.

Thanks,
Xiang

On Tue, Mar 10, 2020 at 10:32:33AM +0000, Ori Kam wrote:
> +#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
> +/**< This flag marks that the results for the pattern that is being compiled
> + * should include all possible matches.
> + * @see struct rte_regex_dev_info::rule_flags, struct rte_regex_rule::rule_flags
> + */
> +
Can we change this flag to RTE_REGEX_DEV_CFG_MATCH_ALL since Hyperscan only supports
match all mode and users don't have to specify this flag per rule?

> + */
> +__rte_experimental
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater than struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.
> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
> + */
> +
Can we add another flag RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F? In this case,
we only return full match for cross buffer scan without any partial result and
without returning response flags such as RTE_REGEX_OPS_RSP_PMI_*.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-13  1:20   ` Wang Xiang
@ 2020-03-15 10:05     ` Ori Kam
  2020-03-16  1:25       ` Wang Xiang
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-15 10:05 UTC (permalink / raw)
  To: Wang Xiang
  Cc: jerinj, dev, pbhagavatula, Shahaf Shuler, hemant.agrawal,
	Opher Reviv, Alex Rosenbaum, dovrat, pkapoor, nipun.gupta,
	bruce.richardson, yang.a.hong, harry.chang, gu.jian1, shanjiangh,
	zhangy.yun, lixingfu, wushuai, yuyingxia, fanchenggang,
	davidfgao, liuzhong1, zhaoyong11, oc, jim, hongjun.ni,
	j.bromhead, deri, fc, arthur.su, Thomas Monjalon

Hi Xiang,


> -----Original Message-----
> From: Wang Xiang <xiang.w.wang@intel.com>
> Sent: Friday, March 13, 2020 3:20 AM
> To: Ori Kam <orika@mellanox.com>
> Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com; Shahaf
> Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> 
> Sorry for the late response as I am occupied by other works.
> Two comments below to make the definitions compatible to Hyperscan.
> 
> Thanks,
> Xiang
> 
> On Tue, Mar 10, 2020 at 10:32:33AM +0000, Ori Kam wrote:
> > +#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
> > +/**< This flag marks that the results for the pattern that is being compiled
> > + * should include all possible matches.
> > + * @see struct rte_regex_dev_info::rule_flags, struct
> rte_regex_rule::rule_flags
> > + */
> > +
> Can we change this flag to RTE_REGEX_DEV_CFG_MATCH_ALL since Hyperscan
> only supports
> match all mode and users don't have to specify this flag per rule?
>

Sure, we can replace the RTE_REGEX_PCRE_RULE_MATCH_ALL_F with 
RTE_REGEX_DEV_CFG_MATCH_ALL, and add RTE_REGEX_DEV_CAPA_SUPP_MATCH_ALL

 
> > + */
> > +__rte_experimental
> > +int
> > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> *dev_info);
> > +
> > +/* Enumerates RegEx device configuration flags */
> > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > +/**< Cross buffer scan refers to the ability to be able to detect
> > + * matches that occur across buffer boundaries, where the buffers are
> related
> > + * to each other in some way. Enable this flag when to scan payload size
> > + * greater than struct rte_regex_dev_info::max_payload_size and/or
> > + * matches can present across scan buffer boundaries.
> > + *
> > + * @see struct rte_regex_dev_info::max_payload_size
> > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> rte_regex_dev_configure()
> > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > + * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > + */
> > +
> Can we add another flag
> RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F? In this case,
> we only return full match for cross buffer scan without any partial result and
> without returning response flags such as RTE_REGEX_OPS_RSP_PMI_*.

I think that it is good in any case to return a flag if the detection was based on 
more than one buffer.
So I don't really see the advantage of adding such a flag.
As far as I understand in your case if the match started in previous buffer and ended 
in the current buffer then you will return also the flag of RTE_REGEX_OPS_RSP_PMI_TOJ_F
For my general knowledge, in your system if we have the following regex: ABC
In the first buffer we have xxxA size 4 and the second buffer is BCxx
If I understand correctly for first buffer you will return no match found.
For the second buffer you will return found and end offset will be equal to  2
Am I correct?
Or you are going to return end offset 6 because it started from the previous buffer? 


Best,
Ori


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-15 10:05     ` Ori Kam
@ 2020-03-16  1:25       ` Wang Xiang
  2020-03-16  9:09         ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Wang Xiang @ 2020-03-16  1:25 UTC (permalink / raw)
  To: Ori Kam
  Cc: jerinj, dev, pbhagavatula, Shahaf Shuler, hemant.agrawal,
	Opher Reviv, Alex Rosenbaum, dovrat, pkapoor, nipun.gupta,
	bruce.richardson, yang.a.hong, harry.chang, gu.jian1, shanjiangh,
	zhangy.yun, lixingfu, wushuai, yuyingxia, fanchenggang,
	davidfgao, liuzhong1, zhaoyong11, oc, jim, hongjun.ni,
	j.bromhead, deri, fc, arthur.su, Thomas Monjalon

On Sun, Mar 15, 2020 at 10:05:53AM +0000, Ori Kam wrote:
Hi Ori,

> Hi Xiang,
> 
> 
> > -----Original Message-----
> > From: Wang Xiang <xiang.w.wang@intel.com>
> > Sent: Friday, March 13, 2020 3:20 AM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com; Shahaf
> > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > 
> > Hi Ori,
> > 
> > Sorry for the late response as I am occupied by other works.
> > Two comments below to make the definitions compatible to Hyperscan.
> > 
> > Thanks,
> > Xiang
> > 
> > On Tue, Mar 10, 2020 at 10:32:33AM +0000, Ori Kam wrote:
> > > +#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
> > > +/**< This flag marks that the results for the pattern that is being compiled
> > > + * should include all possible matches.
> > > + * @see struct rte_regex_dev_info::rule_flags, struct
> > rte_regex_rule::rule_flags
> > > + */
> > > +
> > Can we change this flag to RTE_REGEX_DEV_CFG_MATCH_ALL since Hyperscan
> > only supports
> > match all mode and users don't have to specify this flag per rule?
> >
> 
> Sure, we can replace the RTE_REGEX_PCRE_RULE_MATCH_ALL_F with 
> RTE_REGEX_DEV_CFG_MATCH_ALL, and add RTE_REGEX_DEV_CAPA_SUPP_MATCH_ALL
>
Ack, thanks. 
>  
> > > + */
> > > +__rte_experimental
> > > +int
> > > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> > *dev_info);
> > > +
> > > +/* Enumerates RegEx device configuration flags */
> > > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > > +/**< Cross buffer scan refers to the ability to be able to detect
> > > + * matches that occur across buffer boundaries, where the buffers are
> > related
> > > + * to each other in some way. Enable this flag when to scan payload size
> > > + * greater than struct rte_regex_dev_info::max_payload_size and/or
> > > + * matches can present across scan buffer boundaries.
> > > + *
> > > + * @see struct rte_regex_dev_info::max_payload_size
> > > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> > rte_regex_dev_configure()
> > > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > > + * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > + */
> > > +
> > Can we add another flag
> > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F? In this case,
> > we only return full match for cross buffer scan without any partial result and
> > without returning response flags such as RTE_REGEX_OPS_RSP_PMI_*.
> 
> I think that it is good in any case to return a flag if the detection was based on 
> more than one buffer.
> So I don't really see the advantage of adding such a flag.
> As far as I understand in your case if the match started in previous buffer and ended 
> in the current buffer then you will return also the flag of RTE_REGEX_OPS_RSP_PMI_TOJ_F
> For my general knowledge, in your system if we have the following regex: ABC
> In the first buffer we have xxxA size 4 and the second buffer is BCxx
> If I understand correctly for first buffer you will return no match found.
> For the second buffer you will return found and end offset will be equal to  2
> Am I correct?
> Or you are going to return end offset 6 because it started from the previous buffer? 
> 
Hyperscan guarantees the same matching result regardless of the data is in a single
block or scattered to multiple blocks. So we'll return end offset 6 in this case
without giving any flag indicating whether the match is started in previous buffer
or current buffer. 
> 
> Best,
> Ori
> 

Best,
Xiang

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-16  1:25       ` Wang Xiang
@ 2020-03-16  9:09         ` Ori Kam
  2020-03-16 20:48           ` Wang Xiang
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-16  9:09 UTC (permalink / raw)
  To: Wang Xiang
  Cc: jerinj, dev, pbhagavatula, Shahaf Shuler, hemant.agrawal,
	Opher Reviv, Alex Rosenbaum, dovrat, pkapoor, nipun.gupta,
	bruce.richardson, yang.a.hong, harry.chang, gu.jian1, shanjiangh,
	zhangy.yun, lixingfu, wushuai, yuyingxia, fanchenggang,
	davidfgao, liuzhong1, zhaoyong11, oc, jim, hongjun.ni,
	j.bromhead, deri, fc, arthur.su, Thomas Monjalon

Hi Xiang,

> -----Original Message-----
> From: Wang Xiang <xiang.w.wang@intel.com>
> Sent: Monday, March 16, 2020 3:26 AM
> To: Ori Kam <orika@mellanox.com>
> Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com; Shahaf
> Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> 
> On Sun, Mar 15, 2020 at 10:05:53AM +0000, Ori Kam wrote:
> Hi Ori,
> 
> > Hi Xiang,
> >
> >
> > > -----Original Message-----
> > > From: Wang Xiang <xiang.w.wang@intel.com>
> > > Sent: Friday, March 13, 2020 3:20 AM
> > > To: Ori Kam <orika@mellanox.com>
> > > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com;
> Shahaf
> > > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> > > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > > bruce.richardson@intel.com; yang.a.hong@intel.com;
> harry.chang@intel.com;
> > > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com;
> > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > <thomas@monjalon.net>
> > > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > >
> > > Hi Ori,
> > >
> > > Sorry for the late response as I am occupied by other works.
> > > Two comments below to make the definitions compatible to Hyperscan.
> > >
> > > Thanks,
> > > Xiang
> > >
> > > On Tue, Mar 10, 2020 at 10:32:33AM +0000, Ori Kam wrote:
> > > > +#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
> > > > +/**< This flag marks that the results for the pattern that is being
> compiled
> > > > + * should include all possible matches.
> > > > + * @see struct rte_regex_dev_info::rule_flags, struct
> > > rte_regex_rule::rule_flags
> > > > + */
> > > > +
> > > Can we change this flag to RTE_REGEX_DEV_CFG_MATCH_ALL since
> Hyperscan
> > > only supports
> > > match all mode and users don't have to specify this flag per rule?
> > >
> >
> > Sure, we can replace the RTE_REGEX_PCRE_RULE_MATCH_ALL_F with
> > RTE_REGEX_DEV_CFG_MATCH_ALL, and add
> RTE_REGEX_DEV_CAPA_SUPP_MATCH_ALL
> >
> Ack, thanks.
> >
> > > > + */
> > > > +__rte_experimental
> > > > +int
> > > > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> > > *dev_info);
> > > > +
> > > > +/* Enumerates RegEx device configuration flags */
> > > > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > > > +/**< Cross buffer scan refers to the ability to be able to detect
> > > > + * matches that occur across buffer boundaries, where the buffers are
> > > related
> > > > + * to each other in some way. Enable this flag when to scan payload size
> > > > + * greater than struct rte_regex_dev_info::max_payload_size and/or
> > > > + * matches can present across scan buffer boundaries.
> > > > + *
> > > > + * @see struct rte_regex_dev_info::max_payload_size
> > > > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> > > rte_regex_dev_configure()
> > > > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > > > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > > > + * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > > + */
> > > > +
> > > Can we add another flag
> > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F? In this case,
> > > we only return full match for cross buffer scan without any partial result
> and
> > > without returning response flags such as RTE_REGEX_OPS_RSP_PMI_*.
> >
> > I think that it is good in any case to return a flag if the detection was based on
> > more than one buffer.
> > So I don't really see the advantage of adding such a flag.
> > As far as I understand in your case if the match started in previous buffer and
> ended
> > in the current buffer then you will return also the flag of
> RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > For my general knowledge, in your system if we have the following regex:
> ABC
> > In the first buffer we have xxxA size 4 and the second buffer is BCxx
> > If I understand correctly for first buffer you will return no match found.
> > For the second buffer you will return found and end offset will be equal to  2
> > Am I correct?
> > Or you are going to return end offset 6 because it started from the previous
> buffer?
> >
> Hyperscan guarantees the same matching result regardless of the data is in a
> single
> block or scattered to multiple blocks. So we'll return end offset 6 in this case
> without giving any flag indicating whether the match is started in previous
> buffer
> or current buffer.

What will happen if the match was only in the second buffer? For example
Like before the regex is ABC but now the first buffer is xxxx and the second buffer
is ABCx will the result be end offset 3 or 7?
If the answer is 3 than I think the flag is important, in order to let the user know
that he should count from previous buffer.
If the answer is 7, since only Hyperscan works with end offset if could be defined
that when working with end offset and cross buffer scan is supported then the
result is always true result.

So I think that RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F is not relevant in any
case but the flag should be used if the offset returned is 3.


In other related question, how do Hyperscan marks that 2 buffers should be treated as one?
I think you are missing the cross_buf_id that was introduced in V3 but was removed due to 
lack of usage. This variable was designed to be used in order to let the RegEx engine a place
to save the engine state.

> >
> > Best,
> > Ori
> >
> 
> Best,
> Xiang

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-16 20:48           ` Wang Xiang
@ 2020-03-16 13:49             ` Ori Kam
  2020-03-16 21:10               ` Wang Xiang
  0 siblings, 1 reply; 62+ messages in thread
From: Ori Kam @ 2020-03-16 13:49 UTC (permalink / raw)
  To: Wang Xiang
  Cc: jerinj, dev, pbhagavatula, Shahaf Shuler, hemant.agrawal,
	Opher Reviv, Alex Rosenbaum, dovrat, pkapoor, nipun.gupta,
	bruce.richardson, yang.a.hong, harry.chang, gu.jian1, shanjiangh,
	zhangy.yun, lixingfu, wushuai, yuyingxia, fanchenggang,
	davidfgao, liuzhong1, zhaoyong11, oc, jim, hongjun.ni,
	j.bromhead, deri, fc, arthur.su, Thomas Monjalon

Hi Wang,

PSB, if you don't have any objections and other comments, 
I will start working on the class and will address all of this thread comments 
in the v1 patch,

Thanks,
Ori 

> -----Original Message-----
> From: Wang Xiang <xiang.w.wang@intel.com>
> Sent: Monday, March 16, 2020 10:48 PM
> To: Ori Kam <orika@mellanox.com>
> Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com; Shahaf
> Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> <thomas@monjalon.net>
> Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> 
> Hi Ori,
> 
> On Mon, Mar 16, 2020 at 09:09:06AM +0000, Ori Kam wrote:
> > Hi Xiang,
> >
> > > -----Original Message-----
> > > From: Wang Xiang <xiang.w.wang@intel.com>
> > > Sent: Monday, March 16, 2020 3:26 AM
> > > To: Ori Kam <orika@mellanox.com>
> > > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com;
> Shahaf
> > > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> > > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > > bruce.richardson@intel.com; yang.a.hong@intel.com;
> harry.chang@intel.com;
> > > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> wushuai@inspur.com;
> > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > <thomas@monjalon.net>
> > > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > >
> > > On Sun, Mar 15, 2020 at 10:05:53AM +0000, Ori Kam wrote:
> > > Hi Ori,
> > >
> > > > Hi Xiang,
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Wang Xiang <xiang.w.wang@intel.com>
> > > > > Sent: Friday, March 13, 2020 3:20 AM
> > > > > To: Ori Kam <orika@mellanox.com>
> > > > > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com;
> > > Shahaf
> > > > > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher
> Reviv
> > > > > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > > > > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > > > > bruce.richardson@intel.com; yang.a.hong@intel.com;
> > > harry.chang@intel.com;
> > > > > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > > > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > > wushuai@inspur.com;
> > > > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > > > <thomas@monjalon.net>
> > > > > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > > > >
> > > > > Hi Ori,
> > > > >
> > > > > Sorry for the late response as I am occupied by other works.
> > > > > Two comments below to make the definitions compatible to Hyperscan.
> > > > >
> > > > > Thanks,
> > > > > Xiang
> > > > >
> > > > > On Tue, Mar 10, 2020 at 10:32:33AM +0000, Ori Kam wrote:
> > > > > > +#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
> > > > > > +/**< This flag marks that the results for the pattern that is being
> > > compiled
> > > > > > + * should include all possible matches.
> > > > > > + * @see struct rte_regex_dev_info::rule_flags, struct
> > > > > rte_regex_rule::rule_flags
> > > > > > + */
> > > > > > +
> > > > > Can we change this flag to RTE_REGEX_DEV_CFG_MATCH_ALL since
> > > Hyperscan
> > > > > only supports
> > > > > match all mode and users don't have to specify this flag per rule?
> > > > >
> > > >
> > > > Sure, we can replace the RTE_REGEX_PCRE_RULE_MATCH_ALL_F with
> > > > RTE_REGEX_DEV_CFG_MATCH_ALL, and add
> > > RTE_REGEX_DEV_CAPA_SUPP_MATCH_ALL
> > > >
> > > Ack, thanks.
> > > >
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +int
> > > > > > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> > > > > *dev_info);
> > > > > > +
> > > > > > +/* Enumerates RegEx device configuration flags */
> > > > > > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > > > > > +/**< Cross buffer scan refers to the ability to be able to detect
> > > > > > + * matches that occur across buffer boundaries, where the buffers
> are
> > > > > related
> > > > > > + * to each other in some way. Enable this flag when to scan payload
> size
> > > > > > + * greater than struct rte_regex_dev_info::max_payload_size and/or
> > > > > > + * matches can present across scan buffer boundaries.
> > > > > > + *
> > > > > > + * @see struct rte_regex_dev_info::max_payload_size
> > > > > > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> > > > > rte_regex_dev_configure()
> > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > > > > + */
> > > > > > +
> > > > > Can we add another flag
> > > > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F? In this case,
> > > > > we only return full match for cross buffer scan without any partial result
> > > and
> > > > > without returning response flags such as RTE_REGEX_OPS_RSP_PMI_*.
> > > >
> > > > I think that it is good in any case to return a flag if the detection was
> based on
> > > > more than one buffer.
> > > > So I don't really see the advantage of adding such a flag.
> > > > As far as I understand in your case if the match started in previous buffer
> and
> > > ended
> > > > in the current buffer then you will return also the flag of
> > > RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > > For my general knowledge, in your system if we have the following regex:
> > > ABC
> > > > In the first buffer we have xxxA size 4 and the second buffer is BCxx
> > > > If I understand correctly for first buffer you will return no match found.
> > > > For the second buffer you will return found and end offset will be equal to
> 2
> > > > Am I correct?
> > > > Or you are going to return end offset 6 because it started from the
> previous
> > > buffer?
> > > >
> > > Hyperscan guarantees the same matching result regardless of the data is in
> a
> > > single
> > > block or scattered to multiple blocks. So we'll return end offset 6 in this
> case
> > > without giving any flag indicating whether the match is started in previous
> > > buffer
> > > or current buffer.
> >
> > What will happen if the match was only in the second buffer? For example
> > Like before the regex is ABC but now the first buffer is xxxx and the second
> buffer
> > is ABCx will the result be end offset 3 or 7?
> > If the answer is 3 than I think the flag is important, in order to let the user
> know
> > that he should count from previous buffer.
> > If the answer is 7, since only Hyperscan works with end offset if could be
> defined
> > that when working with end offset and cross buffer scan is supported then the
> > result is always true result.
> >
> > So I think that RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F is not
> relevant in any
> > case but the flag should be used if the offset returned is 3.
> >
> Hyperscan returns 7 in this case, so these flags aren't necessary.
> 
> Hyperscan works in two modes:
> 1) return start and end offset
> 2) return end offset
> 
> Since only Hyperscan supports RTE_REGEX_DEV_CFG_MATCH_ALL, we can
> define
> the result always true if match all and cross buffer scan are
> configured. Having the scan full flag will make users better aware of
> the difference from HW solutions. If you really don't want keep this flag,
> please make this definition clear to users.

The issue with the new flag is that it should always be set, so it is redundant
if I understand correctly. I will try to make it clearer in the comment.

> >
> > In other related question, how do Hyperscan marks that 2 buffers should be
> treated as one?
> > I think you are missing the cross_buf_id that was introduced in V3 but was
> removed due to
> > lack of usage. This variable was designed to be used in order to let the RegEx
> engine a place
> > to save the engine state.
> >
> I agree, we need to have the cross_buf_id back to support cross buffer
> scan.

I will re-add it.

> > > >
> > > > Best,
> > > > Ori
> > > >
> > >
> > > Best,
> > > Xiang
> 
> Thanks,
> Xiang

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-16  9:09         ` Ori Kam
@ 2020-03-16 20:48           ` Wang Xiang
  2020-03-16 13:49             ` Ori Kam
  0 siblings, 1 reply; 62+ messages in thread
From: Wang Xiang @ 2020-03-16 20:48 UTC (permalink / raw)
  To: Ori Kam
  Cc: jerinj, dev, pbhagavatula, Shahaf Shuler, hemant.agrawal,
	Opher Reviv, Alex Rosenbaum, dovrat, pkapoor, nipun.gupta,
	bruce.richardson, yang.a.hong, harry.chang, gu.jian1, shanjiangh,
	zhangy.yun, lixingfu, wushuai, yuyingxia, fanchenggang,
	davidfgao, liuzhong1, zhaoyong11, oc, jim, hongjun.ni,
	j.bromhead, deri, fc, arthur.su, Thomas Monjalon

Hi Ori,

On Mon, Mar 16, 2020 at 09:09:06AM +0000, Ori Kam wrote:
> Hi Xiang,
> 
> > -----Original Message-----
> > From: Wang Xiang <xiang.w.wang@intel.com>
> > Sent: Monday, March 16, 2020 3:26 AM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com; Shahaf
> > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > 
> > On Sun, Mar 15, 2020 at 10:05:53AM +0000, Ori Kam wrote:
> > Hi Ori,
> > 
> > > Hi Xiang,
> > >
> > >
> > > > -----Original Message-----
> > > > From: Wang Xiang <xiang.w.wang@intel.com>
> > > > Sent: Friday, March 13, 2020 3:20 AM
> > > > To: Ori Kam <orika@mellanox.com>
> > > > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com;
> > Shahaf
> > > > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> > > > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > > > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > > > bruce.richardson@intel.com; yang.a.hong@intel.com;
> > harry.chang@intel.com;
> > > > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > wushuai@inspur.com;
> > > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > > <thomas@monjalon.net>
> > > > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > > >
> > > > Hi Ori,
> > > >
> > > > Sorry for the late response as I am occupied by other works.
> > > > Two comments below to make the definitions compatible to Hyperscan.
> > > >
> > > > Thanks,
> > > > Xiang
> > > >
> > > > On Tue, Mar 10, 2020 at 10:32:33AM +0000, Ori Kam wrote:
> > > > > +#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
> > > > > +/**< This flag marks that the results for the pattern that is being
> > compiled
> > > > > + * should include all possible matches.
> > > > > + * @see struct rte_regex_dev_info::rule_flags, struct
> > > > rte_regex_rule::rule_flags
> > > > > + */
> > > > > +
> > > > Can we change this flag to RTE_REGEX_DEV_CFG_MATCH_ALL since
> > Hyperscan
> > > > only supports
> > > > match all mode and users don't have to specify this flag per rule?
> > > >
> > >
> > > Sure, we can replace the RTE_REGEX_PCRE_RULE_MATCH_ALL_F with
> > > RTE_REGEX_DEV_CFG_MATCH_ALL, and add
> > RTE_REGEX_DEV_CAPA_SUPP_MATCH_ALL
> > >
> > Ack, thanks.
> > >
> > > > > + */
> > > > > +__rte_experimental
> > > > > +int
> > > > > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> > > > *dev_info);
> > > > > +
> > > > > +/* Enumerates RegEx device configuration flags */
> > > > > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > > > > +/**< Cross buffer scan refers to the ability to be able to detect
> > > > > + * matches that occur across buffer boundaries, where the buffers are
> > > > related
> > > > > + * to each other in some way. Enable this flag when to scan payload size
> > > > > + * greater than struct rte_regex_dev_info::max_payload_size and/or
> > > > > + * matches can present across scan buffer boundaries.
> > > > > + *
> > > > > + * @see struct rte_regex_dev_info::max_payload_size
> > > > > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> > > > rte_regex_dev_configure()
> > > > > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > > > > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > > > > + * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > > > + */
> > > > > +
> > > > Can we add another flag
> > > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F? In this case,
> > > > we only return full match for cross buffer scan without any partial result
> > and
> > > > without returning response flags such as RTE_REGEX_OPS_RSP_PMI_*.
> > >
> > > I think that it is good in any case to return a flag if the detection was based on
> > > more than one buffer.
> > > So I don't really see the advantage of adding such a flag.
> > > As far as I understand in your case if the match started in previous buffer and
> > ended
> > > in the current buffer then you will return also the flag of
> > RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > For my general knowledge, in your system if we have the following regex:
> > ABC
> > > In the first buffer we have xxxA size 4 and the second buffer is BCxx
> > > If I understand correctly for first buffer you will return no match found.
> > > For the second buffer you will return found and end offset will be equal to  2
> > > Am I correct?
> > > Or you are going to return end offset 6 because it started from the previous
> > buffer?
> > >
> > Hyperscan guarantees the same matching result regardless of the data is in a
> > single
> > block or scattered to multiple blocks. So we'll return end offset 6 in this case
> > without giving any flag indicating whether the match is started in previous
> > buffer
> > or current buffer.
> 
> What will happen if the match was only in the second buffer? For example
> Like before the regex is ABC but now the first buffer is xxxx and the second buffer
> is ABCx will the result be end offset 3 or 7?
> If the answer is 3 than I think the flag is important, in order to let the user know
> that he should count from previous buffer.
> If the answer is 7, since only Hyperscan works with end offset if could be defined
> that when working with end offset and cross buffer scan is supported then the
> result is always true result.
> 
> So I think that RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F is not relevant in any
> case but the flag should be used if the offset returned is 3.
>
Hyperscan returns 7 in this case, so these flags aren't necessary. 

Hyperscan works in two modes:
1) return start and end offset
2) return end offset

Since only Hyperscan supports RTE_REGEX_DEV_CFG_MATCH_ALL, we can define
the result always true if match all and cross buffer scan are
configured. Having the scan full flag will make users better aware of
the difference from HW solutions. If you really don't want keep this flag, 
please make this definition clear to users.
> 
> In other related question, how do Hyperscan marks that 2 buffers should be treated as one?
> I think you are missing the cross_buf_id that was introduced in V3 but was removed due to 
> lack of usage. This variable was designed to be used in order to let the RegEx engine a place
> to save the engine state.
>
I agree, we need to have the cross_buf_id back to support cross buffer
scan.
> > >
> > > Best,
> > > Ori
> > >
> > 
> > Best,
> > Xiang

Thanks,
Xiang

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem
  2020-03-16 13:49             ` Ori Kam
@ 2020-03-16 21:10               ` Wang Xiang
  0 siblings, 0 replies; 62+ messages in thread
From: Wang Xiang @ 2020-03-16 21:10 UTC (permalink / raw)
  To: Ori Kam
  Cc: jerinj, dev, pbhagavatula, Shahaf Shuler, hemant.agrawal,
	Opher Reviv, Alex Rosenbaum, dovrat, pkapoor, nipun.gupta,
	bruce.richardson, yang.a.hong, harry.chang, gu.jian1, shanjiangh,
	zhangy.yun, lixingfu, wushuai, yuyingxia, fanchenggang,
	davidfgao, liuzhong1, zhaoyong11, oc, jim, hongjun.ni,
	j.bromhead, deri, fc, arthur.su, Thomas Monjalon

Hi Ori,

Yes, please go ahead with the patch.

Thanks,
Xiang
On Mon, Mar 16, 2020 at 01:49:51PM +0000, Ori Kam wrote:
> Hi Wang,
> 
> PSB, if you don't have any objections and other comments, 
> I will start working on the class and will address all of this thread comments 
> in the v1 patch,
> 
> Thanks,
> Ori 
> 
> > -----Original Message-----
> > From: Wang Xiang <xiang.w.wang@intel.com>
> > Sent: Monday, March 16, 2020 10:48 PM
> > To: Ori Kam <orika@mellanox.com>
> > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com; Shahaf
> > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > bruce.richardson@intel.com; yang.a.hong@intel.com; harry.chang@intel.com;
> > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com; wushuai@inspur.com;
> > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > <thomas@monjalon.net>
> > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > 
> > Hi Ori,
> > 
> > On Mon, Mar 16, 2020 at 09:09:06AM +0000, Ori Kam wrote:
> > > Hi Xiang,
> > >
> > > > -----Original Message-----
> > > > From: Wang Xiang <xiang.w.wang@intel.com>
> > > > Sent: Monday, March 16, 2020 3:26 AM
> > > > To: Ori Kam <orika@mellanox.com>
> > > > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com;
> > Shahaf
> > > > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher Reviv
> > > > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > > > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > > > bruce.richardson@intel.com; yang.a.hong@intel.com;
> > harry.chang@intel.com;
> > > > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > wushuai@inspur.com;
> > > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > > <thomas@monjalon.net>
> > > > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > > >
> > > > On Sun, Mar 15, 2020 at 10:05:53AM +0000, Ori Kam wrote:
> > > > Hi Ori,
> > > >
> > > > > Hi Xiang,
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Wang Xiang <xiang.w.wang@intel.com>
> > > > > > Sent: Friday, March 13, 2020 3:20 AM
> > > > > > To: Ori Kam <orika@mellanox.com>
> > > > > > Cc: jerinj@marvell.com; dev@dpdk.org; pbhagavatula@marvell.com;
> > > > Shahaf
> > > > > > Shuler <shahafs@mellanox.com>; hemant.agrawal@nxp.com; Opher
> > Reviv
> > > > > > <opher@mellanox.com>; Alex Rosenbaum <alexr@mellanox.com>;
> > > > > > dovrat@marvell.com; pkapoor@marvell.com; nipun.gupta@nxp.com;
> > > > > > bruce.richardson@intel.com; yang.a.hong@intel.com;
> > > > harry.chang@intel.com;
> > > > > > gu.jian1@zte.com.cn; shanjiangh@chinatelecom.cn;
> > > > > > zhangy.yun@chinatelecom.cn; lixingfu@huachentel.com;
> > > > wushuai@inspur.com;
> > > > > > yuyingxia@yxlink.com; fanchenggang@sunyainfo.com;
> > > > > > davidfgao@tencent.com; liuzhong1@chinaunicom.cn;
> > > > > > zhaoyong11@huawei.com; oc@yunify.com; jim@netgate.com;
> > > > > > hongjun.ni@intel.com; j.bromhead@titan-ic.com; deri@ntop.org;
> > > > > > fc@napatech.com; arthur.su@lionic.com; Thomas Monjalon
> > > > > > <thomas@monjalon.net>
> > > > > > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > > > > >
> > > > > > Hi Ori,
> > > > > >
> > > > > > Sorry for the late response as I am occupied by other works.
> > > > > > Two comments below to make the definitions compatible to Hyperscan.
> > > > > >
> > > > > > Thanks,
> > > > > > Xiang
> > > > > >
> > > > > > On Tue, Mar 10, 2020 at 10:32:33AM +0000, Ori Kam wrote:
> > > > > > > +#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
> > > > > > > +/**< This flag marks that the results for the pattern that is being
> > > > compiled
> > > > > > > + * should include all possible matches.
> > > > > > > + * @see struct rte_regex_dev_info::rule_flags, struct
> > > > > > rte_regex_rule::rule_flags
> > > > > > > + */
> > > > > > > +
> > > > > > Can we change this flag to RTE_REGEX_DEV_CFG_MATCH_ALL since
> > > > Hyperscan
> > > > > > only supports
> > > > > > match all mode and users don't have to specify this flag per rule?
> > > > > >
> > > > >
> > > > > Sure, we can replace the RTE_REGEX_PCRE_RULE_MATCH_ALL_F with
> > > > > RTE_REGEX_DEV_CFG_MATCH_ALL, and add
> > > > RTE_REGEX_DEV_CAPA_SUPP_MATCH_ALL
> > > > >
> > > > Ack, thanks.
> > > > >
> > > > > > > + */
> > > > > > > +__rte_experimental
> > > > > > > +int
> > > > > > > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> > > > > > *dev_info);
> > > > > > > +
> > > > > > > +/* Enumerates RegEx device configuration flags */
> > > > > > > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > > > > > > +/**< Cross buffer scan refers to the ability to be able to detect
> > > > > > > + * matches that occur across buffer boundaries, where the buffers
> > are
> > > > > > related
> > > > > > > + * to each other in some way. Enable this flag when to scan payload
> > size
> > > > > > > + * greater than struct rte_regex_dev_info::max_payload_size and/or
> > > > > > > + * matches can present across scan buffer boundaries.
> > > > > > > + *
> > > > > > > + * @see struct rte_regex_dev_info::max_payload_size
> > > > > > > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> > > > > > rte_regex_dev_configure()
> > > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > > > > > + */
> > > > > > > +
> > > > > > Can we add another flag
> > > > > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F? In this case,
> > > > > > we only return full match for cross buffer scan without any partial result
> > > > and
> > > > > > without returning response flags such as RTE_REGEX_OPS_RSP_PMI_*.
> > > > >
> > > > > I think that it is good in any case to return a flag if the detection was
> > based on
> > > > > more than one buffer.
> > > > > So I don't really see the advantage of adding such a flag.
> > > > > As far as I understand in your case if the match started in previous buffer
> > and
> > > > ended
> > > > > in the current buffer then you will return also the flag of
> > > > RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > > > For my general knowledge, in your system if we have the following regex:
> > > > ABC
> > > > > In the first buffer we have xxxA size 4 and the second buffer is BCxx
> > > > > If I understand correctly for first buffer you will return no match found.
> > > > > For the second buffer you will return found and end offset will be equal to
> > 2
> > > > > Am I correct?
> > > > > Or you are going to return end offset 6 because it started from the
> > previous
> > > > buffer?
> > > > >
> > > > Hyperscan guarantees the same matching result regardless of the data is in
> > a
> > > > single
> > > > block or scattered to multiple blocks. So we'll return end offset 6 in this
> > case
> > > > without giving any flag indicating whether the match is started in previous
> > > > buffer
> > > > or current buffer.
> > >
> > > What will happen if the match was only in the second buffer? For example
> > > Like before the regex is ABC but now the first buffer is xxxx and the second
> > buffer
> > > is ABCx will the result be end offset 3 or 7?
> > > If the answer is 3 than I think the flag is important, in order to let the user
> > know
> > > that he should count from previous buffer.
> > > If the answer is 7, since only Hyperscan works with end offset if could be
> > defined
> > > that when working with end offset and cross buffer scan is supported then the
> > > result is always true result.
> > >
> > > So I think that RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F is not
> > relevant in any
> > > case but the flag should be used if the offset returned is 3.
> > >
> > Hyperscan returns 7 in this case, so these flags aren't necessary.
> > 
> > Hyperscan works in two modes:
> > 1) return start and end offset
> > 2) return end offset
> > 
> > Since only Hyperscan supports RTE_REGEX_DEV_CFG_MATCH_ALL, we can
> > define
> > the result always true if match all and cross buffer scan are
> > configured. Having the scan full flag will make users better aware of
> > the difference from HW solutions. If you really don't want keep this flag,
> > please make this definition clear to users.
> 
> The issue with the new flag is that it should always be set, so it is redundant
> if I understand correctly. I will try to make it clearer in the comment.
> 
> > >
> > > In other related question, how do Hyperscan marks that 2 buffers should be
> > treated as one?
> > > I think you are missing the cross_buf_id that was introduced in V3 but was
> > removed due to
> > > lack of usage. This variable was designed to be used in order to let the RegEx
> > engine a place
> > > to save the engine state.
> > >
> > I agree, we need to have the cross_buf_id back to support cross buffer
> > scan.
> 
> I will re-add it.
> 
> > > > >
> > > > > Best,
> > > > > Ori
> > > > >
> > > >
> > > > Best,
> > > > Xiang
> > 
> > Thanks,
> > Xiang

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
@ 2019-10-20 14:09 Jerin Jacob Kollanukkaran
  0 siblings, 0 replies; 62+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2019-10-20 14:09 UTC (permalink / raw)
  To: Wang Xiang
  Cc: Thomas Monjalon, dev, Pavan Nikhilesh Bhagavatula, Shahaf Shuler,
	Hemant Agrawal, Opher Reviv, Alex Rosenbaum, Dovrat Zifroni,
	Prasun Kapoor, Nipun Gupta, Richardson, Bruce, Hong, Yang A,
	Chang, Harry, gu.jian1, shanjiangh, zhangy.yun, lixingfu,
	wushuai, yuyingxia, fanchenggang, davidfgao, liuzhong1,
	zhaoyong11, oc, jim, Ni, Hongjun, j.bromhead, deri, fc,
	arthur.su, Guy Kaneti, Smadar Fuks, Liron Himi, edwin.verplanke,
	keith.wiles

> -----Original Message-----
> From: Wang Xiang <xiang.w.wang@intel.com>
> Sent: Monday, October 14, 2019 7:29 PM
> To: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Pavan
> Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Hemant Agrawal <hemant.agrawal@nxp.com>;
> Opher Reviv <opher@mellanox.com>; Alex Rosenbaum
> <alexr@mellanox.com>; Dovrat Zifroni <dovrat@marvell.com>; Prasun Kapoor
> <pkapoor@marvell.com>; Nipun Gupta <nipun.gupta@nxp.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Hong, Yang A <yang.a.hong@intel.com>;
> Chang, Harry <harry.chang@intel.com>; gu.jian1@zte.com.cn;
> shanjiangh@chinatelecom.cn; zhangy.yun@chinatelecom.cn;
> lixingfu@huachentel.com; wushuai@inspur.com; yuyingxia@yxlink.com;
> fanchenggang@sunyainfo.com; davidfgao@tencent.com;
> liuzhong1@chinaunicom.cn; zhaoyong11@huawei.com; oc@yunify.com;
> jim@netgate.com; Ni, Hongjun <hongjun.ni@intel.com>; j.bromhead@titan-
> ic.com; deri@ntop.org; fc@napatech.com; arthur.su@lionic.com; Guy Kaneti
> <guyk@marvell.com>; Smadar Fuks <smadarf@marvell.com>; Liron Himi
> <lironh@marvell.com>; edwin.verplanke@intel.com; keith.wiles@intel.com
> Subject: [EXT] Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Fri, Sep 27, 2019 at 02:35:00PM +0000, Jerin Jacob Kollanukkaran wrote:
> > > -----Original Message-----
> > > From: Wang Xiang <xiang.w.wang@intel.com>
> > >
> > > Hi Jerin,
> > >
> > > Thanks for your response. More comments below and inline.
> > >
> > > 1) I think the size of some varaibles (e.g. nb_matches, scan_size,
> > > matching offset, etc) should be increased based on what Hyperscan
> supports.
> > >
> > >     a) struct rte_regex_ops:
> > >
> > >         uint16_t scan_size => uint32_t scan_size
> >
> > I think, packet buffers will not be > 64K and getting more than
> > contiguous 64K DMAable memory will be difficult in DPDK.
> > Other than that, rte_regex_match is 64bit now, increasing width of Len
> > could increase the size of  "rte_regex_match". i.e Need more Bandwidth
> > for response.
> > Could other HW implementations share the views on max length is
> > supported on their implementation? Based on that we can decide.
> >
> OK, let's gather ideas from HW implementation.

Any inputs from Mellanox or other vendors on the "width" of the type and
size of "rte_regex_match" considering the performance implications.



> >
> > >         uint8_t nb_actual_matches => uint64 nb_actual_matches
> > >         uint8_t nb_matches => uint64 nb__matches
> >
> > 2^64 matches will be never possible in practical system. How about 2^16.
> >
> I think the number of matches depends on the number of total rules and scan
> size. Based on the definitions (16-bit nb_rules_per_group, 16-bit nb_groups and
> 16-bit scan size), the maximum possible matches could exceed 2^16. Users may
> get partial matches in this case while Hyperscan doesn't make compromises.
> It'll also be good to check other HW implementation.

See above.

> >
> > >
> > >     b) struct rte_regex_match:
> > >         uint16_t offset => uint32_t offset
> > >         uint16_t len => uint32_t len
> >
> > See above.
> >
> > >
> > >     c) uint16_t
> > >         rte_regex_rule_db_update(uint8_t dev_id, const struct
> > > rte_regex_rule *rules,
> > >                                  uint16_t nb_rules);
> > >     =>
> > >        uint32_t
> > >         rte_regex_rule_db_update(uint8_t dev_id, const struct
> > > rte_regex_rule *rules,
> > >                                  uint32_t nb_rules);
> >
> > OK. I will change it next version.
> >
> > >
> > >     d) int
> > >     rte_regex_queue_pair_setup(uint8_t dev_id, uint8_t queue_pair_id,
> > >                     const struct rte_regex_qp_conf *qp_conf);
> > >     =>
> > >        int
> > >     rte_regex_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
> > >                     const struct rte_regex_qp_conf *qp_conf);
> >
> > OK. I will change it next version.
> >
> > >
> > >     e) struct rte_regex_dev_config:
> > >         uint8_t nb_max_matches => uint64_t nb_max_matches
> >
> > 2^64 matches will be never possible in practical system. How about 2^16.
> >
> See above.
> >
> > >
> > >     f) struct rte_regex_dev_info:
> > >         uint8_t max_matches => uint64_t max_matches
> >
> > 2^64 matches will be never possible in practical system. How about 2^16.
> >
> See above.
> >
> > >
> > > 2) There are rte_regex_dev_attr_get() and rte_regex_dev_attr_set()
> defined.
> > > Are all the attributes below could be set by users? Is any of them read-only?
> >
> > See below,
> >
> > > /** Enumerates RegEx device attribute identifier */ enum
> > > rte_regex_dev_attr_id {
> > >     RTE_REGEX_DEV_ATTR_SOCKET_ID,
> > >     /**< The NUMA socket id to which the device is connected or
> > >      * a default of zero if the socket could not be determined.
> > >      * datatype: *int*
> > >      * operation: *get*
> >
> > *get*  means read only. *get* and *set* means it support both
> > operation
> >
> > >      */
> > >     RTE_REGEX_DEV_ATTR_MAX_MATCHES,
> > >     /**< Maximum number of matches per scan.
> > >      * datatype: *uint8_t*
> > >      * operation: *get* and *set*
> > >      *
> > >      * @see RTE_REGEX_OPS_RSP_MAX_MATCH_F
> > >      */
> > >     RTE_REGEX_DEV_ATTR_MAX_SCAN_TIMEOUT,
> > >     /**< Upper bound scan time in ns.
> > >      * datatype: *uint16_t*
> > >      * operation: *get* and *set*
> > >      *
> > >      * @see RTE_REGEX_OPS_RSP_MAX_SCAN_TIMEOUT_F
> > >      */
> > >     RTE_REGEX_DEV_ATTR_MAX_PREFIX,
> > >     /**< Maximum number of prefix detected per scan.
> > >      * This would be useful for denial of service detection.
> > >      * datatype: *uint16_t*
> > >      * operation: *get* and *set*
> > >      *
> > >      * @see RTE_REGEX_OPS_RSP_MAX_PREFIX_F
> > >      */
> > > };
> > >
> > > 3) Both RTE_REGEX_PCRE_RULE_* and
> > > RTE_REGEX_DEV_PCRE_UNSUP_* can be viewed as device capabilities. Can
> > > we merge them with RTE_REGEX_DEV_CAPA_RUNTIME_COMPILATION_F
> and have
> > > a unified regex_dev_capa in struct rte_regex_dev_info.
> >
> > Sure. I will fix it next version.
> >
> > >
> > >
> > > 4) It'll be good if we can also define synchronous matching API for
> > > users who want to have a one-off scan and wait for the results.
> >
> > Makes sense. I will add synchronous matching API in next version(I
> > understand, it will be useful for SW Implementations). Probably expose as
> INFO flag to expose the it as preference.
> >
> > >
> > > On Tue, Sep 10, 2019 at 08:05:39AM +0000, Jerin Jacob Kollanukkaran
> wrote:
> > > > Hi Xiang,
> > > >
> > > > Sorry for delay in response(Was busy with 19.11 proposal
> > > > deadline). Please
> > > see inline.
> > > >
> > > > >
> > > > > Reply to Xiang's queries in main thread:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > Some questions regarding APIs. Could you please give more insights?
> > > > >
> > > > > 1) rte_regex_ops
> > > > >       a) rsp_flags
> > > > >       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> > > > > RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
> > > > >       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a
> > > > > partial match at the end of current buffer after scan.
> > > > >       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> > > > >
> > > > > [Jerin] Since we need three states to represent partial match
> > > > > buffer, RTE_REGEX_OPS_RSP_PMI_SOJ_F to represent start of the
> > > > > buffer, intermediate buffers with no flag, and end of the buffer
> > > > > with RTE_REGEX_OPS_RSP_PMI_EOJ
> > > >
> > > > > [Xiang] How could a user leverage these flags for matching?
> > > > > Suppose a large buffer is divided into multiple chunks. Will
> > > > > RTE_REGEX_OPS_RSP_PMI_SOJ_F cause an early quit once it isn't
> > > > > set after scan the first chunk. Similarly,
> > > > > RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching
> > > > > future buffers after finish the last
> > > chunk?
> > > >
> > > > Let me describe with an example,
> > > >
> > > > Assume,
> > > > 1) struct rte_regex_dev_info:: max_payload_size set to 1024
> > > > 2) rte_regex_dev_config:: dev_cfg_flags configured with
> > > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > > > 3) Device programmed with matching "hello\s+world" pattern
> > > > 4) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > > > and struct rte_regex_op:: scan_size = 1024
> > > >
> > > > data[0..1021] = data don???t have hello world pattern data[1022] = 'h'
> > > > data[1023] = 'e'
> > > >
> > > > 5) user enqueue struct rte_regex_ops:: buf_addr point following "data"
> > > > and struct rte_regex_op:: scan_size = 9
> > > >
> > > > data[0] = 'l'
> > > > data[1] = 'l'
> > > > data[2] = 'o'
> > > > data[3] = ' '
> > > > data[4] = 'w'
> > > > data[5] = 'o'
> > > > data[6] = 'r'
> > > > data[7] = 'l'
> > > > data[8] = 'd'
> > > >
> > > > If so,
> > > >
> > > > Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in
> rte_regex_ops::
> > > > rsp_flags on dequeue Where rte_regex_match:: offset is 1022 and
> > > > len 2
> > > >
> > > > Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in
> rte_regex_ops::
> > > > rsp_flags on dequeue Where rte_regex_match:: offset is 0 and len 9
> > > >
> > > If the defined pattern is "hello.*world" instead of "hello\s+world",
> > > and we enqueue following struct rte_regex_ops:
> > >
> > > 1) rte_regex_op:: scan_size = 1024
> > >
> > >    data[0..1021] = data don???t have hello world pattern
> > >    data[1022] = 'h'
> > >    data[1023] = 'e'
> > >
> > > 2) rte_regex_op:: scan_size = 9
> > >    data[0] = 'l'
> > >    data[1] = 'l'
> > >    data[2] = 'o'
> > >    data[3] = ' '
> > >    data[4] = 'w'
> > >    data[5] = 'o'
> > >    data[6] = 'r'
> > >    data[7] = 'l'
> > >    data[8] = 'd'
> > >
> > > 3) rte_regex_op:: scan_size = 5
> > >    data[0] = 'w'
> > >    data[1] = 'o'
> > >    data[2] = 'r'
> > >    data[3] = 'l'
> > >    data[4] = 'd'
> > >
> > > Will response to 3) have RTE_REGEX_OPS_RSP_PMI_EOJ_F in
> rte_regex_ops::
> > > rsp_flags on dequeue
> > > Where rte_regex_match:: offset is 0 and len 4?
> >
> > Yes.
> >
> > >
> > > I am wondering what's your expected behavior for .* or similar
> > > syntax and if there are syntax compatability issues. We report all matches in
> Hyperscan, e.g.
> > > report end match offsets 11 and 16 for pattern "hello.*world" and
> > > corpus "hello worldworld".
> > >
> > > BTW, not sure how other hardware devices handle cross buffer scan.
> > > Hyperscan doesn't reports matches for start and intermediate buffers
> > > but only reports end offset if a full match is found.
> > >
> > > >
> > > > >
> > > > >       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a
> > > > > definition for a specific hardware implementation. I am
> > > > > wondering what this PREFIX refers to:)?
> > > > >
> > > > > [Jerin] Yes. Looks like it is for hardware specific implementation.
> > > > > Introduced rte_regex_dev_attr_set/get functions to make it
> > > > > portable and To add new implementation specific fields.
> > > > > For example, if a rule is
> > > > > /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is
> > > > > considered the factor. The prefix is a literal string, while the
> > > > > factor can contain complex regular expression constructs. As a
> > > > > result, rule matching occurs in two stages: prefix matching and
> > > > > factor matching.
> > > > >
> > > > >       b)  user_id or user_ptr
> > > > >       Under what kind of circumstances should an application
> > > > > pass value into these variables for enqueue and dequeuer operations?
> > > > >
> > > > > [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also
> > > > > allocated using mempool normally, on enqueue, user can specify
> > > > > user_id If needed to in order identify the op on dequeue if
> > > > > required. The use case could be to store the sequence number
> > > > > from application POV or storing the mbuf ptr in which pattern is
> requested etc.
> > > > >
> > > > >
> > > > >  2) rte_regex_match
> > > > >       a) offset; /**< Starting Byte Position for matched rule.
> > > > > */ and  uint16_t len; /**< Length of match in bytes */
> > > > >       Looks like the matching offset is defined as *starting
> > > > > matching offset* instead of *end matching offset*, e.g. report
> > > > > the offset of
> > > "a" instead of "c"
> > > > > for pattern "abc".
> > > > >       If so, this makes it hard to integrate software regex
> > > > > libraries such as Hyperscan and RE2 as they only report *end
> > > > > matching offset* without length of match.
> > > > >       Although Hyperscan has API for *starting matching offset*,
> > > > > it only delivers partial syntax support. So I think we have to
> > > > > define *end of matching offset* for software solutions.
> > > > >
> > > > > [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST
> tradeoffs.
> > > > > I thought application would need always the length of the match.
> > > > > Probably we will see how other HW implementation (from Mellanox)
> > > > > etc. We will try to abstract it, probably we can make it as
> > > > > function of "user requested".
> > > > > [Xiang] Yes, it will be good to make it per user request. At
> > > > > least from Hyperscan user's point of view, start of match and
> > > > > match length are not mandatory.
> > > >
> > > > OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START In
> > > > device configure.
> > > >
> > > > Since offset+len == end, we can introduce following generic inline
> function.
> > > >
> > > > static inline
> > > > rte_regex_match_end(truct rte_regex_match *match) {
> > > > 	match->offset + match->len;
> > > > }
> > > >
> > > > Example:  pattern to match is  "hello\s+world"  and data is
> > > > following data[4] = 'h'
> > > > data[5] = 'e'
> > > > data[6] = 'l'
> > > > data[7] = 'l'
> > > > data[8] = 'o'
> > > > data[9] = ' '
> > > > data[10] = 'w'
> > > > data[11] = 'o'
> > > > data[12] = 'r'
> > > > data[13] = 'l'
> > > > data[14] = 'd'
> > > >
> > > > if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > match->offset returns 4
> > > > match->len returns 11
> > > >
> > > > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > driver MAY return the following(in hyperscan case)
> > > > match->offset returns 0
> > > > match->len returns 11 + 4
> > > >
> > > > In both case(irrespective of flags, to make application life easy)
> > > rte_regex_match_end() would return 15.
> > > > If application demands for MATCH_AS_START then driver can return
> > > > match->offset returns 4 and match->len returns 11 Aka set
> > > > HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should
> > > > use
> > > rte_regex_match_end() for finding the end of the match. To make,
> > > work in all cases.
> > > >
> > > > Is it OK?
> > > >
> > > Can we replace len with end offset? So we can change "offset" to
> "start_offset"
> > > and len to "end_ offset" in struct rte_regex_match. Users interested
> > > in len could take "end_offset - start_offset".
> > > We may also change RTE_REGEX_DEV_CFG_MATCH_AS_START to
> > > RTE_REGEX_DEV_CFG_MATCH_START
> > >
> > > In your example,
> > > if device is configured with RTE_REGEX_DEV_CFG_MATCH_START
> > > match->start_offset returns 4
> > > match->end_offset returns 15
> > >
> > > if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_START
> > > match->start_offset returns 0
> > > match->end_offset returns 15
> >
> >
> > This part is little tricky as HW descriptions need to be rewritten on response.
> > This is a one issue, I foresee earlier, to come up with
> > rte_regex_match That's works for all implementation  without performance
> issue.
> >
> > We have two HW implementations, both returns start_off and len.
> > Lets get input from other HW implementation on the semantics of
> > rte_regex_match. Based on that, we can decide how to go about it?
> > Thoughts from Mellanox or other vendors?
> >
> Sure. Let's get more inputs on this.
> >
> >
> > >
> > > > >
> > > > > 3)  rte_regex_rule_db_update()
> > > > >     Does this mean we can dynamically add or delete rules for an
> > > > > already generated database without recompile from scratch for
> > > > > hardware Regex implementation?
> > > > >     If so, this isn't possible for software solutions as they
> > > > > don't support dynamic database update and require recompile.
> > > > >
> > > > > [Jerin] rte_regex_rule_db_update() internally it would call
> > > > > recompile function for both HW and SW.
> > > > > See rte_regex_dev_config::rule_db in rte_regex_dev_configure()
> > > > > for precompiled rule database case.
> > > > > [Xiang] OK, sounds like we have to save the original rule-set
> > > > > for the device in order to do recompile. I see both ADD and
> > > > > REMOVE operators from rte_regex_rule.
> > > > > For rules with REMOVE operator, what's the expected behavior to
> > > > > handle them for the old rule-set? Do we need to go through the
> > > > > old rule-set and remove corresponding rules before doing recompile?
> > > >
> > > > Yes.
> > > >
> > > I think it'll be better to change rte_regex_rule_db_update() to
> > > rte_regex_rule_compile() and have users to provide a full rule-set.
> > > So we don't have to maintain old rule-set and decide which one to
> > > keep and remove. We can simply recompile new rule-set and get rid of
> > > rte_regex_rule_op in this case.
> >
> >
> > On virtualized, HW implementations, The RULE database is maintained by
> > single body. So the above scheme, works with SW and HW implementations.
> > And It make user life easy as they don't need to maintain the rules.
> >
> > I don't have preference on the rte_regex_rule_db_update() name, I can
> > change to
> > rte_regex_rule_compile() if required keeping above functionality. Let me
> know.
> >
> >
> OK, I'm good if your are willing to maintain it for users. Then both
> rte_regex_rule_db_update() and rte_regex_rule_compile() work for me.
> >
> >
> >
> >
> >
> >

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2020-03-16 13:54 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-27 15:50 [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem jerinj
2019-07-15  4:26 ` Jerin Jacob Kollanukkaran
2019-08-15  9:35 ` Thomas Monjalon
2019-08-15 11:34   ` Thomas Monjalon
2019-08-19  3:09     ` Jerin Jacob Kollanukkaran
2019-08-20  1:54       ` Wang, Xiang W
2019-09-10  8:05         ` Jerin Jacob Kollanukkaran
2019-09-19 13:58           ` Wang Xiang
2019-09-27 14:35             ` Jerin Jacob Kollanukkaran
2019-10-14 13:59               ` Wang Xiang
2020-01-26 11:55                 ` Ori Kam
2019-08-21  5:32     ` Shahaf Shuler
2019-08-21 15:12       ` John Bromhead
2019-09-10 10:31       ` Jerin Jacob Kollanukkaran
2019-09-10 11:02       ` Jerin Jacob Kollanukkaran
2019-09-27 14:45         ` Jerin Jacob Kollanukkaran
2019-10-02  5:53           ` Shahaf Shuler
2019-10-02  8:31             ` Jerin Jacob Kollanukkaran
2019-10-02  8:52               ` Shahaf Shuler
2019-10-02  9:34                 ` Jerin Jacob Kollanukkaran
2020-01-27 21:19 ` [dpdk-dev] [PATCH v2] net/regexdev: " Ori Kam
2020-01-28  9:00 ` [dpdk-dev] [PATCH v3] regexdev: " Ori Kam
2020-02-22 16:52   ` Jerin Jacob
2020-02-23  8:41     ` Ori Kam
2020-02-23  9:53       ` Jerin Jacob
2020-02-23 12:33         ` Ori Kam
2020-02-25  5:57           ` Jerin Jacob
2020-02-25  7:48             ` Ori Kam
2020-02-26  9:03               ` Wang Xiang
2020-02-26  8:36                 ` Ori Kam
2020-02-27  9:25                   ` Wang Xiang
2020-02-27  7:31                     ` Ori Kam
2020-02-27  9:16                       ` Wang Xiang
2020-02-27 14:40 ` [dpdk-dev] [RFC v4] " Ori Kam
2020-02-27 14:55   ` Jerin Jacob
2020-02-27 15:08 ` [dpdk-dev] [RFC v5] " Ori Kam
2020-03-01  6:13   ` [dpdk-dev] [EXT] " Pavan Nikhilesh Bhagavatula
2020-03-01  7:31     ` Ori Kam
2020-03-01 13:23       ` Pavan Nikhilesh Bhagavatula
2020-03-01 14:10         ` Ori Kam
2020-03-01 14:38           ` Pavan Nikhilesh Bhagavatula
2020-03-01 15:41             ` Ori Kam
2020-03-01 15:57               ` Pavan Nikhilesh Bhagavatula
2020-03-02  7:18                 ` Jerin Jacob
2020-03-03  7:06                   ` Ori Kam
2020-03-02  7:05   ` [dpdk-dev] " Wang Xiang
2020-03-03  7:44     ` Ori Kam
2020-03-03  7:54       ` Jerin Jacob
2020-03-10 10:32 ` [dpdk-dev] [RFC v6] " Ori Kam
2020-03-10 13:42   ` Pavan Nikhilesh Bhagavatula
2020-03-10 16:23     ` Ori Kam
2020-03-10 16:36       ` Pavan Nikhilesh Bhagavatula
2020-03-10 17:00         ` Ori Kam
2020-03-12 12:13           ` Ori Kam
2020-03-13  1:20   ` Wang Xiang
2020-03-15 10:05     ` Ori Kam
2020-03-16  1:25       ` Wang Xiang
2020-03-16  9:09         ` Ori Kam
2020-03-16 20:48           ` Wang Xiang
2020-03-16 13:49             ` Ori Kam
2020-03-16 21:10               ` Wang Xiang
2019-10-20 14:09 [dpdk-dev] [RFC PATCH v1] " Jerin Jacob Kollanukkaran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).