DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
To: "Wang, Xiang W" <xiang.w.wang@intel.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>,
	Shahaf Shuler <shahafs@mellanox.com>,
	Hemant Agrawal <hemant.agrawal@nxp.com>,
	Opher Reviv <opher@mellanox.com>,
	Alex Rosenbaum <alexr@mellanox.com>,
	Dovrat Zifroni <dovrat@marvell.com>,
	Prasun Kapoor <pkapoor@marvell.com>,
	Nipun Gupta <nipun.gupta@nxp.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"Hong, Yang A" <yang.a.hong@intel.com>,
	"Chang, Harry" <harry.chang@intel.com>,
	"gu.jian1@zte.com.cn" <gu.jian1@zte.com.cn>,
	"shanjiangh@chinatelecom.cn" <shanjiangh@chinatelecom.cn>,
	"zhangy.yun@chinatelecom.cn" <zhangy.yun@chinatelecom.cn>,
	"lixingfu@huachentel.com" <lixingfu@huachentel.com>,
	"wushuai@inspur.com" <wushuai@inspur.com>,
	"yuyingxia@yxlink.com" <yuyingxia@yxlink.com>,
	"fanchenggang@sunyainfo.com" <fanchenggang@sunyainfo.com>,
	"davidfgao@tencent.com" <davidfgao@tencent.com>,
	"liuzhong1@chinaunicom.cn" <liuzhong1@chinaunicom.cn>,
	"zhaoyong11@huawei.com" <zhaoyong11@huawei.com>,
	"oc@yunify.com" <oc@yunify.com>,
	"jim@netgate.com" <jim@netgate.com>,
	"Ni, Hongjun" <hongjun.ni@intel.com>,
	"j.bromhead@titan-ic.com" <j.bromhead@titan-ic.com>,
	"deri@ntop.org" <deri@ntop.org>,
	"fc@napatech.com" <fc@napatech.com>,
	"arthur.su@lionic.com" <arthur.su@lionic.com>,
	Guy Kaneti <guyk@marvell.com>, Smadar Fuks <smadarf@marvell.com>,
	Liron Himi <lironh@marvell.com>
Subject: Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem
Date: Tue, 10 Sep 2019 08:05:39 +0000	[thread overview]
Message-ID: <BYAPR18MB2424A91AF798B957AD29C9E2C8B60@BYAPR18MB2424.namprd18.prod.outlook.com> (raw)
In-Reply-To: <E77DB150E44F474D9AF7B121C3D290E95F061572@SHSMSX105.ccr.corp.intel.com>

Hi Xiang,

Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see inline.
 
> 
> Reply to Xiang's queries in main thread:
> 
> Hi all,
> 
> Some questions regarding APIs. Could you please give more insights?
> 
> 1) rte_regex_ops
>       a) rsp_flags
>       These two flags RTE_REGEX_OPS_RSP_PMI_SOJ_F and
> RTE_REGEX_OPS_RSP_PMI_EOJ_F are used for cross buffer scan.
>       RTE_REGEX_OPS_RSP_PMI_EOJ_F tells whether we have a partial match
> at the end of current buffer after scan.
>       What's the purpose of having RTE_REGEX_OPS_RSP_PMI_SOJ_F?
> 
> [Jerin] Since we need three states to represent partial match buffer,
> RTE_REGEX_OPS_RSP_PMI_SOJ_F to
> represent start of the buffer, intermediate buffers with no flag, and end of
> the buffer with RTE_REGEX_OPS_RSP_PMI_EOJ

> [Xiang] How could a user leverage these flags for matching? Suppose a large
> buffer is divided into multiple chunks. Will RTE_REGEX_OPS_RSP_PMI_SOJ_F
> cause an early quit once it isn't set after scan the first chunk. Similarly,
> RTE_REGEX_OPS_RSP_PMI_EOJ tells a user whether to stop matching future
> buffers after finish the last chunk?

Let me describe with an example,

Assume,
1) struct rte_regex_dev_info:: max_payload_size set to 1024
2) rte_regex_dev_config:: dev_cfg_flags configured with RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
3) Device programmed with matching "hello\s+world" pattern
4) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 1024

data[0..1021] = data don’t have hello world pattern
data[1022] = 'h'
data[1023] = 'e'

5) user enqueue struct rte_regex_ops:: buf_addr point following "data" and struct rte_regex_op:: scan_size = 9

data[0] = 'l'
data[1] = 'l'
data[2] = 'o'
data[3] = ' '
data[4] = 'w'
data[5] = 'o'
data[6] = 'r'
data[7] = 'l'
data[8] = 'd'

If so,

Response to 4) will be RTE_REGEX_OPS_RSP_PMI_SOJ_F in rte_regex_ops:: rsp_flags on dequeue
Where rte_regex_match:: offset is 1022 and len 2

Response to 5) will be RTE_REGEX_OPS_RSP_PMI_EOJ_F in rte_regex_ops:: rsp_flags on dequeue
Where rte_regex_match:: offset is 0 and len 9


> 
>       RTE_REGEX_OPS_RSP_MAX_PREFIX_F: This looks like a definition for a
> specific hardware implementation. I am wondering what this PREFIX refers
> to:)?
> 
> [Jerin] Yes. Looks like it is for hardware specific implementation. Introduced
> rte_regex_dev_attr_set/get functions to make it portable and
> To add new implementation specific fields.
> For example, if a rule is
> /ABCDEF.*XYZ/, ABCD is considered the prefix, and EF.*XYZ is considered the
> factor. The prefix is a literal
> string, while the factor can contain complex regular expression constructs. As
> a result, rule matching occurs in
> two stages: prefix matching and factor matching.
> 
>       b)  user_id or user_ptr
>       Under what kind of circumstances should an application pass value into
> these variables for enqueue and dequeuer operations?
> 
> [Jerin] Just like rte_crypto_ops, struct rte_regex_ops also allocated using
> mempool normally, on enqueue, user can specify user_id
> If needed to in order identify the op on dequeue if required. The use case
> could be to store the sequence number from application
> POV or storing the mbuf ptr in which pattern is requested etc.
> 
> 
>  2) rte_regex_match
>       a) offset; /**< Starting Byte Position for matched rule. */ and  uint16_t
> len; /**< Length of match in bytes */
>       Looks like the matching offset is defined as *starting matching offset*
> instead of *end matching offset*, e.g. report the offset of "a" instead of "c"
> for pattern "abc".
>       If so, this makes it hard to integrate software regex libraries such as
> Hyperscan and RE2 as they only report *end matching offset* without length
> of match.
>       Although Hyperscan has API for *starting matching offset*, it only delivers
> partial syntax support. So I think we have to define *end of matching offset*
> for software solutions.
> 
> [Jerin] I understand the hyperscan's HS_FLAG_SOM_LEFTMOST tradeoffs. I
> thought application would need always the length of the match.
> Probably we will see how other HW implementation (from Mellanox) etc. We
> will try to abstract it, probably we can make it as function of "user
> requested".
> [Xiang] Yes, it will be good to make it per user request. At least from
> Hyperscan user's point of view, start of match and match length are not
> mandatory.

OK. I think, we can introduce RTE_REGEX_DEV_CFG_MATCH_AS_START
In device configure.

Since offset+len == end, we can introduce following generic inline function.

static inline 
rte_regex_match_end(truct rte_regex_match *match)
{
	match->offset + match->len;
}

Example:  pattern to match is  "hello\s+world"  and data is following
data[4] = 'h'
data[5] = 'e'
data[6] = 'l'
data[7] = 'l'
data[8] = 'o'
data[9] = ' '
data[10] = 'w'
data[11] = 'o'
data[12] = 'r'
data[13] = 'l'
data[14] = 'd'

if device is configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
match->offset returns 4
match->len returns 11

if device is NOT configured with RTE_REGEX_DEV_CFG_MATCH_AS_START
driver MAY return the following(in hyperscan case)
match->offset returns 0
match->len returns 11 + 4

In both case(irrespective of flags, to make application life easy) rte_regex_match_end() would return 15.
If application demands for MATCH_AS_START then driver can return match->offset returns 4 and match->len returns 11
Aka set HS_FLAG_SOM_LEFTMOST in hyperscan driver, But application should use rte_regex_match_end()
for finding the end of the match. To make, work in all cases.

Is it OK? 

> 
> 3)  rte_regex_rule_db_update()
>     Does this mean we can dynamically add or delete rules for an already
> generated database without recompile from scratch for hardware Regex
> implementation?
>     If so, this isn't possible for software solutions as they don't support
> dynamic database update and require recompile.
> 
> [Jerin] rte_regex_rule_db_update() internally it would call recompile
> function for both HW and SW.
> See rte_regex_dev_config::rule_db in rte_regex_dev_configure() for
> precompiled rule database case.
> [Xiang] OK, sounds like we have to save the original rule-set for the device in
> order to do recompile. I see both ADD and REMOVE operators from
> rte_regex_rule.
> For rules with REMOVE operator, what's the expected behavior to handle
> them for the old rule-set? Do we need to go through the old rule-set and
> remove corresponding rules before doing recompile?

Yes.


  reply	other threads:[~2019-09-10  8:06 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-27 15:50 jerinj
2019-07-15  4:26 ` Jerin Jacob Kollanukkaran
2019-08-15  9:35 ` Thomas Monjalon
2019-08-15 11:34   ` Thomas Monjalon
2019-08-19  3:09     ` Jerin Jacob Kollanukkaran
2019-08-20  1:54       ` Wang, Xiang W
2019-09-10  8:05         ` Jerin Jacob Kollanukkaran [this message]
2019-09-19 13:58           ` Wang Xiang
2019-09-27 14:35             ` Jerin Jacob Kollanukkaran
2019-10-14 13:59               ` Wang Xiang
2020-01-26 11:55                 ` Ori Kam
2019-08-21  5:32     ` Shahaf Shuler
2019-08-21 15:12       ` John Bromhead
2019-09-10 10:31       ` Jerin Jacob Kollanukkaran
2019-09-10 11:02       ` Jerin Jacob Kollanukkaran
2019-09-27 14:45         ` Jerin Jacob Kollanukkaran
2019-10-02  5:53           ` Shahaf Shuler
2019-10-02  8:31             ` Jerin Jacob Kollanukkaran
2019-10-02  8:52               ` Shahaf Shuler
2019-10-02  9:34                 ` Jerin Jacob Kollanukkaran
2020-01-27 21:19 ` [dpdk-dev] [PATCH v2] net/regexdev: " Ori Kam
2020-01-28  9:00 ` [dpdk-dev] [PATCH v3] regexdev: " Ori Kam
2020-02-22 16:52   ` Jerin Jacob
2020-02-23  8:41     ` Ori Kam
2020-02-23  9:53       ` Jerin Jacob
2020-02-23 12:33         ` Ori Kam
2020-02-25  5:57           ` Jerin Jacob
2020-02-25  7:48             ` Ori Kam
2020-02-26  9:03               ` Wang Xiang
2020-02-26  8:36                 ` Ori Kam
2020-02-27  9:25                   ` Wang Xiang
2020-02-27  7:31                     ` Ori Kam
2020-02-27  9:16                       ` Wang Xiang
2020-02-27 14:40 ` [dpdk-dev] [RFC v4] " Ori Kam
2020-02-27 14:55   ` Jerin Jacob
2020-02-27 15:08 ` [dpdk-dev] [RFC v5] " Ori Kam
2020-03-01  6:13   ` [dpdk-dev] [EXT] " Pavan Nikhilesh Bhagavatula
2020-03-01  7:31     ` Ori Kam
2020-03-01 13:23       ` Pavan Nikhilesh Bhagavatula
2020-03-01 14:10         ` Ori Kam
2020-03-01 14:38           ` Pavan Nikhilesh Bhagavatula
2020-03-01 15:41             ` Ori Kam
2020-03-01 15:57               ` Pavan Nikhilesh Bhagavatula
2020-03-02  7:18                 ` Jerin Jacob
2020-03-03  7:06                   ` Ori Kam
2020-03-02  7:05   ` [dpdk-dev] " Wang Xiang
2020-03-03  7:44     ` Ori Kam
2020-03-03  7:54       ` Jerin Jacob
2020-03-10 10:32 ` [dpdk-dev] [RFC v6] " Ori Kam
2020-03-10 13:42   ` Pavan Nikhilesh Bhagavatula
2020-03-10 16:23     ` Ori Kam
2020-03-10 16:36       ` Pavan Nikhilesh Bhagavatula
2020-03-10 17:00         ` Ori Kam
2020-03-12 12:13           ` Ori Kam
2020-03-13  1:20   ` Wang Xiang
2020-03-15 10:05     ` Ori Kam
2020-03-16  1:25       ` Wang Xiang
2020-03-16  9:09         ` Ori Kam
2020-03-16 20:48           ` Wang Xiang
2020-03-16 13:49             ` Ori Kam
2020-03-16 21:10               ` Wang Xiang
2019-10-20 14:09 [dpdk-dev] [RFC PATCH v1] " Jerin Jacob Kollanukkaran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR18MB2424A91AF798B957AD29C9E2C8B60@BYAPR18MB2424.namprd18.prod.outlook.com \
    --to=jerinj@marvell.com \
    --cc=alexr@mellanox.com \
    --cc=arthur.su@lionic.com \
    --cc=bruce.richardson@intel.com \
    --cc=davidfgao@tencent.com \
    --cc=deri@ntop.org \
    --cc=dev@dpdk.org \
    --cc=dovrat@marvell.com \
    --cc=fanchenggang@sunyainfo.com \
    --cc=fc@napatech.com \
    --cc=gu.jian1@zte.com.cn \
    --cc=guyk@marvell.com \
    --cc=harry.chang@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=hongjun.ni@intel.com \
    --cc=j.bromhead@titan-ic.com \
    --cc=jim@netgate.com \
    --cc=lironh@marvell.com \
    --cc=liuzhong1@chinaunicom.cn \
    --cc=lixingfu@huachentel.com \
    --cc=nipun.gupta@nxp.com \
    --cc=oc@yunify.com \
    --cc=opher@mellanox.com \
    --cc=pbhagavatula@marvell.com \
    --cc=pkapoor@marvell.com \
    --cc=shahafs@mellanox.com \
    --cc=shanjiangh@chinatelecom.cn \
    --cc=smadarf@marvell.com \
    --cc=thomas@monjalon.net \
    --cc=wushuai@inspur.com \
    --cc=xiang.w.wang@intel.com \
    --cc=yang.a.hong@intel.com \
    --cc=yuyingxia@yxlink.com \
    --cc=zhangy.yun@chinatelecom.cn \
    --cc=zhaoyong11@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).