* [PATCH v1] ethdev: add direction info when creating the transfer table
@ 2022-09-07 2:40 Rongwei Liu
2022-09-11 8:22 ` Ori Kam
2022-09-12 16:57 ` Ivan Malov
0 siblings, 2 replies; 96+ messages in thread
From: Rongwei Liu @ 2022-09-07 2:40 UTC (permalink / raw)
To: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko
Cc: dev, rasland
The transfer domain rule is able to match traffic wire/vf
origin and it means two directions' underlayer resource.
In customer deployments, they usually match only one direction
traffic in single flow table: either from wire or from vf.
Introduce one new member transfer_mode into rte_flow_attr to
indicate the flow table direction property: from wire, from vf
or bi-direction(default).
It helps to save underlayer memory also on insertion rate.
By default, the transfer domain is bi-direction, and no behavior changes.
1. Match wire origin only
flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
flow template_table 0 create group 0 priority 0 transfer vf_orig...
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
---
app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
lib/ethdev/rte_flow.h | 9 ++++++-
3 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7f50028eb7..b25b595e82 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -177,6 +177,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VF_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VF_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VF_ORIG] = {
+ .name = "vf_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8894,6 +8910,16 @@ parse_table(struct context *ctx, const struct token *token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.flow_attr.transfer_mode = 1;
+ return len;
+ case TABLE_TRANSFER_VF_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.flow_attr.transfer_mode = 2;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 330e34427d..603b7988dd 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3332,7 +3332,8 @@ It is bound to ``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vf_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index a79f1e7ef0..512b08d817 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -130,7 +130,14 @@ struct rte_flow_attr {
* through a suitable port. @see rte_flow_pick_transfer_proxy().
*/
uint32_t transfer:1;
- uint32_t reserved:29; /**< Reserved, must be zero. */
+ /**
+ * 0 means bidirection,
+ * 0x1 origin uplink,
+ * 0x2 origin vport,
+ * N/A both set.
+ */
+ uint32_t transfer_mode:2;
+ uint32_t reserved:27; /**< Reserved, must be zero. */
};
/**
--
2.27.0
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-07 2:40 [PATCH v1] ethdev: add direction info when creating the transfer table Rongwei Liu
@ 2022-09-11 8:22 ` Ori Kam
2022-09-12 16:57 ` Ivan Malov
1 sibling, 0 replies; 96+ messages in thread
From: Ori Kam @ 2022-09-11 8:22 UTC (permalink / raw)
To: Rongwei Liu, Matan Azrad, Slava Ovsiienko,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko
Cc: dev, Raslan Darawsheh
Hi Rongwei,
> -----Original Message-----
> From: Rongwei Liu <rongweil@nvidia.com>
> Sent: Wednesday, 7 September 2022 5:40
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: [PATCH v1] ethdev: add direction info when creating the transfer
> table
>
> The transfer domain rule is able to match traffic wire/vf
> origin and it means two directions' underlayer resource.
>
> In customer deployments, they usually match only one direction
> traffic in single flow table: either from wire or from vf.
>
> Introduce one new member transfer_mode into rte_flow_attr to
> indicate the flow table direction property: from wire, from vf
> or bi-direction(default).
>
> It helps to save underlayer memory also on insertion rate.
>
> By default, the transfer domain is bi-direction, and no behavior changes.
>
> 1. Match wire origin only
> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> 2. Match vf origin only
> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>
> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> ---
> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> lib/ethdev/rte_flow.h | 9 ++++++-
> 3 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 7f50028eb7..b25b595e82 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -177,6 +177,8 @@ enum index {
> TABLE_INGRESS,
> TABLE_EGRESS,
> TABLE_TRANSFER,
> + TABLE_TRANSFER_WIRE_ORIG,
> + TABLE_TRANSFER_VF_ORIG,
> TABLE_RULES_NUMBER,
> TABLE_PATTERN_TEMPLATE,
> TABLE_ACTIONS_TEMPLATE,
> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
> TABLE_INGRESS,
> TABLE_EGRESS,
> TABLE_TRANSFER,
> + TABLE_TRANSFER_WIRE_ORIG,
> + TABLE_TRANSFER_VF_ORIG,
> TABLE_RULES_NUMBER,
> TABLE_PATTERN_TEMPLATE,
> TABLE_ACTIONS_TEMPLATE,
> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
> .next = NEXT(next_table_attr),
> .call = parse_table,
> },
> + [TABLE_TRANSFER_WIRE_ORIG] = {
> + .name = "wire_orig",
> + .help = "affect rule direction to transfer",
> + .next = NEXT(next_table_attr),
> + .call = parse_table,
> + },
> + [TABLE_TRANSFER_VF_ORIG] = {
> + .name = "vf_orig",
> + .help = "affect rule direction to transfer",
> + .next = NEXT(next_table_attr),
> + .call = parse_table,
> + },
> [TABLE_RULES_NUMBER] = {
> .name = "rules_number",
> .help = "number of rules in table",
> @@ -8894,6 +8910,16 @@ parse_table(struct context *ctx, const struct
> token *token,
> case TABLE_TRANSFER:
> out->args.table.attr.flow_attr.transfer = 1;
> return len;
> + case TABLE_TRANSFER_WIRE_ORIG:
> + if (!out->args.table.attr.flow_attr.transfer)
> + return -1;
> + out->args.table.attr.flow_attr.transfer_mode = 1;
> + return len;
> + case TABLE_TRANSFER_VF_ORIG:
> + if (!out->args.table.attr.flow_attr.transfer)
> + return -1;
> + out->args.table.attr.flow_attr.transfer_mode = 2;
> + return len;
> default:
> return -1;
> }
> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> index 330e34427d..603b7988dd 100644
> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -3332,7 +3332,8 @@ It is bound to
> ``rte_flow_template_table_create()``::
>
> flow template_table {port_id} create
> [table_id {id}] [group {group_id}]
> - [priority {level}] [ingress] [egress] [transfer]
> + [priority {level}] [ingress] [egress]
> + [transfer [vf_orig] [wire_orig]]
> rules_number {number}
> pattern_template {pattern_template_id}
> actions_template {actions_template_id}
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index a79f1e7ef0..512b08d817 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -130,7 +130,14 @@ struct rte_flow_attr {
> * through a suitable port. @see rte_flow_pick_transfer_proxy().
> */
> uint32_t transfer:1;
> - uint32_t reserved:29; /**< Reserved, must be zero. */
> + /**
> + * 0 means bidirection,
> + * 0x1 origin uplink,
> + * 0x2 origin vport,
> + * N/A both set.
> + */
> + uint32_t transfer_mode:2;
> + uint32_t reserved:27; /**< Reserved, must be zero. */
> };
>
> /**
> --
> 2.27.0
Acked-by: Ori Kam <orika@nvidia.com>
Thanks,
Ori
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-07 2:40 [PATCH v1] ethdev: add direction info when creating the transfer table Rongwei Liu
2022-09-11 8:22 ` Ori Kam
@ 2022-09-12 16:57 ` Ivan Malov
2022-09-13 13:46 ` Rongwei Liu
1 sibling, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2022-09-12 16:57 UTC (permalink / raw)
To: Rongwei Liu
Cc: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Andrew Rybchenko, dev, rasland
Hi,
On Wed, 7 Sep 2022, Rongwei Liu wrote:
> The transfer domain rule is able to match traffic wire/vf
> origin and it means two directions' underlayer resource.
The point of fact is that matching traffic coming from
some entity like wire / VF has been long generalised
in the form of representors. So, a flow rule with
attribute "transfer" is able to match traffic
coming from either a REPRESENTED_PORT or from
a PORT_REPRESENTOR (please find these items).
>
> In customer deployments, they usually match only one direction
> traffic in single flow table: either from wire or from vf.
Which customer deployments? Could you please provide detailed examples?
>
> Introduce one new member transfer_mode into rte_flow_attr to
> indicate the flow table direction property: from wire, from vf
> or bi-direction(default).
AFAIK, 'rte_flow_attr' serves both traditional flow rule
insertion and asynchronous (table) approach. The patch
adds the attributes to generic 'rte_flow_attr' but,
for some reason, ignores non-table rules.
For example, the diff below adds the attributes to "table" commands
in testpmd but does not add them to regular (non-table)
commands like "flow create". Why?
>
> It helps to save underlayer memory also on insertion rate.
Which memory? Host memory? NIC memory? Term "underlayer" is vague.
I suggest that the commit message be revised to first explain how
such memory is spent currently, then explain why this is not
optimal and, finally, which way the patch is supposed to
improve that. I.e. be more specific.
>
> By default, the transfer domain is bi-direction, and no behavior changes.
>
> 1. Match wire origin only
> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> 2. Match vf origin only
> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>
> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> ---
> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> lib/ethdev/rte_flow.h | 9 ++++++-
> 3 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 7f50028eb7..b25b595e82 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -177,6 +177,8 @@ enum index {
> TABLE_INGRESS,
> TABLE_EGRESS,
> TABLE_TRANSFER,
> + TABLE_TRANSFER_WIRE_ORIG,
> + TABLE_TRANSFER_VF_ORIG,
> TABLE_RULES_NUMBER,
> TABLE_PATTERN_TEMPLATE,
> TABLE_ACTIONS_TEMPLATE,
> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
> TABLE_INGRESS,
> TABLE_EGRESS,
> TABLE_TRANSFER,
> + TABLE_TRANSFER_WIRE_ORIG,
> + TABLE_TRANSFER_VF_ORIG,
> TABLE_RULES_NUMBER,
> TABLE_PATTERN_TEMPLATE,
> TABLE_ACTIONS_TEMPLATE,
> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
> .next = NEXT(next_table_attr),
> .call = parse_table,
> },
> + [TABLE_TRANSFER_WIRE_ORIG] = {
> + .name = "wire_orig",
> + .help = "affect rule direction to transfer",
This does not explain the "wire" aspect. It's too broad.
> + .next = NEXT(next_table_attr),
> + .call = parse_table,
> + },
> + [TABLE_TRANSFER_VF_ORIG] = {
> + .name = "vf_orig",
> + .help = "affect rule direction to transfer",
This explanation simply duplicates such of the "wire_orig".
It does not explain the "vf" part. Should be more specific.
> + .next = NEXT(next_table_attr),
> + .call = parse_table,
> + },
> [TABLE_RULES_NUMBER] = {
> .name = "rules_number",
> .help = "number of rules in table",
> @@ -8894,6 +8910,16 @@ parse_table(struct context *ctx, const struct token
> *token,
> case TABLE_TRANSFER:
> out->args.table.attr.flow_attr.transfer = 1;
> return len;
> + case TABLE_TRANSFER_WIRE_ORIG:
> + if (!out->args.table.attr.flow_attr.transfer)
> + return -1;
> + out->args.table.attr.flow_attr.transfer_mode = 1;
> + return len;
> + case TABLE_TRANSFER_VF_ORIG:
> + if (!out->args.table.attr.flow_attr.transfer)
> + return -1;
> + out->args.table.attr.flow_attr.transfer_mode = 2;
> + return len;
> default:
> return -1;
> }
> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> index 330e34427d..603b7988dd 100644
> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -3332,7 +3332,8 @@ It is bound to ``rte_flow_template_table_create()``::
>
> flow template_table {port_id} create
> [table_id {id}] [group {group_id}]
> - [priority {level}] [ingress] [egress] [transfer]
> + [priority {level}] [ingress] [egress]
> + [transfer [vf_orig] [wire_orig]]
Is it correct? Shouldn't it rather be
[transfer] [vf_orig] [wire_orig]
?
> rules_number {number}
> pattern_template {pattern_template_id}
> actions_template {actions_template_id}
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index a79f1e7ef0..512b08d817 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -130,7 +130,14 @@ struct rte_flow_attr {
> * through a suitable port. @see rte_flow_pick_transfer_proxy().
> */
> uint32_t transfer:1;
> - uint32_t reserved:29; /**< Reserved, must be zero. */
> + /**
> + * 0 means bidirection,
> + * 0x1 origin uplink,
What does "uplink" mean? It's too vague. Hardly a good term.
> + * 0x2 origin vport,
What does "origin vport" mean? Hardly a good term as well.
> + * N/A both set.
What's this?
> + */
> + uint32_t transfer_mode:2;
> + uint32_t reserved:27; /**< Reserved, must be zero. */
> };
>
> /**
> --
> 2.27.0
>
Since the attributes are added to generic 'struct rte_flow_attr',
non-table (synchronous) flow rules are supposed to support them,
too. If that is indeed the case, then I'm afraid such proposal
does not agree with the existing items PORT_REPRESENTOR and
REPRESENTED_PORT. They do exactly the same thing, but they
are designed to be way more generic. Why not use them?
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-12 16:57 ` Ivan Malov
@ 2022-09-13 13:46 ` Rongwei Liu
2022-09-13 14:33 ` Ivan Malov
0 siblings, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2022-09-13 13:46 UTC (permalink / raw)
To: Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi
BR
Rongwei
> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Tuesday, September 13, 2022 00:57
> To: Rongwei Liu <rongweil@nvidia.com>
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
> Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v1] ethdev: add direction info when creating the transfer
> table
>
> External email: Use caution opening links or attachments
>
>
> Hi,
>
> On Wed, 7 Sep 2022, Rongwei Liu wrote:
>
> > The transfer domain rule is able to match traffic wire/vf origin and
> > it means two directions' underlayer resource.
>
> The point of fact is that matching traffic coming from some entity like wire /
> VF has been long generalised in the form of representors. So, a flow rule with
> attribute "transfer" is able to match traffic coming from either a
> REPRESENTED_PORT or from a PORT_REPRESENTOR (please find these items).
>
> >
> > In customer deployments, they usually match only one direction traffic
> > in single flow table: either from wire or from vf.
>
> Which customer deployments? Could you please provide detailed examples?
>
> >
We saw a lot of customers' deployment like:
1. Match overlay traffic from wire and do decap, then send to specific vport.
2. Match specific 5-tuples and do encap, then send to wire.
The matching criteria has obvious direction preference.
> > Introduce one new member transfer_mode into rte_flow_attr to indicate
> > the flow table direction property: from wire, from vf or
> > bi-direction(default).
>
> AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion and
> asynchronous (table) approach. The patch adds the attributes to generic
> 'rte_flow_attr' but, for some reason, ignores non-table rules.
>
> >
Sync API uses one rule to contain everything. It' hard for PMD to determine if this rule has direction preference or not.
Image a situation, just for an example:
1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan /...), so sync API consider them share matching determination logic.
It means "2" have 1M scale capability too. Obviously, it wastes a lot of resources.
In async API, there is pattern_template introduced. We can mark "1" to use pattern_tempate id 1 and "2" to use pattern_template 2.
They will be separated from each other, don't share anymore.
> For example, the diff below adds the attributes to "table" commands in
> testpmd but does not add them to regular (non-table) commands like "flow
> create". Why?
>
> >
"table" command limits pattern_template to single direction or bidirection per user specified attribute.
"rule" command must tight with one "table_id", so the rule will inherit the "table" direction property, no need to specify again.
> > It helps to save underlayer memory also on insertion rate.
>
> Which memory? Host memory? NIC memory? Term "underlayer" is vague.
> I suggest that the commit message be revised to first explain how such
> memory is spent currently, then explain why this is not optimal and, finally,
> which way the patch is supposed to improve that. I.e. be more specific.
>
> >
For large scalable rules, HW (depends on implementation) always needs memory to hold the rules' patterns and actions, either from NIC or from host.
The memory footprint highly depends on "user rules' complexity", also diff between NICs.
~50% memory saving is expected if one-direction is cut.
> > By default, the transfer domain is bi-direction, and no behavior changes.
> >
> > 1. Match wire origin only
> > flow template_table 0 create group 0 priority 0 transfer wire_orig...
> > 2. Match vf origin only
> > flow template_table 0 create group 0 priority 0 transfer vf_orig...
> >
> > Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> > ---
> > app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
> > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> > lib/ethdev/rte_flow.h | 9 ++++++-
> > 3 files changed, 36 insertions(+), 2 deletions(-)
> >
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index 7f50028eb7..b25b595e82 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -177,6 +177,8 @@ enum index {
> > TABLE_INGRESS,
> > TABLE_EGRESS,
> > TABLE_TRANSFER,
> > + TABLE_TRANSFER_WIRE_ORIG,
> > + TABLE_TRANSFER_VF_ORIG,
> > TABLE_RULES_NUMBER,
> > TABLE_PATTERN_TEMPLATE,
> > TABLE_ACTIONS_TEMPLATE,
> > @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
> > TABLE_INGRESS,
> > TABLE_EGRESS,
> > TABLE_TRANSFER,
> > + TABLE_TRANSFER_WIRE_ORIG,
> > + TABLE_TRANSFER_VF_ORIG,
> > TABLE_RULES_NUMBER,
> > TABLE_PATTERN_TEMPLATE,
> > TABLE_ACTIONS_TEMPLATE,
> > @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
> > .next = NEXT(next_table_attr),
> > .call = parse_table,
> > },
> > + [TABLE_TRANSFER_WIRE_ORIG] = {
> > + .name = "wire_orig",
> > + .help = "affect rule direction to transfer",
>
> This does not explain the "wire" aspect. It's too broad.
>
> > + .next = NEXT(next_table_attr),
> > + .call = parse_table,
> > + },
> > + [TABLE_TRANSFER_VF_ORIG] = {
> > + .name = "vf_orig",
> > + .help = "affect rule direction to transfer",
>
> This explanation simply duplicates such of the "wire_orig".
> It does not explain the "vf" part. Should be more specific.
>
> > + .next = NEXT(next_table_attr),
> > + .call = parse_table,
> > + },
> > [TABLE_RULES_NUMBER] = {
> > .name = "rules_number",
> > .help = "number of rules in table", @@ -8894,6 +8910,16
> > @@ parse_table(struct context *ctx, const struct token *token,
> > case TABLE_TRANSFER:
> > out->args.table.attr.flow_attr.transfer = 1;
> > return len;
> > + case TABLE_TRANSFER_WIRE_ORIG:
> > + if (!out->args.table.attr.flow_attr.transfer)
> > + return -1;
> > + out->args.table.attr.flow_attr.transfer_mode = 1;
> > + return len;
> > + case TABLE_TRANSFER_VF_ORIG:
> > + if (!out->args.table.attr.flow_attr.transfer)
> > + return -1;
> > + out->args.table.attr.flow_attr.transfer_mode = 2;
> > + return len;
> > default:
> > return -1;
> > }
> > diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > index 330e34427d..603b7988dd 100644
> > --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > @@ -3332,7 +3332,8 @@ It is bound to
> ``rte_flow_template_table_create()``::
> >
> > flow template_table {port_id} create
> > [table_id {id}] [group {group_id}]
> > - [priority {level}] [ingress] [egress] [transfer]
> > + [priority {level}] [ingress] [egress]
> > + [transfer [vf_orig] [wire_orig]]
>
> Is it correct? Shouldn't it rather be
> [transfer] [vf_orig] [wire_orig]
> ?
>
> > rules_number {number}
> > pattern_template {pattern_template_id}
> > actions_template {actions_template_id} diff --git
> > a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> > a79f1e7ef0..512b08d817 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -130,7 +130,14 @@ struct rte_flow_attr {
> > * through a suitable port. @see rte_flow_pick_transfer_proxy().
> > */
> > uint32_t transfer:1;
> > - uint32_t reserved:29; /**< Reserved, must be zero. */
> > + /**
> > + * 0 means bidirection,
> > + * 0x1 origin uplink,
>
> What does "uplink" mean? It's too vague. Hardly a good term.
>
> > + * 0x2 origin vport,
>
> What does "origin vport" mean? Hardly a good term as well.
>
> > + * N/A both set.
>
> What's this?
>
> > + */
> > + uint32_t transfer_mode:2;
> > + uint32_t reserved:27; /**< Reserved, must be zero. */
> > };
> >
> > /**
> > --
> > 2.27.0
> >
>
> Since the attributes are added to generic 'struct rte_flow_attr', non-table
> (synchronous) flow rules are supposed to support them, too. If that is indeed
> the case, then I'm afraid such proposal does not agree with the existing items
> PORT_REPRESENTOR and REPRESENTED_PORT. They do exactly the same
> thing, but they are designed to be way more generic. Why not use them?
>
> Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-13 13:46 ` Rongwei Liu
@ 2022-09-13 14:33 ` Ivan Malov
2022-09-14 5:16 ` Rongwei Liu
2022-09-28 9:24 ` [PATCH v3] ethdev: add hint when creating async " Rongwei Liu
0 siblings, 2 replies; 96+ messages in thread
From: Ivan Malov @ 2022-09-13 14:33 UTC (permalink / raw)
To: Rongwei Liu
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi Rongwei,
PSB
On Tue, 13 Sep 2022, Rongwei Liu wrote:
> Hi
>
> BR
> Rongwei
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Tuesday, September 13, 2022 00:57
>> To: Rongwei Liu <rongweil@nvidia.com>
>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
>> Darawsheh <rasland@nvidia.com>
>> Subject: Re: [PATCH v1] ethdev: add direction info when creating the transfer
>> table
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi,
>>
>> On Wed, 7 Sep 2022, Rongwei Liu wrote:
>>
>>> The transfer domain rule is able to match traffic wire/vf origin and
>>> it means two directions' underlayer resource.
>>
>> The point of fact is that matching traffic coming from some entity like wire /
>> VF has been long generalised in the form of representors. So, a flow rule with
>> attribute "transfer" is able to match traffic coming from either a
>> REPRESENTED_PORT or from a PORT_REPRESENTOR (please find these items).
>>
>>>
>>> In customer deployments, they usually match only one direction traffic
>>> in single flow table: either from wire or from vf.
>>
>> Which customer deployments? Could you please provide detailed examples?
>>
>>>
>
> We saw a lot of customers' deployment like:
> 1. Match overlay traffic from wire and do decap, then send to specific vport.
> 2. Match specific 5-tuples and do encap, then send to wire.
> The matching criteria has obvious direction preference.
Thank you. My questions are as follows:
In (1), when you say "from wire", do you mean the need to match
packets arriving via whatever physical ports rather then
matching packets arriving from some specific phys. port?
If, however, matching traffic "from wire" in fact means matching
packets arriving from a *specific* physical port, then for sure
item REPRESENTED_PORT should perfectly do the job, and the
proposed attribute is unneeded.
(BTW, in DPDK, it is customary to use term "physical port", not "wire")
In (1), what are "vport"s? Please explain. Once again, I should remind
that, in DPDK, folks prefer terms "represented entity" / "representor"
over vendor-specific terms like "vport", etc.
As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
Could you please explain, why not just add a match item REPRESENTED_PORT
pointing to that VF via its representor? Doing so should perfectly
define the exact direction / traffic source. Isn't that sufficient?
Also please mind that, although I appreciate your explanations here,
on the mailing list, they should finally be added to the commit
message, so that readers do not have to look for them elsewhere.
>
>>> Introduce one new member transfer_mode into rte_flow_attr to indicate
>>> the flow table direction property: from wire, from vf or
>>> bi-direction(default).
>>
>> AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion and
>> asynchronous (table) approach. The patch adds the attributes to generic
>> 'rte_flow_attr' but, for some reason, ignores non-table rules.
>>
>>>
> Sync API uses one rule to contain everything. It' hard for PMD to determine if this rule has direction preference or not.
> Image a situation, just for an example:
> 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
> 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
> 1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan /...), so sync API consider them share matching determination logic.
> It means "2" have 1M scale capability too. Obviously, it wastes a lot of resources.
Strictly speaking, they do not share the same match pattern.
Your example clearly shows that, in (1), the pattern should
request packets coming from "vport 1" and, in (2), packets
coming from "vport 0".
My point is simple: the "vport" from which packets enter
the embedded switch is ALSO a match criterion. If you
accept this, you'll see: the matching conditions differ.
>
> In async API, there is pattern_template introduced. We can mark "1" to use pattern_tempate id 1 and "2" to use pattern_template 2.
> They will be separated from each other, don't share anymore.
Consider an example. "Wire" is a physical port represented by PF0 which,
in turn, is attached to DPDK via ethdev 0. "VF" (vport?) is attached to
guest and is represented by a representor ethdev 1 in DPDK.
So, some rules (template 1) are needed to deliver packets from "wire"
to "VF" and also decapsulate them. And some rules (template 2) are
needed to deliver packets in the opposite direction, from "VF"
to "wire" and also encapsulate them.
My question is, what prevents you from adding match item
REPRESENTED_PORT[ethdev_id=0] to the pattern template 1
and REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?
As I said previously, if you insert such item before eth / ipv4 / etc
to your match pattern, doing so defines an *exact* direction / source.
>
>> For example, the diff below adds the attributes to "table" commands in
>> testpmd but does not add them to regular (non-table) commands like "flow
>> create". Why?
>>
>>>
>
> "table" command limits pattern_template to single direction or bidirection per user specified attribute.
As I say above, the same effect can be achieved by adding item
REPRESENTED_PORT to the corresponding pattern template.
> "rule" command must tight with one "table_id", so the rule will inherit the "table" direction property, no need to specify again.
You migh've misunderstood. I do not talk about "rule" command coupled with
some "table". What I talk about is regular, NON-async flow insertion
commands.
Please take a look at section "/* Validate/create attributes. */" in
file "app/test-pmd/cmdline_flow.c". When one adds a new flow attribute,
they should reflect it the same way as VC_INGRESS, VC_TRANSFER, etc.
That's it.
But, as I say, I still believe that the new attributes aren't needed.
>
>>> It helps to save underlayer memory also on insertion rate.
>>
>> Which memory? Host memory? NIC memory? Term "underlayer" is vague.
>> I suggest that the commit message be revised to first explain how such
>> memory is spent currently, then explain why this is not optimal and, finally,
>> which way the patch is supposed to improve that. I.e. be more specific.
>>
>>>
>
> For large scalable rules, HW (depends on implementation) always needs memory to hold the rules' patterns and actions, either from NIC or from host.
> The memory footprint highly depends on "user rules' complexity", also diff between NICs.
> ~50% memory saving is expected if one-direction is cut.
Regardless of this talk, this explanation should probably be present in
the commit description.
>
>>> By default, the transfer domain is bi-direction, and no behavior changes.
>>>
>>> 1. Match wire origin only
>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
>>> 2. Match vf origin only
>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>>>
>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
>>> ---
>>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
>>> lib/ethdev/rte_flow.h | 9 ++++++-
>>> 3 files changed, 36 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
>>> index 7f50028eb7..b25b595e82 100644
>>> --- a/app/test-pmd/cmdline_flow.c
>>> +++ b/app/test-pmd/cmdline_flow.c
>>> @@ -177,6 +177,8 @@ enum index {
>>> TABLE_INGRESS,
>>> TABLE_EGRESS,
>>> TABLE_TRANSFER,
>>> + TABLE_TRANSFER_WIRE_ORIG,
>>> + TABLE_TRANSFER_VF_ORIG,
>>> TABLE_RULES_NUMBER,
>>> TABLE_PATTERN_TEMPLATE,
>>> TABLE_ACTIONS_TEMPLATE,
>>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
>>> TABLE_INGRESS,
>>> TABLE_EGRESS,
>>> TABLE_TRANSFER,
>>> + TABLE_TRANSFER_WIRE_ORIG,
>>> + TABLE_TRANSFER_VF_ORIG,
>>> TABLE_RULES_NUMBER,
>>> TABLE_PATTERN_TEMPLATE,
>>> TABLE_ACTIONS_TEMPLATE,
>>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
>>> .next = NEXT(next_table_attr),
>>> .call = parse_table,
>>> },
>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
>>> + .name = "wire_orig",
>>> + .help = "affect rule direction to transfer",
>>
>> This does not explain the "wire" aspect. It's too broad.
>>
>>> + .next = NEXT(next_table_attr),
>>> + .call = parse_table,
>>> + },
>>> + [TABLE_TRANSFER_VF_ORIG] = {
>>> + .name = "vf_orig",
>>> + .help = "affect rule direction to transfer",
>>
>> This explanation simply duplicates such of the "wire_orig".
>> It does not explain the "vf" part. Should be more specific.
>>
>>> + .next = NEXT(next_table_attr),
>>> + .call = parse_table,
>>> + },
>>> [TABLE_RULES_NUMBER] = {
>>> .name = "rules_number",
>>> .help = "number of rules in table", @@ -8894,6 +8910,16
>>> @@ parse_table(struct context *ctx, const struct token *token,
>>> case TABLE_TRANSFER:
>>> out->args.table.attr.flow_attr.transfer = 1;
>>> return len;
>>> + case TABLE_TRANSFER_WIRE_ORIG:
>>> + if (!out->args.table.attr.flow_attr.transfer)
>>> + return -1;
>>> + out->args.table.attr.flow_attr.transfer_mode = 1;
>>> + return len;
>>> + case TABLE_TRANSFER_VF_ORIG:
>>> + if (!out->args.table.attr.flow_attr.transfer)
>>> + return -1;
>>> + out->args.table.attr.flow_attr.transfer_mode = 2;
>>> + return len;
>>> default:
>>> return -1;
>>> }
>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> index 330e34427d..603b7988dd 100644
>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> @@ -3332,7 +3332,8 @@ It is bound to
>> ``rte_flow_template_table_create()``::
>>>
>>> flow template_table {port_id} create
>>> [table_id {id}] [group {group_id}]
>>> - [priority {level}] [ingress] [egress] [transfer]
>>> + [priority {level}] [ingress] [egress]
>>> + [transfer [vf_orig] [wire_orig]]
>>
>> Is it correct? Shouldn't it rather be
>> [transfer] [vf_orig] [wire_orig]
>> ?
>>
>>> rules_number {number}
>>> pattern_template {pattern_template_id}
>>> actions_template {actions_template_id} diff --git
>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>>> a79f1e7ef0..512b08d817 100644
>>> --- a/lib/ethdev/rte_flow.h
>>> +++ b/lib/ethdev/rte_flow.h
>>> @@ -130,7 +130,14 @@ struct rte_flow_attr {
>>> * through a suitable port. @see rte_flow_pick_transfer_proxy().
>>> */
>>> uint32_t transfer:1;
>>> - uint32_t reserved:29; /**< Reserved, must be zero. */
>>> + /**
>>> + * 0 means bidirection,
>>> + * 0x1 origin uplink,
>>
>> What does "uplink" mean? It's too vague. Hardly a good term.
>>
>>> + * 0x2 origin vport,
>>
>> What does "origin vport" mean? Hardly a good term as well.
>>
>>> + * N/A both set.
>>
>> What's this?
>>
>>> + */
>>> + uint32_t transfer_mode:2;
>>> + uint32_t reserved:27; /**< Reserved, must be zero. */
>>> };
>>>
>>> /**
>>> --
>>> 2.27.0
>>>
>>
>> Since the attributes are added to generic 'struct rte_flow_attr', non-table
>> (synchronous) flow rules are supposed to support them, too. If that is indeed
>> the case, then I'm afraid such proposal does not agree with the existing items
>> PORT_REPRESENTOR and REPRESENTED_PORT. They do exactly the same
>> thing, but they are designed to be way more generic. Why not use them?
The question stands.
>>
>> Ivan
>
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-13 14:33 ` Ivan Malov
@ 2022-09-14 5:16 ` Rongwei Liu
2022-09-14 7:32 ` Ivan Malov
2022-09-28 9:24 ` [PATCH v3] ethdev: add hint when creating async " Rongwei Liu
1 sibling, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2022-09-14 5:16 UTC (permalink / raw)
To: Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
HI
BR
Rongwei
> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Tuesday, September 13, 2022 22:33
> To: Rongwei Liu <rongweil@nvidia.com>
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
> Darawsheh <rasland@nvidia.com>
> Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer
> table
>
> External email: Use caution opening links or attachments
>
>
> Hi Rongwei,
>
> PSB
>
> On Tue, 13 Sep 2022, Rongwei Liu wrote:
>
> > Hi
> >
> > BR
> > Rongwei
> >
> >> -----Original Message-----
> >> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >> Sent: Tuesday, September 13, 2022 00:57
> >> To: Rongwei Liu <rongweil@nvidia.com>
> >> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> >> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> >> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> >> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
> >> Raslan Darawsheh <rasland@nvidia.com>
> >> Subject: Re: [PATCH v1] ethdev: add direction info when creating the
> >> transfer table
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> Hi,
> >>
> >> On Wed, 7 Sep 2022, Rongwei Liu wrote:
> >>
> >>> The transfer domain rule is able to match traffic wire/vf origin and
> >>> it means two directions' underlayer resource.
> >>
> >> The point of fact is that matching traffic coming from some entity
> >> like wire / VF has been long generalised in the form of representors.
> >> So, a flow rule with attribute "transfer" is able to match traffic
> >> coming from either a REPRESENTED_PORT or from a PORT_REPRESENTOR
> (please find these items).
> >>
> >>>
> >>> In customer deployments, they usually match only one direction
> >>> traffic in single flow table: either from wire or from vf.
> >>
> >> Which customer deployments? Could you please provide detailed examples?
> >>
> >>>
> >
> > We saw a lot of customers' deployment like:
> > 1. Match overlay traffic from wire and do decap, then send to specific vport.
> > 2. Match specific 5-tuples and do encap, then send to wire.
> > The matching criteria has obvious direction preference.
>
> Thank you. My questions are as follows:
>
> In (1), when you say "from wire", do you mean the need to match packets
> arriving via whatever physical ports rather then matching packets arriving
> from some specific phys. port?
>
> If, however, matching traffic "from wire" in fact means matching packets
> arriving from a *specific* physical port, then for sure item
> REPRESENTED_PORT should perfectly do the job, and the proposed attribute is
> unneeded.
>
> (BTW, in DPDK, it is customary to use term "physical port", not "wire")
>
> In (1), what are "vport"s? Please explain. Once again, I should remind that, in
> DPDK, folks prefer terms "represented entity" / "representor"
> over vendor-specific terms like "vport", etc.
>
Vport is virtual port for short such as VF.
> As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
> Could you please explain, why not just add a match item REPRESENTED_PORT
> pointing to that VF via its representor? Doing so should perfectly define the
> exact direction / traffic source. Isn't that sufficient?
>
Per my view, there is matching field and matching value difference.
Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as same or different matching criteria?
I would like to call them same since it can be summarized like 1.1.1.0/30
REPRESENTED_PORT is just another matching item, no essential differences and it can't stand for direction info.
Port id depends on the attach sequence.
> Also please mind that, although I appreciate your explanations here, on the
> mailing list, they should finally be added to the commit message, so that
> readers do not have to look for them elsewhere.
>
We have explained the high possibility of single-direction matching, right?
It' hard to list all the possibilities of traffic matching preferences.
The underlay is the one we have met for now.
> >
> >>> Introduce one new member transfer_mode into rte_flow_attr to
> >>> indicate the flow table direction property: from wire, from vf or
> >>> bi-direction(default).
> >>
> >> AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion
> >> and asynchronous (table) approach. The patch adds the attributes to
> >> generic 'rte_flow_attr' but, for some reason, ignores non-table rules.
> >>
> >>>
> > Sync API uses one rule to contain everything. It' hard for PMD to determine
> if this rule has direction preference or not.
> > Image a situation, just for an example:
> > 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
> > 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
> > 1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan /...), so
> sync API consider them share matching determination logic.
> > It means "2" have 1M scale capability too. Obviously, it wastes a lot of
> resources.
>
> Strictly speaking, they do not share the same match pattern.
> Your example clearly shows that, in (1), the pattern should request packets
> coming from "vport 1" and, in (2), packets coming from "vport 0".
>
> My point is simple: the "vport" from which packets enter the embedded switch
> is ALSO a match criterion. If you accept this, you'll see: the matching
> conditions differ.
>
See above.
In this case, I think the matching fields are both "port_id + ipv4_vxlan". They are same.
Only differs with values like vni 100 or 200 vice versa.
> >
> > In async API, there is pattern_template introduced. We can mark "1" to use
> pattern_tempate id 1 and "2" to use pattern_template 2.
> > They will be separated from each other, don't share anymore.
>
> Consider an example. "Wire" is a physical port represented by PF0 which, in
> turn, is attached to DPDK via ethdev 0. "VF" (vport?) is attached to guest and is
> represented by a representor ethdev 1 in DPDK.
>
> So, some rules (template 1) are needed to deliver packets from "wire"
> to "VF" and also decapsulate them. And some rules (template 2) are needed to
> deliver packets in the opposite direction, from "VF"
> to "wire" and also encapsulate them.
>
> My question is, what prevents you from adding match item
> REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and
> REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?
>
> As I said previously, if you insert such item before eth / ipv4 / etc to your
> match pattern, doing so defines an *exact* direction / source.
>
Could you check the async API guidance? I think pattern template focusing on the matching field (mask).
"REPRESENTED_PORT[ethdev_id=0] " and "REPRESENTED_PORT[ethdev_id=1] "are the same.
1. pattern template: REPRESENTED_PORT mask 0xffff ...
2. action template: action1 / actions2. /
3. table create with pattern_template plus action template..
REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create REPRESENTED_PORT port_id is 0 / actions ....
REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create REPRESENTED_PORT port_id is 1 / actions ....
> >
> >> For example, the diff below adds the attributes to "table" commands
> >> in testpmd but does not add them to regular (non-table) commands like
> >> "flow create". Why?
> >>
> >>>
> >
> > "table" command limits pattern_template to single direction or bidirection
> per user specified attribute.
>
> As I say above, the same effect can be achieved by adding item
> REPRESENTED_PORT to the corresponding pattern template.
See above.
>
> > "rule" command must tight with one "table_id", so the rule will inherit the
> "table" direction property, no need to specify again.
>
> You migh've misunderstood. I do not talk about "rule" command coupled with
> some "table". What I talk about is regular, NON-async flow insertion
> commands.
>
> Please take a look at section "/* Validate/create attributes. */" in file
> "app/test-pmd/cmdline_flow.c". When one adds a new flow attribute, they
> should reflect it the same way as VC_INGRESS, VC_TRANSFER, etc.
>
> That's it.
We don't intend to pass this to sync API. The above code example is for sync API.
>
> But, as I say, I still believe that the new attributes aren't needed.
I think we are not at the same page for now. Can we reach agreement on the same
matching criteria first?
> >
> >>> It helps to save underlayer memory also on insertion rate.
> >>
> >> Which memory? Host memory? NIC memory? Term "underlayer" is vague.
> >> I suggest that the commit message be revised to first explain how
> >> such memory is spent currently, then explain why this is not optimal
> >> and, finally, which way the patch is supposed to improve that. I.e. be more
> specific.
> >>
> >>>
> >
> > For large scalable rules, HW (depends on implementation) always needs
> memory to hold the rules' patterns and actions, either from NIC or from host.
> > The memory footprint highly depends on "user rules' complexity", also diff
> between NICs.
> > ~50% memory saving is expected if one-direction is cut.
>
> Regardless of this talk, this explanation should probably be present in the
> commit description.
>
This number may differ with different NICs or implementation. We can't say it for sure.
> >
> >>> By default, the transfer domain is bi-direction, and no behavior changes.
> >>>
> >>> 1. Match wire origin only
> >>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> >>> 2. Match vf origin only
> >>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
> >>>
> >>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> >>> ---
> >>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
> >>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> >>> lib/ethdev/rte_flow.h | 9 ++++++-
> >>> 3 files changed, 36 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/app/test-pmd/cmdline_flow.c
> >>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82 100644
> >>> --- a/app/test-pmd/cmdline_flow.c
> >>> +++ b/app/test-pmd/cmdline_flow.c
> >>> @@ -177,6 +177,8 @@ enum index {
> >>> TABLE_INGRESS,
> >>> TABLE_EGRESS,
> >>> TABLE_TRANSFER,
> >>> + TABLE_TRANSFER_WIRE_ORIG,
> >>> + TABLE_TRANSFER_VF_ORIG,
> >>> TABLE_RULES_NUMBER,
> >>> TABLE_PATTERN_TEMPLATE,
> >>> TABLE_ACTIONS_TEMPLATE,
> >>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
> >>> TABLE_INGRESS,
> >>> TABLE_EGRESS,
> >>> TABLE_TRANSFER,
> >>> + TABLE_TRANSFER_WIRE_ORIG,
> >>> + TABLE_TRANSFER_VF_ORIG,
> >>> TABLE_RULES_NUMBER,
> >>> TABLE_PATTERN_TEMPLATE,
> >>> TABLE_ACTIONS_TEMPLATE,
> >>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
> >>> .next = NEXT(next_table_attr),
> >>> .call = parse_table,
> >>> },
> >>> + [TABLE_TRANSFER_WIRE_ORIG] = {
> >>> + .name = "wire_orig",
> >>> + .help = "affect rule direction to transfer",
> >>
> >> This does not explain the "wire" aspect. It's too broad.
> >>
> >>> + .next = NEXT(next_table_attr),
> >>> + .call = parse_table,
> >>> + },
> >>> + [TABLE_TRANSFER_VF_ORIG] = {
> >>> + .name = "vf_orig",
> >>> + .help = "affect rule direction to transfer",
> >>
> >> This explanation simply duplicates such of the "wire_orig".
> >> It does not explain the "vf" part. Should be more specific.
> >>
> >>> + .next = NEXT(next_table_attr),
> >>> + .call = parse_table,
> >>> + },
> >>> [TABLE_RULES_NUMBER] = {
> >>> .name = "rules_number",
> >>> .help = "number of rules in table", @@ -8894,6
> >>> +8910,16 @@ parse_table(struct context *ctx, const struct token *token,
> >>> case TABLE_TRANSFER:
> >>> out->args.table.attr.flow_attr.transfer = 1;
> >>> return len;
> >>> + case TABLE_TRANSFER_WIRE_ORIG:
> >>> + if (!out->args.table.attr.flow_attr.transfer)
> >>> + return -1;
> >>> + out->args.table.attr.flow_attr.transfer_mode = 1;
> >>> + return len;
> >>> + case TABLE_TRANSFER_VF_ORIG:
> >>> + if (!out->args.table.attr.flow_attr.transfer)
> >>> + return -1;
> >>> + out->args.table.attr.flow_attr.transfer_mode = 2;
> >>> + return len;
> >>> default:
> >>> return -1;
> >>> }
> >>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>> index 330e34427d..603b7988dd 100644
> >>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>> @@ -3332,7 +3332,8 @@ It is bound to
> >> ``rte_flow_template_table_create()``::
> >>>
> >>> flow template_table {port_id} create
> >>> [table_id {id}] [group {group_id}]
> >>> - [priority {level}] [ingress] [egress] [transfer]
> >>> + [priority {level}] [ingress] [egress]
> >>> + [transfer [vf_orig] [wire_orig]]
> >>
> >> Is it correct? Shouldn't it rather be [transfer] [vf_orig]
> >> [wire_orig] ?
> >>
> >>> rules_number {number}
> >>> pattern_template {pattern_template_id}
> >>> actions_template {actions_template_id} diff --git
> >>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> >>> a79f1e7ef0..512b08d817 100644
> >>> --- a/lib/ethdev/rte_flow.h
> >>> +++ b/lib/ethdev/rte_flow.h
> >>> @@ -130,7 +130,14 @@ struct rte_flow_attr {
> >>> * through a suitable port. @see rte_flow_pick_transfer_proxy().
> >>> */
> >>> uint32_t transfer:1;
> >>> - uint32_t reserved:29; /**< Reserved, must be zero. */
> >>> + /**
> >>> + * 0 means bidirection,
> >>> + * 0x1 origin uplink,
> >>
> >> What does "uplink" mean? It's too vague. Hardly a good term.
> >>
> >>> + * 0x2 origin vport,
> >>
> >> What does "origin vport" mean? Hardly a good term as well.
> >>
> >>> + * N/A both set.
> >>
> >> What's this?
> >>
> >>> + */
> >>> + uint32_t transfer_mode:2;
> >>> + uint32_t reserved:27; /**< Reserved, must be zero. */
> >>> };
> >>>
> >>> /**
> >>> --
> >>> 2.27.0
> >>>
> >>
> >> Since the attributes are added to generic 'struct rte_flow_attr',
> >> non-table
> >> (synchronous) flow rules are supposed to support them, too. If that
> >> is indeed the case, then I'm afraid such proposal does not agree with
> >> the existing items PORT_REPRESENTOR and REPRESENTED_PORT. They do
> >> exactly the same thing, but they are designed to be way more generic. Why
> not use them?
>
> The question stands.
>
> >>
> >> Ivan
> >
>
> Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-14 5:16 ` Rongwei Liu
@ 2022-09-14 7:32 ` Ivan Malov
2022-09-14 10:17 ` Rongwei Liu
0 siblings, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2022-09-14 7:32 UTC (permalink / raw)
To: Rongwei Liu
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi,
On Wed, 14 Sep 2022, Rongwei Liu wrote:
> HI
>
> BR
> Rongwei
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Tuesday, September 13, 2022 22:33
>> To: Rongwei Liu <rongweil@nvidia.com>
>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
>> Darawsheh <rasland@nvidia.com>
>> Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer
>> table
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Rongwei,
>>
>> PSB
>>
>> On Tue, 13 Sep 2022, Rongwei Liu wrote:
>>
>>> Hi
>>>
>>> BR
>>> Rongwei
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>> Sent: Tuesday, September 13, 2022 00:57
>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
>>>> Raslan Darawsheh <rasland@nvidia.com>
>>>> Subject: Re: [PATCH v1] ethdev: add direction info when creating the
>>>> transfer table
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Hi,
>>>>
>>>> On Wed, 7 Sep 2022, Rongwei Liu wrote:
>>>>
>>>>> The transfer domain rule is able to match traffic wire/vf origin and
>>>>> it means two directions' underlayer resource.
>>>>
>>>> The point of fact is that matching traffic coming from some entity
>>>> like wire / VF has been long generalised in the form of representors.
>>>> So, a flow rule with attribute "transfer" is able to match traffic
>>>> coming from either a REPRESENTED_PORT or from a PORT_REPRESENTOR
>> (please find these items).
>>>>
>>>>>
>>>>> In customer deployments, they usually match only one direction
>>>>> traffic in single flow table: either from wire or from vf.
>>>>
>>>> Which customer deployments? Could you please provide detailed examples?
>>>>
>>>>>
>>>
>>> We saw a lot of customers' deployment like:
>>> 1. Match overlay traffic from wire and do decap, then send to specific vport.
>>> 2. Match specific 5-tuples and do encap, then send to wire.
>>> The matching criteria has obvious direction preference.
>>
>> Thank you. My questions are as follows:
>>
>> In (1), when you say "from wire", do you mean the need to match packets
>> arriving via whatever physical ports rather then matching packets arriving
>> from some specific phys. port?
^^
Could you please find my question above? Based on your understanding
of templates in async flow approach, an answer to this question may
help us find the common ground.
--
>>
>> If, however, matching traffic "from wire" in fact means matching packets
>> arriving from a *specific* physical port, then for sure item
>> REPRESENTED_PORT should perfectly do the job, and the proposed attribute is
>> unneeded.
>>
>> (BTW, in DPDK, it is customary to use term "physical port", not "wire")
>>
>> In (1), what are "vport"s? Please explain. Once again, I should remind that, in
>> DPDK, folks prefer terms "represented entity" / "representor"
>> over vendor-specific terms like "vport", etc.
>>
> Vport is virtual port for short such as VF.
Thanks. As I say, term "vport" might be confusing to some readers,
so it'd be better to provide this explanation (about VF)
in the commit description next time.
>> As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
>> Could you please explain, why not just add a match item REPRESENTED_PORT
>> pointing to that VF via its representor? Doing so should perfectly define the
>> exact direction / traffic source. Isn't that sufficient?
>>
> Per my view, there is matching field and matching value difference.
> Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as same or different matching criteria?
> I would like to call them same since it can be summarized like 1.1.1.0/30
> REPRESENTED_PORT is just another matching item, no essential differences and it can't stand for direction info.
It looks like we're starting to run into disagreement here.
There's no "direction" at all. There's an embedded switch
inside the NIC, and there're (logical) switch ports that
packets enter the switch from.
When the user submits a "transfer" rule and does not provide
neither REPRESENTED_PORT nor PORT_REPRESENTOR in the pattern,
the embedded switch is supposed to match packets coming from
ANY ports, be it VFs or physical (wire) ports.
But when the user provides, in example, item REPRESENTED_PORT
to point to the physical (wire) port, the embedded switch
knows exactly which port the packets should enter it from.
In this case, it is supposed to match only packets coming
from that physical port. And this should be sufficient.
This in fact replaces the need to know a "direction".
It's just an exact specification of packet's origin.
> Port id depends on the attach sequence.
Unfortunately, this is hardly a good argument because flow rules
are supposed to be inserted based on the run-time packet
learning. Attach sequence is a don't care here.
>> Also please mind that, although I appreciate your explanations here, on the
>> mailing list, they should finally be added to the commit message, so that
>> readers do not have to look for them elsewhere.
>>
> We have explained the high possibility of single-direction matching, right?
Not quite. As I said, it is not correct to assume any "direction", like in
geographical sense ("north", "south", etc.). Application has ethdevs, and
they are representors of some "virtual ports" (in your terminology)
belonging to the switch, for example, VFs, SFs or physical ports.
The user adds an appropriate item to the pattern (REPRESENTED_PORT),
and doing so specifies the packet path which it enters the switch.
> It' hard to list all the possibilities of traffic matching preferences.
And let's say more: one need never do this. That's exactly the reason
why DPDK has abandoned the concept of "direction" in *transfer* rules
and switched to the use of precise criteria (REPRESENTED_PORT, etc.).
> The underlay is the one we have met for now.
>>>
>>>>> Introduce one new member transfer_mode into rte_flow_attr to
>>>>> indicate the flow table direction property: from wire, from vf or
>>>>> bi-direction(default).
>>>>
>>>> AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion
>>>> and asynchronous (table) approach. The patch adds the attributes to
>>>> generic 'rte_flow_attr' but, for some reason, ignores non-table rules.
>>>>
>>>>>
>>> Sync API uses one rule to contain everything. It' hard for PMD to determine
>> if this rule has direction preference or not.
>>> Image a situation, just for an example:
>>> 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
>>> 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
>>> 1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan /...), so
>> sync API consider them share matching determination logic.
>>> It means "2" have 1M scale capability too. Obviously, it wastes a lot of
>> resources.
>>
>> Strictly speaking, they do not share the same match pattern.
>> Your example clearly shows that, in (1), the pattern should request packets
>> coming from "vport 1" and, in (2), packets coming from "vport 0".
>>
>> My point is simple: the "vport" from which packets enter the embedded switch
>> is ALSO a match criterion. If you accept this, you'll see: the matching
>> conditions differ.
>>
> See above.
> In this case, I think the matching fields are both "port_id + ipv4_vxlan". They are same.
> Only differs with values like vni 100 or 200 vice versa.
Not quite. Look closer: you use *different* port IDs for (1) and (2).
The value of "ethdev_id" field in item REPRESENTED_PORT differs.
>>>
>>> In async API, there is pattern_template introduced. We can mark "1" to use
>> pattern_tempate id 1 and "2" to use pattern_template 2.
>>> They will be separated from each other, don't share anymore.
>>
>> Consider an example. "Wire" is a physical port represented by PF0 which, in
>> turn, is attached to DPDK via ethdev 0. "VF" (vport?) is attached to guest and is
>> represented by a representor ethdev 1 in DPDK.
>>
>> So, some rules (template 1) are needed to deliver packets from "wire"
>> to "VF" and also decapsulate them. And some rules (template 2) are needed to
>> deliver packets in the opposite direction, from "VF"
>> to "wire" and also encapsulate them.
>>
>> My question is, what prevents you from adding match item
>> REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and
>> REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?
>>
>> As I said previously, if you insert such item before eth / ipv4 / etc to your
>> match pattern, doing so defines an *exact* direction / source.
>>
> Could you check the async API guidance? I think pattern template focusing on the matching field (mask).
> "REPRESENTED_PORT[ethdev_id=0] " and "REPRESENTED_PORT[ethdev_id=1] "are the same.
> 1. pattern template: REPRESENTED_PORT mask 0xffff ...
> 2. action template: action1 / actions2. /
> 3. table create with pattern_template plus action template..
> REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create REPRESENTED_PORT port_id is 0 / actions ....
> REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create REPRESENTED_PORT port_id is 1 / actions ....
OK, so, based on this explanation, it appears that
you might be looking to refer to:
a) a *set* of any physical (wire) ports
b) a *set* of any guest ports (VFs)
You chose to achieve this using an attribute, but:
1) as I explained above, the use of term "direction" is wrong;
please hear me out: I'm not saying that your use case and
your optimisation is wrong: I'm saying that naming for it
is wrong: it has nothing to do with "direction";
2) while naming a *set* of wire ports as "wire_orig" might be OK,
sticking with term "vf_orig" for a *set* of guest ports is
clearly not, simply because the user may pass another PF
to a guest instead of passing a VF; in other words,
a better term is needed here;
3) since it is possible to plug multiple NICs to a DPDK application,
even from different vendors, the user may end up having multiple
physical ports belonging to different physical NICs attached to
the application; if this is the case, then referring to a *set*
of wire ports using the new attribute is ambiguous in the
sense that it's unclear whether this applies only to
wire ports of some specific physical NIC or to the
physical ports of *all* NICs managed by the app;
4) adding an attribute instead of yet another pattern item type
is not quite good because PMDs need to be updated separately
to detect this attribute and throw an error if it's not
supported, whilst with a new item type, the PMDs do not
need to be updated = if a PMD sees an unsupported item
while traversing the item with switch () { case }, it
will anyway throw an error;
5) as in (4), a new attribute is not good from documentation
standpoint; plase search for "represented_port = Y" in
documentation = this way, all supported items are
easily defined for various NIC vendors, but the
same isn't true for attributes = there is no
way to indicate supported attributes in doc.
If points (1 - 5) make sense to you, then, if I may be so bold,
I'd like to suggest that the idea of adding a new attribute be
abandoned. Instead, I'd like to suggest adding new items:
(the names are just sketch, for sure, it should be discussed)
ANY_PHY_PORTS { switch_domain_id }
= match packets entering the embedded switch from *whatever*
physical ports belonging to the given switch domain
ANY_GUEST_PORTS { switch_domain_id }
= match packets entering the embedded switch from *whatever*
guest ports (VFs, PFs, etc.) belonging to the given
switch domain
The field "switch_domain_id" is required to tell one physical
board / vendor from another (as I explained in point (3)).
The application can query this parameter from ethdev's
switch info: please see "struct rte_eth_switch_info".
What's your opinion?
>
>>>
>>>> For example, the diff below adds the attributes to "table" commands
>>>> in testpmd but does not add them to regular (non-table) commands like
>>>> "flow create". Why?
>>>>
>>>>>
>>>
>>> "table" command limits pattern_template to single direction or bidirection
>> per user specified attribute.
>>
>> As I say above, the same effect can be achieved by adding item
>> REPRESENTED_PORT to the corresponding pattern template.
> See above.
>>
>>> "rule" command must tight with one "table_id", so the rule will inherit the
>> "table" direction property, no need to specify again.
>>
>> You migh've misunderstood. I do not talk about "rule" command coupled with
>> some "table". What I talk about is regular, NON-async flow insertion
>> commands.
>>
>> Please take a look at section "/* Validate/create attributes. */" in file
>> "app/test-pmd/cmdline_flow.c". When one adds a new flow attribute, they
>> should reflect it the same way as VC_INGRESS, VC_TRANSFER, etc.
>>
>> That's it.
> We don't intend to pass this to sync API. The above code example is for sync API.
So I understand. But there's one slight problem: in your patch, you add
the new attributes to the structure which is *shared* between sync and
async use case scenarios. If one adds an attribute to this structure,
they have to provide accessors for it in all sync-related commands
in testpmd, but your patch does not do that.
In other words, it is wrong to assume that "struct rte_flow_attr" only
applies to async approach. It had been introduced long before the
async flow design was added to DPDK. That's it.
>>
>> But, as I say, I still believe that the new attributes aren't needed.
> I think we are not at the same page for now. Can we reach agreement on the same
> matching criteria first?
>>>
>>>>> It helps to save underlayer memory also on insertion rate.
>>>>
>>>> Which memory? Host memory? NIC memory? Term "underlayer" is vague.
>>>> I suggest that the commit message be revised to first explain how
>>>> such memory is spent currently, then explain why this is not optimal
>>>> and, finally, which way the patch is supposed to improve that. I.e. be more
>> specific.
>>>>
>>>>>
>>>
>>> For large scalable rules, HW (depends on implementation) always needs
>> memory to hold the rules' patterns and actions, either from NIC or from host.
>>> The memory footprint highly depends on "user rules' complexity", also diff
>> between NICs.
>>> ~50% memory saving is expected if one-direction is cut.
>>
>> Regardless of this talk, this explanation should probably be present in the
>> commit description.
>>
> This number may differ with different NICs or implementation. We can't say it for sure.
Not an exact number, of course, but a brief explanation of:
a) what is wrong / not optimal in the current design;
b) how it is observed in customer deployments;
c) why the proposed patch is a good solution.
>>>
>>>>> By default, the transfer domain is bi-direction, and no behavior changes.
>>>>>
>>>>> 1. Match wire origin only
>>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
>>>>> 2. Match vf origin only
>>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>>>>>
>>>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
>>>>> ---
>>>>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
>>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
>>>>> lib/ethdev/rte_flow.h | 9 ++++++-
>>>>> 3 files changed, 36 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/app/test-pmd/cmdline_flow.c
>>>>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82 100644
>>>>> --- a/app/test-pmd/cmdline_flow.c
>>>>> +++ b/app/test-pmd/cmdline_flow.c
>>>>> @@ -177,6 +177,8 @@ enum index {
>>>>> TABLE_INGRESS,
>>>>> TABLE_EGRESS,
>>>>> TABLE_TRANSFER,
>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>> + TABLE_TRANSFER_VF_ORIG,
>>>>> TABLE_RULES_NUMBER,
>>>>> TABLE_PATTERN_TEMPLATE,
>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
>>>>> TABLE_INGRESS,
>>>>> TABLE_EGRESS,
>>>>> TABLE_TRANSFER,
>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>> + TABLE_TRANSFER_VF_ORIG,
>>>>> TABLE_RULES_NUMBER,
>>>>> TABLE_PATTERN_TEMPLATE,
>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
>>>>> .next = NEXT(next_table_attr),
>>>>> .call = parse_table,
>>>>> },
>>>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
>>>>> + .name = "wire_orig",
>>>>> + .help = "affect rule direction to transfer",
>>>>
>>>> This does not explain the "wire" aspect. It's too broad.
>>>>
>>>>> + .next = NEXT(next_table_attr),
>>>>> + .call = parse_table,
>>>>> + },
>>>>> + [TABLE_TRANSFER_VF_ORIG] = {
>>>>> + .name = "vf_orig",
>>>>> + .help = "affect rule direction to transfer",
>>>>
>>>> This explanation simply duplicates such of the "wire_orig".
>>>> It does not explain the "vf" part. Should be more specific.
>>>>
>>>>> + .next = NEXT(next_table_attr),
>>>>> + .call = parse_table,
>>>>> + },
>>>>> [TABLE_RULES_NUMBER] = {
>>>>> .name = "rules_number",
>>>>> .help = "number of rules in table", @@ -8894,6
>>>>> +8910,16 @@ parse_table(struct context *ctx, const struct token *token,
>>>>> case TABLE_TRANSFER:
>>>>> out->args.table.attr.flow_attr.transfer = 1;
>>>>> return len;
>>>>> + case TABLE_TRANSFER_WIRE_ORIG:
>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>> + return -1;
>>>>> + out->args.table.attr.flow_attr.transfer_mode = 1;
>>>>> + return len;
>>>>> + case TABLE_TRANSFER_VF_ORIG:
>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>> + return -1;
>>>>> + out->args.table.attr.flow_attr.transfer_mode = 2;
>>>>> + return len;
>>>>> default:
>>>>> return -1;
>>>>> }
>>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>> index 330e34427d..603b7988dd 100644
>>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>> @@ -3332,7 +3332,8 @@ It is bound to
>>>> ``rte_flow_template_table_create()``::
>>>>>
>>>>> flow template_table {port_id} create
>>>>> [table_id {id}] [group {group_id}]
>>>>> - [priority {level}] [ingress] [egress] [transfer]
>>>>> + [priority {level}] [ingress] [egress]
>>>>> + [transfer [vf_orig] [wire_orig]]
>>>>
>>>> Is it correct? Shouldn't it rather be [transfer] [vf_orig]
>>>> [wire_orig] ?
>>>>
>>>>> rules_number {number}
>>>>> pattern_template {pattern_template_id}
>>>>> actions_template {actions_template_id} diff --git
>>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>>>>> a79f1e7ef0..512b08d817 100644
>>>>> --- a/lib/ethdev/rte_flow.h
>>>>> +++ b/lib/ethdev/rte_flow.h
>>>>> @@ -130,7 +130,14 @@ struct rte_flow_attr {
>>>>> * through a suitable port. @see rte_flow_pick_transfer_proxy().
>>>>> */
>>>>> uint32_t transfer:1;
>>>>> - uint32_t reserved:29; /**< Reserved, must be zero. */
>>>>> + /**
>>>>> + * 0 means bidirection,
>>>>> + * 0x1 origin uplink,
>>>>
>>>> What does "uplink" mean? It's too vague. Hardly a good term.
I believe this comment should be reworked, in case
the idea of having an extra attribute persists.
>>>>
>>>>> + * 0x2 origin vport,
>>>>
>>>> What does "origin vport" mean? Hardly a good term as well.
I still believe this explanation is way too brief and needs
to be reworked to provide more details, to define the
use case for the attribute more specifically.
>>>>
>>>>> + * N/A both set.
>>>>
>>>> What's this?
The question stands.
>>>>
>>>>> + */
>>>>> + uint32_t transfer_mode:2;
>>>>> + uint32_t reserved:27; /**< Reserved, must be zero. */
>>>>> };
>>>>>
>>>>> /**
>>>>> --
>>>>> 2.27.0
>>>>>
>>>>
>>>> Since the attributes are added to generic 'struct rte_flow_attr',
>>>> non-table
>>>> (synchronous) flow rules are supposed to support them, too. If that
>>>> is indeed the case, then I'm afraid such proposal does not agree with
>>>> the existing items PORT_REPRESENTOR and REPRESENTED_PORT. They do
>>>> exactly the same thing, but they are designed to be way more generic. Why
>> not use them?
>>
>> The question stands.
>>
>>>>
>>>> Ivan
>>>
>>
>> Ivan
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-14 7:32 ` Ivan Malov
@ 2022-09-14 10:17 ` Rongwei Liu
2022-09-14 15:18 ` Ivan Malov
0 siblings, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2022-09-14 10:17 UTC (permalink / raw)
To: Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
HI
BR
Rongwei
> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Wednesday, September 14, 2022 15:32
> To: Rongwei Liu <rongweil@nvidia.com>
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
> Darawsheh <rasland@nvidia.com>
> Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer
> table
>
> External email: Use caution opening links or attachments
>
>
> Hi,
>
> On Wed, 14 Sep 2022, Rongwei Liu wrote:
>
> > HI
> >
> > BR
> > Rongwei
> >
> >> -----Original Message-----
> >> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >> Sent: Tuesday, September 13, 2022 22:33
> >> To: Rongwei Liu <rongweil@nvidia.com>
> >> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> >> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> >> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> >> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
> >> Raslan Darawsheh <rasland@nvidia.com>
> >> Subject: RE: [PATCH v1] ethdev: add direction info when creating the
> >> transfer table
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> Hi Rongwei,
> >>
> >> PSB
> >>
> >> On Tue, 13 Sep 2022, Rongwei Liu wrote:
> >>
> >>> Hi
> >>>
> >>> BR
> >>> Rongwei
> >>>
> >>>> -----Original Message-----
> >>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>>> Sent: Tuesday, September 13, 2022 00:57
> >>>> To: Rongwei Liu <rongweil@nvidia.com>
> >>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> >>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> >>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> >>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
> >>>> Raslan Darawsheh <rasland@nvidia.com>
> >>>> Subject: Re: [PATCH v1] ethdev: add direction info when creating
> >>>> the transfer table
> >>>>
> >>>> External email: Use caution opening links or attachments
> >>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> On Wed, 7 Sep 2022, Rongwei Liu wrote:
> >>>>
> >>>>> The transfer domain rule is able to match traffic wire/vf origin
> >>>>> and it means two directions' underlayer resource.
> >>>>
> >>>> The point of fact is that matching traffic coming from some entity
> >>>> like wire / VF has been long generalised in the form of representors.
> >>>> So, a flow rule with attribute "transfer" is able to match traffic
> >>>> coming from either a REPRESENTED_PORT or from a
> PORT_REPRESENTOR
> >> (please find these items).
> >>>>
> >>>>>
> >>>>> In customer deployments, they usually match only one direction
> >>>>> traffic in single flow table: either from wire or from vf.
> >>>>
> >>>> Which customer deployments? Could you please provide detailed
> examples?
> >>>>
> >>>>>
> >>>
> >>> We saw a lot of customers' deployment like:
> >>> 1. Match overlay traffic from wire and do decap, then send to specific
> vport.
> >>> 2. Match specific 5-tuples and do encap, then send to wire.
> >>> The matching criteria has obvious direction preference.
> >>
> >> Thank you. My questions are as follows:
> >>
> >> In (1), when you say "from wire", do you mean the need to match
> >> packets arriving via whatever physical ports rather then matching
> >> packets arriving from some specific phys. port?
>
> ^^
>
> Could you please find my question above? Based on your understanding of
> templates in async flow approach, an answer to this question may help us find
> the common ground.
It means traffic arrived from physical ports (transfer_proxy role) or south band per you concept.
Traffic from vport (not transfer_proxy) or north band per your concept won't hit even if same packets.
>
> --
>
> >>
> >> If, however, matching traffic "from wire" in fact means matching
> >> packets arriving from a *specific* physical port, then for sure item
> >> REPRESENTED_PORT should perfectly do the job, and the proposed
> >> attribute is unneeded.
> >>
> >> (BTW, in DPDK, it is customary to use term "physical port", not
> >> "wire")
> >>
> >> In (1), what are "vport"s? Please explain. Once again, I should
> >> remind that, in DPDK, folks prefer terms "represented entity" /
> "representor"
> >> over vendor-specific terms like "vport", etc.
> >>
> > Vport is virtual port for short such as VF.
>
> Thanks. As I say, term "vport" might be confusing to some readers, so it'd be
> better to provide this explanation (about VF) in the commit description next
> time.
Ack. Will add VF as an example.
>
> >> As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
> >> Could you please explain, why not just add a match item
> >> REPRESENTED_PORT pointing to that VF via its representor? Doing so
> >> should perfectly define the exact direction / traffic source. Isn't that
> sufficient?
> >>
> > Per my view, there is matching field and matching value difference.
> > Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as same or
> different matching criteria?
> > I would like to call them same since it can be summarized like
> > 1.1.1.0/30 REPRESENTED_PORT is just another matching item, no essential
> differences and it can't stand for direction info.
>
> It looks like we're starting to run into disagreement here.
> There's no "direction" at all. There's an embedded switch inside the NIC, and
> there're (logical) switch ports that packets enter the switch from.
>
> When the user submits a "transfer" rule and does not provide neither
> REPRESENTED_PORT nor PORT_REPRESENTOR in the pattern, the embedded
> switch is supposed to match packets coming from ANY ports, be it VFs or
> physical (wire) ports.
>
> But when the user provides, in example, item REPRESENTED_PORT to point to
> the physical (wire) port, the embedded switch knows exactly which port the
> packets should enter it from.
> In this case, it is supposed to match only packets coming from that physical
> port. And this should be sufficient.
> This in fact replaces the need to know a "direction".
> It's just an exact specification of packet's origin.
>
There is traffic arriving or leaving the switch, so there is always direction, implicit or explicit.
For transfer rules, there is a concept transfer_proxy.
It takes the switch ownership; all switch rules should be configured via transfer_proxy.
Image a logic switch with one PF and two VFs.
PF is the transfer proxy and VF belongs to the PF logically.
When receiving traffic from PF, we can say it comes into the logic switch.
When packet sent from VF (VF belongs to PF), so we can say traffic leaves the switch.
Item REPRESENTED_PORT indicates switch to match traffic sent from which port, comes into, or leave switch.
We can say it as one kind of packet metadata.
Like you said, DPDK always treat transfer to match any PORTs traffic.
When REPRESENTED_PORT is specified, the rules are limited to some dedicated PORTs.
Other PORTs are ignored because metadata mismatching.
Rules still have the capability to match ANY PORTS if metadata matched.
This update will allow user to cut the other PORTs matching capabilities.
> > Port id depends on the attach sequence.
>
> Unfortunately, this is hardly a good argument because flow rules are supposed
> to be inserted based on the run-time packet learning. Attach sequence is a
> don't care here.
>
> >> Also please mind that, although I appreciate your explanations here,
> >> on the mailing list, they should finally be added to the commit
> >> message, so that readers do not have to look for them elsewhere.
> >>
> > We have explained the high possibility of single-direction matching, right?
>
> Not quite. As I said, it is not correct to assume any "direction", like in
> geographical sense ("north", "south", etc.). Application has ethdevs, and they
> are representors of some "virtual ports" (in your terminology) belonging to the
> switch, for example, VFs, SFs or physical ports.
>
> The user adds an appropriate item to the pattern (REPRESENTED_PORT), and
> doing so specifies the packet path which it enters the switch.
>
> > It' hard to list all the possibilities of traffic matching preferences.
>
> And let's say more: one need never do this. That's exactly the reason why
> DPDK has abandoned the concept of "direction" in *transfer* rules and
> switched to the use of precise criteria (REPRESENTED_PORT, etc.).
>
As far as I know, DPDK changes "transfer ingress" to "transfer", so it' more clear that transfer can match both directions (both ingress and egress).
REPRESENTED_PORT is the evolution of "port_id", I think, it' only one kind of matching items.
For large scale deployment like 10M rules, if we can save resources significantly by introducing direction, why not?
Again, async API:
1. pattern template A
2. action template B
3. table C with pattern template A + action template B.
4. rule D, E, F...
The specified REPRESENTED_PORT is provided in rules (D, E, F...) not pattern template A or action template B or table C.
Resources may be allocated early at step 3 since table' rule_nums property.
> > The underlay is the one we have met for now.
> >>>
> >>>>> Introduce one new member transfer_mode into rte_flow_attr to
> >>>>> indicate the flow table direction property: from wire, from vf or
> >>>>> bi-direction(default).
> >>>>
> >>>> AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion
> >>>> and asynchronous (table) approach. The patch adds the attributes to
> >>>> generic 'rte_flow_attr' but, for some reason, ignores non-table rules.
> >>>>
> >>>>>
> >>> Sync API uses one rule to contain everything. It' hard for PMD to
> >>> determine
> >> if this rule has direction preference or not.
> >>> Image a situation, just for an example:
> >>> 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
> >>> 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
> >>> 1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan
> >>> /...), so
> >> sync API consider them share matching determination logic.
> >>> It means "2" have 1M scale capability too. Obviously, it wastes a
> >>> lot of
> >> resources.
> >>
> >> Strictly speaking, they do not share the same match pattern.
> >> Your example clearly shows that, in (1), the pattern should request
> >> packets coming from "vport 1" and, in (2), packets coming from "vport 0".
> >>
> >> My point is simple: the "vport" from which packets enter the embedded
> >> switch is ALSO a match criterion. If you accept this, you'll see: the
> >> matching conditions differ.
> >>
> > See above.
> > In this case, I think the matching fields are both "port_id + ipv4_vxlan". They
> are same.
> > Only differs with values like vni 100 or 200 vice versa.
>
> Not quite. Look closer: you use *different* port IDs for (1) and (2).
> The value of "ethdev_id" field in item REPRESENTED_PORT differs.
>
> >>>
> >>> In async API, there is pattern_template introduced. We can mark "1"
> >>> to use
> >> pattern_tempate id 1 and "2" to use pattern_template 2.
> >>> They will be separated from each other, don't share anymore.
> >>
> >> Consider an example. "Wire" is a physical port represented by PF0
> >> which, in turn, is attached to DPDK via ethdev 0. "VF" (vport?) is
> >> attached to guest and is represented by a representor ethdev 1 in DPDK.
> >>
> >> So, some rules (template 1) are needed to deliver packets from "wire"
> >> to "VF" and also decapsulate them. And some rules (template 2) are
> >> needed to deliver packets in the opposite direction, from "VF"
> >> to "wire" and also encapsulate them.
> >>
> >> My question is, what prevents you from adding match item
> >> REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and
> >> REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?
> >>
> >> As I said previously, if you insert such item before eth / ipv4 / etc
> >> to your match pattern, doing so defines an *exact* direction / source.
> >>
> > Could you check the async API guidance? I think pattern template focusing
> on the matching field (mask).
> > "REPRESENTED_PORT[ethdev_id=0] " and
> "REPRESENTED_PORT[ethdev_id=1] "are the same.
> > 1. pattern template: REPRESENTED_PORT mask 0xffff ...
> > 2. action template: action1 / actions2. / 3. table create with
> > pattern_template plus action template..
> > REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create
> REPRESENTED_PORT port_id is 0 / actions ....
> > REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create
> REPRESENTED_PORT port_id is 1 / actions ....
>
> OK, so, based on this explanation, it appears that you might be looking to refer
> to:
> a) a *set* of any physical (wire) ports
> b) a *set* of any guest ports (VFs)
>
Great, looks we are more and more closer to the agreement.
> You chose to achieve this using an attribute, but:
>
> 1) as I explained above, the use of term "direction" is wrong;
> please hear me out: I'm not saying that your use case and
> your optimisation is wrong: I'm saying that naming for it
> is wrong: it has nothing to do with "direction";
>
Do you have any better naming proposal?
> 2) while naming a *set* of wire ports as "wire_orig" might be OK,
> sticking with term "vf_orig" for a *set* of guest ports is
> clearly not, simply because the user may pass another PF
> to a guest instead of passing a VF; in other words,
> a better term is needed here;
>
Like you said, vport may contain VF, SF etc. vport_orgin is on the logic switch perspective.
Any proposal is welcome.
> 3) since it is possible to plug multiple NICs to a DPDK application,
> even from different vendors, the user may end up having multiple
> physical ports belonging to different physical NICs attached to
> the application; if this is the case, then referring to a *set*
> of wire ports using the new attribute is ambiguous in the
> sense that it's unclear whether this applies only to
> wire ports of some specific physical NIC or to the
> physical ports of *all* NICs managed by the app;
>
Not matter how many NICs has been probed by the DPDK, there is always switch/PF/VF/SF.. concept.
Each switch must have an owner identified by transfer_proxy(). Vport (VF/SF) can't cross switch in normal case.
The traffic comes from one NIC can't be offloaded by other NICs unless forwarded by the application.
If user use new attribute to cut one side resource, I think user is smart enough to management the rules in different NICs.
No default behavior changed with this update.
> 4) adding an attribute instead of yet another pattern item type
> is not quite good because PMDs need to be updated separately
> to detect this attribute and throw an error if it's not
> supported, whilst with a new item type, the PMDs do not
> need to be updated = if a PMD sees an unsupported item
> while traversing the item with switch () { case }, it
> will anyway throw an error;
>
PMD also need to check if it supports new matching item or not, right?
We can't assume NIC vendor' PMD implementation, right?
> 5) as in (4), a new attribute is not good from documentation
> standpoint; plase search for "represented_port = Y" in
> documentation = this way, all supported items are
> easily defined for various NIC vendors, but the
> same isn't true for attributes = there is no
> way to indicate supported attributes in doc.
>
> If points (1 - 5) make sense to you, then, if I may be so bold, I'd like to suggest
> that the idea of adding a new attribute be abandoned. Instead, I'd like to
> suggest adding new items:
>
> (the names are just sketch, for sure, it should be discussed)
>
> ANY_PHY_PORTS { switch_domain_id }
> = match packets entering the embedded switch from *whatever*
> physical ports belonging to the given switch domain
>
How many PHY_PORTS can one switch have, per your thought? Can I treat the PHY_PORTS as the { switch_domain_id } owner as transfer_proxy()?
> ANY_GUEST_PORTS { switch_domain_id }
> = match packets entering the embedded switch from *whatever*
> guest ports (VFs, PFs, etc.) belonging to the given
> switch domain
>
> The field "switch_domain_id" is required to tell one physical board / vendor
> from another (as I explained in point (3)).
> The application can query this parameter from ethdev's switch info: please see
> "struct rte_eth_switch_info".
>
> What's your opinion?
>
How can we handle ANY_PHY_PORTS/ ANY_GUEST_PORTS ' relationship with REPRESENTED_PORT if conflicts?
Need future tuning.
Like I said before, offloaded rules can't cross different NIC vendor' "switch_domain_id".
If user probes multiple NICs in one application, application should take care of packet forwarding.
Also application should be aware which ports belong to which NICs.
> >
> >>>
> >>>> For example, the diff below adds the attributes to "table" commands
> >>>> in testpmd but does not add them to regular (non-table) commands
> >>>> like "flow create". Why?
> >>>>
> >>>>>
> >>>
> >>> "table" command limits pattern_template to single direction or
> >>> bidirection
> >> per user specified attribute.
> >>
> >> As I say above, the same effect can be achieved by adding item
> >> REPRESENTED_PORT to the corresponding pattern template.
> > See above.
> >>
> >>> "rule" command must tight with one "table_id", so the rule will
> >>> inherit the
> >> "table" direction property, no need to specify again.
> >>
> >> You migh've misunderstood. I do not talk about "rule" command coupled
> >> with some "table". What I talk about is regular, NON-async flow
> >> insertion commands.
> >>
> >> Please take a look at section "/* Validate/create attributes. */" in
> >> file "app/test-pmd/cmdline_flow.c". When one adds a new flow
> >> attribute, they should reflect it the same way as VC_INGRESS,
> VC_TRANSFER, etc.
> >>
> >> That's it.
> > We don't intend to pass this to sync API. The above code example is for sync
> API.
>
> So I understand. But there's one slight problem: in your patch, you add the new
> attributes to the structure which is *shared* between sync and async use case
> scenarios. If one adds an attribute to this structure, they have to provide
> accessors for it in all sync-related commands in testpmd, but your patch does
> not do that.
>
Like the title said, "creating transfer table" is the ASYNC operation.
We have limited the scope of this patch. Sync API will be another story.
Maybe we can add one more sentence to emphasize async API again.
> In other words, it is wrong to assume that "struct rte_flow_attr" only applies to
> async approach. It had been introduced long before the async flow design was
> added to DPDK. That's it.
>
> >>
> >> But, as I say, I still believe that the new attributes aren't needed.
> > I think we are not at the same page for now. Can we reach agreement on
> > the same matching criteria first?
> >>>
> >>>>> It helps to save underlayer memory also on insertion rate.
> >>>>
> >>>> Which memory? Host memory? NIC memory? Term "underlayer" is
> vague.
> >>>> I suggest that the commit message be revised to first explain how
> >>>> such memory is spent currently, then explain why this is not
> >>>> optimal and, finally, which way the patch is supposed to improve
> >>>> that. I.e. be more
> >> specific.
> >>>>
> >>>>>
> >>>
> >>> For large scalable rules, HW (depends on implementation) always
> >>> needs
> >> memory to hold the rules' patterns and actions, either from NIC or from
> host.
> >>> The memory footprint highly depends on "user rules' complexity",
> >>> also diff
> >> between NICs.
> >>> ~50% memory saving is expected if one-direction is cut.
> >>
> >> Regardless of this talk, this explanation should probably be present
> >> in the commit description.
> >>
> > This number may differ with different NICs or implementation. We can't say
> it for sure.
>
> Not an exact number, of course, but a brief explanation of:
> a) what is wrong / not optimal in the current design;
Please check the commit log, transfer have the capability to match bi-direction traffic no matter what ports.
> b) how it is observed in customer deployments;
Customer have the requirements to save resources and their offloaded rules is direction aware.
> c) why the proposed patch is a good solution.
New attributes provide the way to remove one direction and save underlayer resource.
All of the above can be found in the commit log.
>
> >>>
> >>>>> By default, the transfer domain is bi-direction, and no behavior changes.
> >>>>>
> >>>>> 1. Match wire origin only
> >>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> >>>>> 2. Match vf origin only
> >>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
> >>>>>
> >>>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> >>>>> ---
> >>>>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
> >>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> >>>>> lib/ethdev/rte_flow.h | 9 ++++++-
> >>>>> 3 files changed, 36 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/app/test-pmd/cmdline_flow.c
> >>>>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82 100644
> >>>>> --- a/app/test-pmd/cmdline_flow.c
> >>>>> +++ b/app/test-pmd/cmdline_flow.c
> >>>>> @@ -177,6 +177,8 @@ enum index {
> >>>>> TABLE_INGRESS,
> >>>>> TABLE_EGRESS,
> >>>>> TABLE_TRANSFER,
> >>>>> + TABLE_TRANSFER_WIRE_ORIG,
> >>>>> + TABLE_TRANSFER_VF_ORIG,
> >>>>> TABLE_RULES_NUMBER,
> >>>>> TABLE_PATTERN_TEMPLATE,
> >>>>> TABLE_ACTIONS_TEMPLATE,
> >>>>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
> >>>>> TABLE_INGRESS,
> >>>>> TABLE_EGRESS,
> >>>>> TABLE_TRANSFER,
> >>>>> + TABLE_TRANSFER_WIRE_ORIG,
> >>>>> + TABLE_TRANSFER_VF_ORIG,
> >>>>> TABLE_RULES_NUMBER,
> >>>>> TABLE_PATTERN_TEMPLATE,
> >>>>> TABLE_ACTIONS_TEMPLATE,
> >>>>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
> >>>>> .next = NEXT(next_table_attr),
> >>>>> .call = parse_table,
> >>>>> },
> >>>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
> >>>>> + .name = "wire_orig",
> >>>>> + .help = "affect rule direction to transfer",
> >>>>
> >>>> This does not explain the "wire" aspect. It's too broad.
> >>>>
> >>>>> + .next = NEXT(next_table_attr),
> >>>>> + .call = parse_table,
> >>>>> + },
> >>>>> + [TABLE_TRANSFER_VF_ORIG] = {
> >>>>> + .name = "vf_orig",
> >>>>> + .help = "affect rule direction to transfer",
> >>>>
> >>>> This explanation simply duplicates such of the "wire_orig".
> >>>> It does not explain the "vf" part. Should be more specific.
> >>>>
> >>>>> + .next = NEXT(next_table_attr),
> >>>>> + .call = parse_table,
> >>>>> + },
> >>>>> [TABLE_RULES_NUMBER] = {
> >>>>> .name = "rules_number",
> >>>>> .help = "number of rules in table", @@ -8894,6
> >>>>> +8910,16 @@ parse_table(struct context *ctx, const struct token
> >>>>> +*token,
> >>>>> case TABLE_TRANSFER:
> >>>>> out->args.table.attr.flow_attr.transfer = 1;
> >>>>> return len;
> >>>>> + case TABLE_TRANSFER_WIRE_ORIG:
> >>>>> + if (!out->args.table.attr.flow_attr.transfer)
> >>>>> + return -1;
> >>>>> + out->args.table.attr.flow_attr.transfer_mode = 1;
> >>>>> + return len;
> >>>>> + case TABLE_TRANSFER_VF_ORIG:
> >>>>> + if (!out->args.table.attr.flow_attr.transfer)
> >>>>> + return -1;
> >>>>> + out->args.table.attr.flow_attr.transfer_mode = 2;
> >>>>> + return len;
> >>>>> default:
> >>>>> return -1;
> >>>>> }
> >>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>> index 330e34427d..603b7988dd 100644
> >>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>> @@ -3332,7 +3332,8 @@ It is bound to
> >>>> ``rte_flow_template_table_create()``::
> >>>>>
> >>>>> flow template_table {port_id} create
> >>>>> [table_id {id}] [group {group_id}]
> >>>>> - [priority {level}] [ingress] [egress] [transfer]
> >>>>> + [priority {level}] [ingress] [egress]
> >>>>> + [transfer [vf_orig] [wire_orig]]
> >>>>
> >>>> Is it correct? Shouldn't it rather be [transfer] [vf_orig]
> >>>> [wire_orig] ?
> >>>>
> >>>>> rules_number {number}
> >>>>> pattern_template {pattern_template_id}
> >>>>> actions_template {actions_template_id} diff --git
> >>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> >>>>> a79f1e7ef0..512b08d817 100644
> >>>>> --- a/lib/ethdev/rte_flow.h
> >>>>> +++ b/lib/ethdev/rte_flow.h
> >>>>> @@ -130,7 +130,14 @@ struct rte_flow_attr {
> >>>>> * through a suitable port. @see rte_flow_pick_transfer_proxy().
> >>>>> */
> >>>>> uint32_t transfer:1;
> >>>>> - uint32_t reserved:29; /**< Reserved, must be zero. */
> >>>>> + /**
> >>>>> + * 0 means bidirection,
> >>>>> + * 0x1 origin uplink,
> >>>>
> >>>> What does "uplink" mean? It's too vague. Hardly a good term.
>
> I believe this comment should be reworked, in case the idea of having an extra
> attribute persists.
>
> >>>>
> >>>>> + * 0x2 origin vport,
> >>>>
> >>>> What does "origin vport" mean? Hardly a good term as well.
>
> I still believe this explanation is way too brief and needs to be reworked to
> provide more details, to define the use case for the attribute more specifically.
>
> >>>>
> >>>>> + * N/A both set.
> >>>>
> >>>> What's this?
>
> The question stands.
>
> >>>>
> >>>>> + */
> >>>>> + uint32_t transfer_mode:2;
> >>>>> + uint32_t reserved:27; /**< Reserved, must be zero. */
> >>>>> };
> >>>>>
> >>>>> /**
> >>>>> --
> >>>>> 2.27.0
> >>>>>
> >>>>
> >>>> Since the attributes are added to generic 'struct rte_flow_attr',
> >>>> non-table
> >>>> (synchronous) flow rules are supposed to support them, too. If that
> >>>> is indeed the case, then I'm afraid such proposal does not agree
> >>>> with the existing items PORT_REPRESENTOR and REPRESENTED_PORT.
> They
> >>>> do exactly the same thing, but they are designed to be way more
> >>>> generic. Why
> >> not use them?
> >>
> >> The question stands.
> >>
> >>>>
> >>>> Ivan
> >>>
> >>
> >> Ivan
> >
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-14 10:17 ` Rongwei Liu
@ 2022-09-14 15:18 ` Ivan Malov
2022-09-14 21:02 ` Thomas Monjalon
2022-09-15 0:58 ` Rongwei Liu
0 siblings, 2 replies; 96+ messages in thread
From: Ivan Malov @ 2022-09-14 15:18 UTC (permalink / raw)
To: Rongwei Liu
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi Rongwei,
On Wed, 14 Sep 2022, Rongwei Liu wrote:
> HI
>
> BR
> Rongwei
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Wednesday, September 14, 2022 15:32
>> To: Rongwei Liu <rongweil@nvidia.com>
>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
>> Darawsheh <rasland@nvidia.com>
>> Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer
>> table
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi,
>>
>> On Wed, 14 Sep 2022, Rongwei Liu wrote:
>>
>>> HI
>>>
>>> BR
>>> Rongwei
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>> Sent: Tuesday, September 13, 2022 22:33
>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
>>>> Raslan Darawsheh <rasland@nvidia.com>
>>>> Subject: RE: [PATCH v1] ethdev: add direction info when creating the
>>>> transfer table
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Hi Rongwei,
>>>>
>>>> PSB
>>>>
>>>> On Tue, 13 Sep 2022, Rongwei Liu wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> BR
>>>>> Rongwei
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>> Sent: Tuesday, September 13, 2022 00:57
>>>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
>>>>>> Raslan Darawsheh <rasland@nvidia.com>
>>>>>> Subject: Re: [PATCH v1] ethdev: add direction info when creating
>>>>>> the transfer table
>>>>>>
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On Wed, 7 Sep 2022, Rongwei Liu wrote:
>>>>>>
>>>>>>> The transfer domain rule is able to match traffic wire/vf origin
>>>>>>> and it means two directions' underlayer resource.
>>>>>>
>>>>>> The point of fact is that matching traffic coming from some entity
>>>>>> like wire / VF has been long generalised in the form of representors.
>>>>>> So, a flow rule with attribute "transfer" is able to match traffic
>>>>>> coming from either a REPRESENTED_PORT or from a
>> PORT_REPRESENTOR
>>>> (please find these items).
>>>>>>
>>>>>>>
>>>>>>> In customer deployments, they usually match only one direction
>>>>>>> traffic in single flow table: either from wire or from vf.
>>>>>>
>>>>>> Which customer deployments? Could you please provide detailed
>> examples?
>>>>>>
>>>>>>>
>>>>>
>>>>> We saw a lot of customers' deployment like:
>>>>> 1. Match overlay traffic from wire and do decap, then send to specific
>> vport.
>>>>> 2. Match specific 5-tuples and do encap, then send to wire.
>>>>> The matching criteria has obvious direction preference.
>>>>
>>>> Thank you. My questions are as follows:
>>>>
>>>> In (1), when you say "from wire", do you mean the need to match
>>>> packets arriving via whatever physical ports rather then matching
>>>> packets arriving from some specific phys. port?
>>
>> ^^
>>
>> Could you please find my question above? Based on your understanding of
>> templates in async flow approach, an answer to this question may help us find
>> the common ground.
> It means traffic arrived from physical ports (transfer_proxy role) or south band per you concept.
Transfer proxy has nothing to do with physical ports. And I should stress
out that "south band" and the likes are NOT my concepts. Instead, I think
that direction designations like "south" or "north" aren't applicable
when talking about the embedded switch and its flow (transfer) rules.
> Traffic from vport (not transfer_proxy) or north band per your concept won't hit even if same packets.
Please see above. Transfer proxy is a completely different concept.
And I never used "north band" concept.
>>
>> --
>>
>>>>
>>>> If, however, matching traffic "from wire" in fact means matching
>>>> packets arriving from a *specific* physical port, then for sure item
>>>> REPRESENTED_PORT should perfectly do the job, and the proposed
>>>> attribute is unneeded.
>>>>
>>>> (BTW, in DPDK, it is customary to use term "physical port", not
>>>> "wire")
>>>>
>>>> In (1), what are "vport"s? Please explain. Once again, I should
>>>> remind that, in DPDK, folks prefer terms "represented entity" /
>> "representor"
>>>> over vendor-specific terms like "vport", etc.
>>>>
>>> Vport is virtual port for short such as VF.
>>
>> Thanks. As I say, term "vport" might be confusing to some readers, so it'd be
>> better to provide this explanation (about VF) in the commit description next
>> time.
> Ack. Will add VF as an example.
>>
>>>> As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
>>>> Could you please explain, why not just add a match item
>>>> REPRESENTED_PORT pointing to that VF via its representor? Doing so
>>>> should perfectly define the exact direction / traffic source. Isn't that
>> sufficient?
>>>>
>>> Per my view, there is matching field and matching value difference.
>>> Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as same or
>> different matching criteria?
>>> I would like to call them same since it can be summarized like
>>> 1.1.1.0/30 REPRESENTED_PORT is just another matching item, no essential
>> differences and it can't stand for direction info.
>>
>> It looks like we're starting to run into disagreement here.
>> There's no "direction" at all. There's an embedded switch inside the NIC, and
>> there're (logical) switch ports that packets enter the switch from.
>>
>> When the user submits a "transfer" rule and does not provide neither
>> REPRESENTED_PORT nor PORT_REPRESENTOR in the pattern, the embedded
>> switch is supposed to match packets coming from ANY ports, be it VFs or
>> physical (wire) ports.
>>
>> But when the user provides, in example, item REPRESENTED_PORT to point to
>> the physical (wire) port, the embedded switch knows exactly which port the
>> packets should enter it from.
>> In this case, it is supposed to match only packets coming from that physical
>> port. And this should be sufficient.
>> This in fact replaces the need to know a "direction".
>> It's just an exact specification of packet's origin.
>>
> There is traffic arriving or leaving the switch, so there is always direction, implicit or explicit.
This does not contradict my thoughts above. "Direction" is *defined* by
two points (like in geometry): an initial point (the switch port through
which a packet enters the switch) and the terminal point (the match engine
inside the switch). If one knows these two points, no extra hints are
required to specify some "direction". Because direction is already
represented by this "vector" of sorts. That's why presence of the
port match item in the pattern is absolutely sufficient.
However, based on your later explanations, the use of
precise port item is simply inconvenient in your
use case because you are trying to match traffic
from *multiple* ports that have something in
common (i.e. all VFs or all wire ports).
And, instead of adding a new item type which would serve
exactly your needs, you for some reason try to add an
attribute, which has multiple drawbacks which I
described in my previous letter.
> For transfer rules, there is a concept transfer_proxy.
> It takes the switch ownership; all switch rules should be configured via transfer_proxy.
Yes, such concept exists, but it's a don't care with
regard to the problem that we're discussing, sorry.
Furthermore, unlike "switch domain ID" (which is
the same for all ethdevs belonging to a given
physical NIC board), nobody guarantees that
it's only one transfer proxy port. Some NIC
vendors allows transfer rules to be added
via any ethdev port.
>
> Image a logic switch with one PF and two VFs.
> PF is the transfer proxy and VF belongs to the PF logically.
> When receiving traffic from PF, we can say it comes into the logic switch.
That's correct.
> When packet sent from VF (VF belongs to PF), so we can say traffic leaves the switch.
That's not correct. Traffic sent from VF (for example, a guest VM
is sending packets) also *enters* the switch. PFs and VFs are in
fact *separate* logical ports of the embedded switch.
>
> Item REPRESENTED_PORT indicates switch to match traffic sent from which port, comes into, or leave switch.
That is not correct either. Item REPRESENTED_PORT tells the switch to
match packets which come into the switch FROM the logical port
which is represented by the given DPDK ethdev.
For example, if ethdev="E" is the *main* PF which is bound to
physical port "P", then item REPRESENTED_PORT with ethdev ID
being set to "E" tells the switch that only packet coming
to NIC from *wire* via physical port "E" should match.
> We can say it as one kind of packet metadata.
Kind of yes, but might be vendor-specific. No need to delve into this.
> Like you said, DPDK always treat transfer to match any PORTs traffic.
Slight correction: it treats it this way until it sees an exact port item.
If the user provides REPRESENTED_PORT (or PORT_REPRESENTOR), it's no
longer *any* ports traffic, it's an exact port traffic. That's it.
> When REPRESENTED_PORT is specified, the rules are limited to some dedicated PORTs.
These rules match only packets arriving TO the
embedded switch FROM the said dedicated ports.
> Other PORTs are ignored because metadata mismatching.
Kind of yes, correct.
> Rules still have the capability to match ANY PORTS if metadata matched.
This statement is only correct for the cases when the user does NOT
use neither item REPRESENTED_PORT nor item PORT_REPRESENTOR.
>
> This update will allow user to cut the other PORTs matching capabilities.
As I explained, this is exactly what items PORT_REPRESENTOR
and REPRESENTED_PORT do. No need to have an extra attribute.
If the user adds item REPRESENTED_PORT with ethdev_id="E",
like in the above example, to match packets entering NIC
via the physical port "P", then this rule will NOT match
packets entering NIC from other points. For example,
packets transmitted by a virtual machine via a VF
will not match in this case.
>>> Port id depends on the attach sequence.
>>
>> Unfortunately, this is hardly a good argument because flow rules are supposed
>> to be inserted based on the run-time packet learning. Attach sequence is a
>> don't care here.
>>
>>>> Also please mind that, although I appreciate your explanations here,
>>>> on the mailing list, they should finally be added to the commit
>>>> message, so that readers do not have to look for them elsewhere.
>>>>
>>> We have explained the high possibility of single-direction matching, right?
>>
>> Not quite. As I said, it is not correct to assume any "direction", like in
>> geographical sense ("north", "south", etc.). Application has ethdevs, and they
>> are representors of some "virtual ports" (in your terminology) belonging to the
>> switch, for example, VFs, SFs or physical ports.
>>
>> The user adds an appropriate item to the pattern (REPRESENTED_PORT), and
>> doing so specifies the packet path which it enters the switch.
>>
>>> It' hard to list all the possibilities of traffic matching preferences.
>>
>> And let's say more: one need never do this. That's exactly the reason why
>> DPDK has abandoned the concept of "direction" in *transfer* rules and
>> switched to the use of precise criteria (REPRESENTED_PORT, etc.).
>>
> As far as I know, DPDK changes "transfer ingress" to "transfer", so it' more clear that transfer can match both directions (both ingress and egress).
Not quite. DPDK has abandoned the use of "ingress / egress" in "transfer"
rules because "ingress" and "egress" are only applicable on the VNIC
level. For example, there is a PF attached to DPDK application:
packets that the application receives through this ethdev, are
ingress, and packets that it transmits (tx_burst) are egress.
I can explain in other words. Imagine yourself standing *inside* a room
which only has one door. When someone enters the room, it's "ingress",
when someone leaves, it's "egress". It's relative to your viewpoint.
In this example, such a room represents a VNIC / ethdev.
And now imagine yourself standing *outside* of another room / auditorium
which has multiple doors / exits. You're standing near some particular
exit "A" (VNIC / ethdev), but people may enter this room via another
door "B" and then leave it via yet another door "C". In this case,
from your viewpoint, this traffic cannot be considered neither
ingress nor egress. Because these people do not approach you.
Like in this example, embedded switch is like a large auditorium
with many-many doors / exits. And there can be many-many
directions: packet can enter the switch via phys. port "P1"
and then leave it via another phys. port "P2". Or it can
enter the switch via phys. port and the leave it via
VF's logical port (to be delivered to a guest machine),
or a packet can travel from one VF to another one.
There's no PRE-DEFINED direction like "north to south" or "east to west".
And this explains why it's very undesirable to use term "direction".
> REPRESENTED_PORT is the evolution of "port_id", I think, it' only one kind of matching items.
Yes. But nobody prevents you from defining yet another match item
which will be able to refer to a *group* of ports which have
something in common (i.e. "all guest ports of this switch"
pointing to all logical ports currently attached to
virtual machines / guests, or "all wire ports of this swtich").
>
> For large scale deployment like 10M rules, if we can save resources significantly by introducing direction, why not?
I do not deny the fact that you have a use case where resources can
be saved significantly if you give the PMD some extra knowledge
when creating a flow table / pattern template. That's totally
OK. What I object is the very implementation and the use of
term "direction". If you add new item types (like above),
then, when you create an async table 1 pattern template,
you will have item ANY_WIRE_PORTS, and, for table 2
pattern template, you'll have item ANY_GUEST_PORTS.
As you see, the two pattern templates now differ
because the match criteria use different items.
>
> Again, async API:
> 1. pattern template A
> 2. action template B
> 3. table C with pattern template A + action template B.
> 4. rule D, E, F...
> The specified REPRESENTED_PORT is provided in rules (D, E, F...) not pattern template A or action template B or table C.
> Resources may be allocated early at step 3 since table' rule_nums property.
No, item REPRESENTED_PORT *can* be provided inside pattern template A,
but, as you pointed out earlier, the problem is that you can't
distinguish different pattern templates which have this item,
because pattern templates know nothing about *exact* port IDs
and only know item MASKS. Yes, I agree that in your case
such problem exists, but, as I say above, it can be
solved by adding new item types: one for referring to
all phys. ports of a given NIC and another one for
pointing to a group of current guest users (VFs).
>>> The underlay is the one we have met for now.
>>>>>
>>>>>>> Introduce one new member transfer_mode into rte_flow_attr to
>>>>>>> indicate the flow table direction property: from wire, from vf or
>>>>>>> bi-direction(default).
>>>>>>
>>>>>> AFAIK, 'rte_flow_attr' serves both traditional flow rule insertion
>>>>>> and asynchronous (table) approach. The patch adds the attributes to
>>>>>> generic 'rte_flow_attr' but, for some reason, ignores non-table rules.
>>>>>>
>>>>>>>
>>>>> Sync API uses one rule to contain everything. It' hard for PMD to
>>>>> determine
>>>> if this rule has direction preference or not.
>>>>> Image a situation, just for an example:
>>>>> 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
>>>>> 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
>>>>> 1 and 2 share the same matching conditions (eth / ipv4 / udp / vxlan
>>>>> /...), so
>>>> sync API consider them share matching determination logic.
>>>>> It means "2" have 1M scale capability too. Obviously, it wastes a
>>>>> lot of
>>>> resources.
>>>>
>>>> Strictly speaking, they do not share the same match pattern.
>>>> Your example clearly shows that, in (1), the pattern should request
>>>> packets coming from "vport 1" and, in (2), packets coming from "vport 0".
>>>>
>>>> My point is simple: the "vport" from which packets enter the embedded
>>>> switch is ALSO a match criterion. If you accept this, you'll see: the
>>>> matching conditions differ.
>>>>
>>> See above.
>>> In this case, I think the matching fields are both "port_id + ipv4_vxlan". They
>> are same.
>>> Only differs with values like vni 100 or 200 vice versa.
>>
>> Not quite. Look closer: you use *different* port IDs for (1) and (2).
>> The value of "ethdev_id" field in item REPRESENTED_PORT differs.
>>
>>>>>
>>>>> In async API, there is pattern_template introduced. We can mark "1"
>>>>> to use
>>>> pattern_tempate id 1 and "2" to use pattern_template 2.
>>>>> They will be separated from each other, don't share anymore.
>>>>
>>>> Consider an example. "Wire" is a physical port represented by PF0
>>>> which, in turn, is attached to DPDK via ethdev 0. "VF" (vport?) is
>>>> attached to guest and is represented by a representor ethdev 1 in DPDK.
>>>>
>>>> So, some rules (template 1) are needed to deliver packets from "wire"
>>>> to "VF" and also decapsulate them. And some rules (template 2) are
>>>> needed to deliver packets in the opposite direction, from "VF"
>>>> to "wire" and also encapsulate them.
>>>>
>>>> My question is, what prevents you from adding match item
>>>> REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and
>>>> REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?
>>>>
>>>> As I said previously, if you insert such item before eth / ipv4 / etc
>>>> to your match pattern, doing so defines an *exact* direction / source.
>>>>
>>> Could you check the async API guidance? I think pattern template focusing
>> on the matching field (mask).
>>> "REPRESENTED_PORT[ethdev_id=0] " and
>> "REPRESENTED_PORT[ethdev_id=1] "are the same.
>>> 1. pattern template: REPRESENTED_PORT mask 0xffff ...
>>> 2. action template: action1 / actions2. / 3. table create with
>>> pattern_template plus action template..
>>> REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create
>> REPRESENTED_PORT port_id is 0 / actions ....
>>> REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create
>> REPRESENTED_PORT port_id is 1 / actions ....
>>
>> OK, so, based on this explanation, it appears that you might be looking to refer
>> to:
>> a) a *set* of any physical (wire) ports
>> b) a *set* of any guest ports (VFs)
>>
> Great, looks we are more and more closer to the agreement.
Looks so.
>> You chose to achieve this using an attribute, but:
>>
>> 1) as I explained above, the use of term "direction" is wrong;
>> please hear me out: I'm not saying that your use case and
>> your optimisation is wrong: I'm saying that naming for it
>> is wrong: it has nothing to do with "direction";
>>
> Do you have any better naming proposal?
As I said, what you are trying to achieve using a new
attribute would be way better to achieve using new
pattern items which can be easily told one from
another in PMD when pre-allocaing resources for
different async flow tables.
So, I don't have any proposal for *attribute* naming.
What I propose is to consider new items instead.
>> 2) while naming a *set* of wire ports as "wire_orig" might be OK,
>> sticking with term "vf_orig" for a *set* of guest ports is
>> clearly not, simply because the user may pass another PF
>> to a guest instead of passing a VF; in other words,
>> a better term is needed here;
>>
> Like you said, vport may contain VF, SF etc. vport_orgin is on the logic switch perspective.
> Any proposal is welcome.
The problem is, vport can be easily confused with a slightly more
generic "lport" (embedded switch's "logical port"), and, logical
ports, in turn, are not confined to just VFs or PFs. For example,
physical (wire) ports are ALSO logical ports of the switch.
>> 3) since it is possible to plug multiple NICs to a DPDK application,
>> even from different vendors, the user may end up having multiple
>> physical ports belonging to different physical NICs attached to
>> the application; if this is the case, then referring to a *set*
>> of wire ports using the new attribute is ambiguous in the
>> sense that it's unclear whether this applies only to
>> wire ports of some specific physical NIC or to the
>> physical ports of *all* NICs managed by the app;
>>
> Not matter how many NICs has been probed by the DPDK, there is always switch/PF/VF/SF.. concept.
Correct.
> Each switch must have an owner identified by transfer_proxy(). Vport (VF/SF) can't cross switch in normal case.
No. That is not correct. This is tricky, but please hear me out: an
individual NIC board (that is, a given *switch*) is identified only
by its switch domain ID. As I explained above, "transfer proxy" is
just a technical hint for the applcation to indicate an ethdev
through which "transfer" rules must be managed. Not all vendors
support this concept (and they are not obliged to support it).
> The traffic comes from one NIC can't be offloaded by other NICs unless forwarded by the application.
Right, but forwarding in software (inside DPDK application) is
out of scope with regard to the problem that we're discussing.
> If user use new attribute to cut one side resource, I think user is smart enough to management the rules in different NICs.
As I explained above, I do not deny the existence of the problem that
your patch is trying to solve. Now it looks like we're on the same
page with regard to understanding the fact that what you're
trying to do is to introduce a match criterion that would
refer to a GROUP of similar ports. In my opinion, this
is not an *attribute*, it's a *match criterion*, and
it should be implemented as two new items.
Having two different item types would perfectly fit the need
to know the difference between such "directions" (as per
your terminology) early enough, when parsing templates.
> No default behavior changed with this update.
>
>> 4) adding an attribute instead of yet another pattern item type
>> is not quite good because PMDs need to be updated separately
>> to detect this attribute and throw an error if it's not
>> supported, whilst with a new item type, the PMDs do not
>> need to be updated = if a PMD sees an unsupported item
>> while traversing the item with switch () { case }, it
>> will anyway throw an error;
>>
> PMD also need to check if it supports new matching item or not, right?
> We can't assume NIC vendor' PMD implementation, right?
No-no-no. Imagine a PMD which does not support "transfer" rules.
In such PMD, in the flow parsing function one would have:
if (!!attr->transfer) {
print_error("Transfer is not supported");
return EINVAL;
}
If you add a new attribute, then PMDs which are NOT going
to support it need to be updated to add similar check.
Otherwise, they will simply ignore presence / absence
of the attribute in the rule, and validation result
will be unreliable.
Yes, if this attribute is 0x0, then indeed behaviour
does nto change. But what if it's 0x1 or 0x2?
PMDs that do not support these values must
somehow reject such rules on parsing.
However, this problem does not manifest itself when
parsing items. Typially, in a PMD, one would have:
switch (item->type) {
case RTE_FLOW_ITEM_TYPE_VOID:
break;
case RTE_FLOW_ITEM_TYPE_ETH:
/* blah-blah-blah */
break;
default:
return ENOTSUP;
}
So, if you introduce two new item types to solve your problem,
then you won't have to update existing PMDs. If the vendor
wants to support the new items (say, MLX or SFC), they'll
update their code to accept the items. But other vendors
will not do anything. If the user tries to pass such an
item to a vendor which doesn't support the feature,
the "default" case will just throw an error.
This is what I mean when pointing out such difference
between adding an attribute VS adding new item types.
>> 5) as in (4), a new attribute is not good from documentation
>> standpoint; plase search for "represented_port = Y" in
>> documentation = this way, all supported items are
>> easily defined for various NIC vendors, but the
>> same isn't true for attributes = there is no
>> way to indicate supported attributes in doc.
>>
>> If points (1 - 5) make sense to you, then, if I may be so bold, I'd like to suggest
>> that the idea of adding a new attribute be abandoned. Instead, I'd like to
>> suggest adding new items:
>>
>> (the names are just sketch, for sure, it should be discussed)
>>
>> ANY_PHY_PORTS { switch_domain_id }
>> = match packets entering the embedded switch from *whatever*
>> physical ports belonging to the given switch domain
>>
> How many PHY_PORTS can one switch have, per your thought? Can I treat the PHY_PORTS as the { switch_domain_id } owner as transfer_proxy()?
A single physical NIC board is supposed to have a single
embedded switch engine. Hence, if the NIC board has, in
example, two or four physical ports, these will be the
physical ports of the switch. That's it.
As for the transfer proxy, please see my explanations above.
It's not *always* reliable to tell whether two given ethdevs
belong to the same physical NIC board or not.
Switch domain ID is the right criterion (for applications).
>> ANY_GUEST_PORTS { switch_domain_id }
>> = match packets entering the embedded switch from *whatever*
>> guest ports (VFs, PFs, etc.) belonging to the given
>> switch domain
>>
>> The field "switch_domain_id" is required to tell one physical board / vendor
>> from another (as I explained in point (3)).
>> The application can query this parameter from ethdev's switch info: please see
>> "struct rte_eth_switch_info".
>>
>> What's your opinion?
>>
> How can we handle ANY_PHY_PORTS/ ANY_GUEST_PORTS ' relationship with REPRESENTED_PORT if conflicts?
> Need future tuning.
And if you carry on with "vf_orig" / "wire_orig" approach, you
will inevitably have the very same problem: possible conflict
with items like REPRESENTED_PORT. So does it matter? Yes,
checks need to be done by PMDs when parsing patterns.
> Like I said before, offloaded rules can't cross different NIC vendor' "switch_domain_id".
> If user probes multiple NICs in one application, application should take care of packet forwarding.
> Also application should be aware which ports belong to which NICs.
Yes, perhaps, domain ID is not needed in the new items.
But the application still must keep track of switch
domain IDs itself so it knows which rules to
manage via which ethdevs.
Any other opinions?
>>>
>>>>>
>>>>>> For example, the diff below adds the attributes to "table" commands
>>>>>> in testpmd but does not add them to regular (non-table) commands
>>>>>> like "flow create". Why?
>>>>>>
>>>>>>>
>>>>>
>>>>> "table" command limits pattern_template to single direction or
>>>>> bidirection
>>>> per user specified attribute.
>>>>
>>>> As I say above, the same effect can be achieved by adding item
>>>> REPRESENTED_PORT to the corresponding pattern template.
>>> See above.
>>>>
>>>>> "rule" command must tight with one "table_id", so the rule will
>>>>> inherit the
>>>> "table" direction property, no need to specify again.
>>>>
>>>> You migh've misunderstood. I do not talk about "rule" command coupled
>>>> with some "table". What I talk about is regular, NON-async flow
>>>> insertion commands.
>>>>
>>>> Please take a look at section "/* Validate/create attributes. */" in
>>>> file "app/test-pmd/cmdline_flow.c". When one adds a new flow
>>>> attribute, they should reflect it the same way as VC_INGRESS,
>> VC_TRANSFER, etc.
>>>>
>>>> That's it.
>>> We don't intend to pass this to sync API. The above code example is for sync
>> API.
>>
>> So I understand. But there's one slight problem: in your patch, you add the new
>> attributes to the structure which is *shared* between sync and async use case
>> scenarios. If one adds an attribute to this structure, they have to provide
>> accessors for it in all sync-related commands in testpmd, but your patch does
>> not do that.
>>
> Like the title said, "creating transfer table" is the ASYNC operation.
> We have limited the scope of this patch. Sync API will be another story.
> Maybe we can add one more sentence to emphasize async API again.
No-no-no. There might be slight misunderstanding. I understand that
you are limiting the scope of your patch by saying this and this.
That's OK. What I'm trying to point out is the fact that your
patch nevertheless touches the COMMON part of the flow API
which is shared between two approaches (sync and async).
Imagine a reader that does not know anything about the async approach.
He just opens the file in vim and goes directly to struct rte_flow_attr.
And, over there, he sees the new attribute "wire_orig". He then
immediately assumes that these attributes can be used in
testpmd. Now the reader opens testpmd and tries to
insert a flow rule using the sync approach:
flow create priority 0 transfer vf_orig pattern / ... / end actions drop
And doing so will be a failure, because your patch does not add the
new attribute keyword to sync flow rule syntax parser. That's it.
Once again, I should ephasize: the reader MAY know nothing about the async
approach. But if the attribute is present in "struct rte_flow_attr", it
immediately means that it is available everywhere. Both sync and async.
So, with this in mind, your attempt to limit the scope of the patch
to async-only rules looks a little bit artificial. It's not
correct from the *formal* standpoint.
>
>> In other words, it is wrong to assume that "struct rte_flow_attr" only applies to
>> async approach. It had been introduced long before the async flow design was
>> added to DPDK. That's it.
>>
>>>>
>>>> But, as I say, I still believe that the new attributes aren't needed.
>>> I think we are not at the same page for now. Can we reach agreement on
>>> the same matching criteria first?
>>>>>
>>>>>>> It helps to save underlayer memory also on insertion rate.
>>>>>>
>>>>>> Which memory? Host memory? NIC memory? Term "underlayer" is
>> vague.
>>>>>> I suggest that the commit message be revised to first explain how
>>>>>> such memory is spent currently, then explain why this is not
>>>>>> optimal and, finally, which way the patch is supposed to improve
>>>>>> that. I.e. be more
>>>> specific.
>>>>>>
>>>>>>>
>>>>>
>>>>> For large scalable rules, HW (depends on implementation) always
>>>>> needs
>>>> memory to hold the rules' patterns and actions, either from NIC or from
>> host.
>>>>> The memory footprint highly depends on "user rules' complexity",
>>>>> also diff
>>>> between NICs.
>>>>> ~50% memory saving is expected if one-direction is cut.
>>>>
>>>> Regardless of this talk, this explanation should probably be present
>>>> in the commit description.
>>>>
>>> This number may differ with different NICs or implementation. We can't say
>> it for sure.
>>
>> Not an exact number, of course, but a brief explanation of:
>> a) what is wrong / not optimal in the current design;
> Please check the commit log, transfer have the capability to match bi-direction traffic no matter what ports.
>> b) how it is observed in customer deployments;
> Customer have the requirements to save resources and their offloaded rules is direction aware.
>> c) why the proposed patch is a good solution.
> New attributes provide the way to remove one direction and save underlayer resource.
> All of the above can be found in the commit log.
I understand all of that, but my point is, the existing commit message is
way too brief. Yes, it mentions that SOME customers have SOME deployments,
but it does not shed light on which specifics these deployments have. For
example, back in the day, when items PORT_REPRESENTOR and REPRESENTED_PORT
were added, the cover letter for that patch series provided details of
deployment specifics (application: OvS, scenario: full offload rules).
So, it's always better to expand on such specifics so that the reader
has full picture in their head and doesn't need to look elsewhere.
Not all readers of the commit message will be happy to delve
into our discussions on the mailing list to get the gist.
>
>>
>
>>>>>
>>>>>>> By default, the transfer domain is bi-direction, and no behavior changes.
>>>>>>>
>>>>>>> 1. Match wire origin only
>>>>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
>>>>>>> 2. Match vf origin only
>>>>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>>>>>>>
>>>>>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
>>>>>>> ---
>>>>>>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
>>>>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
>>>>>>> lib/ethdev/rte_flow.h | 9 ++++++-
>>>>>>> 3 files changed, 36 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/app/test-pmd/cmdline_flow.c
>>>>>>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82 100644
>>>>>>> --- a/app/test-pmd/cmdline_flow.c
>>>>>>> +++ b/app/test-pmd/cmdline_flow.c
>>>>>>> @@ -177,6 +177,8 @@ enum index {
>>>>>>> TABLE_INGRESS,
>>>>>>> TABLE_EGRESS,
>>>>>>> TABLE_TRANSFER,
>>>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>>>> + TABLE_TRANSFER_VF_ORIG,
>>>>>>> TABLE_RULES_NUMBER,
>>>>>>> TABLE_PATTERN_TEMPLATE,
>>>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>>>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
>>>>>>> TABLE_INGRESS,
>>>>>>> TABLE_EGRESS,
>>>>>>> TABLE_TRANSFER,
>>>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>>>> + TABLE_TRANSFER_VF_ORIG,
>>>>>>> TABLE_RULES_NUMBER,
>>>>>>> TABLE_PATTERN_TEMPLATE,
>>>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>>>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
>>>>>>> .next = NEXT(next_table_attr),
>>>>>>> .call = parse_table,
>>>>>>> },
>>>>>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
>>>>>>> + .name = "wire_orig",
>>>>>>> + .help = "affect rule direction to transfer",
>>>>>>
>>>>>> This does not explain the "wire" aspect. It's too broad.
>>>>>>
>>>>>>> + .next = NEXT(next_table_attr),
>>>>>>> + .call = parse_table,
>>>>>>> + },
>>>>>>> + [TABLE_TRANSFER_VF_ORIG] = {
>>>>>>> + .name = "vf_orig",
>>>>>>> + .help = "affect rule direction to transfer",
>>>>>>
>>>>>> This explanation simply duplicates such of the "wire_orig".
>>>>>> It does not explain the "vf" part. Should be more specific.
>>>>>>
>>>>>>> + .next = NEXT(next_table_attr),
>>>>>>> + .call = parse_table,
>>>>>>> + },
>>>>>>> [TABLE_RULES_NUMBER] = {
>>>>>>> .name = "rules_number",
>>>>>>> .help = "number of rules in table", @@ -8894,6
>>>>>>> +8910,16 @@ parse_table(struct context *ctx, const struct token
>>>>>>> +*token,
>>>>>>> case TABLE_TRANSFER:
>>>>>>> out->args.table.attr.flow_attr.transfer = 1;
>>>>>>> return len;
>>>>>>> + case TABLE_TRANSFER_WIRE_ORIG:
>>>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>>>> + return -1;
>>>>>>> + out->args.table.attr.flow_attr.transfer_mode = 1;
>>>>>>> + return len;
>>>>>>> + case TABLE_TRANSFER_VF_ORIG:
>>>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>>>> + return -1;
>>>>>>> + out->args.table.attr.flow_attr.transfer_mode = 2;
>>>>>>> + return len;
>>>>>>> default:
>>>>>>> return -1;
>>>>>>> }
>>>>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>> index 330e34427d..603b7988dd 100644
>>>>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>> @@ -3332,7 +3332,8 @@ It is bound to
>>>>>> ``rte_flow_template_table_create()``::
>>>>>>>
>>>>>>> flow template_table {port_id} create
>>>>>>> [table_id {id}] [group {group_id}]
>>>>>>> - [priority {level}] [ingress] [egress] [transfer]
>>>>>>> + [priority {level}] [ingress] [egress]
>>>>>>> + [transfer [vf_orig] [wire_orig]]
>>>>>>
>>>>>> Is it correct? Shouldn't it rather be [transfer] [vf_orig]
>>>>>> [wire_orig] ?
>>>>>>
>>>>>>> rules_number {number}
>>>>>>> pattern_template {pattern_template_id}
>>>>>>> actions_template {actions_template_id} diff --git
>>>>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>>>>>>> a79f1e7ef0..512b08d817 100644
>>>>>>> --- a/lib/ethdev/rte_flow.h
>>>>>>> +++ b/lib/ethdev/rte_flow.h
>>>>>>> @@ -130,7 +130,14 @@ struct rte_flow_attr {
>>>>>>> * through a suitable port. @see rte_flow_pick_transfer_proxy().
>>>>>>> */
>>>>>>> uint32_t transfer:1;
>>>>>>> - uint32_t reserved:29; /**< Reserved, must be zero. */
>>>>>>> + /**
>>>>>>> + * 0 means bidirection,
>>>>>>> + * 0x1 origin uplink,
>>>>>>
>>>>>> What does "uplink" mean? It's too vague. Hardly a good term.
>>
>> I believe this comment should be reworked, in case the idea of having an extra
>> attribute persists.
>>
>>>>>>
>>>>>>> + * 0x2 origin vport,
>>>>>>
>>>>>> What does "origin vport" mean? Hardly a good term as well.
>>
>> I still believe this explanation is way too brief and needs to be reworked to
>> provide more details, to define the use case for the attribute more specifically.
>>
>>>>>>
>>>>>>> + * N/A both set.
>>>>>>
>>>>>> What's this?
>>
>> The question stands.
>>
>>>>>>
>>>>>>> + */
>>>>>>> + uint32_t transfer_mode:2;
>>>>>>> + uint32_t reserved:27; /**< Reserved, must be zero. */
>>>>>>> };
>>>>>>>
>>>>>>> /**
>>>>>>> --
>>>>>>> 2.27.0
>>>>>>>
>>>>>>
>>>>>> Since the attributes are added to generic 'struct rte_flow_attr',
>>>>>> non-table
>>>>>> (synchronous) flow rules are supposed to support them, too. If that
>>>>>> is indeed the case, then I'm afraid such proposal does not agree
>>>>>> with the existing items PORT_REPRESENTOR and REPRESENTED_PORT.
>> They
>>>>>> do exactly the same thing, but they are designed to be way more
>>>>>> generic. Why
>>>> not use them?
>>>>
>>>> The question stands.
>>>>
>>>>>>
>>>>>> Ivan
>>>>>
>>>>
>>>> Ivan
>>>
>
Thank you.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-14 15:18 ` Ivan Malov
@ 2022-09-14 21:02 ` Thomas Monjalon
2022-09-15 0:58 ` Rongwei Liu
1 sibling, 0 replies; 96+ messages in thread
From: Thomas Monjalon @ 2022-09-14 21:02 UTC (permalink / raw)
To: Rongwei Liu, Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam, Aman Singh, Yuying Zhang,
Andrew Rybchenko, dev, Raslan Darawsheh
14/09/2022 17:18, Ivan Malov:
> So, it's always better to expand on such specifics so that the reader
> has full picture in their head and doesn't need to look elsewhere.
> Not all readers of the commit message will be happy to delve
> into our discussions on the mailing list to get the gist.
Yes clearly, we'll need a summary of this long discussion :)
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-14 15:18 ` Ivan Malov
2022-09-14 21:02 ` Thomas Monjalon
@ 2022-09-15 0:58 ` Rongwei Liu
2022-09-15 7:47 ` Ivan Malov
1 sibling, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2022-09-15 0:58 UTC (permalink / raw)
To: Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
HI Ivan:
BR
Rongwei
> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Wednesday, September 14, 2022 23:18
> To: Rongwei Liu <rongweil@nvidia.com>
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
> Darawsheh <rasland@nvidia.com>
> Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer
> table
>
> External email: Use caution opening links or attachments
>
>
> Hi Rongwei,
>
> On Wed, 14 Sep 2022, Rongwei Liu wrote:
>
> > HI
> >
> > BR
> > Rongwei
> >
> >> -----Original Message-----
> >> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >> Sent: Wednesday, September 14, 2022 15:32
> >> To: Rongwei Liu <rongweil@nvidia.com>
> >> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> >> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> >> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> >> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
> >> Raslan Darawsheh <rasland@nvidia.com>
> >> Subject: RE: [PATCH v1] ethdev: add direction info when creating the
> >> transfer table
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> Hi,
> >>
> >> On Wed, 14 Sep 2022, Rongwei Liu wrote:
> >>
> >>> HI
> >>>
> >>> BR
> >>> Rongwei
> >>>
> >>>> -----Original Message-----
> >>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>>> Sent: Tuesday, September 13, 2022 22:33
> >>>> To: Rongwei Liu <rongweil@nvidia.com>
> >>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> >>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> >>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> >>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
> >>>> Raslan Darawsheh <rasland@nvidia.com>
> >>>> Subject: RE: [PATCH v1] ethdev: add direction info when creating
> >>>> the transfer table
> >>>>
> >>>> External email: Use caution opening links or attachments
> >>>>
> >>>>
> >>>> Hi Rongwei,
> >>>>
> >>>> PSB
> >>>>
> >>>> On Tue, 13 Sep 2022, Rongwei Liu wrote:
> >>>>
> >>>>> Hi
> >>>>>
> >>>>> BR
> >>>>> Rongwei
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >>>>>> Sent: Tuesday, September 13, 2022 00:57
> >>>>>> To: Rongwei Liu <rongweil@nvidia.com>
> >>>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >>>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> >>>>>> NBU-Contact- Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
> >>>>>> Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang
> >>>>>> <yuying.zhang@intel.com>; Andrew Rybchenko
> >>>>>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> >>>>>> <rasland@nvidia.com>
> >>>>>> Subject: Re: [PATCH v1] ethdev: add direction info when creating
> >>>>>> the transfer table
> >>>>>>
> >>>>>> External email: Use caution opening links or attachments
> >>>>>>
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> On Wed, 7 Sep 2022, Rongwei Liu wrote:
> >>>>>>
> >>>>>>> The transfer domain rule is able to match traffic wire/vf origin
> >>>>>>> and it means two directions' underlayer resource.
> >>>>>>
> >>>>>> The point of fact is that matching traffic coming from some
> >>>>>> entity like wire / VF has been long generalised in the form of
> representors.
> >>>>>> So, a flow rule with attribute "transfer" is able to match
> >>>>>> traffic coming from either a REPRESENTED_PORT or from a
> >> PORT_REPRESENTOR
> >>>> (please find these items).
> >>>>>>
> >>>>>>>
> >>>>>>> In customer deployments, they usually match only one direction
> >>>>>>> traffic in single flow table: either from wire or from vf.
> >>>>>>
> >>>>>> Which customer deployments? Could you please provide detailed
> >> examples?
> >>>>>>
> >>>>>>>
> >>>>>
> >>>>> We saw a lot of customers' deployment like:
> >>>>> 1. Match overlay traffic from wire and do decap, then send to
> >>>>> specific
> >> vport.
> >>>>> 2. Match specific 5-tuples and do encap, then send to wire.
> >>>>> The matching criteria has obvious direction preference.
> >>>>
> >>>> Thank you. My questions are as follows:
> >>>>
> >>>> In (1), when you say "from wire", do you mean the need to match
> >>>> packets arriving via whatever physical ports rather then matching
> >>>> packets arriving from some specific phys. port?
> >>
> >> ^^
> >>
> >> Could you please find my question above? Based on your understanding
> >> of templates in async flow approach, an answer to this question may
> >> help us find the common ground.
> > It means traffic arrived from physical ports (transfer_proxy role) or south
> band per you concept.
>
> Transfer proxy has nothing to do with physical ports. And I should stress out
> that "south band" and the likes are NOT my concepts. Instead, I think that
> direction designations like "south" or "north" aren't applicable when talking
> about the embedded switch and its flow (transfer) rules.
>
> > Traffic from vport (not transfer_proxy) or north band per your concept won't
> hit even if same packets.
>
> Please see above. Transfer proxy is a completely different concept.
> And I never used "north band" concept.
>
> >>
> >> --
> >>
> >>>>
> >>>> If, however, matching traffic "from wire" in fact means matching
> >>>> packets arriving from a *specific* physical port, then for sure
> >>>> item REPRESENTED_PORT should perfectly do the job, and the proposed
> >>>> attribute is unneeded.
> >>>>
> >>>> (BTW, in DPDK, it is customary to use term "physical port", not
> >>>> "wire")
> >>>>
> >>>> In (1), what are "vport"s? Please explain. Once again, I should
> >>>> remind that, in DPDK, folks prefer terms "represented entity" /
> >> "representor"
> >>>> over vendor-specific terms like "vport", etc.
> >>>>
> >>> Vport is virtual port for short such as VF.
> >>
> >> Thanks. As I say, term "vport" might be confusing to some readers, so
> >> it'd be better to provide this explanation (about VF) in the commit
> >> description next time.
> > Ack. Will add VF as an example.
> >>
> >>>> As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
> >>>> Could you please explain, why not just add a match item
> >>>> REPRESENTED_PORT pointing to that VF via its representor? Doing so
> >>>> should perfectly define the exact direction / traffic source. Isn't
> >>>> that
> >> sufficient?
> >>>>
> >>> Per my view, there is matching field and matching value difference.
> >>> Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as
> >>> same or
> >> different matching criteria?
> >>> I would like to call them same since it can be summarized like
> >>> 1.1.1.0/30 REPRESENTED_PORT is just another matching item, no
> >>> essential
> >> differences and it can't stand for direction info.
> >>
> >> It looks like we're starting to run into disagreement here.
> >> There's no "direction" at all. There's an embedded switch inside the
> >> NIC, and there're (logical) switch ports that packets enter the switch from.
> >>
> >> When the user submits a "transfer" rule and does not provide neither
> >> REPRESENTED_PORT nor PORT_REPRESENTOR in the pattern, the
> embedded
> >> switch is supposed to match packets coming from ANY ports, be it VFs
> >> or physical (wire) ports.
> >>
> >> But when the user provides, in example, item REPRESENTED_PORT to
> >> point to the physical (wire) port, the embedded switch knows exactly
> >> which port the packets should enter it from.
> >> In this case, it is supposed to match only packets coming from that
> >> physical port. And this should be sufficient.
> >> This in fact replaces the need to know a "direction".
> >> It's just an exact specification of packet's origin.
> >>
> > There is traffic arriving or leaving the switch, so there is always direction,
> implicit or explicit.
>
> This does not contradict my thoughts above. "Direction" is *defined* by two
> points (like in geometry): an initial point (the switch port through which a
> packet enters the switch) and the terminal point (the match engine inside the
> switch). If one knows these two points, no extra hints are required to specify
> some "direction". Because direction is already represented by this "vector" of
> sorts. That's why presence of the port match item in the pattern is absolutely
> sufficient.
Good to see this. Thank for the information.
This update leverages the concept exactly defined by you: "an initial point (the switch port through which a
packet enters the switch)"
If you think direction not good, we can change to other words like "initial port"/"origin port" etc.
>
> However, based on your later explanations, the use of precise port item is
> simply inconvenient in your use case because you are trying to match traffic
> from *multiple* ports that have something in common (i.e. all VFs or all wire
> ports).
>
> And, instead of adding a new item type which would serve exactly your needs,
> you for some reason try to add an attribute, which has multiple drawbacks
> which I described in my previous letter.
>
> > For transfer rules, there is a concept transfer_proxy.
> > It takes the switch ownership; all switch rules should be configured via
> transfer_proxy.
>
> Yes, such concept exists, but it's a don't care with regard to the problem that
> we're discussing, sorry.
> Furthermore, unlike "switch domain ID" (which is the same for all ethdevs
> belonging to a given physical NIC board), nobody guarantees that it's only one
> transfer proxy port. Some NIC vendors allows transfer rules to be added via
> any ethdev port.
>
Does any flow rule leverage switchid already. Is it too obscure for end-user?
> >
> > Image a logic switch with one PF and two VFs.
> > PF is the transfer proxy and VF belongs to the PF logically.
> > When receiving traffic from PF, we can say it comes into the logic switch.
>
> That's correct.
>
> > When packet sent from VF (VF belongs to PF), so we can say traffic leaves
> the switch.
>
> That's not correct. Traffic sent from VF (for example, a guest VM is sending
> packets) also *enters* the switch. PFs and VFs are in fact *separate* logical
> ports of the embedded switch.
>
> >
> > Item REPRESENTED_PORT indicates switch to match traffic sent from which
> port, comes into, or leave switch.
>
> That is not correct either. Item REPRESENTED_PORT tells the switch to match
> packets which come into the switch FROM the logical port which is
> represented by the given DPDK ethdev.
>
> For example, if ethdev="E" is the *main* PF which is bound to physical port "P",
> then item REPRESENTED_PORT with ethdev ID being set to "E" tells the switch
> that only packet coming to NIC from *wire* via physical port "E" should match.
>
> > We can say it as one kind of packet metadata.
>
> Kind of yes, but might be vendor-specific. No need to delve into this.
>
> > Like you said, DPDK always treat transfer to match any PORTs traffic.
>
> Slight correction: it treats it this way until it sees an exact port item.
> If the user provides REPRESENTED_PORT (or PORT_REPRESENTOR), it's no
> longer *any* ports traffic, it's an exact port traffic. That's it.
>
> > When REPRESENTED_PORT is specified, the rules are limited to some
> dedicated PORTs.
>
> These rules match only packets arriving TO the embedded switch FROM the
> said dedicated ports.
>
> > Other PORTs are ignored because metadata mismatching.
>
> Kind of yes, correct.
>
> > Rules still have the capability to match ANY PORTS if metadata matched.
>
> This statement is only correct for the cases when the user does NOT use
> neither item REPRESENTED_PORT nor item PORT_REPRESENTOR.
>
> >
> > This update will allow user to cut the other PORTs matching capabilities.
>
> As I explained, this is exactly what items PORT_REPRESENTOR and
> REPRESENTED_PORT do. No need to have an extra attribute.
>
> If the user adds item REPRESENTED_PORT with ethdev_id="E", like in the
> above example, to match packets entering NIC via the physical port "P", then
> this rule will NOT match packets entering NIC from other points. For example,
> packets transmitted by a virtual machine via a VF will not match in this case.
>
> >>> Port id depends on the attach sequence.
> >>
> >> Unfortunately, this is hardly a good argument because flow rules are
> >> supposed to be inserted based on the run-time packet learning. Attach
> >> sequence is a don't care here.
> >>
> >>>> Also please mind that, although I appreciate your explanations
> >>>> here, on the mailing list, they should finally be added to the
> >>>> commit message, so that readers do not have to look for them elsewhere.
> >>>>
> >>> We have explained the high possibility of single-direction matching, right?
> >>
> >> Not quite. As I said, it is not correct to assume any "direction",
> >> like in geographical sense ("north", "south", etc.). Application has
> >> ethdevs, and they are representors of some "virtual ports" (in your
> >> terminology) belonging to the switch, for example, VFs, SFs or physical
> ports.
> >>
> >> The user adds an appropriate item to the pattern (REPRESENTED_PORT),
> >> and doing so specifies the packet path which it enters the switch.
> >>
> >>> It' hard to list all the possibilities of traffic matching preferences.
> >>
> >> And let's say more: one need never do this. That's exactly the reason
> >> why DPDK has abandoned the concept of "direction" in *transfer* rules
> >> and switched to the use of precise criteria (REPRESENTED_PORT, etc.).
> >>
> > As far as I know, DPDK changes "transfer ingress" to "transfer", so it' more
> clear that transfer can match both directions (both ingress and egress).
>
> Not quite. DPDK has abandoned the use of "ingress / egress" in "transfer"
> rules because "ingress" and "egress" are only applicable on the VNIC level. For
> example, there is a PF attached to DPDK application:
> packets that the application receives through this ethdev, are ingress, and
> packets that it transmits (tx_burst) are egress.
>
> I can explain in other words. Imagine yourself standing *inside* a room which
> only has one door. When someone enters the room, it's "ingress", when
> someone leaves, it's "egress". It's relative to your viewpoint.
> In this example, such a room represents a VNIC / ethdev.
>
> And now imagine yourself standing *outside* of another room / auditorium
> which has multiple doors / exits. You're standing near some particular exit "A"
> (VNIC / ethdev), but people may enter this room via another door "B" and then
> leave it via yet another door "C". In this case, from your viewpoint, this traffic
> cannot be considered neither ingress nor egress. Because these people do not
> approach you.
>
> Like in this example, embedded switch is like a large auditorium with many-
> many doors / exits. And there can be many-many
> directions: packet can enter the switch via phys. port "P1"
> and then leave it via another phys. port "P2". Or it can enter the switch via
> phys. port and the leave it via VF's logical port (to be delivered to a guest
> machine), or a packet can travel from one VF to another one.
>
> There's no PRE-DEFINED direction like "north to south" or "east to west".
> And this explains why it's very undesirable to use term "direction".
>
> > REPRESENTED_PORT is the evolution of "port_id", I think, it' only one kind of
> matching items.
>
> Yes. But nobody prevents you from defining yet another match item which will
> be able to refer to a *group* of ports which have something in common (i.e.
> "all guest ports of this switch"
> pointing to all logical ports currently attached to virtual machines / guests, or
> "all wire ports of this swtich").
>
> >
> > For large scale deployment like 10M rules, if we can save resources
> significantly by introducing direction, why not?
>
> I do not deny the fact that you have a use case where resources can be saved
> significantly if you give the PMD some extra knowledge when creating a flow
> table / pattern template. That's totally OK. What I object is the very
> implementation and the use of term "direction". If you add new item types
> (like above), then, when you create an async table 1 pattern template, you will
> have item ANY_WIRE_PORTS, and, for table 2 pattern template, you'll have
> item ANY_GUEST_PORTS.
> As you see, the two pattern templates now differ because the match criteria
> use different items.
>
> >
> > Again, async API:
> > 1. pattern template A
> > 2. action template B
> > 3. table C with pattern template A + action template B.
> > 4. rule D, E, F...
> > The specified REPRESENTED_PORT is provided in rules (D, E, F...) not pattern
> template A or action template B or table C.
> > Resources may be allocated early at step 3 since table' rule_nums property.
>
> No, item REPRESENTED_PORT *can* be provided inside pattern template A,
> but, as you pointed out earlier, the problem is that you can't distinguish
> different pattern templates which have this item, because pattern templates
> know nothing about *exact* port IDs and only know item MASKS. Yes, I agree
> that in your case such problem exists, but, as I say above, it can be solved by
> adding new item types: one for referring to all phys. ports of a given NIC and
> another one for pointing to a group of current guest users (VFs).
>
> >>> The underlay is the one we have met for now.
> >>>>>
> >>>>>>> Introduce one new member transfer_mode into rte_flow_attr to
> >>>>>>> indicate the flow table direction property: from wire, from vf
> >>>>>>> or bi-direction(default).
> >>>>>>
> >>>>>> AFAIK, 'rte_flow_attr' serves both traditional flow rule
> >>>>>> insertion and asynchronous (table) approach. The patch adds the
> >>>>>> attributes to generic 'rte_flow_attr' but, for some reason, ignores non-
> table rules.
> >>>>>>
> >>>>>>>
> >>>>> Sync API uses one rule to contain everything. It' hard for PMD to
> >>>>> determine
> >>>> if this rule has direction preference or not.
> >>>>> Image a situation, just for an example:
> >>>>> 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
> >>>>> 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
> >>>>> 1 and 2 share the same matching conditions (eth / ipv4 / udp /
> >>>>> vxlan /...), so
> >>>> sync API consider them share matching determination logic.
> >>>>> It means "2" have 1M scale capability too. Obviously, it wastes a
> >>>>> lot of
> >>>> resources.
> >>>>
> >>>> Strictly speaking, they do not share the same match pattern.
> >>>> Your example clearly shows that, in (1), the pattern should request
> >>>> packets coming from "vport 1" and, in (2), packets coming from "vport 0".
> >>>>
> >>>> My point is simple: the "vport" from which packets enter the
> >>>> embedded switch is ALSO a match criterion. If you accept this,
> >>>> you'll see: the matching conditions differ.
> >>>>
> >>> See above.
> >>> In this case, I think the matching fields are both "port_id +
> >>> ipv4_vxlan". They
> >> are same.
> >>> Only differs with values like vni 100 or 200 vice versa.
> >>
> >> Not quite. Look closer: you use *different* port IDs for (1) and (2).
> >> The value of "ethdev_id" field in item REPRESENTED_PORT differs.
> >>
> >>>>>
> >>>>> In async API, there is pattern_template introduced. We can mark "1"
> >>>>> to use
> >>>> pattern_tempate id 1 and "2" to use pattern_template 2.
> >>>>> They will be separated from each other, don't share anymore.
> >>>>
> >>>> Consider an example. "Wire" is a physical port represented by PF0
> >>>> which, in turn, is attached to DPDK via ethdev 0. "VF" (vport?) is
> >>>> attached to guest and is represented by a representor ethdev 1 in DPDK.
> >>>>
> >>>> So, some rules (template 1) are needed to deliver packets from "wire"
> >>>> to "VF" and also decapsulate them. And some rules (template 2) are
> >>>> needed to deliver packets in the opposite direction, from "VF"
> >>>> to "wire" and also encapsulate them.
> >>>>
> >>>> My question is, what prevents you from adding match item
> >>>> REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and
> >>>> REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?
> >>>>
> >>>> As I said previously, if you insert such item before eth / ipv4 /
> >>>> etc to your match pattern, doing so defines an *exact* direction / source.
> >>>>
> >>> Could you check the async API guidance? I think pattern template
> >>> focusing
> >> on the matching field (mask).
> >>> "REPRESENTED_PORT[ethdev_id=0] " and
> >> "REPRESENTED_PORT[ethdev_id=1] "are the same.
> >>> 1. pattern template: REPRESENTED_PORT mask 0xffff ...
> >>> 2. action template: action1 / actions2. / 3. table create with
> >>> pattern_template plus action template..
> >>> REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create
> >> REPRESENTED_PORT port_id is 0 / actions ....
> >>> REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create
> >> REPRESENTED_PORT port_id is 1 / actions ....
> >>
> >> OK, so, based on this explanation, it appears that you might be
> >> looking to refer
> >> to:
> >> a) a *set* of any physical (wire) ports
> >> b) a *set* of any guest ports (VFs)
> >>
> > Great, looks we are more and more closer to the agreement.
>
> Looks so.
>
> >> You chose to achieve this using an attribute, but:
> >>
> >> 1) as I explained above, the use of term "direction" is wrong;
> >> please hear me out: I'm not saying that your use case and
> >> your optimisation is wrong: I'm saying that naming for it
> >> is wrong: it has nothing to do with "direction";
> >>
> > Do you have any better naming proposal?
>
> As I said, what you are trying to achieve using a new attribute would be way
> better to achieve using new pattern items which can be easily told one from
> another in PMD when pre-allocaing resources for different async flow tables.
>
> So, I don't have any proposal for *attribute* naming.
> What I propose is to consider new items instead.
>
> >> 2) while naming a *set* of wire ports as "wire_orig" might be OK,
> >> sticking with term "vf_orig" for a *set* of guest ports is
> >> clearly not, simply because the user may pass another PF
> >> to a guest instead of passing a VF; in other words,
> >> a better term is needed here;
> >>
> > Like you said, vport may contain VF, SF etc. vport_orgin is on the logic switch
> perspective.
> > Any proposal is welcome.
>
> The problem is, vport can be easily confused with a slightly more generic
> "lport" (embedded switch's "logical port"), and, logical ports, in turn, are not
> confined to just VFs or PFs. For example, physical (wire) ports are ALSO logical
> ports of the switch.
>
> >> 3) since it is possible to plug multiple NICs to a DPDK application,
> >> even from different vendors, the user may end up having multiple
> >> physical ports belonging to different physical NICs attached to
> >> the application; if this is the case, then referring to a *set*
> >> of wire ports using the new attribute is ambiguous in the
> >> sense that it's unclear whether this applies only to
> >> wire ports of some specific physical NIC or to the
> >> physical ports of *all* NICs managed by the app;
> >>
> > Not matter how many NICs has been probed by the DPDK, there is always
> switch/PF/VF/SF.. concept.
>
> Correct.
>
> > Each switch must have an owner identified by transfer_proxy(). Vport (VF/SF)
> can't cross switch in normal case.
>
> No. That is not correct. This is tricky, but please hear me out: an individual NIC
> board (that is, a given *switch*) is identified only by its switch domain ID. As I
> explained above, "transfer proxy" is just a technical hint for the applcation to
> indicate an ethdev through which "transfer" rules must be managed. Not all
> vendors support this concept (and they are not obliged to support it).
>
> > The traffic comes from one NIC can't be offloaded by other NICs unless
> forwarded by the application.
>
> Right, but forwarding in software (inside DPDK application) is out of scope with
> regard to the problem that we're discussing.
>
> > If user use new attribute to cut one side resource, I think user is smart
> enough to management the rules in different NICs.
>
> As I explained above, I do not deny the existence of the problem that your
> patch is trying to solve. Now it looks like we're on the same page with regard
> to understanding the fact that what you're trying to do is to introduce a match
> criterion that would refer to a GROUP of similar ports. In my opinion, this is
> not an *attribute*, it's a *match criterion*, and it should be implemented as
> two new items.
>
> Having two different item types would perfectly fit the need to know the
> difference between such "directions" (as per your terminology) early enough,
> when parsing templates.
>
> > No default behavior changed with this update.
> >
> >> 4) adding an attribute instead of yet another pattern item type
> >> is not quite good because PMDs need to be updated separately
> >> to detect this attribute and throw an error if it's not
> >> supported, whilst with a new item type, the PMDs do not
> >> need to be updated = if a PMD sees an unsupported item
> >> while traversing the item with switch () { case }, it
> >> will anyway throw an error;
> >>
> > PMD also need to check if it supports new matching item or not, right?
> > We can't assume NIC vendor' PMD implementation, right?
>
> No-no-no. Imagine a PMD which does not support "transfer" rules.
> In such PMD, in the flow parsing function one would have:
>
> if (!!attr->transfer) {
> print_error("Transfer is not supported");
> return EINVAL;
> }
>
> If you add a new attribute, then PMDs which are NOT going to support it need
> to be updated to add similar check.
> Otherwise, they will simply ignore presence / absence of the attribute in the
> rule, and validation result will be unreliable.
>
> Yes, if this attribute is 0x0, then indeed behaviour does nto change. But what if
> it's 0x1 or 0x2?
> PMDs that do not support these values must somehow reject such rules on
> parsing.
>
> However, this problem does not manifest itself when parsing items. Typially, in
> a PMD, one would have:
>
> switch (item->type) {
> case RTE_FLOW_ITEM_TYPE_VOID:
> break;
>
> case RTE_FLOW_ITEM_TYPE_ETH:
> /* blah-blah-blah */
> break;
>
> default:
> return ENOTSUP;
> }
Are you assuming all PMDs will be implemented in the upper style?
This new field targets async API which was added recently. No impact on sync API.
I don't predict any effort on the existing PMD behavior.
But agree with you: we should emphasize it' only for async mode.
>
> So, if you introduce two new item types to solve your problem, then you won't
> have to update existing PMDs. If the vendor wants to support the new items
> (say, MLX or SFC), they'll update their code to accept the items. But other
> vendors will not do anything. If the user tries to pass such an item to a vendor
> which doesn't support the feature, the "default" case will just throw an error.
>
> This is what I mean when pointing out such difference between adding an
> attribute VS adding new item types.
>
> >> 5) as in (4), a new attribute is not good from documentation
> >> standpoint; plase search for "represented_port = Y" in
> >> documentation = this way, all supported items are
> >> easily defined for various NIC vendors, but the
> >> same isn't true for attributes = there is no
> >> way to indicate supported attributes in doc.
> >>
> >> If points (1 - 5) make sense to you, then, if I may be so bold, I'd
> >> like to suggest that the idea of adding a new attribute be abandoned.
> >> Instead, I'd like to suggest adding new items:
> >>
> >> (the names are just sketch, for sure, it should be discussed)
> >>
> >> ANY_PHY_PORTS { switch_domain_id }
> >> = match packets entering the embedded switch from *whatever*
> >> physical ports belonging to the given switch domain
> >>
> > How many PHY_PORTS can one switch have, per your thought? Can I treat
> the PHY_PORTS as the { switch_domain_id } owner as transfer_proxy()?
>
> A single physical NIC board is supposed to have a single embedded switch
> engine. Hence, if the NIC board has, in example, two or four physical ports,
> these will be the physical ports of the switch. That's it.
>
> As for the transfer proxy, please see my explanations above.
> It's not *always* reliable to tell whether two given ethdevs belong to the same
> physical NIC board or not.
>
> Switch domain ID is the right criterion (for applications).
>
> >> ANY_GUEST_PORTS { switch_domain_id }
> >> = match packets entering the embedded switch from *whatever*
> >> guest ports (VFs, PFs, etc.) belonging to the given
> >> switch domain
> >>
> >> The field "switch_domain_id" is required to tell one physical board /
> >> vendor from another (as I explained in point (3)).
> >> The application can query this parameter from ethdev's switch info:
> >> please see "struct rte_eth_switch_info".
> >>
> >> What's your opinion?
> >>
> > How can we handle ANY_PHY_PORTS/ ANY_GUEST_PORTS ' relationship
> with REPRESENTED_PORT if conflicts?
> > Need future tuning.
>
> And if you carry on with "vf_orig" / "wire_orig" approach, you will inevitably
> have the very same problem: possible conflict with items like
> REPRESENTED_PORT. So does it matter? Yes, checks need to be done by PMDs
> when parsing patterns.
>
> > Like I said before, offloaded rules can't cross different NIC vendor'
> "switch_domain_id".
> > If user probes multiple NICs in one application, application should take care
> of packet forwarding.
> > Also application should be aware which ports belong to which NICs.
>
> Yes, perhaps, domain ID is not needed in the new items.
> But the application still must keep track of switch domain IDs itself so it knows
> which rules to manage via which ethdevs.
>
> Any other opinions?
ANY_PHY_PORTS/ ANY_GUEST_PORTS looks like a super set of ports.
This will come another challenge: "why can't we use REPRESENTED_PORT with mask" or "combine several REPRESENTED_PORT together"?
>
> >>>
> >>>>>
> >>>>>> For example, the diff below adds the attributes to "table"
> >>>>>> commands in testpmd but does not add them to regular (non-table)
> >>>>>> commands like "flow create". Why?
> >>>>>>
> >>>>>>>
> >>>>>
> >>>>> "table" command limits pattern_template to single direction or
> >>>>> bidirection
> >>>> per user specified attribute.
> >>>>
> >>>> As I say above, the same effect can be achieved by adding item
> >>>> REPRESENTED_PORT to the corresponding pattern template.
> >>> See above.
> >>>>
> >>>>> "rule" command must tight with one "table_id", so the rule will
> >>>>> inherit the
> >>>> "table" direction property, no need to specify again.
> >>>>
> >>>> You migh've misunderstood. I do not talk about "rule" command
> >>>> coupled with some "table". What I talk about is regular, NON-async
> >>>> flow insertion commands.
> >>>>
> >>>> Please take a look at section "/* Validate/create attributes. */"
> >>>> in file "app/test-pmd/cmdline_flow.c". When one adds a new flow
> >>>> attribute, they should reflect it the same way as VC_INGRESS,
> >> VC_TRANSFER, etc.
> >>>>
> >>>> That's it.
> >>> We don't intend to pass this to sync API. The above code example is
> >>> for sync
> >> API.
> >>
> >> So I understand. But there's one slight problem: in your patch, you
> >> add the new attributes to the structure which is *shared* between
> >> sync and async use case scenarios. If one adds an attribute to this
> >> structure, they have to provide accessors for it in all sync-related
> >> commands in testpmd, but your patch does not do that.
> >>
> > Like the title said, "creating transfer table" is the ASYNC operation.
> > We have limited the scope of this patch. Sync API will be another story.
> > Maybe we can add one more sentence to emphasize async API again.
>
> No-no-no. There might be slight misunderstanding. I understand that you are
> limiting the scope of your patch by saying this and this.
> That's OK. What I'm trying to point out is the fact that your patch nevertheless
> touches the COMMON part of the flow API which is shared between two
> approaches (sync and async).
Yeah, you are right, we should emphasize it for async API not sync in the code and comments.
>
> Imagine a reader that does not know anything about the async approach.
> He just opens the file in vim and goes directly to struct rte_flow_attr.
> And, over there, he sees the new attribute "wire_orig". He then immediately
> assumes that these attributes can be used in testpmd. Now the reader opens
> testpmd and tries to insert a flow rule using the sync approach:
>
> flow create priority 0 transfer vf_orig pattern / ... / end actions drop
>
This is wrong statement.
If user has no idea with cmdline usage, he should rely on "tab indication' not something by guessing.
The command prefix "flow" bifurcated now to sync and async now, user may use any keyword combinations.
He will get "argument error" if it's not good unless he knows what' he is doing.
Again: we should emphasize it's only for async API only.
> And doing so will be a failure, because your patch does not add the new
> attribute keyword to sync flow rule syntax parser. That's it.
>
> Once again, I should ephasize: the reader MAY know nothing about the async
> approach. But if the attribute is present in "struct rte_flow_attr", it
> immediately means that it is available everywhere. Both sync and async.
>
> So, with this in mind, your attempt to limit the scope of the patch to async-only
> rules looks a little bit artificial. It's not correct from the *formal* standpoint.
>
> >
> >> In other words, it is wrong to assume that "struct rte_flow_attr"
> >> only applies to async approach. It had been introduced long before
> >> the async flow design was added to DPDK. That's it.
> >>
> >>>>
> >>>> But, as I say, I still believe that the new attributes aren't needed.
> >>> I think we are not at the same page for now. Can we reach agreement
> >>> on the same matching criteria first?
> >>>>>
> >>>>>>> It helps to save underlayer memory also on insertion rate.
> >>>>>>
> >>>>>> Which memory? Host memory? NIC memory? Term "underlayer" is
> >> vague.
> >>>>>> I suggest that the commit message be revised to first explain how
> >>>>>> such memory is spent currently, then explain why this is not
> >>>>>> optimal and, finally, which way the patch is supposed to improve
> >>>>>> that. I.e. be more
> >>>> specific.
> >>>>>>
> >>>>>>>
> >>>>>
> >>>>> For large scalable rules, HW (depends on implementation) always
> >>>>> needs
> >>>> memory to hold the rules' patterns and actions, either from NIC or
> >>>> from
> >> host.
> >>>>> The memory footprint highly depends on "user rules' complexity",
> >>>>> also diff
> >>>> between NICs.
> >>>>> ~50% memory saving is expected if one-direction is cut.
> >>>>
> >>>> Regardless of this talk, this explanation should probably be
> >>>> present in the commit description.
> >>>>
> >>> This number may differ with different NICs or implementation. We
> >>> can't say
> >> it for sure.
> >>
> >> Not an exact number, of course, but a brief explanation of:
> >> a) what is wrong / not optimal in the current design;
> > Please check the commit log, transfer have the capability to match bi-
> direction traffic no matter what ports.
> >> b) how it is observed in customer deployments;
> > Customer have the requirements to save resources and their offloaded rules
> is direction aware.
> >> c) why the proposed patch is a good solution.
> > New attributes provide the way to remove one direction and save underlayer
> resource.
> > All of the above can be found in the commit log.
>
> I understand all of that, but my point is, the existing commit message is way
> too brief. Yes, it mentions that SOME customers have SOME deployments, but
> it does not shed light on which specifics these deployments have. For example,
> back in the day, when items PORT_REPRESENTOR and REPRESENTED_PORT
> were added, the cover letter for that patch series provided details of
> deployment specifics (application: OvS, scenario: full offload rules).
>
> So, it's always better to expand on such specifics so that the reader has full
> picture in their head and doesn't need to look elsewhere.
> Not all readers of the commit message will be happy to delve into our
> discussions on the mailing list to get the gist.
>
It' approach diverse. Pattern item approach will attract another discussion thread, right?
We should get a conclusion and reflect in the commit changes&logs, and it's easy for others to absorb.
> >
> >>
> >
> >>>>>
> >>>>>>> By default, the transfer domain is bi-direction, and no behavior
> changes.
> >>>>>>>
> >>>>>>> 1. Match wire origin only
> >>>>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> >>>>>>> 2. Match vf origin only
> >>>>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
> >>>>>>>
> >>>>>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> >>>>>>> ---
> >>>>>>> app/test-pmd/cmdline_flow.c | 26
> +++++++++++++++++++++
> >>>>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> >>>>>>> lib/ethdev/rte_flow.h | 9 ++++++-
> >>>>>>> 3 files changed, 36 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/app/test-pmd/cmdline_flow.c
> >>>>>>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82
> >>>>>>> 100644
> >>>>>>> --- a/app/test-pmd/cmdline_flow.c
> >>>>>>> +++ b/app/test-pmd/cmdline_flow.c
> >>>>>>> @@ -177,6 +177,8 @@ enum index {
> >>>>>>> TABLE_INGRESS,
> >>>>>>> TABLE_EGRESS,
> >>>>>>> TABLE_TRANSFER,
> >>>>>>> + TABLE_TRANSFER_WIRE_ORIG,
> >>>>>>> + TABLE_TRANSFER_VF_ORIG,
> >>>>>>> TABLE_RULES_NUMBER,
> >>>>>>> TABLE_PATTERN_TEMPLATE,
> >>>>>>> TABLE_ACTIONS_TEMPLATE,
> >>>>>>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] =
> {
> >>>>>>> TABLE_INGRESS,
> >>>>>>> TABLE_EGRESS,
> >>>>>>> TABLE_TRANSFER,
> >>>>>>> + TABLE_TRANSFER_WIRE_ORIG,
> >>>>>>> + TABLE_TRANSFER_VF_ORIG,
> >>>>>>> TABLE_RULES_NUMBER,
> >>>>>>> TABLE_PATTERN_TEMPLATE,
> >>>>>>> TABLE_ACTIONS_TEMPLATE,
> >>>>>>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
> >>>>>>> .next = NEXT(next_table_attr),
> >>>>>>> .call = parse_table,
> >>>>>>> },
> >>>>>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
> >>>>>>> + .name = "wire_orig",
> >>>>>>> + .help = "affect rule direction to transfer",
> >>>>>>
> >>>>>> This does not explain the "wire" aspect. It's too broad.
> >>>>>>
> >>>>>>> + .next = NEXT(next_table_attr),
> >>>>>>> + .call = parse_table,
> >>>>>>> + },
> >>>>>>> + [TABLE_TRANSFER_VF_ORIG] = {
> >>>>>>> + .name = "vf_orig",
> >>>>>>> + .help = "affect rule direction to transfer",
> >>>>>>
> >>>>>> This explanation simply duplicates such of the "wire_orig".
> >>>>>> It does not explain the "vf" part. Should be more specific.
> >>>>>>
> >>>>>>> + .next = NEXT(next_table_attr),
> >>>>>>> + .call = parse_table,
> >>>>>>> + },
> >>>>>>> [TABLE_RULES_NUMBER] = {
> >>>>>>> .name = "rules_number",
> >>>>>>> .help = "number of rules in table", @@ -8894,6
> >>>>>>> +8910,16 @@ parse_table(struct context *ctx, const struct token
> >>>>>>> +*token,
> >>>>>>> case TABLE_TRANSFER:
> >>>>>>> out->args.table.attr.flow_attr.transfer = 1;
> >>>>>>> return len;
> >>>>>>> + case TABLE_TRANSFER_WIRE_ORIG:
> >>>>>>> + if (!out->args.table.attr.flow_attr.transfer)
> >>>>>>> + return -1;
> >>>>>>> + out->args.table.attr.flow_attr.transfer_mode = 1;
> >>>>>>> + return len;
> >>>>>>> + case TABLE_TRANSFER_VF_ORIG:
> >>>>>>> + if (!out->args.table.attr.flow_attr.transfer)
> >>>>>>> + return -1;
> >>>>>>> + out->args.table.attr.flow_attr.transfer_mode = 2;
> >>>>>>> + return len;
> >>>>>>> default:
> >>>>>>> return -1;
> >>>>>>> }
> >>>>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>>>> index 330e34427d..603b7988dd 100644
> >>>>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>>>> @@ -3332,7 +3332,8 @@ It is bound to
> >>>>>> ``rte_flow_template_table_create()``::
> >>>>>>>
> >>>>>>> flow template_table {port_id} create
> >>>>>>> [table_id {id}] [group {group_id}]
> >>>>>>> - [priority {level}] [ingress] [egress] [transfer]
> >>>>>>> + [priority {level}] [ingress] [egress]
> >>>>>>> + [transfer [vf_orig] [wire_orig]]
> >>>>>>
> >>>>>> Is it correct? Shouldn't it rather be [transfer] [vf_orig]
> >>>>>> [wire_orig] ?
> >>>>>>
> >>>>>>> rules_number {number}
> >>>>>>> pattern_template {pattern_template_id}
> >>>>>>> actions_template {actions_template_id} diff --git
> >>>>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> >>>>>>> a79f1e7ef0..512b08d817 100644
> >>>>>>> --- a/lib/ethdev/rte_flow.h
> >>>>>>> +++ b/lib/ethdev/rte_flow.h
> >>>>>>> @@ -130,7 +130,14 @@ struct rte_flow_attr {
> >>>>>>> * through a suitable port. @see rte_flow_pick_transfer_proxy().
> >>>>>>> */
> >>>>>>> uint32_t transfer:1;
> >>>>>>> - uint32_t reserved:29; /**< Reserved, must be zero. */
> >>>>>>> + /**
> >>>>>>> + * 0 means bidirection,
> >>>>>>> + * 0x1 origin uplink,
> >>>>>>
> >>>>>> What does "uplink" mean? It's too vague. Hardly a good term.
> >>
> >> I believe this comment should be reworked, in case the idea of having
> >> an extra attribute persists.
> >>
> >>>>>>
> >>>>>>> + * 0x2 origin vport,
> >>>>>>
> >>>>>> What does "origin vport" mean? Hardly a good term as well.
> >>
> >> I still believe this explanation is way too brief and needs to be
> >> reworked to provide more details, to define the use case for the attribute
> more specifically.
> >>
> >>>>>>
> >>>>>>> + * N/A both set.
> >>>>>>
> >>>>>> What's this?
> >>
> >> The question stands.
> >>
> >>>>>>
> >>>>>>> + */
> >>>>>>> + uint32_t transfer_mode:2;
> >>>>>>> + uint32_t reserved:27; /**< Reserved, must be zero. */
> >>>>>>> };
> >>>>>>>
> >>>>>>> /**
> >>>>>>> --
> >>>>>>> 2.27.0
> >>>>>>>
> >>>>>>
> >>>>>> Since the attributes are added to generic 'struct rte_flow_attr',
> >>>>>> non-table
> >>>>>> (synchronous) flow rules are supposed to support them, too. If
> >>>>>> that is indeed the case, then I'm afraid such proposal does not
> >>>>>> agree with the existing items PORT_REPRESENTOR and
> REPRESENTED_PORT.
> >> They
> >>>>>> do exactly the same thing, but they are designed to be way more
> >>>>>> generic. Why
> >>>> not use them?
> >>>>
> >>>> The question stands.
> >>>>
> >>>>>>
> >>>>>> Ivan
> >>>>>
> >>>>
> >>>> Ivan
> >>>
> >
>
> Thank you.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-15 0:58 ` Rongwei Liu
@ 2022-09-15 7:47 ` Ivan Malov
2022-09-15 8:18 ` Thomas Monjalon
2022-09-15 8:48 ` Rongwei Liu
0 siblings, 2 replies; 96+ messages in thread
From: Ivan Malov @ 2022-09-15 7:47 UTC (permalink / raw)
To: Rongwei Liu
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi Rongwei,
On Thu, 15 Sep 2022, Rongwei Liu wrote:
> HI Ivan:
>
> BR
> Rongwei
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Wednesday, September 14, 2022 23:18
>> To: Rongwei Liu <rongweil@nvidia.com>
>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
>> Darawsheh <rasland@nvidia.com>
>> Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer
>> table
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Rongwei,
>>
>> On Wed, 14 Sep 2022, Rongwei Liu wrote:
>>
>>> HI
>>>
>>> BR
>>> Rongwei
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>> Sent: Wednesday, September 14, 2022 15:32
>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
>>>> Raslan Darawsheh <rasland@nvidia.com>
>>>> Subject: RE: [PATCH v1] ethdev: add direction info when creating the
>>>> transfer table
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Hi,
>>>>
>>>> On Wed, 14 Sep 2022, Rongwei Liu wrote:
>>>>
>>>>> HI
>>>>>
>>>>> BR
>>>>> Rongwei
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>> Sent: Tuesday, September 13, 2022 22:33
>>>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
>>>>>> Raslan Darawsheh <rasland@nvidia.com>
>>>>>> Subject: RE: [PATCH v1] ethdev: add direction info when creating
>>>>>> the transfer table
>>>>>>
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> Hi Rongwei,
>>>>>>
>>>>>> PSB
>>>>>>
>>>>>> On Tue, 13 Sep 2022, Rongwei Liu wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> BR
>>>>>>> Rongwei
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>>>> Sent: Tuesday, September 13, 2022 00:57
>>>>>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>>>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>>>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
>>>>>>>> NBU-Contact- Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
>>>>>>>> Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang
>>>>>>>> <yuying.zhang@intel.com>; Andrew Rybchenko
>>>>>>>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
>>>>>>>> <rasland@nvidia.com>
>>>>>>>> Subject: Re: [PATCH v1] ethdev: add direction info when creating
>>>>>>>> the transfer table
>>>>>>>>
>>>>>>>> External email: Use caution opening links or attachments
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On Wed, 7 Sep 2022, Rongwei Liu wrote:
>>>>>>>>
>>>>>>>>> The transfer domain rule is able to match traffic wire/vf origin
>>>>>>>>> and it means two directions' underlayer resource.
>>>>>>>>
>>>>>>>> The point of fact is that matching traffic coming from some
>>>>>>>> entity like wire / VF has been long generalised in the form of
>> representors.
>>>>>>>> So, a flow rule with attribute "transfer" is able to match
>>>>>>>> traffic coming from either a REPRESENTED_PORT or from a
>>>> PORT_REPRESENTOR
>>>>>> (please find these items).
>>>>>>>>
>>>>>>>>>
>>>>>>>>> In customer deployments, they usually match only one direction
>>>>>>>>> traffic in single flow table: either from wire or from vf.
>>>>>>>>
>>>>>>>> Which customer deployments? Could you please provide detailed
>>>> examples?
>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> We saw a lot of customers' deployment like:
>>>>>>> 1. Match overlay traffic from wire and do decap, then send to
>>>>>>> specific
>>>> vport.
>>>>>>> 2. Match specific 5-tuples and do encap, then send to wire.
>>>>>>> The matching criteria has obvious direction preference.
>>>>>>
>>>>>> Thank you. My questions are as follows:
>>>>>>
>>>>>> In (1), when you say "from wire", do you mean the need to match
>>>>>> packets arriving via whatever physical ports rather then matching
>>>>>> packets arriving from some specific phys. port?
>>>>
>>>> ^^
>>>>
>>>> Could you please find my question above? Based on your understanding
>>>> of templates in async flow approach, an answer to this question may
>>>> help us find the common ground.
>>> It means traffic arrived from physical ports (transfer_proxy role) or south
>> band per you concept.
>>
>> Transfer proxy has nothing to do with physical ports. And I should stress out
>> that "south band" and the likes are NOT my concepts. Instead, I think that
>> direction designations like "south" or "north" aren't applicable when talking
>> about the embedded switch and its flow (transfer) rules.
>>
>>> Traffic from vport (not transfer_proxy) or north band per your concept won't
>> hit even if same packets.
>>
>> Please see above. Transfer proxy is a completely different concept.
>> And I never used "north band" concept.
>>
>>>>
>>>> --
>>>>
>>>>>>
>>>>>> If, however, matching traffic "from wire" in fact means matching
>>>>>> packets arriving from a *specific* physical port, then for sure
>>>>>> item REPRESENTED_PORT should perfectly do the job, and the proposed
>>>>>> attribute is unneeded.
>>>>>>
>>>>>> (BTW, in DPDK, it is customary to use term "physical port", not
>>>>>> "wire")
>>>>>>
>>>>>> In (1), what are "vport"s? Please explain. Once again, I should
>>>>>> remind that, in DPDK, folks prefer terms "represented entity" /
>>>> "representor"
>>>>>> over vendor-specific terms like "vport", etc.
>>>>>>
>>>>> Vport is virtual port for short such as VF.
>>>>
>>>> Thanks. As I say, term "vport" might be confusing to some readers, so
>>>> it'd be better to provide this explanation (about VF) in the commit
>>>> description next time.
>>> Ack. Will add VF as an example.
>>>>
>>>>>> As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
>>>>>> Could you please explain, why not just add a match item
>>>>>> REPRESENTED_PORT pointing to that VF via its representor? Doing so
>>>>>> should perfectly define the exact direction / traffic source. Isn't
>>>>>> that
>>>> sufficient?
>>>>>>
>>>>> Per my view, there is matching field and matching value difference.
>>>>> Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as
>>>>> same or
>>>> different matching criteria?
>>>>> I would like to call them same since it can be summarized like
>>>>> 1.1.1.0/30 REPRESENTED_PORT is just another matching item, no
>>>>> essential
>>>> differences and it can't stand for direction info.
>>>>
>>>> It looks like we're starting to run into disagreement here.
>>>> There's no "direction" at all. There's an embedded switch inside the
>>>> NIC, and there're (logical) switch ports that packets enter the switch from.
>>>>
>>>> When the user submits a "transfer" rule and does not provide neither
>>>> REPRESENTED_PORT nor PORT_REPRESENTOR in the pattern, the
>> embedded
>>>> switch is supposed to match packets coming from ANY ports, be it VFs
>>>> or physical (wire) ports.
>>>>
>>>> But when the user provides, in example, item REPRESENTED_PORT to
>>>> point to the physical (wire) port, the embedded switch knows exactly
>>>> which port the packets should enter it from.
>>>> In this case, it is supposed to match only packets coming from that
>>>> physical port. And this should be sufficient.
>>>> This in fact replaces the need to know a "direction".
>>>> It's just an exact specification of packet's origin.
>>>>
>>> There is traffic arriving or leaving the switch, so there is always direction,
>> implicit or explicit.
>>
>> This does not contradict my thoughts above. "Direction" is *defined* by two
>> points (like in geometry): an initial point (the switch port through which a
>> packet enters the switch) and the terminal point (the match engine inside the
>> switch). If one knows these two points, no extra hints are required to specify
>> some "direction". Because direction is already represented by this "vector" of
>> sorts. That's why presence of the port match item in the pattern is absolutely
>> sufficient.
> Good to see this. Thank for the information.
You're very welcome.
> This update leverages the concept exactly defined by you: "an initial point (the switch port through which a
> packet enters the switch)"
No, it doesn't seem so. Based on your explanations, it appears that
this update tries to refer to a "super set" of ports which have
something in common. For example, with attribute "wire_orig"
you seem to be trying to request that the rule match packets
arriving from wire through ANY of the phys.ports. So my point
is: why express an obvious match item as an attrbiute?
For example, nobody tries to replace match item IPv4 with
an attribute "is_ipv4". That would be strange, to say the
least. Why should the "vf_orig" case be an exception then?
> If you think direction not good, we can change to other words like "initial port"/"origin port" etc.
As I explained multiple times, "direction" is rather obscure from the
viewpoint located inside the embedded switch. Yes, on non-transfer (VNIC)
level, there are *exactly* two directions: ingress and egress.
But, inside of the embedded switch (transfer rules), there can
be *multiple* various "directions", which are not even
directions, = they're traffic PATHs in fact.
Renaming to "intitial port" and "origin port" won't be helpful either
because, for users, it will be hard to figure out the difference
between the attribute and items PORT_REPRESENTOR / REPRESENTED_PORT.
If, however, you add new items instead of the attribute, the user
will likely see that the new items and the existing ones are
just alternative options = representor-based items help
to address exact ports (one rule - one port), whilst
your new items help to address super sets of ports
like "all wire ports" or "all guest ports".
So, the short of it:
1) these "wire_orig" / "vf_orig" are in fact yet another match criteria;
2) because of that, they should go to match items and not to attributes.
>>
>> However, based on your later explanations, the use of precise port item is
>> simply inconvenient in your use case because you are trying to match traffic
>> from *multiple* ports that have something in common (i.e. all VFs or all wire
>> ports).
>>
>> And, instead of adding a new item type which would serve exactly your needs,
>> you for some reason try to add an attribute, which has multiple drawbacks
>> which I described in my previous letter.
>>
>>> For transfer rules, there is a concept transfer_proxy.
>>> It takes the switch ownership; all switch rules should be configured via
>> transfer_proxy.
>>
>> Yes, such concept exists, but it's a don't care with regard to the problem that
>> we're discussing, sorry.
>> Furthermore, unlike "switch domain ID" (which is the same for all ethdevs
>> belonging to a given physical NIC board), nobody guarantees that it's only one
>> transfer proxy port. Some NIC vendors allows transfer rules to be added via
>> any ethdev port.
>>
> Does any flow rule leverage switchid already. Is it too obscure for end-user?
No, I'm not saying about flow rules. I'm explaining the logic which
application may use to identify which ethdevs are on which NICs.
Imagine a DPDK application which has two ethdevs instantiated:
one ethdev sits on top of the admin. PF (ethdev 0), the other
one sits on top of a low-privilege PF (ethdev 1).
In the latter case, it can also be a VF.
Both ethdev 0 and ethdev 1 belong to the same physical NIC board.
Now, what I'm trying to explain is the fact that "proxy"
behaviour may differ between various vendors:
- some vendors say that they can support managing "transfer" rules via
any PFs / VFs. They do not require that some specific PF ethdev be
used to do that. With such vendors, if the application makes a
query "What's the proxy port ID for the ethdev 1?", it will
get "The proxy port ID for ethdev 1 is 1" response.
- but other vendors cannot support the above workflow and they require
that "transfer" rules be managed using some specific (admin) ethdev.
If the application makes the same query here, it will get the
following response: "The proxy port ID for ethdev 1 is 0".
So, given these explanations, it is incorrect to assume that
the proxy port ID for all ethdevs belonging to the same NIC
board will be the same. They simply may not be like this.
However, *regardless* of the two above scenarious and regardless
of vendor, for NICs which have embedded switch feature, when the
user tries to check the "switch domain ID" for ethdev 0 and
ethdev 1, they will get the same value. So, this should be
the right criterion for the application (not for flow
rules themselves) to decide which ethdev belongs to
which physical NIC board.
>>>
>>> Image a logic switch with one PF and two VFs.
>>> PF is the transfer proxy and VF belongs to the PF logically.
>>> When receiving traffic from PF, we can say it comes into the logic switch.
>>
>> That's correct.
>>
>>> When packet sent from VF (VF belongs to PF), so we can say traffic leaves
>> the switch.
>>
>> That's not correct. Traffic sent from VF (for example, a guest VM is sending
>> packets) also *enters* the switch. PFs and VFs are in fact *separate* logical
>> ports of the embedded switch.
>>
>>>
>>> Item REPRESENTED_PORT indicates switch to match traffic sent from which
>> port, comes into, or leave switch.
>>
>> That is not correct either. Item REPRESENTED_PORT tells the switch to match
>> packets which come into the switch FROM the logical port which is
>> represented by the given DPDK ethdev.
>>
>> For example, if ethdev="E" is the *main* PF which is bound to physical port "P",
>> then item REPRESENTED_PORT with ethdev ID being set to "E" tells the switch
>> that only packet coming to NIC from *wire* via physical port "E" should match.
>>
>>> We can say it as one kind of packet metadata.
>>
>> Kind of yes, but might be vendor-specific. No need to delve into this.
>>
>>> Like you said, DPDK always treat transfer to match any PORTs traffic.
>>
>> Slight correction: it treats it this way until it sees an exact port item.
>> If the user provides REPRESENTED_PORT (or PORT_REPRESENTOR), it's no
>> longer *any* ports traffic, it's an exact port traffic. That's it.
>>
>>> When REPRESENTED_PORT is specified, the rules are limited to some
>> dedicated PORTs.
>>
>> These rules match only packets arriving TO the embedded switch FROM the
>> said dedicated ports.
>>
>>> Other PORTs are ignored because metadata mismatching.
>>
>> Kind of yes, correct.
>>
>>> Rules still have the capability to match ANY PORTS if metadata matched.
>>
>> This statement is only correct for the cases when the user does NOT use
>> neither item REPRESENTED_PORT nor item PORT_REPRESENTOR.
>>
>>>
>>> This update will allow user to cut the other PORTs matching capabilities.
>>
>> As I explained, this is exactly what items PORT_REPRESENTOR and
>> REPRESENTED_PORT do. No need to have an extra attribute.
>>
>> If the user adds item REPRESENTED_PORT with ethdev_id="E", like in the
>> above example, to match packets entering NIC via the physical port "P", then
>> this rule will NOT match packets entering NIC from other points. For example,
>> packets transmitted by a virtual machine via a VF will not match in this case.
>>
>>>>> Port id depends on the attach sequence.
>>>>
>>>> Unfortunately, this is hardly a good argument because flow rules are
>>>> supposed to be inserted based on the run-time packet learning. Attach
>>>> sequence is a don't care here.
>>>>
>>>>>> Also please mind that, although I appreciate your explanations
>>>>>> here, on the mailing list, they should finally be added to the
>>>>>> commit message, so that readers do not have to look for them elsewhere.
>>>>>>
>>>>> We have explained the high possibility of single-direction matching, right?
>>>>
>>>> Not quite. As I said, it is not correct to assume any "direction",
>>>> like in geographical sense ("north", "south", etc.). Application has
>>>> ethdevs, and they are representors of some "virtual ports" (in your
>>>> terminology) belonging to the switch, for example, VFs, SFs or physical
>> ports.
>>>>
>>>> The user adds an appropriate item to the pattern (REPRESENTED_PORT),
>>>> and doing so specifies the packet path which it enters the switch.
>>>>
>>>>> It' hard to list all the possibilities of traffic matching preferences.
>>>>
>>>> And let's say more: one need never do this. That's exactly the reason
>>>> why DPDK has abandoned the concept of "direction" in *transfer* rules
>>>> and switched to the use of precise criteria (REPRESENTED_PORT, etc.).
>>>>
>>> As far as I know, DPDK changes "transfer ingress" to "transfer", so it' more
>> clear that transfer can match both directions (both ingress and egress).
>>
>> Not quite. DPDK has abandoned the use of "ingress / egress" in "transfer"
>> rules because "ingress" and "egress" are only applicable on the VNIC level. For
>> example, there is a PF attached to DPDK application:
>> packets that the application receives through this ethdev, are ingress, and
>> packets that it transmits (tx_burst) are egress.
>>
>> I can explain in other words. Imagine yourself standing *inside* a room which
>> only has one door. When someone enters the room, it's "ingress", when
>> someone leaves, it's "egress". It's relative to your viewpoint.
>> In this example, such a room represents a VNIC / ethdev.
>>
>> And now imagine yourself standing *outside* of another room / auditorium
>> which has multiple doors / exits. You're standing near some particular exit "A"
>> (VNIC / ethdev), but people may enter this room via another door "B" and then
>> leave it via yet another door "C". In this case, from your viewpoint, this traffic
>> cannot be considered neither ingress nor egress. Because these people do not
>> approach you.
>>
>> Like in this example, embedded switch is like a large auditorium with many-
>> many doors / exits. And there can be many-many
>> directions: packet can enter the switch via phys. port "P1"
>> and then leave it via another phys. port "P2". Or it can enter the switch via
>> phys. port and the leave it via VF's logical port (to be delivered to a guest
>> machine), or a packet can travel from one VF to another one.
>>
>> There's no PRE-DEFINED direction like "north to south" or "east to west".
>> And this explains why it's very undesirable to use term "direction".
>>
>>> REPRESENTED_PORT is the evolution of "port_id", I think, it' only one kind of
>> matching items.
>>
>> Yes. But nobody prevents you from defining yet another match item which will
>> be able to refer to a *group* of ports which have something in common (i.e.
>> "all guest ports of this switch"
>> pointing to all logical ports currently attached to virtual machines / guests, or
>> "all wire ports of this swtich").
>>
>>>
>>> For large scale deployment like 10M rules, if we can save resources
>> significantly by introducing direction, why not?
>>
>> I do not deny the fact that you have a use case where resources can be saved
>> significantly if you give the PMD some extra knowledge when creating a flow
>> table / pattern template. That's totally OK. What I object is the very
>> implementation and the use of term "direction". If you add new item types
>> (like above), then, when you create an async table 1 pattern template, you will
>> have item ANY_WIRE_PORTS, and, for table 2 pattern template, you'll have
>> item ANY_GUEST_PORTS.
>> As you see, the two pattern templates now differ because the match criteria
>> use different items.
>>
>>>
>>> Again, async API:
>>> 1. pattern template A
>>> 2. action template B
>>> 3. table C with pattern template A + action template B.
>>> 4. rule D, E, F...
>>> The specified REPRESENTED_PORT is provided in rules (D, E, F...) not pattern
>> template A or action template B or table C.
>>> Resources may be allocated early at step 3 since table' rule_nums property.
>>
>> No, item REPRESENTED_PORT *can* be provided inside pattern template A,
>> but, as you pointed out earlier, the problem is that you can't distinguish
>> different pattern templates which have this item, because pattern templates
>> know nothing about *exact* port IDs and only know item MASKS. Yes, I agree
>> that in your case such problem exists, but, as I say above, it can be solved by
>> adding new item types: one for referring to all phys. ports of a given NIC and
>> another one for pointing to a group of current guest users (VFs).
>>
>>>>> The underlay is the one we have met for now.
>>>>>>>
>>>>>>>>> Introduce one new member transfer_mode into rte_flow_attr to
>>>>>>>>> indicate the flow table direction property: from wire, from vf
>>>>>>>>> or bi-direction(default).
>>>>>>>>
>>>>>>>> AFAIK, 'rte_flow_attr' serves both traditional flow rule
>>>>>>>> insertion and asynchronous (table) approach. The patch adds the
>>>>>>>> attributes to generic 'rte_flow_attr' but, for some reason, ignores non-
>> table rules.
>>>>>>>>
>>>>>>>>>
>>>>>>> Sync API uses one rule to contain everything. It' hard for PMD to
>>>>>>> determine
>>>>>> if this rule has direction preference or not.
>>>>>>> Image a situation, just for an example:
>>>>>>> 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
>>>>>>> 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
>>>>>>> 1 and 2 share the same matching conditions (eth / ipv4 / udp /
>>>>>>> vxlan /...), so
>>>>>> sync API consider them share matching determination logic.
>>>>>>> It means "2" have 1M scale capability too. Obviously, it wastes a
>>>>>>> lot of
>>>>>> resources.
>>>>>>
>>>>>> Strictly speaking, they do not share the same match pattern.
>>>>>> Your example clearly shows that, in (1), the pattern should request
>>>>>> packets coming from "vport 1" and, in (2), packets coming from "vport 0".
>>>>>>
>>>>>> My point is simple: the "vport" from which packets enter the
>>>>>> embedded switch is ALSO a match criterion. If you accept this,
>>>>>> you'll see: the matching conditions differ.
>>>>>>
>>>>> See above.
>>>>> In this case, I think the matching fields are both "port_id +
>>>>> ipv4_vxlan". They
>>>> are same.
>>>>> Only differs with values like vni 100 or 200 vice versa.
>>>>
>>>> Not quite. Look closer: you use *different* port IDs for (1) and (2).
>>>> The value of "ethdev_id" field in item REPRESENTED_PORT differs.
>>>>
>>>>>>>
>>>>>>> In async API, there is pattern_template introduced. We can mark "1"
>>>>>>> to use
>>>>>> pattern_tempate id 1 and "2" to use pattern_template 2.
>>>>>>> They will be separated from each other, don't share anymore.
>>>>>>
>>>>>> Consider an example. "Wire" is a physical port represented by PF0
>>>>>> which, in turn, is attached to DPDK via ethdev 0. "VF" (vport?) is
>>>>>> attached to guest and is represented by a representor ethdev 1 in DPDK.
>>>>>>
>>>>>> So, some rules (template 1) are needed to deliver packets from "wire"
>>>>>> to "VF" and also decapsulate them. And some rules (template 2) are
>>>>>> needed to deliver packets in the opposite direction, from "VF"
>>>>>> to "wire" and also encapsulate them.
>>>>>>
>>>>>> My question is, what prevents you from adding match item
>>>>>> REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and
>>>>>> REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?
>>>>>>
>>>>>> As I said previously, if you insert such item before eth / ipv4 /
>>>>>> etc to your match pattern, doing so defines an *exact* direction / source.
>>>>>>
>>>>> Could you check the async API guidance? I think pattern template
>>>>> focusing
>>>> on the matching field (mask).
>>>>> "REPRESENTED_PORT[ethdev_id=0] " and
>>>> "REPRESENTED_PORT[ethdev_id=1] "are the same.
>>>>> 1. pattern template: REPRESENTED_PORT mask 0xffff ...
>>>>> 2. action template: action1 / actions2. / 3. table create with
>>>>> pattern_template plus action template..
>>>>> REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create
>>>> REPRESENTED_PORT port_id is 0 / actions ....
>>>>> REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create
>>>> REPRESENTED_PORT port_id is 1 / actions ....
>>>>
>>>> OK, so, based on this explanation, it appears that you might be
>>>> looking to refer
>>>> to:
>>>> a) a *set* of any physical (wire) ports
>>>> b) a *set* of any guest ports (VFs)
>>>>
>>> Great, looks we are more and more closer to the agreement.
>>
>> Looks so.
>>
>>>> You chose to achieve this using an attribute, but:
>>>>
>>>> 1) as I explained above, the use of term "direction" is wrong;
>>>> please hear me out: I'm not saying that your use case and
>>>> your optimisation is wrong: I'm saying that naming for it
>>>> is wrong: it has nothing to do with "direction";
>>>>
>>> Do you have any better naming proposal?
>>
>> As I said, what you are trying to achieve using a new attribute would be way
>> better to achieve using new pattern items which can be easily told one from
>> another in PMD when pre-allocaing resources for different async flow tables.
>>
>> So, I don't have any proposal for *attribute* naming.
>> What I propose is to consider new items instead.
>>
>>>> 2) while naming a *set* of wire ports as "wire_orig" might be OK,
>>>> sticking with term "vf_orig" for a *set* of guest ports is
>>>> clearly not, simply because the user may pass another PF
>>>> to a guest instead of passing a VF; in other words,
>>>> a better term is needed here;
>>>>
>>> Like you said, vport may contain VF, SF etc. vport_orgin is on the logic switch
>> perspective.
>>> Any proposal is welcome.
>>
>> The problem is, vport can be easily confused with a slightly more generic
>> "lport" (embedded switch's "logical port"), and, logical ports, in turn, are not
>> confined to just VFs or PFs. For example, physical (wire) ports are ALSO logical
>> ports of the switch.
>>
>>>> 3) since it is possible to plug multiple NICs to a DPDK application,
>>>> even from different vendors, the user may end up having multiple
>>>> physical ports belonging to different physical NICs attached to
>>>> the application; if this is the case, then referring to a *set*
>>>> of wire ports using the new attribute is ambiguous in the
>>>> sense that it's unclear whether this applies only to
>>>> wire ports of some specific physical NIC or to the
>>>> physical ports of *all* NICs managed by the app;
>>>>
>>> Not matter how many NICs has been probed by the DPDK, there is always
>> switch/PF/VF/SF.. concept.
>>
>> Correct.
>>
>>> Each switch must have an owner identified by transfer_proxy(). Vport (VF/SF)
>> can't cross switch in normal case.
>>
>> No. That is not correct. This is tricky, but please hear me out: an individual NIC
>> board (that is, a given *switch*) is identified only by its switch domain ID. As I
>> explained above, "transfer proxy" is just a technical hint for the applcation to
>> indicate an ethdev through which "transfer" rules must be managed. Not all
>> vendors support this concept (and they are not obliged to support it).
>>
>>> The traffic comes from one NIC can't be offloaded by other NICs unless
>> forwarded by the application.
>>
>> Right, but forwarding in software (inside DPDK application) is out of scope with
>> regard to the problem that we're discussing.
>>
>>> If user use new attribute to cut one side resource, I think user is smart
>> enough to management the rules in different NICs.
>>
>> As I explained above, I do not deny the existence of the problem that your
>> patch is trying to solve. Now it looks like we're on the same page with regard
>> to understanding the fact that what you're trying to do is to introduce a match
>> criterion that would refer to a GROUP of similar ports. In my opinion, this is
>> not an *attribute*, it's a *match criterion*, and it should be implemented as
>> two new items.
>>
>> Having two different item types would perfectly fit the need to know the
>> difference between such "directions" (as per your terminology) early enough,
>> when parsing templates.
>>
>>> No default behavior changed with this update.
>>>
>>>> 4) adding an attribute instead of yet another pattern item type
>>>> is not quite good because PMDs need to be updated separately
>>>> to detect this attribute and throw an error if it's not
>>>> supported, whilst with a new item type, the PMDs do not
>>>> need to be updated = if a PMD sees an unsupported item
>>>> while traversing the item with switch () { case }, it
>>>> will anyway throw an error;
>>>>
>>> PMD also need to check if it supports new matching item or not, right?
>>> We can't assume NIC vendor' PMD implementation, right?
>>
>> No-no-no. Imagine a PMD which does not support "transfer" rules.
>> In such PMD, in the flow parsing function one would have:
>>
>> if (!!attr->transfer) {
>> print_error("Transfer is not supported");
>> return EINVAL;
>> }
>>
>> If you add a new attribute, then PMDs which are NOT going to support it need
>> to be updated to add similar check.
>> Otherwise, they will simply ignore presence / absence of the attribute in the
>> rule, and validation result will be unreliable.
>>
>> Yes, if this attribute is 0x0, then indeed behaviour does nto change. But what if
>> it's 0x1 or 0x2?
>> PMDs that do not support these values must somehow reject such rules on
>> parsing.
>>
>> However, this problem does not manifest itself when parsing items. Typially, in
>> a PMD, one would have:
>>
>> switch (item->type) {
>> case RTE_FLOW_ITEM_TYPE_VOID:
>> break;
>>
>> case RTE_FLOW_ITEM_TYPE_ETH:
>> /* blah-blah-blah */
>> break;
>>
>> default:
>> return ENOTSUP;
>> }
> Are you assuming all PMDs will be implemented in the upper style?
One may take a look at the existing PMDs. It's open source after all.
When one has an array of items of unknown count which is
END-terminated, then, obviously, the PMD has to traverse
it one way or another. If it stubles upon an unknown
item, it will have nothing to do but to throw an error.
> This new field targets async API which was added recently. No impact on sync API.
Rongwei, I see your point. The problem with it, however, is that even
if you describe it in comments, the code won't prevent non-sync API
from seeing this attribute in "struct rte_flow_attr".
As I say, "struct rte_flow_attr" has been here for ages.
When one adds a flow rule in a sync way, they fill out
the very same structure. And the user may set this new
argument to non-zero by mistake. Yes, you may argue
that the app developer should be smart enough to
read your comment before the struct member which
says that this field is for a-sync only. Right.
But that's not the only scenario. The field may
become non-zero because of some other mistake in
the program which, for example, leads to the
struct memory being corrupted in one way or
another. That's why the PMD has to validate flow rules...
So, the PMD must detect this inconsistency somehow and throw an error.
With your approach (attribute), the PMDs have to be updated to have
these checks. With the item approach that I suggest, updating the
PMDs is obviously not needed. Am I missing something? Let's discuss.
> I don't predict any effort on the existing PMD behavior.
I see your point. But how is this expressed in code?
As I explain above, consistency checks are what
flow validate API is for. New argument means
new checks. That's it.
> But agree with you: we should emphasize it' only for async mode.
It's better to express this in code. So that the problem (if any)
can be detected programmatically and not just from reading comments.
From my point of view, the easiest way to have this done is to
add items instead of attributes, = no need to update PMDs.
>
>>
>> So, if you introduce two new item types to solve your problem, then you won't
>> have to update existing PMDs. If the vendor wants to support the new items
>> (say, MLX or SFC), they'll update their code to accept the items. But other
>> vendors will not do anything. If the user tries to pass such an item to a vendor
>> which doesn't support the feature, the "default" case will just throw an error.
>>
>> This is what I mean when pointing out such difference between adding an
>> attribute VS adding new item types.
>>
>>>> 5) as in (4), a new attribute is not good from documentation
>>>> standpoint; plase search for "represented_port = Y" in
>>>> documentation = this way, all supported items are
>>>> easily defined for various NIC vendors, but the
>>>> same isn't true for attributes = there is no
>>>> way to indicate supported attributes in doc.
>>>>
>>>> If points (1 - 5) make sense to you, then, if I may be so bold, I'd
>>>> like to suggest that the idea of adding a new attribute be abandoned.
>>>> Instead, I'd like to suggest adding new items:
>>>>
>>>> (the names are just sketch, for sure, it should be discussed)
>>>>
>>>> ANY_PHY_PORTS { switch_domain_id }
>>>> = match packets entering the embedded switch from *whatever*
>>>> physical ports belonging to the given switch domain
>>>>
>>> How many PHY_PORTS can one switch have, per your thought? Can I treat
>> the PHY_PORTS as the { switch_domain_id } owner as transfer_proxy()?
>>
>> A single physical NIC board is supposed to have a single embedded switch
>> engine. Hence, if the NIC board has, in example, two or four physical ports,
>> these will be the physical ports of the switch. That's it.
>>
>> As for the transfer proxy, please see my explanations above.
>> It's not *always* reliable to tell whether two given ethdevs belong to the same
>> physical NIC board or not.
>>
>> Switch domain ID is the right criterion (for applications).
>>
>>>> ANY_GUEST_PORTS { switch_domain_id }
>>>> = match packets entering the embedded switch from *whatever*
>>>> guest ports (VFs, PFs, etc.) belonging to the given
>>>> switch domain
>>>>
>>>> The field "switch_domain_id" is required to tell one physical board /
>>>> vendor from another (as I explained in point (3)).
>>>> The application can query this parameter from ethdev's switch info:
>>>> please see "struct rte_eth_switch_info".
>>>>
>>>> What's your opinion?
>>>>
>>> How can we handle ANY_PHY_PORTS/ ANY_GUEST_PORTS ' relationship
>> with REPRESENTED_PORT if conflicts?
>>> Need future tuning.
>>
>> And if you carry on with "vf_orig" / "wire_orig" approach, you will inevitably
>> have the very same problem: possible conflict with items like
>> REPRESENTED_PORT. So does it matter? Yes, checks need to be done by PMDs
>> when parsing patterns.
>>
>>> Like I said before, offloaded rules can't cross different NIC vendor'
>> "switch_domain_id".
>>> If user probes multiple NICs in one application, application should take care
>> of packet forwarding.
>>> Also application should be aware which ports belong to which NICs.
>>
>> Yes, perhaps, domain ID is not needed in the new items.
>> But the application still must keep track of switch domain IDs itself so it knows
>> which rules to manage via which ethdevs.
>>
>> Any other opinions?
> ANY_PHY_PORTS/ ANY_GUEST_PORTS looks like a super set of ports.
So does the new attribute, doesn't it?
> This will come another challenge: "why can't we use REPRESENTED_PORT with mask" or "combine several REPRESENTED_PORT together"?
This problem has been here for many other items, including now deprecated
items PF, VF and PHY_PORT. Yes, theoretically, when the PMD looks through
the pattern, it has to check that its items do not overlap / contradict.
That's kind of OK, isn't it? The PMD has to check things after all...
For example, no one prevents user from submitting a pattern
with several adjacent items ETH in it. The PMD is supposed
to turn such request down.
>>
>>>>>
>>>>>>>
>>>>>>>> For example, the diff below adds the attributes to "table"
>>>>>>>> commands in testpmd but does not add them to regular (non-table)
>>>>>>>> commands like "flow create". Why?
>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> "table" command limits pattern_template to single direction or
>>>>>>> bidirection
>>>>>> per user specified attribute.
>>>>>>
>>>>>> As I say above, the same effect can be achieved by adding item
>>>>>> REPRESENTED_PORT to the corresponding pattern template.
>>>>> See above.
>>>>>>
>>>>>>> "rule" command must tight with one "table_id", so the rule will
>>>>>>> inherit the
>>>>>> "table" direction property, no need to specify again.
>>>>>>
>>>>>> You migh've misunderstood. I do not talk about "rule" command
>>>>>> coupled with some "table". What I talk about is regular, NON-async
>>>>>> flow insertion commands.
>>>>>>
>>>>>> Please take a look at section "/* Validate/create attributes. */"
>>>>>> in file "app/test-pmd/cmdline_flow.c". When one adds a new flow
>>>>>> attribute, they should reflect it the same way as VC_INGRESS,
>>>> VC_TRANSFER, etc.
>>>>>>
>>>>>> That's it.
>>>>> We don't intend to pass this to sync API. The above code example is
>>>>> for sync
>>>> API.
>>>>
>>>> So I understand. But there's one slight problem: in your patch, you
>>>> add the new attributes to the structure which is *shared* between
>>>> sync and async use case scenarios. If one adds an attribute to this
>>>> structure, they have to provide accessors for it in all sync-related
>>>> commands in testpmd, but your patch does not do that.
>>>>
>>> Like the title said, "creating transfer table" is the ASYNC operation.
>>> We have limited the scope of this patch. Sync API will be another story.
>>> Maybe we can add one more sentence to emphasize async API again.
>>
>> No-no-no. There might be slight misunderstanding. I understand that you are
>> limiting the scope of your patch by saying this and this.
>> That's OK. What I'm trying to point out is the fact that your patch nevertheless
>> touches the COMMON part of the flow API which is shared between two
>> approaches (sync and async).
> Yeah, you are right, we should emphasize it for async API not sync in the code and comments.
>>
>> Imagine a reader that does not know anything about the async approach.
>> He just opens the file in vim and goes directly to struct rte_flow_attr.
>> And, over there, he sees the new attribute "wire_orig". He then immediately
>> assumes that these attributes can be used in testpmd. Now the reader opens
>> testpmd and tries to insert a flow rule using the sync approach:
>>
>> flow create priority 0 transfer vf_orig pattern / ... / end actions drop
>>
>
> This is wrong statement.
> If user has no idea with cmdline usage, he should rely on "tab indication' not something by guessing.
>
> The command prefix "flow" bifurcated now to sync and async now, user may use any keyword combinations.
> He will get "argument error" if it's not good unless he knows what' he is doing.
> Again: we should emphasize it's only for async API only.
OK, even if this example is not good enough, I still believe that
it is not right to introduce new match criteria in the form of
rule attributes. Match criteria belong in the pattern.
>
>> And doing so will be a failure, because your patch does not add the new
>> attribute keyword to sync flow rule syntax parser. That's it.
>>
>> Once again, I should ephasize: the reader MAY know nothing about the async
>> approach. But if the attribute is present in "struct rte_flow_attr", it
>> immediately means that it is available everywhere. Both sync and async.
>>
>> So, with this in mind, your attempt to limit the scope of the patch to async-only
>> rules looks a little bit artificial. It's not correct from the *formal* standpoint.
>>
>>>
>>>> In other words, it is wrong to assume that "struct rte_flow_attr"
>>>> only applies to async approach. It had been introduced long before
>>>> the async flow design was added to DPDK. That's it.
>>>>
>>>>>>
>>>>>> But, as I say, I still believe that the new attributes aren't needed.
>>>>> I think we are not at the same page for now. Can we reach agreement
>>>>> on the same matching criteria first?
>>>>>>>
>>>>>>>>> It helps to save underlayer memory also on insertion rate.
>>>>>>>>
>>>>>>>> Which memory? Host memory? NIC memory? Term "underlayer" is
>>>> vague.
>>>>>>>> I suggest that the commit message be revised to first explain how
>>>>>>>> such memory is spent currently, then explain why this is not
>>>>>>>> optimal and, finally, which way the patch is supposed to improve
>>>>>>>> that. I.e. be more
>>>>>> specific.
>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> For large scalable rules, HW (depends on implementation) always
>>>>>>> needs
>>>>>> memory to hold the rules' patterns and actions, either from NIC or
>>>>>> from
>>>> host.
>>>>>>> The memory footprint highly depends on "user rules' complexity",
>>>>>>> also diff
>>>>>> between NICs.
>>>>>>> ~50% memory saving is expected if one-direction is cut.
>>>>>>
>>>>>> Regardless of this talk, this explanation should probably be
>>>>>> present in the commit description.
>>>>>>
>>>>> This number may differ with different NICs or implementation. We
>>>>> can't say
>>>> it for sure.
>>>>
>>>> Not an exact number, of course, but a brief explanation of:
>>>> a) what is wrong / not optimal in the current design;
>>> Please check the commit log, transfer have the capability to match bi-
>> direction traffic no matter what ports.
>>>> b) how it is observed in customer deployments;
>>> Customer have the requirements to save resources and their offloaded rules
>> is direction aware.
>>>> c) why the proposed patch is a good solution.
>>> New attributes provide the way to remove one direction and save underlayer
>> resource.
>>> All of the above can be found in the commit log.
>>
>> I understand all of that, but my point is, the existing commit message is way
>> too brief. Yes, it mentions that SOME customers have SOME deployments, but
>> it does not shed light on which specifics these deployments have. For example,
>> back in the day, when items PORT_REPRESENTOR and REPRESENTED_PORT
>> were added, the cover letter for that patch series provided details of
>> deployment specifics (application: OvS, scenario: full offload rules).
>>
>> So, it's always better to expand on such specifics so that the reader has full
>> picture in their head and doesn't need to look elsewhere.
>> Not all readers of the commit message will be happy to delve into our
>> discussions on the mailing list to get the gist.
>>
> It' approach diverse. Pattern item approach will attract another discussion thread, right?
As I said, match criteria belong in flow pattern. I recognise the
importance of the problem that you're looking to solve. It's very
good that you care to address it, but what this patch tries to do
is to add more match criteria in the form of new attributes with
rather questionable names... There's a room for improvement.
When I say that new features should not confuse readers, I mean
a very basic thing: readers know that match criteria all sit
in the pattern. And they refer to the pattern item enum in
the code and in documentation to learn about criteria,
while "struct rte_flow_attr" is an unusual place from
which to learn about match criteria.
> We should get a conclusion and reflect in the commit changes&logs, and it's easy for others to absorb.
Yes, but before we get to that, perhaps it pays to hear
more feedback from other reviewers. Thomas? Ori? Andrew?
>>>
>>>>
>>>
>>>>>>>
>>>>>>>>> By default, the transfer domain is bi-direction, and no behavior
>> changes.
>>>>>>>>>
>>>>>>>>> 1. Match wire origin only
>>>>>>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
>>>>>>>>> 2. Match vf origin only
>>>>>>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>>>>>>>>>
>>>>>>>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
>>>>>>>>> ---
>>>>>>>>> app/test-pmd/cmdline_flow.c | 26
>> +++++++++++++++++++++
>>>>>>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
>>>>>>>>> lib/ethdev/rte_flow.h | 9 ++++++-
>>>>>>>>> 3 files changed, 36 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/app/test-pmd/cmdline_flow.c
>>>>>>>>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82
>>>>>>>>> 100644
>>>>>>>>> --- a/app/test-pmd/cmdline_flow.c
>>>>>>>>> +++ b/app/test-pmd/cmdline_flow.c
>>>>>>>>> @@ -177,6 +177,8 @@ enum index {
>>>>>>>>> TABLE_INGRESS,
>>>>>>>>> TABLE_EGRESS,
>>>>>>>>> TABLE_TRANSFER,
>>>>>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>>>>>> + TABLE_TRANSFER_VF_ORIG,
>>>>>>>>> TABLE_RULES_NUMBER,
>>>>>>>>> TABLE_PATTERN_TEMPLATE,
>>>>>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>>>>>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] =
>> {
>>>>>>>>> TABLE_INGRESS,
>>>>>>>>> TABLE_EGRESS,
>>>>>>>>> TABLE_TRANSFER,
>>>>>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>>>>>> + TABLE_TRANSFER_VF_ORIG,
>>>>>>>>> TABLE_RULES_NUMBER,
>>>>>>>>> TABLE_PATTERN_TEMPLATE,
>>>>>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>>>>>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
>>>>>>>>> .next = NEXT(next_table_attr),
>>>>>>>>> .call = parse_table,
>>>>>>>>> },
>>>>>>>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
>>>>>>>>> + .name = "wire_orig",
>>>>>>>>> + .help = "affect rule direction to transfer",
>>>>>>>>
>>>>>>>> This does not explain the "wire" aspect. It's too broad.
>>>>>>>>
>>>>>>>>> + .next = NEXT(next_table_attr),
>>>>>>>>> + .call = parse_table,
>>>>>>>>> + },
>>>>>>>>> + [TABLE_TRANSFER_VF_ORIG] = {
>>>>>>>>> + .name = "vf_orig",
>>>>>>>>> + .help = "affect rule direction to transfer",
>>>>>>>>
>>>>>>>> This explanation simply duplicates such of the "wire_orig".
>>>>>>>> It does not explain the "vf" part. Should be more specific.
>>>>>>>>
>>>>>>>>> + .next = NEXT(next_table_attr),
>>>>>>>>> + .call = parse_table,
>>>>>>>>> + },
>>>>>>>>> [TABLE_RULES_NUMBER] = {
>>>>>>>>> .name = "rules_number",
>>>>>>>>> .help = "number of rules in table", @@ -8894,6
>>>>>>>>> +8910,16 @@ parse_table(struct context *ctx, const struct token
>>>>>>>>> +*token,
>>>>>>>>> case TABLE_TRANSFER:
>>>>>>>>> out->args.table.attr.flow_attr.transfer = 1;
>>>>>>>>> return len;
>>>>>>>>> + case TABLE_TRANSFER_WIRE_ORIG:
>>>>>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>>>>>> + return -1;
>>>>>>>>> + out->args.table.attr.flow_attr.transfer_mode = 1;
>>>>>>>>> + return len;
>>>>>>>>> + case TABLE_TRANSFER_VF_ORIG:
>>>>>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>>>>>> + return -1;
>>>>>>>>> + out->args.table.attr.flow_attr.transfer_mode = 2;
>>>>>>>>> + return len;
>>>>>>>>> default:
>>>>>>>>> return -1;
>>>>>>>>> }
>>>>>>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>>>> index 330e34427d..603b7988dd 100644
>>>>>>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>>>> @@ -3332,7 +3332,8 @@ It is bound to
>>>>>>>> ``rte_flow_template_table_create()``::
>>>>>>>>>
>>>>>>>>> flow template_table {port_id} create
>>>>>>>>> [table_id {id}] [group {group_id}]
>>>>>>>>> - [priority {level}] [ingress] [egress] [transfer]
>>>>>>>>> + [priority {level}] [ingress] [egress]
>>>>>>>>> + [transfer [vf_orig] [wire_orig]]
>>>>>>>>
>>>>>>>> Is it correct? Shouldn't it rather be [transfer] [vf_orig]
>>>>>>>> [wire_orig] ?
>>>>>>>>
>>>>>>>>> rules_number {number}
>>>>>>>>> pattern_template {pattern_template_id}
>>>>>>>>> actions_template {actions_template_id} diff --git
>>>>>>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>>>>>>>>> a79f1e7ef0..512b08d817 100644
>>>>>>>>> --- a/lib/ethdev/rte_flow.h
>>>>>>>>> +++ b/lib/ethdev/rte_flow.h
>>>>>>>>> @@ -130,7 +130,14 @@ struct rte_flow_attr {
>>>>>>>>> * through a suitable port. @see rte_flow_pick_transfer_proxy().
>>>>>>>>> */
>>>>>>>>> uint32_t transfer:1;
>>>>>>>>> - uint32_t reserved:29; /**< Reserved, must be zero. */
>>>>>>>>> + /**
>>>>>>>>> + * 0 means bidirection,
>>>>>>>>> + * 0x1 origin uplink,
>>>>>>>>
>>>>>>>> What does "uplink" mean? It's too vague. Hardly a good term.
>>>>
>>>> I believe this comment should be reworked, in case the idea of having
>>>> an extra attribute persists.
>>>>
>>>>>>>>
>>>>>>>>> + * 0x2 origin vport,
>>>>>>>>
>>>>>>>> What does "origin vport" mean? Hardly a good term as well.
>>>>
>>>> I still believe this explanation is way too brief and needs to be
>>>> reworked to provide more details, to define the use case for the attribute
>> more specifically.
>>>>
>>>>>>>>
>>>>>>>>> + * N/A both set.
>>>>>>>>
>>>>>>>> What's this?
>>>>
>>>> The question stands.
>>>>
>>>>>>>>
>>>>>>>>> + */
>>>>>>>>> + uint32_t transfer_mode:2;
>>>>>>>>> + uint32_t reserved:27; /**< Reserved, must be zero. */
>>>>>>>>> };
>>>>>>>>>
>>>>>>>>> /**
>>>>>>>>> --
>>>>>>>>> 2.27.0
>>>>>>>>>
>>>>>>>>
>>>>>>>> Since the attributes are added to generic 'struct rte_flow_attr',
>>>>>>>> non-table
>>>>>>>> (synchronous) flow rules are supposed to support them, too. If
>>>>>>>> that is indeed the case, then I'm afraid such proposal does not
>>>>>>>> agree with the existing items PORT_REPRESENTOR and
>> REPRESENTED_PORT.
>>>> They
>>>>>>>> do exactly the same thing, but they are designed to be way more
>>>>>>>> generic. Why
>>>>>> not use them?
>>>>>>
>>>>>> The question stands.
>>>>>>
>>>>>>>>
>>>>>>>> Ivan
>>>>>>>
>>>>>>
>>>>>> Ivan
>>>>>
>>>
>>
>> Thank you.
>
Thanks,
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-15 7:47 ` Ivan Malov
@ 2022-09-15 8:18 ` Thomas Monjalon
2022-09-15 9:42 ` Ivan Malov
2022-09-15 8:48 ` Rongwei Liu
1 sibling, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2022-09-15 8:18 UTC (permalink / raw)
To: Rongwei Liu, Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam, Aman Singh, Yuying Zhang,
Andrew Rybchenko, dev, Raslan Darawsheh
15/09/2022 09:47, Ivan Malov:
> As I said, match criteria belong in flow pattern. I recognise the
> importance of the problem that you're looking to solve. It's very
> good that you care to address it, but what this patch tries to do
> is to add more match criteria in the form of new attributes with
> rather questionable names... There's a room for improvement.
>
> When I say that new features should not confuse readers, I mean
> a very basic thing: readers know that match criteria all sit
> in the pattern. And they refer to the pattern item enum in
> the code and in documentation to learn about criteria,
> while "struct rte_flow_attr" is an unusual place from
> which to learn about match criteria.
>
> > We should get a conclusion and reflect in the commit changes&logs, and it's easy for others to absorb.
>
> Yes, but before we get to that, perhaps it pays to hear
> more feedback from other reviewers. Thomas? Ori? Andrew?
Sorry I did not read all.
I think the main question is about the use of attributes.
I refer to this commit of Ivan last year which was agreed:
ethdev: deprecate direction attributes in transfer flows
Attributes "ingress" and "egress" can only apply unambiguosly
to non-"transfer" flows. In "transfer" flows, the standpoint
is effectively shifted to the embedded switch. There can be
many different endpoints connected to the switch, so the
use of "ingress" / "egress" does not shed light on which
endpoints precisely can be considered as traffic sources.
Add relevant deprecation notices and suggest the use of precise
traffic source items (PORT_REPRESENTOR and REPRESENTED_PORT).
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
So +1 for using only pattern items as matching criteria.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-15 7:47 ` Ivan Malov
2022-09-15 8:18 ` Thomas Monjalon
@ 2022-09-15 8:48 ` Rongwei Liu
2022-09-15 10:59 ` Ivan Malov
1 sibling, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2022-09-15 8:48 UTC (permalink / raw)
To: Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
HI Ivan:
BR
Rongwei
-----Original Message-----
From: Ivan Malov <ivan.malov@oktetlabs.ru>
Sent: Thursday, September 15, 2022 15:47
To: Rongwei Liu <rongweil@nvidia.com>
Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer table
External email: Use caution opening links or attachments
Hi Rongwei,
On Thu, 15 Sep 2022, Rongwei Liu wrote:
> HI Ivan:
>
> BR
> Rongwei
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Wednesday, September 14, 2022 23:18
>> To: Rongwei Liu <rongweil@nvidia.com>
>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan
>> Darawsheh <rasland@nvidia.com>
>> Subject: RE: [PATCH v1] ethdev: add direction info when creating the transfer
>> table
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Rongwei,
>>
>> On Wed, 14 Sep 2022, Rongwei Liu wrote:
>>
>>> HI
>>>
>>> BR
>>> Rongwei
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>> Sent: Wednesday, September 14, 2022 15:32
>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
>>>> Raslan Darawsheh <rasland@nvidia.com>
>>>> Subject: RE: [PATCH v1] ethdev: add direction info when creating the
>>>> transfer table
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Hi,
>>>>
>>>> On Wed, 14 Sep 2022, Rongwei Liu wrote:
>>>>
>>>>> HI
>>>>>
>>>>> BR
>>>>> Rongwei
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>> Sent: Tuesday, September 13, 2022 22:33
>>>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>>>> Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org;
>>>>>> Raslan Darawsheh <rasland@nvidia.com>
>>>>>> Subject: RE: [PATCH v1] ethdev: add direction info when creating
>>>>>> the transfer table
>>>>>>
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> Hi Rongwei,
>>>>>>
>>>>>> PSB
>>>>>>
>>>>>> On Tue, 13 Sep 2022, Rongwei Liu wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> BR
>>>>>>> Rongwei
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>>>>>> Sent: Tuesday, September 13, 2022 00:57
>>>>>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>>>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>>>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
>>>>>>>> NBU-Contact- Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>;
>>>>>>>> Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang
>>>>>>>> <yuying.zhang@intel.com>; Andrew Rybchenko
>>>>>>>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
>>>>>>>> <rasland@nvidia.com>
>>>>>>>> Subject: Re: [PATCH v1] ethdev: add direction info when creating
>>>>>>>> the transfer table
>>>>>>>>
>>>>>>>> External email: Use caution opening links or attachments
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On Wed, 7 Sep 2022, Rongwei Liu wrote:
>>>>>>>>
>>>>>>>>> The transfer domain rule is able to match traffic wire/vf origin
>>>>>>>>> and it means two directions' underlayer resource.
>>>>>>>>
>>>>>>>> The point of fact is that matching traffic coming from some
>>>>>>>> entity like wire / VF has been long generalised in the form of
>> representors.
>>>>>>>> So, a flow rule with attribute "transfer" is able to match
>>>>>>>> traffic coming from either a REPRESENTED_PORT or from a
>>>> PORT_REPRESENTOR
>>>>>> (please find these items).
>>>>>>>>
>>>>>>>>>
>>>>>>>>> In customer deployments, they usually match only one direction
>>>>>>>>> traffic in single flow table: either from wire or from vf.
>>>>>>>>
>>>>>>>> Which customer deployments? Could you please provide detailed
>>>> examples?
>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> We saw a lot of customers' deployment like:
>>>>>>> 1. Match overlay traffic from wire and do decap, then send to
>>>>>>> specific
>>>> vport.
>>>>>>> 2. Match specific 5-tuples and do encap, then send to wire.
>>>>>>> The matching criteria has obvious direction preference.
>>>>>>
>>>>>> Thank you. My questions are as follows:
>>>>>>
>>>>>> In (1), when you say "from wire", do you mean the need to match
>>>>>> packets arriving via whatever physical ports rather then matching
>>>>>> packets arriving from some specific phys. port?
>>>>
>>>> ^^
>>>>
>>>> Could you please find my question above? Based on your understanding
>>>> of templates in async flow approach, an answer to this question may
>>>> help us find the common ground.
>>> It means traffic arrived from physical ports (transfer_proxy role) or south
>> band per you concept.
>>
>> Transfer proxy has nothing to do with physical ports. And I should stress out
>> that "south band" and the likes are NOT my concepts. Instead, I think that
>> direction designations like "south" or "north" aren't applicable when talking
>> about the embedded switch and its flow (transfer) rules.
>>
>>> Traffic from vport (not transfer_proxy) or north band per your concept won't
>> hit even if same packets.
>>
>> Please see above. Transfer proxy is a completely different concept.
>> And I never used "north band" concept.
>>
>>>>
>>>> --
>>>>
>>>>>>
>>>>>> If, however, matching traffic "from wire" in fact means matching
>>>>>> packets arriving from a *specific* physical port, then for sure
>>>>>> item REPRESENTED_PORT should perfectly do the job, and the proposed
>>>>>> attribute is unneeded.
>>>>>>
>>>>>> (BTW, in DPDK, it is customary to use term "physical port", not
>>>>>> "wire")
>>>>>>
>>>>>> In (1), what are "vport"s? Please explain. Once again, I should
>>>>>> remind that, in DPDK, folks prefer terms "represented entity" /
>>>> "representor"
>>>>>> over vendor-specific terms like "vport", etc.
>>>>>>
>>>>> Vport is virtual port for short such as VF.
>>>>
>>>> Thanks. As I say, term "vport" might be confusing to some readers, so
>>>> it'd be better to provide this explanation (about VF) in the commit
>>>> description next time.
>>> Ack. Will add VF as an example.
>>>>
>>>>>> As for (2), imagine matching 5-tuple traffic emitted by a VF / guest.
>>>>>> Could you please explain, why not just add a match item
>>>>>> REPRESENTED_PORT pointing to that VF via its representor? Doing so
>>>>>> should perfectly define the exact direction / traffic source. Isn't
>>>>>> that
>>>> sufficient?
>>>>>>
>>>>> Per my view, there is matching field and matching value difference.
>>>>> Like IPv4 src_addr 1.1.1.1, 1.1.1.2. 1.1.1.3, will you treat it as
>>>>> same or
>>>> different matching criteria?
>>>>> I would like to call them same since it can be summarized like
>>>>> 1.1.1.0/30 REPRESENTED_PORT is just another matching item, no
>>>>> essential
>>>> differences and it can't stand for direction info.
>>>>
>>>> It looks like we're starting to run into disagreement here.
>>>> There's no "direction" at all. There's an embedded switch inside the
>>>> NIC, and there're (logical) switch ports that packets enter the switch from.
>>>>
>>>> When the user submits a "transfer" rule and does not provide neither
>>>> REPRESENTED_PORT nor PORT_REPRESENTOR in the pattern, the
>> embedded
>>>> switch is supposed to match packets coming from ANY ports, be it VFs
>>>> or physical (wire) ports.
>>>>
>>>> But when the user provides, in example, item REPRESENTED_PORT to
>>>> point to the physical (wire) port, the embedded switch knows exactly
>>>> which port the packets should enter it from.
>>>> In this case, it is supposed to match only packets coming from that
>>>> physical port. And this should be sufficient.
>>>> This in fact replaces the need to know a "direction".
>>>> It's just an exact specification of packet's origin.
>>>>
>>> There is traffic arriving or leaving the switch, so there is always direction,
>> implicit or explicit.
>>
>> This does not contradict my thoughts above. "Direction" is *defined* by two
>> points (like in geometry): an initial point (the switch port through which a
>> packet enters the switch) and the terminal point (the match engine inside the
>> switch). If one knows these two points, no extra hints are required to specify
>> some "direction". Because direction is already represented by this "vector" of
>> sorts. That's why presence of the port match item in the pattern is absolutely
>> sufficient.
> Good to see this. Thank for the information.
You're very welcome.
> This update leverages the concept exactly defined by you: "an initial point (the switch port through which a
> packet enters the switch)"
No, it doesn't seem so. Based on your explanations, it appears that
this update tries to refer to a "super set" of ports which have
something in common. For example, with attribute "wire_orig"
you seem to be trying to request that the rule match packets
arriving from wire through ANY of the phys.ports. So my point
is: why express an obvious match item as an attrbiute?
Let me explain more based on your point and sentences:
"Direction" is *defined* by two points (like in geometry): an initial point (the switch port through which a
packet enters the switch) and the terminal point (the match engine inside the switch).
Wire_orig: an initial port is from uplink or internet wire, terminal port is switch (physical or logic), switch will handle the packets eventually
Vport_orig: an initial port is from virtual terminal port is switch (physical or logic)
Looks they match perfectly.
I think you have some misunderstanding on matching item, it should contain matching filed(rte_item->mask), matching value(rte_item->spec)
What you proposed "ANY_GUEST_PORT/ANY_PHY_PORT" seemed to mix both together.
ANY_GUEST_PORT: port belongs to specific switch domain and it's virtual
ANY_PHY_PORT: port belongs to specific switch domain and it's physical.
For example, nobody tries to replace match item IPv4 with
an attribute "is_ipv4". That would be strange, to say the
least. Why should the "vf_orig" case be an exception then?
It's a good point. There is already "port_id/represented_port", why do you want to add "IS_***_PORTS" matching item?
Like IPv4, it matches rx_only, tx_only, rx_tx for INGRESS EGRESS TRANSFER domain, eventually it will follow domain principle.
Matching item should be generic. It stands for what the users care and what they want.
"vf_orig"/"wire_orig" is resource sensitive and beyond matching items and matching item should follow it always.
By using this, there is no possibility to match the cut-off path. It' very advance feature
> If you think direction not good, we can change to other words like "initial port"/"origin port" etc.
As I explained multiple times, "direction" is rather obscure from the
viewpoint located inside the embedded switch. Yes, on non-transfer (VNIC)
level, there are *exactly* two directions: ingress and egress.
But, inside of the embedded switch (transfer rules), there can
be *multiple* various "directions", which are not even
directions, = they're traffic PATHs in fact.
Renaming to "intitial port" and "origin port" won't be helpful either
because, for users, it will be hard to figure out the difference
between the attribute and items PORT_REPRESENTOR / REPRESENTED_PORT.
If, however, you add new items instead of the attribute, the user
will likely see that the new items and the existing ones are
just alternative options = representor-based items help
to address exact ports (one rule - one port), whilst
your new items help to address super sets of ports
like "all wire ports" or "all guest ports".
You forgot rte_item->mask here.
So, the short of it:
1) these "wire_orig" / "vf_orig" are in fact yet another match criteria;
2) because of that, they should go to match items and not to attributes.
>>
>> However, based on your later explanations, the use of precise port item is
>> simply inconvenient in your use case because you are trying to match traffic
>> from *multiple* ports that have something in common (i.e. all VFs or all wire
>> ports).
>>
>> And, instead of adding a new item type which would serve exactly your needs,
>> you for some reason try to add an attribute, which has multiple drawbacks
>> which I described in my previous letter.
>>
>>> For transfer rules, there is a concept transfer_proxy.
>>> It takes the switch ownership; all switch rules should be configured via
>> transfer_proxy.
>>
>> Yes, such concept exists, but it's a don't care with regard to the problem that
>> we're discussing, sorry.
>> Furthermore, unlike "switch domain ID" (which is the same for all ethdevs
>> belonging to a given physical NIC board), nobody guarantees that it's only one
>> transfer proxy port. Some NIC vendors allows transfer rules to be added via
>> any ethdev port.
>>
> Does any flow rule leverage switchid already. Is it too obscure for end-user?
No, I'm not saying about flow rules. I'm explaining the logic which
application may use to identify which ethdevs are on which NICs.
Imagine a DPDK application which has two ethdevs instantiated:
one ethdev sits on top of the admin. PF (ethdev 0), the other
one sits on top of a low-privilege PF (ethdev 1).
In the latter case, it can also be a VF.
Both ethdev 0 and ethdev 1 belong to the same physical NIC board.
Now, what I'm trying to explain is the fact that "proxy"
behaviour may differ between various vendors:
- some vendors say that they can support managing "transfer" rules via
any PFs / VFs. They do not require that some specific PF ethdev be
used to do that. With such vendors, if the application makes a
query "What's the proxy port ID for the ethdev 1?", it will
get "The proxy port ID for ethdev 1 is 1" response.
- but other vendors cannot support the above workflow and they require
that "transfer" rules be managed using some specific (admin) ethdev.
If the application makes the same query here, it will get the
following response: "The proxy port ID for ethdev 1 is 0".
So, given these explanations, it is incorrect to assume that
the proxy port ID for all ethdevs belonging to the same NIC
board will be the same. They simply may not be like this.
However, *regardless* of the two above scenarious and regardless
of vendor, for NICs which have embedded switch feature, when the
user tries to check the "switch domain ID" for ethdev 0 and
ethdev 1, they will get the same value. So, this should be
the right criterion for the application (not for flow
rules themselves) to decide which ethdev belongs to
which physical NIC board.
Why you said user is good to check switch domain id and know port belongings.
But not good to know basic dpdk rte_flow api usage?
There are too many assumptions.
Using VF as example, they are different from beginnings, see sriov commands:
echo $num > /sysfs/ .... /PF_BDF/sriov_num
echo VF_BDF > /sysfs/.../bind or unbind
>>>
>>> Image a logic switch with one PF and two VFs.
>>> PF is the transfer proxy and VF belongs to the PF logically.
>>> When receiving traffic from PF, we can say it comes into the logic switch.
>>
>> That's correct.
>>
>>> When packet sent from VF (VF belongs to PF), so we can say traffic leaves
>> the switch.
>>
>> That's not correct. Traffic sent from VF (for example, a guest VM is sending
>> packets) also *enters* the switch. PFs and VFs are in fact *separate* logical
>> ports of the embedded switch.
>>
>>>
>>> Item REPRESENTED_PORT indicates switch to match traffic sent from which
>> port, comes into, or leave switch.
>>
>> That is not correct either. Item REPRESENTED_PORT tells the switch to match
>> packets which come into the switch FROM the logical port which is
>> represented by the given DPDK ethdev.
>>
>> For example, if ethdev="E" is the *main* PF which is bound to physical port "P",
>> then item REPRESENTED_PORT with ethdev ID being set to "E" tells the switch
>> that only packet coming to NIC from *wire* via physical port "E" should match.
>>
>>> We can say it as one kind of packet metadata.
>>
>> Kind of yes, but might be vendor-specific. No need to delve into this.
>>
>>> Like you said, DPDK always treat transfer to match any PORTs traffic.
>>
>> Slight correction: it treats it this way until it sees an exact port item.
>> If the user provides REPRESENTED_PORT (or PORT_REPRESENTOR), it's no
>> longer *any* ports traffic, it's an exact port traffic. That's it.
>>
>>> When REPRESENTED_PORT is specified, the rules are limited to some
>> dedicated PORTs.
>>
>> These rules match only packets arriving TO the embedded switch FROM the
>> said dedicated ports.
>>
>>> Other PORTs are ignored because metadata mismatching.
>>
>> Kind of yes, correct.
>>
>>> Rules still have the capability to match ANY PORTS if metadata matched.
>>
>> This statement is only correct for the cases when the user does NOT use
>> neither item REPRESENTED_PORT nor item PORT_REPRESENTOR.
>>
>>>
>>> This update will allow user to cut the other PORTs matching capabilities.
>>
>> As I explained, this is exactly what items PORT_REPRESENTOR and
>> REPRESENTED_PORT do. No need to have an extra attribute.
>>
>> If the user adds item REPRESENTED_PORT with ethdev_id="E", like in the
>> above example, to match packets entering NIC via the physical port "P", then
>> this rule will NOT match packets entering NIC from other points. For example,
>> packets transmitted by a virtual machine via a VF will not match in this case.
>>
>>>>> Port id depends on the attach sequence.
>>>>
>>>> Unfortunately, this is hardly a good argument because flow rules are
>>>> supposed to be inserted based on the run-time packet learning. Attach
>>>> sequence is a don't care here.
>>>>
>>>>>> Also please mind that, although I appreciate your explanations
>>>>>> here, on the mailing list, they should finally be added to the
>>>>>> commit message, so that readers do not have to look for them elsewhere.
>>>>>>
>>>>> We have explained the high possibility of single-direction matching, right?
>>>>
>>>> Not quite. As I said, it is not correct to assume any "direction",
>>>> like in geographical sense ("north", "south", etc.). Application has
>>>> ethdevs, and they are representors of some "virtual ports" (in your
>>>> terminology) belonging to the switch, for example, VFs, SFs or physical
>> ports.
>>>>
>>>> The user adds an appropriate item to the pattern (REPRESENTED_PORT),
>>>> and doing so specifies the packet path which it enters the switch.
>>>>
>>>>> It' hard to list all the possibilities of traffic matching preferences.
>>>>
>>>> And let's say more: one need never do this. That's exactly the reason
>>>> why DPDK has abandoned the concept of "direction" in *transfer* rules
>>>> and switched to the use of precise criteria (REPRESENTED_PORT, etc.).
>>>>
>>> As far as I know, DPDK changes "transfer ingress" to "transfer", so it' more
>> clear that transfer can match both directions (both ingress and egress).
>>
>> Not quite. DPDK has abandoned the use of "ingress / egress" in "transfer"
>> rules because "ingress" and "egress" are only applicable on the VNIC level. For
>> example, there is a PF attached to DPDK application:
>> packets that the application receives through this ethdev, are ingress, and
>> packets that it transmits (tx_burst) are egress.
>>
>> I can explain in other words. Imagine yourself standing *inside* a room which
>> only has one door. When someone enters the room, it's "ingress", when
>> someone leaves, it's "egress". It's relative to your viewpoint.
>> In this example, such a room represents a VNIC / ethdev.
>>
>> And now imagine yourself standing *outside* of another room / auditorium
>> which has multiple doors / exits. You're standing near some particular exit "A"
>> (VNIC / ethdev), but people may enter this room via another door "B" and then
>> leave it via yet another door "C". In this case, from your viewpoint, this traffic
>> cannot be considered neither ingress nor egress. Because these people do not
>> approach you.
>>
>> Like in this example, embedded switch is like a large auditorium with many-
>> many doors / exits. And there can be many-many
>> directions: packet can enter the switch via phys. port "P1"
>> and then leave it via another phys. port "P2". Or it can enter the switch via
>> phys. port and the leave it via VF's logical port (to be delivered to a guest
>> machine), or a packet can travel from one VF to another one.
>>
>> There's no PRE-DEFINED direction like "north to south" or "east to west".
>> And this explains why it's very undesirable to use term "direction".
>>
>>> REPRESENTED_PORT is the evolution of "port_id", I think, it' only one kind of
>> matching items.
>>
>> Yes. But nobody prevents you from defining yet another match item which will
>> be able to refer to a *group* of ports which have something in common (i.e.
>> "all guest ports of this switch"
>> pointing to all logical ports currently attached to virtual machines / guests, or
>> "all wire ports of this swtich").
>>
>>>
>>> For large scale deployment like 10M rules, if we can save resources
>> significantly by introducing direction, why not?
>>
>> I do not deny the fact that you have a use case where resources can be saved
>> significantly if you give the PMD some extra knowledge when creating a flow
>> table / pattern template. That's totally OK. What I object is the very
>> implementation and the use of term "direction". If you add new item types
>> (like above), then, when you create an async table 1 pattern template, you will
>> have item ANY_WIRE_PORTS, and, for table 2 pattern template, you'll have
>> item ANY_GUEST_PORTS.
>> As you see, the two pattern templates now differ because the match criteria
>> use different items.
>>
>>>
>>> Again, async API:
>>> 1. pattern template A
>>> 2. action template B
>>> 3. table C with pattern template A + action template B.
>>> 4. rule D, E, F...
>>> The specified REPRESENTED_PORT is provided in rules (D, E, F...) not pattern
>> template A or action template B or table C.
>>> Resources may be allocated early at step 3 since table' rule_nums property.
>>
>> No, item REPRESENTED_PORT *can* be provided inside pattern template A,
>> but, as you pointed out earlier, the problem is that you can't distinguish
>> different pattern templates which have this item, because pattern templates
>> know nothing about *exact* port IDs and only know item MASKS. Yes, I agree
>> that in your case such problem exists, but, as I say above, it can be solved by
>> adding new item types: one for referring to all phys. ports of a given NIC and
>> another one for pointing to a group of current guest users (VFs).
>>
>>>>> The underlay is the one we have met for now.
>>>>>>>
>>>>>>>>> Introduce one new member transfer_mode into rte_flow_attr to
>>>>>>>>> indicate the flow table direction property: from wire, from vf
>>>>>>>>> or bi-direction(default).
>>>>>>>>
>>>>>>>> AFAIK, 'rte_flow_attr' serves both traditional flow rule
>>>>>>>> insertion and asynchronous (table) approach. The patch adds the
>>>>>>>> attributes to generic 'rte_flow_attr' but, for some reason, ignores non-
>> table rules.
>>>>>>>>
>>>>>>>>>
>>>>>>> Sync API uses one rule to contain everything. It' hard for PMD to
>>>>>>> determine
>>>>>> if this rule has direction preference or not.
>>>>>>> Image a situation, just for an example:
>>>>>>> 1. Vport 1 VxLAN do decap send to vport 2. 1 million scale
>>>>>>> 2. Vport 0 (wire) VxLAN do decap send to vport 3. 1 hundred scale.
>>>>>>> 1 and 2 share the same matching conditions (eth / ipv4 / udp /
>>>>>>> vxlan /...), so
>>>>>> sync API consider them share matching determination logic.
>>>>>>> It means "2" have 1M scale capability too. Obviously, it wastes a
>>>>>>> lot of
>>>>>> resources.
>>>>>>
>>>>>> Strictly speaking, they do not share the same match pattern.
>>>>>> Your example clearly shows that, in (1), the pattern should request
>>>>>> packets coming from "vport 1" and, in (2), packets coming from "vport 0".
>>>>>>
>>>>>> My point is simple: the "vport" from which packets enter the
>>>>>> embedded switch is ALSO a match criterion. If you accept this,
>>>>>> you'll see: the matching conditions differ.
>>>>>>
>>>>> See above.
>>>>> In this case, I think the matching fields are both "port_id +
>>>>> ipv4_vxlan". They
>>>> are same.
>>>>> Only differs with values like vni 100 or 200 vice versa.
>>>>
>>>> Not quite. Look closer: you use *different* port IDs for (1) and (2).
>>>> The value of "ethdev_id" field in item REPRESENTED_PORT differs.
>>>>
>>>>>>>
>>>>>>> In async API, there is pattern_template introduced. We can mark "1"
>>>>>>> to use
>>>>>> pattern_tempate id 1 and "2" to use pattern_template 2.
>>>>>>> They will be separated from each other, don't share anymore.
>>>>>>
>>>>>> Consider an example. "Wire" is a physical port represented by PF0
>>>>>> which, in turn, is attached to DPDK via ethdev 0. "VF" (vport?) is
>>>>>> attached to guest and is represented by a representor ethdev 1 in DPDK.
>>>>>>
>>>>>> So, some rules (template 1) are needed to deliver packets from "wire"
>>>>>> to "VF" and also decapsulate them. And some rules (template 2) are
>>>>>> needed to deliver packets in the opposite direction, from "VF"
>>>>>> to "wire" and also encapsulate them.
>>>>>>
>>>>>> My question is, what prevents you from adding match item
>>>>>> REPRESENTED_PORT[ethdev_id=0] to the pattern template 1 and
>>>>>> REPRESENTED_PORT[ethdev_id=1] to the pattern template 2?
>>>>>>
>>>>>> As I said previously, if you insert such item before eth / ipv4 /
>>>>>> etc to your match pattern, doing so defines an *exact* direction / source.
>>>>>>
>>>>> Could you check the async API guidance? I think pattern template
>>>>> focusing
>>>> on the matching field (mask).
>>>>> "REPRESENTED_PORT[ethdev_id=0] " and
>>>> "REPRESENTED_PORT[ethdev_id=1] "are the same.
>>>>> 1. pattern template: REPRESENTED_PORT mask 0xffff ...
>>>>> 2. action template: action1 / actions2. / 3. table create with
>>>>> pattern_template plus action template..
>>>>> REPRESENTED_PORT[ethdev_id=0] will be rule1: rule create
>>>> REPRESENTED_PORT port_id is 0 / actions ....
>>>>> REPRESENTED_PORT[ethdev_id=1] will be rule2: rule create
>>>> REPRESENTED_PORT port_id is 1 / actions ....
>>>>
>>>> OK, so, based on this explanation, it appears that you might be
>>>> looking to refer
>>>> to:
>>>> a) a *set* of any physical (wire) ports
>>>> b) a *set* of any guest ports (VFs)
>>>>
>>> Great, looks we are more and more closer to the agreement.
>>
>> Looks so.
>>
>>>> You chose to achieve this using an attribute, but:
>>>>
>>>> 1) as I explained above, the use of term "direction" is wrong;
>>>> please hear me out: I'm not saying that your use case and
>>>> your optimisation is wrong: I'm saying that naming for it
>>>> is wrong: it has nothing to do with "direction";
>>>>
>>> Do you have any better naming proposal?
>>
>> As I said, what you are trying to achieve using a new attribute would be way
>> better to achieve using new pattern items which can be easily told one from
>> another in PMD when pre-allocaing resources for different async flow tables.
>>
>> So, I don't have any proposal for *attribute* naming.
>> What I propose is to consider new items instead.
>>
>>>> 2) while naming a *set* of wire ports as "wire_orig" might be OK,
>>>> sticking with term "vf_orig" for a *set* of guest ports is
>>>> clearly not, simply because the user may pass another PF
>>>> to a guest instead of passing a VF; in other words,
>>>> a better term is needed here;
>>>>
>>> Like you said, vport may contain VF, SF etc. vport_orgin is on the logic switch
>> perspective.
>>> Any proposal is welcome.
>>
>> The problem is, vport can be easily confused with a slightly more generic
>> "lport" (embedded switch's "logical port"), and, logical ports, in turn, are not
>> confined to just VFs or PFs. For example, physical (wire) ports are ALSO logical
>> ports of the switch.
>>
>>>> 3) since it is possible to plug multiple NICs to a DPDK application,
>>>> even from different vendors, the user may end up having multiple
>>>> physical ports belonging to different physical NICs attached to
>>>> the application; if this is the case, then referring to a *set*
>>>> of wire ports using the new attribute is ambiguous in the
>>>> sense that it's unclear whether this applies only to
>>>> wire ports of some specific physical NIC or to the
>>>> physical ports of *all* NICs managed by the app;
>>>>
>>> Not matter how many NICs has been probed by the DPDK, there is always
>> switch/PF/VF/SF.. concept.
>>
>> Correct.
>>
>>> Each switch must have an owner identified by transfer_proxy(). Vport (VF/SF)
>> can't cross switch in normal case.
>>
>> No. That is not correct. This is tricky, but please hear me out: an individual NIC
>> board (that is, a given *switch*) is identified only by its switch domain ID. As I
>> explained above, "transfer proxy" is just a technical hint for the applcation to
>> indicate an ethdev through which "transfer" rules must be managed. Not all
>> vendors support this concept (and they are not obliged to support it).
>>
>>> The traffic comes from one NIC can't be offloaded by other NICs unless
>> forwarded by the application.
>>
>> Right, but forwarding in software (inside DPDK application) is out of scope with
>> regard to the problem that we're discussing.
>>
>>> If user use new attribute to cut one side resource, I think user is smart
>> enough to management the rules in different NICs.
>>
>> As I explained above, I do not deny the existence of the problem that your
>> patch is trying to solve. Now it looks like we're on the same page with regard
>> to understanding the fact that what you're trying to do is to introduce a match
>> criterion that would refer to a GROUP of similar ports. In my opinion, this is
>> not an *attribute*, it's a *match criterion*, and it should be implemented as
>> two new items.
>>
>> Having two different item types would perfectly fit the need to know the
>> difference between such "directions" (as per your terminology) early enough,
>> when parsing templates.
>>
>>> No default behavior changed with this update.
>>>
>>>> 4) adding an attribute instead of yet another pattern item type
>>>> is not quite good because PMDs need to be updated separately
>>>> to detect this attribute and throw an error if it's not
>>>> supported, whilst with a new item type, the PMDs do not
>>>> need to be updated = if a PMD sees an unsupported item
>>>> while traversing the item with switch () { case }, it
>>>> will anyway throw an error;
>>>>
>>> PMD also need to check if it supports new matching item or not, right?
>>> We can't assume NIC vendor' PMD implementation, right?
>>
>> No-no-no. Imagine a PMD which does not support "transfer" rules.
>> In such PMD, in the flow parsing function one would have:
>>
>> if (!!attr->transfer) {
>> print_error("Transfer is not supported");
>> return EINVAL;
>> }
>>
>> If you add a new attribute, then PMDs which are NOT going to support it need
>> to be updated to add similar check.
>> Otherwise, they will simply ignore presence / absence of the attribute in the
>> rule, and validation result will be unreliable.
>>
>> Yes, if this attribute is 0x0, then indeed behaviour does nto change. But what if
>> it's 0x1 or 0x2?
>> PMDs that do not support these values must somehow reject such rules on
>> parsing.
>>
>> However, this problem does not manifest itself when parsing items. Typially, in
>> a PMD, one would have:
>>
>> switch (item->type) {
>> case RTE_FLOW_ITEM_TYPE_VOID:
>> break;
>>
>> case RTE_FLOW_ITEM_TYPE_ETH:
>> /* blah-blah-blah */
>> break;
>>
>> default:
>> return ENOTSUP;
>> }
> Are you assuming all PMDs will be implemented in the upper style?
One may take a look at the existing PMDs. It's open source after all.
When one has an array of items of unknown count which is
END-terminated, then, obviously, the PMD has to traverse
it one way or another. If it stubles upon an unknown
item, it will have nothing to do but to throw an error.
> This new field targets async API which was added recently. No impact on sync API.
Rongwei, I see your point. The problem with it, however, is that even
if you describe it in comments, the code won't prevent non-sync API
from seeing this attribute in "struct rte_flow_attr".
As I say, "struct rte_flow_attr" has been here for ages.
When one adds a flow rule in a sync way, they fill out
the very same structure. And the user may set this new
argument to non-zero by mistake. Yes, you may argue
that the app developer should be smart enough to
read your comment before the struct member which
says that this field is for a-sync only. Right.
But that's not the only scenario. The field may
become non-zero because of some other mistake in
the program which, for example, leads to the
struct memory being corrupted in one way or
another. That's why the PMD has to validate flow rules...
I am very confusing.
If the memory is corrupted or set mistakenly, we should fix it.
Memory corruption will lead unpredictable mistake especially under multi-thread.
So, the PMD must detect this inconsistency somehow and throw an error.
With your approach (attribute), the PMDs have to be updated to have
these checks. With the item approach that I suggest, updating the
PMDs is obviously not needed. Am I missing something? Let's discuss.
I am afraid pmd still needs to check pattern conflicts between "IS_ANY_***PORTS" with "port_id"/"represented_port" to avoid conflicts.
May I know if the attributes fully occupied at your side for some special purpose?
> I don't predict any effort on the existing PMD behavior.
I see your point. But how is this expressed in code?
As I explain above, consistency checks are what
flow validate API is for. New argument means
new checks. That's it.
Like commit log and what I mentioned multiple time.
If user choose to use advanced feature, they should read manual carefully and take the responsibility.
> But agree with you: we should emphasize it' only for async mode.
It's better to express this in code. So that the problem (if any)
can be detected programmatically and not just from reading comments.
From my point of view, the easiest way to have this done is to
add items instead of attributes, = no need to update PMDs.
We have readme, code snippet already.
Even if user set the attribute in the sync API, nothing should happen since no underlayer support.(still behave like current TRANSFER domain)
Unless PMD has bug, but it is always good to fix bugs, right?
>
>>
>> So, if you introduce two new item types to solve your problem, then you won't
>> have to update existing PMDs. If the vendor wants to support the new items
>> (say, MLX or SFC), they'll update their code to accept the items. But other
>> vendors will not do anything. If the user tries to pass such an item to a vendor
>> which doesn't support the feature, the "default" case will just throw an error.
>>
>> This is what I mean when pointing out such difference between adding an
>> attribute VS adding new item types.
>>
>>>> 5) as in (4), a new attribute is not good from documentation
>>>> standpoint; plase search for "represented_port = Y" in
>>>> documentation = this way, all supported items are
>>>> easily defined for various NIC vendors, but the
>>>> same isn't true for attributes = there is no
>>>> way to indicate supported attributes in doc.
>>>>
>>>> If points (1 - 5) make sense to you, then, if I may be so bold, I'd
>>>> like to suggest that the idea of adding a new attribute be abandoned.
>>>> Instead, I'd like to suggest adding new items:
>>>>
>>>> (the names are just sketch, for sure, it should be discussed)
>>>>
>>>> ANY_PHY_PORTS { switch_domain_id }
>>>> = match packets entering the embedded switch from *whatever*
>>>> physical ports belonging to the given switch domain
>>>>
>>> How many PHY_PORTS can one switch have, per your thought? Can I treat
>> the PHY_PORTS as the { switch_domain_id } owner as transfer_proxy()?
>>
>> A single physical NIC board is supposed to have a single embedded switch
>> engine. Hence, if the NIC board has, in example, two or four physical ports,
>> these will be the physical ports of the switch. That's it.
>>
>> As for the transfer proxy, please see my explanations above.
>> It's not *always* reliable to tell whether two given ethdevs belong to the same
>> physical NIC board or not.
>>
>> Switch domain ID is the right criterion (for applications).
>>
>>>> ANY_GUEST_PORTS { switch_domain_id }
>>>> = match packets entering the embedded switch from *whatever*
>>>> guest ports (VFs, PFs, etc.) belonging to the given
>>>> switch domain
>>>>
>>>> The field "switch_domain_id" is required to tell one physical board /
>>>> vendor from another (as I explained in point (3)).
>>>> The application can query this parameter from ethdev's switch info:
>>>> please see "struct rte_eth_switch_info".
>>>>
>>>> What's your opinion?
>>>>
>>> How can we handle ANY_PHY_PORTS/ ANY_GUEST_PORTS ' relationship
>> with REPRESENTED_PORT if conflicts?
>>> Need future tuning.
>>
>> And if you carry on with "vf_orig" / "wire_orig" approach, you will inevitably
>> have the very same problem: possible conflict with items like
>> REPRESENTED_PORT. So does it matter? Yes, checks need to be done by PMDs
>> when parsing patterns.
>>
>>> Like I said before, offloaded rules can't cross different NIC vendor'
>> "switch_domain_id".
>>> If user probes multiple NICs in one application, application should take care
>> of packet forwarding.
>>> Also application should be aware which ports belong to which NICs.
>>
>> Yes, perhaps, domain ID is not needed in the new items.
>> But the application still must keep track of switch domain IDs itself so it knows
>> which rules to manage via which ethdevs.
>>
>> Any other opinions?
> ANY_PHY_PORTS/ ANY_GUEST_PORTS looks like a super set of ports.
So does the new attribute, doesn't it?
> This will come another challenge: "why can't we use REPRESENTED_PORT with mask" or "combine several REPRESENTED_PORT together"?
This problem has been here for many other items, including now deprecated
items PF, VF and PHY_PORT. Yes, theoretically, when the PMD looks through
the pattern, it has to check that its items do not overlap / contradict.
That's kind of OK, isn't it? The PMD has to check things after all...
For example, no one prevents user from submitting a pattern
with several adjacent items ETH in it. The PMD is supposed
to turn such request down.
>>
>>>>>
>>>>>>>
>>>>>>>> For example, the diff below adds the attributes to "table"
>>>>>>>> commands in testpmd but does not add them to regular (non-table)
>>>>>>>> commands like "flow create". Why?
>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> "table" command limits pattern_template to single direction or
>>>>>>> bidirection
>>>>>> per user specified attribute.
>>>>>>
>>>>>> As I say above, the same effect can be achieved by adding item
>>>>>> REPRESENTED_PORT to the corresponding pattern template.
>>>>> See above.
>>>>>>
>>>>>>> "rule" command must tight with one "table_id", so the rule will
>>>>>>> inherit the
>>>>>> "table" direction property, no need to specify again.
>>>>>>
>>>>>> You migh've misunderstood. I do not talk about "rule" command
>>>>>> coupled with some "table". What I talk about is regular, NON-async
>>>>>> flow insertion commands.
>>>>>>
>>>>>> Please take a look at section "/* Validate/create attributes. */"
>>>>>> in file "app/test-pmd/cmdline_flow.c". When one adds a new flow
>>>>>> attribute, they should reflect it the same way as VC_INGRESS,
>>>> VC_TRANSFER, etc.
>>>>>>
>>>>>> That's it.
>>>>> We don't intend to pass this to sync API. The above code example is
>>>>> for sync
>>>> API.
>>>>
>>>> So I understand. But there's one slight problem: in your patch, you
>>>> add the new attributes to the structure which is *shared* between
>>>> sync and async use case scenarios. If one adds an attribute to this
>>>> structure, they have to provide accessors for it in all sync-related
>>>> commands in testpmd, but your patch does not do that.
>>>>
>>> Like the title said, "creating transfer table" is the ASYNC operation.
>>> We have limited the scope of this patch. Sync API will be another story.
>>> Maybe we can add one more sentence to emphasize async API again.
>>
>> No-no-no. There might be slight misunderstanding. I understand that you are
>> limiting the scope of your patch by saying this and this.
>> That's OK. What I'm trying to point out is the fact that your patch nevertheless
>> touches the COMMON part of the flow API which is shared between two
>> approaches (sync and async).
> Yeah, you are right, we should emphasize it for async API not sync in the code and comments.
>>
>> Imagine a reader that does not know anything about the async approach.
>> He just opens the file in vim and goes directly to struct rte_flow_attr.
>> And, over there, he sees the new attribute "wire_orig". He then immediately
>> assumes that these attributes can be used in testpmd. Now the reader opens
>> testpmd and tries to insert a flow rule using the sync approach:
>>
>> flow create priority 0 transfer vf_orig pattern / ... / end actions drop
>>
>
> This is wrong statement.
> If user has no idea with cmdline usage, he should rely on "tab indication' not something by guessing.
>
> The command prefix "flow" bifurcated now to sync and async now, user may use any keyword combinations.
> He will get "argument error" if it's not good unless he knows what' he is doing.
> Again: we should emphasize it's only for async API only.
OK, even if this example is not good enough, I still believe that
it is not right to introduce new match criteria in the form of
rule attributes. Match criteria belong in the pattern.
>
>> And doing so will be a failure, because your patch does not add the new
>> attribute keyword to sync flow rule syntax parser. That's it.
>>
>> Once again, I should ephasize: the reader MAY know nothing about the async
>> approach. But if the attribute is present in "struct rte_flow_attr", it
>> immediately means that it is available everywhere. Both sync and async.
>>
>> So, with this in mind, your attempt to limit the scope of the patch to async-only
>> rules looks a little bit artificial. It's not correct from the *formal* standpoint.
>>
>>>
>>>> In other words, it is wrong to assume that "struct rte_flow_attr"
>>>> only applies to async approach. It had been introduced long before
>>>> the async flow design was added to DPDK. That's it.
>>>>
>>>>>>
>>>>>> But, as I say, I still believe that the new attributes aren't needed.
>>>>> I think we are not at the same page for now. Can we reach agreement
>>>>> on the same matching criteria first?
>>>>>>>
>>>>>>>>> It helps to save underlayer memory also on insertion rate.
>>>>>>>>
>>>>>>>> Which memory? Host memory? NIC memory? Term "underlayer" is
>>>> vague.
>>>>>>>> I suggest that the commit message be revised to first explain how
>>>>>>>> such memory is spent currently, then explain why this is not
>>>>>>>> optimal and, finally, which way the patch is supposed to improve
>>>>>>>> that. I.e. be more
>>>>>> specific.
>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> For large scalable rules, HW (depends on implementation) always
>>>>>>> needs
>>>>>> memory to hold the rules' patterns and actions, either from NIC or
>>>>>> from
>>>> host.
>>>>>>> The memory footprint highly depends on "user rules' complexity",
>>>>>>> also diff
>>>>>> between NICs.
>>>>>>> ~50% memory saving is expected if one-direction is cut.
>>>>>>
>>>>>> Regardless of this talk, this explanation should probably be
>>>>>> present in the commit description.
>>>>>>
>>>>> This number may differ with different NICs or implementation. We
>>>>> can't say
>>>> it for sure.
>>>>
>>>> Not an exact number, of course, but a brief explanation of:
>>>> a) what is wrong / not optimal in the current design;
>>> Please check the commit log, transfer have the capability to match bi-
>> direction traffic no matter what ports.
>>>> b) how it is observed in customer deployments;
>>> Customer have the requirements to save resources and their offloaded rules
>> is direction aware.
>>>> c) why the proposed patch is a good solution.
>>> New attributes provide the way to remove one direction and save underlayer
>> resource.
>>> All of the above can be found in the commit log.
>>
>> I understand all of that, but my point is, the existing commit message is way
>> too brief. Yes, it mentions that SOME customers have SOME deployments, but
>> it does not shed light on which specifics these deployments have. For example,
>> back in the day, when items PORT_REPRESENTOR and REPRESENTED_PORT
>> were added, the cover letter for that patch series provided details of
>> deployment specifics (application: OvS, scenario: full offload rules).
>>
>> So, it's always better to expand on such specifics so that the reader has full
>> picture in their head and doesn't need to look elsewhere.
>> Not all readers of the commit message will be happy to delve into our
>> discussions on the mailing list to get the gist.
>>
> It' approach diverse. Pattern item approach will attract another discussion thread, right?
As I said, match criteria belong in flow pattern. I recognise the
importance of the problem that you're looking to solve. It's very
good that you care to address it, but what this patch tries to do
is to add more match criteria in the form of new attributes with
rather questionable names... There's a room for improvement.
When I say that new features should not confuse readers, I mean
a very basic thing: readers know that match criteria all sit
in the pattern. And they refer to the pattern item enum in
the code and in documentation to learn about criteria,
while "struct rte_flow_attr" is an unusual place from
which to learn about match criteria.
> We should get a conclusion and reflect in the commit changes&logs, and it's easy for others to absorb.
Yes, but before we get to that, perhaps it pays to hear
more feedback from other reviewers. Thomas? Ori? Andrew?
>>>
>>>>
>>>
>>>>>>>
>>>>>>>>> By default, the transfer domain is bi-direction, and no behavior
>> changes.
>>>>>>>>>
>>>>>>>>> 1. Match wire origin only
>>>>>>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
>>>>>>>>> 2. Match vf origin only
>>>>>>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>>>>>>>>>
>>>>>>>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
>>>>>>>>> ---
>>>>>>>>> app/test-pmd/cmdline_flow.c | 26
>> +++++++++++++++++++++
>>>>>>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
>>>>>>>>> lib/ethdev/rte_flow.h | 9 ++++++-
>>>>>>>>> 3 files changed, 36 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/app/test-pmd/cmdline_flow.c
>>>>>>>>> b/app/test-pmd/cmdline_flow.c index 7f50028eb7..b25b595e82
>>>>>>>>> 100644
>>>>>>>>> --- a/app/test-pmd/cmdline_flow.c
>>>>>>>>> +++ b/app/test-pmd/cmdline_flow.c
>>>>>>>>> @@ -177,6 +177,8 @@ enum index {
>>>>>>>>> TABLE_INGRESS,
>>>>>>>>> TABLE_EGRESS,
>>>>>>>>> TABLE_TRANSFER,
>>>>>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>>>>>> + TABLE_TRANSFER_VF_ORIG,
>>>>>>>>> TABLE_RULES_NUMBER,
>>>>>>>>> TABLE_PATTERN_TEMPLATE,
>>>>>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>>>>>> @@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] =
>> {
>>>>>>>>> TABLE_INGRESS,
>>>>>>>>> TABLE_EGRESS,
>>>>>>>>> TABLE_TRANSFER,
>>>>>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>>>>>> + TABLE_TRANSFER_VF_ORIG,
>>>>>>>>> TABLE_RULES_NUMBER,
>>>>>>>>> TABLE_PATTERN_TEMPLATE,
>>>>>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>>>>>> @@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
>>>>>>>>> .next = NEXT(next_table_attr),
>>>>>>>>> .call = parse_table,
>>>>>>>>> },
>>>>>>>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
>>>>>>>>> + .name = "wire_orig",
>>>>>>>>> + .help = "affect rule direction to transfer",
>>>>>>>>
>>>>>>>> This does not explain the "wire" aspect. It's too broad.
>>>>>>>>
>>>>>>>>> + .next = NEXT(next_table_attr),
>>>>>>>>> + .call = parse_table,
>>>>>>>>> + },
>>>>>>>>> + [TABLE_TRANSFER_VF_ORIG] = {
>>>>>>>>> + .name = "vf_orig",
>>>>>>>>> + .help = "affect rule direction to transfer",
>>>>>>>>
>>>>>>>> This explanation simply duplicates such of the "wire_orig".
>>>>>>>> It does not explain the "vf" part. Should be more specific.
>>>>>>>>
>>>>>>>>> + .next = NEXT(next_table_attr),
>>>>>>>>> + .call = parse_table,
>>>>>>>>> + },
>>>>>>>>> [TABLE_RULES_NUMBER] = {
>>>>>>>>> .name = "rules_number",
>>>>>>>>> .help = "number of rules in table", @@ -8894,6
>>>>>>>>> +8910,16 @@ parse_table(struct context *ctx, const struct token
>>>>>>>>> +*token,
>>>>>>>>> case TABLE_TRANSFER:
>>>>>>>>> out->args.table.attr.flow_attr.transfer = 1;
>>>>>>>>> return len;
>>>>>>>>> + case TABLE_TRANSFER_WIRE_ORIG:
>>>>>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>>>>>> + return -1;
>>>>>>>>> + out->args.table.attr.flow_attr.transfer_mode = 1;
>>>>>>>>> + return len;
>>>>>>>>> + case TABLE_TRANSFER_VF_ORIG:
>>>>>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>>>>>> + return -1;
>>>>>>>>> + out->args.table.attr.flow_attr.transfer_mode = 2;
>>>>>>>>> + return len;
>>>>>>>>> default:
>>>>>>>>> return -1;
>>>>>>>>> }
>>>>>>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>>>> index 330e34427d..603b7988dd 100644
>>>>>>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>>>>>> @@ -3332,7 +3332,8 @@ It is bound to
>>>>>>>> ``rte_flow_template_table_create()``::
>>>>>>>>>
>>>>>>>>> flow template_table {port_id} create
>>>>>>>>> [table_id {id}] [group {group_id}]
>>>>>>>>> - [priority {level}] [ingress] [egress] [transfer]
>>>>>>>>> + [priority {level}] [ingress] [egress]
>>>>>>>>> + [transfer [vf_orig] [wire_orig]]
>>>>>>>>
>>>>>>>> Is it correct? Shouldn't it rather be [transfer] [vf_orig]
>>>>>>>> [wire_orig] ?
>>>>>>>>
>>>>>>>>> rules_number {number}
>>>>>>>>> pattern_template {pattern_template_id}
>>>>>>>>> actions_template {actions_template_id} diff --git
>>>>>>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>>>>>>>>> a79f1e7ef0..512b08d817 100644
>>>>>>>>> --- a/lib/ethdev/rte_flow.h
>>>>>>>>> +++ b/lib/ethdev/rte_flow.h
>>>>>>>>> @@ -130,7 +130,14 @@ struct rte_flow_attr {
>>>>>>>>> * through a suitable port. @see rte_flow_pick_transfer_proxy().
>>>>>>>>> */
>>>>>>>>> uint32_t transfer:1;
>>>>>>>>> - uint32_t reserved:29; /**< Reserved, must be zero. */
>>>>>>>>> + /**
>>>>>>>>> + * 0 means bidirection,
>>>>>>>>> + * 0x1 origin uplink,
>>>>>>>>
>>>>>>>> What does "uplink" mean? It's too vague. Hardly a good term.
>>>>
>>>> I believe this comment should be reworked, in case the idea of having
>>>> an extra attribute persists.
>>>>
>>>>>>>>
>>>>>>>>> + * 0x2 origin vport,
>>>>>>>>
>>>>>>>> What does "origin vport" mean? Hardly a good term as well.
>>>>
>>>> I still believe this explanation is way too brief and needs to be
>>>> reworked to provide more details, to define the use case for the attribute
>> more specifically.
>>>>
>>>>>>>>
>>>>>>>>> + * N/A both set.
>>>>>>>>
>>>>>>>> What's this?
>>>>
>>>> The question stands.
>>>>
>>>>>>>>
>>>>>>>>> + */
>>>>>>>>> + uint32_t transfer_mode:2;
>>>>>>>>> + uint32_t reserved:27; /**< Reserved, must be zero. */
>>>>>>>>> };
>>>>>>>>>
>>>>>>>>> /**
>>>>>>>>> --
>>>>>>>>> 2.27.0
>>>>>>>>>
>>>>>>>>
>>>>>>>> Since the attributes are added to generic 'struct rte_flow_attr',
>>>>>>>> non-table
>>>>>>>> (synchronous) flow rules are supposed to support them, too. If
>>>>>>>> that is indeed the case, then I'm afraid such proposal does not
>>>>>>>> agree with the existing items PORT_REPRESENTOR and
>> REPRESENTED_PORT.
>>>> They
>>>>>>>> do exactly the same thing, but they are designed to be way more
>>>>>>>> generic. Why
>>>>>> not use them?
>>>>>>
>>>>>> The question stands.
>>>>>>
>>>>>>>>
>>>>>>>> Ivan
>>>>>>>
>>>>>>
>>>>>> Ivan
>>>>>
>>>
>>
>> Thank you.
>
Thanks,
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-15 8:18 ` Thomas Monjalon
@ 2022-09-15 9:42 ` Ivan Malov
0 siblings, 0 replies; 96+ messages in thread
From: Ivan Malov @ 2022-09-15 9:42 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Ori Kam, Aman Singh,
Yuying Zhang, Andrew Rybchenko, dev, Raslan Darawsheh
Hi Thomas,
On Thu, 15 Sep 2022, Thomas Monjalon wrote:
> 15/09/2022 09:47, Ivan Malov:
>> As I said, match criteria belong in flow pattern. I recognise the
>> importance of the problem that you're looking to solve. It's very
>> good that you care to address it, but what this patch tries to do
>> is to add more match criteria in the form of new attributes with
>> rather questionable names... There's a room for improvement.
>>
>> When I say that new features should not confuse readers, I mean
>> a very basic thing: readers know that match criteria all sit
>> in the pattern. And they refer to the pattern item enum in
>> the code and in documentation to learn about criteria,
>> while "struct rte_flow_attr" is an unusual place from
>> which to learn about match criteria.
>>
>>> We should get a conclusion and reflect in the commit changes&logs, and it's easy for others to absorb.
>>
>> Yes, but before we get to that, perhaps it pays to hear
>> more feedback from other reviewers. Thomas? Ori? Andrew?
>
> Sorry I did not read all.
OK, I will attempt to summarise it to some extent
in my next response to Rongwei which is to follow.
> I think the main question is about the use of attributes.
> I refer to this commit of Ivan last year which was agreed:
>
> ethdev: deprecate direction attributes in transfer flows
>
> Attributes "ingress" and "egress" can only apply unambiguosly
> to non-"transfer" flows. In "transfer" flows, the standpoint
> is effectively shifted to the embedded switch. There can be
> many different endpoints connected to the switch, so the
> use of "ingress" / "egress" does not shed light on which
> endpoints precisely can be considered as traffic sources.
>
> Add relevant deprecation notices and suggest the use of precise
> traffic source items (PORT_REPRESENTOR and REPRESENTED_PORT).
>
> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
> Acked-by: Ori Kam <orika@nvidia.com>
> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
>
> So +1 for using only pattern items as matching criteria.
Thank you.
>
>
>
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-15 8:48 ` Rongwei Liu
@ 2022-09-15 10:59 ` Ivan Malov
2022-09-15 11:16 ` Thomas Monjalon
0 siblings, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2022-09-15 10:59 UTC (permalink / raw)
To: Rongwei Liu
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi Rongwei,
In this reply, I do not include the previous mail because the amount
of inline commentary has gone haywire over the past couple of days.
Let's re-iterate.
But before I get to that, I'd like to offer a fresh perspective:
Perhaps, if we all agree that term "vport" means an endpoint which
can stand for any "port" except for physical one, then it should
be possible to use term ANY_VPORTS rather than ANY_GUEST_PORTS.
But that's tricky, of course. I don't have a way with naming,
so more opinions are welcome and very-very desirable here.
So:
1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
primitives are in fact yet another match criteria?
..
To me, it looks so. If they are match criteria, then they belong
in match pattern, that is, they should be expressed as new items.
For "transfer" rules, the *existing* attributes are: "group"
and "priority". As you may note, these are clearly not match
criteria. They control the look-up order. So, to this day,
there're no match criteria in DPDK expressed as attributes.
If these "wire_orig" / "vf_orig" are going to be introduced
as attributes, that should be backed with strong motivation.
2) From your viewpoint, why items "ANY_PHYS_PORTS" and "ANY_VPORTS"
won't do? Or, which problems do you think they may inflict?
..
Previously, you explained why REPRESENTED_PORT would not
fit your needs. And I understand your point: to async API,
two pattern templates which both have item REPRESENTED_PORT
in them cannot be clearly distinguished and are in fact the
same set of criteria (provided that all other items are also
the same and have the same masks). Templates are, well,
templates (or shapes) of the rules to come later and
do not include exact "spec" for the "ethdev_id".
Got it.
But that's not going to be the case with items ANY_PHYS_PORTS and
ANY_VPORTS, is it? In one async table template, the user submits
item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
In another template, the user submits item ANY_VPORTS to
state that they want to match only traffic transmitted
software endpoints (DPDK ethdevs, guest VFs, etc.)
connected to the switch.
In this example, the PMD will clearly see that the two templates
differ. So it will be able to allocate separate resources, each
one "cutting one half of traffic" (as per your concept).
3) In your most recent response, you suggested that one might have
had the attributes occupied for some other purposes. To me,
they're not. Neither me nor my closest colleagues have
any plans on them. When I advocate using item approach
over the attribute approach, I do this to ensure
a) clarity of the API contract and b) robustness.
4) Also, in your response, you suggested that I might have
confused item mask and spec. That is not the case.
If we agree, that switch domain ID is unneeded in
the new items, then these items will have no
fields in them (like item PF had not had any
before it was deprecated).
No fields in new items => no field masks.
So what's the problem then?
5) With regard to our talk about identifying the relationship
between ethdevs and switch domains, you said that the user
could know the difference from the very beginning:
/sysfs/ .... /PF_BDF/sriov_num
That is true for the user who starts the application, but
this knowledge is hard to obtain from the application
perspective = it's hard to automate.
This is why ethdevs are able to advertise their domain IDs.
And, as I explained, looking at domain ID to understand
port relationship is valid, whilst looking at proxy IDs
to achieve the same goal is not. Proxy port IDs only
serve the purpose of finding an entry point for
managing flows. That has slightly different
meaning, but this subtle difference is important.
6) As for the confusion over the difference between fixing
bugs and making the code robust by extra checks:
Yes, I agree that the programmer who writes the
application must be intelligent enough to use
flow primitives the proper way. Yes, the user
who starts the application also should thread
carefully. But that does not prevent some
mistakes in other parts of code from
corrupting various chunks of memory,
including, for example, flow attrs.
You say that such mistakes have to be "just fixed"
as any other bugs. Right. But how much time will
the programmer spend to identify the bugs?
If the PMDs do all the checks (as with attributes),
the hypothetical bug will manifest itself much
earlier. That will simplify debugging by a lot...
So, my point is that it's still better to ensure
that new flow primitives have all necessary
checks in place. For attributes, it is
required to add them separately.
For items, as I explained, it might not be necessary
in the majority of cases simply because of the
switch (item->type) { case } structure.
So, these are some of my points to explain why the
attribute approach is untenable. To me, attributes
are something global, which demands checks in all
flow-capable PMDs. Items seem better because they
are don't cares to all PMDs which are unaware of
the async concept. So, even if someone does not
implement the async concept or does not like
the new item names, they can turn a blind
eye to this - with attributes, thay can't.
Thank you.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-15 10:59 ` Ivan Malov
@ 2022-09-15 11:16 ` Thomas Monjalon
2022-09-20 9:41 ` Ori Kam
0 siblings, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2022-09-15 11:16 UTC (permalink / raw)
To: Rongwei Liu, Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam, Aman Singh, Yuying Zhang,
Andrew Rybchenko, dev, Raslan Darawsheh
15/09/2022 12:59, Ivan Malov:
> Hi Rongwei,
>
> In this reply, I do not include the previous mail because the amount
> of inline commentary has gone haywire over the past couple of days.
> Let's re-iterate.
>
> But before I get to that, I'd like to offer a fresh perspective:
>
> Perhaps, if we all agree that term "vport" means an endpoint which
> can stand for any "port" except for physical one, then it should
> be possible to use term ANY_VPORTS rather than ANY_GUEST_PORTS.
The opposite of "physical" is "virtual" indeed.
> But that's tricky, of course. I don't have a way with naming,
> so more opinions are welcome and very-very desirable here.
>
> So:
>
> 1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
> primitives are in fact yet another match criteria?
>
> ..
>
> To me, it looks so. If they are match criteria, then they belong
> in match pattern, that is, they should be expressed as new items.
>
> For "transfer" rules, the *existing* attributes are: "group"
> and "priority". As you may note, these are clearly not match
> criteria. They control the look-up order. So, to this day,
> there're no match criteria in DPDK expressed as attributes.
>
> If these "wire_orig" / "vf_orig" are going to be introduced
> as attributes, that should be backed with strong motivation.
I prefer we keep matching in a single place, not in attributes.
> 2) From your viewpoint, why items "ANY_PHYS_PORTS" and "ANY_VPORTS"
> won't do? Or, which problems do you think they may inflict?
>
> ..
>
> Previously, you explained why REPRESENTED_PORT would not
> fit your needs. And I understand your point: to async API,
> two pattern templates which both have item REPRESENTED_PORT
> in them cannot be clearly distinguished and are in fact the
> same set of criteria (provided that all other items are also
> the same and have the same masks). Templates are, well,
> templates (or shapes) of the rules to come later and
> do not include exact "spec" for the "ethdev_id".
> Got it.
>
> But that's not going to be the case with items ANY_PHYS_PORTS and
> ANY_VPORTS, is it? In one async table template, the user submits
> item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
> In another template, the user submits item ANY_VPORTS to
> state that they want to match only traffic transmitted
> software endpoints (DPDK ethdevs, guest VFs, etc.)
> connected to the switch.
>
> In this example, the PMD will clearly see that the two templates
> differ. So it will be able to allocate separate resources, each
> one "cutting one half of traffic" (as per your concept).
>
> 3) In your most recent response, you suggested that one might have
> had the attributes occupied for some other purposes. To me,
> they're not. Neither me nor my closest colleagues have
> any plans on them. When I advocate using item approach
> over the attribute approach, I do this to ensure
> a) clarity of the API contract and b) robustness.
>
> 4) Also, in your response, you suggested that I might have
> confused item mask and spec. That is not the case.
> If we agree, that switch domain ID is unneeded in
> the new items, then these items will have no
> fields in them (like item PF had not had any
> before it was deprecated).
>
> No fields in new items => no field masks.
> So what's the problem then?
>
> 5) With regard to our talk about identifying the relationship
> between ethdevs and switch domains, you said that the user
> could know the difference from the very beginning:
> /sysfs/ .... /PF_BDF/sriov_num
>
> That is true for the user who starts the application, but
> this knowledge is hard to obtain from the application
> perspective = it's hard to automate.
>
> This is why ethdevs are able to advertise their domain IDs.
> And, as I explained, looking at domain ID to understand
namely rte_eth_dev_info.switch_info.domain_id
> port relationship is valid, whilst looking at proxy IDs
> to achieve the same goal is not. Proxy port IDs only
> serve the purpose of finding an entry point for
> managing flows. That has slightly different
> meaning, but this subtle difference is important.
There is also a concept of sibling ports
to get all ports belonging to the same hardware.
> 6) As for the confusion over the difference between fixing
> bugs and making the code robust by extra checks:
>
> Yes, I agree that the programmer who writes the
> application must be intelligent enough to use
> flow primitives the proper way. Yes, the user
> who starts the application also should thread
> carefully. But that does not prevent some
> mistakes in other parts of code from
> corrupting various chunks of memory,
> including, for example, flow attrs.
>
> You say that such mistakes have to be "just fixed"
> as any other bugs. Right. But how much time will
> the programmer spend to identify the bugs?
>
> If the PMDs do all the checks (as with attributes),
> the hypothetical bug will manifest itself much
> earlier. That will simplify debugging by a lot...
>
> So, my point is that it's still better to ensure
> that new flow primitives have all necessary
> checks in place. For attributes, it is
> required to add them separately.
If flow insertion is done in a fast path,
such checks may be skipped.
> For items, as I explained, it might not be necessary
> in the majority of cases simply because of the
> switch (item->type) { case } structure.
>
> So, these are some of my points to explain why the
> attribute approach is untenable. To me, attributes
> are something global, which demands checks in all
> flow-capable PMDs. Items seem better because they
> are don't cares to all PMDs which are unaware of
> the async concept. So, even if someone does not
> implement the async concept or does not like
> the new item names, they can turn a blind
> eye to this - with attributes, thay can't.
>
> Thank you.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-15 11:16 ` Thomas Monjalon
@ 2022-09-20 9:41 ` Ori Kam
2022-09-20 12:45 ` Ivan Malov
0 siblings, 1 reply; 96+ messages in thread
From: Ori Kam @ 2022-09-20 9:41 UTC (permalink / raw)
To: NBU-Contact-Thomas Monjalon (EXTERNAL), Rongwei Liu, Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Aman Singh, Yuying Zhang,
Andrew Rybchenko, dev, Raslan Darawsheh
Hi Ivan, Thomas and Rongwei
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, 15 September 2022 14:16
>
> 15/09/2022 12:59, Ivan Malov:
> > Hi Rongwei,
> >
> > In this reply, I do not include the previous mail because the amount
> > of inline commentary has gone haywire over the past couple of days.
> > Let's re-iterate.
> >
> > But before I get to that, I'd like to offer a fresh perspective:
> >
> > Perhaps, if we all agree that term "vport" means an endpoint which
> > can stand for any "port" except for physical one, then it should
> > be possible to use term ANY_VPORTS rather than ANY_GUEST_PORTS.
>
> The opposite of "physical" is "virtual" indeed.
>
> > But that's tricky, of course. I don't have a way with naming,
> > so more opinions are welcome and very-very desirable here.
> >
> > So:
> >
> > 1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
> > primitives are in fact yet another match criteria?
> >
> > ..
> >
> > To me, it looks so. If they are match criteria, then they belong
> > in match pattern, that is, they should be expressed as new items.
> >
> > For "transfer" rules, the *existing* attributes are: "group"
> > and "priority". As you may note, these are clearly not match
> > criteria. They control the look-up order. So, to this day,
> > there're no match criteria in DPDK expressed as attributes.
> >
> > If these "wire_orig" / "vf_orig" are going to be introduced
> > as attributes, that should be backed with strong motivation.
>
> I prefer we keep matching in a single place, not in attributes.
>
I think we are talking about two different features.
Feature 1:
Allow matching on all vports that are not wire
Feature 2:
Save allocation space and allow fast insertion.
In this case, the matching is not on all vports it can be just part of the vports
but it will never be the wire port.
For example:
port 0 - wire
ports 1,2,3,4,5 - vports
the application want to inset only those rules:
represented_port(port_id=2) / eth / ipv4 (src==xx)
represented_port(port_id=4) / eth / ipv4 (src==xx)
represented_port(port_id=4) / eth / ipv4 (src==yy)
For feature 1 I fully agree with you Ivan, this should be added as an item.
For feature 2 I think Rongwei's suggestion is the better option.
If I understand correctly the idea is to give hint to the PMD on where to allocate memory
and how to insert the rules most optimally. Since this is shared for all rules it makes more sense
to add it as an attribute, just like we don’t have an ingress item (maybe we should?)
Ivan we have the item RTE_FLOW_ITEM_TYPE_PF and RTE_FLOW_ITEM_TYPE_VF which are deprecated,
So do you want to un-deprecate them?
To summarize, if PMD can use such an hint during rule creation and save memory, I vote
to allow it.
if the idea is to match on all vports then it should be an item.
>
> > 2) From your viewpoint, why items "ANY_PHYS_PORTS" and
> "ANY_VPORTS"
> > won't do? Or, which problems do you think they may inflict?
> >
> > ..
> >
> > Previously, you explained why REPRESENTED_PORT would not
> > fit your needs. And I understand your point: to async API,
> > two pattern templates which both have item REPRESENTED_PORT
> > in them cannot be clearly distinguished and are in fact the
> > same set of criteria (provided that all other items are also
> > the same and have the same masks). Templates are, well,
> > templates (or shapes) of the rules to come later and
> > do not include exact "spec" for the "ethdev_id".
> > Got it.
> >
> > But that's not going to be the case with items ANY_PHYS_PORTS and
> > ANY_VPORTS, is it? In one async table template, the user submits
> > item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
> > In another template, the user submits item ANY_VPORTS to
> > state that they want to match only traffic transmitted
> > software endpoints (DPDK ethdevs, guest VFs, etc.)
> > connected to the switch.
> >
> > In this example, the PMD will clearly see that the two templates
> > differ. So it will be able to allocate separate resources, each
> > one "cutting one half of traffic" (as per your concept).
> >
> > 3) In your most recent response, you suggested that one might have
> > had the attributes occupied for some other purposes. To me,
> > they're not. Neither me nor my closest colleagues have
> > any plans on them. When I advocate using item approach
> > over the attribute approach, I do this to ensure
> > a) clarity of the API contract and b) robustness.
If something is shared for all rules in the same table, it should be a table
property.
> >
> > 4) Also, in your response, you suggested that I might have
> > confused item mask and spec. That is not the case.
> > If we agree, that switch domain ID is unneeded in
> > the new items, then these items will have no
> > fields in them (like item PF had not had any
> > before it was deprecated).
> >
> > No fields in new items => no field masks.
> > So what's the problem then?
> >
> > 5) With regard to our talk about identifying the relationship
> > between ethdevs and switch domains, you said that the user
> > could know the difference from the very beginning:
> > /sysfs/ .... /PF_BDF/sriov_num
> >
> > That is true for the user who starts the application, but
> > this knowledge is hard to obtain from the application
> > perspective = it's hard to automate.
> >
> > This is why ethdevs are able to advertise their domain IDs.
> > And, as I explained, looking at domain ID to understand
>
> namely rte_eth_dev_info.switch_info.domain_id
>
> > port relationship is valid, whilst looking at proxy IDs
> > to achieve the same goal is not. Proxy port IDs only
> > serve the purpose of finding an entry point for
> > managing flows. That has slightly different
> > meaning, but this subtle difference is important.
>
> There is also a concept of sibling ports
> to get all ports belonging to the same hardware.
>
>
> > 6) As for the confusion over the difference between fixing
> > bugs and making the code robust by extra checks:
> >
> > Yes, I agree that the programmer who writes the
> > application must be intelligent enough to use
> > flow primitives the proper way. Yes, the user
> > who starts the application also should thread
> > carefully. But that does not prevent some
> > mistakes in other parts of code from
> > corrupting various chunks of memory,
> > including, for example, flow attrs.
> >
> > You say that such mistakes have to be "just fixed"
> > as any other bugs. Right. But how much time will
> > the programmer spend to identify the bugs?
> >
> > If the PMDs do all the checks (as with attributes),
> > the hypothetical bug will manifest itself much
> > earlier. That will simplify debugging by a lot...
> >
> > So, my point is that it's still better to ensure
> > that new flow primitives have all necessary
> > checks in place. For attributes, it is
> > required to add them separately.
>
> If flow insertion is done in a fast path,
> such checks may be skipped.
The idea is that all rules in this table will share the same configuration,
there is no reason to say everything again for each rule. This is why
the rule attributes were moved to the table struct and not per rule.
>
> > For items, as I explained, it might not be necessary
> > in the majority of cases simply because of the
> > switch (item->type) { case } structure.
> >
> > So, these are some of my points to explain why the
> > attribute approach is untenable. To me, attributes
> > are something global, which demands checks in all
> > flow-capable PMDs. Items seem better because they
> > are don't cares to all PMDs which are unaware of
> > the async concept. So, even if someone does not
> > implement the async concept or does not like
> > the new item names, they can turn a blind
> > eye to this - with attributes, thay can't.
> >
Good point,
Maybe we should add hints in the attribute,
for example, hint_only_wire in this case it will be clear that
PMD may ignore this, and it should be fully documented that this is not a mandatory field.
What do you think?
> > Thank you.
>
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-20 9:41 ` Ori Kam
@ 2022-09-20 12:45 ` Ivan Malov
2022-09-20 13:59 ` Ori Kam
0 siblings, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2022-09-20 12:45 UTC (permalink / raw)
To: Ori Kam
Cc: NBU-Contact-Thomas Monjalon (EXTERNAL),
Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, Andrew Rybchenko, dev, Raslan Darawsheh
[-- Attachment #1: Type: text/plain, Size: 10819 bytes --]
Hi Ori,
On Tue, 20 Sep 2022, Ori Kam wrote:
> Hi Ivan, Thomas and Rongwei
>
>> -----Original Message-----
>> From: Thomas Monjalon <thomas@monjalon.net>
>> Sent: Thursday, 15 September 2022 14:16
>>
>> 15/09/2022 12:59, Ivan Malov:
>>> Hi Rongwei,
>>>
>>> In this reply, I do not include the previous mail because the amount
>>> of inline commentary has gone haywire over the past couple of days.
>>> Let's re-iterate.
>>>
>>> But before I get to that, I'd like to offer a fresh perspective:
>>>
>>> Perhaps, if we all agree that term "vport" means an endpoint which
>>> can stand for any "port" except for physical one, then it should
>>> be possible to use term ANY_VPORTS rather than ANY_GUEST_PORTS.
>>
>> The opposite of "physical" is "virtual" indeed.
>>
>>> But that's tricky, of course. I don't have a way with naming,
>>> so more opinions are welcome and very-very desirable here.
>>>
>>> So:
>>>
>>> 1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
>>> primitives are in fact yet another match criteria?
>>>
>>> ..
>>>
>>> To me, it looks so. If they are match criteria, then they belong
>>> in match pattern, that is, they should be expressed as new items.
>>>
>>> For "transfer" rules, the *existing* attributes are: "group"
>>> and "priority". As you may note, these are clearly not match
>>> criteria. They control the look-up order. So, to this day,
>>> there're no match criteria in DPDK expressed as attributes.
>>>
>>> If these "wire_orig" / "vf_orig" are going to be introduced
>>> as attributes, that should be backed with strong motivation.
>>
>> I prefer we keep matching in a single place, not in attributes.
>>
>
> I think we are talking about two different features.
> Feature 1:
> Allow matching on all vports that are not wire
> Feature 2:
> Save allocation space and allow fast insertion.
> In this case, the matching is not on all vports it can be just part of the vports
> but it will never be the wire port.
> For example:
> port 0 - wire
> ports 1,2,3,4,5 - vports
> the application want to inset only those rules:
> represented_port(port_id=2) / eth / ipv4 (src==xx)
> represented_port(port_id=4) / eth / ipv4 (src==xx)
> represented_port(port_id=4) / eth / ipv4 (src==yy)
>
> For feature 1 I fully agree with you Ivan, this should be added as an item.
Thank you.
> For feature 2 I think Rongwei's suggestion is the better option.
> If I understand correctly the idea is to give hint to the PMD on where to allocate memory
> and how to insert the rules most optimally. Since this is shared for all rules it makes more sense
> to add it as an attribute, just like we don’t have an ingress item (maybe we should?)
But isn't pattern template also supposed to be shared for all rules
in the table? I.e., the user creates an async flow table and submits
a flow "shape" (which consists of attrs, pattern template and action
template). So why should "giving a hint" via an item template be
considered worse than doig so via an attribute?
As for "ingress" item, - no, one should not add such. We have had
many discussions concerning this bit in the past. Ingress/egress
are non-transfer terms. They belong in the scope of vNIC / ethdev
filtering, not to embedded switch rules.
In my opinion, in the embedded switch, one should either point to
some precise switch ports (using REPRESENTOR / REPRESENTED items)
or use another kind of item to refer to a "super set" of ports
which have something in common ("all wire ports", "all NON-wire ports").
>
> Ivan we have the item RTE_FLOW_ITEM_TYPE_PF and RTE_FLOW_ITEM_TYPE_VF which are deprecated,
> So do you want to un-deprecate them?
No. These items are deprecated because:
a) their names suggest that application knows whether an ethdev
sits on top of a PF or that the application has some
knowledge of existence of particular VFs, but in
reality applications should not be worried of
the underlying function type = to them, all
ethdevs are just representors of something,
and if the application needs to refer to
VFs (or other PFs, - doesn't matter), it
should do that via REPRESENTOR items;
b) such items would duplicate REPRESENTOR / REPRESENTED.
>
> To summarize, if PMD can use such an hint during rule creation and save memory, I vote
> to allow it.
> if the idea is to match on all vports then it should be an item.
But such a hint would effectively be a match criterion, too, right?
So, in fact it's a combined use case: a match criterion which is
flexible enough to be a "hint" = i.e. the PMD can see it when
processing the pattern *template* and treat it as a hint.
>
>>
>>> 2) From your viewpoint, why items "ANY_PHYS_PORTS" and
>> "ANY_VPORTS"
>>> won't do? Or, which problems do you think they may inflict?
>>>
>>> ..
>>>
>>> Previously, you explained why REPRESENTED_PORT would not
>>> fit your needs. And I understand your point: to async API,
>>> two pattern templates which both have item REPRESENTED_PORT
>>> in them cannot be clearly distinguished and are in fact the
>>> same set of criteria (provided that all other items are also
>>> the same and have the same masks). Templates are, well,
>>> templates (or shapes) of the rules to come later and
>>> do not include exact "spec" for the "ethdev_id".
>>> Got it.
>>>
>>> But that's not going to be the case with items ANY_PHYS_PORTS and
>>> ANY_VPORTS, is it? In one async table template, the user submits
>>> item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
>>> In another template, the user submits item ANY_VPORTS to
>>> state that they want to match only traffic transmitted
>>> software endpoints (DPDK ethdevs, guest VFs, etc.)
>>> connected to the switch.
>>>
>>> In this example, the PMD will clearly see that the two templates
>>> differ. So it will be able to allocate separate resources, each
>>> one "cutting one half of traffic" (as per your concept).
>>>
>>> 3) In your most recent response, you suggested that one might have
>>> had the attributes occupied for some other purposes. To me,
>>> they're not. Neither me nor my closest colleagues have
>>> any plans on them. When I advocate using item approach
>>> over the attribute approach, I do this to ensure
>>> a) clarity of the API contract and b) robustness.
>
> If something is shared for all rules in the same table, it should be a table
> property.
But the whole pattern *template* is also a table property, isn't it?
>
>>>
>>> 4) Also, in your response, you suggested that I might have
>>> confused item mask and spec. That is not the case.
>>> If we agree, that switch domain ID is unneeded in
>>> the new items, then these items will have no
>>> fields in them (like item PF had not had any
>>> before it was deprecated).
>>>
>>> No fields in new items => no field masks.
>>> So what's the problem then?
>>>
>>> 5) With regard to our talk about identifying the relationship
>>> between ethdevs and switch domains, you said that the user
>>> could know the difference from the very beginning:
>>> /sysfs/ .... /PF_BDF/sriov_num
>>>
>>> That is true for the user who starts the application, but
>>> this knowledge is hard to obtain from the application
>>> perspective = it's hard to automate.
>>>
>>> This is why ethdevs are able to advertise their domain IDs.
>>> And, as I explained, looking at domain ID to understand
>>
>> namely rte_eth_dev_info.switch_info.domain_id
>>
>>> port relationship is valid, whilst looking at proxy IDs
>>> to achieve the same goal is not. Proxy port IDs only
>>> serve the purpose of finding an entry point for
>>> managing flows. That has slightly different
>>> meaning, but this subtle difference is important.
>>
>> There is also a concept of sibling ports
>> to get all ports belonging to the same hardware.
>>
>>
>>> 6) As for the confusion over the difference between fixing
>>> bugs and making the code robust by extra checks:
>>>
>>> Yes, I agree that the programmer who writes the
>>> application must be intelligent enough to use
>>> flow primitives the proper way. Yes, the user
>>> who starts the application also should thread
>>> carefully. But that does not prevent some
>>> mistakes in other parts of code from
>>> corrupting various chunks of memory,
>>> including, for example, flow attrs.
>>>
>>> You say that such mistakes have to be "just fixed"
>>> as any other bugs. Right. But how much time will
>>> the programmer spend to identify the bugs?
>>>
>>> If the PMDs do all the checks (as with attributes),
>>> the hypothetical bug will manifest itself much
>>> earlier. That will simplify debugging by a lot...
>>>
>>> So, my point is that it's still better to ensure
>>> that new flow primitives have all necessary
>>> checks in place. For attributes, it is
>>> required to add them separately.
>>
>> If flow insertion is done in a fast path,
>> such checks may be skipped.
>
> The idea is that all rules in this table will share the same configuration,
> there is no reason to say everything again for each rule. This is why
> the rule attributes were moved to the table struct and not per rule.
>
>>
>>> For items, as I explained, it might not be necessary
>>> in the majority of cases simply because of the
>>> switch (item->type) { case } structure.
>>>
>>> So, these are some of my points to explain why the
>>> attribute approach is untenable. To me, attributes
>>> are something global, which demands checks in all
>>> flow-capable PMDs. Items seem better because they
>>> are don't cares to all PMDs which are unaware of
>>> the async concept. So, even if someone does not
>>> implement the async concept or does not like
>>> the new item names, they can turn a blind
>>> eye to this - with attributes, thay can't.
>>>
>
> Good point,
> Maybe we should add hints in the attribute,
> for example, hint_only_wire in this case it will be clear that
> PMD may ignore this, and it should be fully documented that this is not a mandatory field.
> What do you think?
Theoretically, making terminology softer (like with the word "hint")
could make things easier for vendors who may find the new feature
confusing or something like that. But if, in reality, this hint
is indeed another match criterion (see my comments above), then
in no event shall the prefix "hint" be an excuse for this
criterion not being expressed as a pattern item.
Please hear me out: I don't mean to sound arrogant, - just trying
to understand why expressing the new bit as an item can't be
efficient enough for the async flow approach.
>
>>> Thank you.
>>
>>
>
>
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-20 12:45 ` Ivan Malov
@ 2022-09-20 13:59 ` Ori Kam
2022-09-20 15:28 ` Ivan Malov
0 siblings, 1 reply; 96+ messages in thread
From: Ori Kam @ 2022-09-20 13:59 UTC (permalink / raw)
To: Ivan Malov
Cc: NBU-Contact-Thomas Monjalon (EXTERNAL),
Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, Andrew Rybchenko, dev, Raslan Darawsheh
Hi Ivan,
> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Tuesday, 20 September 2022 15:46
>
> Hi Ori,
>
> On Tue, 20 Sep 2022, Ori Kam wrote:
>
> > Hi Ivan, Thomas and Rongwei
> >
> >> -----Original Message-----
> >> From: Thomas Monjalon <thomas@monjalon.net>
> >> Sent: Thursday, 15 September 2022 14:16
> >>
> >> 15/09/2022 12:59, Ivan Malov:
> >>> Hi Rongwei,
> >>>
> >>> In this reply, I do not include the previous mail because the amount
> >>> of inline commentary has gone haywire over the past couple of days.
> >>> Let's re-iterate.
> >>>
> >>> But before I get to that, I'd like to offer a fresh perspective:
> >>>
> >>> Perhaps, if we all agree that term "vport" means an endpoint which
> >>> can stand for any "port" except for physical one, then it should
> >>> be possible to use term ANY_VPORTS rather than ANY_GUEST_PORTS.
> >>
> >> The opposite of "physical" is "virtual" indeed.
> >>
> >>> But that's tricky, of course. I don't have a way with naming,
> >>> so more opinions are welcome and very-very desirable here.
> >>>
> >>> So:
> >>>
> >>> 1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
> >>> primitives are in fact yet another match criteria?
> >>>
> >>> ..
> >>>
> >>> To me, it looks so. If they are match criteria, then they belong
> >>> in match pattern, that is, they should be expressed as new items.
> >>>
> >>> For "transfer" rules, the *existing* attributes are: "group"
> >>> and "priority". As you may note, these are clearly not match
> >>> criteria. They control the look-up order. So, to this day,
> >>> there're no match criteria in DPDK expressed as attributes.
> >>>
> >>> If these "wire_orig" / "vf_orig" are going to be introduced
> >>> as attributes, that should be backed with strong motivation.
> >>
> >> I prefer we keep matching in a single place, not in attributes.
> >>
> >
> > I think we are talking about two different features.
> > Feature 1:
> > Allow matching on all vports that are not wire
> > Feature 2:
> > Save allocation space and allow fast insertion.
> > In this case, the matching is not on all vports it can be just part of the vports
> > but it will never be the wire port.
> > For example:
> > port 0 - wire
> > ports 1,2,3,4,5 - vports
> > the application want to inset only those rules:
> > represented_port(port_id=2) / eth / ipv4 (src==xx)
> > represented_port(port_id=4) / eth / ipv4 (src==xx)
> > represented_port(port_id=4) / eth / ipv4 (src==yy)
> >
> > For feature 1 I fully agree with you Ivan, this should be added as an item.
>
> Thank you.
>
> > For feature 2 I think Rongwei's suggestion is the better option.
> > If I understand correctly the idea is to give hint to the PMD on where to
> allocate memory
> > and how to insert the rules most optimally. Since this is shared for all rules it
> makes more sense
> > to add it as an attribute, just like we don’t have an ingress item (maybe we
> should?)
>
> But isn't pattern template also supposed to be shared for all rules
> in the table? I.e., the user creates an async flow table and submits
> a flow "shape" (which consists of attrs, pattern template and action
> template). So why should "giving a hint" via an item template be
> considered worse than doig so via an attribute?
>
The same item template maybe used elsewhere, for example, the following
pattern eth / ipv4(src, dst) / udp(sport, dport), can be used on number of different
tables.
I think that the main difference between us is that from my point of view this value is just
where to allocate resources / how to better insert the rule. It is not related to matching.
From Nvidia viewpoint we need this information so we can allocate the resource at the correct
place and avoid inserting duplication of rules.
I agree that by using the item we can get the same results, but it is incorrect since we are not matching on it.
Part of the idea of template API is to give as many hints as possible to the PMD so the insertion will be optimized.
> As for "ingress" item, - no, one should not add such. We have had
> many discussions concerning this bit in the past. Ingress/egress
> are non-transfer terms. They belong in the scope of vNIC / ethdev
> filtering, not to embedded switch rules.
>
> In my opinion, in the embedded switch, one should either point to
> some precise switch ports (using REPRESENTOR / REPRESENTED items)
> or use another kind of item to refer to a "super set" of ports
> which have something in common ("all wire ports", "all NON-wire ports").
>
But this is my point we don't want all wire ports or all NON-wire ports, we just know that in this table
we will have only non-wire / wire ports.
> >
> > Ivan we have the item RTE_FLOW_ITEM_TYPE_PF and
> RTE_FLOW_ITEM_TYPE_VF which are deprecated,
> > So do you want to un-deprecate them?
>
> No. These items are deprecated because:
>
> a) their names suggest that application knows whether an ethdev
> sits on top of a PF or that the application has some
> knowledge of existence of particular VFs, but in
> reality applications should not be worried of
> the underlying function type = to them, all
> ethdevs are just representors of something,
> and if the application needs to refer to
> VFs (or other PFs, - doesn't matter), it
> should do that via REPRESENTOR items;
>
> b) such items would duplicate REPRESENTOR / REPRESENTED.
>
Agree with everything you say.
> >
> > To summarize, if PMD can use such an hint during rule creation and save
> memory, I vote
> > to allow it.
> > if the idea is to match on all vports then it should be an item.
>
> But such a hint would effectively be a match criterion, too, right?
> So, in fact it's a combined use case: a match criterion which is
> flexible enough to be a "hint" = i.e. the PMD can see it when
> processing the pattern *template* and treat it as a hint.
>
Yes, but it is an implicit match, just like saying ingress. Egress it has meaning above the
matching. In addition, there is no reason to add extra item for each rule we create, just
to enable something that is fixed during the table creation.
Extra item in pattern template means extra item for each rule.
I know we can avoid this and optimize the code but why add something that no one needs
after table creation?
> >
> >>
> >>> 2) From your viewpoint, why items "ANY_PHYS_PORTS" and
> >> "ANY_VPORTS"
> >>> won't do? Or, which problems do you think they may inflict?
> >>>
> >>> ..
> >>>
> >>> Previously, you explained why REPRESENTED_PORT would not
> >>> fit your needs. And I understand your point: to async API,
> >>> two pattern templates which both have item REPRESENTED_PORT
> >>> in them cannot be clearly distinguished and are in fact the
> >>> same set of criteria (provided that all other items are also
> >>> the same and have the same masks). Templates are, well,
> >>> templates (or shapes) of the rules to come later and
> >>> do not include exact "spec" for the "ethdev_id".
> >>> Got it.
> >>>
> >>> But that's not going to be the case with items ANY_PHYS_PORTS and
> >>> ANY_VPORTS, is it? In one async table template, the user submits
> >>> item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
> >>> In another template, the user submits item ANY_VPORTS to
> >>> state that they want to match only traffic transmitted
> >>> software endpoints (DPDK ethdevs, guest VFs, etc.)
> >>> connected to the switch.
> >>>
> >>> In this example, the PMD will clearly see that the two templates
> >>> differ. So it will be able to allocate separate resources, each
> >>> one "cutting one half of traffic" (as per your concept).
> >>>
> >>> 3) In your most recent response, you suggested that one might have
> >>> had the attributes occupied for some other purposes. To me,
> >>> they're not. Neither me nor my closest colleagues have
> >>> any plans on them. When I advocate using item approach
> >>> over the attribute approach, I do this to ensure
> >>> a) clarity of the API contract and b) robustness.
> >
> > If something is shared for all rules in the same table, it should be a table
> > property.
>
> But the whole pattern *template* is also a table property, isn't it?
>
Like I said above the pattern template can be used in all domains that is why
there is a split between table and patter, in addition to that each table may have
number of pattern templates.
> >
> >>>
> >>> 4) Also, in your response, you suggested that I might have
> >>> confused item mask and spec. That is not the case.
> >>> If we agree, that switch domain ID is unneeded in
> >>> the new items, then these items will have no
> >>> fields in them (like item PF had not had any
> >>> before it was deprecated).
> >>>
> >>> No fields in new items => no field masks.
> >>> So what's the problem then?
> >>>
> >>> 5) With regard to our talk about identifying the relationship
> >>> between ethdevs and switch domains, you said that the user
> >>> could know the difference from the very beginning:
> >>> /sysfs/ .... /PF_BDF/sriov_num
> >>>
> >>> That is true for the user who starts the application, but
> >>> this knowledge is hard to obtain from the application
> >>> perspective = it's hard to automate.
> >>>
> >>> This is why ethdevs are able to advertise their domain IDs.
> >>> And, as I explained, looking at domain ID to understand
> >>
> >> namely rte_eth_dev_info.switch_info.domain_id
> >>
> >>> port relationship is valid, whilst looking at proxy IDs
> >>> to achieve the same goal is not. Proxy port IDs only
> >>> serve the purpose of finding an entry point for
> >>> managing flows. That has slightly different
> >>> meaning, but this subtle difference is important.
> >>
> >> There is also a concept of sibling ports
> >> to get all ports belonging to the same hardware.
> >>
> >>
> >>> 6) As for the confusion over the difference between fixing
> >>> bugs and making the code robust by extra checks:
> >>>
> >>> Yes, I agree that the programmer who writes the
> >>> application must be intelligent enough to use
> >>> flow primitives the proper way. Yes, the user
> >>> who starts the application also should thread
> >>> carefully. But that does not prevent some
> >>> mistakes in other parts of code from
> >>> corrupting various chunks of memory,
> >>> including, for example, flow attrs.
> >>>
> >>> You say that such mistakes have to be "just fixed"
> >>> as any other bugs. Right. But how much time will
> >>> the programmer spend to identify the bugs?
> >>>
> >>> If the PMDs do all the checks (as with attributes),
> >>> the hypothetical bug will manifest itself much
> >>> earlier. That will simplify debugging by a lot...
> >>>
> >>> So, my point is that it's still better to ensure
> >>> that new flow primitives have all necessary
> >>> checks in place. For attributes, it is
> >>> required to add them separately.
> >>
> >> If flow insertion is done in a fast path,
> >> such checks may be skipped.
> >
> > The idea is that all rules in this table will share the same configuration,
> > there is no reason to say everything again for each rule. This is why
> > the rule attributes were moved to the table struct and not per rule.
> >
> >>
> >>> For items, as I explained, it might not be necessary
> >>> in the majority of cases simply because of the
> >>> switch (item->type) { case } structure.
> >>>
> >>> So, these are some of my points to explain why the
> >>> attribute approach is untenable. To me, attributes
> >>> are something global, which demands checks in all
> >>> flow-capable PMDs. Items seem better because they
> >>> are don't cares to all PMDs which are unaware of
> >>> the async concept. So, even if someone does not
> >>> implement the async concept or does not like
> >>> the new item names, they can turn a blind
> >>> eye to this - with attributes, thay can't.
> >>>
> >
> > Good point,
> > Maybe we should add hints in the attribute,
> > for example, hint_only_wire in this case it will be clear that
> > PMD may ignore this, and it should be fully documented that this is not a
> mandatory field.
> > What do you think?
>
> Theoretically, making terminology softer (like with the word "hint")
> could make things easier for vendors who may find the new feature
> confusing or something like that. But if, in reality, this hint
> is indeed another match criterion (see my comments above), then
> in no event shall the prefix "hint" be an excuse for this
> criterion not being expressed as a pattern item.
>
Please see my response above. This is the point it is much more than matching.
> Please hear me out: I don't mean to sound arrogant, - just trying
> to understand why expressing the new bit as an item can't be
> efficient enough for the async flow approach.
>
I don't think you are arrogant, and I hope that you see that I do understand your comments.
saying that I hope I explained why I think it is better to have it as a table attribute and not as an
item. (We are not matching on it, this helps the PMD allocate the table at the best location and avoid
duplication of rules)
If you wish, we can have a short phone call and discuss this.
Best,
Ori
> >
> >>> Thank you.
> >>
> >>
> >
> >
>
> Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-20 13:59 ` Ori Kam
@ 2022-09-20 15:28 ` Ivan Malov
2022-09-21 7:34 ` Ori Kam
0 siblings, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2022-09-20 15:28 UTC (permalink / raw)
To: Ori Kam
Cc: NBU-Contact-Thomas Monjalon (EXTERNAL),
Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, Andrew Rybchenko, dev, Raslan Darawsheh
[-- Attachment #1: Type: text/plain, Size: 15974 bytes --]
Hi Ori,
On Tue, 20 Sep 2022, Ori Kam wrote:
> Hi Ivan,
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Tuesday, 20 September 2022 15:46
>>
>> Hi Ori,
>>
>> On Tue, 20 Sep 2022, Ori Kam wrote:
>>
>>> Hi Ivan, Thomas and Rongwei
>>>
>>>> -----Original Message-----
>>>> From: Thomas Monjalon <thomas@monjalon.net>
>>>> Sent: Thursday, 15 September 2022 14:16
>>>>
>>>> 15/09/2022 12:59, Ivan Malov:
>>>>> Hi Rongwei,
>>>>>
>>>>> In this reply, I do not include the previous mail because the amount
>>>>> of inline commentary has gone haywire over the past couple of days.
>>>>> Let's re-iterate.
>>>>>
>>>>> But before I get to that, I'd like to offer a fresh perspective:
>>>>>
>>>>> Perhaps, if we all agree that term "vport" means an endpoint which
>>>>> can stand for any "port" except for physical one, then it should
>>>>> be possible to use term ANY_VPORTS rather than ANY_GUEST_PORTS.
>>>>
>>>> The opposite of "physical" is "virtual" indeed.
>>>>
>>>>> But that's tricky, of course. I don't have a way with naming,
>>>>> so more opinions are welcome and very-very desirable here.
>>>>>
>>>>> So:
>>>>>
>>>>> 1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
>>>>> primitives are in fact yet another match criteria?
>>>>>
>>>>> ..
>>>>>
>>>>> To me, it looks so. If they are match criteria, then they belong
>>>>> in match pattern, that is, they should be expressed as new items.
>>>>>
>>>>> For "transfer" rules, the *existing* attributes are: "group"
>>>>> and "priority". As you may note, these are clearly not match
>>>>> criteria. They control the look-up order. So, to this day,
>>>>> there're no match criteria in DPDK expressed as attributes.
>>>>>
>>>>> If these "wire_orig" / "vf_orig" are going to be introduced
>>>>> as attributes, that should be backed with strong motivation.
>>>>
>>>> I prefer we keep matching in a single place, not in attributes.
>>>>
>>>
>>> I think we are talking about two different features.
>>> Feature 1:
>>> Allow matching on all vports that are not wire
>>> Feature 2:
>>> Save allocation space and allow fast insertion.
>>> In this case, the matching is not on all vports it can be just part of the vports
>>> but it will never be the wire port.
>>> For example:
>>> port 0 - wire
>>> ports 1,2,3,4,5 - vports
>>> the application want to inset only those rules:
>>> represented_port(port_id=2) / eth / ipv4 (src==xx)
>>> represented_port(port_id=4) / eth / ipv4 (src==xx)
>>> represented_port(port_id=4) / eth / ipv4 (src==yy)
>>>
>>> For feature 1 I fully agree with you Ivan, this should be added as an item.
>>
>> Thank you.
>>
>>> For feature 2 I think Rongwei's suggestion is the better option.
>>> If I understand correctly the idea is to give hint to the PMD on where to
>> allocate memory
>>> and how to insert the rules most optimally. Since this is shared for all rules it
>> makes more sense
>>> to add it as an attribute, just like we don’t have an ingress item (maybe we
>> should?)
>>
>> But isn't pattern template also supposed to be shared for all rules
>> in the table? I.e., the user creates an async flow table and submits
>> a flow "shape" (which consists of attrs, pattern template and action
>> template). So why should "giving a hint" via an item template be
>> considered worse than doig so via an attribute?
>>
>
> The same item template maybe used elsewhere, for example, the following
> pattern eth / ipv4(src, dst) / udp(sport, dport), can be used on number of different
> tables.
In my understanding, the user may want to create flow table A
and use pattern template A' for it, which is as follows:
any_vports / eth / ipv4 / udp
The PMD can see this item and treat it exactly the same
way as it could treat such attribute ("where to allocate
resources, etc.").
Then the user may want to create flow table B and
use pattern template B' for it:
any_phy_ports / eth / ipv4 / udp
Once again, the PMD can clearly see the difference between
the A' and B' templates and, this time, allocate resources
the other way (as per efficiency requirements).
By saying "can be used on number of different tables", do you mean
that it is important to make the *network* part of the pattern
shareable between flow tables? I.e. are you saying that
templates A' and B' cause resource duplication just
because of the same *network* part in your case?
> I think that the main difference between us is that from my point of view this value is just
> where to allocate resources / how to better insert the rule. It is not related to matching.
To me, it *is* the match criterion which, at the same time, serves
as a value indicating the way how resources should be allocated.
But before all, it is a match criterion.
If it refers to a group of ports = in order to ditch "the other half"
of traffic from consideration (like Rongwei explained), then it
looks like a match criterion.
> From Nvidia viewpoint we need this information so we can allocate the resource at the correct
> place and avoid inserting duplication of rules.
I see.
> I agree that by using the item we can get the same results, but it is incorrect since we are not matching on it.
If one provides item UDP in the pattern and does not match on any UDP
fields, doing so nevertheless *is* matching on particular packet type.
The same seemingly goes for the new attribute / item. If it is
provided, then the user doesn't want the rule to affect
packets coming from certain ports (i.e. from wire).
So still sounds like matching.
> Part of the idea of template API is to give as many hints as possible to the PMD so the insertion will be optimized.
I see.
>
>
>> As for "ingress" item, - no, one should not add such. We have had
>> many discussions concerning this bit in the past. Ingress/egress
>> are non-transfer terms. They belong in the scope of vNIC / ethdev
>> filtering, not to embedded switch rules.
>>
>> In my opinion, in the embedded switch, one should either point to
>> some precise switch ports (using REPRESENTOR / REPRESENTED items)
>> or use another kind of item to refer to a "super set" of ports
>> which have something in common ("all wire ports", "all NON-wire ports").
>>
>
> But this is my point we don't want all wire ports or all NON-wire ports, we just know that in this table
> we will have only non-wire / wire ports.
But how do these two viewpoints contradict each other?
>
>>>
>>> Ivan we have the item RTE_FLOW_ITEM_TYPE_PF and
>> RTE_FLOW_ITEM_TYPE_VF which are deprecated,
>>> So do you want to un-deprecate them?
>>
>> No. These items are deprecated because:
>>
>> a) their names suggest that application knows whether an ethdev
>> sits on top of a PF or that the application has some
>> knowledge of existence of particular VFs, but in
>> reality applications should not be worried of
>> the underlying function type = to them, all
>> ethdevs are just representors of something,
>> and if the application needs to refer to
>> VFs (or other PFs, - doesn't matter), it
>> should do that via REPRESENTOR items;
>>
>> b) such items would duplicate REPRESENTOR / REPRESENTED.
>>
> Agree with everything you say.
Great we're on the same page regarding this bit.
>
>>>
>>> To summarize, if PMD can use such an hint during rule creation and save
>> memory, I vote
>>> to allow it.
>>> if the idea is to match on all vports then it should be an item.
>>
>> But such a hint would effectively be a match criterion, too, right?
>> So, in fact it's a combined use case: a match criterion which is
>> flexible enough to be a "hint" = i.e. the PMD can see it when
>> processing the pattern *template* and treat it as a hint.
>>
>
> Yes, but it is an implicit match, just like saying ingress. Egress it has meaning above the
> matching. In addition, there is no reason to add extra item for each rule we create, just
> to enable something that is fixed during the table creation.
> Extra item in pattern template means extra item for each rule.
> I know we can avoid this and optimize the code but why add something that no one needs
> after table creation?
Good question. But, in case some way exists to make such optimisation
laconic enough to avoid confusion etc., then it should be no problem
in preferring the pattern approach over attribute approach.
>
>
>>>
>>>>
>>>>> 2) From your viewpoint, why items "ANY_PHYS_PORTS" and
>>>> "ANY_VPORTS"
>>>>> won't do? Or, which problems do you think they may inflict?
>>>>>
>>>>> ..
>>>>>
>>>>> Previously, you explained why REPRESENTED_PORT would not
>>>>> fit your needs. And I understand your point: to async API,
>>>>> two pattern templates which both have item REPRESENTED_PORT
>>>>> in them cannot be clearly distinguished and are in fact the
>>>>> same set of criteria (provided that all other items are also
>>>>> the same and have the same masks). Templates are, well,
>>>>> templates (or shapes) of the rules to come later and
>>>>> do not include exact "spec" for the "ethdev_id".
>>>>> Got it.
>>>>>
>>>>> But that's not going to be the case with items ANY_PHYS_PORTS and
>>>>> ANY_VPORTS, is it? In one async table template, the user submits
>>>>> item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
>>>>> In another template, the user submits item ANY_VPORTS to
>>>>> state that they want to match only traffic transmitted
>>>>> software endpoints (DPDK ethdevs, guest VFs, etc.)
>>>>> connected to the switch.
>>>>>
>>>>> In this example, the PMD will clearly see that the two templates
>>>>> differ. So it will be able to allocate separate resources, each
>>>>> one "cutting one half of traffic" (as per your concept).
>>>>>
>>>>> 3) In your most recent response, you suggested that one might have
>>>>> had the attributes occupied for some other purposes. To me,
>>>>> they're not. Neither me nor my closest colleagues have
>>>>> any plans on them. When I advocate using item approach
>>>>> over the attribute approach, I do this to ensure
>>>>> a) clarity of the API contract and b) robustness.
>>>
>>> If something is shared for all rules in the same table, it should be a table
>>> property.
>>
>> But the whole pattern *template* is also a table property, isn't it?
>>
>
> Like I said above the pattern template can be used in all domains that is why
> there is a split between table and patter, in addition to that each table may have
> number of pattern templates.
This is a valuable clarification. However, even if the attribute way
may seem OK after this explanation, then I still don't understand
why it is required to add this attribute to the generic "struct
rte_flow_attr" and not just to the *table* attr.
Generic "struct rte_flow_attr" is used both for async and
sync (regular) approach. So why add something to generic
struct which is never going to make sense to sync flows?
>
>>>
>>>>>
>>>>> 4) Also, in your response, you suggested that I might have
>>>>> confused item mask and spec. That is not the case.
>>>>> If we agree, that switch domain ID is unneeded in
>>>>> the new items, then these items will have no
>>>>> fields in them (like item PF had not had any
>>>>> before it was deprecated).
>>>>>
>>>>> No fields in new items => no field masks.
>>>>> So what's the problem then?
>>>>>
>>>>> 5) With regard to our talk about identifying the relationship
>>>>> between ethdevs and switch domains, you said that the user
>>>>> could know the difference from the very beginning:
>>>>> /sysfs/ .... /PF_BDF/sriov_num
>>>>>
>>>>> That is true for the user who starts the application, but
>>>>> this knowledge is hard to obtain from the application
>>>>> perspective = it's hard to automate.
>>>>>
>>>>> This is why ethdevs are able to advertise their domain IDs.
>>>>> And, as I explained, looking at domain ID to understand
>>>>
>>>> namely rte_eth_dev_info.switch_info.domain_id
>>>>
>>>>> port relationship is valid, whilst looking at proxy IDs
>>>>> to achieve the same goal is not. Proxy port IDs only
>>>>> serve the purpose of finding an entry point for
>>>>> managing flows. That has slightly different
>>>>> meaning, but this subtle difference is important.
>>>>
>>>> There is also a concept of sibling ports
>>>> to get all ports belonging to the same hardware.
>>>>
>>>>
>>>>> 6) As for the confusion over the difference between fixing
>>>>> bugs and making the code robust by extra checks:
>>>>>
>>>>> Yes, I agree that the programmer who writes the
>>>>> application must be intelligent enough to use
>>>>> flow primitives the proper way. Yes, the user
>>>>> who starts the application also should thread
>>>>> carefully. But that does not prevent some
>>>>> mistakes in other parts of code from
>>>>> corrupting various chunks of memory,
>>>>> including, for example, flow attrs.
>>>>>
>>>>> You say that such mistakes have to be "just fixed"
>>>>> as any other bugs. Right. But how much time will
>>>>> the programmer spend to identify the bugs?
>>>>>
>>>>> If the PMDs do all the checks (as with attributes),
>>>>> the hypothetical bug will manifest itself much
>>>>> earlier. That will simplify debugging by a lot...
>>>>>
>>>>> So, my point is that it's still better to ensure
>>>>> that new flow primitives have all necessary
>>>>> checks in place. For attributes, it is
>>>>> required to add them separately.
>>>>
>>>> If flow insertion is done in a fast path,
>>>> such checks may be skipped.
>>>
>>> The idea is that all rules in this table will share the same configuration,
>>> there is no reason to say everything again for each rule. This is why
>>> the rule attributes were moved to the table struct and not per rule.
>>>
>>>>
>>>>> For items, as I explained, it might not be necessary
>>>>> in the majority of cases simply because of the
>>>>> switch (item->type) { case } structure.
>>>>>
>>>>> So, these are some of my points to explain why the
>>>>> attribute approach is untenable. To me, attributes
>>>>> are something global, which demands checks in all
>>>>> flow-capable PMDs. Items seem better because they
>>>>> are don't cares to all PMDs which are unaware of
>>>>> the async concept. So, even if someone does not
>>>>> implement the async concept or does not like
>>>>> the new item names, they can turn a blind
>>>>> eye to this - with attributes, thay can't.
>>>>>
>>>
>>> Good point,
>>> Maybe we should add hints in the attribute,
>>> for example, hint_only_wire in this case it will be clear that
>>> PMD may ignore this, and it should be fully documented that this is not a
>> mandatory field.
>>> What do you think?
>>
>> Theoretically, making terminology softer (like with the word "hint")
>> could make things easier for vendors who may find the new feature
>> confusing or something like that. But if, in reality, this hint
>> is indeed another match criterion (see my comments above), then
>> in no event shall the prefix "hint" be an excuse for this
>> criterion not being expressed as a pattern item.
>>
>
> Please see my response above. This is the point it is much more than matching.
>
>> Please hear me out: I don't mean to sound arrogant, - just trying
>> to understand why expressing the new bit as an item can't be
>> efficient enough for the async flow approach.
>>
>
> I don't think you are arrogant, and I hope that you see that I do understand your comments.
> saying that I hope I explained why I think it is better to have it as a table attribute and not as an
> item. (We are not matching on it, this helps the PMD allocate the table at the best location and avoid
> duplication of rules)
>
> If you wish, we can have a short phone call and discuss this.
>
> Best,
> Ori
>
>
>>>
>>>>> Thank you.
>>>>
>>>>
>>>
>>>
>>
>> Ivan
>
Thanks,
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-20 15:28 ` Ivan Malov
@ 2022-09-21 7:34 ` Ori Kam
2022-09-21 8:39 ` Andrew Rybchenko
2022-09-21 9:04 ` Ivan Malov
0 siblings, 2 replies; 96+ messages in thread
From: Ori Kam @ 2022-09-21 7:34 UTC (permalink / raw)
To: Ivan Malov
Cc: NBU-Contact-Thomas Monjalon (EXTERNAL),
Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, Andrew Rybchenko, dev, Raslan Darawsheh
Hi Ivan,
PSB my comments.
In any case, I'm afraid we are in a deadlock.
I understand your viewpoint, I don't think it is the correct
one for the feature suggested here.
For all the reasons I listed.
So from my viewpoint, the patch is Acked.
If you wish as I suggested before, we can have a meeting with
Rongwei and anyone else who is interested and close this subject.
> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Tuesday, 20 September 2022 18:28
RE: [PATCH v1] ethdev: add direction info when creating the transfer
> table
>
> Hi Ori,
>
> On Tue, 20 Sep 2022, Ori Kam wrote:
>
> > Hi Ivan,
> >
> >> -----Original Message-----
> >> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> >> Sent: Tuesday, 20 September 2022 15:46
> >>
> >> Hi Ori,
> >>
> >> On Tue, 20 Sep 2022, Ori Kam wrote:
> >>
> >>> Hi Ivan, Thomas and Rongwei
> >>>
> >>>> -----Original Message-----
> >>>> From: Thomas Monjalon <thomas@monjalon.net>
> >>>> Sent: Thursday, 15 September 2022 14:16
> >>>>
> >>>> 15/09/2022 12:59, Ivan Malov:
> >>>>> Hi Rongwei,
> >>>>>
> >>>>> In this reply, I do not include the previous mail because the amount
> >>>>> of inline commentary has gone haywire over the past couple of days.
> >>>>> Let's re-iterate.
> >>>>>
> >>>>> But before I get to that, I'd like to offer a fresh perspective:
> >>>>>
> >>>>> Perhaps, if we all agree that term "vport" means an endpoint which
> >>>>> can stand for any "port" except for physical one, then it should
> >>>>> be possible to use term ANY_VPORTS rather than
> ANY_GUEST_PORTS.
> >>>>
> >>>> The opposite of "physical" is "virtual" indeed.
> >>>>
> >>>>> But that's tricky, of course. I don't have a way with naming,
> >>>>> so more opinions are welcome and very-very desirable here.
> >>>>>
> >>>>> So:
> >>>>>
> >>>>> 1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
> >>>>> primitives are in fact yet another match criteria?
> >>>>>
> >>>>> ..
> >>>>>
> >>>>> To me, it looks so. If they are match criteria, then they belong
> >>>>> in match pattern, that is, they should be expressed as new items.
> >>>>>
> >>>>> For "transfer" rules, the *existing* attributes are: "group"
> >>>>> and "priority". As you may note, these are clearly not match
> >>>>> criteria. They control the look-up order. So, to this day,
> >>>>> there're no match criteria in DPDK expressed as attributes.
> >>>>>
> >>>>> If these "wire_orig" / "vf_orig" are going to be introduced
> >>>>> as attributes, that should be backed with strong motivation.
> >>>>
> >>>> I prefer we keep matching in a single place, not in attributes.
> >>>>
> >>>
> >>> I think we are talking about two different features.
> >>> Feature 1:
> >>> Allow matching on all vports that are not wire
> >>> Feature 2:
> >>> Save allocation space and allow fast insertion.
> >>> In this case, the matching is not on all vports it can be just part of the
> vports
> >>> but it will never be the wire port.
> >>> For example:
> >>> port 0 - wire
> >>> ports 1,2,3,4,5 - vports
> >>> the application want to inset only those rules:
> >>> represented_port(port_id=2) / eth / ipv4 (src==xx)
> >>> represented_port(port_id=4) / eth / ipv4 (src==xx)
> >>> represented_port(port_id=4) / eth / ipv4 (src==yy)
> >>>
> >>> For feature 1 I fully agree with you Ivan, this should be added as an item.
> >>
> >> Thank you.
> >>
> >>> For feature 2 I think Rongwei's suggestion is the better option.
> >>> If I understand correctly the idea is to give hint to the PMD on where to
> >> allocate memory
> >>> and how to insert the rules most optimally. Since this is shared for all
> rules it
> >> makes more sense
> >>> to add it as an attribute, just like we don’t have an ingress item (maybe
> we
> >> should?)
> >>
> >> But isn't pattern template also supposed to be shared for all rules
> >> in the table? I.e., the user creates an async flow table and submits
> >> a flow "shape" (which consists of attrs, pattern template and action
> >> template). So why should "giving a hint" via an item template be
> >> considered worse than doig so via an attribute?
> >>
> >
> > The same item template maybe used elsewhere, for example, the
> following
> > pattern eth / ipv4(src, dst) / udp(sport, dport), can be used on number of
> different
> > tables.
>
> In my understanding, the user may want to create flow table A
> and use pattern template A' for it, which is as follows:
>
> any_vports / eth / ipv4 / udp
>
> The PMD can see this item and treat it exactly the same
> way as it could treat such attribute ("where to allocate
> resources, etc.").
>
> Then the user may want to create flow table B and
> use pattern template B' for it:
>
> any_phy_ports / eth / ipv4 / udp
>
> Once again, the PMD can clearly see the difference between
> the A' and B' templates and, this time, allocate resources
> the other way (as per efficiency requirements).
>
Yes, but again you select all vports, this is not what the application wants
the application wants to insert the following rules:
Assuming port 0 is wire and DPDK ports 1,2,3,4,5 are vports.
Represented_port(id=2) / eth / ipv4/ udp
Represented_port(id=5) / eth / ipv4/ udp
As you can see the application doesn’t want all ports just some vports but for sure not the
wire port.
I agree that we can go with your approach, but it isn't correct since why application should
insert:
any_phy_ports / Represented_port(id=2) / eth / ipv4/ udp
any_phy_ports / Represented_port(id=2) / eth / ipv4/ udp
> By saying "can be used on number of different tables", do you mean
> that it is important to make the *network* part of the pattern
> shareable between flow tables? I.e. are you saying that
> templates A' and B' cause resource duplication just
> because of the same *network* part in your case?
>
I'm saying that adding will mean that the application can't reuse the pattern template it created.
if the application created 5 tuple template.
It can reuse it in ingress tables, egress tables, FDB tables there is no need to create
extra pattern templates.
> > I think that the main difference between us is that from my point of view
> this value is just
> > where to allocate resources / how to better insert the rule. It is not related
> to matching.
>
> To me, it *is* the match criterion which, at the same time, serves
> as a value indicating the way how resources should be allocated.
> But before all, it is a match criterion.
>
Depends on how you define matching, but just like ingress / egress is not matching
the same goes here.
> If it refers to a group of ports = in order to ditch "the other half"
> of traffic from consideration (like Rongwei explained), then it
> looks like a match criterion.
>
See above comment, also this case it relates to allocation and insertion,
the matching is side product.
> > From Nvidia viewpoint we need this information so we can allocate the
> resource at the correct
> > place and avoid inserting duplication of rules.
>
> I see.
>
> > I agree that by using the item we can get the same results, but it is incorrect
> since we are not matching on it.
>
> If one provides item UDP in the pattern and does not match on any UDP
> fields, doing so nevertheless *is* matching on particular packet type.
>
Yes it matches all UDP will again in this case the idea is not to match all vports
but just to tell the PMD that there will be only vports arriving to this table.
> The same seemingly goes for the new attribute / item. If it is
> provided, then the user doesn't want the rule to affect
> packets coming from certain ports (i.e. from wire).
>
> So still sounds like matching.
>
> > Part of the idea of template API is to give as many hints as possible to the
> PMD so the insertion will be optimized.
>
> I see.
>
> >
> >
> >> As for "ingress" item, - no, one should not add such. We have had
> >> many discussions concerning this bit in the past. Ingress/egress
> >> are non-transfer terms. They belong in the scope of vNIC / ethdev
> >> filtering, not to embedded switch rules.
> >>
> >> In my opinion, in the embedded switch, one should either point to
> >> some precise switch ports (using REPRESENTOR / REPRESENTED items)
> >> or use another kind of item to refer to a "super set" of ports
> >> which have something in common ("all wire ports", "all NON-wire ports").
> >>
> >
> > But this is my point we don't want all wire ports or all NON-wire ports, we
> just know that in this table
> > we will have only non-wire / wire ports.
>
> But how do these two viewpoints contradict each other?
>
Which viewpoints?
> >
> >>>
> >>> Ivan we have the item RTE_FLOW_ITEM_TYPE_PF and
> >> RTE_FLOW_ITEM_TYPE_VF which are deprecated,
> >>> So do you want to un-deprecate them?
> >>
> >> No. These items are deprecated because:
> >>
> >> a) their names suggest that application knows whether an ethdev
> >> sits on top of a PF or that the application has some
> >> knowledge of existence of particular VFs, but in
> >> reality applications should not be worried of
> >> the underlying function type = to them, all
> >> ethdevs are just representors of something,
> >> and if the application needs to refer to
> >> VFs (or other PFs, - doesn't matter), it
> >> should do that via REPRESENTOR items;
> >>
> >> b) such items would duplicate REPRESENTOR / REPRESENTED.
> >>
> > Agree with everything you say.
>
> Great we're on the same page regarding this bit.
>
> >
> >>>
> >>> To summarize, if PMD can use such an hint during rule creation and save
> >> memory, I vote
> >>> to allow it.
> >>> if the idea is to match on all vports then it should be an item.
> >>
> >> But such a hint would effectively be a match criterion, too, right?
> >> So, in fact it's a combined use case: a match criterion which is
> >> flexible enough to be a "hint" = i.e. the PMD can see it when
> >> processing the pattern *template* and treat it as a hint.
> >>
> >
> > Yes, but it is an implicit match, just like saying ingress. Egress it has meaning
> above the
> > matching. In addition, there is no reason to add extra item for each rule we
> create, just
> > to enable something that is fixed during the table creation.
> > Extra item in pattern template means extra item for each rule.
> > I know we can avoid this and optimize the code but why add something
> that no one needs
> > after table creation?
>
> Good question. But, in case some way exists to make such optimisation
> laconic enough to avoid confusion etc., then it should be no problem
> in preferring the pattern approach over attribute approach.
>
> >
> >
> >>>
> >>>>
> >>>>> 2) From your viewpoint, why items "ANY_PHYS_PORTS" and
> >>>> "ANY_VPORTS"
> >>>>> won't do? Or, which problems do you think they may inflict?
> >>>>>
> >>>>> ..
> >>>>>
> >>>>> Previously, you explained why REPRESENTED_PORT would not
> >>>>> fit your needs. And I understand your point: to async API,
> >>>>> two pattern templates which both have item REPRESENTED_PORT
> >>>>> in them cannot be clearly distinguished and are in fact the
> >>>>> same set of criteria (provided that all other items are also
> >>>>> the same and have the same masks). Templates are, well,
> >>>>> templates (or shapes) of the rules to come later and
> >>>>> do not include exact "spec" for the "ethdev_id".
> >>>>> Got it.
> >>>>>
> >>>>> But that's not going to be the case with items ANY_PHYS_PORTS
> and
> >>>>> ANY_VPORTS, is it? In one async table template, the user submits
> >>>>> item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
> >>>>> In another template, the user submits item ANY_VPORTS to
> >>>>> state that they want to match only traffic transmitted
> >>>>> software endpoints (DPDK ethdevs, guest VFs, etc.)
> >>>>> connected to the switch.
> >>>>>
> >>>>> In this example, the PMD will clearly see that the two templates
> >>>>> differ. So it will be able to allocate separate resources, each
> >>>>> one "cutting one half of traffic" (as per your concept).
> >>>>>
> >>>>> 3) In your most recent response, you suggested that one might have
> >>>>> had the attributes occupied for some other purposes. To me,
> >>>>> they're not. Neither me nor my closest colleagues have
> >>>>> any plans on them. When I advocate using item approach
> >>>>> over the attribute approach, I do this to ensure
> >>>>> a) clarity of the API contract and b) robustness.
> >>>
> >>> If something is shared for all rules in the same table, it should be a table
> >>> property.
> >>
> >> But the whole pattern *template* is also a table property, isn't it?
> >>
> >
> > Like I said above the pattern template can be used in all domains that is
> why
> > there is a split between table and patter, in addition to that each table may
> have
> > number of pattern templates.
>
> This is a valuable clarification. However, even if the attribute way
> may seem OK after this explanation, then I still don't understand
> why it is required to add this attribute to the generic "struct
> rte_flow_attr" and not just to the *table* attr.
>
> Generic "struct rte_flow_attr" is used both for async and
> sync (regular) approach. So why add something to generic
> struct which is never going to make sense to sync flows?
>
I guess we can move it to the table attribute, but I think that even
in standard rte_flow API this can save duplicate insertion.
> >
> >>>
> >>>>>
> >>>>> 4) Also, in your response, you suggested that I might have
> >>>>> confused item mask and spec. That is not the case.
> >>>>> If we agree, that switch domain ID is unneeded in
> >>>>> the new items, then these items will have no
> >>>>> fields in them (like item PF had not had any
> >>>>> before it was deprecated).
> >>>>>
> >>>>> No fields in new items => no field masks.
> >>>>> So what's the problem then?
> >>>>>
> >>>>> 5) With regard to our talk about identifying the relationship
> >>>>> between ethdevs and switch domains, you said that the user
> >>>>> could know the difference from the very beginning:
> >>>>> /sysfs/ .... /PF_BDF/sriov_num
> >>>>>
> >>>>> That is true for the user who starts the application, but
> >>>>> this knowledge is hard to obtain from the application
> >>>>> perspective = it's hard to automate.
> >>>>>
> >>>>> This is why ethdevs are able to advertise their domain IDs.
> >>>>> And, as I explained, looking at domain ID to understand
> >>>>
> >>>> namely rte_eth_dev_info.switch_info.domain_id
> >>>>
> >>>>> port relationship is valid, whilst looking at proxy IDs
> >>>>> to achieve the same goal is not. Proxy port IDs only
> >>>>> serve the purpose of finding an entry point for
> >>>>> managing flows. That has slightly different
> >>>>> meaning, but this subtle difference is important.
> >>>>
> >>>> There is also a concept of sibling ports
> >>>> to get all ports belonging to the same hardware.
> >>>>
> >>>>
> >>>>> 6) As for the confusion over the difference between fixing
> >>>>> bugs and making the code robust by extra checks:
> >>>>>
> >>>>> Yes, I agree that the programmer who writes the
> >>>>> application must be intelligent enough to use
> >>>>> flow primitives the proper way. Yes, the user
> >>>>> who starts the application also should thread
> >>>>> carefully. But that does not prevent some
> >>>>> mistakes in other parts of code from
> >>>>> corrupting various chunks of memory,
> >>>>> including, for example, flow attrs.
> >>>>>
> >>>>> You say that such mistakes have to be "just fixed"
> >>>>> as any other bugs. Right. But how much time will
> >>>>> the programmer spend to identify the bugs?
> >>>>>
> >>>>> If the PMDs do all the checks (as with attributes),
> >>>>> the hypothetical bug will manifest itself much
> >>>>> earlier. That will simplify debugging by a lot...
> >>>>>
> >>>>> So, my point is that it's still better to ensure
> >>>>> that new flow primitives have all necessary
> >>>>> checks in place. For attributes, it is
> >>>>> required to add them separately.
> >>>>
> >>>> If flow insertion is done in a fast path,
> >>>> such checks may be skipped.
> >>>
> >>> The idea is that all rules in this table will share the same configuration,
> >>> there is no reason to say everything again for each rule. This is why
> >>> the rule attributes were moved to the table struct and not per rule.
> >>>
> >>>>
> >>>>> For items, as I explained, it might not be necessary
> >>>>> in the majority of cases simply because of the
> >>>>> switch (item->type) { case } structure.
> >>>>>
> >>>>> So, these are some of my points to explain why the
> >>>>> attribute approach is untenable. To me, attributes
> >>>>> are something global, which demands checks in all
> >>>>> flow-capable PMDs. Items seem better because they
> >>>>> are don't cares to all PMDs which are unaware of
> >>>>> the async concept. So, even if someone does not
> >>>>> implement the async concept or does not like
> >>>>> the new item names, they can turn a blind
> >>>>> eye to this - with attributes, thay can't.
> >>>>>
> >>>
> >>> Good point,
> >>> Maybe we should add hints in the attribute,
> >>> for example, hint_only_wire in this case it will be clear that
> >>> PMD may ignore this, and it should be fully documented that this is not a
> >> mandatory field.
> >>> What do you think?
> >>
> >> Theoretically, making terminology softer (like with the word "hint")
> >> could make things easier for vendors who may find the new feature
> >> confusing or something like that. But if, in reality, this hint
> >> is indeed another match criterion (see my comments above), then
> >> in no event shall the prefix "hint" be an excuse for this
> >> criterion not being expressed as a pattern item.
> >>
> >
> > Please see my response above. This is the point it is much more than
> matching.
> >
> >> Please hear me out: I don't mean to sound arrogant, - just trying
> >> to understand why expressing the new bit as an item can't be
> >> efficient enough for the async flow approach.
> >>
> >
> > I don't think you are arrogant, and I hope that you see that I do understand
> your comments.
> > saying that I hope I explained why I think it is better to have it as a table
> attribute and not as an
> > item. (We are not matching on it, this helps the PMD allocate the table at
> the best location and avoid
> > duplication of rules)
> >
> > If you wish, we can have a short phone call and discuss this.
> >
> > Best,
> > Ori
> >
> >
> >>>
> >>>>> Thank you.
> >>>>
> >>>>
> >>>
> >>>
> >>
> >> Ivan
> >
>
> Thanks,
> Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-21 7:34 ` Ori Kam
@ 2022-09-21 8:39 ` Andrew Rybchenko
2022-09-21 9:04 ` Ivan Malov
1 sibling, 0 replies; 96+ messages in thread
From: Andrew Rybchenko @ 2022-09-21 8:39 UTC (permalink / raw)
To: Ori Kam, Ivan Malov
Cc: NBU-Contact-Thomas Monjalon (EXTERNAL),
Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh
Hi Ori,
On 9/21/22 10:34, Ori Kam wrote:
> Hi Ivan,
>
> PSB my comments.
>
> In any case, I'm afraid we are in a deadlock.
> I understand your viewpoint, I don't think it is the correct
> one for the feature suggested here.
> For all the reasons I listed.
>
> So from my viewpoint, the patch is Acked.
> If you wish as I suggested before, we can have a meeting with
> Rongwei and anyone else who is interested and close this subject.
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Tuesday, 20 September 2022 18:28
> RE: [PATCH v1] ethdev: add direction info when creating the transfer
>> table
>>
>> Hi Ori,
>>
>> On Tue, 20 Sep 2022, Ori Kam wrote:
>>
>>> Hi Ivan,
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>> Sent: Tuesday, 20 September 2022 15:46
>>>>
>>>> Hi Ori,
>>>>
>>>> On Tue, 20 Sep 2022, Ori Kam wrote:
>>>>
>>>>> Hi Ivan, Thomas and Rongwei
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Thomas Monjalon <thomas@monjalon.net>
>>>>>> Sent: Thursday, 15 September 2022 14:16
>>>>>>
>>>>>> 15/09/2022 12:59, Ivan Malov:
>>>>>>> Hi Rongwei,
>>>>>>>
>>>>>>> In this reply, I do not include the previous mail because the amount
>>>>>>> of inline commentary has gone haywire over the past couple of days.
>>>>>>> Let's re-iterate.
>>>>>>>
>>>>>>> But before I get to that, I'd like to offer a fresh perspective:
>>>>>>>
>>>>>>> Perhaps, if we all agree that term "vport" means an endpoint which
>>>>>>> can stand for any "port" except for physical one, then it should
>>>>>>> be possible to use term ANY_VPORTS rather than
>> ANY_GUEST_PORTS.
>>>>>>
>>>>>> The opposite of "physical" is "virtual" indeed.
>>>>>>
>>>>>>> But that's tricky, of course. I don't have a way with naming,
>>>>>>> so more opinions are welcome and very-very desirable here.
>>>>>>>
>>>>>>> So:
>>>>>>>
>>>>>>> 1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
>>>>>>> primitives are in fact yet another match criteria?
>>>>>>>
>>>>>>> ..
>>>>>>>
>>>>>>> To me, it looks so. If they are match criteria, then they belong
>>>>>>> in match pattern, that is, they should be expressed as new items.
>>>>>>>
>>>>>>> For "transfer" rules, the *existing* attributes are: "group"
>>>>>>> and "priority". As you may note, these are clearly not match
>>>>>>> criteria. They control the look-up order. So, to this day,
>>>>>>> there're no match criteria in DPDK expressed as attributes.
>>>>>>>
>>>>>>> If these "wire_orig" / "vf_orig" are going to be introduced
>>>>>>> as attributes, that should be backed with strong motivation.
>>>>>>
>>>>>> I prefer we keep matching in a single place, not in attributes.
>>>>>>
>>>>>
>>>>> I think we are talking about two different features.
>>>>> Feature 1:
>>>>> Allow matching on all vports that are not wire
It is good that we share understanding here.
I.e. the feature is about matching.
>>>>> Feature 2:
>>>>> Save allocation space and allow fast insertion.
>>>>> In this case, the matching is not on all vports it can be just part of the
>> vports
>>>>> but it will never be the wire port.
>>>>> For example:
>>>>> port 0 - wire
>>>>> ports 1,2,3,4,5 - vports
>>>>> the application want to inset only those rules:
>>>>> represented_port(port_id=2) / eth / ipv4 (src==xx)
>>>>> represented_port(port_id=4) / eth / ipv4 (src==xx)
>>>>> represented_port(port_id=4) / eth / ipv4 (src==yy)
>>>>>
>>>>> For feature 1 I fully agree with you Ivan, this should be added as an item.
>>>>
>>>> Thank you.
>>>>
>>>>> For feature 2 I think Rongwei's suggestion is the better option.
>>>>> If I understand correctly the idea is to give hint to the PMD on where to
>>>> allocate memory
>>>>> and how to insert the rules most optimallySince this is shared for all
>> rules it
>>>> makes more sense
>>>>> to add it as an attribute,
Hm, if I want to match on IPv4-UDP source port only in the
table, may I add an attribute for it? What does the direction
matching criteria so special to add an attribute for it?
Jokes aside. I perfectly realize that addition of a new
attribute is simple. It is simple from implementation point
of view. But it does not make it right from overall design
point of view. IMHO pattern is responsible for matching in
RTE flow API and all matching criteria should be there.
As for optimizations - I believe it is doable in a different
way. Just create a table and use flow rule with matching on
a direction and jump to the table. I guess you have everything
you need in the case.
>>>>>
just like we don’t have an ingress item (maybe
>> we
>>>> should?)
>>>>
>>>> But isn't pattern template also supposed to be shared for all rules
>>>> in the table? I.e., the user creates an async flow table and submits
>>>> a flow "shape" (which consists of attrs, pattern template and action
>>>> template). So why should "giving a hint" via an item template be
>>>> considered worse than doig so via an attribute?
>>>>
>>>
>>> The same item template maybe used elsewhere, for example, the
>> following
>>> pattern eth / ipv4(src, dst) / udp(sport, dport), can be used on number of
>> different
>>> tables.
>>
>> In my understanding, the user may want to create flow table A
>> and use pattern template A' for it, which is as follows:
>>
>> any_vports / eth / ipv4 / udp
>>
>> The PMD can see this item and treat it exactly the same
>> way as it could treat such attribute ("where to allocate
>> resources, etc.").
>>
>> Then the user may want to create flow table B and
>> use pattern template B' for it:
>>
>> any_phy_ports / eth / ipv4 / udp
>>
>> Once again, the PMD can clearly see the difference between
>> the A' and B' templates and, this time, allocate resources
>> the other way (as per efficiency requirements).
>>
>
> Yes, but again you select all vports, this is not what the application wants
> the application wants to insert the following rules:
> Assuming port 0 is wire and DPDK ports 1,2,3,4,5 are vports.
> Represented_port(id=2) / eth / ipv4/ udp
> Represented_port(id=5) / eth / ipv4/ udp
>
> As you can see the application doesn’t want all ports just some vports but for sure not the
> wire port.
>
> I agree that we can go with your approach, but it isn't correct since why application should
> insert:
> any_phy_ports / Represented_port(id=2) / eth / ipv4/ udp
> any_phy_ports / Represented_port(id=2) / eth / ipv4/ udp
>
>> By saying "can be used on number of different tables", do you mean
>> that it is important to make the *network* part of the pattern
>> shareable between flow tables? I.e. are you saying that
>> templates A' and B' cause resource duplication just
>> because of the same *network* part in your case?
>>
>
> I'm saying that adding will mean that the application can't reuse the pattern template it created.
> if the application created 5 tuple template.
> It can reuse it in ingress tables, egress tables, FDB tables there is no need to create
> extra pattern templates.
>
>>> I think that the main difference between us is that from my point of view
>> this value is just
>>> where to allocate resources / how to better insert the rule. It is not related
>> to matching.
>>
>> To me, it *is* the match criterion which, at the same time, serves
>> as a value indicating the way how resources should be allocated.
>> But before all, it is a match criterion.
>>
>
> Depends on how you define matching, but just like ingress / egress is not matching
> the same goes here.
>
>> If it refers to a group of ports = in order to ditch "the other half"
>> of traffic from consideration (like Rongwei explained), then it
>> looks like a match criterion.
>>
>
> See above comment, also this case it relates to allocation and insertion,
> the matching is side product.
>
>>> From Nvidia viewpoint we need this information so we can allocate the
>> resource at the correct
>>> place and avoid inserting duplication of rules.
>>
>> I see.
>>
>>> I agree that by using the item we can get the same results, but it is incorrect
>> since we are not matching on it.
>>
>> If one provides item UDP in the pattern and does not match on any UDP
>> fields, doing so nevertheless *is* matching on particular packet type.
>>
> Yes it matches all UDP will again in this case the idea is not to match all vports
> but just to tell the PMD that there will be only vports arriving to this table.
>
>> The same seemingly goes for the new attribute / item. If it is
>> provided, then the user doesn't want the rule to affect
>> packets coming from certain ports (i.e. from wire).
>>
>> So still sounds like matching.
>>
>>> Part of the idea of template API is to give as many hints as possible to the
>> PMD so the insertion will be optimized.
>>
>> I see.
>>
>>>
>>>
>>>> As for "ingress" item, - no, one should not add such. We have had
>>>> many discussions concerning this bit in the past. Ingress/egress
>>>> are non-transfer terms. They belong in the scope of vNIC / ethdev
>>>> filtering, not to embedded switch rules.
>>>>
>>>> In my opinion, in the embedded switch, one should either point to
>>>> some precise switch ports (using REPRESENTOR / REPRESENTED items)
>>>> or use another kind of item to refer to a "super set" of ports
>>>> which have something in common ("all wire ports", "all NON-wire ports").
>>>>
>>>
>>> But this is my point we don't want all wire ports or all NON-wire ports, we
>> just know that in this table
>>> we will have only non-wire / wire ports.
>>
>> But how do these two viewpoints contradict each other?
>>
>
> Which viewpoints?
>
>>>
>>>>>
>>>>> Ivan we have the item RTE_FLOW_ITEM_TYPE_PF and
>>>> RTE_FLOW_ITEM_TYPE_VF which are deprecated,
>>>>> So do you want to un-deprecate them?
>>>>
>>>> No. These items are deprecated because:
>>>>
>>>> a) their names suggest that application knows whether an ethdev
>>>> sits on top of a PF or that the application has some
>>>> knowledge of existence of particular VFs, but in
>>>> reality applications should not be worried of
>>>> the underlying function type = to them, all
>>>> ethdevs are just representors of something,
>>>> and if the application needs to refer to
>>>> VFs (or other PFs, - doesn't matter), it
>>>> should do that via REPRESENTOR items;
>>>>
>>>> b) such items would duplicate REPRESENTOR / REPRESENTED.
>>>>
>>> Agree with everything you say.
>>
>> Great we're on the same page regarding this bit.
>>
>>>
>>>>>
>>>>> To summarize, if PMD can use such an hint during rule creation and save
>>>> memory, I vote
>>>>> to allow it.
>>>>> if the idea is to match on all vports then it should be an item.
>>>>
>>>> But such a hint would effectively be a match criterion, too, right?
>>>> So, in fact it's a combined use case: a match criterion which is
>>>> flexible enough to be a "hint" = i.e. the PMD can see it when
>>>> processing the pattern *template* and treat it as a hint.
>>>>
>>>
>>> Yes, but it is an implicit match, just like saying ingress. Egress it has meaning
>> above the
>>> matching. In addition, there is no reason to add extra item for each rule we
>> create, just
>>> to enable something that is fixed during the table creation.
>>> Extra item in pattern template means extra item for each rule.
>>> I know we can avoid this and optimize the code but why add something
>> that no one needs
>>> after table creation?
>>
>> Good question. But, in case some way exists to make such optimisation
>> laconic enough to avoid confusion etc., then it should be no problem
>> in preferring the pattern approach over attribute approach.
>>
>>>
>>>
>>>>>
>>>>>>
>>>>>>> 2) From your viewpoint, why items "ANY_PHYS_PORTS" and
>>>>>> "ANY_VPORTS"
>>>>>>> won't do? Or, which problems do you think they may inflict?
>>>>>>>
>>>>>>> ..
>>>>>>>
>>>>>>> Previously, you explained why REPRESENTED_PORT would not
>>>>>>> fit your needs. And I understand your point: to async API,
>>>>>>> two pattern templates which both have item REPRESENTED_PORT
>>>>>>> in them cannot be clearly distinguished and are in fact the
>>>>>>> same set of criteria (provided that all other items are also
>>>>>>> the same and have the same masks). Templates are, well,
>>>>>>> templates (or shapes) of the rules to come later and
>>>>>>> do not include exact "spec" for the "ethdev_id".
>>>>>>> Got it.
>>>>>>>
>>>>>>> But that's not going to be the case with items ANY_PHYS_PORTS
>> and
>>>>>>> ANY_VPORTS, is it? In one async table template, the user submits
>>>>>>> item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
>>>>>>> In another template, the user submits item ANY_VPORTS to
>>>>>>> state that they want to match only traffic transmitted
>>>>>>> software endpoints (DPDK ethdevs, guest VFs, etc.)
>>>>>>> connected to the switch.
>>>>>>>
>>>>>>> In this example, the PMD will clearly see that the two templates
>>>>>>> differ. So it will be able to allocate separate resources, each
>>>>>>> one "cutting one half of traffic" (as per your concept).
>>>>>>>
>>>>>>> 3) In your most recent response, you suggested that one might have
>>>>>>> had the attributes occupied for some other purposes. To me,
>>>>>>> they're not. Neither me nor my closest colleagues have
>>>>>>> any plans on them. When I advocate using item approach
>>>>>>> over the attribute approach, I do this to ensure
>>>>>>> a) clarity of the API contract and b) robustness.
>>>>>
>>>>> If something is shared for all rules in the same table, it should be a table
>>>>> property.
>>>>
>>>> But the whole pattern *template* is also a table property, isn't it?
>>>>
>>>
>>> Like I said above the pattern template can be used in all domains that is
>> why
>>> there is a split between table and patter, in addition to that each table may
>> have
>>> number of pattern templates.
>>
>> This is a valuable clarification. However, even if the attribute way
>> may seem OK after this explanation, then I still don't understand
>> why it is required to add this attribute to the generic "struct
>> rte_flow_attr" and not just to the *table* attr.
>>
>> Generic "struct rte_flow_attr" is used both for async and
>> sync (regular) approach. So why add something to generic
>> struct which is never going to make sense to sync flows?
>>
>
> I guess we can move it to the table attribute, but I think that even
> in standard rte_flow API this can save duplicate insertion.
>
>>>
>>>>>
>>>>>>>
>>>>>>> 4) Also, in your response, you suggested that I might have
>>>>>>> confused item mask and spec. That is not the case.
>>>>>>> If we agree, that switch domain ID is unneeded in
>>>>>>> the new items, then these items will have no
>>>>>>> fields in them (like item PF had not had any
>>>>>>> before it was deprecated).
>>>>>>>
>>>>>>> No fields in new items => no field masks.
>>>>>>> So what's the problem then?
>>>>>>>
>>>>>>> 5) With regard to our talk about identifying the relationship
>>>>>>> between ethdevs and switch domains, you said that the user
>>>>>>> could know the difference from the very beginning:
>>>>>>> /sysfs/ .... /PF_BDF/sriov_num
>>>>>>>
>>>>>>> That is true for the user who starts the application, but
>>>>>>> this knowledge is hard to obtain from the application
>>>>>>> perspective = it's hard to automate.
>>>>>>>
>>>>>>> This is why ethdevs are able to advertise their domain IDs.
>>>>>>> And, as I explained, looking at domain ID to understand
>>>>>>
>>>>>> namely rte_eth_dev_info.switch_info.domain_id
>>>>>>
>>>>>>> port relationship is valid, whilst looking at proxy IDs
>>>>>>> to achieve the same goal is not. Proxy port IDs only
>>>>>>> serve the purpose of finding an entry point for
>>>>>>> managing flows. That has slightly different
>>>>>>> meaning, but this subtle difference is important.
>>>>>>
>>>>>> There is also a concept of sibling ports
>>>>>> to get all ports belonging to the same hardware.
>>>>>>
>>>>>>
>>>>>>> 6) As for the confusion over the difference between fixing
>>>>>>> bugs and making the code robust by extra checks:
>>>>>>>
>>>>>>> Yes, I agree that the programmer who writes the
>>>>>>> application must be intelligent enough to use
>>>>>>> flow primitives the proper way. Yes, the user
>>>>>>> who starts the application also should thread
>>>>>>> carefully. But that does not prevent some
>>>>>>> mistakes in other parts of code from
>>>>>>> corrupting various chunks of memory,
>>>>>>> including, for example, flow attrs.
>>>>>>>
>>>>>>> You say that such mistakes have to be "just fixed"
>>>>>>> as any other bugs. Right. But how much time will
>>>>>>> the programmer spend to identify the bugs?
>>>>>>>
>>>>>>> If the PMDs do all the checks (as with attributes),
>>>>>>> the hypothetical bug will manifest itself much
>>>>>>> earlier. That will simplify debugging by a lot...
>>>>>>>
>>>>>>> So, my point is that it's still better to ensure
>>>>>>> that new flow primitives have all necessary
>>>>>>> checks in place. For attributes, it is
>>>>>>> required to add them separately.
>>>>>>
>>>>>> If flow insertion is done in a fast path,
>>>>>> such checks may be skipped.
>>>>>
>>>>> The idea is that all rules in this table will share the same configuration,
>>>>> there is no reason to say everything again for each rule. This is why
>>>>> the rule attributes were moved to the table struct and not per rule.
>>>>>
>>>>>>
>>>>>>> For items, as I explained, it might not be necessary
>>>>>>> in the majority of cases simply because of the
>>>>>>> switch (item->type) { case } structure.
>>>>>>>
>>>>>>> So, these are some of my points to explain why the
>>>>>>> attribute approach is untenable. To me, attributes
>>>>>>> are something global, which demands checks in all
>>>>>>> flow-capable PMDs. Items seem better because they
>>>>>>> are don't cares to all PMDs which are unaware of
>>>>>>> the async concept. So, even if someone does not
>>>>>>> implement the async concept or does not like
>>>>>>> the new item names, they can turn a blind
>>>>>>> eye to this - with attributes, thay can't.
>>>>>>>
>>>>>
>>>>> Good point,
>>>>> Maybe we should add hints in the attribute,
>>>>> for example, hint_only_wire in this case it will be clear that
>>>>> PMD may ignore this, and it should be fully documented that this is not a
>>>> mandatory field.
>>>>> What do you think?
>>>>
>>>> Theoretically, making terminology softer (like with the word "hint")
>>>> could make things easier for vendors who may find the new feature
>>>> confusing or something like that. But if, in reality, this hint
>>>> is indeed another match criterion (see my comments above), then
>>>> in no event shall the prefix "hint" be an excuse for this
>>>> criterion not being expressed as a pattern item.
>>>>
>>>
>>> Please see my response above. This is the point it is much more than
>> matching.
>>>
>>>> Please hear me out: I don't mean to sound arrogant, - just trying
>>>> to understand why expressing the new bit as an item can't be
>>>> efficient enough for the async flow approach.
>>>>
>>>
>>> I don't think you are arrogant, and I hope that you see that I do understand
>> your comments.
>>> saying that I hope I explained why I think it is better to have it as a table
>> attribute and not as an
>>> item. (We are not matching on it, this helps the PMD allocate the table at
>> the best location and avoid
>>> duplication of rules)
>>>
>>> If you wish, we can have a short phone call and discuss this.
>>>
>>> Best,
>>> Ori
>>>
>>>
>>>>>
>>>>>>> Thank you.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>> Ivan
>>>
>>
>> Thanks,
>> Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-21 7:34 ` Ori Kam
2022-09-21 8:39 ` Andrew Rybchenko
@ 2022-09-21 9:04 ` Ivan Malov
2022-09-21 9:40 ` Thomas Monjalon
1 sibling, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2022-09-21 9:04 UTC (permalink / raw)
To: Ori Kam
Cc: NBU-Contact-Thomas Monjalon (EXTERNAL),
Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, Andrew Rybchenko, dev, Raslan Darawsheh
[-- Attachment #1: Type: text/plain, Size: 21846 bytes --]
Hi Ori,
On Wed, 21 Sep 2022, Ori Kam wrote:
> Hi Ivan,
>
> PSB my comments.
>
> In any case, I'm afraid we are in a deadlock.
Hope we're not in fact.
> I understand your viewpoint, I don't think it is the correct
> one for the feature suggested here.
> For all the reasons I listed.
Ori, your two most recent replies are indeed valuable clarifications.
Now it's clear to me that your intention is to match on exact ports,
as usual, but this time with a hint for the flow table. Got it.
In your response, you say that matching on ALL vports is not what
the use case needs. OK, I understood. But please note that the
item name does not say "ALL", it says "ANY".
OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
and "PHY_PORTS_ONLY". This way, if user provides item VPORTS_ONLY
and then provides item REPRESENTED_PORT, these two items do not
contradict each other. Item VPORTS_ONLY defines the scope of some
kind, then the following item, REPRESENTED_PORT, makes it narrower.
And, in documentation, one can say clearly that the user *may*
omit item VPORTS_ONLY in the exact rule pattern provided that
they have already submitted this item as part of the template.
It's like with match items IPV4 / UDP. Item IPV4 does not
contradict item UDP. They supplement each other. Same way,
VPORTS_ONLY says one thing (PHYS_PORTS do NOT match),
and then REPRESENTED_PORT clarifies it and specifies
which exact VPORT shall match. Isn't that acceptable?
>
> So from my viewpoint, the patch is Acked.
> If you wish as I suggested before, we can have a meeting with
> Rongwei and anyone else who is interested and close this subject.
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>> Sent: Tuesday, 20 September 2022 18:28
> RE: [PATCH v1] ethdev: add direction info when creating the transfer
>> table
>>
>> Hi Ori,
>>
>> On Tue, 20 Sep 2022, Ori Kam wrote:
>>
>>> Hi Ivan,
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <ivan.malov@oktetlabs.ru>
>>>> Sent: Tuesday, 20 September 2022 15:46
>>>>
>>>> Hi Ori,
>>>>
>>>> On Tue, 20 Sep 2022, Ori Kam wrote:
>>>>
>>>>> Hi Ivan, Thomas and Rongwei
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Thomas Monjalon <thomas@monjalon.net>
>>>>>> Sent: Thursday, 15 September 2022 14:16
>>>>>>
>>>>>> 15/09/2022 12:59, Ivan Malov:
>>>>>>> Hi Rongwei,
>>>>>>>
>>>>>>> In this reply, I do not include the previous mail because the amount
>>>>>>> of inline commentary has gone haywire over the past couple of days.
>>>>>>> Let's re-iterate.
>>>>>>>
>>>>>>> But before I get to that, I'd like to offer a fresh perspective:
>>>>>>>
>>>>>>> Perhaps, if we all agree that term "vport" means an endpoint which
>>>>>>> can stand for any "port" except for physical one, then it should
>>>>>>> be possible to use term ANY_VPORTS rather than
>> ANY_GUEST_PORTS.
>>>>>>
>>>>>> The opposite of "physical" is "virtual" indeed.
>>>>>>
>>>>>>> But that's tricky, of course. I don't have a way with naming,
>>>>>>> so more opinions are welcome and very-very desirable here.
>>>>>>>
>>>>>>> So:
>>>>>>>
>>>>>>> 1) Do you agree that, in your proposal, the new "wire_orig" / "vf_orig"
>>>>>>> primitives are in fact yet another match criteria?
>>>>>>>
>>>>>>> ..
>>>>>>>
>>>>>>> To me, it looks so. If they are match criteria, then they belong
>>>>>>> in match pattern, that is, they should be expressed as new items.
>>>>>>>
>>>>>>> For "transfer" rules, the *existing* attributes are: "group"
>>>>>>> and "priority". As you may note, these are clearly not match
>>>>>>> criteria. They control the look-up order. So, to this day,
>>>>>>> there're no match criteria in DPDK expressed as attributes.
>>>>>>>
>>>>>>> If these "wire_orig" / "vf_orig" are going to be introduced
>>>>>>> as attributes, that should be backed with strong motivation.
>>>>>>
>>>>>> I prefer we keep matching in a single place, not in attributes.
>>>>>>
>>>>>
>>>>> I think we are talking about two different features.
>>>>> Feature 1:
>>>>> Allow matching on all vports that are not wire
>>>>> Feature 2:
>>>>> Save allocation space and allow fast insertion.
>>>>> In this case, the matching is not on all vports it can be just part of the
>> vports
>>>>> but it will never be the wire port.
>>>>> For example:
>>>>> port 0 - wire
>>>>> ports 1,2,3,4,5 - vports
>>>>> the application want to inset only those rules:
>>>>> represented_port(port_id=2) / eth / ipv4 (src==xx)
>>>>> represented_port(port_id=4) / eth / ipv4 (src==xx)
>>>>> represented_port(port_id=4) / eth / ipv4 (src==yy)
>>>>>
>>>>> For feature 1 I fully agree with you Ivan, this should be added as an item.
>>>>
>>>> Thank you.
>>>>
>>>>> For feature 2 I think Rongwei's suggestion is the better option.
>>>>> If I understand correctly the idea is to give hint to the PMD on where to
>>>> allocate memory
>>>>> and how to insert the rules most optimally. Since this is shared for all
>> rules it
>>>> makes more sense
>>>>> to add it as an attribute, just like we don’t have an ingress item (maybe
>> we
>>>> should?)
>>>>
>>>> But isn't pattern template also supposed to be shared for all rules
>>>> in the table? I.e., the user creates an async flow table and submits
>>>> a flow "shape" (which consists of attrs, pattern template and action
>>>> template). So why should "giving a hint" via an item template be
>>>> considered worse than doig so via an attribute?
>>>>
>>>
>>> The same item template maybe used elsewhere, for example, the
>> following
>>> pattern eth / ipv4(src, dst) / udp(sport, dport), can be used on number of
>> different
>>> tables.
>>
>> In my understanding, the user may want to create flow table A
>> and use pattern template A' for it, which is as follows:
>>
>> any_vports / eth / ipv4 / udp
>>
>> The PMD can see this item and treat it exactly the same
>> way as it could treat such attribute ("where to allocate
>> resources, etc.").
>>
>> Then the user may want to create flow table B and
>> use pattern template B' for it:
>>
>> any_phy_ports / eth / ipv4 / udp
>>
>> Once again, the PMD can clearly see the difference between
>> the A' and B' templates and, this time, allocate resources
>> the other way (as per efficiency requirements).
>>
>
> Yes, but again you select all vports, this is not what the application wants
> the application wants to insert the following rules:
> Assuming port 0 is wire and DPDK ports 1,2,3,4,5 are vports.
> Represented_port(id=2) / eth / ipv4/ udp
> Represented_port(id=5) / eth / ipv4/ udp
>
> As you can see the application doesn’t want all ports just some vports but for sure not the
> wire port.
>
> I agree that we can go with your approach, but it isn't correct since why application should
> insert:
> any_phy_ports / Represented_port(id=2) / eth / ipv4/ udp
> any_phy_ports / Represented_port(id=2) / eth / ipv4/ udp
Thanks for the explanation. Now I get the idea, yes.
But the rules which you list and say they're incorrect
are in fact correct. Please see my thoughts above.
>
>> By saying "can be used on number of different tables", do you mean
>> that it is important to make the *network* part of the pattern
>> shareable between flow tables? I.e. are you saying that
>> templates A' and B' cause resource duplication just
>> because of the same *network* part in your case?
>>
>
> I'm saying that adding will mean that the application can't reuse the pattern template it created.
> if the application created 5 tuple template.
> It can reuse it in ingress tables, egress tables, FDB tables there is no need to create
> extra pattern templates.
I'd say, if the application has to create two separate templates which
only differ in the first item (VPORTS / PHYS_PORTS), then it should be
pretty much acceptable. Yes, theoretically, if something can be shared,
then why not indeed share it, but, on the other hand, sharing logic
can be error prone. Keeping templates with this kind of item in
them separate could potentially make code more robust, I think.
No strong opinion here. Andrew? Thomas?
>
>>> I think that the main difference between us is that from my point of view
>> this value is just
>>> where to allocate resources / how to better insert the rule. It is not related
>> to matching.
>>
>> To me, it *is* the match criterion which, at the same time, serves
>> as a value indicating the way how resources should be allocated.
>> But before all, it is a match criterion.
>>
>
> Depends on how you define matching, but just like ingress / egress is not matching
> the same goes here.
>
>> If it refers to a group of ports = in order to ditch "the other half"
>> of traffic from consideration (like Rongwei explained), then it
>> looks like a match criterion.
>>
>
> See above comment, also this case it relates to allocation and insertion,
> the matching is side product.
>
>>> From Nvidia viewpoint we need this information so we can allocate the
>> resource at the correct
>>> place and avoid inserting duplication of rules.
>>
>> I see.
>>
>>> I agree that by using the item we can get the same results, but it is incorrect
>> since we are not matching on it.
>>
>> If one provides item UDP in the pattern and does not match on any UDP
>> fields, doing so nevertheless *is* matching on particular packet type.
>>
> Yes it matches all UDP will again in this case the idea is not to match all vports
> but just to tell the PMD that there will be only vports arriving to this table.
Please see above. This time, I do understand the idea. But I do not
propose to say that "ALL" vports should match. I never suggested
to use word "ALL". Only "ANY". Now I see it can rather be "ONLY"
or somethign like that. I say that this item defines the scope
(broad match), and the following one, REPRESENTED_PORT, will
define an exact port to match that belongs in this scope.
>
>> The same seemingly goes for the new attribute / item. If it is
>> provided, then the user doesn't want the rule to affect
>> packets coming from certain ports (i.e. from wire).
>>
>> So still sounds like matching.
>>
>>> Part of the idea of template API is to give as many hints as possible to the
>> PMD so the insertion will be optimized.
>>
>> I see.
>>
>>>
>>>
>>>> As for "ingress" item, - no, one should not add such. We have had
>>>> many discussions concerning this bit in the past. Ingress/egress
>>>> are non-transfer terms. They belong in the scope of vNIC / ethdev
>>>> filtering, not to embedded switch rules.
>>>>
>>>> In my opinion, in the embedded switch, one should either point to
>>>> some precise switch ports (using REPRESENTOR / REPRESENTED items)
>>>> or use another kind of item to refer to a "super set" of ports
>>>> which have something in common ("all wire ports", "all NON-wire ports").
>>>>
>>>
>>> But this is my point we don't want all wire ports or all NON-wire ports, we
>> just know that in this table
>>> we will have only non-wire / wire ports.
>>
>> But how do these two viewpoints contradict each other?
>>
>
> Which viewpoints?
>
>>>
>>>>>
>>>>> Ivan we have the item RTE_FLOW_ITEM_TYPE_PF and
>>>> RTE_FLOW_ITEM_TYPE_VF which are deprecated,
>>>>> So do you want to un-deprecate them?
>>>>
>>>> No. These items are deprecated because:
>>>>
>>>> a) their names suggest that application knows whether an ethdev
>>>> sits on top of a PF or that the application has some
>>>> knowledge of existence of particular VFs, but in
>>>> reality applications should not be worried of
>>>> the underlying function type = to them, all
>>>> ethdevs are just representors of something,
>>>> and if the application needs to refer to
>>>> VFs (or other PFs, - doesn't matter), it
>>>> should do that via REPRESENTOR items;
>>>>
>>>> b) such items would duplicate REPRESENTOR / REPRESENTED.
>>>>
>>> Agree with everything you say.
>>
>> Great we're on the same page regarding this bit.
>>
>>>
>>>>>
>>>>> To summarize, if PMD can use such an hint during rule creation and save
>>>> memory, I vote
>>>>> to allow it.
>>>>> if the idea is to match on all vports then it should be an item.
>>>>
>>>> But such a hint would effectively be a match criterion, too, right?
>>>> So, in fact it's a combined use case: a match criterion which is
>>>> flexible enough to be a "hint" = i.e. the PMD can see it when
>>>> processing the pattern *template* and treat it as a hint.
>>>>
>>>
>>> Yes, but it is an implicit match, just like saying ingress. Egress it has meaning
>> above the
>>> matching. In addition, there is no reason to add extra item for each rule we
>> create, just
>>> to enable something that is fixed during the table creation.
>>> Extra item in pattern template means extra item for each rule.
>>> I know we can avoid this and optimize the code but why add something
>> that no one needs
>>> after table creation?
>>
>> Good question. But, in case some way exists to make such optimisation
>> laconic enough to avoid confusion etc., then it should be no problem
>> in preferring the pattern approach over attribute approach.
>>
>>>
>>>
>>>>>
>>>>>>
>>>>>>> 2) From your viewpoint, why items "ANY_PHYS_PORTS" and
>>>>>> "ANY_VPORTS"
>>>>>>> won't do? Or, which problems do you think they may inflict?
>>>>>>>
>>>>>>> ..
>>>>>>>
>>>>>>> Previously, you explained why REPRESENTED_PORT would not
>>>>>>> fit your needs. And I understand your point: to async API,
>>>>>>> two pattern templates which both have item REPRESENTED_PORT
>>>>>>> in them cannot be clearly distinguished and are in fact the
>>>>>>> same set of criteria (provided that all other items are also
>>>>>>> the same and have the same masks). Templates are, well,
>>>>>>> templates (or shapes) of the rules to come later and
>>>>>>> do not include exact "spec" for the "ethdev_id".
>>>>>>> Got it.
>>>>>>>
>>>>>>> But that's not going to be the case with items ANY_PHYS_PORTS
>> and
>>>>>>> ANY_VPORTS, is it? In one async table template, the user submits
>>>>>>> item ANY_PHYS_PORTS (instead of table attribute "wire_orig").
>>>>>>> In another template, the user submits item ANY_VPORTS to
>>>>>>> state that they want to match only traffic transmitted
>>>>>>> software endpoints (DPDK ethdevs, guest VFs, etc.)
>>>>>>> connected to the switch.
>>>>>>>
>>>>>>> In this example, the PMD will clearly see that the two templates
>>>>>>> differ. So it will be able to allocate separate resources, each
>>>>>>> one "cutting one half of traffic" (as per your concept).
>>>>>>>
>>>>>>> 3) In your most recent response, you suggested that one might have
>>>>>>> had the attributes occupied for some other purposes. To me,
>>>>>>> they're not. Neither me nor my closest colleagues have
>>>>>>> any plans on them. When I advocate using item approach
>>>>>>> over the attribute approach, I do this to ensure
>>>>>>> a) clarity of the API contract and b) robustness.
>>>>>
>>>>> If something is shared for all rules in the same table, it should be a table
>>>>> property.
>>>>
>>>> But the whole pattern *template* is also a table property, isn't it?
>>>>
>>>
>>> Like I said above the pattern template can be used in all domains that is
>> why
>>> there is a split between table and patter, in addition to that each table may
>> have
>>> number of pattern templates.
>>
>> This is a valuable clarification. However, even if the attribute way
>> may seem OK after this explanation, then I still don't understand
>> why it is required to add this attribute to the generic "struct
>> rte_flow_attr" and not just to the *table* attr.
>>
>> Generic "struct rte_flow_attr" is used both for async and
>> sync (regular) approach. So why add something to generic
>> struct which is never going to make sense to sync flows?
>>
>
> I guess we can move it to the table attribute, but I think that even
> in standard rte_flow API this can save duplicate insertion.
Could you please expand on the standard (non-Async) rule insertion?
In which way can it save resources? Just an example, to get the idea.
Also, please note, that, during our talk with Rongwei, I failed
to explan that, if this new attribute can indeed be used not
only for Async flows, but also for standard (sync) ones,
then testpmd diff should also extend testpmd parser for
regular flows, i.e. the user should be able to write
flow create 0 transfer new_attr_here pattern ... ...
(or validate).
I'm affraid the current code only allows to specify the attribute
in commands which work exclusively for async tables.
That does not seem right.
>
>>>
>>>>>
>>>>>>>
>>>>>>> 4) Also, in your response, you suggested that I might have
>>>>>>> confused item mask and spec. That is not the case.
>>>>>>> If we agree, that switch domain ID is unneeded in
>>>>>>> the new items, then these items will have no
>>>>>>> fields in them (like item PF had not had any
>>>>>>> before it was deprecated).
>>>>>>>
>>>>>>> No fields in new items => no field masks.
>>>>>>> So what's the problem then?
>>>>>>>
>>>>>>> 5) With regard to our talk about identifying the relationship
>>>>>>> between ethdevs and switch domains, you said that the user
>>>>>>> could know the difference from the very beginning:
>>>>>>> /sysfs/ .... /PF_BDF/sriov_num
>>>>>>>
>>>>>>> That is true for the user who starts the application, but
>>>>>>> this knowledge is hard to obtain from the application
>>>>>>> perspective = it's hard to automate.
>>>>>>>
>>>>>>> This is why ethdevs are able to advertise their domain IDs.
>>>>>>> And, as I explained, looking at domain ID to understand
>>>>>>
>>>>>> namely rte_eth_dev_info.switch_info.domain_id
>>>>>>
>>>>>>> port relationship is valid, whilst looking at proxy IDs
>>>>>>> to achieve the same goal is not. Proxy port IDs only
>>>>>>> serve the purpose of finding an entry point for
>>>>>>> managing flows. That has slightly different
>>>>>>> meaning, but this subtle difference is important.
>>>>>>
>>>>>> There is also a concept of sibling ports
>>>>>> to get all ports belonging to the same hardware.
>>>>>>
>>>>>>
>>>>>>> 6) As for the confusion over the difference between fixing
>>>>>>> bugs and making the code robust by extra checks:
>>>>>>>
>>>>>>> Yes, I agree that the programmer who writes the
>>>>>>> application must be intelligent enough to use
>>>>>>> flow primitives the proper way. Yes, the user
>>>>>>> who starts the application also should thread
>>>>>>> carefully. But that does not prevent some
>>>>>>> mistakes in other parts of code from
>>>>>>> corrupting various chunks of memory,
>>>>>>> including, for example, flow attrs.
>>>>>>>
>>>>>>> You say that such mistakes have to be "just fixed"
>>>>>>> as any other bugs. Right. But how much time will
>>>>>>> the programmer spend to identify the bugs?
>>>>>>>
>>>>>>> If the PMDs do all the checks (as with attributes),
>>>>>>> the hypothetical bug will manifest itself much
>>>>>>> earlier. That will simplify debugging by a lot...
>>>>>>>
>>>>>>> So, my point is that it's still better to ensure
>>>>>>> that new flow primitives have all necessary
>>>>>>> checks in place. For attributes, it is
>>>>>>> required to add them separately.
>>>>>>
>>>>>> If flow insertion is done in a fast path,
>>>>>> such checks may be skipped.
>>>>>
>>>>> The idea is that all rules in this table will share the same configuration,
>>>>> there is no reason to say everything again for each rule. This is why
>>>>> the rule attributes were moved to the table struct and not per rule.
>>>>>
>>>>>>
>>>>>>> For items, as I explained, it might not be necessary
>>>>>>> in the majority of cases simply because of the
>>>>>>> switch (item->type) { case } structure.
>>>>>>>
>>>>>>> So, these are some of my points to explain why the
>>>>>>> attribute approach is untenable. To me, attributes
>>>>>>> are something global, which demands checks in all
>>>>>>> flow-capable PMDs. Items seem better because they
>>>>>>> are don't cares to all PMDs which are unaware of
>>>>>>> the async concept. So, even if someone does not
>>>>>>> implement the async concept or does not like
>>>>>>> the new item names, they can turn a blind
>>>>>>> eye to this - with attributes, thay can't.
>>>>>>>
>>>>>
>>>>> Good point,
>>>>> Maybe we should add hints in the attribute,
>>>>> for example, hint_only_wire in this case it will be clear that
>>>>> PMD may ignore this, and it should be fully documented that this is not a
>>>> mandatory field.
>>>>> What do you think?
>>>>
>>>> Theoretically, making terminology softer (like with the word "hint")
>>>> could make things easier for vendors who may find the new feature
>>>> confusing or something like that. But if, in reality, this hint
>>>> is indeed another match criterion (see my comments above), then
>>>> in no event shall the prefix "hint" be an excuse for this
>>>> criterion not being expressed as a pattern item.
>>>>
>>>
>>> Please see my response above. This is the point it is much more than
>> matching.
>>>
>>>> Please hear me out: I don't mean to sound arrogant, - just trying
>>>> to understand why expressing the new bit as an item can't be
>>>> efficient enough for the async flow approach.
>>>>
>>>
>>> I don't think you are arrogant, and I hope that you see that I do understand
>> your comments.
>>> saying that I hope I explained why I think it is better to have it as a table
>> attribute and not as an
>>> item. (We are not matching on it, this helps the PMD allocate the table at
>> the best location and avoid
>>> duplication of rules)
>>>
>>> If you wish, we can have a short phone call and discuss this.
>>>
>>> Best,
>>> Ori
>>>
>>>
>>>>>
>>>>>>> Thank you.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>> Ivan
>>>
>>
>> Thanks,
>> Ivan
>
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-21 9:04 ` Ivan Malov
@ 2022-09-21 9:40 ` Thomas Monjalon
2022-09-21 10:04 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2022-09-21 9:40 UTC (permalink / raw)
To: Ori Kam, Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, Andrew Rybchenko, dev, Raslan Darawsheh, jerinj,
ajit.khaparde
21/09/2022 11:04, Ivan Malov:
> Now it's clear to me that your intention is to match on exact ports,
> as usual, but this time with a hint for the flow table. Got it.
>
> In your response, you say that matching on ALL vports is not what
> the use case needs. OK, I understood. But please note that the
> item name does not say "ALL", it says "ANY".
>
> OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
> and "PHY_PORTS_ONLY". This way, if user provides item VPORTS_ONLY
> and then provides item REPRESENTED_PORT, these two items do not
> contradict each other. Item VPORTS_ONLY defines the scope of some
> kind, then the following item, REPRESENTED_PORT, makes it narrower.
>
> And, in documentation, one can say clearly that the user *may*
> omit item VPORTS_ONLY in the exact rule pattern provided that
> they have already submitted this item as part of the template.
I think the problem that Rongwei & Ori are trying to solve
is to allocate resources for the templates table in the right place.
A table can have multiple templates.
If all rules/templates for this table are dedicated to virtual ports,
then the table will be allocated in a place managing only virtual ports.
This allocation decision must be taken at table creation,
whereas rules will be created later.
In order to do this specific table allocation for vports,
we need to restrict all templates of the table to be "vports only".
I hope it makes things clearer.
Now the question is how to achieve this? Solutions are:
1/ give a hint to the table allocation
2/ insert a pattern item in all templates of the table
I don't see any other solution. Please propose if there are more options.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-21 9:40 ` Thomas Monjalon
@ 2022-09-21 10:04 ` Andrew Rybchenko
2022-09-21 12:41 ` Ori Kam
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-09-21 10:04 UTC (permalink / raw)
To: Thomas Monjalon, Ori Kam, Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
On 9/21/22 12:40, Thomas Monjalon wrote:
> 21/09/2022 11:04, Ivan Malov:
>> Now it's clear to me that your intention is to match on exact ports,
>> as usual, but this time with a hint for the flow table. Got it.
>>
>> In your response, you say that matching on ALL vports is not what
>> the use case needs. OK, I understood. But please note that the
>> item name does not say "ALL", it says "ANY".
>>
>> OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
>> and "PHY_PORTS_ONLY". This way, if user provides item VPORTS_ONLY
>> and then provides item REPRESENTED_PORT, these two items do not
>> contradict each other. Item VPORTS_ONLY defines the scope of some
>> kind, then the following item, REPRESENTED_PORT, makes it narrower.
>>
>> And, in documentation, one can say clearly that the user *may*
>> omit item VPORTS_ONLY in the exact rule pattern provided that
>> they have already submitted this item as part of the template.
>
> I think the problem that Rongwei & Ori are trying to solve
> is to allocate resources for the templates table in the right place.
> A table can have multiple templates.
> If all rules/templates for this table are dedicated to virtual ports,
> then the table will be allocated in a place managing only virtual ports.
> This allocation decision must be taken at table creation,
> whereas rules will be created later.
> In order to do this specific table allocation for vports,
> we need to restrict all templates of the table to be "vports only".
>
> I hope it makes things clearer.
> Now the question is how to achieve this? Solutions are:
>
> 1/ give a hint to the table allocation
> 2/ insert a pattern item in all templates of the table
>
> I don't see any other solution. Please propose if there are more options.
>
>
See my mail
3/ use jump rule which ensures that all traffic meets out
expectations
It means that the table creation could be postponed. Or the
table could be per-configured at the point of creation and
finalized when we know that all traffic will be from wires
or from vports. Yes, it complicates internals to achieve
the optimization.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-21 10:04 ` Andrew Rybchenko
@ 2022-09-21 12:41 ` Ori Kam
2022-09-21 12:51 ` Morten Brørup
0 siblings, 1 reply; 96+ messages in thread
From: Ori Kam @ 2022-09-21 12:41 UTC (permalink / raw)
To: Andrew Rybchenko, NBU-Contact-Thomas Monjalon (EXTERNAL), Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
Hi All,
To avoid multi threads, I will only answer this thread since I assume
everyone is clear about the issue.
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>
> On 9/21/22 12:40, Thomas Monjalon wrote:
> > 21/09/2022 11:04, Ivan Malov:
> >> Now it's clear to me that your intention is to match on exact ports,
> >> as usual, but this time with a hint for the flow table. Got it.
> >>
> >> In your response, you say that matching on ALL vports is not what
> >> the use case needs. OK, I understood. But please note that the
> >> item name does not say "ALL", it says "ANY".
> >>
> >> OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
> >> and "PHY_PORTS_ONLY". This way, if user provides item VPORTS_ONLY
> >> and then provides item REPRESENTED_PORT, these two items do not
> >> contradict each other. Item VPORTS_ONLY defines the scope of some
> >> kind, then the following item, REPRESENTED_PORT, makes it narrower.
> >>
> >> And, in documentation, one can say clearly that the user *may*
> >> omit item VPORTS_ONLY in the exact rule pattern provided that
> >> they have already submitted this item as part of the template.
> >
> > I think the problem that Rongwei & Ori are trying to solve
> > is to allocate resources for the templates table in the right place.
> > A table can have multiple templates.
> > If all rules/templates for this table are dedicated to virtual ports,
> > then the table will be allocated in a place managing only virtual ports.
> > This allocation decision must be taken at table creation,
> > whereas rules will be created later.
> > In order to do this specific table allocation for vports,
> > we need to restrict all templates of the table to be "vports only".
> >
> > I hope it makes things clearer.
> > Now the question is how to achieve this? Solutions are:
> >
> > 1/ give a hint to the table allocation
> > 2/ insert a pattern item in all templates of the table
> >
> > I don't see any other solution. Please propose if there are more options.
> >
> >
>
> See my mail
>
> 3/ use jump rule which ensures that all traffic meets out
> expectations
>
> It means that the table creation could be postponed. Or the
> table could be per-configured at the point of creation and
> finalized when we know that all traffic will be from wires
> or from vports. Yes, it complicates internals to achieve
> the optimization.
Sorry Andrew your suggestion is not a valid one for the following reasons:
1. table creation can't be postponed this is a key idea of the rte_flow template API.
2. we can never know what rules will be inserted if the application doesn't tell us.
how can we know this is the last rule? What do we do with the first rule?
3. I don't see how jumping helps since it worsens the issue when you jump to a table,
how does the PMD know if this table should have only wire or only vports?
I agree with Thomas, there are two valid options, I vote for the hint since this is the
feature idea to tell the PMD where this resource should be allocated.
Best,
Ori
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-21 12:41 ` Ori Kam
@ 2022-09-21 12:51 ` Morten Brørup
2022-09-22 7:39 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Morten Brørup @ 2022-09-21 12:51 UTC (permalink / raw)
To: Ori Kam, Andrew Rybchenko, NBU-Contact-Thomas Monjalon (EXTERNAL),
Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
> From: Ori Kam [mailto:orika@nvidia.com]
> Sent: Wednesday, 21 September 2022 14.41
>
> > From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >
> > On 9/21/22 12:40, Thomas Monjalon wrote:
> > > 21/09/2022 11:04, Ivan Malov:
> > >> Now it's clear to me that your intention is to match on exact
> ports,
> > >> as usual, but this time with a hint for the flow table. Got it.
> > >>
> > >> In your response, you say that matching on ALL vports is not what
> > >> the use case needs. OK, I understood. But please note that the
> > >> item name does not say "ALL", it says "ANY".
> > >>
> > >> OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
> > >> and "PHY_PORTS_ONLY". This way, if user provides item VPORTS_ONLY
> > >> and then provides item REPRESENTED_PORT, these two items do not
> > >> contradict each other. Item VPORTS_ONLY defines the scope of some
> > >> kind, then the following item, REPRESENTED_PORT, makes it
> narrower.
> > >>
> > >> And, in documentation, one can say clearly that the user *may*
> > >> omit item VPORTS_ONLY in the exact rule pattern provided that
> > >> they have already submitted this item as part of the template.
> > >
> > > I think the problem that Rongwei & Ori are trying to solve
> > > is to allocate resources for the templates table in the right
> place.
> > > A table can have multiple templates.
> > > If all rules/templates for this table are dedicated to virtual
> ports,
> > > then the table will be allocated in a place managing only virtual
> ports.
> > > This allocation decision must be taken at table creation,
> > > whereas rules will be created later.
> > > In order to do this specific table allocation for vports,
> > > we need to restrict all templates of the table to be "vports only".
> > >
> > > I hope it makes things clearer.
> > > Now the question is how to achieve this? Solutions are:
> > >
> > > 1/ give a hint to the table allocation
> > > 2/ insert a pattern item in all templates of the table
> > >
> > > I don't see any other solution. Please propose if there are more
> options.
> > >
> > >
> >
> > See my mail
> >
> > 3/ use jump rule which ensures that all traffic meets out
> > expectations
> >
> > It means that the table creation could be postponed. Or the
> > table could be per-configured at the point of creation and
> > finalized when we know that all traffic will be from wires
> > or from vports. Yes, it complicates internals to achieve
> > the optimization.
>
> Sorry Andrew your suggestion is not a valid one for the following
> reasons:
> 1. table creation can't be postponed this is a key idea of the rte_flow
> template API.
> 2. we can never know what rules will be inserted if the application
> doesn't tell us.
> how can we know this is the last rule? What do we do with the
> first rule?
> 3. I don't see how jumping helps since it worsens the issue when you
> jump to a table,
> how does the PMD know if this table should have only wire or only
> vports?
>
> I agree with Thomas, there are two valid options, I vote for the hint
> since this is the
> feature idea to tell the PMD where this resource should be allocated.
This is an optimization; I agree with Ori that a hint is appropriate, like the MBUF_FAST_FREE hint on TX queues.
No need to add more complexity by requiring the driver to recognize that the pattern is present in all templates. (And perhaps also remove that pattern when applying the templates.)
>
> Best,
> Ori
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-21 12:51 ` Morten Brørup
@ 2022-09-22 7:39 ` Andrew Rybchenko
2022-09-22 10:06 ` Ori Kam
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-09-22 7:39 UTC (permalink / raw)
To: Morten Brørup, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
On 9/21/22 15:51, Morten Brørup wrote:
>> From: Ori Kam [mailto:orika@nvidia.com]
>> Sent: Wednesday, 21 September 2022 14.41
>>
>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>
>>> On 9/21/22 12:40, Thomas Monjalon wrote:
>>>> 21/09/2022 11:04, Ivan Malov:
>>>>> Now it's clear to me that your intention is to match on exact
>> ports,
>>>>> as usual, but this time with a hint for the flow table. Got it.
>>>>>
>>>>> In your response, you say that matching on ALL vports is not what
>>>>> the use case needs. OK, I understood. But please note that the
>>>>> item name does not say "ALL", it says "ANY".
>>>>>
>>>>> OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
>>>>> and "PHY_PORTS_ONLY". This way, if user provides item VPORTS_ONLY
>>>>> and then provides item REPRESENTED_PORT, these two items do not
>>>>> contradict each other. Item VPORTS_ONLY defines the scope of some
>>>>> kind, then the following item, REPRESENTED_PORT, makes it
>> narrower.
>>>>>
>>>>> And, in documentation, one can say clearly that the user *may*
>>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
>>>>> they have already submitted this item as part of the template.
>>>>
>>>> I think the problem that Rongwei & Ori are trying to solve
>>>> is to allocate resources for the templates table in the right
>> place.
>>>> A table can have multiple templates.
>>>> If all rules/templates for this table are dedicated to virtual
>> ports,
>>>> then the table will be allocated in a place managing only virtual
>> ports.
>>>> This allocation decision must be taken at table creation,
>>>> whereas rules will be created later.
>>>> In order to do this specific table allocation for vports,
>>>> we need to restrict all templates of the table to be "vports only".
>>>>
>>>> I hope it makes things clearer.
>>>> Now the question is how to achieve this? Solutions are:
>>>>
>>>> 1/ give a hint to the table allocation
>>>> 2/ insert a pattern item in all templates of the table
>>>>
>>>> I don't see any other solution. Please propose if there are more
>> options.
>>>>
>>>>
>>>
>>> See my mail
>>>
>>> 3/ use jump rule which ensures that all traffic meets out
>>> expectations
>>>
>>> It means that the table creation could be postponed. Or the
>>> table could be per-configured at the point of creation and
>>> finalized when we know that all traffic will be from wires
>>> or from vports. Yes, it complicates internals to achieve
>>> the optimization.
>>
>> Sorry Andrew your suggestion is not a valid one for the following
>> reasons:
>> 1. table creation can't be postponed this is a key idea of the rte_flow
>> template API.
I guess nobody cares if it delays insertion on the first rule
only. Anyway, see below.
>> 2. we can never know what rules will be inserted if the application
>> doesn't tell us.
>> how can we know this is the last rule? What do we do with the
>> first rule?
>> 3. I don't see how jumping helps since it worsens the issue when you
>> jump to a table,
>> how does the PMD know if this table should have only wire or only
>> vports?
Jump rules say so. PMD can analyze there rules.
May be just need an attribute saying that all jump rules
to the table are configured and further attempts to reconfigure
will be rejected?
>>
>> I agree with Thomas, there are two valid options, I vote for the hint
>> since this is the
>> feature idea to tell the PMD where this resource should be allocated.
>
> This is an optimization; I agree with Ori that a hint is appropriate, like the MBUF_FAST_FREE hint on TX queues.
>
> No need to add more complexity by requiring the driver to recognize that the pattern is present in all templates. (And perhaps also remove that pattern when applying the templates.)
What does the part of the matching criteria so special
that it is allowed to have dedicated hint attribute?
May be we can have really generic solution when any
part of the matching criteria could provide such hints?
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-22 7:39 ` Andrew Rybchenko
@ 2022-09-22 10:06 ` Ori Kam
2022-09-22 10:31 ` Andrew Rybchenko
2022-09-22 12:43 ` Ivan Malov
0 siblings, 2 replies; 96+ messages in thread
From: Ori Kam @ 2022-09-22 10:06 UTC (permalink / raw)
To: Andrew Rybchenko, Morten Brørup,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
Hi Andrew,
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Thursday, 22 September 2022 10:39
>
> On 9/21/22 15:51, Morten Brørup wrote:
> >> From: Ori Kam [mailto:orika@nvidia.com]
> >> Sent: Wednesday, 21 September 2022 14.41
> >>
> >>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>
> >>> On 9/21/22 12:40, Thomas Monjalon wrote:
> >>>> 21/09/2022 11:04, Ivan Malov:
> >>>>> Now it's clear to me that your intention is to match on exact
> >> ports,
> >>>>> as usual, but this time with a hint for the flow table. Got it.
> >>>>>
> >>>>> In your response, you say that matching on ALL vports is not what
> >>>>> the use case needs. OK, I understood. But please note that the
> >>>>> item name does not say "ALL", it says "ANY".
> >>>>>
> >>>>> OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
> >>>>> and "PHY_PORTS_ONLY". This way, if user provides item
> VPORTS_ONLY
> >>>>> and then provides item REPRESENTED_PORT, these two items do not
> >>>>> contradict each other. Item VPORTS_ONLY defines the scope of some
> >>>>> kind, then the following item, REPRESENTED_PORT, makes it
> >> narrower.
> >>>>>
> >>>>> And, in documentation, one can say clearly that the user *may*
> >>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
> >>>>> they have already submitted this item as part of the template.
> >>>>
> >>>> I think the problem that Rongwei & Ori are trying to solve
> >>>> is to allocate resources for the templates table in the right
> >> place.
> >>>> A table can have multiple templates.
> >>>> If all rules/templates for this table are dedicated to virtual
> >> ports,
> >>>> then the table will be allocated in a place managing only virtual
> >> ports.
> >>>> This allocation decision must be taken at table creation,
> >>>> whereas rules will be created later.
> >>>> In order to do this specific table allocation for vports,
> >>>> we need to restrict all templates of the table to be "vports only".
> >>>>
> >>>> I hope it makes things clearer.
> >>>> Now the question is how to achieve this? Solutions are:
> >>>>
> >>>> 1/ give a hint to the table allocation
> >>>> 2/ insert a pattern item in all templates of the table
> >>>>
> >>>> I don't see any other solution. Please propose if there are more
> >> options.
> >>>>
> >>>>
> >>>
> >>> See my mail
> >>>
> >>> 3/ use jump rule which ensures that all traffic meets out
> >>> expectations
> >>>
> >>> It means that the table creation could be postponed. Or the
> >>> table could be per-configured at the point of creation and
> >>> finalized when we know that all traffic will be from wires
> >>> or from vports. Yes, it complicates internals to achieve
> >>> the optimization.
> >>
> >> Sorry Andrew your suggestion is not a valid one for the following
> >> reasons:
> >> 1. table creation can't be postponed this is a key idea of the rte_flow
> >> template API.
>
> I guess nobody cares if it delays insertion on the first rule
> only. Anyway, see below.
>
> >> 2. we can never know what rules will be inserted if the application
> >> doesn't tell us.
> >> how can we know this is the last rule? What do we do with the
> >> first rule?
> >> 3. I don't see how jumping helps since it worsens the issue when you
> >> jump to a table,
> >> how does the PMD know if this table should have only wire or only
> >> vports?
>
> Jump rules say so. PMD can analyze there rules.
> May be just need an attribute saying that all jump rules
> to the table are configured and further attempts to reconfigure
> will be rejected?
>
The idea is the PMD will not analyze rules. That is why we have the table
and template.
Sorry, I don't understand what attribute can be in jump? The jump is just
to table. It can't say anything about the table destination table.
This is all this patch adds the attribute to a table to say where this
table should be located.
> >>
> >> I agree with Thomas, there are two valid options, I vote for the hint
> >> since this is the
> >> feature idea to tell the PMD where this resource should be allocated.
> >
> > This is an optimization; I agree with Ori that a hint is appropriate, like the
> MBUF_FAST_FREE hint on TX queues.
> >
> > No need to add more complexity by requiring the driver to recognize that
> the pattern is present in all templates. (And perhaps also remove that
> pattern when applying the templates.)
>
> What does the part of the matching criteria so special
> that it is allowed to have dedicated hint attribute?
>
> May be we can have really generic solution when any
> part of the matching criteria could provide such hints?
That is the point I keep returning to, it is not matching!
This is on which HW resource the table should be allocated.
Think about ingress/egress/transfer why are they not in the pattern?
They are where rules should be offloaded, they are different domain.
Like we have elsewhere for example in action create we can state on which
domain the action should be created. If the application selects a number of domains
it may mean that extra resources will be allocated.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-22 10:06 ` Ori Kam
@ 2022-09-22 10:31 ` Andrew Rybchenko
2022-09-22 13:00 ` Ori Kam
2022-09-22 12:43 ` Ivan Malov
1 sibling, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-09-22 10:31 UTC (permalink / raw)
To: Ori Kam, Morten Brørup,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
On 9/22/22 13:06, Ori Kam wrote:
> Hi Andrew,
>
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Thursday, 22 September 2022 10:39
>>
>> On 9/21/22 15:51, Morten Brørup wrote:
>>>> From: Ori Kam [mailto:orika@nvidia.com]
>>>> Sent: Wednesday, 21 September 2022 14.41
>>>>
>>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>>>
>>>>> On 9/21/22 12:40, Thomas Monjalon wrote:
>>>>>> 21/09/2022 11:04, Ivan Malov:
>>>>>>> Now it's clear to me that your intention is to match on exact
>>>> ports,
>>>>>>> as usual, but this time with a hint for the flow table. Got it.
>>>>>>>
>>>>>>> In your response, you say that matching on ALL vports is not what
>>>>>>> the use case needs. OK, I understood. But please note that the
>>>>>>> item name does not say "ALL", it says "ANY".
>>>>>>>
>>>>>>> OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
>>>>>>> and "PHY_PORTS_ONLY". This way, if user provides item
>> VPORTS_ONLY
>>>>>>> and then provides item REPRESENTED_PORT, these two items do not
>>>>>>> contradict each other. Item VPORTS_ONLY defines the scope of some
>>>>>>> kind, then the following item, REPRESENTED_PORT, makes it
>>>> narrower.
>>>>>>>
>>>>>>> And, in documentation, one can say clearly that the user *may*
>>>>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
>>>>>>> they have already submitted this item as part of the template.
>>>>>>
>>>>>> I think the problem that Rongwei & Ori are trying to solve
>>>>>> is to allocate resources for the templates table in the right
>>>> place.
>>>>>> A table can have multiple templates.
>>>>>> If all rules/templates for this table are dedicated to virtual
>>>> ports,
>>>>>> then the table will be allocated in a place managing only virtual
>>>> ports.
>>>>>> This allocation decision must be taken at table creation,
>>>>>> whereas rules will be created later.
>>>>>> In order to do this specific table allocation for vports,
>>>>>> we need to restrict all templates of the table to be "vports only".
>>>>>>
>>>>>> I hope it makes things clearer.
>>>>>> Now the question is how to achieve this? Solutions are:
>>>>>>
>>>>>> 1/ give a hint to the table allocation
>>>>>> 2/ insert a pattern item in all templates of the table
>>>>>>
>>>>>> I don't see any other solution. Please propose if there are more
>>>> options.
>>>>>>
>>>>>>
>>>>>
>>>>> See my mail
>>>>>
>>>>> 3/ use jump rule which ensures that all traffic meets out
>>>>> expectations
>>>>>
>>>>> It means that the table creation could be postponed. Or the
>>>>> table could be per-configured at the point of creation and
>>>>> finalized when we know that all traffic will be from wires
>>>>> or from vports. Yes, it complicates internals to achieve
>>>>> the optimization.
>>>>
>>>> Sorry Andrew your suggestion is not a valid one for the following
>>>> reasons:
>>>> 1. table creation can't be postponed this is a key idea of the rte_flow
>>>> template API.
>>
>> I guess nobody cares if it delays insertion on the first rule
>> only. Anyway, see below.
>>
>>>> 2. we can never know what rules will be inserted if the application
>>>> doesn't tell us.
>>>> how can we know this is the last rule? What do we do with the
>>>> first rule?
>>>> 3. I don't see how jumping helps since it worsens the issue when you
>>>> jump to a table,
>>>> how does the PMD know if this table should have only wire or only
>>>> vports?
>>
>> Jump rules say so. PMD can analyze there rules.
>> May be just need an attribute saying that all jump rules
>> to the table are configured and further attempts to reconfigure
>> will be rejected?
>>
>
> The idea is the PMD will not analyze rules. That is why we have the table
> and template.
> Sorry, I don't understand what attribute can be in jump? The jump is just
> to table. It can't say anything about the table destination table.
> This is all this patch adds the attribute to a table to say where this
> table should be located.
>
>>>>
>>>> I agree with Thomas, there are two valid options, I vote for the hint
>>>> since this is the
>>>> feature idea to tell the PMD where this resource should be allocated.
>>>
>>> This is an optimization; I agree with Ori that a hint is appropriate, like the
>> MBUF_FAST_FREE hint on TX queues.
>>>
>>> No need to add more complexity by requiring the driver to recognize that
>> the pattern is present in all templates. (And perhaps also remove that
>> pattern when applying the templates.)
>>
>> What does the part of the matching criteria so special
>> that it is allowed to have dedicated hint attribute?
>>
>> May be we can have really generic solution when any
>> part of the matching criteria could provide such hints?
>
> That is the point I keep returning to, it is not matching!
> This is on which HW resource the table should be allocated.
Sorry, but it is just your HW details that you have different
location/resources for rules which apply on packets coming
from wire and coming from host (vports).
> Think about ingress/egress/transfer why are they not in the pattern?
We have no ingress/egress in transfer domain any more because
it is ambiguous.
Transfer itself is really a different domain. Logically and
from privileges point of view. That's why it is important to
distinguish it.
Ingress and egress in non-transfer case are natively bound
to two main functions of the driver: transmit (egress rules)
and receive (ingress rules). In general, it is a matching
criteria as well, but because of its nature (explained
above) it is simply handy to distinguish it from the very
beginning.
> They are where rules should be offloaded, they are different domain.
We have just two domains: transfer and non-transfer.
> Like we have elsewhere for example in action create we can state on which
> domain the action should be created. If the application selects a number of domains
> it may mean that extra resources will be allocated.>
Two more points:
1/ If it is just a hint, it is optional for PMD to
support/handle it. It means that it MUST NOT impose any
limitations on matching. If so, if you want a rule to
be applied on packets coming from wire, you still MUST
specify it in the pattern.
So, it does not sound like a hint in your case.
2/ struct rte_flow_attr is used for really all rules.
How a new attribute should be interpreted in non-transfer
rules? Similar to ingress/egress? Duplication?
Or even harder (if it is NOT a hint): should it really
enforce matching of packets coming from wire (i.e. not
a different vport)? Not sure that it is doable or even
make sense.
We can say that the attribute may be used for the transfer
rules only. If so, it MUST be checked on ethdev level
since it is a generic rule.
3/ struct rte_flow_attr is used for sync and async rules.
As I understand you're using it for async rules only.
Does it make sense for sync rules?
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-22 10:06 ` Ori Kam
2022-09-22 10:31 ` Andrew Rybchenko
@ 2022-09-22 12:43 ` Ivan Malov
2022-09-22 14:46 ` Ori Kam
1 sibling, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2022-09-22 12:43 UTC (permalink / raw)
To: Ori Kam
Cc: Andrew Rybchenko, Morten Brørup,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
[-- Attachment #1: Type: text/plain, Size: 7257 bytes --]
Hi Ori,
On Thu, 22 Sep 2022, Ori Kam wrote:
> Hi Andrew,
>
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Thursday, 22 September 2022 10:39
>>
>> On 9/21/22 15:51, Morten Brørup wrote:
>>>> From: Ori Kam [mailto:orika@nvidia.com]
>>>> Sent: Wednesday, 21 September 2022 14.41
>>>>
>>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>>>
>>>>> On 9/21/22 12:40, Thomas Monjalon wrote:
>>>>>> 21/09/2022 11:04, Ivan Malov:
>>>>>>> Now it's clear to me that your intention is to match on exact
>>>> ports,
>>>>>>> as usual, but this time with a hint for the flow table. Got it.
>>>>>>>
>>>>>>> In your response, you say that matching on ALL vports is not what
>>>>>>> the use case needs. OK, I understood. But please note that the
>>>>>>> item name does not say "ALL", it says "ANY".
>>>>>>>
>>>>>>> OK. Say, "ANY" is also confusing. Let's then name it "VPORTS_ONLY"
>>>>>>> and "PHY_PORTS_ONLY". This way, if user provides item
>> VPORTS_ONLY
>>>>>>> and then provides item REPRESENTED_PORT, these two items do not
>>>>>>> contradict each other. Item VPORTS_ONLY defines the scope of some
>>>>>>> kind, then the following item, REPRESENTED_PORT, makes it
>>>> narrower.
>>>>>>>
>>>>>>> And, in documentation, one can say clearly that the user *may*
>>>>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
>>>>>>> they have already submitted this item as part of the template.
>>>>>>
>>>>>> I think the problem that Rongwei & Ori are trying to solve
>>>>>> is to allocate resources for the templates table in the right
>>>> place.
>>>>>> A table can have multiple templates.
>>>>>> If all rules/templates for this table are dedicated to virtual
>>>> ports,
>>>>>> then the table will be allocated in a place managing only virtual
>>>> ports.
>>>>>> This allocation decision must be taken at table creation,
>>>>>> whereas rules will be created later.
>>>>>> In order to do this specific table allocation for vports,
>>>>>> we need to restrict all templates of the table to be "vports only".
>>>>>>
>>>>>> I hope it makes things clearer.
>>>>>> Now the question is how to achieve this? Solutions are:
>>>>>>
>>>>>> 1/ give a hint to the table allocation
>>>>>> 2/ insert a pattern item in all templates of the table
>>>>>>
>>>>>> I don't see any other solution. Please propose if there are more
>>>> options.
>>>>>>
>>>>>>
>>>>>
>>>>> See my mail
>>>>>
>>>>> 3/ use jump rule which ensures that all traffic meets out
>>>>> expectations
>>>>>
>>>>> It means that the table creation could be postponed. Or the
>>>>> table could be per-configured at the point of creation and
>>>>> finalized when we know that all traffic will be from wires
>>>>> or from vports. Yes, it complicates internals to achieve
>>>>> the optimization.
>>>>
>>>> Sorry Andrew your suggestion is not a valid one for the following
>>>> reasons:
>>>> 1. table creation can't be postponed this is a key idea of the rte_flow
>>>> template API.
>>
>> I guess nobody cares if it delays insertion on the first rule
>> only. Anyway, see below.
>>
>>>> 2. we can never know what rules will be inserted if the application
>>>> doesn't tell us.
>>>> how can we know this is the last rule? What do we do with the
>>>> first rule?
>>>> 3. I don't see how jumping helps since it worsens the issue when you
>>>> jump to a table,
>>>> how does the PMD know if this table should have only wire or only
>>>> vports?
>>
>> Jump rules say so. PMD can analyze there rules.
>> May be just need an attribute saying that all jump rules
>> to the table are configured and further attempts to reconfigure
>> will be rejected?
>>
>
> The idea is the PMD will not analyze rules. That is why we have the table
> and template.
PMDs will not analyze **rules**, yes. But that does not dismiss the
need to analyze **tables** and **templates** when they are created.
I.e. table/template creation is some sort of "cold"/"slow" path.
The PMD sees the item in the pattern and translates it to the
internal representation of the table. Just like it **would**
do in case of the attribute approach. But when the rules
are inserted (**hot** async path), the PMD should just
collect exact "spec" values from the pattern without
analyzing it, as per the previously learned template.
From the HW resource usage perspective (in your case),
why isn't such design good enough?
> Sorry, I don't understand what attribute can be in jump? The jump is just
> to table. It can't say anything about the table destination table.
> This is all this patch adds the attribute to a table to say where this
> table should be located.
>
>>>>
>>>> I agree with Thomas, there are two valid options, I vote for the hint
>>>> since this is the
>>>> feature idea to tell the PMD where this resource should be allocated.
>>>
>>> This is an optimization; I agree with Ori that a hint is appropriate, like the
>> MBUF_FAST_FREE hint on TX queues.
>>>
>>> No need to add more complexity by requiring the driver to recognize that
>> the pattern is present in all templates. (And perhaps also remove that
>> pattern when applying the templates.)
>>
>> What does the part of the matching criteria so special
>> that it is allowed to have dedicated hint attribute?
>>
>> May be we can have really generic solution when any
>> part of the matching criteria could provide such hints?
>
> That is the point I keep returning to, it is not matching!
Let's face it: these attributes are in fact matching, which,
in the case of MLX5, is translated into resource properties.
I.e., to MLX5 (internally!), these attributes are indeed
not matching but separate resource allocation. Got it.
But what about other vendors? I guess, hardly can someone
say for sure that others' internals work the same way...
> This is on which HW resource the table should be allocated.
> Think about ingress/egress/transfer why are they not in the pattern?
- ingres/egress only applies to non-transfer rules
and serves to catch either incoming or outcoming
traffic of the single "door" (ethdev)
(furthermore, these attributes had been defined
long before the transfer concept was added, so
even if we NOW realise these attributes **could**
have been expressed in the form of items, I'm
afraid it's no use crying over spilt milk)
- transfer is not in pattern because it is not
a match criterion; it is in fact the indication
of which **match engine** to use: either the
one of the embedded switch or the one of
the vNIC / ethdev
> They are where rules should be offloaded, they are different domain.
It's OK to say that generic concept of "embedded switch level",
or "transfer domain", in the case of MLX5, is in turn split
into two different HW domains, - it's vendor-specific
internals, - but it's not OK to assume that the same
separation is also valid for other vendors.
> Like we have elsewhere for example in action create we can state on which
> domain the action should be created. If the application selects a number of domains
> it may mean that extra resources will be allocated.
Could you please expand on this / give an example?
Just for me to check whether my point of view
could be wrong based on the example or not.
>
>
>
>
>
Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-22 10:31 ` Andrew Rybchenko
@ 2022-09-22 13:00 ` Ori Kam
2022-09-23 7:25 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Ori Kam @ 2022-09-22 13:00 UTC (permalink / raw)
To: Andrew Rybchenko, Morten Brørup,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
Hi Andrew,
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Thursday, 22 September 2022 13:31
>
> On 9/22/22 13:06, Ori Kam wrote:
> > Hi Andrew,
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> Sent: Thursday, 22 September 2022 10:39
> >>
> >> On 9/21/22 15:51, Morten Brørup wrote:
> >>>> From: Ori Kam [mailto:orika@nvidia.com]
> >>>> Sent: Wednesday, 21 September 2022 14.41
> >>>>
> >>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>>>
> >>>>> On 9/21/22 12:40, Thomas Monjalon wrote:
> >>>>>> 21/09/2022 11:04, Ivan Malov:
> >>>>>>> Now it's clear to me that your intention is to match on exact
> >>>> ports,
> >>>>>>> as usual, but this time with a hint for the flow table. Got it.
> >>>>>>>
> >>>>>>> In your response, you say that matching on ALL vports is not what
> >>>>>>> the use case needs. OK, I understood. But please note that the
> >>>>>>> item name does not say "ALL", it says "ANY".
> >>>>>>>
> >>>>>>> OK. Say, "ANY" is also confusing. Let's then name it
> "VPORTS_ONLY"
> >>>>>>> and "PHY_PORTS_ONLY". This way, if user provides item
> >> VPORTS_ONLY
> >>>>>>> and then provides item REPRESENTED_PORT, these two items do
> not
> >>>>>>> contradict each other. Item VPORTS_ONLY defines the scope of
> some
> >>>>>>> kind, then the following item, REPRESENTED_PORT, makes it
> >>>> narrower.
> >>>>>>>
> >>>>>>> And, in documentation, one can say clearly that the user *may*
> >>>>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
> >>>>>>> they have already submitted this item as part of the template.
> >>>>>>
> >>>>>> I think the problem that Rongwei & Ori are trying to solve
> >>>>>> is to allocate resources for the templates table in the right
> >>>> place.
> >>>>>> A table can have multiple templates.
> >>>>>> If all rules/templates for this table are dedicated to virtual
> >>>> ports,
> >>>>>> then the table will be allocated in a place managing only virtual
> >>>> ports.
> >>>>>> This allocation decision must be taken at table creation,
> >>>>>> whereas rules will be created later.
> >>>>>> In order to do this specific table allocation for vports,
> >>>>>> we need to restrict all templates of the table to be "vports only".
> >>>>>>
> >>>>>> I hope it makes things clearer.
> >>>>>> Now the question is how to achieve this? Solutions are:
> >>>>>>
> >>>>>> 1/ give a hint to the table allocation
> >>>>>> 2/ insert a pattern item in all templates of the table
> >>>>>>
> >>>>>> I don't see any other solution. Please propose if there are more
> >>>> options.
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> See my mail
> >>>>>
> >>>>> 3/ use jump rule which ensures that all traffic meets out
> >>>>> expectations
> >>>>>
> >>>>> It means that the table creation could be postponed. Or the
> >>>>> table could be per-configured at the point of creation and
> >>>>> finalized when we know that all traffic will be from wires
> >>>>> or from vports. Yes, it complicates internals to achieve
> >>>>> the optimization.
> >>>>
> >>>> Sorry Andrew your suggestion is not a valid one for the following
> >>>> reasons:
> >>>> 1. table creation can't be postponed this is a key idea of the rte_flow
> >>>> template API.
> >>
> >> I guess nobody cares if it delays insertion on the first rule
> >> only. Anyway, see below.
> >>
> >>>> 2. we can never know what rules will be inserted if the application
> >>>> doesn't tell us.
> >>>> how can we know this is the last rule? What do we do with the
> >>>> first rule?
> >>>> 3. I don't see how jumping helps since it worsens the issue when you
> >>>> jump to a table,
> >>>> how does the PMD know if this table should have only wire or only
> >>>> vports?
> >>
> >> Jump rules say so. PMD can analyze there rules.
> >> May be just need an attribute saying that all jump rules
> >> to the table are configured and further attempts to reconfigure
> >> will be rejected?
> >>
> >
> > The idea is the PMD will not analyze rules. That is why we have the table
> > and template.
> > Sorry, I don't understand what attribute can be in jump? The jump is just
> > to table. It can't say anything about the table destination table.
> > This is all this patch adds the attribute to a table to say where this
> > table should be located.
> >
> >>>>
> >>>> I agree with Thomas, there are two valid options, I vote for the hint
> >>>> since this is the
> >>>> feature idea to tell the PMD where this resource should be allocated.
> >>>
> >>> This is an optimization; I agree with Ori that a hint is appropriate, like the
> >> MBUF_FAST_FREE hint on TX queues.
> >>>
> >>> No need to add more complexity by requiring the driver to recognize
> that
> >> the pattern is present in all templates. (And perhaps also remove that
> >> pattern when applying the templates.)
> >>
> >> What does the part of the matching criteria so special
> >> that it is allowed to have dedicated hint attribute?
> >>
> >> May be we can have really generic solution when any
> >> part of the matching criteria could provide such hints?
> >
> > That is the point I keep returning to, it is not matching!
> > This is on which HW resource the table should be allocated.
>
> Sorry, but it is just your HW details that you have different
> location/resources for rules which apply on packets coming
> from wire and coming from host (vports).
>
Right, maybe other HW may have this issue, and this patch
can help them but currently, this patch solves something in Nvidia HW.
Template API is all about giving hints, some of the hints can be used
only buy some PMDs.
I promise that any vendor that has some way to optimize its PMD
I will support, may differently name or different place but not all PMD
are equal, each one needs its hints.
> > Think about ingress/egress/transfer why are they not in the pattern?
>
> We have no ingress/egress in transfer domain any more because
> it is ambiguous.
>
Yes that is why the name is wire and non wire,
> Transfer itself is really a different domain. Logically and
> from privileges point of view. That's why it is important to
> distinguish it.
>
> Ingress and egress in non-transfer case are natively bound
> to two main functions of the driver: transmit (egress rules)
> and receive (ingress rules). In general, it is a matching
> criteria as well, but because of its nature (explained
> above) it is simply handy to distinguish it from the very
> beginning.
>
I agree with you, but this is just to show that even if something can be treated
as matching it is not the best way to look at it that way.
> > They are where rules should be offloaded, they are different domain.
>
> We have just two domains: transfer and non-transfer.
>
> > Like we have elsewhere for example in action create we can state on which
> > domain the action should be created. If the application selects a number of
> domains
> > it may mean that extra resources will be allocated.>
>
> Two more points:
>
> 1/ If it is just a hint, it is optional for PMD to
> support/handle it. It means that it MUST NOT impose any
> limitations on matching. If so, if you want a rule to
> be applied on packets coming from wire, you still MUST
> specify it in the pattern.
> So, it does not sound like a hint in your case.
Right it is optional if the application doesn't give this hint
the PMD will create just like it does now tables for both wire and
non wire.
>
> 2/ struct rte_flow_attr is used for really all rules.
> How a new attribute should be interpreted in non-transfer
> rules? Similar to ingress/egress? Duplication?
> Or even harder (if it is NOT a hint): should it really
> enforce matching of packets coming from wire (i.e. not
> a different vport)? Not sure that it is doable or even
> make sense.
> We can say that the attribute may be used for the transfer
> rules only. If so, it MUST be checked on ethdev level
> since it is a generic rule.
>
From my point of view, it should be treated only in case of transfer,
I think it also stated in the original commit this way.
Why should we validate it? We don't validate if the application
set transfer + ingress/egress or just ingress+egress or non.
> 3/ struct rte_flow_attr is used for sync and async rules.
> As I understand you're using it for async rules only.
> Does it make sense for sync rules?
Yes, it can save insertion. Since even the sync API since it doesn't have this bit
duplicate the rule.
But if pressed we can move it to the table attribute, do you think it will be better?
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-22 12:43 ` Ivan Malov
@ 2022-09-22 14:46 ` Ori Kam
0 siblings, 0 replies; 96+ messages in thread
From: Ori Kam @ 2022-09-22 14:46 UTC (permalink / raw)
To: Ivan Malov
Cc: Andrew Rybchenko, Morten Brørup,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
Hi Ivan,
> -----Original Message-----
> From: Ivan Malov <ivan.malov@oktetlabs.ru>
> Sent: Thursday, 22 September 2022 15:43
>
> Hi Ori,
>
> On Thu, 22 Sep 2022, Ori Kam wrote:
>
> > Hi Andrew,
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> Sent: Thursday, 22 September 2022 10:39
> >>
> >> On 9/21/22 15:51, Morten Brørup wrote:
> >>>> From: Ori Kam [mailto:orika@nvidia.com]
> >>>> Sent: Wednesday, 21 September 2022 14.41
> >>>>
> >>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>>>
> >>>>> On 9/21/22 12:40, Thomas Monjalon wrote:
> >>>>>> 21/09/2022 11:04, Ivan Malov:
> >>>>>>> Now it's clear to me that your intention is to match on exact
> >>>> ports,
> >>>>>>> as usual, but this time with a hint for the flow table. Got it.
> >>>>>>>
> >>>>>>> In your response, you say that matching on ALL vports is not what
> >>>>>>> the use case needs. OK, I understood. But please note that the
> >>>>>>> item name does not say "ALL", it says "ANY".
> >>>>>>>
> >>>>>>> OK. Say, "ANY" is also confusing. Let's then name it
> "VPORTS_ONLY"
> >>>>>>> and "PHY_PORTS_ONLY". This way, if user provides item
> >> VPORTS_ONLY
> >>>>>>> and then provides item REPRESENTED_PORT, these two items do
> not
> >>>>>>> contradict each other. Item VPORTS_ONLY defines the scope of
> some
> >>>>>>> kind, then the following item, REPRESENTED_PORT, makes it
> >>>> narrower.
> >>>>>>>
> >>>>>>> And, in documentation, one can say clearly that the user *may*
> >>>>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
> >>>>>>> they have already submitted this item as part of the template.
> >>>>>>
> >>>>>> I think the problem that Rongwei & Ori are trying to solve
> >>>>>> is to allocate resources for the templates table in the right
> >>>> place.
> >>>>>> A table can have multiple templates.
> >>>>>> If all rules/templates for this table are dedicated to virtual
> >>>> ports,
> >>>>>> then the table will be allocated in a place managing only virtual
> >>>> ports.
> >>>>>> This allocation decision must be taken at table creation,
> >>>>>> whereas rules will be created later.
> >>>>>> In order to do this specific table allocation for vports,
> >>>>>> we need to restrict all templates of the table to be "vports only".
> >>>>>>
> >>>>>> I hope it makes things clearer.
> >>>>>> Now the question is how to achieve this? Solutions are:
> >>>>>>
> >>>>>> 1/ give a hint to the table allocation
> >>>>>> 2/ insert a pattern item in all templates of the table
> >>>>>>
> >>>>>> I don't see any other solution. Please propose if there are more
> >>>> options.
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> See my mail
> >>>>>
> >>>>> 3/ use jump rule which ensures that all traffic meets out
> >>>>> expectations
> >>>>>
> >>>>> It means that the table creation could be postponed. Or the
> >>>>> table could be per-configured at the point of creation and
> >>>>> finalized when we know that all traffic will be from wires
> >>>>> or from vports. Yes, it complicates internals to achieve
> >>>>> the optimization.
> >>>>
> >>>> Sorry Andrew your suggestion is not a valid one for the following
> >>>> reasons:
> >>>> 1. table creation can't be postponed this is a key idea of the rte_flow
> >>>> template API.
> >>
> >> I guess nobody cares if it delays insertion on the first rule
> >> only. Anyway, see below.
> >>
> >>>> 2. we can never know what rules will be inserted if the application
> >>>> doesn't tell us.
> >>>> how can we know this is the last rule? What do we do with the
> >>>> first rule?
> >>>> 3. I don't see how jumping helps since it worsens the issue when you
> >>>> jump to a table,
> >>>> how does the PMD know if this table should have only wire or only
> >>>> vports?
> >>
> >> Jump rules say so. PMD can analyze there rules.
> >> May be just need an attribute saying that all jump rules
> >> to the table are configured and further attempts to reconfigure
> >> will be rejected?
> >>
> >
> > The idea is the PMD will not analyze rules. That is why we have the table
> > and template.
>
> PMDs will not analyze **rules**, yes. But that does not dismiss the
> need to analyze **tables** and **templates** when they are created.
> I.e. table/template creation is some sort of "cold"/"slow" path.
> The PMD sees the item in the pattern and translates it to the
> internal representation of the table. Just like it **would**
> do in case of the attribute approach. But when the rules
> are inserted (**hot** async path), the PMD should just
> collect exact "spec" values from the pattern without
> analyzing it, as per the previously learned template.
>
Right so why should we force the application to give us what
we both agree is just an hint for each rule, the PMD will not use it
so why give it?
> From the HW resource usage perspective (in your case),
> why isn't such design good enough?
>
From my first reply I told you that both ways can work.
I think as SW developer (not as Nvidia guy) that since this is an attribute
for the table, just like you said the only place we use it is during table creation
the correct place for it is in the table and not as part of the pattern.
Pure SW design.
> > Sorry, I don't understand what attribute can be in jump? The jump is just
> > to table. It can't say anything about the table destination table.
> > This is all this patch adds the attribute to a table to say where this
> > table should be located.
> >
> >>>>
> >>>> I agree with Thomas, there are two valid options, I vote for the hint
> >>>> since this is the
> >>>> feature idea to tell the PMD where this resource should be allocated.
> >>>
> >>> This is an optimization; I agree with Ori that a hint is appropriate, like the
> >> MBUF_FAST_FREE hint on TX queues.
> >>>
> >>> No need to add more complexity by requiring the driver to recognize
> that
> >> the pattern is present in all templates. (And perhaps also remove that
> >> pattern when applying the templates.)
> >>
> >> What does the part of the matching criteria so special
> >> that it is allowed to have dedicated hint attribute?
> >>
> >> May be we can have really generic solution when any
> >> part of the matching criteria could provide such hints?
> >
> > That is the point I keep returning to, it is not matching!
>
> Let's face it: these attributes are in fact matching, which,
> in the case of MLX5, is translated into resource properties.
> I.e., to MLX5 (internally!), these attributes are indeed
> not matching but separate resource allocation. Got it.
>
Happy to hear.
> But what about other vendors? I guess, hardly can someone
> say for sure that others' internals work the same way...
>
Maybe some please see my answer to Andrew,
the idea is that we in DPDK want the best insertion for all vendors,
any vendor that thinks he can get a perf boost by getting a hint from
the application will get my support. This is the idea of template API and
fast insertion.
> > This is on which HW resource the table should be allocated.
> > Think about ingress/egress/transfer why are they not in the pattern?
>
> - ingres/egress only applies to non-transfer rules
> and serves to catch either incoming or outcoming
> traffic of the single "door" (ethdev)
>
> (furthermore, these attributes had been defined
> long before the transfer concept was added, so
> even if we NOW realise these attributes **could**
> have been expressed in the form of items, I'm
> afraid it's no use crying over spilt milk)
>
> - transfer is not in pattern because it is not
> a match criterion; it is in fact the indication
> of which **match engine** to use: either the
> one of the embedded switch or the one of
> the vNIC / ethdev
>
This was just to show some point that there are cases that
even if something could be used as item maybe it is not the best
way.
In any case just like my reply about. From pure SW point of view
I think that it more correct to have it as an attribute.
> > They are where rules should be offloaded, they are different domain.
>
> It's OK to say that generic concept of "embedded switch level",
> or "transfer domain", in the case of MLX5, is in turn split
> into two different HW domains, - it's vendor-specific
> internals, - but it's not OK to assume that the same
> separation is also valid for other vendors.
>
Never said it is true to all vendors, I guess to some but for sure not all of them.
Just like I'm sure not all vendors will use other hints.
> > Like we have elsewhere for example in action create we can state on which
> > domain the action should be created. If the application selects a number of
> domains
> > it may mean that extra resources will be allocated.
>
> Could you please expand on this / give an example?
> Just for me to check whether my point of view
> could be wrong based on the example or not.
>
Let's look at rte_flow_action_handle_create one of the conf parameters is ingress/egress/transfer
application may mark an action to be used only in ingress or in ingress+egress
if the application selects ingress+egress it is possible that insertion rate and PPS
maybe slower.
I hope this makes it clearer.
Best,
Ori
> >
> >
> >
> >
> >
>
> Ivan
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-22 13:00 ` Ori Kam
@ 2022-09-23 7:25 ` Andrew Rybchenko
2022-09-23 16:11 ` Ori Kam
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-09-23 7:25 UTC (permalink / raw)
To: Ori Kam, Morten Brørup,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
Hi Ori,
On 9/22/22 16:00, Ori Kam wrote:
> Hi Andrew,
>
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Thursday, 22 September 2022 13:31
>>
>> On 9/22/22 13:06, Ori Kam wrote:
>>> Hi Andrew,
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>> Sent: Thursday, 22 September 2022 10:39
>>>>
>>>> On 9/21/22 15:51, Morten Brørup wrote:
>>>>>> From: Ori Kam [mailto:orika@nvidia.com]
>>>>>> Sent: Wednesday, 21 September 2022 14.41
>>>>>>
>>>>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>>>>>
>>>>>>> On 9/21/22 12:40, Thomas Monjalon wrote:
>>>>>>>> 21/09/2022 11:04, Ivan Malov:
>>>>>>>>> Now it's clear to me that your intention is to match on exact
>>>>>> ports,
>>>>>>>>> as usual, but this time with a hint for the flow table. Got it.
>>>>>>>>>
>>>>>>>>> In your response, you say that matching on ALL vports is not what
>>>>>>>>> the use case needs. OK, I understood. But please note that the
>>>>>>>>> item name does not say "ALL", it says "ANY".
>>>>>>>>>
>>>>>>>>> OK. Say, "ANY" is also confusing. Let's then name it
>> "VPORTS_ONLY"
>>>>>>>>> and "PHY_PORTS_ONLY". This way, if user provides item
>>>> VPORTS_ONLY
>>>>>>>>> and then provides item REPRESENTED_PORT, these two items do
>> not
>>>>>>>>> contradict each other. Item VPORTS_ONLY defines the scope of
>> some
>>>>>>>>> kind, then the following item, REPRESENTED_PORT, makes it
>>>>>> narrower.
>>>>>>>>>
>>>>>>>>> And, in documentation, one can say clearly that the user *may*
>>>>>>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
>>>>>>>>> they have already submitted this item as part of the template.
>>>>>>>>
>>>>>>>> I think the problem that Rongwei & Ori are trying to solve
>>>>>>>> is to allocate resources for the templates table in the right
>>>>>> place.
>>>>>>>> A table can have multiple templates.
>>>>>>>> If all rules/templates for this table are dedicated to virtual
>>>>>> ports,
>>>>>>>> then the table will be allocated in a place managing only virtual
>>>>>> ports.
>>>>>>>> This allocation decision must be taken at table creation,
>>>>>>>> whereas rules will be created later.
>>>>>>>> In order to do this specific table allocation for vports,
>>>>>>>> we need to restrict all templates of the table to be "vports only".
>>>>>>>>
>>>>>>>> I hope it makes things clearer.
>>>>>>>> Now the question is how to achieve this? Solutions are:
>>>>>>>>
>>>>>>>> 1/ give a hint to the table allocation
>>>>>>>> 2/ insert a pattern item in all templates of the table
>>>>>>>>
>>>>>>>> I don't see any other solution. Please propose if there are more
>>>>>> options.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> See my mail
>>>>>>>
>>>>>>> 3/ use jump rule which ensures that all traffic meets out
>>>>>>> expectations
>>>>>>>
>>>>>>> It means that the table creation could be postponed. Or the
>>>>>>> table could be per-configured at the point of creation and
>>>>>>> finalized when we know that all traffic will be from wires
>>>>>>> or from vports. Yes, it complicates internals to achieve
>>>>>>> the optimization.
>>>>>>
>>>>>> Sorry Andrew your suggestion is not a valid one for the following
>>>>>> reasons:
>>>>>> 1. table creation can't be postponed this is a key idea of the rte_flow
>>>>>> template API.
>>>>
>>>> I guess nobody cares if it delays insertion on the first rule
>>>> only. Anyway, see below.
>>>>
>>>>>> 2. we can never know what rules will be inserted if the application
>>>>>> doesn't tell us.
>>>>>> how can we know this is the last rule? What do we do with the
>>>>>> first rule?
>>>>>> 3. I don't see how jumping helps since it worsens the issue when you
>>>>>> jump to a table,
>>>>>> how does the PMD know if this table should have only wire or only
>>>>>> vports?
>>>>
>>>> Jump rules say so. PMD can analyze there rules.
>>>> May be just need an attribute saying that all jump rules
>>>> to the table are configured and further attempts to reconfigure
>>>> will be rejected?
>>>>
>>>
>>> The idea is the PMD will not analyze rules. That is why we have the table
>>> and template.
>>> Sorry, I don't understand what attribute can be in jump? The jump is just
>>> to table. It can't say anything about the table destination table.
>>> This is all this patch adds the attribute to a table to say where this
>>> table should be located.
>>>
>>>>>>
>>>>>> I agree with Thomas, there are two valid options, I vote for the hint
>>>>>> since this is the
>>>>>> feature idea to tell the PMD where this resource should be allocated.
>>>>>
>>>>> This is an optimization; I agree with Ori that a hint is appropriate, like the
>>>> MBUF_FAST_FREE hint on TX queues.
>>>>>
>>>>> No need to add more complexity by requiring the driver to recognize
>> that
>>>> the pattern is present in all templates. (And perhaps also remove that
>>>> pattern when applying the templates.)
>>>>
>>>> What does the part of the matching criteria so special
>>>> that it is allowed to have dedicated hint attribute?
>>>>
>>>> May be we can have really generic solution when any
>>>> part of the matching criteria could provide such hints?
>>>
>>> That is the point I keep returning to, it is not matching!
>>> This is on which HW resource the table should be allocated.
>>
>> Sorry, but it is just your HW details that you have different
>> location/resources for rules which apply on packets coming
>> from wire and coming from host (vports).
>>
>
>
> Right, maybe other HW may have this issue, and this patch
> can help them but currently, this patch solves something in Nvidia HW.
> Template API is all about giving hints, some of the hints can be used
> only buy some PMDs.
> I promise that any vendor that has some way to optimize its PMD
> I will support, may differently name or different place but not all PMD
> are equal, each one needs its hints.
>
>
>>> Think about ingress/egress/transfer why are they not in the pattern?
>>
>> We have no ingress/egress in transfer domain any more because
>> it is ambiguous.
>>
>
> Yes that is why the name is wire and non wire,
My question here is why application really needs to know it.
Why does it make the difference?
IMHO for a VM which uses some function everything coming to it
is from the logical wire.
Of course since it is a transfer layer, we are talking about an
application like OvS. May be OvS knows the difference...
>
>> Transfer itself is really a different domain. Logically and
>> from privileges point of view. That's why it is important to
>> distinguish it.
>>
>> Ingress and egress in non-transfer case are natively bound
>> to two main functions of the driver: transmit (egress rules)
>> and receive (ingress rules). In general, it is a matching
>> criteria as well, but because of its nature (explained
>> above) it is simply handy to distinguish it from the very
>> beginning.
>>
>
> I agree with you, but this is just to show that even if something can be treated
> as matching it is not the best way to look at it that way.
>
>>> They are where rules should be offloaded, they are different domain.
>>
>> We have just two domains: transfer and non-transfer.
>>
>>> Like we have elsewhere for example in action create we can state on which
>>> domain the action should be created. If the application selects a number of
>> domains
>>> it may mean that extra resources will be allocated.>
>>
>> Two more points:
>>
>> 1/ If it is just a hint, it is optional for PMD to
>> support/handle it. It means that it MUST NOT impose any
>> limitations on matching. If so, if you want a rule to
>> be applied on packets coming from wire, you still MUST
>> specify it in the pattern.
>> So, it does not sound like a hint in your case.
>
> Right it is optional if the application doesn't give this hint
> the PMD will create just like it does now tables for both wire and
> non wire.
Let's make it clear here and in the attribute documentation.
You're taking about one side, but there is an another one.
If it is just a hint which is optional to specify/interpret,
it does not impose any matching criteria. So, if a rule does
not specify source, the rule must be applied on traffic
coming from both wire and not-wire. In order to limit
souses, we still need matching criteria anyway - i.e. pattern
item. Moreover, if a PMD supports the hint and matching
criteria contradicts it, flow rule insertion must fail.
Could you please confirm that we share our understanding here.
>
>>
>> 2/ struct rte_flow_attr is used for really all rules.
>> How a new attribute should be interpreted in non-transfer
>> rules? Similar to ingress/egress? Duplication?
>> Or even harder (if it is NOT a hint): should it really
>> enforce matching of packets coming from wire (i.e. not
>> a different vport)? Not sure that it is doable or even
>> make sense.
>> We can say that the attribute may be used for the transfer
>> rules only. If so, it MUST be checked on ethdev level
>> since it is a generic rule.
>>
>
> From my point of view, it should be treated only in case of transfer,
> I think it also stated in the original commit this way.
> Why should we validate it?
Because it is a generic limitation. Otherwise each PMD must
check it. The check is required to avoid misusage.
> We don't validate if the application
> set transfer + ingress/egress or just ingress+egress or non.
We're going to add corresponding checks on ethdev when we
finalize deprecation of ingress/egress in transfer rules.
>
>
>> 3/ struct rte_flow_attr is used for sync and async rules.
>> As I understand you're using it for async rules only.
>> Does it make sense for sync rules?
>
> Yes, it can save insertion. Since even the sync API since it doesn't have this bit
> duplicate the rule.
Sorry I don't understand above.
> But if pressed we can move it to the table attribute, do you think it will be better?
If it is not a generic thing, IMHO it is better to put in
table attributes.
Andrew.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v1] ethdev: add direction info when creating the transfer table
2022-09-23 7:25 ` Andrew Rybchenko
@ 2022-09-23 16:11 ` Ori Kam
0 siblings, 0 replies; 96+ messages in thread
From: Ori Kam @ 2022-09-23 16:11 UTC (permalink / raw)
To: Andrew Rybchenko, Morten Brørup,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Ivan Malov
Cc: Rongwei Liu, Matan Azrad, Slava Ovsiienko, Aman Singh,
Yuying Zhang, dev, Raslan Darawsheh, jerinj, ajit.khaparde
Hi Andrew,
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Friday, 23 September 2022 10:26
>
> Hi Ori,
>
> On 9/22/22 16:00, Ori Kam wrote:
> > Hi Andrew,
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> Sent: Thursday, 22 September 2022 13:31
> >>
> >> On 9/22/22 13:06, Ori Kam wrote:
> >>> Hi Andrew,
> >>>
> >>>> -----Original Message-----
> >>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>> Sent: Thursday, 22 September 2022 10:39
> >>>>
> >>>> On 9/21/22 15:51, Morten Brørup wrote:
> >>>>>> From: Ori Kam [mailto:orika@nvidia.com]
> >>>>>> Sent: Wednesday, 21 September 2022 14.41
> >>>>>>
> >>>>>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>>>>>>
> >>>>>>> On 9/21/22 12:40, Thomas Monjalon wrote:
> >>>>>>>> 21/09/2022 11:04, Ivan Malov:
> >>>>>>>>> Now it's clear to me that your intention is to match on exact
> >>>>>> ports,
> >>>>>>>>> as usual, but this time with a hint for the flow table. Got it.
> >>>>>>>>>
> >>>>>>>>> In your response, you say that matching on ALL vports is not
> what
> >>>>>>>>> the use case needs. OK, I understood. But please note that the
> >>>>>>>>> item name does not say "ALL", it says "ANY".
> >>>>>>>>>
> >>>>>>>>> OK. Say, "ANY" is also confusing. Let's then name it
> >> "VPORTS_ONLY"
> >>>>>>>>> and "PHY_PORTS_ONLY". This way, if user provides item
> >>>> VPORTS_ONLY
> >>>>>>>>> and then provides item REPRESENTED_PORT, these two items
> do
> >> not
> >>>>>>>>> contradict each other. Item VPORTS_ONLY defines the scope of
> >> some
> >>>>>>>>> kind, then the following item, REPRESENTED_PORT, makes it
> >>>>>> narrower.
> >>>>>>>>>
> >>>>>>>>> And, in documentation, one can say clearly that the user *may*
> >>>>>>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
> >>>>>>>>> they have already submitted this item as part of the template.
> >>>>>>>>
> >>>>>>>> I think the problem that Rongwei & Ori are trying to solve
> >>>>>>>> is to allocate resources for the templates table in the right
> >>>>>> place.
> >>>>>>>> A table can have multiple templates.
> >>>>>>>> If all rules/templates for this table are dedicated to virtual
> >>>>>> ports,
> >>>>>>>> then the table will be allocated in a place managing only virtual
> >>>>>> ports.
> >>>>>>>> This allocation decision must be taken at table creation,
> >>>>>>>> whereas rules will be created later.
> >>>>>>>> In order to do this specific table allocation for vports,
> >>>>>>>> we need to restrict all templates of the table to be "vports only".
> >>>>>>>>
> >>>>>>>> I hope it makes things clearer.
> >>>>>>>> Now the question is how to achieve this? Solutions are:
> >>>>>>>>
> >>>>>>>> 1/ give a hint to the table allocation
> >>>>>>>> 2/ insert a pattern item in all templates of the table
> >>>>>>>>
> >>>>>>>> I don't see any other solution. Please propose if there are more
> >>>>>> options.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> See my mail
> >>>>>>>
> >>>>>>> 3/ use jump rule which ensures that all traffic meets out
> >>>>>>> expectations
> >>>>>>>
> >>>>>>> It means that the table creation could be postponed. Or the
> >>>>>>> table could be per-configured at the point of creation and
> >>>>>>> finalized when we know that all traffic will be from wires
> >>>>>>> or from vports. Yes, it complicates internals to achieve
> >>>>>>> the optimization.
> >>>>>>
> >>>>>> Sorry Andrew your suggestion is not a valid one for the following
> >>>>>> reasons:
> >>>>>> 1. table creation can't be postponed this is a key idea of the rte_flow
> >>>>>> template API.
> >>>>
> >>>> I guess nobody cares if it delays insertion on the first rule
> >>>> only. Anyway, see below.
> >>>>
> >>>>>> 2. we can never know what rules will be inserted if the application
> >>>>>> doesn't tell us.
> >>>>>> how can we know this is the last rule? What do we do with the
> >>>>>> first rule?
> >>>>>> 3. I don't see how jumping helps since it worsens the issue when you
> >>>>>> jump to a table,
> >>>>>> how does the PMD know if this table should have only wire or
> only
> >>>>>> vports?
> >>>>
> >>>> Jump rules say so. PMD can analyze there rules.
> >>>> May be just need an attribute saying that all jump rules
> >>>> to the table are configured and further attempts to reconfigure
> >>>> will be rejected?
> >>>>
> >>>
> >>> The idea is the PMD will not analyze rules. That is why we have the table
> >>> and template.
> >>> Sorry, I don't understand what attribute can be in jump? The jump is just
> >>> to table. It can't say anything about the table destination table.
> >>> This is all this patch adds the attribute to a table to say where this
> >>> table should be located.
> >>>
> >>>>>>
> >>>>>> I agree with Thomas, there are two valid options, I vote for the hint
> >>>>>> since this is the
> >>>>>> feature idea to tell the PMD where this resource should be
> allocated.
> >>>>>
> >>>>> This is an optimization; I agree with Ori that a hint is appropriate, like
> the
> >>>> MBUF_FAST_FREE hint on TX queues.
> >>>>>
> >>>>> No need to add more complexity by requiring the driver to recognize
> >> that
> >>>> the pattern is present in all templates. (And perhaps also remove that
> >>>> pattern when applying the templates.)
> >>>>
> >>>> What does the part of the matching criteria so special
> >>>> that it is allowed to have dedicated hint attribute?
> >>>>
> >>>> May be we can have really generic solution when any
> >>>> part of the matching criteria could provide such hints?
> >>>
> >>> That is the point I keep returning to, it is not matching!
> >>> This is on which HW resource the table should be allocated.
> >>
> >> Sorry, but it is just your HW details that you have different
> >> location/resources for rules which apply on packets coming
> >> from wire and coming from host (vports).
> >>
> >
> >
> > Right, maybe other HW may have this issue, and this patch
> > can help them but currently, this patch solves something in Nvidia HW.
> > Template API is all about giving hints, some of the hints can be used
> > only buy some PMDs.
> > I promise that any vendor that has some way to optimize its PMD
> > I will support, may differently name or different place but not all PMD
> > are equal, each one needs its hints.
> >
> >
> >>> Think about ingress/egress/transfer why are they not in the pattern?
> >>
> >> We have no ingress/egress in transfer domain any more because
> >> it is ambiguous.
> >>
> >
> > Yes that is why the name is wire and non wire,
>
> My question here is why application really needs to know it.
> Why does it make the difference?
> IMHO for a VM which uses some function everything coming to it
> is from the logical wire.
> Of course since it is a transfer layer, we are talking about an
> application like OvS. May be OvS knows the difference...
>
We are talking about the app that controls the switch,
for example OVS, we are not talking about an application that can
control only NIC (ingress/egress)
In any case the idea of the template API is to optimize applications
that has prior knowledge and can give hints to the PMD so
they will get better insertion and resource allocation.
If the application doesn't know, that is also O.K but it will not
get the most performance boost.
> >
> >> Transfer itself is really a different domain. Logically and
> >> from privileges point of view. That's why it is important to
> >> distinguish it.
> >>
> >> Ingress and egress in non-transfer case are natively bound
> >> to two main functions of the driver: transmit (egress rules)
> >> and receive (ingress rules). In general, it is a matching
> >> criteria as well, but because of its nature (explained
> >> above) it is simply handy to distinguish it from the very
> >> beginning.
> >>
> >
> > I agree with you, but this is just to show that even if something can be
> treated
> > as matching it is not the best way to look at it that way.
> >
> >>> They are where rules should be offloaded, they are different domain.
> >>
> >> We have just two domains: transfer and non-transfer.
> >>
> >>> Like we have elsewhere for example in action create we can state on
> which
> >>> domain the action should be created. If the application selects a number
> of
> >> domains
> >>> it may mean that extra resources will be allocated.>
> >>
> >> Two more points:
> >>
> >> 1/ If it is just a hint, it is optional for PMD to
> >> support/handle it. It means that it MUST NOT impose any
> >> limitations on matching. If so, if you want a rule to
> >> be applied on packets coming from wire, you still MUST
> >> specify it in the pattern.
> >> So, it does not sound like a hint in your case.
> >
> > Right it is optional if the application doesn't give this hint
> > the PMD will create just like it does now tables for both wire and
> > non wire.
>
> Let's make it clear here and in the attribute documentation.
> You're taking about one side, but there is an another one.
> If it is just a hint which is optional to specify/interpret,
> it does not impose any matching criteria. So, if a rule does
> not specify source, the rule must be applied on traffic
> coming from both wire and not-wire. In order to limit
> souses, we still need matching criteria anyway - i.e. pattern
> item. Moreover, if a PMD supports the hint and matching
> criteria contradicts it, flow rule insertion must fail.
>
> Could you please confirm that we share our understanding here.
>
I fully agree that it should be clearly stated that this is only a hint.
I'm not sure I agree with you on the second part that says rule must be applied
on both sides if there is no source.
If the application gives this hint then he is bounded by it.
for example he can have only matching on ipv4 in group X
while in group x-1 it set a rule that moves traffic from wire to group X
so application knows that in group X there is only traffic that arrived from
wire without matching on it.
He can look at the relaxed_matching attribute, that states that
PMD should only match on fields that have a non zero mask
and not the pattern order.
This is the same if application set relax matching and matches on UDP dport = 100
the PMD/HW will not verity that the packet is really UDP packet and just check the
dport field.
Like everything else application when giving an hint should know that the hint binds
it to it. It can't say I give a hint but will o the reverse of the given hint.
> >
> >>
> >> 2/ struct rte_flow_attr is used for really all rules.
> >> How a new attribute should be interpreted in non-transfer
> >> rules? Similar to ingress/egress? Duplication?
> >> Or even harder (if it is NOT a hint): should it really
> >> enforce matching of packets coming from wire (i.e. not
> >> a different vport)? Not sure that it is doable or even
> >> make sense.
> >> We can say that the attribute may be used for the transfer
> >> rules only. If so, it MUST be checked on ethdev level
> >> since it is a generic rule.
> >>
> >
> > From my point of view, it should be treated only in case of transfer,
> > I think it also stated in the original commit this way.
> > Why should we validate it?
>
> Because it is a generic limitation. Otherwise each PMD must
> check it. The check is required to avoid misusage.
>
> > We don't validate if the application
> > set transfer + ingress/egress or just ingress+egress or non.
>
> We're going to add corresponding checks on ethdev when we
> finalize deprecation of ingress/egress in transfer rules.
>
I will nack this patch.
Since it adds extra checks that will result in perf degradation.
If the application doesn't something against the documations.
that doesn't cause system crash DPDK should not validate it.
Like you don't validate anything in data path.
> >
> >
> >> 3/ struct rte_flow_attr is used for sync and async rules.
> >> As I understand you're using it for async rules only.
> >> Does it make sense for sync rules?
> >
> > Yes, it can save insertion. Since even the sync API since it doesn't have this
> bit
> > duplicate the rule.
>
> Sorry I don't understand above.
>
Sorry, something broke with my sentence.
Even using standard API this hint can help, since using it PMD can insert
The rule on only one hw resource and doesn't need to duplicate it.
> > But if pressed we can move it to the table attribute, do you think it will be
> better?
>
> If it is not a generic thing, IMHO it is better to put in
> table attributes.
>
So just to make sure I understand correctly, you perefer that all
none generic will only be in the template API?
I'm O.K with that.
> Andrew.
Thanks,
Ori
^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v3] ethdev: add hint when creating async transfer table
2022-09-13 14:33 ` Ivan Malov
2022-09-14 5:16 ` Rongwei Liu
@ 2022-09-28 9:24 ` Rongwei Liu
2022-10-04 8:31 ` Andrew Rybchenko
1 sibling, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2022-09-28 9:24 UTC (permalink / raw)
To: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko
Cc: dev, rasland
The transfer domain rule is able to match traffic wire/vf
origin and it means two directions' underlayer resource.
In customer deployments, they usually match only one direction
traffic in single flow table: either from wire or from vf.
Introduce one new member transfer_mode into rte_flow_template_table_attr
to indicate the flow table direction property: from wire, from vf
or bi-direction(default).
It helps to save underlayer memory also on insertion rate, and this
new field doesn't expose any matching criteira.
By default, the transfer domain is to match bi-direction traffic, and
no behavior changed.
1. Match wire origin only
flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
flow template_table 0 create group 0 priority 0 transfer vf_orig...
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
v2: Move the new field to template table attribute.
---
app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++++
doc/guides/prog_guide/rte_flow.rst | 7 ++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
lib/ethdev/rte_flow.h | 7 ++++++
4 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index d97be6fe98..beacb0a802 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -178,6 +178,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VF_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1142,6 +1144,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VF_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2882,6 +2886,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VF_ORIG] = {
+ .name = "vf_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8895,6 +8911,16 @@ parse_table(struct context *ctx, const struct token *token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.transfer_mode = 1;
+ return len;
+ case TABLE_TRANSFER_VF_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.transfer_mode = 2;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 588914b231..de8d6836d6 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3748,6 +3748,13 @@ the maximum number of flow rules is defined at table creation time.
Any flow rule creation beyond the maximum table size is rejected.
Application may create another table to accommodate more rules in this case.
+Attribute: transfer_mode
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+This is an optional table attribute and meaningless if `Attribute: Transfer`
+is not specified. It doesn't expose any matching criteria but just as a hint
+to indicate PMD where to bound the rules.
+
.. code-block:: c
struct rte_flow_template_table *
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 5fbec06c35..9a97c5f513 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3125,7 +3125,8 @@ It is bound to ``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vf_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 017f690798..a546e9a72b 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5233,6 +5233,13 @@ struct rte_flow_template_table_attr {
* Maximum number of flow rules that this table holds.
*/
uint32_t nb_flows;
+ /**
+ * 0 means bidirection,
+ * 0x1 origin uplink,
+ * 0x2 origin vport,
+ * N/A both set.
+ */
+ uint32_t transfer_mode:2;
};
/**
--
2.31.1
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v3] ethdev: add hint when creating async transfer table
2022-09-28 9:24 ` [PATCH v3] ethdev: add hint when creating async " Rongwei Liu
@ 2022-10-04 8:31 ` Andrew Rybchenko
2022-11-04 10:42 ` [PATCH v4] ethdev: add special flags " Rongwei Liu
` (4 more replies)
0 siblings, 5 replies; 96+ messages in thread
From: Andrew Rybchenko @ 2022-10-04 8:31 UTC (permalink / raw)
To: Rongwei Liu, matan, viacheslavo, orika, thomas, Aman Singh,
Yuying Zhang, Ferruh Yigit
Cc: dev, rasland
On 9/28/22 12:24, Rongwei Liu wrote:
> The transfer domain rule is able to match traffic wire/vf
> origin and it means two directions' underlayer resource.
>
> In customer deployments, they usually match only one direction
> traffic in single flow table: either from wire or from vf.
>
> Introduce one new member transfer_mode into rte_flow_template_table_attr
> to indicate the flow table direction property: from wire, from vf
> or bi-direction(default).
>
> It helps to save underlayer memory also on insertion rate, and this
> new field doesn't expose any matching criteira.
>
> By default, the transfer domain is to match bi-direction traffic, and
> no behavior changed.
>
> 1. Match wire origin only
> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> 2. Match vf origin only
> flow template_table 0 create group 0 priority 0 transfer vf_orig...
Since wire_orig and vf_orig are just optional hints and not
all PMDs are obliged to handle it, it does not impose any
matching criteria. So, example above are misleading and you
need to add pattern items to highlight that corresponding rules
are really wire_orig or vf_orig.
>
> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> Acked-by: Ori Kam <orika@nvidia.com>
[snip]
^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v4] ethdev: add special flags when creating async transfer table
2022-10-04 8:31 ` Andrew Rybchenko
@ 2022-11-04 10:42 ` Rongwei Liu
2022-11-04 10:44 ` Rongwei Liu
` (3 subsequent siblings)
4 siblings, 0 replies; 96+ messages in thread
From: Rongwei Liu @ 2022-11-04 10:42 UTC (permalink / raw)
To: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko
Cc: dev, rasland
The transfer domain rule is able to match traffic wire/vport
origin which are corresponding to two kinds of underlayer resources.
Wire means traffic arrives from the uplink port while vport means
traffic initiated from VF/SF.
In customer deployments, they usually match only one kind of
traffic in single flow table: either from wire or from vport.
PMD can save significant resources if passing specical hint from rte
layer.
There are two possible approaches, using IPv4 as an example:
1. Use pattern item.
pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
"ANY_VPORT" needs to be present in each async rule even if it's
just a hint. No value to match.
2. Add special flags into table_attr. It will be:
template_table 0 create table_id 0 group 1 transfer vf_orig
Approach 1 needs to specify the pattern in each flow rules which wastes
memory and not end user friendly.
This patch takes the 2nd approach and introduce one new member
specialize into rte_flow_table_attr to indicate async flow table matching
optimization: from wire, from vport.
It helps to save underlayer memory and also on insertion rate.
By default, there is no hint, so the behavior of the transfer domain
doesn't change.
1. Match wire origin only
flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
flow template_table 0 create group 0 priority 0 transfer vport_orig...
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
v2: Move the new field to template table attribute.
v4: Mark it as optional and clear the concept.
---
app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++
doc/guides/prog_guide/rte_flow.rst | 15 ++++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 +-
lib/ethdev/rte_flow.h | 32 +++++++++++++++++++++
4 files changed, 75 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..15f2af9b40 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -184,6 +184,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VPORT_ORIG] = {
+ .name = "vport_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8993,6 +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_WIRE_ORIG;
+ return len;
+ case TABLE_TRANSFER_VPORT_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_VPORT_ORIG;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..d9ca041ae4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
&actions_templates, nb_actions_templ,
&error);
+Table Attribute: Specialize
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Application can help optimizing underlayer resources and insertion rate
+by specializing template table.
+Specialization is done by providing hints
+in the template table attribute ``specialize``.
+
+This attribute is not mandatory for each PMD to implement.
+If a hint is not supported, it will be silently ignored,
+and no special optimization is done.
+
+If a table is specialized, the application should make sure the rules
+comply with the table attribute.
+
Asynchronous operations
-----------------------
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b3238415f4 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3145,7 +3145,8 @@ It is bound to ``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vport_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 8858b56428..1eab12796f 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t port_id,
*/
struct rte_flow_template_table;
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Special optional flags for template table attribute.
+ * Each bit stands for a table specialization
+ * offering a potential optimization at PMD layer.
+ * PMD can ignore the unsupported bits silently.
+ */
+enum rte_flow_template_table_specialize {
+ /**
+ * Specialize table for transfer flows which come only from wire.
+ * It allows PMD not to allocate resources for non-wire originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-wire originated traffic will be missed
+ * if the hint is supported.
+ */
+ RTE_FLOW_TRANSFER_WIRE_ORIG = RTE_BIT32(0),
+ /**
+ * Specialize table for transfer flows which come only from vport (e.g. VF, SF).
+ * It allows PMD not to allocate resources for non-vport originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-vport originated traffic will be missed
+ * if the hint is supported.
+ */
+ RTE_FLOW_TRANSFER_VPORT_ORIG = RTE_BIT32(1),
+};
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice.
@@ -5201,6 +5229,10 @@ struct rte_flow_template_table_attr {
* Maximum number of flow rules that this table holds.
*/
uint32_t nb_flows;
+ /**
+ * Optional hint flags for PMD optimization.
+ */
+ enum rte_flow_template_table_specialize specialize;
};
/**
--
2.27.0
^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v4] ethdev: add special flags when creating async transfer table
2022-10-04 8:31 ` Andrew Rybchenko
2022-11-04 10:42 ` [PATCH v4] ethdev: add special flags " Rongwei Liu
@ 2022-11-04 10:44 ` Rongwei Liu
2022-11-08 11:39 ` Andrew Rybchenko
2022-11-06 10:02 ` [PATCH v3] ethdev: add hint " Andrew Rybchenko
` (2 subsequent siblings)
4 siblings, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2022-11-04 10:44 UTC (permalink / raw)
To: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko
Cc: dev, rasland
The transfer domain rule is able to match traffic wire/vport
origin which are corresponding to two kinds of underlayer resources.
Wire means traffic arrives from the uplink port while vport means
traffic initiated from VF/SF.
In customer deployments, they usually match only one kind of
traffic in single flow table: either from wire or from vport.
PMD can save significant resources if passing special hint from rte
layer.
There are two possible approaches, using IPv4 as an example:
1. Use pattern item.
pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
"ANY_VPORT" needs to be present in each async rule even if it's
just a hint. No value to match.
2. Add special flags into table_attr. It will be:
template_table 0 create table_id 0 group 1 transfer vf_orig
Approach 1 needs to specify the pattern in each flow rules which wastes
memory and not end user friendly.
This patch takes the 2nd approach and introduce one new member
specialize into rte_flow_table_attr to indicate async flow table matching
optimization: from wire, from vport.
It helps to save underlayer memory and also on insertion rate.
By default, there is no hint, so the behavior of the transfer domain
doesn't change.
1. Match wire origin only
flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
flow template_table 0 create group 0 priority 0 transfer vport_orig...
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
v2: Move the new field to template table attribute.
v4: Mark it as optional and clear the concept.
---
app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++
doc/guides/prog_guide/rte_flow.rst | 15 ++++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 +-
lib/ethdev/rte_flow.h | 32 +++++++++++++++++++++
4 files changed, 75 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..15f2af9b40 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -184,6 +184,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VPORT_ORIG] = {
+ .name = "vport_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8993,6 +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_WIRE_ORIG;
+ return len;
+ case TABLE_TRANSFER_VPORT_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_VPORT_ORIG;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..d9ca041ae4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
&actions_templates, nb_actions_templ,
&error);
+Table Attribute: Specialize
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Application can help optimizing underlayer resources and insertion rate
+by specializing template table.
+Specialization is done by providing hints
+in the template table attribute ``specialize``.
+
+This attribute is not mandatory for each PMD to implement.
+If a hint is not supported, it will be silently ignored,
+and no special optimization is done.
+
+If a table is specialized, the application should make sure the rules
+comply with the table attribute.
+
Asynchronous operations
-----------------------
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b3238415f4 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3145,7 +3145,8 @@ It is bound to ``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vport_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 8858b56428..1eab12796f 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t port_id,
*/
struct rte_flow_template_table;
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Special optional flags for template table attribute.
+ * Each bit stands for a table specialization
+ * offering a potential optimization at PMD layer.
+ * PMD can ignore the unsupported bits silently.
+ */
+enum rte_flow_template_table_specialize {
+ /**
+ * Specialize table for transfer flows which come only from wire.
+ * It allows PMD not to allocate resources for non-wire originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-wire originated traffic will be missed
+ * if the hint is supported.
+ */
+ RTE_FLOW_TRANSFER_WIRE_ORIG = RTE_BIT32(0),
+ /**
+ * Specialize table for transfer flows which come only from vport (e.g. VF, SF).
+ * It allows PMD not to allocate resources for non-vport originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-vport originated traffic will be missed
+ * if the hint is supported.
+ */
+ RTE_FLOW_TRANSFER_VPORT_ORIG = RTE_BIT32(1),
+};
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice.
@@ -5201,6 +5229,10 @@ struct rte_flow_template_table_attr {
* Maximum number of flow rules that this table holds.
*/
uint32_t nb_flows;
+ /**
+ * Optional hint flags for PMD optimization.
+ */
+ enum rte_flow_template_table_specialize specialize;
};
/**
--
2.27.0
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v3] ethdev: add hint when creating async transfer table
2022-10-04 8:31 ` Andrew Rybchenko
2022-11-04 10:42 ` [PATCH v4] ethdev: add special flags " Rongwei Liu
2022-11-04 10:44 ` Rongwei Liu
@ 2022-11-06 10:02 ` Andrew Rybchenko
2022-11-07 1:58 ` Rongwei Liu
2022-11-08 9:19 ` Thomas Monjalon
2022-11-09 8:11 ` [PATCH v5] ethdev: add special flags when creating async transfer table Rongwei Liu
2022-11-09 8:13 ` Rongwei Liu
4 siblings, 2 replies; 96+ messages in thread
From: Andrew Rybchenko @ 2022-11-06 10:02 UTC (permalink / raw)
To: Rongwei Liu, matan, viacheslavo, orika, thomas, Aman Singh,
Yuying Zhang, Ferruh Yigit
Cc: dev, rasland
On 10/4/22 11:31, Andrew Rybchenko wrote:
> On 9/28/22 12:24, Rongwei Liu wrote:
>> The transfer domain rule is able to match traffic wire/vf
>> origin and it means two directions' underlayer resource.
>>
>> In customer deployments, they usually match only one direction
>> traffic in single flow table: either from wire or from vf.
>>
>> Introduce one new member transfer_mode into rte_flow_template_table_attr
>> to indicate the flow table direction property: from wire, from vf
>> or bi-direction(default).
>>
>> It helps to save underlayer memory also on insertion rate, and this
>> new field doesn't expose any matching criteira.
>>
>> By default, the transfer domain is to match bi-direction traffic, and
>> no behavior changed.
>>
>> 1. Match wire origin only
>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
>> 2. Match vf origin only
>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>
> Since wire_orig and vf_orig are just optional hints and not
> all PMDs are obliged to handle it, it does not impose any
> matching criteria. So, example above are misleading and you
> need to add pattern items to highlight that corresponding rules
> are really wire_orig or vf_orig.
I'm sorry, but I still don't see how it is addressed in v4.
>
>>
>> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
>> Acked-by: Ori Kam <orika@nvidia.com>
>
> [snip]
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v3] ethdev: add hint when creating async transfer table
2022-11-06 10:02 ` [PATCH v3] ethdev: add hint " Andrew Rybchenko
@ 2022-11-07 1:58 ` Rongwei Liu
2022-11-08 9:19 ` Thomas Monjalon
1 sibling, 0 replies; 96+ messages in thread
From: Rongwei Liu @ 2022-11-07 1:58 UTC (permalink / raw)
To: Andrew Rybchenko, Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Ferruh Yigit, Thomas Monjalon
Cc: dev, Raslan Darawsheh
BR
Rongwei
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Sunday, November 6, 2022 18:03
> To: Rongwei Liu <rongweil@nvidia.com>; Matan Azrad <matan@nvidia.com>;
> Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman
> Singh <aman.deep.singh@intel.com>; Yuying Zhang
> <yuying.zhang@intel.com>; Ferruh Yigit <ferruh.yigit@amd.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v3] ethdev: add hint when creating async transfer table
>
> External email: Use caution opening links or attachments
>
>
> On 10/4/22 11:31, Andrew Rybchenko wrote:
> > On 9/28/22 12:24, Rongwei Liu wrote:
> >> The transfer domain rule is able to match traffic wire/vf origin and
> >> it means two directions' underlayer resource.
> >>
> >> In customer deployments, they usually match only one direction
> >> traffic in single flow table: either from wire or from vf.
> >>
> >> Introduce one new member transfer_mode into
> >> rte_flow_template_table_attr to indicate the flow table direction
> >> property: from wire, from vf or bi-direction(default).
> >>
> >> It helps to save underlayer memory also on insertion rate, and this
> >> new field doesn't expose any matching criteira.
> >>
> >> By default, the transfer domain is to match bi-direction traffic, and
> >> no behavior changed.
> >>
> >> 1. Match wire origin only
> >> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> >> 2. Match vf origin only
> >> flow template_table 0 create group 0 priority 0 transfer vf_orig...
> >
> > Since wire_orig and vf_orig are just optional hints and not all PMDs
> > are obliged to handle it, it does not impose any matching criteria.
> > So, example above are misleading and you need to add pattern items to
> > highlight that corresponding rules are really wire_orig or vf_orig.
>
> I'm sorry, but I still don't see how it is addressed in v4.
@Thomas Monjalon Could you share some thoughts? Thanks.
>
> >
> >>
> >> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> >> Acked-by: Ori Kam <orika@nvidia.com>
> >
> > [snip]
> >
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v3] ethdev: add hint when creating async transfer table
2022-11-06 10:02 ` [PATCH v3] ethdev: add hint " Andrew Rybchenko
2022-11-07 1:58 ` Rongwei Liu
@ 2022-11-08 9:19 ` Thomas Monjalon
2022-11-08 9:35 ` Andrew Rybchenko
1 sibling, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2022-11-08 9:19 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
06/11/2022 11:02, Andrew Rybchenko:
> On 10/4/22 11:31, Andrew Rybchenko wrote:
> > On 9/28/22 12:24, Rongwei Liu wrote:
> >> The transfer domain rule is able to match traffic wire/vf
> >> origin and it means two directions' underlayer resource.
> >>
> >> In customer deployments, they usually match only one direction
> >> traffic in single flow table: either from wire or from vf.
Customer deployment is not an argument.
> >> Introduce one new member transfer_mode into rte_flow_template_table_attr
> >> to indicate the flow table direction property: from wire, from vf
> >> or bi-direction(default).
The origin is not a direction.
We should update this sentence.
> >> It helps to save underlayer memory also on insertion rate, and this
> >> new field doesn't expose any matching criteira.
Should be reworded.
> >> By default, the transfer domain is to match bi-direction traffic, and
> >> no behavior changed.
This sentence is confusing, it should be removed.
> >> 1. Match wire origin only
> >> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> >> 2. Match vf origin only
> >> flow template_table 0 create group 0 priority 0 transfer vf_orig...
This testpmd example needs to be introduced with a sentence.
> > Since wire_orig and vf_orig are just optional hints and not
> > all PMDs are obliged to handle it, it does not impose any
> > matching criteria.
Yes
> > So, example above are misleading and you
> > need to add pattern items to highlight that corresponding rules
> > are really wire_orig or vf_orig.
This is template table creation, so I don't think there is more to add.
What do you have in mind?
> I'm sorry, but I still don't see how it is addressed in v4.
I think the documentation in v4 is pretty clear.
Do you see something in the doc which is confusing?
The commit message needs rewording though.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v3] ethdev: add hint when creating async transfer table
2022-11-08 9:19 ` Thomas Monjalon
@ 2022-11-08 9:35 ` Andrew Rybchenko
2022-11-08 11:18 ` Thomas Monjalon
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-11-08 9:35 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
On 11/8/22 12:19, Thomas Monjalon wrote:
> 06/11/2022 11:02, Andrew Rybchenko:
>> On 10/4/22 11:31, Andrew Rybchenko wrote:
>>> On 9/28/22 12:24, Rongwei Liu wrote:
>>>> The transfer domain rule is able to match traffic wire/vf
>>>> origin and it means two directions' underlayer resource.
>>>>
>>>> In customer deployments, they usually match only one direction
>>>> traffic in single flow table: either from wire or from vf.
>
> Customer deployment is not an argument.
>
>>>> Introduce one new member transfer_mode into rte_flow_template_table_attr
>>>> to indicate the flow table direction property: from wire, from vf
>>>> or bi-direction(default).
>
> The origin is not a direction.
> We should update this sentence.
>
>>>> It helps to save underlayer memory also on insertion rate, and this
>>>> new field doesn't expose any matching criteira.
>
> Should be reworded.
>
>>>> By default, the transfer domain is to match bi-direction traffic, and
>>>> no behavior changed.
>
> This sentence is confusing, it should be removed.
>
>>>> 1. Match wire origin only
>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
>>>> 2. Match vf origin only
>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>
> This testpmd example needs to be introduced with a sentence.
>
>>> Since wire_orig and vf_orig are just optional hints and not
>>> all PMDs are obliged to handle it, it does not impose any
>>> matching criteria.
>
> Yes
>
>>> So, example above are misleading and you
>>> need to add pattern items to highlight that corresponding rules
>>> are really wire_orig or vf_orig.
>
> This is template table creation, so I don't think there is more to add.
> What do you have in mind?
>
Since origin is just a hint which does not impose any matching
criteria it must be highlighted in example that corresponding
rules must have some pattern items defining corresponding
origin.
>> I'm sorry, but I still don't see how it is addressed in v4.
>
> I think the documentation in v4 is pretty clear.
> Do you see something in the doc which is confusing?
> The commit message needs rewording though.
>
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v3] ethdev: add hint when creating async transfer table
2022-11-08 9:35 ` Andrew Rybchenko
@ 2022-11-08 11:18 ` Thomas Monjalon
2022-11-08 11:48 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2022-11-08 11:18 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
08/11/2022 10:35, Andrew Rybchenko:
> On 11/8/22 12:19, Thomas Monjalon wrote:
> > 06/11/2022 11:02, Andrew Rybchenko:
> >> On 10/4/22 11:31, Andrew Rybchenko wrote:
> >>> On 9/28/22 12:24, Rongwei Liu wrote:
> >>>> The transfer domain rule is able to match traffic wire/vf
> >>>> origin and it means two directions' underlayer resource.
> >>>>
> >>>> In customer deployments, they usually match only one direction
> >>>> traffic in single flow table: either from wire or from vf.
> >
> > Customer deployment is not an argument.
> >
> >>>> Introduce one new member transfer_mode into rte_flow_template_table_attr
> >>>> to indicate the flow table direction property: from wire, from vf
> >>>> or bi-direction(default).
> >
> > The origin is not a direction.
> > We should update this sentence.
> >
> >>>> It helps to save underlayer memory also on insertion rate, and this
> >>>> new field doesn't expose any matching criteira.
> >
> > Should be reworded.
> >
> >>>> By default, the transfer domain is to match bi-direction traffic, and
> >>>> no behavior changed.
> >
> > This sentence is confusing, it should be removed.
> >
> >>>> 1. Match wire origin only
> >>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> >>>> 2. Match vf origin only
> >>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
> >
> > This testpmd example needs to be introduced with a sentence.
> >
> >>> Since wire_orig and vf_orig are just optional hints and not
> >>> all PMDs are obliged to handle it, it does not impose any
> >>> matching criteria.
> >
> > Yes
> >
> >>> So, example above are misleading and you
> >>> need to add pattern items to highlight that corresponding rules
> >>> are really wire_orig or vf_orig.
> >
> > This is template table creation, so I don't think there is more to add.
> > What do you have in mind?
> >
>
> Since origin is just a hint which does not impose any matching
> criteria it must be highlighted in example that corresponding
> rules must have some pattern items defining corresponding
> origin.
Yes we could talk about corresponding rules in the commit message.
What do you think of the explanations in the doc?
> >> I'm sorry, but I still don't see how it is addressed in v4.
> >
> > I think the documentation in v4 is pretty clear.
> > Do you see something in the doc which is confusing?
> > The commit message needs rewording though.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-04 10:44 ` Rongwei Liu
@ 2022-11-08 11:39 ` Andrew Rybchenko
2022-11-08 11:47 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-11-08 11:39 UTC (permalink / raw)
To: Rongwei Liu, matan, viacheslavo, orika, thomas, Aman Singh,
Yuying Zhang, Ferruh Yigit
Cc: dev, rasland
On 11/4/22 13:44, Rongwei Liu wrote:
> The transfer domain rule is able to match traffic wire/vport
> origin which are corresponding to two kinds of underlayer resources.
>
> Wire means traffic arrives from the uplink port while vport means
> traffic initiated from VF/SF.
>
> In customer deployments, they usually match only one kind of
> traffic in single flow table: either from wire or from vport.
> PMD can save significant resources if passing special hint from rte
> layer.
>
> There are two possible approaches, using IPv4 as an example:
> 1. Use pattern item.
> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
> "ANY_VPORT" needs to be present in each async rule even if it's
> just a hint. No value to match.
>
> 2. Add special flags into table_attr. It will be:
> template_table 0 create table_id 0 group 1 transfer vf_orig
>
> Approach 1 needs to specify the pattern in each flow rules which wastes
> memory and not end user friendly.
> This patch takes the 2nd approach and introduce one new member
> specialize into rte_flow_table_attr to indicate async flow table matching
> optimization: from wire, from vport.
>
> It helps to save underlayer memory and also on insertion rate.
>
> By default, there is no hint, so the behavior of the transfer domain
> doesn't change.
>
> 1. Match wire origin only
> flow template_table 0 create group 0 priority 0 transfer wire_orig...
> 2. Match vf origin only
> flow template_table 0 create group 0 priority 0 transfer vport_orig...
>
> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> Acked-by: Ori Kam <orika@nvidia.com>
>
> v2: Move the new field to template table attribute.
> v4: Mark it as optional and clear the concept.
> ---
> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++
> doc/guides/prog_guide/rte_flow.rst | 15 ++++++++++
> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 +-
> lib/ethdev/rte_flow.h | 32 +++++++++++++++++++++
> 4 files changed, 75 insertions(+), 1 deletion(-)
>
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 88108498e0..15f2af9b40 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -184,6 +184,8 @@ enum index {
> TABLE_INGRESS,
> TABLE_EGRESS,
> TABLE_TRANSFER,
> + TABLE_TRANSFER_WIRE_ORIG,
> + TABLE_TRANSFER_VPORT_ORIG,
> TABLE_RULES_NUMBER,
> TABLE_PATTERN_TEMPLATE,
> TABLE_ACTIONS_TEMPLATE,
> @@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
> TABLE_INGRESS,
> TABLE_EGRESS,
> TABLE_TRANSFER,
> + TABLE_TRANSFER_WIRE_ORIG,
> + TABLE_TRANSFER_VPORT_ORIG,
> TABLE_RULES_NUMBER,
> TABLE_PATTERN_TEMPLATE,
> TABLE_ACTIONS_TEMPLATE,
> @@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
> .next = NEXT(next_table_attr),
> .call = parse_table,
> },
> + [TABLE_TRANSFER_WIRE_ORIG] = {
> + .name = "wire_orig",
> + .help = "affect rule direction to transfer",
> + .next = NEXT(next_table_attr),
> + .call = parse_table,
> + },
> + [TABLE_TRANSFER_VPORT_ORIG] = {
> + .name = "vport_orig",
> + .help = "affect rule direction to transfer",
> + .next = NEXT(next_table_attr),
> + .call = parse_table,
> + },
> [TABLE_RULES_NUMBER] = {
> .name = "rules_number",
> .help = "number of rules in table",
> @@ -8993,6 +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
> case TABLE_TRANSFER:
> out->args.table.attr.flow_attr.transfer = 1;
> return len;
> + case TABLE_TRANSFER_WIRE_ORIG:
> + if (!out->args.table.attr.flow_attr.transfer)
> + return -1;
> + out->args.table.attr.specialize = RTE_FLOW_TRANSFER_WIRE_ORIG;
> + return len;
> + case TABLE_TRANSFER_VPORT_ORIG:
> + if (!out->args.table.attr.flow_attr.transfer)
> + return -1;
> + out->args.table.attr.specialize = RTE_FLOW_TRANSFER_VPORT_ORIG;
> + return len;
> default:
> return -1;
> }
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index 3e6242803d..d9ca041ae4 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
> &actions_templates, nb_actions_templ,
> &error);
>
> +Table Attribute: Specialize
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Application can help optimizing underlayer resources and insertion rate
> +by specializing template table.
> +Specialization is done by providing hints
> +in the template table attribute ``specialize``.
> +
> +This attribute is not mandatory for each PMD to implement.
> +If a hint is not supported, it will be silently ignored,
> +and no special optimization is done.
> +
> +If a table is specialized, the application should make sure the rules
> +comply with the table attribute.
> +
> Asynchronous operations
> -----------------------
>
> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> index 96c5ae0fe4..b3238415f4 100644
> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -3145,7 +3145,8 @@ It is bound to ``rte_flow_template_table_create()``::
>
> flow template_table {port_id} create
> [table_id {id}] [group {group_id}]
> - [priority {level}] [ingress] [egress] [transfer]
> + [priority {level}] [ingress] [egress]
> + [transfer [vport_orig] [wire_orig]]
> rules_number {number}
> pattern_template {pattern_template_id}
> actions_template {actions_template_id}
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index 8858b56428..1eab12796f 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t port_id,
> */
> struct rte_flow_template_table;
>
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Special optional flags for template table attribute.
> + * Each bit stands for a table specialization
> + * offering a potential optimization at PMD layer.
> + * PMD can ignore the unsupported bits silently.
> + */
> +enum rte_flow_template_table_specialize {
> + /**
> + * Specialize table for transfer flows which come only from wire.
> + * It allows PMD not to allocate resources for non-wire originated traffic.
> + * This bit is not a matching criteria, just an optimization hint.
> + * Flow rules which match non-wire originated traffic will be missed
> + * if the hint is supported.
> + */
> + RTE_FLOW_TRANSFER_WIRE_ORIG = RTE_BIT32(0),
> + /**
> + * Specialize table for transfer flows which come only from vport (e.g. VF, SF).
> + * It allows PMD not to allocate resources for non-vport originated traffic.
> + * This bit is not a matching criteria, just an optimization hint.
> + * Flow rules which match non-vport originated traffic will be missed
> + * if the hint is supported.
> + */
> + RTE_FLOW_TRANSFER_VPORT_ORIG = RTE_BIT32(1),
> +};
> +
> /**
> * @warning
> * @b EXPERIMENTAL: this API may change without prior notice.
> @@ -5201,6 +5229,10 @@ struct rte_flow_template_table_attr {
> * Maximum number of flow rules that this table holds.
> */
> uint32_t nb_flows;
> + /**
> + * Optional hint flags for PMD optimization.
> + */
> + enum rte_flow_template_table_specialize specialize;
IMHO it is not 100% correct to use enum for flag since
RTE_FLOW_TRANSFER_WIRE_ORIG | RTE_FLOW_TRANSFER_VPORT_ORIG
is not the enum member. uint32_t is a better option here since
bits are defined as RTE_BIT32. enum should be mentioned in the
description.
> };
>
> /**
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-08 11:39 ` Andrew Rybchenko
@ 2022-11-08 11:47 ` Andrew Rybchenko
2022-11-08 13:29 ` Thomas Monjalon
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-11-08 11:47 UTC (permalink / raw)
To: Rongwei Liu, matan, viacheslavo, orika, thomas, Aman Singh,
Yuying Zhang, Ferruh Yigit
Cc: dev, rasland
On 11/8/22 14:39, Andrew Rybchenko wrote:
> On 11/4/22 13:44, Rongwei Liu wrote:
>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
>> index 8858b56428..1eab12796f 100644
>> --- a/lib/ethdev/rte_flow.h
>> +++ b/lib/ethdev/rte_flow.h
>> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t
>> port_id,
>> */
>> struct rte_flow_template_table;
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change without prior notice.
>> + *
>> + * Special optional flags for template table attribute.
>> + * Each bit stands for a table specialization
>> + * offering a potential optimization at PMD layer.
>> + * PMD can ignore the unsupported bits silently.
>> + */
>> +enum rte_flow_template_table_specialize {
>> + /**
>> + * Specialize table for transfer flows which come only from wire.
>> + * It allows PMD not to allocate resources for non-wire
>> originated traffic.
>> + * This bit is not a matching criteria, just an optimization hint.
>> + * Flow rules which match non-wire originated traffic will be missed
>> + * if the hint is supported.
Sorry, but if so, the hint changes behavior.
Let's consider a rule which matches both VF originating and
wire originating traffic. Will the rule be missed (ignored)
regardless if the hint is supported or not?
I.e. it will not apply to wire originated traffic as well.
>> + */
>> + RTE_FLOW_TRANSFER_WIRE_ORIG = RTE_BIT32(0),
>> + /**
>> + * Specialize table for transfer flows which come only from vport
>> (e.g. VF, SF).
>> + * It allows PMD not to allocate resources for non-vport
>> originated traffic.
>> + * This bit is not a matching criteria, just an optimization hint.
>> + * Flow rules which match non-vport originated traffic will be
>> missed
>> + * if the hint is supported.
>> + */
>> + RTE_FLOW_TRANSFER_VPORT_ORIG = RTE_BIT32(1),
>> +};
>> +
>> /**
>> * @warning
>> * @b EXPERIMENTAL: this API may change without prior notice.
>> @@ -5201,6 +5229,10 @@ struct rte_flow_template_table_attr {
>> * Maximum number of flow rules that this table holds.
>> */
>> uint32_t nb_flows;
>> + /**
>> + * Optional hint flags for PMD optimization.
>> + */
>> + enum rte_flow_template_table_specialize specialize;
>
>
> IMHO it is not 100% correct to use enum for flag since
> RTE_FLOW_TRANSFER_WIRE_ORIG | RTE_FLOW_TRANSFER_VPORT_ORIG
> is not the enum member. uint32_t is a better option here since
> bits are defined as RTE_BIT32. enum should be mentioned in the
> description.
>
>> };
>> /**
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v3] ethdev: add hint when creating async transfer table
2022-11-08 11:18 ` Thomas Monjalon
@ 2022-11-08 11:48 ` Andrew Rybchenko
2022-11-14 8:47 ` [PATCH v6] ethdev: add special flags " Rongwei Liu
2022-11-14 11:59 ` [PATCH v7] " Rongwei Liu
0 siblings, 2 replies; 96+ messages in thread
From: Andrew Rybchenko @ 2022-11-08 11:48 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
On 11/8/22 14:18, Thomas Monjalon wrote:
> 08/11/2022 10:35, Andrew Rybchenko:
>> On 11/8/22 12:19, Thomas Monjalon wrote:
>>> 06/11/2022 11:02, Andrew Rybchenko:
>>>> On 10/4/22 11:31, Andrew Rybchenko wrote:
>>>>> On 9/28/22 12:24, Rongwei Liu wrote:
>>>>>> The transfer domain rule is able to match traffic wire/vf
>>>>>> origin and it means two directions' underlayer resource.
>>>>>>
>>>>>> In customer deployments, they usually match only one direction
>>>>>> traffic in single flow table: either from wire or from vf.
>>>
>>> Customer deployment is not an argument.
>>>
>>>>>> Introduce one new member transfer_mode into rte_flow_template_table_attr
>>>>>> to indicate the flow table direction property: from wire, from vf
>>>>>> or bi-direction(default).
>>>
>>> The origin is not a direction.
>>> We should update this sentence.
>>>
>>>>>> It helps to save underlayer memory also on insertion rate, and this
>>>>>> new field doesn't expose any matching criteira.
>>>
>>> Should be reworded.
>>>
>>>>>> By default, the transfer domain is to match bi-direction traffic, and
>>>>>> no behavior changed.
>>>
>>> This sentence is confusing, it should be removed.
>>>
>>>>>> 1. Match wire origin only
>>>>>> flow template_table 0 create group 0 priority 0 transfer wire_orig...
>>>>>> 2. Match vf origin only
>>>>>> flow template_table 0 create group 0 priority 0 transfer vf_orig...
>>>
>>> This testpmd example needs to be introduced with a sentence.
>>>
>>>>> Since wire_orig and vf_orig are just optional hints and not
>>>>> all PMDs are obliged to handle it, it does not impose any
>>>>> matching criteria.
>>>
>>> Yes
>>>
>>>>> So, example above are misleading and you
>>>>> need to add pattern items to highlight that corresponding rules
>>>>> are really wire_orig or vf_orig.
>>>
>>> This is template table creation, so I don't think there is more to add.
>>> What do you have in mind?
>>>
>>
>> Since origin is just a hint which does not impose any matching
>> criteria it must be highlighted in example that corresponding
>> rules must have some pattern items defining corresponding
>> origin.
>
> Yes we could talk about corresponding rules in the commit message.
>
> What do you think of the explanations in the doc?
I've replied on v4.
>
>>>> I'm sorry, but I still don't see how it is addressed in v4.
>>>
>>> I think the documentation in v4 is pretty clear.
>>> Do you see something in the doc which is confusing?
>>> The commit message needs rewording though.
>
>
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-08 11:47 ` Andrew Rybchenko
@ 2022-11-08 13:29 ` Thomas Monjalon
2022-11-08 14:38 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2022-11-08 13:29 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
08/11/2022 12:47, Andrew Rybchenko:
> On 11/8/22 14:39, Andrew Rybchenko wrote:
> > On 11/4/22 13:44, Rongwei Liu wrote:
> >> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> >> index 8858b56428..1eab12796f 100644
> >> --- a/lib/ethdev/rte_flow.h
> >> +++ b/lib/ethdev/rte_flow.h
> >> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t
> >> port_id,
> >> */
> >> struct rte_flow_template_table;
> >> +/**
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * Special optional flags for template table attribute.
> >> + * Each bit stands for a table specialization
> >> + * offering a potential optimization at PMD layer.
> >> + * PMD can ignore the unsupported bits silently.
> >> + */
> >> +enum rte_flow_template_table_specialize {
> >> + /**
> >> + * Specialize table for transfer flows which come only from wire.
> >> + * It allows PMD not to allocate resources for non-wire
> >> originated traffic.
> >> + * This bit is not a matching criteria, just an optimization hint.
> >> + * Flow rules which match non-wire originated traffic will be missed
> >> + * if the hint is supported.
>
> Sorry, but if so, the hint changes behavior.
Yes the hint may change behaviour.
> Let's consider a rule which matches both VF originating and
> wire originating traffic. Will the rule be missed (ignored)
> regardless if the hint is supported or not?
If the hint RTE_FLOW_TRANSFER_WIRE_ORIG is used,
the PMD may assume the table won't be used for traffic
which is not coming from wire ports.
As a consequence, the table may be implemented on the path
of wire traffic only.
In this case, the traffic coming from virtual ports
won't be affected by this table.
To answer the question, a rule matching both virtual and wire traffic
will be applied in a table affecting only wire traffic,
so it will still apply (not completely ignored).
If you really want to manage both types of traffic in this table,
you must not use such hint.
> I.e. it will not apply to wire originated traffic as well.
>
> >> + */
> >> + RTE_FLOW_TRANSFER_WIRE_ORIG = RTE_BIT32(0),
> >> + /**
> >> + * Specialize table for transfer flows which come only from vport
> >> (e.g. VF, SF).
> >> + * It allows PMD not to allocate resources for non-vport
> >> originated traffic.
> >> + * This bit is not a matching criteria, just an optimization hint.
> >> + * Flow rules which match non-vport originated traffic will be
> >> missed
> >> + * if the hint is supported.
> >> + */
> >> + RTE_FLOW_TRANSFER_VPORT_ORIG = RTE_BIT32(1),
> >> +};
> >> +
> >> /**
> >> * @warning
> >> * @b EXPERIMENTAL: this API may change without prior notice.
> >> @@ -5201,6 +5229,10 @@ struct rte_flow_template_table_attr {
> >> * Maximum number of flow rules that this table holds.
> >> */
> >> uint32_t nb_flows;
> >> + /**
> >> + * Optional hint flags for PMD optimization.
> >> + */
> >> + enum rte_flow_template_table_specialize specialize;
> >
> >
> > IMHO it is not 100% correct to use enum for flag since
> > RTE_FLOW_TRANSFER_WIRE_ORIG | RTE_FLOW_TRANSFER_VPORT_ORIG
> > is not the enum member. uint32_t is a better option here since
> > bits are defined as RTE_BIT32. enum should be mentioned in the
> > description.
I agree, let's not use enum.
Instead we can mention the prefix of the defines in the comments.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-08 13:29 ` Thomas Monjalon
@ 2022-11-08 14:38 ` Andrew Rybchenko
2022-11-08 15:25 ` Thomas Monjalon
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-11-08 14:38 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
On 11/8/22 16:29, Thomas Monjalon wrote:
> 08/11/2022 12:47, Andrew Rybchenko:
>> On 11/8/22 14:39, Andrew Rybchenko wrote:
>>> On 11/4/22 13:44, Rongwei Liu wrote:
>>>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
>>>> index 8858b56428..1eab12796f 100644
>>>> --- a/lib/ethdev/rte_flow.h
>>>> +++ b/lib/ethdev/rte_flow.h
>>>> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t
>>>> port_id,
>>>> */
>>>> struct rte_flow_template_table;
>>>> +/**
>>>> + * @warning
>>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>>> + *
>>>> + * Special optional flags for template table attribute.
>>>> + * Each bit stands for a table specialization
>>>> + * offering a potential optimization at PMD layer.
>>>> + * PMD can ignore the unsupported bits silently.
>>>> + */
>>>> +enum rte_flow_template_table_specialize {
>>>> + /**
>>>> + * Specialize table for transfer flows which come only from wire.
>>>> + * It allows PMD not to allocate resources for non-wire
>>>> originated traffic.
>>>> + * This bit is not a matching criteria, just an optimization hint.
>>>> + * Flow rules which match non-wire originated traffic will be missed
>>>> + * if the hint is supported.
>>
>> Sorry, but if so, the hint changes behavior.
>
> Yes the hint may change behaviour.
>
>> Let's consider a rule which matches both VF originating and
>> wire originating traffic. Will the rule be missed (ignored)
>> regardless if the hint is supported or not?
>
> If the hint RTE_FLOW_TRANSFER_WIRE_ORIG is used,
> the PMD may assume the table won't be used for traffic
> which is not coming from wire ports.
> As a consequence, the table may be implemented on the path
> of wire traffic only.
> In this case, the traffic coming from virtual ports
> won't be affected by this table.
> To answer the question, a rule matching both virtual and wire traffic
> will be applied in a table affecting only wire traffic,
> so it will still apply (not completely ignored).
If so, it is not a hint. It becomes matching criteria
which should be in pattern as we discussed.
>
> If you really want to manage both types of traffic in this table,
> you must not use such hint.
>
>> I.e. it will not apply to wire originated traffic as well.
>>
>>>> + */
>>>> + RTE_FLOW_TRANSFER_WIRE_ORIG = RTE_BIT32(0),
>>>> + /**
>>>> + * Specialize table for transfer flows which come only from vport
>>>> (e.g. VF, SF).
>>>> + * It allows PMD not to allocate resources for non-vport
>>>> originated traffic.
>>>> + * This bit is not a matching criteria, just an optimization hint.
>>>> + * Flow rules which match non-vport originated traffic will be
>>>> missed
>>>> + * if the hint is supported.
>>>> + */
>>>> + RTE_FLOW_TRANSFER_VPORT_ORIG = RTE_BIT32(1),
>>>> +};
>>>> +
>>>> /**
>>>> * @warning
>>>> * @b EXPERIMENTAL: this API may change without prior notice.
>>>> @@ -5201,6 +5229,10 @@ struct rte_flow_template_table_attr {
>>>> * Maximum number of flow rules that this table holds.
>>>> */
>>>> uint32_t nb_flows;
>>>> + /**
>>>> + * Optional hint flags for PMD optimization.
>>>> + */
>>>> + enum rte_flow_template_table_specialize specialize;
>>>
>>>
>>> IMHO it is not 100% correct to use enum for flag since
>>> RTE_FLOW_TRANSFER_WIRE_ORIG | RTE_FLOW_TRANSFER_VPORT_ORIG
>>> is not the enum member. uint32_t is a better option here since
>>> bits are defined as RTE_BIT32. enum should be mentioned in the
>>> description.
>
> I agree, let's not use enum.
> Instead we can mention the prefix of the defines in the comments.
>
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-08 14:38 ` Andrew Rybchenko
@ 2022-11-08 15:25 ` Thomas Monjalon
2022-11-09 8:53 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2022-11-08 15:25 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
08/11/2022 15:38, Andrew Rybchenko:
> On 11/8/22 16:29, Thomas Monjalon wrote:
> > 08/11/2022 12:47, Andrew Rybchenko:
> >> On 11/8/22 14:39, Andrew Rybchenko wrote:
> >>> On 11/4/22 13:44, Rongwei Liu wrote:
> >>>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> >>>> index 8858b56428..1eab12796f 100644
> >>>> --- a/lib/ethdev/rte_flow.h
> >>>> +++ b/lib/ethdev/rte_flow.h
> >>>> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t
> >>>> port_id,
> >>>> */
> >>>> struct rte_flow_template_table;
> >>>> +/**
> >>>> + * @warning
> >>>> + * @b EXPERIMENTAL: this API may change without prior notice.
> >>>> + *
> >>>> + * Special optional flags for template table attribute.
> >>>> + * Each bit stands for a table specialization
> >>>> + * offering a potential optimization at PMD layer.
> >>>> + * PMD can ignore the unsupported bits silently.
> >>>> + */
> >>>> +enum rte_flow_template_table_specialize {
> >>>> + /**
> >>>> + * Specialize table for transfer flows which come only from wire.
> >>>> + * It allows PMD not to allocate resources for non-wire
> >>>> originated traffic.
> >>>> + * This bit is not a matching criteria, just an optimization hint.
> >>>> + * Flow rules which match non-wire originated traffic will be missed
> >>>> + * if the hint is supported.
> >>
> >> Sorry, but if so, the hint changes behavior.
> >
> > Yes the hint may change behaviour.
> >
> >> Let's consider a rule which matches both VF originating and
> >> wire originating traffic. Will the rule be missed (ignored)
> >> regardless if the hint is supported or not?
> >
> > If the hint RTE_FLOW_TRANSFER_WIRE_ORIG is used,
> > the PMD may assume the table won't be used for traffic
> > which is not coming from wire ports.
> > As a consequence, the table may be implemented on the path
> > of wire traffic only.
> > In this case, the traffic coming from virtual ports
> > won't be affected by this table.
> > To answer the question, a rule matching both virtual and wire traffic
> > will be applied in a table affecting only wire traffic,
> > so it will still apply (not completely ignored).
>
> If so, it is not a hint. It becomes matching criteria
> which should be in pattern as we discussed.
It is not a strict matching because the PMD is free to support it or not.
^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v5] ethdev: add special flags when creating async transfer table
2022-10-04 8:31 ` Andrew Rybchenko
` (2 preceding siblings ...)
2022-11-06 10:02 ` [PATCH v3] ethdev: add hint " Andrew Rybchenko
@ 2022-11-09 8:11 ` Rongwei Liu
2022-11-09 8:13 ` Rongwei Liu
4 siblings, 0 replies; 96+ messages in thread
From: Rongwei Liu @ 2022-11-09 8:11 UTC (permalink / raw)
To: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko
Cc: dev, rasland
The transfer domain rule is able to match traffic wire/vport
origin which are corresponding to two kinds of underlayer resources.
Wire means traffic arrives from the uplink port while vport means
traffic initiated from VF/SF.
In customer deployments, they usually match only one kind of
traffic in single flow table: either from wire or from vport.
PMD can save significant resources if passing specical hint from rte
layer.
There are two possible approaches, using IPv4 as an example:
1. Use pattern item.
pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
"ANY_VPORT" needs to be present in each async rule even if it's
just a hint. No value to match.
2. Add special flags into table_attr. It will be:
template_table 0 create table_id 0 group 1 transfer vf_orig
Approach 1 needs to specify the pattern in each flow rules which wastes
memory and not end user friendly.
This patch takes the 2nd approach and introduce one new member
specialize into rte_flow_table_attr to indicate async flow table matching
optimization: from wire, from vport.
It helps to save underlayer memory and also on insertion rate.
By default, there is no hint, so the behavior of the transfer domain
doesn't change.
1. Match wire origin only
flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
flow template_table 0 create group 0 priority 0 transfer vport_orig...
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
v2: Move the new field to template table attribute.
v4: Mark it as optional and clear the concept.
v5: Change specialize type to uint32_t.
---
app/test-pmd/cmdline_flow.c | 26 ++++++++++++++++
doc/guides/prog_guide/rte_flow.rst | 15 +++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 +-
lib/ethdev/rte_flow.h | 34 +++++++++++++++++++++
4 files changed, 77 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..15f2af9b40 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -184,6 +184,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VPORT_ORIG] = {
+ .name = "vport_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8993,6 +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_WIRE_ORIG;
+ return len;
+ case TABLE_TRANSFER_VPORT_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_VPORT_ORIG;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..d9ca041ae4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
&actions_templates, nb_actions_templ,
&error);
+Table Attribute: Specialize
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Application can help optimizing underlayer resources and insertion rate
+by specializing template table.
+Specialization is done by providing hints
+in the template table attribute ``specialize``.
+
+This attribute is not mandatory for each PMD to implement.
+If a hint is not supported, it will be silently ignored,
+and no special optimization is done.
+
+If a table is specialized, the application should make sure the rules
+comply with the table attribute.
+
Asynchronous operations
-----------------------
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b3238415f4 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3145,7 +3145,8 @@ It is bound to ``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vport_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 8858b56428..b3b462d0cd 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t port_id,
*/
struct rte_flow_template_table;
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Special optional flags for template table attribute.
+ * Each bit stands for a table specialization
+ * offering a potential optimization at PMD layer.
+ * PMD can ignore the unsupported bits silently.
+ */
+enum rte_flow_template_table_specialize {
+ /**
+ * Specialize table for transfer flows which come only from wire.
+ * It allows PMD not to allocate resources for non-wire originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-wire originated traffic will be missed
+ * if the hint is supported.
+ */
+ RTE_FLOW_TRANSFER_WIRE_ORIG = RTE_BIT32(0),
+ /**
+ * Specialize table for transfer flows which come only from vport (e.g. VF, SF).
+ * It allows PMD not to allocate resources for non-vport originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-vport originated traffic will be missed
+ * if the hint is supported.
+ */
+ RTE_FLOW_TRANSFER_VPORT_ORIG = RTE_BIT32(1),
+};
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice.
@@ -5201,6 +5229,12 @@ struct rte_flow_template_table_attr {
* Maximum number of flow rules that this table holds.
*/
uint32_t nb_flows;
+ /**
+ * Optional hint flags for PMD optimization.
+ * The value should be picked up from
+ * enumeration rte_flow_template_table_specialize.
+ */
+ uint32_t specialize;
};
/**
--
2.27.0
^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v5] ethdev: add special flags when creating async transfer table
2022-10-04 8:31 ` Andrew Rybchenko
` (3 preceding siblings ...)
2022-11-09 8:11 ` [PATCH v5] ethdev: add special flags when creating async transfer table Rongwei Liu
@ 2022-11-09 8:13 ` Rongwei Liu
2022-11-09 8:31 ` Thomas Monjalon
4 siblings, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2022-11-09 8:13 UTC (permalink / raw)
To: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko
Cc: dev, rasland
The transfer domain rule is able to match traffic wire/vport
origin which are corresponding to two kinds of underlayer resources.
Wire means traffic arrives from the uplink port while vport means
traffic initiated from VF/SF.
In customer deployments, they usually match only one kind of
traffic in single flow table: either from wire or from vport.
PMD can save significant resources if passing special hint from rte
layer.
There are two possible approaches, using IPv4 as an example:
1. Use pattern item.
pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
"ANY_VPORT" needs to be present in each async rule even if it's
just a hint. No value to match.
2. Add special flags into table_attr. It will be:
template_table 0 create table_id 0 group 1 transfer vf_orig
Approach 1 needs to specify the pattern in each flow rules which wastes
memory and not end user friendly.
This patch takes the 2nd approach and introduce one new member
specialize into rte_flow_table_attr to indicate async flow table matching
optimization: from wire, from vport.
It helps to save underlayer memory and also on insertion rate.
By default, there is no hint, so the behavior of the transfer domain
doesn't change.
1. Match wire origin only
flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
flow template_table 0 create group 0 priority 0 transfer vport_orig...
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
v2: Move the new field to template table attribute.
v4: Mark it as optional and clear the concept.
v5: Change specialize type to uint32_t.
---
app/test-pmd/cmdline_flow.c | 26 ++++++++++++++++
doc/guides/prog_guide/rte_flow.rst | 15 +++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 +-
lib/ethdev/rte_flow.h | 34 +++++++++++++++++++++
4 files changed, 77 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..15f2af9b40 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -184,6 +184,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VPORT_ORIG] = {
+ .name = "vport_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8993,6 +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_WIRE_ORIG;
+ return len;
+ case TABLE_TRANSFER_VPORT_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_VPORT_ORIG;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..d9ca041ae4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
&actions_templates, nb_actions_templ,
&error);
+Table Attribute: Specialize
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Application can help optimizing underlayer resources and insertion rate
+by specializing template table.
+Specialization is done by providing hints
+in the template table attribute ``specialize``.
+
+This attribute is not mandatory for each PMD to implement.
+If a hint is not supported, it will be silently ignored,
+and no special optimization is done.
+
+If a table is specialized, the application should make sure the rules
+comply with the table attribute.
+
Asynchronous operations
-----------------------
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b3238415f4 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3145,7 +3145,8 @@ It is bound to ``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vport_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 8858b56428..b3b462d0cd 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t port_id,
*/
struct rte_flow_template_table;
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Special optional flags for template table attribute.
+ * Each bit stands for a table specialization
+ * offering a potential optimization at PMD layer.
+ * PMD can ignore the unsupported bits silently.
+ */
+enum rte_flow_template_table_specialize {
+ /**
+ * Specialize table for transfer flows which come only from wire.
+ * It allows PMD not to allocate resources for non-wire originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-wire originated traffic will be missed
+ * if the hint is supported.
+ */
+ RTE_FLOW_TRANSFER_WIRE_ORIG = RTE_BIT32(0),
+ /**
+ * Specialize table for transfer flows which come only from vport (e.g. VF, SF).
+ * It allows PMD not to allocate resources for non-vport originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-vport originated traffic will be missed
+ * if the hint is supported.
+ */
+ RTE_FLOW_TRANSFER_VPORT_ORIG = RTE_BIT32(1),
+};
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice.
@@ -5201,6 +5229,12 @@ struct rte_flow_template_table_attr {
* Maximum number of flow rules that this table holds.
*/
uint32_t nb_flows;
+ /**
+ * Optional hint flags for PMD optimization.
+ * The value should be picked up from
+ * enumeration rte_flow_template_table_specialize.
+ */
+ uint32_t specialize;
};
/**
--
2.27.0
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v5] ethdev: add special flags when creating async transfer table
2022-11-09 8:13 ` Rongwei Liu
@ 2022-11-09 8:31 ` Thomas Monjalon
0 siblings, 0 replies; 96+ messages in thread
From: Thomas Monjalon @ 2022-11-09 8:31 UTC (permalink / raw)
To: Rongwei Liu
Cc: matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko, dev, rasland
09/11/2022 09:13, Rongwei Liu:
> v2: Move the new field to template table attribute.
> v4: Mark it as optional and clear the concept.
> v5: Change specialize type to uint32_t.
There are more changes to do (replace enum with defines and update the commit message).
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-08 15:25 ` Thomas Monjalon
@ 2022-11-09 8:53 ` Andrew Rybchenko
2022-11-09 9:03 ` Thomas Monjalon
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-11-09 8:53 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
On 11/8/22 18:25, Thomas Monjalon wrote:
> 08/11/2022 15:38, Andrew Rybchenko:
>> On 11/8/22 16:29, Thomas Monjalon wrote:
>>> 08/11/2022 12:47, Andrew Rybchenko:
>>>> On 11/8/22 14:39, Andrew Rybchenko wrote:
>>>>> On 11/4/22 13:44, Rongwei Liu wrote:
>>>>>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
>>>>>> index 8858b56428..1eab12796f 100644
>>>>>> --- a/lib/ethdev/rte_flow.h
>>>>>> +++ b/lib/ethdev/rte_flow.h
>>>>>> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t
>>>>>> port_id,
>>>>>> */
>>>>>> struct rte_flow_template_table;
>>>>>> +/**
>>>>>> + * @warning
>>>>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>>>>> + *
>>>>>> + * Special optional flags for template table attribute.
>>>>>> + * Each bit stands for a table specialization
>>>>>> + * offering a potential optimization at PMD layer.
>>>>>> + * PMD can ignore the unsupported bits silently.
>>>>>> + */
>>>>>> +enum rte_flow_template_table_specialize {
>>>>>> + /**
>>>>>> + * Specialize table for transfer flows which come only from wire.
>>>>>> + * It allows PMD not to allocate resources for non-wire
>>>>>> originated traffic.
>>>>>> + * This bit is not a matching criteria, just an optimization hint.
>>>>>> + * Flow rules which match non-wire originated traffic will be missed
>>>>>> + * if the hint is supported.
>>>>
>>>> Sorry, but if so, the hint changes behavior.
>>>
>>> Yes the hint may change behaviour.
>>>
>>>> Let's consider a rule which matches both VF originating and
>>>> wire originating traffic. Will the rule be missed (ignored)
>>>> regardless if the hint is supported or not?
>>>
>>> If the hint RTE_FLOW_TRANSFER_WIRE_ORIG is used,
>>> the PMD may assume the table won't be used for traffic
>>> which is not coming from wire ports.
>>> As a consequence, the table may be implemented on the path
>>> of wire traffic only.
>>> In this case, the traffic coming from virtual ports
>>> won't be affected by this table.
>>> To answer the question, a rule matching both virtual and wire traffic
>>> will be applied in a table affecting only wire traffic,
>>> so it will still apply (not completely ignored).
>>
>> If so, it is not a hint. It becomes matching criteria
>> which should be in pattern as we discussed.
>
> It is not a strict matching because the PMD is free to support it or not.
It cannot be optional matching criteria. Matching criteria must
be always mandatory. Otherwise application does not know what
to expect and behaviour may legitimately vary on different
vendors.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-09 8:53 ` Andrew Rybchenko
@ 2022-11-09 9:03 ` Thomas Monjalon
2022-11-09 9:36 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2022-11-09 9:03 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
09/11/2022 09:53, Andrew Rybchenko:
> On 11/8/22 18:25, Thomas Monjalon wrote:
> > 08/11/2022 15:38, Andrew Rybchenko:
> >> On 11/8/22 16:29, Thomas Monjalon wrote:
> >>> 08/11/2022 12:47, Andrew Rybchenko:
> >>>> On 11/8/22 14:39, Andrew Rybchenko wrote:
> >>>>> On 11/4/22 13:44, Rongwei Liu wrote:
> >>>>>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> >>>>>> index 8858b56428..1eab12796f 100644
> >>>>>> --- a/lib/ethdev/rte_flow.h
> >>>>>> +++ b/lib/ethdev/rte_flow.h
> >>>>>> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t
> >>>>>> port_id,
> >>>>>> */
> >>>>>> struct rte_flow_template_table;
> >>>>>> +/**
> >>>>>> + * @warning
> >>>>>> + * @b EXPERIMENTAL: this API may change without prior notice.
> >>>>>> + *
> >>>>>> + * Special optional flags for template table attribute.
> >>>>>> + * Each bit stands for a table specialization
> >>>>>> + * offering a potential optimization at PMD layer.
> >>>>>> + * PMD can ignore the unsupported bits silently.
> >>>>>> + */
> >>>>>> +enum rte_flow_template_table_specialize {
> >>>>>> + /**
> >>>>>> + * Specialize table for transfer flows which come only from wire.
> >>>>>> + * It allows PMD not to allocate resources for non-wire
> >>>>>> originated traffic.
> >>>>>> + * This bit is not a matching criteria, just an optimization hint.
> >>>>>> + * Flow rules which match non-wire originated traffic will be missed
> >>>>>> + * if the hint is supported.
> >>>>
> >>>> Sorry, but if so, the hint changes behavior.
> >>>
> >>> Yes the hint may change behaviour.
> >>>
> >>>> Let's consider a rule which matches both VF originating and
> >>>> wire originating traffic. Will the rule be missed (ignored)
> >>>> regardless if the hint is supported or not?
> >>>
> >>> If the hint RTE_FLOW_TRANSFER_WIRE_ORIG is used,
> >>> the PMD may assume the table won't be used for traffic
> >>> which is not coming from wire ports.
> >>> As a consequence, the table may be implemented on the path
> >>> of wire traffic only.
> >>> In this case, the traffic coming from virtual ports
> >>> won't be affected by this table.
> >>> To answer the question, a rule matching both virtual and wire traffic
> >>> will be applied in a table affecting only wire traffic,
> >>> so it will still apply (not completely ignored).
> >>
> >> If so, it is not a hint. It becomes matching criteria
> >> which should be in pattern as we discussed.
> >
> > It is not a strict matching because the PMD is free to support it or not.
>
> It cannot be optional matching criteria. Matching criteria must
> be always mandatory. Otherwise application does not know what
> to expect and behaviour may legitimately vary on different
> vendors.
I think you take it in the wrong direction.
The idea is not to have it as a criteria.
Let me explain again:
If an application is using a flow table to manage flows
which *always* come from the same type of port (wire or virtual),
then the application can give this information to the driver.
With this assumption coming from the application,
the driver may do some optimizations.
Now about what is explained above:
If the application gives such a hint
but does not respect its own assumption,
then confusion happens.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-09 9:03 ` Thomas Monjalon
@ 2022-11-09 9:36 ` Andrew Rybchenko
2022-11-09 10:50 ` Thomas Monjalon
0 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2022-11-09 9:36 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
On 11/9/22 12:03, Thomas Monjalon wrote:
> 09/11/2022 09:53, Andrew Rybchenko:
>> On 11/8/22 18:25, Thomas Monjalon wrote:
>>> 08/11/2022 15:38, Andrew Rybchenko:
>>>> On 11/8/22 16:29, Thomas Monjalon wrote:
>>>>> 08/11/2022 12:47, Andrew Rybchenko:
>>>>>> On 11/8/22 14:39, Andrew Rybchenko wrote:
>>>>>>> On 11/4/22 13:44, Rongwei Liu wrote:
>>>>>>>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
>>>>>>>> index 8858b56428..1eab12796f 100644
>>>>>>>> --- a/lib/ethdev/rte_flow.h
>>>>>>>> +++ b/lib/ethdev/rte_flow.h
>>>>>>>> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t
>>>>>>>> port_id,
>>>>>>>> */
>>>>>>>> struct rte_flow_template_table;
>>>>>>>> +/**
>>>>>>>> + * @warning
>>>>>>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>>>>>>> + *
>>>>>>>> + * Special optional flags for template table attribute.
>>>>>>>> + * Each bit stands for a table specialization
>>>>>>>> + * offering a potential optimization at PMD layer.
>>>>>>>> + * PMD can ignore the unsupported bits silently.
>>>>>>>> + */
>>>>>>>> +enum rte_flow_template_table_specialize {
>>>>>>>> + /**
>>>>>>>> + * Specialize table for transfer flows which come only from wire.
>>>>>>>> + * It allows PMD not to allocate resources for non-wire
>>>>>>>> originated traffic.
>>>>>>>> + * This bit is not a matching criteria, just an optimization hint.
>>>>>>>> + * Flow rules which match non-wire originated traffic will be missed
>>>>>>>> + * if the hint is supported.
>>>>>>
>>>>>> Sorry, but if so, the hint changes behavior.
>>>>>
>>>>> Yes the hint may change behaviour.
>>>>>
>>>>>> Let's consider a rule which matches both VF originating and
>>>>>> wire originating traffic. Will the rule be missed (ignored)
>>>>>> regardless if the hint is supported or not?
>>>>>
>>>>> If the hint RTE_FLOW_TRANSFER_WIRE_ORIG is used,
>>>>> the PMD may assume the table won't be used for traffic
>>>>> which is not coming from wire ports.
>>>>> As a consequence, the table may be implemented on the path
>>>>> of wire traffic only.
>>>>> In this case, the traffic coming from virtual ports
>>>>> won't be affected by this table.
>>>>> To answer the question, a rule matching both virtual and wire traffic
>>>>> will be applied in a table affecting only wire traffic,
>>>>> so it will still apply (not completely ignored).
>>>>
>>>> If so, it is not a hint. It becomes matching criteria
>>>> which should be in pattern as we discussed.
>>>
>>> It is not a strict matching because the PMD is free to support it or not.
>>
>> It cannot be optional matching criteria. Matching criteria must
>> be always mandatory. Otherwise application does not know what
>> to expect and behaviour may legitimately vary on different
>> vendors.
>
> I think you take it in the wrong direction.
> The idea is not to have it as a criteria.
> Let me explain again:
>
> If an application is using a flow table to manage flows
> which *always* come from the same type of port (wire or virtual),
What does guarantee it? Is it used a jump-table and jump rule
must guarantee it? Or has pattern corresponding unit?
It is very thin ice and I'm ready to bet money that finally
it will be used as a matching criteria intentionally or not
intentionally. Simply because it works as matching criteria
on, for example, Mellanox. I.e. if rules from table with
corresponding hint are programmed to HW which applies these
rules on traffic from wire only - effectively it is a matching
criteria. And it will be used this way. And it will be not
portable to other HW which does not support the hint.
So, we're making an API which is very easy to misuse if not
to say more.
You know better if it is OK or not to rely on liable users
in the case of DPDK.
It would be much safer if we do not rely on application in this
case, introduce a new pattern item to specify origin and
require PMD to check that pattern has either a new pattern item
or corresponding REPRESENTED_PORT/PORT_REPRESENTOR pattern
item.
I realize that my concerns could be not valid and it is just
a paranoia. Just add your ack and let's move forward.
> then the application can give this information to the driver.
> With this assumption coming from the application,
> the driver may do some optimizations.
>
> Now about what is explained above:
> If the application gives such a hint
> but does not respect its own assumption,
> then confusion happens.
>
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v4] ethdev: add special flags when creating async transfer table
2022-11-09 9:36 ` Andrew Rybchenko
@ 2022-11-09 10:50 ` Thomas Monjalon
0 siblings, 0 replies; 96+ messages in thread
From: Thomas Monjalon @ 2022-11-09 10:50 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Ferruh Yigit, dev, rasland
09/11/2022 10:36, Andrew Rybchenko:
> On 11/9/22 12:03, Thomas Monjalon wrote:
> > 09/11/2022 09:53, Andrew Rybchenko:
> >> On 11/8/22 18:25, Thomas Monjalon wrote:
> >>> 08/11/2022 15:38, Andrew Rybchenko:
> >>>> On 11/8/22 16:29, Thomas Monjalon wrote:
> >>>>> 08/11/2022 12:47, Andrew Rybchenko:
> >>>>>> On 11/8/22 14:39, Andrew Rybchenko wrote:
> >>>>>>> On 11/4/22 13:44, Rongwei Liu wrote:
> >>>>>>>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> >>>>>>>> index 8858b56428..1eab12796f 100644
> >>>>>>>> --- a/lib/ethdev/rte_flow.h
> >>>>>>>> +++ b/lib/ethdev/rte_flow.h
> >>>>>>>> @@ -5186,6 +5186,34 @@ rte_flow_actions_template_destroy(uint16_t
> >>>>>>>> port_id,
> >>>>>>>> */
> >>>>>>>> struct rte_flow_template_table;
> >>>>>>>> +/**
> >>>>>>>> + * @warning
> >>>>>>>> + * @b EXPERIMENTAL: this API may change without prior notice.
> >>>>>>>> + *
> >>>>>>>> + * Special optional flags for template table attribute.
> >>>>>>>> + * Each bit stands for a table specialization
> >>>>>>>> + * offering a potential optimization at PMD layer.
> >>>>>>>> + * PMD can ignore the unsupported bits silently.
> >>>>>>>> + */
> >>>>>>>> +enum rte_flow_template_table_specialize {
> >>>>>>>> + /**
> >>>>>>>> + * Specialize table for transfer flows which come only from wire.
> >>>>>>>> + * It allows PMD not to allocate resources for non-wire
> >>>>>>>> originated traffic.
> >>>>>>>> + * This bit is not a matching criteria, just an optimization hint.
> >>>>>>>> + * Flow rules which match non-wire originated traffic will be missed
> >>>>>>>> + * if the hint is supported.
> >>>>>>
> >>>>>> Sorry, but if so, the hint changes behavior.
> >>>>>
> >>>>> Yes the hint may change behaviour.
> >>>>>
> >>>>>> Let's consider a rule which matches both VF originating and
> >>>>>> wire originating traffic. Will the rule be missed (ignored)
> >>>>>> regardless if the hint is supported or not?
> >>>>>
> >>>>> If the hint RTE_FLOW_TRANSFER_WIRE_ORIG is used,
> >>>>> the PMD may assume the table won't be used for traffic
> >>>>> which is not coming from wire ports.
> >>>>> As a consequence, the table may be implemented on the path
> >>>>> of wire traffic only.
> >>>>> In this case, the traffic coming from virtual ports
> >>>>> won't be affected by this table.
> >>>>> To answer the question, a rule matching both virtual and wire traffic
> >>>>> will be applied in a table affecting only wire traffic,
> >>>>> so it will still apply (not completely ignored).
> >>>>
> >>>> If so, it is not a hint. It becomes matching criteria
> >>>> which should be in pattern as we discussed.
> >>>
> >>> It is not a strict matching because the PMD is free to support it or not.
> >>
> >> It cannot be optional matching criteria. Matching criteria must
> >> be always mandatory. Otherwise application does not know what
> >> to expect and behaviour may legitimately vary on different
> >> vendors.
> >
> > I think you take it in the wrong direction.
> > The idea is not to have it as a criteria.
> > Let me explain again:
> >
> > If an application is using a flow table to manage flows
> > which *always* come from the same type of port (wire or virtual),
>
> What does guarantee it? Is it used a jump-table and jump rule
> must guarantee it? Or has pattern corresponding unit?
>
> It is very thin ice and I'm ready to bet money that finally
> it will be used as a matching criteria intentionally or not
> intentionally. Simply because it works as matching criteria
> on, for example, Mellanox. I.e. if rules from table with
> corresponding hint are programmed to HW which applies these
> rules on traffic from wire only - effectively it is a matching
> criteria. And it will be used this way. And it will be not
> portable to other HW which does not support the hint.
> So, we're making an API which is very easy to misuse if not
> to say more.
I completely understand your concern (I have same).
In other words, if the application misuse the hint,
it will become not portable.
That's why I made sure to highlight such misue consequence
in the API comments.
> You know better if it is OK or not to rely on liable users
> in the case of DPDK.
I do not rely on users, and I don't want to block innovation.
That's why I want to make sure all is explained and clear,
so freedom comes with responsibility.
> It would be much safer if we do not rely on application in this
> case, introduce a new pattern item to specify origin and
> require PMD to check that pattern has either a new pattern item
> or corresponding REPRESENTED_PORT/PORT_REPRESENTOR pattern
> item.
Safer is not often compatible with fastest :)
> I realize that my concerns could be not valid and it is just
> a paranoia. Just add your ack and let's move forward.
Let's wait for other opinions.
> > then the application can give this information to the driver.
> > With this assumption coming from the application,
> > the driver may do some optimizations.
> >
> > Now about what is explained above:
> > If the application gives such a hint
> > but does not respect its own assumption,
> > then confusion happens.
^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v6] ethdev: add special flags when creating async transfer table
2022-11-08 11:48 ` Andrew Rybchenko
@ 2022-11-14 8:47 ` Rongwei Liu
2022-11-14 11:59 ` [PATCH v7] " Rongwei Liu
1 sibling, 0 replies; 96+ messages in thread
From: Rongwei Liu @ 2022-11-14 8:47 UTC (permalink / raw)
To: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko
Cc: dev, rasland
In case flow rules match only one kind of traffic in a flow table,
then optimization can be done via allocation of this table.
Such optimization is possible only if the application gives a hint
about its usage of the table during initial configuration.
The transfer domain rules may process traffic from wire or vport,
which may correspond to two kinds of underlayer resources.
That's why the first two hints introduced in this patch are about
wire and vport traffic specialization.
Wire means traffic arrives from the uplink port while vport means
traffic initiated from VF/SF.
There are two possible approaches for providing the hints.
Using IPv4 as an example:
1. Use pattern item in both template table and flow rules.
pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
"ANY_VPORT" needs to be present in each flow rule even if it's
just a hint. No value to match because matching is already done by
IPv4 item.
2. Add special flags into table_attr.
template_table 0 create table_id 0 group 1 transfer vport_orig
Approach 1 needs to specify the pattern in each flow rule which wastes
memory and is not user friendly.
This patch takes the 2nd approach and introduces one new member
"specialize" into rte_flow_table_attr to indicate possible flow table
optimization.
By default, there is no hint, so the behavior of the transfer domain
doesn't change.
There is no guarantee that the hint will be used by the PMD.
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
v2: Move the new field to template table attribute.
v4: Mark it as optional and clear the concept.
v5: Change specialize type to uint32_t.
v6: Change the flags to macros and re-construct the commit log.
---
app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
4 files changed, 71 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..15f2af9b40 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -184,6 +184,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VPORT_ORIG] = {
+ .name = "vport_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8993,6 +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_WIRE_ORIG;
+ return len;
+ case TABLE_TRANSFER_VPORT_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TRANSFER_VPORT_ORIG;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..d9ca041ae4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
&actions_templates, nb_actions_templ,
&error);
+Table Attribute: Specialize
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Application can help optimizing underlayer resources and insertion rate
+by specializing template table.
+Specialization is done by providing hints
+in the template table attribute ``specialize``.
+
+This attribute is not mandatory for each PMD to implement.
+If a hint is not supported, it will be silently ignored,
+and no special optimization is done.
+
+If a table is specialized, the application should make sure the rules
+comply with the table attribute.
+
Asynchronous operations
-----------------------
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b3238415f4 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3145,7 +3145,8 @@ It is bound to ``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vport_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 8858b56428..c27b48c5c1 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t port_id,
*/
struct rte_flow_template_table;
+/**@{@name Special optional flags for template table attribute
+ * Each bit is a hint for table specialization,
+ * offering a potential optimization at PMD layer.
+ * PMD can ignore the unsupported bits silently.
+ */
+/**
+ * Specialize table for transfer flows which come only from wire.
+ * It allows PMD not to allocate resources for non-wire originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-wire originated traffic will be missed
+ * if the hint is supported.
+ */
+#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG RTE_BIT32(0)
+/**
+ * Specialize table for transfer flows which come only from vport (e.g. VF, SF).
+ * It allows PMD not to allocate resources for non-vport originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-vport originated traffic will be missed
+ * if the hint is supported.
+ */
+#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG RTE_BIT32(1)
+/**@}*/
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice.
@@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
* Maximum number of flow rules that this table holds.
*/
uint32_t nb_flows;
+ /**
+ * Optional hint flags for PMD optimization.
+ * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
+ */
+ uint32_t specialize;
};
/**
--
2.27.0
^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v7] ethdev: add special flags when creating async transfer table
2022-11-08 11:48 ` Andrew Rybchenko
2022-11-14 8:47 ` [PATCH v6] ethdev: add special flags " Rongwei Liu
@ 2022-11-14 11:59 ` Rongwei Liu
2023-01-17 15:13 ` Ferruh Yigit
` (3 more replies)
1 sibling, 4 replies; 96+ messages in thread
From: Rongwei Liu @ 2022-11-14 11:59 UTC (permalink / raw)
To: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko
Cc: dev, rasland
In case flow rules match only one kind of traffic in a flow table,
then optimization can be done via allocation of this table.
Such optimization is possible only if the application gives a hint
about its usage of the table during initial configuration.
The transfer domain rules may process traffic from wire or vport,
which may correspond to two kinds of underlayer resources.
That's why the first two hints introduced in this patch are about
wire and vport traffic specialization.
Wire means traffic arrives from the uplink port while vport means
traffic initiated from VF/SF.
There are two possible approaches for providing the hints.
Using IPv4 as an example:
1. Use pattern item in both template table and flow rules.
pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
"ANY_VPORT" needs to be present in each flow rule even if it's
just a hint. No value to match because matching is already done by
IPv4 item.
2. Add special flags into table_attr.
template_table 0 create table_id 0 group 1 transfer vport_orig
Approach 1 needs to specify the pattern in each flow rule which wastes
memory and is not user friendly.
This patch takes the 2nd approach and introduces one new member
"specialize" into rte_flow_table_attr to indicate possible flow table
optimization.
By default, there is no hint, so the behavior of the transfer domain
doesn't change.
There is no guarantee that the hint will be used by the PMD.
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
v2: Move the new field to template table attribute.
v4: Mark it as optional and clear the concept.
v5: Change specialize type to uint32_t.
v6: Change the flags to macros and re-construct the commit log.
v7: Fix build failure.
---
app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
4 files changed, 71 insertions(+), 1 deletion(-)
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 88108498e0..62197f2618 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -184,6 +184,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+ TABLE_TRANSFER_WIRE_ORIG,
+ TABLE_TRANSFER_VPORT_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+ [TABLE_TRANSFER_WIRE_ORIG] = {
+ .name = "wire_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
+ [TABLE_TRANSFER_VPORT_ORIG] = {
+ .name = "vport_orig",
+ .help = "affect rule direction to transfer",
+ .next = NEXT(next_table_attr),
+ .call = parse_table,
+ },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8993,6 +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+ case TABLE_TRANSFER_WIRE_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
+ return len;
+ case TABLE_TRANSFER_VPORT_ORIG:
+ if (!out->args.table.attr.flow_attr.transfer)
+ return -1;
+ out->args.table.attr.specialize = RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
+ return len;
default:
return -1;
}
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e6242803d..d9ca041ae4 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
&actions_templates, nb_actions_templ,
&error);
+Table Attribute: Specialize
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Application can help optimizing underlayer resources and insertion rate
+by specializing template table.
+Specialization is done by providing hints
+in the template table attribute ``specialize``.
+
+This attribute is not mandatory for each PMD to implement.
+If a hint is not supported, it will be silently ignored,
+and no special optimization is done.
+
+If a table is specialized, the application should make sure the rules
+comply with the table attribute.
+
Asynchronous operations
-----------------------
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 96c5ae0fe4..b3238415f4 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3145,7 +3145,8 @@ It is bound to ``rte_flow_template_table_create()``::
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
- [priority {level}] [ingress] [egress] [transfer]
+ [priority {level}] [ingress] [egress]
+ [transfer [vport_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 8858b56428..c27b48c5c1 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t port_id,
*/
struct rte_flow_template_table;
+/**@{@name Special optional flags for template table attribute
+ * Each bit is a hint for table specialization,
+ * offering a potential optimization at PMD layer.
+ * PMD can ignore the unsupported bits silently.
+ */
+/**
+ * Specialize table for transfer flows which come only from wire.
+ * It allows PMD not to allocate resources for non-wire originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-wire originated traffic will be missed
+ * if the hint is supported.
+ */
+#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG RTE_BIT32(0)
+/**
+ * Specialize table for transfer flows which come only from vport (e.g. VF, SF).
+ * It allows PMD not to allocate resources for non-vport originated traffic.
+ * This bit is not a matching criteria, just an optimization hint.
+ * Flow rules which match non-vport originated traffic will be missed
+ * if the hint is supported.
+ */
+#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG RTE_BIT32(1)
+/**@}*/
+
/**
* @warning
* @b EXPERIMENTAL: this API may change without prior notice.
@@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
* Maximum number of flow rules that this table holds.
*/
uint32_t nb_flows;
+ /**
+ * Optional hint flags for PMD optimization.
+ * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
+ */
+ uint32_t specialize;
};
/**
--
2.27.0
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7] ethdev: add special flags when creating async transfer table
2022-11-14 11:59 ` [PATCH v7] " Rongwei Liu
@ 2023-01-17 15:13 ` Ferruh Yigit
2023-01-17 17:01 ` Ferruh Yigit
2023-01-18 7:30 ` Andrew Rybchenko
2023-01-18 7:28 ` Andrew Rybchenko
` (2 subsequent siblings)
3 siblings, 2 replies; 96+ messages in thread
From: Ferruh Yigit @ 2023-01-17 15:13 UTC (permalink / raw)
To: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Andrew Rybchenko, Ivan Malov, thomas
Cc: dev, rasland
On 11/14/2022 11:59 AM, Rongwei Liu wrote:
> In case flow rules match only one kind of traffic in a flow table,
> then optimization can be done via allocation of this table.
> Such optimization is possible only if the application gives a hint
> about its usage of the table during initial configuration.
>
> The transfer domain rules may process traffic from wire or vport,
> which may correspond to two kinds of underlayer resources.
> That's why the first two hints introduced in this patch are about
> wire and vport traffic specialization.
> Wire means traffic arrives from the uplink port while vport means
> traffic initiated from VF/SF.
>
> There are two possible approaches for providing the hints.
> Using IPv4 as an example:
> 1. Use pattern item in both template table and flow rules.
>
> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
>
> "ANY_VPORT" needs to be present in each flow rule even if it's
> just a hint. No value to match because matching is already done by
> IPv4 item.
>
> 2. Add special flags into table_attr.
>
> template_table 0 create table_id 0 group 1 transfer vport_orig
>
> Approach 1 needs to specify the pattern in each flow rule which wastes
> memory and is not user friendly.
> This patch takes the 2nd approach and introduces one new member
> "specialize" into rte_flow_table_attr to indicate possible flow table
> optimization.
>
> By default, there is no hint, so the behavior of the transfer domain
> doesn't change.
> There is no guarantee that the hint will be used by the PMD.
>
> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> Acked-by: Ori Kam <orika@nvidia.com>
Hi Andrew, Ivan,
Do you have objection/comment to latest version, if not I will proceed
with patch?
Thanks,
ferruh
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-17 15:13 ` Ferruh Yigit
@ 2023-01-17 17:01 ` Ferruh Yigit
2023-01-18 2:50 ` Rongwei Liu
2023-01-18 7:30 ` Andrew Rybchenko
1 sibling, 1 reply; 96+ messages in thread
From: Ferruh Yigit @ 2023-01-17 17:01 UTC (permalink / raw)
To: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
Andrew Rybchenko, Ivan Malov, thomas
Cc: dev, rasland
On 1/17/2023 3:13 PM, Ferruh Yigit wrote:
> On 11/14/2022 11:59 AM, Rongwei Liu wrote:
>> In case flow rules match only one kind of traffic in a flow table,
>> then optimization can be done via allocation of this table.
>> Such optimization is possible only if the application gives a hint
>> about its usage of the table during initial configuration.
>>
>> The transfer domain rules may process traffic from wire or vport,
>> which may correspond to two kinds of underlayer resources.
>> That's why the first two hints introduced in this patch are about
>> wire and vport traffic specialization.
>> Wire means traffic arrives from the uplink port while vport means
>> traffic initiated from VF/SF.
>>
>> There are two possible approaches for providing the hints.
>> Using IPv4 as an example:
>> 1. Use pattern item in both template table and flow rules.
>>
>> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
>> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
>>
>> "ANY_VPORT" needs to be present in each flow rule even if it's
>> just a hint. No value to match because matching is already done by
>> IPv4 item.
>>
>> 2. Add special flags into table_attr.
>>
>> template_table 0 create table_id 0 group 1 transfer vport_orig
>>
>> Approach 1 needs to specify the pattern in each flow rule which wastes
>> memory and is not user friendly.
>> This patch takes the 2nd approach and introduces one new member
>> "specialize" into rte_flow_table_attr to indicate possible flow table
>> optimization.
>>
>> By default, there is no hint, so the behavior of the transfer domain
>> doesn't change.
>> There is no guarantee that the hint will be used by the PMD.
>>
>> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
>> Acked-by: Ori Kam <orika@nvidia.com>
>
> Hi Andrew, Ivan,
>
> Do you have objection/comment to latest version, if not I will proceed
> with patch?
>
BTW, there is an implementation of this flag in some driver, right?
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-17 17:01 ` Ferruh Yigit
@ 2023-01-18 2:50 ` Rongwei Liu
0 siblings, 0 replies; 96+ messages in thread
From: Rongwei Liu @ 2023-01-18 2:50 UTC (permalink / raw)
To: Ferruh Yigit, Matan Azrad, Slava Ovsiienko, Ori Kam, Aman Singh,
Yuying Zhang, Andrew Rybchenko, Ivan Malov,
NBU-Contact-Thomas Monjalon (EXTERNAL)
Cc: dev, Raslan Darawsheh
HI Ferruh:
BR
Rongwei
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Wednesday, January 18, 2023 01:02
> To: Rongwei Liu <rongweil@nvidia.com>; Matan Azrad <matan@nvidia.com>;
> Slava Ovsiienko <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>;
> Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang
> <yuying.zhang@intel.com>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>; Ivan Malov <ivan.malov@oktetlabs.ru>;
> NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [PATCH v7] ethdev: add special flags when creating async transfer
> table
>
> External email: Use caution opening links or attachments
>
>
> On 1/17/2023 3:13 PM, Ferruh Yigit wrote:
> > On 11/14/2022 11:59 AM, Rongwei Liu wrote:
> >> In case flow rules match only one kind of traffic in a flow table,
> >> then optimization can be done via allocation of this table.
> >> Such optimization is possible only if the application gives a hint
> >> about its usage of the table during initial configuration.
> >>
> >> The transfer domain rules may process traffic from wire or vport,
> >> which may correspond to two kinds of underlayer resources.
> >> That's why the first two hints introduced in this patch are about
> >> wire and vport traffic specialization.
> >> Wire means traffic arrives from the uplink port while vport means
> >> traffic initiated from VF/SF.
> >>
> >> There are two possible approaches for providing the hints.
> >> Using IPv4 as an example:
> >> 1. Use pattern item in both template table and flow rules.
> >>
> >> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> >> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
> >>
> >> "ANY_VPORT" needs to be present in each flow rule even if it's
> >> just a hint. No value to match because matching is already done by
> >> IPv4 item.
> >>
> >> 2. Add special flags into table_attr.
> >>
> >> template_table 0 create table_id 0 group 1 transfer vport_orig
> >>
> >> Approach 1 needs to specify the pattern in each flow rule which
> >> wastes memory and is not user friendly.
> >> This patch takes the 2nd approach and introduces one new member
> >> "specialize" into rte_flow_table_attr to indicate possible flow table
> >> optimization.
> >>
> >> By default, there is no hint, so the behavior of the transfer domain
> >> doesn't change.
> >> There is no guarantee that the hint will be used by the PMD.
> >>
> >> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> >> Acked-by: Ori Kam <orika@nvidia.com>
> >
> > Hi Andrew, Ivan,
> >
> > Do you have objection/comment to latest version, if not I will proceed
> > with patch?
> >
>
> BTW, there is an implementation of this flag in some driver, right?
Yes, NVIDIA NIC has an implementation ready. Will pass the new RTE table attribute to PMD once API accepted.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7] ethdev: add special flags when creating async transfer table
2022-11-14 11:59 ` [PATCH v7] " Rongwei Liu
2023-01-17 15:13 ` Ferruh Yigit
@ 2023-01-18 7:28 ` Andrew Rybchenko
2023-01-18 16:18 ` Thomas Monjalon
2023-01-30 0:00 ` Ivan Malov
2023-02-02 11:19 ` [PATCH v8] ethdev: add optimization hints in flow template table Rongwei Liu
3 siblings, 1 reply; 96+ messages in thread
From: Andrew Rybchenko @ 2023-01-18 7:28 UTC (permalink / raw)
To: Rongwei Liu, matan, viacheslavo, orika, thomas, Aman Singh,
Yuying Zhang, Ferruh Yigit
Cc: dev, rasland
On 11/14/22 14:59, Rongwei Liu wrote:
> In case flow rules match only one kind of traffic in a flow table,
> then optimization can be done via allocation of this table.
> Such optimization is possible only if the application gives a hint
> about its usage of the table during initial configuration.
>
> The transfer domain rules may process traffic from wire or vport,
> which may correspond to two kinds of underlayer resources.
> That's why the first two hints introduced in this patch are about
> wire and vport traffic specialization.
> Wire means traffic arrives from the uplink port while vport means
> traffic initiated from VF/SF.
>
> There are two possible approaches for providing the hints.
> Using IPv4 as an example:
> 1. Use pattern item in both template table and flow rules.
>
> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
>
> "ANY_VPORT" needs to be present in each flow rule even if it's
> just a hint. No value to match because matching is already done by
> IPv4 item.
>
> 2. Add special flags into table_attr.
>
> template_table 0 create table_id 0 group 1 transfer vport_orig
>
> Approach 1 needs to specify the pattern in each flow rule which wastes
> memory and is not user friendly.
> This patch takes the 2nd approach and introduces one new member
> "specialize" into rte_flow_table_attr to indicate possible flow table
> optimization.
The above description is misleading. It alternates options (1)
and (2), but in fact (2) requires (1) as well.
(2) is simply done on different level - much earlier, before
flow rules creation. Since resources allocation is assumed to
be done on table creation, we need to know the purpose of the
table in advance to optimize resources allocation.
Since (2) is *not a matching criteria*, but just a hint, (1)
flow rules must have matching criteria anyway.
>
> By default, there is no hint, so the behavior of the transfer domain
> doesn't change.
> There is no guarantee that the hint will be used by the PMD.
>
> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
> Acked-by: Ori Kam <orika@nvidia.com>
[snip]
> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
> index 3e6242803d..d9ca041ae4 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
> &actions_templates, nb_actions_templ,
> &error);
>
> +Table Attribute: Specialize
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Application can help optimizing underlayer resources and insertion rate
> +by specializing template table.
> +Specialization is done by providing hints
> +in the template table attribute ``specialize``.
> +
> +This attribute is not mandatory for each PMD to implement.
> +If a hint is not supported, it will be silently ignored,
> +and no special optimization is done.
> +
> +If a table is specialized, the application should make sure the rules
> +comply with the table attribute.
If a table is specialized, the application must make sure that
all flow rules added to the table have pattern which implies
corresponding matching criteria. For example if a table is
specialized to be wire-origin only, pattern should have
represented port item with ethdev which corresponds to a
physical port (or any other item which matches packets
coming from wire only).
[snip]
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-17 15:13 ` Ferruh Yigit
2023-01-17 17:01 ` Ferruh Yigit
@ 2023-01-18 7:30 ` Andrew Rybchenko
1 sibling, 0 replies; 96+ messages in thread
From: Andrew Rybchenko @ 2023-01-18 7:30 UTC (permalink / raw)
To: Ferruh Yigit, Rongwei Liu, matan, viacheslavo, orika, Aman Singh,
Yuying Zhang, Ivan Malov, thomas
Cc: dev, rasland
On 1/17/23 18:13, Ferruh Yigit wrote:
> On 11/14/2022 11:59 AM, Rongwei Liu wrote:
>> In case flow rules match only one kind of traffic in a flow table,
>> then optimization can be done via allocation of this table.
>> Such optimization is possible only if the application gives a hint
>> about its usage of the table during initial configuration.
>>
>> The transfer domain rules may process traffic from wire or vport,
>> which may correspond to two kinds of underlayer resources.
>> That's why the first two hints introduced in this patch are about
>> wire and vport traffic specialization.
>> Wire means traffic arrives from the uplink port while vport means
>> traffic initiated from VF/SF.
>>
>> There are two possible approaches for providing the hints.
>> Using IPv4 as an example:
>> 1. Use pattern item in both template table and flow rules.
>>
>> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
>> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
>>
>> "ANY_VPORT" needs to be present in each flow rule even if it's
>> just a hint. No value to match because matching is already done by
>> IPv4 item.
>>
>> 2. Add special flags into table_attr.
>>
>> template_table 0 create table_id 0 group 1 transfer vport_orig
>>
>> Approach 1 needs to specify the pattern in each flow rule which wastes
>> memory and is not user friendly.
>> This patch takes the 2nd approach and introduces one new member
>> "specialize" into rte_flow_table_attr to indicate possible flow table
>> optimization.
>>
>> By default, there is no hint, so the behavior of the transfer domain
>> doesn't change.
>> There is no guarantee that the hint will be used by the PMD.
>>
>> Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
>> Acked-by: Ori Kam <orika@nvidia.com>
>
> Hi Andrew, Ivan,
>
> Do you have objection/comment to latest version, if not I will proceed
> with patch?
>
> Thanks,
> ferruh
Hi Ferruh,
Sorry, but I'm still unhappy with the description.
See my reply.
Thanks,
Andrew.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-18 7:28 ` Andrew Rybchenko
@ 2023-01-18 16:18 ` Thomas Monjalon
2023-02-01 10:17 ` Andrew Rybchenko
0 siblings, 1 reply; 96+ messages in thread
From: Thomas Monjalon @ 2023-01-18 16:18 UTC (permalink / raw)
To: Ferruh Yigit, Andrew Rybchenko
Cc: Rongwei Liu, matan, viacheslavo, orika, Aman Singh, Yuying Zhang,
dev, rasland, jerinj
18/01/2023 08:28, Andrew Rybchenko:
> On 11/14/22 14:59, Rongwei Liu wrote:
> > In case flow rules match only one kind of traffic in a flow table,
> > then optimization can be done via allocation of this table.
> > Such optimization is possible only if the application gives a hint
> > about its usage of the table during initial configuration.
> >
> > The transfer domain rules may process traffic from wire or vport,
> > which may correspond to two kinds of underlayer resources.
> > That's why the first two hints introduced in this patch are about
> > wire and vport traffic specialization.
> > Wire means traffic arrives from the uplink port while vport means
> > traffic initiated from VF/SF.
> >
> > There are two possible approaches for providing the hints.
> > Using IPv4 as an example:
> > 1. Use pattern item in both template table and flow rules.
> >
> > pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> > async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
> >
> > "ANY_VPORT" needs to be present in each flow rule even if it's
> > just a hint. No value to match because matching is already done by
> > IPv4 item.
> >
> > 2. Add special flags into table_attr.
> >
> > template_table 0 create table_id 0 group 1 transfer vport_orig
> >
> > Approach 1 needs to specify the pattern in each flow rule which wastes
> > memory and is not user friendly.
> > This patch takes the 2nd approach and introduces one new member
> > "specialize" into rte_flow_table_attr to indicate possible flow table
> > optimization.
>
> The above description is misleading. It alternates options (1)
> and (2), but in fact (2) requires (1) as well.
Yes the above description may be misleading
and it seems you are misleaded :)
I will explain below why the option (2) doesn't require (1).
I think we should apply the same example to both cases to make it clear:
1. Use pattern item in both template table and flow rules:
template table 3 = transfer pattern ANY_VPORT / eth / ipv4 src is 255.255.255.255 / end
flow rule = template_table 3 pattern ANY_VPORT / eth / ipv4 src is 1.1.1.1 / end
The pattern template 3 will be used only to match flows coming from vports.
ANY_VPORT needs to be present in each flow rule.
ANY_VPORT matching is redundant with IP src 1.1.1.1 because
the user knows 1.1.1.1 is the IP of a vport.
2. Add specialization flag into template table attribute:
template table 3 = transfer VPORT_ORIG pattern eth / ipv4 src is 255.255.255.255 / end
flow rule = template_table 3 pattern eth / ipv4 src is 1.1.1.1 / end
The pattern template 3 can be used only to match flows coming from vports.
> (2) is simply done on different level - much earlier, before
> flow rules creation. Since resources allocation is assumed to
> be done on table creation, we need to know the purpose of the
> table in advance to optimize resources allocation.
Actually in both cases we get the hint at template table creation.
But in solution 2 we are not creating a redundant pattern matching,
and we don't need to check it in flow rules, so it is more efficient.
> Since (2) is *not a matching criteria*, but just a hint, (1)
> flow rules must have matching criteria anyway.
No we don't need the matching criteria ANY_VPORT with solution (2)
because we are already matching on an IP src which is a vport.
> > +Table Attribute: Specialize
> > +^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +Application can help optimizing underlayer resources and insertion rate
> > +by specializing template table.
> > +Specialization is done by providing hints
> > +in the template table attribute ``specialize``.
> > +
> > +This attribute is not mandatory for each PMD to implement.
> > +If a hint is not supported, it will be silently ignored,
> > +and no special optimization is done.
> > +
> > +If a table is specialized, the application should make sure the rules
> > +comply with the table attribute.
>
> If a table is specialized, the application must make sure that
> all flow rules added to the table have pattern which implies
> corresponding matching criteria. For example if a table is
> specialized to be wire-origin only, pattern should have
> represented port item with ethdev which corresponds to a
> physical port (or any other item which matches packets
> coming from wire only).
No need of a matching criteria strictly mapping the hint.
Here the hint is SPECIALIZE_TRANSFER_VPORT_ORIG
and the rules can match on an IP src which is assigned to a vport.
So there is no need to strictly match the vport itself in the rule.
Hope it make thinks clear.
We can improve the commit log as I wrote above.
^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7] ethdev: add special flags when creating async transfer table
2022-11-14 11:59 ` [PATCH v7] " Rongwei Liu
2023-01-17 15:13 ` Ferruh Yigit
2023-01-18 7:28 ` Andrew Rybchenko
@ 2023-01-30 0:00 ` Ivan Malov
2023-01-30 2:34 ` Rongwei Liu
2023-02-02 11:19 ` [PATCH v8] ethdev: add optimization hints in flow template table Rongwei Liu
3 siblings, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2023-01-30 0:00 UTC (permalink / raw)
To: Rongwei Liu
Cc: matan, viacheslavo, orika, thomas, Aman Singh, Yuying Zhang,
Ferruh Yigit, Andrew Rybchenko, dev, rasland
Hi Rongwei,
Thanks for persevering. I have no strong opinion, but, at least, the
fact that the new flags are no longer meant for use in rte_flow_attr,
which is clearly not the right place for such, is an improvement.
However, let's take a closer look at the current patch, shall we?
But, before we get to that, I'd like to kindly request that you
provide a more concrete example of how this feature is supposed
to be used. Are there some real-life application examples?
Also, to me, it's still unclear how an application can obtain
the knowledge of this hint in the first instance. For example,
can Open vSwitch somehow tell ethdevs representing physical
ports from ones representing "vports" (host endpoints)?
How does it know which attribute to specify?
For the rest of my notes, PSB.
On Mon, 14 Nov 2022, Rongwei Liu wrote:
> In case flow rules match only one kind of traffic in a flow table,
> then optimization can be done via allocation of this table.
This wording might confuse readers. Consider rephrasing it, please:
If multiple flow rules share a common set of match masks, then
they might belong in a flow table which can be pre-allocated.
> Such optimization is possible only if the application gives a hint
> about its usage of the table during initial configuration.
>
> The transfer domain rules may process traffic from wire or vport,
> which may correspond to two kinds of underlayer resources.
Why name it a "vport"? Why not "host"?
host = packets generated by any of the host's "vport"s
wire = packets arriving at the NIC from the network
> That's why the first two hints introduced in this patch are about
> wire and vport traffic specialization.
> Wire means traffic arrives from the uplink port while vport means
> traffic initiated from VF/SF.
By the sound of it, the meaning is confined to just VFs/SFs.
What if the user wants to match packets coming from PFs?
>
> There are two possible approaches for providing the hints.
> Using IPv4 as an example:
> 1. Use pattern item in both template table and flow rules.
>
> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
>
> "ANY_VPORT" needs to be present in each flow rule even if it's
> just a hint. No value to match because matching is already done by
> IPv4 item.
Why no value to match on? How does it prevent rogue tenants
from spoofing network headers? If the application receives
a packet on a particular vport's representor, then it may
strictly specify item represented_port pointing to that
vport so that only packets from that vport match.
Why isn't security a consideration?
>
> 2. Add special flags into table_attr.
>
> template_table 0 create table_id 0 group 1 transfer vport_orig
>
> Approach 1 needs to specify the pattern in each flow rule which wastes
> memory and is not user friendly.
What if the user has to insert a group of rules which not only
have the same set of match masks but also share exactly the
same match spec values for a limited subset of network
items (for example, those of an encap. header)? This
way, a subset of network item specs can remain fixed
across many rules. Does that count as wasting memory?
If yes, then the problem does not concern just a single pair
of attributes, but rather deserves a more versatile solution
like some sort of indirect grouping of constant item specs.
Have you considered such options?
> This patch takes the 2nd approach and introduces one new member
> "specialize" into rte_flow_table_attr to indicate possible flow table
> optimization.
The name "specialize" might have some drawbacks:
- spelling difference (specialise/specialize)
- in grep output, will mix with flows' "spec"
- quite long
- not a noun
Why not "scope"? Or something like that?
>
> By default, there is no hint, so the behavior of the transfer domain
> doesn't change.
> There is no guarantee that the hint will be used by the PMD.
>
> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> Acked-by: Ori Kam <orika at nvidia.com>
>
> v2: Move the new field to template table attribute.
> v4: Mark it as optional and clear the concept.
> v5: Change specialize type to uint32_t.
> v6: Change the flags to macros and re-construct the commit log.
> v7: Fix build failure.
> ---
> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
> doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
> 4 files changed, 71 insertions(+), 1 deletion(-)
>
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 88108498e0..62197f2618 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -184,6 +184,8 @@ enum index {
> TABLE_INGRESS,
> TABLE_EGRESS,
> TABLE_TRANSFER,
> + TABLE_TRANSFER_WIRE_ORIG,
> + TABLE_TRANSFER_VPORT_ORIG,
> TABLE_RULES_NUMBER,
> TABLE_PATTERN_TEMPLATE,
> TABLE_ACTIONS_TEMPLATE,
> @@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
> TABLE_INGRESS,
> TABLE_EGRESS,
> TABLE_TRANSFER,
> + TABLE_TRANSFER_WIRE_ORIG,
> + TABLE_TRANSFER_VPORT_ORIG,
> TABLE_RULES_NUMBER,
> TABLE_PATTERN_TEMPLATE,
> TABLE_ACTIONS_TEMPLATE,
> @@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
> .next = NEXT(next_table_attr),
> .call = parse_table,
> },
> + [TABLE_TRANSFER_WIRE_ORIG] = {
> + .name = "wire_orig",
> + .help = "affect rule direction to transfer",
> + .next = NEXT(next_table_attr),
> + .call = parse_table,
> + },
> + [TABLE_TRANSFER_VPORT_ORIG] = {
> + .name = "vport_orig",
> + .help = "affect rule direction to transfer",
> + .next = NEXT(next_table_attr),
> + .call = parse_table,
> + },
> [TABLE_RULES_NUMBER] = {
> .name = "rules_number",
> .help = "number of rules in table",
> @@ -8993,6 +9009,16 @@ parse_table(struct context *ctx, const struct token
> *token,
> case TABLE_TRANSFER:
> out->args.table.attr.flow_attr.transfer = 1;
> return len;
> + case TABLE_TRANSFER_WIRE_ORIG:
> + if (!out->args.table.attr.flow_attr.transfer)
> + return -1;
> + out->args.table.attr.specialize =
> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
> + return len;
> + case TABLE_TRANSFER_VPORT_ORIG:
> + if (!out->args.table.attr.flow_attr.transfer)
> + return -1;
> + out->args.table.attr.specialize =
> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
> + return len;
> default:
> return -1;
> }
> diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> index 3e6242803d..d9ca041ae4 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
> &actions_templates, nb_actions_templ,
> &error);
>
> +Table Attribute: Specialize
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Application can help optimizing underlayer resources and insertion rate
> +by specializing template table.
> +Specialization is done by providing hints
> +in the template table attribute ``specialize``.
> +
> +This attribute is not mandatory for each PMD to implement.
> +If a hint is not supported, it will be silently ignored,
> +and no special optimization is done.
Silently ignoring the field does not sit well with the
application's possible intent to drop represented_port
match from the patterns. From my point of view, if the
application sets this attribute, it believes it can
rely on it, that is, packets coming from host won't
match if the attribute asks to match network
only, for instance. Has this been considered?
> +
> +If a table is specialized, the application should make sure the rules
> +comply with the table attribute.
How does the application enforce that? I would appreciate you explain it.
> +
> Asynchronous operations
> -----------------------
>
> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> index 96c5ae0fe4..b3238415f4 100644
> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> @@ -3145,7 +3145,8 @@ It is bound to ``rte_flow_template_table_create()``::
>
> flow template_table {port_id} create
> [table_id {id}] [group {group_id}]
> - [priority {level}] [ingress] [egress] [transfer]
> + [priority {level}] [ingress] [egress]
> + [transfer [vport_orig] [wire_orig]]
> rules_number {number}
> pattern_template {pattern_template_id}
> actions_template {actions_template_id}
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index 8858b56428..c27b48c5c1 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t port_id,
> */
> struct rte_flow_template_table;
>
> +/**@{@name Special optional flags for template table attribute
> + * Each bit is a hint for table specialization,
> + * offering a potential optimization at PMD layer.
> + * PMD can ignore the unsupported bits silently.
> + */
> +/**
> + * Specialize table for transfer flows which come only from wire.
> + * It allows PMD not to allocate resources for non-wire originated traffic.
> + * This bit is not a matching criteria, just an optimization hint.
You intended to spell "criterion", I take it. And still, it *is*
a match criterion. I'm not denying the possible need to have
this criterion at the earliest processing stage. That might
be OK, but I still have a hunch that this is too specific.
Please see my comment above about wasting memory.
I guess this type of criterion is not the only
one that may need to be provided as a "hint".
> + * Flow rules which match non-wire originated traffic will be missed
> + * if the hint is supported.
And what if it's unsupported? Is it indeed OK to silently ignore it?
> + */
> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG RTE_BIT32(0)
Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ?
To me, TRANSFER looks redundant as this bit is already supposed
to be ticked in the "struct rte_flow_attr flow_attr" field
of the "struct rte_flow_template_table_attr".
> +/**
> + * Specialize table for transfer flows which come only from vport (e.g. VF,
> SF).
And PF?
> + * It allows PMD not to allocate resources for non-vport originated traffic.
> + * This bit is not a matching criteria, just an optimization hint.
> + * Flow rules which match non-vport originated traffic will be missed
> + * if the hint is supported.
> + */
> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG RTE_BIT32(1)
Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ?
> +/**@}*/
> +
> /**
> * @warning
> * @b EXPERIMENTAL: this API may change without prior notice.
> @@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
> * Maximum number of flow rules that this table holds.
> */
> uint32_t nb_flows;
> + /**
> + * Optional hint flags for PMD optimization.
> + * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
> + */
> + uint32_t specialize;
Why not "scope" or something?
> };
>
> /**
> --
> 2.27.0
>
Thank you.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-30 0:00 ` Ivan Malov
@ 2023-01-30 2:34 ` Rongwei Liu
2023-01-30 7:40 ` Ivan Malov
0 siblings, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2023-01-30 2:34 UTC (permalink / raw)
To: Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi Ivan,
BR
Rongwei
> -----Original Message-----
> From: Ivan Malov <ivan.malov@arknetworks.am>
> Sent: Monday, January 30, 2023 08:00
> To: Rongwei Liu <rongweil@nvidia.com>
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> <rasland@nvidia.com>
> Subject: Re: [PATCH v7] ethdev: add special flags when creating async transfer
> table
>
> External email: Use caution opening links or attachments
>
>
> Hi Rongwei,
>
> Thanks for persevering. I have no strong opinion, but, at least, the fact that the
> new flags are no longer meant for use in rte_flow_attr, which is clearly not
> the right place for such, is an improvement.
>
Thanks for the suggestion, move it to rte_flow_table_attr now and it' dedicated to async API.
> However, let's take a closer look at the current patch, shall we?
>
> But, before we get to that, I'd like to kindly request that you provide a more
> concrete example of how this feature is supposed to be used. Are there some
> real-life application examples?
>
Sure.
> Also, to me, it's still unclear how an application can obtain the knowledge of
> this hint in the first instance. For example, can Open vSwitch somehow tell
> ethdevs representing physical ports from ones representing "vports" (host
> endpoints)?
> How does it know which attribute to specify?
>
Hint should be initiated by application and application knows it' traffic pattern which highly relates to deployment.
Let' use VxLAN encap/decap as an example:
1. Traffic from wire should be VxLAN pattern and do the decap, then send to different vports.
flow pattern_template 0 create transfer relaxed no pattern_template_id 4 template represented_port ethdev_port_id is 0 / eth / ipv4 / udp / vxlan / tag index is 0 data is 0x33 / end
flow actions_template 0 create transfer actions_template_id 4 template raw_decap index 0 / represented_port ethdev_port_id 1 / end mask raw_decap index 0 / represented_port ethdev_port_id 1 / end
flow template_table 0 create group 1 priority 0 transfer wire_orig table_id 4 rules_number 128 pattern_template 4 actions_template 4
2. Traffic from vports should be encap with different VxLAN header and send to wire.
flow actions_template 1 create transfer actions_template_id 5 template raw_encap index 0 / represented_port ethdev_port_id 0 / end mask raw_encap index 0 / represented_port ethdev_port_id 0 / end
flow template_table 0 create group 1 priority 0 transfer vport_orig table_id 5 rules_number 128 pattern_template 4 actions_template 5
> For the rest of my notes, PSB.
>
> On Mon, 14 Nov 2022, Rongwei Liu wrote:
>
> > In case flow rules match only one kind of traffic in a flow table,
> > then optimization can be done via allocation of this table.
>
> This wording might confuse readers. Consider rephrasing it, please:
> If multiple flow rules share a common set of match masks, then they might
> belong in a flow table which can be pre-allocated.
>
> > Such optimization is possible only if the application gives a hint
> > about its usage of the table during initial configuration.
> >
> > The transfer domain rules may process traffic from wire or vport,
> > which may correspond to two kinds of underlayer resources.
>
> Why name it a "vport"? Why not "host"?
>
> host = packets generated by any of the host's "vport"s wire = packets arriving
> at the NIC from the network
Vport is "virtual port" for short and contains "VF/SF" for now.
Per my thoughts, it' clearer and maps to DPDK port probing/management.
>
> > That's why the first two hints introduced in this patch are about wire
> > and vport traffic specialization.
> > Wire means traffic arrives from the uplink port while vport means
> > traffic initiated from VF/SF.
>
> By the sound of it, the meaning is confined to just VFs/SFs.
> What if the user wants to match packets coming from PFs?
>
It should be "wire_orig".
> >
> > There are two possible approaches for providing the hints.
> > Using IPv4 as an example:
> > 1. Use pattern item in both template table and flow rules.
> >
> > pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> > async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
> >
> > "ANY_VPORT" needs to be present in each flow rule even if it's just
> > a hint. No value to match because matching is already done by
> > IPv4 item.
>
> Why no value to match on? How does it prevent rogue tenants from spoofing
> network headers? If the application receives a packet on a particular vport's
> representor, then it may strictly specify item represented_port pointing to that
> vport so that only packets from that vport match.
>
> Why isn't security a consideration?
>
There is some misunderstanding here. "ANY_VPORT" is the approach (new matching item without value) suggested by you.
I was explaining we need to apply it to each flow rule even if it's only a flag and no value.
> >
> > 2. Add special flags into table_attr.
> >
> > template_table 0 create table_id 0 group 1 transfer vport_orig
> >
> > Approach 1 needs to specify the pattern in each flow rule which wastes
> > memory and is not user friendly.
>
> What if the user has to insert a group of rules which not only have the same
> set of match masks but also share exactly the same match spec values for a
> limited subset of network items (for example, those of an encap. header)? This
> way, a subset of network item specs can remain fixed across many rules. Does
> that count as wasting memory?
>
Per my understanding, you are talking "multiple spec and mask mixing".
We provide a hint in this patch and no assumption on the matching patterns.
I think matching pattern is totally controlled by application layer.
"wasting memory " because your approach needs to scatter in each rule while this patch only needs to set table_attr once.
No relation with matching patter totally.
> If yes, then the problem does not concern just a single pair of attributes, but
> rather deserves a more versatile solution like some sort of indirect grouping of
> constant item specs.
> Have you considered such options?
See above.
>
> > This patch takes the 2nd approach and introduces one new member
> > "specialize" into rte_flow_table_attr to indicate possible flow table
> > optimization.
>
> The name "specialize" might have some drawbacks:
> - spelling difference (specialise/specialize)
> - in grep output, will mix with flows' "spec"
> - quite long
> - not a noun
>
> Why not "scope"? Or something like that?
>
It means special optimization to PMD. "scope" is more rogue.
> >
> > By default, there is no hint, so the behavior of the transfer domain
> > doesn't change.
> > There is no guarantee that the hint will be used by the PMD.
> >
> > Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> > Acked-by: Ori Kam <orika at nvidia.com>
> >
> > v2: Move the new field to template table attribute.
> > v4: Mark it as optional and clear the concept.
> > v5: Change specialize type to uint32_t.
> > v6: Change the flags to macros and re-construct the commit log.
> > v7: Fix build failure.
> > ---
> > app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
> > doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
> > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> > lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
> > 4 files changed, 71 insertions(+), 1 deletion(-)
> >
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index 88108498e0..62197f2618 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -184,6 +184,8 @@ enum index {
> > TABLE_INGRESS,
> > TABLE_EGRESS,
> > TABLE_TRANSFER,
> > + TABLE_TRANSFER_WIRE_ORIG,
> > + TABLE_TRANSFER_VPORT_ORIG,
> > TABLE_RULES_NUMBER,
> > TABLE_PATTERN_TEMPLATE,
> > TABLE_ACTIONS_TEMPLATE,
> > @@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
> > TABLE_INGRESS,
> > TABLE_EGRESS,
> > TABLE_TRANSFER,
> > + TABLE_TRANSFER_WIRE_ORIG,
> > + TABLE_TRANSFER_VPORT_ORIG,
> > TABLE_RULES_NUMBER,
> > TABLE_PATTERN_TEMPLATE,
> > TABLE_ACTIONS_TEMPLATE,
> > @@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
> > .next = NEXT(next_table_attr),
> > .call = parse_table,
> > },
> > + [TABLE_TRANSFER_WIRE_ORIG] = {
> > + .name = "wire_orig",
> > + .help = "affect rule direction to transfer",
> > + .next = NEXT(next_table_attr),
> > + .call = parse_table,
> > + },
> > + [TABLE_TRANSFER_VPORT_ORIG] = {
> > + .name = "vport_orig",
> > + .help = "affect rule direction to transfer",
> > + .next = NEXT(next_table_attr),
> > + .call = parse_table,
> > + },
> > [TABLE_RULES_NUMBER] = {
> > .name = "rules_number",
> > .help = "number of rules in table", @@ -8993,6 +9009,16
> > @@ parse_table(struct context *ctx, const struct token *token,
> > case TABLE_TRANSFER:
> > out->args.table.attr.flow_attr.transfer = 1;
> > return len;
> > + case TABLE_TRANSFER_WIRE_ORIG:
> > + if (!out->args.table.attr.flow_attr.transfer)
> > + return -1;
> > + out->args.table.attr.specialize =
> > RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
> > + return len;
> > + case TABLE_TRANSFER_VPORT_ORIG:
> > + if (!out->args.table.attr.flow_attr.transfer)
> > + return -1;
> > + out->args.table.attr.specialize =
> > RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
> > + return len;
> > default:
> > return -1;
> > }
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > index 3e6242803d..d9ca041ae4 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
> > &actions_templates, nb_actions_templ,
> > &error);
> >
> > +Table Attribute: Specialize
> > +^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +Application can help optimizing underlayer resources and insertion
> > +rate by specializing template table.
> > +Specialization is done by providing hints in the template table
> > +attribute ``specialize``.
> > +
> > +This attribute is not mandatory for each PMD to implement.
> > +If a hint is not supported, it will be silently ignored, and no
> > +special optimization is done.
>
> Silently ignoring the field does not sit well with the application's possible intent
> to drop represented_port match from the patterns. From my point of view, if
> the application sets this attribute, it believes it can rely on it, that is, packets
> coming from host won't match if the attribute asks to match network only, for
> instance. Has this been considered?
>
> > +
> > +If a table is specialized, the application should make sure the rules
> > +comply with the table attribute.
>
> How does the application enforce that? I would appreciate you explain it.
>
> > +
> > Asynchronous operations
> > -----------------------
> >
> > diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > index 96c5ae0fe4..b3238415f4 100644
> > --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > @@ -3145,7 +3145,8 @@ It is bound to
> ``rte_flow_template_table_create()``::
> >
> > flow template_table {port_id} create
> > [table_id {id}] [group {group_id}]
> > - [priority {level}] [ingress] [egress] [transfer]
> > + [priority {level}] [ingress] [egress]
> > + [transfer [vport_orig] [wire_orig]]
> > rules_number {number}
> > pattern_template {pattern_template_id}
> > actions_template {actions_template_id} diff --git
> > a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> > 8858b56428..c27b48c5c1 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t
> > port_id, */ struct rte_flow_template_table;
> >
> > +/**@{@name Special optional flags for template table attribute
> > + * Each bit is a hint for table specialization,
> > + * offering a potential optimization at PMD layer.
> > + * PMD can ignore the unsupported bits silently.
> > + */
> > +/**
> > + * Specialize table for transfer flows which come only from wire.
> > + * It allows PMD not to allocate resources for non-wire originated traffic.
> > + * This bit is not a matching criteria, just an optimization hint.
>
> You intended to spell "criterion", I take it. And still, it *is* a match criterion.
> I'm not denying the possible need to have this criterion at the earliest
> processing stage. That might be OK, but I still have a hunch that this is too
> specific.
> Please see my comment above about wasting memory.
> I guess this type of criterion is not the only one that may need to be provided
> as a "hint".
>
> > + * Flow rules which match non-wire originated traffic will be missed
> > + * if the hint is supported.
>
> And what if it's unsupported? Is it indeed OK to silently ignore it?
>
> > + */
> > +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG
> RTE_BIT32(0)
>
> Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ?
>
> To me, TRANSFER looks redundant as this bit is already supposed to be ticked
> in the "struct rte_flow_attr flow_attr" field of the "struct
> rte_flow_template_table_attr".
>
> > +/**
> > + * Specialize table for transfer flows which come only from vport
> > +(e.g. VF,
> > SF).
>
> And PF?
>
> > + * It allows PMD not to allocate resources for non-vport originated traffic.
> > + * This bit is not a matching criteria, just an optimization hint.
> > + * Flow rules which match non-vport originated traffic will be missed
> > + * if the hint is supported.
> > + */
> > +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG
> RTE_BIT32(1)
>
> Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ?
>
> > +/**@}*/
> > +
> > /**
> > * @warning
> > * @b EXPERIMENTAL: this API may change without prior notice.
> > @@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
> > * Maximum number of flow rules that this table holds.
> > */
> > uint32_t nb_flows;
> > + /**
> > + * Optional hint flags for PMD optimization.
> > + * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
> > + */
> > + uint32_t specialize;
>
> Why not "scope" or something?
>
> > };
> >
> > /**
> > --
> > 2.27.0
> >
>
> Thank you.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-30 2:34 ` Rongwei Liu
@ 2023-01-30 7:40 ` Ivan Malov
2023-01-30 14:49 ` Rongwei Liu
0 siblings, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2023-01-30 7:40 UTC (permalink / raw)
To: Rongwei Liu
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi Rongwei,
For my responses, PSB.
By the way, now you mention things like wasting memory and insertion
optimisastions, are there any comparative figures to see the effect
of this hint on insertion performance / memory footprint?
Some "before" / "after" examples would really be helpful.
After all, I'm not objecting this patch. But I believe that other
reviewers' concerns should nevertheless be addressed anyway.
On Mon, 30 Jan 2023, Rongwei Liu wrote:
> Hi Ivan,
>
> BR
> Rongwei
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@arknetworks.am>
>> Sent: Monday, January 30, 2023 08:00
>> To: Rongwei Liu <rongweil@nvidia.com>
>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
>> <rasland@nvidia.com>
>> Subject: Re: [PATCH v7] ethdev: add special flags when creating async transfer
>> table
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Rongwei,
>>
>> Thanks for persevering. I have no strong opinion, but, at least, the fact that the
>> new flags are no longer meant for use in rte_flow_attr, which is clearly not
>> the right place for such, is an improvement.
>>
> Thanks for the suggestion, move it to rte_flow_table_attr now and it' dedicated to async API.
>> However, let's take a closer look at the current patch, shall we?
>>
>> But, before we get to that, I'd like to kindly request that you provide a more
>> concrete example of how this feature is supposed to be used. Are there some
>> real-life application examples?
>>
> Sure.
>> Also, to me, it's still unclear how an application can obtain the knowledge of
>> this hint in the first instance. For example, can Open vSwitch somehow tell
>> ethdevs representing physical ports from ones representing "vports" (host
>> endpoints)?
>> How does it know which attribute to specify?
>>
> Hint should be initiated by application and application knows it' traffic pattern which highly relates to deployment.
> Let' use VxLAN encap/decap as an example:
> 1. Traffic from wire should be VxLAN pattern and do the decap, then send to different vports.
> flow pattern_template 0 create transfer relaxed no pattern_template_id 4 template represented_port ethdev_port_id is 0 / eth / ipv4 / udp / vxlan / tag index is 0 data is 0x33 / end
> flow actions_template 0 create transfer actions_template_id 4 template raw_decap index 0 / represented_port ethdev_port_id 1 / end mask raw_decap index 0 / represented_port ethdev_port_id 1 / end
> flow template_table 0 create group 1 priority 0 transfer wire_orig table_id 4 rules_number 128 pattern_template 4 actions_template 4
>
> 2. Traffic from vports should be encap with different VxLAN header and send to wire.
> flow actions_template 1 create transfer actions_template_id 5 template raw_encap index 0 / represented_port ethdev_port_id 0 / end mask raw_encap index 0 / represented_port ethdev_port_id 0 / end
> flow template_table 0 create group 1 priority 0 transfer vport_orig table_id 5 rules_number 128 pattern_template 4 actions_template 5
>
>> For the rest of my notes, PSB.
>>
>> On Mon, 14 Nov 2022, Rongwei Liu wrote:
>>
>>> In case flow rules match only one kind of traffic in a flow table,
>>> then optimization can be done via allocation of this table.
>>
>> This wording might confuse readers. Consider rephrasing it, please:
>> If multiple flow rules share a common set of match masks, then they might
>> belong in a flow table which can be pre-allocated.
>>
>>> Such optimization is possible only if the application gives a hint
>>> about its usage of the table during initial configuration.
>>>
>>> The transfer domain rules may process traffic from wire or vport,
>>> which may correspond to two kinds of underlayer resources.
>>
>> Why name it a "vport"? Why not "host"?
>>
>> host = packets generated by any of the host's "vport"s wire = packets arriving
>> at the NIC from the network
> Vport is "virtual port" for short and contains "VF/SF" for now.
> Per my thoughts, it' clearer and maps to DPDK port probing/management.
I understand that "host" might not be a brilliant name.
If "vport" stands for every port of the NIC that is not a network port,
then this name might be OK to me, but why doesn't it cover PFs? A PF is
clearly not a network / physical port. Why just VF/SF then? Where does
that "for now" decision come from? Just wondering.
>>
>>> That's why the first two hints introduced in this patch are about wire
>>> and vport traffic specialization.
>>> Wire means traffic arrives from the uplink port while vport means
>>> traffic initiated from VF/SF.
>>
>> By the sound of it, the meaning is confined to just VFs/SFs.
>> What if the user wants to match packets coming from PFs?
>>
> It should be "wire_orig".
Forgive me, but that does not sound correct. Say, there's an application
and it has a PF plugged into it: ethdev index 0. And the application
transmits packets using rte_eth_tx_burst() from that port.
You say that these packets can be matched via "wire_orig".
But they do not come from the wire. They come from PF...
>>>
>>> There are two possible approaches for providing the hints.
>>> Using IPv4 as an example:
>>> 1. Use pattern item in both template table and flow rules.
>>>
>>> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
>>> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
>>>
>>> "ANY_VPORT" needs to be present in each flow rule even if it's just
>>> a hint. No value to match because matching is already done by
>>> IPv4 item.
>>
>> Why no value to match on? How does it prevent rogue tenants from spoofing
>> network headers? If the application receives a packet on a particular vport's
>> representor, then it may strictly specify item represented_port pointing to that
>> vport so that only packets from that vport match.
>>
>> Why isn't security a consideration?
>>
> There is some misunderstanding here. "ANY_VPORT" is the approach (new matching item without value) suggested by you.
I'm not talking about ANY_VPORT in this particular paragraph.
There's item "represented_port" mentioned over there. I'm just asking
about this "already done by IPv4 item" bit. Yes, it matches on the
header but not on the true origin of the packet (the logical port
of the NIC). If the app knows which logical port the packet
ingresses the NIC, why not match on it for security?
> I was explaining we need to apply it to each flow rule even if it's only a flag and no value.
That's clear. But PSB.
>>>
>>> 2. Add special flags into table_attr.
>>>
>>> template_table 0 create table_id 0 group 1 transfer vport_orig
>>>
>>> Approach 1 needs to specify the pattern in each flow rule which wastes
>>> memory and is not user friendly.
>>
>> What if the user has to insert a group of rules which not only have the same
>> set of match masks but also share exactly the same match spec values for a
>> limited subset of network items (for example, those of an encap. header)? This
>> way, a subset of network item specs can remain fixed across many rules. Does
>> that count as wasting memory?
>>
> Per my understanding, you are talking "multiple spec and mask mixing".
Say, there's a group of rules, and each of them matches on exactly
the same encap. header (the same in all rules), but different
internal match field values. So, why don't these "fixed"
encap. header items deserve being "optimised" somehow,
the same way as this "wire_orig" does?
If the application knows for sure that there will be packets with exactly
the same encap. header, - that forms this special knowledge that can be
used during init times to help the PMD optimise resource allocation.
Isn't that so? Don't these items deserve some form of a "hint"?
> We provide a hint in this patch and no assumption on the matching patterns.
So I understand. My point is, certain portions of matching patterns
may be "fixed" = entirely the same (masks and specs) in all rules
of a table. Why not give PMD a "hint" about them, too?
> I think matching pattern is totally controlled by application layer.
So is the "direction" spec: the app layer has item represented_port
to control that. But, still, we're here to discuss a hint for that.
Why does the new hint aim exclusively at optimising out this
specific meta item? Why isn't it possible to care about a
generic portion of "know in advance" all-the-same items?
> "wasting memory " because your approach needs to scatter in each rule while this patch only needs to set table_attr once.
> No relation with matching patter totally.
The slight problem with your proposal is that for some reason only
one type of a match criterion deserves a hint moved to the attrs.
Whilst in reality the applicaction may know in advance that
certain subsets of items will not only have the same masks
in all rules but also totally the same specs. If that is
a valid use case, why doesn't it deserve the same (more
generic) optimisation / a hint? Just wondering...
Or has that been addressed already somehow?
>> If yes, then the problem does not concern just a single pair of attributes, but
>> rather deserves a more versatile solution like some sort of indirect grouping of
>> constant item specs.
>> Have you considered such options?
> See above.
>>
>>> This patch takes the 2nd approach and introduces one new member
>>> "specialize" into rte_flow_table_attr to indicate possible flow table
>>> optimization.
>>
>> The name "specialize" might have some drawbacks:
>> - spelling difference (specialise/specialize)
>> - in grep output, will mix with flows' "spec"
>> - quite long
>> - not a noun
>>
>> Why not "scope"? Or something like that?
>>
> It means special optimization to PMD. "scope" is more rogue.
Why is it "rogue"? Scope is something limiting the point of view.
So are the suggested flags. Flag "wire_origin" (or whatever it
can be named eventually) limits the scope of matching. No?
>>>
>>> By default, there is no hint, so the behavior of the transfer domain
>>> doesn't change.
>>> There is no guarantee that the hint will be used by the PMD.
>>>
>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
>>> Acked-by: Ori Kam <orika at nvidia.com>
>>>
>>> v2: Move the new field to template table attribute.
>>> v4: Mark it as optional and clear the concept.
>>> v5: Change specialize type to uint32_t.
>>> v6: Change the flags to macros and re-construct the commit log.
>>> v7: Fix build failure.
>>> ---
>>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
>>> doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
>>> lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
>>> 4 files changed, 71 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
>>> index 88108498e0..62197f2618 100644
>>> --- a/app/test-pmd/cmdline_flow.c
>>> +++ b/app/test-pmd/cmdline_flow.c
>>> @@ -184,6 +184,8 @@ enum index {
>>> TABLE_INGRESS,
>>> TABLE_EGRESS,
>>> TABLE_TRANSFER,
>>> + TABLE_TRANSFER_WIRE_ORIG,
>>> + TABLE_TRANSFER_VPORT_ORIG,
>>> TABLE_RULES_NUMBER,
>>> TABLE_PATTERN_TEMPLATE,
>>> TABLE_ACTIONS_TEMPLATE,
>>> @@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
>>> TABLE_INGRESS,
>>> TABLE_EGRESS,
>>> TABLE_TRANSFER,
>>> + TABLE_TRANSFER_WIRE_ORIG,
>>> + TABLE_TRANSFER_VPORT_ORIG,
>>> TABLE_RULES_NUMBER,
>>> TABLE_PATTERN_TEMPLATE,
>>> TABLE_ACTIONS_TEMPLATE,
>>> @@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
>>> .next = NEXT(next_table_attr),
>>> .call = parse_table,
>>> },
>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
>>> + .name = "wire_orig",
>>> + .help = "affect rule direction to transfer",
>>> + .next = NEXT(next_table_attr),
>>> + .call = parse_table,
>>> + },
>>> + [TABLE_TRANSFER_VPORT_ORIG] = {
>>> + .name = "vport_orig",
>>> + .help = "affect rule direction to transfer",
>>> + .next = NEXT(next_table_attr),
>>> + .call = parse_table,
>>> + },
>>> [TABLE_RULES_NUMBER] = {
>>> .name = "rules_number",
>>> .help = "number of rules in table", @@ -8993,6 +9009,16
>>> @@ parse_table(struct context *ctx, const struct token *token,
>>> case TABLE_TRANSFER:
>>> out->args.table.attr.flow_attr.transfer = 1;
>>> return len;
>>> + case TABLE_TRANSFER_WIRE_ORIG:
>>> + if (!out->args.table.attr.flow_attr.transfer)
>>> + return -1;
>>> + out->args.table.attr.specialize =
>>> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
>>> + return len;
>>> + case TABLE_TRANSFER_VPORT_ORIG:
>>> + if (!out->args.table.attr.flow_attr.transfer)
>>> + return -1;
>>> + out->args.table.attr.specialize =
>>> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
>>> + return len;
>>> default:
>>> return -1;
>>> }
>>> diff --git a/doc/guides/prog_guide/rte_flow.rst
>>> b/doc/guides/prog_guide/rte_flow.rst
>>> index 3e6242803d..d9ca041ae4 100644
>>> --- a/doc/guides/prog_guide/rte_flow.rst
>>> +++ b/doc/guides/prog_guide/rte_flow.rst
>>> @@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
>>> &actions_templates, nb_actions_templ,
>>> &error);
>>>
>>> +Table Attribute: Specialize
>>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> +
>>> +Application can help optimizing underlayer resources and insertion
>>> +rate by specializing template table.
>>> +Specialization is done by providing hints in the template table
>>> +attribute ``specialize``.
>>> +
>>> +This attribute is not mandatory for each PMD to implement.
>>> +If a hint is not supported, it will be silently ignored, and no
>>> +special optimization is done.
>>
>> Silently ignoring the field does not sit well with the application's possible intent
>> to drop represented_port match from the patterns. From my point of view, if
>> the application sets this attribute, it believes it can rely on it, that is, packets
>> coming from host won't match if the attribute asks to match network only, for
>> instance. Has this been considered?
>>
>>> +
>>> +If a table is specialized, the application should make sure the rules
>>> +comply with the table attribute.
>>
>> How does the application enforce that? I would appreciate you explain it.
>>
>>> +
>>> Asynchronous operations
>>> -----------------------
>>>
>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> index 96c5ae0fe4..b3238415f4 100644
>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> @@ -3145,7 +3145,8 @@ It is bound to
>> ``rte_flow_template_table_create()``::
>>>
>>> flow template_table {port_id} create
>>> [table_id {id}] [group {group_id}]
>>> - [priority {level}] [ingress] [egress] [transfer]
>>> + [priority {level}] [ingress] [egress]
>>> + [transfer [vport_orig] [wire_orig]]
>>> rules_number {number}
>>> pattern_template {pattern_template_id}
>>> actions_template {actions_template_id} diff --git
>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>>> 8858b56428..c27b48c5c1 100644
>>> --- a/lib/ethdev/rte_flow.h
>>> +++ b/lib/ethdev/rte_flow.h
>>> @@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t
>>> port_id, */ struct rte_flow_template_table;
>>>
>>> +/**@{@name Special optional flags for template table attribute
>>> + * Each bit is a hint for table specialization,
>>> + * offering a potential optimization at PMD layer.
>>> + * PMD can ignore the unsupported bits silently.
>>> + */
>>> +/**
>>> + * Specialize table for transfer flows which come only from wire.
>>> + * It allows PMD not to allocate resources for non-wire originated traffic.
>>> + * This bit is not a matching criteria, just an optimization hint.
>>
>> You intended to spell "criterion", I take it. And still, it *is* a match criterion.
>> I'm not denying the possible need to have this criterion at the earliest
>> processing stage. That might be OK, but I still have a hunch that this is too
>> specific.
>> Please see my comment above about wasting memory.
>> I guess this type of criterion is not the only one that may need to be provided
>> as a "hint".
>>
>>> + * Flow rules which match non-wire originated traffic will be missed
>>> + * if the hint is supported.
>>
>> And what if it's unsupported? Is it indeed OK to silently ignore it?
>>
>>> + */
>>> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG
>> RTE_BIT32(0)
>>
>> Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ?
>>
>> To me, TRANSFER looks redundant as this bit is already supposed to be ticked
>> in the "struct rte_flow_attr flow_attr" field of the "struct
>> rte_flow_template_table_attr".
>>
>>> +/**
>>> + * Specialize table for transfer flows which come only from vport
>>> +(e.g. VF,
>>> SF).
>>
>> And PF?
>>
>>> + * It allows PMD not to allocate resources for non-vport originated traffic.
>>> + * This bit is not a matching criteria, just an optimization hint.
>>> + * Flow rules which match non-vport originated traffic will be missed
>>> + * if the hint is supported.
>>> + */
>>> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG
>> RTE_BIT32(1)
>>
>> Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ?
>>
>>> +/**@}*/
>>> +
>>> /**
>>> * @warning
>>> * @b EXPERIMENTAL: this API may change without prior notice.
>>> @@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
>>> * Maximum number of flow rules that this table holds.
>>> */
>>> uint32_t nb_flows;
>>> + /**
>>> + * Optional hint flags for PMD optimization.
>>> + * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
>>> + */
>>> + uint32_t specialize;
>>
>> Why not "scope" or something?
>>
>>> };
>>>
>>> /**
>>> --
>>> 2.27.0
>>>
>>
>> Thank you.
>
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-30 7:40 ` Ivan Malov
@ 2023-01-30 14:49 ` Rongwei Liu
2023-01-30 23:00 ` Ivan Malov
0 siblings, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2023-01-30 14:49 UTC (permalink / raw)
To: Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko, dev,
Raslan Darawsheh
HI Ivan
BR
Rongwei
> -----Original Message-----
> From: Ivan Malov <ivan.malov@arknetworks.am>
> Sent: Monday, January 30, 2023 15:40
> To: Rongwei Liu <rongweil@nvidia.com>
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> <rasland@nvidia.com>
> Subject: RE: [PATCH v7] ethdev: add special flags when creating async transfer
> table
>
> External email: Use caution opening links or attachments
>
>
> Hi Rongwei,
>
> For my responses, PSB.
>
> By the way, now you mention things like wasting memory and insertion
> optimisastions, are there any comparative figures to see the effect of this hint
> on insertion performance / memory footprint?
> Some "before" / "after" examples would really be helpful.
>
Good to hear we reach agreement almost.
First, the hint has nothing related to matching, only affects PMD resource management.
In my local test, it can save around 50% memory in the VxLAN encap/decap example case.
Insertion rate has very very few improvements.
> After all, I'm not objecting this patch. But I believe that other reviewers'
> concerns should nevertheless be addressed anyway.
Let me try to show the hint is useful.
>
> On Mon, 30 Jan 2023, Rongwei Liu wrote:
>
> > Hi Ivan,
> >
> > BR
> > Rongwei
> >
> >> -----Original Message-----
> >> From: Ivan Malov <ivan.malov@arknetworks.am>
> >> Sent: Monday, January 30, 2023 08:00
> >> To: Rongwei Liu <rongweil@nvidia.com>
> >> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> >> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> >> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> >> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
> >> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> >> <rasland@nvidia.com>
> >> Subject: Re: [PATCH v7] ethdev: add special flags when creating async
> >> transfer table
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> Hi Rongwei,
> >>
> >> Thanks for persevering. I have no strong opinion, but, at least, the
> >> fact that the new flags are no longer meant for use in rte_flow_attr,
> >> which is clearly not the right place for such, is an improvement.
> >>
> > Thanks for the suggestion, move it to rte_flow_table_attr now and it'
> dedicated to async API.
> >> However, let's take a closer look at the current patch, shall we?
> >>
> >> But, before we get to that, I'd like to kindly request that you
> >> provide a more concrete example of how this feature is supposed to be
> >> used. Are there some real-life application examples?
> >>
> > Sure.
> >> Also, to me, it's still unclear how an application can obtain the
> >> knowledge of this hint in the first instance. For example, can Open
> >> vSwitch somehow tell ethdevs representing physical ports from ones
> >> representing "vports" (host endpoints)?
> >> How does it know which attribute to specify?
> >>
> > Hint should be initiated by application and application knows it' traffic
> pattern which highly relates to deployment.
> > Let' use VxLAN encap/decap as an example:
> > 1. Traffic from wire should be VxLAN pattern and do the decap, then send to
> different vports.
> > flow pattern_template 0 create transfer relaxed no pattern_template_id
> > 4 template represented_port ethdev_port_id is 0 / eth / ipv4 / udp /
> > vxlan / tag index is 0 data is 0x33 / end flow actions_template 0
> > create transfer actions_template_id 4 template raw_decap index 0 /
> > represented_port ethdev_port_id 1 / end mask raw_decap index 0 /
> > represented_port ethdev_port_id 1 / end flow template_table 0 create
> > group 1 priority 0 transfer wire_orig table_id 4 rules_number 128
> > pattern_template 4 actions_template 4
> >
> > 2. Traffic from vports should be encap with different VxLAN header and send
> to wire.
> > flow actions_template 1 create transfer actions_template_id 5 template
> > raw_encap index 0 / represented_port ethdev_port_id 0 / end mask
> > raw_encap index 0 / represented_port ethdev_port_id 0 / end flow
> > template_table 0 create group 1 priority 0 transfer vport_orig
> > table_id 5 rules_number 128 pattern_template 4 actions_template 5
> >
> >> For the rest of my notes, PSB.
> >>
> >> On Mon, 14 Nov 2022, Rongwei Liu wrote:
> >>
> >>> In case flow rules match only one kind of traffic in a flow table,
> >>> then optimization can be done via allocation of this table.
> >>
> >> This wording might confuse readers. Consider rephrasing it, please:
> >> If multiple flow rules share a common set of match masks, then they
> >> might belong in a flow table which can be pre-allocated.
> >>
> >>> Such optimization is possible only if the application gives a hint
> >>> about its usage of the table during initial configuration.
> >>>
> >>> The transfer domain rules may process traffic from wire or vport,
> >>> which may correspond to two kinds of underlayer resources.
> >>
> >> Why name it a "vport"? Why not "host"?
> >>
> >> host = packets generated by any of the host's "vport"s wire = packets
> >> arriving at the NIC from the network
> > Vport is "virtual port" for short and contains "VF/SF" for now.
> > Per my thoughts, it' clearer and maps to DPDK port probing/management.
>
> I understand that "host" might not be a brilliant name.
>
> If "vport" stands for every port of the NIC that is not a network port, then this
> name might be OK to me, but why doesn't it cover PFs? A PF is clearly not a
> network / physical port. Why just VF/SF then? Where does that "for now"
> decision come from? Just wondering.
>
"For now" stands for my understanding. DPDK is always in evolution, right?
You are right, PF should be included in 'vport" concept.
> >>
> >>> That's why the first two hints introduced in this patch are about
> >>> wire and vport traffic specialization.
> >>> Wire means traffic arrives from the uplink port while vport means
> >>> traffic initiated from VF/SF.
> >>
> >> By the sound of it, the meaning is confined to just VFs/SFs.
> >> What if the user wants to match packets coming from PFs?
> >>
> > It should be "wire_orig".
>
> Forgive me, but that does not sound correct. Say, there's an application and it
> has a PF plugged into it: ethdev index 0. And the application transmits packets
> using rte_eth_tx_burst() from that port.
> You say that these packets can be matched via "wire_orig".
> But they do not come from the wire. They come from PF...
Hmm. My mistake.
This may highly depend on PMD implementation. Basically, PFs' traffic may contain
"from wire"/"wire_orig" and '"from local"/"vport_orig".
That' why we emphasize it' optional for PMD.
>
> >>>
> >>> There are two possible approaches for providing the hints.
> >>> Using IPv4 as an example:
> >>> 1. Use pattern item in both template table and flow rules.
> >>>
> >>> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> >>> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
> >>>
> >>> "ANY_VPORT" needs to be present in each flow rule even if it's
> >>> just a hint. No value to match because matching is already done by
> >>> IPv4 item.
> >>
> >> Why no value to match on? How does it prevent rogue tenants from
> >> spoofing network headers? If the application receives a packet on a
> >> particular vport's representor, then it may strictly specify item
> >> represented_port pointing to that vport so that only packets from that vport
> match.
> >>
> >> Why isn't security a consideration?
> >>
> > There is some misunderstanding here. "ANY_VPORT" is the approach (new
> matching item without value) suggested by you.
> I'm not talking about ANY_VPORT in this particular paragraph.
>
> There's item "represented_port" mentioned over there. I'm just asking about
> this "already done by IPv4 item" bit. Yes, it matches on the header but not on
> the true origin of the packet (the logical port of the NIC). If the app knows
> which logical port the packet ingresses the NIC, why not match on it for
> security?
>
Hint is not a matching and it implies how to manage underlayer steering resource.
If "vport_orig" is present, PMD will only apply the steering logic to vport traffic.
The resource is allocated in the async table before each rule. Already cover security considerations.
Matching on "represented_port" needs to program each rule, considering a port range like index "5-10".
Hint tells PMD only to take care of traffic from vport regardless the port index.
> > I was explaining we need to apply it to each flow rule even if it's only a flag
> and no value.
>
> That's clear. But PSB.
>
> >>>
> >>> 2. Add special flags into table_attr.
> >>>
> >>> template_table 0 create table_id 0 group 1 transfer vport_orig
> >>>
> >>> Approach 1 needs to specify the pattern in each flow rule which
> >>> wastes memory and is not user friendly.
> >>
> >> What if the user has to insert a group of rules which not only have
> >> the same set of match masks but also share exactly the same match
> >> spec values for a limited subset of network items (for example, those
> >> of an encap. header)? This way, a subset of network item specs can
> >> remain fixed across many rules. Does that count as wasting memory?
> >>
> > Per my understanding, you are talking "multiple spec and mask mixing".
>
> Say, there's a group of rules, and each of them matches on exactly the same
> encap. header (the same in all rules), but different internal match field values.
> So, why don't these "fixed"
> encap. header items deserve being "optimised" somehow, the same way as
> this "wire_orig" does?
We are back to original point. Async approach is trying to pre-allocate resources and speed up the insertion.
Resource is allocated in async table stage and we only have mask information.
In each rule, the matching value passes in. I guess you are saying to optimize per different matching values, right?
This needs dynamic calculations per each rule and wastes the resource in async table(table allocates resource for all possible values).
>
> If the application knows for sure that there will be packets with exactly the
> same encap. header, - that forms this special knowledge that can be used
> during init times to help the PMD optimise resource allocation.
> Isn't that so? Don't these items deserve some form of a "hint"?
>
It can deserve some kinds of "hint". But see above, these hints are per rule and resource allocation happens before rules.
> > We provide a hint in this patch and no assumption on the matching patterns.
>
> So I understand. My point is, certain portions of matching patterns may be
> "fixed" = entirely the same (masks and specs) in all rules of a table. Why not
> give PMD a "hint" about them, too?
>
> > I think matching pattern is totally controlled by application layer.
>
> So is the "direction" spec: the app layer has item represented_port to control
> that. But, still, we're here to discuss a hint for that.
> Why does the new hint aim exclusively at optimising out this specific meta
> item? Why isn't it possible to care about a generic portion of "know in
> advance" all-the-same items?
" generic portion of know in advance" is some still kind of dynamic approach, right?
Imagine a situation. DPDK has 10 VFs, each VF may have different VxLAN encap headers.
This hint approach can work for 10 VFs once.
In public cloud deployments, each VF/SF may map to different users, but underlay is almost same(GRE/VxLAN... differ in filed values).
>
> > "wasting memory " because your approach needs to scatter in each rule
> while this patch only needs to set table_attr once.
> > No relation with matching patter totally.
>
> The slight problem with your proposal is that for some reason only one type of
> a match criterion deserves a hint moved to the attrs.
> Whilst in reality the applicaction may know in advance that certain subsets of
> items will not only have the same masks in all rules but also totally the same
> specs. If that is a valid use case, why doesn't it deserve the same (more
> generic) optimisation / a hint? Just wondering...
> Or has that been addressed already somehow?
>
Believe me, the hint helps us to save significant resources already.
Per my view, your proposal is totally valid in sync approach, but please check my response,
Async is trying to allocate resources in advance and speed up insertion ASAP.
> >> If yes, then the problem does not concern just a single pair of
> >> attributes, but rather deserves a more versatile solution like some
> >> sort of indirect grouping of constant item specs.
> >> Have you considered such options?
> > See above.
> >>
> >>> This patch takes the 2nd approach and introduces one new member
> >>> "specialize" into rte_flow_table_attr to indicate possible flow
> >>> table optimization.
> >>
> >> The name "specialize" might have some drawbacks:
> >> - spelling difference (specialise/specialize)
> >> - in grep output, will mix with flows' "spec"
> >> - quite long
> >> - not a noun
> >>
> >> Why not "scope"? Or something like that?
> >>
> > It means special optimization to PMD. "scope" is more rogue.
>
> Why is it "rogue"? Scope is something limiting the point of view.
> So are the suggested flags. Flag "wire_origin" (or whatever it can be named
> eventually) limits the scope of matching. No?
>
Hint won't interfere with matching. It has no knowledge of matching.
Instead, it only controls matching resources. "wire_orig" tells PMD to allocate HW resource for traffic from wire only.
Then traffic from vport is sliently ignored. Hint doesn't know what are matched and how many fields are involves.
> >>>
> >>> By default, there is no hint, so the behavior of the transfer domain
> >>> doesn't change.
> >>> There is no guarantee that the hint will be used by the PMD.
> >>>
> >>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> >>> Acked-by: Ori Kam <orika at nvidia.com>
> >>>
> >>> v2: Move the new field to template table attribute.
> >>> v4: Mark it as optional and clear the concept.
> >>> v5: Change specialize type to uint32_t.
> >>> v6: Change the flags to macros and re-construct the commit log.
> >>> v7: Fix build failure.
> >>> ---
> >>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
> >>> doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
> >>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> >>> lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
> >>> 4 files changed, 71 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/app/test-pmd/cmdline_flow.c
> >>> b/app/test-pmd/cmdline_flow.c index 88108498e0..62197f2618 100644
> >>> --- a/app/test-pmd/cmdline_flow.c
> >>> +++ b/app/test-pmd/cmdline_flow.c
> >>> @@ -184,6 +184,8 @@ enum index {
> >>> TABLE_INGRESS,
> >>> TABLE_EGRESS,
> >>> TABLE_TRANSFER,
> >>> + TABLE_TRANSFER_WIRE_ORIG,
> >>> + TABLE_TRANSFER_VPORT_ORIG,
> >>> TABLE_RULES_NUMBER,
> >>> TABLE_PATTERN_TEMPLATE,
> >>> TABLE_ACTIONS_TEMPLATE,
> >>> @@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
> >>> TABLE_INGRESS,
> >>> TABLE_EGRESS,
> >>> TABLE_TRANSFER,
> >>> + TABLE_TRANSFER_WIRE_ORIG,
> >>> + TABLE_TRANSFER_VPORT_ORIG,
> >>> TABLE_RULES_NUMBER,
> >>> TABLE_PATTERN_TEMPLATE,
> >>> TABLE_ACTIONS_TEMPLATE,
> >>> @@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
> >>> .next = NEXT(next_table_attr),
> >>> .call = parse_table,
> >>> },
> >>> + [TABLE_TRANSFER_WIRE_ORIG] = {
> >>> + .name = "wire_orig",
> >>> + .help = "affect rule direction to transfer",
> >>> + .next = NEXT(next_table_attr),
> >>> + .call = parse_table,
> >>> + },
> >>> + [TABLE_TRANSFER_VPORT_ORIG] = {
> >>> + .name = "vport_orig",
> >>> + .help = "affect rule direction to transfer",
> >>> + .next = NEXT(next_table_attr),
> >>> + .call = parse_table,
> >>> + },
> >>> [TABLE_RULES_NUMBER] = {
> >>> .name = "rules_number",
> >>> .help = "number of rules in table", @@ -8993,6
> >>> +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
> >>> case TABLE_TRANSFER:
> >>> out->args.table.attr.flow_attr.transfer = 1;
> >>> return len;
> >>> + case TABLE_TRANSFER_WIRE_ORIG:
> >>> + if (!out->args.table.attr.flow_attr.transfer)
> >>> + return -1;
> >>> + out->args.table.attr.specialize =
> >>> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
> >>> + return len;
> >>> + case TABLE_TRANSFER_VPORT_ORIG:
> >>> + if (!out->args.table.attr.flow_attr.transfer)
> >>> + return -1;
> >>> + out->args.table.attr.specialize =
> >>> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
> >>> + return len;
> >>> default:
> >>> return -1;
> >>> }
> >>> diff --git a/doc/guides/prog_guide/rte_flow.rst
> >>> b/doc/guides/prog_guide/rte_flow.rst
> >>> index 3e6242803d..d9ca041ae4 100644
> >>> --- a/doc/guides/prog_guide/rte_flow.rst
> >>> +++ b/doc/guides/prog_guide/rte_flow.rst
> >>> @@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
> >>> &actions_templates, nb_actions_templ,
> >>> &error);
> >>>
> >>> +Table Attribute: Specialize
> >>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>> +
> >>> +Application can help optimizing underlayer resources and insertion
> >>> +rate by specializing template table.
> >>> +Specialization is done by providing hints in the template table
> >>> +attribute ``specialize``.
> >>> +
> >>> +This attribute is not mandatory for each PMD to implement.
> >>> +If a hint is not supported, it will be silently ignored, and no
> >>> +special optimization is done.
> >>
> >> Silently ignoring the field does not sit well with the application's
> >> possible intent to drop represented_port match from the patterns.
> >> From my point of view, if the application sets this attribute, it
> >> believes it can rely on it, that is, packets coming from host won't
> >> match if the attribute asks to match network only, for instance. Has this
> been considered?
> >>
> >>> +
> >>> +If a table is specialized, the application should make sure the
> >>> +rules comply with the table attribute.
> >>
> >> How does the application enforce that? I would appreciate you explain it.
> >>
> >>> +
> >>> Asynchronous operations
> >>> -----------------------
> >>>
> >>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>> index 96c5ae0fe4..b3238415f4 100644
> >>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>> @@ -3145,7 +3145,8 @@ It is bound to
> >> ``rte_flow_template_table_create()``::
> >>>
> >>> flow template_table {port_id} create
> >>> [table_id {id}] [group {group_id}]
> >>> - [priority {level}] [ingress] [egress] [transfer]
> >>> + [priority {level}] [ingress] [egress]
> >>> + [transfer [vport_orig] [wire_orig]]
> >>> rules_number {number}
> >>> pattern_template {pattern_template_id}
> >>> actions_template {actions_template_id} diff --git
> >>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> >>> 8858b56428..c27b48c5c1 100644
> >>> --- a/lib/ethdev/rte_flow.h
> >>> +++ b/lib/ethdev/rte_flow.h
> >>> @@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t
> >>> port_id, */ struct rte_flow_template_table;
> >>>
> >>> +/**@{@name Special optional flags for template table attribute
> >>> + * Each bit is a hint for table specialization,
> >>> + * offering a potential optimization at PMD layer.
> >>> + * PMD can ignore the unsupported bits silently.
> >>> + */
> >>> +/**
> >>> + * Specialize table for transfer flows which come only from wire.
> >>> + * It allows PMD not to allocate resources for non-wire originated traffic.
> >>> + * This bit is not a matching criteria, just an optimization hint.
> >>
> >> You intended to spell "criterion", I take it. And still, it *is* a match criterion.
> >> I'm not denying the possible need to have this criterion at the
> >> earliest processing stage. That might be OK, but I still have a hunch
> >> that this is too specific.
> >> Please see my comment above about wasting memory.
> >> I guess this type of criterion is not the only one that may need to
> >> be provided as a "hint".
> >>
> >>> + * Flow rules which match non-wire originated traffic will be
> >>> + missed
> >>> + * if the hint is supported.
> >>
> >> And what if it's unsupported? Is it indeed OK to silently ignore it?
> >>
> >>> + */
> >>> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG
> >> RTE_BIT32(0)
> >>
> >> Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ?
> >>
> >> To me, TRANSFER looks redundant as this bit is already supposed to be
> >> ticked in the "struct rte_flow_attr flow_attr" field of the "struct
> >> rte_flow_template_table_attr".
> >>
> >>> +/**
> >>> + * Specialize table for transfer flows which come only from vport
> >>> +(e.g. VF,
> >>> SF).
> >>
> >> And PF?
> >>
> >>> + * It allows PMD not to allocate resources for non-vport originated traffic.
> >>> + * This bit is not a matching criteria, just an optimization hint.
> >>> + * Flow rules which match non-vport originated traffic will be
> >>> +missed
> >>> + * if the hint is supported.
> >>> + */
> >>> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG
> >> RTE_BIT32(1)
> >>
> >> Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ?
> >>
> >>> +/**@}*/
> >>> +
> >>> /**
> >>> * @warning
> >>> * @b EXPERIMENTAL: this API may change without prior notice.
> >>> @@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
> >>> * Maximum number of flow rules that this table holds.
> >>> */
> >>> uint32_t nb_flows;
> >>> + /**
> >>> + * Optional hint flags for PMD optimization.
> >>> + * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
> >>> + */
> >>> + uint32_t specialize;
> >>
> >> Why not "scope" or something?
> >>
> >>> };
> >>>
> >>> /**
> >>> --
> >>> 2.27.0
> >>>
> >>
> >> Thank you.
> >
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-30 14:49 ` Rongwei Liu
@ 2023-01-30 23:00 ` Ivan Malov
2023-01-31 3:06 ` Rongwei Liu
0 siblings, 1 reply; 96+ messages in thread
From: Ivan Malov @ 2023-01-30 23:00 UTC (permalink / raw)
To: Rongwei Liu
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi Rongwei,
Thanks for the professional attitude.
Hope this discussion gets us on the
same page. Please see below.
On Mon, 30 Jan 2023, Rongwei Liu wrote:
> HI Ivan
>
> BR
> Rongwei
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@arknetworks.am>
>> Sent: Monday, January 30, 2023 15:40
>> To: Rongwei Liu <rongweil@nvidia.com>
>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
>> <rasland@nvidia.com>
>> Subject: RE: [PATCH v7] ethdev: add special flags when creating async transfer
>> table
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Rongwei,
>>
>> For my responses, PSB.
>>
>> By the way, now you mention things like wasting memory and insertion
>> optimisastions, are there any comparative figures to see the effect of this hint
>> on insertion performance / memory footprint?
>> Some "before" / "after" examples would really be helpful.
>>
> Good to hear we reach agreement almost.
Very well.
The key point here is that one may agree that some optimisations
are indeed needed, yes. I don't deny the fact that some vendors
might have issues with how the API maps to the HW capabilities.
Yes, some undesirable resource overhead shall be avoided, but
the high level hints for that have to be designed with care.
> First, the hint has nothing related to matching, only affects PMD resource management.
You say "PMD resource management". For the flow management, that's mostly
vendor-specific, I take it. Let me explain. The application, for instance,
can control the number of Tx descriptors in the queue during setup stage.
Tx descriptors are a common type of HW resource, hence the explicit
control for it available to applications. For flow library, it's
not like that. Different vendors have different "underlayer"
representations, they may vary drastically.
I take it, in the case of the HW you're working with, this hint indeed
maps to something that is entirely resource-related and which does not
belong in this specific vendor's match criteria. I 100% understand
that, in your case, these are separate. But the point is that, on
the high-level programming level (vendor-neutral), such a hint
is in fact a match criterion. Because it tells the driver to
limit the scope of matching to just "from net"/"from vport",
the same way other metadata items do (represented_port).
The only difference is that it refers to a group of
unspecified ports which have something in common.
So, although I don't strongly object having some hints like this one
in the generic API, I nevertheless disagree with describing this as
just "resource-specific" and not being a match criterion. It's just
not always the case. It might not be valid for *all* NIC vendors.
> In my local test, it can save around 50% memory in the VxLAN encap/decap example case.
Forgive me in case this has been already discussed; where's that memory?
I mean, is it some sort of general-purpose memory? Or some HW-specific
table capacity overhead? I'm trying to understand how the feature
will be useful to other vendors, or how common this problem is.
> Insertion rate has very very few improvements.
>> After all, I'm not objecting this patch. But I believe that other reviewers'
>> concerns should nevertheless be addressed anyway.
> Let me try to show the hint is useful.
>>
>> On Mon, 30 Jan 2023, Rongwei Liu wrote:
>>
>>> Hi Ivan,
>>>
>>> BR
>>> Rongwei
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <ivan.malov@arknetworks.am>
>>>> Sent: Monday, January 30, 2023 08:00
>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
>>>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
>>>> <rasland@nvidia.com>
>>>> Subject: Re: [PATCH v7] ethdev: add special flags when creating async
>>>> transfer table
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Hi Rongwei,
>>>>
>>>> Thanks for persevering. I have no strong opinion, but, at least, the
>>>> fact that the new flags are no longer meant for use in rte_flow_attr,
>>>> which is clearly not the right place for such, is an improvement.
>>>>
>>> Thanks for the suggestion, move it to rte_flow_table_attr now and it'
>> dedicated to async API.
>>>> However, let's take a closer look at the current patch, shall we?
>>>>
>>>> But, before we get to that, I'd like to kindly request that you
>>>> provide a more concrete example of how this feature is supposed to be
>>>> used. Are there some real-life application examples?
>>>>
>>> Sure.
>>>> Also, to me, it's still unclear how an application can obtain the
>>>> knowledge of this hint in the first instance. For example, can Open
>>>> vSwitch somehow tell ethdevs representing physical ports from ones
>>>> representing "vports" (host endpoints)?
>>>> How does it know which attribute to specify?
>>>>
>>> Hint should be initiated by application and application knows it' traffic
>> pattern which highly relates to deployment.
>>> Let' use VxLAN encap/decap as an example:
>>> 1. Traffic from wire should be VxLAN pattern and do the decap, then send to
>> different vports.
>>> flow pattern_template 0 create transfer relaxed no pattern_template_id
>>> 4 template represented_port ethdev_port_id is 0 / eth / ipv4 / udp /
>>> vxlan / tag index is 0 data is 0x33 / end flow actions_template 0
>>> create transfer actions_template_id 4 template raw_decap index 0 /
>>> represented_port ethdev_port_id 1 / end mask raw_decap index 0 /
>>> represented_port ethdev_port_id 1 / end flow template_table 0 create
>>> group 1 priority 0 transfer wire_orig table_id 4 rules_number 128
>>> pattern_template 4 actions_template 4
>>>
>>> 2. Traffic from vports should be encap with different VxLAN header and send
>> to wire.
>>> flow actions_template 1 create transfer actions_template_id 5 template
>>> raw_encap index 0 / represented_port ethdev_port_id 0 / end mask
>>> raw_encap index 0 / represented_port ethdev_port_id 0 / end flow
>>> template_table 0 create group 1 priority 0 transfer vport_orig
>>> table_id 5 rules_number 128 pattern_template 4 actions_template 5
>>>
>>>> For the rest of my notes, PSB.
>>>>
>>>> On Mon, 14 Nov 2022, Rongwei Liu wrote:
>>>>
>>>>> In case flow rules match only one kind of traffic in a flow table,
>>>>> then optimization can be done via allocation of this table.
>>>>
>>>> This wording might confuse readers. Consider rephrasing it, please:
>>>> If multiple flow rules share a common set of match masks, then they
>>>> might belong in a flow table which can be pre-allocated.
>>>>
>>>>> Such optimization is possible only if the application gives a hint
>>>>> about its usage of the table during initial configuration.
>>>>>
>>>>> The transfer domain rules may process traffic from wire or vport,
>>>>> which may correspond to two kinds of underlayer resources.
>>>>
>>>> Why name it a "vport"? Why not "host"?
>>>>
>>>> host = packets generated by any of the host's "vport"s wire = packets
>>>> arriving at the NIC from the network
>>> Vport is "virtual port" for short and contains "VF/SF" for now.
>>> Per my thoughts, it' clearer and maps to DPDK port probing/management.
>>
>> I understand that "host" might not be a brilliant name.
>>
>> If "vport" stands for every port of the NIC that is not a network port, then this
>> name might be OK to me, but why doesn't it cover PFs? A PF is clearly not a
>> network / physical port. Why just VF/SF then? Where does that "for now"
>> decision come from? Just wondering.
>>
> "For now" stands for my understanding. DPDK is always in evolution, right?
> You are right, PF should be included in 'vport" concept.
>>>>
>>>>> That's why the first two hints introduced in this patch are about
>>>>> wire and vport traffic specialization.
>>>>> Wire means traffic arrives from the uplink port while vport means
>>>>> traffic initiated from VF/SF.
>>>>
>>>> By the sound of it, the meaning is confined to just VFs/SFs.
>>>> What if the user wants to match packets coming from PFs?
>>>>
>>> It should be "wire_orig".
>>
>> Forgive me, but that does not sound correct. Say, there's an application and it
>> has a PF plugged into it: ethdev index 0. And the application transmits packets
>> using rte_eth_tx_burst() from that port.
>> You say that these packets can be matched via "wire_orig".
>> But they do not come from the wire. They come from PF...
> Hmm. My mistake.
> This may highly depend on PMD implementation. Basically, PFs' traffic may contain
> "from wire"/"wire_orig" and '"from local"/"vport_orig".
> That' why we emphasize it' optional for PMD.
>>
>>>>>
>>>>> There are two possible approaches for providing the hints.
>>>>> Using IPv4 as an example:
>>>>> 1. Use pattern item in both template table and flow rules.
>>>>>
>>>>> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
>>>>> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
>>>>>
>>>>> "ANY_VPORT" needs to be present in each flow rule even if it's
>>>>> just a hint. No value to match because matching is already done by
>>>>> IPv4 item.
>>>>
>>>> Why no value to match on? How does it prevent rogue tenants from
>>>> spoofing network headers? If the application receives a packet on a
>>>> particular vport's representor, then it may strictly specify item
>>>> represented_port pointing to that vport so that only packets from that vport
>> match.
>>>>
>>>> Why isn't security a consideration?
>>>>
>>> There is some misunderstanding here. "ANY_VPORT" is the approach (new
>> matching item without value) suggested by you.
>> I'm not talking about ANY_VPORT in this particular paragraph.
>>
>> There's item "represented_port" mentioned over there. I'm just asking about
>> this "already done by IPv4 item" bit. Yes, it matches on the header but not on
>> the true origin of the packet (the logical port of the NIC). If the app knows
>> which logical port the packet ingresses the NIC, why not match on it for
>> security?
>>
> Hint is not a matching and it implies how to manage underlayer steering resource.
> If "vport_orig" is present, PMD will only apply the steering logic to vport traffic.
> The resource is allocated in the async table before each rule. Already cover security considerations.
> Matching on "represented_port" needs to program each rule, considering a port range like index "5-10".
> Hint tells PMD only to take care of traffic from vport regardless the port index.
>
>>> I was explaining we need to apply it to each flow rule even if it's only a flag
>> and no value.
>>
>> That's clear. But PSB.
>>
>>>>>
>>>>> 2. Add special flags into table_attr.
>>>>>
>>>>> template_table 0 create table_id 0 group 1 transfer vport_orig
>>>>>
>>>>> Approach 1 needs to specify the pattern in each flow rule which
>>>>> wastes memory and is not user friendly.
>>>>
>>>> What if the user has to insert a group of rules which not only have
>>>> the same set of match masks but also share exactly the same match
>>>> spec values for a limited subset of network items (for example, those
>>>> of an encap. header)? This way, a subset of network item specs can
>>>> remain fixed across many rules. Does that count as wasting memory?
>>>>
>>> Per my understanding, you are talking "multiple spec and mask mixing".
>>
>> Say, there's a group of rules, and each of them matches on exactly the same
>> encap. header (the same in all rules), but different internal match field values.
>> So, why don't these "fixed"
>> encap. header items deserve being "optimised" somehow, the same way as
>> this "wire_orig" does?
> We are back to original point. Async approach is trying to pre-allocate resources and speed up the insertion.
> Resource is allocated in async table stage and we only have mask information.
> In each rule, the matching value passes in. I guess you are saying to optimize per different matching values, right?
> This needs dynamic calculations per each rule and wastes the resource in async table(table allocates resource for all possible values).
>>
>> If the application knows for sure that there will be packets with exactly the
>> same encap. header, - that forms this special knowledge that can be used
>> during init times to help the PMD optimise resource allocation.
>> Isn't that so? Don't these items deserve some form of a "hint"?
>>
> It can deserve some kinds of "hint". But see above, these hints are per rule and resource allocation happens before rules.
That's not per rule. Perhaps I should've worded it differently.
Suppose, an application has to insert many flow rules, each
of which has match items A and B. Item A not only has the
same mask in *all* rule instances, but also the same spec.
On the other hand, item B only has the same mask in all
the rules, but its spec is different for each rule.
In this example, the application can allocate a template
with items A and B, but that only provides a fixed mask
for them. And the application will HAVE to provide item
A with exactly the same spec in all rule instances. The
PMD, in turn, will HAVE to process this item every time,
being unable to see it's in fact the same at all times.
To me, this sounds very similar to how you described the
need to always provide item ANY_VPORT in each rule thus
facing some waste of memory and parsing difficulties.
If the application knows that a certain item (or a certain
fraction of items) is going to be entirely the same (mask +
spec) across all the rules, why shouldn't it be able to
express this as a hint to the PMD? Why shouldn't it be
able to avoid providing such items in every new flow
rule instance? The same way the "vport_orig" works.
I'm not demanding that you re-implement or re-design this.
Just trying to find out whether such a problem can indeed
be acknowledged. Or has it been solved already? If not,
then perhaps it pays to just discuss whether solving
it can be combined with this "vport_orig" solution.
What do you think? What do others think?
>>> We provide a hint in this patch and no assumption on the matching patterns.
>>
>> So I understand. My point is, certain portions of matching patterns may be
>> "fixed" = entirely the same (masks and specs) in all rules of a table. Why not
>> give PMD a "hint" about them, too?
>>
>>> I think matching pattern is totally controlled by application layer.
>>
>> So is the "direction" spec: the app layer has item represented_port to control
>> that. But, still, we're here to discuss a hint for that.
>> Why does the new hint aim exclusively at optimising out this specific meta
>> item? Why isn't it possible to care about a generic portion of "know in
>> advance" all-the-same items?
> " generic portion of know in advance" is some still kind of dynamic approach, right?
> Imagine a situation. DPDK has 10 VFs, each VF may have different VxLAN encap headers.
> This hint approach can work for 10 VFs once.
> In public cloud deployments, each VF/SF may map to different users, but underlay is almost same(GRE/VxLAN... differ in filed values).
>>
>>> "wasting memory " because your approach needs to scatter in each rule
>> while this patch only needs to set table_attr once.
>>> No relation with matching patter totally.
>>
>> The slight problem with your proposal is that for some reason only one type of
>> a match criterion deserves a hint moved to the attrs.
>> Whilst in reality the applicaction may know in advance that certain subsets of
>> items will not only have the same masks in all rules but also totally the same
>> specs. If that is a valid use case, why doesn't it deserve the same (more
>> generic) optimisation / a hint? Just wondering...
>> Or has that been addressed already somehow?
>>
> Believe me, the hint helps us to save significant resources already.
I'm not arguing it can be helpful. You're working round the clock
to offer a solution, - that's fine and is greatly appreciated.
But what I'm trying to say is that it looks like the problem
might manifest itself for other type of knowledge that also
may deserve a hint. Hence the questions. Hence the offer to
think of covering more match criteria, not just net/vport.
> Per my view, your proposal is totally valid in sync approach, but please check my response,
> Async is trying to allocate resources in advance and speed up insertion ASAP.
So if it's valid in sync approach, then why can't it be valid in the
async one? And I guess it can reflect positively on the insertion
rate, too. Why limit this "hint" approach to just one aspect then?
I'm sure we're close to understanding each other here.
Yes, "orig_vport" is just a one-bit knowledge, and
seems innocent to add as a hint, but why not make
it possible to have a hint for an arbitrary set
of always-the-same match criteria?
In this case, nobody will ever argue of whether a hint
is a match criterion or if it's not. It will be quite
a generic instrument, potentially useful to vendors.
I'm afraid I can't think of an immediate example of
such usefulness, but at least it will appear as
generic as possible from the API perspective.
>>>> If yes, then the problem does not concern just a single pair of
>>>> attributes, but rather deserves a more versatile solution like some
>>>> sort of indirect grouping of constant item specs.
>>>> Have you considered such options?
>>> See above.
>>>>
>>>>> This patch takes the 2nd approach and introduces one new member
>>>>> "specialize" into rte_flow_table_attr to indicate possible flow
>>>>> table optimization.
>>>>
>>>> The name "specialize" might have some drawbacks:
>>>> - spelling difference (specialise/specialize)
>>>> - in grep output, will mix with flows' "spec"
>>>> - quite long
>>>> - not a noun
>>>>
>>>> Why not "scope"? Or something like that?
>>>>
>>> It means special optimization to PMD. "scope" is more rogue.
>>
>> Why is it "rogue"? Scope is something limiting the point of view.
>> So are the suggested flags. Flag "wire_origin" (or whatever it can be named
>> eventually) limits the scope of matching. No?
>>
> Hint won't interfere with matching. It has no knowledge of matching.
Does specifying "orig_vport" actually provide a *choice* for the
packet origin? Does it filter out everything else? If yes,
then, alas, it *is* matching. Because matching is
choosing something of interest. Let's face it.
As I said above, I do acknowledge the fact that, for some vendors,
this match criterion, internally goes to a different HW aspect
that is separate from matching on, for example, IPv4 addresses.
That's OK. But for some vendors, this might be just a regular
match criterion internally. So let's describe it with care.
> Instead, it only controls matching resources. "wire_orig" tells PMD to allocate HW resource for traffic from wire only.
If it controls "matching resources", it's indeed affiliated with
matching then. Look. When the application creates a template, it
tells the PMD that it is going to match on this, this and this.
Masks... No exact values; they will come at a later stage. But,
with this "wire_orig", the application tells the PMD that not
only it will match on *some* "direction", but it actually
provides a SPEC for that. If it indicates bit "wire_orig",
that equals to setting a "mask" for the "direction enum"
AND a "spec" (WIRE). Isn't that the case?
If it is, then please see my above concerns about possibly having
similar need to provide exact-spec hints for other items as well.
> Then traffic from vport is sliently ignored. Hint doesn't know what are matched and how many fields are involves.
>>>>>
>>>>> By default, there is no hint, so the behavior of the transfer domain
>>>>> doesn't change.
>>>>> There is no guarantee that the hint will be used by the PMD.
>>>>>
>>>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
>>>>> Acked-by: Ori Kam <orika at nvidia.com>
>>>>>
>>>>> v2: Move the new field to template table attribute.
>>>>> v4: Mark it as optional and clear the concept.
>>>>> v5: Change specialize type to uint32_t.
>>>>> v6: Change the flags to macros and re-construct the commit log.
>>>>> v7: Fix build failure.
>>>>> ---
>>>>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
>>>>> doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
>>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
>>>>> lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
>>>>> 4 files changed, 71 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/app/test-pmd/cmdline_flow.c
>>>>> b/app/test-pmd/cmdline_flow.c index 88108498e0..62197f2618 100644
>>>>> --- a/app/test-pmd/cmdline_flow.c
>>>>> +++ b/app/test-pmd/cmdline_flow.c
>>>>> @@ -184,6 +184,8 @@ enum index {
>>>>> TABLE_INGRESS,
>>>>> TABLE_EGRESS,
>>>>> TABLE_TRANSFER,
>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>> + TABLE_TRANSFER_VPORT_ORIG,
>>>>> TABLE_RULES_NUMBER,
>>>>> TABLE_PATTERN_TEMPLATE,
>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>> @@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
>>>>> TABLE_INGRESS,
>>>>> TABLE_EGRESS,
>>>>> TABLE_TRANSFER,
>>>>> + TABLE_TRANSFER_WIRE_ORIG,
>>>>> + TABLE_TRANSFER_VPORT_ORIG,
>>>>> TABLE_RULES_NUMBER,
>>>>> TABLE_PATTERN_TEMPLATE,
>>>>> TABLE_ACTIONS_TEMPLATE,
>>>>> @@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
>>>>> .next = NEXT(next_table_attr),
>>>>> .call = parse_table,
>>>>> },
>>>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
>>>>> + .name = "wire_orig",
>>>>> + .help = "affect rule direction to transfer",
>>>>> + .next = NEXT(next_table_attr),
>>>>> + .call = parse_table,
>>>>> + },
>>>>> + [TABLE_TRANSFER_VPORT_ORIG] = {
>>>>> + .name = "vport_orig",
>>>>> + .help = "affect rule direction to transfer",
>>>>> + .next = NEXT(next_table_attr),
>>>>> + .call = parse_table,
>>>>> + },
>>>>> [TABLE_RULES_NUMBER] = {
>>>>> .name = "rules_number",
>>>>> .help = "number of rules in table", @@ -8993,6
>>>>> +9009,16 @@ parse_table(struct context *ctx, const struct token *token,
>>>>> case TABLE_TRANSFER:
>>>>> out->args.table.attr.flow_attr.transfer = 1;
>>>>> return len;
>>>>> + case TABLE_TRANSFER_WIRE_ORIG:
>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>> + return -1;
>>>>> + out->args.table.attr.specialize =
>>>>> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
>>>>> + return len;
>>>>> + case TABLE_TRANSFER_VPORT_ORIG:
>>>>> + if (!out->args.table.attr.flow_attr.transfer)
>>>>> + return -1;
>>>>> + out->args.table.attr.specialize =
>>>>> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
>>>>> + return len;
>>>>> default:
>>>>> return -1;
>>>>> }
>>>>> diff --git a/doc/guides/prog_guide/rte_flow.rst
>>>>> b/doc/guides/prog_guide/rte_flow.rst
>>>>> index 3e6242803d..d9ca041ae4 100644
>>>>> --- a/doc/guides/prog_guide/rte_flow.rst
>>>>> +++ b/doc/guides/prog_guide/rte_flow.rst
>>>>> @@ -3605,6 +3605,21 @@ and pattern and actions templates are created.
>>>>> &actions_templates, nb_actions_templ,
>>>>> &error);
>>>>>
>>>>> +Table Attribute: Specialize
>>>>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> +
>>>>> +Application can help optimizing underlayer resources and insertion
>>>>> +rate by specializing template table.
>>>>> +Specialization is done by providing hints in the template table
>>>>> +attribute ``specialize``.
>>>>> +
>>>>> +This attribute is not mandatory for each PMD to implement.
>>>>> +If a hint is not supported, it will be silently ignored, and no
>>>>> +special optimization is done.
>>>>
>>>> Silently ignoring the field does not sit well with the application's
>>>> possible intent to drop represented_port match from the patterns.
>>>> From my point of view, if the application sets this attribute, it
>>>> believes it can rely on it, that is, packets coming from host won't
>>>> match if the attribute asks to match network only, for instance. Has this
>> been considered?
>>>>
>>>>> +
>>>>> +If a table is specialized, the application should make sure the
>>>>> +rules comply with the table attribute.
>>>>
>>>> How does the application enforce that? I would appreciate you explain it.
>>>>
>>>>> +
>>>>> Asynchronous operations
>>>>> -----------------------
>>>>>
>>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>> index 96c5ae0fe4..b3238415f4 100644
>>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>>>> @@ -3145,7 +3145,8 @@ It is bound to
>>>> ``rte_flow_template_table_create()``::
>>>>>
>>>>> flow template_table {port_id} create
>>>>> [table_id {id}] [group {group_id}]
>>>>> - [priority {level}] [ingress] [egress] [transfer]
>>>>> + [priority {level}] [ingress] [egress]
>>>>> + [transfer [vport_orig] [wire_orig]]
>>>>> rules_number {number}
>>>>> pattern_template {pattern_template_id}
>>>>> actions_template {actions_template_id} diff --git
>>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
>>>>> 8858b56428..c27b48c5c1 100644
>>>>> --- a/lib/ethdev/rte_flow.h
>>>>> +++ b/lib/ethdev/rte_flow.h
>>>>> @@ -5186,6 +5186,29 @@ rte_flow_actions_template_destroy(uint16_t
>>>>> port_id, */ struct rte_flow_template_table;
>>>>>
>>>>> +/**@{@name Special optional flags for template table attribute
>>>>> + * Each bit is a hint for table specialization,
>>>>> + * offering a potential optimization at PMD layer.
>>>>> + * PMD can ignore the unsupported bits silently.
>>>>> + */
>>>>> +/**
>>>>> + * Specialize table for transfer flows which come only from wire.
>>>>> + * It allows PMD not to allocate resources for non-wire originated traffic.
>>>>> + * This bit is not a matching criteria, just an optimization hint.
>>>>
>>>> You intended to spell "criterion", I take it. And still, it *is* a match criterion.
>>>> I'm not denying the possible need to have this criterion at the
>>>> earliest processing stage. That might be OK, but I still have a hunch
>>>> that this is too specific.
>>>> Please see my comment above about wasting memory.
>>>> I guess this type of criterion is not the only one that may need to
>>>> be provided as a "hint".
>>>>
>>>>> + * Flow rules which match non-wire originated traffic will be
>>>>> + missed
>>>>> + * if the hint is supported.
>>>>
>>>> And what if it's unsupported? Is it indeed OK to silently ignore it?
>>>>
>>>>> + */
>>>>> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG
>>>> RTE_BIT32(0)
>>>>
>>>> Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ?
>>>>
>>>> To me, TRANSFER looks redundant as this bit is already supposed to be
>>>> ticked in the "struct rte_flow_attr flow_attr" field of the "struct
>>>> rte_flow_template_table_attr".
>>>>
>>>>> +/**
>>>>> + * Specialize table for transfer flows which come only from vport
>>>>> +(e.g. VF,
>>>>> SF).
>>>>
>>>> And PF?
>>>>
>>>>> + * It allows PMD not to allocate resources for non-vport originated traffic.
>>>>> + * This bit is not a matching criteria, just an optimization hint.
>>>>> + * Flow rules which match non-vport originated traffic will be
>>>>> +missed
>>>>> + * if the hint is supported.
>>>>> + */
>>>>> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG
>>>> RTE_BIT32(1)
>>>>
>>>> Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ?
>>>>
>>>>> +/**@}*/
>>>>> +
>>>>> /**
>>>>> * @warning
>>>>> * @b EXPERIMENTAL: this API may change without prior notice.
>>>>> @@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
>>>>> * Maximum number of flow rules that this table holds.
>>>>> */
>>>>> uint32_t nb_flows;
>>>>> + /**
>>>>> + * Optional hint flags for PMD optimization.
>>>>> + * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
>>>>> + */
>>>>> + uint32_t specialize;
>>>>
>>>> Why not "scope" or something?
>>>>
>>>>> };
>>>>>
>>>>> /**
>>>>> --
>>>>> 2.27.0
>>>>>
>>>>
>>>> Thank you.
>>>
>
Thank you.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-30 23:00 ` Ivan Malov
@ 2023-01-31 3:06 ` Rongwei Liu
2023-01-31 5:30 ` Ivan Malov
0 siblings, 1 reply; 96+ messages in thread
From: Rongwei Liu @ 2023-01-31 3:06 UTC (permalink / raw)
To: Ivan Malov
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko, dev,
Raslan Darawsheh
HI Ivan
BR
Rongwei
> -----Original Message-----
> From: Ivan Malov <ivan.malov@arknetworks.am>
> Sent: Tuesday, January 31, 2023 07:00
> To: Rongwei Liu <rongweil@nvidia.com>
> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> <rasland@nvidia.com>
> Subject: RE: [PATCH v7] ethdev: add special flags when creating async transfer
> table
>
> External email: Use caution opening links or attachments
>
>
> Hi Rongwei,
>
> Thanks for the professional attitude.
> Hope this discussion gets us on the
> same page. Please see below.
Thanks for the suggestion and comments. Hope everything goes well.
>
> On Mon, 30 Jan 2023, Rongwei Liu wrote:
>
> > HI Ivan
> >
> > BR
> > Rongwei
> >
> >> -----Original Message-----
> >> From: Ivan Malov <ivan.malov@arknetworks.am>
> >> Sent: Monday, January 30, 2023 15:40
> >> To: Rongwei Liu <rongweil@nvidia.com>
> >> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> >> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> >> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> >> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
> >> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> >> <rasland@nvidia.com>
> >> Subject: RE: [PATCH v7] ethdev: add special flags when creating async
> >> transfer table
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> Hi Rongwei,
> >>
> >> For my responses, PSB.
> >>
> >> By the way, now you mention things like wasting memory and insertion
> >> optimisastions, are there any comparative figures to see the effect
> >> of this hint on insertion performance / memory footprint?
> >> Some "before" / "after" examples would really be helpful.
> >>
> > Good to hear we reach agreement almost.
>
> Very well.
>
> The key point here is that one may agree that some optimisations are indeed
> needed, yes. I don't deny the fact that some vendors might have issues with
> how the API maps to the HW capabilities.
> Yes, some undesirable resource overhead shall be avoided, but the high level
> hints for that have to be designed with care.
>
Totally agree. That' why we emphasize "optional for PMD" and "application should take care of hint"
> > First, the hint has nothing related to matching, only affects PMD resource
> management.
>
> You say "PMD resource management". For the flow management, that's
> mostly vendor-specific, I take it. Let me explain. The application, for instance,
> can control the number of Tx descriptors in the queue during setup stage.
> Tx descriptors are a common type of HW resource, hence the explicit control
> for it available to applications. For flow library, it's not like that. Different
> vendors have different "underlayer"
> representations, they may vary drastically.
The resource I mentioned is about "steering logic" not SW datapath.
With flow rules offloading, hardware should store the steering logic in its reachable memory no matter embedded in or mapping from host OS.
>
> I take it, in the case of the HW you're working with, this hint indeed maps to
> something that is entirely resource-related and which does not belong in this
> specific vendor's match criteria. I 100% understand that, in your case, these
> are separate. But the point is that, on the high-level programming level
> (vendor-neutral), such a hint is in fact a match criterion. Because it tells the
> driver to limit the scope of matching to just "from net"/"from vport", the same
> way other metadata items do (represented_port).
> The only difference is that it refers to a group of unspecified ports which have
> something in common.
>
" a group of unspecified ports" means dynamic and flexible, right. IMO it's valid and fits sync flow perfectly.
But in async, when allocating resources (table creation), the group info is still unknown. We don't want to scatter it into each rule insertion.
> So, although I don't strongly object having some hints like this one in the
> generic API, I nevertheless disagree with describing this as just "resource-
> specific" and not being a match criterion. It's just not always the case. It might
> not be valid for *all* NIC vendors.
>
Agree, not valid for *all* NIC vendors.
> > In my local test, it can save around 50% memory in the VxLAN encap/decap
> example case.
>
> Forgive me in case this has been already discussed; where's that memory?
> I mean, is it some sort of general-purpose memory? Or some HW-specific
> table capacity overhead? I'm trying to understand how the feature will be
> useful to other vendors, or how common this problem is.
>
See above. HW always needs memory to store offloaded rules no matter embedded in chip or borrowed from OS.
> > Insertion rate has very very few improvements.
> >> After all, I'm not objecting this patch. But I believe that other reviewers'
> >> concerns should nevertheless be addressed anyway.
> > Let me try to show the hint is useful.
> >>
> >> On Mon, 30 Jan 2023, Rongwei Liu wrote:
> >>
> >>> Hi Ivan,
> >>>
> >>> BR
> >>> Rongwei
> >>>
> >>>> -----Original Message-----
> >>>> From: Ivan Malov <ivan.malov@arknetworks.am>
> >>>> Sent: Monday, January 30, 2023 08:00
> >>>> To: Rongwei Liu <rongweil@nvidia.com>
> >>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
> >>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
> >>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
> >>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
> >>>> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
> >>>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
> >>>> <rasland@nvidia.com>
> >>>> Subject: Re: [PATCH v7] ethdev: add special flags when creating
> >>>> async transfer table
> >>>>
> >>>> External email: Use caution opening links or attachments
> >>>>
> >>>>
> >>>> Hi Rongwei,
> >>>>
> >>>> Thanks for persevering. I have no strong opinion, but, at least,
> >>>> the fact that the new flags are no longer meant for use in
> >>>> rte_flow_attr, which is clearly not the right place for such, is an
> improvement.
> >>>>
> >>> Thanks for the suggestion, move it to rte_flow_table_attr now and it'
> >> dedicated to async API.
> >>>> However, let's take a closer look at the current patch, shall we?
> >>>>
> >>>> But, before we get to that, I'd like to kindly request that you
> >>>> provide a more concrete example of how this feature is supposed to
> >>>> be used. Are there some real-life application examples?
> >>>>
> >>> Sure.
> >>>> Also, to me, it's still unclear how an application can obtain the
> >>>> knowledge of this hint in the first instance. For example, can Open
> >>>> vSwitch somehow tell ethdevs representing physical ports from ones
> >>>> representing "vports" (host endpoints)?
> >>>> How does it know which attribute to specify?
> >>>>
> >>> Hint should be initiated by application and application knows it'
> >>> traffic
> >> pattern which highly relates to deployment.
> >>> Let' use VxLAN encap/decap as an example:
> >>> 1. Traffic from wire should be VxLAN pattern and do the decap, then
> >>> send to
> >> different vports.
> >>> flow pattern_template 0 create transfer relaxed no
> >>> pattern_template_id
> >>> 4 template represented_port ethdev_port_id is 0 / eth / ipv4 / udp /
> >>> vxlan / tag index is 0 data is 0x33 / end flow actions_template 0
> >>> create transfer actions_template_id 4 template raw_decap index 0 /
> >>> represented_port ethdev_port_id 1 / end mask raw_decap index 0 /
> >>> represented_port ethdev_port_id 1 / end flow template_table 0 create
> >>> group 1 priority 0 transfer wire_orig table_id 4 rules_number 128
> >>> pattern_template 4 actions_template 4
> >>>
> >>> 2. Traffic from vports should be encap with different VxLAN header
> >>> and send
> >> to wire.
> >>> flow actions_template 1 create transfer actions_template_id 5
> >>> template raw_encap index 0 / represented_port ethdev_port_id 0 / end
> >>> mask raw_encap index 0 / represented_port ethdev_port_id 0 / end
> >>> flow template_table 0 create group 1 priority 0 transfer vport_orig
> >>> table_id 5 rules_number 128 pattern_template 4 actions_template 5
> >>>
> >>>> For the rest of my notes, PSB.
> >>>>
> >>>> On Mon, 14 Nov 2022, Rongwei Liu wrote:
> >>>>
> >>>>> In case flow rules match only one kind of traffic in a flow table,
> >>>>> then optimization can be done via allocation of this table.
> >>>>
> >>>> This wording might confuse readers. Consider rephrasing it, please:
> >>>> If multiple flow rules share a common set of match masks, then they
> >>>> might belong in a flow table which can be pre-allocated.
> >>>>
> >>>>> Such optimization is possible only if the application gives a hint
> >>>>> about its usage of the table during initial configuration.
> >>>>>
> >>>>> The transfer domain rules may process traffic from wire or vport,
> >>>>> which may correspond to two kinds of underlayer resources.
> >>>>
> >>>> Why name it a "vport"? Why not "host"?
> >>>>
> >>>> host = packets generated by any of the host's "vport"s wire =
> >>>> packets arriving at the NIC from the network
> >>> Vport is "virtual port" for short and contains "VF/SF" for now.
> >>> Per my thoughts, it' clearer and maps to DPDK port probing/management.
> >>
> >> I understand that "host" might not be a brilliant name.
> >>
> >> If "vport" stands for every port of the NIC that is not a network
> >> port, then this name might be OK to me, but why doesn't it cover PFs?
> >> A PF is clearly not a network / physical port. Why just VF/SF then? Where
> does that "for now"
> >> decision come from? Just wondering.
> >>
> > "For now" stands for my understanding. DPDK is always in evolution, right?
> > You are right, PF should be included in 'vport" concept.
> >>>>
> >>>>> That's why the first two hints introduced in this patch are about
> >>>>> wire and vport traffic specialization.
> >>>>> Wire means traffic arrives from the uplink port while vport means
> >>>>> traffic initiated from VF/SF.
> >>>>
> >>>> By the sound of it, the meaning is confined to just VFs/SFs.
> >>>> What if the user wants to match packets coming from PFs?
> >>>>
> >>> It should be "wire_orig".
> >>
> >> Forgive me, but that does not sound correct. Say, there's an
> >> application and it has a PF plugged into it: ethdev index 0. And the
> >> application transmits packets using rte_eth_tx_burst() from that port.
> >> You say that these packets can be matched via "wire_orig".
> >> But they do not come from the wire. They come from PF...
> > Hmm. My mistake.
> > This may highly depend on PMD implementation. Basically, PFs' traffic
> > may contain "from wire"/"wire_orig" and '"from local"/"vport_orig".
> > That' why we emphasize it' optional for PMD.
> >>
> >>>>>
> >>>>> There are two possible approaches for providing the hints.
> >>>>> Using IPv4 as an example:
> >>>>> 1. Use pattern item in both template table and flow rules.
> >>>>>
> >>>>> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
> >>>>> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
> >>>>>
> >>>>> "ANY_VPORT" needs to be present in each flow rule even if it's
> >>>>> just a hint. No value to match because matching is already done by
> >>>>> IPv4 item.
> >>>>
> >>>> Why no value to match on? How does it prevent rogue tenants from
> >>>> spoofing network headers? If the application receives a packet on a
> >>>> particular vport's representor, then it may strictly specify item
> >>>> represented_port pointing to that vport so that only packets from
> >>>> that vport
> >> match.
> >>>>
> >>>> Why isn't security a consideration?
> >>>>
> >>> There is some misunderstanding here. "ANY_VPORT" is the approach
> >>> (new
> >> matching item without value) suggested by you.
> >> I'm not talking about ANY_VPORT in this particular paragraph.
> >>
> >> There's item "represented_port" mentioned over there. I'm just asking
> >> about this "already done by IPv4 item" bit. Yes, it matches on the
> >> header but not on the true origin of the packet (the logical port of
> >> the NIC). If the app knows which logical port the packet ingresses
> >> the NIC, why not match on it for security?
> >>
> > Hint is not a matching and it implies how to manage underlayer steering
> resource.
> > If "vport_orig" is present, PMD will only apply the steering logic to vport
> traffic.
> > The resource is allocated in the async table before each rule. Already cover
> security considerations.
> > Matching on "represented_port" needs to program each rule, considering a
> port range like index "5-10".
> > Hint tells PMD only to take care of traffic from vport regardless the port
> index.
> >
> >>> I was explaining we need to apply it to each flow rule even if it's
> >>> only a flag
> >> and no value.
> >>
> >> That's clear. But PSB.
> >>
> >>>>>
> >>>>> 2. Add special flags into table_attr.
> >>>>>
> >>>>> template_table 0 create table_id 0 group 1 transfer vport_orig
> >>>>>
> >>>>> Approach 1 needs to specify the pattern in each flow rule which
> >>>>> wastes memory and is not user friendly.
> >>>>
> >>>> What if the user has to insert a group of rules which not only have
> >>>> the same set of match masks but also share exactly the same match
> >>>> spec values for a limited subset of network items (for example,
> >>>> those of an encap. header)? This way, a subset of network item
> >>>> specs can remain fixed across many rules. Does that count as wasting
> memory?
> >>>>
> >>> Per my understanding, you are talking "multiple spec and mask mixing".
> >>
> >> Say, there's a group of rules, and each of them matches on exactly
> >> the same encap. header (the same in all rules), but different internal match
> field values.
> >> So, why don't these "fixed"
> >> encap. header items deserve being "optimised" somehow, the same way
> >> as this "wire_orig" does?
> > We are back to original point. Async approach is trying to pre-allocate
> resources and speed up the insertion.
> > Resource is allocated in async table stage and we only have mask
> information.
> > In each rule, the matching value passes in. I guess you are saying to optimize
> per different matching values, right?
> > This needs dynamic calculations per each rule and wastes the resource in
> async table(table allocates resource for all possible values).
> >>
> >> If the application knows for sure that there will be packets with
> >> exactly the same encap. header, - that forms this special knowledge
> >> that can be used during init times to help the PMD optimise resource
> allocation.
> >> Isn't that so? Don't these items deserve some form of a "hint"?
> >>
> > It can deserve some kinds of "hint". But see above, these hints are per rule
> and resource allocation happens before rules.
>
> That's not per rule. Perhaps I should've worded it differently.
>
> Suppose, an application has to insert many flow rules, each of which has
> match items A and B. Item A not only has the same mask in *all* rule
> instances, but also the same spec.
> On the other hand, item B only has the same mask in all the rules, but its spec
> is different for each rule.
>
> In this example, the application can allocate a template with items A and B,
> but that only provides a fixed mask for them. And the application will HAVE to
> provide item A with exactly the same spec in all rule instances. The PMD, in
> turn, will HAVE to process this item every time, being unable to see it's in fact
> the same at all times.
>
> To me, this sounds very similar to how you described the need to always
> provide item ANY_VPORT in each rule thus facing some waste of memory and
> parsing difficulties.
>
> If the application knows that a certain item (or a certain fraction of items) is
> going to be entirely the same (mask +
> spec) across all the rules, why shouldn't it be able to express this as a hint to
> the PMD? Why shouldn't it be able to avoid providing such items in every new
> flow rule instance? The same way the "vport_orig" works.
>
> I'm not demanding that you re-implement or re-design this.
> Just trying to find out whether such a problem can indeed be acknowledged.
> Or has it been solved already? If not, then perhaps it pays to just discuss
> whether solving it can be combined with this "vport_orig" solution.
>
> What do you think? What do others think?
>
> >>> We provide a hint in this patch and no assumption on the matching
> patterns.
> >>
> >> So I understand. My point is, certain portions of matching patterns
> >> may be "fixed" = entirely the same (masks and specs) in all rules of
> >> a table. Why not give PMD a "hint" about them, too?
> >>
> >>> I think matching pattern is totally controlled by application layer.
> >>
> >> So is the "direction" spec: the app layer has item represented_port
> >> to control that. But, still, we're here to discuss a hint for that.
> >> Why does the new hint aim exclusively at optimising out this specific
> >> meta item? Why isn't it possible to care about a generic portion of
> >> "know in advance" all-the-same items?
> > " generic portion of know in advance" is some still kind of dynamic approach,
> right?
> > Imagine a situation. DPDK has 10 VFs, each VF may have different VxLAN
> encap headers.
> > This hint approach can work for 10 VFs once.
> > In public cloud deployments, each VF/SF may map to different users, but
> underlay is almost same(GRE/VxLAN... differ in filed values).
> >>
> >>> "wasting memory " because your approach needs to scatter in each
> >>> rule
> >> while this patch only needs to set table_attr once.
> >>> No relation with matching patter totally.
> >>
> >> The slight problem with your proposal is that for some reason only
> >> one type of a match criterion deserves a hint moved to the attrs.
> >> Whilst in reality the applicaction may know in advance that certain
> >> subsets of items will not only have the same masks in all rules but
> >> also totally the same specs. If that is a valid use case, why doesn't
> >> it deserve the same (more
> >> generic) optimisation / a hint? Just wondering...
> >> Or has that been addressed already somehow?
> >>
> > Believe me, the hint helps us to save significant resources already.
>
> I'm not arguing it can be helpful. You're working round the clock to offer a
> solution, - that's fine and is greatly appreciated.
> But what I'm trying to say is that it looks like the problem might manifest itself
> for other type of knowledge that also may deserve a hint. Hence the questions.
> Hence the offer to think of covering more match criteria, not just net/vport.
>
> > Per my view, your proposal is totally valid in sync approach, but
> > please check my response, Async is trying to allocate resources in advance
> and speed up insertion ASAP.
>
> So if it's valid in sync approach, then why can't it be valid in the async one?
> And I guess it can reflect positively on the insertion rate, too. Why limit this
> "hint" approach to just one aspect then?
>
> I'm sure we're close to understanding each other here.
> Yes, "orig_vport" is just a one-bit knowledge, and seems innocent to add as a
> hint, but why not make it possible to have a hint for an arbitrary set of always-
> the-same match criteria?
>
> In this case, nobody will ever argue of whether a hint is a match criterion or if
> it's not. It will be quite a generic instrument, potentially useful to vendors.
> I'm afraid I can't think of an immediate example of such usefulness, but at
> least it will appear as generic as possible from the API perspective.
>
> >>>> If yes, then the problem does not concern just a single pair of
> >>>> attributes, but rather deserves a more versatile solution like some
> >>>> sort of indirect grouping of constant item specs.
> >>>> Have you considered such options?
> >>> See above.
> >>>>
> >>>>> This patch takes the 2nd approach and introduces one new member
> >>>>> "specialize" into rte_flow_table_attr to indicate possible flow
> >>>>> table optimization.
> >>>>
> >>>> The name "specialize" might have some drawbacks:
> >>>> - spelling difference (specialise/specialize)
> >>>> - in grep output, will mix with flows' "spec"
> >>>> - quite long
> >>>> - not a noun
> >>>>
> >>>> Why not "scope"? Or something like that?
> >>>>
> >>> It means special optimization to PMD. "scope" is more rogue.
> >>
> >> Why is it "rogue"? Scope is something limiting the point of view.
> >> So are the suggested flags. Flag "wire_origin" (or whatever it can be
> >> named
> >> eventually) limits the scope of matching. No?
> >>
> > Hint won't interfere with matching. It has no knowledge of matching.
>
> Does specifying "orig_vport" actually provide a *choice* for the packet origin?
> Does it filter out everything else? If yes, then, alas, it *is* matching. Because
> matching is choosing something of interest. Let's face it.
>
> As I said above, I do acknowledge the fact that, for some vendors, this match
> criterion, internally goes to a different HW aspect that is separate from
> matching on, for example, IPv4 addresses.
> That's OK. But for some vendors, this might be just a regular match criterion
> internally. So let's describe it with care.
>
> > Instead, it only controls matching resources. "wire_orig" tells PMD to
> allocate HW resource for traffic from wire only.
>
> If it controls "matching resources", it's indeed affiliated with matching then.
> Look. When the application creates a template, it tells the PMD that it is going
> to match on this, this and this.
> Masks... No exact values; they will come at a later stage. But, with this
> "wire_orig", the application tells the PMD that not only it will match on
> *some* "direction", but it actually provides a SPEC for that. If it indicates bit
> "wire_orig", that equals to setting a "mask" for the "direction enum"
> AND a "spec" (WIRE). Isn't that the case?
>
> If it is, then please see my above concerns about possibly having similar need
> to provide exact-spec hints for other items as well.
>
> > Then traffic from vport is sliently ignored. Hint doesn't know what are
> matched and how many fields are involves.
> >>>>>
> >>>>> By default, there is no hint, so the behavior of the transfer
> >>>>> domain doesn't change.
> >>>>> There is no guarantee that the hint will be used by the PMD.
> >>>>>
> >>>>> Signed-off-by: Rongwei Liu <rongweil at nvidia.com>
> >>>>> Acked-by: Ori Kam <orika at nvidia.com>
> >>>>>
> >>>>> v2: Move the new field to template table attribute.
> >>>>> v4: Mark it as optional and clear the concept.
> >>>>> v5: Change specialize type to uint32_t.
> >>>>> v6: Change the flags to macros and re-construct the commit log.
> >>>>> v7: Fix build failure.
> >>>>> ---
> >>>>> app/test-pmd/cmdline_flow.c | 26 +++++++++++++++++++
> >>>>> doc/guides/prog_guide/rte_flow.rst | 15 +++++++++++
> >>>>> doc/guides/testpmd_app_ug/testpmd_funcs.rst | 3 ++-
> >>>>> lib/ethdev/rte_flow.h | 28 +++++++++++++++++++++
> >>>>> 4 files changed, 71 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/app/test-pmd/cmdline_flow.c
> >>>>> b/app/test-pmd/cmdline_flow.c index 88108498e0..62197f2618 100644
> >>>>> --- a/app/test-pmd/cmdline_flow.c
> >>>>> +++ b/app/test-pmd/cmdline_flow.c
> >>>>> @@ -184,6 +184,8 @@ enum index {
> >>>>> TABLE_INGRESS,
> >>>>> TABLE_EGRESS,
> >>>>> TABLE_TRANSFER,
> >>>>> + TABLE_TRANSFER_WIRE_ORIG,
> >>>>> + TABLE_TRANSFER_VPORT_ORIG,
> >>>>> TABLE_RULES_NUMBER,
> >>>>> TABLE_PATTERN_TEMPLATE,
> >>>>> TABLE_ACTIONS_TEMPLATE,
> >>>>> @@ -1158,6 +1160,8 @@ static const enum index next_table_attr[] = {
> >>>>> TABLE_INGRESS,
> >>>>> TABLE_EGRESS,
> >>>>> TABLE_TRANSFER,
> >>>>> + TABLE_TRANSFER_WIRE_ORIG,
> >>>>> + TABLE_TRANSFER_VPORT_ORIG,
> >>>>> TABLE_RULES_NUMBER,
> >>>>> TABLE_PATTERN_TEMPLATE,
> >>>>> TABLE_ACTIONS_TEMPLATE,
> >>>>> @@ -2933,6 +2937,18 @@ static const struct token token_list[] = {
> >>>>> .next = NEXT(next_table_attr),
> >>>>> .call = parse_table,
> >>>>> },
> >>>>> + [TABLE_TRANSFER_WIRE_ORIG] = {
> >>>>> + .name = "wire_orig",
> >>>>> + .help = "affect rule direction to transfer",
> >>>>> + .next = NEXT(next_table_attr),
> >>>>> + .call = parse_table,
> >>>>> + },
> >>>>> + [TABLE_TRANSFER_VPORT_ORIG] = {
> >>>>> + .name = "vport_orig",
> >>>>> + .help = "affect rule direction to transfer",
> >>>>> + .next = NEXT(next_table_attr),
> >>>>> + .call = parse_table,
> >>>>> + },
> >>>>> [TABLE_RULES_NUMBER] = {
> >>>>> .name = "rules_number",
> >>>>> .help = "number of rules in table", @@ -8993,6
> >>>>> +9009,16 @@ parse_table(struct context *ctx, const struct token
> >>>>> +*token,
> >>>>> case TABLE_TRANSFER:
> >>>>> out->args.table.attr.flow_attr.transfer = 1;
> >>>>> return len;
> >>>>> + case TABLE_TRANSFER_WIRE_ORIG:
> >>>>> + if (!out->args.table.attr.flow_attr.transfer)
> >>>>> + return -1;
> >>>>> + out->args.table.attr.specialize =
> >>>>> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG;
> >>>>> + return len;
> >>>>> + case TABLE_TRANSFER_VPORT_ORIG:
> >>>>> + if (!out->args.table.attr.flow_attr.transfer)
> >>>>> + return -1;
> >>>>> + out->args.table.attr.specialize =
> >>>>> RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
> >>>>> + return len;
> >>>>> default:
> >>>>> return -1;
> >>>>> }
> >>>>> diff --git a/doc/guides/prog_guide/rte_flow.rst
> >>>>> b/doc/guides/prog_guide/rte_flow.rst
> >>>>> index 3e6242803d..d9ca041ae4 100644
> >>>>> --- a/doc/guides/prog_guide/rte_flow.rst
> >>>>> +++ b/doc/guides/prog_guide/rte_flow.rst
> >>>>> @@ -3605,6 +3605,21 @@ and pattern and actions templates are
> created.
> >>>>> &actions_templates, nb_actions_templ,
> >>>>> &error);
> >>>>>
> >>>>> +Table Attribute: Specialize
> >>>>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>>>> +
> >>>>> +Application can help optimizing underlayer resources and
> >>>>> +insertion rate by specializing template table.
> >>>>> +Specialization is done by providing hints in the template table
> >>>>> +attribute ``specialize``.
> >>>>> +
> >>>>> +This attribute is not mandatory for each PMD to implement.
> >>>>> +If a hint is not supported, it will be silently ignored, and no
> >>>>> +special optimization is done.
> >>>>
> >>>> Silently ignoring the field does not sit well with the
> >>>> application's possible intent to drop represented_port match from the
> patterns.
> >>>> From my point of view, if the application sets this attribute, it
> >>>> believes it can rely on it, that is, packets coming from host won't
> >>>> match if the attribute asks to match network only, for instance.
> >>>> Has this
> >> been considered?
> >>>>
> >>>>> +
> >>>>> +If a table is specialized, the application should make sure the
> >>>>> +rules comply with the table attribute.
> >>>>
> >>>> How does the application enforce that? I would appreciate you explain it.
> >>>>
> >>>>> +
> >>>>> Asynchronous operations
> >>>>> -----------------------
> >>>>>
> >>>>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>> index 96c5ae0fe4..b3238415f4 100644
> >>>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> >>>>> @@ -3145,7 +3145,8 @@ It is bound to
> >>>> ``rte_flow_template_table_create()``::
> >>>>>
> >>>>> flow template_table {port_id} create
> >>>>> [table_id {id}] [group {group_id}]
> >>>>> - [priority {level}] [ingress] [egress] [transfer]
> >>>>> + [priority {level}] [ingress] [egress]
> >>>>> + [transfer [vport_orig] [wire_orig]]
> >>>>> rules_number {number}
> >>>>> pattern_template {pattern_template_id}
> >>>>> actions_template {actions_template_id} diff --git
> >>>>> a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> >>>>> 8858b56428..c27b48c5c1 100644
> >>>>> --- a/lib/ethdev/rte_flow.h
> >>>>> +++ b/lib/ethdev/rte_flow.h
> >>>>> @@ -5186,6 +5186,29 @@
> rte_flow_actions_template_destroy(uint16_t
> >>>>> port_id, */ struct rte_flow_template_table;
> >>>>>
> >>>>> +/**@{@name Special optional flags for template table attribute
> >>>>> + * Each bit is a hint for table specialization,
> >>>>> + * offering a potential optimization at PMD layer.
> >>>>> + * PMD can ignore the unsupported bits silently.
> >>>>> + */
> >>>>> +/**
> >>>>> + * Specialize table for transfer flows which come only from wire.
> >>>>> + * It allows PMD not to allocate resources for non-wire originated
> traffic.
> >>>>> + * This bit is not a matching criteria, just an optimization hint.
> >>>>
> >>>> You intended to spell "criterion", I take it. And still, it *is* a match
> criterion.
> >>>> I'm not denying the possible need to have this criterion at the
> >>>> earliest processing stage. That might be OK, but I still have a
> >>>> hunch that this is too specific.
> >>>> Please see my comment above about wasting memory.
> >>>> I guess this type of criterion is not the only one that may need to
> >>>> be provided as a "hint".
> >>>>
> >>>>> + * Flow rules which match non-wire originated traffic will be
> >>>>> + missed
> >>>>> + * if the hint is supported.
> >>>>
> >>>> And what if it's unsupported? Is it indeed OK to silently ignore it?
> >>>>
> >>>>> + */
> >>>>> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG
> >>>> RTE_BIT32(0)
> >>>>
> >>>> Why not RTE_FLOW_TABLE_SCOPE_FROM_WIRE ?
> >>>>
> >>>> To me, TRANSFER looks redundant as this bit is already supposed to
> >>>> be ticked in the "struct rte_flow_attr flow_attr" field of the
> >>>> "struct rte_flow_template_table_attr".
> >>>>
> >>>>> +/**
> >>>>> + * Specialize table for transfer flows which come only from vport
> >>>>> +(e.g. VF,
> >>>>> SF).
> >>>>
> >>>> And PF?
> >>>>
> >>>>> + * It allows PMD not to allocate resources for non-vport originated
> traffic.
> >>>>> + * This bit is not a matching criteria, just an optimization hint.
> >>>>> + * Flow rules which match non-vport originated traffic will be
> >>>>> +missed
> >>>>> + * if the hint is supported.
> >>>>> + */
> >>>>> +#define RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG
> >>>> RTE_BIT32(1)
> >>>>
> >>>> Why not RTE_FLOW_TABLE_SCOPE_FROM_HOST ?
> >>>>
> >>>>> +/**@}*/
> >>>>> +
> >>>>> /**
> >>>>> * @warning
> >>>>> * @b EXPERIMENTAL: this API may change without prior notice.
> >>>>> @@ -5201,6 +5224,11 @@ struct rte_flow_template_table_attr {
> >>>>> * Maximum number of flow rules that this table holds.
> >>>>> */
> >>>>> uint32_t nb_flows;
> >>>>> + /**
> >>>>> + * Optional hint flags for PMD optimization.
> >>>>> + * Value is composed with RTE_FLOW_TABLE_SPECIALIZE_*.
> >>>>> + */
> >>>>> + uint32_t specialize;
> >>>>
> >>>> Why not "scope" or something?
> >>>>
> >>>>> };
> >>>>>
> >>>>> /**
> >>>>> --
> >>>>> 2.27.0
> >>>>>
> >>>>
> >>>> Thank you.
> >>>
> >
>
> Thank you.
^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v7] ethdev: add special flags when creating async transfer table
2023-01-31 3:06 ` Rongwei Liu
@ 2023-01-31 5:30 ` Ivan Malov
2023-01-31 6:14 ` Rongwei Liu
2023-02-01 10:12 ` Thomas Monjalon
0 siblings, 2 replies; 96+ messages in thread
From: Ivan Malov @ 2023-01-31 5:30 UTC (permalink / raw)
To: Rongwei Liu
Cc: Matan Azrad, Slava Ovsiienko, Ori Kam,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Aman Singh, Yuying Zhang, Ferruh Yigit, Andrew Rybchenko, dev,
Raslan Darawsheh
Hi Rongwei,
OK, I hear ya. Thanks for persevering.
I still hope community will comment on the possibility to
provide a hint mechanism for always-the-same match items,
with the perspective of becoming more versatile. Other
than that, your current patch might be OK, but, again,
I think other reviewers' comments (if any) shall
be addressed. But no strong objections from me.
By the way, for this "specialise" field, in your opinion,
which extra flags could emerge in future / would be nice
to have? I mean, is there any concept of what can be
added to this field's namespace and what can't be?
Thank you.
On Tue, 31 Jan 2023, Rongwei Liu wrote:
> HI Ivan
>
> BR
> Rongwei
>
>> -----Original Message-----
>> From: Ivan Malov <ivan.malov@arknetworks.am>
>> Sent: Tuesday, January 31, 2023 07:00
>> To: Rongwei Liu <rongweil@nvidia.com>
>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
>> <rasland@nvidia.com>
>> Subject: RE: [PATCH v7] ethdev: add special flags when creating async transfer
>> table
>>
>> External email: Use caution opening links or attachments
>>
>>
>> Hi Rongwei,
>>
>> Thanks for the professional attitude.
>> Hope this discussion gets us on the
>> same page. Please see below.
> Thanks for the suggestion and comments. Hope everything goes well.
>>
>> On Mon, 30 Jan 2023, Rongwei Liu wrote:
>>
>>> HI Ivan
>>>
>>> BR
>>> Rongwei
>>>
>>>> -----Original Message-----
>>>> From: Ivan Malov <ivan.malov@arknetworks.am>
>>>> Sent: Monday, January 30, 2023 15:40
>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
>>>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
>>>> <rasland@nvidia.com>
>>>> Subject: RE: [PATCH v7] ethdev: add special flags when creating async
>>>> transfer table
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Hi Rongwei,
>>>>
>>>> For my responses, PSB.
>>>>
>>>> By the way, now you mention things like wasting memory and insertion
>>>> optimisastions, are there any comparative figures to see the effect
>>>> of this hint on insertion performance / memory footprint?
>>>> Some "before" / "after" examples would really be helpful.
>>>>
>>> Good to hear we reach agreement almost.
>>
>> Very well.
>>
>> The key point here is that one may agree that some optimisations are indeed
>> needed, yes. I don't deny the fact that some vendors might have issues with
>> how the API maps to the HW capabilities.
>> Yes, some undesirable resource overhead shall be avoided, but the high level
>> hints for that have to be designed with care.
>>
> Totally agree. That' why we emphasize "optional for PMD" and "application should take care of hint"
>>> First, the hint has nothing related to matching, only affects PMD resource
>> management.
>>
>> You say "PMD resource management". For the flow management, that's
>> mostly vendor-specific, I take it. Let me explain. The application, for instance,
>> can control the number of Tx descriptors in the queue during setup stage.
>> Tx descriptors are a common type of HW resource, hence the explicit control
>> for it available to applications. For flow library, it's not like that. Different
>> vendors have different "underlayer"
>> representations, they may vary drastically.
> The resource I mentioned is about "steering logic" not SW datapath.
> With flow rules offloading, hardware should store the steering logic in its reachable memory no matter embedded in or mapping from host OS.
>>
>> I take it, in the case of the HW you're working with, this hint indeed maps to
>> something that is entirely resource-related and which does not belong in this
>> specific vendor's match criteria. I 100% understand that, in your case, these
>> are separate. But the point is that, on the high-level programming level
>> (vendor-neutral), such a hint is in fact a match criterion. Because it tells the
>> driver to limit the scope of matching to just "from net"/"from vport", the same
>> way other metadata items do (represented_port).
>> The only difference is that it refers to a group of unspecified ports which have
>> something in common.
>>
> " a group of unspecified ports" means dynamic and flexible, right. IMO it's valid and fits sync flow perfectly.
> But in async, when allocating resources (table creation), the group info is still unknown. We don't want to scatter it into each rule insertion.
>> So, although I don't strongly object having some hints like this one in the
>> generic API, I nevertheless disagree with describing this as just "resource-
>> specific" and not being a match criterion. It's just not always the case. It might
>> not be valid for *all* NIC vendors.
>>
> Agree, not valid for *all* NIC vendors.
>>> In my local test, it can save around 50% memory in the VxLAN encap/decap
>> example case.
>>
>> Forgive me in case this has been already discussed; where's that memory?
>> I mean, is it some sort of general-purpose memory? Or some HW-specific
>> table capacity overhead? I'm trying to understand how the feature will be
>> useful to other vendors, or how common this problem is.
>>
> See above. HW always needs memory to store offloaded rules no matter embedded in chip or borrowed from OS.
>>> Insertion rate has very very few improvements.
>>>> After all, I'm not objecting this patch. But I believe that other reviewers'
>>>> concerns should nevertheless be addressed anyway.
>>> Let me try to show the hint is useful.
>>>>
>>>> On Mon, 30 Jan 2023, Rongwei Liu wrote:
>>>>
>>>>> Hi Ivan,
>>>>>
>>>>> BR
>>>>> Rongwei
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ivan Malov <ivan.malov@arknetworks.am>
>>>>>> Sent: Monday, January 30, 2023 08:00
>>>>>> To: Rongwei Liu <rongweil@nvidia.com>
>>>>>> Cc: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko
>>>>>> <viacheslavo@nvidia.com>; Ori Kam <orika@nvidia.com>; NBU-Contact-
>>>>>> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Aman Singh
>>>>>> <aman.deep.singh@intel.com>; Yuying Zhang <yuying.zhang@intel.com>;
>>>>>> Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko
>>>>>> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Raslan Darawsheh
>>>>>> <rasland@nvidia.com>
>>>>>> Subject: Re: [PATCH v7] ethdev: add special flags when creating
>>>>>> async transfer table
>>>>>>
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> Hi Rongwei,
>>>>>>
>>>>>> Thanks for persevering. I have no strong opinion, but, at least,
>>>>>> the fact that the new flags are no longer meant for use in
>>>>>> rte_flow_attr, which is clearly not the right place for such, is an
>> improvement.
>>>>>>
>>>>> Thanks for the suggestion, move it to rte_flow_table_attr now and it'
>>>> dedicated to async API.
>>>>>> However, let's take a closer look at the current patch, shall we?
>>>>>>
>>>>>> But, before we get to that, I'd like to kindly request that you
>>>>>> provide a more concrete example of how this feature is supposed to
>>>>>> be used. Are there some real-life application examples?
>>>>>>
>>>>> Sure.
>>>>>> Also, to me, it's still unclear how an application can obtain the
>>>>>> knowledge of this hint in the first instance. For example, can Open
>>>>>> vSwitch somehow tell ethdevs representing physical ports from ones
>>>>>> representing "vports" (host endpoints)?
>>>>>> How does it know which attribute to specify?
>>>>>>
>>>>> Hint should be initiated by application and application knows it'
>>>>> traffic
>>>> pattern which highly relates to deployment.
>>>>> Let' use VxLAN encap/decap as an example:
>>>>> 1. Traffic from wire should be VxLAN pattern and do the decap, then
>>>>> send to
>>>> different vports.
>>>>> flow pattern_template 0 create transfer relaxed no
>>>>> pattern_template_id
>>>>> 4 template represented_port ethdev_port_id is 0 / eth / ipv4 / udp /
>>>>> vxlan / tag index is 0 data is 0x33 / end flow actions_template 0
>>>>> create transfer actions_template_id 4 template raw_decap index 0 /
>>>>> represented_port ethdev_port_id 1 / end mask raw_decap index 0 /
>>>>> represented_port ethdev_port_id 1 / end flow template_table 0 create
>>>>> group 1 priority 0 transfer wire_orig table_id 4 rules_number 128
>>>>> pattern_template 4 actions_template 4
>>>>>
>>>>> 2. Traffic from vports should be encap with different VxLAN header
>>>>> and send
>>>> to wire.
>>>>> flow actions_template 1 create transfer actions_template_id 5
>>>>> template raw_encap index 0 / represented_port ethdev_port_id 0 / end
>>>>> mask raw_encap index 0 / represented_port ethdev_port_id 0 / end
>>>>> flow template_table 0 create group 1 priority 0 transfer vport_orig
>>>>> table_id 5 rules_number 128 pattern_template 4 actions_template 5
>>>>>
>>>>>> For the rest of my notes, PSB.
>>>>>>
>>>>>> On Mon, 14 Nov 2022, Rongwei Liu wrote:
>>>>>>
>>>>>>> In case flow rules match only one kind of traffic in a flow table,
>>>>>>> then optimization can be done via allocation of this table.
>>>>>>
>>>>>> This wording might confuse readers. Consider rephrasing it, please:
>>>>>> If multiple flow rules share a common set of match masks, then they
>>>>>> might belong in a flow table which can be pre-allocated.
>>>>>>
>>>>>>> Such optimization is possible only if the application gives a hint
>>>>>>> about its usage of the table during initial configuration.
>>>>>>>
>>>>>>> The transfer domain rules may process traffic from wire or vport,
>>>>>>> which may correspond to two kinds of underlayer resources.
>>>>>>
>>>>>> Why name it a "vport"? Why not "host"?
>>>>>>
>>>>>> host = packets generated by any of the host's "vport"s wire =
>>>>>> packets arriving at the NIC from the network
>>>>> Vport is "virtual port" for short and contains "VF/SF" for now.
>>>>> Per my thoughts, it' clearer and maps to DPDK port probing/management.
>>>>
>>>> I understand that "host" might not be a brilliant name.
>>>>
>>>> If "vport" stands for every port of the NIC that is not a network
>>>> port, then this name might be OK to me, but why doesn't it cover PFs?
>>>> A PF is clearly not a network / physical port. Why just VF/SF then? Where
>> does that "for now"
>>>> decision come from? Just wondering.
>>>>
>>> "For now" stands for my understanding. DPDK is always in evolution, right?
>>> You are right, PF should be included in 'vport" concept.
>>>>>>
>>>>>>> That's why the first two hints introduced in this patch are about
>>>>>>> wire and vport traffic specialization.
>>>>>>> Wire means traffic arrives from the uplink port while vport means
>>>>>>> traffic initiated from VF/SF.
>>>>>>
>>>>>> By the sound of it, the meaning is confined to just VFs/SFs.
>>>>>> What if the user wants to match packets coming from PFs?
>>>>>>
>>>>> It should be "wire_orig".
>>>>
>>>> Forgive me, but that does not sound correct. Say, there's an
>>>> application and it has a PF plugged into it: ethdev index 0. And the
>>>> application transmits packets using rte_eth_tx_burst() from that port.
>>>> You say that these packets can be matched via "wire_orig".
>>>> But they do not come from the wire. They come from PF...
>>> Hmm. My mistake.
>>> This may highly depend on PMD implementation. Basically, PFs' traffic
>>> may contain "from wire"/"wire_orig" and '"from local"/"vport_orig".
>>> That' why we emphasize it' optional for PMD.
>>>>
>>>>>>>
>>>>>>> There are two possible approaches for providing the hints.
>>>>>>> Using IPv4 as an example:
>>>>>>> 1. Use pattern item in both template table and flow rules.
>>>>>>>
>>>>>>> pattern_template: pattern ANY_VPORT / eth / ipv4 is 1.1.1.1 / end
>>>>>>> async flow create: pattern ANY_VPORT / eth / ipv4 is 1.1.1.2 / end
>>>>>>>
>>>>>>> "ANY_VPORT" needs to be present in each flow rule even if it's
>>>>>>> just a hint. No value to match because matching is already done by
>>>>>>> IPv4 item.
>>>>>>
>>>>>> Why no value to match on? How does it prevent rogue tenants from
>>>>>> spoofing network headers? If the application receives a packet on a
>>>>>> particular vport's representor, then it may strictly specify item
>>>>>> represented_port pointing to that vport so that only packets from
>>>>>> that vport
>>>> match.
>>>>>>
>>>>>> Why isn't security a consideration?
>>>>>>
>>>>> There is some misunderstanding here. "ANY_VPORT" is the approach
>>>>> (new
>>>> matching item without value) suggested by you.
>>>> I'm not talking about ANY_VPORT in this particular paragraph.
>>>>
>>>> There's item "represented_port" mentioned over there. I'm just asking
>>>> about this "already done by IPv4 item" bit. Yes, it matches on the
>>>> header but not on the true origin of the packet (the logical port of
>>>> the NIC). If the app knows which logical port the packet ingresses
>>>> the NIC, why not match on it for security?
>>>>
>>> Hint is not a matching and it implies how to manage underlayer steering
>> resource.
>>> If "vport_orig" is present, PMD will only apply the steering logic to vport
>> traffic.
>>> The resource is allocated in the async table before each rule. Already cover
>> security considerations.
>>> Matching on "represented_port" needs to program each rule, considering a
>> port range like index "5-10".
>>> Hint tells PMD only to take care of traffic from vport regardless the port
>> index.
>>>
>>>>> I was explaining we need to apply it to each flow rule even if it's
>>>>> only a flag
>>>> and no value.
>>>>
>>>> That's clear. But PSB.
>>>>
>>>>>>>
>>>>>>> 2. Add special flags into table_attr.
>>>>>>>
>>>>>>> template_table 0 create table_id 0 group 1 transfer vport_orig
>>>>>>>
>>>>>>> Approach 1 needs to specify the pattern in each flow rule which
>>>>>>> wastes memory and is not user friendly.
>>>>>>
>>>>>> What if the user has to insert a group of rules which not only have
>>>>>> the same set of match masks but also share exactly the same match
>>>>>> spec values for a limited subset of network items (for example,
>>>>>> those of an encap. header)? This way, a subset of network item
>>>>>> specs can remain fixed across many rules. Does that count as wasting
>> memory?
>>>>>>
>>>>>