From: Yongseok Koh <yskoh@mellanox.com>
To: Shahaf Shuler <shahafs@mellanox.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2] net/mlx5: support multiple groups and jump action
Date: Wed, 10 Oct 2018 02:47:41 +0000 [thread overview]
Message-ID: <20181010024731.GI9031@mtidpdk.mti.labs.mlnx> (raw)
In-Reply-To: <VI1PR05MB44312B9EB1CE808BAE4B63AAC3E70@VI1PR05MB4431.eurprd05.prod.outlook.com>
On Tue, Oct 09, 2018 at 05:50:30AM -0700, Shahaf Shuler wrote:
> Hi Koh,
> Few comments.
>
> Monday, October 8, 2018 9:06 PM¸ Yongseok Koh:
> > Subject: [PATCH v2] net/mlx5: support multiple groups and jump action
> >
> > rte_flow has 'group' attribute and 'jump' action in order to support multiple
> > groups. This feature is known as multi-table support ('chain' in linux TC
> > flower) in general because a group means a table of flows. Example
> > commands are:
> >
> > flow create 0 transfer priority 1 ingress
> > pattern eth / vlan vid is 100 / end
> > actions jump group 1 / end
> >
> > flow create 0 transfer priority 1 ingress
> > pattern eth / vlan vid is 200 / end
> > actions jump group 2 / end
> >
> > flow create 0 transfer group 1 priority 2 ingress
> > pattern eth / vlan vid is 100 /
> > ipv4 dst spec 192.168.40.0 dst prefix 24 / end
> > actions drop / end
> >
> > flow create 0 transfer group 1 priority 2 ingress
> > pattern end
> > actions of_pop_vlan / port_id id 1 / end
> >
> > flow create 0 transfer group 2 priority 2 ingress
> > pattern eth / vlan vid is 200 /
> > ipv4 dst spec 192.168.40.0 dst prefix 24 / end
> > actions of_pop_vlan / port_id id 2 / end
> >
> > flow create 0 transfer group 2 priority 2 ingress
> > pattern end
> > actions port_id id 2 / end
> >
> > With theses flows, if a packet having vlan 200 and src_ip as 192.168.40.1, this
> > packet will firstly hit the 1st flow. Then it will hit the 5th flow because of the
> > 'jump' action. As a result, the packet will be forwarded to port 2 (VF
> > representor) with vlan tag being stripped off. If the packet had vlan 100
> > instead, it would be dropped by the 3rd flow.
> >
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > ---
> >
> > v2:
> > * drop ethdev patch as it had already been fixed by Adrien's patch
> > * move TCF macros from mlx5_flow.h to mlx5_flow_tcf.c
> >
> > drivers/net/mlx5/Makefile | 10 +++++
> > drivers/net/mlx5/meson.build | 4 ++
> > drivers/net/mlx5/mlx5_flow.h | 1 +
> > drivers/net/mlx5/mlx5_flow_tcf.c | 82
> > +++++++++++++++++++++++++++++++++++-----
> > 4 files changed, 88 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile index
> > ca1de9f21..92bae9dfc 100644
> > --- a/drivers/net/mlx5/Makefile
> > +++ b/drivers/net/mlx5/Makefile
> > @@ -347,6 +347,16 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-
> > config-h.sh
> > enum TCA_VLAN_PUSH_VLAN_PRIORITY \
> > $(AUTOCONF_OUTPUT)
> > $Q sh -- '$<' '$@' \
> > + HAVE_TCA_CHAIN \
> > + linux/rtnetlink.h \
> > + enum TCA_CHAIN \
> > + $(AUTOCONF_OUTPUT)
> > + $Q sh -- '$<' '$@' \
> > + HAVE_TC_ACT_GOTO_CHAIN \
> > + linux/pkt_cls.h \
> > + define TC_ACT_GOTO_CHAIN \
> > + $(AUTOCONF_OUTPUT)
> > + $Q sh -- '$<' '$@' \
> > HAVE_SUPPORTED_40000baseKR4_Full \
> > /usr/include/linux/ethtool.h \
> > define SUPPORTED_40000baseKR4_Full \
> > diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
> > index fd93ac162..696624838 100644
> > --- a/drivers/net/mlx5/meson.build
> > +++ b/drivers/net/mlx5/meson.build
> > @@ -182,6 +182,10 @@ if build
> > 'TCA_FLOWER_KEY_VLAN_ETH_TYPE' ],
> > [ 'HAVE_TC_ACT_VLAN', 'linux/tc_act/tc_vlan.h',
> > 'TCA_VLAN_PUSH_VLAN_PRIORITY' ],
> > + [ 'HAVE_TCA_CHAIN', 'linux/rtnetlink.h',
> > + 'TCA_CHAIN' ],
> > + [ 'HAVE_TC_ACT_GOTO_CHAIN', 'linux/pkt_cls.h',
> > + 'TC_ACT_GOTO_CHAIN' ],
>
> Please keep the dictionary order according to the linux header for the search.
Okay.
> > [ 'HAVE_RDMA_NL_NLDEV', 'rdma/rdma_netlink.h',
> > 'RDMA_NL_NLDEV' ],
> > [ 'HAVE_RDMA_NLDEV_CMD_GET', 'rdma/rdma_netlink.h',
> > diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> > index 12de841e8..3ed0ddc58 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -78,6 +78,7 @@
> > #define MLX5_FLOW_ACTION_OF_PUSH_VLAN (1u << 8) #define
> > MLX5_FLOW_ACTION_OF_SET_VLAN_VID (1u << 9) #define
> > MLX5_FLOW_ACTION_OF_SET_VLAN_PCP (1u << 10)
> > +#define MLX5_FLOW_ACTION_JUMP (1u << 11)
> >
> > #define MLX5_FLOW_FATE_ACTIONS \
> > (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE |
> > MLX5_FLOW_ACTION_RSS) diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c
> > b/drivers/net/mlx5/mlx5_flow_tcf.c
> > index 91f6ef678..fbc4c2bb7 100644
> > --- a/drivers/net/mlx5/mlx5_flow_tcf.c
> > +++ b/drivers/net/mlx5/mlx5_flow_tcf.c
> > @@ -148,6 +148,12 @@ struct tc_vlan {
> > #ifndef HAVE_TCA_FLOWER_KEY_VLAN_ETH_TYPE #define
> > TCA_FLOWER_KEY_VLAN_ETH_TYPE 25 #endif
> > +#ifndef HAVE_TCA_CHAIN
> > +#define TCA_CHAIN 11
> > +#endif
> > +#ifndef HAVE_TC_ACT_GOTO_CHAIN
> > +#define TC_ACT_GOTO_CHAIN 0x20000000
> > +#endif
> >
> > #ifndef IPV6_ADDR_LEN
> > #define IPV6_ADDR_LEN 16
> > @@ -225,7 +231,13 @@ struct flow_tcf_ptoi {
> > unsigned int ifindex; /**< Network interface index. */ };
> >
> > -#define MLX5_TCF_FATE_ACTIONS (MLX5_FLOW_ACTION_DROP |
> > MLX5_FLOW_ACTION_PORT_ID)
> > +/* Due to a limitation on driver/FW. */ #define
> > MLX5_TCF_GROUP_ID_MAX 3
> > +#define MLX5_TCF_GROUP_PRIORITY_MAX 14
>
> I guess there is no way to query those and trial and error is overkill for this first implementation.
Well, not a huge task. If you (or FW/drv team) think this max value is likely
increased in the near future (before the next LTS version), then it isn't a bad
idea to add such code now. Let me know.
> > +
> > +#define MLX5_TCF_FATE_ACTIONS \
> > + (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_PORT_ID | \
> > + MLX5_FLOW_ACTION_JUMP)
> > #define MLX5_TCF_VLAN_ACTIONS \
> > (MLX5_FLOW_ACTION_OF_POP_VLAN |
> > MLX5_FLOW_ACTION_OF_PUSH_VLAN | \
> > MLX5_FLOW_ACTION_OF_SET_VLAN_VID |
> > MLX5_FLOW_ACTION_OF_SET_VLAN_PCP) @@ -370,14 +382,25 @@
> > flow_tcf_validate_attributes(const struct rte_flow_attr *attr,
> > struct rte_flow_error *error)
> > {
> > /*
> > - * Supported attributes: no groups, some priorities and ingress only.
> > - * Don't care about transfer as it is the caller's problem.
> > + * Supported attributes: groups, some priorities and ingress only.
> > + * group is supported only if kernel supports chain. Don't care about
> > + * transfer as it is the caller's problem.
> > */
> > - if (attr->group)
> > + if (attr->group > MLX5_TCF_GROUP_ID_MAX)
> > return rte_flow_error_set(error, ENOTSUP,
> >
> > RTE_FLOW_ERROR_TYPE_ATTR_GROUP, attr,
> > - "groups are not supported");
> > - if (attr->priority > 0xfffe)
> > + "group ID larger than "
> > +
> > RTE_STR(MLX5_TCF_GROUP_ID_MAX)
> > + " isn't supported");
> > + else if (attr->group > 0 &&
> > + attr->priority > MLX5_TCF_GROUP_PRIORITY_MAX)
> > + return rte_flow_error_set(error, ENOTSUP,
> > +
> > RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY,
> > + attr,
> > + "lowest priority level is "
>
> Lowest or highest?
Lowest it is. Consistent with the error message of no-multi-group case below.
> > +
> > RTE_STR(MLX5_TCF_GROUP_PRIORITY_MAX)
> > + " when group is configured");
> > + else if (attr->priority > 0xfffe)
> > return rte_flow_error_set(error, ENOTSUP,
> >
> > RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY,
> > attr,
>
> [... ]
>
> > flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow,
> > mnl_attr_nest_end(nlh, na_act);
> > mnl_attr_nest_end(nlh, na_act_index);
> > break;
> > + case RTE_FLOW_ACTION_TYPE_JUMP:
> > + conf.jump = actions->conf;
> > + na_act_index =
> > + mnl_attr_nest_start(nlh,
> > na_act_index_cur++);
> > + assert(na_act_index);
> > + mnl_attr_put_strz(nlh, TCA_ACT_KIND, "gact");
> > + na_act = mnl_attr_nest_start(nlh,
> > TCA_ACT_OPTIONS);
> > + assert(na_act);
> > + mnl_attr_put(nlh, TCA_GACT_PARMS,
> > + sizeof(struct tc_gact),
> > + &(struct tc_gact){
> > + .action = TC_ACT_GOTO_CHAIN |
> > + conf.jump->group,
> > + });
> > + mnl_attr_nest_end(nlh, na_act);
> > + mnl_attr_nest_end(nlh, na_act_index);
> > + break;
> > case RTE_FLOW_ACTION_TYPE_DROP:
> > na_act_index =
> > mnl_attr_nest_start(nlh,
> > na_act_index_cur++);
>
>
> We also spoke about that for group > 0 the flow items can start from the middle. E.g. on table 0 we have flow rule that redirect all eth/ip to group 1, and on group 1 the rules can start with udp or tcp.
> Is this possible today w/o any code change?
Not possible. It needs more changes. Complexity of the additional code depends
on a set of limitations we make. In the simplest way, we can unconditionally
allow such flows if TCF allows it. But if we need smarter way, we would have to
add much more validation code. For example, in a case where grp 0 has "item
eth/ip proto is udp / aciton goto grp 1" and grp 1 has "item tcp ...", we should
decide whether this is a violation or not. IIRC, that's why we decided to not
allow such flows during the design review. Users have to specify full items
starting from 'eth' with the current implementation.
Will submit v3 with the change in Makefile and meson.build. But if you think
there's need to add additional features like I answered above, let me know so
that I can submit v4.
Thanks,
Yongseok
next prev parent reply other threads:[~2018-10-10 2:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-03 1:56 [dpdk-dev] [PATCH 0/2] " Yongseok Koh
2018-10-03 1:56 ` [dpdk-dev] [PATCH 1/2] ethdev: add jump action to description table Yongseok Koh
2018-10-03 7:12 ` Andrew Rybchenko
2018-10-03 7:31 ` Yongseok Koh
2018-10-03 1:56 ` [dpdk-dev] [PATCH 2/2] net/mlx5: support multiple groups and jump action Yongseok Koh
2018-10-08 18:06 ` [dpdk-dev] [PATCH v2] " Yongseok Koh
2018-10-09 8:18 ` Ori Kam
2018-10-09 12:50 ` Shahaf Shuler
2018-10-10 2:47 ` Yongseok Koh [this message]
2018-10-10 5:35 ` Shahaf Shuler
2018-10-10 2:54 ` [dpdk-dev] [PATCH v3] " Yongseok Koh
2018-10-12 8:42 ` [dpdk-dev] [PATCH v4] " Yongseok Koh
2018-10-14 5:55 ` Shahaf Shuler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181010024731.GI9031@mtidpdk.mti.labs.mlnx \
--to=yskoh@mellanox.com \
--cc=dev@dpdk.org \
--cc=shahafs@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).