From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 4A1A21B3B5 for ; Thu, 7 Feb 2019 14:27:08 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 888BCC074EF9; Thu, 7 Feb 2019 13:27:07 +0000 (UTC) Received: from ktraynor.remote.csb (unknown [10.33.36.135]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6BC6219CB6; Thu, 7 Feb 2019 13:27:06 +0000 (UTC) From: Kevin Traynor To: Viacheslav Ovsiienko Cc: Shahaf Shuler , dpdk stable Date: Thu, 7 Feb 2019 13:25:09 +0000 Message-Id: <20190207132614.20538-3-ktraynor@redhat.com> In-Reply-To: <20190207132614.20538-1-ktraynor@redhat.com> References: <20190207132614.20538-1-ktraynor@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Thu, 07 Feb 2019 13:27:07 +0000 (UTC) Subject: [dpdk-stable] patch 'net/mlx5: support tunnel inner items on E-Switch' has been queued to LTS release 18.11.1 X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Feb 2019 13:27:08 -0000 Hi, FYI, your patch has been queued to LTS release 18.11.1 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objections before 02/14/19. So please shout if anyone has objections. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. This will indicate if there was any rebasing needed to apply to the stable branch. If there were code changes for rebasing (ie: not only metadata diffs), please double check that the rebase was correctly done. Thanks. Kevin Traynor --- >>From 12e14a863374ed6b22eec4da32e127146dabfea3 Mon Sep 17 00:00:00 2001 From: Viacheslav Ovsiienko Date: Thu, 27 Dec 2018 15:34:43 +0000 Subject: [PATCH] net/mlx5: support tunnel inner items on E-Switch [ upstream commit 78f5341d71cdb8a2a157081214a214d00586fb37 ] This patch updates the translation routine for the E-Switch Flows. Inner tunnel pattern items are translated into Netlink message, support for tunnel inner IP addresses (v4 or v6), IP protocol, and TCP and UDP ports is added. We are going to support Flows matching with outer tunnel items and not containing the explicit tunnel decap action (this one might be drop, redirect or table jump, for exapmle). So we can not rely on presence of tunnel decap action in the list to decide whether the Flow is for tunnel, instead we will use the presence of tunnel item. Item translation is rebound to presence of tunnel items, instead of relying on decap action. There is no way to tell kernel driver the outer address type (IPv4 or IPv6) but specify the address flower key. The outer address key is put on Netlink with zero mask if there is no RTE item is specified in the list. Signed-off-by: Viacheslav Ovsiienko Acked-by: Shahaf Shuler --- drivers/net/mlx5/mlx5_flow_tcf.c | 174 ++++++++++++++++++++++--------- 1 file changed, 125 insertions(+), 49 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c b/drivers/net/mlx5/mlx5_flow_tcf.c index 5fc50c2b5..688422da6 100644 --- a/drivers/net/mlx5/mlx5_flow_tcf.c +++ b/drivers/net/mlx5/mlx5_flow_tcf.c @@ -464,5 +464,7 @@ static const union { struct rte_flow_item_udp udp; struct rte_flow_item_vxlan vxlan; -} flow_tcf_mask_empty; +} flow_tcf_mask_empty = { + {0}, +}; /** Supported masks for known item types. */ @@ -2293,4 +2295,6 @@ flow_tcf_validate(struct rte_eth_dev *dev, * @param[in] items * Pointer to the list of items. + * @param[out] action_flags + * Pointer to the detected actions. * * @return @@ -2299,5 +2303,6 @@ flow_tcf_validate(struct rte_eth_dev *dev, static int flow_tcf_get_items_size(const struct rte_flow_attr *attr, - const struct rte_flow_item items[]) + const struct rte_flow_item items[], + uint64_t *action_flags) { int size = 0; @@ -2350,4 +2355,14 @@ flow_tcf_get_items_size(const struct rte_flow_attr *attr, case RTE_FLOW_ITEM_TYPE_VXLAN: size += SZ_NLATTR_TYPE_OF(uint32_t); + /* + * There might be no VXLAN decap action in the action + * list, nonetheless the VXLAN tunnel flow requires + * the decap structure to be correctly applied to + * VXLAN device, set the flag to create the structure. + * Translation routine will not put the decap action + * in tne Netlink message if there is no actual action + * in the list. + */ + *action_flags |= MLX5_FLOW_ACTION_VXLAN_DECAP; break; default: @@ -2598,5 +2613,5 @@ flow_tcf_prepare(const struct rte_flow_attr *attr, uint8_t *sp, *tun = NULL; - size += flow_tcf_get_items_size(attr, items); + size += flow_tcf_get_items_size(attr, items, &action_flags); size += flow_tcf_get_actions_and_size(actions, &action_flags); dev_flow = rte_zmalloc(__func__, size, MNL_ALIGNTO); @@ -3002,4 +3017,5 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, bool vlan_eth_type_set = 0; bool ip_proto_set = 0; + bool tunnel_outer = 0; struct nlattr *na_flower; struct nlattr *na_flower_act; @@ -3015,4 +3031,5 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, case FLOW_TCF_TUNACT_VXLAN_DECAP: decap.vxlan = dev_flow->tcf.vxlan_decap; + tunnel_outer = 1; break; case FLOW_TCF_TUNACT_VXLAN_ENCAP: @@ -3069,5 +3086,5 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, break; case RTE_FLOW_ITEM_TYPE_ETH: - item_flags |= (item_flags & MLX5_FLOW_LAYER_VXLAN) ? + item_flags |= (item_flags & MLX5_FLOW_LAYER_TUNNEL) ? MLX5_FLOW_LAYER_INNER_L2 : MLX5_FLOW_LAYER_OUTER_L2; @@ -3082,10 +3099,9 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, break; spec.eth = items->spec; - if (decap.vxlan && - !(item_flags & MLX5_FLOW_LAYER_VXLAN)) { + if (tunnel_outer) { DRV_LOG(WARNING, - "outer L2 addresses cannot be forced" - " for vxlan decapsulation, parameter" - " ignored"); + "outer L2 addresses cannot be" + " forced is outer ones for tunnel," + " parameter is ignored"); break; } @@ -3116,4 +3132,5 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, assert(!encap.hdr); assert(!decap.hdr); + assert(!tunnel_outer); item_flags |= MLX5_FLOW_LAYER_OUTER_VLAN; mask.vlan = flow_tcf_item_mask @@ -3150,5 +3167,7 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, break; case RTE_FLOW_ITEM_TYPE_IPV4: - item_flags |= MLX5_FLOW_LAYER_OUTER_L3_IPV4; + item_flags |= (item_flags & MLX5_FLOW_LAYER_TUNNEL) ? + MLX5_FLOW_LAYER_INNER_L3_IPV4 : + MLX5_FLOW_LAYER_OUTER_L3_IPV4; mask.ipv4 = flow_tcf_item_mask (items, &rte_flow_item_ipv4_mask, @@ -3159,5 +3178,5 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, assert(mask.ipv4); spec.ipv4 = items->spec; - if (!decap.vxlan) { + if (!tunnel_outer) { if (!eth_type_set || (!vlan_eth_type_set && vlan_present)) @@ -3170,23 +3189,44 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, eth_type_set = 1; vlan_eth_type_set = 1; - if (mask.ipv4 == &flow_tcf_mask_empty.ipv4) + } + if (!tunnel_outer && mask.ipv4->hdr.next_proto_id) { + /* + * No way to set IP protocol for outer tunnel + * layers. Usually it is fixed, for example, + * to UDP for VXLAN/GPE. + */ + assert(spec.ipv4); /* Mask is not empty. */ + mnl_attr_put_u8(nlh, TCA_FLOWER_KEY_IP_PROTO, + spec.ipv4->hdr.next_proto_id); + ip_proto_set = 1; + } + if (mask.ipv4 == &flow_tcf_mask_empty.ipv4 || + (!mask.ipv4->hdr.src_addr && + !mask.ipv4->hdr.dst_addr)) { + if (!tunnel_outer) break; - if (mask.ipv4->hdr.next_proto_id) { - mnl_attr_put_u8 - (nlh, TCA_FLOWER_KEY_IP_PROTO, - spec.ipv4->hdr.next_proto_id); - ip_proto_set = 1; - } - } else { - assert(mask.ipv4 != &flow_tcf_mask_empty.ipv4); + /* + * For tunnel outer we must set outer IP key + * anyway, even if the specification/mask is + * empty. There is no another way to tell + * kernel about he outer layer protocol. + */ + mnl_attr_put_u32 + (nlh, TCA_FLOWER_KEY_ENC_IPV4_SRC, + mask.ipv4->hdr.src_addr); + mnl_attr_put_u32 + (nlh, TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK, + mask.ipv4->hdr.src_addr); + assert(dev_flow->tcf.nlsize >= nlh->nlmsg_len); + break; } if (mask.ipv4->hdr.src_addr) { mnl_attr_put_u32 - (nlh, decap.vxlan ? + (nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_IPV4_SRC : TCA_FLOWER_KEY_IPV4_SRC, spec.ipv4->hdr.src_addr); mnl_attr_put_u32 - (nlh, decap.vxlan ? + (nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK : TCA_FLOWER_KEY_IPV4_SRC_MASK, @@ -3195,10 +3235,10 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, if (mask.ipv4->hdr.dst_addr) { mnl_attr_put_u32 - (nlh, decap.vxlan ? + (nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_IPV4_DST : TCA_FLOWER_KEY_IPV4_DST, spec.ipv4->hdr.dst_addr); mnl_attr_put_u32 - (nlh, decap.vxlan ? + (nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_IPV4_DST_MASK : TCA_FLOWER_KEY_IPV4_DST_MASK, @@ -3207,6 +3247,10 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, assert(dev_flow->tcf.nlsize >= nlh->nlmsg_len); break; - case RTE_FLOW_ITEM_TYPE_IPV6: - item_flags |= MLX5_FLOW_LAYER_OUTER_L3_IPV6; + case RTE_FLOW_ITEM_TYPE_IPV6: { + bool ipv6_src, ipv6_dst; + + item_flags |= (item_flags & MLX5_FLOW_LAYER_TUNNEL) ? + MLX5_FLOW_LAYER_INNER_L3_IPV6 : + MLX5_FLOW_LAYER_OUTER_L3_IPV6; mask.ipv6 = flow_tcf_item_mask (items, &rte_flow_item_ipv6_mask, @@ -3217,5 +3261,5 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, assert(mask.ipv6); spec.ipv6 = items->spec; - if (!decap.vxlan) { + if (!tunnel_outer) { if (!eth_type_set || (!vlan_eth_type_set && vlan_present)) @@ -3228,22 +3272,48 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, eth_type_set = 1; vlan_eth_type_set = 1; - if (mask.ipv6 == &flow_tcf_mask_empty.ipv6) + } + if (!tunnel_outer && mask.ipv6->hdr.proto) { + /* + * No way to set IP protocol for outer tunnel + * layers. Usually it is fixed, for example, + * to UDP for VXLAN/GPE. + */ + assert(spec.ipv6); /* Mask is not empty. */ + mnl_attr_put_u8(nlh, TCA_FLOWER_KEY_IP_PROTO, + spec.ipv6->hdr.proto); + ip_proto_set = 1; + } + ipv6_dst = !IN6_IS_ADDR_UNSPECIFIED + (mask.ipv6->hdr.dst_addr); + ipv6_src = !IN6_IS_ADDR_UNSPECIFIED + (mask.ipv6->hdr.src_addr); + if (mask.ipv6 == &flow_tcf_mask_empty.ipv6 || + (!ipv6_dst && !ipv6_src)) { + if (!tunnel_outer) break; - if (mask.ipv6->hdr.proto) { - mnl_attr_put_u8 - (nlh, TCA_FLOWER_KEY_IP_PROTO, - spec.ipv6->hdr.proto); - ip_proto_set = 1; - } - } else { - assert(mask.ipv6 != &flow_tcf_mask_empty.ipv6); + /* + * For tunnel outer we must set outer IP key + * anyway, even if the specification/mask is + * empty. There is no another way to tell + * kernel about he outer layer protocol. + */ + mnl_attr_put(nlh, + TCA_FLOWER_KEY_ENC_IPV6_SRC, + IPV6_ADDR_LEN, + mask.ipv6->hdr.src_addr); + mnl_attr_put(nlh, + TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK, + IPV6_ADDR_LEN, + mask.ipv6->hdr.src_addr); + assert(dev_flow->tcf.nlsize >= nlh->nlmsg_len); + break; } - if (!IN6_IS_ADDR_UNSPECIFIED(mask.ipv6->hdr.src_addr)) { - mnl_attr_put(nlh, decap.vxlan ? + if (ipv6_src) { + mnl_attr_put(nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_IPV6_SRC : TCA_FLOWER_KEY_IPV6_SRC, IPV6_ADDR_LEN, spec.ipv6->hdr.src_addr); - mnl_attr_put(nlh, decap.vxlan ? + mnl_attr_put(nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK : TCA_FLOWER_KEY_IPV6_SRC_MASK, @@ -3251,11 +3321,11 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, mask.ipv6->hdr.src_addr); } - if (!IN6_IS_ADDR_UNSPECIFIED(mask.ipv6->hdr.dst_addr)) { - mnl_attr_put(nlh, decap.vxlan ? + if (ipv6_dst) { + mnl_attr_put(nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_IPV6_DST : TCA_FLOWER_KEY_IPV6_DST, IPV6_ADDR_LEN, spec.ipv6->hdr.dst_addr); - mnl_attr_put(nlh, decap.vxlan ? + mnl_attr_put(nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_IPV6_DST_MASK : TCA_FLOWER_KEY_IPV6_DST_MASK, @@ -3265,6 +3335,9 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, assert(dev_flow->tcf.nlsize >= nlh->nlmsg_len); break; + } case RTE_FLOW_ITEM_TYPE_UDP: - item_flags |= MLX5_FLOW_LAYER_OUTER_L4_UDP; + item_flags |= (item_flags & MLX5_FLOW_LAYER_TUNNEL) ? + MLX5_FLOW_LAYER_INNER_L4_UDP : + MLX5_FLOW_LAYER_OUTER_L4_UDP; mask.udp = flow_tcf_item_mask (items, &rte_flow_item_udp_mask, @@ -3275,5 +3348,5 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, assert(mask.udp); spec.udp = items->spec; - if (!decap.vxlan) { + if (!tunnel_outer) { if (!ip_proto_set) mnl_attr_put_u8 @@ -3290,10 +3363,10 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, if (mask.udp->hdr.src_port) { mnl_attr_put_u16 - (nlh, decap.vxlan ? + (nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_UDP_SRC_PORT : TCA_FLOWER_KEY_UDP_SRC, spec.udp->hdr.src_port); mnl_attr_put_u16 - (nlh, decap.vxlan ? + (nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK : TCA_FLOWER_KEY_UDP_SRC_MASK, @@ -3302,10 +3375,10 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, if (mask.udp->hdr.dst_port) { mnl_attr_put_u16 - (nlh, decap.vxlan ? + (nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_UDP_DST_PORT : TCA_FLOWER_KEY_UDP_DST, spec.udp->hdr.dst_port); mnl_attr_put_u16 - (nlh, decap.vxlan ? + (nlh, tunnel_outer ? TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK : TCA_FLOWER_KEY_UDP_DST_MASK, @@ -3315,5 +3388,7 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, break; case RTE_FLOW_ITEM_TYPE_TCP: - item_flags |= MLX5_FLOW_LAYER_OUTER_L4_TCP; + item_flags |= (item_flags & MLX5_FLOW_LAYER_TUNNEL) ? + MLX5_FLOW_LAYER_INNER_L4_TCP : + MLX5_FLOW_LAYER_OUTER_L4_TCP; mask.tcp = flow_tcf_item_mask (items, &rte_flow_item_tcp_mask, @@ -3359,4 +3434,5 @@ flow_tcf_translate(struct rte_eth_dev *dev, struct mlx5_flow *dev_flow, case RTE_FLOW_ITEM_TYPE_VXLAN: assert(decap.vxlan); + tunnel_outer = 0; item_flags |= MLX5_FLOW_LAYER_VXLAN; spec.vxlan = items->spec; -- 2.19.0 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2019-02-07 13:19:55.559309596 +0000 +++ 0003-net-mlx5-support-tunnel-inner-items-on-E-Switch.patch 2019-02-07 13:19:55.000000000 +0000 @@ -1,8 +1,10 @@ -From 78f5341d71cdb8a2a157081214a214d00586fb37 Mon Sep 17 00:00:00 2001 +From 12e14a863374ed6b22eec4da32e127146dabfea3 Mon Sep 17 00:00:00 2001 From: Viacheslav Ovsiienko Date: Thu, 27 Dec 2018 15:34:43 +0000 Subject: [PATCH] net/mlx5: support tunnel inner items on E-Switch +[ upstream commit 78f5341d71cdb8a2a157081214a214d00586fb37 ] + This patch updates the translation routine for the E-Switch Flows. Inner tunnel pattern items are translated into Netlink message, support for tunnel inner IP addresses (v4 or v6), IP protocol, @@ -21,8 +23,6 @@ address key is put on Netlink with zero mask if there is no RTE item is specified in the list. -Cc: stable@dpdk.org - Signed-off-by: Viacheslav Ovsiienko Acked-by: Shahaf Shuler ---