DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch
@ 2018-12-29 19:55 Viacheslav Ovsiienko
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 1/5] net/mlx5: optimize neigh and local encap rules search Viacheslav Ovsiienko
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Viacheslav Ovsiienko @ 2018-12-29 19:55 UTC (permalink / raw)
  To: shahafs; +Cc: dev

This patchset simplifies the virtual VXLAN tunnel devices management.
Previous design used the VXLAN devices attached to outer interface for
encapsulation rules. The new design uses the unattached devices, it allows
use the single VXLAN device both for encapsulation and decapsulation rules
and removes UDP port sharing issues. 
	
Also patchset introduces the minor changes in VXLAN device management
allowing to be compiled and operate on some old kernels (for example RH7.2
original kernel 3.10.327), which do not support VXLAN device metadata.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

Viacheslav Ovsiienko (5):
  net/mlx5: optimize neigh and local encap rules search
  net/mlx5: introduce encapsulation rules container
  net/mlx5: switch encap rules to use container
  net/mlx5: switch to detached VXLAN network devices
  net/mlx5: add RH7.2 VXLAN device metadata workaround

 drivers/net/mlx5/mlx5_flow_tcf.c | 270 +++++++++++++++++++++++----------------
 1 file changed, 159 insertions(+), 111 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 1/5] net/mlx5: optimize neigh and local encap rules search
  2018-12-29 19:55 [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Viacheslav Ovsiienko
@ 2018-12-29 19:55 ` Viacheslav Ovsiienko
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 2/5] net/mlx5: introduce encapsulation rules container Viacheslav Ovsiienko
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Viacheslav Ovsiienko @ 2018-12-29 19:55 UTC (permalink / raw)
  To: shahafs; +Cc: dev

This patch removes unnecessary local varialbles and optimizes
local and neigh encapsulation rules search.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow_tcf.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c b/drivers/net/mlx5/mlx5_flow_tcf.c
index 33ebddd..b4734a0 100644
--- a/drivers/net/mlx5/mlx5_flow_tcf.c
+++ b/drivers/net/mlx5/mlx5_flow_tcf.c
@@ -4781,8 +4781,7 @@ struct tcf_nlcb_context {
 		     struct rte_flow_error *error)
 {
 	const struct flow_tcf_vxlan_encap *encap = dev_flow->tcf.vxlan_encap;
-	struct tcf_local_rule *rule;
-	bool found = false;
+	struct tcf_local_rule *rule = NULL;
 	int ret;
 
 	assert(encap);
@@ -4793,7 +4792,6 @@ struct tcf_nlcb_context {
 			if (rule->mask & FLOW_TCF_ENCAP_IPV4_SRC &&
 			    encap->ipv4.src == rule->ipv4.src &&
 			    encap->ipv4.dst == rule->ipv4.dst) {
-				found = true;
 				break;
 			}
 		}
@@ -4806,12 +4804,11 @@ struct tcf_nlcb_context {
 					    sizeof(encap->ipv6.src)) &&
 			    !memcmp(&encap->ipv6.dst, &rule->ipv6.dst,
 					    sizeof(encap->ipv6.dst))) {
-				found = true;
 				break;
 			}
 		}
 	}
-	if (found) {
+	if (rule) {
 		if (enable) {
 			rule->refcnt++;
 			return 0;
@@ -4890,8 +4887,7 @@ struct tcf_nlcb_context {
 		     struct rte_flow_error *error)
 {
 	const struct flow_tcf_vxlan_encap *encap = dev_flow->tcf.vxlan_encap;
-	struct tcf_neigh_rule *rule;
-	bool found = false;
+	struct tcf_neigh_rule *rule = NULL;
 	int ret;
 
 	assert(encap);
@@ -4901,7 +4897,6 @@ struct tcf_nlcb_context {
 		LIST_FOREACH(rule, &vtep->neigh, next) {
 			if (rule->mask & FLOW_TCF_ENCAP_IPV4_DST &&
 			    encap->ipv4.dst == rule->ipv4.dst) {
-				found = true;
 				break;
 			}
 		}
@@ -4912,12 +4907,11 @@ struct tcf_nlcb_context {
 			if (rule->mask & FLOW_TCF_ENCAP_IPV6_DST &&
 			    !memcmp(&encap->ipv6.dst, &rule->ipv6.dst,
 						sizeof(encap->ipv6.dst))) {
-				found = true;
 				break;
 			}
 		}
 	}
-	if (found) {
+	if (rule) {
 		if (memcmp(&encap->eth.dst, &rule->eth,
 			   sizeof(encap->eth.dst))) {
 			DRV_LOG(WARNING, "Destination MAC differs"
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 2/5] net/mlx5: introduce encapsulation rules container
  2018-12-29 19:55 [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Viacheslav Ovsiienko
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 1/5] net/mlx5: optimize neigh and local encap rules search Viacheslav Ovsiienko
@ 2018-12-29 19:55 ` Viacheslav Ovsiienko
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 3/5] net/mlx5: switch encap rules to use container Viacheslav Ovsiienko
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Viacheslav Ovsiienko @ 2018-12-29 19:55 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Currently the VXLAN encapsulation neigh/local rules
are stored in the list contained in the VTEP device
structure. Encapsulation VTEP device is attached to
outer interface and stored rules are related to this
underlying interface. We are going to use unattached
VXLAN devices for encapsulation (kernel does not use
attached interface to find egress one), so we should
introduce the structure to keep interface related
neigh/local rules instead of VTEP structure. This
patch introduces internal tcf_irule structure, and
its create/delete methods.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow_tcf.c | 108 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 107 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c b/drivers/net/mlx5/mlx5_flow_tcf.c
index b4734a0..a6dca08 100644
--- a/drivers/net/mlx5/mlx5_flow_tcf.c
+++ b/drivers/net/mlx5/mlx5_flow_tcf.c
@@ -431,6 +431,15 @@ struct tcf_local_rule {
 	};
 };
 
+/** Outer interface VXLAN encapsulation rules container. */
+struct tcf_irule {
+	LIST_ENTRY(tcf_irule) next;
+	LIST_HEAD(, tcf_neigh_rule) neigh;
+	LIST_HEAD(, tcf_local_rule) local;
+	uint32_t refcnt;
+	unsigned int ifouter; /**< Own interface index. */
+};
+
 /** VXLAN virtual netdev. */
 struct tcf_vtep {
 	LIST_ENTRY(tcf_vtep) next;
@@ -458,6 +467,7 @@ struct flow_tcf_vxlan_decap {
 
 struct flow_tcf_vxlan_encap {
 	struct flow_tcf_tunnel_hdr hdr;
+	struct tcf_irule *iface;
 	uint32_t mask;
 	uint8_t ip_tos;
 	uint8_t ip_ttl_hop;
@@ -4971,11 +4981,90 @@ struct tcf_nlcb_context {
 	return 0;
 }
 
+/* VXLAN encap rule database for outer interfaces. */
+static  LIST_HEAD(, tcf_irule) iface_list_vxlan = LIST_HEAD_INITIALIZER();
+
 /* VTEP device list is shared between PMD port instances. */
 static LIST_HEAD(, tcf_vtep) vtep_list_vxlan = LIST_HEAD_INITIALIZER();
 static pthread_mutex_t vtep_list_mutex = PTHREAD_MUTEX_INITIALIZER;
 
 /**
+ * Acquire the VXLAN encap rules container for specified interface.
+ * First looks for the container in the existing ones list, creates
+ * and initializes the new container if existing not found.
+ *
+ * @param[in] tcf
+ *   Context object initialized by mlx5_flow_tcf_context_create().
+ * @param[in] ifouter
+ *   Network interface index to create VXLAN encap rules on.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ * @return
+ *   Rule container pointer on success,
+ *   NULL otherwise and rte_errno is set.
+ */
+static struct tcf_irule*
+flow_tcf_encap_irule_acquire(struct mlx5_flow_tcf_context *tcf,
+			     unsigned int ifouter,
+			     struct rte_flow_error *error)
+{
+	struct tcf_irule *iface;
+
+	/* Look whether the container for encap rules is created. */
+	assert(ifouter);
+	LIST_FOREACH(iface, &iface_list_vxlan, next) {
+		if (iface->ifouter == ifouter)
+			break;
+	}
+	if (iface) {
+		/* Container already exists, just increment the reference. */
+		iface->refcnt++;
+		return iface;
+	}
+	/* Not found, we should create the new container. */
+	iface = rte_zmalloc(__func__, sizeof(*iface),
+			    alignof(struct tcf_irule));
+	if (!iface) {
+		rte_flow_error_set(error, ENOMEM,
+				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+				   "unable to allocate memory for container");
+		return NULL;
+	}
+	*iface = (struct tcf_irule){
+			.local = LIST_HEAD_INITIALIZER(),
+			.neigh = LIST_HEAD_INITIALIZER(),
+			.ifouter = ifouter,
+			.refcnt = 1,
+	};
+	/* Interface cleanup for new container created. */
+	flow_tcf_encap_iface_cleanup(tcf, ifouter);
+	flow_tcf_encap_local_cleanup(tcf, ifouter);
+	flow_tcf_encap_neigh_cleanup(tcf, ifouter);
+	LIST_INSERT_HEAD(&iface_list_vxlan, iface, next);
+	return iface;
+}
+
+/**
+ * Releases VXLAN encap rules container by pointer. Decrements the
+ * reference cointer and deletes the container if counter is zero.
+ *
+ * @param[in] irule
+ *   VXLAN rule container pointer to release.
+ */
+static void
+flow_tcf_encap_irule_release(struct tcf_irule *iface)
+{
+	assert(iface->refcnt);
+	if (--iface->refcnt == 0) {
+		/* Reference counter is zero, delete the container. */
+		assert(LIST_EMPTY(&iface->local));
+		assert(LIST_EMPTY(&iface->neigh));
+		LIST_REMOVE(iface, next);
+		rte_free(iface);
+	}
+}
+
+/**
  * Deletes VTEP network device.
  *
  * @param[in] tcf
@@ -5247,6 +5336,7 @@ struct tcf_nlcb_context {
 {
 	static uint16_t encap_port = MLX5_VXLAN_PORT_MIN - 1;
 	struct tcf_vtep *vtep;
+	struct tcf_irule *iface;
 	int ret;
 
 	assert(ifouter);
@@ -5296,6 +5386,13 @@ struct tcf_nlcb_context {
 	}
 	assert(vtep->ifouter == ifouter);
 	assert(vtep->ifindex);
+	iface = flow_tcf_encap_irule_acquire(tcf, ifouter, error);
+	if (!iface) {
+		if (--vtep->refcnt == 0)
+			flow_tcf_vtep_delete(tcf, vtep);
+		return NULL;
+	}
+	dev_flow->tcf.vxlan_encap->iface = iface;
 	/* Create local ipaddr with peer to specify the outer IPs. */
 	ret = flow_tcf_encap_local(tcf, vtep, dev_flow, true, error);
 	if (!ret) {
@@ -5306,6 +5403,8 @@ struct tcf_nlcb_context {
 					     dev_flow, false, error);
 	}
 	if (ret) {
+		dev_flow->tcf.vxlan_encap->iface = NULL;
+		flow_tcf_encap_irule_release(iface);
 		if (--vtep->refcnt == 0)
 			flow_tcf_vtep_delete(tcf, vtep);
 		return NULL;
@@ -5378,11 +5477,18 @@ struct tcf_nlcb_context {
 	switch (dev_flow->tcf.tunnel->type) {
 	case FLOW_TCF_TUNACT_VXLAN_DECAP:
 		break;
-	case FLOW_TCF_TUNACT_VXLAN_ENCAP:
+	case FLOW_TCF_TUNACT_VXLAN_ENCAP: {
+		struct tcf_irule *iface;
+
 		/* Remove the encap ancillary rules first. */
+		iface = dev_flow->tcf.vxlan_encap->iface;
+		assert(iface);
 		flow_tcf_encap_neigh(tcf, vtep, dev_flow, false, NULL);
 		flow_tcf_encap_local(tcf, vtep, dev_flow, false, NULL);
+		flow_tcf_encap_irule_release(iface);
+		dev_flow->tcf.vxlan_encap->iface = NULL;
 		break;
+	}
 	default:
 		assert(false);
 		DRV_LOG(WARNING, "Unsupported tunnel type");
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 3/5] net/mlx5: switch encap rules to use container
  2018-12-29 19:55 [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Viacheslav Ovsiienko
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 1/5] net/mlx5: optimize neigh and local encap rules search Viacheslav Ovsiienko
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 2/5] net/mlx5: introduce encapsulation rules container Viacheslav Ovsiienko
@ 2018-12-29 19:55 ` Viacheslav Ovsiienko
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 4/5] net/mlx5: switch to detached VXLAN network devices Viacheslav Ovsiienko
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Viacheslav Ovsiienko @ 2018-12-29 19:55 UTC (permalink / raw)
  To: shahafs; +Cc: dev

The VXLAN encapsulation neigh/local rules will use
the new introduced structure, which keeps the
rules lists, related to specified outer interface,
instead of attached VTEP structure. It allows us to
unbind VTEP structure from keeping the rules for
interface.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow_tcf.c | 42 ++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c b/drivers/net/mlx5/mlx5_flow_tcf.c
index a6dca08..b99e322 100644
--- a/drivers/net/mlx5/mlx5_flow_tcf.c
+++ b/drivers/net/mlx5/mlx5_flow_tcf.c
@@ -4771,8 +4771,8 @@ struct tcf_nlcb_context {
  *
  * @param[in] tcf
  *   Libmnl socket context object.
- * @param[in] vtep
- *   VTEP object, contains rule database and ifouter index.
+ * @param[in] iface
+ *   Object, contains rule database and ifouter index.
  * @param[in] dev_flow
  *   Flow object, contains the tunnel parameters (for encap only).
  * @param[in] enable
@@ -4785,7 +4785,7 @@ struct tcf_nlcb_context {
  */
 static int
 flow_tcf_encap_local(struct mlx5_flow_tcf_context *tcf,
-		     struct tcf_vtep *vtep,
+		     struct tcf_irule *iface,
 		     struct mlx5_flow *dev_flow,
 		     bool enable,
 		     struct rte_flow_error *error)
@@ -4798,7 +4798,7 @@ struct tcf_nlcb_context {
 	assert(encap->hdr.type == FLOW_TCF_TUNACT_VXLAN_ENCAP);
 	if (encap->mask & FLOW_TCF_ENCAP_IPV4_SRC) {
 		assert(encap->mask & FLOW_TCF_ENCAP_IPV4_DST);
-		LIST_FOREACH(rule, &vtep->local, next) {
+		LIST_FOREACH(rule, &iface->local, next) {
 			if (rule->mask & FLOW_TCF_ENCAP_IPV4_SRC &&
 			    encap->ipv4.src == rule->ipv4.src &&
 			    encap->ipv4.dst == rule->ipv4.dst) {
@@ -4808,7 +4808,7 @@ struct tcf_nlcb_context {
 	} else {
 		assert(encap->mask & FLOW_TCF_ENCAP_IPV6_SRC);
 		assert(encap->mask & FLOW_TCF_ENCAP_IPV6_DST);
-		LIST_FOREACH(rule, &vtep->local, next) {
+		LIST_FOREACH(rule, &iface->local, next) {
 			if (rule->mask & FLOW_TCF_ENCAP_IPV6_SRC &&
 			    !memcmp(&encap->ipv6.src, &rule->ipv6.src,
 					    sizeof(encap->ipv6.src)) &&
@@ -4826,7 +4826,7 @@ struct tcf_nlcb_context {
 		if (!rule->refcnt || !--rule->refcnt) {
 			LIST_REMOVE(rule, next);
 			return flow_tcf_rule_local(tcf, encap,
-					vtep->ifouter, false, error);
+					iface->ifouter, false, error);
 		}
 		return 0;
 	}
@@ -4859,13 +4859,13 @@ struct tcf_nlcb_context {
 		memcpy(&rule->ipv6.src, &encap->ipv6.src, IPV6_ADDR_LEN);
 		memcpy(&rule->ipv6.dst, &encap->ipv6.dst, IPV6_ADDR_LEN);
 	}
-	ret = flow_tcf_rule_local(tcf, encap, vtep->ifouter, true, error);
+	ret = flow_tcf_rule_local(tcf, encap, iface->ifouter, true, error);
 	if (ret) {
 		rte_free(rule);
 		return ret;
 	}
 	rule->refcnt++;
-	LIST_INSERT_HEAD(&vtep->local, rule, next);
+	LIST_INSERT_HEAD(&iface->local, rule, next);
 	return 0;
 }
 
@@ -4877,8 +4877,8 @@ struct tcf_nlcb_context {
  *
  * @param[in] tcf
  *   Libmnl socket context object.
- * @param[in] vtep
- *   VTEP object, contains rule database and ifouter index.
+ * @param[in] iface
+ *   Object, contains rule database and ifouter index.
  * @param[in] dev_flow
  *   Flow object, contains the tunnel parameters (for encap only).
  * @param[in] enable
@@ -4891,7 +4891,7 @@ struct tcf_nlcb_context {
  */
 static int
 flow_tcf_encap_neigh(struct mlx5_flow_tcf_context *tcf,
-		     struct tcf_vtep *vtep,
+		     struct tcf_irule *iface,
 		     struct mlx5_flow *dev_flow,
 		     bool enable,
 		     struct rte_flow_error *error)
@@ -4904,7 +4904,7 @@ struct tcf_nlcb_context {
 	assert(encap->hdr.type == FLOW_TCF_TUNACT_VXLAN_ENCAP);
 	if (encap->mask & FLOW_TCF_ENCAP_IPV4_DST) {
 		assert(encap->mask & FLOW_TCF_ENCAP_IPV4_SRC);
-		LIST_FOREACH(rule, &vtep->neigh, next) {
+		LIST_FOREACH(rule, &iface->neigh, next) {
 			if (rule->mask & FLOW_TCF_ENCAP_IPV4_DST &&
 			    encap->ipv4.dst == rule->ipv4.dst) {
 				break;
@@ -4913,7 +4913,7 @@ struct tcf_nlcb_context {
 	} else {
 		assert(encap->mask & FLOW_TCF_ENCAP_IPV6_SRC);
 		assert(encap->mask & FLOW_TCF_ENCAP_IPV6_DST);
-		LIST_FOREACH(rule, &vtep->neigh, next) {
+		LIST_FOREACH(rule, &iface->neigh, next) {
 			if (rule->mask & FLOW_TCF_ENCAP_IPV6_DST &&
 			    !memcmp(&encap->ipv6.dst, &rule->ipv6.dst,
 						sizeof(encap->ipv6.dst))) {
@@ -4940,7 +4940,7 @@ struct tcf_nlcb_context {
 		if (!rule->refcnt || !--rule->refcnt) {
 			LIST_REMOVE(rule, next);
 			return flow_tcf_rule_neigh(tcf, encap,
-						   vtep->ifouter,
+						   iface->ifouter,
 						   false, error);
 		}
 		return 0;
@@ -4971,13 +4971,13 @@ struct tcf_nlcb_context {
 		memcpy(&rule->ipv6.dst, &encap->ipv6.dst, IPV6_ADDR_LEN);
 	}
 	memcpy(&rule->eth, &encap->eth.dst, sizeof(rule->eth));
-	ret = flow_tcf_rule_neigh(tcf, encap, vtep->ifouter, true, error);
+	ret = flow_tcf_rule_neigh(tcf, encap, iface->ifouter, true, error);
 	if (ret) {
 		rte_free(rule);
 		return ret;
 	}
 	rule->refcnt++;
-	LIST_INSERT_HEAD(&vtep->neigh, rule, next);
+	LIST_INSERT_HEAD(&iface->neigh, rule, next);
 	return 0;
 }
 
@@ -5394,12 +5394,12 @@ struct tcf_nlcb_context {
 	}
 	dev_flow->tcf.vxlan_encap->iface = iface;
 	/* Create local ipaddr with peer to specify the outer IPs. */
-	ret = flow_tcf_encap_local(tcf, vtep, dev_flow, true, error);
+	ret = flow_tcf_encap_local(tcf, iface, dev_flow, true, error);
 	if (!ret) {
 		/* Create neigh rule to specify outer destination MAC. */
-		ret = flow_tcf_encap_neigh(tcf, vtep, dev_flow, true, error);
+		ret = flow_tcf_encap_neigh(tcf, iface, dev_flow, true, error);
 		if (ret)
-			flow_tcf_encap_local(tcf, vtep,
+			flow_tcf_encap_local(tcf, iface,
 					     dev_flow, false, error);
 	}
 	if (ret) {
@@ -5483,8 +5483,8 @@ struct tcf_nlcb_context {
 		/* Remove the encap ancillary rules first. */
 		iface = dev_flow->tcf.vxlan_encap->iface;
 		assert(iface);
-		flow_tcf_encap_neigh(tcf, vtep, dev_flow, false, NULL);
-		flow_tcf_encap_local(tcf, vtep, dev_flow, false, NULL);
+		flow_tcf_encap_neigh(tcf, iface, dev_flow, false, NULL);
+		flow_tcf_encap_local(tcf, iface, dev_flow, false, NULL);
 		flow_tcf_encap_irule_release(iface);
 		dev_flow->tcf.vxlan_encap->iface = NULL;
 		break;
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 4/5] net/mlx5: switch to detached VXLAN network devices
  2018-12-29 19:55 [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Viacheslav Ovsiienko
                   ` (2 preceding siblings ...)
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 3/5] net/mlx5: switch encap rules to use container Viacheslav Ovsiienko
@ 2018-12-29 19:55 ` Viacheslav Ovsiienko
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 5/5] net/mlx5: add RH7.2 VXLAN device metadata workaround Viacheslav Ovsiienko
  2019-01-13 12:19 ` [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Shahaf Shuler
  5 siblings, 0 replies; 7+ messages in thread
From: Viacheslav Ovsiienko @ 2018-12-29 19:55 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Current design uses the VXLAN virtual devices attached
to outer network interface for decapsulation. Kernel
allows to use non-attached devices, so now we can create
not attached device and use it both for encapsulation
and decapsulation. Devices management becomes simpler,
less VXLAN devices are created and used.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow_tcf.c | 73 ++++++----------------------------------
 1 file changed, 11 insertions(+), 62 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c b/drivers/net/mlx5/mlx5_flow_tcf.c
index b99e322..7f9a76c 100644
--- a/drivers/net/mlx5/mlx5_flow_tcf.c
+++ b/drivers/net/mlx5/mlx5_flow_tcf.c
@@ -443,11 +443,8 @@ struct tcf_irule {
 /** VXLAN virtual netdev. */
 struct tcf_vtep {
 	LIST_ENTRY(tcf_vtep) next;
-	LIST_HEAD(, tcf_neigh_rule) neigh;
-	LIST_HEAD(, tcf_local_rule) local;
 	uint32_t refcnt;
 	unsigned int ifindex; /**< Own interface index. */
-	unsigned int ifouter; /**< Index of device attached to. */
 	uint16_t port;
 	uint8_t created;
 };
@@ -5109,11 +5106,6 @@ struct tcf_nlcb_context {
  *
  * @param[in] tcf
  *   Context object initialized by mlx5_flow_tcf_context_create().
- * @param[in] ifouter
- *   Outer interface to attach new-created VXLAN device
- *   If zero the VXLAN device will not be attached to any device.
- *   These VTEPs are used for decapsulation and can be precreated
- *   and shared between processes.
  * @param[in] port
  *   UDP port of created VTEP device.
  * @param[out] error
@@ -5126,7 +5118,6 @@ struct tcf_nlcb_context {
 #ifdef HAVE_IFLA_VXLAN_COLLECT_METADATA
 static struct tcf_vtep*
 flow_tcf_vtep_create(struct mlx5_flow_tcf_context *tcf,
-		     unsigned int ifouter,
 		     uint16_t port, struct rte_flow_error *error)
 {
 	struct tcf_vtep *vtep;
@@ -5156,8 +5147,6 @@ struct tcf_nlcb_context {
 	}
 	*vtep = (struct tcf_vtep){
 			.port = port,
-			.local = LIST_HEAD_INITIALIZER(),
-			.neigh = LIST_HEAD_INITIALIZER(),
 	};
 	memset(buf, 0, sizeof(buf));
 	nlh = mnl_nlmsg_put_header(buf);
@@ -5175,8 +5164,6 @@ struct tcf_nlcb_context {
 	assert(na_info);
 	mnl_attr_put_strz(nlh, IFLA_INFO_KIND, "vxlan");
 	na_vxlan = mnl_attr_nest_start(nlh, IFLA_INFO_DATA);
-	if (ifouter)
-		mnl_attr_put_u32(nlh, IFLA_VXLAN_LINK, ifouter);
 	assert(na_vxlan);
 	mnl_attr_put_u8(nlh, IFLA_VXLAN_COLLECT_METADATA, 1);
 	mnl_attr_put_u8(nlh, IFLA_VXLAN_UDP_ZERO_CSUM6_RX, 1);
@@ -5190,7 +5177,7 @@ struct tcf_nlcb_context {
 		DRV_LOG(WARNING,
 			"netlink: VTEP %s create failure (%d)",
 			name, rte_errno);
-		if (rte_errno != EEXIST || ifouter)
+		if (rte_errno != EEXIST)
 			/*
 			 * Some unhandled error occurred or device is
 			 * for encapsulation and cannot be shared.
@@ -5216,7 +5203,6 @@ struct tcf_nlcb_context {
 		goto error;
 	}
 	vtep->ifindex = ret;
-	vtep->ifouter = ifouter;
 	memset(buf, 0, sizeof(buf));
 	nlh = mnl_nlmsg_put_header(buf);
 	nlh->nlmsg_type = RTM_NEWLINK;
@@ -5254,7 +5240,6 @@ struct tcf_nlcb_context {
 #else
 static struct tcf_vtep*
 flow_tcf_vtep_create(struct mlx5_flow_tcf_context *tcf __rte_unused,
-		     unsigned int ifouter __rte_unused,
 		     uint16_t port __rte_unused,
 		     struct rte_flow_error *error)
 {
@@ -5293,13 +5278,6 @@ struct tcf_nlcb_context {
 		if (vtep->port == port)
 			break;
 	}
-	if (vtep && vtep->ifouter) {
-		rte_flow_error_set(error, -errno,
-				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-				   "Failed to create decap VTEP with specified"
-				   " UDP port, atatched device exists");
-		return NULL;
-	}
 	if (vtep) {
 		/* Device exists, just increment the reference counter. */
 		vtep->refcnt++;
@@ -5307,7 +5285,7 @@ struct tcf_nlcb_context {
 		return vtep;
 	}
 	/* No decapsulation device exists, try to create the new one. */
-	vtep = flow_tcf_vtep_create(tcf, 0, port, error);
+	vtep = flow_tcf_vtep_create(tcf, port, error);
 	if (vtep)
 		LIST_INSERT_HEAD(&vtep_list_vxlan, vtep, next);
 	return vtep;
@@ -5319,7 +5297,7 @@ struct tcf_nlcb_context {
  * @param[in] tcf
  *   Context object initialized by mlx5_flow_tcf_context_create().
  * @param[in] ifouter
- *   Network interface index to attach VXLAN encap device to.
+ *   Network interface index to create VXLAN encap rules on.
  * @param[in] dev_flow
  *   Flow tcf object with tunnel structure pointer set.
  * @param[out] error
@@ -5331,60 +5309,31 @@ struct tcf_nlcb_context {
 static struct tcf_vtep*
 flow_tcf_encap_vtep_acquire(struct mlx5_flow_tcf_context *tcf,
 			    unsigned int ifouter,
-			    struct mlx5_flow *dev_flow __rte_unused,
+			    struct mlx5_flow *dev_flow,
 			    struct rte_flow_error *error)
 {
-	static uint16_t encap_port = MLX5_VXLAN_PORT_MIN - 1;
+	static uint16_t port;
 	struct tcf_vtep *vtep;
 	struct tcf_irule *iface;
 	int ret;
 
 	assert(ifouter);
-	/* Look whether the attached VTEP for encap is created. */
+	/* Look whether the VTEP for specified port is created. */
+	port = rte_be_to_cpu_16(dev_flow->tcf.vxlan_encap->udp.dst);
 	LIST_FOREACH(vtep, &vtep_list_vxlan, next) {
-		if (vtep->ifouter == ifouter)
+		if (vtep->port == port)
 			break;
 	}
 	if (vtep) {
 		/* VTEP already exists, just increment the reference. */
 		vtep->refcnt++;
 	} else {
-		uint16_t pcnt;
-
-		/* Not found, we should create the new attached VTEP. */
-		flow_tcf_encap_iface_cleanup(tcf, ifouter);
-		flow_tcf_encap_local_cleanup(tcf, ifouter);
-		flow_tcf_encap_neigh_cleanup(tcf, ifouter);
-		for (pcnt = 0; pcnt <= (MLX5_VXLAN_PORT_MAX
-				     - MLX5_VXLAN_PORT_MIN); pcnt++) {
-			encap_port++;
-			/* Wraparound the UDP port index. */
-			if (encap_port < MLX5_VXLAN_PORT_MIN ||
-			    encap_port > MLX5_VXLAN_PORT_MAX)
-				encap_port = MLX5_VXLAN_PORT_MIN;
-			/* Check whether UDP port is in already in use. */
-			LIST_FOREACH(vtep, &vtep_list_vxlan, next) {
-				if (vtep->port == encap_port)
-					break;
-			}
-			if (vtep) {
-				/* Port is in use, try the next one. */
-				vtep = NULL;
-				continue;
-			}
-			vtep = flow_tcf_vtep_create(tcf, ifouter,
-						    encap_port, error);
-			if (vtep) {
-				LIST_INSERT_HEAD(&vtep_list_vxlan, vtep, next);
-				break;
-			}
-			if (rte_errno != EEXIST)
-				break;
-		}
+		/* Not found, we should create the new VTEP. */
+		vtep = flow_tcf_vtep_create(tcf, port, error);
 		if (!vtep)
 			return NULL;
+		LIST_INSERT_HEAD(&vtep_list_vxlan, vtep, next);
 	}
-	assert(vtep->ifouter == ifouter);
 	assert(vtep->ifindex);
 	iface = flow_tcf_encap_irule_acquire(tcf, ifouter, error);
 	if (!iface) {
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [dpdk-dev] [PATCH 5/5] net/mlx5: add RH7.2 VXLAN device metadata workaround
  2018-12-29 19:55 [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Viacheslav Ovsiienko
                   ` (3 preceding siblings ...)
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 4/5] net/mlx5: switch to detached VXLAN network devices Viacheslav Ovsiienko
@ 2018-12-29 19:55 ` Viacheslav Ovsiienko
  2019-01-13 12:19 ` [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Shahaf Shuler
  5 siblings, 0 replies; 7+ messages in thread
From: Viacheslav Ovsiienko @ 2018-12-29 19:55 UTC (permalink / raw)
  To: shahafs; +Cc: dev

RH7.2 with kernel 3.10.0-327 does not support VXLAN
devices metadata and IFLA_VXLAN_COLLECT_METADATA
key is neither defined nor supported. We must specify
VNI parameter, which will be actually ignored by kernel,
applied rules will be processed by mlx5 kernel driver
and the actual VNI from rules will be used.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
 drivers/net/mlx5/mlx5_flow_tcf.c | 33 ++++++++++++++++-----------------
 1 file changed, 16 insertions(+), 17 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c b/drivers/net/mlx5/mlx5_flow_tcf.c
index 7f9a76c..72bad85 100644
--- a/drivers/net/mlx5/mlx5_flow_tcf.c
+++ b/drivers/net/mlx5/mlx5_flow_tcf.c
@@ -351,9 +351,8 @@ struct tc_tunnel_key {
 #define TCA_ACT_MAX_PRIO 32
 #endif
 
-/** UDP port range of VXLAN devices created by driver. */
-#define MLX5_VXLAN_PORT_MIN 30000
-#define MLX5_VXLAN_PORT_MAX 60000
+/** Parameters of VXLAN devices created by driver. */
+#define MLX5_VXLAN_DEFAULT_VNI	1
 #define MLX5_VXLAN_DEVICE_PFX "vmlx_"
 
 /** Tunnel action type, used for @p type in header structure. */
@@ -5115,7 +5114,6 @@ struct tcf_nlcb_context {
  * Pointer to created device structure on success,
  * NULL otherwise and rte_errno is set.
  */
-#ifdef HAVE_IFLA_VXLAN_COLLECT_METADATA
 static struct tcf_vtep*
 flow_tcf_vtep_create(struct mlx5_flow_tcf_context *tcf,
 		     uint16_t port, struct rte_flow_error *error)
@@ -5165,10 +5163,24 @@ struct tcf_nlcb_context {
 	mnl_attr_put_strz(nlh, IFLA_INFO_KIND, "vxlan");
 	na_vxlan = mnl_attr_nest_start(nlh, IFLA_INFO_DATA);
 	assert(na_vxlan);
+#ifdef HAVE_IFLA_VXLAN_COLLECT_METADATA
+	/*
+	 * RH 7.2 does not support metadata for tunnel device.
+	 * It does not matter because we are going to use the
+	 * hardware offload by mlx5 driver.
+	 */
 	mnl_attr_put_u8(nlh, IFLA_VXLAN_COLLECT_METADATA, 1);
+#endif
 	mnl_attr_put_u8(nlh, IFLA_VXLAN_UDP_ZERO_CSUM6_RX, 1);
 	mnl_attr_put_u8(nlh, IFLA_VXLAN_LEARNING, 0);
 	mnl_attr_put_u16(nlh, IFLA_VXLAN_PORT, vxlan_port);
+#ifndef HAVE_IFLA_VXLAN_COLLECT_METADATA
+	/*
+	 *  We must specify VNI explicitly if metadata not supported.
+	 *  Note, VNI is transferred with native endianness format.
+	 */
+	mnl_attr_put_u16(nlh, IFLA_VXLAN_ID, MLX5_VXLAN_DEFAULT_VNI);
+#endif
 	mnl_attr_nest_end(nlh, na_vxlan);
 	mnl_attr_nest_end(nlh, na_info);
 	assert(sizeof(buf) >= nlh->nlmsg_len);
@@ -5237,19 +5249,6 @@ struct tcf_nlcb_context {
 	rte_free(vtep);
 	return NULL;
 }
-#else
-static struct tcf_vtep*
-flow_tcf_vtep_create(struct mlx5_flow_tcf_context *tcf __rte_unused,
-		     uint16_t port __rte_unused,
-		     struct rte_flow_error *error)
-{
-	rte_flow_error_set(error, ENOTSUP,
-			   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
-			   "netlink: failed to create VTEP, "
-			   "vxlan metadata are not supported by kernel");
-	return NULL;
-}
-#endif /* HAVE_IFLA_VXLAN_COLLECT_METADATA */
 
 /**
  * Acquire target interface index for VXLAN tunneling decapsulation.
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch
  2018-12-29 19:55 [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Viacheslav Ovsiienko
                   ` (4 preceding siblings ...)
  2018-12-29 19:55 ` [dpdk-dev] [PATCH 5/5] net/mlx5: add RH7.2 VXLAN device metadata workaround Viacheslav Ovsiienko
@ 2019-01-13 12:19 ` Shahaf Shuler
  5 siblings, 0 replies; 7+ messages in thread
From: Shahaf Shuler @ 2019-01-13 12:19 UTC (permalink / raw)
  To: Slava Ovsiienko; +Cc: dev

Saturday, December 29, 2018 9:56 PM, Viacheslav Ovsiienko:
> Subject: [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices
> management for E-Switch
> 
> This patchset simplifies the virtual VXLAN tunnel devices management.
> Previous design used the VXLAN devices attached to outer interface for
> encapsulation rules. The new design uses the unattached devices, it allows
> use the single VXLAN device both for encapsulation and decapsulation rules
> and removes UDP port sharing issues.
> 
> Also patchset introduces the minor changes in VXLAN device management
> allowing to be compiled and operate on some old kernels (for example RH7.2
> original kernel 3.10.327), which do not support VXLAN device metadata.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

Applied to next-net-mlx, thanks. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-13 12:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-29 19:55 [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Viacheslav Ovsiienko
2018-12-29 19:55 ` [dpdk-dev] [PATCH 1/5] net/mlx5: optimize neigh and local encap rules search Viacheslav Ovsiienko
2018-12-29 19:55 ` [dpdk-dev] [PATCH 2/5] net/mlx5: introduce encapsulation rules container Viacheslav Ovsiienko
2018-12-29 19:55 ` [dpdk-dev] [PATCH 3/5] net/mlx5: switch encap rules to use container Viacheslav Ovsiienko
2018-12-29 19:55 ` [dpdk-dev] [PATCH 4/5] net/mlx5: switch to detached VXLAN network devices Viacheslav Ovsiienko
2018-12-29 19:55 ` [dpdk-dev] [PATCH 5/5] net/mlx5: add RH7.2 VXLAN device metadata workaround Viacheslav Ovsiienko
2019-01-13 12:19 ` [dpdk-dev] [PATCH 0/5] net/mlx5: simplify VXLAN devices management for E-Switch Shahaf Shuler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).