DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support
@ 2016-01-29 10:31 Adrien Mazarguil
  2016-01-29 10:31 ` [dpdk-dev] [PATCH 1/5] mlx5: refactor special flows handling Adrien Mazarguil
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-01-29 10:31 UTC (permalink / raw)
  To: dev

To preserve compatibility with Mellanox OFED 3.1, flow director and RX VLAN
stripping code is only enabled if compiled with 3.2.

Yaacov Hazan (5):
  mlx5: refactor special flows handling
  mlx5: add special flows (broadcast and IPv6 multicast)
  mlx5: make flow steering rule generator more generic
  mlx5: add support for flow director
  mlx5: add support for RX VLAN stripping

 drivers/net/mlx5/Makefile       |   6 +
 drivers/net/mlx5/mlx5.c         |  39 +-
 drivers/net/mlx5/mlx5.h         |  19 +-
 drivers/net/mlx5/mlx5_defs.h    |  14 +
 drivers/net/mlx5/mlx5_ethdev.c  |   3 +-
 drivers/net/mlx5/mlx5_fdir.c    | 890 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_mac.c     |  10 +-
 drivers/net/mlx5/mlx5_rxmode.c  | 350 ++++++++--------
 drivers/net/mlx5/mlx5_rxq.c     |  80 +++-
 drivers/net/mlx5/mlx5_rxtx.c    |  27 ++
 drivers/net/mlx5/mlx5_rxtx.h    |  51 ++-
 drivers/net/mlx5/mlx5_trigger.c |  21 +-
 drivers/net/mlx5/mlx5_vlan.c    | 104 +++++
 13 files changed, 1388 insertions(+), 226 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_fdir.c

-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH 1/5] mlx5: refactor special flows handling
  2016-01-29 10:31 [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
@ 2016-01-29 10:31 ` Adrien Mazarguil
  2016-01-29 10:31 ` [dpdk-dev] [PATCH 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-01-29 10:31 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Merge redundant code by adding a static initialization table to manage
promiscuous and allmulticast (special) flows.

New function priv_rehash_flows() implements the logic to enable/disable
relevant flows in one place from any context.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.c         |   4 +-
 drivers/net/mlx5/mlx5.h         |   6 +-
 drivers/net/mlx5/mlx5_defs.h    |   3 +
 drivers/net/mlx5/mlx5_rxmode.c  | 321 ++++++++++++++++++----------------------
 drivers/net/mlx5/mlx5_rxq.c     |  33 ++++-
 drivers/net/mlx5/mlx5_rxtx.h    |  29 +++-
 drivers/net/mlx5/mlx5_trigger.c |  14 +-
 7 files changed, 210 insertions(+), 200 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 821ee0f..4180842 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -88,8 +88,8 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	      ((priv->ctx != NULL) ? priv->ctx->device->name : ""));
 	/* In case mlx5_dev_stop() has not been called. */
 	priv_dev_interrupt_handler_uninstall(priv, dev);
-	priv_allmulticast_disable(priv);
-	priv_promiscuous_disable(priv);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	/* Prevent crashes when queues are still in use. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b84d31d..bc0c7e2 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -194,13 +194,11 @@ int mlx5_dev_rss_reta_update(struct rte_eth_dev *,
 
 /* mlx5_rxmode.c */
 
-int priv_promiscuous_enable(struct priv *);
+int priv_special_flow_enable(struct priv *, enum hash_rxq_flow_type);
+void priv_special_flow_disable(struct priv *, enum hash_rxq_flow_type);
 void mlx5_promiscuous_enable(struct rte_eth_dev *);
-void priv_promiscuous_disable(struct priv *);
 void mlx5_promiscuous_disable(struct rte_eth_dev *);
-int priv_allmulticast_enable(struct priv *);
 void mlx5_allmulticast_enable(struct rte_eth_dev *);
-void priv_allmulticast_disable(struct priv *);
 void mlx5_allmulticast_disable(struct rte_eth_dev *);
 
 /* mlx5_stats.c */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index ae5eda9..1f2a010 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -43,6 +43,9 @@
 /* Maximum number of simultaneous VLAN filters. */
 #define MLX5_MAX_VLAN_IDS 128
 
+/* Maximum number of special flows. */
+#define MLX5_MAX_SPECIAL_FLOWS 2
+
 /* Request send completion once in every 64 sends, might be less. */
 #define MLX5_PMD_TX_PER_COMP_REQ 64
 
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 096fd18..b2ed17e 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -58,31 +58,96 @@
 #include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
 
-static void hash_rxq_promiscuous_disable(struct hash_rxq *);
-static void hash_rxq_allmulticast_disable(struct hash_rxq *);
+/* Initialization data for special flows. */
+static const struct special_flow_init special_flow_init[] = {
+	[HASH_RXQ_FLOW_TYPE_PROMISC] = {
+		.dst_mac_val = "\x00\x00\x00\x00\x00\x00",
+		.dst_mac_mask = "\x00\x00\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_TCPV4 |
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_TCPV6 |
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+	[HASH_RXQ_FLOW_TYPE_ALLMULTI] = {
+		.dst_mac_val = "\x01\x00\x00\x00\x00\x00",
+		.dst_mac_mask = "\x01\x00\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+};
 
 /**
- * Enable promiscuous mode in a hash RX queue.
+ * Enable a special flow in a hash RX queue.
  *
  * @param hash_rxq
  *   Pointer to hash RX queue structure.
+ * @param flow_type
+ *   Special flow type.
  *
  * @return
  *   0 on success, errno value on failure.
  */
 static int
-hash_rxq_promiscuous_enable(struct hash_rxq *hash_rxq)
+hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
+			     enum hash_rxq_flow_type flow_type)
 {
 	struct ibv_exp_flow *flow;
 	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
 	struct ibv_exp_flow_attr *attr = &data->attr;
+	struct ibv_exp_flow_spec_eth *spec = &data->spec;
+	const uint8_t *mac;
+	const uint8_t *mask;
 
-	if (hash_rxq->promisc_flow != NULL)
+	/* Check if flow is relevant for this hash_rxq. */
+	if (!(special_flow_init[flow_type].hash_types & (1 << hash_rxq->type)))
+		return 0;
+	/* Check if flow already exists. */
+	if (hash_rxq->special_flow[flow_type] != NULL)
 		return 0;
-	DEBUG("%p: enabling promiscuous mode", (void *)hash_rxq);
-	/* Promiscuous flows only differ from normal flows by not filtering
-	 * on specific MAC addresses. */
+
+	/*
+	 * No padding must be inserted by the compiler between attr and spec.
+	 * This layout is expected by libibverbs.
+	 */
+	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
 	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	/* The first specification must be Ethernet. */
+	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
+	assert(spec->size == sizeof(*spec));
+
+	mac = special_flow_init[flow_type].dst_mac_val;
+	mask = special_flow_init[flow_type].dst_mac_mask;
+	*spec = (struct ibv_exp_flow_spec_eth){
+		.type = IBV_EXP_FLOW_SPEC_ETH,
+		.size = sizeof(*spec),
+		.val = {
+			.dst_mac = {
+				mac[0], mac[1], mac[2],
+				mac[3], mac[4], mac[5],
+			},
+		},
+		.mask = {
+			.dst_mac = {
+				mask[0], mask[1], mask[2],
+				mask[3], mask[4], mask[5],
+			},
+		},
+	};
+
 	errno = 0;
 	flow = ibv_exp_create_flow(hash_rxq->qp, attr);
 	if (flow == NULL) {
@@ -94,38 +159,63 @@ hash_rxq_promiscuous_enable(struct hash_rxq *hash_rxq)
 			return errno;
 		return EINVAL;
 	}
-	hash_rxq->promisc_flow = flow;
-	DEBUG("%p: promiscuous mode enabled", (void *)hash_rxq);
+	hash_rxq->special_flow[flow_type] = flow;
+	DEBUG("%p: enabling special flow %s (%d)",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
 	return 0;
 }
 
 /**
- * Enable promiscuous mode in all hash RX queues.
+ * Disable a special flow in a hash RX queue.
+ *
+ * @param hash_rxq
+ *   Pointer to hash RX queue structure.
+ * @param flow_type
+ *   Special flow type.
+ */
+static void
+hash_rxq_special_flow_disable(struct hash_rxq *hash_rxq,
+			      enum hash_rxq_flow_type flow_type)
+{
+	if (hash_rxq->special_flow[flow_type] == NULL)
+		return;
+	DEBUG("%p: disabling special flow %s (%d)",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
+	claim_zero(ibv_exp_destroy_flow(hash_rxq->special_flow[flow_type]));
+	hash_rxq->special_flow[flow_type] = NULL;
+	DEBUG("%p: special flow %s (%d) disabled",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
+}
+
+/**
+ * Enable a special flow in all hash RX queues.
  *
  * @param priv
  *   Private structure.
+ * @param flow_type
+ *   Special flow type.
  *
  * @return
  *   0 on success, errno value on failure.
  */
 int
-priv_promiscuous_enable(struct priv *priv)
+priv_special_flow_enable(struct priv *priv, enum hash_rxq_flow_type flow_type)
 {
 	unsigned int i;
 
-	if (!priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_PROMISC))
+	if (!priv_allow_flow_type(priv, flow_type))
 		return 0;
 	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
 		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
 		int ret;
 
-		ret = hash_rxq_promiscuous_enable(hash_rxq);
+		ret = hash_rxq_special_flow_enable(hash_rxq, flow_type);
 		if (!ret)
 			continue;
 		/* Failure, rollback. */
 		while (i != 0) {
 			hash_rxq = &(*priv->hash_rxqs)[--i];
-			hash_rxq_promiscuous_disable(hash_rxq);
+			hash_rxq_special_flow_disable(hash_rxq, flow_type);
 		}
 		return ret;
 	}
@@ -133,6 +223,26 @@ priv_promiscuous_enable(struct priv *priv)
 }
 
 /**
+ * Disable a special flow in all hash RX queues.
+ *
+ * @param priv
+ *   Private structure.
+ * @param flow_type
+ *   Special flow type.
+ */
+void
+priv_special_flow_disable(struct priv *priv, enum hash_rxq_flow_type flow_type)
+{
+	unsigned int i;
+
+	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
+		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
+
+		hash_rxq_special_flow_disable(hash_rxq, flow_type);
+	}
+}
+
+/**
  * DPDK callback to enable promiscuous mode.
  *
  * @param dev
@@ -146,49 +256,14 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
 
 	priv_lock(priv);
 	priv->promisc_req = 1;
-	ret = priv_promiscuous_enable(priv);
+	ret = priv_rehash_flows(priv);
 	if (ret)
-		ERROR("cannot enable promiscuous mode: %s", strerror(ret));
-	else {
-		priv_mac_addrs_disable(priv);
-		priv_allmulticast_disable(priv);
-	}
+		ERROR("error while enabling promiscuous mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Disable promiscuous mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- */
-static void
-hash_rxq_promiscuous_disable(struct hash_rxq *hash_rxq)
-{
-	if (hash_rxq->promisc_flow == NULL)
-		return;
-	DEBUG("%p: disabling promiscuous mode", (void *)hash_rxq);
-	claim_zero(ibv_exp_destroy_flow(hash_rxq->promisc_flow));
-	hash_rxq->promisc_flow = NULL;
-	DEBUG("%p: promiscuous mode disabled", (void *)hash_rxq);
-}
-
-/**
- * Disable promiscuous mode in all hash RX queues.
- *
- * @param priv
- *   Private structure.
- */
-void
-priv_promiscuous_disable(struct priv *priv)
-{
-	unsigned int i;
-
-	for (i = 0; (i != priv->hash_rxqs_n); ++i)
-		hash_rxq_promiscuous_disable(&(*priv->hash_rxqs)[i]);
-}
-
-/**
  * DPDK callback to disable promiscuous mode.
  *
  * @param dev
@@ -198,105 +273,18 @@ void
 mlx5_promiscuous_disable(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
+	int ret;
 
 	priv_lock(priv);
 	priv->promisc_req = 0;
-	priv_promiscuous_disable(priv);
-	priv_mac_addrs_enable(priv);
-	priv_allmulticast_enable(priv);
+	ret = priv_rehash_flows(priv);
+	if (ret)
+		ERROR("error while disabling promiscuous mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Enable allmulti mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- *
- * @return
- *   0 on success, errno value on failure.
- */
-static int
-hash_rxq_allmulticast_enable(struct hash_rxq *hash_rxq)
-{
-	struct ibv_exp_flow *flow;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
-	struct ibv_exp_flow_attr *attr = &data->attr;
-	struct ibv_exp_flow_spec_eth *spec = &data->spec;
-
-	if (hash_rxq->allmulti_flow != NULL)
-		return 0;
-	DEBUG("%p: enabling allmulticast mode", (void *)hash_rxq);
-	/*
-	 * No padding must be inserted by the compiler between attr and spec.
-	 * This layout is expected by libibverbs.
-	 */
-	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
-	*spec = (struct ibv_exp_flow_spec_eth){
-		.type = IBV_EXP_FLOW_SPEC_ETH,
-		.size = sizeof(*spec),
-		.val = {
-			.dst_mac = "\x01\x00\x00\x00\x00\x00",
-		},
-		.mask = {
-			.dst_mac = "\x01\x00\x00\x00\x00\x00",
-		},
-	};
-	errno = 0;
-	flow = ibv_exp_create_flow(hash_rxq->qp, attr);
-	if (flow == NULL) {
-		/* It's not clear whether errno is always set in this case. */
-		ERROR("%p: flow configuration failed, errno=%d: %s",
-		      (void *)hash_rxq, errno,
-		      (errno ? strerror(errno) : "Unknown error"));
-		if (errno)
-			return errno;
-		return EINVAL;
-	}
-	hash_rxq->allmulti_flow = flow;
-	DEBUG("%p: allmulticast mode enabled", (void *)hash_rxq);
-	return 0;
-}
-
-/**
- * Enable allmulti mode in most hash RX queues.
- * TCP queues are exempted to save resources.
- *
- * @param priv
- *   Private structure.
- *
- * @return
- *   0 on success, errno value on failure.
- */
-int
-priv_allmulticast_enable(struct priv *priv)
-{
-	unsigned int i;
-
-	if (!priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI))
-		return 0;
-	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
-		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
-		int ret;
-
-		/* allmulticast not relevant for TCP. */
-		if (hash_rxq->type == HASH_RXQ_TCPV4)
-			continue;
-		ret = hash_rxq_allmulticast_enable(hash_rxq);
-		if (!ret)
-			continue;
-		/* Failure, rollback. */
-		while (i != 0) {
-			hash_rxq = &(*priv->hash_rxqs)[--i];
-			hash_rxq_allmulticast_disable(hash_rxq);
-		}
-		return ret;
-	}
-	return 0;
-}
-
-/**
  * DPDK callback to enable allmulti mode.
  *
  * @param dev
@@ -310,45 +298,14 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
 
 	priv_lock(priv);
 	priv->allmulti_req = 1;
-	ret = priv_allmulticast_enable(priv);
+	ret = priv_rehash_flows(priv);
 	if (ret)
-		ERROR("cannot enable allmulticast mode: %s", strerror(ret));
+		ERROR("error while enabling allmulticast mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Disable allmulti mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- */
-static void
-hash_rxq_allmulticast_disable(struct hash_rxq *hash_rxq)
-{
-	if (hash_rxq->allmulti_flow == NULL)
-		return;
-	DEBUG("%p: disabling allmulticast mode", (void *)hash_rxq);
-	claim_zero(ibv_exp_destroy_flow(hash_rxq->allmulti_flow));
-	hash_rxq->allmulti_flow = NULL;
-	DEBUG("%p: allmulticast mode disabled", (void *)hash_rxq);
-}
-
-/**
- * Disable allmulti mode in all hash RX queues.
- *
- * @param priv
- *   Private structure.
- */
-void
-priv_allmulticast_disable(struct priv *priv)
-{
-	unsigned int i;
-
-	for (i = 0; (i != priv->hash_rxqs_n); ++i)
-		hash_rxq_allmulticast_disable(&(*priv->hash_rxqs)[i]);
-}
-
-/**
  * DPDK callback to disable allmulti mode.
  *
  * @param dev
@@ -358,9 +315,13 @@ void
 mlx5_allmulticast_disable(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
+	int ret;
 
 	priv_lock(priv);
 	priv->allmulti_req = 0;
-	priv_allmulticast_disable(priv);
+	ret = priv_rehash_flows(priv);
+	if (ret)
+		ERROR("error while disabling allmulticast mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 37b4efd..11f447a 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -534,8 +534,8 @@ priv_destroy_hash_rxqs(struct priv *priv)
 		assert(hash_rxq->priv == priv);
 		assert(hash_rxq->qp != NULL);
 		/* Also check that there are no remaining flows. */
-		assert(hash_rxq->allmulti_flow == NULL);
-		assert(hash_rxq->promisc_flow == NULL);
+		for (j = 0; (j != RTE_DIM(hash_rxq->special_flow)); ++j)
+			assert(hash_rxq->special_flow[j] == NULL);
 		for (j = 0; (j != RTE_DIM(hash_rxq->mac_flow)); ++j)
 			for (k = 0; (k != RTE_DIM(hash_rxq->mac_flow[j])); ++k)
 				assert(hash_rxq->mac_flow[j][k] == NULL);
@@ -586,6 +586,35 @@ priv_allow_flow_type(struct priv *priv, enum hash_rxq_flow_type type)
 }
 
 /**
+ * Automatically enable/disable flows according to configuration.
+ *
+ * @param priv
+ *   Private structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+priv_rehash_flows(struct priv *priv)
+{
+	unsigned int i;
+
+	for (i = 0; (i != RTE_DIM((*priv->hash_rxqs)[0].special_flow)); ++i)
+		if (!priv_allow_flow_type(priv, i)) {
+			priv_special_flow_disable(priv, i);
+		} else {
+			int ret = priv_special_flow_enable(priv, i);
+
+			if (ret)
+				return ret;
+		}
+	if (priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_MAC))
+		return priv_mac_addrs_enable(priv);
+	priv_mac_addrs_disable(priv);
+	return 0;
+}
+
+/**
  * Allocate RX queue elements with scattered packets support.
  *
  * @param rxq
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index e1e1925..983e6a4 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -176,20 +176,42 @@ struct ind_table_init {
 	unsigned int hash_types_n;
 };
 
+/* Initialization data for special flows. */
+struct special_flow_init {
+	uint8_t dst_mac_val[6];
+	uint8_t dst_mac_mask[6];
+	unsigned int hash_types;
+};
+
 enum hash_rxq_flow_type {
-	HASH_RXQ_FLOW_TYPE_MAC,
 	HASH_RXQ_FLOW_TYPE_PROMISC,
 	HASH_RXQ_FLOW_TYPE_ALLMULTI,
+	HASH_RXQ_FLOW_TYPE_MAC,
 };
 
+#ifndef NDEBUG
+static inline const char *
+hash_rxq_flow_type_str(enum hash_rxq_flow_type flow_type)
+{
+	switch (flow_type) {
+	case HASH_RXQ_FLOW_TYPE_PROMISC:
+		return "promiscuous";
+	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
+		return "allmulticast";
+	case HASH_RXQ_FLOW_TYPE_MAC:
+		return "MAC";
+	}
+	return NULL;
+}
+#endif /* NDEBUG */
+
 struct hash_rxq {
 	struct priv *priv; /* Back pointer to private data. */
 	struct ibv_qp *qp; /* Hash RX QP. */
 	enum hash_rxq_type type; /* Hash RX queue type. */
 	/* MAC flow steering rules, one per VLAN ID. */
 	struct ibv_exp_flow *mac_flow[MLX5_MAX_MAC_ADDRESSES][MLX5_MAX_VLAN_IDS];
-	struct ibv_exp_flow *promisc_flow; /* Promiscuous flow. */
-	struct ibv_exp_flow *allmulti_flow; /* Multicast flow. */
+	struct ibv_exp_flow *special_flow[MLX5_MAX_SPECIAL_FLOWS];
 };
 
 /* TX element. */
@@ -247,6 +269,7 @@ size_t hash_rxq_flow_attr(const struct hash_rxq *, struct ibv_exp_flow_attr *,
 int priv_create_hash_rxqs(struct priv *);
 void priv_destroy_hash_rxqs(struct priv *);
 int priv_allow_flow_type(struct priv *, enum hash_rxq_flow_type);
+int priv_rehash_flows(struct priv *);
 void rxq_cleanup(struct rxq *);
 int rxq_rehash(struct rte_eth_dev *, struct rxq *);
 int rxq_setup(struct rte_eth_dev *, struct rxq *, uint16_t, unsigned int,
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ff1203d..d9f7d00 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -72,11 +72,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	DEBUG("%p: allocating and configuring hash RX queues", (void *)dev);
 	err = priv_create_hash_rxqs(priv);
 	if (!err)
-		err = priv_promiscuous_enable(priv);
-	if (!err)
-		err = priv_mac_addrs_enable(priv);
-	if (!err)
-		err = priv_allmulticast_enable(priv);
+		err = priv_rehash_flows(priv);
 	if (!err)
 		priv->started = 1;
 	else {
@@ -84,8 +80,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		      " %s",
 		      (void *)priv, strerror(err));
 		/* Rollback. */
-		priv_allmulticast_disable(priv);
-		priv_promiscuous_disable(priv);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 		priv_mac_addrs_disable(priv);
 		priv_destroy_hash_rxqs(priv);
 	}
@@ -113,8 +109,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 		return;
 	}
 	DEBUG("%p: cleaning up and destroying hash RX queues", (void *)dev);
-	priv_allmulticast_disable(priv);
-	priv_promiscuous_disable(priv);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	priv_dev_interrupt_handler_uninstall(priv, dev);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH 2/5] mlx5: add special flows (broadcast and IPv6 multicast)
  2016-01-29 10:31 [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
  2016-01-29 10:31 ` [dpdk-dev] [PATCH 1/5] mlx5: refactor special flows handling Adrien Mazarguil
@ 2016-01-29 10:31 ` Adrien Mazarguil
  2016-01-29 10:32 ` [dpdk-dev] [PATCH 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-01-29 10:31 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Until now, broadcast frames were handled like unicast. Moving the related
flow to the special flows table frees up the related unicast MAC entry.

The same method is used to handle IPv6 multicast frames.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.c         |  7 +++----
 drivers/net/mlx5/mlx5_defs.h    |  2 +-
 drivers/net/mlx5/mlx5_ethdev.c  |  3 +--
 drivers/net/mlx5/mlx5_mac.c     |  6 ++----
 drivers/net/mlx5/mlx5_rxmode.c  | 24 ++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxq.c     | 10 ++++++++++
 drivers/net/mlx5/mlx5_rxtx.h    |  6 ++++++
 drivers/net/mlx5/mlx5_trigger.c |  4 ++++
 8 files changed, 51 insertions(+), 11 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 4180842..6afc873 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -90,6 +90,8 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	priv_dev_interrupt_handler_uninstall(priv, dev);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	/* Prevent crashes when queues are still in use. */
@@ -415,13 +417,10 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		     mac.addr_bytes[0], mac.addr_bytes[1],
 		     mac.addr_bytes[2], mac.addr_bytes[3],
 		     mac.addr_bytes[4], mac.addr_bytes[5]);
-		/* Register MAC and broadcast addresses. */
+		/* Register MAC address. */
 		claim_zero(priv_mac_addr_add(priv, 0,
 					     (const uint8_t (*)[ETHER_ADDR_LEN])
 					     mac.addr_bytes));
-		claim_zero(priv_mac_addr_add(priv, (RTE_DIM(priv->mac) - 1),
-					     &(const uint8_t [ETHER_ADDR_LEN])
-					     { "\xff\xff\xff\xff\xff\xff" }));
 #ifndef NDEBUG
 		{
 			char ifname[IF_NAMESIZE];
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 1f2a010..a7440e7 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -44,7 +44,7 @@
 #define MLX5_MAX_VLAN_IDS 128
 
 /* Maximum number of special flows. */
-#define MLX5_MAX_SPECIAL_FLOWS 2
+#define MLX5_MAX_SPECIAL_FLOWS 4
 
 /* Request send completion once in every 64 sends, might be less. */
 #define MLX5_PMD_TX_PER_COMP_REQ 64
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 1159fa3..6704382 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -501,8 +501,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 		max = 65535;
 	info->max_rx_queues = max;
 	info->max_tx_queues = max;
-	/* Last array entry is reserved for broadcast. */
-	info->max_mac_addrs = (RTE_DIM(priv->mac) - 1);
+	info->max_mac_addrs = RTE_DIM(priv->mac);
 	info->rx_offload_capa =
 		(priv->hw_csum ?
 		 (DEV_RX_OFFLOAD_IPV4_CKSUM |
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index e37ce06..1a0b974 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -212,8 +212,7 @@ mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 	priv_lock(priv);
 	DEBUG("%p: removing MAC address from index %" PRIu32,
 	      (void *)dev, index);
-	/* Last array entry is reserved for broadcast. */
-	if (index >= (RTE_DIM(priv->mac) - 1))
+	if (index >= RTE_DIM(priv->mac))
 		goto end;
 	priv_mac_addr_del(priv, index);
 end:
@@ -479,8 +478,7 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
 	priv_lock(priv);
 	DEBUG("%p: adding MAC address at index %" PRIu32,
 	      (void *)dev, index);
-	/* Last array entry is reserved for broadcast. */
-	if (index >= (RTE_DIM(priv->mac) - 1))
+	if (index >= RTE_DIM(priv->mac))
 		goto end;
 	priv_mac_addr_add(priv, index,
 			  (const uint8_t (*)[ETHER_ADDR_LEN])
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index b2ed17e..6ee7ce3 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -88,6 +88,30 @@ static const struct special_flow_init special_flow_init[] = {
 			1 << HASH_RXQ_ETH |
 			0,
 	},
+	[HASH_RXQ_FLOW_TYPE_BROADCAST] = {
+		.dst_mac_val = "\xff\xff\xff\xff\xff\xff",
+		.dst_mac_mask = "\xff\xff\xff\xff\xff\xff",
+		.hash_types =
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+#ifdef HAVE_FLOW_SPEC_IPV6
+	[HASH_RXQ_FLOW_TYPE_IPV6MULTI] = {
+		.dst_mac_val = "\x33\x33\x00\x00\x00\x00",
+		.dst_mac_mask = "\xff\xff\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+#endif /* HAVE_FLOW_SPEC_IPV6 */
 };
 
 /**
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 11f447a..27e3bcc 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -579,8 +579,18 @@ priv_allow_flow_type(struct priv *priv, enum hash_rxq_flow_type type)
 		return !!priv->promisc_req;
 	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
 		return !!priv->allmulti_req;
+	case HASH_RXQ_FLOW_TYPE_BROADCAST:
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case HASH_RXQ_FLOW_TYPE_IPV6MULTI:
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+		/* If allmulti is enabled, broadcast and ipv6multi
+		 * are unnecessary. */
+		return !priv->allmulti_req;
 	case HASH_RXQ_FLOW_TYPE_MAC:
 		return 1;
+	default:
+		/* Unsupported flow type is not allowed. */
+		return 0;
 	}
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 983e6a4..d5a5019 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -186,6 +186,8 @@ struct special_flow_init {
 enum hash_rxq_flow_type {
 	HASH_RXQ_FLOW_TYPE_PROMISC,
 	HASH_RXQ_FLOW_TYPE_ALLMULTI,
+	HASH_RXQ_FLOW_TYPE_BROADCAST,
+	HASH_RXQ_FLOW_TYPE_IPV6MULTI,
 	HASH_RXQ_FLOW_TYPE_MAC,
 };
 
@@ -198,6 +200,10 @@ hash_rxq_flow_type_str(enum hash_rxq_flow_type flow_type)
 		return "promiscuous";
 	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
 		return "allmulticast";
+	case HASH_RXQ_FLOW_TYPE_BROADCAST:
+		return "broadcast";
+	case HASH_RXQ_FLOW_TYPE_IPV6MULTI:
+		return "IPv6 multicast";
 	case HASH_RXQ_FLOW_TYPE_MAC:
 		return "MAC";
 	}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index d9f7d00..90b8068 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -80,6 +80,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		      " %s",
 		      (void *)priv, strerror(err));
 		/* Rollback. */
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
 		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 		priv_mac_addrs_disable(priv);
@@ -109,6 +111,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 		return;
 	}
 	DEBUG("%p: cleaning up and destroying hash RX queues", (void *)dev);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH 3/5] mlx5: make flow steering rule generator more generic
  2016-01-29 10:31 [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
  2016-01-29 10:31 ` [dpdk-dev] [PATCH 1/5] mlx5: refactor special flows handling Adrien Mazarguil
  2016-01-29 10:31 ` [dpdk-dev] [PATCH 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
@ 2016-01-29 10:32 ` Adrien Mazarguil
  2016-01-29 10:32 ` [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director Adrien Mazarguil
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-01-29 10:32 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Upcoming flow director support will reuse this function to generate filter
rules.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_mac.c    |  4 ++--
 drivers/net/mlx5/mlx5_rxmode.c |  5 +++--
 drivers/net/mlx5/mlx5_rxq.c    | 16 ++++++++--------
 drivers/net/mlx5/mlx5_rxtx.h   |  4 ++--
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index 1a0b974..f952afc 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -241,7 +241,7 @@ hash_rxq_add_mac_flow(struct hash_rxq *hash_rxq, unsigned int mac_index,
 	const uint8_t (*mac)[ETHER_ADDR_LEN] =
 			(const uint8_t (*)[ETHER_ADDR_LEN])
 			priv->mac[mac_index].addr_bytes;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, hash_rxq->type));
 	struct ibv_exp_flow_attr *attr = &data->attr;
 	struct ibv_exp_flow_spec_eth *spec = &data->spec;
 	unsigned int vlan_enabled = !!priv->vlan_filter_n;
@@ -256,7 +256,7 @@ hash_rxq_add_mac_flow(struct hash_rxq *hash_rxq, unsigned int mac_index,
 	 * This layout is expected by libibverbs.
 	 */
 	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	priv_flow_attr(priv, attr, sizeof(data), hash_rxq->type);
 	/* The first specification must be Ethernet. */
 	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
 	assert(spec->size == sizeof(*spec));
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 6ee7ce3..9ac7a41 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -129,8 +129,9 @@ static int
 hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
 			     enum hash_rxq_flow_type flow_type)
 {
+	struct priv *priv = hash_rxq->priv;
 	struct ibv_exp_flow *flow;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, hash_rxq->type));
 	struct ibv_exp_flow_attr *attr = &data->attr;
 	struct ibv_exp_flow_spec_eth *spec = &data->spec;
 	const uint8_t *mac;
@@ -148,7 +149,7 @@ hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
 	 * This layout is expected by libibverbs.
 	 */
 	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	priv_flow_attr(priv, attr, sizeof(data), hash_rxq->type);
 	/* The first specification must be Ethernet. */
 	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
 	assert(spec->size == sizeof(*spec));
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 27e3bcc..641ee07 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -210,27 +210,27 @@ const size_t rss_hash_default_key_len = sizeof(rss_hash_default_key);
  * information from hash_rxq_init[]. Nothing is written to flow_attr when
  * flow_attr_size is not large enough, but the required size is still returned.
  *
- * @param[in] hash_rxq
- *   Pointer to hash RX queue.
+ * @param priv
+ *   Pointer to private structure.
  * @param[out] flow_attr
  *   Pointer to flow attribute structure to fill. Note that the allocated
  *   area must be larger and large enough to hold all flow specifications.
  * @param flow_attr_size
  *   Entire size of flow_attr and trailing room for flow specifications.
+ * @param type
+ *   Hash RX queue type to use for flow steering rule.
  *
  * @return
  *   Total size of the flow attribute buffer. No errors are defined.
  */
 size_t
-hash_rxq_flow_attr(const struct hash_rxq *hash_rxq,
-		   struct ibv_exp_flow_attr *flow_attr,
-		   size_t flow_attr_size)
+priv_flow_attr(struct priv *priv, struct ibv_exp_flow_attr *flow_attr,
+	       size_t flow_attr_size, enum hash_rxq_type type)
 {
 	size_t offset = sizeof(*flow_attr);
-	enum hash_rxq_type type = hash_rxq->type;
 	const struct hash_rxq_init *init = &hash_rxq_init[type];
 
-	assert(hash_rxq->priv != NULL);
+	assert(priv != NULL);
 	assert((size_t)type < RTE_DIM(hash_rxq_init));
 	do {
 		offset += init->flow_spec.hdr.size;
@@ -244,7 +244,7 @@ hash_rxq_flow_attr(const struct hash_rxq *hash_rxq,
 		.type = IBV_EXP_FLOW_ATTR_NORMAL,
 		.priority = init->flow_priority,
 		.num_of_specs = 0,
-		.port = hash_rxq->priv->port,
+		.port = priv->port,
 		.flags = 0,
 	};
 	do {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index d5a5019..c42bb8d 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -270,8 +270,8 @@ extern const unsigned int hash_rxq_init_n;
 extern uint8_t rss_hash_default_key[];
 extern const size_t rss_hash_default_key_len;
 
-size_t hash_rxq_flow_attr(const struct hash_rxq *, struct ibv_exp_flow_attr *,
-			  size_t);
+size_t priv_flow_attr(struct priv *, struct ibv_exp_flow_attr *,
+		      size_t, enum hash_rxq_type);
 int priv_create_hash_rxqs(struct priv *);
 void priv_destroy_hash_rxqs(struct priv *);
 int priv_allow_flow_type(struct priv *, enum hash_rxq_flow_type);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director
  2016-01-29 10:31 [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
                   ` (2 preceding siblings ...)
  2016-01-29 10:32 ` [dpdk-dev] [PATCH 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
@ 2016-01-29 10:32 ` Adrien Mazarguil
  2016-02-17 17:13   ` Bruce Richardson
  2016-01-29 10:32 ` [dpdk-dev] [PATCH 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Adrien Mazarguil @ 2016-01-29 10:32 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Add support for flow director filters (RTE_FDIR_MODE_PERFECT and
RTE_FDIR_MODE_PERFECT_MAC_VLAN modes).

This feature requires MLNX_OFED 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/Makefile       |   6 +
 drivers/net/mlx5/mlx5.c         |  12 +
 drivers/net/mlx5/mlx5.h         |  10 +
 drivers/net/mlx5/mlx5_defs.h    |  11 +
 drivers/net/mlx5/mlx5_fdir.c    | 890 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxq.c     |   6 +
 drivers/net/mlx5/mlx5_rxtx.h    |   7 +
 drivers/net/mlx5/mlx5_trigger.c |   3 +
 8 files changed, 945 insertions(+)
 create mode 100644 drivers/net/mlx5/mlx5_fdir.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 698f072..46a17e0 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -52,6 +52,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rxmode.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_vlan.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_stats.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_fdir.c
 
 # Dependencies.
 DEPDIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += lib/librte_ether
@@ -125,6 +126,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
+		infiniband/verbs.h \
+		enum IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 6afc873..2646b3f 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -94,6 +94,11 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
+
+	/* Remove flow director elements. */
+	priv_fdir_disable(priv);
+	priv_fdir_delete_filters_list(priv);
+
 	/* Prevent crashes when queues are still in use. */
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
@@ -169,6 +174,9 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.reta_query = mlx5_dev_rss_reta_query,
 	.rss_hash_update = mlx5_rss_hash_update,
 	.rss_hash_conf_get = mlx5_rss_hash_conf_get,
+#ifdef MLX5_FDIR_SUPPORT
+	.filter_ctrl = mlx5_dev_filter_ctrl,
+#endif /* MLX5_FDIR_SUPPORT */
 };
 
 static struct {
@@ -421,6 +429,10 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		claim_zero(priv_mac_addr_add(priv, 0,
 					     (const uint8_t (*)[ETHER_ADDR_LEN])
 					     mac.addr_bytes));
+		/* Initialize FD filters list. */
+		err = fdir_init_filters_list(priv);
+		if (err)
+			goto port_error;
 #ifndef NDEBUG
 		{
 			char ifname[IF_NAMESIZE];
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index bc0c7e2..fdd2504 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -120,6 +120,7 @@ struct priv {
 	struct rte_intr_handle intr_handle; /* Interrupt handler. */
 	unsigned int (*reta_idx)[]; /* RETA index table. */
 	unsigned int reta_idx_n; /* RETA index size. */
+	struct fdir_filter_list *fdir_filter_list; /* Flow director rules. */
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
@@ -215,4 +216,13 @@ int mlx5_vlan_filter_set(struct rte_eth_dev *, uint16_t, int);
 int mlx5_dev_start(struct rte_eth_dev *);
 void mlx5_dev_stop(struct rte_eth_dev *);
 
+/* mlx5_fdir.c */
+
+int fdir_init_filters_list(struct priv *);
+void priv_fdir_delete_filters_list(struct priv *);
+void priv_fdir_disable(struct priv *);
+void priv_fdir_enable(struct priv *);
+int mlx5_dev_filter_ctrl(struct rte_eth_dev *, enum rte_filter_type,
+			 enum rte_filter_op, void *);
+
 #endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index a7440e7..195440c 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -34,6 +34,8 @@
 #ifndef RTE_PMD_MLX5_DEFS_H_
 #define RTE_PMD_MLX5_DEFS_H_
 
+#include "mlx5_autoconf.h"
+
 /* Reported driver name. */
 #define MLX5_DRIVER_NAME "librte_pmd_mlx5"
 
@@ -84,4 +86,13 @@
 /* Alarm timeout. */
 #define MLX5_ALARM_TIMEOUT_US 100000
 
+/*
+ * Extended flow priorities necessary to support flow director are available
+ * since MLNX_OFED 3.2. Considering this version adds support for VLAN
+ * offloads as well, their availability means flow director can be used.
+ */
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+#define MLX5_FDIR_SUPPORT 1
+#endif
+
 #endif /* RTE_PMD_MLX5_DEFS_H_ */
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
new file mode 100644
index 0000000..20021b4
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -0,0 +1,890 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2015 6WIND S.A.
+ *   Copyright 2015 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <assert.h>
+#include <stdint.h>
+#include <string.h>
+#include <errno.h>
+
+/* Verbs header. */
+/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include <infiniband/verbs.h>
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+/* DPDK headers don't like -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include <rte_ether.h>
+#include <rte_malloc.h>
+#include <rte_ethdev.h>
+#include <rte_common.h>
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+#include "mlx5.h"
+#include "mlx5_rxtx.h"
+
+struct fdir_flow_desc {
+	uint16_t dst_port;
+	uint16_t src_port;
+	uint32_t src_ip[4];
+	uint32_t dst_ip[4];
+	uint8_t	mac[6];
+	uint16_t vlan_tag;
+	enum hash_rxq_type type;
+};
+
+struct mlx5_fdir_filter {
+	LIST_ENTRY(mlx5_fdir_filter) next;
+	uint16_t queue; /* Queue assigned to if FDIR match. */
+	struct fdir_flow_desc desc;
+	struct ibv_exp_flow *flow;
+};
+
+LIST_HEAD(fdir_filter_list, mlx5_fdir_filter);
+
+/**
+ * Convert struct rte_eth_fdir_filter to mlx5 filter descriptor.
+ *
+ * @param[in] fdir_filter
+ *   DPDK filter structure to convert.
+ * @param[out] desc
+ *   Resulting mlx5 filter descriptor.
+ * @param mode
+ *   Flow director mode.
+ */
+static void
+fdir_filter_to_flow_desc(const struct rte_eth_fdir_filter *fdir_filter,
+			 struct fdir_flow_desc *desc, enum rte_fdir_mode mode)
+{
+	/* Initialize descriptor. */
+	memset(desc, 0, sizeof(*desc));
+
+	/* Set VLAN ID. */
+	desc->vlan_tag = fdir_filter->input.flow_ext.vlan_tci;
+
+	/* Set MAC address. */
+	if (mode == RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		rte_memcpy(desc->mac,
+			   fdir_filter->input.flow.mac_vlan_flow.mac_addr.addr_bytes,
+			   sizeof(desc->mac));
+		desc->type = HASH_RXQ_ETH;
+		return;
+	}
+
+	/* Set mode */
+	switch (fdir_filter->input.flow_type) {
+	case RTE_ETH_FLOW_NONFRAG_IPV4_UDP:
+		desc->type = HASH_RXQ_UDPV4;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_TCP:
+		desc->type = HASH_RXQ_TCPV4;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
+		desc->type = HASH_RXQ_IPV4;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case RTE_ETH_FLOW_NONFRAG_IPV6_UDP:
+		desc->type = HASH_RXQ_UDPV6;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV6_TCP:
+		desc->type = HASH_RXQ_TCPV6;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV6_OTHER:
+		desc->type = HASH_RXQ_IPV6;
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		break;
+	}
+
+	/* Set flow values */
+	switch (fdir_filter->input.flow_type) {
+	case RTE_ETH_FLOW_NONFRAG_IPV4_UDP:
+	case RTE_ETH_FLOW_NONFRAG_IPV4_TCP:
+		desc->src_port = fdir_filter->input.flow.udp4_flow.src_port;
+		desc->dst_port = fdir_filter->input.flow.udp4_flow.dst_port;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
+		desc->src_ip[0] = fdir_filter->input.flow.ip4_flow.src_ip;
+		desc->dst_ip[0] = fdir_filter->input.flow.ip4_flow.dst_ip;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case RTE_ETH_FLOW_NONFRAG_IPV6_UDP:
+	case RTE_ETH_FLOW_NONFRAG_IPV6_TCP:
+		desc->src_port = fdir_filter->input.flow.udp6_flow.src_port;
+		desc->dst_port = fdir_filter->input.flow.udp6_flow.dst_port;
+	case RTE_ETH_FLOW_NONFRAG_IPV6_OTHER:
+		rte_memcpy(desc->src_ip,
+			   fdir_filter->input.flow.ipv6_flow.src_ip,
+			   sizeof(desc->src_ip));
+		rte_memcpy(desc->dst_ip,
+			   fdir_filter->input.flow.ipv6_flow.dst_ip,
+			   sizeof(desc->dst_ip));
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		break;
+	}
+}
+
+/**
+ * Create flow director steering rule for a specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param mlx5_fdir_filter
+ *   Filter to create a steering rule for.
+ * @param fdir_queue
+ *   Flow director queue for matching packets.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_flow_add(struct priv *priv,
+		   struct mlx5_fdir_filter *mlx5_fdir_filter,
+		   struct fdir_queue *fdir_queue)
+{
+	struct ibv_exp_flow *flow;
+	struct fdir_flow_desc *desc = &mlx5_fdir_filter->desc;
+	enum rte_fdir_mode fdir_mode = priv->dev->data->dev_conf.fdir_conf.mode;
+	struct rte_eth_fdir_masks *mask = &priv->dev->data->dev_conf.fdir_conf.mask;
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, desc->type));
+	struct ibv_exp_flow_attr *attr = &data->attr;
+	uintptr_t spec_offset = (uintptr_t)&data->spec;
+	struct ibv_exp_flow_spec_eth *spec_eth;
+	struct ibv_exp_flow_spec_ipv4 *spec_ipv4;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	struct ibv_exp_flow_spec_ipv6 *spec_ipv6;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	struct ibv_exp_flow_spec_tcp_udp *spec_tcp_udp;
+	unsigned int i;
+
+	/*
+	 * No padding must be inserted by the compiler between attr and spec.
+	 * This layout is expected by libibverbs.
+	 */
+	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec_offset);
+	priv_flow_attr(priv, attr, sizeof(data), desc->type);
+
+	/* Set Ethernet spec */
+	spec_eth = (struct ibv_exp_flow_spec_eth *)spec_offset;
+
+	/* The first specification must be Ethernet. */
+	assert(spec_eth->type == IBV_EXP_FLOW_SPEC_ETH);
+	assert(spec_eth->size == sizeof(*spec_eth));
+
+	/* VLAN ID */
+	spec_eth->val.vlan_tag = desc->vlan_tag;
+	spec_eth->mask.vlan_tag = mask->vlan_tci_mask;
+
+	/* Update priority */
+	attr->priority = 2;
+
+	if (fdir_mode == RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		/* MAC Address */
+		rte_memcpy(spec_eth->val.dst_mac,
+			   desc->mac,
+			   sizeof(spec_eth->val.dst_mac));
+		/* The mask is per byte mask */
+		for (i = 0; i < sizeof(spec_eth->mask.dst_mac); i++)
+			spec_eth->mask.dst_mac[i] = mask->mac_addr_byte_mask;
+
+		goto create_flow;
+	}
+
+	switch (desc->type) {
+	case HASH_RXQ_IPV4:
+	case HASH_RXQ_UDPV4:
+	case HASH_RXQ_TCPV4:
+		spec_offset += spec_eth->size;
+
+		/* Set IP spec */
+		spec_ipv4 = (struct ibv_exp_flow_spec_ipv4 *)spec_offset;
+
+		/* The second specification must be IP. */
+		assert(spec_ipv4->type == IBV_EXP_FLOW_SPEC_IPV4);
+		assert(spec_ipv4->size == sizeof(*spec_ipv4));
+
+		spec_ipv4->val.src_ip = desc->src_ip[0];
+		spec_ipv4->val.dst_ip = desc->dst_ip[0];
+		spec_ipv4->mask.src_ip = mask->ipv4_mask.src_ip;
+		spec_ipv4->mask.dst_ip = mask->ipv4_mask.dst_ip;
+
+		/* Update priority */
+		attr->priority = 1;
+
+		if (desc->type == HASH_RXQ_IPV4)
+			goto create_flow;
+
+		spec_offset += spec_ipv4->size;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case HASH_RXQ_IPV6:
+	case HASH_RXQ_UDPV6:
+	case HASH_RXQ_TCPV6:
+		spec_offset += spec_eth->size;
+
+		/* Set IP spec */
+		spec_ipv6 = (struct ibv_exp_flow_spec_ipv6 *)spec_offset;
+
+		/* The second specification must be IP. */
+		assert(spec_ipv6->type == IBV_EXP_FLOW_SPEC_IPV6);
+		assert(spec_ipv6->size == sizeof(*spec_ipv6));
+
+		rte_memcpy(spec_ipv6->val.src_ip,
+			   desc->src_ip,
+			   sizeof(spec_ipv6->val.src_ip));
+		rte_memcpy(spec_ipv6->val.dst_ip,
+			   desc->dst_ip,
+			   sizeof(spec_ipv6->val.dst_ip));
+		rte_memcpy(spec_ipv6->mask.src_ip,
+			   mask->ipv6_mask.src_ip,
+			   sizeof(spec_ipv6->mask.src_ip));
+		rte_memcpy(spec_ipv6->mask.dst_ip,
+			   mask->ipv6_mask.dst_ip,
+			   sizeof(spec_ipv6->mask.dst_ip));
+
+		/* Update priority */
+		attr->priority = 1;
+
+		if (desc->type == HASH_RXQ_IPV6)
+			goto create_flow;
+
+		spec_offset += spec_ipv6->size;
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		ERROR("invalid flow attribute type");
+		return EINVAL;
+	}
+
+	/* Set TCP/UDP flow specification. */
+	spec_tcp_udp = (struct ibv_exp_flow_spec_tcp_udp *)spec_offset;
+
+	/* The third specification must be TCP/UDP. */
+	assert(spec_tcp_udp->type == IBV_EXP_FLOW_SPEC_TCP ||
+	       spec_tcp_udp->type == IBV_EXP_FLOW_SPEC_UDP);
+	assert(spec_tcp_udp->size == sizeof(*spec_tcp_udp));
+
+	spec_tcp_udp->val.src_port = desc->src_port;
+	spec_tcp_udp->val.dst_port = desc->dst_port;
+	spec_tcp_udp->mask.src_port = mask->src_port_mask;
+	spec_tcp_udp->mask.dst_port = mask->dst_port_mask;
+
+	/* Update priority */
+	attr->priority = 0;
+
+create_flow:
+
+	errno = 0;
+	flow = ibv_exp_create_flow(fdir_queue->qp, attr);
+	if (flow == NULL) {
+		/* It's not clear whether errno is always set in this case. */
+		ERROR("%p: flow director configuration failed, errno=%d: %s",
+		      (void *)priv, errno,
+		      (errno ? strerror(errno) : "Unknown error"));
+		if (errno)
+			return errno;
+		return EINVAL;
+	}
+
+	DEBUG("%p: added flow director rule (%p)", (void *)priv, (void *)flow);
+	mlx5_fdir_filter->flow = flow;
+	return 0;
+}
+
+/**
+ * Get flow director queue for a specific RX queue, create it in case
+ * it does not exist.
+ *
+ * @param priv
+ *   Private structure.
+ * @param idx
+ *   RX queue index.
+ *
+ * @return
+ *   Related flow director queue on success, NULL otherwise.
+ */
+static struct fdir_queue *
+priv_get_fdir_queue(struct priv *priv, uint16_t idx)
+{
+	struct fdir_queue *fdir_queue = &(*priv->rxqs)[idx]->fdir_queue;
+	struct ibv_exp_rwq_ind_table *ind_table = NULL;
+	struct ibv_qp *qp = NULL;
+	struct ibv_exp_rwq_ind_table_init_attr ind_init_attr;
+	struct ibv_exp_rx_hash_conf hash_conf;
+	struct ibv_exp_qp_init_attr qp_init_attr;
+	int err = 0;
+
+	/* Return immediately if it has already been created. */
+	if (fdir_queue->qp != NULL)
+		return fdir_queue;
+
+	ind_init_attr = (struct ibv_exp_rwq_ind_table_init_attr){
+		.pd = priv->pd,
+		.log_ind_tbl_size = 0,
+		.ind_tbl = &((*priv->rxqs)[idx]->wq),
+		.comp_mask = 0,
+	};
+
+	errno = 0;
+	ind_table = ibv_exp_create_rwq_ind_table(priv->ctx,
+						 &ind_init_attr);
+	if (ind_table == NULL) {
+		/* Not clear whether errno is set. */
+		err = (errno ? errno : EINVAL);
+		ERROR("RX indirection table creation failed with error %d: %s",
+		      err, strerror(err));
+		goto error;
+	}
+
+	/* Create fdir_queue qp. */
+	hash_conf = (struct ibv_exp_rx_hash_conf){
+		.rx_hash_function = IBV_EXP_RX_HASH_FUNC_TOEPLITZ,
+		.rx_hash_key_len = rss_hash_default_key_len,
+		.rx_hash_key = rss_hash_default_key,
+		.rx_hash_fields_mask = 0,
+		.rwq_ind_tbl = ind_table,
+	};
+	qp_init_attr = (struct ibv_exp_qp_init_attr){
+		.max_inl_recv = 0, /* Currently not supported. */
+		.qp_type = IBV_QPT_RAW_PACKET,
+		.comp_mask = (IBV_EXP_QP_INIT_ATTR_PD |
+			      IBV_EXP_QP_INIT_ATTR_RX_HASH),
+		.pd = priv->pd,
+		.rx_hash_conf = &hash_conf,
+		.port_num = priv->port,
+	};
+
+	qp = ibv_exp_create_qp(priv->ctx, &qp_init_attr);
+	if (qp == NULL) {
+		err = (errno ? errno : EINVAL);
+		ERROR("hash RX QP creation failure: %s", strerror(err));
+		goto error;
+	}
+
+	fdir_queue->ind_table = ind_table;
+	fdir_queue->qp = qp;
+
+	return fdir_queue;
+
+error:
+	if (qp != NULL)
+		claim_zero(ibv_destroy_qp(qp));
+
+	if (ind_table != NULL)
+		claim_zero(ibv_exp_destroy_rwq_ind_table(ind_table));
+
+	return NULL;
+}
+
+/**
+ * Enable flow director filter and create steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ * @param mlx5_fdir_filter
+ *   Filter to create steering rule for.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_enable(struct priv *priv,
+			struct mlx5_fdir_filter *mlx5_fdir_filter)
+{
+	struct fdir_queue *fdir_queue;
+
+	/* Check if flow already exists. */
+	if (mlx5_fdir_filter->flow != NULL)
+		return 0;
+
+	/* Get fdir_queue for specific queue. */
+	fdir_queue = priv_get_fdir_queue(priv, mlx5_fdir_filter->queue);
+
+	if (fdir_queue == NULL) {
+		ERROR("failed to create flow director rxq for queue %d",
+		      mlx5_fdir_filter->queue);
+		return EINVAL;
+	}
+
+	/* Create flow */
+	return priv_fdir_flow_add(priv, mlx5_fdir_filter, fdir_queue);
+}
+
+/**
+ * Initialize flow director filters list.
+ *
+ * @param priv
+ *   Private structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+fdir_init_filters_list(struct priv *priv)
+{
+	/* Filter list initialization should be done only once. */
+	if (priv->fdir_filter_list)
+		return 0;
+
+	/* Create filters list. */
+	priv->fdir_filter_list =
+		rte_calloc(__func__, 1, sizeof(*priv->fdir_filter_list), 0);
+
+	if (priv->fdir_filter_list == NULL) {
+		int err = ENOMEM;
+
+		ERROR("cannot allocate flow director filter list: %s",
+		      strerror(err));
+		return err;
+	}
+
+	LIST_INIT(priv->fdir_filter_list);
+
+	return 0;
+}
+
+/**
+ * Remove all flow director filters and delete list.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_delete_filters_list(struct priv *priv)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	void *prev = NULL;
+
+	/* Run on every fdir filter and delete it */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		/* Deallocate previous element safely. */
+		rte_free(prev);
+
+		/* Only valid elements should be in the list. */
+		assert(mlx5_fdir_filter != NULL);
+
+		/* Remove element from list. */
+		LIST_REMOVE(mlx5_fdir_filter, next);
+
+		/* Destroy flow handle. */
+		if (mlx5_fdir_filter->flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(mlx5_fdir_filter->flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+
+		prev = mlx5_fdir_filter;
+	}
+
+	rte_free(prev);
+	rte_free(priv->fdir_filter_list);
+	priv->fdir_filter_list = NULL;
+}
+
+/**
+ * Disable flow director, remove all steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_disable(struct priv *priv)
+{
+	unsigned int i;
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	struct fdir_queue *fdir_queue;
+
+	/* Run on every flow director filter and destroy flow handle. */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		/* Only valid elements should be in the list */
+		assert(mlx5_fdir_filter != NULL);
+
+		/* Destroy flow handle */
+		if (mlx5_fdir_filter->flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(mlx5_fdir_filter->flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+	}
+
+	/* Run on every RX queue to destroy related flow director QP and
+	 * indirection table. */
+	for (i = 0; (i != priv->rxqs_n); i++) {
+		fdir_queue = &(*priv->rxqs)[i]->fdir_queue;
+
+		if (fdir_queue->qp != NULL) {
+			claim_zero(ibv_destroy_qp(fdir_queue->qp));
+			fdir_queue->qp = NULL;
+		}
+
+		if (fdir_queue->ind_table != NULL) {
+			claim_zero(ibv_exp_destroy_rwq_ind_table(fdir_queue->ind_table));
+			fdir_queue->ind_table = NULL;
+		}
+	}
+}
+
+/**
+ * Enable flow director, create steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_enable(struct priv *priv)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	/* Run on every fdir filter and create flow handle */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		/* Only valid elements should be in the list */
+		assert(mlx5_fdir_filter != NULL);
+
+		priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+	}
+}
+
+/**
+ * Find specific filter in list.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Flow director filter to find.
+ *
+ * @return
+ *   Filter element if found, otherwise NULL.
+ */
+static struct mlx5_fdir_filter *
+priv_find_filter_in_list(struct priv *priv,
+			 const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct fdir_flow_desc desc;
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	enum rte_fdir_mode fdir_mode = priv->dev->data->dev_conf.fdir_conf.mode;
+
+	/* Get flow director filter to look for. */
+	fdir_filter_to_flow_desc(fdir_filter, &desc, fdir_mode);
+
+	/* Look for the requested element. */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		/* Only valid elements should be in the list. */
+		assert(mlx5_fdir_filter != NULL);
+
+		/* Return matching filter. */
+		if (!memcmp(&desc, &mlx5_fdir_filter->desc, sizeof(desc)))
+			return mlx5_fdir_filter;
+	}
+
+	/* Filter not found */
+	return NULL;
+}
+
+/**
+ * Add new flow director filter and store it in list.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Flow director filter to add.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_add(struct priv *priv,
+		     const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	enum rte_fdir_mode fdir_mode = priv->dev->data->dev_conf.fdir_conf.mode;
+	int err = 0;
+
+	/* Validate queue number */
+	if (fdir_filter->action.rx_queue >= priv->rxqs_n) {
+		ERROR("invalid queue number %d", fdir_filter->action.rx_queue);
+		return EINVAL;
+	}
+
+	/* Duplicate filters are currently unsupported. */
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		ERROR("filter already exists");
+		return EINVAL;
+	}
+
+	/* Create new flow director filter. */
+	mlx5_fdir_filter =
+		rte_calloc(__func__, 1, sizeof(*mlx5_fdir_filter), 0);
+	if (mlx5_fdir_filter == NULL) {
+		err = ENOMEM;
+		ERROR("cannot allocate flow director filter: %s",
+		      strerror(err));
+		return err;
+	}
+
+	/* Set queue. */
+	mlx5_fdir_filter->queue = fdir_filter->action.rx_queue;
+
+	/* Convert to mlx5 filter descriptor. */
+	fdir_filter_to_flow_desc(fdir_filter,
+				 &mlx5_fdir_filter->desc, fdir_mode);
+
+	/* Insert new filter into list. */
+	LIST_INSERT_HEAD(priv->fdir_filter_list, mlx5_fdir_filter, next);
+
+	DEBUG("%p: flow director filter %p added",
+	      (void *)priv, (void *)mlx5_fdir_filter);
+
+	/* Enable filter immediately if device is started. */
+	if (priv->started)
+		err = priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+
+	return err;
+}
+
+/**
+ * Update queue for specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Filter to be updated.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_update(struct priv *priv,
+			const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		int err = 0;
+
+		/* Update queue number. */
+		mlx5_fdir_filter->queue = fdir_filter->action.rx_queue;
+
+		/* Destroy flow handle. */
+		if (mlx5_fdir_filter->flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(mlx5_fdir_filter->flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+		DEBUG("%p: flow director filter %p updated",
+		      (void *)priv, (void *)mlx5_fdir_filter);
+
+		/* Enable filter if device is started. */
+		if (priv->started)
+			err = priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+
+		return err;
+	}
+
+	/* Filter not found, create it. */
+	DEBUG("%p: filter not found for update, creating new filter",
+	      (void *)priv);
+	return priv_fdir_filter_add(priv, fdir_filter);
+}
+
+/**
+ * Delete specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Filter to be deleted.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_delete(struct priv *priv,
+			const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		/* Remove element from list. */
+		LIST_REMOVE(mlx5_fdir_filter, next);
+
+		/* Destroy flow handle. */
+		if (mlx5_fdir_filter->flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(mlx5_fdir_filter->flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+
+		DEBUG("%p: flow director filter %p deleted",
+		      (void *)priv, (void *)mlx5_fdir_filter);
+
+		/* Delete filter. */
+		rte_free(mlx5_fdir_filter);
+
+		return 0;
+	}
+
+	ERROR("%p: flow director delete failed, cannot find filter",
+	      (void *)priv);
+	return EINVAL;
+}
+
+/**
+ * Get flow director information.
+ *
+ * @param priv
+ *   Private structure.
+ * @param[out] fdir_info
+ *   Resulting flow director information.
+ */
+static void
+priv_fdir_info_get(struct priv *priv, struct rte_eth_fdir_info *fdir_info)
+{
+	struct rte_eth_fdir_masks *mask =
+		&priv->dev->data->dev_conf.fdir_conf.mask;
+
+	fdir_info->mode = priv->dev->data->dev_conf.fdir_conf.mode;
+	fdir_info->guarant_spc = 0;
+
+	rte_memcpy(&(fdir_info->mask), mask, sizeof(fdir_info->mask));
+
+	fdir_info->max_flexpayload = 0;
+	fdir_info->flow_types_mask[0] = 0;
+
+	fdir_info->flex_payload_unit = 0;
+	fdir_info->max_flex_payload_segment_num = 0;
+	fdir_info->flex_payload_limit = 0;
+	memset(&fdir_info->flex_conf, 0, sizeof(fdir_info->flex_conf));
+}
+
+/**
+ * Deal with flow director operations.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param filter_op
+ *   Operation to perform.
+ * @param arg
+ *   Pointer to operation-specific structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_ctrl_func(struct priv *priv, enum rte_filter_op filter_op, void *arg)
+{
+	enum rte_fdir_mode fdir_mode =
+		priv->dev->data->dev_conf.fdir_conf.mode;
+	int ret = 0;
+
+	if (filter_op == RTE_ETH_FILTER_NOP)
+		return 0;
+
+	if (fdir_mode != RTE_FDIR_MODE_PERFECT &&
+	    fdir_mode != RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		ERROR("%p: flow director mode %d not supported",
+		      (void *)priv, fdir_mode);
+		return EINVAL;
+	}
+
+	if (arg == NULL)
+		return EINVAL;
+
+	switch (filter_op) {
+	case RTE_ETH_FILTER_ADD:
+		ret = priv_fdir_filter_add(priv, arg);
+		break;
+	case RTE_ETH_FILTER_UPDATE:
+		ret = priv_fdir_filter_update(priv, arg);
+		break;
+	case RTE_ETH_FILTER_DELETE:
+		ret = priv_fdir_filter_delete(priv, arg);
+		break;
+	case RTE_ETH_FILTER_INFO:
+		priv_fdir_info_get(priv, arg);
+		break;
+	default:
+		DEBUG("%p: unknown operation %u", (void *)priv, filter_op);
+		ret = EINVAL;
+		break;
+	}
+	return ret;
+}
+
+/**
+ * Manage filter operations.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param filter_type
+ *   Filter type.
+ * @param filter_op
+ *   Operation to perform.
+ * @param arg
+ *   Pointer to operation-specific structure.
+ *
+ * @return
+ *   0 on success, negative errno value on failure.
+ */
+int
+mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
+		     enum rte_filter_type filter_type,
+		     enum rte_filter_op filter_op,
+		     void *arg)
+{
+	int ret = -EINVAL;
+	struct priv *priv = dev->data->dev_private;
+
+	switch (filter_type) {
+	case RTE_ETH_FILTER_FDIR:
+		priv_lock(priv);
+		ret = priv_fdir_ctrl_func(priv, filter_op, arg);
+		priv_unlock(priv);
+		break;
+	default:
+		ERROR("%p: filter type (%d) not supported",
+		      (void *)dev, filter_type);
+		break;
+	}
+
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 641ee07..9a70f32 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -62,6 +62,7 @@
 #include "mlx5.h"
 #include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
+#include "mlx5_autoconf.h"
 #include "mlx5_defs.h"
 
 /* Initialization data for hash RX queues. */
@@ -242,7 +243,12 @@ priv_flow_attr(struct priv *priv, struct ibv_exp_flow_attr *flow_attr,
 	init = &hash_rxq_init[type];
 	*flow_attr = (struct ibv_exp_flow_attr){
 		.type = IBV_EXP_FLOW_ATTR_NORMAL,
+#ifdef MLX5_FDIR_SUPPORT
+		/* Priorities < 3 are reserved for flow director. */
+		.priority = init->flow_priority + 3,
+#else /* MLX5_FDIR_SUPPORT */
 		.priority = init->flow_priority,
+#endif /* MLX5_FDIR_SUPPORT */
 		.num_of_specs = 0,
 		.port = priv->port,
 		.flags = 0,
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index c42bb8d..b2f72f8 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -93,6 +93,12 @@ struct rxq_elt {
 	struct rte_mbuf *buf; /* SGE buffer. */
 };
 
+/* Flow director queue structure. */
+struct fdir_queue {
+	struct ibv_qp *qp; /* Associated RX QP. */
+	struct ibv_exp_rwq_ind_table *ind_table; /* Indirection table. */
+};
+
 struct priv;
 
 /* RX queue descriptor. */
@@ -118,6 +124,7 @@ struct rxq {
 	struct mlx5_rxq_stats stats; /* RX queue counters. */
 	unsigned int socket; /* CPU socket ID for allocations. */
 	struct ibv_exp_res_domain *rd; /* Resource Domain. */
+	struct fdir_queue fdir_queue; /* Flow director queue. */
 };
 
 /* Hash RX queue types. */
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 90b8068..db7890f 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -87,6 +87,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		priv_mac_addrs_disable(priv);
 		priv_destroy_hash_rxqs(priv);
 	}
+	if (dev->data->dev_conf.fdir_conf.mode != RTE_FDIR_MODE_NONE)
+		priv_fdir_enable(priv);
 	priv_dev_interrupt_handler_install(priv, dev);
 	priv_unlock(priv);
 	return -err;
@@ -117,6 +119,7 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
+	priv_fdir_disable(priv);
 	priv_dev_interrupt_handler_uninstall(priv, dev);
 	priv->started = 0;
 	priv_unlock(priv);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH 5/5] mlx5: add support for RX VLAN stripping
  2016-01-29 10:31 [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
                   ` (3 preceding siblings ...)
  2016-01-29 10:32 ` [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director Adrien Mazarguil
@ 2016-01-29 10:32 ` Adrien Mazarguil
  2016-02-17 17:14 ` [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Bruce Richardson
  2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
  6 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-01-29 10:32 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Allows HW to strip the 802.1Q header from incoming frames and report it
through the mbuf structure.

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.c      |  16 ++++++-
 drivers/net/mlx5/mlx5.h      |   3 ++
 drivers/net/mlx5/mlx5_rxq.c  |  15 ++++++-
 drivers/net/mlx5/mlx5_rxtx.c |  27 +++++++++++
 drivers/net/mlx5/mlx5_rxtx.h |   5 +++
 drivers/net/mlx5/mlx5_vlan.c | 104 +++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 168 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 2646b3f..94b4d80 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -170,6 +170,10 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.mac_addr_remove = mlx5_mac_addr_remove,
 	.mac_addr_add = mlx5_mac_addr_add,
 	.mtu_set = mlx5_dev_set_mtu,
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	.vlan_strip_queue_set = mlx5_vlan_strip_queue_set,
+	.vlan_offload_set = mlx5_vlan_offload_set,
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	.reta_update = mlx5_dev_rss_reta_update,
 	.reta_query = mlx5_dev_rss_reta_query,
 	.rss_hash_update = mlx5_rss_hash_update,
@@ -324,7 +328,11 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 #ifdef HAVE_EXP_QUERY_DEVICE
 		exp_device_attr.comp_mask =
 			IBV_EXP_DEVICE_ATTR_EXP_CAP_FLAGS |
-			IBV_EXP_DEVICE_ATTR_RX_HASH;
+			IBV_EXP_DEVICE_ATTR_RX_HASH |
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+			IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS |
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+			0;
 #endif /* HAVE_EXP_QUERY_DEVICE */
 
 		DEBUG("using port %u (%08" PRIx32 ")", port, test);
@@ -395,6 +403,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 			priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
 		DEBUG("maximum RX indirection table size is %u",
 		      priv->ind_table_max_size);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		priv->hw_vlan_strip = !!(exp_device_attr.wq_vlan_offloads_cap &
+					 IBV_EXP_RECEIVE_WQ_CVLAN_STRIP);
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+		DEBUG("VLAN stripping is %ssupported",
+		      (priv->hw_vlan_strip ? "" : "not "));
 
 #else /* HAVE_EXP_QUERY_DEVICE */
 		priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index fdd2504..585760b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -101,6 +101,7 @@ struct priv {
 	unsigned int allmulti_req:1; /* All multicast mode requested. */
 	unsigned int hw_csum:1; /* Checksum offload is supported. */
 	unsigned int hw_csum_l2tun:1; /* Same for L2 tunnels. */
+	unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
 	unsigned int vf:1; /* This is a VF device. */
 	unsigned int pending_alarm:1; /* An alarm is pending. */
 	/* RX/TX queues. */
@@ -210,6 +211,8 @@ void mlx5_stats_reset(struct rte_eth_dev *);
 /* mlx5_vlan.c */
 
 int mlx5_vlan_filter_set(struct rte_eth_dev *, uint16_t, int);
+void mlx5_vlan_offload_set(struct rte_eth_dev *, int);
+void mlx5_vlan_strip_queue_set(struct rte_eth_dev *, uint16_t, int);
 
 /* mlx5_trigger.c */
 
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 9a70f32..c79ce5c 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1224,6 +1224,8 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	      priv->device_attr.max_qp_wr);
 	DEBUG("priv->device_attr.max_sge is %d",
 	      priv->device_attr.max_sge);
+	/* Configure VLAN stripping. */
+	tmpl.vlan_strip = dev->data->dev_conf.rxmode.hw_vlan_strip;
 	attr.wq = (struct ibv_exp_wq_init_attr){
 		.wq_context = NULL, /* Could be useful in the future. */
 		.wq_type = IBV_EXP_WQT_RQ,
@@ -1238,8 +1240,18 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 				 MLX5_PMD_SGE_WR_N),
 		.pd = priv->pd,
 		.cq = tmpl.cq,
-		.comp_mask = IBV_EXP_CREATE_WQ_RES_DOMAIN,
+		.comp_mask =
+			IBV_EXP_CREATE_WQ_RES_DOMAIN |
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+			IBV_EXP_CREATE_WQ_VLAN_OFFLOADS |
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+			0,
 		.res_domain = tmpl.rd,
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		.vlan_offloads = (tmpl.vlan_strip ?
+				  IBV_EXP_RECEIVE_WQ_CVLAN_STRIP :
+				  0),
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	};
 	tmpl.wq = ibv_exp_create_wq(priv->ctx, &attr.wq);
 	if (tmpl.wq == NULL) {
@@ -1262,6 +1274,7 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	DEBUG("%p: RTE port ID: %u", (void *)rxq, tmpl.port_id);
 	attr.params = (struct ibv_exp_query_intf_params){
 		.intf_scope = IBV_EXP_INTF_GLOBAL,
+		.intf_version = 1,
 		.intf = IBV_EXP_INTF_CQ,
 		.obj = tmpl.cq,
 	};
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fa5e648..7585570 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -62,6 +62,7 @@
 #include "mlx5.h"
 #include "mlx5_utils.h"
 #include "mlx5_rxtx.h"
+#include "mlx5_autoconf.h"
 #include "mlx5_defs.h"
 
 /**
@@ -713,12 +714,19 @@ mlx5_rx_burst_sp(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		unsigned int seg_headroom = RTE_PKTMBUF_HEADROOM;
 		unsigned int j = 0;
 		uint32_t flags;
+		uint16_t vlan_tci;
 
 		/* Sanity checks. */
 		assert(elts_head < rxq->elts_n);
 		assert(rxq->elts_head < rxq->elts_n);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		ret = rxq->if_cq->poll_length_flags_cvlan(rxq->cq, NULL, NULL,
+							  &flags, &vlan_tci);
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL,
 						    &flags);
+		(void)vlan_tci;
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		if (unlikely(ret < 0)) {
 			struct ibv_wc wc;
 			int wcs_n;
@@ -840,6 +848,12 @@ mlx5_rx_burst_sp(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		PKT_LEN(pkt_buf) = pkt_buf_len;
 		pkt_buf->packet_type = rxq_cq_to_pkt_type(flags);
 		pkt_buf->ol_flags = rxq_cq_to_ol_flags(rxq, flags);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		if (flags & IBV_EXP_CQ_RX_CVLAN_STRIPPED_V1) {
+			pkt_buf->ol_flags |= PKT_RX_VLAN_PKT;
+			pkt_buf->vlan_tci = vlan_tci;
+		}
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 
 		/* Return packet. */
 		*(pkts++) = pkt_buf;
@@ -910,6 +924,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		struct rte_mbuf *seg = elt->buf;
 		struct rte_mbuf *rep;
 		uint32_t flags;
+		uint16_t vlan_tci;
 
 		/* Sanity checks. */
 		assert(seg != NULL);
@@ -921,8 +936,14 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		 */
 		rte_prefetch0(seg);
 		rte_prefetch0(&seg->cacheline1);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		ret = rxq->if_cq->poll_length_flags_cvlan(rxq->cq, NULL, NULL,
+							  &flags, &vlan_tci);
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL,
 						    &flags);
+		(void)vlan_tci;
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		if (unlikely(ret < 0)) {
 			struct ibv_wc wc;
 			int wcs_n;
@@ -989,6 +1010,12 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		DATA_LEN(seg) = len;
 		seg->packet_type = rxq_cq_to_pkt_type(flags);
 		seg->ol_flags = rxq_cq_to_ol_flags(rxq, flags);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		if (flags & IBV_EXP_CQ_RX_CVLAN_STRIPPED_V1) {
+			seg->ol_flags |= PKT_RX_VLAN_PKT;
+			seg->vlan_tci = vlan_tci;
+		}
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 
 		/* Return packet. */
 		*(pkts++) = seg;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index b2f72f8..fde0ca2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -109,7 +109,11 @@ struct rxq {
 	struct ibv_cq *cq; /* Completion Queue. */
 	struct ibv_exp_wq *wq; /* Work Queue. */
 	struct ibv_exp_wq_family *if_wq; /* WQ burst interface. */
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	struct ibv_exp_cq_family_v1 *if_cq; /* CQ interface. */
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	struct ibv_exp_cq_family *if_cq; /* CQ interface. */
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	unsigned int port_id; /* Port ID for incoming packets. */
 	unsigned int elts_n; /* (*elts)[] length. */
 	unsigned int elts_head; /* Current index in (*elts)[]. */
@@ -120,6 +124,7 @@ struct rxq {
 	unsigned int sp:1; /* Use scattered RX elements. */
 	unsigned int csum:1; /* Enable checksum offloading. */
 	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
+	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
 	uint32_t mb_len; /* Length of a mp-issued mbuf. */
 	struct mlx5_rxq_stats stats; /* RX queue counters. */
 	unsigned int socket; /* CPU socket ID for allocations. */
diff --git a/drivers/net/mlx5/mlx5_vlan.c b/drivers/net/mlx5/mlx5_vlan.c
index 3a07ad1..fa9e3b8 100644
--- a/drivers/net/mlx5/mlx5_vlan.c
+++ b/drivers/net/mlx5/mlx5_vlan.c
@@ -48,6 +48,7 @@
 
 #include "mlx5_utils.h"
 #include "mlx5.h"
+#include "mlx5_autoconf.h"
 
 /**
  * Configure a VLAN filter.
@@ -127,3 +128,106 @@ mlx5_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 	assert(ret >= 0);
 	return -ret;
 }
+
+/**
+ * Set/reset VLAN stripping for a specific queue.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param idx
+ *   RX queue index.
+ * @param on
+ *   Enable/disable VLAN stripping.
+ */
+static void
+priv_vlan_strip_queue_set(struct priv *priv, uint16_t idx, int on)
+{
+	struct rxq *rxq = (*priv->rxqs)[idx];
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	struct ibv_exp_wq_attr mod;
+	uint16_t vlan_offloads =
+		(on ? IBV_EXP_RECEIVE_WQ_CVLAN_STRIP : 0) |
+		0;
+	int err;
+
+	DEBUG("set VLAN offloads 0x%x for port %d queue %d",
+	      vlan_offloads, rxq->port_id, idx);
+	mod = (struct ibv_exp_wq_attr){
+		.attr_mask = IBV_EXP_WQ_ATTR_VLAN_OFFLOADS,
+		.vlan_offloads = vlan_offloads,
+	};
+
+	err = ibv_exp_modify_wq(rxq->wq, &mod);
+	if (err) {
+		ERROR("%p: failed to modified stripping mode: %s",
+		      (void *)priv, strerror(err));
+		return;
+	}
+
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+
+	/* Update related bits in RX queue. */
+	rxq->vlan_strip = !!on;
+}
+
+/**
+ * Callback to set/reset VLAN stripping for a specific queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param queue
+ *   RX queue index.
+ * @param on
+ *   Enable/disable VLAN stripping.
+ */
+void
+mlx5_vlan_strip_queue_set(struct rte_eth_dev *dev, uint16_t queue, int on)
+{
+	struct priv *priv = dev->data->dev_private;
+
+	/* Validate hw support */
+	if (!priv->hw_vlan_strip) {
+		ERROR("VLAN stripping is not supported");
+		return;
+	}
+
+	/* Validate queue number */
+	if (queue >= priv->rxqs_n) {
+		ERROR("VLAN stripping, invalid queue number %d", queue);
+		return;
+	}
+
+	priv_lock(priv);
+	priv_vlan_strip_queue_set(priv, queue, on);
+	priv_unlock(priv);
+}
+
+/**
+ * Callback to set/reset VLAN offloads for a port.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param mask
+ *   VLAN offload bit mask.
+ */
+void
+mlx5_vlan_offload_set(struct rte_eth_dev *dev, int mask)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int i;
+
+	if (mask & ETH_VLAN_STRIP_MASK) {
+		int hw_vlan_strip = dev->data->dev_conf.rxmode.hw_vlan_strip;
+
+		if (!priv->hw_vlan_strip) {
+			ERROR("VLAN stripping is not supported");
+			return;
+		}
+
+		/* Run on every RX queue and set/reset VLAN stripping. */
+		priv_lock(priv);
+		for (i = 0; (i != priv->rxqs_n); i++)
+			priv_vlan_strip_queue_set(priv, i, hw_vlan_strip);
+		priv_unlock(priv);
+	}
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director
  2016-01-29 10:32 ` [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director Adrien Mazarguil
@ 2016-02-17 17:13   ` Bruce Richardson
  2016-02-18 16:10     ` Adrien Mazarguil
  0 siblings, 1 reply; 26+ messages in thread
From: Bruce Richardson @ 2016-02-17 17:13 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

On Fri, Jan 29, 2016 at 11:32:01AM +0100, Adrien Mazarguil wrote:
> From: Yaacov Hazan <yaacovh@mellanox.com>
> 
> Add support for flow director filters (RTE_FDIR_MODE_PERFECT and
> RTE_FDIR_MODE_PERFECT_MAC_VLAN modes).
> 
> This feature requires MLNX_OFED 3.2.
> 
> Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
Hi Adrien, Yaacov,

this patch raises a lot of warnings (17) with checkpatch. Can you perhaps look
to see if this number can be reduced.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support
  2016-01-29 10:31 [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
                   ` (4 preceding siblings ...)
  2016-01-29 10:32 ` [dpdk-dev] [PATCH 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
@ 2016-02-17 17:14 ` Bruce Richardson
  2016-02-18 16:27   ` Adrien Mazarguil
  2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
  6 siblings, 1 reply; 26+ messages in thread
From: Bruce Richardson @ 2016-02-17 17:14 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

On Fri, Jan 29, 2016 at 11:31:57AM +0100, Adrien Mazarguil wrote:
> To preserve compatibility with Mellanox OFED 3.1, flow director and RX VLAN
> stripping code is only enabled if compiled with 3.2.
> 
This description would seem to imply that a documentation update is necessary,
or at minimum a release note update, along with the code changes.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director
  2016-02-17 17:13   ` Bruce Richardson
@ 2016-02-18 16:10     ` Adrien Mazarguil
  2016-02-23 15:13       ` Bruce Richardson
  0 siblings, 1 reply; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-18 16:10 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

Hi Bruce,

On Wed, Feb 17, 2016 at 05:13:44PM +0000, Bruce Richardson wrote:
> On Fri, Jan 29, 2016 at 11:32:01AM +0100, Adrien Mazarguil wrote:
> > From: Yaacov Hazan <yaacovh@mellanox.com>
> > 
> > Add support for flow director filters (RTE_FDIR_MODE_PERFECT and
> > RTE_FDIR_MODE_PERFECT_MAC_VLAN modes).
> > 
> > This feature requires MLNX_OFED 3.2.
> > 
> > Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > ---
> Hi Adrien, Yaacov,
> 
> this patch raises a lot of warnings (17) with checkpatch. Can you perhaps look
> to see if this number can be reduced.

We actually did it before submitting that patch, there is indeed a bunch of
remaining warnings that have been left on purpose. Not sure if we have the
same configuration for checkpatch, but they should fall into the following
categories:

- "WARNING: return of an errno should typically be negative" - All return
  values are documented in the code. Since this PMD uses syscalls in its
  control path, it uses positive errno values internally for
  consistency. Public callback functions however return negative error
  values.

- "WARNING: line over 80 characters" - Well, although I'm a big fan of the
  80 characters limit, breaking those would have made the code harder to
  read.

- "WARNING: Missing a blank line after declarations" - It's actually a
  declaration through a macro, there is no missing blank line.

- "WARNING: networking block comments don't use an empty /* line" - Not sure
  if we really care? I don't particularly mind.

- "CHECK: Comparison to NULL could be written "!" - I do not mind either,
  writing the full check seems clearer to me.

- "CHECK: Unnecessary parentheses around fdir_info->mask" - Looks like a
  valid, although minor error.

Please tell me which of these still need to be fixed.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support
  2016-02-17 17:14 ` [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Bruce Richardson
@ 2016-02-18 16:27   ` Adrien Mazarguil
  0 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-18 16:27 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Wed, Feb 17, 2016 at 05:14:49PM +0000, Bruce Richardson wrote:
> On Fri, Jan 29, 2016 at 11:31:57AM +0100, Adrien Mazarguil wrote:
> > To preserve compatibility with Mellanox OFED 3.1, flow director and RX VLAN
> > stripping code is only enabled if compiled with 3.2.
> > 
> This description would seem to imply that a documentation update is necessary,
> or at minimum a release note update, along with the code changes.

We initially intended to submit documentation updates separately with
several other changes. Will update the series.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v2 0/5] Add flow director and RX VLAN stripping support
  2016-01-29 10:31 [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
                   ` (5 preceding siblings ...)
  2016-02-17 17:14 ` [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Bruce Richardson
@ 2016-02-22 18:02 ` Adrien Mazarguil
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 1/5] mlx5: refactor special flows handling Adrien Mazarguil
                     ` (5 more replies)
  6 siblings, 6 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:02 UTC (permalink / raw)
  To: dev

To preserve compatibility with Mellanox OFED 3.1, flow director and RX VLAN
stripping code is only enabled if compiled with 3.2.

Changes in v2:
- Rebased patchset on top of dpdk-next-net/rel_16_04.
- Fixed trivial compilation warnings (positive errnos are left on purpose).
- Updated documentation and release notes for flow director and RX VLAN
  stripping features.
- Fixed missing Mellanox OFED >= 3.2 check for CQ family query interface
  version.

Yaacov Hazan (5):
  mlx5: refactor special flows handling
  mlx5: add special flows (broadcast and IPv6 multicast)
  mlx5: make flow steering rule generator more generic
  mlx5: add support for flow director
  mlx5: add support for RX VLAN stripping

 doc/guides/nics/mlx5.rst               |  16 +-
 doc/guides/rel_notes/release_16_04.rst |  13 +
 drivers/net/mlx5/Makefile              |   6 +
 drivers/net/mlx5/mlx5.c                |  39 +-
 drivers/net/mlx5/mlx5.h                |  19 +-
 drivers/net/mlx5/mlx5_defs.h           |  14 +
 drivers/net/mlx5/mlx5_ethdev.c         |   3 +-
 drivers/net/mlx5/mlx5_fdir.c           | 910 +++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_mac.c            |  10 +-
 drivers/net/mlx5/mlx5_rxmode.c         | 350 ++++++-------
 drivers/net/mlx5/mlx5_rxq.c            |  82 ++-
 drivers/net/mlx5/mlx5_rxtx.c           |  27 +
 drivers/net/mlx5/mlx5_rxtx.h           |  51 +-
 drivers/net/mlx5/mlx5_trigger.c        |  21 +-
 drivers/net/mlx5/mlx5_vlan.c           | 104 ++++
 15 files changed, 1438 insertions(+), 227 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_fdir.c

-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v2 1/5] mlx5: refactor special flows handling
  2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
@ 2016-02-22 18:02   ` Adrien Mazarguil
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:02 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Merge redundant code by adding a static initialization table to manage
promiscuous and allmulticast (special) flows.

New function priv_rehash_flows() implements the logic to enable/disable
relevant flows in one place from any context.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.c         |   4 +-
 drivers/net/mlx5/mlx5.h         |   6 +-
 drivers/net/mlx5/mlx5_defs.h    |   3 +
 drivers/net/mlx5/mlx5_rxmode.c  | 321 ++++++++++++++++++----------------------
 drivers/net/mlx5/mlx5_rxq.c     |  33 ++++-
 drivers/net/mlx5/mlx5_rxtx.h    |  29 +++-
 drivers/net/mlx5/mlx5_trigger.c |  14 +-
 7 files changed, 210 insertions(+), 200 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 30d88b5..52bf4b2 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -88,8 +88,8 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	      ((priv->ctx != NULL) ? priv->ctx->device->name : ""));
 	/* In case mlx5_dev_stop() has not been called. */
 	priv_dev_interrupt_handler_uninstall(priv, dev);
-	priv_allmulticast_disable(priv);
-	priv_promiscuous_disable(priv);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	/* Prevent crashes when queues are still in use. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 2f9a594..1c69bfa 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -195,13 +195,11 @@ int mlx5_dev_rss_reta_update(struct rte_eth_dev *,
 
 /* mlx5_rxmode.c */
 
-int priv_promiscuous_enable(struct priv *);
+int priv_special_flow_enable(struct priv *, enum hash_rxq_flow_type);
+void priv_special_flow_disable(struct priv *, enum hash_rxq_flow_type);
 void mlx5_promiscuous_enable(struct rte_eth_dev *);
-void priv_promiscuous_disable(struct priv *);
 void mlx5_promiscuous_disable(struct rte_eth_dev *);
-int priv_allmulticast_enable(struct priv *);
 void mlx5_allmulticast_enable(struct rte_eth_dev *);
-void priv_allmulticast_disable(struct priv *);
 void mlx5_allmulticast_disable(struct rte_eth_dev *);
 
 /* mlx5_stats.c */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index bb82c9a..1ec14ef 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -43,6 +43,9 @@
 /* Maximum number of simultaneous VLAN filters. */
 #define MLX5_MAX_VLAN_IDS 128
 
+/* Maximum number of special flows. */
+#define MLX5_MAX_SPECIAL_FLOWS 2
+
 /* Request send completion once in every 64 sends, might be less. */
 #define MLX5_PMD_TX_PER_COMP_REQ 64
 
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 096fd18..b2ed17e 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -58,31 +58,96 @@
 #include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
 
-static void hash_rxq_promiscuous_disable(struct hash_rxq *);
-static void hash_rxq_allmulticast_disable(struct hash_rxq *);
+/* Initialization data for special flows. */
+static const struct special_flow_init special_flow_init[] = {
+	[HASH_RXQ_FLOW_TYPE_PROMISC] = {
+		.dst_mac_val = "\x00\x00\x00\x00\x00\x00",
+		.dst_mac_mask = "\x00\x00\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_TCPV4 |
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_TCPV6 |
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+	[HASH_RXQ_FLOW_TYPE_ALLMULTI] = {
+		.dst_mac_val = "\x01\x00\x00\x00\x00\x00",
+		.dst_mac_mask = "\x01\x00\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+};
 
 /**
- * Enable promiscuous mode in a hash RX queue.
+ * Enable a special flow in a hash RX queue.
  *
  * @param hash_rxq
  *   Pointer to hash RX queue structure.
+ * @param flow_type
+ *   Special flow type.
  *
  * @return
  *   0 on success, errno value on failure.
  */
 static int
-hash_rxq_promiscuous_enable(struct hash_rxq *hash_rxq)
+hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
+			     enum hash_rxq_flow_type flow_type)
 {
 	struct ibv_exp_flow *flow;
 	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
 	struct ibv_exp_flow_attr *attr = &data->attr;
+	struct ibv_exp_flow_spec_eth *spec = &data->spec;
+	const uint8_t *mac;
+	const uint8_t *mask;
 
-	if (hash_rxq->promisc_flow != NULL)
+	/* Check if flow is relevant for this hash_rxq. */
+	if (!(special_flow_init[flow_type].hash_types & (1 << hash_rxq->type)))
+		return 0;
+	/* Check if flow already exists. */
+	if (hash_rxq->special_flow[flow_type] != NULL)
 		return 0;
-	DEBUG("%p: enabling promiscuous mode", (void *)hash_rxq);
-	/* Promiscuous flows only differ from normal flows by not filtering
-	 * on specific MAC addresses. */
+
+	/*
+	 * No padding must be inserted by the compiler between attr and spec.
+	 * This layout is expected by libibverbs.
+	 */
+	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
 	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	/* The first specification must be Ethernet. */
+	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
+	assert(spec->size == sizeof(*spec));
+
+	mac = special_flow_init[flow_type].dst_mac_val;
+	mask = special_flow_init[flow_type].dst_mac_mask;
+	*spec = (struct ibv_exp_flow_spec_eth){
+		.type = IBV_EXP_FLOW_SPEC_ETH,
+		.size = sizeof(*spec),
+		.val = {
+			.dst_mac = {
+				mac[0], mac[1], mac[2],
+				mac[3], mac[4], mac[5],
+			},
+		},
+		.mask = {
+			.dst_mac = {
+				mask[0], mask[1], mask[2],
+				mask[3], mask[4], mask[5],
+			},
+		},
+	};
+
 	errno = 0;
 	flow = ibv_exp_create_flow(hash_rxq->qp, attr);
 	if (flow == NULL) {
@@ -94,38 +159,63 @@ hash_rxq_promiscuous_enable(struct hash_rxq *hash_rxq)
 			return errno;
 		return EINVAL;
 	}
-	hash_rxq->promisc_flow = flow;
-	DEBUG("%p: promiscuous mode enabled", (void *)hash_rxq);
+	hash_rxq->special_flow[flow_type] = flow;
+	DEBUG("%p: enabling special flow %s (%d)",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
 	return 0;
 }
 
 /**
- * Enable promiscuous mode in all hash RX queues.
+ * Disable a special flow in a hash RX queue.
+ *
+ * @param hash_rxq
+ *   Pointer to hash RX queue structure.
+ * @param flow_type
+ *   Special flow type.
+ */
+static void
+hash_rxq_special_flow_disable(struct hash_rxq *hash_rxq,
+			      enum hash_rxq_flow_type flow_type)
+{
+	if (hash_rxq->special_flow[flow_type] == NULL)
+		return;
+	DEBUG("%p: disabling special flow %s (%d)",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
+	claim_zero(ibv_exp_destroy_flow(hash_rxq->special_flow[flow_type]));
+	hash_rxq->special_flow[flow_type] = NULL;
+	DEBUG("%p: special flow %s (%d) disabled",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
+}
+
+/**
+ * Enable a special flow in all hash RX queues.
  *
  * @param priv
  *   Private structure.
+ * @param flow_type
+ *   Special flow type.
  *
  * @return
  *   0 on success, errno value on failure.
  */
 int
-priv_promiscuous_enable(struct priv *priv)
+priv_special_flow_enable(struct priv *priv, enum hash_rxq_flow_type flow_type)
 {
 	unsigned int i;
 
-	if (!priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_PROMISC))
+	if (!priv_allow_flow_type(priv, flow_type))
 		return 0;
 	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
 		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
 		int ret;
 
-		ret = hash_rxq_promiscuous_enable(hash_rxq);
+		ret = hash_rxq_special_flow_enable(hash_rxq, flow_type);
 		if (!ret)
 			continue;
 		/* Failure, rollback. */
 		while (i != 0) {
 			hash_rxq = &(*priv->hash_rxqs)[--i];
-			hash_rxq_promiscuous_disable(hash_rxq);
+			hash_rxq_special_flow_disable(hash_rxq, flow_type);
 		}
 		return ret;
 	}
@@ -133,6 +223,26 @@ priv_promiscuous_enable(struct priv *priv)
 }
 
 /**
+ * Disable a special flow in all hash RX queues.
+ *
+ * @param priv
+ *   Private structure.
+ * @param flow_type
+ *   Special flow type.
+ */
+void
+priv_special_flow_disable(struct priv *priv, enum hash_rxq_flow_type flow_type)
+{
+	unsigned int i;
+
+	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
+		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
+
+		hash_rxq_special_flow_disable(hash_rxq, flow_type);
+	}
+}
+
+/**
  * DPDK callback to enable promiscuous mode.
  *
  * @param dev
@@ -146,49 +256,14 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
 
 	priv_lock(priv);
 	priv->promisc_req = 1;
-	ret = priv_promiscuous_enable(priv);
+	ret = priv_rehash_flows(priv);
 	if (ret)
-		ERROR("cannot enable promiscuous mode: %s", strerror(ret));
-	else {
-		priv_mac_addrs_disable(priv);
-		priv_allmulticast_disable(priv);
-	}
+		ERROR("error while enabling promiscuous mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Disable promiscuous mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- */
-static void
-hash_rxq_promiscuous_disable(struct hash_rxq *hash_rxq)
-{
-	if (hash_rxq->promisc_flow == NULL)
-		return;
-	DEBUG("%p: disabling promiscuous mode", (void *)hash_rxq);
-	claim_zero(ibv_exp_destroy_flow(hash_rxq->promisc_flow));
-	hash_rxq->promisc_flow = NULL;
-	DEBUG("%p: promiscuous mode disabled", (void *)hash_rxq);
-}
-
-/**
- * Disable promiscuous mode in all hash RX queues.
- *
- * @param priv
- *   Private structure.
- */
-void
-priv_promiscuous_disable(struct priv *priv)
-{
-	unsigned int i;
-
-	for (i = 0; (i != priv->hash_rxqs_n); ++i)
-		hash_rxq_promiscuous_disable(&(*priv->hash_rxqs)[i]);
-}
-
-/**
  * DPDK callback to disable promiscuous mode.
  *
  * @param dev
@@ -198,105 +273,18 @@ void
 mlx5_promiscuous_disable(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
+	int ret;
 
 	priv_lock(priv);
 	priv->promisc_req = 0;
-	priv_promiscuous_disable(priv);
-	priv_mac_addrs_enable(priv);
-	priv_allmulticast_enable(priv);
+	ret = priv_rehash_flows(priv);
+	if (ret)
+		ERROR("error while disabling promiscuous mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Enable allmulti mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- *
- * @return
- *   0 on success, errno value on failure.
- */
-static int
-hash_rxq_allmulticast_enable(struct hash_rxq *hash_rxq)
-{
-	struct ibv_exp_flow *flow;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
-	struct ibv_exp_flow_attr *attr = &data->attr;
-	struct ibv_exp_flow_spec_eth *spec = &data->spec;
-
-	if (hash_rxq->allmulti_flow != NULL)
-		return 0;
-	DEBUG("%p: enabling allmulticast mode", (void *)hash_rxq);
-	/*
-	 * No padding must be inserted by the compiler between attr and spec.
-	 * This layout is expected by libibverbs.
-	 */
-	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
-	*spec = (struct ibv_exp_flow_spec_eth){
-		.type = IBV_EXP_FLOW_SPEC_ETH,
-		.size = sizeof(*spec),
-		.val = {
-			.dst_mac = "\x01\x00\x00\x00\x00\x00",
-		},
-		.mask = {
-			.dst_mac = "\x01\x00\x00\x00\x00\x00",
-		},
-	};
-	errno = 0;
-	flow = ibv_exp_create_flow(hash_rxq->qp, attr);
-	if (flow == NULL) {
-		/* It's not clear whether errno is always set in this case. */
-		ERROR("%p: flow configuration failed, errno=%d: %s",
-		      (void *)hash_rxq, errno,
-		      (errno ? strerror(errno) : "Unknown error"));
-		if (errno)
-			return errno;
-		return EINVAL;
-	}
-	hash_rxq->allmulti_flow = flow;
-	DEBUG("%p: allmulticast mode enabled", (void *)hash_rxq);
-	return 0;
-}
-
-/**
- * Enable allmulti mode in most hash RX queues.
- * TCP queues are exempted to save resources.
- *
- * @param priv
- *   Private structure.
- *
- * @return
- *   0 on success, errno value on failure.
- */
-int
-priv_allmulticast_enable(struct priv *priv)
-{
-	unsigned int i;
-
-	if (!priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI))
-		return 0;
-	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
-		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
-		int ret;
-
-		/* allmulticast not relevant for TCP. */
-		if (hash_rxq->type == HASH_RXQ_TCPV4)
-			continue;
-		ret = hash_rxq_allmulticast_enable(hash_rxq);
-		if (!ret)
-			continue;
-		/* Failure, rollback. */
-		while (i != 0) {
-			hash_rxq = &(*priv->hash_rxqs)[--i];
-			hash_rxq_allmulticast_disable(hash_rxq);
-		}
-		return ret;
-	}
-	return 0;
-}
-
-/**
  * DPDK callback to enable allmulti mode.
  *
  * @param dev
@@ -310,45 +298,14 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
 
 	priv_lock(priv);
 	priv->allmulti_req = 1;
-	ret = priv_allmulticast_enable(priv);
+	ret = priv_rehash_flows(priv);
 	if (ret)
-		ERROR("cannot enable allmulticast mode: %s", strerror(ret));
+		ERROR("error while enabling allmulticast mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Disable allmulti mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- */
-static void
-hash_rxq_allmulticast_disable(struct hash_rxq *hash_rxq)
-{
-	if (hash_rxq->allmulti_flow == NULL)
-		return;
-	DEBUG("%p: disabling allmulticast mode", (void *)hash_rxq);
-	claim_zero(ibv_exp_destroy_flow(hash_rxq->allmulti_flow));
-	hash_rxq->allmulti_flow = NULL;
-	DEBUG("%p: allmulticast mode disabled", (void *)hash_rxq);
-}
-
-/**
- * Disable allmulti mode in all hash RX queues.
- *
- * @param priv
- *   Private structure.
- */
-void
-priv_allmulticast_disable(struct priv *priv)
-{
-	unsigned int i;
-
-	for (i = 0; (i != priv->hash_rxqs_n); ++i)
-		hash_rxq_allmulticast_disable(&(*priv->hash_rxqs)[i]);
-}
-
-/**
  * DPDK callback to disable allmulti mode.
  *
  * @param dev
@@ -358,9 +315,13 @@ void
 mlx5_allmulticast_disable(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
+	int ret;
 
 	priv_lock(priv);
 	priv->allmulti_req = 0;
-	priv_allmulticast_disable(priv);
+	ret = priv_rehash_flows(priv);
+	if (ret)
+		ERROR("error while disabling allmulticast mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index ebbe186..166516a 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -534,8 +534,8 @@ priv_destroy_hash_rxqs(struct priv *priv)
 		assert(hash_rxq->priv == priv);
 		assert(hash_rxq->qp != NULL);
 		/* Also check that there are no remaining flows. */
-		assert(hash_rxq->allmulti_flow == NULL);
-		assert(hash_rxq->promisc_flow == NULL);
+		for (j = 0; (j != RTE_DIM(hash_rxq->special_flow)); ++j)
+			assert(hash_rxq->special_flow[j] == NULL);
 		for (j = 0; (j != RTE_DIM(hash_rxq->mac_flow)); ++j)
 			for (k = 0; (k != RTE_DIM(hash_rxq->mac_flow[j])); ++k)
 				assert(hash_rxq->mac_flow[j][k] == NULL);
@@ -586,6 +586,35 @@ priv_allow_flow_type(struct priv *priv, enum hash_rxq_flow_type type)
 }
 
 /**
+ * Automatically enable/disable flows according to configuration.
+ *
+ * @param priv
+ *   Private structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+priv_rehash_flows(struct priv *priv)
+{
+	unsigned int i;
+
+	for (i = 0; (i != RTE_DIM((*priv->hash_rxqs)[0].special_flow)); ++i)
+		if (!priv_allow_flow_type(priv, i)) {
+			priv_special_flow_disable(priv, i);
+		} else {
+			int ret = priv_special_flow_enable(priv, i);
+
+			if (ret)
+				return ret;
+		}
+	if (priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_MAC))
+		return priv_mac_addrs_enable(priv);
+	priv_mac_addrs_disable(priv);
+	return 0;
+}
+
+/**
  * Allocate RX queue elements with scattered packets support.
  *
  * @param rxq
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index e1e1925..983e6a4 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -176,20 +176,42 @@ struct ind_table_init {
 	unsigned int hash_types_n;
 };
 
+/* Initialization data for special flows. */
+struct special_flow_init {
+	uint8_t dst_mac_val[6];
+	uint8_t dst_mac_mask[6];
+	unsigned int hash_types;
+};
+
 enum hash_rxq_flow_type {
-	HASH_RXQ_FLOW_TYPE_MAC,
 	HASH_RXQ_FLOW_TYPE_PROMISC,
 	HASH_RXQ_FLOW_TYPE_ALLMULTI,
+	HASH_RXQ_FLOW_TYPE_MAC,
 };
 
+#ifndef NDEBUG
+static inline const char *
+hash_rxq_flow_type_str(enum hash_rxq_flow_type flow_type)
+{
+	switch (flow_type) {
+	case HASH_RXQ_FLOW_TYPE_PROMISC:
+		return "promiscuous";
+	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
+		return "allmulticast";
+	case HASH_RXQ_FLOW_TYPE_MAC:
+		return "MAC";
+	}
+	return NULL;
+}
+#endif /* NDEBUG */
+
 struct hash_rxq {
 	struct priv *priv; /* Back pointer to private data. */
 	struct ibv_qp *qp; /* Hash RX QP. */
 	enum hash_rxq_type type; /* Hash RX queue type. */
 	/* MAC flow steering rules, one per VLAN ID. */
 	struct ibv_exp_flow *mac_flow[MLX5_MAX_MAC_ADDRESSES][MLX5_MAX_VLAN_IDS];
-	struct ibv_exp_flow *promisc_flow; /* Promiscuous flow. */
-	struct ibv_exp_flow *allmulti_flow; /* Multicast flow. */
+	struct ibv_exp_flow *special_flow[MLX5_MAX_SPECIAL_FLOWS];
 };
 
 /* TX element. */
@@ -247,6 +269,7 @@ size_t hash_rxq_flow_attr(const struct hash_rxq *, struct ibv_exp_flow_attr *,
 int priv_create_hash_rxqs(struct priv *);
 void priv_destroy_hash_rxqs(struct priv *);
 int priv_allow_flow_type(struct priv *, enum hash_rxq_flow_type);
+int priv_rehash_flows(struct priv *);
 void rxq_cleanup(struct rxq *);
 int rxq_rehash(struct rte_eth_dev *, struct rxq *);
 int rxq_setup(struct rte_eth_dev *, struct rxq *, uint16_t, unsigned int,
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ff1203d..d9f7d00 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -72,11 +72,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	DEBUG("%p: allocating and configuring hash RX queues", (void *)dev);
 	err = priv_create_hash_rxqs(priv);
 	if (!err)
-		err = priv_promiscuous_enable(priv);
-	if (!err)
-		err = priv_mac_addrs_enable(priv);
-	if (!err)
-		err = priv_allmulticast_enable(priv);
+		err = priv_rehash_flows(priv);
 	if (!err)
 		priv->started = 1;
 	else {
@@ -84,8 +80,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		      " %s",
 		      (void *)priv, strerror(err));
 		/* Rollback. */
-		priv_allmulticast_disable(priv);
-		priv_promiscuous_disable(priv);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 		priv_mac_addrs_disable(priv);
 		priv_destroy_hash_rxqs(priv);
 	}
@@ -113,8 +109,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 		return;
 	}
 	DEBUG("%p: cleaning up and destroying hash RX queues", (void *)dev);
-	priv_allmulticast_disable(priv);
-	priv_promiscuous_disable(priv);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	priv_dev_interrupt_handler_uninstall(priv, dev);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v2 2/5] mlx5: add special flows (broadcast and IPv6 multicast)
  2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 1/5] mlx5: refactor special flows handling Adrien Mazarguil
@ 2016-02-22 18:02   ` Adrien Mazarguil
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:02 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Until now, broadcast frames were handled like unicast. Moving the related
flow to the special flows table frees up the related unicast MAC entry.

The same method is used to handle IPv6 multicast frames.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.c         |  7 +++----
 drivers/net/mlx5/mlx5_defs.h    |  2 +-
 drivers/net/mlx5/mlx5_ethdev.c  |  3 +--
 drivers/net/mlx5/mlx5_mac.c     |  6 ++----
 drivers/net/mlx5/mlx5_rxmode.c  | 24 ++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxq.c     | 10 ++++++++++
 drivers/net/mlx5/mlx5_rxtx.h    |  6 ++++++
 drivers/net/mlx5/mlx5_trigger.c |  4 ++++
 8 files changed, 51 insertions(+), 11 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 52bf4b2..cf7c4a5 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -90,6 +90,8 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	priv_dev_interrupt_handler_uninstall(priv, dev);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	/* Prevent crashes when queues are still in use. */
@@ -416,13 +418,10 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		     mac.addr_bytes[0], mac.addr_bytes[1],
 		     mac.addr_bytes[2], mac.addr_bytes[3],
 		     mac.addr_bytes[4], mac.addr_bytes[5]);
-		/* Register MAC and broadcast addresses. */
+		/* Register MAC address. */
 		claim_zero(priv_mac_addr_add(priv, 0,
 					     (const uint8_t (*)[ETHER_ADDR_LEN])
 					     mac.addr_bytes));
-		claim_zero(priv_mac_addr_add(priv, (RTE_DIM(priv->mac) - 1),
-					     &(const uint8_t [ETHER_ADDR_LEN])
-					     { "\xff\xff\xff\xff\xff\xff" }));
 #ifndef NDEBUG
 		{
 			char ifname[IF_NAMESIZE];
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 1ec14ef..67c3948 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -44,7 +44,7 @@
 #define MLX5_MAX_VLAN_IDS 128
 
 /* Maximum number of special flows. */
-#define MLX5_MAX_SPECIAL_FLOWS 2
+#define MLX5_MAX_SPECIAL_FLOWS 4
 
 /* Request send completion once in every 64 sends, might be less. */
 #define MLX5_PMD_TX_PER_COMP_REQ 64
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 1159fa3..6704382 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -501,8 +501,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 		max = 65535;
 	info->max_rx_queues = max;
 	info->max_tx_queues = max;
-	/* Last array entry is reserved for broadcast. */
-	info->max_mac_addrs = (RTE_DIM(priv->mac) - 1);
+	info->max_mac_addrs = RTE_DIM(priv->mac);
 	info->rx_offload_capa =
 		(priv->hw_csum ?
 		 (DEV_RX_OFFLOAD_IPV4_CKSUM |
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index b1f34d9..a1a7ff5 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -212,8 +212,7 @@ mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 	priv_lock(priv);
 	DEBUG("%p: removing MAC address from index %" PRIu32,
 	      (void *)dev, index);
-	/* Last array entry is reserved for broadcast. */
-	if (index >= (RTE_DIM(priv->mac) - 1))
+	if (index >= RTE_DIM(priv->mac))
 		goto end;
 	priv_mac_addr_del(priv, index);
 end:
@@ -479,8 +478,7 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
 	priv_lock(priv);
 	DEBUG("%p: adding MAC address at index %" PRIu32,
 	      (void *)dev, index);
-	/* Last array entry is reserved for broadcast. */
-	if (index >= (RTE_DIM(priv->mac) - 1))
+	if (index >= RTE_DIM(priv->mac))
 		goto end;
 	priv_mac_addr_add(priv, index,
 			  (const uint8_t (*)[ETHER_ADDR_LEN])
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index b2ed17e..6ee7ce3 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -88,6 +88,30 @@ static const struct special_flow_init special_flow_init[] = {
 			1 << HASH_RXQ_ETH |
 			0,
 	},
+	[HASH_RXQ_FLOW_TYPE_BROADCAST] = {
+		.dst_mac_val = "\xff\xff\xff\xff\xff\xff",
+		.dst_mac_mask = "\xff\xff\xff\xff\xff\xff",
+		.hash_types =
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+#ifdef HAVE_FLOW_SPEC_IPV6
+	[HASH_RXQ_FLOW_TYPE_IPV6MULTI] = {
+		.dst_mac_val = "\x33\x33\x00\x00\x00\x00",
+		.dst_mac_mask = "\xff\xff\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+#endif /* HAVE_FLOW_SPEC_IPV6 */
 };
 
 /**
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 166516a..fcf192a 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -579,8 +579,18 @@ priv_allow_flow_type(struct priv *priv, enum hash_rxq_flow_type type)
 		return !!priv->promisc_req;
 	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
 		return !!priv->allmulti_req;
+	case HASH_RXQ_FLOW_TYPE_BROADCAST:
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case HASH_RXQ_FLOW_TYPE_IPV6MULTI:
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+		/* If allmulti is enabled, broadcast and ipv6multi
+		 * are unnecessary. */
+		return !priv->allmulti_req;
 	case HASH_RXQ_FLOW_TYPE_MAC:
 		return 1;
+	default:
+		/* Unsupported flow type is not allowed. */
+		return 0;
 	}
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 983e6a4..d5a5019 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -186,6 +186,8 @@ struct special_flow_init {
 enum hash_rxq_flow_type {
 	HASH_RXQ_FLOW_TYPE_PROMISC,
 	HASH_RXQ_FLOW_TYPE_ALLMULTI,
+	HASH_RXQ_FLOW_TYPE_BROADCAST,
+	HASH_RXQ_FLOW_TYPE_IPV6MULTI,
 	HASH_RXQ_FLOW_TYPE_MAC,
 };
 
@@ -198,6 +200,10 @@ hash_rxq_flow_type_str(enum hash_rxq_flow_type flow_type)
 		return "promiscuous";
 	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
 		return "allmulticast";
+	case HASH_RXQ_FLOW_TYPE_BROADCAST:
+		return "broadcast";
+	case HASH_RXQ_FLOW_TYPE_IPV6MULTI:
+		return "IPv6 multicast";
 	case HASH_RXQ_FLOW_TYPE_MAC:
 		return "MAC";
 	}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index d9f7d00..90b8068 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -80,6 +80,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		      " %s",
 		      (void *)priv, strerror(err));
 		/* Rollback. */
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
 		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 		priv_mac_addrs_disable(priv);
@@ -109,6 +111,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 		return;
 	}
 	DEBUG("%p: cleaning up and destroying hash RX queues", (void *)dev);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v2 3/5] mlx5: make flow steering rule generator more generic
  2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 1/5] mlx5: refactor special flows handling Adrien Mazarguil
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
@ 2016-02-22 18:02   ` Adrien Mazarguil
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 4/5] mlx5: add support for flow director Adrien Mazarguil
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:02 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Upcoming flow director support will reuse this function to generate filter
rules.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_mac.c    |  4 ++--
 drivers/net/mlx5/mlx5_rxmode.c |  5 +++--
 drivers/net/mlx5/mlx5_rxq.c    | 16 ++++++++--------
 drivers/net/mlx5/mlx5_rxtx.h   |  4 ++--
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index a1a7ff5..edb05ad 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -241,7 +241,7 @@ hash_rxq_add_mac_flow(struct hash_rxq *hash_rxq, unsigned int mac_index,
 	const uint8_t (*mac)[ETHER_ADDR_LEN] =
 			(const uint8_t (*)[ETHER_ADDR_LEN])
 			priv->mac[mac_index].addr_bytes;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, hash_rxq->type));
 	struct ibv_exp_flow_attr *attr = &data->attr;
 	struct ibv_exp_flow_spec_eth *spec = &data->spec;
 	unsigned int vlan_enabled = !!priv->vlan_filter_n;
@@ -256,7 +256,7 @@ hash_rxq_add_mac_flow(struct hash_rxq *hash_rxq, unsigned int mac_index,
 	 * This layout is expected by libibverbs.
 	 */
 	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	priv_flow_attr(priv, attr, sizeof(data), hash_rxq->type);
 	/* The first specification must be Ethernet. */
 	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
 	assert(spec->size == sizeof(*spec));
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 6ee7ce3..9ac7a41 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -129,8 +129,9 @@ static int
 hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
 			     enum hash_rxq_flow_type flow_type)
 {
+	struct priv *priv = hash_rxq->priv;
 	struct ibv_exp_flow *flow;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, hash_rxq->type));
 	struct ibv_exp_flow_attr *attr = &data->attr;
 	struct ibv_exp_flow_spec_eth *spec = &data->spec;
 	const uint8_t *mac;
@@ -148,7 +149,7 @@ hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
 	 * This layout is expected by libibverbs.
 	 */
 	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	priv_flow_attr(priv, attr, sizeof(data), hash_rxq->type);
 	/* The first specification must be Ethernet. */
 	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
 	assert(spec->size == sizeof(*spec));
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index fcf192a..36910b2 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -210,27 +210,27 @@ const size_t rss_hash_default_key_len = sizeof(rss_hash_default_key);
  * information from hash_rxq_init[]. Nothing is written to flow_attr when
  * flow_attr_size is not large enough, but the required size is still returned.
  *
- * @param[in] hash_rxq
- *   Pointer to hash RX queue.
+ * @param priv
+ *   Pointer to private structure.
  * @param[out] flow_attr
  *   Pointer to flow attribute structure to fill. Note that the allocated
  *   area must be larger and large enough to hold all flow specifications.
  * @param flow_attr_size
  *   Entire size of flow_attr and trailing room for flow specifications.
+ * @param type
+ *   Hash RX queue type to use for flow steering rule.
  *
  * @return
  *   Total size of the flow attribute buffer. No errors are defined.
  */
 size_t
-hash_rxq_flow_attr(const struct hash_rxq *hash_rxq,
-		   struct ibv_exp_flow_attr *flow_attr,
-		   size_t flow_attr_size)
+priv_flow_attr(struct priv *priv, struct ibv_exp_flow_attr *flow_attr,
+	       size_t flow_attr_size, enum hash_rxq_type type)
 {
 	size_t offset = sizeof(*flow_attr);
-	enum hash_rxq_type type = hash_rxq->type;
 	const struct hash_rxq_init *init = &hash_rxq_init[type];
 
-	assert(hash_rxq->priv != NULL);
+	assert(priv != NULL);
 	assert((size_t)type < RTE_DIM(hash_rxq_init));
 	do {
 		offset += init->flow_spec.hdr.size;
@@ -244,7 +244,7 @@ hash_rxq_flow_attr(const struct hash_rxq *hash_rxq,
 		.type = IBV_EXP_FLOW_ATTR_NORMAL,
 		.priority = init->flow_priority,
 		.num_of_specs = 0,
-		.port = hash_rxq->priv->port,
+		.port = priv->port,
 		.flags = 0,
 	};
 	do {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index d5a5019..c42bb8d 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -270,8 +270,8 @@ extern const unsigned int hash_rxq_init_n;
 extern uint8_t rss_hash_default_key[];
 extern const size_t rss_hash_default_key_len;
 
-size_t hash_rxq_flow_attr(const struct hash_rxq *, struct ibv_exp_flow_attr *,
-			  size_t);
+size_t priv_flow_attr(struct priv *, struct ibv_exp_flow_attr *,
+		      size_t, enum hash_rxq_type);
 int priv_create_hash_rxqs(struct priv *);
 void priv_destroy_hash_rxqs(struct priv *);
 int priv_allow_flow_type(struct priv *, enum hash_rxq_flow_type);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v2 4/5] mlx5: add support for flow director
  2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
                     ` (2 preceding siblings ...)
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
@ 2016-02-22 18:02   ` Adrien Mazarguil
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
  2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:02 UTC (permalink / raw)
  To: dev; +Cc: Raslan Darawsheh

From: Yaacov Hazan <yaacovh@mellanox.com>

Add support for flow director filters (RTE_FDIR_MODE_PERFECT and
RTE_FDIR_MODE_PERFECT_MAC_VLAN modes).

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>
---
 doc/guides/nics/mlx5.rst               |  14 +-
 doc/guides/rel_notes/release_16_04.rst |   7 +
 drivers/net/mlx5/Makefile              |   6 +
 drivers/net/mlx5/mlx5.c                |  12 +
 drivers/net/mlx5/mlx5.h                |  10 +
 drivers/net/mlx5/mlx5_defs.h           |  11 +
 drivers/net/mlx5/mlx5_fdir.c           | 910 +++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxq.c            |   6 +
 drivers/net/mlx5/mlx5_rxtx.h           |   7 +
 drivers/net/mlx5/mlx5_trigger.c        |   3 +
 10 files changed, 985 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/mlx5/mlx5_fdir.c

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index b2a12ce..8981068 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -86,6 +86,7 @@ Features
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
+- Flow director (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN).
 
 Limitations
 -----------
@@ -214,7 +215,8 @@ DPDK and must be installed separately:
 
 Currently supported by DPDK:
 
-- Mellanox OFED **3.1-1.0.3** or **3.1-1.5.7.1** depending on usage.
+- Mellanox OFED **3.1-1.0.3**, **3.1-1.5.7.1** or **3.2-1.0.1.1** depending
+  on usage.
 
     The following features are supported with version **3.1-1.5.7.1** and
     above only:
@@ -223,6 +225,11 @@ Currently supported by DPDK:
     - RX checksum offloads.
     - IBM POWER8.
 
+    The following features are supported with version **3.2-1.0.1.1** and
+    above only:
+
+    - Flow director.
+
 - Minimum firmware version:
 
   With MLNX_OFED **3.1-1.0.3**:
@@ -235,6 +242,11 @@ Currently supported by DPDK:
   - ConnectX-4: **12.13.0144**
   - ConnectX-4 Lx: **14.13.0144**
 
+  With MLNX_OFED **3.2-1.0.1.1**:
+
+  - ConnectX-4: **12.14.1100**
+  - ConnectX-4 Lx: **14.14.1100**
+
 Getting Mellanox OFED
 ~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index cb8d432..b3ced63 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -66,6 +66,13 @@ This section should contain new features added in this release. Sample format:
 
 * **szedata2: Add functions for setting link up/down.**
 
+* **mlx5: flow director support.**
+
+  Added flow director support (RTE_FDIR_MODE_PERFECT and
+  RTE_FDIR_MODE_PERFECT_MAC_VLAN).
+
+  Only available with Mellanox OFED >= 3.2.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 698f072..46a17e0 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -52,6 +52,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rxmode.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_vlan.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_stats.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_fdir.c
 
 # Dependencies.
 DEPDIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += lib/librte_ether
@@ -125,6 +126,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
+		infiniband/verbs.h \
+		enum IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index cf7c4a5..43e24ff 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -94,6 +94,11 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
+
+	/* Remove flow director elements. */
+	priv_fdir_disable(priv);
+	priv_fdir_delete_filters_list(priv);
+
 	/* Prevent crashes when queues are still in use. */
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
@@ -170,6 +175,9 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.reta_query = mlx5_dev_rss_reta_query,
 	.rss_hash_update = mlx5_rss_hash_update,
 	.rss_hash_conf_get = mlx5_rss_hash_conf_get,
+#ifdef MLX5_FDIR_SUPPORT
+	.filter_ctrl = mlx5_dev_filter_ctrl,
+#endif /* MLX5_FDIR_SUPPORT */
 };
 
 static struct {
@@ -422,6 +430,10 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		claim_zero(priv_mac_addr_add(priv, 0,
 					     (const uint8_t (*)[ETHER_ADDR_LEN])
 					     mac.addr_bytes));
+		/* Initialize FD filters list. */
+		err = fdir_init_filters_list(priv);
+		if (err)
+			goto port_error;
 #ifndef NDEBUG
 		{
 			char ifname[IF_NAMESIZE];
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 1c69bfa..8019ee3 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -120,6 +120,7 @@ struct priv {
 	struct rte_intr_handle intr_handle; /* Interrupt handler. */
 	unsigned int (*reta_idx)[]; /* RETA index table. */
 	unsigned int reta_idx_n; /* RETA index size. */
+	struct fdir_filter_list *fdir_filter_list; /* Flow director rules. */
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
@@ -216,4 +217,13 @@ int mlx5_vlan_filter_set(struct rte_eth_dev *, uint16_t, int);
 int mlx5_dev_start(struct rte_eth_dev *);
 void mlx5_dev_stop(struct rte_eth_dev *);
 
+/* mlx5_fdir.c */
+
+int fdir_init_filters_list(struct priv *);
+void priv_fdir_delete_filters_list(struct priv *);
+void priv_fdir_disable(struct priv *);
+void priv_fdir_enable(struct priv *);
+int mlx5_dev_filter_ctrl(struct rte_eth_dev *, enum rte_filter_type,
+			 enum rte_filter_op, void *);
+
 #endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 67c3948..5b00d8e 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -34,6 +34,8 @@
 #ifndef RTE_PMD_MLX5_DEFS_H_
 #define RTE_PMD_MLX5_DEFS_H_
 
+#include "mlx5_autoconf.h"
+
 /* Reported driver name. */
 #define MLX5_DRIVER_NAME "librte_pmd_mlx5"
 
@@ -84,4 +86,13 @@
 /* Alarm timeout. */
 #define MLX5_ALARM_TIMEOUT_US 100000
 
+/*
+ * Extended flow priorities necessary to support flow director are available
+ * since MLNX_OFED 3.2. Considering this version adds support for VLAN
+ * offloads as well, their availability means flow director can be used.
+ */
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+#define MLX5_FDIR_SUPPORT 1
+#endif
+
 #endif /* RTE_PMD_MLX5_DEFS_H_ */
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
new file mode 100644
index 0000000..6402c62
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -0,0 +1,910 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2015 6WIND S.A.
+ *   Copyright 2015 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <assert.h>
+#include <stdint.h>
+#include <string.h>
+#include <errno.h>
+
+/* Verbs header. */
+/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include <infiniband/verbs.h>
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+/* DPDK headers don't like -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include <rte_ether.h>
+#include <rte_malloc.h>
+#include <rte_ethdev.h>
+#include <rte_common.h>
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+#include "mlx5.h"
+#include "mlx5_rxtx.h"
+
+struct fdir_flow_desc {
+	uint16_t dst_port;
+	uint16_t src_port;
+	uint32_t src_ip[4];
+	uint32_t dst_ip[4];
+	uint8_t	mac[6];
+	uint16_t vlan_tag;
+	enum hash_rxq_type type;
+};
+
+struct mlx5_fdir_filter {
+	LIST_ENTRY(mlx5_fdir_filter) next;
+	uint16_t queue; /* Queue assigned to if FDIR match. */
+	struct fdir_flow_desc desc;
+	struct ibv_exp_flow *flow;
+};
+
+LIST_HEAD(fdir_filter_list, mlx5_fdir_filter);
+
+/**
+ * Convert struct rte_eth_fdir_filter to mlx5 filter descriptor.
+ *
+ * @param[in] fdir_filter
+ *   DPDK filter structure to convert.
+ * @param[out] desc
+ *   Resulting mlx5 filter descriptor.
+ * @param mode
+ *   Flow director mode.
+ */
+static void
+fdir_filter_to_flow_desc(const struct rte_eth_fdir_filter *fdir_filter,
+			 struct fdir_flow_desc *desc, enum rte_fdir_mode mode)
+{
+	/* Initialize descriptor. */
+	memset(desc, 0, sizeof(*desc));
+
+	/* Set VLAN ID. */
+	desc->vlan_tag = fdir_filter->input.flow_ext.vlan_tci;
+
+	/* Set MAC address. */
+	if (mode == RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		rte_memcpy(desc->mac,
+			   fdir_filter->input.flow.mac_vlan_flow.mac_addr.
+				addr_bytes,
+			   sizeof(desc->mac));
+		desc->type = HASH_RXQ_ETH;
+		return;
+	}
+
+	/* Set mode */
+	switch (fdir_filter->input.flow_type) {
+	case RTE_ETH_FLOW_NONFRAG_IPV4_UDP:
+		desc->type = HASH_RXQ_UDPV4;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_TCP:
+		desc->type = HASH_RXQ_TCPV4;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
+		desc->type = HASH_RXQ_IPV4;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case RTE_ETH_FLOW_NONFRAG_IPV6_UDP:
+		desc->type = HASH_RXQ_UDPV6;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV6_TCP:
+		desc->type = HASH_RXQ_TCPV6;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV6_OTHER:
+		desc->type = HASH_RXQ_IPV6;
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		break;
+	}
+
+	/* Set flow values */
+	switch (fdir_filter->input.flow_type) {
+	case RTE_ETH_FLOW_NONFRAG_IPV4_UDP:
+	case RTE_ETH_FLOW_NONFRAG_IPV4_TCP:
+		desc->src_port = fdir_filter->input.flow.udp4_flow.src_port;
+		desc->dst_port = fdir_filter->input.flow.udp4_flow.dst_port;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
+		desc->src_ip[0] = fdir_filter->input.flow.ip4_flow.src_ip;
+		desc->dst_ip[0] = fdir_filter->input.flow.ip4_flow.dst_ip;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case RTE_ETH_FLOW_NONFRAG_IPV6_UDP:
+	case RTE_ETH_FLOW_NONFRAG_IPV6_TCP:
+		desc->src_port = fdir_filter->input.flow.udp6_flow.src_port;
+		desc->dst_port = fdir_filter->input.flow.udp6_flow.dst_port;
+		/* Fall through. */
+	case RTE_ETH_FLOW_NONFRAG_IPV6_OTHER:
+		rte_memcpy(desc->src_ip,
+			   fdir_filter->input.flow.ipv6_flow.src_ip,
+			   sizeof(desc->src_ip));
+		rte_memcpy(desc->dst_ip,
+			   fdir_filter->input.flow.ipv6_flow.dst_ip,
+			   sizeof(desc->dst_ip));
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		break;
+	}
+}
+
+/**
+ * Create flow director steering rule for a specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param mlx5_fdir_filter
+ *   Filter to create a steering rule for.
+ * @param fdir_queue
+ *   Flow director queue for matching packets.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_flow_add(struct priv *priv,
+		   struct mlx5_fdir_filter *mlx5_fdir_filter,
+		   struct fdir_queue *fdir_queue)
+{
+	struct ibv_exp_flow *flow;
+	struct fdir_flow_desc *desc = &mlx5_fdir_filter->desc;
+	enum rte_fdir_mode fdir_mode =
+		priv->dev->data->dev_conf.fdir_conf.mode;
+	struct rte_eth_fdir_masks *mask =
+		&priv->dev->data->dev_conf.fdir_conf.mask;
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, desc->type));
+	struct ibv_exp_flow_attr *attr = &data->attr;
+	uintptr_t spec_offset = (uintptr_t)&data->spec;
+	struct ibv_exp_flow_spec_eth *spec_eth;
+	struct ibv_exp_flow_spec_ipv4 *spec_ipv4;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	struct ibv_exp_flow_spec_ipv6 *spec_ipv6;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	struct ibv_exp_flow_spec_tcp_udp *spec_tcp_udp;
+	unsigned int i;
+
+	/*
+	 * No padding must be inserted by the compiler between attr and spec.
+	 * This layout is expected by libibverbs.
+	 */
+	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec_offset);
+	priv_flow_attr(priv, attr, sizeof(data), desc->type);
+
+	/* Set Ethernet spec */
+	spec_eth = (struct ibv_exp_flow_spec_eth *)spec_offset;
+
+	/* The first specification must be Ethernet. */
+	assert(spec_eth->type == IBV_EXP_FLOW_SPEC_ETH);
+	assert(spec_eth->size == sizeof(*spec_eth));
+
+	/* VLAN ID */
+	spec_eth->val.vlan_tag = desc->vlan_tag;
+	spec_eth->mask.vlan_tag = mask->vlan_tci_mask;
+
+	/* Update priority */
+	attr->priority = 2;
+
+	if (fdir_mode == RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		/* MAC Address */
+		rte_memcpy(spec_eth->val.dst_mac,
+			   desc->mac,
+			   sizeof(spec_eth->val.dst_mac));
+		/* The mask is per byte mask */
+		for (i = 0; i < sizeof(spec_eth->mask.dst_mac); i++)
+			spec_eth->mask.dst_mac[i] = mask->mac_addr_byte_mask;
+
+		goto create_flow;
+	}
+
+	switch (desc->type) {
+	case HASH_RXQ_IPV4:
+	case HASH_RXQ_UDPV4:
+	case HASH_RXQ_TCPV4:
+		spec_offset += spec_eth->size;
+
+		/* Set IP spec */
+		spec_ipv4 = (struct ibv_exp_flow_spec_ipv4 *)spec_offset;
+
+		/* The second specification must be IP. */
+		assert(spec_ipv4->type == IBV_EXP_FLOW_SPEC_IPV4);
+		assert(spec_ipv4->size == sizeof(*spec_ipv4));
+
+		spec_ipv4->val.src_ip = desc->src_ip[0];
+		spec_ipv4->val.dst_ip = desc->dst_ip[0];
+		spec_ipv4->mask.src_ip = mask->ipv4_mask.src_ip;
+		spec_ipv4->mask.dst_ip = mask->ipv4_mask.dst_ip;
+
+		/* Update priority */
+		attr->priority = 1;
+
+		if (desc->type == HASH_RXQ_IPV4)
+			goto create_flow;
+
+		spec_offset += spec_ipv4->size;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case HASH_RXQ_IPV6:
+	case HASH_RXQ_UDPV6:
+	case HASH_RXQ_TCPV6:
+		spec_offset += spec_eth->size;
+
+		/* Set IP spec */
+		spec_ipv6 = (struct ibv_exp_flow_spec_ipv6 *)spec_offset;
+
+		/* The second specification must be IP. */
+		assert(spec_ipv6->type == IBV_EXP_FLOW_SPEC_IPV6);
+		assert(spec_ipv6->size == sizeof(*spec_ipv6));
+
+		rte_memcpy(spec_ipv6->val.src_ip,
+			   desc->src_ip,
+			   sizeof(spec_ipv6->val.src_ip));
+		rte_memcpy(spec_ipv6->val.dst_ip,
+			   desc->dst_ip,
+			   sizeof(spec_ipv6->val.dst_ip));
+		rte_memcpy(spec_ipv6->mask.src_ip,
+			   mask->ipv6_mask.src_ip,
+			   sizeof(spec_ipv6->mask.src_ip));
+		rte_memcpy(spec_ipv6->mask.dst_ip,
+			   mask->ipv6_mask.dst_ip,
+			   sizeof(spec_ipv6->mask.dst_ip));
+
+		/* Update priority */
+		attr->priority = 1;
+
+		if (desc->type == HASH_RXQ_IPV6)
+			goto create_flow;
+
+		spec_offset += spec_ipv6->size;
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		ERROR("invalid flow attribute type");
+		return EINVAL;
+	}
+
+	/* Set TCP/UDP flow specification. */
+	spec_tcp_udp = (struct ibv_exp_flow_spec_tcp_udp *)spec_offset;
+
+	/* The third specification must be TCP/UDP. */
+	assert(spec_tcp_udp->type == IBV_EXP_FLOW_SPEC_TCP ||
+	       spec_tcp_udp->type == IBV_EXP_FLOW_SPEC_UDP);
+	assert(spec_tcp_udp->size == sizeof(*spec_tcp_udp));
+
+	spec_tcp_udp->val.src_port = desc->src_port;
+	spec_tcp_udp->val.dst_port = desc->dst_port;
+	spec_tcp_udp->mask.src_port = mask->src_port_mask;
+	spec_tcp_udp->mask.dst_port = mask->dst_port_mask;
+
+	/* Update priority */
+	attr->priority = 0;
+
+create_flow:
+
+	errno = 0;
+	flow = ibv_exp_create_flow(fdir_queue->qp, attr);
+	if (flow == NULL) {
+		/* It's not clear whether errno is always set in this case. */
+		ERROR("%p: flow director configuration failed, errno=%d: %s",
+		      (void *)priv, errno,
+		      (errno ? strerror(errno) : "Unknown error"));
+		if (errno)
+			return errno;
+		return EINVAL;
+	}
+
+	DEBUG("%p: added flow director rule (%p)", (void *)priv, (void *)flow);
+	mlx5_fdir_filter->flow = flow;
+	return 0;
+}
+
+/**
+ * Get flow director queue for a specific RX queue, create it in case
+ * it does not exist.
+ *
+ * @param priv
+ *   Private structure.
+ * @param idx
+ *   RX queue index.
+ *
+ * @return
+ *   Related flow director queue on success, NULL otherwise.
+ */
+static struct fdir_queue *
+priv_get_fdir_queue(struct priv *priv, uint16_t idx)
+{
+	struct fdir_queue *fdir_queue = &(*priv->rxqs)[idx]->fdir_queue;
+	struct ibv_exp_rwq_ind_table *ind_table = NULL;
+	struct ibv_qp *qp = NULL;
+	struct ibv_exp_rwq_ind_table_init_attr ind_init_attr;
+	struct ibv_exp_rx_hash_conf hash_conf;
+	struct ibv_exp_qp_init_attr qp_init_attr;
+	int err = 0;
+
+	/* Return immediately if it has already been created. */
+	if (fdir_queue->qp != NULL)
+		return fdir_queue;
+
+	ind_init_attr = (struct ibv_exp_rwq_ind_table_init_attr){
+		.pd = priv->pd,
+		.log_ind_tbl_size = 0,
+		.ind_tbl = &((*priv->rxqs)[idx]->wq),
+		.comp_mask = 0,
+	};
+
+	errno = 0;
+	ind_table = ibv_exp_create_rwq_ind_table(priv->ctx,
+						 &ind_init_attr);
+	if (ind_table == NULL) {
+		/* Not clear whether errno is set. */
+		err = (errno ? errno : EINVAL);
+		ERROR("RX indirection table creation failed with error %d: %s",
+		      err, strerror(err));
+		goto error;
+	}
+
+	/* Create fdir_queue qp. */
+	hash_conf = (struct ibv_exp_rx_hash_conf){
+		.rx_hash_function = IBV_EXP_RX_HASH_FUNC_TOEPLITZ,
+		.rx_hash_key_len = rss_hash_default_key_len,
+		.rx_hash_key = rss_hash_default_key,
+		.rx_hash_fields_mask = 0,
+		.rwq_ind_tbl = ind_table,
+	};
+	qp_init_attr = (struct ibv_exp_qp_init_attr){
+		.max_inl_recv = 0, /* Currently not supported. */
+		.qp_type = IBV_QPT_RAW_PACKET,
+		.comp_mask = (IBV_EXP_QP_INIT_ATTR_PD |
+			      IBV_EXP_QP_INIT_ATTR_RX_HASH),
+		.pd = priv->pd,
+		.rx_hash_conf = &hash_conf,
+		.port_num = priv->port,
+	};
+
+	qp = ibv_exp_create_qp(priv->ctx, &qp_init_attr);
+	if (qp == NULL) {
+		err = (errno ? errno : EINVAL);
+		ERROR("hash RX QP creation failure: %s", strerror(err));
+		goto error;
+	}
+
+	fdir_queue->ind_table = ind_table;
+	fdir_queue->qp = qp;
+
+	return fdir_queue;
+
+error:
+	if (qp != NULL)
+		claim_zero(ibv_destroy_qp(qp));
+
+	if (ind_table != NULL)
+		claim_zero(ibv_exp_destroy_rwq_ind_table(ind_table));
+
+	return NULL;
+}
+
+/**
+ * Enable flow director filter and create steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ * @param mlx5_fdir_filter
+ *   Filter to create steering rule for.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_enable(struct priv *priv,
+			struct mlx5_fdir_filter *mlx5_fdir_filter)
+{
+	struct fdir_queue *fdir_queue;
+
+	/* Check if flow already exists. */
+	if (mlx5_fdir_filter->flow != NULL)
+		return 0;
+
+	/* Get fdir_queue for specific queue. */
+	fdir_queue = priv_get_fdir_queue(priv, mlx5_fdir_filter->queue);
+
+	if (fdir_queue == NULL) {
+		ERROR("failed to create flow director rxq for queue %d",
+		      mlx5_fdir_filter->queue);
+		return EINVAL;
+	}
+
+	/* Create flow */
+	return priv_fdir_flow_add(priv, mlx5_fdir_filter, fdir_queue);
+}
+
+/**
+ * Initialize flow director filters list.
+ *
+ * @param priv
+ *   Private structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+fdir_init_filters_list(struct priv *priv)
+{
+	/* Filter list initialization should be done only once. */
+	if (priv->fdir_filter_list)
+		return 0;
+
+	/* Create filters list. */
+	priv->fdir_filter_list =
+		rte_calloc(__func__, 1, sizeof(*priv->fdir_filter_list), 0);
+
+	if (priv->fdir_filter_list == NULL) {
+		int err = ENOMEM;
+
+		ERROR("cannot allocate flow director filter list: %s",
+		      strerror(err));
+		return err;
+	}
+
+	LIST_INIT(priv->fdir_filter_list);
+
+	return 0;
+}
+
+/**
+ * Remove all flow director filters and delete list.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_delete_filters_list(struct priv *priv)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	void *prev = NULL;
+
+	/* Run on every fdir filter and delete it */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		struct ibv_exp_flow *flow;
+
+		/* Deallocate previous element safely. */
+		rte_free(prev);
+
+		/* Only valid elements should be in the list. */
+		assert(mlx5_fdir_filter != NULL);
+		flow = mlx5_fdir_filter->flow;
+
+		/* Remove element from list. */
+		LIST_REMOVE(mlx5_fdir_filter, next);
+
+		/* Destroy flow handle. */
+		if (flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+
+		prev = mlx5_fdir_filter;
+	}
+
+	rte_free(prev);
+	rte_free(priv->fdir_filter_list);
+	priv->fdir_filter_list = NULL;
+}
+
+/**
+ * Disable flow director, remove all steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_disable(struct priv *priv)
+{
+	unsigned int i;
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	struct fdir_queue *fdir_queue;
+
+	/* Run on every flow director filter and destroy flow handle. */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		struct ibv_exp_flow *flow;
+
+		/* Only valid elements should be in the list */
+		assert(mlx5_fdir_filter != NULL);
+		flow = mlx5_fdir_filter->flow;
+
+		/* Destroy flow handle */
+		if (flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+	}
+
+	/* Run on every RX queue to destroy related flow director QP and
+	 * indirection table. */
+	for (i = 0; (i != priv->rxqs_n); i++) {
+		fdir_queue = &(*priv->rxqs)[i]->fdir_queue;
+
+		if (fdir_queue->qp != NULL) {
+			claim_zero(ibv_destroy_qp(fdir_queue->qp));
+			fdir_queue->qp = NULL;
+		}
+
+		if (fdir_queue->ind_table != NULL) {
+			claim_zero(ibv_exp_destroy_rwq_ind_table
+				   (fdir_queue->ind_table));
+			fdir_queue->ind_table = NULL;
+		}
+	}
+}
+
+/**
+ * Enable flow director, create steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_enable(struct priv *priv)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	/* Run on every fdir filter and create flow handle */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		/* Only valid elements should be in the list */
+		assert(mlx5_fdir_filter != NULL);
+
+		priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+	}
+}
+
+/**
+ * Find specific filter in list.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Flow director filter to find.
+ *
+ * @return
+ *   Filter element if found, otherwise NULL.
+ */
+static struct mlx5_fdir_filter *
+priv_find_filter_in_list(struct priv *priv,
+			 const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct fdir_flow_desc desc;
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	enum rte_fdir_mode fdir_mode = priv->dev->data->dev_conf.fdir_conf.mode;
+
+	/* Get flow director filter to look for. */
+	fdir_filter_to_flow_desc(fdir_filter, &desc, fdir_mode);
+
+	/* Look for the requested element. */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		/* Only valid elements should be in the list. */
+		assert(mlx5_fdir_filter != NULL);
+
+		/* Return matching filter. */
+		if (!memcmp(&desc, &mlx5_fdir_filter->desc, sizeof(desc)))
+			return mlx5_fdir_filter;
+	}
+
+	/* Filter not found */
+	return NULL;
+}
+
+/**
+ * Add new flow director filter and store it in list.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Flow director filter to add.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_add(struct priv *priv,
+		     const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	enum rte_fdir_mode fdir_mode = priv->dev->data->dev_conf.fdir_conf.mode;
+	int err = 0;
+
+	/* Validate queue number. */
+	if (fdir_filter->action.rx_queue >= priv->rxqs_n) {
+		ERROR("invalid queue number %d", fdir_filter->action.rx_queue);
+		return EINVAL;
+	}
+
+	/* Duplicate filters are currently unsupported. */
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		ERROR("filter already exists");
+		return EINVAL;
+	}
+
+	/* Create new flow director filter. */
+	mlx5_fdir_filter =
+		rte_calloc(__func__, 1, sizeof(*mlx5_fdir_filter), 0);
+	if (mlx5_fdir_filter == NULL) {
+		err = ENOMEM;
+		ERROR("cannot allocate flow director filter: %s",
+		      strerror(err));
+		return err;
+	}
+
+	/* Set queue. */
+	mlx5_fdir_filter->queue = fdir_filter->action.rx_queue;
+
+	/* Convert to mlx5 filter descriptor. */
+	fdir_filter_to_flow_desc(fdir_filter,
+				 &mlx5_fdir_filter->desc, fdir_mode);
+
+	/* Insert new filter into list. */
+	LIST_INSERT_HEAD(priv->fdir_filter_list, mlx5_fdir_filter, next);
+
+	DEBUG("%p: flow director filter %p added",
+	      (void *)priv, (void *)mlx5_fdir_filter);
+
+	/* Enable filter immediately if device is started. */
+	if (priv->started)
+		err = priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+
+	return err;
+}
+
+/**
+ * Update queue for specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Filter to be updated.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_update(struct priv *priv,
+			const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	/* Validate queue number. */
+	if (fdir_filter->action.rx_queue >= priv->rxqs_n) {
+		ERROR("invalid queue number %d", fdir_filter->action.rx_queue);
+		return EINVAL;
+	}
+
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		struct ibv_exp_flow *flow = mlx5_fdir_filter->flow;
+		int err = 0;
+
+		/* Update queue number. */
+		mlx5_fdir_filter->queue = fdir_filter->action.rx_queue;
+
+		/* Destroy flow handle. */
+		if (flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+		DEBUG("%p: flow director filter %p updated",
+		      (void *)priv, (void *)mlx5_fdir_filter);
+
+		/* Enable filter if device is started. */
+		if (priv->started)
+			err = priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+
+		return err;
+	}
+
+	/* Filter not found, create it. */
+	DEBUG("%p: filter not found for update, creating new filter",
+	      (void *)priv);
+	return priv_fdir_filter_add(priv, fdir_filter);
+}
+
+/**
+ * Delete specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Filter to be deleted.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_delete(struct priv *priv,
+			const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		struct ibv_exp_flow *flow = mlx5_fdir_filter->flow;
+
+		/* Remove element from list. */
+		LIST_REMOVE(mlx5_fdir_filter, next);
+
+		/* Destroy flow handle. */
+		if (flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+
+		DEBUG("%p: flow director filter %p deleted",
+		      (void *)priv, (void *)mlx5_fdir_filter);
+
+		/* Delete filter. */
+		rte_free(mlx5_fdir_filter);
+
+		return 0;
+	}
+
+	ERROR("%p: flow director delete failed, cannot find filter",
+	      (void *)priv);
+	return EINVAL;
+}
+
+/**
+ * Get flow director information.
+ *
+ * @param priv
+ *   Private structure.
+ * @param[out] fdir_info
+ *   Resulting flow director information.
+ */
+static void
+priv_fdir_info_get(struct priv *priv, struct rte_eth_fdir_info *fdir_info)
+{
+	struct rte_eth_fdir_masks *mask =
+		&priv->dev->data->dev_conf.fdir_conf.mask;
+
+	fdir_info->mode = priv->dev->data->dev_conf.fdir_conf.mode;
+	fdir_info->guarant_spc = 0;
+
+	rte_memcpy(&fdir_info->mask, mask, sizeof(fdir_info->mask));
+
+	fdir_info->max_flexpayload = 0;
+	fdir_info->flow_types_mask[0] = 0;
+
+	fdir_info->flex_payload_unit = 0;
+	fdir_info->max_flex_payload_segment_num = 0;
+	fdir_info->flex_payload_limit = 0;
+	memset(&fdir_info->flex_conf, 0, sizeof(fdir_info->flex_conf));
+}
+
+/**
+ * Deal with flow director operations.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param filter_op
+ *   Operation to perform.
+ * @param arg
+ *   Pointer to operation-specific structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_ctrl_func(struct priv *priv, enum rte_filter_op filter_op, void *arg)
+{
+	enum rte_fdir_mode fdir_mode =
+		priv->dev->data->dev_conf.fdir_conf.mode;
+	int ret = 0;
+
+	if (filter_op == RTE_ETH_FILTER_NOP)
+		return 0;
+
+	if (fdir_mode != RTE_FDIR_MODE_PERFECT &&
+	    fdir_mode != RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		ERROR("%p: flow director mode %d not supported",
+		      (void *)priv, fdir_mode);
+		return EINVAL;
+	}
+
+	if (arg == NULL)
+		return EINVAL;
+
+	switch (filter_op) {
+	case RTE_ETH_FILTER_ADD:
+		ret = priv_fdir_filter_add(priv, arg);
+		break;
+	case RTE_ETH_FILTER_UPDATE:
+		ret = priv_fdir_filter_update(priv, arg);
+		break;
+	case RTE_ETH_FILTER_DELETE:
+		ret = priv_fdir_filter_delete(priv, arg);
+		break;
+	case RTE_ETH_FILTER_INFO:
+		priv_fdir_info_get(priv, arg);
+		break;
+	default:
+		DEBUG("%p: unknown operation %u", (void *)priv, filter_op);
+		ret = EINVAL;
+		break;
+	}
+	return ret;
+}
+
+/**
+ * Manage filter operations.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param filter_type
+ *   Filter type.
+ * @param filter_op
+ *   Operation to perform.
+ * @param arg
+ *   Pointer to operation-specific structure.
+ *
+ * @return
+ *   0 on success, negative errno value on failure.
+ */
+int
+mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
+		     enum rte_filter_type filter_type,
+		     enum rte_filter_op filter_op,
+		     void *arg)
+{
+	int ret = -EINVAL;
+	struct priv *priv = dev->data->dev_private;
+
+	switch (filter_type) {
+	case RTE_ETH_FILTER_FDIR:
+		priv_lock(priv);
+		ret = priv_fdir_ctrl_func(priv, filter_op, arg);
+		priv_unlock(priv);
+		break;
+	default:
+		ERROR("%p: filter type (%d) not supported",
+		      (void *)dev, filter_type);
+		break;
+	}
+
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 36910b2..093f4e5 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -62,6 +62,7 @@
 #include "mlx5.h"
 #include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
+#include "mlx5_autoconf.h"
 #include "mlx5_defs.h"
 
 /* Initialization data for hash RX queues. */
@@ -242,7 +243,12 @@ priv_flow_attr(struct priv *priv, struct ibv_exp_flow_attr *flow_attr,
 	init = &hash_rxq_init[type];
 	*flow_attr = (struct ibv_exp_flow_attr){
 		.type = IBV_EXP_FLOW_ATTR_NORMAL,
+#ifdef MLX5_FDIR_SUPPORT
+		/* Priorities < 3 are reserved for flow director. */
+		.priority = init->flow_priority + 3,
+#else /* MLX5_FDIR_SUPPORT */
 		.priority = init->flow_priority,
+#endif /* MLX5_FDIR_SUPPORT */
 		.num_of_specs = 0,
 		.port = priv->port,
 		.flags = 0,
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index c42bb8d..b2f72f8 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -93,6 +93,12 @@ struct rxq_elt {
 	struct rte_mbuf *buf; /* SGE buffer. */
 };
 
+/* Flow director queue structure. */
+struct fdir_queue {
+	struct ibv_qp *qp; /* Associated RX QP. */
+	struct ibv_exp_rwq_ind_table *ind_table; /* Indirection table. */
+};
+
 struct priv;
 
 /* RX queue descriptor. */
@@ -118,6 +124,7 @@ struct rxq {
 	struct mlx5_rxq_stats stats; /* RX queue counters. */
 	unsigned int socket; /* CPU socket ID for allocations. */
 	struct ibv_exp_res_domain *rd; /* Resource Domain. */
+	struct fdir_queue fdir_queue; /* Flow director queue. */
 };
 
 /* Hash RX queue types. */
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 90b8068..db7890f 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -87,6 +87,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		priv_mac_addrs_disable(priv);
 		priv_destroy_hash_rxqs(priv);
 	}
+	if (dev->data->dev_conf.fdir_conf.mode != RTE_FDIR_MODE_NONE)
+		priv_fdir_enable(priv);
 	priv_dev_interrupt_handler_install(priv, dev);
 	priv_unlock(priv);
 	return -err;
@@ -117,6 +119,7 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
+	priv_fdir_disable(priv);
 	priv_dev_interrupt_handler_uninstall(priv, dev);
 	priv->started = 0;
 	priv_unlock(priv);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v2 5/5] mlx5: add support for RX VLAN stripping
  2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
                     ` (3 preceding siblings ...)
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 4/5] mlx5: add support for flow director Adrien Mazarguil
@ 2016-02-22 18:02   ` Adrien Mazarguil
  2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:02 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Allows HW to strip the 802.1Q header from incoming frames and report it
through the mbuf structure.

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/nics/mlx5.rst               |   2 +
 doc/guides/rel_notes/release_16_04.rst |   6 ++
 drivers/net/mlx5/mlx5.c                |  16 ++++-
 drivers/net/mlx5/mlx5.h                |   3 +
 drivers/net/mlx5/mlx5_rxq.c            |  17 +++++-
 drivers/net/mlx5/mlx5_rxtx.c           |  27 +++++++++
 drivers/net/mlx5/mlx5_rxtx.h           |   5 ++
 drivers/net/mlx5/mlx5_vlan.c           | 104 +++++++++++++++++++++++++++++++++
 8 files changed, 178 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 8981068..8422206 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -83,6 +83,7 @@ Features
 - Configurable RETA table.
 - Support for multiple MAC addresses.
 - VLAN filtering.
+- RX VLAN stripping.
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
@@ -229,6 +230,7 @@ Currently supported by DPDK:
     above only:
 
     - Flow director.
+    - RX VLAN stripping.
 
 - Minimum firmware version:
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index b3ced63..4d2f76c 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -73,6 +73,12 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **mlx5: RX VLAN stripping support.**
+
+  Added support for RX VLAN stripping.
+
+  Only available with Mellanox OFED >= 3.2.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 43e24ff..575420e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -171,6 +171,10 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.mac_addr_add = mlx5_mac_addr_add,
 	.mac_addr_set = mlx5_mac_addr_set,
 	.mtu_set = mlx5_dev_set_mtu,
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	.vlan_strip_queue_set = mlx5_vlan_strip_queue_set,
+	.vlan_offload_set = mlx5_vlan_offload_set,
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	.reta_update = mlx5_dev_rss_reta_update,
 	.reta_query = mlx5_dev_rss_reta_query,
 	.rss_hash_update = mlx5_rss_hash_update,
@@ -325,7 +329,11 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 #ifdef HAVE_EXP_QUERY_DEVICE
 		exp_device_attr.comp_mask =
 			IBV_EXP_DEVICE_ATTR_EXP_CAP_FLAGS |
-			IBV_EXP_DEVICE_ATTR_RX_HASH;
+			IBV_EXP_DEVICE_ATTR_RX_HASH |
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+			IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS |
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+			0;
 #endif /* HAVE_EXP_QUERY_DEVICE */
 
 		DEBUG("using port %u (%08" PRIx32 ")", port, test);
@@ -396,6 +404,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 			priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
 		DEBUG("maximum RX indirection table size is %u",
 		      priv->ind_table_max_size);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		priv->hw_vlan_strip = !!(exp_device_attr.wq_vlan_offloads_cap &
+					 IBV_EXP_RECEIVE_WQ_CVLAN_STRIP);
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+		DEBUG("VLAN stripping is %ssupported",
+		      (priv->hw_vlan_strip ? "" : "not "));
 
 #else /* HAVE_EXP_QUERY_DEVICE */
 		priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8019ee3..8442016 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -101,6 +101,7 @@ struct priv {
 	unsigned int allmulti_req:1; /* All multicast mode requested. */
 	unsigned int hw_csum:1; /* Checksum offload is supported. */
 	unsigned int hw_csum_l2tun:1; /* Same for L2 tunnels. */
+	unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
 	unsigned int vf:1; /* This is a VF device. */
 	unsigned int pending_alarm:1; /* An alarm is pending. */
 	/* RX/TX queues. */
@@ -211,6 +212,8 @@ void mlx5_stats_reset(struct rte_eth_dev *);
 /* mlx5_vlan.c */
 
 int mlx5_vlan_filter_set(struct rte_eth_dev *, uint16_t, int);
+void mlx5_vlan_offload_set(struct rte_eth_dev *, int);
+void mlx5_vlan_strip_queue_set(struct rte_eth_dev *, uint16_t, int);
 
 /* mlx5_trigger.c */
 
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 093f4e5..573ad8f 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1224,6 +1224,8 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	      priv->device_attr.max_qp_wr);
 	DEBUG("priv->device_attr.max_sge is %d",
 	      priv->device_attr.max_sge);
+	/* Configure VLAN stripping. */
+	tmpl.vlan_strip = dev->data->dev_conf.rxmode.hw_vlan_strip;
 	attr.wq = (struct ibv_exp_wq_init_attr){
 		.wq_context = NULL, /* Could be useful in the future. */
 		.wq_type = IBV_EXP_WQT_RQ,
@@ -1238,8 +1240,18 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 				 MLX5_PMD_SGE_WR_N),
 		.pd = priv->pd,
 		.cq = tmpl.cq,
-		.comp_mask = IBV_EXP_CREATE_WQ_RES_DOMAIN,
+		.comp_mask =
+			IBV_EXP_CREATE_WQ_RES_DOMAIN |
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+			IBV_EXP_CREATE_WQ_VLAN_OFFLOADS |
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+			0,
 		.res_domain = tmpl.rd,
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		.vlan_offloads = (tmpl.vlan_strip ?
+				  IBV_EXP_RECEIVE_WQ_CVLAN_STRIP :
+				  0),
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	};
 	tmpl.wq = ibv_exp_create_wq(priv->ctx, &attr.wq);
 	if (tmpl.wq == NULL) {
@@ -1262,6 +1274,9 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	DEBUG("%p: RTE port ID: %u", (void *)rxq, tmpl.port_id);
 	attr.params = (struct ibv_exp_query_intf_params){
 		.intf_scope = IBV_EXP_INTF_GLOBAL,
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		.intf_version = 1,
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		.intf = IBV_EXP_INTF_CQ,
 		.obj = tmpl.cq,
 	};
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fa5e648..7585570 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -62,6 +62,7 @@
 #include "mlx5.h"
 #include "mlx5_utils.h"
 #include "mlx5_rxtx.h"
+#include "mlx5_autoconf.h"
 #include "mlx5_defs.h"
 
 /**
@@ -713,12 +714,19 @@ mlx5_rx_burst_sp(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		unsigned int seg_headroom = RTE_PKTMBUF_HEADROOM;
 		unsigned int j = 0;
 		uint32_t flags;
+		uint16_t vlan_tci;
 
 		/* Sanity checks. */
 		assert(elts_head < rxq->elts_n);
 		assert(rxq->elts_head < rxq->elts_n);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		ret = rxq->if_cq->poll_length_flags_cvlan(rxq->cq, NULL, NULL,
+							  &flags, &vlan_tci);
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL,
 						    &flags);
+		(void)vlan_tci;
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		if (unlikely(ret < 0)) {
 			struct ibv_wc wc;
 			int wcs_n;
@@ -840,6 +848,12 @@ mlx5_rx_burst_sp(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		PKT_LEN(pkt_buf) = pkt_buf_len;
 		pkt_buf->packet_type = rxq_cq_to_pkt_type(flags);
 		pkt_buf->ol_flags = rxq_cq_to_ol_flags(rxq, flags);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		if (flags & IBV_EXP_CQ_RX_CVLAN_STRIPPED_V1) {
+			pkt_buf->ol_flags |= PKT_RX_VLAN_PKT;
+			pkt_buf->vlan_tci = vlan_tci;
+		}
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 
 		/* Return packet. */
 		*(pkts++) = pkt_buf;
@@ -910,6 +924,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		struct rte_mbuf *seg = elt->buf;
 		struct rte_mbuf *rep;
 		uint32_t flags;
+		uint16_t vlan_tci;
 
 		/* Sanity checks. */
 		assert(seg != NULL);
@@ -921,8 +936,14 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		 */
 		rte_prefetch0(seg);
 		rte_prefetch0(&seg->cacheline1);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		ret = rxq->if_cq->poll_length_flags_cvlan(rxq->cq, NULL, NULL,
+							  &flags, &vlan_tci);
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL,
 						    &flags);
+		(void)vlan_tci;
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		if (unlikely(ret < 0)) {
 			struct ibv_wc wc;
 			int wcs_n;
@@ -989,6 +1010,12 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		DATA_LEN(seg) = len;
 		seg->packet_type = rxq_cq_to_pkt_type(flags);
 		seg->ol_flags = rxq_cq_to_ol_flags(rxq, flags);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		if (flags & IBV_EXP_CQ_RX_CVLAN_STRIPPED_V1) {
+			seg->ol_flags |= PKT_RX_VLAN_PKT;
+			seg->vlan_tci = vlan_tci;
+		}
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 
 		/* Return packet. */
 		*(pkts++) = seg;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index b2f72f8..fde0ca2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -109,7 +109,11 @@ struct rxq {
 	struct ibv_cq *cq; /* Completion Queue. */
 	struct ibv_exp_wq *wq; /* Work Queue. */
 	struct ibv_exp_wq_family *if_wq; /* WQ burst interface. */
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	struct ibv_exp_cq_family_v1 *if_cq; /* CQ interface. */
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	struct ibv_exp_cq_family *if_cq; /* CQ interface. */
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	unsigned int port_id; /* Port ID for incoming packets. */
 	unsigned int elts_n; /* (*elts)[] length. */
 	unsigned int elts_head; /* Current index in (*elts)[]. */
@@ -120,6 +124,7 @@ struct rxq {
 	unsigned int sp:1; /* Use scattered RX elements. */
 	unsigned int csum:1; /* Enable checksum offloading. */
 	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
+	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
 	uint32_t mb_len; /* Length of a mp-issued mbuf. */
 	struct mlx5_rxq_stats stats; /* RX queue counters. */
 	unsigned int socket; /* CPU socket ID for allocations. */
diff --git a/drivers/net/mlx5/mlx5_vlan.c b/drivers/net/mlx5/mlx5_vlan.c
index 3a07ad1..fa9e3b8 100644
--- a/drivers/net/mlx5/mlx5_vlan.c
+++ b/drivers/net/mlx5/mlx5_vlan.c
@@ -48,6 +48,7 @@
 
 #include "mlx5_utils.h"
 #include "mlx5.h"
+#include "mlx5_autoconf.h"
 
 /**
  * Configure a VLAN filter.
@@ -127,3 +128,106 @@ mlx5_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 	assert(ret >= 0);
 	return -ret;
 }
+
+/**
+ * Set/reset VLAN stripping for a specific queue.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param idx
+ *   RX queue index.
+ * @param on
+ *   Enable/disable VLAN stripping.
+ */
+static void
+priv_vlan_strip_queue_set(struct priv *priv, uint16_t idx, int on)
+{
+	struct rxq *rxq = (*priv->rxqs)[idx];
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	struct ibv_exp_wq_attr mod;
+	uint16_t vlan_offloads =
+		(on ? IBV_EXP_RECEIVE_WQ_CVLAN_STRIP : 0) |
+		0;
+	int err;
+
+	DEBUG("set VLAN offloads 0x%x for port %d queue %d",
+	      vlan_offloads, rxq->port_id, idx);
+	mod = (struct ibv_exp_wq_attr){
+		.attr_mask = IBV_EXP_WQ_ATTR_VLAN_OFFLOADS,
+		.vlan_offloads = vlan_offloads,
+	};
+
+	err = ibv_exp_modify_wq(rxq->wq, &mod);
+	if (err) {
+		ERROR("%p: failed to modified stripping mode: %s",
+		      (void *)priv, strerror(err));
+		return;
+	}
+
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+
+	/* Update related bits in RX queue. */
+	rxq->vlan_strip = !!on;
+}
+
+/**
+ * Callback to set/reset VLAN stripping for a specific queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param queue
+ *   RX queue index.
+ * @param on
+ *   Enable/disable VLAN stripping.
+ */
+void
+mlx5_vlan_strip_queue_set(struct rte_eth_dev *dev, uint16_t queue, int on)
+{
+	struct priv *priv = dev->data->dev_private;
+
+	/* Validate hw support */
+	if (!priv->hw_vlan_strip) {
+		ERROR("VLAN stripping is not supported");
+		return;
+	}
+
+	/* Validate queue number */
+	if (queue >= priv->rxqs_n) {
+		ERROR("VLAN stripping, invalid queue number %d", queue);
+		return;
+	}
+
+	priv_lock(priv);
+	priv_vlan_strip_queue_set(priv, queue, on);
+	priv_unlock(priv);
+}
+
+/**
+ * Callback to set/reset VLAN offloads for a port.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param mask
+ *   VLAN offload bit mask.
+ */
+void
+mlx5_vlan_offload_set(struct rte_eth_dev *dev, int mask)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int i;
+
+	if (mask & ETH_VLAN_STRIP_MASK) {
+		int hw_vlan_strip = dev->data->dev_conf.rxmode.hw_vlan_strip;
+
+		if (!priv->hw_vlan_strip) {
+			ERROR("VLAN stripping is not supported");
+			return;
+		}
+
+		/* Run on every RX queue and set/reset VLAN stripping. */
+		priv_lock(priv);
+		for (i = 0; (i != priv->rxqs_n); i++)
+			priv_vlan_strip_queue_set(priv, i, hw_vlan_strip);
+		priv_unlock(priv);
+	}
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director
  2016-02-18 16:10     ` Adrien Mazarguil
@ 2016-02-23 15:13       ` Bruce Richardson
  2016-02-23 17:14         ` Thomas Monjalon
  0 siblings, 1 reply; 26+ messages in thread
From: Bruce Richardson @ 2016-02-23 15:13 UTC (permalink / raw)
  To: dev

On Thu, Feb 18, 2016 at 05:10:16PM +0100, Adrien Mazarguil wrote:
> Hi Bruce,
> 
> On Wed, Feb 17, 2016 at 05:13:44PM +0000, Bruce Richardson wrote:
> > On Fri, Jan 29, 2016 at 11:32:01AM +0100, Adrien Mazarguil wrote:
> > > From: Yaacov Hazan <yaacovh@mellanox.com>
> > > 
> > > Add support for flow director filters (RTE_FDIR_MODE_PERFECT and
> > > RTE_FDIR_MODE_PERFECT_MAC_VLAN modes).
> > > 
> > > This feature requires MLNX_OFED 3.2.
> > > 
> > > Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
> > > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > ---
> > Hi Adrien, Yaacov,
> > 
> > this patch raises a lot of warnings (17) with checkpatch. Can you perhaps look
> > to see if this number can be reduced.
> 
> We actually did it before submitting that patch, there is indeed a bunch of
> remaining warnings that have been left on purpose. Not sure if we have the
> same configuration for checkpatch, but they should fall into the following
> categories:
> 
> - "WARNING: return of an errno should typically be negative" - All return
>   values are documented in the code. Since this PMD uses syscalls in its
>   control path, it uses positive errno values internally for
>   consistency. Public callback functions however return negative error
>   values.
> 
> - "WARNING: line over 80 characters" - Well, although I'm a big fan of the
>   80 characters limit, breaking those would have made the code harder to
>   read.
> 
> - "WARNING: Missing a blank line after declarations" - It's actually a
>   declaration through a macro, there is no missing blank line.
> 
> - "WARNING: networking block comments don't use an empty /* line" - Not sure
>   if we really care? I don't particularly mind.
> 
> - "CHECK: Comparison to NULL could be written "!" - I do not mind either,
>   writing the full check seems clearer to me.
> 
> - "CHECK: Unnecessary parentheses around fdir_info->mask" - Looks like a
>   valid, although minor error.
> 
> Please tell me which of these still need to be fixed.
> 
> -- 
Hi Adrien,

sorry for the delayed reply, I was out for a couple of days.

As none of the above are errors, I'm not going to mandate that they be fixed
before I merge in the patch, so long as you as maintainer are happy with them.

My request mainly came about because of the sheer number of warnings that were
being flagged. To keep the codebase clean requires constant discipline, so I
don't like seeing patches where 17 warnings are flagged. I was hoping since
a new rev of the set was likely anyway that some steps could be taken to reduce
that number.

Thomas, any thoughts here, since I'm still "learning the ropes" as committer. 
Do you have any rules-of-thumb or guidelines as regards checkpatch warnings? The
contributor guide only seems to cover running checkpatch, not anything about
what to do with the output.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director
  2016-02-23 15:13       ` Bruce Richardson
@ 2016-02-23 17:14         ` Thomas Monjalon
  2016-02-23 17:38           ` Adrien Mazarguil
  0 siblings, 1 reply; 26+ messages in thread
From: Thomas Monjalon @ 2016-02-23 17:14 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

2016-02-23 15:13, Bruce Richardson:
> On Thu, Feb 18, 2016 at 05:10:16PM +0100, Adrien Mazarguil wrote:
> > Hi Bruce,
> > 
> > On Wed, Feb 17, 2016 at 05:13:44PM +0000, Bruce Richardson wrote:
> > > Hi Adrien, Yaacov,
> > > 
> > > this patch raises a lot of warnings (17) with checkpatch. Can you perhaps look
> > > to see if this number can be reduced.
> > 
> > We actually did it before submitting that patch, there is indeed a bunch of
> > remaining warnings that have been left on purpose. Not sure if we have the
> > same configuration for checkpatch, but they should fall into the following
> > categories:
> > 
> > - "WARNING: return of an errno should typically be negative" - All return
> >   values are documented in the code. Since this PMD uses syscalls in its
> >   control path, it uses positive errno values internally for
> >   consistency. Public callback functions however return negative error
> >   values.
> > 
> > - "WARNING: line over 80 characters" - Well, although I'm a big fan of the
> >   80 characters limit, breaking those would have made the code harder to
> >   read.
> > 
> > - "WARNING: Missing a blank line after declarations" - It's actually a
> >   declaration through a macro, there is no missing blank line.
> > 
> > - "WARNING: networking block comments don't use an empty /* line" - Not sure
> >   if we really care? I don't particularly mind.
> > 
> > - "CHECK: Comparison to NULL could be written "!" - I do not mind either,
> >   writing the full check seems clearer to me.
> > 
> > - "CHECK: Unnecessary parentheses around fdir_info->mask" - Looks like a
> >   valid, although minor error.
> > 
> > Please tell me which of these still need to be fixed.
> > 
> Hi Adrien,
> 
> sorry for the delayed reply, I was out for a couple of days.
> 
> As none of the above are errors, I'm not going to mandate that they be fixed
> before I merge in the patch, so long as you as maintainer are happy with them.
> 
> My request mainly came about because of the sheer number of warnings that were
> being flagged. To keep the codebase clean requires constant discipline, so I
> don't like seeing patches where 17 warnings are flagged. I was hoping since
> a new rev of the set was likely anyway that some steps could be taken to reduce
> that number.
> 
> Thomas, any thoughts here, since I'm still "learning the ropes" as committer. 
> Do you have any rules-of-thumb or guidelines as regards checkpatch warnings? The
> contributor guide only seems to cover running checkpatch, not anything about
> what to do with the output.

I totally agree with you, Bruce.
Everybody must make some effort to keep consistency and avoid coding style
exceptions. Some code areas are not yet fully compliant with the rules,
depending of their history and... their maintainers ;)
I think we can tolerate some exceptions like for the 80 char limit.
Some checks may be disabled after discussion (networking block comments?).
Other checks deserve to be followed more strictly (e.g. negative errno).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director
  2016-02-23 17:14         ` Thomas Monjalon
@ 2016-02-23 17:38           ` Adrien Mazarguil
  0 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-02-23 17:38 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev

On Tue, Feb 23, 2016 at 06:14:19PM +0100, Thomas Monjalon wrote:
> 2016-02-23 15:13, Bruce Richardson:
> > On Thu, Feb 18, 2016 at 05:10:16PM +0100, Adrien Mazarguil wrote:
> > > Hi Bruce,
> > > 
> > > On Wed, Feb 17, 2016 at 05:13:44PM +0000, Bruce Richardson wrote:
> > > > Hi Adrien, Yaacov,
> > > > 
> > > > this patch raises a lot of warnings (17) with checkpatch. Can you perhaps look
> > > > to see if this number can be reduced.
> > > 
> > > We actually did it before submitting that patch, there is indeed a bunch of
> > > remaining warnings that have been left on purpose. Not sure if we have the
> > > same configuration for checkpatch, but they should fall into the following
> > > categories:
> > > 
> > > - "WARNING: return of an errno should typically be negative" - All return
> > >   values are documented in the code. Since this PMD uses syscalls in its
> > >   control path, it uses positive errno values internally for
> > >   consistency. Public callback functions however return negative error
> > >   values.
> > > 
> > > - "WARNING: line over 80 characters" - Well, although I'm a big fan of the
> > >   80 characters limit, breaking those would have made the code harder to
> > >   read.
> > > 
> > > - "WARNING: Missing a blank line after declarations" - It's actually a
> > >   declaration through a macro, there is no missing blank line.
> > > 
> > > - "WARNING: networking block comments don't use an empty /* line" - Not sure
> > >   if we really care? I don't particularly mind.
> > > 
> > > - "CHECK: Comparison to NULL could be written "!" - I do not mind either,
> > >   writing the full check seems clearer to me.
> > > 
> > > - "CHECK: Unnecessary parentheses around fdir_info->mask" - Looks like a
> > >   valid, although minor error.
> > > 
> > > Please tell me which of these still need to be fixed.
> > > 
> > Hi Adrien,
> > 
> > sorry for the delayed reply, I was out for a couple of days.
> > 
> > As none of the above are errors, I'm not going to mandate that they be fixed
> > before I merge in the patch, so long as you as maintainer are happy with them.
> > 
> > My request mainly came about because of the sheer number of warnings that were
> > being flagged. To keep the codebase clean requires constant discipline, so I
> > don't like seeing patches where 17 warnings are flagged. I was hoping since
> > a new rev of the set was likely anyway that some steps could be taken to reduce
> > that number.
> > 
> > Thomas, any thoughts here, since I'm still "learning the ropes" as committer. 
> > Do you have any rules-of-thumb or guidelines as regards checkpatch warnings? The
> > contributor guide only seems to cover running checkpatch, not anything about
> > what to do with the output.
> 
> I totally agree with you, Bruce.
> Everybody must make some effort to keep consistency and avoid coding style
> exceptions. Some code areas are not yet fully compliant with the rules,
> depending of their history and... their maintainers ;)
> I think we can tolerate some exceptions like for the 80 char limit.
> Some checks may be disabled after discussion (networking block comments?).

Actually you can ignore this one and NULL comparison checks - they do not
show up when using scripts/checkpatches.sh (I was previously using
checkpatch.pl directly).

I've fixed and sent updated versions of these patchsets anyway, with errno
warnings still present since it's a design decision. We can discuss it if
necessary.

> Other checks deserve to be followed more strictly (e.g. negative errno).

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support
  2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
                     ` (4 preceding siblings ...)
  2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
@ 2016-03-03 14:26   ` Adrien Mazarguil
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 1/5] mlx5: refactor special flows handling Adrien Mazarguil
                       ` (5 more replies)
  5 siblings, 6 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:26 UTC (permalink / raw)
  To: dev

To preserve compatibility with Mellanox OFED 3.1, flow director and RX VLAN
stripping code is only enabled if compiled with 3.2.

Changes in v3:
- Fixed flow registration issue caused by missing masks in flow rules.
- Fixed packet duplication with overlapping FDIR rules.
- Added FDIR flush command support.
- Updated Mellanox OFED prerequisite to 3.2-2.0.0.0.

Changes in v2:
- Rebased patchset on top of dpdk-next-net/rel_16_04.
- Fixed trivial compilation warnings (positive errnos are left on purpose).
- Updated documentation and release notes for flow director and RX VLAN
  stripping features.
- Fixed missing Mellanox OFED >= 3.2 check for CQ family query interface
  version.

Yaacov Hazan (5):
  mlx5: refactor special flows handling
  mlx5: add special flows (broadcast and IPv6 multicast)
  mlx5: make flow steering rule generator more generic
  mlx5: add support for flow director
  mlx5: add support for RX VLAN stripping

 doc/guides/nics/mlx5.rst               |  16 +-
 doc/guides/rel_notes/release_16_04.rst |  13 +
 drivers/net/mlx5/Makefile              |   6 +
 drivers/net/mlx5/mlx5.c                |  39 +-
 drivers/net/mlx5/mlx5.h                |  19 +-
 drivers/net/mlx5/mlx5_defs.h           |  14 +
 drivers/net/mlx5/mlx5_ethdev.c         |   3 +-
 drivers/net/mlx5/mlx5_fdir.c           | 980 +++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_mac.c            |  10 +-
 drivers/net/mlx5/mlx5_rxmode.c         | 350 ++++++------
 drivers/net/mlx5/mlx5_rxq.c            |  82 ++-
 drivers/net/mlx5/mlx5_rxtx.c           |  27 +
 drivers/net/mlx5/mlx5_rxtx.h           |  51 +-
 drivers/net/mlx5/mlx5_trigger.c        |  21 +-
 drivers/net/mlx5/mlx5_vlan.c           | 104 ++++
 15 files changed, 1508 insertions(+), 227 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_fdir.c

-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v3 1/5] mlx5: refactor special flows handling
  2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
@ 2016-03-03 14:26     ` Adrien Mazarguil
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:26 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Merge redundant code by adding a static initialization table to manage
promiscuous and allmulticast (special) flows.

New function priv_rehash_flows() implements the logic to enable/disable
relevant flows in one place from any context.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.c         |   4 +-
 drivers/net/mlx5/mlx5.h         |   6 +-
 drivers/net/mlx5/mlx5_defs.h    |   3 +
 drivers/net/mlx5/mlx5_rxmode.c  | 321 ++++++++++++++++++----------------------
 drivers/net/mlx5/mlx5_rxq.c     |  33 ++++-
 drivers/net/mlx5/mlx5_rxtx.h    |  29 +++-
 drivers/net/mlx5/mlx5_trigger.c |  14 +-
 7 files changed, 210 insertions(+), 200 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 30d88b5..52bf4b2 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -88,8 +88,8 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	      ((priv->ctx != NULL) ? priv->ctx->device->name : ""));
 	/* In case mlx5_dev_stop() has not been called. */
 	priv_dev_interrupt_handler_uninstall(priv, dev);
-	priv_allmulticast_disable(priv);
-	priv_promiscuous_disable(priv);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	/* Prevent crashes when queues are still in use. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 2f9a594..1c69bfa 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -195,13 +195,11 @@ int mlx5_dev_rss_reta_update(struct rte_eth_dev *,
 
 /* mlx5_rxmode.c */
 
-int priv_promiscuous_enable(struct priv *);
+int priv_special_flow_enable(struct priv *, enum hash_rxq_flow_type);
+void priv_special_flow_disable(struct priv *, enum hash_rxq_flow_type);
 void mlx5_promiscuous_enable(struct rte_eth_dev *);
-void priv_promiscuous_disable(struct priv *);
 void mlx5_promiscuous_disable(struct rte_eth_dev *);
-int priv_allmulticast_enable(struct priv *);
 void mlx5_allmulticast_enable(struct rte_eth_dev *);
-void priv_allmulticast_disable(struct priv *);
 void mlx5_allmulticast_disable(struct rte_eth_dev *);
 
 /* mlx5_stats.c */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index bb82c9a..1ec14ef 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -43,6 +43,9 @@
 /* Maximum number of simultaneous VLAN filters. */
 #define MLX5_MAX_VLAN_IDS 128
 
+/* Maximum number of special flows. */
+#define MLX5_MAX_SPECIAL_FLOWS 2
+
 /* Request send completion once in every 64 sends, might be less. */
 #define MLX5_PMD_TX_PER_COMP_REQ 64
 
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 096fd18..b2ed17e 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -58,31 +58,96 @@
 #include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
 
-static void hash_rxq_promiscuous_disable(struct hash_rxq *);
-static void hash_rxq_allmulticast_disable(struct hash_rxq *);
+/* Initialization data for special flows. */
+static const struct special_flow_init special_flow_init[] = {
+	[HASH_RXQ_FLOW_TYPE_PROMISC] = {
+		.dst_mac_val = "\x00\x00\x00\x00\x00\x00",
+		.dst_mac_mask = "\x00\x00\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_TCPV4 |
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_TCPV6 |
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+	[HASH_RXQ_FLOW_TYPE_ALLMULTI] = {
+		.dst_mac_val = "\x01\x00\x00\x00\x00\x00",
+		.dst_mac_mask = "\x01\x00\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+};
 
 /**
- * Enable promiscuous mode in a hash RX queue.
+ * Enable a special flow in a hash RX queue.
  *
  * @param hash_rxq
  *   Pointer to hash RX queue structure.
+ * @param flow_type
+ *   Special flow type.
  *
  * @return
  *   0 on success, errno value on failure.
  */
 static int
-hash_rxq_promiscuous_enable(struct hash_rxq *hash_rxq)
+hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
+			     enum hash_rxq_flow_type flow_type)
 {
 	struct ibv_exp_flow *flow;
 	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
 	struct ibv_exp_flow_attr *attr = &data->attr;
+	struct ibv_exp_flow_spec_eth *spec = &data->spec;
+	const uint8_t *mac;
+	const uint8_t *mask;
 
-	if (hash_rxq->promisc_flow != NULL)
+	/* Check if flow is relevant for this hash_rxq. */
+	if (!(special_flow_init[flow_type].hash_types & (1 << hash_rxq->type)))
+		return 0;
+	/* Check if flow already exists. */
+	if (hash_rxq->special_flow[flow_type] != NULL)
 		return 0;
-	DEBUG("%p: enabling promiscuous mode", (void *)hash_rxq);
-	/* Promiscuous flows only differ from normal flows by not filtering
-	 * on specific MAC addresses. */
+
+	/*
+	 * No padding must be inserted by the compiler between attr and spec.
+	 * This layout is expected by libibverbs.
+	 */
+	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
 	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	/* The first specification must be Ethernet. */
+	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
+	assert(spec->size == sizeof(*spec));
+
+	mac = special_flow_init[flow_type].dst_mac_val;
+	mask = special_flow_init[flow_type].dst_mac_mask;
+	*spec = (struct ibv_exp_flow_spec_eth){
+		.type = IBV_EXP_FLOW_SPEC_ETH,
+		.size = sizeof(*spec),
+		.val = {
+			.dst_mac = {
+				mac[0], mac[1], mac[2],
+				mac[3], mac[4], mac[5],
+			},
+		},
+		.mask = {
+			.dst_mac = {
+				mask[0], mask[1], mask[2],
+				mask[3], mask[4], mask[5],
+			},
+		},
+	};
+
 	errno = 0;
 	flow = ibv_exp_create_flow(hash_rxq->qp, attr);
 	if (flow == NULL) {
@@ -94,38 +159,63 @@ hash_rxq_promiscuous_enable(struct hash_rxq *hash_rxq)
 			return errno;
 		return EINVAL;
 	}
-	hash_rxq->promisc_flow = flow;
-	DEBUG("%p: promiscuous mode enabled", (void *)hash_rxq);
+	hash_rxq->special_flow[flow_type] = flow;
+	DEBUG("%p: enabling special flow %s (%d)",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
 	return 0;
 }
 
 /**
- * Enable promiscuous mode in all hash RX queues.
+ * Disable a special flow in a hash RX queue.
+ *
+ * @param hash_rxq
+ *   Pointer to hash RX queue structure.
+ * @param flow_type
+ *   Special flow type.
+ */
+static void
+hash_rxq_special_flow_disable(struct hash_rxq *hash_rxq,
+			      enum hash_rxq_flow_type flow_type)
+{
+	if (hash_rxq->special_flow[flow_type] == NULL)
+		return;
+	DEBUG("%p: disabling special flow %s (%d)",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
+	claim_zero(ibv_exp_destroy_flow(hash_rxq->special_flow[flow_type]));
+	hash_rxq->special_flow[flow_type] = NULL;
+	DEBUG("%p: special flow %s (%d) disabled",
+	      (void *)hash_rxq, hash_rxq_flow_type_str(flow_type), flow_type);
+}
+
+/**
+ * Enable a special flow in all hash RX queues.
  *
  * @param priv
  *   Private structure.
+ * @param flow_type
+ *   Special flow type.
  *
  * @return
  *   0 on success, errno value on failure.
  */
 int
-priv_promiscuous_enable(struct priv *priv)
+priv_special_flow_enable(struct priv *priv, enum hash_rxq_flow_type flow_type)
 {
 	unsigned int i;
 
-	if (!priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_PROMISC))
+	if (!priv_allow_flow_type(priv, flow_type))
 		return 0;
 	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
 		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
 		int ret;
 
-		ret = hash_rxq_promiscuous_enable(hash_rxq);
+		ret = hash_rxq_special_flow_enable(hash_rxq, flow_type);
 		if (!ret)
 			continue;
 		/* Failure, rollback. */
 		while (i != 0) {
 			hash_rxq = &(*priv->hash_rxqs)[--i];
-			hash_rxq_promiscuous_disable(hash_rxq);
+			hash_rxq_special_flow_disable(hash_rxq, flow_type);
 		}
 		return ret;
 	}
@@ -133,6 +223,26 @@ priv_promiscuous_enable(struct priv *priv)
 }
 
 /**
+ * Disable a special flow in all hash RX queues.
+ *
+ * @param priv
+ *   Private structure.
+ * @param flow_type
+ *   Special flow type.
+ */
+void
+priv_special_flow_disable(struct priv *priv, enum hash_rxq_flow_type flow_type)
+{
+	unsigned int i;
+
+	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
+		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
+
+		hash_rxq_special_flow_disable(hash_rxq, flow_type);
+	}
+}
+
+/**
  * DPDK callback to enable promiscuous mode.
  *
  * @param dev
@@ -146,49 +256,14 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
 
 	priv_lock(priv);
 	priv->promisc_req = 1;
-	ret = priv_promiscuous_enable(priv);
+	ret = priv_rehash_flows(priv);
 	if (ret)
-		ERROR("cannot enable promiscuous mode: %s", strerror(ret));
-	else {
-		priv_mac_addrs_disable(priv);
-		priv_allmulticast_disable(priv);
-	}
+		ERROR("error while enabling promiscuous mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Disable promiscuous mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- */
-static void
-hash_rxq_promiscuous_disable(struct hash_rxq *hash_rxq)
-{
-	if (hash_rxq->promisc_flow == NULL)
-		return;
-	DEBUG("%p: disabling promiscuous mode", (void *)hash_rxq);
-	claim_zero(ibv_exp_destroy_flow(hash_rxq->promisc_flow));
-	hash_rxq->promisc_flow = NULL;
-	DEBUG("%p: promiscuous mode disabled", (void *)hash_rxq);
-}
-
-/**
- * Disable promiscuous mode in all hash RX queues.
- *
- * @param priv
- *   Private structure.
- */
-void
-priv_promiscuous_disable(struct priv *priv)
-{
-	unsigned int i;
-
-	for (i = 0; (i != priv->hash_rxqs_n); ++i)
-		hash_rxq_promiscuous_disable(&(*priv->hash_rxqs)[i]);
-}
-
-/**
  * DPDK callback to disable promiscuous mode.
  *
  * @param dev
@@ -198,105 +273,18 @@ void
 mlx5_promiscuous_disable(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
+	int ret;
 
 	priv_lock(priv);
 	priv->promisc_req = 0;
-	priv_promiscuous_disable(priv);
-	priv_mac_addrs_enable(priv);
-	priv_allmulticast_enable(priv);
+	ret = priv_rehash_flows(priv);
+	if (ret)
+		ERROR("error while disabling promiscuous mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Enable allmulti mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- *
- * @return
- *   0 on success, errno value on failure.
- */
-static int
-hash_rxq_allmulticast_enable(struct hash_rxq *hash_rxq)
-{
-	struct ibv_exp_flow *flow;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
-	struct ibv_exp_flow_attr *attr = &data->attr;
-	struct ibv_exp_flow_spec_eth *spec = &data->spec;
-
-	if (hash_rxq->allmulti_flow != NULL)
-		return 0;
-	DEBUG("%p: enabling allmulticast mode", (void *)hash_rxq);
-	/*
-	 * No padding must be inserted by the compiler between attr and spec.
-	 * This layout is expected by libibverbs.
-	 */
-	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
-	*spec = (struct ibv_exp_flow_spec_eth){
-		.type = IBV_EXP_FLOW_SPEC_ETH,
-		.size = sizeof(*spec),
-		.val = {
-			.dst_mac = "\x01\x00\x00\x00\x00\x00",
-		},
-		.mask = {
-			.dst_mac = "\x01\x00\x00\x00\x00\x00",
-		},
-	};
-	errno = 0;
-	flow = ibv_exp_create_flow(hash_rxq->qp, attr);
-	if (flow == NULL) {
-		/* It's not clear whether errno is always set in this case. */
-		ERROR("%p: flow configuration failed, errno=%d: %s",
-		      (void *)hash_rxq, errno,
-		      (errno ? strerror(errno) : "Unknown error"));
-		if (errno)
-			return errno;
-		return EINVAL;
-	}
-	hash_rxq->allmulti_flow = flow;
-	DEBUG("%p: allmulticast mode enabled", (void *)hash_rxq);
-	return 0;
-}
-
-/**
- * Enable allmulti mode in most hash RX queues.
- * TCP queues are exempted to save resources.
- *
- * @param priv
- *   Private structure.
- *
- * @return
- *   0 on success, errno value on failure.
- */
-int
-priv_allmulticast_enable(struct priv *priv)
-{
-	unsigned int i;
-
-	if (!priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI))
-		return 0;
-	for (i = 0; (i != priv->hash_rxqs_n); ++i) {
-		struct hash_rxq *hash_rxq = &(*priv->hash_rxqs)[i];
-		int ret;
-
-		/* allmulticast not relevant for TCP. */
-		if (hash_rxq->type == HASH_RXQ_TCPV4)
-			continue;
-		ret = hash_rxq_allmulticast_enable(hash_rxq);
-		if (!ret)
-			continue;
-		/* Failure, rollback. */
-		while (i != 0) {
-			hash_rxq = &(*priv->hash_rxqs)[--i];
-			hash_rxq_allmulticast_disable(hash_rxq);
-		}
-		return ret;
-	}
-	return 0;
-}
-
-/**
  * DPDK callback to enable allmulti mode.
  *
  * @param dev
@@ -310,45 +298,14 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
 
 	priv_lock(priv);
 	priv->allmulti_req = 1;
-	ret = priv_allmulticast_enable(priv);
+	ret = priv_rehash_flows(priv);
 	if (ret)
-		ERROR("cannot enable allmulticast mode: %s", strerror(ret));
+		ERROR("error while enabling allmulticast mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
 
 /**
- * Disable allmulti mode in a hash RX queue.
- *
- * @param hash_rxq
- *   Pointer to hash RX queue structure.
- */
-static void
-hash_rxq_allmulticast_disable(struct hash_rxq *hash_rxq)
-{
-	if (hash_rxq->allmulti_flow == NULL)
-		return;
-	DEBUG("%p: disabling allmulticast mode", (void *)hash_rxq);
-	claim_zero(ibv_exp_destroy_flow(hash_rxq->allmulti_flow));
-	hash_rxq->allmulti_flow = NULL;
-	DEBUG("%p: allmulticast mode disabled", (void *)hash_rxq);
-}
-
-/**
- * Disable allmulti mode in all hash RX queues.
- *
- * @param priv
- *   Private structure.
- */
-void
-priv_allmulticast_disable(struct priv *priv)
-{
-	unsigned int i;
-
-	for (i = 0; (i != priv->hash_rxqs_n); ++i)
-		hash_rxq_allmulticast_disable(&(*priv->hash_rxqs)[i]);
-}
-
-/**
  * DPDK callback to disable allmulti mode.
  *
  * @param dev
@@ -358,9 +315,13 @@ void
 mlx5_allmulticast_disable(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
+	int ret;
 
 	priv_lock(priv);
 	priv->allmulti_req = 0;
-	priv_allmulticast_disable(priv);
+	ret = priv_rehash_flows(priv);
+	if (ret)
+		ERROR("error while disabling allmulticast mode: %s",
+		      strerror(ret));
 	priv_unlock(priv);
 }
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index ebbe186..166516a 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -534,8 +534,8 @@ priv_destroy_hash_rxqs(struct priv *priv)
 		assert(hash_rxq->priv == priv);
 		assert(hash_rxq->qp != NULL);
 		/* Also check that there are no remaining flows. */
-		assert(hash_rxq->allmulti_flow == NULL);
-		assert(hash_rxq->promisc_flow == NULL);
+		for (j = 0; (j != RTE_DIM(hash_rxq->special_flow)); ++j)
+			assert(hash_rxq->special_flow[j] == NULL);
 		for (j = 0; (j != RTE_DIM(hash_rxq->mac_flow)); ++j)
 			for (k = 0; (k != RTE_DIM(hash_rxq->mac_flow[j])); ++k)
 				assert(hash_rxq->mac_flow[j][k] == NULL);
@@ -586,6 +586,35 @@ priv_allow_flow_type(struct priv *priv, enum hash_rxq_flow_type type)
 }
 
 /**
+ * Automatically enable/disable flows according to configuration.
+ *
+ * @param priv
+ *   Private structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+priv_rehash_flows(struct priv *priv)
+{
+	unsigned int i;
+
+	for (i = 0; (i != RTE_DIM((*priv->hash_rxqs)[0].special_flow)); ++i)
+		if (!priv_allow_flow_type(priv, i)) {
+			priv_special_flow_disable(priv, i);
+		} else {
+			int ret = priv_special_flow_enable(priv, i);
+
+			if (ret)
+				return ret;
+		}
+	if (priv_allow_flow_type(priv, HASH_RXQ_FLOW_TYPE_MAC))
+		return priv_mac_addrs_enable(priv);
+	priv_mac_addrs_disable(priv);
+	return 0;
+}
+
+/**
  * Allocate RX queue elements with scattered packets support.
  *
  * @param rxq
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index e1e1925..983e6a4 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -176,20 +176,42 @@ struct ind_table_init {
 	unsigned int hash_types_n;
 };
 
+/* Initialization data for special flows. */
+struct special_flow_init {
+	uint8_t dst_mac_val[6];
+	uint8_t dst_mac_mask[6];
+	unsigned int hash_types;
+};
+
 enum hash_rxq_flow_type {
-	HASH_RXQ_FLOW_TYPE_MAC,
 	HASH_RXQ_FLOW_TYPE_PROMISC,
 	HASH_RXQ_FLOW_TYPE_ALLMULTI,
+	HASH_RXQ_FLOW_TYPE_MAC,
 };
 
+#ifndef NDEBUG
+static inline const char *
+hash_rxq_flow_type_str(enum hash_rxq_flow_type flow_type)
+{
+	switch (flow_type) {
+	case HASH_RXQ_FLOW_TYPE_PROMISC:
+		return "promiscuous";
+	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
+		return "allmulticast";
+	case HASH_RXQ_FLOW_TYPE_MAC:
+		return "MAC";
+	}
+	return NULL;
+}
+#endif /* NDEBUG */
+
 struct hash_rxq {
 	struct priv *priv; /* Back pointer to private data. */
 	struct ibv_qp *qp; /* Hash RX QP. */
 	enum hash_rxq_type type; /* Hash RX queue type. */
 	/* MAC flow steering rules, one per VLAN ID. */
 	struct ibv_exp_flow *mac_flow[MLX5_MAX_MAC_ADDRESSES][MLX5_MAX_VLAN_IDS];
-	struct ibv_exp_flow *promisc_flow; /* Promiscuous flow. */
-	struct ibv_exp_flow *allmulti_flow; /* Multicast flow. */
+	struct ibv_exp_flow *special_flow[MLX5_MAX_SPECIAL_FLOWS];
 };
 
 /* TX element. */
@@ -247,6 +269,7 @@ size_t hash_rxq_flow_attr(const struct hash_rxq *, struct ibv_exp_flow_attr *,
 int priv_create_hash_rxqs(struct priv *);
 void priv_destroy_hash_rxqs(struct priv *);
 int priv_allow_flow_type(struct priv *, enum hash_rxq_flow_type);
+int priv_rehash_flows(struct priv *);
 void rxq_cleanup(struct rxq *);
 int rxq_rehash(struct rte_eth_dev *, struct rxq *);
 int rxq_setup(struct rte_eth_dev *, struct rxq *, uint16_t, unsigned int,
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ff1203d..d9f7d00 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -72,11 +72,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	DEBUG("%p: allocating and configuring hash RX queues", (void *)dev);
 	err = priv_create_hash_rxqs(priv);
 	if (!err)
-		err = priv_promiscuous_enable(priv);
-	if (!err)
-		err = priv_mac_addrs_enable(priv);
-	if (!err)
-		err = priv_allmulticast_enable(priv);
+		err = priv_rehash_flows(priv);
 	if (!err)
 		priv->started = 1;
 	else {
@@ -84,8 +80,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		      " %s",
 		      (void *)priv, strerror(err));
 		/* Rollback. */
-		priv_allmulticast_disable(priv);
-		priv_promiscuous_disable(priv);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 		priv_mac_addrs_disable(priv);
 		priv_destroy_hash_rxqs(priv);
 	}
@@ -113,8 +109,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 		return;
 	}
 	DEBUG("%p: cleaning up and destroying hash RX queues", (void *)dev);
-	priv_allmulticast_disable(priv);
-	priv_promiscuous_disable(priv);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	priv_dev_interrupt_handler_uninstall(priv, dev);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v3 2/5] mlx5: add special flows (broadcast and IPv6 multicast)
  2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 1/5] mlx5: refactor special flows handling Adrien Mazarguil
@ 2016-03-03 14:26     ` Adrien Mazarguil
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:26 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Until now, broadcast frames were handled like unicast. Moving the related
flow to the special flows table frees up the related unicast MAC entry.

The same method is used to handle IPv6 multicast frames.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.c         |  7 +++----
 drivers/net/mlx5/mlx5_defs.h    |  2 +-
 drivers/net/mlx5/mlx5_ethdev.c  |  3 +--
 drivers/net/mlx5/mlx5_mac.c     |  6 ++----
 drivers/net/mlx5/mlx5_rxmode.c  | 24 ++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxq.c     | 10 ++++++++++
 drivers/net/mlx5/mlx5_rxtx.h    |  6 ++++++
 drivers/net/mlx5/mlx5_trigger.c |  4 ++++
 8 files changed, 51 insertions(+), 11 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 52bf4b2..cf7c4a5 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -90,6 +90,8 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	priv_dev_interrupt_handler_uninstall(priv, dev);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
 	/* Prevent crashes when queues are still in use. */
@@ -416,13 +418,10 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		     mac.addr_bytes[0], mac.addr_bytes[1],
 		     mac.addr_bytes[2], mac.addr_bytes[3],
 		     mac.addr_bytes[4], mac.addr_bytes[5]);
-		/* Register MAC and broadcast addresses. */
+		/* Register MAC address. */
 		claim_zero(priv_mac_addr_add(priv, 0,
 					     (const uint8_t (*)[ETHER_ADDR_LEN])
 					     mac.addr_bytes));
-		claim_zero(priv_mac_addr_add(priv, (RTE_DIM(priv->mac) - 1),
-					     &(const uint8_t [ETHER_ADDR_LEN])
-					     { "\xff\xff\xff\xff\xff\xff" }));
 #ifndef NDEBUG
 		{
 			char ifname[IF_NAMESIZE];
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 1ec14ef..67c3948 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -44,7 +44,7 @@
 #define MLX5_MAX_VLAN_IDS 128
 
 /* Maximum number of special flows. */
-#define MLX5_MAX_SPECIAL_FLOWS 2
+#define MLX5_MAX_SPECIAL_FLOWS 4
 
 /* Request send completion once in every 64 sends, might be less. */
 #define MLX5_PMD_TX_PER_COMP_REQ 64
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 1159fa3..6704382 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -501,8 +501,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 		max = 65535;
 	info->max_rx_queues = max;
 	info->max_tx_queues = max;
-	/* Last array entry is reserved for broadcast. */
-	info->max_mac_addrs = (RTE_DIM(priv->mac) - 1);
+	info->max_mac_addrs = RTE_DIM(priv->mac);
 	info->rx_offload_capa =
 		(priv->hw_csum ?
 		 (DEV_RX_OFFLOAD_IPV4_CKSUM |
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index b1f34d9..a1a7ff5 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -212,8 +212,7 @@ mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 	priv_lock(priv);
 	DEBUG("%p: removing MAC address from index %" PRIu32,
 	      (void *)dev, index);
-	/* Last array entry is reserved for broadcast. */
-	if (index >= (RTE_DIM(priv->mac) - 1))
+	if (index >= RTE_DIM(priv->mac))
 		goto end;
 	priv_mac_addr_del(priv, index);
 end:
@@ -479,8 +478,7 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
 	priv_lock(priv);
 	DEBUG("%p: adding MAC address at index %" PRIu32,
 	      (void *)dev, index);
-	/* Last array entry is reserved for broadcast. */
-	if (index >= (RTE_DIM(priv->mac) - 1))
+	if (index >= RTE_DIM(priv->mac))
 		goto end;
 	priv_mac_addr_add(priv, index,
 			  (const uint8_t (*)[ETHER_ADDR_LEN])
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index b2ed17e..6ee7ce3 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -88,6 +88,30 @@ static const struct special_flow_init special_flow_init[] = {
 			1 << HASH_RXQ_ETH |
 			0,
 	},
+	[HASH_RXQ_FLOW_TYPE_BROADCAST] = {
+		.dst_mac_val = "\xff\xff\xff\xff\xff\xff",
+		.dst_mac_mask = "\xff\xff\xff\xff\xff\xff",
+		.hash_types =
+			1 << HASH_RXQ_UDPV4 |
+			1 << HASH_RXQ_IPV4 |
+#ifdef HAVE_FLOW_SPEC_IPV6
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+#ifdef HAVE_FLOW_SPEC_IPV6
+	[HASH_RXQ_FLOW_TYPE_IPV6MULTI] = {
+		.dst_mac_val = "\x33\x33\x00\x00\x00\x00",
+		.dst_mac_mask = "\xff\xff\x00\x00\x00\x00",
+		.hash_types =
+			1 << HASH_RXQ_UDPV6 |
+			1 << HASH_RXQ_IPV6 |
+			1 << HASH_RXQ_ETH |
+			0,
+	},
+#endif /* HAVE_FLOW_SPEC_IPV6 */
 };
 
 /**
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 166516a..fcf192a 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -579,8 +579,18 @@ priv_allow_flow_type(struct priv *priv, enum hash_rxq_flow_type type)
 		return !!priv->promisc_req;
 	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
 		return !!priv->allmulti_req;
+	case HASH_RXQ_FLOW_TYPE_BROADCAST:
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case HASH_RXQ_FLOW_TYPE_IPV6MULTI:
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+		/* If allmulti is enabled, broadcast and ipv6multi
+		 * are unnecessary. */
+		return !priv->allmulti_req;
 	case HASH_RXQ_FLOW_TYPE_MAC:
 		return 1;
+	default:
+		/* Unsupported flow type is not allowed. */
+		return 0;
 	}
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 983e6a4..d5a5019 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -186,6 +186,8 @@ struct special_flow_init {
 enum hash_rxq_flow_type {
 	HASH_RXQ_FLOW_TYPE_PROMISC,
 	HASH_RXQ_FLOW_TYPE_ALLMULTI,
+	HASH_RXQ_FLOW_TYPE_BROADCAST,
+	HASH_RXQ_FLOW_TYPE_IPV6MULTI,
 	HASH_RXQ_FLOW_TYPE_MAC,
 };
 
@@ -198,6 +200,10 @@ hash_rxq_flow_type_str(enum hash_rxq_flow_type flow_type)
 		return "promiscuous";
 	case HASH_RXQ_FLOW_TYPE_ALLMULTI:
 		return "allmulticast";
+	case HASH_RXQ_FLOW_TYPE_BROADCAST:
+		return "broadcast";
+	case HASH_RXQ_FLOW_TYPE_IPV6MULTI:
+		return "IPv6 multicast";
 	case HASH_RXQ_FLOW_TYPE_MAC:
 		return "MAC";
 	}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index d9f7d00..90b8068 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -80,6 +80,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		      " %s",
 		      (void *)priv, strerror(err));
 		/* Rollback. */
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
+		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
 		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 		priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 		priv_mac_addrs_disable(priv);
@@ -109,6 +111,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 		return;
 	}
 	DEBUG("%p: cleaning up and destroying hash RX queues", (void *)dev);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
+	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_BROADCAST);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_ALLMULTI);
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v3 3/5] mlx5: make flow steering rule generator more generic
  2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 1/5] mlx5: refactor special flows handling Adrien Mazarguil
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
@ 2016-03-03 14:26     ` Adrien Mazarguil
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 4/5] mlx5: add support for flow director Adrien Mazarguil
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:26 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Upcoming flow director support will reuse this function to generate filter
rules.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_mac.c    |  4 ++--
 drivers/net/mlx5/mlx5_rxmode.c |  5 +++--
 drivers/net/mlx5/mlx5_rxq.c    | 16 ++++++++--------
 drivers/net/mlx5/mlx5_rxtx.h   |  4 ++--
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index a1a7ff5..edb05ad 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -241,7 +241,7 @@ hash_rxq_add_mac_flow(struct hash_rxq *hash_rxq, unsigned int mac_index,
 	const uint8_t (*mac)[ETHER_ADDR_LEN] =
 			(const uint8_t (*)[ETHER_ADDR_LEN])
 			priv->mac[mac_index].addr_bytes;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, hash_rxq->type));
 	struct ibv_exp_flow_attr *attr = &data->attr;
 	struct ibv_exp_flow_spec_eth *spec = &data->spec;
 	unsigned int vlan_enabled = !!priv->vlan_filter_n;
@@ -256,7 +256,7 @@ hash_rxq_add_mac_flow(struct hash_rxq *hash_rxq, unsigned int mac_index,
 	 * This layout is expected by libibverbs.
 	 */
 	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	priv_flow_attr(priv, attr, sizeof(data), hash_rxq->type);
 	/* The first specification must be Ethernet. */
 	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
 	assert(spec->size == sizeof(*spec));
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 6ee7ce3..9ac7a41 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -129,8 +129,9 @@ static int
 hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
 			     enum hash_rxq_flow_type flow_type)
 {
+	struct priv *priv = hash_rxq->priv;
 	struct ibv_exp_flow *flow;
-	FLOW_ATTR_SPEC_ETH(data, hash_rxq_flow_attr(hash_rxq, NULL, 0));
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, hash_rxq->type));
 	struct ibv_exp_flow_attr *attr = &data->attr;
 	struct ibv_exp_flow_spec_eth *spec = &data->spec;
 	const uint8_t *mac;
@@ -148,7 +149,7 @@ hash_rxq_special_flow_enable(struct hash_rxq *hash_rxq,
 	 * This layout is expected by libibverbs.
 	 */
 	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec);
-	hash_rxq_flow_attr(hash_rxq, attr, sizeof(data));
+	priv_flow_attr(priv, attr, sizeof(data), hash_rxq->type);
 	/* The first specification must be Ethernet. */
 	assert(spec->type == IBV_EXP_FLOW_SPEC_ETH);
 	assert(spec->size == sizeof(*spec));
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index fcf192a..36910b2 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -210,27 +210,27 @@ const size_t rss_hash_default_key_len = sizeof(rss_hash_default_key);
  * information from hash_rxq_init[]. Nothing is written to flow_attr when
  * flow_attr_size is not large enough, but the required size is still returned.
  *
- * @param[in] hash_rxq
- *   Pointer to hash RX queue.
+ * @param priv
+ *   Pointer to private structure.
  * @param[out] flow_attr
  *   Pointer to flow attribute structure to fill. Note that the allocated
  *   area must be larger and large enough to hold all flow specifications.
  * @param flow_attr_size
  *   Entire size of flow_attr and trailing room for flow specifications.
+ * @param type
+ *   Hash RX queue type to use for flow steering rule.
  *
  * @return
  *   Total size of the flow attribute buffer. No errors are defined.
  */
 size_t
-hash_rxq_flow_attr(const struct hash_rxq *hash_rxq,
-		   struct ibv_exp_flow_attr *flow_attr,
-		   size_t flow_attr_size)
+priv_flow_attr(struct priv *priv, struct ibv_exp_flow_attr *flow_attr,
+	       size_t flow_attr_size, enum hash_rxq_type type)
 {
 	size_t offset = sizeof(*flow_attr);
-	enum hash_rxq_type type = hash_rxq->type;
 	const struct hash_rxq_init *init = &hash_rxq_init[type];
 
-	assert(hash_rxq->priv != NULL);
+	assert(priv != NULL);
 	assert((size_t)type < RTE_DIM(hash_rxq_init));
 	do {
 		offset += init->flow_spec.hdr.size;
@@ -244,7 +244,7 @@ hash_rxq_flow_attr(const struct hash_rxq *hash_rxq,
 		.type = IBV_EXP_FLOW_ATTR_NORMAL,
 		.priority = init->flow_priority,
 		.num_of_specs = 0,
-		.port = hash_rxq->priv->port,
+		.port = priv->port,
 		.flags = 0,
 	};
 	do {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index d5a5019..c42bb8d 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -270,8 +270,8 @@ extern const unsigned int hash_rxq_init_n;
 extern uint8_t rss_hash_default_key[];
 extern const size_t rss_hash_default_key_len;
 
-size_t hash_rxq_flow_attr(const struct hash_rxq *, struct ibv_exp_flow_attr *,
-			  size_t);
+size_t priv_flow_attr(struct priv *, struct ibv_exp_flow_attr *,
+		      size_t, enum hash_rxq_type);
 int priv_create_hash_rxqs(struct priv *);
 void priv_destroy_hash_rxqs(struct priv *);
 int priv_allow_flow_type(struct priv *, enum hash_rxq_flow_type);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v3 4/5] mlx5: add support for flow director
  2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
                       ` (2 preceding siblings ...)
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
@ 2016-03-03 14:26     ` Adrien Mazarguil
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
  2016-03-09 16:11     ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Bruce Richardson
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:26 UTC (permalink / raw)
  To: dev; +Cc: Raslan Darawsheh

From: Yaacov Hazan <yaacovh@mellanox.com>

Add support for flow director filters (RTE_FDIR_MODE_PERFECT and
RTE_FDIR_MODE_PERFECT_MAC_VLAN modes).

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>
---
 doc/guides/nics/mlx5.rst               |  14 +-
 doc/guides/rel_notes/release_16_04.rst |   7 +
 drivers/net/mlx5/Makefile              |   6 +
 drivers/net/mlx5/mlx5.c                |  12 +
 drivers/net/mlx5/mlx5.h                |  10 +
 drivers/net/mlx5/mlx5_defs.h           |  11 +
 drivers/net/mlx5/mlx5_fdir.c           | 980 +++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxq.c            |   6 +
 drivers/net/mlx5/mlx5_rxtx.h           |   7 +
 drivers/net/mlx5/mlx5_trigger.c        |   3 +
 10 files changed, 1055 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/mlx5/mlx5_fdir.c

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index b2a12ce..6bd452e 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -86,6 +86,7 @@ Features
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
+- Flow director (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN).
 
 Limitations
 -----------
@@ -214,7 +215,8 @@ DPDK and must be installed separately:
 
 Currently supported by DPDK:
 
-- Mellanox OFED **3.1-1.0.3** or **3.1-1.5.7.1** depending on usage.
+- Mellanox OFED **3.1-1.0.3**, **3.1-1.5.7.1** or **3.2-2.0.0.0** depending
+  on usage.
 
     The following features are supported with version **3.1-1.5.7.1** and
     above only:
@@ -223,6 +225,11 @@ Currently supported by DPDK:
     - RX checksum offloads.
     - IBM POWER8.
 
+    The following features are supported with version **3.2-2.0.0.0** and
+    above only:
+
+    - Flow director.
+
 - Minimum firmware version:
 
   With MLNX_OFED **3.1-1.0.3**:
@@ -235,6 +242,11 @@ Currently supported by DPDK:
   - ConnectX-4: **12.13.0144**
   - ConnectX-4 Lx: **14.13.0144**
 
+  With MLNX_OFED **3.2-2.0.0.0**:
+
+  - ConnectX-4: **12.14.2036**
+  - ConnectX-4 Lx: **14.14.2036**
+
 Getting Mellanox OFED
 ~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 73494f9..c6c76d6 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -74,6 +74,13 @@ This section should contain new features added in this release. Sample format:
 
 * **szedata2: Add functions for setting link up/down.**
 
+* **mlx5: flow director support.**
+
+  Added flow director support (RTE_FDIR_MODE_PERFECT and
+  RTE_FDIR_MODE_PERFECT_MAC_VLAN).
+
+  Only available with Mellanox OFED >= 3.2.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 698f072..46a17e0 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -52,6 +52,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rxmode.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_vlan.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_stats.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_fdir.c
 
 # Dependencies.
 DEPDIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += lib/librte_ether
@@ -125,6 +126,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
+		infiniband/verbs.h \
+		enum IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index cf7c4a5..43e24ff 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -94,6 +94,11 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_IPV6MULTI);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
+
+	/* Remove flow director elements. */
+	priv_fdir_disable(priv);
+	priv_fdir_delete_filters_list(priv);
+
 	/* Prevent crashes when queues are still in use. */
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
@@ -170,6 +175,9 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.reta_query = mlx5_dev_rss_reta_query,
 	.rss_hash_update = mlx5_rss_hash_update,
 	.rss_hash_conf_get = mlx5_rss_hash_conf_get,
+#ifdef MLX5_FDIR_SUPPORT
+	.filter_ctrl = mlx5_dev_filter_ctrl,
+#endif /* MLX5_FDIR_SUPPORT */
 };
 
 static struct {
@@ -422,6 +430,10 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		claim_zero(priv_mac_addr_add(priv, 0,
 					     (const uint8_t (*)[ETHER_ADDR_LEN])
 					     mac.addr_bytes));
+		/* Initialize FD filters list. */
+		err = fdir_init_filters_list(priv);
+		if (err)
+			goto port_error;
 #ifndef NDEBUG
 		{
 			char ifname[IF_NAMESIZE];
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 1c69bfa..8019ee3 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -120,6 +120,7 @@ struct priv {
 	struct rte_intr_handle intr_handle; /* Interrupt handler. */
 	unsigned int (*reta_idx)[]; /* RETA index table. */
 	unsigned int reta_idx_n; /* RETA index size. */
+	struct fdir_filter_list *fdir_filter_list; /* Flow director rules. */
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
@@ -216,4 +217,13 @@ int mlx5_vlan_filter_set(struct rte_eth_dev *, uint16_t, int);
 int mlx5_dev_start(struct rte_eth_dev *);
 void mlx5_dev_stop(struct rte_eth_dev *);
 
+/* mlx5_fdir.c */
+
+int fdir_init_filters_list(struct priv *);
+void priv_fdir_delete_filters_list(struct priv *);
+void priv_fdir_disable(struct priv *);
+void priv_fdir_enable(struct priv *);
+int mlx5_dev_filter_ctrl(struct rte_eth_dev *, enum rte_filter_type,
+			 enum rte_filter_op, void *);
+
 #endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 67c3948..5b00d8e 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -34,6 +34,8 @@
 #ifndef RTE_PMD_MLX5_DEFS_H_
 #define RTE_PMD_MLX5_DEFS_H_
 
+#include "mlx5_autoconf.h"
+
 /* Reported driver name. */
 #define MLX5_DRIVER_NAME "librte_pmd_mlx5"
 
@@ -84,4 +86,13 @@
 /* Alarm timeout. */
 #define MLX5_ALARM_TIMEOUT_US 100000
 
+/*
+ * Extended flow priorities necessary to support flow director are available
+ * since MLNX_OFED 3.2. Considering this version adds support for VLAN
+ * offloads as well, their availability means flow director can be used.
+ */
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+#define MLX5_FDIR_SUPPORT 1
+#endif
+
 #endif /* RTE_PMD_MLX5_DEFS_H_ */
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
new file mode 100644
index 0000000..63e43ad
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -0,0 +1,980 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright 2015 6WIND S.A.
+ *   Copyright 2015 Mellanox.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of 6WIND S.A. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <assert.h>
+#include <stdint.h>
+#include <string.h>
+#include <errno.h>
+
+/* Verbs header. */
+/* ISO C doesn't support unnamed structs/unions, disabling -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include <infiniband/verbs.h>
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+/* DPDK headers don't like -pedantic. */
+#ifdef PEDANTIC
+#pragma GCC diagnostic ignored "-pedantic"
+#endif
+#include <rte_ether.h>
+#include <rte_malloc.h>
+#include <rte_ethdev.h>
+#include <rte_common.h>
+#ifdef PEDANTIC
+#pragma GCC diagnostic error "-pedantic"
+#endif
+
+#include "mlx5.h"
+#include "mlx5_rxtx.h"
+
+struct fdir_flow_desc {
+	uint16_t dst_port;
+	uint16_t src_port;
+	uint32_t src_ip[4];
+	uint32_t dst_ip[4];
+	uint8_t	mac[6];
+	uint16_t vlan_tag;
+	enum hash_rxq_type type;
+};
+
+struct mlx5_fdir_filter {
+	LIST_ENTRY(mlx5_fdir_filter) next;
+	uint16_t queue; /* Queue assigned to if FDIR match. */
+	struct fdir_flow_desc desc;
+	struct ibv_exp_flow *flow;
+};
+
+LIST_HEAD(fdir_filter_list, mlx5_fdir_filter);
+
+/**
+ * Convert struct rte_eth_fdir_filter to mlx5 filter descriptor.
+ *
+ * @param[in] fdir_filter
+ *   DPDK filter structure to convert.
+ * @param[out] desc
+ *   Resulting mlx5 filter descriptor.
+ * @param mode
+ *   Flow director mode.
+ */
+static void
+fdir_filter_to_flow_desc(const struct rte_eth_fdir_filter *fdir_filter,
+			 struct fdir_flow_desc *desc, enum rte_fdir_mode mode)
+{
+	/* Initialize descriptor. */
+	memset(desc, 0, sizeof(*desc));
+
+	/* Set VLAN ID. */
+	desc->vlan_tag = fdir_filter->input.flow_ext.vlan_tci;
+
+	/* Set MAC address. */
+	if (mode == RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		rte_memcpy(desc->mac,
+			   fdir_filter->input.flow.mac_vlan_flow.mac_addr.
+				addr_bytes,
+			   sizeof(desc->mac));
+		desc->type = HASH_RXQ_ETH;
+		return;
+	}
+
+	/* Set mode */
+	switch (fdir_filter->input.flow_type) {
+	case RTE_ETH_FLOW_NONFRAG_IPV4_UDP:
+		desc->type = HASH_RXQ_UDPV4;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_TCP:
+		desc->type = HASH_RXQ_TCPV4;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
+		desc->type = HASH_RXQ_IPV4;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case RTE_ETH_FLOW_NONFRAG_IPV6_UDP:
+		desc->type = HASH_RXQ_UDPV6;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV6_TCP:
+		desc->type = HASH_RXQ_TCPV6;
+		break;
+	case RTE_ETH_FLOW_NONFRAG_IPV6_OTHER:
+		desc->type = HASH_RXQ_IPV6;
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		break;
+	}
+
+	/* Set flow values */
+	switch (fdir_filter->input.flow_type) {
+	case RTE_ETH_FLOW_NONFRAG_IPV4_UDP:
+	case RTE_ETH_FLOW_NONFRAG_IPV4_TCP:
+		desc->src_port = fdir_filter->input.flow.udp4_flow.src_port;
+		desc->dst_port = fdir_filter->input.flow.udp4_flow.dst_port;
+	case RTE_ETH_FLOW_NONFRAG_IPV4_OTHER:
+		desc->src_ip[0] = fdir_filter->input.flow.ip4_flow.src_ip;
+		desc->dst_ip[0] = fdir_filter->input.flow.ip4_flow.dst_ip;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case RTE_ETH_FLOW_NONFRAG_IPV6_UDP:
+	case RTE_ETH_FLOW_NONFRAG_IPV6_TCP:
+		desc->src_port = fdir_filter->input.flow.udp6_flow.src_port;
+		desc->dst_port = fdir_filter->input.flow.udp6_flow.dst_port;
+		/* Fall through. */
+	case RTE_ETH_FLOW_NONFRAG_IPV6_OTHER:
+		rte_memcpy(desc->src_ip,
+			   fdir_filter->input.flow.ipv6_flow.src_ip,
+			   sizeof(desc->src_ip));
+		rte_memcpy(desc->dst_ip,
+			   fdir_filter->input.flow.ipv6_flow.dst_ip,
+			   sizeof(desc->dst_ip));
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		break;
+	}
+}
+
+/**
+ * Check if two flow descriptors overlap according to configured mask.
+ *
+ * @param priv
+ *   Private structure that provides flow director mask.
+ * @param desc1
+ *   First flow descriptor to compare.
+ * @param desc2
+ *   Second flow descriptor to compare.
+ *
+ * @return
+ *   Nonzero if descriptors overlap.
+ */
+static int
+priv_fdir_overlap(const struct priv *priv,
+		  const struct fdir_flow_desc *desc1,
+		  const struct fdir_flow_desc *desc2)
+{
+	const struct rte_eth_fdir_masks *mask =
+		&priv->dev->data->dev_conf.fdir_conf.mask;
+	unsigned int i;
+
+	if (desc1->type != desc2->type)
+		return 0;
+	/* Ignore non masked bits. */
+	for (i = 0; i != RTE_DIM(desc1->mac); ++i)
+		if ((desc1->mac[i] & mask->mac_addr_byte_mask) !=
+		    (desc2->mac[i] & mask->mac_addr_byte_mask))
+			return 0;
+	if (((desc1->src_port & mask->src_port_mask) !=
+	     (desc2->src_port & mask->src_port_mask)) ||
+	    ((desc1->dst_port & mask->dst_port_mask) !=
+	     (desc2->dst_port & mask->dst_port_mask)))
+		return 0;
+	switch (desc1->type) {
+	case HASH_RXQ_IPV4:
+	case HASH_RXQ_UDPV4:
+	case HASH_RXQ_TCPV4:
+		if (((desc1->src_ip[0] & mask->ipv4_mask.src_ip) !=
+		     (desc2->src_ip[0] & mask->ipv4_mask.src_ip)) ||
+		    ((desc1->dst_ip[0] & mask->ipv4_mask.dst_ip) !=
+		     (desc2->dst_ip[0] & mask->ipv4_mask.dst_ip)))
+			return 0;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case HASH_RXQ_IPV6:
+	case HASH_RXQ_UDPV6:
+	case HASH_RXQ_TCPV6:
+		for (i = 0; i != RTE_DIM(desc1->src_ip); ++i)
+			if (((desc1->src_ip[i] & mask->ipv6_mask.src_ip[i]) !=
+			     (desc2->src_ip[i] & mask->ipv6_mask.src_ip[i])) ||
+			    ((desc1->dst_ip[i] & mask->ipv6_mask.dst_ip[i]) !=
+			     (desc2->dst_ip[i] & mask->ipv6_mask.dst_ip[i])))
+				return 0;
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		break;
+	}
+	return 1;
+}
+
+/**
+ * Create flow director steering rule for a specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param mlx5_fdir_filter
+ *   Filter to create a steering rule for.
+ * @param fdir_queue
+ *   Flow director queue for matching packets.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_flow_add(struct priv *priv,
+		   struct mlx5_fdir_filter *mlx5_fdir_filter,
+		   struct fdir_queue *fdir_queue)
+{
+	struct ibv_exp_flow *flow;
+	struct fdir_flow_desc *desc = &mlx5_fdir_filter->desc;
+	enum rte_fdir_mode fdir_mode =
+		priv->dev->data->dev_conf.fdir_conf.mode;
+	struct rte_eth_fdir_masks *mask =
+		&priv->dev->data->dev_conf.fdir_conf.mask;
+	FLOW_ATTR_SPEC_ETH(data, priv_flow_attr(priv, NULL, 0, desc->type));
+	struct ibv_exp_flow_attr *attr = &data->attr;
+	uintptr_t spec_offset = (uintptr_t)&data->spec;
+	struct ibv_exp_flow_spec_eth *spec_eth;
+	struct ibv_exp_flow_spec_ipv4 *spec_ipv4;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	struct ibv_exp_flow_spec_ipv6 *spec_ipv6;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	struct ibv_exp_flow_spec_tcp_udp *spec_tcp_udp;
+	struct mlx5_fdir_filter *iter_fdir_filter;
+	unsigned int i;
+
+	/* Abort if an existing flow overlaps this one to avoid packet
+	 * duplication, even if it targets another queue. */
+	LIST_FOREACH(iter_fdir_filter, priv->fdir_filter_list, next)
+		if ((iter_fdir_filter != mlx5_fdir_filter) &&
+		    (iter_fdir_filter->flow != NULL) &&
+		    (priv_fdir_overlap(priv,
+				       &mlx5_fdir_filter->desc,
+				       &iter_fdir_filter->desc)))
+			return EEXIST;
+
+	/*
+	 * No padding must be inserted by the compiler between attr and spec.
+	 * This layout is expected by libibverbs.
+	 */
+	assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec_offset);
+	priv_flow_attr(priv, attr, sizeof(data), desc->type);
+
+	/* Set Ethernet spec */
+	spec_eth = (struct ibv_exp_flow_spec_eth *)spec_offset;
+
+	/* The first specification must be Ethernet. */
+	assert(spec_eth->type == IBV_EXP_FLOW_SPEC_ETH);
+	assert(spec_eth->size == sizeof(*spec_eth));
+
+	/* VLAN ID */
+	spec_eth->val.vlan_tag = desc->vlan_tag & mask->vlan_tci_mask;
+	spec_eth->mask.vlan_tag = mask->vlan_tci_mask;
+
+	/* Update priority */
+	attr->priority = 2;
+
+	if (fdir_mode == RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		/* MAC Address */
+		for (i = 0; i != RTE_DIM(spec_eth->mask.dst_mac); ++i) {
+			spec_eth->val.dst_mac[i] =
+				desc->mac[i] & mask->mac_addr_byte_mask;
+			spec_eth->mask.dst_mac[i] = mask->mac_addr_byte_mask;
+		}
+		goto create_flow;
+	}
+
+	switch (desc->type) {
+	case HASH_RXQ_IPV4:
+	case HASH_RXQ_UDPV4:
+	case HASH_RXQ_TCPV4:
+		spec_offset += spec_eth->size;
+
+		/* Set IP spec */
+		spec_ipv4 = (struct ibv_exp_flow_spec_ipv4 *)spec_offset;
+
+		/* The second specification must be IP. */
+		assert(spec_ipv4->type == IBV_EXP_FLOW_SPEC_IPV4);
+		assert(spec_ipv4->size == sizeof(*spec_ipv4));
+
+		spec_ipv4->val.src_ip =
+			desc->src_ip[0] & mask->ipv4_mask.src_ip;
+		spec_ipv4->val.dst_ip =
+			desc->dst_ip[0] & mask->ipv4_mask.dst_ip;
+		spec_ipv4->mask.src_ip = mask->ipv4_mask.src_ip;
+		spec_ipv4->mask.dst_ip = mask->ipv4_mask.dst_ip;
+
+		/* Update priority */
+		attr->priority = 1;
+
+		if (desc->type == HASH_RXQ_IPV4)
+			goto create_flow;
+
+		spec_offset += spec_ipv4->size;
+		break;
+#ifdef HAVE_FLOW_SPEC_IPV6
+	case HASH_RXQ_IPV6:
+	case HASH_RXQ_UDPV6:
+	case HASH_RXQ_TCPV6:
+		spec_offset += spec_eth->size;
+
+		/* Set IP spec */
+		spec_ipv6 = (struct ibv_exp_flow_spec_ipv6 *)spec_offset;
+
+		/* The second specification must be IP. */
+		assert(spec_ipv6->type == IBV_EXP_FLOW_SPEC_IPV6);
+		assert(spec_ipv6->size == sizeof(*spec_ipv6));
+
+		for (i = 0; i != RTE_DIM(desc->src_ip); ++i) {
+			((uint32_t *)spec_ipv6->val.src_ip)[i] =
+				desc->src_ip[i] & mask->ipv6_mask.src_ip[i];
+			((uint32_t *)spec_ipv6->val.dst_ip)[i] =
+				desc->dst_ip[i] & mask->ipv6_mask.dst_ip[i];
+		}
+		rte_memcpy(spec_ipv6->mask.src_ip,
+			   mask->ipv6_mask.src_ip,
+			   sizeof(spec_ipv6->mask.src_ip));
+		rte_memcpy(spec_ipv6->mask.dst_ip,
+			   mask->ipv6_mask.dst_ip,
+			   sizeof(spec_ipv6->mask.dst_ip));
+
+		/* Update priority */
+		attr->priority = 1;
+
+		if (desc->type == HASH_RXQ_IPV6)
+			goto create_flow;
+
+		spec_offset += spec_ipv6->size;
+		break;
+#endif /* HAVE_FLOW_SPEC_IPV6 */
+	default:
+		ERROR("invalid flow attribute type");
+		return EINVAL;
+	}
+
+	/* Set TCP/UDP flow specification. */
+	spec_tcp_udp = (struct ibv_exp_flow_spec_tcp_udp *)spec_offset;
+
+	/* The third specification must be TCP/UDP. */
+	assert(spec_tcp_udp->type == IBV_EXP_FLOW_SPEC_TCP ||
+	       spec_tcp_udp->type == IBV_EXP_FLOW_SPEC_UDP);
+	assert(spec_tcp_udp->size == sizeof(*spec_tcp_udp));
+
+	spec_tcp_udp->val.src_port = desc->src_port & mask->src_port_mask;
+	spec_tcp_udp->val.dst_port = desc->dst_port & mask->dst_port_mask;
+	spec_tcp_udp->mask.src_port = mask->src_port_mask;
+	spec_tcp_udp->mask.dst_port = mask->dst_port_mask;
+
+	/* Update priority */
+	attr->priority = 0;
+
+create_flow:
+
+	errno = 0;
+	flow = ibv_exp_create_flow(fdir_queue->qp, attr);
+	if (flow == NULL) {
+		/* It's not clear whether errno is always set in this case. */
+		ERROR("%p: flow director configuration failed, errno=%d: %s",
+		      (void *)priv, errno,
+		      (errno ? strerror(errno) : "Unknown error"));
+		if (errno)
+			return errno;
+		return EINVAL;
+	}
+
+	DEBUG("%p: added flow director rule (%p)", (void *)priv, (void *)flow);
+	mlx5_fdir_filter->flow = flow;
+	return 0;
+}
+
+/**
+ * Get flow director queue for a specific RX queue, create it in case
+ * it does not exist.
+ *
+ * @param priv
+ *   Private structure.
+ * @param idx
+ *   RX queue index.
+ *
+ * @return
+ *   Related flow director queue on success, NULL otherwise.
+ */
+static struct fdir_queue *
+priv_get_fdir_queue(struct priv *priv, uint16_t idx)
+{
+	struct fdir_queue *fdir_queue = &(*priv->rxqs)[idx]->fdir_queue;
+	struct ibv_exp_rwq_ind_table *ind_table = NULL;
+	struct ibv_qp *qp = NULL;
+	struct ibv_exp_rwq_ind_table_init_attr ind_init_attr;
+	struct ibv_exp_rx_hash_conf hash_conf;
+	struct ibv_exp_qp_init_attr qp_init_attr;
+	int err = 0;
+
+	/* Return immediately if it has already been created. */
+	if (fdir_queue->qp != NULL)
+		return fdir_queue;
+
+	ind_init_attr = (struct ibv_exp_rwq_ind_table_init_attr){
+		.pd = priv->pd,
+		.log_ind_tbl_size = 0,
+		.ind_tbl = &((*priv->rxqs)[idx]->wq),
+		.comp_mask = 0,
+	};
+
+	errno = 0;
+	ind_table = ibv_exp_create_rwq_ind_table(priv->ctx,
+						 &ind_init_attr);
+	if (ind_table == NULL) {
+		/* Not clear whether errno is set. */
+		err = (errno ? errno : EINVAL);
+		ERROR("RX indirection table creation failed with error %d: %s",
+		      err, strerror(err));
+		goto error;
+	}
+
+	/* Create fdir_queue qp. */
+	hash_conf = (struct ibv_exp_rx_hash_conf){
+		.rx_hash_function = IBV_EXP_RX_HASH_FUNC_TOEPLITZ,
+		.rx_hash_key_len = rss_hash_default_key_len,
+		.rx_hash_key = rss_hash_default_key,
+		.rx_hash_fields_mask = 0,
+		.rwq_ind_tbl = ind_table,
+	};
+	qp_init_attr = (struct ibv_exp_qp_init_attr){
+		.max_inl_recv = 0, /* Currently not supported. */
+		.qp_type = IBV_QPT_RAW_PACKET,
+		.comp_mask = (IBV_EXP_QP_INIT_ATTR_PD |
+			      IBV_EXP_QP_INIT_ATTR_RX_HASH),
+		.pd = priv->pd,
+		.rx_hash_conf = &hash_conf,
+		.port_num = priv->port,
+	};
+
+	qp = ibv_exp_create_qp(priv->ctx, &qp_init_attr);
+	if (qp == NULL) {
+		err = (errno ? errno : EINVAL);
+		ERROR("hash RX QP creation failure: %s", strerror(err));
+		goto error;
+	}
+
+	fdir_queue->ind_table = ind_table;
+	fdir_queue->qp = qp;
+
+	return fdir_queue;
+
+error:
+	if (qp != NULL)
+		claim_zero(ibv_destroy_qp(qp));
+
+	if (ind_table != NULL)
+		claim_zero(ibv_exp_destroy_rwq_ind_table(ind_table));
+
+	return NULL;
+}
+
+/**
+ * Enable flow director filter and create steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ * @param mlx5_fdir_filter
+ *   Filter to create steering rule for.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_enable(struct priv *priv,
+			struct mlx5_fdir_filter *mlx5_fdir_filter)
+{
+	struct fdir_queue *fdir_queue;
+
+	/* Check if flow already exists. */
+	if (mlx5_fdir_filter->flow != NULL)
+		return 0;
+
+	/* Get fdir_queue for specific queue. */
+	fdir_queue = priv_get_fdir_queue(priv, mlx5_fdir_filter->queue);
+
+	if (fdir_queue == NULL) {
+		ERROR("failed to create flow director rxq for queue %d",
+		      mlx5_fdir_filter->queue);
+		return EINVAL;
+	}
+
+	/* Create flow */
+	return priv_fdir_flow_add(priv, mlx5_fdir_filter, fdir_queue);
+}
+
+/**
+ * Initialize flow director filters list.
+ *
+ * @param priv
+ *   Private structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+fdir_init_filters_list(struct priv *priv)
+{
+	/* Filter list initialization should be done only once. */
+	if (priv->fdir_filter_list)
+		return 0;
+
+	/* Create filters list. */
+	priv->fdir_filter_list =
+		rte_calloc(__func__, 1, sizeof(*priv->fdir_filter_list), 0);
+
+	if (priv->fdir_filter_list == NULL) {
+		int err = ENOMEM;
+
+		ERROR("cannot allocate flow director filter list: %s",
+		      strerror(err));
+		return err;
+	}
+
+	LIST_INIT(priv->fdir_filter_list);
+
+	return 0;
+}
+
+/**
+ * Flush all filters.
+ *
+ * @param priv
+ *   Private structure.
+ */
+static void
+priv_fdir_filter_flush(struct priv *priv)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	while ((mlx5_fdir_filter = LIST_FIRST(priv->fdir_filter_list))) {
+		struct ibv_exp_flow *flow = mlx5_fdir_filter->flow;
+
+		DEBUG("%p: flushing flow director filter %p",
+		      (void *)priv, (void *)mlx5_fdir_filter);
+		LIST_REMOVE(mlx5_fdir_filter, next);
+		if (flow != NULL)
+			claim_zero(ibv_exp_destroy_flow(flow));
+		rte_free(mlx5_fdir_filter);
+	}
+}
+
+/**
+ * Remove all flow director filters and delete list.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_delete_filters_list(struct priv *priv)
+{
+	priv_fdir_filter_flush(priv);
+	rte_free(priv->fdir_filter_list);
+	priv->fdir_filter_list = NULL;
+}
+
+/**
+ * Disable flow director, remove all steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_disable(struct priv *priv)
+{
+	unsigned int i;
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	struct fdir_queue *fdir_queue;
+
+	/* Run on every flow director filter and destroy flow handle. */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		struct ibv_exp_flow *flow;
+
+		/* Only valid elements should be in the list */
+		assert(mlx5_fdir_filter != NULL);
+		flow = mlx5_fdir_filter->flow;
+
+		/* Destroy flow handle */
+		if (flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+	}
+
+	/* Run on every RX queue to destroy related flow director QP and
+	 * indirection table. */
+	for (i = 0; (i != priv->rxqs_n); i++) {
+		fdir_queue = &(*priv->rxqs)[i]->fdir_queue;
+
+		if (fdir_queue->qp != NULL) {
+			claim_zero(ibv_destroy_qp(fdir_queue->qp));
+			fdir_queue->qp = NULL;
+		}
+
+		if (fdir_queue->ind_table != NULL) {
+			claim_zero(ibv_exp_destroy_rwq_ind_table
+				   (fdir_queue->ind_table));
+			fdir_queue->ind_table = NULL;
+		}
+	}
+}
+
+/**
+ * Enable flow director, create steering rules.
+ *
+ * @param priv
+ *   Private structure.
+ */
+void
+priv_fdir_enable(struct priv *priv)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	/* Run on every fdir filter and create flow handle */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		/* Only valid elements should be in the list */
+		assert(mlx5_fdir_filter != NULL);
+
+		priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+	}
+}
+
+/**
+ * Find specific filter in list.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Flow director filter to find.
+ *
+ * @return
+ *   Filter element if found, otherwise NULL.
+ */
+static struct mlx5_fdir_filter *
+priv_find_filter_in_list(struct priv *priv,
+			 const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct fdir_flow_desc desc;
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	enum rte_fdir_mode fdir_mode = priv->dev->data->dev_conf.fdir_conf.mode;
+
+	/* Get flow director filter to look for. */
+	fdir_filter_to_flow_desc(fdir_filter, &desc, fdir_mode);
+
+	/* Look for the requested element. */
+	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
+		/* Only valid elements should be in the list. */
+		assert(mlx5_fdir_filter != NULL);
+
+		/* Return matching filter. */
+		if (!memcmp(&desc, &mlx5_fdir_filter->desc, sizeof(desc)))
+			return mlx5_fdir_filter;
+	}
+
+	/* Filter not found */
+	return NULL;
+}
+
+/**
+ * Add new flow director filter and store it in list.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Flow director filter to add.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_add(struct priv *priv,
+		     const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+	enum rte_fdir_mode fdir_mode = priv->dev->data->dev_conf.fdir_conf.mode;
+	int err = 0;
+
+	/* Validate queue number. */
+	if (fdir_filter->action.rx_queue >= priv->rxqs_n) {
+		ERROR("invalid queue number %d", fdir_filter->action.rx_queue);
+		return EINVAL;
+	}
+
+	/* Duplicate filters are currently unsupported. */
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		ERROR("filter already exists");
+		return EINVAL;
+	}
+
+	/* Create new flow director filter. */
+	mlx5_fdir_filter =
+		rte_calloc(__func__, 1, sizeof(*mlx5_fdir_filter), 0);
+	if (mlx5_fdir_filter == NULL) {
+		err = ENOMEM;
+		ERROR("cannot allocate flow director filter: %s",
+		      strerror(err));
+		return err;
+	}
+
+	/* Set queue. */
+	mlx5_fdir_filter->queue = fdir_filter->action.rx_queue;
+
+	/* Convert to mlx5 filter descriptor. */
+	fdir_filter_to_flow_desc(fdir_filter,
+				 &mlx5_fdir_filter->desc, fdir_mode);
+
+	/* Insert new filter into list. */
+	LIST_INSERT_HEAD(priv->fdir_filter_list, mlx5_fdir_filter, next);
+
+	DEBUG("%p: flow director filter %p added",
+	      (void *)priv, (void *)mlx5_fdir_filter);
+
+	/* Enable filter immediately if device is started. */
+	if (priv->started)
+		err = priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+
+	return err;
+}
+
+/**
+ * Update queue for specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Filter to be updated.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_update(struct priv *priv,
+			const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	/* Validate queue number. */
+	if (fdir_filter->action.rx_queue >= priv->rxqs_n) {
+		ERROR("invalid queue number %d", fdir_filter->action.rx_queue);
+		return EINVAL;
+	}
+
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		struct ibv_exp_flow *flow = mlx5_fdir_filter->flow;
+		int err = 0;
+
+		/* Update queue number. */
+		mlx5_fdir_filter->queue = fdir_filter->action.rx_queue;
+
+		/* Destroy flow handle. */
+		if (flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+		DEBUG("%p: flow director filter %p updated",
+		      (void *)priv, (void *)mlx5_fdir_filter);
+
+		/* Enable filter if device is started. */
+		if (priv->started)
+			err = priv_fdir_filter_enable(priv, mlx5_fdir_filter);
+
+		return err;
+	}
+
+	/* Filter not found, create it. */
+	DEBUG("%p: filter not found for update, creating new filter",
+	      (void *)priv);
+	return priv_fdir_filter_add(priv, fdir_filter);
+}
+
+/**
+ * Delete specific filter.
+ *
+ * @param priv
+ *   Private structure.
+ * @param fdir_filter
+ *   Filter to be deleted.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_filter_delete(struct priv *priv,
+			const struct rte_eth_fdir_filter *fdir_filter)
+{
+	struct mlx5_fdir_filter *mlx5_fdir_filter;
+
+	mlx5_fdir_filter = priv_find_filter_in_list(priv, fdir_filter);
+	if (mlx5_fdir_filter != NULL) {
+		struct ibv_exp_flow *flow = mlx5_fdir_filter->flow;
+
+		/* Remove element from list. */
+		LIST_REMOVE(mlx5_fdir_filter, next);
+
+		/* Destroy flow handle. */
+		if (flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(flow));
+			mlx5_fdir_filter->flow = NULL;
+		}
+
+		DEBUG("%p: flow director filter %p deleted",
+		      (void *)priv, (void *)mlx5_fdir_filter);
+
+		/* Delete filter. */
+		rte_free(mlx5_fdir_filter);
+
+		return 0;
+	}
+
+	ERROR("%p: flow director delete failed, cannot find filter",
+	      (void *)priv);
+	return EINVAL;
+}
+
+/**
+ * Get flow director information.
+ *
+ * @param priv
+ *   Private structure.
+ * @param[out] fdir_info
+ *   Resulting flow director information.
+ */
+static void
+priv_fdir_info_get(struct priv *priv, struct rte_eth_fdir_info *fdir_info)
+{
+	struct rte_eth_fdir_masks *mask =
+		&priv->dev->data->dev_conf.fdir_conf.mask;
+
+	fdir_info->mode = priv->dev->data->dev_conf.fdir_conf.mode;
+	fdir_info->guarant_spc = 0;
+
+	rte_memcpy(&fdir_info->mask, mask, sizeof(fdir_info->mask));
+
+	fdir_info->max_flexpayload = 0;
+	fdir_info->flow_types_mask[0] = 0;
+
+	fdir_info->flex_payload_unit = 0;
+	fdir_info->max_flex_payload_segment_num = 0;
+	fdir_info->flex_payload_limit = 0;
+	memset(&fdir_info->flex_conf, 0, sizeof(fdir_info->flex_conf));
+}
+
+/**
+ * Deal with flow director operations.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param filter_op
+ *   Operation to perform.
+ * @param arg
+ *   Pointer to operation-specific structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_fdir_ctrl_func(struct priv *priv, enum rte_filter_op filter_op, void *arg)
+{
+	enum rte_fdir_mode fdir_mode =
+		priv->dev->data->dev_conf.fdir_conf.mode;
+	int ret = 0;
+
+	if (filter_op == RTE_ETH_FILTER_NOP)
+		return 0;
+
+	if (fdir_mode != RTE_FDIR_MODE_PERFECT &&
+	    fdir_mode != RTE_FDIR_MODE_PERFECT_MAC_VLAN) {
+		ERROR("%p: flow director mode %d not supported",
+		      (void *)priv, fdir_mode);
+		return EINVAL;
+	}
+
+	switch (filter_op) {
+	case RTE_ETH_FILTER_ADD:
+		ret = priv_fdir_filter_add(priv, arg);
+		break;
+	case RTE_ETH_FILTER_UPDATE:
+		ret = priv_fdir_filter_update(priv, arg);
+		break;
+	case RTE_ETH_FILTER_DELETE:
+		ret = priv_fdir_filter_delete(priv, arg);
+		break;
+	case RTE_ETH_FILTER_FLUSH:
+		priv_fdir_filter_flush(priv);
+		break;
+	case RTE_ETH_FILTER_INFO:
+		priv_fdir_info_get(priv, arg);
+		break;
+	default:
+		DEBUG("%p: unknown operation %u", (void *)priv, filter_op);
+		ret = EINVAL;
+		break;
+	}
+	return ret;
+}
+
+/**
+ * Manage filter operations.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param filter_type
+ *   Filter type.
+ * @param filter_op
+ *   Operation to perform.
+ * @param arg
+ *   Pointer to operation-specific structure.
+ *
+ * @return
+ *   0 on success, negative errno value on failure.
+ */
+int
+mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
+		     enum rte_filter_type filter_type,
+		     enum rte_filter_op filter_op,
+		     void *arg)
+{
+	int ret = -EINVAL;
+	struct priv *priv = dev->data->dev_private;
+
+	switch (filter_type) {
+	case RTE_ETH_FILTER_FDIR:
+		priv_lock(priv);
+		ret = priv_fdir_ctrl_func(priv, filter_op, arg);
+		priv_unlock(priv);
+		break;
+	default:
+		ERROR("%p: filter type (%d) not supported",
+		      (void *)dev, filter_type);
+		break;
+	}
+
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 36910b2..093f4e5 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -62,6 +62,7 @@
 #include "mlx5.h"
 #include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
+#include "mlx5_autoconf.h"
 #include "mlx5_defs.h"
 
 /* Initialization data for hash RX queues. */
@@ -242,7 +243,12 @@ priv_flow_attr(struct priv *priv, struct ibv_exp_flow_attr *flow_attr,
 	init = &hash_rxq_init[type];
 	*flow_attr = (struct ibv_exp_flow_attr){
 		.type = IBV_EXP_FLOW_ATTR_NORMAL,
+#ifdef MLX5_FDIR_SUPPORT
+		/* Priorities < 3 are reserved for flow director. */
+		.priority = init->flow_priority + 3,
+#else /* MLX5_FDIR_SUPPORT */
 		.priority = init->flow_priority,
+#endif /* MLX5_FDIR_SUPPORT */
 		.num_of_specs = 0,
 		.port = priv->port,
 		.flags = 0,
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index c42bb8d..b2f72f8 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -93,6 +93,12 @@ struct rxq_elt {
 	struct rte_mbuf *buf; /* SGE buffer. */
 };
 
+/* Flow director queue structure. */
+struct fdir_queue {
+	struct ibv_qp *qp; /* Associated RX QP. */
+	struct ibv_exp_rwq_ind_table *ind_table; /* Indirection table. */
+};
+
 struct priv;
 
 /* RX queue descriptor. */
@@ -118,6 +124,7 @@ struct rxq {
 	struct mlx5_rxq_stats stats; /* RX queue counters. */
 	unsigned int socket; /* CPU socket ID for allocations. */
 	struct ibv_exp_res_domain *rd; /* Resource Domain. */
+	struct fdir_queue fdir_queue; /* Flow director queue. */
 };
 
 /* Hash RX queue types. */
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 90b8068..db7890f 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -87,6 +87,8 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 		priv_mac_addrs_disable(priv);
 		priv_destroy_hash_rxqs(priv);
 	}
+	if (dev->data->dev_conf.fdir_conf.mode != RTE_FDIR_MODE_NONE)
+		priv_fdir_enable(priv);
 	priv_dev_interrupt_handler_install(priv, dev);
 	priv_unlock(priv);
 	return -err;
@@ -117,6 +119,7 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	priv_special_flow_disable(priv, HASH_RXQ_FLOW_TYPE_PROMISC);
 	priv_mac_addrs_disable(priv);
 	priv_destroy_hash_rxqs(priv);
+	priv_fdir_disable(priv);
 	priv_dev_interrupt_handler_uninstall(priv, dev);
 	priv->started = 0;
 	priv_unlock(priv);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [dpdk-dev] [PATCH v3 5/5] mlx5: add support for RX VLAN stripping
  2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
                       ` (3 preceding siblings ...)
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 4/5] mlx5: add support for flow director Adrien Mazarguil
@ 2016-03-03 14:26     ` Adrien Mazarguil
  2016-03-09 16:11     ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Bruce Richardson
  5 siblings, 0 replies; 26+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:26 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

Allows HW to strip the 802.1Q header from incoming frames and report it
through the mbuf structure.

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/nics/mlx5.rst               |   2 +
 doc/guides/rel_notes/release_16_04.rst |   6 ++
 drivers/net/mlx5/mlx5.c                |  16 ++++-
 drivers/net/mlx5/mlx5.h                |   3 +
 drivers/net/mlx5/mlx5_rxq.c            |  17 +++++-
 drivers/net/mlx5/mlx5_rxtx.c           |  27 +++++++++
 drivers/net/mlx5/mlx5_rxtx.h           |   5 ++
 drivers/net/mlx5/mlx5_vlan.c           | 104 +++++++++++++++++++++++++++++++++
 8 files changed, 178 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 6bd452e..edfbf1f 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -83,6 +83,7 @@ Features
 - Configurable RETA table.
 - Support for multiple MAC addresses.
 - VLAN filtering.
+- RX VLAN stripping.
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
@@ -229,6 +230,7 @@ Currently supported by DPDK:
     above only:
 
     - Flow director.
+    - RX VLAN stripping.
 
 - Minimum firmware version:
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index c6c76d6..c69e55e 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -81,6 +81,12 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **mlx5: RX VLAN stripping support.**
+
+  Added support for RX VLAN stripping.
+
+  Only available with Mellanox OFED >= 3.2.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 43e24ff..575420e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -171,6 +171,10 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.mac_addr_add = mlx5_mac_addr_add,
 	.mac_addr_set = mlx5_mac_addr_set,
 	.mtu_set = mlx5_dev_set_mtu,
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	.vlan_strip_queue_set = mlx5_vlan_strip_queue_set,
+	.vlan_offload_set = mlx5_vlan_offload_set,
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	.reta_update = mlx5_dev_rss_reta_update,
 	.reta_query = mlx5_dev_rss_reta_query,
 	.rss_hash_update = mlx5_rss_hash_update,
@@ -325,7 +329,11 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 #ifdef HAVE_EXP_QUERY_DEVICE
 		exp_device_attr.comp_mask =
 			IBV_EXP_DEVICE_ATTR_EXP_CAP_FLAGS |
-			IBV_EXP_DEVICE_ATTR_RX_HASH;
+			IBV_EXP_DEVICE_ATTR_RX_HASH |
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+			IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS |
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+			0;
 #endif /* HAVE_EXP_QUERY_DEVICE */
 
 		DEBUG("using port %u (%08" PRIx32 ")", port, test);
@@ -396,6 +404,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 			priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
 		DEBUG("maximum RX indirection table size is %u",
 		      priv->ind_table_max_size);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		priv->hw_vlan_strip = !!(exp_device_attr.wq_vlan_offloads_cap &
+					 IBV_EXP_RECEIVE_WQ_CVLAN_STRIP);
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+		DEBUG("VLAN stripping is %ssupported",
+		      (priv->hw_vlan_strip ? "" : "not "));
 
 #else /* HAVE_EXP_QUERY_DEVICE */
 		priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8019ee3..8442016 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -101,6 +101,7 @@ struct priv {
 	unsigned int allmulti_req:1; /* All multicast mode requested. */
 	unsigned int hw_csum:1; /* Checksum offload is supported. */
 	unsigned int hw_csum_l2tun:1; /* Same for L2 tunnels. */
+	unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
 	unsigned int vf:1; /* This is a VF device. */
 	unsigned int pending_alarm:1; /* An alarm is pending. */
 	/* RX/TX queues. */
@@ -211,6 +212,8 @@ void mlx5_stats_reset(struct rte_eth_dev *);
 /* mlx5_vlan.c */
 
 int mlx5_vlan_filter_set(struct rte_eth_dev *, uint16_t, int);
+void mlx5_vlan_offload_set(struct rte_eth_dev *, int);
+void mlx5_vlan_strip_queue_set(struct rte_eth_dev *, uint16_t, int);
 
 /* mlx5_trigger.c */
 
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 093f4e5..573ad8f 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1224,6 +1224,8 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	      priv->device_attr.max_qp_wr);
 	DEBUG("priv->device_attr.max_sge is %d",
 	      priv->device_attr.max_sge);
+	/* Configure VLAN stripping. */
+	tmpl.vlan_strip = dev->data->dev_conf.rxmode.hw_vlan_strip;
 	attr.wq = (struct ibv_exp_wq_init_attr){
 		.wq_context = NULL, /* Could be useful in the future. */
 		.wq_type = IBV_EXP_WQT_RQ,
@@ -1238,8 +1240,18 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 				 MLX5_PMD_SGE_WR_N),
 		.pd = priv->pd,
 		.cq = tmpl.cq,
-		.comp_mask = IBV_EXP_CREATE_WQ_RES_DOMAIN,
+		.comp_mask =
+			IBV_EXP_CREATE_WQ_RES_DOMAIN |
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+			IBV_EXP_CREATE_WQ_VLAN_OFFLOADS |
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+			0,
 		.res_domain = tmpl.rd,
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		.vlan_offloads = (tmpl.vlan_strip ?
+				  IBV_EXP_RECEIVE_WQ_CVLAN_STRIP :
+				  0),
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	};
 	tmpl.wq = ibv_exp_create_wq(priv->ctx, &attr.wq);
 	if (tmpl.wq == NULL) {
@@ -1262,6 +1274,9 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	DEBUG("%p: RTE port ID: %u", (void *)rxq, tmpl.port_id);
 	attr.params = (struct ibv_exp_query_intf_params){
 		.intf_scope = IBV_EXP_INTF_GLOBAL,
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		.intf_version = 1,
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		.intf = IBV_EXP_INTF_CQ,
 		.obj = tmpl.cq,
 	};
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fa5e648..7585570 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -62,6 +62,7 @@
 #include "mlx5.h"
 #include "mlx5_utils.h"
 #include "mlx5_rxtx.h"
+#include "mlx5_autoconf.h"
 #include "mlx5_defs.h"
 
 /**
@@ -713,12 +714,19 @@ mlx5_rx_burst_sp(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		unsigned int seg_headroom = RTE_PKTMBUF_HEADROOM;
 		unsigned int j = 0;
 		uint32_t flags;
+		uint16_t vlan_tci;
 
 		/* Sanity checks. */
 		assert(elts_head < rxq->elts_n);
 		assert(rxq->elts_head < rxq->elts_n);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		ret = rxq->if_cq->poll_length_flags_cvlan(rxq->cq, NULL, NULL,
+							  &flags, &vlan_tci);
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL,
 						    &flags);
+		(void)vlan_tci;
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		if (unlikely(ret < 0)) {
 			struct ibv_wc wc;
 			int wcs_n;
@@ -840,6 +848,12 @@ mlx5_rx_burst_sp(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		PKT_LEN(pkt_buf) = pkt_buf_len;
 		pkt_buf->packet_type = rxq_cq_to_pkt_type(flags);
 		pkt_buf->ol_flags = rxq_cq_to_ol_flags(rxq, flags);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		if (flags & IBV_EXP_CQ_RX_CVLAN_STRIPPED_V1) {
+			pkt_buf->ol_flags |= PKT_RX_VLAN_PKT;
+			pkt_buf->vlan_tci = vlan_tci;
+		}
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 
 		/* Return packet. */
 		*(pkts++) = pkt_buf;
@@ -910,6 +924,7 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		struct rte_mbuf *seg = elt->buf;
 		struct rte_mbuf *rep;
 		uint32_t flags;
+		uint16_t vlan_tci;
 
 		/* Sanity checks. */
 		assert(seg != NULL);
@@ -921,8 +936,14 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		 */
 		rte_prefetch0(seg);
 		rte_prefetch0(&seg->cacheline1);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		ret = rxq->if_cq->poll_length_flags_cvlan(rxq->cq, NULL, NULL,
+							  &flags, &vlan_tci);
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		ret = rxq->if_cq->poll_length_flags(rxq->cq, NULL, NULL,
 						    &flags);
+		(void)vlan_tci;
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 		if (unlikely(ret < 0)) {
 			struct ibv_wc wc;
 			int wcs_n;
@@ -989,6 +1010,12 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		DATA_LEN(seg) = len;
 		seg->packet_type = rxq_cq_to_pkt_type(flags);
 		seg->ol_flags = rxq_cq_to_ol_flags(rxq, flags);
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+		if (flags & IBV_EXP_CQ_RX_CVLAN_STRIPPED_V1) {
+			seg->ol_flags |= PKT_RX_VLAN_PKT;
+			seg->vlan_tci = vlan_tci;
+		}
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 
 		/* Return packet. */
 		*(pkts++) = seg;
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index b2f72f8..fde0ca2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -109,7 +109,11 @@ struct rxq {
 	struct ibv_cq *cq; /* Completion Queue. */
 	struct ibv_exp_wq *wq; /* Work Queue. */
 	struct ibv_exp_wq_family *if_wq; /* WQ burst interface. */
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	struct ibv_exp_cq_family_v1 *if_cq; /* CQ interface. */
+#else /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	struct ibv_exp_cq_family *if_cq; /* CQ interface. */
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	unsigned int port_id; /* Port ID for incoming packets. */
 	unsigned int elts_n; /* (*elts)[] length. */
 	unsigned int elts_head; /* Current index in (*elts)[]. */
@@ -120,6 +124,7 @@ struct rxq {
 	unsigned int sp:1; /* Use scattered RX elements. */
 	unsigned int csum:1; /* Enable checksum offloading. */
 	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
+	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
 	uint32_t mb_len; /* Length of a mp-issued mbuf. */
 	struct mlx5_rxq_stats stats; /* RX queue counters. */
 	unsigned int socket; /* CPU socket ID for allocations. */
diff --git a/drivers/net/mlx5/mlx5_vlan.c b/drivers/net/mlx5/mlx5_vlan.c
index 3a07ad1..fa9e3b8 100644
--- a/drivers/net/mlx5/mlx5_vlan.c
+++ b/drivers/net/mlx5/mlx5_vlan.c
@@ -48,6 +48,7 @@
 
 #include "mlx5_utils.h"
 #include "mlx5.h"
+#include "mlx5_autoconf.h"
 
 /**
  * Configure a VLAN filter.
@@ -127,3 +128,106 @@ mlx5_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 	assert(ret >= 0);
 	return -ret;
 }
+
+/**
+ * Set/reset VLAN stripping for a specific queue.
+ *
+ * @param priv
+ *   Pointer to private structure.
+ * @param idx
+ *   RX queue index.
+ * @param on
+ *   Enable/disable VLAN stripping.
+ */
+static void
+priv_vlan_strip_queue_set(struct priv *priv, uint16_t idx, int on)
+{
+	struct rxq *rxq = (*priv->rxqs)[idx];
+#ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
+	struct ibv_exp_wq_attr mod;
+	uint16_t vlan_offloads =
+		(on ? IBV_EXP_RECEIVE_WQ_CVLAN_STRIP : 0) |
+		0;
+	int err;
+
+	DEBUG("set VLAN offloads 0x%x for port %d queue %d",
+	      vlan_offloads, rxq->port_id, idx);
+	mod = (struct ibv_exp_wq_attr){
+		.attr_mask = IBV_EXP_WQ_ATTR_VLAN_OFFLOADS,
+		.vlan_offloads = vlan_offloads,
+	};
+
+	err = ibv_exp_modify_wq(rxq->wq, &mod);
+	if (err) {
+		ERROR("%p: failed to modified stripping mode: %s",
+		      (void *)priv, strerror(err));
+		return;
+	}
+
+#endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+
+	/* Update related bits in RX queue. */
+	rxq->vlan_strip = !!on;
+}
+
+/**
+ * Callback to set/reset VLAN stripping for a specific queue.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param queue
+ *   RX queue index.
+ * @param on
+ *   Enable/disable VLAN stripping.
+ */
+void
+mlx5_vlan_strip_queue_set(struct rte_eth_dev *dev, uint16_t queue, int on)
+{
+	struct priv *priv = dev->data->dev_private;
+
+	/* Validate hw support */
+	if (!priv->hw_vlan_strip) {
+		ERROR("VLAN stripping is not supported");
+		return;
+	}
+
+	/* Validate queue number */
+	if (queue >= priv->rxqs_n) {
+		ERROR("VLAN stripping, invalid queue number %d", queue);
+		return;
+	}
+
+	priv_lock(priv);
+	priv_vlan_strip_queue_set(priv, queue, on);
+	priv_unlock(priv);
+}
+
+/**
+ * Callback to set/reset VLAN offloads for a port.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param mask
+ *   VLAN offload bit mask.
+ */
+void
+mlx5_vlan_offload_set(struct rte_eth_dev *dev, int mask)
+{
+	struct priv *priv = dev->data->dev_private;
+	unsigned int i;
+
+	if (mask & ETH_VLAN_STRIP_MASK) {
+		int hw_vlan_strip = dev->data->dev_conf.rxmode.hw_vlan_strip;
+
+		if (!priv->hw_vlan_strip) {
+			ERROR("VLAN stripping is not supported");
+			return;
+		}
+
+		/* Run on every RX queue and set/reset VLAN stripping. */
+		priv_lock(priv);
+		for (i = 0; (i != priv->rxqs_n); i++)
+			priv_vlan_strip_queue_set(priv, i, hw_vlan_strip);
+		priv_unlock(priv);
+	}
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support
  2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
                       ` (4 preceding siblings ...)
  2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
@ 2016-03-09 16:11     ` Bruce Richardson
  5 siblings, 0 replies; 26+ messages in thread
From: Bruce Richardson @ 2016-03-09 16:11 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

On Thu, Mar 03, 2016 at 03:26:39PM +0100, Adrien Mazarguil wrote:
> To preserve compatibility with Mellanox OFED 3.1, flow director and RX VLAN
> stripping code is only enabled if compiled with 3.2.
> 
> Changes in v3:
> - Fixed flow registration issue caused by missing masks in flow rules.
> - Fixed packet duplication with overlapping FDIR rules.
> - Added FDIR flush command support.
> - Updated Mellanox OFED prerequisite to 3.2-2.0.0.0.
> 
> Changes in v2:
> - Rebased patchset on top of dpdk-next-net/rel_16_04.
> - Fixed trivial compilation warnings (positive errnos are left on purpose).
> - Updated documentation and release notes for flow director and RX VLAN
>   stripping features.
> - Fixed missing Mellanox OFED >= 3.2 check for CQ family query interface
>   version.
> 
> Yaacov Hazan (5):
>   mlx5: refactor special flows handling
>   mlx5: add special flows (broadcast and IPv6 multicast)
>   mlx5: make flow steering rule generator more generic
>   mlx5: add support for flow director
>   mlx5: add support for RX VLAN stripping
> 
Applied to dpdk-next-net/rel_16_04

Thanks,
/Bruce

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-03-09 16:11 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-29 10:31 [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
2016-01-29 10:31 ` [dpdk-dev] [PATCH 1/5] mlx5: refactor special flows handling Adrien Mazarguil
2016-01-29 10:31 ` [dpdk-dev] [PATCH 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
2016-01-29 10:32 ` [dpdk-dev] [PATCH 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
2016-01-29 10:32 ` [dpdk-dev] [PATCH 4/5] mlx5: add support for flow director Adrien Mazarguil
2016-02-17 17:13   ` Bruce Richardson
2016-02-18 16:10     ` Adrien Mazarguil
2016-02-23 15:13       ` Bruce Richardson
2016-02-23 17:14         ` Thomas Monjalon
2016-02-23 17:38           ` Adrien Mazarguil
2016-01-29 10:32 ` [dpdk-dev] [PATCH 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
2016-02-17 17:14 ` [dpdk-dev] [PATCH 0/5] Add flow director and RX VLAN stripping support Bruce Richardson
2016-02-18 16:27   ` Adrien Mazarguil
2016-02-22 18:02 ` [dpdk-dev] [PATCH v2 " Adrien Mazarguil
2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 1/5] mlx5: refactor special flows handling Adrien Mazarguil
2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 4/5] mlx5: add support for flow director Adrien Mazarguil
2016-02-22 18:02   ` [dpdk-dev] [PATCH v2 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
2016-03-03 14:26   ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Adrien Mazarguil
2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 1/5] mlx5: refactor special flows handling Adrien Mazarguil
2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 2/5] mlx5: add special flows (broadcast and IPv6 multicast) Adrien Mazarguil
2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 3/5] mlx5: make flow steering rule generator more generic Adrien Mazarguil
2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 4/5] mlx5: add support for flow director Adrien Mazarguil
2016-03-03 14:26     ` [dpdk-dev] [PATCH v3 5/5] mlx5: add support for RX VLAN stripping Adrien Mazarguil
2016-03-09 16:11     ` [dpdk-dev] [PATCH v3 0/5] Add flow director and RX VLAN stripping support Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).