DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/4] Implement missing features in mlx5
@ 2016-02-22 18:19 Adrien Mazarguil
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 1/4] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:19 UTC (permalink / raw)
  To: dev

This patchset adds to mlx5 a few features available in mlx4 (TX from
secondary processes) or provided by Verbs (support for HW packet padding,
TX VLAN insertion).

Release notes and documentation are updated accordingly.

Note: should be applied after "Assorted fixes for mlx4 and mlx5".

Olga Shern (1):
  mlx5: add support for HW packet padding

Or Ami (2):
  mlx5: add callbacks to support link (up / down) changes
  mlx5: allow operation in secondary processes

Yaacov Hazan (1):
  mlx5: add VLAN insertion offload

 config/common_linuxapp                 |   1 +
 doc/guides/nics/mlx5.rst               |  25 ++-
 doc/guides/rel_notes/release_16_04.rst |  17 ++
 drivers/net/mlx5/Makefile              |  14 ++
 drivers/net/mlx5/mlx5.c                |  63 ++++++-
 drivers/net/mlx5/mlx5.h                |  18 ++
 drivers/net/mlx5/mlx5_defs.h           |   9 +
 drivers/net/mlx5/mlx5_ethdev.c         | 299 ++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_mac.c            |   6 +
 drivers/net/mlx5/mlx5_rxmode.c         |  12 ++
 drivers/net/mlx5/mlx5_rxq.c            |  56 ++++++
 drivers/net/mlx5/mlx5_rxtx.c           | 109 ++++++++++--
 drivers/net/mlx5/mlx5_rxtx.h           |  21 +++
 drivers/net/mlx5/mlx5_stats.c          |   2 +-
 drivers/net/mlx5/mlx5_trigger.c        |   6 +
 drivers/net/mlx5/mlx5_txq.c            |  65 ++++++-
 16 files changed, 683 insertions(+), 40 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH 1/4] mlx5: add callbacks to support link (up / down) changes
  2016-02-22 18:19 [dpdk-dev] [PATCH 0/4] Implement missing features in mlx5 Adrien Mazarguil
@ 2016-02-22 18:19 ` Adrien Mazarguil
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 2/4] mlx5: allow operation in secondary processes Adrien Mazarguil
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:19 UTC (permalink / raw)
  To: dev

From: Or Ami <ora@mellanox.com>

Burst functions are updated to make sure applications cannot attempt to
send/receive after link is brought down.

Signed-off-by: Or Ami <ora@mellanox.com>
---
 doc/guides/rel_notes/release_16_04.rst |  4 ++
 drivers/net/mlx5/mlx5.c                |  2 +
 drivers/net/mlx5/mlx5.h                |  2 +
 drivers/net/mlx5/mlx5_ethdev.c         | 85 ++++++++++++++++++++++++++++++++++
 4 files changed, 93 insertions(+)

diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 4750205..cbb736f 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -79,6 +79,10 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **mlx5: Added link up/down callbacks.**
+
+  Implemented callbacks to bring link up and down.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ad69ec2..14ac4ba 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -148,6 +148,8 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.dev_configure = mlx5_dev_configure,
 	.dev_start = mlx5_dev_start,
 	.dev_stop = mlx5_dev_stop,
+	.dev_set_link_down = mlx5_set_link_down,
+	.dev_set_link_up = mlx5_set_link_up,
 	.dev_close = mlx5_dev_close,
 	.promiscuous_enable = mlx5_promiscuous_enable,
 	.promiscuous_disable = mlx5_promiscuous_disable,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 43b24fb..9a3f240 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -168,6 +168,8 @@ void mlx5_dev_link_status_handler(void *);
 void mlx5_dev_interrupt_handler(struct rte_intr_handle *, void *);
 void priv_dev_interrupt_handler_uninstall(struct priv *, struct rte_eth_dev *);
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
+int mlx5_set_link_down(struct rte_eth_dev *dev);
+int mlx5_set_link_up(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6704382..f609e0f 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -968,3 +968,88 @@ priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_dev *dev)
 					   dev);
 	}
 }
+
+/**
+ * Change the link state (UP / DOWN).
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param up
+ *   Nonzero for link up, otherwise link down.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_set_link(struct priv *priv, int up)
+{
+	struct rte_eth_dev *dev = priv->dev;
+	int err;
+	unsigned int i;
+
+	if (up) {
+		err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
+		if (err)
+			return err;
+		for (i = 0; i < priv->rxqs_n; i++)
+			if ((*priv->rxqs)[i]->sp)
+				break;
+		/* Check if an sp queue exists.
+		 * Note: Some old frames might be received.
+		 */
+		if (i == priv->rxqs_n)
+			dev->rx_pkt_burst = mlx5_rx_burst;
+		else
+			dev->rx_pkt_burst = mlx5_rx_burst_sp;
+		dev->tx_pkt_burst = mlx5_tx_burst;
+	} else {
+		err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
+		if (err)
+			return err;
+		dev->rx_pkt_burst = removed_rx_burst;
+		dev->tx_pkt_burst = removed_tx_burst;
+	}
+	return 0;
+}
+
+/**
+ * DPDK callback to bring the link DOWN.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+mlx5_set_link_down(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	int err;
+
+	priv_lock(priv);
+	err = priv_set_link(priv, 0);
+	priv_unlock(priv);
+	return err;
+}
+
+/**
+ * DPDK callback to bring the link UP.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+mlx5_set_link_up(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	int err;
+
+	priv_lock(priv);
+	err = priv_set_link(priv, 1);
+	priv_unlock(priv);
+	return err;
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH 2/4] mlx5: allow operation in secondary processes
  2016-02-22 18:19 [dpdk-dev] [PATCH 0/4] Implement missing features in mlx5 Adrien Mazarguil
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 1/4] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
@ 2016-02-22 18:19 ` Adrien Mazarguil
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 3/4] mlx5: add support for HW packet padding Adrien Mazarguil
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:19 UTC (permalink / raw)
  To: dev

From: Or Ami <ora@mellanox.com>

Secondary processes are expected to use queues and other resources
allocated by the primary, however Verbs resources can only be shared
between processes when inherited through fork().

This limitation can be worked around for TX by configuring separate queues
from secondary processes.

Signed-off-by: Or Ami <ora@mellanox.com>
---
 doc/guides/nics/mlx5.rst               |   3 +-
 doc/guides/rel_notes/release_16_04.rst |   4 +
 drivers/net/mlx5/mlx5.c                |  42 +++++--
 drivers/net/mlx5/mlx5.h                |  12 ++
 drivers/net/mlx5/mlx5_ethdev.c         | 202 ++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_mac.c            |   6 +
 drivers/net/mlx5/mlx5_rxmode.c         |  12 ++
 drivers/net/mlx5/mlx5_rxq.c            |  46 ++++++++
 drivers/net/mlx5/mlx5_rxtx.h           |   8 ++
 drivers/net/mlx5/mlx5_stats.c          |   2 +-
 drivers/net/mlx5/mlx5_trigger.c        |   6 +
 drivers/net/mlx5/mlx5_txq.c            |  50 +++++++-
 12 files changed, 378 insertions(+), 15 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 8422206..df07146 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -88,6 +88,7 @@ Features
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
 - Flow director (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN).
+- Secondary process TX is supported.
 
 Limitations
 -----------
@@ -96,7 +97,7 @@ Limitations
 - Inner RSS for VXLAN frames is not supported yet.
 - Port statistics through software counters only.
 - Hardware checksum offloads for VXLAN inner header are not supported yet.
-- Secondary processes are not supported yet.
+- Secondary process RX is not supported.
 
 Configuration
 -------------
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index cbb736f..08f7592 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -83,6 +83,10 @@ This section should contain new features added in this release. Sample format:
 
   Implemented callbacks to bring link up and down.
 
+* **mlx5: Added support for operation in secondary processes.**
+
+  Implemented TX support in secondary processes (like mlx4).
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 14ac4ba..998e6f0 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -78,7 +78,7 @@
 static void
 mlx5_dev_close(struct rte_eth_dev *dev)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	void *tmp;
 	unsigned int i;
 
@@ -483,18 +483,44 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 			goto port_error;
 		}
 
-		eth_dev->data->dev_private = priv;
-		eth_dev->pci_dev = pci_dev;
+		/* Secondary processes have to use local storage for their
+		 * private data as well as a copy of eth_dev->data, but this
+		 * pointer must not be modified before burst functions are
+		 * actually called. */
+		if (mlx5_is_secondary()) {
+			struct mlx5_secondary_data *sd =
+				&mlx5_secondary_data[eth_dev->data->port_id];
+			sd->primary_priv = eth_dev->data->dev_private;
+			if (sd->primary_priv == NULL) {
+				ERROR("no private data for port %u",
+						eth_dev->data->port_id);
+				err = EINVAL;
+				goto port_error;
+			}
+			sd->shared_dev_data = eth_dev->data;
+			rte_spinlock_init(&sd->lock);
+			memcpy(sd->data.name, sd->shared_dev_data->name,
+				   sizeof(sd->data.name));
+			sd->data.dev_private = priv;
+			sd->data.rx_mbuf_alloc_failed = 0;
+			sd->data.mtu = ETHER_MTU;
+			sd->data.port_id = sd->shared_dev_data->port_id;
+			sd->data.mac_addrs = priv->mac;
+			eth_dev->tx_pkt_burst = mlx5_tx_burst_secondary_setup;
+			eth_dev->rx_pkt_burst = mlx5_rx_burst_secondary_setup;
+		} else {
+			eth_dev->data->dev_private = priv;
+			eth_dev->data->rx_mbuf_alloc_failed = 0;
+			eth_dev->data->mtu = ETHER_MTU;
+			eth_dev->data->mac_addrs = priv->mac;
+		}
 
+		eth_dev->pci_dev = pci_dev;
 		rte_eth_copy_pci_info(eth_dev, pci_dev);
-
 		eth_dev->driver = &mlx5_driver;
-		eth_dev->data->rx_mbuf_alloc_failed = 0;
-		eth_dev->data->mtu = ETHER_MTU;
-
 		priv->dev = eth_dev;
 		eth_dev->dev_ops = &mlx5_dev_ops;
-		eth_dev->data->mac_addrs = priv->mac;
+
 		TAILQ_INIT(&eth_dev->link_intr_cbs);
 
 		/* Bring Ethernet device up. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9a3f240..bad9283 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -59,6 +59,7 @@
 #include <rte_ethdev.h>
 #include <rte_spinlock.h>
 #include <rte_interrupts.h>
+#include <rte_errno.h>
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -125,6 +126,14 @@ struct priv {
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
+/* Local storage for secondary process data. */
+struct mlx5_secondary_data {
+	struct rte_eth_dev_data data; /* Local device data. */
+	struct priv *primary_priv; /* Private structure from primary. */
+	struct rte_eth_dev_data *shared_dev_data; /* Shared device data. */
+	rte_spinlock_t lock; /* Port configuration lock. */
+} mlx5_secondary_data[RTE_MAX_ETHPORTS];
+
 /**
  * Lock private structure to protect it from concurrent access in the
  * control path.
@@ -152,6 +161,8 @@ priv_unlock(struct priv *priv)
 
 /* mlx5_ethdev.c */
 
+struct priv *mlx5_get_priv(struct rte_eth_dev *dev);
+int mlx5_is_secondary(void);
 int priv_get_ifname(const struct priv *, char (*)[IF_NAMESIZE]);
 int priv_ifreq(const struct priv *, int req, struct ifreq *);
 int priv_get_mtu(struct priv *, uint16_t *);
@@ -170,6 +181,7 @@ void priv_dev_interrupt_handler_uninstall(struct priv *, struct rte_eth_dev *);
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
 int mlx5_set_link_down(struct rte_eth_dev *dev);
 int mlx5_set_link_up(struct rte_eth_dev *dev);
+struct priv *mlx5_secondary_data_setup(struct priv *priv);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f609e0f..6b674a2 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -59,6 +59,7 @@
 #include <rte_common.h>
 #include <rte_interrupts.h>
 #include <rte_alarm.h>
+#include <rte_malloc.h>
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -68,6 +69,38 @@
 #include "mlx5_utils.h"
 
 /**
+ * Return private structure associated with an Ethernet device.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   Pointer to private structure.
+ */
+struct priv *
+mlx5_get_priv(struct rte_eth_dev *dev)
+{
+	struct mlx5_secondary_data *sd;
+
+	if (!mlx5_is_secondary())
+		return dev->data->dev_private;
+	sd = &mlx5_secondary_data[dev->data->port_id];
+	return sd->data.dev_private;
+}
+
+/**
+ * Check if running as a secondary process.
+ *
+ * @return
+ *   Nonzero if running as a secondary process.
+ */
+inline int
+mlx5_is_secondary(void)
+{
+	return rte_eal_process_type() != RTE_PROC_PRIMARY;
+}
+
+/**
  * Get interface name from private structure.
  *
  * @param[in] priv
@@ -464,6 +497,9 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	ret = dev_configure(dev);
 	assert(ret >= 0);
@@ -482,7 +518,7 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 void
 mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	unsigned int max;
 	char ifname[IF_NAMESIZE];
 
@@ -536,7 +572,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 static int
 mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct ethtool_cmd edata = {
 		.cmd = ETHTOOL_GSET
 	};
@@ -585,7 +621,7 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 int
 mlx5_link_update(struct rte_eth_dev *dev, int wait_to_complete)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	int ret;
 
 	priv_lock(priv);
@@ -620,6 +656,9 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
 	uint16_t (*rx_func)(void *, struct rte_mbuf **, uint16_t) =
 		mlx5_rx_burst;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	/* Set kernel interface MTU first. */
 	if (priv_set_mtu(priv, mtu)) {
@@ -694,6 +733,9 @@ mlx5_dev_get_flow_ctrl(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 	};
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	ifr.ifr_data = &ethpause;
 	priv_lock(priv);
 	if (priv_ifreq(priv, SIOCETHTOOL, &ifr)) {
@@ -742,6 +784,9 @@ mlx5_dev_set_flow_ctrl(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 	};
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	ifr.ifr_data = &ethpause;
 	ethpause.autoneg = fc_conf->autoneg;
 	if (((fc_conf->mode & RTE_FC_FULL) == RTE_FC_FULL) ||
@@ -1053,3 +1098,154 @@ mlx5_set_link_up(struct rte_eth_dev *dev)
 	priv_unlock(priv);
 	return err;
 }
+
+/**
+ * Configure secondary process queues from a private data pointer (primary
+ * or secondary) and update burst callbacks. Can take place only once.
+ *
+ * All queues must have been previously created by the primary process to
+ * avoid undefined behavior.
+ *
+ * @param priv
+ *   Private data pointer from either primary or secondary process.
+ *
+ * @return
+ *   Private data pointer from secondary process, NULL in case of error.
+ */
+struct priv *
+mlx5_secondary_data_setup(struct priv *priv)
+{
+	unsigned int port_id = 0;
+	struct mlx5_secondary_data *sd;
+	void **tx_queues;
+	void **rx_queues;
+	unsigned int nb_tx_queues;
+	unsigned int nb_rx_queues;
+	unsigned int i;
+
+	/* priv must be valid at this point. */
+	assert(priv != NULL);
+	/* priv->dev must also be valid but may point to local memory from
+	 * another process, possibly with the same address and must not
+	 * be dereferenced yet. */
+	assert(priv->dev != NULL);
+	/* Determine port ID by finding out where priv comes from. */
+	while (1) {
+		sd = &mlx5_secondary_data[port_id];
+		rte_spinlock_lock(&sd->lock);
+		/* Primary process? */
+		if (sd->primary_priv == priv)
+			break;
+		/* Secondary process? */
+		if (sd->data.dev_private == priv)
+			break;
+		rte_spinlock_unlock(&sd->lock);
+		if (++port_id == RTE_DIM(mlx5_secondary_data))
+			port_id = 0;
+	}
+	/* Switch to secondary private structure. If private data has already
+	 * been updated by another thread, there is nothing else to do. */
+	priv = sd->data.dev_private;
+	if (priv->dev->data == &sd->data)
+		goto end;
+	/* Sanity checks. Secondary private structure is supposed to point
+	 * to local eth_dev, itself still pointing to the shared device data
+	 * structure allocated by the primary process. */
+	assert(sd->shared_dev_data != &sd->data);
+	assert(sd->data.nb_tx_queues == 0);
+	assert(sd->data.tx_queues == NULL);
+	assert(sd->data.nb_rx_queues == 0);
+	assert(sd->data.rx_queues == NULL);
+	assert(priv != sd->primary_priv);
+	assert(priv->dev->data == sd->shared_dev_data);
+	assert(priv->txqs_n == 0);
+	assert(priv->txqs == NULL);
+	assert(priv->rxqs_n == 0);
+	assert(priv->rxqs == NULL);
+	nb_tx_queues = sd->shared_dev_data->nb_tx_queues;
+	nb_rx_queues = sd->shared_dev_data->nb_rx_queues;
+	/* Allocate local storage for queues. */
+	tx_queues = rte_zmalloc("secondary ethdev->tx_queues",
+				sizeof(sd->data.tx_queues[0]) * nb_tx_queues,
+				RTE_CACHE_LINE_SIZE);
+	rx_queues = rte_zmalloc("secondary ethdev->rx_queues",
+				sizeof(sd->data.rx_queues[0]) * nb_rx_queues,
+				RTE_CACHE_LINE_SIZE);
+	if (tx_queues == NULL || rx_queues == NULL)
+		goto error;
+	/* Lock to prevent control operations during setup. */
+	priv_lock(priv);
+	/* TX queues. */
+	for (i = 0; i != nb_tx_queues; ++i) {
+		struct txq *primary_txq = (*sd->primary_priv->txqs)[i];
+		struct txq *txq;
+
+		if (primary_txq == NULL)
+			continue;
+		txq = rte_calloc_socket("TXQ", 1, sizeof(*txq), 0,
+					primary_txq->socket);
+		if (txq != NULL) {
+			if (txq_setup(priv->dev,
+				      txq,
+				      primary_txq->elts_n * MLX5_PMD_SGE_WR_N,
+				      primary_txq->socket,
+				      NULL) == 0) {
+				txq->stats.idx = primary_txq->stats.idx;
+				tx_queues[i] = txq;
+				continue;
+			}
+			rte_free(txq);
+		}
+		while (i) {
+			txq = tx_queues[--i];
+			txq_cleanup(txq);
+			rte_free(txq);
+		}
+		goto error;
+	}
+	/* RX queues. */
+	for (i = 0; i != nb_rx_queues; ++i) {
+		struct rxq *primary_rxq = (*sd->primary_priv->rxqs)[i];
+
+		if (primary_rxq == NULL)
+			continue;
+		/* Not supported yet. */
+		rx_queues[i] = NULL;
+	}
+	/* Update everything. */
+	priv->txqs = (void *)tx_queues;
+	priv->txqs_n = nb_tx_queues;
+	priv->rxqs = (void *)rx_queues;
+	priv->rxqs_n = nb_rx_queues;
+	sd->data.rx_queues = rx_queues;
+	sd->data.tx_queues = tx_queues;
+	sd->data.nb_rx_queues = nb_rx_queues;
+	sd->data.nb_tx_queues = nb_tx_queues;
+	sd->data.dev_link = sd->shared_dev_data->dev_link;
+	sd->data.mtu = sd->shared_dev_data->mtu;
+	memcpy(sd->data.rx_queue_state, sd->shared_dev_data->rx_queue_state,
+	       sizeof(sd->data.rx_queue_state));
+	memcpy(sd->data.tx_queue_state, sd->shared_dev_data->tx_queue_state,
+	       sizeof(sd->data.tx_queue_state));
+	sd->data.dev_flags = sd->shared_dev_data->dev_flags;
+	/* Use local data from now on. */
+	rte_mb();
+	priv->dev->data = &sd->data;
+	rte_mb();
+	priv->dev->tx_pkt_burst = mlx5_tx_burst;
+	priv->dev->rx_pkt_burst = removed_rx_burst;
+	priv_unlock(priv);
+end:
+	/* More sanity checks. */
+	assert(priv->dev->tx_pkt_burst == mlx5_tx_burst);
+	assert(priv->dev->rx_pkt_burst == removed_rx_burst);
+	assert(priv->dev->data == &sd->data);
+	rte_spinlock_unlock(&sd->lock);
+	return priv;
+error:
+	priv_unlock(priv);
+	rte_free(tx_queues);
+	rte_free(rx_queues);
+	rte_spinlock_unlock(&sd->lock);
+	return NULL;
+}
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index edb05ad..c9cea48 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -209,6 +209,9 @@ mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	DEBUG("%p: removing MAC address from index %" PRIu32,
 	      (void *)dev, index);
@@ -474,6 +477,9 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	(void)vmdq;
 	priv_lock(priv);
 	DEBUG("%p: adding MAC address at index %" PRIu32,
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 2bc005e..3a55f63 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -396,6 +396,9 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->promisc_req = 1;
 	ret = priv_rehash_flows(priv);
@@ -417,6 +420,9 @@ mlx5_promiscuous_disable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->promisc_req = 0;
 	ret = priv_rehash_flows(priv);
@@ -438,6 +444,9 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->allmulti_req = 1;
 	ret = priv_rehash_flows(priv);
@@ -459,6 +468,9 @@ mlx5_allmulticast_disable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->allmulti_req = 0;
 	ret = priv_rehash_flows(priv);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index df6fd92..3d84f41 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1395,6 +1395,9 @@ mlx5_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	struct rxq *rxq = (*priv->rxqs)[idx];
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	DEBUG("%p: configuring queue %u for %u descriptors",
 	      (void *)dev, idx, desc);
@@ -1453,6 +1456,9 @@ mlx5_rx_queue_release(void *dpdk_rxq)
 	struct priv *priv;
 	unsigned int i;
 
+	if (mlx5_is_secondary())
+		return;
+
 	if (rxq == NULL)
 		return;
 	priv = rxq->priv;
@@ -1468,3 +1474,43 @@ mlx5_rx_queue_release(void *dpdk_rxq)
 	rte_free(rxq);
 	priv_unlock(priv);
 }
+
+/**
+ * DPDK callback for RX in secondary processes.
+ *
+ * This function configures all queues from primary process information
+ * if necessary before reverting to the normal RX burst callback.
+ *
+ * @param dpdk_rxq
+ *   Generic pointer to RX queue structure.
+ * @param[out] pkts
+ *   Array to store received packets.
+ * @param pkts_n
+ *   Maximum number of packets in array.
+ *
+ * @return
+ *   Number of packets successfully received (<= pkts_n).
+ */
+uint16_t
+mlx5_rx_burst_secondary_setup(void *dpdk_rxq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n)
+{
+	struct rxq *rxq = dpdk_rxq;
+	struct priv *priv = mlx5_secondary_data_setup(rxq->priv);
+	struct priv *primary_priv;
+	unsigned int index;
+
+	if (priv == NULL)
+		return 0;
+	primary_priv =
+		mlx5_secondary_data[priv->dev->data->port_id].primary_priv;
+	/* Look for queue index in both private structures. */
+	for (index = 0; index != priv->rxqs_n; ++index)
+		if (((*primary_priv->rxqs)[index] == rxq) ||
+		    ((*priv->rxqs)[index] == rxq))
+			break;
+	if (index == priv->rxqs_n)
+		return 0;
+	rxq = (*priv->rxqs)[index];
+	return priv->dev->rx_pkt_burst(rxq, pkts, pkts_n);
+}
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 6ac1a5a..6a0087e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -309,13 +309,21 @@ int rxq_setup(struct rte_eth_dev *, struct rxq *, uint16_t, unsigned int,
 int mlx5_rx_queue_setup(struct rte_eth_dev *, uint16_t, uint16_t, unsigned int,
 			const struct rte_eth_rxconf *, struct rte_mempool *);
 void mlx5_rx_queue_release(void *);
+uint16_t mlx5_rx_burst_secondary_setup(void *dpdk_rxq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n);
+
 
 /* mlx5_txq.c */
 
 void txq_cleanup(struct txq *);
+int txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
+	  unsigned int socket, const struct rte_eth_txconf *conf);
+
 int mlx5_tx_queue_setup(struct rte_eth_dev *, uint16_t, uint16_t, unsigned int,
 			const struct rte_eth_txconf *);
 void mlx5_tx_queue_release(void *);
+uint16_t mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n);
 
 /* mlx5_rxtx.c */
 
diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
index 6d1a600..2d3cb51 100644
--- a/drivers/net/mlx5/mlx5_stats.c
+++ b/drivers/net/mlx5/mlx5_stats.c
@@ -55,7 +55,7 @@
 void
 mlx5_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct rte_eth_stats tmp = {0};
 	unsigned int i;
 	unsigned int idx;
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index b5ca7d4..e9b9a29 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -64,6 +64,9 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int err;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	if (priv->started) {
 		priv_unlock(priv);
@@ -104,6 +107,9 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	if (!priv->started) {
 		priv_unlock(priv);
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 3364fca..6700af4 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -255,11 +255,11 @@ txq_cleanup(struct txq *txq)
  * @return
  *   0 on success, errno value on failure.
  */
-static int
+int
 txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 	  unsigned int socket, const struct rte_eth_txconf *conf)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct txq tmpl = {
 		.priv = priv,
 		.socket = socket
@@ -464,6 +464,9 @@ mlx5_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	struct txq *txq = (*priv->txqs)[idx];
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	DEBUG("%p: configuring queue %u for %u descriptors",
 	      (void *)dev, idx, desc);
@@ -519,6 +522,9 @@ mlx5_tx_queue_release(void *dpdk_txq)
 	struct priv *priv;
 	unsigned int i;
 
+	if (mlx5_is_secondary())
+		return;
+
 	if (txq == NULL)
 		return;
 	priv = txq->priv;
@@ -534,3 +540,43 @@ mlx5_tx_queue_release(void *dpdk_txq)
 	rte_free(txq);
 	priv_unlock(priv);
 }
+
+/**
+ * DPDK callback for TX in secondary processes.
+ *
+ * This function configures all queues from primary process information
+ * if necessary before reverting to the normal TX burst callback.
+ *
+ * @param dpdk_txq
+ *   Generic pointer to TX queue structure.
+ * @param[in] pkts
+ *   Packets to transmit.
+ * @param pkts_n
+ *   Number of packets in array.
+ *
+ * @return
+ *   Number of packets successfully transmitted (<= pkts_n).
+ */
+uint16_t
+mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n)
+{
+	struct txq *txq = dpdk_txq;
+	struct priv *priv = mlx5_secondary_data_setup(txq->priv);
+	struct priv *primary_priv;
+	unsigned int index;
+
+	if (priv == NULL)
+		return 0;
+	primary_priv =
+		mlx5_secondary_data[priv->dev->data->port_id].primary_priv;
+	/* Look for queue index in both private structures. */
+	for (index = 0; index != priv->txqs_n; ++index)
+		if (((*primary_priv->txqs)[index] == txq) ||
+		    ((*priv->txqs)[index] == txq))
+			break;
+	if (index == priv->txqs_n)
+		return 0;
+	txq = (*priv->txqs)[index];
+	return priv->dev->tx_pkt_burst(txq, pkts, pkts_n);
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH 3/4] mlx5: add support for HW packet padding
  2016-02-22 18:19 [dpdk-dev] [PATCH 0/4] Implement missing features in mlx5 Adrien Mazarguil
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 1/4] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 2/4] mlx5: allow operation in secondary processes Adrien Mazarguil
@ 2016-02-22 18:19 ` Adrien Mazarguil
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 4/4] mlx5: add VLAN insertion offload Adrien Mazarguil
  2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
  4 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:19 UTC (permalink / raw)
  To: dev

From: Olga Shern <olgas@mellanox.com>

Environment variable MLX5_PMD_ENABLE_PADDING enables HW packet padding
in PCI bus transactions.

When packet size is cache aligned and CRC stripping is enabled, 4 fewer
bytes are written to the PCI bus. Enabling padding makes such packets
aligned again.

In cases where PCI bandwidth is the bottleneck, padding can improve
performance by 10%.

This is disabled by default since this can also decrease performance for
unaligned packet sizes.

Signed-off-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/nics/mlx5.rst               | 14 ++++++++++++++
 doc/guides/rel_notes/release_16_04.rst |  5 +++++
 drivers/net/mlx5/Makefile              |  5 +++++
 drivers/net/mlx5/mlx5.c                | 19 +++++++++++++++++++
 drivers/net/mlx5/mlx5.h                |  4 ++++
 drivers/net/mlx5/mlx5_rxq.c            | 10 ++++++++++
 6 files changed, 57 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index df07146..66fe0d9 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -155,6 +155,20 @@ Environment variables
   lower performance when there is no backpressure, it is not enabled by
   default.
 
+- ``MLX5_PMD_ENABLE_PADDING``
+
+  Enables HW packet padding in PCI bus transactions.
+
+  When packet size is cache aligned and CRC stripping is enabled, 4 fewer
+  bytes are written to the PCI bus. Enabling padding makes such packets
+  aligned again.
+
+  In cases where PCI bandwidth is the bottleneck, padding can improve
+  performance by 10%.
+
+  This is disabled by default since this can also decrease performance for
+  unaligned packet sizes.
+
 Run-time configuration
 ~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 08f7592..a3a30fd 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -87,6 +87,11 @@ This section should contain new features added in this release. Sample format:
 
   Implemented TX support in secondary processes (like mlx4).
 
+* **mlx5: Added optional packet padding by HW.**
+
+  Added an option to make PCI bus transactions rounded to multiple of 64
+  bytes for better cache alignment.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 7076ae3..712c0a9 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -137,6 +137,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_CQ_RX_TCP_PACKET \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
+		infiniband/verbs.h \
+		enum IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 998e6f0..8baef28 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -68,6 +68,25 @@
 #include "mlx5_defs.h"
 
 /**
+ * Retrieve integer value from environment variable.
+ *
+ * @param[in] name
+ *   Environment variable name.
+ *
+ * @return
+ *   Integer value, 0 if the variable is not set.
+ */
+int
+mlx5_getenv_int(const char *name)
+{
+	const char *val = getenv(name);
+
+	if (val == NULL)
+		return 0;
+	return atoi(val);
+}
+
+/**
  * DPDK callback to close the device.
  *
  * Destroy all queues and objects, free memory.
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index bad9283..436b70b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -159,6 +159,10 @@ priv_unlock(struct priv *priv)
 	rte_spinlock_unlock(&priv->lock);
 }
 
+/* mlx5.c */
+
+int mlx5_getenv_int(const char *);
+
 /* mlx5_ethdev.c */
 
 struct priv *mlx5_get_priv(struct rte_eth_dev *dev);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 3d84f41..0efa7a3 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1258,6 +1258,16 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 				  0),
 #endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	};
+
+#ifdef HAVE_EXP_CREATE_WQ_FLAG_RX_END_PADDING
+	if (mlx5_getenv_int("MLX5_PMD_ENABLE_PADDING")) {
+		INFO("%p: packet padding is enabled on queue %p",
+		     (void *)dev, (void *)rxq);
+		attr.wq.flags = IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING;
+		attr.wq.comp_mask |= IBV_EXP_CREATE_WQ_FLAGS;
+	}
+#endif /* HAVE_EXP_CREATE_WQ_FLAG_RX_END_PADDING */
+
 	tmpl.wq = ibv_exp_create_wq(priv->ctx, &attr.wq);
 	if (tmpl.wq == NULL) {
 		ret = (errno ? errno : EINVAL);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH 4/4] mlx5: add VLAN insertion offload
  2016-02-22 18:19 [dpdk-dev] [PATCH 0/4] Implement missing features in mlx5 Adrien Mazarguil
                   ` (2 preceding siblings ...)
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 3/4] mlx5: add support for HW packet padding Adrien Mazarguil
@ 2016-02-22 18:19 ` Adrien Mazarguil
  2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
  4 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-02-22 18:19 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

VLAN insertion is done in software by the PMD by default unless
CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION is enabled and Verbs provides
support for hardware insertion.

When enabled, this option improves performance when VLAN insertion is
requested, however ConnectX-4 Lx boards cannot take advantage of
multi-packet send optimizations anymore.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 config/common_linuxapp                 |   1 +
 doc/guides/nics/mlx5.rst               |   8 +++
 doc/guides/rel_notes/release_16_04.rst |   4 ++
 drivers/net/mlx5/Makefile              |   9 +++
 drivers/net/mlx5/mlx5_defs.h           |   9 +++
 drivers/net/mlx5/mlx5_ethdev.c         |  12 ++--
 drivers/net/mlx5/mlx5_rxtx.c           | 109 +++++++++++++++++++++++++++------
 drivers/net/mlx5/mlx5_rxtx.h           |  13 ++++
 drivers/net/mlx5/mlx5_txq.c            |  15 ++++-
 9 files changed, 155 insertions(+), 25 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 7b5e49f..793d262 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -220,6 +220,7 @@ CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
 CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N=4
 CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0
 CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8
+CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION=n
 
 #
 # Compile burst-oriented Broadcom PMD driver
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 66fe0d9..1ca1534 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -84,6 +84,7 @@ Features
 - Support for multiple MAC addresses.
 - VLAN filtering.
 - RX VLAN stripping.
+- TX VLAN insertion.
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
@@ -142,6 +143,13 @@ These options can be modified in the ``.config`` file.
 
   This value is always 1 for RX queues since they use a single MP.
 
+- ``CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION`` (default **n**)
+
+  Use Verbs instead of PMD implementation for VLAN insertion. Disabled by
+  default since it prevents ConnectX-4 Lx adapters from taking advantage of
+  multi-packet send optimizations, otherwise provides better performance
+  when VLAN insertion is requested.
+
 Environment variables
 ~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index a3a30fd..906d835 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -92,6 +92,10 @@ This section should contain new features added in this release. Sample format:
   Added an option to make PCI bus transactions rounded to multiple of 64
   bytes for better cache alignment.
 
+* **mlx5: TX VLAN insertion support.**
+
+  Added support for TX VLAN insertion.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 712c0a9..a260c52 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -101,6 +101,10 @@ ifdef CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE
 CFLAGS += -DMLX5_PMD_TX_MP_CACHE=$(CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE)
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION),y)
+CFLAGS += -DMLX5_VERBS_VLAN_INSERTION
+endif
+
 include $(RTE_SDK)/mk/rte.lib.mk
 
 # Generate and clean-up mlx5_autoconf.h.
@@ -142,6 +146,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_VLAN_INSERTION \
+		infiniband/verbs.h \
+		enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 5b00d8e..fb8db2e 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -95,4 +95,13 @@
 #define MLX5_FDIR_SUPPORT 1
 #endif
 
+/*
+ * Prevent compilation when HW VLAN insertion is requested by configuration
+ * but not supported by Verbs.
+ */
+#if defined(MLX5_VERBS_VLAN_INSERTION) && !defined(HAVE_VERBS_VLAN_INSERTION)
+#error CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION \
+	enabled in configuration but not supported by libibverbs.
+#endif
+
 #endif /* RTE_PMD_MLX5_DEFS_H_ */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6b674a2..66115d2 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -544,12 +544,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 		  DEV_RX_OFFLOAD_UDP_CKSUM |
 		  DEV_RX_OFFLOAD_TCP_CKSUM) :
 		 0);
-	info->tx_offload_capa =
-		(priv->hw_csum ?
-		 (DEV_TX_OFFLOAD_IPV4_CKSUM |
-		  DEV_TX_OFFLOAD_UDP_CKSUM |
-		  DEV_TX_OFFLOAD_TCP_CKSUM) :
-		 0);
+	info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+	if (priv->hw_csum)
+		info->tx_offload_capa |=
+			(DEV_TX_OFFLOAD_IPV4_CKSUM |
+			 DEV_TX_OFFLOAD_UDP_CKSUM |
+			 DEV_TX_OFFLOAD_TCP_CKSUM);
 	if (priv_get_ifname(priv, &ifname) == 0)
 		info->if_index = if_nametoindex(ifname);
 	/* FIXME: RETA update/query API expects the callee to know the size of
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 4919189..9fc535e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -333,6 +333,40 @@ txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
 	txq_mp2mr(txq, mp);
 }
 
+#ifndef MLX5_VERBS_VLAN_INSERTION
+
+/**
+ * Insert VLAN to specific packet, using the mbuf's headroom space.
+ *
+ * @param buf
+ *   Buffer to insert the vlan.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static inline int
+insert_vlan_sw(struct rte_mbuf *buf)
+{
+	uintptr_t addr;
+	uint32_t vlan;
+	uint16_t head_room_len = rte_pktmbuf_headroom(buf);
+
+	if (head_room_len < 4)
+		return EINVAL;
+
+	addr = rte_pktmbuf_mtod(buf, uintptr_t);
+	vlan = htonl(0x81000000 | buf->vlan_tci);
+	memmove((void *)(addr - 4), (void *)addr, 12);
+	memcpy((void *)(addr + 8), &vlan, sizeof(vlan));
+
+	SET_DATA_OFF(buf, head_room_len - 4);
+	DATA_LEN(buf) += 4;
+
+	return 0;
+}
+
+#endif /* !MLX5_VERBS_VLAN_INSERTION */
+
 #if MLX5_PMD_SGE_WR_N > 1
 
 /**
@@ -554,6 +588,14 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			if (RTE_ETH_IS_TUNNEL_PKT(buf->packet_type))
 				send_flags |= IBV_EXP_QP_BURST_TUNNEL;
 		}
+#ifndef MLX5_VERBS_VLAN_INSERTION
+		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
+			err = insert_vlan_sw(buf);
+
+			if (unlikely(err))
+				goto stop;
+		}
+#endif /* !MLX5_VERBS_VLAN_INSERTION */
 		if (likely(segs == 1)) {
 			uintptr_t addr;
 			uint32_t length;
@@ -577,13 +619,23 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			}
 			/* Put packet into send queue. */
 #if MLX5_PMD_MAX_INLINE > 0
-			if (length <= txq->max_inline)
-				err = txq->send_pending_inline
-					(txq->qp,
-					 (void *)addr,
-					 length,
-					 send_flags);
-			else
+			if (length <= txq->max_inline) {
+#ifdef MLX5_VERBS_VLAN_INSERTION
+				if (buf->ol_flags & PKT_TX_VLAN_PKT)
+					err = txq->send_pending_inline_vlan
+						(txq->qp,
+						 (void *)addr,
+						 length,
+						 send_flags,
+						 &buf->vlan_tci);
+				else
+#endif /* MLX5_VERBS_VLAN_INSERTION */
+					err = txq->send_pending_inline
+						(txq->qp,
+						 (void *)addr,
+						 length,
+						 send_flags);
+			} else
 #endif
 			{
 				/* Retrieve Memory Region key for this
@@ -597,12 +649,23 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 					elt->buf = NULL;
 					goto stop;
 				}
-				err = txq->send_pending
-					(txq->qp,
-					 addr,
-					 length,
-					 lkey,
-					 send_flags);
+#ifdef MLX5_VERBS_VLAN_INSERTION
+				if (buf->ol_flags & PKT_TX_VLAN_PKT)
+					err = txq->send_pending_vlan
+						(txq->qp,
+						 addr,
+						 length,
+						 lkey,
+						 send_flags,
+						 &buf->vlan_tci);
+				else
+#endif /* MLX5_VERBS_VLAN_INSERTION */
+					err = txq->send_pending
+						(txq->qp,
+						 addr,
+						 length,
+						 lkey,
+						 send_flags);
 			}
 			if (unlikely(err))
 				goto stop;
@@ -619,11 +682,21 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			if (ret.length == (unsigned int)-1)
 				goto stop;
 			/* Put SG list into send queue. */
-			err = txq->send_pending_sg_list
-				(txq->qp,
-				 sges,
-				 ret.num,
-				 send_flags);
+#ifdef MLX5_VERBS_VLAN_INSERTION
+			if (buf->ol_flags & PKT_TX_VLAN_PKT)
+				err = txq->send_pending_sg_list_vlan
+					(txq->qp,
+					 sges,
+					 ret.num,
+					 send_flags,
+					 &buf->vlan_tci);
+			else
+#endif /* MLX5_VERBS_VLAN_INSERTION */
+				err = txq->send_pending_sg_list
+					(txq->qp,
+					 sges,
+					 ret.num,
+					 send_flags);
 			if (unlikely(err))
 				goto stop;
 #ifdef MLX5_PMD_SOFT_COUNTERS
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 6a0087e..7306d18 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -254,11 +254,20 @@ struct txq {
 	struct priv *priv; /* Back pointer to private data. */
 	int32_t (*poll_cnt)(struct ibv_cq *cq, uint32_t max);
 	int (*send_pending)();
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	int (*send_pending_vlan)();
+#endif
 #if MLX5_PMD_MAX_INLINE > 0
 	int (*send_pending_inline)();
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	int (*send_pending_inline_vlan)();
+#endif
 #endif
 #if MLX5_PMD_SGE_WR_N > 1
 	int (*send_pending_sg_list)();
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	int (*send_pending_sg_list_vlan)();
+#endif
 #endif
 	int (*send_flush)(struct ibv_qp *qp);
 	struct ibv_cq *cq; /* Completion Queue. */
@@ -282,7 +291,11 @@ struct txq {
 	/* Elements used only for init part are here. */
 	linear_t (*elts_linear)[]; /* Linearized buffers. */
 	struct ibv_mr *mr_linear; /* Memory Region for linearized buffers. */
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	struct ibv_exp_qp_burst_family_v1 *if_qp; /* QP burst interface. */
+#else
 	struct ibv_exp_qp_burst_family *if_qp; /* QP burst interface. */
+#endif
 	struct ibv_exp_cq_family *if_cq; /* CQ interface. */
 	struct ibv_exp_res_domain *rd; /* Resource Domain. */
 	unsigned int socket; /* CPU socket ID for allocations. */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 6700af4..c643cf4 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -400,7 +400,11 @@ txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 		.intf_scope = IBV_EXP_INTF_GLOBAL,
 		.intf = IBV_EXP_INTF_QP_BURST,
 		.obj = tmpl.qp,
-#ifdef HAVE_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR
+#ifdef HAVE_VERBS_VLAN_INSERTION
+		.intf_version = 1,
+#endif
+#if defined(HAVE_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR) && \
+	!defined(MLX5_VERBS_VLAN_INSERTION)
 		/* Multi packet send WR can only be used outside of VF. */
 		.family_flags =
 			(!priv->vf ?
@@ -422,11 +426,20 @@ txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 	txq->poll_cnt = txq->if_cq->poll_cnt;
 #if MLX5_PMD_MAX_INLINE > 0
 	txq->send_pending_inline = txq->if_qp->send_pending_inline;
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	txq->send_pending_inline_vlan = txq->if_qp->send_pending_inline_vlan;
+#endif
 #endif
 #if MLX5_PMD_SGE_WR_N > 1
 	txq->send_pending_sg_list = txq->if_qp->send_pending_sg_list;
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	txq->send_pending_sg_list_vlan = txq->if_qp->send_pending_sg_list_vlan;
+#endif
 #endif
 	txq->send_pending = txq->if_qp->send_pending;
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	txq->send_pending_vlan = txq->if_qp->send_pending_vlan;
+#endif
 	txq->send_flush = txq->if_qp->send_flush;
 	DEBUG("%p: txq updated with %p", (void *)txq, (void *)&tmpl);
 	/* Pre-register known mempools. */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5
  2016-02-22 18:19 [dpdk-dev] [PATCH 0/4] Implement missing features in mlx5 Adrien Mazarguil
                   ` (3 preceding siblings ...)
  2016-02-22 18:19 ` [dpdk-dev] [PATCH 4/4] mlx5: add VLAN insertion offload Adrien Mazarguil
@ 2016-03-03 14:27 ` Adrien Mazarguil
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 1/5] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
                     ` (5 more replies)
  4 siblings, 6 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:27 UTC (permalink / raw)
  To: dev

This patchset adds to mlx5 a few features available in mlx4 (TX from
secondary processes) or provided by Verbs (support for HW packet padding,
TX VLAN insertion).

Release notes and documentation are updated accordingly.

Note: should be applied after "Assorted fixes for mlx4 and mlx5".

Changes in v2:
- Added support for CRC stripping configuration.
- Updated packet padding feature macro and made cosmetic changes to its
  implementation to match CRC stripping's.
- Updated release notes about packet padding.
- Updated TX VLAN insertion documentation.

Olga Shern (2):
  mlx5: add RX CRC stripping configuration
  mlx5: add support for HW packet padding

Or Ami (2):
  mlx5: add callbacks to support link (up / down) changes
  mlx5: allow operation in secondary processes

Yaacov Hazan (1):
  mlx5: add VLAN insertion offload

 config/common_linuxapp                 |   1 +
 doc/guides/nics/mlx5.rst               |  28 ++-
 doc/guides/rel_notes/release_16_04.rst |  27 +++
 drivers/net/mlx5/Makefile              |  19 +++
 drivers/net/mlx5/mlx5.c                |  79 ++++++++-
 drivers/net/mlx5/mlx5.h                |  20 +++
 drivers/net/mlx5/mlx5_defs.h           |   9 +
 drivers/net/mlx5/mlx5_ethdev.c         | 299 ++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_mac.c            |   6 +
 drivers/net/mlx5/mlx5_rxmode.c         |  12 ++
 drivers/net/mlx5/mlx5_rxq.c            |  85 ++++++++++
 drivers/net/mlx5/mlx5_rxtx.c           | 115 ++++++++++---
 drivers/net/mlx5/mlx5_rxtx.h           |  22 +++
 drivers/net/mlx5/mlx5_stats.c          |   2 +-
 drivers/net/mlx5/mlx5_trigger.c        |   6 +
 drivers/net/mlx5/mlx5_txq.c            |  65 ++++++-
 16 files changed, 753 insertions(+), 42 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v2 1/5] mlx5: add callbacks to support link (up / down) changes
  2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
@ 2016-03-03 14:27   ` Adrien Mazarguil
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 2/5] mlx5: allow operation in secondary processes Adrien Mazarguil
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:27 UTC (permalink / raw)
  To: dev

From: Or Ami <ora@mellanox.com>

Burst functions are updated to make sure applications cannot attempt to
send/receive after link is brought down.

Signed-off-by: Or Ami <ora@mellanox.com>
---
 doc/guides/rel_notes/release_16_04.rst |  4 ++
 drivers/net/mlx5/mlx5.c                |  2 +
 drivers/net/mlx5/mlx5.h                |  2 +
 drivers/net/mlx5/mlx5_ethdev.c         | 85 ++++++++++++++++++++++++++++++++++
 4 files changed, 93 insertions(+)

diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 8669515..5e43d50 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -87,6 +87,10 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **mlx5: Added link up/down callbacks.**
+
+  Implemented callbacks to bring link up and down.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ad69ec2..14ac4ba 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -148,6 +148,8 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.dev_configure = mlx5_dev_configure,
 	.dev_start = mlx5_dev_start,
 	.dev_stop = mlx5_dev_stop,
+	.dev_set_link_down = mlx5_set_link_down,
+	.dev_set_link_up = mlx5_set_link_up,
 	.dev_close = mlx5_dev_close,
 	.promiscuous_enable = mlx5_promiscuous_enable,
 	.promiscuous_disable = mlx5_promiscuous_disable,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 43b24fb..9a3f240 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -168,6 +168,8 @@ void mlx5_dev_link_status_handler(void *);
 void mlx5_dev_interrupt_handler(struct rte_intr_handle *, void *);
 void priv_dev_interrupt_handler_uninstall(struct priv *, struct rte_eth_dev *);
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
+int mlx5_set_link_down(struct rte_eth_dev *dev);
+int mlx5_set_link_up(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6704382..f609e0f 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -968,3 +968,88 @@ priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_dev *dev)
 					   dev);
 	}
 }
+
+/**
+ * Change the link state (UP / DOWN).
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param up
+ *   Nonzero for link up, otherwise link down.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_set_link(struct priv *priv, int up)
+{
+	struct rte_eth_dev *dev = priv->dev;
+	int err;
+	unsigned int i;
+
+	if (up) {
+		err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
+		if (err)
+			return err;
+		for (i = 0; i < priv->rxqs_n; i++)
+			if ((*priv->rxqs)[i]->sp)
+				break;
+		/* Check if an sp queue exists.
+		 * Note: Some old frames might be received.
+		 */
+		if (i == priv->rxqs_n)
+			dev->rx_pkt_burst = mlx5_rx_burst;
+		else
+			dev->rx_pkt_burst = mlx5_rx_burst_sp;
+		dev->tx_pkt_burst = mlx5_tx_burst;
+	} else {
+		err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
+		if (err)
+			return err;
+		dev->rx_pkt_burst = removed_rx_burst;
+		dev->tx_pkt_burst = removed_tx_burst;
+	}
+	return 0;
+}
+
+/**
+ * DPDK callback to bring the link DOWN.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+mlx5_set_link_down(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	int err;
+
+	priv_lock(priv);
+	err = priv_set_link(priv, 0);
+	priv_unlock(priv);
+	return err;
+}
+
+/**
+ * DPDK callback to bring the link UP.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+mlx5_set_link_up(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	int err;
+
+	priv_lock(priv);
+	err = priv_set_link(priv, 1);
+	priv_unlock(priv);
+	return err;
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v2 2/5] mlx5: allow operation in secondary processes
  2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 1/5] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
@ 2016-03-03 14:27   ` Adrien Mazarguil
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 3/5] mlx5: add RX CRC stripping configuration Adrien Mazarguil
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:27 UTC (permalink / raw)
  To: dev

From: Or Ami <ora@mellanox.com>

Secondary processes are expected to use queues and other resources
allocated by the primary, however Verbs resources can only be shared
between processes when inherited through fork().

This limitation can be worked around for TX by configuring separate queues
from secondary processes.

Signed-off-by: Or Ami <ora@mellanox.com>
---
 doc/guides/nics/mlx5.rst               |   3 +-
 doc/guides/rel_notes/release_16_04.rst |   4 +
 drivers/net/mlx5/mlx5.c                |  42 +++++--
 drivers/net/mlx5/mlx5.h                |  12 ++
 drivers/net/mlx5/mlx5_ethdev.c         | 202 ++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_mac.c            |   6 +
 drivers/net/mlx5/mlx5_rxmode.c         |  12 ++
 drivers/net/mlx5/mlx5_rxq.c            |  46 ++++++++
 drivers/net/mlx5/mlx5_rxtx.h           |   8 ++
 drivers/net/mlx5/mlx5_stats.c          |   2 +-
 drivers/net/mlx5/mlx5_trigger.c        |   6 +
 drivers/net/mlx5/mlx5_txq.c            |  50 +++++++-
 12 files changed, 378 insertions(+), 15 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index edfbf1f..f0d8a7e 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -88,6 +88,7 @@ Features
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
 - Flow director (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN).
+- Secondary process TX is supported.
 
 Limitations
 -----------
@@ -96,7 +97,7 @@ Limitations
 - Inner RSS for VXLAN frames is not supported yet.
 - Port statistics through software counters only.
 - Hardware checksum offloads for VXLAN inner header are not supported yet.
-- Secondary processes are not supported yet.
+- Secondary process RX is not supported.
 
 Configuration
 -------------
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 5e43d50..49eed7e 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -91,6 +91,10 @@ This section should contain new features added in this release. Sample format:
 
   Implemented callbacks to bring link up and down.
 
+* **mlx5: Added support for operation in secondary processes.**
+
+  Implemented TX support in secondary processes (like mlx4).
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 14ac4ba..998e6f0 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -78,7 +78,7 @@
 static void
 mlx5_dev_close(struct rte_eth_dev *dev)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	void *tmp;
 	unsigned int i;
 
@@ -483,18 +483,44 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 			goto port_error;
 		}
 
-		eth_dev->data->dev_private = priv;
-		eth_dev->pci_dev = pci_dev;
+		/* Secondary processes have to use local storage for their
+		 * private data as well as a copy of eth_dev->data, but this
+		 * pointer must not be modified before burst functions are
+		 * actually called. */
+		if (mlx5_is_secondary()) {
+			struct mlx5_secondary_data *sd =
+				&mlx5_secondary_data[eth_dev->data->port_id];
+			sd->primary_priv = eth_dev->data->dev_private;
+			if (sd->primary_priv == NULL) {
+				ERROR("no private data for port %u",
+						eth_dev->data->port_id);
+				err = EINVAL;
+				goto port_error;
+			}
+			sd->shared_dev_data = eth_dev->data;
+			rte_spinlock_init(&sd->lock);
+			memcpy(sd->data.name, sd->shared_dev_data->name,
+				   sizeof(sd->data.name));
+			sd->data.dev_private = priv;
+			sd->data.rx_mbuf_alloc_failed = 0;
+			sd->data.mtu = ETHER_MTU;
+			sd->data.port_id = sd->shared_dev_data->port_id;
+			sd->data.mac_addrs = priv->mac;
+			eth_dev->tx_pkt_burst = mlx5_tx_burst_secondary_setup;
+			eth_dev->rx_pkt_burst = mlx5_rx_burst_secondary_setup;
+		} else {
+			eth_dev->data->dev_private = priv;
+			eth_dev->data->rx_mbuf_alloc_failed = 0;
+			eth_dev->data->mtu = ETHER_MTU;
+			eth_dev->data->mac_addrs = priv->mac;
+		}
 
+		eth_dev->pci_dev = pci_dev;
 		rte_eth_copy_pci_info(eth_dev, pci_dev);
-
 		eth_dev->driver = &mlx5_driver;
-		eth_dev->data->rx_mbuf_alloc_failed = 0;
-		eth_dev->data->mtu = ETHER_MTU;
-
 		priv->dev = eth_dev;
 		eth_dev->dev_ops = &mlx5_dev_ops;
-		eth_dev->data->mac_addrs = priv->mac;
+
 		TAILQ_INIT(&eth_dev->link_intr_cbs);
 
 		/* Bring Ethernet device up. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9a3f240..bad9283 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -59,6 +59,7 @@
 #include <rte_ethdev.h>
 #include <rte_spinlock.h>
 #include <rte_interrupts.h>
+#include <rte_errno.h>
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -125,6 +126,14 @@ struct priv {
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
+/* Local storage for secondary process data. */
+struct mlx5_secondary_data {
+	struct rte_eth_dev_data data; /* Local device data. */
+	struct priv *primary_priv; /* Private structure from primary. */
+	struct rte_eth_dev_data *shared_dev_data; /* Shared device data. */
+	rte_spinlock_t lock; /* Port configuration lock. */
+} mlx5_secondary_data[RTE_MAX_ETHPORTS];
+
 /**
  * Lock private structure to protect it from concurrent access in the
  * control path.
@@ -152,6 +161,8 @@ priv_unlock(struct priv *priv)
 
 /* mlx5_ethdev.c */
 
+struct priv *mlx5_get_priv(struct rte_eth_dev *dev);
+int mlx5_is_secondary(void);
 int priv_get_ifname(const struct priv *, char (*)[IF_NAMESIZE]);
 int priv_ifreq(const struct priv *, int req, struct ifreq *);
 int priv_get_mtu(struct priv *, uint16_t *);
@@ -170,6 +181,7 @@ void priv_dev_interrupt_handler_uninstall(struct priv *, struct rte_eth_dev *);
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
 int mlx5_set_link_down(struct rte_eth_dev *dev);
 int mlx5_set_link_up(struct rte_eth_dev *dev);
+struct priv *mlx5_secondary_data_setup(struct priv *priv);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f609e0f..6b674a2 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -59,6 +59,7 @@
 #include <rte_common.h>
 #include <rte_interrupts.h>
 #include <rte_alarm.h>
+#include <rte_malloc.h>
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -68,6 +69,38 @@
 #include "mlx5_utils.h"
 
 /**
+ * Return private structure associated with an Ethernet device.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   Pointer to private structure.
+ */
+struct priv *
+mlx5_get_priv(struct rte_eth_dev *dev)
+{
+	struct mlx5_secondary_data *sd;
+
+	if (!mlx5_is_secondary())
+		return dev->data->dev_private;
+	sd = &mlx5_secondary_data[dev->data->port_id];
+	return sd->data.dev_private;
+}
+
+/**
+ * Check if running as a secondary process.
+ *
+ * @return
+ *   Nonzero if running as a secondary process.
+ */
+inline int
+mlx5_is_secondary(void)
+{
+	return rte_eal_process_type() != RTE_PROC_PRIMARY;
+}
+
+/**
  * Get interface name from private structure.
  *
  * @param[in] priv
@@ -464,6 +497,9 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	ret = dev_configure(dev);
 	assert(ret >= 0);
@@ -482,7 +518,7 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 void
 mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	unsigned int max;
 	char ifname[IF_NAMESIZE];
 
@@ -536,7 +572,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 static int
 mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct ethtool_cmd edata = {
 		.cmd = ETHTOOL_GSET
 	};
@@ -585,7 +621,7 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 int
 mlx5_link_update(struct rte_eth_dev *dev, int wait_to_complete)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	int ret;
 
 	priv_lock(priv);
@@ -620,6 +656,9 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
 	uint16_t (*rx_func)(void *, struct rte_mbuf **, uint16_t) =
 		mlx5_rx_burst;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	/* Set kernel interface MTU first. */
 	if (priv_set_mtu(priv, mtu)) {
@@ -694,6 +733,9 @@ mlx5_dev_get_flow_ctrl(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 	};
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	ifr.ifr_data = &ethpause;
 	priv_lock(priv);
 	if (priv_ifreq(priv, SIOCETHTOOL, &ifr)) {
@@ -742,6 +784,9 @@ mlx5_dev_set_flow_ctrl(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 	};
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	ifr.ifr_data = &ethpause;
 	ethpause.autoneg = fc_conf->autoneg;
 	if (((fc_conf->mode & RTE_FC_FULL) == RTE_FC_FULL) ||
@@ -1053,3 +1098,154 @@ mlx5_set_link_up(struct rte_eth_dev *dev)
 	priv_unlock(priv);
 	return err;
 }
+
+/**
+ * Configure secondary process queues from a private data pointer (primary
+ * or secondary) and update burst callbacks. Can take place only once.
+ *
+ * All queues must have been previously created by the primary process to
+ * avoid undefined behavior.
+ *
+ * @param priv
+ *   Private data pointer from either primary or secondary process.
+ *
+ * @return
+ *   Private data pointer from secondary process, NULL in case of error.
+ */
+struct priv *
+mlx5_secondary_data_setup(struct priv *priv)
+{
+	unsigned int port_id = 0;
+	struct mlx5_secondary_data *sd;
+	void **tx_queues;
+	void **rx_queues;
+	unsigned int nb_tx_queues;
+	unsigned int nb_rx_queues;
+	unsigned int i;
+
+	/* priv must be valid at this point. */
+	assert(priv != NULL);
+	/* priv->dev must also be valid but may point to local memory from
+	 * another process, possibly with the same address and must not
+	 * be dereferenced yet. */
+	assert(priv->dev != NULL);
+	/* Determine port ID by finding out where priv comes from. */
+	while (1) {
+		sd = &mlx5_secondary_data[port_id];
+		rte_spinlock_lock(&sd->lock);
+		/* Primary process? */
+		if (sd->primary_priv == priv)
+			break;
+		/* Secondary process? */
+		if (sd->data.dev_private == priv)
+			break;
+		rte_spinlock_unlock(&sd->lock);
+		if (++port_id == RTE_DIM(mlx5_secondary_data))
+			port_id = 0;
+	}
+	/* Switch to secondary private structure. If private data has already
+	 * been updated by another thread, there is nothing else to do. */
+	priv = sd->data.dev_private;
+	if (priv->dev->data == &sd->data)
+		goto end;
+	/* Sanity checks. Secondary private structure is supposed to point
+	 * to local eth_dev, itself still pointing to the shared device data
+	 * structure allocated by the primary process. */
+	assert(sd->shared_dev_data != &sd->data);
+	assert(sd->data.nb_tx_queues == 0);
+	assert(sd->data.tx_queues == NULL);
+	assert(sd->data.nb_rx_queues == 0);
+	assert(sd->data.rx_queues == NULL);
+	assert(priv != sd->primary_priv);
+	assert(priv->dev->data == sd->shared_dev_data);
+	assert(priv->txqs_n == 0);
+	assert(priv->txqs == NULL);
+	assert(priv->rxqs_n == 0);
+	assert(priv->rxqs == NULL);
+	nb_tx_queues = sd->shared_dev_data->nb_tx_queues;
+	nb_rx_queues = sd->shared_dev_data->nb_rx_queues;
+	/* Allocate local storage for queues. */
+	tx_queues = rte_zmalloc("secondary ethdev->tx_queues",
+				sizeof(sd->data.tx_queues[0]) * nb_tx_queues,
+				RTE_CACHE_LINE_SIZE);
+	rx_queues = rte_zmalloc("secondary ethdev->rx_queues",
+				sizeof(sd->data.rx_queues[0]) * nb_rx_queues,
+				RTE_CACHE_LINE_SIZE);
+	if (tx_queues == NULL || rx_queues == NULL)
+		goto error;
+	/* Lock to prevent control operations during setup. */
+	priv_lock(priv);
+	/* TX queues. */
+	for (i = 0; i != nb_tx_queues; ++i) {
+		struct txq *primary_txq = (*sd->primary_priv->txqs)[i];
+		struct txq *txq;
+
+		if (primary_txq == NULL)
+			continue;
+		txq = rte_calloc_socket("TXQ", 1, sizeof(*txq), 0,
+					primary_txq->socket);
+		if (txq != NULL) {
+			if (txq_setup(priv->dev,
+				      txq,
+				      primary_txq->elts_n * MLX5_PMD_SGE_WR_N,
+				      primary_txq->socket,
+				      NULL) == 0) {
+				txq->stats.idx = primary_txq->stats.idx;
+				tx_queues[i] = txq;
+				continue;
+			}
+			rte_free(txq);
+		}
+		while (i) {
+			txq = tx_queues[--i];
+			txq_cleanup(txq);
+			rte_free(txq);
+		}
+		goto error;
+	}
+	/* RX queues. */
+	for (i = 0; i != nb_rx_queues; ++i) {
+		struct rxq *primary_rxq = (*sd->primary_priv->rxqs)[i];
+
+		if (primary_rxq == NULL)
+			continue;
+		/* Not supported yet. */
+		rx_queues[i] = NULL;
+	}
+	/* Update everything. */
+	priv->txqs = (void *)tx_queues;
+	priv->txqs_n = nb_tx_queues;
+	priv->rxqs = (void *)rx_queues;
+	priv->rxqs_n = nb_rx_queues;
+	sd->data.rx_queues = rx_queues;
+	sd->data.tx_queues = tx_queues;
+	sd->data.nb_rx_queues = nb_rx_queues;
+	sd->data.nb_tx_queues = nb_tx_queues;
+	sd->data.dev_link = sd->shared_dev_data->dev_link;
+	sd->data.mtu = sd->shared_dev_data->mtu;
+	memcpy(sd->data.rx_queue_state, sd->shared_dev_data->rx_queue_state,
+	       sizeof(sd->data.rx_queue_state));
+	memcpy(sd->data.tx_queue_state, sd->shared_dev_data->tx_queue_state,
+	       sizeof(sd->data.tx_queue_state));
+	sd->data.dev_flags = sd->shared_dev_data->dev_flags;
+	/* Use local data from now on. */
+	rte_mb();
+	priv->dev->data = &sd->data;
+	rte_mb();
+	priv->dev->tx_pkt_burst = mlx5_tx_burst;
+	priv->dev->rx_pkt_burst = removed_rx_burst;
+	priv_unlock(priv);
+end:
+	/* More sanity checks. */
+	assert(priv->dev->tx_pkt_burst == mlx5_tx_burst);
+	assert(priv->dev->rx_pkt_burst == removed_rx_burst);
+	assert(priv->dev->data == &sd->data);
+	rte_spinlock_unlock(&sd->lock);
+	return priv;
+error:
+	priv_unlock(priv);
+	rte_free(tx_queues);
+	rte_free(rx_queues);
+	rte_spinlock_unlock(&sd->lock);
+	return NULL;
+}
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index edb05ad..c9cea48 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -209,6 +209,9 @@ mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	DEBUG("%p: removing MAC address from index %" PRIu32,
 	      (void *)dev, index);
@@ -474,6 +477,9 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	(void)vmdq;
 	priv_lock(priv);
 	DEBUG("%p: adding MAC address at index %" PRIu32,
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 2bc005e..3a55f63 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -396,6 +396,9 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->promisc_req = 1;
 	ret = priv_rehash_flows(priv);
@@ -417,6 +420,9 @@ mlx5_promiscuous_disable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->promisc_req = 0;
 	ret = priv_rehash_flows(priv);
@@ -438,6 +444,9 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->allmulti_req = 1;
 	ret = priv_rehash_flows(priv);
@@ -459,6 +468,9 @@ mlx5_allmulticast_disable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->allmulti_req = 0;
 	ret = priv_rehash_flows(priv);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index df6fd92..3d84f41 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1395,6 +1395,9 @@ mlx5_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	struct rxq *rxq = (*priv->rxqs)[idx];
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	DEBUG("%p: configuring queue %u for %u descriptors",
 	      (void *)dev, idx, desc);
@@ -1453,6 +1456,9 @@ mlx5_rx_queue_release(void *dpdk_rxq)
 	struct priv *priv;
 	unsigned int i;
 
+	if (mlx5_is_secondary())
+		return;
+
 	if (rxq == NULL)
 		return;
 	priv = rxq->priv;
@@ -1468,3 +1474,43 @@ mlx5_rx_queue_release(void *dpdk_rxq)
 	rte_free(rxq);
 	priv_unlock(priv);
 }
+
+/**
+ * DPDK callback for RX in secondary processes.
+ *
+ * This function configures all queues from primary process information
+ * if necessary before reverting to the normal RX burst callback.
+ *
+ * @param dpdk_rxq
+ *   Generic pointer to RX queue structure.
+ * @param[out] pkts
+ *   Array to store received packets.
+ * @param pkts_n
+ *   Maximum number of packets in array.
+ *
+ * @return
+ *   Number of packets successfully received (<= pkts_n).
+ */
+uint16_t
+mlx5_rx_burst_secondary_setup(void *dpdk_rxq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n)
+{
+	struct rxq *rxq = dpdk_rxq;
+	struct priv *priv = mlx5_secondary_data_setup(rxq->priv);
+	struct priv *primary_priv;
+	unsigned int index;
+
+	if (priv == NULL)
+		return 0;
+	primary_priv =
+		mlx5_secondary_data[priv->dev->data->port_id].primary_priv;
+	/* Look for queue index in both private structures. */
+	for (index = 0; index != priv->rxqs_n; ++index)
+		if (((*primary_priv->rxqs)[index] == rxq) ||
+		    ((*priv->rxqs)[index] == rxq))
+			break;
+	if (index == priv->rxqs_n)
+		return 0;
+	rxq = (*priv->rxqs)[index];
+	return priv->dev->rx_pkt_burst(rxq, pkts, pkts_n);
+}
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 6ac1a5a..6a0087e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -309,13 +309,21 @@ int rxq_setup(struct rte_eth_dev *, struct rxq *, uint16_t, unsigned int,
 int mlx5_rx_queue_setup(struct rte_eth_dev *, uint16_t, uint16_t, unsigned int,
 			const struct rte_eth_rxconf *, struct rte_mempool *);
 void mlx5_rx_queue_release(void *);
+uint16_t mlx5_rx_burst_secondary_setup(void *dpdk_rxq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n);
+
 
 /* mlx5_txq.c */
 
 void txq_cleanup(struct txq *);
+int txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
+	  unsigned int socket, const struct rte_eth_txconf *conf);
+
 int mlx5_tx_queue_setup(struct rte_eth_dev *, uint16_t, uint16_t, unsigned int,
 			const struct rte_eth_txconf *);
 void mlx5_tx_queue_release(void *);
+uint16_t mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n);
 
 /* mlx5_rxtx.c */
 
diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
index 6d1a600..2d3cb51 100644
--- a/drivers/net/mlx5/mlx5_stats.c
+++ b/drivers/net/mlx5/mlx5_stats.c
@@ -55,7 +55,7 @@
 void
 mlx5_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct rte_eth_stats tmp = {0};
 	unsigned int i;
 	unsigned int idx;
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index b5ca7d4..e9b9a29 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -64,6 +64,9 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int err;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	if (priv->started) {
 		priv_unlock(priv);
@@ -104,6 +107,9 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	if (!priv->started) {
 		priv_unlock(priv);
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 3364fca..6700af4 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -255,11 +255,11 @@ txq_cleanup(struct txq *txq)
  * @return
  *   0 on success, errno value on failure.
  */
-static int
+int
 txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 	  unsigned int socket, const struct rte_eth_txconf *conf)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct txq tmpl = {
 		.priv = priv,
 		.socket = socket
@@ -464,6 +464,9 @@ mlx5_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	struct txq *txq = (*priv->txqs)[idx];
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	DEBUG("%p: configuring queue %u for %u descriptors",
 	      (void *)dev, idx, desc);
@@ -519,6 +522,9 @@ mlx5_tx_queue_release(void *dpdk_txq)
 	struct priv *priv;
 	unsigned int i;
 
+	if (mlx5_is_secondary())
+		return;
+
 	if (txq == NULL)
 		return;
 	priv = txq->priv;
@@ -534,3 +540,43 @@ mlx5_tx_queue_release(void *dpdk_txq)
 	rte_free(txq);
 	priv_unlock(priv);
 }
+
+/**
+ * DPDK callback for TX in secondary processes.
+ *
+ * This function configures all queues from primary process information
+ * if necessary before reverting to the normal TX burst callback.
+ *
+ * @param dpdk_txq
+ *   Generic pointer to TX queue structure.
+ * @param[in] pkts
+ *   Packets to transmit.
+ * @param pkts_n
+ *   Number of packets in array.
+ *
+ * @return
+ *   Number of packets successfully transmitted (<= pkts_n).
+ */
+uint16_t
+mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n)
+{
+	struct txq *txq = dpdk_txq;
+	struct priv *priv = mlx5_secondary_data_setup(txq->priv);
+	struct priv *primary_priv;
+	unsigned int index;
+
+	if (priv == NULL)
+		return 0;
+	primary_priv =
+		mlx5_secondary_data[priv->dev->data->port_id].primary_priv;
+	/* Look for queue index in both private structures. */
+	for (index = 0; index != priv->txqs_n; ++index)
+		if (((*primary_priv->txqs)[index] == txq) ||
+		    ((*priv->txqs)[index] == txq))
+			break;
+	if (index == priv->txqs_n)
+		return 0;
+	txq = (*priv->txqs)[index];
+	return priv->dev->tx_pkt_burst(txq, pkts, pkts_n);
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v2 3/5] mlx5: add RX CRC stripping configuration
  2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 1/5] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 2/5] mlx5: allow operation in secondary processes Adrien Mazarguil
@ 2016-03-03 14:27   ` Adrien Mazarguil
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 4/5] mlx5: add support for HW packet padding Adrien Mazarguil
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:27 UTC (permalink / raw)
  To: dev

From: Olga Shern <olgas@mellanox.com>

Until now, CRC was always stripped by hardware. This feature can be
configured since MLNX_OFED >= 3.2.

Signed-off-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/nics/mlx5.rst               |  2 ++
 doc/guides/rel_notes/release_16_04.rst |  6 ++++++
 drivers/net/mlx5/Makefile              |  5 +++++
 drivers/net/mlx5/mlx5.c                |  7 +++++++
 drivers/net/mlx5/mlx5.h                |  1 +
 drivers/net/mlx5/mlx5_rxq.c            | 24 ++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c           |  6 ++++--
 drivers/net/mlx5/mlx5_rxtx.h           |  1 +
 8 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index f0d8a7e..8b63f3f 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -84,6 +84,7 @@ Features
 - Support for multiple MAC addresses.
 - VLAN filtering.
 - RX VLAN stripping.
+- RX CRC stripping configuration.
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
@@ -232,6 +233,7 @@ Currently supported by DPDK:
 
     - Flow director.
     - RX VLAN stripping.
+    - RX CRC stripping configuration.
 
 - Minimum firmware version:
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 49eed7e..01fb7ed 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -95,6 +95,12 @@ This section should contain new features added in this release. Sample format:
 
   Implemented TX support in secondary processes (like mlx4).
 
+* **mlx5: Added RX CRC stripping configuration.**
+
+  Until now, CRC was always stripped. It can now be configured.
+
+  Only available with Mellanox OFED >= 3.2.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 7076ae3..cc6de2d 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -137,6 +137,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_CQ_RX_TCP_PACKET \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_FCS \
+		infiniband/verbs.h \
+		enum IBV_EXP_CREATE_WQ_FLAG_SCATTER_FCS \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 998e6f0..acfb365 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -417,6 +417,13 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		DEBUG("VLAN stripping is %ssupported",
 		      (priv->hw_vlan_strip ? "" : "not "));
 
+#ifdef HAVE_VERBS_FCS
+		priv->hw_fcs_strip = !!(exp_device_attr.exp_device_cap_flags &
+					IBV_EXP_DEVICE_SCATTER_FCS);
+#endif /* HAVE_VERBS_FCS */
+		DEBUG("FCS stripping configuration is %ssupported",
+		      (priv->hw_fcs_strip ? "" : "not "));
+
 #else /* HAVE_EXP_QUERY_DEVICE */
 		priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
 #endif /* HAVE_EXP_QUERY_DEVICE */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index bad9283..9690827 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -103,6 +103,7 @@ struct priv {
 	unsigned int hw_csum:1; /* Checksum offload is supported. */
 	unsigned int hw_csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
+	unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
 	unsigned int vf:1; /* This is a VF device. */
 	unsigned int pending_alarm:1; /* An alarm is pending. */
 	/* RX/TX queues. */
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 3d84f41..19a1119 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1258,6 +1258,30 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 				  0),
 #endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	};
+
+#ifdef HAVE_VERBS_FCS
+	/* By default, FCS (CRC) is stripped by hardware. */
+	if (dev->data->dev_conf.rxmode.hw_strip_crc) {
+		tmpl.crc_present = 0;
+	} else if (priv->hw_fcs_strip) {
+		/* Ask HW/Verbs to leave CRC in place when supported. */
+		attr.wq.flags |= IBV_EXP_CREATE_WQ_FLAG_SCATTER_FCS;
+		attr.wq.comp_mask |= IBV_EXP_CREATE_WQ_FLAGS;
+		tmpl.crc_present = 1;
+	} else {
+		WARN("%p: CRC stripping has been disabled but will still"
+		     " be performed by hardware, make sure MLNX_OFED and"
+		     " firmware are up to date",
+		     (void *)dev);
+		tmpl.crc_present = 0;
+	}
+	DEBUG("%p: CRC stripping is %s, %u bytes will be subtracted from"
+	      " incoming frames to hide it",
+	      (void *)dev,
+	      tmpl.crc_present ? "disabled" : "enabled",
+	      tmpl.crc_present << 2);
+#endif /* HAVE_VERBS_FCS */
+
 	tmpl.wq = ibv_exp_create_wq(priv->ctx, &attr.wq);
 	if (tmpl.wq == NULL) {
 		ret = (errno ? errno : EINVAL);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 4919189..34340d2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -828,7 +828,8 @@ mlx5_rx_burst_sp(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		}
 		if (ret == 0)
 			break;
-		len = ret;
+		assert(ret >= (rxq->crc_present << 2));
+		len = ret - (rxq->crc_present << 2);
 		pkt_buf_len = len;
 		/*
 		 * Replace spent segments with new ones, concatenate and
@@ -1040,7 +1041,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		}
 		if (ret == 0)
 			break;
-		len = ret;
+		assert(ret >= (rxq->crc_present << 2));
+		len = ret - (rxq->crc_present << 2);
 		rep = __rte_mbuf_raw_alloc(rxq->mp);
 		if (unlikely(rep == NULL)) {
 			/*
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 6a0087e..61be3e4 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -116,6 +116,7 @@ struct rxq {
 	unsigned int csum:1; /* Enable checksum offloading. */
 	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
+	unsigned int crc_present:1; /* CRC must be subtracted. */
 	union {
 		struct rxq_elt_sp (*sp)[]; /* Scattered RX elements. */
 		struct rxq_elt (*no_sp)[]; /* RX elements. */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v2 4/5] mlx5: add support for HW packet padding
  2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
                     ` (2 preceding siblings ...)
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 3/5] mlx5: add RX CRC stripping configuration Adrien Mazarguil
@ 2016-03-03 14:27   ` Adrien Mazarguil
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 5/5] mlx5: add VLAN insertion offload Adrien Mazarguil
  2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:27 UTC (permalink / raw)
  To: dev

From: Olga Shern <olgas@mellanox.com>

Environment variable MLX5_PMD_ENABLE_PADDING enables HW packet padding
in PCI bus transactions.

When packet size is cache aligned and CRC stripping is enabled, 4 fewer
bytes are written to the PCI bus. Enabling padding makes such packets
aligned again.

In cases where PCI bandwidth is the bottleneck, padding can improve
performance by 10%.

This is disabled by default since this can also decrease performance for
unaligned packet sizes.

Signed-off-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/nics/mlx5.rst               | 14 ++++++++++++++
 doc/guides/rel_notes/release_16_04.rst |  7 +++++++
 drivers/net/mlx5/Makefile              |  5 +++++
 drivers/net/mlx5/mlx5.c                | 28 ++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                |  5 +++++
 drivers/net/mlx5/mlx5_rxq.c            | 15 +++++++++++++++
 6 files changed, 74 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 8b63f3f..9df30be 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -156,6 +156,20 @@ Environment variables
   lower performance when there is no backpressure, it is not enabled by
   default.
 
+- ``MLX5_PMD_ENABLE_PADDING``
+
+  Enables HW packet padding in PCI bus transactions.
+
+  When packet size is cache aligned and CRC stripping is enabled, 4 fewer
+  bytes are written to the PCI bus. Enabling padding makes such packets
+  aligned again.
+
+  In cases where PCI bandwidth is the bottleneck, padding can improve
+  performance by 10%.
+
+  This is disabled by default since this can also decrease performance for
+  unaligned packet sizes.
+
 Run-time configuration
 ~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 01fb7ed..6bcfad1 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -101,6 +101,13 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **mlx5: Added optional packet padding by HW.**
+
+  Added an option to make PCI bus transactions rounded to multiple of a
+  cache line size for better alignment.
+
+  Only available with Mellanox OFED >= 3.2.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index cc6de2d..a6a3cab 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -142,6 +142,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_CREATE_WQ_FLAG_SCATTER_FCS \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_RX_END_PADDING \
+		infiniband/verbs.h \
+		enum IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index acfb365..94eefb9 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -68,6 +68,25 @@
 #include "mlx5_defs.h"
 
 /**
+ * Retrieve integer value from environment variable.
+ *
+ * @param[in] name
+ *   Environment variable name.
+ *
+ * @return
+ *   Integer value, 0 if the variable is not set.
+ */
+int
+mlx5_getenv_int(const char *name)
+{
+	const char *val = getenv(name);
+
+	if (val == NULL)
+		return 0;
+	return atoi(val);
+}
+
+/**
  * DPDK callback to close the device.
  *
  * Destroy all queues and objects, free memory.
@@ -332,6 +351,9 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 #ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
 			IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS |
 #endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+#ifdef HAVE_EXP_CREATE_WQ_FLAG_RX_END_PADDING
+			IBV_EXP_DEVICE_ATTR_RX_PAD_END_ALIGN |
+#endif /* HAVE_EXP_CREATE_WQ_FLAG_RX_END_PADDING */
 			0;
 #endif /* HAVE_EXP_QUERY_DEVICE */
 
@@ -424,6 +446,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		DEBUG("FCS stripping configuration is %ssupported",
 		      (priv->hw_fcs_strip ? "" : "not "));
 
+#ifdef HAVE_VERBS_RX_END_PADDING
+		priv->hw_padding = !!exp_device_attr.rx_pad_end_addr_align;
+#endif /* HAVE_VERBS_RX_END_PADDING */
+		DEBUG("hardware RX end alignment padding is %ssupported",
+		      (priv->hw_padding ? "" : "not "));
+
 #else /* HAVE_EXP_QUERY_DEVICE */
 		priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
 #endif /* HAVE_EXP_QUERY_DEVICE */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9690827..1904d54 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -104,6 +104,7 @@ struct priv {
 	unsigned int hw_csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
 	unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
+	unsigned int hw_padding:1; /* End alignment padding is supported. */
 	unsigned int vf:1; /* This is a VF device. */
 	unsigned int pending_alarm:1; /* An alarm is pending. */
 	/* RX/TX queues. */
@@ -160,6 +161,10 @@ priv_unlock(struct priv *priv)
 	rte_spinlock_unlock(&priv->lock);
 }
 
+/* mlx5.c */
+
+int mlx5_getenv_int(const char *);
+
 /* mlx5_ethdev.c */
 
 struct priv *mlx5_get_priv(struct rte_eth_dev *dev);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 19a1119..c8af77f 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1282,6 +1282,21 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	      tmpl.crc_present << 2);
 #endif /* HAVE_VERBS_FCS */
 
+#ifdef HAVE_VERBS_RX_END_PADDING
+	if (!mlx5_getenv_int("MLX5_PMD_ENABLE_PADDING"))
+		; /* Nothing else to do. */
+	else if (priv->hw_padding) {
+		INFO("%p: enabling packet padding on queue %p",
+		     (void *)dev, (void *)rxq);
+		attr.wq.flags |= IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING;
+		attr.wq.comp_mask |= IBV_EXP_CREATE_WQ_FLAGS;
+	} else
+		WARN("%p: packet padding has been requested but is not"
+		     " supported, make sure MLNX_OFED and firmware are"
+		     " up to date",
+		     (void *)dev);
+#endif /* HAVE_VERBS_RX_END_PADDING */
+
 	tmpl.wq = ibv_exp_create_wq(priv->ctx, &attr.wq);
 	if (tmpl.wq == NULL) {
 		ret = (errno ? errno : EINVAL);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v2 5/5] mlx5: add VLAN insertion offload
  2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
                     ` (3 preceding siblings ...)
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 4/5] mlx5: add support for HW packet padding Adrien Mazarguil
@ 2016-03-03 14:27   ` Adrien Mazarguil
  2016-03-11 15:24     ` Bruce Richardson
  2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
  5 siblings, 1 reply; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-03 14:27 UTC (permalink / raw)
  To: dev

From: Yaacov Hazan <yaacovh@mellanox.com>

VLAN insertion is done in software by the PMD by default unless
CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION is enabled and Verbs provides
support for hardware insertion.

When enabled, this option improves performance when VLAN insertion is
requested, however ConnectX-4 Lx boards cannot take advantage of
multi-packet send optimizations anymore.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 config/common_linuxapp                 |   1 +
 doc/guides/nics/mlx5.rst               |   9 +++
 doc/guides/rel_notes/release_16_04.rst |   6 ++
 drivers/net/mlx5/Makefile              |   9 +++
 drivers/net/mlx5/mlx5_defs.h           |   9 +++
 drivers/net/mlx5/mlx5_ethdev.c         |  12 ++--
 drivers/net/mlx5/mlx5_rxtx.c           | 109 +++++++++++++++++++++++++++------
 drivers/net/mlx5/mlx5_rxtx.h           |  13 ++++
 drivers/net/mlx5/mlx5_txq.c            |  15 ++++-
 9 files changed, 158 insertions(+), 25 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 7b5e49f..793d262 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -220,6 +220,7 @@ CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
 CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N=4
 CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0
 CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8
+CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION=n
 
 #
 # Compile burst-oriented Broadcom PMD driver
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 9df30be..e391518 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -84,6 +84,7 @@ Features
 - Support for multiple MAC addresses.
 - VLAN filtering.
 - RX VLAN stripping.
+- TX VLAN insertion.
 - RX CRC stripping configuration.
 - Promiscuous mode.
 - Multicast promiscuous mode.
@@ -143,6 +144,13 @@ These options can be modified in the ``.config`` file.
 
   This value is always 1 for RX queues since they use a single MP.
 
+- ``CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION`` (default **n**)
+
+  Use Verbs instead of PMD implementation for VLAN insertion. Disabled by
+  default since it prevents ConnectX-4 Lx adapters from taking advantage of
+  multi-packet send optimizations, otherwise provides better performance
+  when VLAN insertion is requested.
+
 Environment variables
 ~~~~~~~~~~~~~~~~~~~~~
 
@@ -247,6 +255,7 @@ Currently supported by DPDK:
 
     - Flow director.
     - RX VLAN stripping.
+    - TX VLAN insertion.
     - RX CRC stripping configuration.
 
 - Minimum firmware version:
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 6bcfad1..238ef84 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -108,6 +108,12 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **mlx5: TX VLAN insertion support.**
+
+  Added support for TX VLAN insertion.
+
+  Only available with Mellanox OFED >= 3.2.
+
 
 Resolved Issues
 ---------------
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index a6a3cab..7d24fd2 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -101,6 +101,10 @@ ifdef CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE
 CFLAGS += -DMLX5_PMD_TX_MP_CACHE=$(CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE)
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION),y)
+CFLAGS += -DMLX5_VERBS_VLAN_INSERTION
+endif
+
 include $(RTE_SDK)/mk/rte.lib.mk
 
 # Generate and clean-up mlx5_autoconf.h.
@@ -147,6 +151,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_VLAN_INSERTION \
+		infiniband/verbs.h \
+		enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 5b00d8e..fb8db2e 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -95,4 +95,13 @@
 #define MLX5_FDIR_SUPPORT 1
 #endif
 
+/*
+ * Prevent compilation when HW VLAN insertion is requested by configuration
+ * but not supported by Verbs.
+ */
+#if defined(MLX5_VERBS_VLAN_INSERTION) && !defined(HAVE_VERBS_VLAN_INSERTION)
+#error CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION \
+	enabled in configuration but not supported by libibverbs.
+#endif
+
 #endif /* RTE_PMD_MLX5_DEFS_H_ */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6b674a2..66115d2 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -544,12 +544,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 		  DEV_RX_OFFLOAD_UDP_CKSUM |
 		  DEV_RX_OFFLOAD_TCP_CKSUM) :
 		 0);
-	info->tx_offload_capa =
-		(priv->hw_csum ?
-		 (DEV_TX_OFFLOAD_IPV4_CKSUM |
-		  DEV_TX_OFFLOAD_UDP_CKSUM |
-		  DEV_TX_OFFLOAD_TCP_CKSUM) :
-		 0);
+	info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+	if (priv->hw_csum)
+		info->tx_offload_capa |=
+			(DEV_TX_OFFLOAD_IPV4_CKSUM |
+			 DEV_TX_OFFLOAD_UDP_CKSUM |
+			 DEV_TX_OFFLOAD_TCP_CKSUM);
 	if (priv_get_ifname(priv, &ifname) == 0)
 		info->if_index = if_nametoindex(ifname);
 	/* FIXME: RETA update/query API expects the callee to know the size of
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 34340d2..819060e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -333,6 +333,40 @@ txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
 	txq_mp2mr(txq, mp);
 }
 
+#ifndef MLX5_VERBS_VLAN_INSERTION
+
+/**
+ * Insert VLAN to specific packet, using the mbuf's headroom space.
+ *
+ * @param buf
+ *   Buffer to insert the vlan.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static inline int
+insert_vlan_sw(struct rte_mbuf *buf)
+{
+	uintptr_t addr;
+	uint32_t vlan;
+	uint16_t head_room_len = rte_pktmbuf_headroom(buf);
+
+	if (head_room_len < 4)
+		return EINVAL;
+
+	addr = rte_pktmbuf_mtod(buf, uintptr_t);
+	vlan = htonl(0x81000000 | buf->vlan_tci);
+	memmove((void *)(addr - 4), (void *)addr, 12);
+	memcpy((void *)(addr + 8), &vlan, sizeof(vlan));
+
+	SET_DATA_OFF(buf, head_room_len - 4);
+	DATA_LEN(buf) += 4;
+
+	return 0;
+}
+
+#endif /* !MLX5_VERBS_VLAN_INSERTION */
+
 #if MLX5_PMD_SGE_WR_N > 1
 
 /**
@@ -554,6 +588,14 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			if (RTE_ETH_IS_TUNNEL_PKT(buf->packet_type))
 				send_flags |= IBV_EXP_QP_BURST_TUNNEL;
 		}
+#ifndef MLX5_VERBS_VLAN_INSERTION
+		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
+			err = insert_vlan_sw(buf);
+
+			if (unlikely(err))
+				goto stop;
+		}
+#endif /* !MLX5_VERBS_VLAN_INSERTION */
 		if (likely(segs == 1)) {
 			uintptr_t addr;
 			uint32_t length;
@@ -577,13 +619,23 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			}
 			/* Put packet into send queue. */
 #if MLX5_PMD_MAX_INLINE > 0
-			if (length <= txq->max_inline)
-				err = txq->send_pending_inline
-					(txq->qp,
-					 (void *)addr,
-					 length,
-					 send_flags);
-			else
+			if (length <= txq->max_inline) {
+#ifdef MLX5_VERBS_VLAN_INSERTION
+				if (buf->ol_flags & PKT_TX_VLAN_PKT)
+					err = txq->send_pending_inline_vlan
+						(txq->qp,
+						 (void *)addr,
+						 length,
+						 send_flags,
+						 &buf->vlan_tci);
+				else
+#endif /* MLX5_VERBS_VLAN_INSERTION */
+					err = txq->send_pending_inline
+						(txq->qp,
+						 (void *)addr,
+						 length,
+						 send_flags);
+			} else
 #endif
 			{
 				/* Retrieve Memory Region key for this
@@ -597,12 +649,23 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 					elt->buf = NULL;
 					goto stop;
 				}
-				err = txq->send_pending
-					(txq->qp,
-					 addr,
-					 length,
-					 lkey,
-					 send_flags);
+#ifdef MLX5_VERBS_VLAN_INSERTION
+				if (buf->ol_flags & PKT_TX_VLAN_PKT)
+					err = txq->send_pending_vlan
+						(txq->qp,
+						 addr,
+						 length,
+						 lkey,
+						 send_flags,
+						 &buf->vlan_tci);
+				else
+#endif /* MLX5_VERBS_VLAN_INSERTION */
+					err = txq->send_pending
+						(txq->qp,
+						 addr,
+						 length,
+						 lkey,
+						 send_flags);
 			}
 			if (unlikely(err))
 				goto stop;
@@ -619,11 +682,21 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			if (ret.length == (unsigned int)-1)
 				goto stop;
 			/* Put SG list into send queue. */
-			err = txq->send_pending_sg_list
-				(txq->qp,
-				 sges,
-				 ret.num,
-				 send_flags);
+#ifdef MLX5_VERBS_VLAN_INSERTION
+			if (buf->ol_flags & PKT_TX_VLAN_PKT)
+				err = txq->send_pending_sg_list_vlan
+					(txq->qp,
+					 sges,
+					 ret.num,
+					 send_flags,
+					 &buf->vlan_tci);
+			else
+#endif /* MLX5_VERBS_VLAN_INSERTION */
+				err = txq->send_pending_sg_list
+					(txq->qp,
+					 sges,
+					 ret.num,
+					 send_flags);
 			if (unlikely(err))
 				goto stop;
 #ifdef MLX5_PMD_SOFT_COUNTERS
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 61be3e4..dd9aa25 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -255,11 +255,20 @@ struct txq {
 	struct priv *priv; /* Back pointer to private data. */
 	int32_t (*poll_cnt)(struct ibv_cq *cq, uint32_t max);
 	int (*send_pending)();
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	int (*send_pending_vlan)();
+#endif
 #if MLX5_PMD_MAX_INLINE > 0
 	int (*send_pending_inline)();
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	int (*send_pending_inline_vlan)();
+#endif
 #endif
 #if MLX5_PMD_SGE_WR_N > 1
 	int (*send_pending_sg_list)();
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	int (*send_pending_sg_list_vlan)();
+#endif
 #endif
 	int (*send_flush)(struct ibv_qp *qp);
 	struct ibv_cq *cq; /* Completion Queue. */
@@ -283,7 +292,11 @@ struct txq {
 	/* Elements used only for init part are here. */
 	linear_t (*elts_linear)[]; /* Linearized buffers. */
 	struct ibv_mr *mr_linear; /* Memory Region for linearized buffers. */
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	struct ibv_exp_qp_burst_family_v1 *if_qp; /* QP burst interface. */
+#else
 	struct ibv_exp_qp_burst_family *if_qp; /* QP burst interface. */
+#endif
 	struct ibv_exp_cq_family *if_cq; /* CQ interface. */
 	struct ibv_exp_res_domain *rd; /* Resource Domain. */
 	unsigned int socket; /* CPU socket ID for allocations. */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 6700af4..c643cf4 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -400,7 +400,11 @@ txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 		.intf_scope = IBV_EXP_INTF_GLOBAL,
 		.intf = IBV_EXP_INTF_QP_BURST,
 		.obj = tmpl.qp,
-#ifdef HAVE_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR
+#ifdef HAVE_VERBS_VLAN_INSERTION
+		.intf_version = 1,
+#endif
+#if defined(HAVE_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR) && \
+	!defined(MLX5_VERBS_VLAN_INSERTION)
 		/* Multi packet send WR can only be used outside of VF. */
 		.family_flags =
 			(!priv->vf ?
@@ -422,11 +426,20 @@ txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 	txq->poll_cnt = txq->if_cq->poll_cnt;
 #if MLX5_PMD_MAX_INLINE > 0
 	txq->send_pending_inline = txq->if_qp->send_pending_inline;
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	txq->send_pending_inline_vlan = txq->if_qp->send_pending_inline_vlan;
+#endif
 #endif
 #if MLX5_PMD_SGE_WR_N > 1
 	txq->send_pending_sg_list = txq->if_qp->send_pending_sg_list;
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	txq->send_pending_sg_list_vlan = txq->if_qp->send_pending_sg_list_vlan;
+#endif
 #endif
 	txq->send_pending = txq->if_qp->send_pending;
+#ifdef MLX5_VERBS_VLAN_INSERTION
+	txq->send_pending_vlan = txq->if_qp->send_pending_vlan;
+#endif
 	txq->send_flush = txq->if_qp->send_flush;
 	DEBUG("%p: txq updated with %p", (void *)txq, (void *)&tmpl);
 	/* Pre-register known mempools. */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/5] mlx5: add VLAN insertion offload
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 5/5] mlx5: add VLAN insertion offload Adrien Mazarguil
@ 2016-03-11 15:24     ` Bruce Richardson
  2016-03-16 13:45       ` Adrien Mazarguil
  0 siblings, 1 reply; 20+ messages in thread
From: Bruce Richardson @ 2016-03-11 15:24 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

On Thu, Mar 03, 2016 at 03:27:59PM +0100, Adrien Mazarguil wrote:
> From: Yaacov Hazan <yaacovh@mellanox.com>
> 
> VLAN insertion is done in software by the PMD by default unless
> CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION is enabled and Verbs provides
> support for hardware insertion.
> 
> When enabled, this option improves performance when VLAN insertion is
> requested, however ConnectX-4 Lx boards cannot take advantage of
> multi-packet send optimizations anymore.
> 
> Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
>  config/common_linuxapp                 |   1 +
>  doc/guides/nics/mlx5.rst               |   9 +++
>  doc/guides/rel_notes/release_16_04.rst |   6 ++
>  drivers/net/mlx5/Makefile              |   9 +++
>  drivers/net/mlx5/mlx5_defs.h           |   9 +++
>  drivers/net/mlx5/mlx5_ethdev.c         |  12 ++--
>  drivers/net/mlx5/mlx5_rxtx.c           | 109 +++++++++++++++++++++++++++------
>  drivers/net/mlx5/mlx5_rxtx.h           |  13 ++++
>  drivers/net/mlx5/mlx5_txq.c            |  15 ++++-
>  9 files changed, 158 insertions(+), 25 deletions(-)
> 
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 7b5e49f..793d262 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -220,6 +220,7 @@ CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
>  CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N=4
>  CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0
>  CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8
> +CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION=n
>
New build time configuration options are no longer allowed in DPDK, as they can't
be used in binary distributions and make testing harder. This should be made
a run-time option instead.

/Bruce

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/5] mlx5: add VLAN insertion offload
  2016-03-11 15:24     ` Bruce Richardson
@ 2016-03-16 13:45       ` Adrien Mazarguil
  0 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-16 13:45 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

On Fri, Mar 11, 2016 at 03:24:43PM +0000, Bruce Richardson wrote:
> On Thu, Mar 03, 2016 at 03:27:59PM +0100, Adrien Mazarguil wrote:
> > From: Yaacov Hazan <yaacovh@mellanox.com>
> > 
> > VLAN insertion is done in software by the PMD by default unless
> > CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION is enabled and Verbs provides
> > support for hardware insertion.
> > 
> > When enabled, this option improves performance when VLAN insertion is
> > requested, however ConnectX-4 Lx boards cannot take advantage of
> > multi-packet send optimizations anymore.
> > 
> > Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > ---
> >  config/common_linuxapp                 |   1 +
> >  doc/guides/nics/mlx5.rst               |   9 +++
> >  doc/guides/rel_notes/release_16_04.rst |   6 ++
> >  drivers/net/mlx5/Makefile              |   9 +++
> >  drivers/net/mlx5/mlx5_defs.h           |   9 +++
> >  drivers/net/mlx5/mlx5_ethdev.c         |  12 ++--
> >  drivers/net/mlx5/mlx5_rxtx.c           | 109 +++++++++++++++++++++++++++------
> >  drivers/net/mlx5/mlx5_rxtx.h           |  13 ++++
> >  drivers/net/mlx5/mlx5_txq.c            |  15 ++++-
> >  9 files changed, 158 insertions(+), 25 deletions(-)
> > 
> > diff --git a/config/common_linuxapp b/config/common_linuxapp
> > index 7b5e49f..793d262 100644
> > --- a/config/common_linuxapp
> > +++ b/config/common_linuxapp
> > @@ -220,6 +220,7 @@ CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
> >  CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N=4
> >  CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0
> >  CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8
> > +CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION=n
> >
> New build time configuration options are no longer allowed in DPDK, as they can't
> be used in binary distributions and make testing harder. This should be made
> a run-time option instead.

OK, it was done as a performance improvement for a specific case, I will
submit an updated patchset without this option.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5
  2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
                     ` (4 preceding siblings ...)
  2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 5/5] mlx5: add VLAN insertion offload Adrien Mazarguil
@ 2016-03-17 15:38   ` Adrien Mazarguil
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 1/5] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
                       ` (5 more replies)
  5 siblings, 6 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-17 15:38 UTC (permalink / raw)
  To: dev

This patchset adds to mlx5 a few features available in mlx4 (TX from
secondary processes) or provided by Verbs (support for HW packet padding,
TX VLAN insertion).

Release notes and documentation are updated accordingly.

Changes in v3:
- Removed compilation option for TX VLAN insertion, the method to use is now
  determined at runtime.
- Modified releases notes slightly.

Changes in v2:
- Added support for CRC stripping configuration.
- Updated packet padding feature macro and made cosmetic changes to its
  implementation to match CRC stripping's.
- Updated release notes about packet padding.
- Updated TX VLAN insertion documentation.

Olga Shern (2):
  mlx5: add RX CRC stripping configuration
  mlx5: add support for HW packet padding

Or Ami (2):
  mlx5: add callbacks to support link (up / down) changes
  mlx5: allow operation in secondary processes

Yaacov Hazan (1):
  mlx5: add VLAN insertion offload

 doc/guides/nics/mlx5.rst               |  21 ++-
 doc/guides/rel_notes/release_16_04.rst |  27 +++
 drivers/net/mlx5/Makefile              |  15 ++
 drivers/net/mlx5/mlx5.c                |  91 ++++++++--
 drivers/net/mlx5/mlx5.h                |  21 +++
 drivers/net/mlx5/mlx5_ethdev.c         | 299 ++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_mac.c            |   6 +
 drivers/net/mlx5/mlx5_rxmode.c         |  12 ++
 drivers/net/mlx5/mlx5_rxq.c            |  85 ++++++++++
 drivers/net/mlx5/mlx5_rxtx.c           | 118 ++++++++++---
 drivers/net/mlx5/mlx5_rxtx.h           |  22 +++
 drivers/net/mlx5/mlx5_stats.c          |   2 +-
 drivers/net/mlx5/mlx5_trigger.c        |   6 +
 drivers/net/mlx5/mlx5_txq.c            |  66 +++++++-
 14 files changed, 746 insertions(+), 45 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v3 1/5] mlx5: add callbacks to support link (up / down) changes
  2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
@ 2016-03-17 15:38     ` Adrien Mazarguil
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 2/5] mlx5: allow operation in secondary processes Adrien Mazarguil
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-17 15:38 UTC (permalink / raw)
  To: dev; +Cc: Or Ami

From: Or Ami <ora@mellanox.com>

Burst functions are updated to make sure applications cannot attempt to
send/receive after link is brought down.

Signed-off-by: Or Ami <ora@mellanox.com>
---
 doc/guides/rel_notes/release_16_04.rst |  4 ++
 drivers/net/mlx5/mlx5.c                |  2 +
 drivers/net/mlx5/mlx5.h                |  2 +
 drivers/net/mlx5/mlx5_ethdev.c         | 85 ++++++++++++++++++++++++++++++++++
 4 files changed, 93 insertions(+)

diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 5f9eb3e..a011f0b 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -130,6 +130,10 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **Added mlx5 link up/down callbacks.**
+
+  Implemented callbacks to bring link up and down.
+
 * **Added af_packet dynamic removal function.**
 
   Af_packet device can now be detached using API, like other PMD devices.
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ad69ec2..14ac4ba 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -148,6 +148,8 @@ static const struct eth_dev_ops mlx5_dev_ops = {
 	.dev_configure = mlx5_dev_configure,
 	.dev_start = mlx5_dev_start,
 	.dev_stop = mlx5_dev_stop,
+	.dev_set_link_down = mlx5_set_link_down,
+	.dev_set_link_up = mlx5_set_link_up,
 	.dev_close = mlx5_dev_close,
 	.promiscuous_enable = mlx5_promiscuous_enable,
 	.promiscuous_disable = mlx5_promiscuous_disable,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 43b24fb..9a3f240 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -168,6 +168,8 @@ void mlx5_dev_link_status_handler(void *);
 void mlx5_dev_interrupt_handler(struct rte_intr_handle *, void *);
 void priv_dev_interrupt_handler_uninstall(struct priv *, struct rte_eth_dev *);
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
+int mlx5_set_link_down(struct rte_eth_dev *dev);
+int mlx5_set_link_up(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6704382..f609e0f 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -968,3 +968,88 @@ priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_dev *dev)
 					   dev);
 	}
 }
+
+/**
+ * Change the link state (UP / DOWN).
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param up
+ *   Nonzero for link up, otherwise link down.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static int
+priv_set_link(struct priv *priv, int up)
+{
+	struct rte_eth_dev *dev = priv->dev;
+	int err;
+	unsigned int i;
+
+	if (up) {
+		err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
+		if (err)
+			return err;
+		for (i = 0; i < priv->rxqs_n; i++)
+			if ((*priv->rxqs)[i]->sp)
+				break;
+		/* Check if an sp queue exists.
+		 * Note: Some old frames might be received.
+		 */
+		if (i == priv->rxqs_n)
+			dev->rx_pkt_burst = mlx5_rx_burst;
+		else
+			dev->rx_pkt_burst = mlx5_rx_burst_sp;
+		dev->tx_pkt_burst = mlx5_tx_burst;
+	} else {
+		err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
+		if (err)
+			return err;
+		dev->rx_pkt_burst = removed_rx_burst;
+		dev->tx_pkt_burst = removed_tx_burst;
+	}
+	return 0;
+}
+
+/**
+ * DPDK callback to bring the link DOWN.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+mlx5_set_link_down(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	int err;
+
+	priv_lock(priv);
+	err = priv_set_link(priv, 0);
+	priv_unlock(priv);
+	return err;
+}
+
+/**
+ * DPDK callback to bring the link UP.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+int
+mlx5_set_link_up(struct rte_eth_dev *dev)
+{
+	struct priv *priv = dev->data->dev_private;
+	int err;
+
+	priv_lock(priv);
+	err = priv_set_link(priv, 1);
+	priv_unlock(priv);
+	return err;
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v3 2/5] mlx5: allow operation in secondary processes
  2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 1/5] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
@ 2016-03-17 15:38     ` Adrien Mazarguil
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 3/5] mlx5: add RX CRC stripping configuration Adrien Mazarguil
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-17 15:38 UTC (permalink / raw)
  To: dev; +Cc: Or Ami

From: Or Ami <ora@mellanox.com>

Secondary processes are expected to use queues and other resources
allocated by the primary, however Verbs resources can only be shared
between processes when inherited through fork().

This limitation can be worked around for TX by configuring separate queues
from secondary processes.

Signed-off-by: Or Ami <ora@mellanox.com>
---
 doc/guides/nics/mlx5.rst               |   3 +-
 doc/guides/rel_notes/release_16_04.rst |   4 +
 drivers/net/mlx5/mlx5.c                |  42 +++++--
 drivers/net/mlx5/mlx5.h                |  12 ++
 drivers/net/mlx5/mlx5_ethdev.c         | 202 ++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_mac.c            |   6 +
 drivers/net/mlx5/mlx5_rxmode.c         |  12 ++
 drivers/net/mlx5/mlx5_rxq.c            |  46 ++++++++
 drivers/net/mlx5/mlx5_rxtx.h           |   8 ++
 drivers/net/mlx5/mlx5_stats.c          |   2 +-
 drivers/net/mlx5/mlx5_trigger.c        |   6 +
 drivers/net/mlx5/mlx5_txq.c            |  50 +++++++-
 12 files changed, 378 insertions(+), 15 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index edfbf1f..f0d8a7e 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -88,6 +88,7 @@ Features
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
 - Flow director (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN).
+- Secondary process TX is supported.
 
 Limitations
 -----------
@@ -96,7 +97,7 @@ Limitations
 - Inner RSS for VXLAN frames is not supported yet.
 - Port statistics through software counters only.
 - Hardware checksum offloads for VXLAN inner header are not supported yet.
-- Secondary processes are not supported yet.
+- Secondary process RX is not supported.
 
 Configuration
 -------------
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index a011f0b..ceef9b7 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -134,6 +134,10 @@ This section should contain new features added in this release. Sample format:
 
   Implemented callbacks to bring link up and down.
 
+* **Added mlx5 support for operation in secondary processes.**
+
+  Implemented TX support in secondary processes (like mlx4).
+
 * **Added af_packet dynamic removal function.**
 
   Af_packet device can now be detached using API, like other PMD devices.
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 14ac4ba..998e6f0 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -78,7 +78,7 @@
 static void
 mlx5_dev_close(struct rte_eth_dev *dev)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	void *tmp;
 	unsigned int i;
 
@@ -483,18 +483,44 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 			goto port_error;
 		}
 
-		eth_dev->data->dev_private = priv;
-		eth_dev->pci_dev = pci_dev;
+		/* Secondary processes have to use local storage for their
+		 * private data as well as a copy of eth_dev->data, but this
+		 * pointer must not be modified before burst functions are
+		 * actually called. */
+		if (mlx5_is_secondary()) {
+			struct mlx5_secondary_data *sd =
+				&mlx5_secondary_data[eth_dev->data->port_id];
+			sd->primary_priv = eth_dev->data->dev_private;
+			if (sd->primary_priv == NULL) {
+				ERROR("no private data for port %u",
+						eth_dev->data->port_id);
+				err = EINVAL;
+				goto port_error;
+			}
+			sd->shared_dev_data = eth_dev->data;
+			rte_spinlock_init(&sd->lock);
+			memcpy(sd->data.name, sd->shared_dev_data->name,
+				   sizeof(sd->data.name));
+			sd->data.dev_private = priv;
+			sd->data.rx_mbuf_alloc_failed = 0;
+			sd->data.mtu = ETHER_MTU;
+			sd->data.port_id = sd->shared_dev_data->port_id;
+			sd->data.mac_addrs = priv->mac;
+			eth_dev->tx_pkt_burst = mlx5_tx_burst_secondary_setup;
+			eth_dev->rx_pkt_burst = mlx5_rx_burst_secondary_setup;
+		} else {
+			eth_dev->data->dev_private = priv;
+			eth_dev->data->rx_mbuf_alloc_failed = 0;
+			eth_dev->data->mtu = ETHER_MTU;
+			eth_dev->data->mac_addrs = priv->mac;
+		}
 
+		eth_dev->pci_dev = pci_dev;
 		rte_eth_copy_pci_info(eth_dev, pci_dev);
-
 		eth_dev->driver = &mlx5_driver;
-		eth_dev->data->rx_mbuf_alloc_failed = 0;
-		eth_dev->data->mtu = ETHER_MTU;
-
 		priv->dev = eth_dev;
 		eth_dev->dev_ops = &mlx5_dev_ops;
-		eth_dev->data->mac_addrs = priv->mac;
+
 		TAILQ_INIT(&eth_dev->link_intr_cbs);
 
 		/* Bring Ethernet device up. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9a3f240..bad9283 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -59,6 +59,7 @@
 #include <rte_ethdev.h>
 #include <rte_spinlock.h>
 #include <rte_interrupts.h>
+#include <rte_errno.h>
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -125,6 +126,14 @@ struct priv {
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
+/* Local storage for secondary process data. */
+struct mlx5_secondary_data {
+	struct rte_eth_dev_data data; /* Local device data. */
+	struct priv *primary_priv; /* Private structure from primary. */
+	struct rte_eth_dev_data *shared_dev_data; /* Shared device data. */
+	rte_spinlock_t lock; /* Port configuration lock. */
+} mlx5_secondary_data[RTE_MAX_ETHPORTS];
+
 /**
  * Lock private structure to protect it from concurrent access in the
  * control path.
@@ -152,6 +161,8 @@ priv_unlock(struct priv *priv)
 
 /* mlx5_ethdev.c */
 
+struct priv *mlx5_get_priv(struct rte_eth_dev *dev);
+int mlx5_is_secondary(void);
 int priv_get_ifname(const struct priv *, char (*)[IF_NAMESIZE]);
 int priv_ifreq(const struct priv *, int req, struct ifreq *);
 int priv_get_mtu(struct priv *, uint16_t *);
@@ -170,6 +181,7 @@ void priv_dev_interrupt_handler_uninstall(struct priv *, struct rte_eth_dev *);
 void priv_dev_interrupt_handler_install(struct priv *, struct rte_eth_dev *);
 int mlx5_set_link_down(struct rte_eth_dev *dev);
 int mlx5_set_link_up(struct rte_eth_dev *dev);
+struct priv *mlx5_secondary_data_setup(struct priv *priv);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f609e0f..6b674a2 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -59,6 +59,7 @@
 #include <rte_common.h>
 #include <rte_interrupts.h>
 #include <rte_alarm.h>
+#include <rte_malloc.h>
 #ifdef PEDANTIC
 #pragma GCC diagnostic error "-pedantic"
 #endif
@@ -68,6 +69,38 @@
 #include "mlx5_utils.h"
 
 /**
+ * Return private structure associated with an Ethernet device.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   Pointer to private structure.
+ */
+struct priv *
+mlx5_get_priv(struct rte_eth_dev *dev)
+{
+	struct mlx5_secondary_data *sd;
+
+	if (!mlx5_is_secondary())
+		return dev->data->dev_private;
+	sd = &mlx5_secondary_data[dev->data->port_id];
+	return sd->data.dev_private;
+}
+
+/**
+ * Check if running as a secondary process.
+ *
+ * @return
+ *   Nonzero if running as a secondary process.
+ */
+inline int
+mlx5_is_secondary(void)
+{
+	return rte_eal_process_type() != RTE_PROC_PRIMARY;
+}
+
+/**
  * Get interface name from private structure.
  *
  * @param[in] priv
@@ -464,6 +497,9 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	ret = dev_configure(dev);
 	assert(ret >= 0);
@@ -482,7 +518,7 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 void
 mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	unsigned int max;
 	char ifname[IF_NAMESIZE];
 
@@ -536,7 +572,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 static int
 mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct ethtool_cmd edata = {
 		.cmd = ETHTOOL_GSET
 	};
@@ -585,7 +621,7 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 int
 mlx5_link_update(struct rte_eth_dev *dev, int wait_to_complete)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	int ret;
 
 	priv_lock(priv);
@@ -620,6 +656,9 @@ mlx5_dev_set_mtu(struct rte_eth_dev *dev, uint16_t mtu)
 	uint16_t (*rx_func)(void *, struct rte_mbuf **, uint16_t) =
 		mlx5_rx_burst;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	/* Set kernel interface MTU first. */
 	if (priv_set_mtu(priv, mtu)) {
@@ -694,6 +733,9 @@ mlx5_dev_get_flow_ctrl(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 	};
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	ifr.ifr_data = &ethpause;
 	priv_lock(priv);
 	if (priv_ifreq(priv, SIOCETHTOOL, &ifr)) {
@@ -742,6 +784,9 @@ mlx5_dev_set_flow_ctrl(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
 	};
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	ifr.ifr_data = &ethpause;
 	ethpause.autoneg = fc_conf->autoneg;
 	if (((fc_conf->mode & RTE_FC_FULL) == RTE_FC_FULL) ||
@@ -1053,3 +1098,154 @@ mlx5_set_link_up(struct rte_eth_dev *dev)
 	priv_unlock(priv);
 	return err;
 }
+
+/**
+ * Configure secondary process queues from a private data pointer (primary
+ * or secondary) and update burst callbacks. Can take place only once.
+ *
+ * All queues must have been previously created by the primary process to
+ * avoid undefined behavior.
+ *
+ * @param priv
+ *   Private data pointer from either primary or secondary process.
+ *
+ * @return
+ *   Private data pointer from secondary process, NULL in case of error.
+ */
+struct priv *
+mlx5_secondary_data_setup(struct priv *priv)
+{
+	unsigned int port_id = 0;
+	struct mlx5_secondary_data *sd;
+	void **tx_queues;
+	void **rx_queues;
+	unsigned int nb_tx_queues;
+	unsigned int nb_rx_queues;
+	unsigned int i;
+
+	/* priv must be valid at this point. */
+	assert(priv != NULL);
+	/* priv->dev must also be valid but may point to local memory from
+	 * another process, possibly with the same address and must not
+	 * be dereferenced yet. */
+	assert(priv->dev != NULL);
+	/* Determine port ID by finding out where priv comes from. */
+	while (1) {
+		sd = &mlx5_secondary_data[port_id];
+		rte_spinlock_lock(&sd->lock);
+		/* Primary process? */
+		if (sd->primary_priv == priv)
+			break;
+		/* Secondary process? */
+		if (sd->data.dev_private == priv)
+			break;
+		rte_spinlock_unlock(&sd->lock);
+		if (++port_id == RTE_DIM(mlx5_secondary_data))
+			port_id = 0;
+	}
+	/* Switch to secondary private structure. If private data has already
+	 * been updated by another thread, there is nothing else to do. */
+	priv = sd->data.dev_private;
+	if (priv->dev->data == &sd->data)
+		goto end;
+	/* Sanity checks. Secondary private structure is supposed to point
+	 * to local eth_dev, itself still pointing to the shared device data
+	 * structure allocated by the primary process. */
+	assert(sd->shared_dev_data != &sd->data);
+	assert(sd->data.nb_tx_queues == 0);
+	assert(sd->data.tx_queues == NULL);
+	assert(sd->data.nb_rx_queues == 0);
+	assert(sd->data.rx_queues == NULL);
+	assert(priv != sd->primary_priv);
+	assert(priv->dev->data == sd->shared_dev_data);
+	assert(priv->txqs_n == 0);
+	assert(priv->txqs == NULL);
+	assert(priv->rxqs_n == 0);
+	assert(priv->rxqs == NULL);
+	nb_tx_queues = sd->shared_dev_data->nb_tx_queues;
+	nb_rx_queues = sd->shared_dev_data->nb_rx_queues;
+	/* Allocate local storage for queues. */
+	tx_queues = rte_zmalloc("secondary ethdev->tx_queues",
+				sizeof(sd->data.tx_queues[0]) * nb_tx_queues,
+				RTE_CACHE_LINE_SIZE);
+	rx_queues = rte_zmalloc("secondary ethdev->rx_queues",
+				sizeof(sd->data.rx_queues[0]) * nb_rx_queues,
+				RTE_CACHE_LINE_SIZE);
+	if (tx_queues == NULL || rx_queues == NULL)
+		goto error;
+	/* Lock to prevent control operations during setup. */
+	priv_lock(priv);
+	/* TX queues. */
+	for (i = 0; i != nb_tx_queues; ++i) {
+		struct txq *primary_txq = (*sd->primary_priv->txqs)[i];
+		struct txq *txq;
+
+		if (primary_txq == NULL)
+			continue;
+		txq = rte_calloc_socket("TXQ", 1, sizeof(*txq), 0,
+					primary_txq->socket);
+		if (txq != NULL) {
+			if (txq_setup(priv->dev,
+				      txq,
+				      primary_txq->elts_n * MLX5_PMD_SGE_WR_N,
+				      primary_txq->socket,
+				      NULL) == 0) {
+				txq->stats.idx = primary_txq->stats.idx;
+				tx_queues[i] = txq;
+				continue;
+			}
+			rte_free(txq);
+		}
+		while (i) {
+			txq = tx_queues[--i];
+			txq_cleanup(txq);
+			rte_free(txq);
+		}
+		goto error;
+	}
+	/* RX queues. */
+	for (i = 0; i != nb_rx_queues; ++i) {
+		struct rxq *primary_rxq = (*sd->primary_priv->rxqs)[i];
+
+		if (primary_rxq == NULL)
+			continue;
+		/* Not supported yet. */
+		rx_queues[i] = NULL;
+	}
+	/* Update everything. */
+	priv->txqs = (void *)tx_queues;
+	priv->txqs_n = nb_tx_queues;
+	priv->rxqs = (void *)rx_queues;
+	priv->rxqs_n = nb_rx_queues;
+	sd->data.rx_queues = rx_queues;
+	sd->data.tx_queues = tx_queues;
+	sd->data.nb_rx_queues = nb_rx_queues;
+	sd->data.nb_tx_queues = nb_tx_queues;
+	sd->data.dev_link = sd->shared_dev_data->dev_link;
+	sd->data.mtu = sd->shared_dev_data->mtu;
+	memcpy(sd->data.rx_queue_state, sd->shared_dev_data->rx_queue_state,
+	       sizeof(sd->data.rx_queue_state));
+	memcpy(sd->data.tx_queue_state, sd->shared_dev_data->tx_queue_state,
+	       sizeof(sd->data.tx_queue_state));
+	sd->data.dev_flags = sd->shared_dev_data->dev_flags;
+	/* Use local data from now on. */
+	rte_mb();
+	priv->dev->data = &sd->data;
+	rte_mb();
+	priv->dev->tx_pkt_burst = mlx5_tx_burst;
+	priv->dev->rx_pkt_burst = removed_rx_burst;
+	priv_unlock(priv);
+end:
+	/* More sanity checks. */
+	assert(priv->dev->tx_pkt_burst == mlx5_tx_burst);
+	assert(priv->dev->rx_pkt_burst == removed_rx_burst);
+	assert(priv->dev->data == &sd->data);
+	rte_spinlock_unlock(&sd->lock);
+	return priv;
+error:
+	priv_unlock(priv);
+	rte_free(tx_queues);
+	rte_free(rx_queues);
+	rte_spinlock_unlock(&sd->lock);
+	return NULL;
+}
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index edb05ad..c9cea48 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -209,6 +209,9 @@ mlx5_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	DEBUG("%p: removing MAC address from index %" PRIu32,
 	      (void *)dev, index);
@@ -474,6 +477,9 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr,
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	(void)vmdq;
 	priv_lock(priv);
 	DEBUG("%p: adding MAC address at index %" PRIu32,
diff --git a/drivers/net/mlx5/mlx5_rxmode.c b/drivers/net/mlx5/mlx5_rxmode.c
index 2bc005e..3a55f63 100644
--- a/drivers/net/mlx5/mlx5_rxmode.c
+++ b/drivers/net/mlx5/mlx5_rxmode.c
@@ -396,6 +396,9 @@ mlx5_promiscuous_enable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->promisc_req = 1;
 	ret = priv_rehash_flows(priv);
@@ -417,6 +420,9 @@ mlx5_promiscuous_disable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->promisc_req = 0;
 	ret = priv_rehash_flows(priv);
@@ -438,6 +444,9 @@ mlx5_allmulticast_enable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->allmulti_req = 1;
 	ret = priv_rehash_flows(priv);
@@ -459,6 +468,9 @@ mlx5_allmulticast_disable(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int ret;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	priv->allmulti_req = 0;
 	ret = priv_rehash_flows(priv);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index df6fd92..3d84f41 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1395,6 +1395,9 @@ mlx5_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	struct rxq *rxq = (*priv->rxqs)[idx];
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	DEBUG("%p: configuring queue %u for %u descriptors",
 	      (void *)dev, idx, desc);
@@ -1453,6 +1456,9 @@ mlx5_rx_queue_release(void *dpdk_rxq)
 	struct priv *priv;
 	unsigned int i;
 
+	if (mlx5_is_secondary())
+		return;
+
 	if (rxq == NULL)
 		return;
 	priv = rxq->priv;
@@ -1468,3 +1474,43 @@ mlx5_rx_queue_release(void *dpdk_rxq)
 	rte_free(rxq);
 	priv_unlock(priv);
 }
+
+/**
+ * DPDK callback for RX in secondary processes.
+ *
+ * This function configures all queues from primary process information
+ * if necessary before reverting to the normal RX burst callback.
+ *
+ * @param dpdk_rxq
+ *   Generic pointer to RX queue structure.
+ * @param[out] pkts
+ *   Array to store received packets.
+ * @param pkts_n
+ *   Maximum number of packets in array.
+ *
+ * @return
+ *   Number of packets successfully received (<= pkts_n).
+ */
+uint16_t
+mlx5_rx_burst_secondary_setup(void *dpdk_rxq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n)
+{
+	struct rxq *rxq = dpdk_rxq;
+	struct priv *priv = mlx5_secondary_data_setup(rxq->priv);
+	struct priv *primary_priv;
+	unsigned int index;
+
+	if (priv == NULL)
+		return 0;
+	primary_priv =
+		mlx5_secondary_data[priv->dev->data->port_id].primary_priv;
+	/* Look for queue index in both private structures. */
+	for (index = 0; index != priv->rxqs_n; ++index)
+		if (((*primary_priv->rxqs)[index] == rxq) ||
+		    ((*priv->rxqs)[index] == rxq))
+			break;
+	if (index == priv->rxqs_n)
+		return 0;
+	rxq = (*priv->rxqs)[index];
+	return priv->dev->rx_pkt_burst(rxq, pkts, pkts_n);
+}
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 6ac1a5a..6a0087e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -309,13 +309,21 @@ int rxq_setup(struct rte_eth_dev *, struct rxq *, uint16_t, unsigned int,
 int mlx5_rx_queue_setup(struct rte_eth_dev *, uint16_t, uint16_t, unsigned int,
 			const struct rte_eth_rxconf *, struct rte_mempool *);
 void mlx5_rx_queue_release(void *);
+uint16_t mlx5_rx_burst_secondary_setup(void *dpdk_rxq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n);
+
 
 /* mlx5_txq.c */
 
 void txq_cleanup(struct txq *);
+int txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
+	  unsigned int socket, const struct rte_eth_txconf *conf);
+
 int mlx5_tx_queue_setup(struct rte_eth_dev *, uint16_t, uint16_t, unsigned int,
 			const struct rte_eth_txconf *);
 void mlx5_tx_queue_release(void *);
+uint16_t mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n);
 
 /* mlx5_rxtx.c */
 
diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
index 6d1a600..2d3cb51 100644
--- a/drivers/net/mlx5/mlx5_stats.c
+++ b/drivers/net/mlx5/mlx5_stats.c
@@ -55,7 +55,7 @@
 void
 mlx5_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct rte_eth_stats tmp = {0};
 	unsigned int i;
 	unsigned int idx;
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index b5ca7d4..e9b9a29 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -64,6 +64,9 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	struct priv *priv = dev->data->dev_private;
 	int err;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	if (priv->started) {
 		priv_unlock(priv);
@@ -104,6 +107,9 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 {
 	struct priv *priv = dev->data->dev_private;
 
+	if (mlx5_is_secondary())
+		return;
+
 	priv_lock(priv);
 	if (!priv->started) {
 		priv_unlock(priv);
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 3364fca..6700af4 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -255,11 +255,11 @@ txq_cleanup(struct txq *txq)
  * @return
  *   0 on success, errno value on failure.
  */
-static int
+int
 txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 	  unsigned int socket, const struct rte_eth_txconf *conf)
 {
-	struct priv *priv = dev->data->dev_private;
+	struct priv *priv = mlx5_get_priv(dev);
 	struct txq tmpl = {
 		.priv = priv,
 		.socket = socket
@@ -464,6 +464,9 @@ mlx5_tx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	struct txq *txq = (*priv->txqs)[idx];
 	int ret;
 
+	if (mlx5_is_secondary())
+		return -E_RTE_SECONDARY;
+
 	priv_lock(priv);
 	DEBUG("%p: configuring queue %u for %u descriptors",
 	      (void *)dev, idx, desc);
@@ -519,6 +522,9 @@ mlx5_tx_queue_release(void *dpdk_txq)
 	struct priv *priv;
 	unsigned int i;
 
+	if (mlx5_is_secondary())
+		return;
+
 	if (txq == NULL)
 		return;
 	priv = txq->priv;
@@ -534,3 +540,43 @@ mlx5_tx_queue_release(void *dpdk_txq)
 	rte_free(txq);
 	priv_unlock(priv);
 }
+
+/**
+ * DPDK callback for TX in secondary processes.
+ *
+ * This function configures all queues from primary process information
+ * if necessary before reverting to the normal TX burst callback.
+ *
+ * @param dpdk_txq
+ *   Generic pointer to TX queue structure.
+ * @param[in] pkts
+ *   Packets to transmit.
+ * @param pkts_n
+ *   Number of packets in array.
+ *
+ * @return
+ *   Number of packets successfully transmitted (<= pkts_n).
+ */
+uint16_t
+mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
+			      uint16_t pkts_n)
+{
+	struct txq *txq = dpdk_txq;
+	struct priv *priv = mlx5_secondary_data_setup(txq->priv);
+	struct priv *primary_priv;
+	unsigned int index;
+
+	if (priv == NULL)
+		return 0;
+	primary_priv =
+		mlx5_secondary_data[priv->dev->data->port_id].primary_priv;
+	/* Look for queue index in both private structures. */
+	for (index = 0; index != priv->txqs_n; ++index)
+		if (((*primary_priv->txqs)[index] == txq) ||
+		    ((*priv->txqs)[index] == txq))
+			break;
+	if (index == priv->txqs_n)
+		return 0;
+	txq = (*priv->txqs)[index];
+	return priv->dev->tx_pkt_burst(txq, pkts, pkts_n);
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v3 3/5] mlx5: add RX CRC stripping configuration
  2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 1/5] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 2/5] mlx5: allow operation in secondary processes Adrien Mazarguil
@ 2016-03-17 15:38     ` Adrien Mazarguil
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 4/5] mlx5: add support for HW packet padding Adrien Mazarguil
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-17 15:38 UTC (permalink / raw)
  To: dev; +Cc: Olga Shern

From: Olga Shern <olgas@mellanox.com>

Until now, CRC was always stripped by hardware. This feature can be
configured since MLNX_OFED >= 3.2.

Signed-off-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/nics/mlx5.rst               |  2 ++
 doc/guides/rel_notes/release_16_04.rst |  6 ++++++
 drivers/net/mlx5/Makefile              |  5 +++++
 drivers/net/mlx5/mlx5.c                |  7 +++++++
 drivers/net/mlx5/mlx5.h                |  1 +
 drivers/net/mlx5/mlx5_rxq.c            | 24 ++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c           |  6 ++++--
 drivers/net/mlx5/mlx5_rxtx.h           |  1 +
 8 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index f0d8a7e..8b63f3f 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -84,6 +84,7 @@ Features
 - Support for multiple MAC addresses.
 - VLAN filtering.
 - RX VLAN stripping.
+- RX CRC stripping configuration.
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
@@ -232,6 +233,7 @@ Currently supported by DPDK:
 
     - Flow director.
     - RX VLAN stripping.
+    - RX CRC stripping configuration.
 
 - Minimum firmware version:
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index ceef9b7..a498ef7 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -138,6 +138,12 @@ This section should contain new features added in this release. Sample format:
 
   Implemented TX support in secondary processes (like mlx4).
 
+* **Added mlx5 RX CRC stripping configuration.**
+
+  Until now, CRC was always stripped. It can now be configured.
+
+  Only available with Mellanox OFED >= 3.2.
+
 * **Added af_packet dynamic removal function.**
 
   Af_packet device can now be detached using API, like other PMD devices.
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 7076ae3..cc6de2d 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -137,6 +137,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_CQ_RX_TCP_PACKET \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_FCS \
+		infiniband/verbs.h \
+		enum IBV_EXP_CREATE_WQ_FLAG_SCATTER_FCS \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 998e6f0..acfb365 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -417,6 +417,13 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		DEBUG("VLAN stripping is %ssupported",
 		      (priv->hw_vlan_strip ? "" : "not "));
 
+#ifdef HAVE_VERBS_FCS
+		priv->hw_fcs_strip = !!(exp_device_attr.exp_device_cap_flags &
+					IBV_EXP_DEVICE_SCATTER_FCS);
+#endif /* HAVE_VERBS_FCS */
+		DEBUG("FCS stripping configuration is %ssupported",
+		      (priv->hw_fcs_strip ? "" : "not "));
+
 #else /* HAVE_EXP_QUERY_DEVICE */
 		priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
 #endif /* HAVE_EXP_QUERY_DEVICE */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index bad9283..9690827 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -103,6 +103,7 @@ struct priv {
 	unsigned int hw_csum:1; /* Checksum offload is supported. */
 	unsigned int hw_csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
+	unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
 	unsigned int vf:1; /* This is a VF device. */
 	unsigned int pending_alarm:1; /* An alarm is pending. */
 	/* RX/TX queues. */
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 3d84f41..19a1119 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1258,6 +1258,30 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 				  0),
 #endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
 	};
+
+#ifdef HAVE_VERBS_FCS
+	/* By default, FCS (CRC) is stripped by hardware. */
+	if (dev->data->dev_conf.rxmode.hw_strip_crc) {
+		tmpl.crc_present = 0;
+	} else if (priv->hw_fcs_strip) {
+		/* Ask HW/Verbs to leave CRC in place when supported. */
+		attr.wq.flags |= IBV_EXP_CREATE_WQ_FLAG_SCATTER_FCS;
+		attr.wq.comp_mask |= IBV_EXP_CREATE_WQ_FLAGS;
+		tmpl.crc_present = 1;
+	} else {
+		WARN("%p: CRC stripping has been disabled but will still"
+		     " be performed by hardware, make sure MLNX_OFED and"
+		     " firmware are up to date",
+		     (void *)dev);
+		tmpl.crc_present = 0;
+	}
+	DEBUG("%p: CRC stripping is %s, %u bytes will be subtracted from"
+	      " incoming frames to hide it",
+	      (void *)dev,
+	      tmpl.crc_present ? "disabled" : "enabled",
+	      tmpl.crc_present << 2);
+#endif /* HAVE_VERBS_FCS */
+
 	tmpl.wq = ibv_exp_create_wq(priv->ctx, &attr.wq);
 	if (tmpl.wq == NULL) {
 		ret = (errno ? errno : EINVAL);
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 4919189..34340d2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -828,7 +828,8 @@ mlx5_rx_burst_sp(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		}
 		if (ret == 0)
 			break;
-		len = ret;
+		assert(ret >= (rxq->crc_present << 2));
+		len = ret - (rxq->crc_present << 2);
 		pkt_buf_len = len;
 		/*
 		 * Replace spent segments with new ones, concatenate and
@@ -1040,7 +1041,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		}
 		if (ret == 0)
 			break;
-		len = ret;
+		assert(ret >= (rxq->crc_present << 2));
+		len = ret - (rxq->crc_present << 2);
 		rep = __rte_mbuf_raw_alloc(rxq->mp);
 		if (unlikely(rep == NULL)) {
 			/*
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 6a0087e..61be3e4 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -116,6 +116,7 @@ struct rxq {
 	unsigned int csum:1; /* Enable checksum offloading. */
 	unsigned int csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int vlan_strip:1; /* Enable VLAN stripping. */
+	unsigned int crc_present:1; /* CRC must be subtracted. */
 	union {
 		struct rxq_elt_sp (*sp)[]; /* Scattered RX elements. */
 		struct rxq_elt (*no_sp)[]; /* RX elements. */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v3 4/5] mlx5: add support for HW packet padding
  2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
                       ` (2 preceding siblings ...)
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 3/5] mlx5: add RX CRC stripping configuration Adrien Mazarguil
@ 2016-03-17 15:38     ` Adrien Mazarguil
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 5/5] mlx5: add VLAN insertion offload Adrien Mazarguil
  2016-03-22 16:35     ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Bruce Richardson
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-17 15:38 UTC (permalink / raw)
  To: dev; +Cc: Olga Shern

From: Olga Shern <olgas@mellanox.com>

Environment variable MLX5_PMD_ENABLE_PADDING enables HW packet padding
in PCI bus transactions.

When packet size is cache aligned and CRC stripping is enabled, 4 fewer
bytes are written to the PCI bus. Enabling padding makes such packets
aligned again.

In cases where PCI bandwidth is the bottleneck, padding can improve
performance by 10%.

This is disabled by default since this can also decrease performance for
unaligned packet sizes.

Signed-off-by: Olga Shern <olgas@mellanox.com>
---
 doc/guides/nics/mlx5.rst               | 14 ++++++++++++++
 doc/guides/rel_notes/release_16_04.rst |  7 +++++++
 drivers/net/mlx5/Makefile              |  5 +++++
 drivers/net/mlx5/mlx5.c                | 28 ++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                |  5 +++++
 drivers/net/mlx5/mlx5_rxq.c            | 15 +++++++++++++++
 6 files changed, 74 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 8b63f3f..9df30be 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -156,6 +156,20 @@ Environment variables
   lower performance when there is no backpressure, it is not enabled by
   default.
 
+- ``MLX5_PMD_ENABLE_PADDING``
+
+  Enables HW packet padding in PCI bus transactions.
+
+  When packet size is cache aligned and CRC stripping is enabled, 4 fewer
+  bytes are written to the PCI bus. Enabling padding makes such packets
+  aligned again.
+
+  In cases where PCI bandwidth is the bottleneck, padding can improve
+  performance by 10%.
+
+  This is disabled by default since this can also decrease performance for
+  unaligned packet sizes.
+
 Run-time configuration
 ~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index a498ef7..8eb423f 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -144,6 +144,13 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **Added mlx5 optional packet padding by HW.**
+
+  Added an option to make PCI bus transactions rounded to multiple of a
+  cache line size for better alignment.
+
+  Only available with Mellanox OFED >= 3.2.
+
 * **Added af_packet dynamic removal function.**
 
   Af_packet device can now be detached using API, like other PMD devices.
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index cc6de2d..a6a3cab 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -142,6 +142,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_CREATE_WQ_FLAG_SCATTER_FCS \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_RX_END_PADDING \
+		infiniband/verbs.h \
+		enum IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index acfb365..94eefb9 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -68,6 +68,25 @@
 #include "mlx5_defs.h"
 
 /**
+ * Retrieve integer value from environment variable.
+ *
+ * @param[in] name
+ *   Environment variable name.
+ *
+ * @return
+ *   Integer value, 0 if the variable is not set.
+ */
+int
+mlx5_getenv_int(const char *name)
+{
+	const char *val = getenv(name);
+
+	if (val == NULL)
+		return 0;
+	return atoi(val);
+}
+
+/**
  * DPDK callback to close the device.
  *
  * Destroy all queues and objects, free memory.
@@ -332,6 +351,9 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 #ifdef HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS
 			IBV_EXP_DEVICE_ATTR_VLAN_OFFLOADS |
 #endif /* HAVE_EXP_DEVICE_ATTR_VLAN_OFFLOADS */
+#ifdef HAVE_EXP_CREATE_WQ_FLAG_RX_END_PADDING
+			IBV_EXP_DEVICE_ATTR_RX_PAD_END_ALIGN |
+#endif /* HAVE_EXP_CREATE_WQ_FLAG_RX_END_PADDING */
 			0;
 #endif /* HAVE_EXP_QUERY_DEVICE */
 
@@ -424,6 +446,12 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		DEBUG("FCS stripping configuration is %ssupported",
 		      (priv->hw_fcs_strip ? "" : "not "));
 
+#ifdef HAVE_VERBS_RX_END_PADDING
+		priv->hw_padding = !!exp_device_attr.rx_pad_end_addr_align;
+#endif /* HAVE_VERBS_RX_END_PADDING */
+		DEBUG("hardware RX end alignment padding is %ssupported",
+		      (priv->hw_padding ? "" : "not "));
+
 #else /* HAVE_EXP_QUERY_DEVICE */
 		priv->ind_table_max_size = RSS_INDIRECTION_TABLE_SIZE;
 #endif /* HAVE_EXP_QUERY_DEVICE */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9690827..1904d54 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -104,6 +104,7 @@ struct priv {
 	unsigned int hw_csum_l2tun:1; /* Same for L2 tunnels. */
 	unsigned int hw_vlan_strip:1; /* VLAN stripping is supported. */
 	unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
+	unsigned int hw_padding:1; /* End alignment padding is supported. */
 	unsigned int vf:1; /* This is a VF device. */
 	unsigned int pending_alarm:1; /* An alarm is pending. */
 	/* RX/TX queues. */
@@ -160,6 +161,10 @@ priv_unlock(struct priv *priv)
 	rte_spinlock_unlock(&priv->lock);
 }
 
+/* mlx5.c */
+
+int mlx5_getenv_int(const char *);
+
 /* mlx5_ethdev.c */
 
 struct priv *mlx5_get_priv(struct rte_eth_dev *dev);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 19a1119..c8af77f 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1282,6 +1282,21 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	      tmpl.crc_present << 2);
 #endif /* HAVE_VERBS_FCS */
 
+#ifdef HAVE_VERBS_RX_END_PADDING
+	if (!mlx5_getenv_int("MLX5_PMD_ENABLE_PADDING"))
+		; /* Nothing else to do. */
+	else if (priv->hw_padding) {
+		INFO("%p: enabling packet padding on queue %p",
+		     (void *)dev, (void *)rxq);
+		attr.wq.flags |= IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING;
+		attr.wq.comp_mask |= IBV_EXP_CREATE_WQ_FLAGS;
+	} else
+		WARN("%p: packet padding has been requested but is not"
+		     " supported, make sure MLNX_OFED and firmware are"
+		     " up to date",
+		     (void *)dev);
+#endif /* HAVE_VERBS_RX_END_PADDING */
+
 	tmpl.wq = ibv_exp_create_wq(priv->ctx, &attr.wq);
 	if (tmpl.wq == NULL) {
 		ret = (errno ? errno : EINVAL);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [dpdk-dev] [PATCH v3 5/5] mlx5: add VLAN insertion offload
  2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
                       ` (3 preceding siblings ...)
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 4/5] mlx5: add support for HW packet padding Adrien Mazarguil
@ 2016-03-17 15:38     ` Adrien Mazarguil
  2016-03-22 16:35     ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Bruce Richardson
  5 siblings, 0 replies; 20+ messages in thread
From: Adrien Mazarguil @ 2016-03-17 15:38 UTC (permalink / raw)
  To: dev; +Cc: Yaacov Hazan

From: Yaacov Hazan <yaacovh@mellanox.com>

VLAN insertion can be done in hardware when supported in Verbs. A software
fallback is provided otherwise. The software implementation is also used
when multi-packet send is enabled on a queue, as both features are mutually
exclusive.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/nics/mlx5.rst               |   2 +
 doc/guides/rel_notes/release_16_04.rst |   6 ++
 drivers/net/mlx5/Makefile              |   5 ++
 drivers/net/mlx5/mlx5.c                |  12 +++-
 drivers/net/mlx5/mlx5.h                |   1 +
 drivers/net/mlx5/mlx5_ethdev.c         |  12 ++--
 drivers/net/mlx5/mlx5_rxtx.c           | 112 +++++++++++++++++++++++++++------
 drivers/net/mlx5/mlx5_rxtx.h           |  13 ++++
 drivers/net/mlx5/mlx5_txq.c            |  16 ++++-
 9 files changed, 151 insertions(+), 28 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 9df30be..925cb9e 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -84,6 +84,7 @@ Features
 - Support for multiple MAC addresses.
 - VLAN filtering.
 - RX VLAN stripping.
+- TX VLAN insertion.
 - RX CRC stripping configuration.
 - Promiscuous mode.
 - Multicast promiscuous mode.
@@ -247,6 +248,7 @@ Currently supported by DPDK:
 
     - Flow director.
     - RX VLAN stripping.
+    - TX VLAN insertion.
     - RX CRC stripping configuration.
 
 - Minimum firmware version:
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 8eb423f..087eb25 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -151,6 +151,12 @@ This section should contain new features added in this release. Sample format:
 
   Only available with Mellanox OFED >= 3.2.
 
+* **Added mlx5 TX VLAN insertion support.**
+
+  Added support for TX VLAN insertion.
+
+  Only available with Mellanox OFED >= 3.2.
+
 * **Added af_packet dynamic removal function.**
 
   Af_packet device can now be detached using API, like other PMD devices.
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index a6a3cab..7e6d589 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -147,6 +147,11 @@ mlx5_autoconf.h: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/verbs.h \
 		enum IBV_EXP_CREATE_WQ_FLAG_RX_END_PADDING \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_VLAN_INSERTION \
+		infiniband/verbs.h \
+		enum IBV_EXP_RECEIVE_WQ_CVLAN_INSERTION \
+		$(AUTOCONF_OUTPUT)
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 94eefb9..ea4b6e3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -260,6 +260,7 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 	struct ibv_context *attr_ctx = NULL;
 	struct ibv_device_attr device_attr;
 	unsigned int vf;
+	unsigned int mps;
 	int idx;
 	int i;
 
@@ -305,8 +306,14 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		       PCI_DEVICE_ID_MELLANOX_CONNECTX4VF) ||
 		      (pci_dev->id.device_id ==
 		       PCI_DEVICE_ID_MELLANOX_CONNECTX4LXVF));
-		INFO("PCI information matches, using device \"%s\" (VF: %s)",
-		     list[i]->name, (vf ? "true" : "false"));
+		/* Multi-packet send is only supported by ConnectX-4 Lx PF. */
+		mps = (pci_dev->id.device_id ==
+		       PCI_DEVICE_ID_MELLANOX_CONNECTX4LX);
+		INFO("PCI information matches, using device \"%s\" (VF: %s,"
+		     " MPS: %s)",
+		     list[i]->name,
+		     vf ? "true" : "false",
+		     mps ? "true" : "false");
 		attr_ctx = ibv_open_device(list[i]);
 		err = errno;
 		break;
@@ -457,6 +464,7 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 #endif /* HAVE_EXP_QUERY_DEVICE */
 
 		priv->vf = vf;
+		priv->mps = mps;
 		/* Allocate and register default RSS hash keys. */
 		priv->rss_conf = rte_calloc(__func__, hash_rxq_init_n,
 					    sizeof((*priv->rss_conf)[0]), 0);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 1904d54..d012f50 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -106,6 +106,7 @@ struct priv {
 	unsigned int hw_fcs_strip:1; /* FCS stripping is supported. */
 	unsigned int hw_padding:1; /* End alignment padding is supported. */
 	unsigned int vf:1; /* This is a VF device. */
+	unsigned int mps:1; /* Whether multi-packet send is supported. */
 	unsigned int pending_alarm:1; /* An alarm is pending. */
 	/* RX/TX queues. */
 	unsigned int rxqs_n; /* RX queues array size. */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 6b674a2..66115d2 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -544,12 +544,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 		  DEV_RX_OFFLOAD_UDP_CKSUM |
 		  DEV_RX_OFFLOAD_TCP_CKSUM) :
 		 0);
-	info->tx_offload_capa =
-		(priv->hw_csum ?
-		 (DEV_TX_OFFLOAD_IPV4_CKSUM |
-		  DEV_TX_OFFLOAD_UDP_CKSUM |
-		  DEV_TX_OFFLOAD_TCP_CKSUM) :
-		 0);
+	info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+	if (priv->hw_csum)
+		info->tx_offload_capa |=
+			(DEV_TX_OFFLOAD_IPV4_CKSUM |
+			 DEV_TX_OFFLOAD_UDP_CKSUM |
+			 DEV_TX_OFFLOAD_TCP_CKSUM);
 	if (priv_get_ifname(priv, &ifname) == 0)
 		info->if_index = if_nametoindex(ifname);
 	/* FIXME: RETA update/query API expects the callee to know the size of
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 34340d2..dfe852f 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -333,6 +333,36 @@ txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
 	txq_mp2mr(txq, mp);
 }
 
+/**
+ * Insert VLAN using mbuf headroom space.
+ *
+ * @param buf
+ *   Buffer for VLAN insertion.
+ *
+ * @return
+ *   0 on success, errno value on failure.
+ */
+static inline int
+insert_vlan_sw(struct rte_mbuf *buf)
+{
+	uintptr_t addr;
+	uint32_t vlan;
+	uint16_t head_room_len = rte_pktmbuf_headroom(buf);
+
+	if (head_room_len < 4)
+		return EINVAL;
+
+	addr = rte_pktmbuf_mtod(buf, uintptr_t);
+	vlan = htonl(0x81000000 | buf->vlan_tci);
+	memmove((void *)(addr - 4), (void *)addr, 12);
+	memcpy((void *)(addr + 8), &vlan, sizeof(vlan));
+
+	SET_DATA_OFF(buf, head_room_len - 4);
+	DATA_LEN(buf) += 4;
+
+	return 0;
+}
+
 #if MLX5_PMD_SGE_WR_N > 1
 
 /**
@@ -534,6 +564,9 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		unsigned int sent_size = 0;
 #endif
 		uint32_t send_flags = 0;
+#ifdef HAVE_VERBS_VLAN_INSERTION
+		int insert_vlan = 0;
+#endif /* HAVE_VERBS_VLAN_INSERTION */
 
 		if (i + 1 < max)
 			rte_prefetch0(buf_next);
@@ -554,6 +587,18 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			if (RTE_ETH_IS_TUNNEL_PKT(buf->packet_type))
 				send_flags |= IBV_EXP_QP_BURST_TUNNEL;
 		}
+		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
+#ifdef HAVE_VERBS_VLAN_INSERTION
+			if (!txq->priv->mps)
+				insert_vlan = 1;
+			else
+#endif /* HAVE_VERBS_VLAN_INSERTION */
+			{
+				err = insert_vlan_sw(buf);
+				if (unlikely(err))
+					goto stop;
+			}
+		}
 		if (likely(segs == 1)) {
 			uintptr_t addr;
 			uint32_t length;
@@ -577,13 +622,23 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			}
 			/* Put packet into send queue. */
 #if MLX5_PMD_MAX_INLINE > 0
-			if (length <= txq->max_inline)
-				err = txq->send_pending_inline
-					(txq->qp,
-					 (void *)addr,
-					 length,
-					 send_flags);
-			else
+			if (length <= txq->max_inline) {
+#ifdef HAVE_VERBS_VLAN_INSERTION
+				if (insert_vlan)
+					err = txq->send_pending_inline_vlan
+						(txq->qp,
+						 (void *)addr,
+						 length,
+						 send_flags,
+						 &buf->vlan_tci);
+				else
+#endif /* HAVE_VERBS_VLAN_INSERTION */
+					err = txq->send_pending_inline
+						(txq->qp,
+						 (void *)addr,
+						 length,
+						 send_flags);
+			} else
 #endif
 			{
 				/* Retrieve Memory Region key for this
@@ -597,12 +652,23 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 					elt->buf = NULL;
 					goto stop;
 				}
-				err = txq->send_pending
-					(txq->qp,
-					 addr,
-					 length,
-					 lkey,
-					 send_flags);
+#ifdef HAVE_VERBS_VLAN_INSERTION
+				if (insert_vlan)
+					err = txq->send_pending_vlan
+						(txq->qp,
+						 addr,
+						 length,
+						 lkey,
+						 send_flags,
+						 &buf->vlan_tci);
+				else
+#endif /* HAVE_VERBS_VLAN_INSERTION */
+					err = txq->send_pending
+						(txq->qp,
+						 addr,
+						 length,
+						 lkey,
+						 send_flags);
 			}
 			if (unlikely(err))
 				goto stop;
@@ -619,11 +685,21 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 			if (ret.length == (unsigned int)-1)
 				goto stop;
 			/* Put SG list into send queue. */
-			err = txq->send_pending_sg_list
-				(txq->qp,
-				 sges,
-				 ret.num,
-				 send_flags);
+#ifdef HAVE_VERBS_VLAN_INSERTION
+			if (insert_vlan)
+				err = txq->send_pending_sg_list_vlan
+					(txq->qp,
+					 sges,
+					 ret.num,
+					 send_flags,
+					 &buf->vlan_tci);
+			else
+#endif /* HAVE_VERBS_VLAN_INSERTION */
+				err = txq->send_pending_sg_list
+					(txq->qp,
+					 sges,
+					 ret.num,
+					 send_flags);
 			if (unlikely(err))
 				goto stop;
 #ifdef MLX5_PMD_SOFT_COUNTERS
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 61be3e4..0e2b607 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -255,11 +255,20 @@ struct txq {
 	struct priv *priv; /* Back pointer to private data. */
 	int32_t (*poll_cnt)(struct ibv_cq *cq, uint32_t max);
 	int (*send_pending)();
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	int (*send_pending_vlan)();
+#endif
 #if MLX5_PMD_MAX_INLINE > 0
 	int (*send_pending_inline)();
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	int (*send_pending_inline_vlan)();
+#endif
 #endif
 #if MLX5_PMD_SGE_WR_N > 1
 	int (*send_pending_sg_list)();
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	int (*send_pending_sg_list_vlan)();
+#endif
 #endif
 	int (*send_flush)(struct ibv_qp *qp);
 	struct ibv_cq *cq; /* Completion Queue. */
@@ -283,7 +292,11 @@ struct txq {
 	/* Elements used only for init part are here. */
 	linear_t (*elts_linear)[]; /* Linearized buffers. */
 	struct ibv_mr *mr_linear; /* Memory Region for linearized buffers. */
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	struct ibv_exp_qp_burst_family_v1 *if_qp; /* QP burst interface. */
+#else
 	struct ibv_exp_qp_burst_family *if_qp; /* QP burst interface. */
+#endif
 	struct ibv_exp_cq_family *if_cq; /* CQ interface. */
 	struct ibv_exp_res_domain *rd; /* Resource Domain. */
 	unsigned int socket; /* CPU socket ID for allocations. */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 6700af4..ce2bb42 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -400,10 +400,13 @@ txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 		.intf_scope = IBV_EXP_INTF_GLOBAL,
 		.intf = IBV_EXP_INTF_QP_BURST,
 		.obj = tmpl.qp,
+#ifdef HAVE_VERBS_VLAN_INSERTION
+		.intf_version = 1,
+#endif
 #ifdef HAVE_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR
-		/* Multi packet send WR can only be used outside of VF. */
+		/* Enable multi-packet send if supported. */
 		.family_flags =
-			(!priv->vf ?
+			(priv->mps ?
 			 IBV_EXP_QP_BURST_CREATE_ENABLE_MULTI_PACKET_SEND_WR :
 			 0),
 #endif
@@ -422,11 +425,20 @@ txq_setup(struct rte_eth_dev *dev, struct txq *txq, uint16_t desc,
 	txq->poll_cnt = txq->if_cq->poll_cnt;
 #if MLX5_PMD_MAX_INLINE > 0
 	txq->send_pending_inline = txq->if_qp->send_pending_inline;
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	txq->send_pending_inline_vlan = txq->if_qp->send_pending_inline_vlan;
+#endif
 #endif
 #if MLX5_PMD_SGE_WR_N > 1
 	txq->send_pending_sg_list = txq->if_qp->send_pending_sg_list;
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	txq->send_pending_sg_list_vlan = txq->if_qp->send_pending_sg_list_vlan;
+#endif
 #endif
 	txq->send_pending = txq->if_qp->send_pending;
+#ifdef HAVE_VERBS_VLAN_INSERTION
+	txq->send_pending_vlan = txq->if_qp->send_pending_vlan;
+#endif
 	txq->send_flush = txq->if_qp->send_flush;
 	DEBUG("%p: txq updated with %p", (void *)txq, (void *)&tmpl);
 	/* Pre-register known mempools. */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5
  2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
                       ` (4 preceding siblings ...)
  2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 5/5] mlx5: add VLAN insertion offload Adrien Mazarguil
@ 2016-03-22 16:35     ` Bruce Richardson
  5 siblings, 0 replies; 20+ messages in thread
From: Bruce Richardson @ 2016-03-22 16:35 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: dev

On Thu, Mar 17, 2016 at 04:38:53PM +0100, Adrien Mazarguil wrote:
> This patchset adds to mlx5 a few features available in mlx4 (TX from
> secondary processes) or provided by Verbs (support for HW packet padding,
> TX VLAN insertion).
> 
> Release notes and documentation are updated accordingly.
> 
> Changes in v3:
> - Removed compilation option for TX VLAN insertion, the method to use is now
>   determined at runtime.
> - Modified releases notes slightly.
> 
> Changes in v2:
> - Added support for CRC stripping configuration.
> - Updated packet padding feature macro and made cosmetic changes to its
>   implementation to match CRC stripping's.
> - Updated release notes about packet padding.
> - Updated TX VLAN insertion documentation.
> 
> Olga Shern (2):
>   mlx5: add RX CRC stripping configuration
>   mlx5: add support for HW packet padding
> 
> Or Ami (2):
>   mlx5: add callbacks to support link (up / down) changes
>   mlx5: allow operation in secondary processes
> 
> Yaacov Hazan (1):
>   mlx5: add VLAN insertion offload
>
Applied to dpdk-next-net/rel_16_04.

/Bruce

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-03-22 16:35 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-22 18:19 [dpdk-dev] [PATCH 0/4] Implement missing features in mlx5 Adrien Mazarguil
2016-02-22 18:19 ` [dpdk-dev] [PATCH 1/4] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
2016-02-22 18:19 ` [dpdk-dev] [PATCH 2/4] mlx5: allow operation in secondary processes Adrien Mazarguil
2016-02-22 18:19 ` [dpdk-dev] [PATCH 3/4] mlx5: add support for HW packet padding Adrien Mazarguil
2016-02-22 18:19 ` [dpdk-dev] [PATCH 4/4] mlx5: add VLAN insertion offload Adrien Mazarguil
2016-03-03 14:27 ` [dpdk-dev] [PATCH v2 0/5] Implement missing features in mlx5 Adrien Mazarguil
2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 1/5] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 2/5] mlx5: allow operation in secondary processes Adrien Mazarguil
2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 3/5] mlx5: add RX CRC stripping configuration Adrien Mazarguil
2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 4/5] mlx5: add support for HW packet padding Adrien Mazarguil
2016-03-03 14:27   ` [dpdk-dev] [PATCH v2 5/5] mlx5: add VLAN insertion offload Adrien Mazarguil
2016-03-11 15:24     ` Bruce Richardson
2016-03-16 13:45       ` Adrien Mazarguil
2016-03-17 15:38   ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Adrien Mazarguil
2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 1/5] mlx5: add callbacks to support link (up / down) changes Adrien Mazarguil
2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 2/5] mlx5: allow operation in secondary processes Adrien Mazarguil
2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 3/5] mlx5: add RX CRC stripping configuration Adrien Mazarguil
2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 4/5] mlx5: add support for HW packet padding Adrien Mazarguil
2016-03-17 15:38     ` [dpdk-dev] [PATCH v3 5/5] mlx5: add VLAN insertion offload Adrien Mazarguil
2016-03-22 16:35     ` [dpdk-dev] [PATCH v3 0/5] Implement missing features in mlx5 Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).