patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch
@ 2016-11-08 10:36 Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 01/14] net/mlx5: support Mellanox OFED 3.4 Nelio Laranjeiro
                   ` (27 more replies)
  0 siblings, 28 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Adrien Mazarguil

Patchset of fixes from 16.11 master branch rebased on top of 16.07.1 tag.

Adrien Mazarguil (1):
  net/mlx5: fix Rx VLAN offload capability report

Nélio Laranjeiro (6):
  net/mlx5: support Mellanox OFED 3.4
  net/mlx5: re-factorize functions
  net/mlx5: fix inline logic
  net/mlx5: fix Rx function selection
  net/mlx5: fix link speed capability information
  net/mlx5: fix support for newer link speeds

Olga Shern (1):
  net/mlx5: fix link status report

Olivier Gournet (1):
  net/mlx5: fix initialization in secondary process

Raslan Darawsheh (1):
  net/mlx5: fix removing VLAN filter

Sagi Grimberg (1):
  net/mlx5: fix possible NULL dereference in Rx path

Yaacov Hazan (3):
  net/mlx5: fix inconsistent return value in flow director
  net/mlx5: refactor allocation of flow director queues
  net/mlx5: fix flow director drop mode

 doc/guides/nics/mlx5.rst       |   3 +-
 drivers/net/mlx5/Makefile      |  20 ++
 drivers/net/mlx5/mlx5.c        |   1 +
 drivers/net/mlx5/mlx5.h        |   4 +
 drivers/net/mlx5/mlx5_ethdev.c | 159 +++++++++++--
 drivers/net/mlx5/mlx5_fdir.c   | 270 +++++++++++++++-------
 drivers/net/mlx5/mlx5_prm.h    |   6 +
 drivers/net/mlx5/mlx5_rxq.c    |   4 +-
 drivers/net/mlx5/mlx5_rxtx.c   | 505 ++++++++++-------------------------------
 drivers/net/mlx5/mlx5_rxtx.h   |   7 +-
 drivers/net/mlx5/mlx5_txq.c    |   9 +-
 drivers/net/mlx5/mlx5_vlan.c   |   3 +-
 12 files changed, 491 insertions(+), 500 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 01/14] net/mlx5: support Mellanox OFED 3.4
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 02/14] net/mlx5: fix possible NULL dereference in Rx path Nelio Laranjeiro
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

Some macros are renamed by Mellanox OFED 3.4.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/Makefile    | 5 +++++
 drivers/net/mlx5/mlx5_prm.h  | 6 ++++++
 drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index f6d3938..2c13c30 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -116,6 +116,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/mlx5_hw.h \
 		enum MLX5_ETH_VLAN_INLINE_HEADER_SIZE \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_MLX5_OPCODE_TSO \
+		infiniband/mlx5_hw.h \
+		enum MLX5_OPCODE_TSO \
+		$(AUTOCONF_OUTPUT)
 
 # Create mlx5_autoconf.h or update it in case it differs from the new one.
 
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 4383009..e23d5cb 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -44,6 +44,8 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#include "mlx5_autoconf.h"
+
 /* Get CQE owner bit. */
 #define MLX5_CQE_OWNER(op_own) ((op_own) & MLX5_CQE_OWNER_MASK)
 
@@ -71,6 +73,10 @@
 /* Room for inline data in multi-packet WQE. */
 #define MLX5_MWQE64_INL_DATA 28
 
+#ifndef HAVE_VERBS_MLX5_OPCODE_TSO
+#define MLX5_OPCODE_TSO MLX5_OPCODE_LSO_MPW /* Compat with OFED 3.3. */
+#endif
+
 /* Subset of struct mlx5_wqe_eth_seg. */
 struct mlx5_wqe_eth_seg_small {
 	uint32_t rsvd0;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index a13cbc7..cc62e78 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -908,7 +908,7 @@ mlx5_mpw_new(struct txq *txq, struct mlx5_mpw *mpw, uint32_t length)
 	mpw->wqe->mpw.eseg.rsvd2 = 0;
 	mpw->wqe->mpw.ctrl.data[0] = htonl((MLX5_OPC_MOD_MPW << 24) |
 					   (txq->wqe_ci << 8) |
-					   MLX5_OPCODE_LSO_MPW);
+					   MLX5_OPCODE_TSO);
 	mpw->wqe->mpw.ctrl.data[2] = 0;
 	mpw->wqe->mpw.ctrl.data[3] = 0;
 	mpw->data.dseg[0] = &mpw->wqe->mpw.dseg[0];
@@ -1107,7 +1107,7 @@ mlx5_mpw_inline_new(struct txq *txq, struct mlx5_mpw *mpw, uint32_t length)
 	mpw->wqe = &(*txq->wqes)[idx];
 	mpw->wqe->mpw_inl.ctrl.data[0] = htonl((MLX5_OPC_MOD_MPW << 24) |
 					       (txq->wqe_ci << 8) |
-					       MLX5_OPCODE_LSO_MPW);
+					       MLX5_OPCODE_TSO);
 	mpw->wqe->mpw_inl.ctrl.data[2] = 0;
 	mpw->wqe->mpw_inl.ctrl.data[3] = 0;
 	mpw->wqe->mpw_inl.eseg.mss = htons(length);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 02/14] net/mlx5: fix possible NULL dereference in Rx path
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 01/14] net/mlx5: support Mellanox OFED 3.4 Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 03/14] net/mlx5: fix inconsistent return value in flow director Nelio Laranjeiro
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Sagi Grimberg, Adrien Mazarguil

From: Sagi Grimberg <sagi@grimberg.me>

The user is allowed to call ->rx_pkt_burst() even without free
mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
on the first iteration (where pkt is still NULL). This would cause us
to deref a NULL pkt (reset refcount and free).

Fix this by checking the pkt before freeing it.

Fixes: a1bdb71a32da ("net/mlx5: fix crash in Rx")

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index cc62e78..59e8183 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1572,6 +1572,14 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		rte_prefetch0(wqe);
 		rep = rte_mbuf_raw_alloc(rxq->mp);
 		if (unlikely(rep == NULL)) {
+			++rxq->stats.rx_nombuf;
+			if (!pkt) {
+				/*
+				 * no buffers before we even started,
+				 * bail out silently.
+				 */
+				break;
+			}
 			while (pkt != seg) {
 				assert(pkt != (*rxq->elts)[idx]);
 				seg = NEXT(pkt);
@@ -1579,7 +1587,6 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 				__rte_mbuf_raw_free(pkt);
 				pkt = seg;
 			}
-			++rxq->stats.rx_nombuf;
 			break;
 		}
 		if (!pkt) {
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 03/14] net/mlx5: fix inconsistent return value in flow director
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 01/14] net/mlx5: support Mellanox OFED 3.4 Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 02/14] net/mlx5: fix possible NULL dereference in Rx path Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 04/14] net/mlx5: fix Rx VLAN offload capability report Nelio Laranjeiro
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Yaacov Hazan, Adrien Mazarguil

From: Yaacov Hazan <yaacovh@mellanox.com>

The return value in DPDK is negative errno on failure.
Since internal functions in mlx driver return positive
values need to negate this value when it returned to
dpdk layer.

Fixes: 76f5c99 ("mlx5: support flow director")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
---
 drivers/net/mlx5/mlx5_fdir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 070edde..0372936 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -955,7 +955,7 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
 		     enum rte_filter_op filter_op,
 		     void *arg)
 {
-	int ret = -EINVAL;
+	int ret = EINVAL;
 	struct priv *priv = dev->data->dev_private;
 
 	switch (filter_type) {
@@ -970,5 +970,5 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
 		break;
 	}
 
-	return ret;
+	return -ret;
 }
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 04/14] net/mlx5: fix Rx VLAN offload capability report
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (2 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 03/14] net/mlx5: fix inconsistent return value in flow director Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 05/14] net/mlx5: fix removing VLAN filter Nelio Laranjeiro
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Adrien Mazarguil

From: Adrien Mazarguil <adrien.mazarguil@6wind.com>

This capability is implemented but not reported.

Fixes: f3db9489188a ("mlx5: support Rx VLAN stripping")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_ethdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 17588a5..b3b7820 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -583,7 +583,8 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 		 (DEV_RX_OFFLOAD_IPV4_CKSUM |
 		  DEV_RX_OFFLOAD_UDP_CKSUM |
 		  DEV_RX_OFFLOAD_TCP_CKSUM) :
-		 0);
+		 0) |
+		(priv->hw_vlan_strip ? DEV_RX_OFFLOAD_VLAN_STRIP : 0);
 	if (!priv->mps)
 		info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
 	if (priv->hw_csum)
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 05/14] net/mlx5: fix removing VLAN filter
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (3 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 04/14] net/mlx5: fix Rx VLAN offload capability report Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 06/14] net/mlx5: refactor allocation of flow director queues Nelio Laranjeiro
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Raslan Darawsheh, Adrien Mazarguil

From: Raslan Darawsheh <rdarawsheh@asaltech.com>

memmove was moving bytes as the number of elements next to i, while it
should move the number of elements multiplied by the size of each element.

Fixes: e9086978 ("mlx5: support VLAN filtering")

Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>
---
 drivers/net/mlx5/mlx5_vlan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_vlan.c b/drivers/net/mlx5/mlx5_vlan.c
index 64e599d..1b0fa40 100644
--- a/drivers/net/mlx5/mlx5_vlan.c
+++ b/drivers/net/mlx5/mlx5_vlan.c
@@ -87,7 +87,8 @@ vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 		--priv->vlan_filter_n;
 		memmove(&priv->vlan_filter[i],
 			&priv->vlan_filter[i + 1],
-			priv->vlan_filter_n - i);
+			sizeof(priv->vlan_filter[i]) *
+			(priv->vlan_filter_n - i));
 		priv->vlan_filter[priv->vlan_filter_n] = 0;
 	} else {
 		assert(i == priv->vlan_filter_n);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 06/14] net/mlx5: refactor allocation of flow director queues
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (4 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 05/14] net/mlx5: fix removing VLAN filter Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 07/14] net/mlx5: fix flow director drop mode Nelio Laranjeiro
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Yaacov Hazan, Adrien Mazarguil

From: Yaacov Hazan <yaacovh@mellanox.com>

This is done to prepare support for drop queues, which are not related to
existing Rx queues and need to be managed separately.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.h      |   1 +
 drivers/net/mlx5/mlx5_fdir.c | 229 ++++++++++++++++++++++++++++---------------
 drivers/net/mlx5/mlx5_rxq.c  |   2 +
 drivers/net/mlx5/mlx5_rxtx.h |   4 +-
 4 files changed, 156 insertions(+), 80 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 65241a6..b1b657d 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -257,6 +257,7 @@ void mlx5_dev_stop(struct rte_eth_dev *);
 
 /* mlx5_fdir.c */
 
+void priv_fdir_queue_destroy(struct priv *, struct fdir_queue *);
 int fdir_init_filters_list(struct priv *);
 void priv_fdir_delete_filters_list(struct priv *);
 void priv_fdir_disable(struct priv *);
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 0372936..4a82dc9 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -400,6 +400,145 @@ create_flow:
 }
 
 /**
+ * Destroy a flow director queue.
+ *
+ * @param fdir_queue
+ *   Flow director queue to be destroyed.
+ */
+void
+priv_fdir_queue_destroy(struct priv *priv, struct fdir_queue *fdir_queue)
+{
+	struct mlx5_fdir_filter *fdir_filter;
+
+	/* Disable filter flows still applying to this queue. */
+	LIST_FOREACH(fdir_filter, priv->fdir_filter_list, next) {
+		unsigned int idx = fdir_filter->queue;
+		struct rxq_ctrl *rxq_ctrl =
+			container_of((*priv->rxqs)[idx], struct rxq_ctrl, rxq);
+
+		assert(idx < priv->rxqs_n);
+		if (fdir_queue == rxq_ctrl->fdir_queue &&
+		    fdir_filter->flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(fdir_filter->flow));
+			fdir_filter->flow = NULL;
+		}
+	}
+	assert(fdir_queue->qp);
+	claim_zero(ibv_destroy_qp(fdir_queue->qp));
+	assert(fdir_queue->ind_table);
+	claim_zero(ibv_exp_destroy_rwq_ind_table(fdir_queue->ind_table));
+	if (fdir_queue->wq)
+		claim_zero(ibv_exp_destroy_wq(fdir_queue->wq));
+	if (fdir_queue->cq)
+		claim_zero(ibv_destroy_cq(fdir_queue->cq));
+#ifndef NDEBUG
+	memset(fdir_queue, 0x2a, sizeof(*fdir_queue));
+#endif
+	rte_free(fdir_queue);
+}
+
+/**
+ * Create a flow director queue.
+ *
+ * @param priv
+ *   Private structure.
+ * @param wq
+ *   Work queue to route matched packets to, NULL if one needs to
+ *   be created.
+ *
+ * @return
+ *   Related flow director queue on success, NULL otherwise.
+ */
+static struct fdir_queue *
+priv_fdir_queue_create(struct priv *priv, struct ibv_exp_wq *wq,
+		       unsigned int socket)
+{
+	struct fdir_queue *fdir_queue;
+
+	fdir_queue = rte_calloc_socket(__func__, 1, sizeof(*fdir_queue),
+				       0, socket);
+	if (!fdir_queue) {
+		ERROR("cannot allocate flow director queue");
+		return NULL;
+	}
+	assert(priv->pd);
+	assert(priv->ctx);
+	if (!wq) {
+		fdir_queue->cq = ibv_exp_create_cq(
+			priv->ctx, 1, NULL, NULL, 0,
+			&(struct ibv_exp_cq_init_attr){
+				.comp_mask = 0,
+			});
+		if (!fdir_queue->cq) {
+			ERROR("cannot create flow director CQ");
+			goto error;
+		}
+		fdir_queue->wq = ibv_exp_create_wq(
+			priv->ctx,
+			&(struct ibv_exp_wq_init_attr){
+				.wq_type = IBV_EXP_WQT_RQ,
+				.max_recv_wr = 1,
+				.max_recv_sge = 1,
+				.pd = priv->pd,
+				.cq = fdir_queue->cq,
+			});
+		if (!fdir_queue->wq) {
+			ERROR("cannot create flow director WQ");
+			goto error;
+		}
+		wq = fdir_queue->wq;
+	}
+	fdir_queue->ind_table = ibv_exp_create_rwq_ind_table(
+		priv->ctx,
+		&(struct ibv_exp_rwq_ind_table_init_attr){
+			.pd = priv->pd,
+			.log_ind_tbl_size = 0,
+			.ind_tbl = &wq,
+			.comp_mask = 0,
+		});
+	if (!fdir_queue->ind_table) {
+		ERROR("cannot create flow director indirection table");
+		goto error;
+	}
+	fdir_queue->qp = ibv_exp_create_qp(
+		priv->ctx,
+		&(struct ibv_exp_qp_init_attr){
+			.qp_type = IBV_QPT_RAW_PACKET,
+			.comp_mask =
+				IBV_EXP_QP_INIT_ATTR_PD |
+				IBV_EXP_QP_INIT_ATTR_PORT |
+				IBV_EXP_QP_INIT_ATTR_RX_HASH,
+			.pd = priv->pd,
+			.rx_hash_conf = &(struct ibv_exp_rx_hash_conf){
+				.rx_hash_function =
+					IBV_EXP_RX_HASH_FUNC_TOEPLITZ,
+				.rx_hash_key_len = rss_hash_default_key_len,
+				.rx_hash_key = rss_hash_default_key,
+				.rx_hash_fields_mask = 0,
+				.rwq_ind_tbl = fdir_queue->ind_table,
+			},
+			.port_num = priv->port,
+		});
+	if (!fdir_queue->qp) {
+		ERROR("cannot create flow director hash RX QP");
+		goto error;
+	}
+	return fdir_queue;
+error:
+	assert(fdir_queue);
+	assert(!fdir_queue->qp);
+	if (fdir_queue->ind_table)
+		claim_zero(ibv_exp_destroy_rwq_ind_table
+			   (fdir_queue->ind_table));
+	if (fdir_queue->wq)
+		claim_zero(ibv_exp_destroy_wq(fdir_queue->wq));
+	if (fdir_queue->cq)
+		claim_zero(ibv_destroy_cq(fdir_queue->cq));
+	rte_free(fdir_queue);
+	return NULL;
+}
+
+/**
  * Get flow director queue for a specific RX queue, create it in case
  * it does not exist.
  *
@@ -416,74 +555,15 @@ priv_get_fdir_queue(struct priv *priv, uint16_t idx)
 {
 	struct rxq_ctrl *rxq_ctrl =
 		container_of((*priv->rxqs)[idx], struct rxq_ctrl, rxq);
-	struct fdir_queue *fdir_queue = &rxq_ctrl->fdir_queue;
-	struct ibv_exp_rwq_ind_table *ind_table = NULL;
-	struct ibv_qp *qp = NULL;
-	struct ibv_exp_rwq_ind_table_init_attr ind_init_attr;
-	struct ibv_exp_rx_hash_conf hash_conf;
-	struct ibv_exp_qp_init_attr qp_init_attr;
-	int err = 0;
-
-	/* Return immediately if it has already been created. */
-	if (fdir_queue->qp != NULL)
-		return fdir_queue;
-
-	ind_init_attr = (struct ibv_exp_rwq_ind_table_init_attr){
-		.pd = priv->pd,
-		.log_ind_tbl_size = 0,
-		.ind_tbl = &rxq_ctrl->wq,
-		.comp_mask = 0,
-	};
+	struct fdir_queue *fdir_queue = rxq_ctrl->fdir_queue;
 
-	errno = 0;
-	ind_table = ibv_exp_create_rwq_ind_table(priv->ctx,
-						 &ind_init_attr);
-	if (ind_table == NULL) {
-		/* Not clear whether errno is set. */
-		err = (errno ? errno : EINVAL);
-		ERROR("RX indirection table creation failed with error %d: %s",
-		      err, strerror(err));
-		goto error;
-	}
-
-	/* Create fdir_queue qp. */
-	hash_conf = (struct ibv_exp_rx_hash_conf){
-		.rx_hash_function = IBV_EXP_RX_HASH_FUNC_TOEPLITZ,
-		.rx_hash_key_len = rss_hash_default_key_len,
-		.rx_hash_key = rss_hash_default_key,
-		.rx_hash_fields_mask = 0,
-		.rwq_ind_tbl = ind_table,
-	};
-	qp_init_attr = (struct ibv_exp_qp_init_attr){
-		.max_inl_recv = 0, /* Currently not supported. */
-		.qp_type = IBV_QPT_RAW_PACKET,
-		.comp_mask = (IBV_EXP_QP_INIT_ATTR_PD |
-			      IBV_EXP_QP_INIT_ATTR_RX_HASH),
-		.pd = priv->pd,
-		.rx_hash_conf = &hash_conf,
-		.port_num = priv->port,
-	};
-
-	qp = ibv_exp_create_qp(priv->ctx, &qp_init_attr);
-	if (qp == NULL) {
-		err = (errno ? errno : EINVAL);
-		ERROR("hash RX QP creation failure: %s", strerror(err));
-		goto error;
+	assert(rxq_ctrl->wq);
+	if (fdir_queue == NULL) {
+		fdir_queue = priv_fdir_queue_create(priv, rxq_ctrl->wq,
+						    rxq_ctrl->socket);
+		rxq_ctrl->fdir_queue = fdir_queue;
 	}
-
-	fdir_queue->ind_table = ind_table;
-	fdir_queue->qp = qp;
-
 	return fdir_queue;
-
-error:
-	if (qp != NULL)
-		claim_zero(ibv_destroy_qp(qp));
-
-	if (ind_table != NULL)
-		claim_zero(ibv_exp_destroy_rwq_ind_table(ind_table));
-
-	return NULL;
 }
 
 /**
@@ -601,7 +681,6 @@ priv_fdir_disable(struct priv *priv)
 {
 	unsigned int i;
 	struct mlx5_fdir_filter *mlx5_fdir_filter;
-	struct fdir_queue *fdir_queue;
 
 	/* Run on every flow director filter and destroy flow handle. */
 	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
@@ -618,23 +697,15 @@ priv_fdir_disable(struct priv *priv)
 		}
 	}
 
-	/* Run on every RX queue to destroy related flow director QP and
-	 * indirection table. */
+	/* Destroy flow director context in each RX queue. */
 	for (i = 0; (i != priv->rxqs_n); i++) {
 		struct rxq_ctrl *rxq_ctrl =
 			container_of((*priv->rxqs)[i], struct rxq_ctrl, rxq);
 
-		fdir_queue = &rxq_ctrl->fdir_queue;
-		if (fdir_queue->qp != NULL) {
-			claim_zero(ibv_destroy_qp(fdir_queue->qp));
-			fdir_queue->qp = NULL;
-		}
-
-		if (fdir_queue->ind_table != NULL) {
-			claim_zero(ibv_exp_destroy_rwq_ind_table
-				   (fdir_queue->ind_table));
-			fdir_queue->ind_table = NULL;
-		}
+		if (!rxq_ctrl->fdir_queue)
+			continue;
+		priv_fdir_queue_destroy(priv, rxq_ctrl->fdir_queue);
+		rxq_ctrl->fdir_queue = NULL;
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 99027d2..8e02440 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -745,6 +745,8 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 
 	DEBUG("cleaning up %p", (void *)rxq_ctrl);
 	rxq_free_elts(rxq_ctrl);
+	if (rxq_ctrl->fdir_queue != NULL)
+		priv_fdir_queue_destroy(rxq_ctrl->priv, rxq_ctrl->fdir_queue);
 	if (rxq_ctrl->if_wq != NULL) {
 		assert(rxq_ctrl->priv != NULL);
 		assert(rxq_ctrl->priv->ctx != NULL);
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 952f88c..f68149e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -87,6 +87,8 @@ struct mlx5_txq_stats {
 struct fdir_queue {
 	struct ibv_qp *qp; /* Associated RX QP. */
 	struct ibv_exp_rwq_ind_table *ind_table; /* Indirection table. */
+	struct ibv_exp_wq *wq; /* Work queue. */
+	struct ibv_cq *cq; /* Completion queue. */
 };
 
 struct priv;
@@ -128,7 +130,7 @@ struct rxq_ctrl {
 	struct ibv_cq *cq; /* Completion Queue. */
 	struct ibv_exp_wq *wq; /* Work Queue. */
 	struct ibv_exp_res_domain *rd; /* Resource Domain. */
-	struct fdir_queue fdir_queue; /* Flow director queue. */
+	struct fdir_queue *fdir_queue; /* Flow director queue. */
 	struct ibv_mr *mr; /* Memory Region (for mp). */
 	struct ibv_exp_wq_family *if_wq; /* WQ burst interface. */
 	struct ibv_exp_cq_family_v1 *if_cq; /* CQ interface. */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 07/14] net/mlx5: fix flow director drop mode
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (5 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 06/14] net/mlx5: refactor allocation of flow director queues Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 08/14] net/mlx5: re-factorize functions Nelio Laranjeiro
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Yaacov Hazan, Adrien Mazarguil

From: Yaacov Hazan <yaacovh@mellanox.com>

Packet rejection was routed to a polled queue.  This patch route them to a
dummy queue which is not polled.

Fixes: 76f5c99e6840 ("mlx5: support flow director")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 doc/guides/nics/mlx5.rst     |  3 ++-
 drivers/net/mlx5/mlx5.h      |  1 +
 drivers/net/mlx5/mlx5_fdir.c | 41 +++++++++++++++++++++++++++++++++++++++--
 3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 5c10cd3..8923173 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -84,7 +84,8 @@ Features
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
-- Flow director (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN).
+- Flow director (RTE_FDIR_MODE_PERFECT, RTE_FDIR_MODE_PERFECT_MAC_VLAN and
+  RTE_ETH_FDIR_REJECT).
 - Secondary process TX is supported.
 - KVM and VMware ESX SR-IOV modes are supported.
 
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b1b657d..9e8ac7e 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -134,6 +134,7 @@ struct priv {
 	unsigned int (*reta_idx)[]; /* RETA index table. */
 	unsigned int reta_idx_n; /* RETA index size. */
 	struct fdir_filter_list *fdir_filter_list; /* Flow director rules. */
+	struct fdir_queue *fdir_drop_queue; /* Flow director drop queue. */
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 4a82dc9..1acf682 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -75,6 +75,7 @@ struct fdir_flow_desc {
 struct mlx5_fdir_filter {
 	LIST_ENTRY(mlx5_fdir_filter) next;
 	uint16_t queue; /* Queue assigned to if FDIR match. */
+	enum rte_eth_fdir_behavior behavior;
 	struct fdir_flow_desc desc;
 	struct ibv_exp_flow *flow;
 };
@@ -567,6 +568,33 @@ priv_get_fdir_queue(struct priv *priv, uint16_t idx)
 }
 
 /**
+ * Get or flow director drop queue. Create it if it does not exist.
+ *
+ * @param priv
+ *   Private structure.
+ *
+ * @return
+ *   Flow director drop queue on success, NULL otherwise.
+ */
+static struct fdir_queue *
+priv_get_fdir_drop_queue(struct priv *priv)
+{
+	struct fdir_queue *fdir_queue = priv->fdir_drop_queue;
+
+	if (fdir_queue == NULL) {
+		unsigned int socket = SOCKET_ID_ANY;
+
+		/* Select a known NUMA socket if possible. */
+		if (priv->rxqs_n && (*priv->rxqs)[0])
+			socket = container_of((*priv->rxqs)[0],
+					      struct rxq_ctrl, rxq)->socket;
+		fdir_queue = priv_fdir_queue_create(priv, NULL, socket);
+		priv->fdir_drop_queue = fdir_queue;
+	}
+	return fdir_queue;
+}
+
+/**
  * Enable flow director filter and create steering rules.
  *
  * @param priv
@@ -588,7 +616,11 @@ priv_fdir_filter_enable(struct priv *priv,
 		return 0;
 
 	/* Get fdir_queue for specific queue. */
-	fdir_queue = priv_get_fdir_queue(priv, mlx5_fdir_filter->queue);
+	if (mlx5_fdir_filter->behavior == RTE_ETH_FDIR_REJECT)
+		fdir_queue = priv_get_fdir_drop_queue(priv);
+	else
+		fdir_queue = priv_get_fdir_queue(priv,
+						 mlx5_fdir_filter->queue);
 
 	if (fdir_queue == NULL) {
 		ERROR("failed to create flow director rxq for queue %d",
@@ -707,6 +739,10 @@ priv_fdir_disable(struct priv *priv)
 		priv_fdir_queue_destroy(priv, rxq_ctrl->fdir_queue);
 		rxq_ctrl->fdir_queue = NULL;
 	}
+	if (priv->fdir_drop_queue) {
+		priv_fdir_queue_destroy(priv, priv->fdir_drop_queue);
+		priv->fdir_drop_queue = NULL;
+	}
 }
 
 /**
@@ -807,8 +843,9 @@ priv_fdir_filter_add(struct priv *priv,
 		return err;
 	}
 
-	/* Set queue. */
+	/* Set action parameters. */
 	mlx5_fdir_filter->queue = fdir_filter->action.rx_queue;
+	mlx5_fdir_filter->behavior = fdir_filter->action.behavior;
 
 	/* Convert to mlx5 filter descriptor. */
 	fdir_filter_to_flow_desc(fdir_filter,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 08/14] net/mlx5: re-factorize functions
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (6 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 07/14] net/mlx5: fix flow director drop mode Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 09/14] net/mlx5: fix inline logic Nelio Laranjeiro
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

Rework logic of wqe_write() and wqe_write_vlan() which are pretty similar
to keep a single one.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 98 ++++++++++----------------------------------
 1 file changed, 22 insertions(+), 76 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 59e8183..47d6d68 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -290,8 +290,8 @@ txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
  *   Pointer to TX queue structure.
  * @param wqe
  *   Pointer to the WQE to fill.
- * @param addr
- *   Buffer data address.
+ * @param buf
+ *   Buffer.
  * @param length
  *   Packet length.
  * @param lkey
@@ -299,54 +299,24 @@ txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
  */
 static inline void
 mlx5_wqe_write(struct txq *txq, volatile union mlx5_wqe *wqe,
-	       uintptr_t addr, uint32_t length, uint32_t lkey)
-{
-	wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
-	wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
-	wqe->wqe.ctrl.data[2] = 0;
-	wqe->wqe.ctrl.data[3] = 0;
-	wqe->inl.eseg.rsvd0 = 0;
-	wqe->inl.eseg.rsvd1 = 0;
-	wqe->inl.eseg.mss = 0;
-	wqe->inl.eseg.rsvd2 = 0;
-	wqe->wqe.eseg.inline_hdr_sz = htons(MLX5_ETH_INLINE_HEADER_SIZE);
-	/* Copy the first 16 bytes into inline header. */
-	rte_memcpy((uint8_t *)(uintptr_t)wqe->wqe.eseg.inline_hdr_start,
-		   (uint8_t *)(uintptr_t)addr,
-		   MLX5_ETH_INLINE_HEADER_SIZE);
-	addr += MLX5_ETH_INLINE_HEADER_SIZE;
-	length -= MLX5_ETH_INLINE_HEADER_SIZE;
-	/* Store remaining data in data segment. */
-	wqe->wqe.dseg.byte_count = htonl(length);
-	wqe->wqe.dseg.lkey = lkey;
-	wqe->wqe.dseg.addr = htonll(addr);
-	/* Increment consumer index. */
-	++txq->wqe_ci;
-}
-
-/**
- * Write a regular WQE with VLAN.
- *
- * @param txq
- *   Pointer to TX queue structure.
- * @param wqe
- *   Pointer to the WQE to fill.
- * @param addr
- *   Buffer data address.
- * @param length
- *   Packet length.
- * @param lkey
- *   Memory region lkey.
- * @param vlan_tci
- *   VLAN field to insert in packet.
- */
-static inline void
-mlx5_wqe_write_vlan(struct txq *txq, volatile union mlx5_wqe *wqe,
-		    uintptr_t addr, uint32_t length, uint32_t lkey,
-		    uint16_t vlan_tci)
+	       struct rte_mbuf *buf, uint32_t length, uint32_t lkey)
 {
-	uint32_t vlan = htonl(0x81000000 | vlan_tci);
-
+	uintptr_t addr = rte_pktmbuf_mtod(buf, uintptr_t);
+
+	rte_mov16((uint8_t *)&wqe->wqe.eseg.inline_hdr_start,
+		  (uint8_t *)addr);
+	addr += 16;
+	length -= 16;
+	/* Need to insert VLAN ? */
+	if (buf->ol_flags & PKT_TX_VLAN_PKT) {
+		uint32_t vlan = htonl(0x81000000 | buf->vlan_tci);
+
+		memcpy((uint8_t *)&wqe->wqe.eseg.inline_hdr_start + 12,
+		       &vlan, sizeof(vlan));
+		addr -= sizeof(vlan);
+		length += sizeof(vlan);
+	}
+	/* Write the WQE. */
 	wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
 	wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
 	wqe->wqe.ctrl.data[2] = 0;
@@ -355,20 +325,7 @@ mlx5_wqe_write_vlan(struct txq *txq, volatile union mlx5_wqe *wqe,
 	wqe->inl.eseg.rsvd1 = 0;
 	wqe->inl.eseg.mss = 0;
 	wqe->inl.eseg.rsvd2 = 0;
-	wqe->wqe.eseg.inline_hdr_sz = htons(MLX5_ETH_VLAN_INLINE_HEADER_SIZE);
-	/*
-	 * Copy 12 bytes of source & destination MAC address.
-	 * Copy 4 bytes of VLAN.
-	 * Copy 2 bytes of Ether type.
-	 */
-	rte_memcpy((uint8_t *)(uintptr_t)wqe->wqe.eseg.inline_hdr_start,
-		   (uint8_t *)(uintptr_t)addr, 12);
-	rte_memcpy((uint8_t *)((uintptr_t)wqe->wqe.eseg.inline_hdr_start + 12),
-		   &vlan, sizeof(vlan));
-	rte_memcpy((uint8_t *)((uintptr_t)wqe->wqe.eseg.inline_hdr_start + 16),
-		   (uint8_t *)((uintptr_t)addr + 12), 2);
-	addr += MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
-	length -= MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
+	wqe->wqe.eseg.inline_hdr_sz = htons(16);
 	/* Store remaining data in data segment. */
 	wqe->wqe.dseg.byte_count = htonl(length);
 	wqe->wqe.dseg.lkey = lkey;
@@ -609,7 +566,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 	do {
 		struct rte_mbuf *buf = *(pkts++);
 		unsigned int elts_head_next;
-		uintptr_t addr;
 		uint32_t length;
 		uint32_t lkey;
 		unsigned int segs_n = buf->nb_segs;
@@ -631,8 +587,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		rte_prefetch0(wqe);
 		if (pkts_n)
 			rte_prefetch0(*pkts);
-		/* Retrieve buffer information. */
-		addr = rte_pktmbuf_mtod(buf, uintptr_t);
 		length = DATA_LEN(buf);
 		/* Update element. */
 		(*txq->elts)[elts_head] = buf;
@@ -642,11 +596,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 						       volatile void *));
 		/* Retrieve Memory Region key for this memory pool. */
 		lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-		if (buf->ol_flags & PKT_TX_VLAN_PKT)
-			mlx5_wqe_write_vlan(txq, wqe, addr, length, lkey,
-					    buf->vlan_tci);
-		else
-			mlx5_wqe_write(txq, wqe, addr, length, lkey);
+		mlx5_wqe_write(txq, wqe, buf, length, lkey);
 		/* Should we enable HW CKSUM offload */
 		if (buf->ol_flags &
 		    (PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) {
@@ -810,11 +760,7 @@ mlx5_tx_burst_inline(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		} else {
 			/* Retrieve Memory Region key for this memory pool. */
 			lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-			if (buf->ol_flags & PKT_TX_VLAN_PKT)
-				mlx5_wqe_write_vlan(txq, wqe, addr, length,
-						    lkey, buf->vlan_tci);
-			else
-				mlx5_wqe_write(txq, wqe, addr, length, lkey);
+			mlx5_wqe_write(txq, wqe, buf, length, lkey);
 		}
 		while (--segs_n) {
 			/*
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 09/14] net/mlx5: fix inline logic
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (7 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 08/14] net/mlx5: re-factorize functions Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 10/14] net/mlx5: fix Rx function selection Nelio Laranjeiro
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu
  Cc: Nélio Laranjeiro, Adrien Mazarguil, Vasily Philipov

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

To improve performance the NIC expects for large packets to have a pointer
to a cache aligned address, old inline code could break this assumption
which hurts performance.

Fixes: 2a66cf378954 ("net/mlx5: support inline send")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
---
 drivers/net/mlx5/mlx5_ethdev.c |   4 -
 drivers/net/mlx5/mlx5_rxtx.c   | 422 ++++++++++-------------------------------
 drivers/net/mlx5/mlx5_rxtx.h   |   3 +-
 drivers/net/mlx5/mlx5_txq.c    |   9 +-
 4 files changed, 103 insertions(+), 335 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index b3b7820..205ce9c 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1398,10 +1398,6 @@ priv_select_tx_function(struct priv *priv)
 	} else if ((priv->sriov == 0) && priv->mps) {
 		priv->dev->tx_pkt_burst = mlx5_tx_burst_mpw;
 		DEBUG("selected MPW TX function");
-	} else if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) {
-		priv->dev->tx_pkt_burst = mlx5_tx_burst_inline;
-		DEBUG("selected inline TX function (%u >= %u queues)",
-		      priv->txqs_n, priv->txqs_inline);
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 47d6d68..bb76f2c 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -294,179 +294,99 @@ txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
  *   Buffer.
  * @param length
  *   Packet length.
- * @param lkey
- *   Memory region lkey.
+ *
+ * @return ds
+ *   Number of DS elements consumed.
  */
-static inline void
+static inline unsigned int
 mlx5_wqe_write(struct txq *txq, volatile union mlx5_wqe *wqe,
-	       struct rte_mbuf *buf, uint32_t length, uint32_t lkey)
+	       struct rte_mbuf *buf, uint32_t length)
 {
+	uintptr_t raw = (uintptr_t)&wqe->wqe.eseg.inline_hdr_start;
+	uint16_t ds;
+	uint16_t pkt_inline_sz = 16;
 	uintptr_t addr = rte_pktmbuf_mtod(buf, uintptr_t);
+	struct mlx5_wqe_data_seg *dseg = NULL;
 
-	rte_mov16((uint8_t *)&wqe->wqe.eseg.inline_hdr_start,
-		  (uint8_t *)addr);
-	addr += 16;
+	assert(length >= 16);
+	/* Start the know and common part of the WQE structure. */
+	wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
+	wqe->wqe.ctrl.data[2] = 0;
+	wqe->wqe.ctrl.data[3] = 0;
+	wqe->wqe.eseg.rsvd0 = 0;
+	wqe->wqe.eseg.rsvd1 = 0;
+	wqe->wqe.eseg.mss = 0;
+	wqe->wqe.eseg.rsvd2 = 0;
+	/* Start by copying the Ethernet Header. */
+	rte_mov16((uint8_t *)raw, (uint8_t *)addr);
 	length -= 16;
-	/* Need to insert VLAN ? */
+	addr += 16;
+	/* Replace the Ethernet type by the VLAN if necessary. */
 	if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 		uint32_t vlan = htonl(0x81000000 | buf->vlan_tci);
 
-		memcpy((uint8_t *)&wqe->wqe.eseg.inline_hdr_start + 12,
+		memcpy((uint8_t *)(raw + 16 - sizeof(vlan)),
 		       &vlan, sizeof(vlan));
 		addr -= sizeof(vlan);
 		length += sizeof(vlan);
 	}
-	/* Write the WQE. */
-	wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
-	wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
-	wqe->wqe.ctrl.data[2] = 0;
-	wqe->wqe.ctrl.data[3] = 0;
-	wqe->inl.eseg.rsvd0 = 0;
-	wqe->inl.eseg.rsvd1 = 0;
-	wqe->inl.eseg.mss = 0;
-	wqe->inl.eseg.rsvd2 = 0;
-	wqe->wqe.eseg.inline_hdr_sz = htons(16);
-	/* Store remaining data in data segment. */
-	wqe->wqe.dseg.byte_count = htonl(length);
-	wqe->wqe.dseg.lkey = lkey;
-	wqe->wqe.dseg.addr = htonll(addr);
-	/* Increment consumer index. */
-	++txq->wqe_ci;
-}
-
-/**
- * Write a inline WQE.
- *
- * @param txq
- *   Pointer to TX queue structure.
- * @param wqe
- *   Pointer to the WQE to fill.
- * @param addr
- *   Buffer data address.
- * @param length
- *   Packet length.
- * @param lkey
- *   Memory region lkey.
- */
-static inline void
-mlx5_wqe_write_inline(struct txq *txq, volatile union mlx5_wqe *wqe,
-		      uintptr_t addr, uint32_t length)
-{
-	uint32_t size;
-	uint16_t wqe_cnt = txq->wqe_n - 1;
-	uint16_t wqe_ci = txq->wqe_ci + 1;
-
-	/* Copy the first 16 bytes into inline header. */
-	rte_memcpy((void *)(uintptr_t)wqe->inl.eseg.inline_hdr_start,
-		   (void *)(uintptr_t)addr,
-		   MLX5_ETH_INLINE_HEADER_SIZE);
-	addr += MLX5_ETH_INLINE_HEADER_SIZE;
-	length -= MLX5_ETH_INLINE_HEADER_SIZE;
-	size = 3 + ((4 + length + 15) / 16);
-	wqe->inl.byte_cnt = htonl(length | MLX5_INLINE_SEG);
-	rte_memcpy((void *)(uintptr_t)&wqe->inl.data[0],
-		   (void *)addr, MLX5_WQE64_INL_DATA);
-	addr += MLX5_WQE64_INL_DATA;
-	length -= MLX5_WQE64_INL_DATA;
-	while (length) {
-		volatile union mlx5_wqe *wqe_next =
-			&(*txq->wqes)[wqe_ci & wqe_cnt];
-		uint32_t copy_bytes = (length > sizeof(*wqe)) ?
-				      sizeof(*wqe) :
-				      length;
-
-		rte_mov64((uint8_t *)(uintptr_t)&wqe_next->data[0],
-			  (uint8_t *)addr);
-		addr += copy_bytes;
-		length -= copy_bytes;
-		++wqe_ci;
-	}
-	assert(size < 64);
-	wqe->inl.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
-	wqe->inl.ctrl.data[1] = htonl(txq->qp_num_8s | size);
-	wqe->inl.ctrl.data[2] = 0;
-	wqe->inl.ctrl.data[3] = 0;
-	wqe->inl.eseg.rsvd0 = 0;
-	wqe->inl.eseg.rsvd1 = 0;
-	wqe->inl.eseg.mss = 0;
-	wqe->inl.eseg.rsvd2 = 0;
-	wqe->inl.eseg.inline_hdr_sz = htons(MLX5_ETH_INLINE_HEADER_SIZE);
-	/* Increment consumer index. */
-	txq->wqe_ci = wqe_ci;
-}
-
-/**
- * Write a inline WQE with VLAN.
- *
- * @param txq
- *   Pointer to TX queue structure.
- * @param wqe
- *   Pointer to the WQE to fill.
- * @param addr
- *   Buffer data address.
- * @param length
- *   Packet length.
- * @param lkey
- *   Memory region lkey.
- * @param vlan_tci
- *   VLAN field to insert in packet.
- */
-static inline void
-mlx5_wqe_write_inline_vlan(struct txq *txq, volatile union mlx5_wqe *wqe,
-			   uintptr_t addr, uint32_t length, uint16_t vlan_tci)
-{
-	uint32_t size;
-	uint32_t wqe_cnt = txq->wqe_n - 1;
-	uint16_t wqe_ci = txq->wqe_ci + 1;
-	uint32_t vlan = htonl(0x81000000 | vlan_tci);
-
-	/*
-	 * Copy 12 bytes of source & destination MAC address.
-	 * Copy 4 bytes of VLAN.
-	 * Copy 2 bytes of Ether type.
-	 */
-	rte_memcpy((uint8_t *)(uintptr_t)wqe->inl.eseg.inline_hdr_start,
-		   (uint8_t *)addr, 12);
-	rte_memcpy((uint8_t *)(uintptr_t)wqe->inl.eseg.inline_hdr_start + 12,
-		   &vlan, sizeof(vlan));
-	rte_memcpy((uint8_t *)((uintptr_t)wqe->inl.eseg.inline_hdr_start + 16),
-		   (uint8_t *)(addr + 12), 2);
-	addr += MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
-	length -= MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
-	size = (sizeof(wqe->inl.ctrl.ctrl) +
-		sizeof(wqe->inl.eseg) +
-		sizeof(wqe->inl.byte_cnt) +
-		length + 15) / 16;
-	wqe->inl.byte_cnt = htonl(length | MLX5_INLINE_SEG);
-	rte_memcpy((void *)(uintptr_t)&wqe->inl.data[0],
-		   (void *)addr, MLX5_WQE64_INL_DATA);
-	addr += MLX5_WQE64_INL_DATA;
-	length -= MLX5_WQE64_INL_DATA;
-	while (length) {
-		volatile union mlx5_wqe *wqe_next =
-			&(*txq->wqes)[wqe_ci & wqe_cnt];
-		uint32_t copy_bytes = (length > sizeof(*wqe)) ?
-				      sizeof(*wqe) :
-				      length;
-
-		rte_mov64((uint8_t *)(uintptr_t)&wqe_next->data[0],
-			  (uint8_t *)addr);
-		addr += copy_bytes;
-		length -= copy_bytes;
-		++wqe_ci;
+	/* Inline if enough room. */
+	if (txq->max_inline != 0) {
+		uintptr_t end = (uintptr_t)&(*txq->wqes)[txq->wqe_n];
+		uint16_t max_inline = txq->max_inline * RTE_CACHE_LINE_SIZE;
+		uint16_t room;
+
+		raw += 16;
+		room = end - (uintptr_t)raw;
+		if (room > max_inline) {
+			uintptr_t addr_end = (addr + max_inline) &
+				~(RTE_CACHE_LINE_SIZE - 1);
+			uint16_t copy_b = ((addr_end - addr) > length) ?
+					  length :
+					  (addr_end - addr);
+
+			rte_memcpy((void *)raw, (void *)addr, copy_b);
+			addr += copy_b;
+			length -= copy_b;
+			pkt_inline_sz += copy_b;
+			/* Sanity check. */
+			assert(addr <= addr_end);
+		}
+		/* Store the inlined packet size in the WQE. */
+		wqe->wqe.eseg.inline_hdr_sz = htons(pkt_inline_sz);
+		/*
+		 * 2 DWORDs consumed by the WQE header + 1 DSEG +
+		 * the size of the inline part of the packet.
+		 */
+		ds = 2 + ((pkt_inline_sz - 2 + 15) / 16);
+		if (length > 0) {
+			dseg = (struct mlx5_wqe_data_seg *)
+				((uintptr_t)wqe + (ds * 16));
+			if ((uintptr_t)dseg >= end)
+				dseg = (struct mlx5_wqe_data_seg *)
+					((uintptr_t)&(*txq->wqes)[0]);
+			goto use_dseg;
+		}
+	} else {
+		/* Add the remaining packet as a simple ds. */
+		ds = 3;
+		/*
+		 * No inline has been done in the packet, only the Ethernet
+		 * Header as been stored.
+		 */
+		wqe->wqe.eseg.inline_hdr_sz = htons(16);
+		dseg = (struct mlx5_wqe_data_seg *)
+			((uintptr_t)wqe + (ds * 16));
+use_dseg:
+		*dseg = (struct mlx5_wqe_data_seg) {
+			.addr = htonll(addr),
+			.byte_count = htonl(length),
+			.lkey = txq_mp2mr(txq, txq_mb2mp(buf)),
+		};
+		++ds;
 	}
-	assert(size < 64);
-	wqe->inl.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
-	wqe->inl.ctrl.data[1] = htonl(txq->qp_num_8s | size);
-	wqe->inl.ctrl.data[2] = 0;
-	wqe->inl.ctrl.data[3] = 0;
-	wqe->inl.eseg.rsvd0 = 0;
-	wqe->inl.eseg.rsvd1 = 0;
-	wqe->inl.eseg.mss = 0;
-	wqe->inl.eseg.rsvd2 = 0;
-	wqe->inl.eseg.inline_hdr_sz = htons(MLX5_ETH_VLAN_INLINE_HEADER_SIZE);
-	/* Increment consumer index. */
-	txq->wqe_ci = wqe_ci;
+	wqe->wqe.ctrl.data[1] = htonl(txq->qp_num_8s | ds);
+	return ds;
 }
 
 /**
@@ -567,7 +487,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		struct rte_mbuf *buf = *(pkts++);
 		unsigned int elts_head_next;
 		uint32_t length;
-		uint32_t lkey;
 		unsigned int segs_n = buf->nb_segs;
 		volatile struct mlx5_wqe_data_seg *dseg;
 		unsigned int ds = sizeof(*wqe) / 16;
@@ -583,8 +502,8 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		--pkts_n;
 		elts_head_next = (elts_head + 1) & (elts_n - 1);
 		wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)];
-		dseg = &wqe->wqe.dseg;
-		rte_prefetch0(wqe);
+		tx_prefetch_wqe(txq, txq->wqe_ci);
+		tx_prefetch_wqe(txq, txq->wqe_ci + 1);
 		if (pkts_n)
 			rte_prefetch0(*pkts);
 		length = DATA_LEN(buf);
@@ -594,9 +513,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		if (pkts_n)
 			rte_prefetch0(rte_pktmbuf_mtod(*pkts,
 						       volatile void *));
-		/* Retrieve Memory Region key for this memory pool. */
-		lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-		mlx5_wqe_write(txq, wqe, buf, length, lkey);
 		/* Should we enable HW CKSUM offload */
 		if (buf->ol_flags &
 		    (PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) {
@@ -606,6 +522,11 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		} else {
 			wqe->wqe.eseg.cs_flags = 0;
 		}
+		ds = mlx5_wqe_write(txq, wqe, buf, length);
+		if (segs_n == 1)
+			goto skip_segs;
+		dseg = (volatile struct mlx5_wqe_data_seg *)
+			(((uintptr_t)wqe) + ds * 16);
 		while (--segs_n) {
 			/*
 			 * Spill on next WQE when the current one does not have
@@ -636,11 +557,13 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		/* Update DS field in WQE. */
 		wqe->wqe.ctrl.data[1] &= htonl(0xffffffc0);
 		wqe->wqe.ctrl.data[1] |= htonl(ds & 0x3f);
-		elts_head = elts_head_next;
+skip_segs:
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		/* Increment sent bytes counter. */
 		txq->stats.obytes += length;
 #endif
+		/* Increment consumer index. */
+		txq->wqe_ci += (ds + 3) / 4;
 		elts_head = elts_head_next;
 		++i;
 	} while (pkts_n);
@@ -669,162 +592,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 }
 
 /**
- * DPDK callback for TX with inline support.
- *
- * @param dpdk_txq
- *   Generic pointer to TX queue structure.
- * @param[in] pkts
- *   Packets to transmit.
- * @param pkts_n
- *   Number of packets in array.
- *
- * @return
- *   Number of packets successfully transmitted (<= pkts_n).
- */
-uint16_t
-mlx5_tx_burst_inline(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
-{
-	struct txq *txq = (struct txq *)dpdk_txq;
-	uint16_t elts_head = txq->elts_head;
-	const unsigned int elts_n = txq->elts_n;
-	unsigned int i = 0;
-	unsigned int j = 0;
-	unsigned int max;
-	unsigned int comp;
-	volatile union mlx5_wqe *wqe = NULL;
-	unsigned int max_inline = txq->max_inline;
-
-	if (unlikely(!pkts_n))
-		return 0;
-	/* Prefetch first packet cacheline. */
-	tx_prefetch_cqe(txq, txq->cq_ci);
-	tx_prefetch_cqe(txq, txq->cq_ci + 1);
-	rte_prefetch0(*pkts);
-	/* Start processing. */
-	txq_complete(txq);
-	max = (elts_n - (elts_head - txq->elts_tail));
-	if (max > elts_n)
-		max -= elts_n;
-	do {
-		struct rte_mbuf *buf = *(pkts++);
-		unsigned int elts_head_next;
-		uintptr_t addr;
-		uint32_t length;
-		uint32_t lkey;
-		unsigned int segs_n = buf->nb_segs;
-		volatile struct mlx5_wqe_data_seg *dseg;
-		unsigned int ds = sizeof(*wqe) / 16;
-
-		/*
-		 * Make sure there is enough room to store this packet and
-		 * that one ring entry remains unused.
-		 */
-		assert(segs_n);
-		if (max < segs_n + 1)
-			break;
-		max -= segs_n;
-		--pkts_n;
-		elts_head_next = (elts_head + 1) & (elts_n - 1);
-		wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)];
-		dseg = &wqe->wqe.dseg;
-		tx_prefetch_wqe(txq, txq->wqe_ci);
-		tx_prefetch_wqe(txq, txq->wqe_ci + 1);
-		if (pkts_n)
-			rte_prefetch0(*pkts);
-		/* Should we enable HW CKSUM offload */
-		if (buf->ol_flags &
-		    (PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) {
-			wqe->inl.eseg.cs_flags =
-				MLX5_ETH_WQE_L3_CSUM |
-				MLX5_ETH_WQE_L4_CSUM;
-		} else {
-			wqe->inl.eseg.cs_flags = 0;
-		}
-		/* Retrieve buffer information. */
-		addr = rte_pktmbuf_mtod(buf, uintptr_t);
-		length = DATA_LEN(buf);
-		/* Update element. */
-		(*txq->elts)[elts_head] = buf;
-		/* Prefetch next buffer data. */
-		if (pkts_n)
-			rte_prefetch0(rte_pktmbuf_mtod(*pkts,
-						       volatile void *));
-		if ((length <= max_inline) && (segs_n == 1)) {
-			if (buf->ol_flags & PKT_TX_VLAN_PKT)
-				mlx5_wqe_write_inline_vlan(txq, wqe,
-							   addr, length,
-							   buf->vlan_tci);
-			else
-				mlx5_wqe_write_inline(txq, wqe, addr, length);
-			goto skip_segs;
-		} else {
-			/* Retrieve Memory Region key for this memory pool. */
-			lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-			mlx5_wqe_write(txq, wqe, buf, length, lkey);
-		}
-		while (--segs_n) {
-			/*
-			 * Spill on next WQE when the current one does not have
-			 * enough room left. Size of WQE must a be a multiple
-			 * of data segment size.
-			 */
-			assert(!(sizeof(*wqe) % sizeof(*dseg)));
-			if (!(ds % (sizeof(*wqe) / 16)))
-				dseg = (volatile void *)
-					&(*txq->wqes)[txq->wqe_ci++ &
-						      (txq->wqe_n - 1)];
-			else
-				++dseg;
-			++ds;
-			buf = buf->next;
-			assert(buf);
-			/* Store segment information. */
-			dseg->byte_count = htonl(DATA_LEN(buf));
-			dseg->lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-			dseg->addr = htonll(rte_pktmbuf_mtod(buf, uintptr_t));
-			(*txq->elts)[elts_head_next] = buf;
-			elts_head_next = (elts_head_next + 1) & (elts_n - 1);
-#ifdef MLX5_PMD_SOFT_COUNTERS
-			length += DATA_LEN(buf);
-#endif
-			++j;
-		}
-		/* Update DS field in WQE. */
-		wqe->inl.ctrl.data[1] &= htonl(0xffffffc0);
-		wqe->inl.ctrl.data[1] |= htonl(ds & 0x3f);
-skip_segs:
-		elts_head = elts_head_next;
-#ifdef MLX5_PMD_SOFT_COUNTERS
-		/* Increment sent bytes counter. */
-		txq->stats.obytes += length;
-#endif
-		++i;
-	} while (pkts_n);
-	/* Take a shortcut if nothing must be sent. */
-	if (unlikely(i == 0))
-		return 0;
-	/* Check whether completion threshold has been reached. */
-	comp = txq->elts_comp + i + j;
-	if (comp >= MLX5_TX_COMP_THRESH) {
-		/* Request completion on last WQE. */
-		wqe->inl.ctrl.data[2] = htonl(8);
-		/* Save elts_head in unused "immediate" field of WQE. */
-		wqe->inl.ctrl.data[3] = elts_head;
-		txq->elts_comp = 0;
-	} else {
-		txq->elts_comp = comp;
-	}
-#ifdef MLX5_PMD_SOFT_COUNTERS
-	/* Increment sent packets counter. */
-	txq->stats.opackets += i;
-#endif
-	/* Ring QP doorbell. */
-	mlx5_tx_dbrec(txq);
-	txq->elts_head = elts_head;
-	return i;
-}
-
-/**
  * Open a MPW session.
  *
  * @param txq
@@ -1114,7 +881,7 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 	unsigned int j = 0;
 	unsigned int max;
 	unsigned int comp;
-	unsigned int inline_room = txq->max_inline;
+	unsigned int inline_room = txq->max_inline * RTE_CACHE_LINE_SIZE;
 	struct mlx5_mpw mpw = {
 		.state = MLX5_MPW_STATE_CLOSED,
 	};
@@ -1168,7 +935,8 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 			    (length > inline_room) ||
 			    (mpw.wqe->mpw_inl.eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
-				inline_room = txq->max_inline;
+				inline_room =
+					txq->max_inline * RTE_CACHE_LINE_SIZE;
 			}
 		}
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -1184,7 +952,8 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
-			assert(inline_room == txq->max_inline);
+			assert(inline_room ==
+			       txq->max_inline * RTE_CACHE_LINE_SIZE);
 #if defined(MLX5_PMD_SOFT_COUNTERS) || !defined(NDEBUG)
 			length = 0;
 #endif
@@ -1249,7 +1018,8 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 			++j;
 			if (mpw.pkts_n == MLX5_MPW_DSEG_MAX) {
 				mlx5_mpw_inline_close(txq, &mpw);
-				inline_room = txq->max_inline;
+				inline_room =
+					txq->max_inline * RTE_CACHE_LINE_SIZE;
 			} else {
 				inline_room -= length;
 			}
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index f68149e..05779ef 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -249,7 +249,7 @@ struct txq {
 	uint16_t wqe_n; /* Number of WQ elements. */
 	uint16_t bf_offset; /* Blueflame offset. */
 	uint16_t bf_buf_size; /* Blueflame size. */
-	uint16_t max_inline; /* Maximum size to inline in a WQE. */
+	uint16_t max_inline; /* Multiple of RTE_CACHE_LINE_SIZE to inline. */
 	uint32_t qp_num_8s; /* QP number shifted by 8. */
 	volatile struct mlx5_cqe (*cqes)[]; /* Completion queue. */
 	volatile union mlx5_wqe (*wqes)[]; /* Work queue. */
@@ -314,7 +314,6 @@ uint16_t mlx5_tx_burst_secondary_setup(void *, struct rte_mbuf **, uint16_t);
 /* mlx5_rxtx.c */
 
 uint16_t mlx5_tx_burst(void *, struct rte_mbuf **, uint16_t);
-uint16_t mlx5_tx_burst_inline(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_tx_burst_mpw(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_tx_burst_mpw_inline(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst(void *, struct rte_mbuf **, uint16_t);
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 476ce79..e4510ef 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -338,9 +338,12 @@ txq_ctrl_setup(struct rte_eth_dev *dev, struct txq_ctrl *txq_ctrl,
 		.comp_mask = (IBV_EXP_QP_INIT_ATTR_PD |
 			      IBV_EXP_QP_INIT_ATTR_RES_DOMAIN),
 	};
-	if (priv->txq_inline && priv->txqs_n >= priv->txqs_inline) {
-		tmpl.txq.max_inline = priv->txq_inline;
-		attr.init.cap.max_inline_data = tmpl.txq.max_inline;
+	if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) {
+		tmpl.txq.max_inline =
+			((priv->txq_inline + (RTE_CACHE_LINE_SIZE - 1)) /
+			 RTE_CACHE_LINE_SIZE);
+		attr.init.cap.max_inline_data =
+			tmpl.txq.max_inline * RTE_CACHE_LINE_SIZE;
 	}
 	tmpl.qp = ibv_exp_create_qp(priv->ctx, &attr.init);
 	if (tmpl.qp == NULL) {
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 10/14] net/mlx5: fix Rx function selection
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (8 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 09/14] net/mlx5: fix inline logic Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 11/14] net/mlx5: fix link status report Nelio Laranjeiro
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

mlx5_rx_queue_setup() was setting the Rx function by itself instead of
using priv_select_rx_function() written for that purpose.

Fixes: cdab90cb5c8d ("net/mlx5: add Tx/Rx burst function selection wrapper")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_rxq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 8e02440..2d98c99 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1261,7 +1261,7 @@ mlx5_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 		      (void *)dev, (void *)rxq_ctrl);
 		(*priv->rxqs)[idx] = &rxq_ctrl->rxq;
 		/* Update receive callback. */
-		dev->rx_pkt_burst = mlx5_rx_burst;
+		priv_select_rx_function(priv);
 	}
 	priv_unlock(priv);
 	return -ret;
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 11/14] net/mlx5: fix link status report
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (9 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 10/14] net/mlx5: fix Rx function selection Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 12/14] net/mlx5: fix initialization in secondary process Nelio Laranjeiro
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Olga Shern, Adrien Mazarguil

From: Olga Shern <olgas@mellanox.com>

This commit fixes link status report on device start up when
lcs callback is configured.

Fixes: 62072098b54e ("mlx5: support setting link up or down")

Signed-off-by: Olga Shern <olgas@mellanox.com>
---
 drivers/net/mlx5/mlx5.c        | 1 +
 drivers/net/mlx5/mlx5.h        | 1 +
 drivers/net/mlx5/mlx5_ethdev.c | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index dc86c37..9448374 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -668,6 +668,7 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_dev)
 		/* Bring Ethernet device up. */
 		DEBUG("forcing Ethernet interface up");
 		priv_set_flags(priv, ~IFF_UP, IFF_UP);
+		mlx5_link_update_unlocked(priv->dev, 1);
 		continue;
 
 port_error:
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9e8ac7e..82172e4 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -187,6 +187,7 @@ int priv_set_flags(struct priv *, unsigned int, unsigned int);
 int mlx5_dev_configure(struct rte_eth_dev *);
 void mlx5_dev_infos_get(struct rte_eth_dev *, struct rte_eth_dev_info *);
 const uint32_t *mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+int mlx5_link_update_unlocked(struct rte_eth_dev *, int);
 int mlx5_link_update(struct rte_eth_dev *, int);
 int mlx5_dev_set_mtu(struct rte_eth_dev *, uint16_t);
 int mlx5_dev_get_flow_ctrl(struct rte_eth_dev *, struct rte_eth_fc_conf *);
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 205ce9c..7edbcd9 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -638,7 +638,7 @@ mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev)
  * @param wait_to_complete
  *   Wait for request completion (ignored).
  */
-static int
+int
 mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 {
 	struct priv *priv = mlx5_get_priv(dev);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 12/14] net/mlx5: fix initialization in secondary process
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (10 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 11/14] net/mlx5: fix link status report Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 13/14] net/mlx5: fix link speed capability information Nelio Laranjeiro
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Olivier Gournet, Adrien Mazarguil

From: Olivier Gournet <ogournet@corp.free.fr>

The changes introduced by previous commits (ones in fixes lines) made
secondaries attempt to reinitialize the Tx queue structures of the primary
instead of their own, for which they also do not allocate enough memory,
leading to crashes.

Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
Fixes: 21c8bb4928c9 ("net/mlx5: split Tx queue structure")

Signed-off-by: Olivier Gournet <ogournet@corp.free.fr>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_ethdev.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7edbcd9..29d2aec 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1309,11 +1309,13 @@ mlx5_secondary_data_setup(struct priv *priv)
 			continue;
 		primary_txq_ctrl = container_of(primary_txq,
 						struct txq_ctrl, txq);
-		txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl), 0,
+		txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl) +
+					     (1 << primary_txq->elts_n) *
+					     sizeof(struct rte_mbuf *), 0,
 					     primary_txq_ctrl->socket);
 		if (txq_ctrl != NULL) {
 			if (txq_ctrl_setup(priv->dev,
-					   primary_txq_ctrl,
+					   txq_ctrl,
 					   primary_txq->elts_n,
 					   primary_txq_ctrl->socket,
 					   NULL) == 0) {
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 13/14] net/mlx5: fix link speed capability information
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (11 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 12/14] net/mlx5: fix initialization in secondary process Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 14/14] net/mlx5: fix support for newer link speeds Nelio Laranjeiro
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

Make hard-coded values dynamic to return correct link speed capabilities
(not all ConnectX-4 NICs support everything).

Fixes: e274f5732225 ("ethdev: add speed capabilities")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 25 +++++++++++++++----------
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 82172e4..79b7a60 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -135,6 +135,7 @@ struct priv {
 	unsigned int reta_idx_n; /* RETA index size. */
 	struct fdir_filter_list *fdir_filter_list; /* Flow director rules. */
 	struct fdir_queue *fdir_drop_queue; /* Flow director drop queue. */
+	uint32_t link_speed_capa; /* Link speed capabilities. */
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 29d2aec..614085c 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -600,15 +600,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	 * size if it is not fixed.
 	 * The API should be updated to solve this problem. */
 	info->reta_size = priv->ind_table_max_size;
-	info->speed_capa =
-			ETH_LINK_SPEED_1G |
-			ETH_LINK_SPEED_10G |
-			ETH_LINK_SPEED_20G |
-			ETH_LINK_SPEED_25G |
-			ETH_LINK_SPEED_40G |
-			ETH_LINK_SPEED_50G |
-			ETH_LINK_SPEED_56G |
-			ETH_LINK_SPEED_100G;
+	info->speed_capa = priv->link_speed_capa;
 	priv_unlock(priv);
 }
 
@@ -643,7 +635,7 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 {
 	struct priv *priv = mlx5_get_priv(dev);
 	struct ethtool_cmd edata = {
-		.cmd = ETHTOOL_GSET
+		.cmd = ETHTOOL_GSET /* Deprecated since Linux v4.5. */
 	};
 	struct ifreq ifr;
 	struct rte_eth_link dev_link;
@@ -668,6 +660,19 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 		dev_link.link_speed = 0;
 	else
 		dev_link.link_speed = link_speed;
+	priv->link_speed_capa = 0;
+	if (edata.supported & SUPPORTED_Autoneg)
+		priv->link_speed_capa |= ETH_LINK_SPEED_AUTONEG;
+	if (edata.supported & (SUPPORTED_1000baseT_Full |
+			       SUPPORTED_1000baseKX_Full))
+		priv->link_speed_capa |= ETH_LINK_SPEED_1G;
+	if (edata.supported & SUPPORTED_10000baseKR_Full)
+		priv->link_speed_capa |= ETH_LINK_SPEED_10G;
+	if (edata.supported & (SUPPORTED_40000baseKR4_Full |
+			       SUPPORTED_40000baseCR4_Full |
+			       SUPPORTED_40000baseSR4_Full |
+			       SUPPORTED_40000baseLR4_Full))
+		priv->link_speed_capa |= ETH_LINK_SPEED_40G;
 	dev_link.link_duplex = ((edata.duplex == DUPLEX_HALF) ?
 				ETH_LINK_HALF_DUPLEX : ETH_LINK_FULL_DUPLEX);
 	dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH 14/14] net/mlx5: fix support for newer link speeds
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (12 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 13/14] net/mlx5: fix link speed capability information Nelio Laranjeiro
@ 2016-11-08 10:36 ` Nelio Laranjeiro
  2016-11-09  5:39 ` [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Yuanhan Liu
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-08 10:36 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

Not all speed capabilities can be reported properly before Linux 4.8 (25G,
50G and 100G speeds are missing), moreover the API to retrieve them only
exists since Linux 4.5, this commit thus implements compatibility code for
all versions.

Fixes: e274f5732225 ("ethdev: add speed capabilities")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/Makefile      |  15 +++++
 drivers/net/mlx5/mlx5_ethdev.c | 123 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 135 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 2c13c30..cf87f0b 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -121,6 +121,21 @@ mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/mlx5_hw.h \
 		enum MLX5_OPCODE_TSO \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_ETHTOOL_LINK_MODE_25G \
+		/usr/include/linux/ethtool.h \
+		enum ETHTOOL_LINK_MODE_25000baseCR_Full_BIT \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_ETHTOOL_LINK_MODE_50G \
+		/usr/include/linux/ethtool.h \
+		enum ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_ETHTOOL_LINK_MODE_100G \
+		/usr/include/linux/ethtool.h \
+		enum ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT \
+		$(AUTOCONF_OUTPUT)
 
 # Create mlx5_autoconf.h or update it in case it differs from the new one.
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 614085c..e9ffb4b 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -623,15 +623,15 @@ mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 }
 
 /**
- * DPDK callback to retrieve physical link information (unlocked version).
+ * Retrieve physical link information (unlocked version using legacy ioctl).
  *
  * @param dev
  *   Pointer to Ethernet device structure.
  * @param wait_to_complete
  *   Wait for request completion (ignored).
  */
-int
-mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
+static int
+mlx5_link_update_unlocked_gset(struct rte_eth_dev *dev, int wait_to_complete)
 {
 	struct priv *priv = mlx5_get_priv(dev);
 	struct ethtool_cmd edata = {
@@ -687,6 +687,123 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 }
 
 /**
+ * Retrieve physical link information (unlocked version using new ioctl from
+ * Linux 4.5).
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param wait_to_complete
+ *   Wait for request completion (ignored).
+ */
+static int
+mlx5_link_update_unlocked_gs(struct rte_eth_dev *dev, int wait_to_complete)
+{
+#ifdef ETHTOOL_GLINKSETTINGS
+	struct priv *priv = mlx5_get_priv(dev);
+	struct ethtool_link_settings edata = {
+		.cmd = ETHTOOL_GLINKSETTINGS,
+	};
+	struct ifreq ifr;
+	struct rte_eth_link dev_link;
+	uint64_t sc;
+
+	(void)wait_to_complete;
+	if (priv_ifreq(priv, SIOCGIFFLAGS, &ifr)) {
+		WARN("ioctl(SIOCGIFFLAGS) failed: %s", strerror(errno));
+		return -1;
+	}
+	memset(&dev_link, 0, sizeof(dev_link));
+	dev_link.link_status = ((ifr.ifr_flags & IFF_UP) &&
+				(ifr.ifr_flags & IFF_RUNNING));
+	ifr.ifr_data = (void *)&edata;
+	if (priv_ifreq(priv, SIOCETHTOOL, &ifr)) {
+		DEBUG("ioctl(SIOCETHTOOL, ETHTOOL_GLINKSETTINGS) failed: %s",
+		      strerror(errno));
+		return -1;
+	}
+	dev_link.link_speed = edata.speed;
+	sc = edata.link_mode_masks[0] |
+		((uint64_t)edata.link_mode_masks[1] << 32);
+	priv->link_speed_capa = 0;
+	/* Link speeds available in kernel v4.5. */
+	if (sc & ETHTOOL_LINK_MODE_Autoneg_BIT)
+		priv->link_speed_capa |= ETH_LINK_SPEED_AUTONEG;
+	if (sc & (ETHTOOL_LINK_MODE_1000baseT_Full_BIT |
+		  ETHTOOL_LINK_MODE_1000baseKX_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_1G;
+	if (sc & (ETHTOOL_LINK_MODE_10000baseKX4_Full_BIT |
+		  ETHTOOL_LINK_MODE_10000baseKR_Full_BIT |
+		  ETHTOOL_LINK_MODE_10000baseR_FEC_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_10G;
+	if (sc & (ETHTOOL_LINK_MODE_20000baseMLD2_Full_BIT |
+		  ETHTOOL_LINK_MODE_20000baseKR2_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_20G;
+	if (sc & (ETHTOOL_LINK_MODE_40000baseKR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_40000baseCR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_40000baseSR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_40000baseLR4_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_40G;
+	if (sc & (ETHTOOL_LINK_MODE_56000baseKR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_56000baseCR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_56000baseSR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_56000baseLR4_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_56G;
+	/* Link speeds available in kernel v4.6. */
+#ifdef HAVE_ETHTOOL_LINK_MODE_25G
+	if (sc & (ETHTOOL_LINK_MODE_25000baseCR_Full_BIT |
+		  ETHTOOL_LINK_MODE_25000baseKR_Full_BIT |
+		  ETHTOOL_LINK_MODE_25000baseSR_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_25G;
+#endif
+#ifdef HAVE_ETHTOOL_LINK_MODE_50G
+	if (sc & (ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT |
+		  ETHTOOL_LINK_MODE_50000baseKR2_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_50G;
+#endif
+#ifdef HAVE_ETHTOOL_LINK_MODE_100G
+	if (sc & (ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_100G;
+#endif
+	dev_link.link_duplex = ((edata.duplex == DUPLEX_HALF) ?
+				ETH_LINK_HALF_DUPLEX : ETH_LINK_FULL_DUPLEX);
+	dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
+				  ETH_LINK_SPEED_FIXED);
+	if (memcmp(&dev_link, &dev->data->dev_link, sizeof(dev_link))) {
+		/* Link status changed. */
+		dev->data->dev_link = dev_link;
+		return 0;
+	}
+#else
+	(void)dev;
+	(void)wait_to_complete;
+#endif
+	/* Link status is still the same. */
+	return -1;
+}
+
+/**
+ * DPDK callback to retrieve physical link information (unlocked version).
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param wait_to_complete
+ *   Wait for request completion (ignored).
+ */
+int
+mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
+{
+	int ret;
+
+	ret = mlx5_link_update_unlocked_gs(dev, wait_to_complete);
+	if (ret < 0)
+		ret = mlx5_link_update_unlocked_gset(dev, wait_to_complete);
+	return ret;
+}
+
+/**
  * DPDK callback to retrieve physical link information.
  *
  * @param dev
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (13 preceding siblings ...)
  2016-11-08 10:36 ` [dpdk-stable] [PATCH 14/14] net/mlx5: fix support for newer link speeds Nelio Laranjeiro
@ 2016-11-09  5:39 ` Yuanhan Liu
  2016-11-09  9:38   ` Nélio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 00/12] " Nelio Laranjeiro
                   ` (12 subsequent siblings)
  27 siblings, 1 reply; 31+ messages in thread
From: Yuanhan Liu @ 2016-11-09  5:39 UTC (permalink / raw)
  To: Nelio Laranjeiro; +Cc: stable, Adrien Mazarguil

On Tue, Nov 08, 2016 at 11:36:41AM +0100, Nelio Laranjeiro wrote:
> Patchset of fixes from 16.11 master branch rebased on top of 16.07.1 tag.

Hi Nelio,

I appreciate your help here! However, would you do the rebase on top of
the 16.07 branch? I have already cherry-picked few patches there for
16.07.2:

    [yliu@yliu-dev ~/dpdk]$ git log --oneline v16.07.1..HEAD -- drivers/net/mlx5
    23641e8 net/mlx5: fix link status report
    da181ce net/mlx5: fix hash key size retrieval
    c1097f4 net/mlx5: fix Rx function selection

Besides, would you add a heading line before the commit log body if it's
a commit cherry-picked from master branch?

    [ upstream commit xxx ]

And add following if it can't be applied cleanly that needs backport
effort?

    [ backported from upstream commit xxx]

Thanks.

	--yliu

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch
  2016-11-09  5:39 ` [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Yuanhan Liu
@ 2016-11-09  9:38   ` Nélio Laranjeiro
  0 siblings, 0 replies; 31+ messages in thread
From: Nélio Laranjeiro @ 2016-11-09  9:38 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: stable, Adrien Mazarguil

On Wed, Nov 09, 2016 at 01:39:18PM +0800, Yuanhan Liu wrote:
> On Tue, Nov 08, 2016 at 11:36:41AM +0100, Nelio Laranjeiro wrote:
> > Patchset of fixes from 16.11 master branch rebased on top of 16.07.1 tag.
> 
> Hi Nelio,
> 
> I appreciate your help here! However, would you do the rebase on top of
> the 16.07 branch? I have already cherry-picked few patches there for
> 16.07.2:
> 
>     [yliu@yliu-dev ~/dpdk]$ git log --oneline v16.07.1..HEAD -- drivers/net/mlx5
>     23641e8 net/mlx5: fix link status report
>     da181ce net/mlx5: fix hash key size retrieval
>     c1097f4 net/mlx5: fix Rx function selection
> 
> Besides, would you add a heading line before the commit log body if it's
> a commit cherry-picked from master branch?
> 
>     [ upstream commit xxx ]
> 
> And add following if it can't be applied cleanly that needs backport
> effort?
> 
>     [ backported from upstream commit xxx]
> 
> Thanks.
> 
> 	--yliu

Hi Yuanhan,

I am preparing a v2 with your requests.

Regards,

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 00/12] Patches for 16.07.2 stable branch
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (14 preceding siblings ...)
  2016-11-09  5:39 ` [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Yuanhan Liu
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09 11:01   ` Yuanhan Liu
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 01/12] net/mlx5: support Mellanox OFED 3.4 Nelio Laranjeiro
                   ` (11 subsequent siblings)
  27 siblings, 1 reply; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Adrien Mazarguil

Patchset of fixes from 16.11 master branch rebased on top of 16.07.1 tag.

Changes in V2:

 - Rebase on top of stable/16.07.
 - added backport/upstream in commit logs for each commit.

Adrien Mazarguil (1):
  net/mlx5: fix Rx VLAN offload capability report

Nélio Laranjeiro (5):
  net/mlx5: support Mellanox OFED 3.4
  net/mlx5: re-factorize functions
  net/mlx5: fix inline logic
  net/mlx5: fix link speed capability information
  net/mlx5: fix support for newer link speeds

Olivier Gournet (1):
  net/mlx5: fix initialization in secondary process

Raslan Darawsheh (1):
  net/mlx5: fix removing VLAN filter

Sagi Grimberg (1):
  net/mlx5: fix possible NULL dereference in Rx path

Yaacov Hazan (3):
  net/mlx5: fix inconsistent return value in flow director
  net/mlx5: refactor allocation of flow director queues
  net/mlx5: fix flow director drop mode

 doc/guides/nics/mlx5.rst       |   3 +-
 drivers/net/mlx5/Makefile      |  20 ++
 drivers/net/mlx5/mlx5.h        |   3 +
 drivers/net/mlx5/mlx5_ethdev.c | 161 +++++++++++--
 drivers/net/mlx5/mlx5_fdir.c   | 270 +++++++++++++++-------
 drivers/net/mlx5/mlx5_prm.h    |   6 +
 drivers/net/mlx5/mlx5_rxq.c    |   2 +
 drivers/net/mlx5/mlx5_rxtx.c   | 505 ++++++++++-------------------------------
 drivers/net/mlx5/mlx5_rxtx.h   |   7 +-
 drivers/net/mlx5/mlx5_txq.c    |   9 +-
 drivers/net/mlx5/mlx5_vlan.c   |   3 +-
 11 files changed, 489 insertions(+), 500 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 01/12] net/mlx5: support Mellanox OFED 3.4
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (15 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 00/12] " Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 02/12] net/mlx5: fix possible NULL dereference in Rx path Nelio Laranjeiro
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

[ backported from upstream commit c904ae25feb4de94683963a774f726cff5b08a0c ]

Some macros are renamed by Mellanox OFED 3.4.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/Makefile    | 5 +++++
 drivers/net/mlx5/mlx5_prm.h  | 6 ++++++
 drivers/net/mlx5/mlx5_rxtx.c | 4 ++--
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index f6d3938..2c13c30 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -116,6 +116,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/mlx5_hw.h \
 		enum MLX5_ETH_VLAN_INLINE_HEADER_SIZE \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_VERBS_MLX5_OPCODE_TSO \
+		infiniband/mlx5_hw.h \
+		enum MLX5_OPCODE_TSO \
+		$(AUTOCONF_OUTPUT)
 
 # Create mlx5_autoconf.h or update it in case it differs from the new one.
 
diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
index 4383009..e23d5cb 100644
--- a/drivers/net/mlx5/mlx5_prm.h
+++ b/drivers/net/mlx5/mlx5_prm.h
@@ -44,6 +44,8 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#include "mlx5_autoconf.h"
+
 /* Get CQE owner bit. */
 #define MLX5_CQE_OWNER(op_own) ((op_own) & MLX5_CQE_OWNER_MASK)
 
@@ -71,6 +73,10 @@
 /* Room for inline data in multi-packet WQE. */
 #define MLX5_MWQE64_INL_DATA 28
 
+#ifndef HAVE_VERBS_MLX5_OPCODE_TSO
+#define MLX5_OPCODE_TSO MLX5_OPCODE_LSO_MPW /* Compat with OFED 3.3. */
+#endif
+
 /* Subset of struct mlx5_wqe_eth_seg. */
 struct mlx5_wqe_eth_seg_small {
 	uint32_t rsvd0;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index a13cbc7..cc62e78 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -908,7 +908,7 @@ mlx5_mpw_new(struct txq *txq, struct mlx5_mpw *mpw, uint32_t length)
 	mpw->wqe->mpw.eseg.rsvd2 = 0;
 	mpw->wqe->mpw.ctrl.data[0] = htonl((MLX5_OPC_MOD_MPW << 24) |
 					   (txq->wqe_ci << 8) |
-					   MLX5_OPCODE_LSO_MPW);
+					   MLX5_OPCODE_TSO);
 	mpw->wqe->mpw.ctrl.data[2] = 0;
 	mpw->wqe->mpw.ctrl.data[3] = 0;
 	mpw->data.dseg[0] = &mpw->wqe->mpw.dseg[0];
@@ -1107,7 +1107,7 @@ mlx5_mpw_inline_new(struct txq *txq, struct mlx5_mpw *mpw, uint32_t length)
 	mpw->wqe = &(*txq->wqes)[idx];
 	mpw->wqe->mpw_inl.ctrl.data[0] = htonl((MLX5_OPC_MOD_MPW << 24) |
 					       (txq->wqe_ci << 8) |
-					       MLX5_OPCODE_LSO_MPW);
+					       MLX5_OPCODE_TSO);
 	mpw->wqe->mpw_inl.ctrl.data[2] = 0;
 	mpw->wqe->mpw_inl.ctrl.data[3] = 0;
 	mpw->wqe->mpw_inl.eseg.mss = htons(length);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 02/12] net/mlx5: fix possible NULL dereference in Rx path
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (16 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 01/12] net/mlx5: support Mellanox OFED 3.4 Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 03/12] net/mlx5: fix inconsistent return value in flow director Nelio Laranjeiro
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Sagi Grimberg, Adrien Mazarguil

From: Sagi Grimberg <sagi@grimberg.me>

[ upstream commit 15a756b63734c1d95e3f8d922e67d11fb32b025b ]

The user is allowed to call ->rx_pkt_burst() even without free
mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
on the first iteration (where pkt is still NULL). This would cause us
to deref a NULL pkt (reset refcount and free).

Fix this by checking the pkt before freeing it.

Fixes: a1bdb71a32da ("net/mlx5: fix crash in Rx")

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index cc62e78..59e8183 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1572,6 +1572,14 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		rte_prefetch0(wqe);
 		rep = rte_mbuf_raw_alloc(rxq->mp);
 		if (unlikely(rep == NULL)) {
+			++rxq->stats.rx_nombuf;
+			if (!pkt) {
+				/*
+				 * no buffers before we even started,
+				 * bail out silently.
+				 */
+				break;
+			}
 			while (pkt != seg) {
 				assert(pkt != (*rxq->elts)[idx]);
 				seg = NEXT(pkt);
@@ -1579,7 +1587,6 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
 				__rte_mbuf_raw_free(pkt);
 				pkt = seg;
 			}
-			++rxq->stats.rx_nombuf;
 			break;
 		}
 		if (!pkt) {
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 03/12] net/mlx5: fix inconsistent return value in flow director
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (17 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 02/12] net/mlx5: fix possible NULL dereference in Rx path Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 04/12] net/mlx5: fix Rx VLAN offload capability report Nelio Laranjeiro
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Yaacov Hazan, Adrien Mazarguil

From: Yaacov Hazan <yaacovh@mellanox.com>

[ upstream commit c502d05197e35dc2840fdf5892f6310c8cc4b0fd ]

The return value in DPDK is negative errno on failure.
Since internal functions in mlx driver return positive
values need to negate this value when it returned to
dpdk layer.

Fixes: 76f5c99 ("mlx5: support flow director")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
---
 drivers/net/mlx5/mlx5_fdir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 070edde..0372936 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -955,7 +955,7 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
 		     enum rte_filter_op filter_op,
 		     void *arg)
 {
-	int ret = -EINVAL;
+	int ret = EINVAL;
 	struct priv *priv = dev->data->dev_private;
 
 	switch (filter_type) {
@@ -970,5 +970,5 @@ mlx5_dev_filter_ctrl(struct rte_eth_dev *dev,
 		break;
 	}
 
-	return ret;
+	return -ret;
 }
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 04/12] net/mlx5: fix Rx VLAN offload capability report
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (18 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 03/12] net/mlx5: fix inconsistent return value in flow director Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 05/12] net/mlx5: fix removing VLAN filter Nelio Laranjeiro
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Adrien Mazarguil

From: Adrien Mazarguil <adrien.mazarguil@6wind.com>

[ upstream commit f08b6e71042d90f8f080e4a4b48cb4d29bfe4fd9 ]

This capability is implemented but not reported.

Fixes: f3db9489188a ("mlx5: support Rx VLAN stripping")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_ethdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index d113b91..cdb0dd5 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -583,7 +583,8 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 		 (DEV_RX_OFFLOAD_IPV4_CKSUM |
 		  DEV_RX_OFFLOAD_UDP_CKSUM |
 		  DEV_RX_OFFLOAD_TCP_CKSUM) :
-		 0);
+		 0) |
+		(priv->hw_vlan_strip ? DEV_RX_OFFLOAD_VLAN_STRIP : 0);
 	if (!priv->mps)
 		info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
 	if (priv->hw_csum)
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 05/12] net/mlx5: fix removing VLAN filter
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (19 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 04/12] net/mlx5: fix Rx VLAN offload capability report Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 06/12] net/mlx5: refactor allocation of flow director queues Nelio Laranjeiro
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Raslan Darawsheh, Adrien Mazarguil

From: Raslan Darawsheh <rdarawsheh@asaltech.com>

[ upstream commit 70d32d3afebfdd827f5f949f588f56d9e7563520 ]

memmove was moving bytes as the number of elements next to i, while it
should move the number of elements multiplied by the size of each element.

Fixes: e9086978 ("mlx5: support VLAN filtering")

Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>
---
 drivers/net/mlx5/mlx5_vlan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_vlan.c b/drivers/net/mlx5/mlx5_vlan.c
index 64e599d..1b0fa40 100644
--- a/drivers/net/mlx5/mlx5_vlan.c
+++ b/drivers/net/mlx5/mlx5_vlan.c
@@ -87,7 +87,8 @@ vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 		--priv->vlan_filter_n;
 		memmove(&priv->vlan_filter[i],
 			&priv->vlan_filter[i + 1],
-			priv->vlan_filter_n - i);
+			sizeof(priv->vlan_filter[i]) *
+			(priv->vlan_filter_n - i));
 		priv->vlan_filter[priv->vlan_filter_n] = 0;
 	} else {
 		assert(i == priv->vlan_filter_n);
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 06/12] net/mlx5: refactor allocation of flow director queues
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (20 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 05/12] net/mlx5: fix removing VLAN filter Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 07/12] net/mlx5: fix flow director drop mode Nelio Laranjeiro
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Yaacov Hazan, Adrien Mazarguil

From: Yaacov Hazan <yaacovh@mellanox.com>

[ upstream commit f5d9b9990d9c7a4b9a0523ae6975923629d67a14 ]

This is done to prepare support for drop queues, which are not related to
existing Rx queues and need to be managed separately.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5.h      |   1 +
 drivers/net/mlx5/mlx5_fdir.c | 229 ++++++++++++++++++++++++++++---------------
 drivers/net/mlx5/mlx5_rxq.c  |   2 +
 drivers/net/mlx5/mlx5_rxtx.h |   4 +-
 4 files changed, 156 insertions(+), 80 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 932c638..9d78dcd 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -258,6 +258,7 @@ void mlx5_dev_stop(struct rte_eth_dev *);
 
 /* mlx5_fdir.c */
 
+void priv_fdir_queue_destroy(struct priv *, struct fdir_queue *);
 int fdir_init_filters_list(struct priv *);
 void priv_fdir_delete_filters_list(struct priv *);
 void priv_fdir_disable(struct priv *);
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 0372936..4a82dc9 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -400,6 +400,145 @@ create_flow:
 }
 
 /**
+ * Destroy a flow director queue.
+ *
+ * @param fdir_queue
+ *   Flow director queue to be destroyed.
+ */
+void
+priv_fdir_queue_destroy(struct priv *priv, struct fdir_queue *fdir_queue)
+{
+	struct mlx5_fdir_filter *fdir_filter;
+
+	/* Disable filter flows still applying to this queue. */
+	LIST_FOREACH(fdir_filter, priv->fdir_filter_list, next) {
+		unsigned int idx = fdir_filter->queue;
+		struct rxq_ctrl *rxq_ctrl =
+			container_of((*priv->rxqs)[idx], struct rxq_ctrl, rxq);
+
+		assert(idx < priv->rxqs_n);
+		if (fdir_queue == rxq_ctrl->fdir_queue &&
+		    fdir_filter->flow != NULL) {
+			claim_zero(ibv_exp_destroy_flow(fdir_filter->flow));
+			fdir_filter->flow = NULL;
+		}
+	}
+	assert(fdir_queue->qp);
+	claim_zero(ibv_destroy_qp(fdir_queue->qp));
+	assert(fdir_queue->ind_table);
+	claim_zero(ibv_exp_destroy_rwq_ind_table(fdir_queue->ind_table));
+	if (fdir_queue->wq)
+		claim_zero(ibv_exp_destroy_wq(fdir_queue->wq));
+	if (fdir_queue->cq)
+		claim_zero(ibv_destroy_cq(fdir_queue->cq));
+#ifndef NDEBUG
+	memset(fdir_queue, 0x2a, sizeof(*fdir_queue));
+#endif
+	rte_free(fdir_queue);
+}
+
+/**
+ * Create a flow director queue.
+ *
+ * @param priv
+ *   Private structure.
+ * @param wq
+ *   Work queue to route matched packets to, NULL if one needs to
+ *   be created.
+ *
+ * @return
+ *   Related flow director queue on success, NULL otherwise.
+ */
+static struct fdir_queue *
+priv_fdir_queue_create(struct priv *priv, struct ibv_exp_wq *wq,
+		       unsigned int socket)
+{
+	struct fdir_queue *fdir_queue;
+
+	fdir_queue = rte_calloc_socket(__func__, 1, sizeof(*fdir_queue),
+				       0, socket);
+	if (!fdir_queue) {
+		ERROR("cannot allocate flow director queue");
+		return NULL;
+	}
+	assert(priv->pd);
+	assert(priv->ctx);
+	if (!wq) {
+		fdir_queue->cq = ibv_exp_create_cq(
+			priv->ctx, 1, NULL, NULL, 0,
+			&(struct ibv_exp_cq_init_attr){
+				.comp_mask = 0,
+			});
+		if (!fdir_queue->cq) {
+			ERROR("cannot create flow director CQ");
+			goto error;
+		}
+		fdir_queue->wq = ibv_exp_create_wq(
+			priv->ctx,
+			&(struct ibv_exp_wq_init_attr){
+				.wq_type = IBV_EXP_WQT_RQ,
+				.max_recv_wr = 1,
+				.max_recv_sge = 1,
+				.pd = priv->pd,
+				.cq = fdir_queue->cq,
+			});
+		if (!fdir_queue->wq) {
+			ERROR("cannot create flow director WQ");
+			goto error;
+		}
+		wq = fdir_queue->wq;
+	}
+	fdir_queue->ind_table = ibv_exp_create_rwq_ind_table(
+		priv->ctx,
+		&(struct ibv_exp_rwq_ind_table_init_attr){
+			.pd = priv->pd,
+			.log_ind_tbl_size = 0,
+			.ind_tbl = &wq,
+			.comp_mask = 0,
+		});
+	if (!fdir_queue->ind_table) {
+		ERROR("cannot create flow director indirection table");
+		goto error;
+	}
+	fdir_queue->qp = ibv_exp_create_qp(
+		priv->ctx,
+		&(struct ibv_exp_qp_init_attr){
+			.qp_type = IBV_QPT_RAW_PACKET,
+			.comp_mask =
+				IBV_EXP_QP_INIT_ATTR_PD |
+				IBV_EXP_QP_INIT_ATTR_PORT |
+				IBV_EXP_QP_INIT_ATTR_RX_HASH,
+			.pd = priv->pd,
+			.rx_hash_conf = &(struct ibv_exp_rx_hash_conf){
+				.rx_hash_function =
+					IBV_EXP_RX_HASH_FUNC_TOEPLITZ,
+				.rx_hash_key_len = rss_hash_default_key_len,
+				.rx_hash_key = rss_hash_default_key,
+				.rx_hash_fields_mask = 0,
+				.rwq_ind_tbl = fdir_queue->ind_table,
+			},
+			.port_num = priv->port,
+		});
+	if (!fdir_queue->qp) {
+		ERROR("cannot create flow director hash RX QP");
+		goto error;
+	}
+	return fdir_queue;
+error:
+	assert(fdir_queue);
+	assert(!fdir_queue->qp);
+	if (fdir_queue->ind_table)
+		claim_zero(ibv_exp_destroy_rwq_ind_table
+			   (fdir_queue->ind_table));
+	if (fdir_queue->wq)
+		claim_zero(ibv_exp_destroy_wq(fdir_queue->wq));
+	if (fdir_queue->cq)
+		claim_zero(ibv_destroy_cq(fdir_queue->cq));
+	rte_free(fdir_queue);
+	return NULL;
+}
+
+/**
  * Get flow director queue for a specific RX queue, create it in case
  * it does not exist.
  *
@@ -416,74 +555,15 @@ priv_get_fdir_queue(struct priv *priv, uint16_t idx)
 {
 	struct rxq_ctrl *rxq_ctrl =
 		container_of((*priv->rxqs)[idx], struct rxq_ctrl, rxq);
-	struct fdir_queue *fdir_queue = &rxq_ctrl->fdir_queue;
-	struct ibv_exp_rwq_ind_table *ind_table = NULL;
-	struct ibv_qp *qp = NULL;
-	struct ibv_exp_rwq_ind_table_init_attr ind_init_attr;
-	struct ibv_exp_rx_hash_conf hash_conf;
-	struct ibv_exp_qp_init_attr qp_init_attr;
-	int err = 0;
-
-	/* Return immediately if it has already been created. */
-	if (fdir_queue->qp != NULL)
-		return fdir_queue;
-
-	ind_init_attr = (struct ibv_exp_rwq_ind_table_init_attr){
-		.pd = priv->pd,
-		.log_ind_tbl_size = 0,
-		.ind_tbl = &rxq_ctrl->wq,
-		.comp_mask = 0,
-	};
+	struct fdir_queue *fdir_queue = rxq_ctrl->fdir_queue;
 
-	errno = 0;
-	ind_table = ibv_exp_create_rwq_ind_table(priv->ctx,
-						 &ind_init_attr);
-	if (ind_table == NULL) {
-		/* Not clear whether errno is set. */
-		err = (errno ? errno : EINVAL);
-		ERROR("RX indirection table creation failed with error %d: %s",
-		      err, strerror(err));
-		goto error;
-	}
-
-	/* Create fdir_queue qp. */
-	hash_conf = (struct ibv_exp_rx_hash_conf){
-		.rx_hash_function = IBV_EXP_RX_HASH_FUNC_TOEPLITZ,
-		.rx_hash_key_len = rss_hash_default_key_len,
-		.rx_hash_key = rss_hash_default_key,
-		.rx_hash_fields_mask = 0,
-		.rwq_ind_tbl = ind_table,
-	};
-	qp_init_attr = (struct ibv_exp_qp_init_attr){
-		.max_inl_recv = 0, /* Currently not supported. */
-		.qp_type = IBV_QPT_RAW_PACKET,
-		.comp_mask = (IBV_EXP_QP_INIT_ATTR_PD |
-			      IBV_EXP_QP_INIT_ATTR_RX_HASH),
-		.pd = priv->pd,
-		.rx_hash_conf = &hash_conf,
-		.port_num = priv->port,
-	};
-
-	qp = ibv_exp_create_qp(priv->ctx, &qp_init_attr);
-	if (qp == NULL) {
-		err = (errno ? errno : EINVAL);
-		ERROR("hash RX QP creation failure: %s", strerror(err));
-		goto error;
+	assert(rxq_ctrl->wq);
+	if (fdir_queue == NULL) {
+		fdir_queue = priv_fdir_queue_create(priv, rxq_ctrl->wq,
+						    rxq_ctrl->socket);
+		rxq_ctrl->fdir_queue = fdir_queue;
 	}
-
-	fdir_queue->ind_table = ind_table;
-	fdir_queue->qp = qp;
-
 	return fdir_queue;
-
-error:
-	if (qp != NULL)
-		claim_zero(ibv_destroy_qp(qp));
-
-	if (ind_table != NULL)
-		claim_zero(ibv_exp_destroy_rwq_ind_table(ind_table));
-
-	return NULL;
 }
 
 /**
@@ -601,7 +681,6 @@ priv_fdir_disable(struct priv *priv)
 {
 	unsigned int i;
 	struct mlx5_fdir_filter *mlx5_fdir_filter;
-	struct fdir_queue *fdir_queue;
 
 	/* Run on every flow director filter and destroy flow handle. */
 	LIST_FOREACH(mlx5_fdir_filter, priv->fdir_filter_list, next) {
@@ -618,23 +697,15 @@ priv_fdir_disable(struct priv *priv)
 		}
 	}
 
-	/* Run on every RX queue to destroy related flow director QP and
-	 * indirection table. */
+	/* Destroy flow director context in each RX queue. */
 	for (i = 0; (i != priv->rxqs_n); i++) {
 		struct rxq_ctrl *rxq_ctrl =
 			container_of((*priv->rxqs)[i], struct rxq_ctrl, rxq);
 
-		fdir_queue = &rxq_ctrl->fdir_queue;
-		if (fdir_queue->qp != NULL) {
-			claim_zero(ibv_destroy_qp(fdir_queue->qp));
-			fdir_queue->qp = NULL;
-		}
-
-		if (fdir_queue->ind_table != NULL) {
-			claim_zero(ibv_exp_destroy_rwq_ind_table
-				   (fdir_queue->ind_table));
-			fdir_queue->ind_table = NULL;
-		}
+		if (!rxq_ctrl->fdir_queue)
+			continue;
+		priv_fdir_queue_destroy(priv, rxq_ctrl->fdir_queue);
+		rxq_ctrl->fdir_queue = NULL;
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 514e06f..2d98c99 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -745,6 +745,8 @@ rxq_cleanup(struct rxq_ctrl *rxq_ctrl)
 
 	DEBUG("cleaning up %p", (void *)rxq_ctrl);
 	rxq_free_elts(rxq_ctrl);
+	if (rxq_ctrl->fdir_queue != NULL)
+		priv_fdir_queue_destroy(rxq_ctrl->priv, rxq_ctrl->fdir_queue);
 	if (rxq_ctrl->if_wq != NULL) {
 		assert(rxq_ctrl->priv != NULL);
 		assert(rxq_ctrl->priv->ctx != NULL);
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 952f88c..f68149e 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -87,6 +87,8 @@ struct mlx5_txq_stats {
 struct fdir_queue {
 	struct ibv_qp *qp; /* Associated RX QP. */
 	struct ibv_exp_rwq_ind_table *ind_table; /* Indirection table. */
+	struct ibv_exp_wq *wq; /* Work queue. */
+	struct ibv_cq *cq; /* Completion queue. */
 };
 
 struct priv;
@@ -128,7 +130,7 @@ struct rxq_ctrl {
 	struct ibv_cq *cq; /* Completion Queue. */
 	struct ibv_exp_wq *wq; /* Work Queue. */
 	struct ibv_exp_res_domain *rd; /* Resource Domain. */
-	struct fdir_queue fdir_queue; /* Flow director queue. */
+	struct fdir_queue *fdir_queue; /* Flow director queue. */
 	struct ibv_mr *mr; /* Memory Region (for mp). */
 	struct ibv_exp_wq_family *if_wq; /* WQ burst interface. */
 	struct ibv_exp_cq_family_v1 *if_cq; /* CQ interface. */
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 07/12] net/mlx5: fix flow director drop mode
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (21 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 06/12] net/mlx5: refactor allocation of flow director queues Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 08/12] net/mlx5: re-factorize functions Nelio Laranjeiro
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Yaacov Hazan, Adrien Mazarguil

From: Yaacov Hazan <yaacovh@mellanox.com>

[ upstream commit ade188a51664a32eea120a995ce84661294cc8af ]

Packet rejection was routed to a polled queue.  This patch route them to a
dummy queue which is not polled.

Fixes: 76f5c99e6840 ("mlx5: support flow director")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 doc/guides/nics/mlx5.rst     |  3 ++-
 drivers/net/mlx5/mlx5.h      |  1 +
 drivers/net/mlx5/mlx5_fdir.c | 41 +++++++++++++++++++++++++++++++++++++++--
 3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 5c10cd3..8923173 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -84,7 +84,8 @@ Features
 - Promiscuous mode.
 - Multicast promiscuous mode.
 - Hardware checksum offloads.
-- Flow director (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN).
+- Flow director (RTE_FDIR_MODE_PERFECT, RTE_FDIR_MODE_PERFECT_MAC_VLAN and
+  RTE_ETH_FDIR_REJECT).
 - Secondary process TX is supported.
 - KVM and VMware ESX SR-IOV modes are supported.
 
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9d78dcd..82172e4 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -134,6 +134,7 @@ struct priv {
 	unsigned int (*reta_idx)[]; /* RETA index table. */
 	unsigned int reta_idx_n; /* RETA index size. */
 	struct fdir_filter_list *fdir_filter_list; /* Flow director rules. */
+	struct fdir_queue *fdir_drop_queue; /* Flow director drop queue. */
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
diff --git a/drivers/net/mlx5/mlx5_fdir.c b/drivers/net/mlx5/mlx5_fdir.c
index 4a82dc9..1acf682 100644
--- a/drivers/net/mlx5/mlx5_fdir.c
+++ b/drivers/net/mlx5/mlx5_fdir.c
@@ -75,6 +75,7 @@ struct fdir_flow_desc {
 struct mlx5_fdir_filter {
 	LIST_ENTRY(mlx5_fdir_filter) next;
 	uint16_t queue; /* Queue assigned to if FDIR match. */
+	enum rte_eth_fdir_behavior behavior;
 	struct fdir_flow_desc desc;
 	struct ibv_exp_flow *flow;
 };
@@ -567,6 +568,33 @@ priv_get_fdir_queue(struct priv *priv, uint16_t idx)
 }
 
 /**
+ * Get or flow director drop queue. Create it if it does not exist.
+ *
+ * @param priv
+ *   Private structure.
+ *
+ * @return
+ *   Flow director drop queue on success, NULL otherwise.
+ */
+static struct fdir_queue *
+priv_get_fdir_drop_queue(struct priv *priv)
+{
+	struct fdir_queue *fdir_queue = priv->fdir_drop_queue;
+
+	if (fdir_queue == NULL) {
+		unsigned int socket = SOCKET_ID_ANY;
+
+		/* Select a known NUMA socket if possible. */
+		if (priv->rxqs_n && (*priv->rxqs)[0])
+			socket = container_of((*priv->rxqs)[0],
+					      struct rxq_ctrl, rxq)->socket;
+		fdir_queue = priv_fdir_queue_create(priv, NULL, socket);
+		priv->fdir_drop_queue = fdir_queue;
+	}
+	return fdir_queue;
+}
+
+/**
  * Enable flow director filter and create steering rules.
  *
  * @param priv
@@ -588,7 +616,11 @@ priv_fdir_filter_enable(struct priv *priv,
 		return 0;
 
 	/* Get fdir_queue for specific queue. */
-	fdir_queue = priv_get_fdir_queue(priv, mlx5_fdir_filter->queue);
+	if (mlx5_fdir_filter->behavior == RTE_ETH_FDIR_REJECT)
+		fdir_queue = priv_get_fdir_drop_queue(priv);
+	else
+		fdir_queue = priv_get_fdir_queue(priv,
+						 mlx5_fdir_filter->queue);
 
 	if (fdir_queue == NULL) {
 		ERROR("failed to create flow director rxq for queue %d",
@@ -707,6 +739,10 @@ priv_fdir_disable(struct priv *priv)
 		priv_fdir_queue_destroy(priv, rxq_ctrl->fdir_queue);
 		rxq_ctrl->fdir_queue = NULL;
 	}
+	if (priv->fdir_drop_queue) {
+		priv_fdir_queue_destroy(priv, priv->fdir_drop_queue);
+		priv->fdir_drop_queue = NULL;
+	}
 }
 
 /**
@@ -807,8 +843,9 @@ priv_fdir_filter_add(struct priv *priv,
 		return err;
 	}
 
-	/* Set queue. */
+	/* Set action parameters. */
 	mlx5_fdir_filter->queue = fdir_filter->action.rx_queue;
+	mlx5_fdir_filter->behavior = fdir_filter->action.behavior;
 
 	/* Convert to mlx5 filter descriptor. */
 	fdir_filter_to_flow_desc(fdir_filter,
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 08/12] net/mlx5: re-factorize functions
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (22 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 07/12] net/mlx5: fix flow director drop mode Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 09/12] net/mlx5: fix inline logic Nelio Laranjeiro
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

[ upstream commit d772d4408d3f71d89ab2c50bf02c553c9e11d4db ]

Rework logic of wqe_write() and wqe_write_vlan() which are pretty similar
to keep a single one.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 98 ++++++++++----------------------------------
 1 file changed, 22 insertions(+), 76 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 59e8183..47d6d68 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -290,8 +290,8 @@ txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
  *   Pointer to TX queue structure.
  * @param wqe
  *   Pointer to the WQE to fill.
- * @param addr
- *   Buffer data address.
+ * @param buf
+ *   Buffer.
  * @param length
  *   Packet length.
  * @param lkey
@@ -299,54 +299,24 @@ txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
  */
 static inline void
 mlx5_wqe_write(struct txq *txq, volatile union mlx5_wqe *wqe,
-	       uintptr_t addr, uint32_t length, uint32_t lkey)
-{
-	wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
-	wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
-	wqe->wqe.ctrl.data[2] = 0;
-	wqe->wqe.ctrl.data[3] = 0;
-	wqe->inl.eseg.rsvd0 = 0;
-	wqe->inl.eseg.rsvd1 = 0;
-	wqe->inl.eseg.mss = 0;
-	wqe->inl.eseg.rsvd2 = 0;
-	wqe->wqe.eseg.inline_hdr_sz = htons(MLX5_ETH_INLINE_HEADER_SIZE);
-	/* Copy the first 16 bytes into inline header. */
-	rte_memcpy((uint8_t *)(uintptr_t)wqe->wqe.eseg.inline_hdr_start,
-		   (uint8_t *)(uintptr_t)addr,
-		   MLX5_ETH_INLINE_HEADER_SIZE);
-	addr += MLX5_ETH_INLINE_HEADER_SIZE;
-	length -= MLX5_ETH_INLINE_HEADER_SIZE;
-	/* Store remaining data in data segment. */
-	wqe->wqe.dseg.byte_count = htonl(length);
-	wqe->wqe.dseg.lkey = lkey;
-	wqe->wqe.dseg.addr = htonll(addr);
-	/* Increment consumer index. */
-	++txq->wqe_ci;
-}
-
-/**
- * Write a regular WQE with VLAN.
- *
- * @param txq
- *   Pointer to TX queue structure.
- * @param wqe
- *   Pointer to the WQE to fill.
- * @param addr
- *   Buffer data address.
- * @param length
- *   Packet length.
- * @param lkey
- *   Memory region lkey.
- * @param vlan_tci
- *   VLAN field to insert in packet.
- */
-static inline void
-mlx5_wqe_write_vlan(struct txq *txq, volatile union mlx5_wqe *wqe,
-		    uintptr_t addr, uint32_t length, uint32_t lkey,
-		    uint16_t vlan_tci)
+	       struct rte_mbuf *buf, uint32_t length, uint32_t lkey)
 {
-	uint32_t vlan = htonl(0x81000000 | vlan_tci);
-
+	uintptr_t addr = rte_pktmbuf_mtod(buf, uintptr_t);
+
+	rte_mov16((uint8_t *)&wqe->wqe.eseg.inline_hdr_start,
+		  (uint8_t *)addr);
+	addr += 16;
+	length -= 16;
+	/* Need to insert VLAN ? */
+	if (buf->ol_flags & PKT_TX_VLAN_PKT) {
+		uint32_t vlan = htonl(0x81000000 | buf->vlan_tci);
+
+		memcpy((uint8_t *)&wqe->wqe.eseg.inline_hdr_start + 12,
+		       &vlan, sizeof(vlan));
+		addr -= sizeof(vlan);
+		length += sizeof(vlan);
+	}
+	/* Write the WQE. */
 	wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
 	wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
 	wqe->wqe.ctrl.data[2] = 0;
@@ -355,20 +325,7 @@ mlx5_wqe_write_vlan(struct txq *txq, volatile union mlx5_wqe *wqe,
 	wqe->inl.eseg.rsvd1 = 0;
 	wqe->inl.eseg.mss = 0;
 	wqe->inl.eseg.rsvd2 = 0;
-	wqe->wqe.eseg.inline_hdr_sz = htons(MLX5_ETH_VLAN_INLINE_HEADER_SIZE);
-	/*
-	 * Copy 12 bytes of source & destination MAC address.
-	 * Copy 4 bytes of VLAN.
-	 * Copy 2 bytes of Ether type.
-	 */
-	rte_memcpy((uint8_t *)(uintptr_t)wqe->wqe.eseg.inline_hdr_start,
-		   (uint8_t *)(uintptr_t)addr, 12);
-	rte_memcpy((uint8_t *)((uintptr_t)wqe->wqe.eseg.inline_hdr_start + 12),
-		   &vlan, sizeof(vlan));
-	rte_memcpy((uint8_t *)((uintptr_t)wqe->wqe.eseg.inline_hdr_start + 16),
-		   (uint8_t *)((uintptr_t)addr + 12), 2);
-	addr += MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
-	length -= MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
+	wqe->wqe.eseg.inline_hdr_sz = htons(16);
 	/* Store remaining data in data segment. */
 	wqe->wqe.dseg.byte_count = htonl(length);
 	wqe->wqe.dseg.lkey = lkey;
@@ -609,7 +566,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 	do {
 		struct rte_mbuf *buf = *(pkts++);
 		unsigned int elts_head_next;
-		uintptr_t addr;
 		uint32_t length;
 		uint32_t lkey;
 		unsigned int segs_n = buf->nb_segs;
@@ -631,8 +587,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		rte_prefetch0(wqe);
 		if (pkts_n)
 			rte_prefetch0(*pkts);
-		/* Retrieve buffer information. */
-		addr = rte_pktmbuf_mtod(buf, uintptr_t);
 		length = DATA_LEN(buf);
 		/* Update element. */
 		(*txq->elts)[elts_head] = buf;
@@ -642,11 +596,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 						       volatile void *));
 		/* Retrieve Memory Region key for this memory pool. */
 		lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-		if (buf->ol_flags & PKT_TX_VLAN_PKT)
-			mlx5_wqe_write_vlan(txq, wqe, addr, length, lkey,
-					    buf->vlan_tci);
-		else
-			mlx5_wqe_write(txq, wqe, addr, length, lkey);
+		mlx5_wqe_write(txq, wqe, buf, length, lkey);
 		/* Should we enable HW CKSUM offload */
 		if (buf->ol_flags &
 		    (PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) {
@@ -810,11 +760,7 @@ mlx5_tx_burst_inline(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		} else {
 			/* Retrieve Memory Region key for this memory pool. */
 			lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-			if (buf->ol_flags & PKT_TX_VLAN_PKT)
-				mlx5_wqe_write_vlan(txq, wqe, addr, length,
-						    lkey, buf->vlan_tci);
-			else
-				mlx5_wqe_write(txq, wqe, addr, length, lkey);
+			mlx5_wqe_write(txq, wqe, buf, length, lkey);
 		}
 		while (--segs_n) {
 			/*
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 09/12] net/mlx5: fix inline logic
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (23 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 08/12] net/mlx5: re-factorize functions Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 10/12] net/mlx5: fix initialization in secondary process Nelio Laranjeiro
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu
  Cc: Nélio Laranjeiro, Adrien Mazarguil, Vasily Philipov

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

[ upstream commit 0e8679fcddc45902cd8aa1d0fbfa542fee11b074 ]

To improve performance the NIC expects for large packets to have a pointer
to a cache aligned address, old inline code could break this assumption
which hurts performance.

Fixes: 2a66cf378954 ("net/mlx5: support inline send")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
---
 drivers/net/mlx5/mlx5_ethdev.c |   4 -
 drivers/net/mlx5/mlx5_rxtx.c   | 422 ++++++++++-------------------------------
 drivers/net/mlx5/mlx5_rxtx.h   |   3 +-
 drivers/net/mlx5/mlx5_txq.c    |   9 +-
 4 files changed, 103 insertions(+), 335 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index cdb0dd5..a3e8427 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1401,10 +1401,6 @@ priv_select_tx_function(struct priv *priv)
 	} else if ((priv->sriov == 0) && priv->mps) {
 		priv->dev->tx_pkt_burst = mlx5_tx_burst_mpw;
 		DEBUG("selected MPW TX function");
-	} else if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) {
-		priv->dev->tx_pkt_burst = mlx5_tx_burst_inline;
-		DEBUG("selected inline TX function (%u >= %u queues)",
-		      priv->txqs_n, priv->txqs_inline);
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 47d6d68..bb76f2c 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -294,179 +294,99 @@ txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
  *   Buffer.
  * @param length
  *   Packet length.
- * @param lkey
- *   Memory region lkey.
+ *
+ * @return ds
+ *   Number of DS elements consumed.
  */
-static inline void
+static inline unsigned int
 mlx5_wqe_write(struct txq *txq, volatile union mlx5_wqe *wqe,
-	       struct rte_mbuf *buf, uint32_t length, uint32_t lkey)
+	       struct rte_mbuf *buf, uint32_t length)
 {
+	uintptr_t raw = (uintptr_t)&wqe->wqe.eseg.inline_hdr_start;
+	uint16_t ds;
+	uint16_t pkt_inline_sz = 16;
 	uintptr_t addr = rte_pktmbuf_mtod(buf, uintptr_t);
+	struct mlx5_wqe_data_seg *dseg = NULL;
 
-	rte_mov16((uint8_t *)&wqe->wqe.eseg.inline_hdr_start,
-		  (uint8_t *)addr);
-	addr += 16;
+	assert(length >= 16);
+	/* Start the know and common part of the WQE structure. */
+	wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
+	wqe->wqe.ctrl.data[2] = 0;
+	wqe->wqe.ctrl.data[3] = 0;
+	wqe->wqe.eseg.rsvd0 = 0;
+	wqe->wqe.eseg.rsvd1 = 0;
+	wqe->wqe.eseg.mss = 0;
+	wqe->wqe.eseg.rsvd2 = 0;
+	/* Start by copying the Ethernet Header. */
+	rte_mov16((uint8_t *)raw, (uint8_t *)addr);
 	length -= 16;
-	/* Need to insert VLAN ? */
+	addr += 16;
+	/* Replace the Ethernet type by the VLAN if necessary. */
 	if (buf->ol_flags & PKT_TX_VLAN_PKT) {
 		uint32_t vlan = htonl(0x81000000 | buf->vlan_tci);
 
-		memcpy((uint8_t *)&wqe->wqe.eseg.inline_hdr_start + 12,
+		memcpy((uint8_t *)(raw + 16 - sizeof(vlan)),
 		       &vlan, sizeof(vlan));
 		addr -= sizeof(vlan);
 		length += sizeof(vlan);
 	}
-	/* Write the WQE. */
-	wqe->wqe.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
-	wqe->wqe.ctrl.data[1] = htonl((txq->qp_num_8s) | 4);
-	wqe->wqe.ctrl.data[2] = 0;
-	wqe->wqe.ctrl.data[3] = 0;
-	wqe->inl.eseg.rsvd0 = 0;
-	wqe->inl.eseg.rsvd1 = 0;
-	wqe->inl.eseg.mss = 0;
-	wqe->inl.eseg.rsvd2 = 0;
-	wqe->wqe.eseg.inline_hdr_sz = htons(16);
-	/* Store remaining data in data segment. */
-	wqe->wqe.dseg.byte_count = htonl(length);
-	wqe->wqe.dseg.lkey = lkey;
-	wqe->wqe.dseg.addr = htonll(addr);
-	/* Increment consumer index. */
-	++txq->wqe_ci;
-}
-
-/**
- * Write a inline WQE.
- *
- * @param txq
- *   Pointer to TX queue structure.
- * @param wqe
- *   Pointer to the WQE to fill.
- * @param addr
- *   Buffer data address.
- * @param length
- *   Packet length.
- * @param lkey
- *   Memory region lkey.
- */
-static inline void
-mlx5_wqe_write_inline(struct txq *txq, volatile union mlx5_wqe *wqe,
-		      uintptr_t addr, uint32_t length)
-{
-	uint32_t size;
-	uint16_t wqe_cnt = txq->wqe_n - 1;
-	uint16_t wqe_ci = txq->wqe_ci + 1;
-
-	/* Copy the first 16 bytes into inline header. */
-	rte_memcpy((void *)(uintptr_t)wqe->inl.eseg.inline_hdr_start,
-		   (void *)(uintptr_t)addr,
-		   MLX5_ETH_INLINE_HEADER_SIZE);
-	addr += MLX5_ETH_INLINE_HEADER_SIZE;
-	length -= MLX5_ETH_INLINE_HEADER_SIZE;
-	size = 3 + ((4 + length + 15) / 16);
-	wqe->inl.byte_cnt = htonl(length | MLX5_INLINE_SEG);
-	rte_memcpy((void *)(uintptr_t)&wqe->inl.data[0],
-		   (void *)addr, MLX5_WQE64_INL_DATA);
-	addr += MLX5_WQE64_INL_DATA;
-	length -= MLX5_WQE64_INL_DATA;
-	while (length) {
-		volatile union mlx5_wqe *wqe_next =
-			&(*txq->wqes)[wqe_ci & wqe_cnt];
-		uint32_t copy_bytes = (length > sizeof(*wqe)) ?
-				      sizeof(*wqe) :
-				      length;
-
-		rte_mov64((uint8_t *)(uintptr_t)&wqe_next->data[0],
-			  (uint8_t *)addr);
-		addr += copy_bytes;
-		length -= copy_bytes;
-		++wqe_ci;
-	}
-	assert(size < 64);
-	wqe->inl.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
-	wqe->inl.ctrl.data[1] = htonl(txq->qp_num_8s | size);
-	wqe->inl.ctrl.data[2] = 0;
-	wqe->inl.ctrl.data[3] = 0;
-	wqe->inl.eseg.rsvd0 = 0;
-	wqe->inl.eseg.rsvd1 = 0;
-	wqe->inl.eseg.mss = 0;
-	wqe->inl.eseg.rsvd2 = 0;
-	wqe->inl.eseg.inline_hdr_sz = htons(MLX5_ETH_INLINE_HEADER_SIZE);
-	/* Increment consumer index. */
-	txq->wqe_ci = wqe_ci;
-}
-
-/**
- * Write a inline WQE with VLAN.
- *
- * @param txq
- *   Pointer to TX queue structure.
- * @param wqe
- *   Pointer to the WQE to fill.
- * @param addr
- *   Buffer data address.
- * @param length
- *   Packet length.
- * @param lkey
- *   Memory region lkey.
- * @param vlan_tci
- *   VLAN field to insert in packet.
- */
-static inline void
-mlx5_wqe_write_inline_vlan(struct txq *txq, volatile union mlx5_wqe *wqe,
-			   uintptr_t addr, uint32_t length, uint16_t vlan_tci)
-{
-	uint32_t size;
-	uint32_t wqe_cnt = txq->wqe_n - 1;
-	uint16_t wqe_ci = txq->wqe_ci + 1;
-	uint32_t vlan = htonl(0x81000000 | vlan_tci);
-
-	/*
-	 * Copy 12 bytes of source & destination MAC address.
-	 * Copy 4 bytes of VLAN.
-	 * Copy 2 bytes of Ether type.
-	 */
-	rte_memcpy((uint8_t *)(uintptr_t)wqe->inl.eseg.inline_hdr_start,
-		   (uint8_t *)addr, 12);
-	rte_memcpy((uint8_t *)(uintptr_t)wqe->inl.eseg.inline_hdr_start + 12,
-		   &vlan, sizeof(vlan));
-	rte_memcpy((uint8_t *)((uintptr_t)wqe->inl.eseg.inline_hdr_start + 16),
-		   (uint8_t *)(addr + 12), 2);
-	addr += MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
-	length -= MLX5_ETH_VLAN_INLINE_HEADER_SIZE - sizeof(vlan);
-	size = (sizeof(wqe->inl.ctrl.ctrl) +
-		sizeof(wqe->inl.eseg) +
-		sizeof(wqe->inl.byte_cnt) +
-		length + 15) / 16;
-	wqe->inl.byte_cnt = htonl(length | MLX5_INLINE_SEG);
-	rte_memcpy((void *)(uintptr_t)&wqe->inl.data[0],
-		   (void *)addr, MLX5_WQE64_INL_DATA);
-	addr += MLX5_WQE64_INL_DATA;
-	length -= MLX5_WQE64_INL_DATA;
-	while (length) {
-		volatile union mlx5_wqe *wqe_next =
-			&(*txq->wqes)[wqe_ci & wqe_cnt];
-		uint32_t copy_bytes = (length > sizeof(*wqe)) ?
-				      sizeof(*wqe) :
-				      length;
-
-		rte_mov64((uint8_t *)(uintptr_t)&wqe_next->data[0],
-			  (uint8_t *)addr);
-		addr += copy_bytes;
-		length -= copy_bytes;
-		++wqe_ci;
+	/* Inline if enough room. */
+	if (txq->max_inline != 0) {
+		uintptr_t end = (uintptr_t)&(*txq->wqes)[txq->wqe_n];
+		uint16_t max_inline = txq->max_inline * RTE_CACHE_LINE_SIZE;
+		uint16_t room;
+
+		raw += 16;
+		room = end - (uintptr_t)raw;
+		if (room > max_inline) {
+			uintptr_t addr_end = (addr + max_inline) &
+				~(RTE_CACHE_LINE_SIZE - 1);
+			uint16_t copy_b = ((addr_end - addr) > length) ?
+					  length :
+					  (addr_end - addr);
+
+			rte_memcpy((void *)raw, (void *)addr, copy_b);
+			addr += copy_b;
+			length -= copy_b;
+			pkt_inline_sz += copy_b;
+			/* Sanity check. */
+			assert(addr <= addr_end);
+		}
+		/* Store the inlined packet size in the WQE. */
+		wqe->wqe.eseg.inline_hdr_sz = htons(pkt_inline_sz);
+		/*
+		 * 2 DWORDs consumed by the WQE header + 1 DSEG +
+		 * the size of the inline part of the packet.
+		 */
+		ds = 2 + ((pkt_inline_sz - 2 + 15) / 16);
+		if (length > 0) {
+			dseg = (struct mlx5_wqe_data_seg *)
+				((uintptr_t)wqe + (ds * 16));
+			if ((uintptr_t)dseg >= end)
+				dseg = (struct mlx5_wqe_data_seg *)
+					((uintptr_t)&(*txq->wqes)[0]);
+			goto use_dseg;
+		}
+	} else {
+		/* Add the remaining packet as a simple ds. */
+		ds = 3;
+		/*
+		 * No inline has been done in the packet, only the Ethernet
+		 * Header as been stored.
+		 */
+		wqe->wqe.eseg.inline_hdr_sz = htons(16);
+		dseg = (struct mlx5_wqe_data_seg *)
+			((uintptr_t)wqe + (ds * 16));
+use_dseg:
+		*dseg = (struct mlx5_wqe_data_seg) {
+			.addr = htonll(addr),
+			.byte_count = htonl(length),
+			.lkey = txq_mp2mr(txq, txq_mb2mp(buf)),
+		};
+		++ds;
 	}
-	assert(size < 64);
-	wqe->inl.ctrl.data[0] = htonl((txq->wqe_ci << 8) | MLX5_OPCODE_SEND);
-	wqe->inl.ctrl.data[1] = htonl(txq->qp_num_8s | size);
-	wqe->inl.ctrl.data[2] = 0;
-	wqe->inl.ctrl.data[3] = 0;
-	wqe->inl.eseg.rsvd0 = 0;
-	wqe->inl.eseg.rsvd1 = 0;
-	wqe->inl.eseg.mss = 0;
-	wqe->inl.eseg.rsvd2 = 0;
-	wqe->inl.eseg.inline_hdr_sz = htons(MLX5_ETH_VLAN_INLINE_HEADER_SIZE);
-	/* Increment consumer index. */
-	txq->wqe_ci = wqe_ci;
+	wqe->wqe.ctrl.data[1] = htonl(txq->qp_num_8s | ds);
+	return ds;
 }
 
 /**
@@ -567,7 +487,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		struct rte_mbuf *buf = *(pkts++);
 		unsigned int elts_head_next;
 		uint32_t length;
-		uint32_t lkey;
 		unsigned int segs_n = buf->nb_segs;
 		volatile struct mlx5_wqe_data_seg *dseg;
 		unsigned int ds = sizeof(*wqe) / 16;
@@ -583,8 +502,8 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		--pkts_n;
 		elts_head_next = (elts_head + 1) & (elts_n - 1);
 		wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)];
-		dseg = &wqe->wqe.dseg;
-		rte_prefetch0(wqe);
+		tx_prefetch_wqe(txq, txq->wqe_ci);
+		tx_prefetch_wqe(txq, txq->wqe_ci + 1);
 		if (pkts_n)
 			rte_prefetch0(*pkts);
 		length = DATA_LEN(buf);
@@ -594,9 +513,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		if (pkts_n)
 			rte_prefetch0(rte_pktmbuf_mtod(*pkts,
 						       volatile void *));
-		/* Retrieve Memory Region key for this memory pool. */
-		lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-		mlx5_wqe_write(txq, wqe, buf, length, lkey);
 		/* Should we enable HW CKSUM offload */
 		if (buf->ol_flags &
 		    (PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) {
@@ -606,6 +522,11 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		} else {
 			wqe->wqe.eseg.cs_flags = 0;
 		}
+		ds = mlx5_wqe_write(txq, wqe, buf, length);
+		if (segs_n == 1)
+			goto skip_segs;
+		dseg = (volatile struct mlx5_wqe_data_seg *)
+			(((uintptr_t)wqe) + ds * 16);
 		while (--segs_n) {
 			/*
 			 * Spill on next WQE when the current one does not have
@@ -636,11 +557,13 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 		/* Update DS field in WQE. */
 		wqe->wqe.ctrl.data[1] &= htonl(0xffffffc0);
 		wqe->wqe.ctrl.data[1] |= htonl(ds & 0x3f);
-		elts_head = elts_head_next;
+skip_segs:
 #ifdef MLX5_PMD_SOFT_COUNTERS
 		/* Increment sent bytes counter. */
 		txq->stats.obytes += length;
 #endif
+		/* Increment consumer index. */
+		txq->wqe_ci += (ds + 3) / 4;
 		elts_head = elts_head_next;
 		++i;
 	} while (pkts_n);
@@ -669,162 +592,6 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
 }
 
 /**
- * DPDK callback for TX with inline support.
- *
- * @param dpdk_txq
- *   Generic pointer to TX queue structure.
- * @param[in] pkts
- *   Packets to transmit.
- * @param pkts_n
- *   Number of packets in array.
- *
- * @return
- *   Number of packets successfully transmitted (<= pkts_n).
- */
-uint16_t
-mlx5_tx_burst_inline(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
-{
-	struct txq *txq = (struct txq *)dpdk_txq;
-	uint16_t elts_head = txq->elts_head;
-	const unsigned int elts_n = txq->elts_n;
-	unsigned int i = 0;
-	unsigned int j = 0;
-	unsigned int max;
-	unsigned int comp;
-	volatile union mlx5_wqe *wqe = NULL;
-	unsigned int max_inline = txq->max_inline;
-
-	if (unlikely(!pkts_n))
-		return 0;
-	/* Prefetch first packet cacheline. */
-	tx_prefetch_cqe(txq, txq->cq_ci);
-	tx_prefetch_cqe(txq, txq->cq_ci + 1);
-	rte_prefetch0(*pkts);
-	/* Start processing. */
-	txq_complete(txq);
-	max = (elts_n - (elts_head - txq->elts_tail));
-	if (max > elts_n)
-		max -= elts_n;
-	do {
-		struct rte_mbuf *buf = *(pkts++);
-		unsigned int elts_head_next;
-		uintptr_t addr;
-		uint32_t length;
-		uint32_t lkey;
-		unsigned int segs_n = buf->nb_segs;
-		volatile struct mlx5_wqe_data_seg *dseg;
-		unsigned int ds = sizeof(*wqe) / 16;
-
-		/*
-		 * Make sure there is enough room to store this packet and
-		 * that one ring entry remains unused.
-		 */
-		assert(segs_n);
-		if (max < segs_n + 1)
-			break;
-		max -= segs_n;
-		--pkts_n;
-		elts_head_next = (elts_head + 1) & (elts_n - 1);
-		wqe = &(*txq->wqes)[txq->wqe_ci & (txq->wqe_n - 1)];
-		dseg = &wqe->wqe.dseg;
-		tx_prefetch_wqe(txq, txq->wqe_ci);
-		tx_prefetch_wqe(txq, txq->wqe_ci + 1);
-		if (pkts_n)
-			rte_prefetch0(*pkts);
-		/* Should we enable HW CKSUM offload */
-		if (buf->ol_flags &
-		    (PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM | PKT_TX_UDP_CKSUM)) {
-			wqe->inl.eseg.cs_flags =
-				MLX5_ETH_WQE_L3_CSUM |
-				MLX5_ETH_WQE_L4_CSUM;
-		} else {
-			wqe->inl.eseg.cs_flags = 0;
-		}
-		/* Retrieve buffer information. */
-		addr = rte_pktmbuf_mtod(buf, uintptr_t);
-		length = DATA_LEN(buf);
-		/* Update element. */
-		(*txq->elts)[elts_head] = buf;
-		/* Prefetch next buffer data. */
-		if (pkts_n)
-			rte_prefetch0(rte_pktmbuf_mtod(*pkts,
-						       volatile void *));
-		if ((length <= max_inline) && (segs_n == 1)) {
-			if (buf->ol_flags & PKT_TX_VLAN_PKT)
-				mlx5_wqe_write_inline_vlan(txq, wqe,
-							   addr, length,
-							   buf->vlan_tci);
-			else
-				mlx5_wqe_write_inline(txq, wqe, addr, length);
-			goto skip_segs;
-		} else {
-			/* Retrieve Memory Region key for this memory pool. */
-			lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-			mlx5_wqe_write(txq, wqe, buf, length, lkey);
-		}
-		while (--segs_n) {
-			/*
-			 * Spill on next WQE when the current one does not have
-			 * enough room left. Size of WQE must a be a multiple
-			 * of data segment size.
-			 */
-			assert(!(sizeof(*wqe) % sizeof(*dseg)));
-			if (!(ds % (sizeof(*wqe) / 16)))
-				dseg = (volatile void *)
-					&(*txq->wqes)[txq->wqe_ci++ &
-						      (txq->wqe_n - 1)];
-			else
-				++dseg;
-			++ds;
-			buf = buf->next;
-			assert(buf);
-			/* Store segment information. */
-			dseg->byte_count = htonl(DATA_LEN(buf));
-			dseg->lkey = txq_mp2mr(txq, txq_mb2mp(buf));
-			dseg->addr = htonll(rte_pktmbuf_mtod(buf, uintptr_t));
-			(*txq->elts)[elts_head_next] = buf;
-			elts_head_next = (elts_head_next + 1) & (elts_n - 1);
-#ifdef MLX5_PMD_SOFT_COUNTERS
-			length += DATA_LEN(buf);
-#endif
-			++j;
-		}
-		/* Update DS field in WQE. */
-		wqe->inl.ctrl.data[1] &= htonl(0xffffffc0);
-		wqe->inl.ctrl.data[1] |= htonl(ds & 0x3f);
-skip_segs:
-		elts_head = elts_head_next;
-#ifdef MLX5_PMD_SOFT_COUNTERS
-		/* Increment sent bytes counter. */
-		txq->stats.obytes += length;
-#endif
-		++i;
-	} while (pkts_n);
-	/* Take a shortcut if nothing must be sent. */
-	if (unlikely(i == 0))
-		return 0;
-	/* Check whether completion threshold has been reached. */
-	comp = txq->elts_comp + i + j;
-	if (comp >= MLX5_TX_COMP_THRESH) {
-		/* Request completion on last WQE. */
-		wqe->inl.ctrl.data[2] = htonl(8);
-		/* Save elts_head in unused "immediate" field of WQE. */
-		wqe->inl.ctrl.data[3] = elts_head;
-		txq->elts_comp = 0;
-	} else {
-		txq->elts_comp = comp;
-	}
-#ifdef MLX5_PMD_SOFT_COUNTERS
-	/* Increment sent packets counter. */
-	txq->stats.opackets += i;
-#endif
-	/* Ring QP doorbell. */
-	mlx5_tx_dbrec(txq);
-	txq->elts_head = elts_head;
-	return i;
-}
-
-/**
  * Open a MPW session.
  *
  * @param txq
@@ -1114,7 +881,7 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 	unsigned int j = 0;
 	unsigned int max;
 	unsigned int comp;
-	unsigned int inline_room = txq->max_inline;
+	unsigned int inline_room = txq->max_inline * RTE_CACHE_LINE_SIZE;
 	struct mlx5_mpw mpw = {
 		.state = MLX5_MPW_STATE_CLOSED,
 	};
@@ -1168,7 +935,8 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 			    (length > inline_room) ||
 			    (mpw.wqe->mpw_inl.eseg.cs_flags != cs_flags)) {
 				mlx5_mpw_inline_close(txq, &mpw);
-				inline_room = txq->max_inline;
+				inline_room =
+					txq->max_inline * RTE_CACHE_LINE_SIZE;
 			}
 		}
 		if (mpw.state == MLX5_MPW_STATE_CLOSED) {
@@ -1184,7 +952,8 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 		/* Multi-segment packets must be alone in their MPW. */
 		assert((segs_n == 1) || (mpw.pkts_n == 0));
 		if (mpw.state == MLX5_MPW_STATE_OPENED) {
-			assert(inline_room == txq->max_inline);
+			assert(inline_room ==
+			       txq->max_inline * RTE_CACHE_LINE_SIZE);
 #if defined(MLX5_PMD_SOFT_COUNTERS) || !defined(NDEBUG)
 			length = 0;
 #endif
@@ -1249,7 +1018,8 @@ mlx5_tx_burst_mpw_inline(void *dpdk_txq, struct rte_mbuf **pkts,
 			++j;
 			if (mpw.pkts_n == MLX5_MPW_DSEG_MAX) {
 				mlx5_mpw_inline_close(txq, &mpw);
-				inline_room = txq->max_inline;
+				inline_room =
+					txq->max_inline * RTE_CACHE_LINE_SIZE;
 			} else {
 				inline_room -= length;
 			}
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index f68149e..05779ef 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -249,7 +249,7 @@ struct txq {
 	uint16_t wqe_n; /* Number of WQ elements. */
 	uint16_t bf_offset; /* Blueflame offset. */
 	uint16_t bf_buf_size; /* Blueflame size. */
-	uint16_t max_inline; /* Maximum size to inline in a WQE. */
+	uint16_t max_inline; /* Multiple of RTE_CACHE_LINE_SIZE to inline. */
 	uint32_t qp_num_8s; /* QP number shifted by 8. */
 	volatile struct mlx5_cqe (*cqes)[]; /* Completion queue. */
 	volatile union mlx5_wqe (*wqes)[]; /* Work queue. */
@@ -314,7 +314,6 @@ uint16_t mlx5_tx_burst_secondary_setup(void *, struct rte_mbuf **, uint16_t);
 /* mlx5_rxtx.c */
 
 uint16_t mlx5_tx_burst(void *, struct rte_mbuf **, uint16_t);
-uint16_t mlx5_tx_burst_inline(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_tx_burst_mpw(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_tx_burst_mpw_inline(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst(void *, struct rte_mbuf **, uint16_t);
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 476ce79..e4510ef 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -338,9 +338,12 @@ txq_ctrl_setup(struct rte_eth_dev *dev, struct txq_ctrl *txq_ctrl,
 		.comp_mask = (IBV_EXP_QP_INIT_ATTR_PD |
 			      IBV_EXP_QP_INIT_ATTR_RES_DOMAIN),
 	};
-	if (priv->txq_inline && priv->txqs_n >= priv->txqs_inline) {
-		tmpl.txq.max_inline = priv->txq_inline;
-		attr.init.cap.max_inline_data = tmpl.txq.max_inline;
+	if (priv->txq_inline && (priv->txqs_n >= priv->txqs_inline)) {
+		tmpl.txq.max_inline =
+			((priv->txq_inline + (RTE_CACHE_LINE_SIZE - 1)) /
+			 RTE_CACHE_LINE_SIZE);
+		attr.init.cap.max_inline_data =
+			tmpl.txq.max_inline * RTE_CACHE_LINE_SIZE;
 	}
 	tmpl.qp = ibv_exp_create_qp(priv->ctx, &attr.init);
 	if (tmpl.qp == NULL) {
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 10/12] net/mlx5: fix initialization in secondary process
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (24 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 09/12] net/mlx5: fix inline logic Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 11/12] net/mlx5: fix link speed capability information Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 12/12] net/mlx5: fix support for newer link speeds Nelio Laranjeiro
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Olivier Gournet, Adrien Mazarguil

From: Olivier Gournet <ogournet@corp.free.fr>

[ backported from upstream commit 69491883cb3c10be73b55316b813909dcc61d8c5 ]

The changes introduced by previous commits (ones in fixes lines) made
secondaries attempt to reinitialize the Tx queue structures of the primary
instead of their own, for which they also do not allocate enough memory,
leading to crashes.

Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
Fixes: 21c8bb4928c9 ("net/mlx5: split Tx queue structure")

Signed-off-by: Olivier Gournet <ogournet@corp.free.fr>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx5/mlx5_ethdev.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index a3e8427..90ad467 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1312,11 +1312,13 @@ mlx5_secondary_data_setup(struct priv *priv)
 			continue;
 		primary_txq_ctrl = container_of(primary_txq,
 						struct txq_ctrl, txq);
-		txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl), 0,
+		txq_ctrl = rte_calloc_socket("TXQ", 1, sizeof(*txq_ctrl) +
+					     (1 << primary_txq->elts_n) *
+					     sizeof(struct rte_mbuf *), 0,
 					     primary_txq_ctrl->socket);
 		if (txq_ctrl != NULL) {
 			if (txq_ctrl_setup(priv->dev,
-					   primary_txq_ctrl,
+					   txq_ctrl,
 					   primary_txq->elts_n,
 					   primary_txq_ctrl->socket,
 					   NULL) == 0) {
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 11/12] net/mlx5: fix link speed capability information
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (25 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 10/12] net/mlx5: fix initialization in secondary process Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 12/12] net/mlx5: fix support for newer link speeds Nelio Laranjeiro
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

[ upstream commit 75ef62a9430116e7a0d00fbec68af05a5333c116 ]

Make hard-coded values dynamic to return correct link speed capabilities
(not all ConnectX-4 NICs support everything).

Fixes: e274f5732225 ("ethdev: add speed capabilities")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 25 +++++++++++++++----------
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 82172e4..79b7a60 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -135,6 +135,7 @@ struct priv {
 	unsigned int reta_idx_n; /* RETA index size. */
 	struct fdir_filter_list *fdir_filter_list; /* Flow director rules. */
 	struct fdir_queue *fdir_drop_queue; /* Flow director drop queue. */
+	uint32_t link_speed_capa; /* Link speed capabilities. */
 	rte_spinlock_t lock; /* Lock for control functions. */
 };
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 90ad467..20c1c8a 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -603,15 +603,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	info->hash_key_size = ((*priv->rss_conf) ?
 			       (*priv->rss_conf)[0]->rss_key_len :
 			       0);
-	info->speed_capa =
-			ETH_LINK_SPEED_1G |
-			ETH_LINK_SPEED_10G |
-			ETH_LINK_SPEED_20G |
-			ETH_LINK_SPEED_25G |
-			ETH_LINK_SPEED_40G |
-			ETH_LINK_SPEED_50G |
-			ETH_LINK_SPEED_56G |
-			ETH_LINK_SPEED_100G;
+	info->speed_capa = priv->link_speed_capa;
 	priv_unlock(priv);
 }
 
@@ -646,7 +638,7 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 {
 	struct priv *priv = mlx5_get_priv(dev);
 	struct ethtool_cmd edata = {
-		.cmd = ETHTOOL_GSET
+		.cmd = ETHTOOL_GSET /* Deprecated since Linux v4.5. */
 	};
 	struct ifreq ifr;
 	struct rte_eth_link dev_link;
@@ -671,6 +663,19 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 		dev_link.link_speed = 0;
 	else
 		dev_link.link_speed = link_speed;
+	priv->link_speed_capa = 0;
+	if (edata.supported & SUPPORTED_Autoneg)
+		priv->link_speed_capa |= ETH_LINK_SPEED_AUTONEG;
+	if (edata.supported & (SUPPORTED_1000baseT_Full |
+			       SUPPORTED_1000baseKX_Full))
+		priv->link_speed_capa |= ETH_LINK_SPEED_1G;
+	if (edata.supported & SUPPORTED_10000baseKR_Full)
+		priv->link_speed_capa |= ETH_LINK_SPEED_10G;
+	if (edata.supported & (SUPPORTED_40000baseKR4_Full |
+			       SUPPORTED_40000baseCR4_Full |
+			       SUPPORTED_40000baseSR4_Full |
+			       SUPPORTED_40000baseLR4_Full))
+		priv->link_speed_capa |= ETH_LINK_SPEED_40G;
 	dev_link.link_duplex = ((edata.duplex == DUPLEX_HALF) ?
 				ETH_LINK_HALF_DUPLEX : ETH_LINK_FULL_DUPLEX);
 	dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-stable] [PATCH v2 12/12] net/mlx5: fix support for newer link speeds
  2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
                   ` (26 preceding siblings ...)
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 11/12] net/mlx5: fix link speed capability information Nelio Laranjeiro
@ 2016-11-09  9:57 ` Nelio Laranjeiro
  27 siblings, 0 replies; 31+ messages in thread
From: Nelio Laranjeiro @ 2016-11-09  9:57 UTC (permalink / raw)
  To: stable, Yuanhan Liu; +Cc: Nélio Laranjeiro, Adrien Mazarguil

From: Nélio Laranjeiro <nelio.laranjeiro@6wind.com>

[ upstream commit 188408719888c997e813fb9fbb11e9029f1f1191 ]

Not all speed capabilities can be reported properly before Linux 4.8 (25G,
50G and 100G speeds are missing), moreover the API to retrieve them only
exists since Linux 4.5, this commit thus implements compatibility code for
all versions.

Fixes: e274f5732225 ("ethdev: add speed capabilities")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
---
 drivers/net/mlx5/Makefile      |  15 +++++
 drivers/net/mlx5/mlx5_ethdev.c | 123 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 135 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 2c13c30..cf87f0b 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -121,6 +121,21 @@ mlx5_autoconf.h.new: $(RTE_SDK)/scripts/auto-config-h.sh
 		infiniband/mlx5_hw.h \
 		enum MLX5_OPCODE_TSO \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_ETHTOOL_LINK_MODE_25G \
+		/usr/include/linux/ethtool.h \
+		enum ETHTOOL_LINK_MODE_25000baseCR_Full_BIT \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_ETHTOOL_LINK_MODE_50G \
+		/usr/include/linux/ethtool.h \
+		enum ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT \
+		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_ETHTOOL_LINK_MODE_100G \
+		/usr/include/linux/ethtool.h \
+		enum ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT \
+		$(AUTOCONF_OUTPUT)
 
 # Create mlx5_autoconf.h or update it in case it differs from the new one.
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 20c1c8a..ba1ec2a 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -626,15 +626,15 @@ mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev)
 }
 
 /**
- * DPDK callback to retrieve physical link information (unlocked version).
+ * Retrieve physical link information (unlocked version using legacy ioctl).
  *
  * @param dev
  *   Pointer to Ethernet device structure.
  * @param wait_to_complete
  *   Wait for request completion (ignored).
  */
-int
-mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
+static int
+mlx5_link_update_unlocked_gset(struct rte_eth_dev *dev, int wait_to_complete)
 {
 	struct priv *priv = mlx5_get_priv(dev);
 	struct ethtool_cmd edata = {
@@ -690,6 +690,123 @@ mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
 }
 
 /**
+ * Retrieve physical link information (unlocked version using new ioctl from
+ * Linux 4.5).
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param wait_to_complete
+ *   Wait for request completion (ignored).
+ */
+static int
+mlx5_link_update_unlocked_gs(struct rte_eth_dev *dev, int wait_to_complete)
+{
+#ifdef ETHTOOL_GLINKSETTINGS
+	struct priv *priv = mlx5_get_priv(dev);
+	struct ethtool_link_settings edata = {
+		.cmd = ETHTOOL_GLINKSETTINGS,
+	};
+	struct ifreq ifr;
+	struct rte_eth_link dev_link;
+	uint64_t sc;
+
+	(void)wait_to_complete;
+	if (priv_ifreq(priv, SIOCGIFFLAGS, &ifr)) {
+		WARN("ioctl(SIOCGIFFLAGS) failed: %s", strerror(errno));
+		return -1;
+	}
+	memset(&dev_link, 0, sizeof(dev_link));
+	dev_link.link_status = ((ifr.ifr_flags & IFF_UP) &&
+				(ifr.ifr_flags & IFF_RUNNING));
+	ifr.ifr_data = (void *)&edata;
+	if (priv_ifreq(priv, SIOCETHTOOL, &ifr)) {
+		DEBUG("ioctl(SIOCETHTOOL, ETHTOOL_GLINKSETTINGS) failed: %s",
+		      strerror(errno));
+		return -1;
+	}
+	dev_link.link_speed = edata.speed;
+	sc = edata.link_mode_masks[0] |
+		((uint64_t)edata.link_mode_masks[1] << 32);
+	priv->link_speed_capa = 0;
+	/* Link speeds available in kernel v4.5. */
+	if (sc & ETHTOOL_LINK_MODE_Autoneg_BIT)
+		priv->link_speed_capa |= ETH_LINK_SPEED_AUTONEG;
+	if (sc & (ETHTOOL_LINK_MODE_1000baseT_Full_BIT |
+		  ETHTOOL_LINK_MODE_1000baseKX_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_1G;
+	if (sc & (ETHTOOL_LINK_MODE_10000baseKX4_Full_BIT |
+		  ETHTOOL_LINK_MODE_10000baseKR_Full_BIT |
+		  ETHTOOL_LINK_MODE_10000baseR_FEC_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_10G;
+	if (sc & (ETHTOOL_LINK_MODE_20000baseMLD2_Full_BIT |
+		  ETHTOOL_LINK_MODE_20000baseKR2_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_20G;
+	if (sc & (ETHTOOL_LINK_MODE_40000baseKR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_40000baseCR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_40000baseSR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_40000baseLR4_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_40G;
+	if (sc & (ETHTOOL_LINK_MODE_56000baseKR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_56000baseCR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_56000baseSR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_56000baseLR4_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_56G;
+	/* Link speeds available in kernel v4.6. */
+#ifdef HAVE_ETHTOOL_LINK_MODE_25G
+	if (sc & (ETHTOOL_LINK_MODE_25000baseCR_Full_BIT |
+		  ETHTOOL_LINK_MODE_25000baseKR_Full_BIT |
+		  ETHTOOL_LINK_MODE_25000baseSR_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_25G;
+#endif
+#ifdef HAVE_ETHTOOL_LINK_MODE_50G
+	if (sc & (ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT |
+		  ETHTOOL_LINK_MODE_50000baseKR2_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_50G;
+#endif
+#ifdef HAVE_ETHTOOL_LINK_MODE_100G
+	if (sc & (ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT |
+		  ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT))
+		priv->link_speed_capa |= ETH_LINK_SPEED_100G;
+#endif
+	dev_link.link_duplex = ((edata.duplex == DUPLEX_HALF) ?
+				ETH_LINK_HALF_DUPLEX : ETH_LINK_FULL_DUPLEX);
+	dev_link.link_autoneg = !(dev->data->dev_conf.link_speeds &
+				  ETH_LINK_SPEED_FIXED);
+	if (memcmp(&dev_link, &dev->data->dev_link, sizeof(dev_link))) {
+		/* Link status changed. */
+		dev->data->dev_link = dev_link;
+		return 0;
+	}
+#else
+	(void)dev;
+	(void)wait_to_complete;
+#endif
+	/* Link status is still the same. */
+	return -1;
+}
+
+/**
+ * DPDK callback to retrieve physical link information (unlocked version).
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param wait_to_complete
+ *   Wait for request completion (ignored).
+ */
+int
+mlx5_link_update_unlocked(struct rte_eth_dev *dev, int wait_to_complete)
+{
+	int ret;
+
+	ret = mlx5_link_update_unlocked_gs(dev, wait_to_complete);
+	if (ret < 0)
+		ret = mlx5_link_update_unlocked_gset(dev, wait_to_complete);
+	return ret;
+}
+
+/**
  * DPDK callback to retrieve physical link information.
  *
  * @param dev
-- 
2.1.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-stable] [PATCH v2 00/12] Patches for 16.07.2 stable branch
  2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 00/12] " Nelio Laranjeiro
@ 2016-11-09 11:01   ` Yuanhan Liu
  0 siblings, 0 replies; 31+ messages in thread
From: Yuanhan Liu @ 2016-11-09 11:01 UTC (permalink / raw)
  To: Nelio Laranjeiro; +Cc: stable, Adrien Mazarguil

On Wed, Nov 09, 2016 at 10:57:39AM +0100, Nelio Laranjeiro wrote:
> Patchset of fixes from 16.11 master branch rebased on top of 16.07.1 tag.
> 
> Changes in V2:
> 
>  - Rebase on top of stable/16.07.
>  - added backport/upstream in commit logs for each commit.

Nelio,

Thanks! And all applied to dpdk-stable 16.07 branch.

	--yliu

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-11-09 11:00 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-08 10:36 [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 01/14] net/mlx5: support Mellanox OFED 3.4 Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 02/14] net/mlx5: fix possible NULL dereference in Rx path Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 03/14] net/mlx5: fix inconsistent return value in flow director Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 04/14] net/mlx5: fix Rx VLAN offload capability report Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 05/14] net/mlx5: fix removing VLAN filter Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 06/14] net/mlx5: refactor allocation of flow director queues Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 07/14] net/mlx5: fix flow director drop mode Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 08/14] net/mlx5: re-factorize functions Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 09/14] net/mlx5: fix inline logic Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 10/14] net/mlx5: fix Rx function selection Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 11/14] net/mlx5: fix link status report Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 12/14] net/mlx5: fix initialization in secondary process Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 13/14] net/mlx5: fix link speed capability information Nelio Laranjeiro
2016-11-08 10:36 ` [dpdk-stable] [PATCH 14/14] net/mlx5: fix support for newer link speeds Nelio Laranjeiro
2016-11-09  5:39 ` [dpdk-stable] [PATCH 00/14] Patches for 16.07.2 stable branch Yuanhan Liu
2016-11-09  9:38   ` Nélio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 00/12] " Nelio Laranjeiro
2016-11-09 11:01   ` Yuanhan Liu
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 01/12] net/mlx5: support Mellanox OFED 3.4 Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 02/12] net/mlx5: fix possible NULL dereference in Rx path Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 03/12] net/mlx5: fix inconsistent return value in flow director Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 04/12] net/mlx5: fix Rx VLAN offload capability report Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 05/12] net/mlx5: fix removing VLAN filter Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 06/12] net/mlx5: refactor allocation of flow director queues Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 07/12] net/mlx5: fix flow director drop mode Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 08/12] net/mlx5: re-factorize functions Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 09/12] net/mlx5: fix inline logic Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 10/12] net/mlx5: fix initialization in secondary process Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 11/12] net/mlx5: fix link speed capability information Nelio Laranjeiro
2016-11-09  9:57 ` [dpdk-stable] [PATCH v2 12/12] net/mlx5: fix support for newer link speeds Nelio Laranjeiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).