patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Bing Zhao <bingz@nvidia.com>
To: <stable@dpdk.org>, <christian.ehrhardt@canonical.com>
Cc: <viacheslavo@nvidia.com>, <matan@nvidia.com>
Subject: [dpdk-stable] [PATCH 19.11 5/6] common/mlx5: use new port query API if available
Date: Mon, 16 Aug 2021 19:29:51 +0300	[thread overview]
Message-ID: <20210816162952.1931473-6-bingz@nvidia.com> (raw)
In-Reply-To: <20210816162952.1931473-1-bingz@nvidia.com>

From: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

[ upstream commit d0cf77e8c2b64319057f5f629a7a595ce6e8b556 ]

In order to get E-Switch vport identifiers the mlx5 PMD relies
on two approaches:
  [a] use port query API if it is provided by rdma-core library
  [b] otherwise, deduce vport ids from the related VF index
The latter is not reliable and may not work with newer kernel
drivers and in some configurations (LAG), causing E-Switch
malfunction. Hence, engaging the port query API is highly
desirable.

Depending on rdma-core version the port query API is:
  - very old OFED versions have no query API (approach [b])
  - rdma-core OFED < 5.5 provides mlx5dv_query_devx_port,
    HAVE_MLX5DV_DR_DEVX_PORT flag is defined (approach [a])
  - rdma-core OFED >= 5.5 has mlx5dv_query_port, flag
    HAVE_MLX5DV_DR_DEVX_PORT_V35 is defined (approach [a])
  - future OFED versions might remove mlx5dv_query_devx_port
    and HAVE_MLX5DV_DR_DEVX_PORT will not be defined
  - Upstream rdma-core < v35 has no port query API (approach [b])
  - Upstream rdma-core >= v35 has  mlx5dv_query_port, flag
    HAVE_MLX5DV_DR_DEVX_PORT_V35 is defined (approach [a])

In order to support the new mlx5dv_query_port routine, the
conditional compilation flag HAVE_MLX5DV_DR_DEVX_PORT_V35
is introduced by this patch. The flag HAVE_MLX5DV_DR_DEVX_PORT
is kept for compatibility with previous rdma-core versions.

Despite this patch is not a bugfix (it follows the introduced API
variation in underlying library), it resolves the compatibility
issue and is highly desired to be ported to DPDK LTS.

Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/net/mlx5/Makefile    |  5 +++
 drivers/net/mlx5/meson.build |  2 ++
 drivers/net/mlx5/mlx5.c      | 60 ++++++++++++++++--------------------
 drivers/net/mlx5/mlx5_glue.c | 55 +++++++++++++++++++++++++++------
 drivers/net/mlx5/mlx5_glue.h | 16 +++++++++-
 5 files changed, 94 insertions(+), 44 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 3719f0f11e..d62bfd6875 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -178,6 +178,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
 		infiniband/mlx5dv.h \
 		func mlx5dv_query_devx_port \
 		$(AUTOCONF_OUTPUT)
+	$Q sh -- '$<' '$@' \
+		HAVE_MLX5DV_DR_DEVX_PORT_V35 \
+		infiniband/mlx5dv.h \
+		func mlx5dv_query_port \
+		$(AUTOCONF_OUTPUT)
 	$Q sh -- '$<' '$@' \
 		HAVE_MLX5DV_DR_CREATE_DEST_IB_PORT \
 		infiniband/mlx5dv.h \
diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index 3336c58ce5..81126d6a1b 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -138,6 +138,8 @@ if build
 		'IBV_WQ_FLAG_RX_END_PADDING' ],
 		[ 'HAVE_MLX5DV_DR_DEVX_PORT', 'infiniband/mlx5dv.h',
 		'mlx5dv_query_devx_port' ],
+		[ 'HAVE_MLX5DV_DR_DEVX_PORT_V35', 'infiniband/mlx5dv.h',
+		'mlx5dv_query_port' ],
 		[ 'HAVE_MLX5DV_DR_CREATE_DEST_IB_PORT', 'infiniband/mlx5dv.h',
 		'mlx5dv_dr_action_create_dest_ib_port' ],
 		[ 'HAVE_IBV_DEVX_OBJ', 'infiniband/mlx5dv.h',
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 4696a1f2d1..18fccecf3d 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -2175,9 +2175,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	int own_domain_id = 0;
 	uint16_t port_id;
 	unsigned int i;
-#ifdef HAVE_MLX5DV_DR_DEVX_PORT
-	struct mlx5dv_devx_port devx_port = { .comp_mask = 0 };
-#endif
+	struct mlx5_port_info vport_info = { .query_flags = 0 };
 
 	/* Determine if this port representor is supposed to be spawned. */
 	if (switch_info->representor && dpdk_dev->devargs) {
@@ -2411,28 +2409,26 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->vport_meta_tag = 0;
 	priv->vport_meta_mask = 0;
 	priv->pf_bond = spawn->pf_bond;
-#ifdef HAVE_MLX5DV_DR_DEVX_PORT
 	/*
-	 * The DevX port query API is implemented. E-Switch may use
-	 * either vport or reg_c[0] metadata register to match on
-	 * vport index. The engaged part of metadata register is
-	 * defined by mask.
+	 * If we have E-Switch we should determine the vport attributes.
+	 * E-Switch may use either source vport field or reg_c[0] metadata
+	 * register to match on vport index. The engaged part of metadata
+	 * register is defined by mask.
 	 */
 	if (switch_info->representor || switch_info->master) {
-		devx_port.comp_mask = MLX5DV_DEVX_PORT_VPORT |
-				      MLX5DV_DEVX_PORT_MATCH_REG_C_0;
-		err = mlx5_glue->devx_port_query(sh->ctx, spawn->ibv_port,
-						 &devx_port);
+		err = mlx5_glue->devx_port_query(sh->ctx,
+						 spawn->ibv_port,
+						 &vport_info);
 		if (err) {
 			DRV_LOG(WARNING,
 				"can't query devx port %d on device %s",
 				spawn->ibv_port, spawn->ibv_dev->name);
-			devx_port.comp_mask = 0;
+			vport_info.query_flags = 0;
 		}
 	}
-	if (devx_port.comp_mask & MLX5DV_DEVX_PORT_MATCH_REG_C_0) {
-		priv->vport_meta_tag = devx_port.reg_c_0.value;
-		priv->vport_meta_mask = devx_port.reg_c_0.mask;
+	if (vport_info.query_flags & MLX5_PORT_QUERY_REG_C0) {
+		priv->vport_meta_tag = vport_info.vport_meta_tag;
+		priv->vport_meta_mask = vport_info.vport_meta_mask;
 		if (!priv->vport_meta_mask) {
 			DRV_LOG(ERR, "vport zero mask for port %d"
 				     " on bonding device %s",
@@ -2448,8 +2444,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			goto error;
 		}
 	}
-	if (devx_port.comp_mask & MLX5DV_DEVX_PORT_VPORT) {
-		priv->vport_id = devx_port.vport_num;
+	if (vport_info.query_flags & MLX5_PORT_QUERY_VPORT) {
+		priv->vport_id = vport_info.vport_id;
 	} else if (spawn->pf_bond >= 0 &&
 			(switch_info->representor || switch_info->master)) {
 		DRV_LOG(ERR, "can't deduce vport index for port %d"
@@ -2458,25 +2454,21 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		err = ENOTSUP;
 		goto error;
 	} else {
-		/* Suppose vport index in compatible way. */
+		/*
+		 * Suppose vport index in compatible way. Kernel/rdma_core
+		 * support single E-Switch per PF configurations only and
+		 * vport_id field contains the vport index for associated VF,
+		 * which is deduced from representor port name.
+		 * For example, let's have the IB device port 10, it has
+		 * attached network device eth0, which has port name attribute
+		 * pf0vf2, we can deduce the VF number as 2, and set vport index
+		 * as 3 (2+1). This assigning schema should be changed if the
+		 * multiple E-Switch instances per PF configurations or/and PCI
+		 * subfunctions are added.
+		 */
 		priv->vport_id = switch_info->representor ?
 				 switch_info->port_name + 1 : -1;
 	}
-#else
-	/*
-	 * Kernel/rdma_core support single E-Switch per PF configurations
-	 * only and vport_id field contains the vport index for
-	 * associated VF, which is deduced from representor port name.
-	 * For example, let's have the IB device port 10, it has
-	 * attached network device eth0, which has port name attribute
-	 * pf0vf2, we can deduce the VF number as 2, and set vport index
-	 * as 3 (2+1). This assigning schema should be changed if the
-	 * multiple E-Switch instances per PF configurations or/and PCI
-	 * subfunctions are added.
-	 */
-	priv->vport_id = switch_info->representor ?
-			 switch_info->port_name + 1 : -1;
-#endif
 	/* representor_id field keeps the unmodified VF index. */
 	priv->representor_id = switch_info->representor ?
 			       switch_info->port_name : -1;
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index 1553a9b41c..33f604364a 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -1025,17 +1025,54 @@ mlx5_glue_devx_qp_query(struct ibv_qp *qp,
 static int
 mlx5_glue_devx_port_query(struct ibv_context *ctx,
 			  uint32_t port_num,
-			  struct mlx5dv_devx_port *mlx5_devx_port)
-{
+			  struct mlx5_port_info *info)
+{
+	int err = 0;
+
+	info->query_flags = 0;
+#ifdef HAVE_MLX5DV_DR_DEVX_PORT_V35
+	/* The DevX port query API is implemented (rdma-core v35 and above). */
+	struct mlx5_ib_uapi_query_port devx_port;
+
+	memset(&devx_port, 0, sizeof(devx_port));
+	err = mlx5dv_query_port(ctx, port_num, &devx_port);
+	if (err)
+		return err;
+	if (devx_port.flags & MLX5DV_QUERY_PORT_VPORT_REG_C0) {
+		info->vport_meta_tag = devx_port.reg_c0.value;
+		info->vport_meta_mask = devx_port.reg_c0.mask;
+		info->query_flags |= MLX5_PORT_QUERY_REG_C0;
+	}
+	if (devx_port.flags & MLX5DV_QUERY_PORT_VPORT) {
+		info->vport_id = devx_port.vport;
+		info->query_flags |= MLX5_PORT_QUERY_VPORT;
+	}
+#else
 #ifdef HAVE_MLX5DV_DR_DEVX_PORT
-	return mlx5dv_query_devx_port(ctx, port_num, mlx5_devx_port);
+	/* The legacy DevX port query API is implemented (prior v35). */
+	struct mlx5dv_devx_port devx_port = {
+		.comp_mask = MLX5DV_DEVX_PORT_VPORT |
+			     MLX5DV_DEVX_PORT_MATCH_REG_C_0
+	};
+
+	err = mlx5dv_query_devx_port(ctx, port_num, &devx_port);
+	if (err)
+		return err;
+	if (devx_port.comp_mask & MLX5DV_DEVX_PORT_MATCH_REG_C_0) {
+		info->vport_meta_tag = devx_port.reg_c_0.value;
+		info->vport_meta_mask = devx_port.reg_c_0.mask;
+		info->query_flags |= MLX5_PORT_QUERY_REG_C0;
+	}
+	if (devx_port.comp_mask & MLX5DV_DEVX_PORT_VPORT) {
+		info->vport_id = devx_port.vport_num;
+		info->query_flags |= MLX5_PORT_QUERY_VPORT;
+	}
 #else
-	(void)ctx;
-	(void)port_num;
-	(void)mlx5_devx_port;
-	errno = ENOTSUP;
-	return errno;
-#endif
+	RTE_SET_USED(ctx);
+	RTE_SET_USED(port_num);
+#endif /* HAVE_MLX5DV_DR_DEVX_PORT */
+#endif /* HAVE_MLX5DV_DR_DEVX_PORT_V35 */
+	return err;
 }
 
 alignas(RTE_CACHE_LINE_SIZE)
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 9895e55974..a639ab28b0 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -81,6 +81,20 @@ struct mlx5dv_dr_domain;
 struct mlx5dv_devx_port;
 #endif
 
+#ifndef HAVE_MLX5DV_DR_DEVX_PORT_V35
+struct mlx5dv_port;
+#endif
+
+#define MLX5_PORT_QUERY_VPORT (1u << 0)
+#define MLX5_PORT_QUERY_REG_C0 (1u << 1)
+
+struct mlx5_port_info {
+	uint16_t query_flags;
+	uint16_t vport_id; /* Associated VF vport index (if any). */
+	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
+	uint32_t vport_meta_mask; /* Used for vport index field match mask. */
+};
+
 #ifndef HAVE_MLX5_DR_CREATE_ACTION_FLOW_METER
 struct mlx5dv_dr_flow_meter_attr;
 #endif
@@ -255,7 +269,7 @@ struct mlx5_glue {
 			     void *out, size_t outlen);
 	int (*devx_port_query)(struct ibv_context *ctx,
 			       uint32_t port_num,
-			       struct mlx5dv_devx_port *mlx5_devx_port);
+			       struct mlx5_port_info *info);
 };
 
 extern const struct mlx5_glue *mlx5_glue;
-- 
2.21.0


  parent reply	other threads:[~2021-08-16 16:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16 16:29 [dpdk-stable] [PATCH 19.11 0/6] Cumulative fixes for stable 19.11 Bing Zhao
2021-08-16 16:29 ` [dpdk-stable] [PATCH 19.11 1/6] app/testpmd: fix offloads for newly attached port Bing Zhao
2021-08-16 16:29 ` [dpdk-stable] [PATCH 19.11 2/6] common/mlx5: fix compatibility with OFED port query API Bing Zhao
2021-08-16 16:29 ` [dpdk-stable] [PATCH 19.11 3/6] net/mlx5: fix switchdev mode recognition Bing Zhao
2021-08-16 16:29 ` [dpdk-stable] [PATCH 19.11 4/6] net/mlx5: fix RoCE LAG bond device probing Bing Zhao
2021-08-16 16:29 ` Bing Zhao [this message]
2021-08-16 16:29 ` [dpdk-stable] [PATCH 19.11 6/6] net/mlx5: fix multi-segment inline for the first segments Bing Zhao
2021-08-17 11:55   ` Christian Ehrhardt
2021-08-17 13:55     ` Bing Zhao
2021-08-17  9:42 ` [dpdk-stable] [PATCH 19.11 0/6] Cumulative fixes for stable 19.11 Christian Ehrhardt
2021-08-17 13:54 ` [dpdk-stable] [PATCH 19.11 v2] net/mlx5: fix multi-segment inline for the first segments Bing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210816162952.1931473-6-bingz@nvidia.com \
    --to=bingz@nvidia.com \
    --cc=christian.ehrhardt@canonical.com \
    --cc=matan@nvidia.com \
    --cc=stable@dpdk.org \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).