DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation
@ 2021-11-03 10:17 michaelba
  2021-11-03 10:17 ` [dpdk-dev] [PATCH 1/3] common/mlx5: fix MR search non inline function michaelba
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: michaelba @ 2021-11-03 10:17 UTC (permalink / raw)
  To: dev; +Cc: Matan Azrad, Thomas Monjalon, Michael Baum

From: Michael Baum <michaelba@oss.nvidia.com>

Due to sharing MR management in common, Tx performance is dropped.
This series back improves performance in a variety of ways.

Michael Baum (3):
  common/mlx5: fix MR search non inline function
  common/mlx5: fix redundant parameter in search MR function
  common/mlx5: make MR managmant port-agnostic for MP

 drivers/common/mlx5/mlx5_common.c        |  10 +-
 drivers/common/mlx5/mlx5_common.h        |   7 -
 drivers/common/mlx5/mlx5_common_mp.c     |  38 ++--
 drivers/common/mlx5/mlx5_common_mp.h     |  44 +++--
 drivers/common/mlx5/mlx5_common_mr.c     | 210 ++++++-----------------
 drivers/common/mlx5/mlx5_common_mr.h     |  70 ++++++--
 drivers/common/mlx5/version.map          |   4 +-
 drivers/compress/mlx5/mlx5_compress.c    |   4 +-
 drivers/crypto/mlx5/mlx5_crypto.c        |  24 ++-
 drivers/net/mlx5/linux/mlx5_mp_os.c      |  39 ++---
 drivers/net/mlx5/mlx5_rx.h               |  10 +-
 drivers/net/mlx5/mlx5_rxq.c              |   6 +-
 drivers/net/mlx5/mlx5_trigger.c          |   4 +-
 drivers/net/mlx5/mlx5_tx.h               |  27 +--
 drivers/net/mlx5/mlx5_txq.c              |   3 +-
 drivers/regex/mlx5/mlx5_regex_control.c  |   3 +-
 drivers/regex/mlx5/mlx5_regex_fastpath.c |  29 +---
 17 files changed, 204 insertions(+), 328 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [PATCH 1/3] common/mlx5: fix MR search non inline function
  2021-11-03 10:17 [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation michaelba
@ 2021-11-03 10:17 ` michaelba
  2021-11-03 10:17 ` [dpdk-dev] [PATCH 2/3] common/mlx5: fix redundant parameter in search MR function michaelba
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: michaelba @ 2021-11-03 10:17 UTC (permalink / raw)
  To: dev
  Cc: Matan Azrad, Thomas Monjalon, Michael Baum, Viacheslav Ovsiienko,
	Dmitry Kozlyuk

From: Michael Baum <michaelba@oss.nvidia.com>

Memory region management has recently been shared between drivers,
including the search for caches in the data plane.
The initial search in the local linear cache of the queue, usually
yields a result and one should not continue searching in the other
caches.

The function that searches in the local cache receives as a pointer a
parameter to a device, that is not necessary for its operation but for
subsequent searches (which, as mentioned, usually do not happen).
Transferring the device to a function and maintaining it, takes some
time and causes a small damage to performance.

Add the pointer to the device as a field of the mr_ctrl structure. The
field will be updated during control path and will be used only when
needed in the search.

Fixes: fc59a1ec556b ("common/mlx5: share MR mempool registration")

Signed-off-by: Michael Baum <michaelba@oss.nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 drivers/common/mlx5/mlx5_common.h    |  7 ----
 drivers/common/mlx5/mlx5_common_mr.c | 51 +----------------------
 drivers/common/mlx5/mlx5_common_mr.h | 61 +++++++++++++++++++++++++++-
 drivers/common/mlx5/version.map      |  2 +-
 4 files changed, 61 insertions(+), 60 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_common.h b/drivers/common/mlx5/mlx5_common.h
index 744c6a72b3..e16f3aa713 100644
--- a/drivers/common/mlx5/mlx5_common.h
+++ b/drivers/common/mlx5/mlx5_common.h
@@ -417,13 +417,6 @@ void
 mlx5_dev_mempool_unregister(struct mlx5_common_device *cdev,
 			    struct rte_mempool *mp);
 
-/* mlx5_common_mr.c */
-
-__rte_internal
-uint32_t
-mlx5_mr_mb2mr(struct mlx5_common_device *cdev, struct mlx5_mp_id *mp_id,
-	      struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf);
-
 /* mlx5_common_os.c */
 
 int mlx5_os_open_device(struct mlx5_common_device *cdev, uint32_t classes);
diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c
index 53a3e8565d..903ed0652c 100644
--- a/drivers/common/mlx5/mlx5_common_mr.c
+++ b/drivers/common/mlx5/mlx5_common_mr.c
@@ -1859,23 +1859,7 @@ mlx5_mr_mempool2mr_bh(struct mlx5_mr_share_cache *share_cache,
 	return lkey;
 }
 
-/**
- * Bottom-half of LKey search on. If supported, lookup for the address from
- * the mempool. Otherwise, search in old mechanism caches.
- *
- * @param cdev
- *   Pointer to mlx5 device.
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
- * @param mr_ctrl
- *   Pointer to per-queue MR control structure.
- * @param mb
- *   Pointer to mbuf.
- *
- * @return
- *   Searched LKey on success, UINT32_MAX on no match.
- */
-static uint32_t
+uint32_t
 mlx5_mr_mb2mr_bh(struct mlx5_common_device *cdev, struct mlx5_mp_id *mp_id,
 		 struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb)
 {
@@ -1908,36 +1892,3 @@ mlx5_mr_mb2mr_bh(struct mlx5_common_device *cdev, struct mlx5_mp_id *mp_id,
 	return mlx5_mr_addr2mr_bh(cdev->pd, mp_id, &cdev->mr_scache, mr_ctrl,
 				  addr, cdev->config.mr_ext_memseg_en);
 }
-
-/**
- * Query LKey from a packet buffer.
- *
- * @param cdev
- *   Pointer to the mlx5 device structure.
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
- * @param mr_ctrl
- *   Pointer to per-queue MR control structure.
- * @param mbuf
- *   Pointer to mbuf.
- *
- * @return
- *   Searched LKey on success, UINT32_MAX on no match.
- */
-uint32_t
-mlx5_mr_mb2mr(struct mlx5_common_device *cdev, struct mlx5_mp_id *mp_id,
-	      struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf)
-{
-	uint32_t lkey;
-
-	/* Check generation bit to see if there's any change on existing MRs. */
-	if (unlikely(*mr_ctrl->dev_gen_ptr != mr_ctrl->cur_gen))
-		mlx5_mr_flush_local_cache(mr_ctrl);
-	/* Linear search on MR cache array. */
-	lkey = mlx5_mr_lookup_lkey(mr_ctrl->cache, &mr_ctrl->mru,
-				   MLX5_MR_CACHE_N, (uintptr_t)mbuf->buf_addr);
-	if (likely(lkey != UINT32_MAX))
-		return lkey;
-	/* Take slower bottom-half on miss. */
-	return mlx5_mr_mb2mr_bh(cdev, mp_id, mr_ctrl, mbuf);
-}
diff --git a/drivers/common/mlx5/mlx5_common_mr.h b/drivers/common/mlx5/mlx5_common_mr.h
index e74f81641c..8771c7d02b 100644
--- a/drivers/common/mlx5/mlx5_common_mr.h
+++ b/drivers/common/mlx5/mlx5_common_mr.h
@@ -62,6 +62,8 @@ struct mlx5_mr_btree {
 	struct mr_cache_entry (*table)[];
 } __rte_packed;
 
+struct mlx5_common_device;
+
 /* Per-queue MR control descriptor. */
 struct mlx5_mr_ctrl {
 	uint32_t *dev_gen_ptr; /* Generation number of device to poll. */
@@ -160,6 +162,63 @@ mlx5_mr_lookup_lkey(struct mr_cache_entry *lkp_tbl, uint16_t *cached_idx,
 	return UINT32_MAX;
 }
 
+__rte_internal
+void mlx5_mr_flush_local_cache(struct mlx5_mr_ctrl *mr_ctrl);
+
+/**
+ * Bottom-half of LKey search on. If supported, lookup for the address from
+ * the mempool. Otherwise, search in old mechanism caches.
+ *
+ * @param cdev
+ *   Pointer to mlx5 device.
+ * @param mp_id
+ *   Multi-process identifier, may be NULL for the primary process.
+ * @param mr_ctrl
+ *   Pointer to per-queue MR control structure.
+ * @param mb
+ *   Pointer to mbuf.
+ *
+ * @return
+ *   Searched LKey on success, UINT32_MAX on no match.
+ */
+__rte_internal
+uint32_t mlx5_mr_mb2mr_bh(struct mlx5_common_device *cdev,
+			  struct mlx5_mp_id *mp_id,
+			  struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb);
+
+/**
+ * Query LKey from a packet buffer.
+ *
+ * @param cdev
+ *   Pointer to the mlx5 device structure.
+ * @param mp_id
+ *   Multi-process identifier, may be NULL for the primary process.
+ * @param mr_ctrl
+ *   Pointer to per-queue MR control structure.
+ * @param mbuf
+ *   Pointer to mbuf.
+ *
+ * @return
+ *   Searched LKey on success, UINT32_MAX on no match.
+ */
+static __rte_always_inline uint32_t
+mlx5_mr_mb2mr(struct mlx5_common_device *cdev, struct mlx5_mp_id *mp_id,
+	      struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf)
+{
+	uint32_t lkey;
+
+	/* Check generation bit to see if there's any change on existing MRs. */
+	if (unlikely(*mr_ctrl->dev_gen_ptr != mr_ctrl->cur_gen))
+		mlx5_mr_flush_local_cache(mr_ctrl);
+	/* Linear search on MR cache array. */
+	lkey = mlx5_mr_lookup_lkey(mr_ctrl->cache, &mr_ctrl->mru,
+				   MLX5_MR_CACHE_N, (uintptr_t)mbuf->buf_addr);
+	if (likely(lkey != UINT32_MAX))
+		return lkey;
+	/* Take slower bottom-half on miss. */
+	return mlx5_mr_mb2mr_bh(cdev, mp_id, mr_ctrl, mbuf);
+}
+
 /* mlx5_common_mr.c */
 
 __rte_internal
@@ -176,8 +235,6 @@ void mlx5_mr_release_cache(struct mlx5_mr_share_cache *mr_cache);
 int mlx5_mr_create_cache(struct mlx5_mr_share_cache *share_cache, int socket);
 void mlx5_mr_dump_cache(struct mlx5_mr_share_cache *share_cache __rte_unused);
 void mlx5_mr_rebuild_cache(struct mlx5_mr_share_cache *share_cache);
-__rte_internal
-void mlx5_mr_flush_local_cache(struct mlx5_mr_ctrl *mr_ctrl);
 void mlx5_free_mr_by_addr(struct mlx5_mr_share_cache *share_cache,
 			  const char *ibdev_name, const void *addr, size_t len);
 int mlx5_mr_insert_cache(struct mlx5_mr_share_cache *share_cache,
diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map
index 0ea8325f9a..f059dba7d6 100644
--- a/drivers/common/mlx5/version.map
+++ b/drivers/common/mlx5/version.map
@@ -112,7 +112,7 @@ INTERNAL {
 	mlx5_mr_create_primary;
 	mlx5_mr_ctrl_init;
 	mlx5_mr_flush_local_cache;
-	mlx5_mr_mb2mr;
+	mlx5_mr_mb2mr_bh;
 
 	mlx5_nl_allmulti; # WINDOWS_NO_EXPORT
 	mlx5_nl_ifindex; # WINDOWS_NO_EXPORT
-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [PATCH 2/3] common/mlx5: fix redundant parameter in search MR function
  2021-11-03 10:17 [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation michaelba
  2021-11-03 10:17 ` [dpdk-dev] [PATCH 1/3] common/mlx5: fix MR search non inline function michaelba
@ 2021-11-03 10:17 ` michaelba
  2021-11-03 10:17 ` [dpdk-dev] [PATCH 3/3] common/mlx5: make MR managmant port-agnostic for MP michaelba
  2021-11-07 13:17 ` [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation Thomas Monjalon
  3 siblings, 0 replies; 5+ messages in thread
From: michaelba @ 2021-11-03 10:17 UTC (permalink / raw)
  To: dev
  Cc: Matan Azrad, Thomas Monjalon, Michael Baum, Viacheslav Ovsiienko,
	Dmitry Kozlyuk

From: Michael Baum <michaelba@oss.nvidia.com>

Memory region management has recently been shared between drivers,
including the search for caches in the data plane.
The initial search in the local linear cache of the queue, usually
yields a result and one should not continue searching in the next level
caches.

The function that searches in the local cache gets the pointer to a
device as a parameter, that is not necessary for its operation
but for subsequent searches (which, as mentioned, usually do not
happen).
Transferring the device to a function and maintaining it, takes some
time and causes some impact on performance.

Add the pointer to the device as a field of the mr_ctrl structure. The
field will be updated during control path and will be used only when
needed in the search.

Fixes: fc59a1ec556b ("common/mlx5: share MR mempool registration")

Signed-off-by: Michael Baum <michaelba@oss.nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 drivers/common/mlx5/mlx5_common_mr.c     | 14 +++++++-----
 drivers/common/mlx5/mlx5_common_mr.h     | 28 ++++++++++-------------
 drivers/compress/mlx5/mlx5_compress.c    |  4 ++--
 drivers/crypto/mlx5/mlx5_crypto.c        | 24 +++++++++-----------
 drivers/net/mlx5/mlx5_rx.h               | 10 ++------
 drivers/net/mlx5/mlx5_rxq.c              |  3 +--
 drivers/net/mlx5/mlx5_tx.h               |  3 +--
 drivers/net/mlx5/mlx5_txq.c              |  3 +--
 drivers/regex/mlx5/mlx5_regex_control.c  |  3 +--
 drivers/regex/mlx5/mlx5_regex_fastpath.c | 29 ++++--------------------
 10 files changed, 43 insertions(+), 78 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c
index 903ed0652c..003d358f96 100644
--- a/drivers/common/mlx5/mlx5_common_mr.c
+++ b/drivers/common/mlx5/mlx5_common_mr.c
@@ -292,8 +292,8 @@ mlx5_mr_btree_dump(struct mlx5_mr_btree *bt __rte_unused)
  *
  * @param mr_ctrl
  *   Pointer to MR control structure.
- * @param dev_gen_ptr
- *   Pointer to generation number of global cache.
+ * @param cdev
+ *   Pointer to the mlx5 device structure.
  * @param socket
  *   NUMA socket on which memory must be allocated.
  *
@@ -301,15 +301,16 @@ mlx5_mr_btree_dump(struct mlx5_mr_btree *bt __rte_unused)
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 int
-mlx5_mr_ctrl_init(struct mlx5_mr_ctrl *mr_ctrl, uint32_t *dev_gen_ptr,
+mlx5_mr_ctrl_init(struct mlx5_mr_ctrl *mr_ctrl, struct mlx5_common_device *cdev,
 		  int socket)
 {
 	if (mr_ctrl == NULL) {
 		rte_errno = EINVAL;
 		return -rte_errno;
 	}
+	mr_ctrl->cdev = cdev;
 	/* Save pointer of global generation number to check memory event. */
-	mr_ctrl->dev_gen_ptr = dev_gen_ptr;
+	mr_ctrl->dev_gen_ptr = &cdev->mr_scache.dev_gen;
 	/* Initialize B-tree and allocate memory for bottom-half cache table. */
 	return mlx5_mr_btree_init(&mr_ctrl->cache_bh, MLX5_MR_BTREE_CACHE_N,
 				  socket);
@@ -1860,11 +1861,12 @@ mlx5_mr_mempool2mr_bh(struct mlx5_mr_share_cache *share_cache,
 }
 
 uint32_t
-mlx5_mr_mb2mr_bh(struct mlx5_common_device *cdev, struct mlx5_mp_id *mp_id,
-		 struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb)
+mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb,
+		 struct mlx5_mp_id *mp_id)
 {
 	uint32_t lkey;
 	uintptr_t addr = (uintptr_t)mb->buf_addr;
+	struct mlx5_common_device *cdev = mr_ctrl->cdev;
 
 	if (cdev->config.mr_mempool_reg_en) {
 		struct rte_mempool *mp = NULL;
diff --git a/drivers/common/mlx5/mlx5_common_mr.h b/drivers/common/mlx5/mlx5_common_mr.h
index 8771c7d02b..f65974b8a9 100644
--- a/drivers/common/mlx5/mlx5_common_mr.h
+++ b/drivers/common/mlx5/mlx5_common_mr.h
@@ -66,6 +66,7 @@ struct mlx5_common_device;
 
 /* Per-queue MR control descriptor. */
 struct mlx5_mr_ctrl {
+	struct mlx5_common_device *cdev; /* Pointer to the mlx5 common device.*/
 	uint32_t *dev_gen_ptr; /* Generation number of device to poll. */
 	uint32_t cur_gen; /* Generation number saved to flush caches. */
 	uint16_t mru; /* Index of last hit entry in top-half cache. */
@@ -169,41 +170,36 @@ void mlx5_mr_flush_local_cache(struct mlx5_mr_ctrl *mr_ctrl);
  * Bottom-half of LKey search on. If supported, lookup for the address from
  * the mempool. Otherwise, search in old mechanism caches.
  *
- * @param cdev
- *   Pointer to mlx5 device.
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
  * @param mr_ctrl
  *   Pointer to per-queue MR control structure.
  * @param mb
  *   Pointer to mbuf.
+ * @param mp_id
+ *   Multi-process identifier, may be NULL for the primary process.
  *
  * @return
  *   Searched LKey on success, UINT32_MAX on no match.
  */
 __rte_internal
-uint32_t mlx5_mr_mb2mr_bh(struct mlx5_common_device *cdev,
-			  struct mlx5_mp_id *mp_id,
-			  struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb);
+uint32_t mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf,
+			  struct mlx5_mp_id *mp_id);
 
 /**
  * Query LKey from a packet buffer.
  *
- * @param cdev
- *   Pointer to the mlx5 device structure.
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
  * @param mr_ctrl
  *   Pointer to per-queue MR control structure.
  * @param mbuf
  *   Pointer to mbuf.
+ * @param mp_id
+ *   Multi-process identifier, may be NULL for the primary process.
  *
  * @return
  *   Searched LKey on success, UINT32_MAX on no match.
  */
 static __rte_always_inline uint32_t
-mlx5_mr_mb2mr(struct mlx5_common_device *cdev, struct mlx5_mp_id *mp_id,
-	      struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf)
+mlx5_mr_mb2mr(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf,
+	      struct mlx5_mp_id *mp_id)
 {
 	uint32_t lkey;
 
@@ -216,14 +212,14 @@ mlx5_mr_mb2mr(struct mlx5_common_device *cdev, struct mlx5_mp_id *mp_id,
 	if (likely(lkey != UINT32_MAX))
 		return lkey;
 	/* Take slower bottom-half on miss. */
-	return mlx5_mr_mb2mr_bh(cdev, mp_id, mr_ctrl, mbuf);
+	return mlx5_mr_mb2mr_bh(mr_ctrl, mbuf, mp_id);
 }
 
 /* mlx5_common_mr.c */
 
 __rte_internal
-int mlx5_mr_ctrl_init(struct mlx5_mr_ctrl *mr_ctrl, uint32_t *dev_gen_ptr,
-		      int socket);
+int mlx5_mr_ctrl_init(struct mlx5_mr_ctrl *mr_ctrl,
+		      struct mlx5_common_device *cdev, int socket);
 __rte_internal
 void mlx5_mr_btree_free(struct mlx5_mr_btree *bt);
 void mlx5_mr_btree_dump(struct mlx5_mr_btree *bt __rte_unused);
diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c
index c4081c5f7d..5cf6d647af 100644
--- a/drivers/compress/mlx5/mlx5_compress.c
+++ b/drivers/compress/mlx5/mlx5_compress.c
@@ -205,7 +205,7 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id,
 		return -rte_errno;
 	}
 	dev->data->queue_pairs[qp_id] = qp;
-	if (mlx5_mr_ctrl_init(&qp->mr_ctrl, &priv->cdev->mr_scache.dev_gen,
+	if (mlx5_mr_ctrl_init(&qp->mr_ctrl, priv->cdev,
 			      priv->dev_config.socket_id)) {
 		DRV_LOG(ERR, "Cannot allocate MR Btree for qp %u.",
 			(uint32_t)qp_id);
@@ -471,7 +471,7 @@ mlx5_compress_dseg_set(struct mlx5_compress_qp *qp,
 	uintptr_t addr = rte_pktmbuf_mtod_offset(mbuf, uintptr_t, offset);
 
 	dseg->bcount = rte_cpu_to_be_32(len);
-	dseg->lkey = mlx5_mr_mb2mr(qp->priv->cdev, 0, &qp->mr_ctrl, mbuf);
+	dseg->lkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf, 0);
 	dseg->pbuf = rte_cpu_to_be_64(addr);
 	return dseg->lkey;
 }
diff --git a/drivers/crypto/mlx5/mlx5_crypto.c b/drivers/crypto/mlx5/mlx5_crypto.c
index f430d8cde0..1740dba003 100644
--- a/drivers/crypto/mlx5/mlx5_crypto.c
+++ b/drivers/crypto/mlx5/mlx5_crypto.c
@@ -305,9 +305,9 @@ mlx5_crypto_get_block_size(struct rte_crypto_op *op)
 }
 
 static __rte_always_inline uint32_t
-mlx5_crypto_klm_set(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp,
-		      struct rte_mbuf *mbuf, struct mlx5_wqe_dseg *klm,
-		      uint32_t offset, uint32_t *remain)
+mlx5_crypto_klm_set(struct mlx5_crypto_qp *qp, struct rte_mbuf *mbuf,
+		    struct mlx5_wqe_dseg *klm, uint32_t offset,
+		    uint32_t *remain)
 {
 	uint32_t data_len = (rte_pktmbuf_data_len(mbuf) - offset);
 	uintptr_t addr = rte_pktmbuf_mtod_offset(mbuf, uintptr_t, offset);
@@ -317,22 +317,21 @@ mlx5_crypto_klm_set(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp,
 	*remain -= data_len;
 	klm->bcount = rte_cpu_to_be_32(data_len);
 	klm->pbuf = rte_cpu_to_be_64(addr);
-	klm->lkey = mlx5_mr_mb2mr(priv->cdev, 0, &qp->mr_ctrl, mbuf);
+	klm->lkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf, 0);
 	return klm->lkey;
 
 }
 
 static __rte_always_inline uint32_t
-mlx5_crypto_klms_set(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp,
-		     struct rte_crypto_op *op, struct rte_mbuf *mbuf,
-		     struct mlx5_wqe_dseg *klm)
+mlx5_crypto_klms_set(struct mlx5_crypto_qp *qp, struct rte_crypto_op *op,
+		     struct rte_mbuf *mbuf, struct mlx5_wqe_dseg *klm)
 {
 	uint32_t remain_len = op->sym->cipher.data.length;
 	uint32_t nb_segs = mbuf->nb_segs;
 	uint32_t klm_n = 1u;
 
 	/* First mbuf needs to take the cipher offset. */
-	if (unlikely(mlx5_crypto_klm_set(priv, qp, mbuf, klm,
+	if (unlikely(mlx5_crypto_klm_set(qp, mbuf, klm,
 		     op->sym->cipher.data.offset, &remain_len) == UINT32_MAX)) {
 		op->status = RTE_CRYPTO_OP_STATUS_ERROR;
 		return 0;
@@ -344,7 +343,7 @@ mlx5_crypto_klms_set(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp,
 			op->status = RTE_CRYPTO_OP_STATUS_INVALID_ARGS;
 			return 0;
 		}
-		if (unlikely(mlx5_crypto_klm_set(priv, qp, mbuf, ++klm, 0,
+		if (unlikely(mlx5_crypto_klm_set(qp, mbuf, ++klm, 0,
 						 &remain_len) == UINT32_MAX)) {
 			op->status = RTE_CRYPTO_OP_STATUS_ERROR;
 			return 0;
@@ -370,7 +369,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv,
 	uint32_t ds;
 	bool ipl = op->sym->m_dst == NULL || op->sym->m_dst == op->sym->m_src;
 	/* Set UMR WQE. */
-	uint32_t klm_n = mlx5_crypto_klms_set(priv, qp, op,
+	uint32_t klm_n = mlx5_crypto_klms_set(qp, op,
 				   ipl ? op->sym->m_src : op->sym->m_dst, klms);
 
 	if (unlikely(klm_n == 0))
@@ -396,8 +395,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv,
 	cseg = RTE_PTR_ADD(cseg, priv->umr_wqe_size);
 	klms = RTE_PTR_ADD(cseg, sizeof(struct mlx5_rdma_write_wqe));
 	if (!ipl) {
-		klm_n = mlx5_crypto_klms_set(priv, qp, op, op->sym->m_src,
-					     klms);
+		klm_n = mlx5_crypto_klms_set(qp, op, op->sym->m_src, klms);
 		if (unlikely(klm_n == 0))
 			return 0;
 	} else {
@@ -643,7 +641,7 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id,
 		DRV_LOG(ERR, "Failed to create QP.");
 		goto error;
 	}
-	if (mlx5_mr_ctrl_init(&qp->mr_ctrl, &priv->cdev->mr_scache.dev_gen,
+	if (mlx5_mr_ctrl_init(&qp->mr_ctrl, priv->cdev,
 			      priv->dev_config.socket_id) != 0) {
 		DRV_LOG(ERR, "Cannot allocate MR Btree for qp %u.",
 			(uint32_t)qp_id);
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index 4952fe1455..322f234628 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -282,7 +282,6 @@ static __rte_always_inline uint32_t
 mlx5_rx_addr2mr(struct mlx5_rxq_data *rxq, uintptr_t addr)
 {
 	struct mlx5_mr_ctrl *mr_ctrl = &rxq->mr_ctrl;
-	struct mlx5_rxq_ctrl *rxq_ctrl;
 	struct rte_mempool *mp;
 	uint32_t lkey;
 
@@ -291,14 +290,9 @@ mlx5_rx_addr2mr(struct mlx5_rxq_data *rxq, uintptr_t addr)
 				   MLX5_MR_CACHE_N, addr);
 	if (likely(lkey != UINT32_MAX))
 		return lkey;
-	/*
-	 * Slower search in the mempool database on miss.
-	 * During queue creation rxq->sh is not yet set, so we use rxq_ctrl.
-	 */
-	rxq_ctrl = container_of(rxq, struct mlx5_rxq_ctrl, rxq);
 	mp = mlx5_rxq_mprq_enabled(rxq) ? rxq->mprq_mp : rxq->mp;
-	return mlx5_mr_mempool2mr_bh(&rxq_ctrl->priv->sh->cdev->mr_scache,
-				     mr_ctrl, mp, addr);
+	return mlx5_mr_mempool2mr_bh(&mr_ctrl->cdev->mr_scache, mr_ctrl,
+				     mp, addr);
 }
 
 #define mlx5_rx_mb2mr(rxq, mb) mlx5_rx_addr2mr(rxq, (uintptr_t)((mb)->buf_addr))
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 4f02fe02b9..1fc2f0e0c1 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1455,8 +1455,7 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 		goto error;
 	}
 	tmpl->type = MLX5_RXQ_TYPE_STANDARD;
-	if (mlx5_mr_ctrl_init(&tmpl->rxq.mr_ctrl,
-			      &priv->sh->cdev->mr_scache.dev_gen, socket)) {
+	if (mlx5_mr_ctrl_init(&tmpl->rxq.mr_ctrl, priv->sh->cdev, socket)) {
 		/* rte_errno is already set. */
 		goto error;
 	}
diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h
index ea20213a40..7fed0e7cb9 100644
--- a/drivers/net/mlx5/mlx5_tx.h
+++ b/drivers/net/mlx5/mlx5_tx.h
@@ -368,10 +368,9 @@ mlx5_tx_mb2mr(struct mlx5_txq_data *txq, struct rte_mbuf *mb)
 	struct mlx5_mr_ctrl *mr_ctrl = &txq->mr_ctrl;
 	struct mlx5_txq_ctrl *txq_ctrl =
 			container_of(txq, struct mlx5_txq_ctrl, txq);
-	struct mlx5_priv *priv = txq_ctrl->priv;
 
 	/* Take slower bottom-half on miss. */
-	return mlx5_mr_mb2mr(priv->sh->cdev, &priv->mp_id, mr_ctrl, mb);
+	return mlx5_mr_mb2mr(mr_ctrl, mb, &txq_ctrl->priv->mp_id);
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index e2a38d980a..e9ab7fa266 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -1134,8 +1134,7 @@ mlx5_txq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 		rte_errno = ENOMEM;
 		return NULL;
 	}
-	if (mlx5_mr_ctrl_init(&tmpl->txq.mr_ctrl,
-			      &priv->sh->cdev->mr_scache.dev_gen, socket)) {
+	if (mlx5_mr_ctrl_init(&tmpl->txq.mr_ctrl, priv->sh->cdev, socket)) {
 		/* rte_errno is already set. */
 		goto error;
 	}
diff --git a/drivers/regex/mlx5/mlx5_regex_control.c b/drivers/regex/mlx5/mlx5_regex_control.c
index 50c966a022..e40b1f20d9 100644
--- a/drivers/regex/mlx5/mlx5_regex_control.c
+++ b/drivers/regex/mlx5/mlx5_regex_control.c
@@ -242,8 +242,7 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind,
 		nb_sq_config++;
 	}
 
-	ret = mlx5_mr_ctrl_init(&qp->mr_ctrl, &priv->cdev->mr_scache.dev_gen,
-				rte_socket_id());
+	ret = mlx5_mr_ctrl_init(&qp->mr_ctrl, priv->cdev, rte_socket_id());
 	if (ret) {
 		DRV_LOG(ERR, "Error setting up mr btree");
 		goto err_btree;
diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c
index adb5343a46..943cb9c19e 100644
--- a/drivers/regex/mlx5/mlx5_regex_fastpath.c
+++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c
@@ -109,26 +109,6 @@ set_wqe_ctrl_seg(struct mlx5_wqe_ctrl_seg *seg, uint16_t pi, uint8_t opcode,
 	seg->imm = imm;
 }
 
-/**
- * Query LKey from a packet buffer for QP. If not found, add the mempool.
- *
- * @param priv
- *   Pointer to the priv object.
- * @param mr_ctrl
- *   Pointer to per-queue MR control structure.
- * @param mbuf
- *   Pointer to source mbuf, to search in.
- *
- * @return
- *   Searched LKey on success, UINT32_MAX on no match.
- */
-static inline uint32_t
-mlx5_regex_mb2mr(struct mlx5_regex_priv *priv, struct mlx5_mr_ctrl *mr_ctrl,
-		 struct rte_mbuf *mbuf)
-{
-	return mlx5_mr_mb2mr(priv->cdev, 0, mr_ctrl, mbuf);
-}
-
 static inline void
 __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj,
 	   struct rte_regex_ops *op, struct mlx5_regex_job *job,
@@ -180,7 +160,7 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp,
 	struct mlx5_klm klm;
 
 	klm.byte_count = rte_pktmbuf_data_len(op->mbuf);
-	klm.mkey = mlx5_regex_mb2mr(priv, &qp->mr_ctrl, op->mbuf);
+	klm.mkey = mlx5_mr_mb2mr(&qp->mr_ctrl, op->mbuf, 0);
 	klm.address = rte_pktmbuf_mtod(op->mbuf, uintptr_t);
 	__prep_one(priv, qp_obj, op, job, qp_obj->pi, &klm);
 	qp_obj->db_pi = qp_obj->pi;
@@ -349,9 +329,8 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp,
 			while (mbuf) {
 				addr = rte_pktmbuf_mtod(mbuf, uintptr_t);
 				/* Build indirect mkey seg's KLM. */
-				mkey_klm->mkey = mlx5_regex_mb2mr(priv,
-								  &qp->mr_ctrl,
-								  mbuf);
+				mkey_klm->mkey = mlx5_mr_mb2mr(&qp->mr_ctrl,
+							       mbuf, 0);
 				mkey_klm->address = rte_cpu_to_be_64(addr);
 				mkey_klm->byte_count = rte_cpu_to_be_32
 						(rte_pktmbuf_data_len(mbuf));
@@ -368,7 +347,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp,
 			klm.byte_count = scatter_size;
 		} else {
 			/* The single mubf case. Build the KLM directly. */
-			klm.mkey = mlx5_regex_mb2mr(priv, &qp->mr_ctrl, mbuf);
+			klm.mkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf, 0);
 			klm.address = rte_pktmbuf_mtod(mbuf, uintptr_t);
 			klm.byte_count = rte_pktmbuf_data_len(mbuf);
 		}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [dpdk-dev] [PATCH 3/3] common/mlx5: make MR managmant port-agnostic for MP
  2021-11-03 10:17 [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation michaelba
  2021-11-03 10:17 ` [dpdk-dev] [PATCH 1/3] common/mlx5: fix MR search non inline function michaelba
  2021-11-03 10:17 ` [dpdk-dev] [PATCH 2/3] common/mlx5: fix redundant parameter in search MR function michaelba
@ 2021-11-03 10:17 ` michaelba
  2021-11-07 13:17 ` [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation Thomas Monjalon
  3 siblings, 0 replies; 5+ messages in thread
From: michaelba @ 2021-11-03 10:17 UTC (permalink / raw)
  To: dev
  Cc: Matan Azrad, Thomas Monjalon, Michael Baum, Viacheslav Ovsiienko,
	Dmitry Kozlyuk

From: Michael Baum <michaelba@oss.nvidia.com>

In the multi-process mechanism, there are things that the secondary
process does not perform itself but asks the primary process to perform
for it.
There is a special API for communication between the processes that
receives parameters necessary for the specific action required as well
as a special structure called mp_id that contains the port number of the
processes through which the initial process finds the relevant ETH
device for the processes.

One of the operations performed through this mechanism is the creation
of a memory region, where the secondary process sends the virtual
address as a parameter and the mp_id structure with the port number
inside it.
However, once the memory area management is shared between the drivers
and either port number or ETH device is no longer relevant to them, it
seems unnecessary to continue communicating between the processes
through the mp_id variable.

In this patch we will remove the use of the above structure for all MR
management, and add to the specific parameter of operations a pointer to
the common device that contains everything needed to create/register MR.

Fixes: 9f1d636f3ef08 ("common/mlx5: share MR management")

Signed-off-by: Michael Baum <michaelba@oss.nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 drivers/common/mlx5/mlx5_common.c        |  10 +-
 drivers/common/mlx5/mlx5_common_mp.c     |  38 +++---
 drivers/common/mlx5/mlx5_common_mp.h     |  44 +++++--
 drivers/common/mlx5/mlx5_common_mr.c     | 149 +++++++----------------
 drivers/common/mlx5/mlx5_common_mr.h     |  27 ++--
 drivers/common/mlx5/version.map          |   2 +-
 drivers/compress/mlx5/mlx5_compress.c    |   2 +-
 drivers/crypto/mlx5/mlx5_crypto.c        |   2 +-
 drivers/net/mlx5/linux/mlx5_mp_os.c      |  39 +++---
 drivers/net/mlx5/mlx5_rxq.c              |   3 +-
 drivers/net/mlx5/mlx5_trigger.c          |   4 +-
 drivers/net/mlx5/mlx5_tx.h               |  26 +---
 drivers/regex/mlx5/mlx5_regex_fastpath.c |   6 +-
 13 files changed, 131 insertions(+), 221 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_common.c b/drivers/common/mlx5/mlx5_common.c
index e6ff045c95..1c36212a04 100644
--- a/drivers/common/mlx5/mlx5_common.c
+++ b/drivers/common/mlx5/mlx5_common.c
@@ -318,10 +318,7 @@ static int
 mlx5_dev_mempool_register(struct mlx5_common_device *cdev,
 			  struct rte_mempool *mp)
 {
-	struct mlx5_mp_id mp_id;
-
-	mlx5_mp_id_init(&mp_id, 0);
-	return mlx5_mr_mempool_register(&cdev->mr_scache, cdev->pd, mp, &mp_id);
+	return mlx5_mr_mempool_register(cdev, mp);
 }
 
 /**
@@ -336,10 +333,7 @@ void
 mlx5_dev_mempool_unregister(struct mlx5_common_device *cdev,
 			    struct rte_mempool *mp)
 {
-	struct mlx5_mp_id mp_id;
-
-	mlx5_mp_id_init(&mp_id, 0);
-	if (mlx5_mr_mempool_unregister(&cdev->mr_scache, mp, &mp_id) < 0)
+	if (mlx5_mr_mempool_unregister(cdev, mp) < 0)
 		DRV_LOG(WARNING, "Failed to unregister mempool %s for PD %p: %s",
 			mp->name, cdev->pd, rte_strerror(rte_errno));
 }
diff --git a/drivers/common/mlx5/mlx5_common_mp.c b/drivers/common/mlx5/mlx5_common_mp.c
index 6dfc5535e0..536d61f66c 100644
--- a/drivers/common/mlx5/mlx5_common_mp.c
+++ b/drivers/common/mlx5/mlx5_common_mp.c
@@ -16,8 +16,8 @@
 /**
  * Request Memory Region creation to the primary process.
  *
- * @param[in] mp_id
- *   ID of the MP process.
+ * @param cdev
+ *   Pointer to the mlx5 common device.
  * @param addr
  *   Target virtual address to register.
  *
@@ -25,23 +25,24 @@
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 int
-mlx5_mp_req_mr_create(struct mlx5_mp_id *mp_id, uintptr_t addr)
+mlx5_mp_req_mr_create(struct mlx5_common_device *cdev, uintptr_t addr)
 {
 	struct rte_mp_msg mp_req;
 	struct rte_mp_msg *mp_res;
 	struct rte_mp_reply mp_rep;
 	struct mlx5_mp_param *req = (struct mlx5_mp_param *)mp_req.param;
+	struct mlx5_mp_arg_mr_manage *arg = &req->args.mr_manage;
 	struct mlx5_mp_param *res;
 	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
 	int ret;
 
 	MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_SECONDARY);
-	mp_init_msg(mp_id, &mp_req, MLX5_MP_REQ_CREATE_MR);
-	req->args.addr = addr;
+	mp_init_port_agnostic_msg(&mp_req, MLX5_MP_REQ_CREATE_MR);
+	arg->addr = addr;
+	arg->cdev = cdev;
 	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
 	if (ret) {
-		DRV_LOG(ERR, "port %u request to primary process failed",
-			mp_id->port_id);
+		DRV_LOG(ERR, "Create MR request to primary process failed.");
 		return -rte_errno;
 	}
 	MLX5_ASSERT(mp_rep.nb_received == 1);
@@ -55,27 +56,22 @@ mlx5_mp_req_mr_create(struct mlx5_mp_id *mp_id, uintptr_t addr)
 }
 
 /**
- * @param mp_id
- *   ID of the MP process.
- * @param share_cache
- *   Shared MR cache.
- * @param pd
- *   Protection domain.
+ * @param cdev
+ *   Pointer to the mlx5 common device.
  * @param mempool
  *   Mempool to register or unregister.
  * @param reg
  *   True to register the mempool, False to unregister.
  */
 int
-mlx5_mp_req_mempool_reg(struct mlx5_mp_id *mp_id,
-			struct mlx5_mr_share_cache *share_cache, void *pd,
+mlx5_mp_req_mempool_reg(struct mlx5_common_device *cdev,
 			struct rte_mempool *mempool, bool reg)
 {
 	struct rte_mp_msg mp_req;
 	struct rte_mp_msg *mp_res;
 	struct rte_mp_reply mp_rep;
 	struct mlx5_mp_param *req = (struct mlx5_mp_param *)mp_req.param;
-	struct mlx5_mp_arg_mempool_reg *arg = &req->args.mempool_reg;
+	struct mlx5_mp_arg_mr_manage *arg = &req->args.mr_manage;
 	struct mlx5_mp_param *res;
 	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
 	enum mlx5_mp_req_type type;
@@ -84,14 +80,14 @@ mlx5_mp_req_mempool_reg(struct mlx5_mp_id *mp_id,
 	MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_SECONDARY);
 	type = reg ? MLX5_MP_REQ_MEMPOOL_REGISTER :
 		     MLX5_MP_REQ_MEMPOOL_UNREGISTER;
-	mp_init_msg(mp_id, &mp_req, type);
-	arg->share_cache = share_cache;
-	arg->pd = pd;
+	mp_init_port_agnostic_msg(&mp_req, type);
 	arg->mempool = mempool;
+	arg->cdev = cdev;
 	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
 	if (ret) {
-		DRV_LOG(ERR, "port %u request to primary process failed",
-			mp_id->port_id);
+		DRV_LOG(ERR,
+			"Mempool %sregister request to primary process failed.",
+			reg ? "" : "un");
 		return -rte_errno;
 	}
 	MLX5_ASSERT(mp_rep.nb_received == 1);
diff --git a/drivers/common/mlx5/mlx5_common_mp.h b/drivers/common/mlx5/mlx5_common_mp.h
index 2276dc921c..b1e3a41a20 100644
--- a/drivers/common/mlx5/mlx5_common_mp.h
+++ b/drivers/common/mlx5/mlx5_common_mp.h
@@ -35,22 +35,24 @@ struct mlx5_mp_arg_queue_id {
 	uint16_t queue_id; /* DPDK queue ID. */
 };
 
-struct mlx5_mp_arg_mempool_reg {
-	struct mlx5_mr_share_cache *share_cache;
-	void *pd; /* NULL for MLX5_MP_REQ_MEMPOOL_UNREGISTER */
-	struct rte_mempool *mempool;
+struct mlx5_mp_arg_mr_manage {
+	struct mlx5_common_device *cdev;
+	union {
+		struct rte_mempool *mempool;
+		/* MLX5_MP_REQ_MEMPOOL_(UN)REGISTER */
+		uintptr_t addr; /* MLX5_MP_REQ_CREATE_MR */
+	};
 };
 
-/* Pameters for IPC. */
+/* Parameters for IPC. */
 struct mlx5_mp_param {
 	enum mlx5_mp_req_type type;
 	int port_id;
 	int result;
 	RTE_STD_C11
 	union {
-		uintptr_t addr; /* MLX5_MP_REQ_CREATE_MR */
-		struct mlx5_mp_arg_mempool_reg mempool_reg;
-		/* MLX5_MP_REQ_MEMPOOL_(UN)REGISTER */
+		struct mlx5_mp_arg_mr_manage mr_manage;
+		/* MLX5_MP_REQ_MEMPOOL_(UN)REGISTER, MLX5_MP_REQ_CREATE_MR */
 		struct mlx5_mp_arg_queue_state_modify state_modify;
 		/* MLX5_MP_REQ_QUEUE_STATE_MODIFY */
 		struct mlx5_mp_arg_queue_id queue_id;
@@ -101,6 +103,25 @@ mp_init_msg(struct mlx5_mp_id *mp_id, struct rte_mp_msg *msg,
 	param->port_id = mp_id->port_id;
 }
 
+/**
+ * Initialize IPC port-agnostic message.
+ *
+ * @param[out] msg
+ *   Pointer to message to fill in.
+ * @param[in] type
+ *   Message type.
+ */
+static inline void
+mp_init_port_agnostic_msg(struct rte_mp_msg *msg, enum mlx5_mp_req_type type)
+{
+	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg->param;
+
+	memset(msg, 0, sizeof(*msg));
+	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
+	msg->len_param = sizeof(*param);
+	param->type = type;
+}
+
 __rte_internal
 int mlx5_mp_init_primary(const char *name, const rte_mp_t primary_action);
 __rte_internal
@@ -110,11 +131,10 @@ int mlx5_mp_init_secondary(const char *name, const rte_mp_t secondary_action);
 __rte_internal
 void mlx5_mp_uninit_secondary(const char *name);
 __rte_internal
-int mlx5_mp_req_mr_create(struct mlx5_mp_id *mp_id, uintptr_t addr);
+int mlx5_mp_req_mr_create(struct mlx5_common_device *cdev, uintptr_t addr);
 __rte_internal
-int mlx5_mp_req_mempool_reg(struct mlx5_mp_id *mp_id,
-			struct mlx5_mr_share_cache *share_cache, void *pd,
-			struct rte_mempool *mempool, bool reg);
+int mlx5_mp_req_mempool_reg(struct mlx5_common_device *cdev,
+			    struct rte_mempool *mempool, bool reg);
 __rte_internal
 int mlx5_mp_req_queue_state_modify(struct mlx5_mp_id *mp_id,
 				   struct mlx5_mp_arg_queue_state_modify *sm);
diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c
index 003d358f96..93c8dc9042 100644
--- a/drivers/common/mlx5/mlx5_common_mr.c
+++ b/drivers/common/mlx5/mlx5_common_mr.c
@@ -591,10 +591,8 @@ mr_find_contig_memsegs_cb(const struct rte_memseg_list *msl,
  * list is on the shared memory, following LKey lookup should succeed unless the
  * request fails.
  *
- * @param pd
- *   Pointer to pd of a device (net, regex, vdpa,...).
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
+ * @param cdev
+ *   Pointer to the mlx5 common device.
  * @param share_cache
  *   Pointer to a global shared MR cache.
  * @param[out] entry
@@ -602,31 +600,22 @@ mr_find_contig_memsegs_cb(const struct rte_memseg_list *msl,
  *   created. If failed to create one, this will not be updated.
  * @param addr
  *   Target virtual address to register.
- * @param mr_ext_memseg_en
- *   Configurable flag about external memory segment enable or not.
  *
  * @return
  *   Searched LKey on success, UINT32_MAX on failure and rte_errno is set.
  */
 static uint32_t
-mlx5_mr_create_secondary(void *pd __rte_unused,
-			 struct mlx5_mp_id *mp_id,
+mlx5_mr_create_secondary(struct mlx5_common_device *cdev,
 			 struct mlx5_mr_share_cache *share_cache,
-			 struct mr_cache_entry *entry, uintptr_t addr,
-			 unsigned int mr_ext_memseg_en __rte_unused)
+			 struct mr_cache_entry *entry, uintptr_t addr)
 {
 	int ret;
 
-	if (mp_id == NULL) {
-		rte_errno = EINVAL;
-		return UINT32_MAX;
-	}
-	DRV_LOG(DEBUG, "port %u requesting MR creation for address (%p)",
-	      mp_id->port_id, (void *)addr);
-	ret = mlx5_mp_req_mr_create(mp_id, addr);
+	DRV_LOG(DEBUG, "Requesting MR creation for address (%p)", (void *)addr);
+	ret = mlx5_mp_req_mr_create(cdev, addr);
 	if (ret) {
 		DRV_LOG(DEBUG, "Fail to request MR creation for address (%p)",
-		      (void *)addr);
+			(void *)addr);
 		return UINT32_MAX;
 	}
 	rte_rwlock_read_lock(&share_cache->rwlock);
@@ -636,8 +625,8 @@ mlx5_mr_create_secondary(void *pd __rte_unused,
 	MLX5_ASSERT(entry->lkey != UINT32_MAX);
 	rte_rwlock_read_unlock(&share_cache->rwlock);
 	DRV_LOG(DEBUG, "MR CREATED by primary process for %p:\n"
-	      "  [0x%" PRIxPTR ", 0x%" PRIxPTR "), lkey=0x%x",
-	      (void *)addr, entry->start, entry->end, entry->lkey);
+		"  [0x%" PRIxPTR ", 0x%" PRIxPTR "), lkey=0x%x",
+		(void *)addr, entry->start, entry->end, entry->lkey);
 	return entry->lkey;
 }
 
@@ -660,7 +649,7 @@ mlx5_mr_create_secondary(void *pd __rte_unused,
  * @return
  *   Searched LKey on success, UINT32_MAX on failure and rte_errno is set.
  */
-uint32_t
+static uint32_t
 mlx5_mr_create_primary(void *pd,
 		       struct mlx5_mr_share_cache *share_cache,
 		       struct mr_cache_entry *entry, uintptr_t addr,
@@ -888,10 +877,8 @@ mlx5_mr_create_primary(void *pd,
  * Create a new global Memory Region (MR) for a missing virtual address.
  * This can be called from primary and secondary process.
  *
- * @param pd
- *   Pointer to pd handle of a device (net, regex, vdpa,...).
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
+ * @param cdev
+ *   Pointer to the mlx5 common device.
  * @param share_cache
  *   Pointer to a global shared MR cache.
  * @param[out] entry
@@ -899,28 +886,24 @@ mlx5_mr_create_primary(void *pd,
  *   created. If failed to create one, this will not be updated.
  * @param addr
  *   Target virtual address to register.
- * @param mr_ext_memseg_en
- *   Configurable flag about external memory segment enable or not.
  *
  * @return
  *   Searched LKey on success, UINT32_MAX on failure and rte_errno is set.
  */
-static uint32_t
-mlx5_mr_create(void *pd, struct mlx5_mp_id *mp_id,
+uint32_t
+mlx5_mr_create(struct mlx5_common_device *cdev,
 	       struct mlx5_mr_share_cache *share_cache,
-	       struct mr_cache_entry *entry, uintptr_t addr,
-	       unsigned int mr_ext_memseg_en)
+	       struct mr_cache_entry *entry, uintptr_t addr)
 {
 	uint32_t ret = 0;
 
 	switch (rte_eal_process_type()) {
 	case RTE_PROC_PRIMARY:
-		ret = mlx5_mr_create_primary(pd, share_cache, entry,
-					     addr, mr_ext_memseg_en);
+		ret = mlx5_mr_create_primary(cdev->pd, share_cache, entry, addr,
+					     cdev->config.mr_ext_memseg_en);
 		break;
 	case RTE_PROC_SECONDARY:
-		ret = mlx5_mr_create_secondary(pd, mp_id, share_cache, entry,
-					       addr, mr_ext_memseg_en);
+		ret = mlx5_mr_create_secondary(cdev, share_cache, entry, addr);
 		break;
 	default:
 		break;
@@ -932,12 +915,6 @@ mlx5_mr_create(void *pd, struct mlx5_mp_id *mp_id,
  * Look up address in the global MR cache table. If not found, create a new MR.
  * Insert the found/created entry to local bottom-half cache table.
  *
- * @param pd
- *   Pointer to pd of a device (net, regex, vdpa,...).
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
- * @param share_cache
- *   Pointer to a global shared MR cache.
  * @param mr_ctrl
  *   Pointer to per-queue MR control structure.
  * @param[out] entry
@@ -945,19 +922,15 @@ mlx5_mr_create(void *pd, struct mlx5_mp_id *mp_id,
  *   created. If failed to create one, this is not written.
  * @param addr
  *   Search key.
- * @param mr_ext_memseg_en
- *   Configurable flag about external memory segment enable or not.
  *
  * @return
  *   Searched LKey on success, UINT32_MAX on no match.
  */
 static uint32_t
-mr_lookup_caches(void *pd, struct mlx5_mp_id *mp_id,
-		 struct mlx5_mr_share_cache *share_cache,
-		 struct mlx5_mr_ctrl *mr_ctrl,
-		 struct mr_cache_entry *entry, uintptr_t addr,
-		 unsigned int mr_ext_memseg_en)
+mr_lookup_caches(struct mlx5_mr_ctrl *mr_ctrl,
+		 struct mr_cache_entry *entry, uintptr_t addr)
 {
+	struct mlx5_mr_share_cache *share_cache = &mr_ctrl->cdev->mr_scache;
 	struct mlx5_mr_btree *bt = &mr_ctrl->cache_bh;
 	uint32_t lkey;
 	uint16_t idx;
@@ -982,8 +955,7 @@ mr_lookup_caches(void *pd, struct mlx5_mp_id *mp_id,
 	}
 	rte_rwlock_read_unlock(&share_cache->rwlock);
 	/* First time to see the address? Create a new MR. */
-	lkey = mlx5_mr_create(pd, mp_id, share_cache, entry, addr,
-			      mr_ext_memseg_en);
+	lkey = mlx5_mr_create(mr_ctrl->cdev, share_cache, entry, addr);
 	/*
 	 * Update the local cache if successfully created a new global MR. Even
 	 * if failed to create one, there's no action to take in this datapath
@@ -1000,27 +972,16 @@ mr_lookup_caches(void *pd, struct mlx5_mp_id *mp_id,
  * misses, search in the global MR cache table and update the new entry to
  * per-queue local caches.
  *
- * @param pd
- *   Pointer to pd of a device (net, regex, vdpa,...).
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
- * @param share_cache
- *   Pointer to a global shared MR cache.
  * @param mr_ctrl
  *   Pointer to per-queue MR control structure.
  * @param addr
  *   Search key.
- * @param mr_ext_memseg_en
- *   Configurable flag about external memory segment enable or not.
  *
  * @return
  *   Searched LKey on success, UINT32_MAX on no match.
  */
 static uint32_t
-mlx5_mr_addr2mr_bh(void *pd, struct mlx5_mp_id *mp_id,
-		   struct mlx5_mr_share_cache *share_cache,
-		   struct mlx5_mr_ctrl *mr_ctrl, uintptr_t addr,
-		   unsigned int mr_ext_memseg_en)
+mlx5_mr_addr2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, uintptr_t addr)
 {
 	uint32_t lkey;
 	uint16_t bh_idx = 0;
@@ -1038,8 +999,7 @@ mlx5_mr_addr2mr_bh(void *pd, struct mlx5_mp_id *mp_id,
 		 * and local cache_bh[] will be updated inside if possible.
 		 * Top-half cache entry will also be updated.
 		 */
-		lkey = mr_lookup_caches(pd, mp_id, share_cache, mr_ctrl,
-					repl, addr, mr_ext_memseg_en);
+		lkey = mr_lookup_caches(mr_ctrl, repl, addr);
 		if (unlikely(lkey == UINT32_MAX))
 			return UINT32_MAX;
 	}
@@ -1622,44 +1582,35 @@ mlx5_mr_mempool_register_primary(struct mlx5_mr_share_cache *share_cache,
 }
 
 static int
-mlx5_mr_mempool_register_secondary(struct mlx5_mr_share_cache *share_cache,
-				   void *pd, struct rte_mempool *mp,
-				   struct mlx5_mp_id *mp_id)
+mlx5_mr_mempool_register_secondary(struct mlx5_common_device *cdev,
+				   struct rte_mempool *mp)
 {
-	if (mp_id == NULL) {
-		rte_errno = EINVAL;
-		return -1;
-	}
-	return mlx5_mp_req_mempool_reg(mp_id, share_cache, pd, mp, true);
+	return mlx5_mp_req_mempool_reg(cdev, mp, true);
 }
 
 /**
  * Register the memory of a mempool in the protection domain.
  *
- * @param share_cache
- *   Shared MR cache of the protection domain.
- * @param pd
- *   Protection domain object.
+ * @param cdev
+ *   Pointer to the mlx5 common device.
  * @param mp
  *   Mempool to register.
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
  *
  * @return
  *   0 on success, (-1) on failure and rte_errno is set.
  */
 int
-mlx5_mr_mempool_register(struct mlx5_mr_share_cache *share_cache, void *pd,
-			 struct rte_mempool *mp, struct mlx5_mp_id *mp_id)
+mlx5_mr_mempool_register(struct mlx5_common_device *cdev,
+			 struct rte_mempool *mp)
 {
 	if (mp->flags & RTE_MEMPOOL_F_NON_IO)
 		return 0;
 	switch (rte_eal_process_type()) {
 	case RTE_PROC_PRIMARY:
-		return mlx5_mr_mempool_register_primary(share_cache, pd, mp);
+		return mlx5_mr_mempool_register_primary(&cdev->mr_scache,
+							cdev->pd, mp);
 	case RTE_PROC_SECONDARY:
-		return mlx5_mr_mempool_register_secondary(share_cache, pd, mp,
-							  mp_id);
+		return mlx5_mr_mempool_register_secondary(cdev, mp);
 	default:
 		return -1;
 	}
@@ -1695,42 +1646,34 @@ mlx5_mr_mempool_unregister_primary(struct mlx5_mr_share_cache *share_cache,
 }
 
 static int
-mlx5_mr_mempool_unregister_secondary(struct mlx5_mr_share_cache *share_cache,
-				     struct rte_mempool *mp,
-				     struct mlx5_mp_id *mp_id)
+mlx5_mr_mempool_unregister_secondary(struct mlx5_common_device *cdev,
+				     struct rte_mempool *mp)
 {
-	if (mp_id == NULL) {
-		rte_errno = EINVAL;
-		return -1;
-	}
-	return mlx5_mp_req_mempool_reg(mp_id, share_cache, NULL, mp, false);
+	return mlx5_mp_req_mempool_reg(cdev, mp, false);
 }
 
 /**
  * Unregister the memory of a mempool from the protection domain.
  *
- * @param share_cache
- *   Shared MR cache of the protection domain.
+ * @param cdev
+ *   Pointer to the mlx5 common device.
  * @param mp
  *   Mempool to unregister.
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
  *
  * @return
  *   0 on success, (-1) on failure and rte_errno is set.
  */
 int
-mlx5_mr_mempool_unregister(struct mlx5_mr_share_cache *share_cache,
-			   struct rte_mempool *mp, struct mlx5_mp_id *mp_id)
+mlx5_mr_mempool_unregister(struct mlx5_common_device *cdev,
+			   struct rte_mempool *mp)
 {
 	if (mp->flags & RTE_MEMPOOL_F_NON_IO)
 		return 0;
 	switch (rte_eal_process_type()) {
 	case RTE_PROC_PRIMARY:
-		return mlx5_mr_mempool_unregister_primary(share_cache, mp);
+		return mlx5_mr_mempool_unregister_primary(&cdev->mr_scache, mp);
 	case RTE_PROC_SECONDARY:
-		return mlx5_mr_mempool_unregister_secondary(share_cache, mp,
-							    mp_id);
+		return mlx5_mr_mempool_unregister_secondary(cdev, mp);
 	default:
 		return -1;
 	}
@@ -1861,8 +1804,7 @@ mlx5_mr_mempool2mr_bh(struct mlx5_mr_share_cache *share_cache,
 }
 
 uint32_t
-mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb,
-		 struct mlx5_mp_id *mp_id)
+mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb)
 {
 	uint32_t lkey;
 	uintptr_t addr = (uintptr_t)mb->buf_addr;
@@ -1891,6 +1833,5 @@ mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mb,
 		}
 		/* Fallback for generic mechanism in corner cases. */
 	}
-	return mlx5_mr_addr2mr_bh(cdev->pd, mp_id, &cdev->mr_scache, mr_ctrl,
-				  addr, cdev->config.mr_ext_memseg_en);
+	return mlx5_mr_addr2mr_bh(mr_ctrl, addr);
 }
diff --git a/drivers/common/mlx5/mlx5_common_mr.h b/drivers/common/mlx5/mlx5_common_mr.h
index f65974b8a9..93903d8397 100644
--- a/drivers/common/mlx5/mlx5_common_mr.h
+++ b/drivers/common/mlx5/mlx5_common_mr.h
@@ -174,15 +174,12 @@ void mlx5_mr_flush_local_cache(struct mlx5_mr_ctrl *mr_ctrl);
  *   Pointer to per-queue MR control structure.
  * @param mb
  *   Pointer to mbuf.
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
  *
  * @return
  *   Searched LKey on success, UINT32_MAX on no match.
  */
 __rte_internal
-uint32_t mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf,
-			  struct mlx5_mp_id *mp_id);
+uint32_t mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf);
 
 /**
  * Query LKey from a packet buffer.
@@ -191,15 +188,12 @@ uint32_t mlx5_mr_mb2mr_bh(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf,
  *   Pointer to per-queue MR control structure.
  * @param mbuf
  *   Pointer to mbuf.
- * @param mp_id
- *   Multi-process identifier, may be NULL for the primary process.
  *
  * @return
  *   Searched LKey on success, UINT32_MAX on no match.
  */
 static __rte_always_inline uint32_t
-mlx5_mr_mb2mr(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf,
-	      struct mlx5_mp_id *mp_id)
+mlx5_mr_mb2mr(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf)
 {
 	uint32_t lkey;
 
@@ -212,7 +206,7 @@ mlx5_mr_mb2mr(struct mlx5_mr_ctrl *mr_ctrl, struct rte_mbuf *mbuf,
 	if (likely(lkey != UINT32_MAX))
 		return lkey;
 	/* Take slower bottom-half on miss. */
-	return mlx5_mr_mb2mr_bh(mr_ctrl, mbuf, mp_id);
+	return mlx5_mr_mb2mr_bh(mr_ctrl, mbuf);
 }
 
 /* mlx5_common_mr.c */
@@ -244,10 +238,9 @@ mlx5_create_mr_ext(void *pd, uintptr_t addr, size_t len, int socket_id,
 void mlx5_mr_free(struct mlx5_mr *mr, mlx5_dereg_mr_t dereg_mr_cb);
 __rte_internal
 uint32_t
-mlx5_mr_create_primary(void *pd,
-		       struct mlx5_mr_share_cache *share_cache,
-		       struct mr_cache_entry *entry, uintptr_t addr,
-		       unsigned int mr_ext_memseg_en);
+mlx5_mr_create(struct mlx5_common_device *cdev,
+	       struct mlx5_mr_share_cache *share_cache,
+	       struct mr_cache_entry *entry, uintptr_t addr);
 
 /* mlx5_common_verbs.c */
 
@@ -264,11 +257,11 @@ mlx5_os_set_reg_mr_cb(mlx5_reg_mr_t *reg_mr_cb, mlx5_dereg_mr_t *dereg_mr_cb);
 
 __rte_internal
 int
-mlx5_mr_mempool_register(struct mlx5_mr_share_cache *share_cache, void *pd,
-			 struct rte_mempool *mp, struct mlx5_mp_id *mp_id);
+mlx5_mr_mempool_register(struct mlx5_common_device *cdev,
+			 struct rte_mempool *mp);
 __rte_internal
 int
-mlx5_mr_mempool_unregister(struct mlx5_mr_share_cache *share_cache,
-			   struct rte_mempool *mp, struct mlx5_mp_id *mp_id);
+mlx5_mr_mempool_unregister(struct mlx5_common_device *cdev,
+			   struct rte_mempool *mp);
 
 #endif /* RTE_PMD_MLX5_COMMON_MR_H_ */
diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map
index f059dba7d6..fff8d1f937 100644
--- a/drivers/common/mlx5/version.map
+++ b/drivers/common/mlx5/version.map
@@ -109,7 +109,7 @@ INTERNAL {
 
 	mlx5_mprq_buf_free_cb;
 	mlx5_mr_btree_free;
-	mlx5_mr_create_primary;
+	mlx5_mr_create;
 	mlx5_mr_ctrl_init;
 	mlx5_mr_flush_local_cache;
 	mlx5_mr_mb2mr_bh;
diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c
index 5cf6d647af..0e87d3b819 100644
--- a/drivers/compress/mlx5/mlx5_compress.c
+++ b/drivers/compress/mlx5/mlx5_compress.c
@@ -471,7 +471,7 @@ mlx5_compress_dseg_set(struct mlx5_compress_qp *qp,
 	uintptr_t addr = rte_pktmbuf_mtod_offset(mbuf, uintptr_t, offset);
 
 	dseg->bcount = rte_cpu_to_be_32(len);
-	dseg->lkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf, 0);
+	dseg->lkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf);
 	dseg->pbuf = rte_cpu_to_be_64(addr);
 	return dseg->lkey;
 }
diff --git a/drivers/crypto/mlx5/mlx5_crypto.c b/drivers/crypto/mlx5/mlx5_crypto.c
index 1740dba003..6b0ea87438 100644
--- a/drivers/crypto/mlx5/mlx5_crypto.c
+++ b/drivers/crypto/mlx5/mlx5_crypto.c
@@ -317,7 +317,7 @@ mlx5_crypto_klm_set(struct mlx5_crypto_qp *qp, struct rte_mbuf *mbuf,
 	*remain -= data_len;
 	klm->bcount = rte_cpu_to_be_32(data_len);
 	klm->pbuf = rte_cpu_to_be_64(addr);
-	klm->lkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf, 0);
+	klm->lkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf);
 	return klm->lkey;
 
 }
diff --git a/drivers/net/mlx5/linux/mlx5_mp_os.c b/drivers/net/mlx5/linux/mlx5_mp_os.c
index 017a731b3f..edc5203dd6 100644
--- a/drivers/net/mlx5/linux/mlx5_mp_os.c
+++ b/drivers/net/mlx5/linux/mlx5_mp_os.c
@@ -34,24 +34,26 @@ mlx5_mp_os_handle_port_agnostic(const struct rte_mp_msg *mp_msg,
 	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
 	const struct mlx5_mp_param *param =
 		(const struct mlx5_mp_param *)mp_msg->param;
-	const struct mlx5_mp_arg_mempool_reg *mpr;
-	struct mlx5_mp_id mp_id;
+	const struct mlx5_mp_arg_mr_manage *mng = &param->args.mr_manage;
+	struct mr_cache_entry entry;
+	uint32_t lkey;
 
 	switch (param->type) {
+	case MLX5_MP_REQ_CREATE_MR:
+		mp_init_port_agnostic_msg(&mp_res, param->type);
+		lkey = mlx5_mr_create(mng->cdev, &mng->cdev->mr_scache, &entry,
+				      mng->addr);
+		if (lkey == UINT32_MAX)
+			res->result = -rte_errno;
+		return rte_mp_reply(&mp_res, peer);
 	case MLX5_MP_REQ_MEMPOOL_REGISTER:
-		mlx5_mp_id_init(&mp_id, param->port_id);
-		mp_init_msg(&mp_id, &mp_res, param->type);
-		mpr = &param->args.mempool_reg;
-		res->result = mlx5_mr_mempool_register(mpr->share_cache,
-						       mpr->pd, mpr->mempool,
-						       NULL);
+		mp_init_port_agnostic_msg(&mp_res, param->type);
+		res->result = mlx5_mr_mempool_register(mng->cdev, mng->mempool);
 		return rte_mp_reply(&mp_res, peer);
 	case MLX5_MP_REQ_MEMPOOL_UNREGISTER:
-		mlx5_mp_id_init(&mp_id, param->port_id);
-		mp_init_msg(&mp_id, &mp_res, param->type);
-		mpr = &param->args.mempool_reg;
-		res->result = mlx5_mr_mempool_unregister(mpr->share_cache,
-							 mpr->mempool, NULL);
+		mp_init_port_agnostic_msg(&mp_res, param->type);
+		res->result = mlx5_mr_mempool_unregister(mng->cdev,
+							 mng->mempool);
 		return rte_mp_reply(&mp_res, peer);
 	default:
 		return 1;
@@ -69,8 +71,6 @@ mlx5_mp_os_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
 	struct rte_eth_dev *dev;
 	struct mlx5_priv *priv;
 	struct mlx5_common_device *cdev;
-	struct mr_cache_entry entry;
-	uint32_t lkey;
 	int ret;
 
 	MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_PRIMARY);
@@ -88,15 +88,6 @@ mlx5_mp_os_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
 	priv = dev->data->dev_private;
 	cdev = priv->sh->cdev;
 	switch (param->type) {
-	case MLX5_MP_REQ_CREATE_MR:
-		mp_init_msg(&priv->mp_id, &mp_res, param->type);
-		lkey = mlx5_mr_create_primary(cdev->pd, &cdev->mr_scache,
-					      &entry, param->args.addr,
-					      cdev->config.mr_ext_memseg_en);
-		if (lkey == UINT32_MAX)
-			res->result = -rte_errno;
-		ret = rte_mp_reply(&mp_res, peer);
-		break;
 	case MLX5_MP_REQ_VERBS_CMD_FD:
 		mp_init_msg(&priv->mp_id, &mp_res, param->type);
 		mp_res.num_fds = 1;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 1fc2f0e0c1..5f6dae953b 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1247,8 +1247,7 @@ mlx5_mprq_alloc_mp(struct rte_eth_dev *dev)
 		rte_errno = ENOMEM;
 		return -rte_errno;
 	}
-	ret = mlx5_mr_mempool_register(&priv->sh->cdev->mr_scache,
-				       priv->sh->cdev->pd, mp, &priv->mp_id);
+	ret = mlx5_mr_mempool_register(priv->sh->cdev, mp);
 	if (ret < 0 && rte_errno != EEXIST) {
 		ret = rte_errno;
 		DRV_LOG(ERR, "port %u failed to register a mempool for Multi-Packet RQ",
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index d916c8addc..172419ca56 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -147,9 +147,7 @@ mlx5_rxq_mempool_register(struct mlx5_rxq_ctrl *rxq_ctrl)
 	}
 	for (s = 0; s < rxq_ctrl->rxq.rxseg_n; s++) {
 		mp = rxq_ctrl->rxq.rxseg[s].mp;
-		ret = mlx5_mr_mempool_register(&priv->sh->cdev->mr_scache,
-					       priv->sh->cdev->pd, mp,
-					       &priv->mp_id);
+		ret = mlx5_mr_mempool_register(priv->sh->cdev, mp);
 		if (ret < 0 && rte_errno != EEXIST)
 			return ret;
 		rte_mempool_mem_iter(mp, mlx5_rxq_mempool_register_cb,
diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h
index 7fed0e7cb9..0b9109a115 100644
--- a/drivers/net/mlx5/mlx5_tx.h
+++ b/drivers/net/mlx5/mlx5_tx.h
@@ -351,28 +351,6 @@ __mlx5_uar_write64(uint64_t val, void *addr, rte_spinlock_t *lock)
 #define mlx5_uar_write64(val, dst, lock) __mlx5_uar_write64(val, dst, lock)
 #endif
 
-/**
- * Query LKey from a packet buffer for Tx.
- *
- * @param txq
- *   Pointer to Tx queue structure.
- * @param mb
- *   Pointer to mbuf.
- *
- * @return
- *   Searched LKey on success, UINT32_MAX on no match.
- */
-static __rte_always_inline uint32_t
-mlx5_tx_mb2mr(struct mlx5_txq_data *txq, struct rte_mbuf *mb)
-{
-	struct mlx5_mr_ctrl *mr_ctrl = &txq->mr_ctrl;
-	struct mlx5_txq_ctrl *txq_ctrl =
-			container_of(txq, struct mlx5_txq_ctrl, txq);
-
-	/* Take slower bottom-half on miss. */
-	return mlx5_mr_mb2mr(mr_ctrl, mb, &txq_ctrl->priv->mp_id);
-}
-
 /**
  * Ring TX queue doorbell and flush the update if requested.
  *
@@ -1370,7 +1348,7 @@ mlx5_tx_dseg_ptr(struct mlx5_txq_data *__rte_restrict txq,
 {
 	MLX5_ASSERT(len);
 	dseg->bcount = rte_cpu_to_be_32(len);
-	dseg->lkey = mlx5_tx_mb2mr(txq, loc->mbuf);
+	dseg->lkey = mlx5_mr_mb2mr(&txq->mr_ctrl, loc->mbuf);
 	dseg->pbuf = rte_cpu_to_be_64((uintptr_t)buf);
 }
 
@@ -1406,7 +1384,7 @@ mlx5_tx_dseg_iptr(struct mlx5_txq_data *__rte_restrict txq,
 	MLX5_ASSERT(len);
 	if (len > MLX5_DSEG_MIN_INLINE_SIZE) {
 		dseg->bcount = rte_cpu_to_be_32(len);
-		dseg->lkey = mlx5_tx_mb2mr(txq, loc->mbuf);
+		dseg->lkey = mlx5_mr_mb2mr(&txq->mr_ctrl, loc->mbuf);
 		dseg->pbuf = rte_cpu_to_be_64((uintptr_t)buf);
 
 		return;
diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c
index 943cb9c19e..03875632a8 100644
--- a/drivers/regex/mlx5/mlx5_regex_fastpath.c
+++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c
@@ -160,7 +160,7 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp,
 	struct mlx5_klm klm;
 
 	klm.byte_count = rte_pktmbuf_data_len(op->mbuf);
-	klm.mkey = mlx5_mr_mb2mr(&qp->mr_ctrl, op->mbuf, 0);
+	klm.mkey = mlx5_mr_mb2mr(&qp->mr_ctrl, op->mbuf);
 	klm.address = rte_pktmbuf_mtod(op->mbuf, uintptr_t);
 	__prep_one(priv, qp_obj, op, job, qp_obj->pi, &klm);
 	qp_obj->db_pi = qp_obj->pi;
@@ -330,7 +330,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp,
 				addr = rte_pktmbuf_mtod(mbuf, uintptr_t);
 				/* Build indirect mkey seg's KLM. */
 				mkey_klm->mkey = mlx5_mr_mb2mr(&qp->mr_ctrl,
-							       mbuf, 0);
+							       mbuf);
 				mkey_klm->address = rte_cpu_to_be_64(addr);
 				mkey_klm->byte_count = rte_cpu_to_be_32
 						(rte_pktmbuf_data_len(mbuf));
@@ -347,7 +347,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp,
 			klm.byte_count = scatter_size;
 		} else {
 			/* The single mubf case. Build the KLM directly. */
-			klm.mkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf, 0);
+			klm.mkey = mlx5_mr_mb2mr(&qp->mr_ctrl, mbuf);
 			klm.address = rte_pktmbuf_mtod(mbuf, uintptr_t);
 			klm.byte_count = rte_pktmbuf_data_len(mbuf);
 		}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation
  2021-11-03 10:17 [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation michaelba
                   ` (2 preceding siblings ...)
  2021-11-03 10:17 ` [dpdk-dev] [PATCH 3/3] common/mlx5: make MR managmant port-agnostic for MP michaelba
@ 2021-11-07 13:17 ` Thomas Monjalon
  3 siblings, 0 replies; 5+ messages in thread
From: Thomas Monjalon @ 2021-11-07 13:17 UTC (permalink / raw)
  To: Michael Baum; +Cc: dev, Matan Azrad, xuemingl

> Michael Baum (3):
>   common/mlx5: fix MR search non inline function
>   common/mlx5: fix redundant parameter in search MR function
>   common/mlx5: make MR managmant port-agnostic for MP

Applied, thanks.

Please take care of retest because rebase after "shared Rx queue" feature
was not obvious.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-07 13:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-03 10:17 [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation michaelba
2021-11-03 10:17 ` [dpdk-dev] [PATCH 1/3] common/mlx5: fix MR search non inline function michaelba
2021-11-03 10:17 ` [dpdk-dev] [PATCH 2/3] common/mlx5: fix redundant parameter in search MR function michaelba
2021-11-03 10:17 ` [dpdk-dev] [PATCH 3/3] common/mlx5: make MR managmant port-agnostic for MP michaelba
2021-11-07 13:17 ` [dpdk-dev] [PATCH 0/3] mlx5: fix performance degradation Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).