Memory region (MR) lookup by address inside mempool MRs
was not accounting for the upper bound of an MR.
For mempools covered by multiple MRs this could return
a wrong MR LKey, typically resulting in an unrecoverable
TxQ failure:
mlx5_net: Cannot change Tx QP state to INIT Invalid argument
Corresponding message from /var/log/dpdk_mlx5_port_X_txq_Y_index_Z*:
Unexpected CQE error syndrome 0x04 CQN = 128 SQN = 4848
wqe_counter = 0 wq_ci = 9 cq_ci = 122
This is likely to happen with --legacy-mem and IOVA-as-PA,
because EAL intentionally maps pages at non-adjacent PA
to non-adjacent VA in this mode, and MLX5 PMD works with VA.
Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities")
Cc: stable@dpdk.org
Reported-by: Wang Yunjian <wangyunjian@huawei.com>
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
drivers/common/mlx5/mlx5_common_mr.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/common/mlx5/mlx5_common_mr.c b/drivers/common/mlx5/mlx5_common_mr.c
index 1537b5d428..5f7e4f6734 100644
--- a/drivers/common/mlx5/mlx5_common_mr.c
+++ b/drivers/common/mlx5/mlx5_common_mr.c
@@ -1834,12 +1834,13 @@ mlx5_mempool_reg_addr2mr(struct mlx5_mempool_reg *mpr, uintptr_t addr,
for (i = 0; i < mpr->mrs_n; i++) {
const struct mlx5_pmd_mr *mr = &mpr->mrs[i].pmd_mr;
- uintptr_t mr_addr = (uintptr_t)mr->addr;
+ uintptr_t mr_start = (uintptr_t)mr->addr;
+ uintptr_t mr_end = mr_start + mr->len;
- if (mr_addr <= addr) {
+ if (mr_start <= addr && addr < mr_end) {
lkey = rte_cpu_to_be_32(mr->lkey);
- entry->start = mr_addr;
- entry->end = mr_addr + mr->len;
+ entry->start = mr_start;
+ entry->end = mr_end;
entry->lkey = lkey;
break;
}
--
2.25.1