* [dpdk-dev] [PATCH 0/5] mlx5: replaced hardware queue object @ 2021-09-03 14:21 Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 1/5] common/mlx5: share DevX QP operations Raja Zidane ` (4 more replies) 0 siblings, 5 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-03 14:21 UTC (permalink / raw) To: dev The mlx5 PMDs for compress and regex classes use an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, mmo, that will be supported only in the QP object. The FW introduced new capabilities to define whether the mmo configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set mmo configuration according to the new FW capabilities. Raja Zidane (5): common/mlx5: share DevX QP operations common/mlx5: update new MMO HCA capabilities common/mlx5: add MMO configuration for the DevX QP compress/mlx5: refactor queue HW object regex/mlx5: refactor HW queue objects drivers/common/mlx5/mlx5_common_devx.c | 144 ++++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 ++++ drivers/common/mlx5/mlx5_devx_cmds.c | 17 ++- drivers/common/mlx5/mlx5_devx_cmds.h | 13 +- drivers/common/mlx5/mlx5_prm.h | 47 ++++++- drivers/common/mlx5/version.map | 3 + drivers/compress/mlx5/mlx5_compress.c | 71 +++++----- drivers/crypto/mlx5/mlx5_crypto.c | 96 ++++---------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/regex/mlx5/mlx5_regex.c | 6 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 64 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 159 +++++++++++------------ drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 53 ++------ 15 files changed, 435 insertions(+), 287 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH 1/5] common/mlx5: share DevX QP operations 2021-09-03 14:21 [dpdk-dev] [PATCH 0/5] mlx5: replaced hardware queue object Raja Zidane @ 2021-09-03 14:21 ` Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane ` (3 subsequent siblings) 4 siblings, 1 reply; 38+ messages in thread From: Raja Zidane @ 2021-09-03 14:21 UTC (permalink / raw) To: dev Currently drivers using QP (vDPA, crypto and compress, regex soon) manage their memory, creation, modification and destruction of the QP, in almost identical code. Move QP memory management, creation and destruction to common. Add common function to change QP state to RTS. Add user_index attribute to QP creation. It's for better code maintenance and reuse. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 ++++ drivers/common/mlx5/mlx5_devx_cmds.c | 1 + drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/version.map | 3 + drivers/crypto/mlx5/mlx5_crypto.c | 96 ++++------------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 53 +++------ 9 files changed, 209 insertions(+), 122 deletions(-) diff --git a/drivers/common/mlx5/mlx5_common_devx.c b/drivers/common/mlx5/mlx5_common_devx.c index 22c8d356c4..825f84b183 100644 --- a/drivers/common/mlx5/mlx5_common_devx.c +++ b/drivers/common/mlx5/mlx5_common_devx.c @@ -271,6 +271,115 @@ mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, return -rte_errno; } +/** + * Destroy DevX Queue Pair. + * + * @param[in] qp + * DevX QP to destroy. + */ +void +mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp) +{ + if (qp->qp) + claim_zero(mlx5_devx_cmd_destroy(qp->qp)); + if (qp->umem_obj) + claim_zero(mlx5_os_umem_dereg(qp->umem_obj)); + if (qp->umem_buf) + mlx5_free((void *)(uintptr_t)qp->umem_buf); +} + +/** + * Create Queue Pair using DevX API. + * + * Get a pointer to partially initialized attributes structure, and updates the + * following fields: + * wq_umem_id + * wq_umem_offset + * dbr_umem_valid + * dbr_umem_id + * dbr_address + * log_page_size + * All other fields are updated by caller. + * + * @param[in] ctx + * Context returned from mlx5 open_device() glue function. + * @param[in/out] qp_obj + * Pointer to QP to create. + * @param[in] log_wqbb_n + * Log of number of WQBBs in queue. + * @param[in] attr + * Pointer to QP attributes structure. + * @param[in] socket + * Socket to use for allocation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket) +{ + struct mlx5_devx_obj *qp = NULL; + struct mlx5dv_devx_umem *umem_obj = NULL; + void *umem_buf = NULL; + size_t alignment = MLX5_WQE_BUF_ALIGNMENT; + uint32_t umem_size, umem_dbrec; + uint16_t qp_size = 1 << log_wqbb_n; + int ret; + + if (alignment == (size_t)-1) { + DRV_LOG(ERR, "Failed to get WQE buf alignment."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Allocate memory buffer for WQEs and doorbell record. */ + umem_size = MLX5_WQE_SIZE * qp_size; + umem_dbrec = RTE_ALIGN(umem_size, MLX5_DBR_SIZE); + umem_size += MLX5_DBR_SIZE; + umem_buf = mlx5_malloc(MLX5_MEM_RTE | MLX5_MEM_ZERO, umem_size, + alignment, socket); + if (!umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for QP."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Register allocated buffer in user space with DevX. */ + umem_obj = mlx5_os_umem_reg(ctx, (void *)(uintptr_t)umem_buf, umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!umem_obj) { + DRV_LOG(ERR, "Failed to register umem for QP."); + rte_errno = errno; + goto error; + } + /* Fill attributes for SQ object creation. */ + attr->wq_umem_id = mlx5_os_get_umem_id(umem_obj); + attr->wq_umem_offset = 0; + attr->dbr_umem_valid = 1; + attr->dbr_umem_id = attr->wq_umem_id; + attr->dbr_address = umem_dbrec; + attr->log_page_size = MLX5_LOG_PAGE_SIZE; + /* Create send queue object with DevX. */ + qp = mlx5_devx_cmd_create_qp(ctx, attr); + if (!qp) { + DRV_LOG(ERR, "Can't create DevX QP object."); + rte_errno = ENOMEM; + goto error; + } + qp_obj->umem_buf = umem_buf; + qp_obj->umem_obj = umem_obj; + qp_obj->qp = qp; + qp_obj->db_rec = RTE_PTR_ADD(qp_obj->umem_buf, umem_dbrec); + return 0; +error: + ret = rte_errno; + if (umem_obj) + claim_zero(mlx5_os_umem_dereg(umem_obj)); + if (umem_buf) + mlx5_free((void *)(uintptr_t)umem_buf); + rte_errno = ret; + return -rte_errno; +} + /** * Destroy DevX Receive Queue. * @@ -385,3 +494,38 @@ mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, return -rte_errno; } + +/** + * Change QP state to RTS. + * + * @param[in] qp + * DevX QP to change. + * @param[in] remote_qp_id + * The remote QP ID for MLX5_CMD_OP_INIT2RTR_QP operation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id) +{ + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RST2INIT_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_INIT2RTR_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RTR2RTS_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", + rte_errno); + return -1; + } + return 0; +} diff --git a/drivers/common/mlx5/mlx5_common_devx.h b/drivers/common/mlx5/mlx5_common_devx.h index aad0184e5a..f699405f69 100644 --- a/drivers/common/mlx5/mlx5_common_devx.h +++ b/drivers/common/mlx5/mlx5_common_devx.h @@ -33,6 +33,18 @@ struct mlx5_devx_sq { volatile uint32_t *db_rec; /* The SQ doorbell record. */ }; +/* DevX Queue Pair structure. */ +struct mlx5_devx_qp { + struct mlx5_devx_obj *qp; /* The QP DevX object. */ + void *umem_obj; /* The QP umem object. */ + union { + void *umem_buf; + struct mlx5_wqe *wqes; /* The QP ring buffer. */ + struct mlx5_aso_wqe *aso_wqes; + }; + volatile uint32_t *db_rec; /* The QP doorbell record. */ +}; + /* DevX Receive Queue structure. */ struct mlx5_devx_rq { struct mlx5_devx_obj *rq; /* The RQ DevX object. */ @@ -59,6 +71,14 @@ int mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, struct mlx5_devx_create_sq_attr *attr, int socket); +__rte_internal +void mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp); + +__rte_internal +int mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, + uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket); + __rte_internal void mlx5_devx_rq_destroy(struct mlx5_devx_rq *rq); @@ -67,4 +87,7 @@ int mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, uint16_t log_wqbb_n, struct mlx5_devx_create_rq_attr *attr, int socket); +__rte_internal +int mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id); + #endif /* RTE_PMD_MLX5_COMMON_DEVX_H_ */ diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 56407cc332..ac554cca05 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2021,6 +2021,7 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, st, MLX5_QP_ST_RC); MLX5_SET(qpc, qpc, pd, attr->pd); MLX5_SET(qpc, qpc, ts_format, attr->ts_format); + MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index e576e30f24..c071629904 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -397,6 +397,7 @@ struct mlx5_devx_qp_attr { uint64_t dbr_address; uint32_t wq_umem_id; uint64_t wq_umem_offset; + uint32_t user_index:24; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index e5cb6b7060..d3c5040aac 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -67,6 +67,9 @@ INTERNAL { mlx5_devx_get_out_command_status; + mlx5_devx_qp2rts; + mlx5_devx_qp_create; + mlx5_devx_qp_destroy; mlx5_devx_rq_create; mlx5_devx_rq_destroy; mlx5_devx_sq_create; diff --git a/drivers/crypto/mlx5/mlx5_crypto.c b/drivers/crypto/mlx5/mlx5_crypto.c index b3d5200ca3..1d91dc5737 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.c +++ b/drivers/crypto/mlx5/mlx5_crypto.c @@ -257,12 +257,7 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) { struct mlx5_crypto_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->qp_obj != NULL) - claim_zero(mlx5_devx_cmd_destroy(qp->qp_obj)); - if (qp->umem_obj != NULL) - claim_zero(mlx5_glue->devx_umem_dereg(qp->umem_obj)); - if (qp->umem_buf != NULL) - rte_free(qp->umem_buf); + mlx5_devx_qp_destroy(&qp->qp_obj); mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); mlx5_devx_cq_destroy(&qp->cq_obj); rte_free(qp); @@ -270,34 +265,6 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) return 0; } -static int -mlx5_crypto_qp2rts(struct mlx5_crypto_qp *qp) -{ - /* - * In Order to configure self loopback, when calling these functions the - * remote QP id that is used is the id of the same QP. - */ - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RST2INIT_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_INIT2RTR_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RTR2RTS_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", - rte_errno); - return -1; - } - return 0; -} - static __rte_noinline uint32_t mlx5_crypto_get_block_size(struct rte_crypto_op *op) { @@ -452,7 +419,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, memcpy(klms, &umr->kseg[0], sizeof(*klms) * klm_n); } ds = 2 + klm_n; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_RDMA_WRITE); ds = RTE_ALIGN(ds, 4); @@ -461,7 +428,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, if (priv->max_rdmar_ds > ds) { cseg += ds; ds = priv->max_rdmar_ds - ds; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_NOP); qp->db_pi += ds >> 2; /* Here, DS is 4 aligned for sure. */ @@ -503,7 +470,7 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, return 0; do { op = *ops++; - umr = RTE_PTR_ADD(qp->umem_buf, priv->wqe_set_size * qp->pi); + umr = RTE_PTR_ADD(qp->qp_obj.umem_buf, priv->wqe_set_size * qp->pi); if (unlikely(mlx5_crypto_wqe_set(priv, qp, op, umr) == 0)) { qp->stats.enqueue_err_count++; if (remain != nb_ops) { @@ -517,7 +484,7 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); + qp->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); rte_wmb(); mlx5_crypto_uar_write(*(volatile uint64_t *)qp->wqe, qp->priv); rte_wmb(); @@ -583,7 +550,7 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) uint32_t i; for (i = 0 ; i < qp->entries_n; i++) { - struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->umem_buf, i * + struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->qp_obj.umem_buf, i * priv->wqe_set_size); struct mlx5_wqe_umr_cseg *ucseg = (struct mlx5_wqe_umr_cseg *) (cseg + 1); @@ -593,7 +560,7 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) struct mlx5_wqe_rseg *rseg; /* Init UMR WQE. */ - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | (priv->umr_wqe_size / MLX5_WSEG_SIZE)); cseg->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR << MLX5_COMP_MODE_OFFSET); @@ -628,7 +595,7 @@ mlx5_crypto_indirect_mkeys_prepare(struct mlx5_crypto_priv *priv, .klm_num = RTE_ALIGN(priv->max_segs_num, 4), }; - for (umr = (struct mlx5_umr_wqe *)qp->umem_buf, i = 0; + for (umr = (struct mlx5_umr_wqe *)qp->qp_obj.umem_buf, i = 0; i < qp->entries_n; i++, umr = RTE_PTR_ADD(umr, priv->wqe_set_size)) { attr.klm_array = (struct mlx5_klm *)&umr->kseg[0]; qp->mkey[i] = mlx5_devx_cmd_mkey_create(priv->ctx, &attr); @@ -649,9 +616,7 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, struct mlx5_devx_qp_attr attr = {0}; struct mlx5_crypto_qp *qp; uint16_t log_nb_desc = rte_log2_u32(qp_conf->nb_descriptors); - uint32_t umem_size = RTE_BIT32(log_nb_desc) * - priv->wqe_set_size + - sizeof(*qp->db_rec) * 2; + uint32_t ret; uint32_t alloc_size = sizeof(*qp); struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), @@ -675,18 +640,14 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto error; } - qp->umem_buf = rte_zmalloc_socket(__func__, umem_size, 4096, socket_id); - if (qp->umem_buf == NULL) { - DRV_LOG(ERR, "Failed to allocate QP umem."); - rte_errno = ENOMEM; - goto error; - } - qp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)qp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (qp->umem_obj == NULL) { - DRV_LOG(ERR, "Failed to register QP umem."); + attr.pd = priv->pdn; + attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); + attr.cqn = qp->cq_obj.cq->id; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(log_nb_desc); + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp_obj, log_nb_desc, &attr, socket_id); + if(ret) { + DRV_LOG(ERR, "Failed to create QP."); goto error; } if (mlx5_mr_btree_init(&qp->mr_ctrl.cache_bh, MLX5_MR_BTREE_CACHE_N, @@ -697,24 +658,11 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, goto error; } qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; - attr.pd = priv->pdn; - attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); - attr.cqn = qp->cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 0; - attr.sq_size = RTE_BIT32(log_nb_desc); - attr.dbr_umem_valid = 1; - attr.wq_umem_id = qp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = qp->umem_obj->umem_id; - attr.dbr_address = RTE_BIT64(log_nb_desc) * priv->wqe_set_size; - qp->qp_obj = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (qp->qp_obj == NULL) { - DRV_LOG(ERR, "Failed to create QP(%u).", rte_errno); - goto error; - } - qp->db_rec = RTE_PTR_ADD(qp->umem_buf, (uintptr_t)attr.dbr_address); - if (mlx5_crypto_qp2rts(qp)) + /* + * In Order to configure self loopback, when calling devx qp2rts the + * remote QP id that is used is the id of the same QP. + */ + if (mlx5_devx_qp2rts(&qp->qp_obj, qp->qp_obj.qp->id)) goto error; qp->mkey = (struct mlx5_devx_obj **)RTE_ALIGN((uintptr_t)(qp + 1), RTE_CACHE_LINE_SIZE); diff --git a/drivers/crypto/mlx5/mlx5_crypto.h b/drivers/crypto/mlx5/mlx5_crypto.h index d49b0001f0..013eed30b5 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.h +++ b/drivers/crypto/mlx5/mlx5_crypto.h @@ -43,11 +43,8 @@ struct mlx5_crypto_priv { struct mlx5_crypto_qp { struct mlx5_crypto_priv *priv; struct mlx5_devx_cq cq_obj; - struct mlx5_devx_obj *qp_obj; + struct mlx5_devx_qp qp_obj; struct rte_cryptodev_stats stats; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; struct rte_crypto_op **ops; struct mlx5_devx_obj **mkey; /* WQE's indirect mekys. */ struct mlx5_mr_ctrl mr_ctrl; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 2a04e36607..a27f3fdadb 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -54,10 +54,7 @@ struct mlx5_vdpa_cq { struct mlx5_vdpa_event_qp { struct mlx5_vdpa_cq cq; struct mlx5_devx_obj *fw_qp; - struct mlx5_devx_obj *sw_qp; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; + struct mlx5_devx_qp sw_qp; }; struct mlx5_vdpa_query_mr { diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index 3541c652ce..b557c93dd4 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -179,7 +179,7 @@ mlx5_vdpa_cq_poll(struct mlx5_vdpa_cq *cq) cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci); rte_io_wmb(); /* Ring SW QP doorbell record. */ - eqp->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); + eqp->sw_qp.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); } return comp; } @@ -531,12 +531,7 @@ mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv) void mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp) { - if (eqp->sw_qp) - claim_zero(mlx5_devx_cmd_destroy(eqp->sw_qp)); - if (eqp->umem_obj) - claim_zero(mlx5_glue->devx_umem_dereg(eqp->umem_obj)); - if (eqp->umem_buf) - rte_free(eqp->umem_buf); + mlx5_devx_qp_destroy(&eqp->sw_qp); if (eqp->fw_qp) claim_zero(mlx5_devx_cmd_destroy(eqp->fw_qp)); mlx5_vdpa_cq_destroy(&eqp->cq); @@ -547,36 +542,36 @@ static int mlx5_vdpa_qps2rts(struct mlx5_vdpa_event_qp *eqp) { if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to INIT state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RST2INIT_QP, + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, MLX5_CMD_OP_RST2INIT_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to INIT state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTR state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_INIT2RTR_QP, + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, MLX5_CMD_OP_INIT2RTR_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTR state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RTR2RTS_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTS state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RTR2RTS_QP, + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, MLX5_CMD_OP_RTR2RTS_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTS state(%u).", rte_errno); @@ -591,8 +586,7 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, { struct mlx5_devx_qp_attr attr = {0}; uint16_t log_desc_n = rte_log2_u32(desc_n); - uint32_t umem_size = (1 << log_desc_n) * MLX5_WSEG_SIZE + - sizeof(*eqp->db_rec) * 2; + uint32_t ret; if (mlx5_vdpa_event_qp_global_prepare(priv)) return -1; @@ -605,42 +599,21 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, DRV_LOG(ERR, "Failed to create FW QP(%u).", rte_errno); goto error; } - eqp->umem_buf = rte_zmalloc(__func__, umem_size, 4096); - if (!eqp->umem_buf) { - DRV_LOG(ERR, "Failed to allocate memory for SW QP."); - rte_errno = ENOMEM; - goto error; - } - eqp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)eqp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (!eqp->umem_obj) { - DRV_LOG(ERR, "Failed to register umem for SW QP."); - goto error; - } attr.uar_index = priv->uar->page_id; attr.cqn = eqp->cq.cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 1 << log_desc_n; + attr.rq_size = RTE_BIT32(log_desc_n); attr.log_rq_stride = rte_log2_u32(MLX5_WSEG_SIZE); attr.sq_size = 0; /* No need SQ. */ - attr.dbr_umem_valid = 1; - attr.wq_umem_id = eqp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = eqp->umem_obj->umem_id; attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - attr.dbr_address = RTE_BIT64(log_desc_n) * MLX5_WSEG_SIZE; - eqp->sw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (!eqp->sw_qp) { + ret = mlx5_devx_qp_create(priv->ctx, &(eqp->sw_qp), log_desc_n, &attr, SOCKET_ID_ANY); + if (ret) { DRV_LOG(ERR, "Failed to create SW QP(%u).", rte_errno); goto error; } - eqp->db_rec = RTE_PTR_ADD(eqp->umem_buf, (uintptr_t)attr.dbr_address); if (mlx5_vdpa_qps2rts(eqp)) goto error; /* First ringing. */ - rte_write32(rte_cpu_to_be_32(1 << log_desc_n), &eqp->db_rec[0]); + rte_write32(rte_cpu_to_be_32(RTE_BIT32(log_desc_n)), &eqp->sw_qp.db_rec[0]); return 0; error: mlx5_vdpa_event_qp_destroy(eqp); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V2 0/5] mlx5: replaced hardware queue object 2021-09-03 14:21 ` [dpdk-dev] [PATCH 1/5] common/mlx5: share DevX QP operations Raja Zidane @ 2021-09-12 16:36 ` Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 1/5] common/mlx5: share DevX QP operations Raja Zidane ` (4 more replies) 0 siblings, 5 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-12 16:36 UTC (permalink / raw) To: dev The mlx5 PMDs for compress and regex classes use an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, mmo, that will be supported only in the QP object. The FW introduced new capabilities to define whether the mmo configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set mmo configuration according to the new FW capabilities. V2: fix coding style issues. Raja Zidane (5): common/mlx5: share DevX QP operations common/mlx5: update new MMO HCA capabilities common/mlx5: add MMO configuration for the DevX QP compress/mlx5: refactor queue HW object regex/mlx5: refactor HW queue objects drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 +++ drivers/common/mlx5/mlx5_devx_cmds.c | 23 ++- drivers/common/mlx5/mlx5_devx_cmds.h | 13 +- drivers/common/mlx5/mlx5_prm.h | 48 ++++++- drivers/common/mlx5/version.map | 3 + drivers/compress/mlx5/mlx5_compress.c | 73 +++++----- drivers/crypto/mlx5/mlx5_crypto.c | 100 ++++--------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 59 +++----- 15 files changed, 460 insertions(+), 294 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V2 1/5] common/mlx5: share DevX QP operations 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 0/5] mlx5: replaced hardware queue object Raja Zidane @ 2021-09-12 16:36 ` Raja Zidane 2021-09-15 0:04 ` [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane ` (3 subsequent siblings) 4 siblings, 1 reply; 38+ messages in thread From: Raja Zidane @ 2021-09-12 16:36 UTC (permalink / raw) To: dev Currently drivers using QP (vDPA, crypto and compress, regex soon) manage their memory, creation, modification and destruction of the QP, in almost identical code. Move QP memory management, creation and destruction to common. Add common function to change QP state to RTS. Add user_index attribute to QP creation. It's for better code maintenance and reuse. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 ++++ drivers/common/mlx5/mlx5_devx_cmds.c | 1 + drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/version.map | 3 + drivers/crypto/mlx5/mlx5_crypto.c | 100 +++++------------ drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 59 +++------- 9 files changed, 216 insertions(+), 125 deletions(-) diff --git a/drivers/common/mlx5/mlx5_common_devx.c b/drivers/common/mlx5/mlx5_common_devx.c index 22c8d356c4..825f84b183 100644 --- a/drivers/common/mlx5/mlx5_common_devx.c +++ b/drivers/common/mlx5/mlx5_common_devx.c @@ -271,6 +271,115 @@ mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, return -rte_errno; } +/** + * Destroy DevX Queue Pair. + * + * @param[in] qp + * DevX QP to destroy. + */ +void +mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp) +{ + if (qp->qp) + claim_zero(mlx5_devx_cmd_destroy(qp->qp)); + if (qp->umem_obj) + claim_zero(mlx5_os_umem_dereg(qp->umem_obj)); + if (qp->umem_buf) + mlx5_free((void *)(uintptr_t)qp->umem_buf); +} + +/** + * Create Queue Pair using DevX API. + * + * Get a pointer to partially initialized attributes structure, and updates the + * following fields: + * wq_umem_id + * wq_umem_offset + * dbr_umem_valid + * dbr_umem_id + * dbr_address + * log_page_size + * All other fields are updated by caller. + * + * @param[in] ctx + * Context returned from mlx5 open_device() glue function. + * @param[in/out] qp_obj + * Pointer to QP to create. + * @param[in] log_wqbb_n + * Log of number of WQBBs in queue. + * @param[in] attr + * Pointer to QP attributes structure. + * @param[in] socket + * Socket to use for allocation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket) +{ + struct mlx5_devx_obj *qp = NULL; + struct mlx5dv_devx_umem *umem_obj = NULL; + void *umem_buf = NULL; + size_t alignment = MLX5_WQE_BUF_ALIGNMENT; + uint32_t umem_size, umem_dbrec; + uint16_t qp_size = 1 << log_wqbb_n; + int ret; + + if (alignment == (size_t)-1) { + DRV_LOG(ERR, "Failed to get WQE buf alignment."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Allocate memory buffer for WQEs and doorbell record. */ + umem_size = MLX5_WQE_SIZE * qp_size; + umem_dbrec = RTE_ALIGN(umem_size, MLX5_DBR_SIZE); + umem_size += MLX5_DBR_SIZE; + umem_buf = mlx5_malloc(MLX5_MEM_RTE | MLX5_MEM_ZERO, umem_size, + alignment, socket); + if (!umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for QP."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Register allocated buffer in user space with DevX. */ + umem_obj = mlx5_os_umem_reg(ctx, (void *)(uintptr_t)umem_buf, umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!umem_obj) { + DRV_LOG(ERR, "Failed to register umem for QP."); + rte_errno = errno; + goto error; + } + /* Fill attributes for SQ object creation. */ + attr->wq_umem_id = mlx5_os_get_umem_id(umem_obj); + attr->wq_umem_offset = 0; + attr->dbr_umem_valid = 1; + attr->dbr_umem_id = attr->wq_umem_id; + attr->dbr_address = umem_dbrec; + attr->log_page_size = MLX5_LOG_PAGE_SIZE; + /* Create send queue object with DevX. */ + qp = mlx5_devx_cmd_create_qp(ctx, attr); + if (!qp) { + DRV_LOG(ERR, "Can't create DevX QP object."); + rte_errno = ENOMEM; + goto error; + } + qp_obj->umem_buf = umem_buf; + qp_obj->umem_obj = umem_obj; + qp_obj->qp = qp; + qp_obj->db_rec = RTE_PTR_ADD(qp_obj->umem_buf, umem_dbrec); + return 0; +error: + ret = rte_errno; + if (umem_obj) + claim_zero(mlx5_os_umem_dereg(umem_obj)); + if (umem_buf) + mlx5_free((void *)(uintptr_t)umem_buf); + rte_errno = ret; + return -rte_errno; +} + /** * Destroy DevX Receive Queue. * @@ -385,3 +494,38 @@ mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, return -rte_errno; } + +/** + * Change QP state to RTS. + * + * @param[in] qp + * DevX QP to change. + * @param[in] remote_qp_id + * The remote QP ID for MLX5_CMD_OP_INIT2RTR_QP operation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id) +{ + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RST2INIT_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_INIT2RTR_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RTR2RTS_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", + rte_errno); + return -1; + } + return 0; +} diff --git a/drivers/common/mlx5/mlx5_common_devx.h b/drivers/common/mlx5/mlx5_common_devx.h index aad0184e5a..f699405f69 100644 --- a/drivers/common/mlx5/mlx5_common_devx.h +++ b/drivers/common/mlx5/mlx5_common_devx.h @@ -33,6 +33,18 @@ struct mlx5_devx_sq { volatile uint32_t *db_rec; /* The SQ doorbell record. */ }; +/* DevX Queue Pair structure. */ +struct mlx5_devx_qp { + struct mlx5_devx_obj *qp; /* The QP DevX object. */ + void *umem_obj; /* The QP umem object. */ + union { + void *umem_buf; + struct mlx5_wqe *wqes; /* The QP ring buffer. */ + struct mlx5_aso_wqe *aso_wqes; + }; + volatile uint32_t *db_rec; /* The QP doorbell record. */ +}; + /* DevX Receive Queue structure. */ struct mlx5_devx_rq { struct mlx5_devx_obj *rq; /* The RQ DevX object. */ @@ -59,6 +71,14 @@ int mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, struct mlx5_devx_create_sq_attr *attr, int socket); +__rte_internal +void mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp); + +__rte_internal +int mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, + uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket); + __rte_internal void mlx5_devx_rq_destroy(struct mlx5_devx_rq *rq); @@ -67,4 +87,7 @@ int mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, uint16_t log_wqbb_n, struct mlx5_devx_create_rq_attr *attr, int socket); +__rte_internal +int mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id); + #endif /* RTE_PMD_MLX5_COMMON_DEVX_H_ */ diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 56407cc332..ac554cca05 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2021,6 +2021,7 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, st, MLX5_QP_ST_RC); MLX5_SET(qpc, qpc, pd, attr->pd); MLX5_SET(qpc, qpc, ts_format, attr->ts_format); + MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index e576e30f24..c071629904 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -397,6 +397,7 @@ struct mlx5_devx_qp_attr { uint64_t dbr_address; uint32_t wq_umem_id; uint64_t wq_umem_offset; + uint32_t user_index:24; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index e5cb6b7060..d3c5040aac 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -67,6 +67,9 @@ INTERNAL { mlx5_devx_get_out_command_status; + mlx5_devx_qp2rts; + mlx5_devx_qp_create; + mlx5_devx_qp_destroy; mlx5_devx_rq_create; mlx5_devx_rq_destroy; mlx5_devx_sq_create; diff --git a/drivers/crypto/mlx5/mlx5_crypto.c b/drivers/crypto/mlx5/mlx5_crypto.c index b3d5200ca3..b2d386c4db 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.c +++ b/drivers/crypto/mlx5/mlx5_crypto.c @@ -257,12 +257,7 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) { struct mlx5_crypto_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->qp_obj != NULL) - claim_zero(mlx5_devx_cmd_destroy(qp->qp_obj)); - if (qp->umem_obj != NULL) - claim_zero(mlx5_glue->devx_umem_dereg(qp->umem_obj)); - if (qp->umem_buf != NULL) - rte_free(qp->umem_buf); + mlx5_devx_qp_destroy(&qp->qp_obj); mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); mlx5_devx_cq_destroy(&qp->cq_obj); rte_free(qp); @@ -270,34 +265,6 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) return 0; } -static int -mlx5_crypto_qp2rts(struct mlx5_crypto_qp *qp) -{ - /* - * In Order to configure self loopback, when calling these functions the - * remote QP id that is used is the id of the same QP. - */ - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RST2INIT_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_INIT2RTR_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RTR2RTS_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", - rte_errno); - return -1; - } - return 0; -} - static __rte_noinline uint32_t mlx5_crypto_get_block_size(struct rte_crypto_op *op) { @@ -452,7 +419,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, memcpy(klms, &umr->kseg[0], sizeof(*klms) * klm_n); } ds = 2 + klm_n; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_RDMA_WRITE); ds = RTE_ALIGN(ds, 4); @@ -461,7 +428,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, if (priv->max_rdmar_ds > ds) { cseg += ds; ds = priv->max_rdmar_ds - ds; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_NOP); qp->db_pi += ds >> 2; /* Here, DS is 4 aligned for sure. */ @@ -503,7 +470,8 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, return 0; do { op = *ops++; - umr = RTE_PTR_ADD(qp->umem_buf, priv->wqe_set_size * qp->pi); + umr = RTE_PTR_ADD(qp->qp_obj.umem_buf, + priv->wqe_set_size * qp->pi); if (unlikely(mlx5_crypto_wqe_set(priv, qp, op, umr) == 0)) { qp->stats.enqueue_err_count++; if (remain != nb_ops) { @@ -517,7 +485,7 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); + qp->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); rte_wmb(); mlx5_crypto_uar_write(*(volatile uint64_t *)qp->wqe, qp->priv); rte_wmb(); @@ -583,8 +551,8 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) uint32_t i; for (i = 0 ; i < qp->entries_n; i++) { - struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->umem_buf, i * - priv->wqe_set_size); + struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->qp_obj.umem_buf, + i * priv->wqe_set_size); struct mlx5_wqe_umr_cseg *ucseg = (struct mlx5_wqe_umr_cseg *) (cseg + 1); struct mlx5_wqe_umr_bsf_seg *bsf = @@ -593,7 +561,7 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) struct mlx5_wqe_rseg *rseg; /* Init UMR WQE. */ - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | (priv->umr_wqe_size / MLX5_WSEG_SIZE)); cseg->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR << MLX5_COMP_MODE_OFFSET); @@ -628,7 +596,7 @@ mlx5_crypto_indirect_mkeys_prepare(struct mlx5_crypto_priv *priv, .klm_num = RTE_ALIGN(priv->max_segs_num, 4), }; - for (umr = (struct mlx5_umr_wqe *)qp->umem_buf, i = 0; + for (umr = (struct mlx5_umr_wqe *)qp->qp_obj.umem_buf, i = 0; i < qp->entries_n; i++, umr = RTE_PTR_ADD(umr, priv->wqe_set_size)) { attr.klm_array = (struct mlx5_klm *)&umr->kseg[0]; qp->mkey[i] = mlx5_devx_cmd_mkey_create(priv->ctx, &attr); @@ -649,9 +617,7 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, struct mlx5_devx_qp_attr attr = {0}; struct mlx5_crypto_qp *qp; uint16_t log_nb_desc = rte_log2_u32(qp_conf->nb_descriptors); - uint32_t umem_size = RTE_BIT32(log_nb_desc) * - priv->wqe_set_size + - sizeof(*qp->db_rec) * 2; + uint32_t ret; uint32_t alloc_size = sizeof(*qp); struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), @@ -675,18 +641,15 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto error; } - qp->umem_buf = rte_zmalloc_socket(__func__, umem_size, 4096, socket_id); - if (qp->umem_buf == NULL) { - DRV_LOG(ERR, "Failed to allocate QP umem."); - rte_errno = ENOMEM; - goto error; - } - qp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)qp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (qp->umem_obj == NULL) { - DRV_LOG(ERR, "Failed to register QP umem."); + attr.pd = priv->pdn; + attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); + attr.cqn = qp->cq_obj.cq->id; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(log_nb_desc); + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp_obj, log_nb_desc, &attr, + socket_id); + if (ret) { + DRV_LOG(ERR, "Failed to create QP."); goto error; } if (mlx5_mr_btree_init(&qp->mr_ctrl.cache_bh, MLX5_MR_BTREE_CACHE_N, @@ -697,24 +660,11 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, goto error; } qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; - attr.pd = priv->pdn; - attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); - attr.cqn = qp->cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 0; - attr.sq_size = RTE_BIT32(log_nb_desc); - attr.dbr_umem_valid = 1; - attr.wq_umem_id = qp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = qp->umem_obj->umem_id; - attr.dbr_address = RTE_BIT64(log_nb_desc) * priv->wqe_set_size; - qp->qp_obj = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (qp->qp_obj == NULL) { - DRV_LOG(ERR, "Failed to create QP(%u).", rte_errno); - goto error; - } - qp->db_rec = RTE_PTR_ADD(qp->umem_buf, (uintptr_t)attr.dbr_address); - if (mlx5_crypto_qp2rts(qp)) + /* + * In Order to configure self loopback, when calling devx qp2rts the + * remote QP id that is used is the id of the same QP. + */ + if (mlx5_devx_qp2rts(&qp->qp_obj, qp->qp_obj.qp->id)) goto error; qp->mkey = (struct mlx5_devx_obj **)RTE_ALIGN((uintptr_t)(qp + 1), RTE_CACHE_LINE_SIZE); diff --git a/drivers/crypto/mlx5/mlx5_crypto.h b/drivers/crypto/mlx5/mlx5_crypto.h index d49b0001f0..013eed30b5 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.h +++ b/drivers/crypto/mlx5/mlx5_crypto.h @@ -43,11 +43,8 @@ struct mlx5_crypto_priv { struct mlx5_crypto_qp { struct mlx5_crypto_priv *priv; struct mlx5_devx_cq cq_obj; - struct mlx5_devx_obj *qp_obj; + struct mlx5_devx_qp qp_obj; struct rte_cryptodev_stats stats; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; struct rte_crypto_op **ops; struct mlx5_devx_obj **mkey; /* WQE's indirect mekys. */ struct mlx5_mr_ctrl mr_ctrl; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 2a04e36607..a27f3fdadb 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -54,10 +54,7 @@ struct mlx5_vdpa_cq { struct mlx5_vdpa_event_qp { struct mlx5_vdpa_cq cq; struct mlx5_devx_obj *fw_qp; - struct mlx5_devx_obj *sw_qp; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; + struct mlx5_devx_qp sw_qp; }; struct mlx5_vdpa_query_mr { diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index 3541c652ce..bb6722839a 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -179,7 +179,7 @@ mlx5_vdpa_cq_poll(struct mlx5_vdpa_cq *cq) cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci); rte_io_wmb(); /* Ring SW QP doorbell record. */ - eqp->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); + eqp->sw_qp.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); } return comp; } @@ -531,12 +531,7 @@ mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv) void mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp) { - if (eqp->sw_qp) - claim_zero(mlx5_devx_cmd_destroy(eqp->sw_qp)); - if (eqp->umem_obj) - claim_zero(mlx5_glue->devx_umem_dereg(eqp->umem_obj)); - if (eqp->umem_buf) - rte_free(eqp->umem_buf); + mlx5_devx_qp_destroy(&eqp->sw_qp); if (eqp->fw_qp) claim_zero(mlx5_devx_cmd_destroy(eqp->fw_qp)); mlx5_vdpa_cq_destroy(&eqp->cq); @@ -547,36 +542,36 @@ static int mlx5_vdpa_qps2rts(struct mlx5_vdpa_event_qp *eqp) { if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to INIT state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->fw_qp->id)) { + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, + MLX5_CMD_OP_RST2INIT_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to INIT state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTR state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->fw_qp->id)) { + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, + MLX5_CMD_OP_INIT2RTR_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTR state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RTR2RTS_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTS state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RTR2RTS_QP, + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, MLX5_CMD_OP_RTR2RTS_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTS state(%u).", rte_errno); @@ -591,8 +586,7 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, { struct mlx5_devx_qp_attr attr = {0}; uint16_t log_desc_n = rte_log2_u32(desc_n); - uint32_t umem_size = (1 << log_desc_n) * MLX5_WSEG_SIZE + - sizeof(*eqp->db_rec) * 2; + uint32_t ret; if (mlx5_vdpa_event_qp_global_prepare(priv)) return -1; @@ -605,42 +599,23 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, DRV_LOG(ERR, "Failed to create FW QP(%u).", rte_errno); goto error; } - eqp->umem_buf = rte_zmalloc(__func__, umem_size, 4096); - if (!eqp->umem_buf) { - DRV_LOG(ERR, "Failed to allocate memory for SW QP."); - rte_errno = ENOMEM; - goto error; - } - eqp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)eqp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (!eqp->umem_obj) { - DRV_LOG(ERR, "Failed to register umem for SW QP."); - goto error; - } attr.uar_index = priv->uar->page_id; attr.cqn = eqp->cq.cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 1 << log_desc_n; + attr.rq_size = RTE_BIT32(log_desc_n); attr.log_rq_stride = rte_log2_u32(MLX5_WSEG_SIZE); attr.sq_size = 0; /* No need SQ. */ - attr.dbr_umem_valid = 1; - attr.wq_umem_id = eqp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = eqp->umem_obj->umem_id; attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - attr.dbr_address = RTE_BIT64(log_desc_n) * MLX5_WSEG_SIZE; - eqp->sw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (!eqp->sw_qp) { + ret = mlx5_devx_qp_create(priv->ctx, &(eqp->sw_qp), log_desc_n, &attr, + SOCKET_ID_ANY); + if (ret) { DRV_LOG(ERR, "Failed to create SW QP(%u).", rte_errno); goto error; } - eqp->db_rec = RTE_PTR_ADD(eqp->umem_buf, (uintptr_t)attr.dbr_address); if (mlx5_vdpa_qps2rts(eqp)) goto error; /* First ringing. */ - rte_write32(rte_cpu_to_be_32(1 << log_desc_n), &eqp->db_rec[0]); + rte_write32(rte_cpu_to_be_32(RTE_BIT32(log_desc_n)), + &eqp->sw_qp.db_rec[0]); return 0; error: mlx5_vdpa_event_qp_destroy(eqp); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 1/5] common/mlx5: share DevX QP operations Raja Zidane @ 2021-09-15 0:04 ` Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 1/5] common/mlx5: share DevX QP operations Raja Zidane ` (5 more replies) 0 siblings, 6 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-15 0:04 UTC (permalink / raw) To: dev The mlx5 PMDs for compress and regex classes use an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, mmo, that will be supported only in the QP object. The FW introduced new capabilities to define whether the mmo configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set mmo configuration according to the new FW capabilities. V2: fix checkpatch errors. V3: rebase. Raja Zidane (5): common/mlx5: share DevX QP operations common/mlx5: update new MMO HCA capabilities common/mlx5: add MMO configuration for the DevX QP compress/mlx5: refactor queue HW object regex/mlx5: refactor HW queue objects drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 +++ drivers/common/mlx5/mlx5_devx_cmds.c | 23 ++- drivers/common/mlx5/mlx5_devx_cmds.h | 13 +- drivers/common/mlx5/mlx5_prm.h | 48 ++++++- drivers/common/mlx5/version.map | 3 + drivers/compress/mlx5/mlx5_compress.c | 73 +++++----- drivers/crypto/mlx5/mlx5_crypto.c | 102 ++++---------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 59 +++----- 15 files changed, 461 insertions(+), 295 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V3 1/5] common/mlx5: share DevX QP operations 2021-09-15 0:04 ` [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object Raja Zidane @ 2021-09-15 0:05 ` Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane ` (4 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-15 0:05 UTC (permalink / raw) To: dev Currently drivers using QP (vDPA, crypto and compress, regex soon) manage their memory, creation, modification and destruction of the QP, in almost identical code. Move QP memory management, creation and destruction to common. Add common function to change QP state to RTS. Add user_index attribute to QP creation. It's for better code maintenance and reuse. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 ++++ drivers/common/mlx5/mlx5_devx_cmds.c | 1 + drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/version.map | 3 + drivers/crypto/mlx5/mlx5_crypto.c | 102 +++++------------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 59 +++------- 9 files changed, 217 insertions(+), 126 deletions(-) diff --git a/drivers/common/mlx5/mlx5_common_devx.c b/drivers/common/mlx5/mlx5_common_devx.c index 22c8d356c4..825f84b183 100644 --- a/drivers/common/mlx5/mlx5_common_devx.c +++ b/drivers/common/mlx5/mlx5_common_devx.c @@ -271,6 +271,115 @@ mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, return -rte_errno; } +/** + * Destroy DevX Queue Pair. + * + * @param[in] qp + * DevX QP to destroy. + */ +void +mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp) +{ + if (qp->qp) + claim_zero(mlx5_devx_cmd_destroy(qp->qp)); + if (qp->umem_obj) + claim_zero(mlx5_os_umem_dereg(qp->umem_obj)); + if (qp->umem_buf) + mlx5_free((void *)(uintptr_t)qp->umem_buf); +} + +/** + * Create Queue Pair using DevX API. + * + * Get a pointer to partially initialized attributes structure, and updates the + * following fields: + * wq_umem_id + * wq_umem_offset + * dbr_umem_valid + * dbr_umem_id + * dbr_address + * log_page_size + * All other fields are updated by caller. + * + * @param[in] ctx + * Context returned from mlx5 open_device() glue function. + * @param[in/out] qp_obj + * Pointer to QP to create. + * @param[in] log_wqbb_n + * Log of number of WQBBs in queue. + * @param[in] attr + * Pointer to QP attributes structure. + * @param[in] socket + * Socket to use for allocation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket) +{ + struct mlx5_devx_obj *qp = NULL; + struct mlx5dv_devx_umem *umem_obj = NULL; + void *umem_buf = NULL; + size_t alignment = MLX5_WQE_BUF_ALIGNMENT; + uint32_t umem_size, umem_dbrec; + uint16_t qp_size = 1 << log_wqbb_n; + int ret; + + if (alignment == (size_t)-1) { + DRV_LOG(ERR, "Failed to get WQE buf alignment."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Allocate memory buffer for WQEs and doorbell record. */ + umem_size = MLX5_WQE_SIZE * qp_size; + umem_dbrec = RTE_ALIGN(umem_size, MLX5_DBR_SIZE); + umem_size += MLX5_DBR_SIZE; + umem_buf = mlx5_malloc(MLX5_MEM_RTE | MLX5_MEM_ZERO, umem_size, + alignment, socket); + if (!umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for QP."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Register allocated buffer in user space with DevX. */ + umem_obj = mlx5_os_umem_reg(ctx, (void *)(uintptr_t)umem_buf, umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!umem_obj) { + DRV_LOG(ERR, "Failed to register umem for QP."); + rte_errno = errno; + goto error; + } + /* Fill attributes for SQ object creation. */ + attr->wq_umem_id = mlx5_os_get_umem_id(umem_obj); + attr->wq_umem_offset = 0; + attr->dbr_umem_valid = 1; + attr->dbr_umem_id = attr->wq_umem_id; + attr->dbr_address = umem_dbrec; + attr->log_page_size = MLX5_LOG_PAGE_SIZE; + /* Create send queue object with DevX. */ + qp = mlx5_devx_cmd_create_qp(ctx, attr); + if (!qp) { + DRV_LOG(ERR, "Can't create DevX QP object."); + rte_errno = ENOMEM; + goto error; + } + qp_obj->umem_buf = umem_buf; + qp_obj->umem_obj = umem_obj; + qp_obj->qp = qp; + qp_obj->db_rec = RTE_PTR_ADD(qp_obj->umem_buf, umem_dbrec); + return 0; +error: + ret = rte_errno; + if (umem_obj) + claim_zero(mlx5_os_umem_dereg(umem_obj)); + if (umem_buf) + mlx5_free((void *)(uintptr_t)umem_buf); + rte_errno = ret; + return -rte_errno; +} + /** * Destroy DevX Receive Queue. * @@ -385,3 +494,38 @@ mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, return -rte_errno; } + +/** + * Change QP state to RTS. + * + * @param[in] qp + * DevX QP to change. + * @param[in] remote_qp_id + * The remote QP ID for MLX5_CMD_OP_INIT2RTR_QP operation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id) +{ + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RST2INIT_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_INIT2RTR_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RTR2RTS_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", + rte_errno); + return -1; + } + return 0; +} diff --git a/drivers/common/mlx5/mlx5_common_devx.h b/drivers/common/mlx5/mlx5_common_devx.h index aad0184e5a..f699405f69 100644 --- a/drivers/common/mlx5/mlx5_common_devx.h +++ b/drivers/common/mlx5/mlx5_common_devx.h @@ -33,6 +33,18 @@ struct mlx5_devx_sq { volatile uint32_t *db_rec; /* The SQ doorbell record. */ }; +/* DevX Queue Pair structure. */ +struct mlx5_devx_qp { + struct mlx5_devx_obj *qp; /* The QP DevX object. */ + void *umem_obj; /* The QP umem object. */ + union { + void *umem_buf; + struct mlx5_wqe *wqes; /* The QP ring buffer. */ + struct mlx5_aso_wqe *aso_wqes; + }; + volatile uint32_t *db_rec; /* The QP doorbell record. */ +}; + /* DevX Receive Queue structure. */ struct mlx5_devx_rq { struct mlx5_devx_obj *rq; /* The RQ DevX object. */ @@ -59,6 +71,14 @@ int mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, struct mlx5_devx_create_sq_attr *attr, int socket); +__rte_internal +void mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp); + +__rte_internal +int mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, + uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket); + __rte_internal void mlx5_devx_rq_destroy(struct mlx5_devx_rq *rq); @@ -67,4 +87,7 @@ int mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, uint16_t log_wqbb_n, struct mlx5_devx_create_rq_attr *attr, int socket); +__rte_internal +int mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id); + #endif /* RTE_PMD_MLX5_COMMON_DEVX_H_ */ diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 56407cc332..ac554cca05 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2021,6 +2021,7 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, st, MLX5_QP_ST_RC); MLX5_SET(qpc, qpc, pd, attr->pd); MLX5_SET(qpc, qpc, ts_format, attr->ts_format); + MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index e576e30f24..c071629904 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -397,6 +397,7 @@ struct mlx5_devx_qp_attr { uint64_t dbr_address; uint32_t wq_umem_id; uint64_t wq_umem_offset; + uint32_t user_index:24; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index e5cb6b7060..d3c5040aac 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -67,6 +67,9 @@ INTERNAL { mlx5_devx_get_out_command_status; + mlx5_devx_qp2rts; + mlx5_devx_qp_create; + mlx5_devx_qp_destroy; mlx5_devx_rq_create; mlx5_devx_rq_destroy; mlx5_devx_sq_create; diff --git a/drivers/crypto/mlx5/mlx5_crypto.c b/drivers/crypto/mlx5/mlx5_crypto.c index e01be15ade..23b1569039 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.c +++ b/drivers/crypto/mlx5/mlx5_crypto.c @@ -257,12 +257,7 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) { struct mlx5_crypto_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->qp_obj != NULL) - claim_zero(mlx5_devx_cmd_destroy(qp->qp_obj)); - if (qp->umem_obj != NULL) - claim_zero(mlx5_glue->devx_umem_dereg(qp->umem_obj)); - if (qp->umem_buf != NULL) - rte_free(qp->umem_buf); + mlx5_devx_qp_destroy(&qp->qp_obj); mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); mlx5_devx_cq_destroy(&qp->cq_obj); rte_free(qp); @@ -270,34 +265,6 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) return 0; } -static int -mlx5_crypto_qp2rts(struct mlx5_crypto_qp *qp) -{ - /* - * In Order to configure self loopback, when calling these functions the - * remote QP id that is used is the id of the same QP. - */ - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RST2INIT_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_INIT2RTR_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RTR2RTS_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", - rte_errno); - return -1; - } - return 0; -} - static __rte_noinline uint32_t mlx5_crypto_get_block_size(struct rte_crypto_op *op) { @@ -452,7 +419,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, memcpy(klms, &umr->kseg[0], sizeof(*klms) * klm_n); } ds = 2 + klm_n; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_RDMA_WRITE); ds = RTE_ALIGN(ds, 4); @@ -461,7 +428,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, if (priv->max_rdmar_ds > ds) { cseg += ds; ds = priv->max_rdmar_ds - ds; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_NOP); qp->db_pi += ds >> 2; /* Here, DS is 4 aligned for sure. */ @@ -503,7 +470,8 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, return 0; do { op = *ops++; - umr = RTE_PTR_ADD(qp->umem_buf, priv->wqe_set_size * qp->pi); + umr = RTE_PTR_ADD(qp->qp_obj.umem_buf, + priv->wqe_set_size * qp->pi); if (unlikely(mlx5_crypto_wqe_set(priv, qp, op, umr) == 0)) { qp->stats.enqueue_err_count++; if (remain != nb_ops) { @@ -517,7 +485,7 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); + qp->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); rte_wmb(); mlx5_crypto_uar_write(*(volatile uint64_t *)qp->wqe, qp->priv); rte_wmb(); @@ -583,8 +551,8 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) uint32_t i; for (i = 0 ; i < qp->entries_n; i++) { - struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->umem_buf, i * - priv->wqe_set_size); + struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->qp_obj.umem_buf, + i * priv->wqe_set_size); struct mlx5_wqe_umr_cseg *ucseg = (struct mlx5_wqe_umr_cseg *) (cseg + 1); struct mlx5_wqe_umr_bsf_seg *bsf = @@ -593,7 +561,7 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) struct mlx5_wqe_rseg *rseg; /* Init UMR WQE. */ - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | (priv->umr_wqe_size / MLX5_WSEG_SIZE)); cseg->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR << MLX5_COMP_MODE_OFFSET); @@ -628,7 +596,7 @@ mlx5_crypto_indirect_mkeys_prepare(struct mlx5_crypto_priv *priv, .klm_num = RTE_ALIGN(priv->max_segs_num, 4), }; - for (umr = (struct mlx5_umr_wqe *)qp->umem_buf, i = 0; + for (umr = (struct mlx5_umr_wqe *)qp->qp_obj.umem_buf, i = 0; i < qp->entries_n; i++, umr = RTE_PTR_ADD(umr, priv->wqe_set_size)) { attr.klm_array = (struct mlx5_klm *)&umr->kseg[0]; qp->mkey[i] = mlx5_devx_cmd_mkey_create(priv->ctx, &attr); @@ -649,9 +617,7 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, struct mlx5_devx_qp_attr attr = {0}; struct mlx5_crypto_qp *qp; uint16_t log_nb_desc = rte_log2_u32(qp_conf->nb_descriptors); - uint32_t umem_size = RTE_BIT32(log_nb_desc) * - priv->wqe_set_size + - sizeof(*qp->db_rec) * 2; + uint32_t ret; uint32_t alloc_size = sizeof(*qp); struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), @@ -675,18 +641,16 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto error; } - qp->umem_buf = rte_zmalloc_socket(__func__, umem_size, 4096, socket_id); - if (qp->umem_buf == NULL) { - DRV_LOG(ERR, "Failed to allocate QP umem."); - rte_errno = ENOMEM; - goto error; - } - qp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)qp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (qp->umem_obj == NULL) { - DRV_LOG(ERR, "Failed to register QP umem."); + attr.pd = priv->pdn; + attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); + attr.cqn = qp->cq_obj.cq->id; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(log_nb_desc); + attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp_obj, log_nb_desc, &attr, + socket_id); + if (ret) { + DRV_LOG(ERR, "Failed to create QP."); goto error; } if (mlx5_mr_btree_init(&qp->mr_ctrl.cache_bh, MLX5_MR_BTREE_CACHE_N, @@ -697,25 +661,11 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, goto error; } qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; - attr.pd = priv->pdn; - attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); - attr.cqn = qp->cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 0; - attr.sq_size = RTE_BIT32(log_nb_desc); - attr.dbr_umem_valid = 1; - attr.wq_umem_id = qp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = qp->umem_obj->umem_id; - attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - attr.dbr_address = RTE_BIT64(log_nb_desc) * priv->wqe_set_size; - qp->qp_obj = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (qp->qp_obj == NULL) { - DRV_LOG(ERR, "Failed to create QP(%u).", rte_errno); - goto error; - } - qp->db_rec = RTE_PTR_ADD(qp->umem_buf, (uintptr_t)attr.dbr_address); - if (mlx5_crypto_qp2rts(qp)) + /* + * In Order to configure self loopback, when calling devx qp2rts the + * remote QP id that is used is the id of the same QP. + */ + if (mlx5_devx_qp2rts(&qp->qp_obj, qp->qp_obj.qp->id)) goto error; qp->mkey = (struct mlx5_devx_obj **)RTE_ALIGN((uintptr_t)(qp + 1), RTE_CACHE_LINE_SIZE); diff --git a/drivers/crypto/mlx5/mlx5_crypto.h b/drivers/crypto/mlx5/mlx5_crypto.h index d589e0ac3d..4d7e6d2d10 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.h +++ b/drivers/crypto/mlx5/mlx5_crypto.h @@ -44,11 +44,8 @@ struct mlx5_crypto_priv { struct mlx5_crypto_qp { struct mlx5_crypto_priv *priv; struct mlx5_devx_cq cq_obj; - struct mlx5_devx_obj *qp_obj; + struct mlx5_devx_qp qp_obj; struct rte_cryptodev_stats stats; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; struct rte_crypto_op **ops; struct mlx5_devx_obj **mkey; /* WQE's indirect mekys. */ struct mlx5_mr_ctrl mr_ctrl; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 2a04e36607..a27f3fdadb 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -54,10 +54,7 @@ struct mlx5_vdpa_cq { struct mlx5_vdpa_event_qp { struct mlx5_vdpa_cq cq; struct mlx5_devx_obj *fw_qp; - struct mlx5_devx_obj *sw_qp; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; + struct mlx5_devx_qp sw_qp; }; struct mlx5_vdpa_query_mr { diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index 3541c652ce..bb6722839a 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -179,7 +179,7 @@ mlx5_vdpa_cq_poll(struct mlx5_vdpa_cq *cq) cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci); rte_io_wmb(); /* Ring SW QP doorbell record. */ - eqp->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); + eqp->sw_qp.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); } return comp; } @@ -531,12 +531,7 @@ mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv) void mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp) { - if (eqp->sw_qp) - claim_zero(mlx5_devx_cmd_destroy(eqp->sw_qp)); - if (eqp->umem_obj) - claim_zero(mlx5_glue->devx_umem_dereg(eqp->umem_obj)); - if (eqp->umem_buf) - rte_free(eqp->umem_buf); + mlx5_devx_qp_destroy(&eqp->sw_qp); if (eqp->fw_qp) claim_zero(mlx5_devx_cmd_destroy(eqp->fw_qp)); mlx5_vdpa_cq_destroy(&eqp->cq); @@ -547,36 +542,36 @@ static int mlx5_vdpa_qps2rts(struct mlx5_vdpa_event_qp *eqp) { if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to INIT state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->fw_qp->id)) { + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, + MLX5_CMD_OP_RST2INIT_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to INIT state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTR state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->fw_qp->id)) { + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, + MLX5_CMD_OP_INIT2RTR_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTR state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RTR2RTS_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTS state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RTR2RTS_QP, + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, MLX5_CMD_OP_RTR2RTS_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTS state(%u).", rte_errno); @@ -591,8 +586,7 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, { struct mlx5_devx_qp_attr attr = {0}; uint16_t log_desc_n = rte_log2_u32(desc_n); - uint32_t umem_size = (1 << log_desc_n) * MLX5_WSEG_SIZE + - sizeof(*eqp->db_rec) * 2; + uint32_t ret; if (mlx5_vdpa_event_qp_global_prepare(priv)) return -1; @@ -605,42 +599,23 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, DRV_LOG(ERR, "Failed to create FW QP(%u).", rte_errno); goto error; } - eqp->umem_buf = rte_zmalloc(__func__, umem_size, 4096); - if (!eqp->umem_buf) { - DRV_LOG(ERR, "Failed to allocate memory for SW QP."); - rte_errno = ENOMEM; - goto error; - } - eqp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)eqp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (!eqp->umem_obj) { - DRV_LOG(ERR, "Failed to register umem for SW QP."); - goto error; - } attr.uar_index = priv->uar->page_id; attr.cqn = eqp->cq.cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 1 << log_desc_n; + attr.rq_size = RTE_BIT32(log_desc_n); attr.log_rq_stride = rte_log2_u32(MLX5_WSEG_SIZE); attr.sq_size = 0; /* No need SQ. */ - attr.dbr_umem_valid = 1; - attr.wq_umem_id = eqp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = eqp->umem_obj->umem_id; attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - attr.dbr_address = RTE_BIT64(log_desc_n) * MLX5_WSEG_SIZE; - eqp->sw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (!eqp->sw_qp) { + ret = mlx5_devx_qp_create(priv->ctx, &(eqp->sw_qp), log_desc_n, &attr, + SOCKET_ID_ANY); + if (ret) { DRV_LOG(ERR, "Failed to create SW QP(%u).", rte_errno); goto error; } - eqp->db_rec = RTE_PTR_ADD(eqp->umem_buf, (uintptr_t)attr.dbr_address); if (mlx5_vdpa_qps2rts(eqp)) goto error; /* First ringing. */ - rte_write32(rte_cpu_to_be_32(1 << log_desc_n), &eqp->db_rec[0]); + rte_write32(rte_cpu_to_be_32(RTE_BIT32(log_desc_n)), + &eqp->sw_qp.db_rec[0]); return 0; error: mlx5_vdpa_event_qp_destroy(eqp); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V3 2/5] common/mlx5: update new MMO HCA capabilities 2021-09-15 0:04 ` [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 1/5] common/mlx5: share DevX QP operations Raja Zidane @ 2021-09-15 0:05 ` Raja Zidane 2021-09-22 19:48 ` Thomas Monjalon 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane ` (3 subsequent siblings) 5 siblings, 1 reply; 38+ messages in thread From: Raja Zidane @ 2021-09-15 0:05 UTC (permalink / raw) To: dev New MMO HCA capabilities were added and others were renamed. Align hca capabilities with new prm. Add support in devx interface for changes in HCA capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 15 ++++++++++++--- drivers/common/mlx5/mlx5_devx_cmds.h | 11 ++++++++--- drivers/common/mlx5/mlx5_prm.h | 20 ++++++++++++++------ 3 files changed, 34 insertions(+), 12 deletions(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index ac554cca05..00c78b1288 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -858,9 +858,18 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, attr->log_max_srq_sz = MLX5_GET(cmd_hca_cap, hcattr, log_max_srq_sz); attr->reg_c_preserve = MLX5_GET(cmd_hca_cap, hcattr, reg_c_preserve); - attr->mmo_dma_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo); - attr->mmo_compress_en = MLX5_GET(cmd_hca_cap, hcattr, compress); - attr->mmo_decompress_en = MLX5_GET(cmd_hca_cap, hcattr, decompress); + attr->mmo_regex_qp_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_qp); + attr->mmo_regex_sq_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_sq); + attr->mmo_dma_sq_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_sq); + attr->mmo_compress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_sq); + attr->mmo_decompress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_sq); + attr->mmo_dma_qp_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_qp); + attr->mmo_compress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_qp); + attr->mmo_decompress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_qp); attr->compress_min_block_size = MLX5_GET(cmd_hca_cap, hcattr, compress_min_block_size); attr->log_max_mmo_dma = MLX5_GET(cmd_hca_cap, hcattr, log_dma_mmo_size); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index c071629904..b21df0fd9b 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -173,9 +173,14 @@ struct mlx5_hca_attr { uint32_t log_max_srq; uint32_t log_max_srq_sz; uint32_t rss_ind_tbl_cap; - uint32_t mmo_dma_en:1; - uint32_t mmo_compress_en:1; - uint32_t mmo_decompress_en:1; + uint32_t mmo_dma_sq_en:1; + uint32_t mmo_compress_sq_en:1; + uint32_t mmo_decompress_sq_en:1; + uint32_t mmo_dma_qp_en:1; + uint32_t mmo_compress_qp_en:1; + uint32_t mmo_decompress_qp_en:1; + uint32_t mmo_regex_qp_en:1; + uint32_t mmo_regex_sq_en:1; uint32_t compress_min_block_size:4; uint32_t log_max_mmo_dma:5; uint32_t log_max_mmo_compress:5; diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index d361bcf90e..070c538c8c 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1386,10 +1386,10 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 rtr2rts_qp_counters_set_id[0x1]; u8 rts2rts_udp_sport[0x1]; u8 rts2rts_lag_tx_port_affinity[0x1]; - u8 dma_mmo[0x1]; + u8 dma_mmo_sq[0x1]; u8 compress_min_block_size[0x4]; - u8 compress[0x1]; - u8 decompress[0x1]; + u8 compress_mmo_sq[0x1]; + u8 decompress_mmo_sq[0x1]; u8 log_max_ra_res_qp[0x6]; u8 end_pad[0x1]; u8 cc_query_allowed[0x1]; @@ -1519,7 +1519,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_lag_ports[0x4]; u8 reserved_at_280[0x10]; u8 max_wqe_sz_sq[0x10]; - u8 reserved_at_2a0[0x10]; + u8 reserved_at_2a0[0xe]; + u8 regexp_mmo_sq[0x1]; + u8 reserved_at_2b0[0x1]; u8 max_wqe_sz_rq[0x10]; u8 max_flow_counter_31_16[0x10]; u8 max_wqe_sz_sq_dc[0x10]; @@ -1632,7 +1634,12 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_vhca_ports[0x8]; u8 reserved_at_618[0x6]; u8 sw_owner_id[0x1]; - u8 reserved_at_61f[0x1e1]; + u8 reserved_at_61f[0x109]; + u8 dma_mmo_qp[0x1]; + u8 regexp_mmo_qp[0x1]; + u8 compress_mmo_qp[0x1]; + u8 decompress_mmo_qp[0x1]; + u8 reserved_at_624[0xd4]; }; struct mlx5_ifc_qos_cap_bits { @@ -3244,7 +3251,8 @@ struct mlx5_ifc_create_qp_in_bits { u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 reserved_at_40[0x40]; + u8 qpc_ext[0x1]; + u8 reserved_at_41[0x3f]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [dpdk-dev] [PATCH V3 2/5] common/mlx5: update new MMO HCA capabilities 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane @ 2021-09-22 19:48 ` Thomas Monjalon 0 siblings, 0 replies; 38+ messages in thread From: Thomas Monjalon @ 2021-09-22 19:48 UTC (permalink / raw) To: Raja Zidane; +Cc: dev, Matan Azrad 15/09/2021 02:05, Raja Zidane: > New MMO HCA capabilities were added and others were renamed. > Align hca capabilities with new prm. > Add support in devx interface for changes in HCA capabilities. > > Signed-off-by: Raja Zidane <rzidane@nvidia.com> > Acked-by: Matan Azrad <matan@nvidia.com> > --- > --- a/drivers/common/mlx5/mlx5_devx_cmds.h > +++ b/drivers/common/mlx5/mlx5_devx_cmds.h > @@ -173,9 +173,14 @@ struct mlx5_hca_attr { > uint32_t log_max_srq; > uint32_t log_max_srq_sz; > uint32_t rss_ind_tbl_cap; > - uint32_t mmo_dma_en:1; > - uint32_t mmo_compress_en:1; > - uint32_t mmo_decompress_en:1; > + uint32_t mmo_dma_sq_en:1; > + uint32_t mmo_compress_sq_en:1; > + uint32_t mmo_decompress_sq_en:1; > + uint32_t mmo_dma_qp_en:1; > + uint32_t mmo_compress_qp_en:1; > + uint32_t mmo_decompress_qp_en:1; > + uint32_t mmo_regex_qp_en:1; > + uint32_t mmo_regex_sq_en:1; It does not compile: struct mlx5_hca_attr has no member named mmo_compress_en ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V3 3/5] common/mlx5: add MMO configuration for the DevX QP 2021-09-15 0:04 ` [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane @ 2021-09-15 0:05 ` Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 4/5] compress/mlx5: refactor queue HW object Raja Zidane ` (2 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-15 0:05 UTC (permalink / raw) To: dev A new configuration MMO was added to QP Context. If set, MMO WQEs are supported on this QP. For DMA MMO, supported only when dma_mmo_qp==1. For REGEXP MMO, supported only when regexp_mmo_qp==1. For COMPRESS MMO, supported only when compress_mmo_qp==1. For DECOMPRESS MMO, supported only when decompress_mmo_qp==1. Add support to DevX interface to set MMO bit. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++++++ drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/mlx5_prm.h | 28 +++++++++++++++++++++++++++- 3 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 00c78b1288..eefb869b7d 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2032,6 +2032,13 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, ts_format, attr->ts_format); MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { + if (attr->mmo) { + void *qpc_ext_and_pas_list = MLX5_ADDR_OF(create_qp_in, + in, qpc_extension_and_pas_list); + void *qpc_ext = MLX5_ADDR_OF(qpc_extension_and_pas_list, + qpc_ext_and_pas_list, qpc_data_extension); + MLX5_SET(qpc_extension, qpc_ext, mmo, 1); + } MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); if (attr->log_page_size > MLX5_ADAPTER_PAGE_SHIFT) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index b21df0fd9b..e149f8b4f5 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -403,6 +403,7 @@ struct mlx5_devx_qp_attr { uint32_t wq_umem_id; uint64_t wq_umem_offset; uint32_t user_index:24; + uint32_t mmo:1; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 070c538c8c..cb28adb80a 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -3243,6 +3243,28 @@ struct mlx5_ifc_create_qp_out_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_qpc_extension_bits { + u8 reserved_at_0[0x2]; + u8 mmo[0x1]; + u8 reserved_at_3[0x5fd]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_pas_list_bits { + u8 pas[0][0x40]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_extension_and_pas_list_bits { + struct mlx5_ifc_qpc_extension_bits qpc_data_extension; + u8 pas[0][0x40]; +}; + + #ifdef PEDANTIC #pragma GCC diagnostic ignored "-Wpedantic" #endif @@ -3260,7 +3282,11 @@ struct mlx5_ifc_create_qp_in_bits { u8 wq_umem_id[0x20]; u8 wq_umem_valid[0x1]; u8 reserved_at_861[0x1f]; - u8 pas[0][0x40]; + union { + struct mlx5_ifc_qpc_pas_list_bits qpc_pas_list; + struct mlx5_ifc_qpc_extension_and_pas_list_bits + qpc_extension_and_pas_list; + }; }; #ifdef PEDANTIC #pragma GCC diagnostic error "-Wpedantic" -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V3 4/5] compress/mlx5: refactor queue HW object 2021-09-15 0:04 ` [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object Raja Zidane ` (2 preceding siblings ...) 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane @ 2021-09-15 0:05 ` Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-15 0:05 UTC (permalink / raw) To: dev The mlx5 PMD for compress class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/compress/mlx5/mlx5_compress.c | 73 +++++++++++++++------------ 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index c5e0a83a8c..5c5aa87a18 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -40,7 +40,7 @@ struct mlx5_compress_priv { void *uar; uint32_t pdn; /* Protection Domain number. */ uint8_t min_block_size; - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ /* Minimum huffman block size supported by the device. */ struct ibv_pd *pd; struct rte_compressdev_config dev_config; @@ -48,6 +48,13 @@ struct mlx5_compress_priv { rte_spinlock_t xform_sl; struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ volatile uint64_t *uar_addr; + /* HCA caps*/ + uint32_t mmo_decomp_sq:1; + uint32_t mmo_decomp_qp:1; + uint32_t mmo_comp_sq:1; + uint32_t mmo_comp_qp:1; + uint32_t mmo_dma_sq:1; + uint32_t mmo_dma_qp:1; #ifndef RTE_ARCH_64 rte_spinlock_t uar32_sl; #endif /* RTE_ARCH_64 */ @@ -61,7 +68,7 @@ struct mlx5_compress_qp { struct mlx5_mr_ctrl mr_ctrl; int socket_id; struct mlx5_devx_cq cq; - struct mlx5_devx_sq sq; + struct mlx5_devx_qp qp; struct mlx5_pmd_mr opaque_mr; struct rte_comp_op **ops; struct mlx5_compress_priv *priv; @@ -134,8 +141,8 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) { struct mlx5_compress_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->sq.sq != NULL) - mlx5_devx_sq_destroy(&qp->sq); + if (qp->qp.qp != NULL) + mlx5_devx_qp_destroy(&qp->qp); if (qp->cq.cq != NULL) mlx5_devx_cq_destroy(&qp->cq); if (qp->opaque_mr.obj != NULL) { @@ -152,12 +159,12 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) } static void -mlx5_compress_init_sq(struct mlx5_compress_qp *qp) +mlx5_compress_init_qp(struct mlx5_compress_qp *qp) { volatile struct mlx5_gga_wqe *restrict wqe = - (volatile struct mlx5_gga_wqe *)qp->sq.wqes; + (volatile struct mlx5_gga_wqe *)qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; - const uint32_t sq_ds = rte_cpu_to_be_32((qp->sq.sq->id << 8) | 4u); + const uint32_t sq_ds = rte_cpu_to_be_32((qp->qp.qp->id << 8) | 4u); const uint32_t flags = RTE_BE32(MLX5_COMP_ALWAYS << MLX5_COMP_MODE_OFFSET); const uint32_t opaq_lkey = rte_cpu_to_be_32(qp->opaque_mr.lkey); @@ -182,15 +189,10 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), }; - struct mlx5_devx_create_sq_attr sq_attr = { + struct mlx5_devx_qp_attr qp_attr = { + .pd = priv->pdn, + .uar_index = mlx5_os_get_devx_uar_page_id(priv->uar), .user_index = qp_id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .pd = priv->pdn, - .uar_page = mlx5_os_get_devx_uar_page_id(priv->uar), - }, - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, }; uint32_t log_ops_n = rte_log2_u32(max_inflight_ops); uint32_t alloc_size = sizeof(*qp); @@ -242,24 +244,26 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto err; } - sq_attr.cqn = qp->cq.cq->id; - sq_attr.ts_format = mlx5_ts_format_conv(priv->sq_ts_format); - ret = mlx5_devx_sq_create(priv->ctx, &qp->sq, log_ops_n, &sq_attr, + qp_attr.cqn = qp->cq.cq->id; + qp_attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + qp_attr.rq_size = 0; + qp_attr.sq_size = RTE_BIT32(log_ops_n); + qp_attr.mmo = priv->mmo_decomp_qp && priv->mmo_comp_qp + && priv->mmo_dma_qp; + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp, log_ops_n, &qp_attr, socket_id); if (ret != 0) { - DRV_LOG(ERR, "Failed to create SQ."); + DRV_LOG(ERR, "Failed to create QP."); goto err; } - mlx5_compress_init_sq(qp); - ret = mlx5_devx_cmd_modify_sq(qp->sq.sq, &modify_attr); - if (ret != 0) { - DRV_LOG(ERR, "Can't change SQ state to ready."); + mlx5_compress_init_qp(qp); + ret = mlx5_devx_qp2rts(&qp->qp, 0); + if (ret) goto err; - } /* Save pointer of global generation number to check memory event. */ qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; DRV_LOG(INFO, "QP %u: SQN=0x%X CQN=0x%X entries num = %u", - (uint32_t)qp_id, qp->sq.sq->id, qp->cq.cq->id, qp->entries_n); + (uint32_t)qp_id, qp->qp.qp->id, qp->cq.cq->id, qp->entries_n); return 0; err: mlx5_compress_qp_release(dev, qp_id); @@ -508,7 +512,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, { struct mlx5_compress_qp *qp = queue_pair; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes, *wqe; + qp->qp.wqes, *wqe; struct mlx5_compress_xform *xform; struct rte_comp_op *op; uint16_t mask = qp->entries_n - 1; @@ -563,7 +567,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->sq.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); + qp->qp.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); rte_wmb(); mlx5_compress_uar_write(*(volatile uint64_t *)wqe, qp->priv); rte_wmb(); @@ -598,7 +602,7 @@ mlx5_compress_cqe_err_handle(struct mlx5_compress_qp *qp, volatile struct mlx5_err_cqe *cqe = (volatile struct mlx5_err_cqe *) &qp->cq.cqes[idx]; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes; + qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; op->status = RTE_COMP_OP_STATUS_ERROR; @@ -813,8 +817,9 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_en == 0 || att.mmo_decompress_en == 0 || - att.mmo_dma_en == 0) { + ((att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) && (att.mmo_compress_qp_en == 0 || + att.mmo_decompress_qp_en == 0 || att.mmo_dma_qp_en == 0))) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); @@ -835,10 +840,16 @@ mlx5_compress_dev_probe(struct rte_device *dev) cdev->enqueue_burst = mlx5_compress_enqueue_burst; cdev->feature_flags = RTE_COMPDEV_FF_HW_ACCELERATED; priv = cdev->data->dev_private; + priv->mmo_decomp_sq = att.mmo_decompress_sq_en; + priv->mmo_decomp_qp = att.mmo_decompress_qp_en; + priv->mmo_comp_sq = att.mmo_compress_sq_en; + priv->mmo_comp_qp = att.mmo_compress_qp_en; + priv->mmo_dma_sq = att.mmo_dma_sq_en; + priv->mmo_dma_qp = att.mmo_dma_qp_en; priv->ctx = ctx; priv->cdev = cdev; priv->min_block_size = att.compress_min_block_size; - priv->sq_ts_format = att.sq_ts_format; + priv->qp_ts_format = att.qp_ts_format; if (mlx5_compress_hw_global_prepare(priv) != 0) { rte_compressdev_pmd_destroy(priv->cdev); claim_zero(mlx5_glue->close_device(priv->ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V3 5/5] regex/mlx5: refactor HW queue objects 2021-09-15 0:04 ` [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object Raja Zidane ` (3 preceding siblings ...) 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 4/5] compress/mlx5: refactor queue HW object Raja Zidane @ 2021-09-15 0:05 ` Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-15 0:05 UTC (permalink / raw) To: dev The mlx5 PMD for regex class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- 4 files changed, 133 insertions(+), 125 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_regex.c b/drivers/regex/mlx5/mlx5_regex.c index f17b6df47f..a3368749b9 100644 --- a/drivers/regex/mlx5/mlx5_regex.c +++ b/drivers/regex/mlx5/mlx5_regex.c @@ -146,7 +146,8 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) DRV_LOG(ERR, "Unable to read HCA capabilities."); rte_errno = ENOTSUP; goto dev_error; - } else if (!attr.regex || attr.regexp_num_of_engines == 0) { + } else if (((!attr.regex) && (!attr.mmo_regex_sq_en) && + (!attr.mmo_regex_qp_en)) || attr.regexp_num_of_engines == 0) { DRV_LOG(ERR, "Not enough capabilities to support RegEx, maybe " "old FW/OFED version?"); rte_errno = ENOTSUP; @@ -164,7 +165,9 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) rte_errno = ENOMEM; goto dev_error; } - priv->sq_ts_format = attr.sq_ts_format; + priv->mmo_regex_qp_cap = attr.mmo_regex_qp_en; + priv->mmo_regex_sq_cap = attr.mmo_regex_sq_en; + priv->qp_ts_format = attr.qp_ts_format; priv->ctx = ctx; priv->nb_engines = 2; /* attr.regexp_num_of_engines */ ret = mlx5_devx_regex_register_read(priv->ctx, 0, diff --git a/drivers/regex/mlx5/mlx5_regex.h b/drivers/regex/mlx5/mlx5_regex.h index 514f3408f9..2242d250a3 100644 --- a/drivers/regex/mlx5/mlx5_regex.h +++ b/drivers/regex/mlx5/mlx5_regex.h @@ -17,12 +17,12 @@ #include "mlx5_rxp.h" #include "mlx5_regex_utils.h" -struct mlx5_regex_sq { +struct mlx5_regex_hw_qp { uint16_t log_nb_desc; /* Log 2 number of desc for this object. */ - struct mlx5_devx_sq sq_obj; /* The SQ DevX object. */ + struct mlx5_devx_qp qp_obj; /* The QP DevX object. */ size_t pi, db_pi; size_t ci; - uint32_t sqn; + uint32_t qpn; }; struct mlx5_regex_cq { @@ -34,10 +34,10 @@ struct mlx5_regex_cq { struct mlx5_regex_qp { uint32_t flags; /* QP user flags. */ uint32_t nb_desc; /* Total number of desc for this qp. */ - struct mlx5_regex_sq *sqs; /* Pointer to sq array. */ - uint16_t nb_obj; /* Number of sq objects. */ + struct mlx5_regex_hw_qp *qps; /* Pointer to qp array. */ + uint16_t nb_obj; /* Number of qp objects. */ struct mlx5_regex_cq cq; /* CQ struct. */ - uint32_t free_sqs; + uint32_t free_qps; struct mlx5_regex_job *jobs; struct ibv_mr *metadata; struct ibv_mr *outputs; @@ -73,8 +73,10 @@ struct mlx5_regex_priv { /**< Called by memory event callback. */ struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ uint8_t is_bf2; /* The device is BF2 device. */ - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ uint8_t has_umr; /* The device supports UMR. */ + uint32_t mmo_regex_qp_cap:1; + uint32_t mmo_regex_sq_cap:1; }; #ifdef HAVE_IBV_FLOW_DV_SUPPORT diff --git a/drivers/regex/mlx5/mlx5_regex_control.c b/drivers/regex/mlx5/mlx5_regex_control.c index 8ce2dabb55..572ecc6d86 100644 --- a/drivers/regex/mlx5/mlx5_regex_control.c +++ b/drivers/regex/mlx5/mlx5_regex_control.c @@ -106,12 +106,12 @@ regex_ctrl_create_cq(struct mlx5_regex_priv *priv, struct mlx5_regex_cq *cq) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) +regex_ctrl_destroy_hw_qp(struct mlx5_regex_qp *qp, uint16_t q_ind) { - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; - mlx5_devx_sq_destroy(&sq->sq_obj); - memset(sq, 0, sizeof(*sq)); + mlx5_devx_qp_destroy(&qp_obj->qp_obj); + memset(qp, 0, sizeof(*qp)); return 0; } @@ -131,45 +131,44 @@ regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_create_sq(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, +regex_ctrl_create_hw_qp(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, uint16_t q_ind, uint16_t log_nb_desc) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT - struct mlx5_devx_create_sq_attr attr = { - .user_index = q_ind, + struct mlx5_devx_qp_attr attr = { .cqn = qp->cq.cq_obj.cq->id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .uar_page = priv->uar->page_id, - }, - .ts_format = mlx5_ts_format_conv(priv->sq_ts_format), - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, + .uar_index = priv->uar->page_id, + .ts_format = mlx5_ts_format_conv(priv->qp_ts_format), + .user_index = q_ind, }; - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; uint32_t pd_num = 0; int ret; - sq->log_nb_desc = log_nb_desc; - sq->sqn = q_ind; - sq->ci = 0; - sq->pi = 0; + qp_obj->log_nb_desc = log_nb_desc; + qp_obj->qpn = q_ind; + qp_obj->ci = 0; + qp_obj->pi = 0; ret = regex_get_pdn(priv->pd, &pd_num); if (ret) return ret; - attr.wq_attr.pd = pd_num; - ret = mlx5_devx_sq_create(priv->ctx, &sq->sq_obj, + attr.pd = pd_num; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, + log_nb_desc)); + attr.mmo = priv->mmo_regex_qp_cap; + ret = mlx5_devx_qp_create(priv->ctx, &qp_obj->qp_obj, MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_nb_desc), &attr, SOCKET_ID_ANY); if (ret) { - DRV_LOG(ERR, "Can't create SQ object."); + DRV_LOG(ERR, "Can't create QP object."); rte_errno = ENOMEM; return -rte_errno; } - ret = mlx5_devx_cmd_modify_sq(sq->sq_obj.sq, &modify_attr); + ret = mlx5_devx_qp2rts(&qp_obj->qp_obj, 0); if (ret) { - DRV_LOG(ERR, "Can't change SQ state to ready."); - regex_ctrl_destroy_sq(qp, q_ind); + DRV_LOG(ERR, "Can't change QP state to RTS."); + regex_ctrl_destroy_hw_qp(qp, q_ind); rte_errno = ENOMEM; return -rte_errno; } @@ -224,10 +223,10 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, (1 << MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_desc)); else qp->nb_obj = 1; - qp->sqs = rte_malloc(NULL, - qp->nb_obj * sizeof(struct mlx5_regex_sq), 64); - if (!qp->sqs) { - DRV_LOG(ERR, "Can't allocate sq array memory."); + qp->qps = rte_malloc(NULL, + qp->nb_obj * sizeof(struct mlx5_regex_hw_qp), 64); + if (!qp->qps) { + DRV_LOG(ERR, "Can't allocate qp array memory."); rte_errno = ENOMEM; return -rte_errno; } @@ -238,9 +237,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, goto err_cq; } for (i = 0; i < qp->nb_obj; i++) { - ret = regex_ctrl_create_sq(priv, qp, i, log_desc); + ret = regex_ctrl_create_hw_qp(priv, qp, i, log_desc); if (ret) { - DRV_LOG(ERR, "Can't create sq."); + DRV_LOG(ERR, "Can't create qp object."); goto err_btree; } nb_sq_config++; @@ -266,9 +265,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); err_btree: for (i = 0; i < nb_sq_config; i++) - regex_ctrl_destroy_sq(qp, i); + regex_ctrl_destroy_hw_qp(qp, i); regex_ctrl_destroy_cq(&qp->cq); err_cq: - rte_free(qp->sqs); + rte_free(qp->qps); return ret; } diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c index 786718af53..18b01b6bf9 100644 --- a/drivers/regex/mlx5/mlx5_regex_fastpath.c +++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c @@ -39,13 +39,13 @@ #define MLX5_REGEX_KLMS_SIZE \ ((MLX5_REGEX_MAX_KLM_NUM) * sizeof(struct mlx5_klm)) /* In WQE set mode, the pi should be quarter of the MLX5_REGEX_MAX_WQE_INDEX. */ -#define MLX5_REGEX_UMR_SQ_PI_IDX(pi, ops) \ +#define MLX5_REGEX_UMR_QP_PI_IDX(pi, ops) \ (((pi) + (ops)) & (MLX5_REGEX_MAX_WQE_INDEX >> 2)) static inline uint32_t -sq_size_get(struct mlx5_regex_sq *sq) +qp_size_get(struct mlx5_regex_hw_qp *qp) { - return (1U << sq->log_nb_desc); + return (1U << qp->log_nb_desc); } static inline uint32_t @@ -144,11 +144,11 @@ mlx5_regex_addr2mr(struct mlx5_regex_priv *priv, struct mlx5_mr_ctrl *mr_ctrl, static inline void -__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, +__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); uint16_t group0 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F ? @@ -168,13 +168,13 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F | RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F))) group0 = op->group_id0; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; int ds = 4; /* ctrl + meta + input + output */ set_wqe_ctrl_seg((struct mlx5_wqe_ctrl_seg *)wqe, (priv->has_umr ? (pi * 4 + 3) : pi), MLX5_OPCODE_MMO, MLX5_OPC_MOD_MMO_REGEX, - sq->sq_obj.sq->id, 0, ds, 0, 0); + qp_obj->qp_obj.qp->id, 0, ds, 0, 0); set_regex_ctrl_seg(wqe + 12, 0, group0, group1, group2, group3, control); struct mlx5_wqe_data_seg *input_seg = @@ -188,7 +188,7 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, static inline void prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops *op, + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job) { struct mlx5_klm klm; @@ -196,42 +196,42 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.byte_count = rte_pktmbuf_data_len(op->mbuf); klm.mkey = mlx5_regex_addr2mr(priv, &qp->mr_ctrl, op->mbuf); klm.address = rte_pktmbuf_mtod(op->mbuf, uintptr_t); - __prep_one(priv, sq, op, job, sq->pi, &klm); - sq->db_pi = sq->pi; - sq->pi = (sq->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; + __prep_one(priv, qp_obj, op, job, qp_obj->pi, &klm); + qp_obj->db_pi = qp_obj->pi; + qp_obj->pi = (qp_obj->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; } static inline void -send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq) +send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj) { struct mlx5dv_devx_uar *uar = priv->uar; - size_t wqe_offset = (sq->db_pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (qp_obj->db_pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; /* Or the fm_ce_se instead of set, avoid the fence be cleared. */ ((struct mlx5_wqe_ctrl_seg *)wqe)->fm_ce_se |= MLX5_WQE_CTRL_CQ_UPDATE; uint64_t *doorbell_addr = (uint64_t *)((uint8_t *)uar->base_addr + 0x800); rte_io_wmb(); - sq->sq_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? - (sq->db_pi * 4 + 3) : sq->db_pi) & - MLX5_REGEX_MAX_WQE_INDEX); + qp_obj->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? + (qp_obj->db_pi * 4 + 3) : qp_obj->db_pi) + & MLX5_REGEX_MAX_WQE_INDEX); rte_wmb(); *doorbell_addr = *(volatile uint64_t *)wqe; rte_wmb(); } static inline int -get_free(struct mlx5_regex_sq *sq, uint8_t has_umr) { - return (sq_size_get(sq) - ((sq->pi - sq->ci) & +get_free(struct mlx5_regex_hw_qp *qp, uint8_t has_umr) { + return (qp_size_get(qp) - ((qp->pi - qp->ci) & (has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX))); } static inline uint32_t -job_id_get(uint32_t qid, size_t sq_size, size_t index) { - return qid * sq_size + (index & (sq_size - 1)); +job_id_get(uint32_t qid, size_t qp_size, size_t index) { + return qid * qp_size + (index & (qp_size - 1)); } #ifdef HAVE_MLX5_UMR_IMKEY @@ -242,14 +242,14 @@ mkey_klm_available(struct mlx5_klm *klm, uint32_t pos, uint32_t new) } static inline void -complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, +complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_hw_qp *qp_obj, struct mlx5_regex_job *mkey_job, size_t umr_index, uint32_t klm_size, uint32_t total_len) { - size_t wqe_offset = (umr_index & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (umr_index & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB * 4); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp_obj->qp_obj.wqes + wqe_offset); struct mlx5_wqe_umr_ctrl_seg *ucseg = (struct mlx5_wqe_umr_ctrl_seg *)(wqe + 1); struct mlx5_wqe_mkey_context_seg *mkc = @@ -260,7 +260,7 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* Set WQE control seg. Non-inline KLM UMR WQE size must be 9 WQE_DS. */ set_wqe_ctrl_seg(wqe, (umr_index * 4), MLX5_OPCODE_UMR, - 0, sq->sq_obj.sq->id, 0, 9, 0, + 0, qp_obj->qp_obj.qp->id, 0, 9, 0, rte_cpu_to_be_32(mkey_job->imkey->id)); /* Set UMR WQE control seg. */ ucseg->mkey_mask |= rte_cpu_to_be_64(MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN | @@ -287,36 +287,36 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, } static inline void -prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, - struct rte_regex_ops *op, struct mlx5_regex_job *job, - size_t pi, struct mlx5_klm *klm) +prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, + struct mlx5_regex_hw_qp *qp, struct rte_regex_ops *op, + struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp) - 1)) * (MLX5_SEND_WQE_BB << 2); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp->qp_obj.wqes + wqe_offset); /* Clear the WQE memory used as UMR WQE previously. */ if ((rte_be_to_cpu_32(wqe->opmod_idx_opcode) & 0xff) != MLX5_OPCODE_NOP) memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* UMR WQE size is 9 DS, align nop WQE to 3 WQEBBS(12 DS). */ - set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, sq->sq_obj.sq->id, + set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, qp->qp_obj.qp->id, 0, 12, 0, 0); - __prep_one(priv, sq, op, job, pi, klm); + __prep_one(priv, qp, op, job, pi, klm); } static inline void prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops **op, size_t nb_ops) + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops **op, + size_t nb_ops) { struct mlx5_regex_job *job = NULL; - size_t sqid = sq->sqn, mkey_job_id = 0; + size_t hw_qpid = qp_obj->qpn, mkey_job_id = 0; size_t left_ops = nb_ops; uint32_t klm_num = 0, len; struct mlx5_klm *mkey_klm = NULL; struct mlx5_klm klm; - sqid = sq->sqn; while (left_ops--) rte_prefetch0(op[left_ops]); left_ops = nb_ops; @@ -328,7 +328,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, */ while (left_ops--) { struct rte_mbuf *mbuf = op[left_ops]->mbuf; - size_t pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, left_ops); + size_t pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, left_ops); if (mbuf->nb_segs > 1) { size_t scatter_size = 0; @@ -340,16 +340,16 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, * WQE in the next WQE set. */ if (mkey_klm) - complete_umr_wqe(qp, sq, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], - MLX5_REGEX_UMR_SQ_PI_IDX(pi, 1), + MLX5_REGEX_UMR_QP_PI_IDX(pi, 1), klm_num, len); /* * Get the indircet mkey and KLM array index * from the last WQE set. */ - mkey_job_id = job_id_get(sqid, - sq_size_get(sq), pi); + mkey_job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), pi); mkey_klm = qp->jobs[mkey_job_id].imkey_array; klm_num = 0; len = 0; @@ -383,22 +383,23 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.address = rte_pktmbuf_mtod(mbuf, uintptr_t); klm.byte_count = rte_pktmbuf_data_len(mbuf); } - job = &qp->jobs[job_id_get(sqid, sq_size_get(sq), pi)]; + job = &qp->jobs[job_id_get(hw_qpid, qp_size_get(qp_obj), pi)]; /* * Build the nop + RegEx WQE set by default. The fist nop WQE * will be updated later as UMR WQE if scattered mubf exist. */ - prep_nop_regex_wqe_set(priv, sq, op[left_ops], job, pi, &klm); + prep_nop_regex_wqe_set(priv, qp_obj, op[left_ops], job, pi, + &klm); } /* * Scattered mbuf have been added to the KLM array. Complete the build * of UMR WQE, update the first nop WQE as UMR WQE. */ if (mkey_klm) - complete_umr_wqe(qp, sq, &qp->jobs[mkey_job_id], sq->pi, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], qp_obj->pi, klm_num, len); - sq->db_pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops - 1); - sq->pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops); + qp_obj->db_pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops - 1); + qp_obj->pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops); } uint16_t @@ -407,21 +408,22 @@ mlx5_regexdev_enqueue_gga(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, nb_left = nb_ops, nb_desc; + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, nb_left = nb_ops, nb_desc; - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - nb_desc = get_free(sq, priv->has_umr); + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + nb_desc = get_free(qp_obj, priv->has_umr); if (nb_desc) { /* The ops be handled can't exceed nb_ops. */ if (nb_desc > nb_left) nb_desc = nb_left; else - queue->free_sqs &= ~(1 << sqid); - prep_regex_umr_wqe_set(priv, queue, sq, ops, nb_desc); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + prep_regex_umr_wqe_set(priv, queue, qp_obj, ops, + nb_desc); + send_doorbell(priv, qp_obj); nb_left -= nb_desc; } if (!nb_left) @@ -440,23 +442,25 @@ mlx5_regexdev_enqueue(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, job_id, i = 0; - - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - while (get_free(sq, priv->has_umr)) { - job_id = job_id_get(sqid, sq_size_get(sq), sq->pi); - prep_one(priv, queue, sq, ops[i], &queue->jobs[job_id]); + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, job_id, i = 0; + + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + while (get_free(qp_obj, priv->has_umr)) { + job_id = job_id_get(hw_qpid, qp_size_get(qp_obj), + qp_obj->pi); + prep_one(priv, queue, qp_obj, ops[i], + &queue->jobs[job_id]); i++; if (unlikely(i == nb_ops)) { - send_doorbell(priv, sq); + send_doorbell(priv, qp_obj); goto out; } } - queue->free_sqs &= ~(1 << sqid); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + send_doorbell(priv, qp_obj); } out: @@ -566,21 +570,21 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, uint16_t wq_counter = (rte_be_to_cpu_16(cqe->wqe_counter) + 1) & MLX5_REGEX_MAX_WQE_INDEX; - size_t sqid = cqe->rsvd3[2]; - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; + size_t hw_qpid = cqe->rsvd3[2]; + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; /* UMR mode WQE counter move as WQE set(4 WQEBBS).*/ if (priv->has_umr) wq_counter >>= 2; - while (sq->ci != wq_counter) { + while (qp_obj->ci != wq_counter) { if (unlikely(i == nb_ops)) { /* Return without updating cq->ci */ goto out; } - uint32_t job_id = job_id_get(sqid, sq_size_get(sq), - sq->ci); + uint32_t job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), qp_obj->ci); extract_result(ops[i], &queue->jobs[job_id]); - sq->ci = (sq->ci + 1) & (priv->has_umr ? + qp_obj->ci = (qp_obj->ci + 1) & (priv->has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX); i++; @@ -588,7 +592,7 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, cq->ci = (cq->ci + 1) & 0xffffff; rte_wmb(); cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->ci); - queue->free_sqs |= (1 << sqid); + queue->free_qps |= (1 << hw_qpid); } out: @@ -597,15 +601,15 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, } static void -setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) +setup_qps(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) { - size_t sqid, entry; + size_t hw_qpid, entry; uint32_t job_id; - for (sqid = 0; sqid < queue->nb_obj; sqid++) { - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes; - for (entry = 0 ; entry < sq_size_get(sq); entry++) { - job_id = sqid * sq_size_get(sq) + entry; + for (hw_qpid = 0; hw_qpid < queue->nb_obj; hw_qpid++) { + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes; + for (entry = 0 ; entry < qp_size_get(qp_obj); entry++) { + job_id = hw_qpid * qp_size_get(qp_obj) + entry; struct mlx5_regex_job *job = &queue->jobs[job_id]; /* Fill UMR WQE with NOP in advanced. */ @@ -613,7 +617,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) set_wqe_ctrl_seg ((struct mlx5_wqe_ctrl_seg *)wqe, entry * 2, MLX5_OPCODE_NOP, 0, - sq->sq_obj.sq->id, 0, 12, 0, 0); + qp_obj->qp_obj.qp->id, 0, 12, 0, 0); wqe += MLX5_REGEX_UMR_WQE_SIZE; } set_metadata_seg((struct mlx5_wqe_metadata_seg *) @@ -627,7 +631,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) (uintptr_t)job->output); wqe += 64; } - queue->free_sqs |= 1 << sqid; + queue->free_qps |= 1 << hw_qpid; } } @@ -737,7 +741,7 @@ mlx5_regexdev_setup_fastpath(struct mlx5_regex_priv *priv, uint32_t qp_id) return err; } - setup_sqs(priv, qp); + setup_qps(priv, qp); if (priv->has_umr) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object 2021-09-15 0:04 ` [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object Raja Zidane ` (4 preceding siblings ...) 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 5/5] regex/mlx5: refactor HW queue objects Raja Zidane @ 2021-09-28 12:16 ` Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 1/5] common/mlx5: share DevX QP operations Raja Zidane ` (5 more replies) 5 siblings, 6 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-28 12:16 UTC (permalink / raw) To: dev The mlx5 PMDs for compress and regex classes use an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, mmo, that will be supported only in the QP object. The FW introduced new capabilities to define whether the mmo configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set mmo configuration according to the new FW capabilities. V2: fix checkpatch errors. V3: rebase. V4: compilation error in commit 2/5. Raja Zidane (5): common/mlx5: share DevX QP operations common/mlx5: update new MMO HCA capabilities common/mlx5: add MMO configuration for the DevX QP compress/mlx5: refactor queue HW object regex/mlx5: refactor HW queue objects drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 +++ drivers/common/mlx5/mlx5_devx_cmds.c | 23 ++- drivers/common/mlx5/mlx5_devx_cmds.h | 13 +- drivers/common/mlx5/mlx5_prm.h | 48 ++++++- drivers/common/mlx5/version.map | 3 + drivers/compress/mlx5/mlx5_compress.c | 73 +++++----- drivers/crypto/mlx5/mlx5_crypto.c | 102 ++++---------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 59 +++----- 15 files changed, 461 insertions(+), 295 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V4 1/5] common/mlx5: share DevX QP operations 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane @ 2021-09-28 12:16 ` Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane ` (4 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-28 12:16 UTC (permalink / raw) To: dev Currently drivers using QP (vDPA, crypto and compress, regex soon) manage their memory, creation, modification and destruction of the QP, in almost identical code. Move QP memory management, creation and destruction to common. Add common function to change QP state to RTS. Add user_index attribute to QP creation. It's for better code maintenance and reuse. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 ++++ drivers/common/mlx5/mlx5_devx_cmds.c | 1 + drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/version.map | 3 + drivers/crypto/mlx5/mlx5_crypto.c | 102 +++++------------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 59 +++------- 9 files changed, 217 insertions(+), 126 deletions(-) diff --git a/drivers/common/mlx5/mlx5_common_devx.c b/drivers/common/mlx5/mlx5_common_devx.c index 22c8d356c4..825f84b183 100644 --- a/drivers/common/mlx5/mlx5_common_devx.c +++ b/drivers/common/mlx5/mlx5_common_devx.c @@ -271,6 +271,115 @@ mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, return -rte_errno; } +/** + * Destroy DevX Queue Pair. + * + * @param[in] qp + * DevX QP to destroy. + */ +void +mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp) +{ + if (qp->qp) + claim_zero(mlx5_devx_cmd_destroy(qp->qp)); + if (qp->umem_obj) + claim_zero(mlx5_os_umem_dereg(qp->umem_obj)); + if (qp->umem_buf) + mlx5_free((void *)(uintptr_t)qp->umem_buf); +} + +/** + * Create Queue Pair using DevX API. + * + * Get a pointer to partially initialized attributes structure, and updates the + * following fields: + * wq_umem_id + * wq_umem_offset + * dbr_umem_valid + * dbr_umem_id + * dbr_address + * log_page_size + * All other fields are updated by caller. + * + * @param[in] ctx + * Context returned from mlx5 open_device() glue function. + * @param[in/out] qp_obj + * Pointer to QP to create. + * @param[in] log_wqbb_n + * Log of number of WQBBs in queue. + * @param[in] attr + * Pointer to QP attributes structure. + * @param[in] socket + * Socket to use for allocation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket) +{ + struct mlx5_devx_obj *qp = NULL; + struct mlx5dv_devx_umem *umem_obj = NULL; + void *umem_buf = NULL; + size_t alignment = MLX5_WQE_BUF_ALIGNMENT; + uint32_t umem_size, umem_dbrec; + uint16_t qp_size = 1 << log_wqbb_n; + int ret; + + if (alignment == (size_t)-1) { + DRV_LOG(ERR, "Failed to get WQE buf alignment."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Allocate memory buffer for WQEs and doorbell record. */ + umem_size = MLX5_WQE_SIZE * qp_size; + umem_dbrec = RTE_ALIGN(umem_size, MLX5_DBR_SIZE); + umem_size += MLX5_DBR_SIZE; + umem_buf = mlx5_malloc(MLX5_MEM_RTE | MLX5_MEM_ZERO, umem_size, + alignment, socket); + if (!umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for QP."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Register allocated buffer in user space with DevX. */ + umem_obj = mlx5_os_umem_reg(ctx, (void *)(uintptr_t)umem_buf, umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!umem_obj) { + DRV_LOG(ERR, "Failed to register umem for QP."); + rte_errno = errno; + goto error; + } + /* Fill attributes for SQ object creation. */ + attr->wq_umem_id = mlx5_os_get_umem_id(umem_obj); + attr->wq_umem_offset = 0; + attr->dbr_umem_valid = 1; + attr->dbr_umem_id = attr->wq_umem_id; + attr->dbr_address = umem_dbrec; + attr->log_page_size = MLX5_LOG_PAGE_SIZE; + /* Create send queue object with DevX. */ + qp = mlx5_devx_cmd_create_qp(ctx, attr); + if (!qp) { + DRV_LOG(ERR, "Can't create DevX QP object."); + rte_errno = ENOMEM; + goto error; + } + qp_obj->umem_buf = umem_buf; + qp_obj->umem_obj = umem_obj; + qp_obj->qp = qp; + qp_obj->db_rec = RTE_PTR_ADD(qp_obj->umem_buf, umem_dbrec); + return 0; +error: + ret = rte_errno; + if (umem_obj) + claim_zero(mlx5_os_umem_dereg(umem_obj)); + if (umem_buf) + mlx5_free((void *)(uintptr_t)umem_buf); + rte_errno = ret; + return -rte_errno; +} + /** * Destroy DevX Receive Queue. * @@ -385,3 +494,38 @@ mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, return -rte_errno; } + +/** + * Change QP state to RTS. + * + * @param[in] qp + * DevX QP to change. + * @param[in] remote_qp_id + * The remote QP ID for MLX5_CMD_OP_INIT2RTR_QP operation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id) +{ + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RST2INIT_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_INIT2RTR_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RTR2RTS_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", + rte_errno); + return -1; + } + return 0; +} diff --git a/drivers/common/mlx5/mlx5_common_devx.h b/drivers/common/mlx5/mlx5_common_devx.h index aad0184e5a..f699405f69 100644 --- a/drivers/common/mlx5/mlx5_common_devx.h +++ b/drivers/common/mlx5/mlx5_common_devx.h @@ -33,6 +33,18 @@ struct mlx5_devx_sq { volatile uint32_t *db_rec; /* The SQ doorbell record. */ }; +/* DevX Queue Pair structure. */ +struct mlx5_devx_qp { + struct mlx5_devx_obj *qp; /* The QP DevX object. */ + void *umem_obj; /* The QP umem object. */ + union { + void *umem_buf; + struct mlx5_wqe *wqes; /* The QP ring buffer. */ + struct mlx5_aso_wqe *aso_wqes; + }; + volatile uint32_t *db_rec; /* The QP doorbell record. */ +}; + /* DevX Receive Queue structure. */ struct mlx5_devx_rq { struct mlx5_devx_obj *rq; /* The RQ DevX object. */ @@ -59,6 +71,14 @@ int mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, struct mlx5_devx_create_sq_attr *attr, int socket); +__rte_internal +void mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp); + +__rte_internal +int mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, + uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket); + __rte_internal void mlx5_devx_rq_destroy(struct mlx5_devx_rq *rq); @@ -67,4 +87,7 @@ int mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, uint16_t log_wqbb_n, struct mlx5_devx_create_rq_attr *attr, int socket); +__rte_internal +int mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id); + #endif /* RTE_PMD_MLX5_COMMON_DEVX_H_ */ diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 56407cc332..ac554cca05 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2021,6 +2021,7 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, st, MLX5_QP_ST_RC); MLX5_SET(qpc, qpc, pd, attr->pd); MLX5_SET(qpc, qpc, ts_format, attr->ts_format); + MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index e576e30f24..c071629904 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -397,6 +397,7 @@ struct mlx5_devx_qp_attr { uint64_t dbr_address; uint32_t wq_umem_id; uint64_t wq_umem_offset; + uint32_t user_index:24; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index e5cb6b7060..d3c5040aac 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -67,6 +67,9 @@ INTERNAL { mlx5_devx_get_out_command_status; + mlx5_devx_qp2rts; + mlx5_devx_qp_create; + mlx5_devx_qp_destroy; mlx5_devx_rq_create; mlx5_devx_rq_destroy; mlx5_devx_sq_create; diff --git a/drivers/crypto/mlx5/mlx5_crypto.c b/drivers/crypto/mlx5/mlx5_crypto.c index e01be15ade..23b1569039 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.c +++ b/drivers/crypto/mlx5/mlx5_crypto.c @@ -257,12 +257,7 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) { struct mlx5_crypto_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->qp_obj != NULL) - claim_zero(mlx5_devx_cmd_destroy(qp->qp_obj)); - if (qp->umem_obj != NULL) - claim_zero(mlx5_glue->devx_umem_dereg(qp->umem_obj)); - if (qp->umem_buf != NULL) - rte_free(qp->umem_buf); + mlx5_devx_qp_destroy(&qp->qp_obj); mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); mlx5_devx_cq_destroy(&qp->cq_obj); rte_free(qp); @@ -270,34 +265,6 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) return 0; } -static int -mlx5_crypto_qp2rts(struct mlx5_crypto_qp *qp) -{ - /* - * In Order to configure self loopback, when calling these functions the - * remote QP id that is used is the id of the same QP. - */ - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RST2INIT_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_INIT2RTR_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RTR2RTS_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", - rte_errno); - return -1; - } - return 0; -} - static __rte_noinline uint32_t mlx5_crypto_get_block_size(struct rte_crypto_op *op) { @@ -452,7 +419,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, memcpy(klms, &umr->kseg[0], sizeof(*klms) * klm_n); } ds = 2 + klm_n; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_RDMA_WRITE); ds = RTE_ALIGN(ds, 4); @@ -461,7 +428,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, if (priv->max_rdmar_ds > ds) { cseg += ds; ds = priv->max_rdmar_ds - ds; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_NOP); qp->db_pi += ds >> 2; /* Here, DS is 4 aligned for sure. */ @@ -503,7 +470,8 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, return 0; do { op = *ops++; - umr = RTE_PTR_ADD(qp->umem_buf, priv->wqe_set_size * qp->pi); + umr = RTE_PTR_ADD(qp->qp_obj.umem_buf, + priv->wqe_set_size * qp->pi); if (unlikely(mlx5_crypto_wqe_set(priv, qp, op, umr) == 0)) { qp->stats.enqueue_err_count++; if (remain != nb_ops) { @@ -517,7 +485,7 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); + qp->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); rte_wmb(); mlx5_crypto_uar_write(*(volatile uint64_t *)qp->wqe, qp->priv); rte_wmb(); @@ -583,8 +551,8 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) uint32_t i; for (i = 0 ; i < qp->entries_n; i++) { - struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->umem_buf, i * - priv->wqe_set_size); + struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->qp_obj.umem_buf, + i * priv->wqe_set_size); struct mlx5_wqe_umr_cseg *ucseg = (struct mlx5_wqe_umr_cseg *) (cseg + 1); struct mlx5_wqe_umr_bsf_seg *bsf = @@ -593,7 +561,7 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) struct mlx5_wqe_rseg *rseg; /* Init UMR WQE. */ - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | (priv->umr_wqe_size / MLX5_WSEG_SIZE)); cseg->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR << MLX5_COMP_MODE_OFFSET); @@ -628,7 +596,7 @@ mlx5_crypto_indirect_mkeys_prepare(struct mlx5_crypto_priv *priv, .klm_num = RTE_ALIGN(priv->max_segs_num, 4), }; - for (umr = (struct mlx5_umr_wqe *)qp->umem_buf, i = 0; + for (umr = (struct mlx5_umr_wqe *)qp->qp_obj.umem_buf, i = 0; i < qp->entries_n; i++, umr = RTE_PTR_ADD(umr, priv->wqe_set_size)) { attr.klm_array = (struct mlx5_klm *)&umr->kseg[0]; qp->mkey[i] = mlx5_devx_cmd_mkey_create(priv->ctx, &attr); @@ -649,9 +617,7 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, struct mlx5_devx_qp_attr attr = {0}; struct mlx5_crypto_qp *qp; uint16_t log_nb_desc = rte_log2_u32(qp_conf->nb_descriptors); - uint32_t umem_size = RTE_BIT32(log_nb_desc) * - priv->wqe_set_size + - sizeof(*qp->db_rec) * 2; + uint32_t ret; uint32_t alloc_size = sizeof(*qp); struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), @@ -675,18 +641,16 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto error; } - qp->umem_buf = rte_zmalloc_socket(__func__, umem_size, 4096, socket_id); - if (qp->umem_buf == NULL) { - DRV_LOG(ERR, "Failed to allocate QP umem."); - rte_errno = ENOMEM; - goto error; - } - qp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)qp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (qp->umem_obj == NULL) { - DRV_LOG(ERR, "Failed to register QP umem."); + attr.pd = priv->pdn; + attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); + attr.cqn = qp->cq_obj.cq->id; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(log_nb_desc); + attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp_obj, log_nb_desc, &attr, + socket_id); + if (ret) { + DRV_LOG(ERR, "Failed to create QP."); goto error; } if (mlx5_mr_btree_init(&qp->mr_ctrl.cache_bh, MLX5_MR_BTREE_CACHE_N, @@ -697,25 +661,11 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, goto error; } qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; - attr.pd = priv->pdn; - attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); - attr.cqn = qp->cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 0; - attr.sq_size = RTE_BIT32(log_nb_desc); - attr.dbr_umem_valid = 1; - attr.wq_umem_id = qp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = qp->umem_obj->umem_id; - attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - attr.dbr_address = RTE_BIT64(log_nb_desc) * priv->wqe_set_size; - qp->qp_obj = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (qp->qp_obj == NULL) { - DRV_LOG(ERR, "Failed to create QP(%u).", rte_errno); - goto error; - } - qp->db_rec = RTE_PTR_ADD(qp->umem_buf, (uintptr_t)attr.dbr_address); - if (mlx5_crypto_qp2rts(qp)) + /* + * In Order to configure self loopback, when calling devx qp2rts the + * remote QP id that is used is the id of the same QP. + */ + if (mlx5_devx_qp2rts(&qp->qp_obj, qp->qp_obj.qp->id)) goto error; qp->mkey = (struct mlx5_devx_obj **)RTE_ALIGN((uintptr_t)(qp + 1), RTE_CACHE_LINE_SIZE); diff --git a/drivers/crypto/mlx5/mlx5_crypto.h b/drivers/crypto/mlx5/mlx5_crypto.h index d589e0ac3d..4d7e6d2d10 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.h +++ b/drivers/crypto/mlx5/mlx5_crypto.h @@ -44,11 +44,8 @@ struct mlx5_crypto_priv { struct mlx5_crypto_qp { struct mlx5_crypto_priv *priv; struct mlx5_devx_cq cq_obj; - struct mlx5_devx_obj *qp_obj; + struct mlx5_devx_qp qp_obj; struct rte_cryptodev_stats stats; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; struct rte_crypto_op **ops; struct mlx5_devx_obj **mkey; /* WQE's indirect mekys. */ struct mlx5_mr_ctrl mr_ctrl; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 2a04e36607..a27f3fdadb 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -54,10 +54,7 @@ struct mlx5_vdpa_cq { struct mlx5_vdpa_event_qp { struct mlx5_vdpa_cq cq; struct mlx5_devx_obj *fw_qp; - struct mlx5_devx_obj *sw_qp; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; + struct mlx5_devx_qp sw_qp; }; struct mlx5_vdpa_query_mr { diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index 3541c652ce..bb6722839a 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -179,7 +179,7 @@ mlx5_vdpa_cq_poll(struct mlx5_vdpa_cq *cq) cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci); rte_io_wmb(); /* Ring SW QP doorbell record. */ - eqp->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); + eqp->sw_qp.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); } return comp; } @@ -531,12 +531,7 @@ mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv) void mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp) { - if (eqp->sw_qp) - claim_zero(mlx5_devx_cmd_destroy(eqp->sw_qp)); - if (eqp->umem_obj) - claim_zero(mlx5_glue->devx_umem_dereg(eqp->umem_obj)); - if (eqp->umem_buf) - rte_free(eqp->umem_buf); + mlx5_devx_qp_destroy(&eqp->sw_qp); if (eqp->fw_qp) claim_zero(mlx5_devx_cmd_destroy(eqp->fw_qp)); mlx5_vdpa_cq_destroy(&eqp->cq); @@ -547,36 +542,36 @@ static int mlx5_vdpa_qps2rts(struct mlx5_vdpa_event_qp *eqp) { if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to INIT state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->fw_qp->id)) { + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, + MLX5_CMD_OP_RST2INIT_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to INIT state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTR state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->fw_qp->id)) { + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, + MLX5_CMD_OP_INIT2RTR_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTR state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RTR2RTS_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTS state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RTR2RTS_QP, + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, MLX5_CMD_OP_RTR2RTS_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTS state(%u).", rte_errno); @@ -591,8 +586,7 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, { struct mlx5_devx_qp_attr attr = {0}; uint16_t log_desc_n = rte_log2_u32(desc_n); - uint32_t umem_size = (1 << log_desc_n) * MLX5_WSEG_SIZE + - sizeof(*eqp->db_rec) * 2; + uint32_t ret; if (mlx5_vdpa_event_qp_global_prepare(priv)) return -1; @@ -605,42 +599,23 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, DRV_LOG(ERR, "Failed to create FW QP(%u).", rte_errno); goto error; } - eqp->umem_buf = rte_zmalloc(__func__, umem_size, 4096); - if (!eqp->umem_buf) { - DRV_LOG(ERR, "Failed to allocate memory for SW QP."); - rte_errno = ENOMEM; - goto error; - } - eqp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)eqp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (!eqp->umem_obj) { - DRV_LOG(ERR, "Failed to register umem for SW QP."); - goto error; - } attr.uar_index = priv->uar->page_id; attr.cqn = eqp->cq.cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 1 << log_desc_n; + attr.rq_size = RTE_BIT32(log_desc_n); attr.log_rq_stride = rte_log2_u32(MLX5_WSEG_SIZE); attr.sq_size = 0; /* No need SQ. */ - attr.dbr_umem_valid = 1; - attr.wq_umem_id = eqp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = eqp->umem_obj->umem_id; attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - attr.dbr_address = RTE_BIT64(log_desc_n) * MLX5_WSEG_SIZE; - eqp->sw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (!eqp->sw_qp) { + ret = mlx5_devx_qp_create(priv->ctx, &(eqp->sw_qp), log_desc_n, &attr, + SOCKET_ID_ANY); + if (ret) { DRV_LOG(ERR, "Failed to create SW QP(%u).", rte_errno); goto error; } - eqp->db_rec = RTE_PTR_ADD(eqp->umem_buf, (uintptr_t)attr.dbr_address); if (mlx5_vdpa_qps2rts(eqp)) goto error; /* First ringing. */ - rte_write32(rte_cpu_to_be_32(1 << log_desc_n), &eqp->db_rec[0]); + rte_write32(rte_cpu_to_be_32(RTE_BIT32(log_desc_n)), + &eqp->sw_qp.db_rec[0]); return 0; error: mlx5_vdpa_event_qp_destroy(eqp); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V4 2/5] common/mlx5: update new MMO HCA capabilities 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 1/5] common/mlx5: share DevX QP operations Raja Zidane @ 2021-09-28 12:16 ` Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane ` (3 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-28 12:16 UTC (permalink / raw) To: dev New MMO HCA capabilities were added and others were renamed. Align hca capabilities with new prm. Add support in devx interface for changes in HCA capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 15 ++++++++++++--- drivers/common/mlx5/mlx5_devx_cmds.h | 11 ++++++++--- drivers/common/mlx5/mlx5_prm.h | 20 ++++++++++++++------ drivers/compress/mlx5/mlx5_compress.c | 4 ++-- 4 files changed, 36 insertions(+), 14 deletions(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index ac554cca05..00c78b1288 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -858,9 +858,18 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, attr->log_max_srq_sz = MLX5_GET(cmd_hca_cap, hcattr, log_max_srq_sz); attr->reg_c_preserve = MLX5_GET(cmd_hca_cap, hcattr, reg_c_preserve); - attr->mmo_dma_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo); - attr->mmo_compress_en = MLX5_GET(cmd_hca_cap, hcattr, compress); - attr->mmo_decompress_en = MLX5_GET(cmd_hca_cap, hcattr, decompress); + attr->mmo_regex_qp_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_qp); + attr->mmo_regex_sq_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_sq); + attr->mmo_dma_sq_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_sq); + attr->mmo_compress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_sq); + attr->mmo_decompress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_sq); + attr->mmo_dma_qp_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_qp); + attr->mmo_compress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_qp); + attr->mmo_decompress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_qp); attr->compress_min_block_size = MLX5_GET(cmd_hca_cap, hcattr, compress_min_block_size); attr->log_max_mmo_dma = MLX5_GET(cmd_hca_cap, hcattr, log_dma_mmo_size); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index c071629904..b21df0fd9b 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -173,9 +173,14 @@ struct mlx5_hca_attr { uint32_t log_max_srq; uint32_t log_max_srq_sz; uint32_t rss_ind_tbl_cap; - uint32_t mmo_dma_en:1; - uint32_t mmo_compress_en:1; - uint32_t mmo_decompress_en:1; + uint32_t mmo_dma_sq_en:1; + uint32_t mmo_compress_sq_en:1; + uint32_t mmo_decompress_sq_en:1; + uint32_t mmo_dma_qp_en:1; + uint32_t mmo_compress_qp_en:1; + uint32_t mmo_decompress_qp_en:1; + uint32_t mmo_regex_qp_en:1; + uint32_t mmo_regex_sq_en:1; uint32_t compress_min_block_size:4; uint32_t log_max_mmo_dma:5; uint32_t log_max_mmo_compress:5; diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index d361bcf90e..ec5f871c61 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1386,10 +1386,10 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 rtr2rts_qp_counters_set_id[0x1]; u8 rts2rts_udp_sport[0x1]; u8 rts2rts_lag_tx_port_affinity[0x1]; - u8 dma_mmo[0x1]; + u8 dma_mmo_sq[0x1]; u8 compress_min_block_size[0x4]; - u8 compress[0x1]; - u8 decompress[0x1]; + u8 compress_mmo_sq[0x1]; + u8 decompress_mmo_sq[0x1]; u8 log_max_ra_res_qp[0x6]; u8 end_pad[0x1]; u8 cc_query_allowed[0x1]; @@ -1519,7 +1519,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_lag_ports[0x4]; u8 reserved_at_280[0x10]; u8 max_wqe_sz_sq[0x10]; - u8 reserved_at_2a0[0x10]; + u8 reserved_at_2a0[0xc]; + u8 regexp_mmo_sq[0x1]; + u8 reserved_at_2b0[0x3]; u8 max_wqe_sz_rq[0x10]; u8 max_flow_counter_31_16[0x10]; u8 max_wqe_sz_sq_dc[0x10]; @@ -1632,7 +1634,12 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_vhca_ports[0x8]; u8 reserved_at_618[0x6]; u8 sw_owner_id[0x1]; - u8 reserved_at_61f[0x1e1]; + u8 reserved_at_61f[0x109]; + u8 dma_mmo_qp[0x1]; + u8 regexp_mmo_qp[0x1]; + u8 compress_mmo_qp[0x1]; + u8 decompress_mmo_qp[0x1]; + u8 reserved_at_624[0xd4]; }; struct mlx5_ifc_qos_cap_bits { @@ -3244,7 +3251,8 @@ struct mlx5_ifc_create_qp_in_bits { u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 reserved_at_40[0x40]; + u8 qpc_ext[0x1]; + u8 reserved_at_41[0x3f]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index c5e0a83a8c..1e03030510 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -813,8 +813,8 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_en == 0 || att.mmo_decompress_en == 0 || - att.mmo_dma_en == 0) { + att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V4 3/5] common/mlx5: add MMO configuration for the DevX QP 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane @ 2021-09-28 12:16 ` Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 4/5] compress/mlx5: refactor queue HW object Raja Zidane ` (2 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-28 12:16 UTC (permalink / raw) To: dev A new configuration MMO was added to QP Context. If set, MMO WQEs are supported on this QP. For DMA MMO, supported only when dma_mmo_qp==1. For REGEXP MMO, supported only when regexp_mmo_qp==1. For COMPRESS MMO, supported only when compress_mmo_qp==1. For DECOMPRESS MMO, supported only when decompress_mmo_qp==1. Add support to DevX interface to set MMO bit. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++++++ drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/mlx5_prm.h | 28 +++++++++++++++++++++++++++- 3 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 00c78b1288..eefb869b7d 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2032,6 +2032,13 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, ts_format, attr->ts_format); MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { + if (attr->mmo) { + void *qpc_ext_and_pas_list = MLX5_ADDR_OF(create_qp_in, + in, qpc_extension_and_pas_list); + void *qpc_ext = MLX5_ADDR_OF(qpc_extension_and_pas_list, + qpc_ext_and_pas_list, qpc_data_extension); + MLX5_SET(qpc_extension, qpc_ext, mmo, 1); + } MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); if (attr->log_page_size > MLX5_ADAPTER_PAGE_SHIFT) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index b21df0fd9b..e149f8b4f5 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -403,6 +403,7 @@ struct mlx5_devx_qp_attr { uint32_t wq_umem_id; uint64_t wq_umem_offset; uint32_t user_index:24; + uint32_t mmo:1; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index ec5f871c61..54e62aa153 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -3243,6 +3243,28 @@ struct mlx5_ifc_create_qp_out_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_qpc_extension_bits { + u8 reserved_at_0[0x2]; + u8 mmo[0x1]; + u8 reserved_at_3[0x5fd]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_pas_list_bits { + u8 pas[0][0x40]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_extension_and_pas_list_bits { + struct mlx5_ifc_qpc_extension_bits qpc_data_extension; + u8 pas[0][0x40]; +}; + + #ifdef PEDANTIC #pragma GCC diagnostic ignored "-Wpedantic" #endif @@ -3260,7 +3282,11 @@ struct mlx5_ifc_create_qp_in_bits { u8 wq_umem_id[0x20]; u8 wq_umem_valid[0x1]; u8 reserved_at_861[0x1f]; - u8 pas[0][0x40]; + union { + struct mlx5_ifc_qpc_pas_list_bits qpc_pas_list; + struct mlx5_ifc_qpc_extension_and_pas_list_bits + qpc_extension_and_pas_list; + }; }; #ifdef PEDANTIC #pragma GCC diagnostic error "-Wpedantic" -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V4 4/5] compress/mlx5: refactor queue HW object 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane ` (2 preceding siblings ...) 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane @ 2021-09-28 12:16 ` Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-28 12:16 UTC (permalink / raw) To: dev The mlx5 PMD for compress class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/compress/mlx5/mlx5_compress.c | 73 +++++++++++++++------------ 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index 1e03030510..5c5aa87a18 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -40,7 +40,7 @@ struct mlx5_compress_priv { void *uar; uint32_t pdn; /* Protection Domain number. */ uint8_t min_block_size; - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ /* Minimum huffman block size supported by the device. */ struct ibv_pd *pd; struct rte_compressdev_config dev_config; @@ -48,6 +48,13 @@ struct mlx5_compress_priv { rte_spinlock_t xform_sl; struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ volatile uint64_t *uar_addr; + /* HCA caps*/ + uint32_t mmo_decomp_sq:1; + uint32_t mmo_decomp_qp:1; + uint32_t mmo_comp_sq:1; + uint32_t mmo_comp_qp:1; + uint32_t mmo_dma_sq:1; + uint32_t mmo_dma_qp:1; #ifndef RTE_ARCH_64 rte_spinlock_t uar32_sl; #endif /* RTE_ARCH_64 */ @@ -61,7 +68,7 @@ struct mlx5_compress_qp { struct mlx5_mr_ctrl mr_ctrl; int socket_id; struct mlx5_devx_cq cq; - struct mlx5_devx_sq sq; + struct mlx5_devx_qp qp; struct mlx5_pmd_mr opaque_mr; struct rte_comp_op **ops; struct mlx5_compress_priv *priv; @@ -134,8 +141,8 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) { struct mlx5_compress_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->sq.sq != NULL) - mlx5_devx_sq_destroy(&qp->sq); + if (qp->qp.qp != NULL) + mlx5_devx_qp_destroy(&qp->qp); if (qp->cq.cq != NULL) mlx5_devx_cq_destroy(&qp->cq); if (qp->opaque_mr.obj != NULL) { @@ -152,12 +159,12 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) } static void -mlx5_compress_init_sq(struct mlx5_compress_qp *qp) +mlx5_compress_init_qp(struct mlx5_compress_qp *qp) { volatile struct mlx5_gga_wqe *restrict wqe = - (volatile struct mlx5_gga_wqe *)qp->sq.wqes; + (volatile struct mlx5_gga_wqe *)qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; - const uint32_t sq_ds = rte_cpu_to_be_32((qp->sq.sq->id << 8) | 4u); + const uint32_t sq_ds = rte_cpu_to_be_32((qp->qp.qp->id << 8) | 4u); const uint32_t flags = RTE_BE32(MLX5_COMP_ALWAYS << MLX5_COMP_MODE_OFFSET); const uint32_t opaq_lkey = rte_cpu_to_be_32(qp->opaque_mr.lkey); @@ -182,15 +189,10 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), }; - struct mlx5_devx_create_sq_attr sq_attr = { + struct mlx5_devx_qp_attr qp_attr = { + .pd = priv->pdn, + .uar_index = mlx5_os_get_devx_uar_page_id(priv->uar), .user_index = qp_id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .pd = priv->pdn, - .uar_page = mlx5_os_get_devx_uar_page_id(priv->uar), - }, - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, }; uint32_t log_ops_n = rte_log2_u32(max_inflight_ops); uint32_t alloc_size = sizeof(*qp); @@ -242,24 +244,26 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto err; } - sq_attr.cqn = qp->cq.cq->id; - sq_attr.ts_format = mlx5_ts_format_conv(priv->sq_ts_format); - ret = mlx5_devx_sq_create(priv->ctx, &qp->sq, log_ops_n, &sq_attr, + qp_attr.cqn = qp->cq.cq->id; + qp_attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + qp_attr.rq_size = 0; + qp_attr.sq_size = RTE_BIT32(log_ops_n); + qp_attr.mmo = priv->mmo_decomp_qp && priv->mmo_comp_qp + && priv->mmo_dma_qp; + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp, log_ops_n, &qp_attr, socket_id); if (ret != 0) { - DRV_LOG(ERR, "Failed to create SQ."); + DRV_LOG(ERR, "Failed to create QP."); goto err; } - mlx5_compress_init_sq(qp); - ret = mlx5_devx_cmd_modify_sq(qp->sq.sq, &modify_attr); - if (ret != 0) { - DRV_LOG(ERR, "Can't change SQ state to ready."); + mlx5_compress_init_qp(qp); + ret = mlx5_devx_qp2rts(&qp->qp, 0); + if (ret) goto err; - } /* Save pointer of global generation number to check memory event. */ qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; DRV_LOG(INFO, "QP %u: SQN=0x%X CQN=0x%X entries num = %u", - (uint32_t)qp_id, qp->sq.sq->id, qp->cq.cq->id, qp->entries_n); + (uint32_t)qp_id, qp->qp.qp->id, qp->cq.cq->id, qp->entries_n); return 0; err: mlx5_compress_qp_release(dev, qp_id); @@ -508,7 +512,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, { struct mlx5_compress_qp *qp = queue_pair; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes, *wqe; + qp->qp.wqes, *wqe; struct mlx5_compress_xform *xform; struct rte_comp_op *op; uint16_t mask = qp->entries_n - 1; @@ -563,7 +567,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->sq.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); + qp->qp.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); rte_wmb(); mlx5_compress_uar_write(*(volatile uint64_t *)wqe, qp->priv); rte_wmb(); @@ -598,7 +602,7 @@ mlx5_compress_cqe_err_handle(struct mlx5_compress_qp *qp, volatile struct mlx5_err_cqe *cqe = (volatile struct mlx5_err_cqe *) &qp->cq.cqes[idx]; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes; + qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; op->status = RTE_COMP_OP_STATUS_ERROR; @@ -813,8 +817,9 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || - att.mmo_dma_sq_en == 0) { + ((att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) && (att.mmo_compress_qp_en == 0 || + att.mmo_decompress_qp_en == 0 || att.mmo_dma_qp_en == 0))) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); @@ -835,10 +840,16 @@ mlx5_compress_dev_probe(struct rte_device *dev) cdev->enqueue_burst = mlx5_compress_enqueue_burst; cdev->feature_flags = RTE_COMPDEV_FF_HW_ACCELERATED; priv = cdev->data->dev_private; + priv->mmo_decomp_sq = att.mmo_decompress_sq_en; + priv->mmo_decomp_qp = att.mmo_decompress_qp_en; + priv->mmo_comp_sq = att.mmo_compress_sq_en; + priv->mmo_comp_qp = att.mmo_compress_qp_en; + priv->mmo_dma_sq = att.mmo_dma_sq_en; + priv->mmo_dma_qp = att.mmo_dma_qp_en; priv->ctx = ctx; priv->cdev = cdev; priv->min_block_size = att.compress_min_block_size; - priv->sq_ts_format = att.sq_ts_format; + priv->qp_ts_format = att.qp_ts_format; if (mlx5_compress_hw_global_prepare(priv) != 0) { rte_compressdev_pmd_destroy(priv->cdev); claim_zero(mlx5_glue->close_device(priv->ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V4 5/5] regex/mlx5: refactor HW queue objects 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane ` (3 preceding siblings ...) 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 4/5] compress/mlx5: refactor queue HW object Raja Zidane @ 2021-09-28 12:16 ` Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-28 12:16 UTC (permalink / raw) To: dev The mlx5 PMD for regex class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- 4 files changed, 133 insertions(+), 125 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_regex.c b/drivers/regex/mlx5/mlx5_regex.c index f17b6df47f..a3368749b9 100644 --- a/drivers/regex/mlx5/mlx5_regex.c +++ b/drivers/regex/mlx5/mlx5_regex.c @@ -146,7 +146,8 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) DRV_LOG(ERR, "Unable to read HCA capabilities."); rte_errno = ENOTSUP; goto dev_error; - } else if (!attr.regex || attr.regexp_num_of_engines == 0) { + } else if (((!attr.regex) && (!attr.mmo_regex_sq_en) && + (!attr.mmo_regex_qp_en)) || attr.regexp_num_of_engines == 0) { DRV_LOG(ERR, "Not enough capabilities to support RegEx, maybe " "old FW/OFED version?"); rte_errno = ENOTSUP; @@ -164,7 +165,9 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) rte_errno = ENOMEM; goto dev_error; } - priv->sq_ts_format = attr.sq_ts_format; + priv->mmo_regex_qp_cap = attr.mmo_regex_qp_en; + priv->mmo_regex_sq_cap = attr.mmo_regex_sq_en; + priv->qp_ts_format = attr.qp_ts_format; priv->ctx = ctx; priv->nb_engines = 2; /* attr.regexp_num_of_engines */ ret = mlx5_devx_regex_register_read(priv->ctx, 0, diff --git a/drivers/regex/mlx5/mlx5_regex.h b/drivers/regex/mlx5/mlx5_regex.h index 514f3408f9..2242d250a3 100644 --- a/drivers/regex/mlx5/mlx5_regex.h +++ b/drivers/regex/mlx5/mlx5_regex.h @@ -17,12 +17,12 @@ #include "mlx5_rxp.h" #include "mlx5_regex_utils.h" -struct mlx5_regex_sq { +struct mlx5_regex_hw_qp { uint16_t log_nb_desc; /* Log 2 number of desc for this object. */ - struct mlx5_devx_sq sq_obj; /* The SQ DevX object. */ + struct mlx5_devx_qp qp_obj; /* The QP DevX object. */ size_t pi, db_pi; size_t ci; - uint32_t sqn; + uint32_t qpn; }; struct mlx5_regex_cq { @@ -34,10 +34,10 @@ struct mlx5_regex_cq { struct mlx5_regex_qp { uint32_t flags; /* QP user flags. */ uint32_t nb_desc; /* Total number of desc for this qp. */ - struct mlx5_regex_sq *sqs; /* Pointer to sq array. */ - uint16_t nb_obj; /* Number of sq objects. */ + struct mlx5_regex_hw_qp *qps; /* Pointer to qp array. */ + uint16_t nb_obj; /* Number of qp objects. */ struct mlx5_regex_cq cq; /* CQ struct. */ - uint32_t free_sqs; + uint32_t free_qps; struct mlx5_regex_job *jobs; struct ibv_mr *metadata; struct ibv_mr *outputs; @@ -73,8 +73,10 @@ struct mlx5_regex_priv { /**< Called by memory event callback. */ struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ uint8_t is_bf2; /* The device is BF2 device. */ - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ uint8_t has_umr; /* The device supports UMR. */ + uint32_t mmo_regex_qp_cap:1; + uint32_t mmo_regex_sq_cap:1; }; #ifdef HAVE_IBV_FLOW_DV_SUPPORT diff --git a/drivers/regex/mlx5/mlx5_regex_control.c b/drivers/regex/mlx5/mlx5_regex_control.c index 8ce2dabb55..572ecc6d86 100644 --- a/drivers/regex/mlx5/mlx5_regex_control.c +++ b/drivers/regex/mlx5/mlx5_regex_control.c @@ -106,12 +106,12 @@ regex_ctrl_create_cq(struct mlx5_regex_priv *priv, struct mlx5_regex_cq *cq) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) +regex_ctrl_destroy_hw_qp(struct mlx5_regex_qp *qp, uint16_t q_ind) { - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; - mlx5_devx_sq_destroy(&sq->sq_obj); - memset(sq, 0, sizeof(*sq)); + mlx5_devx_qp_destroy(&qp_obj->qp_obj); + memset(qp, 0, sizeof(*qp)); return 0; } @@ -131,45 +131,44 @@ regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_create_sq(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, +regex_ctrl_create_hw_qp(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, uint16_t q_ind, uint16_t log_nb_desc) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT - struct mlx5_devx_create_sq_attr attr = { - .user_index = q_ind, + struct mlx5_devx_qp_attr attr = { .cqn = qp->cq.cq_obj.cq->id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .uar_page = priv->uar->page_id, - }, - .ts_format = mlx5_ts_format_conv(priv->sq_ts_format), - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, + .uar_index = priv->uar->page_id, + .ts_format = mlx5_ts_format_conv(priv->qp_ts_format), + .user_index = q_ind, }; - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; uint32_t pd_num = 0; int ret; - sq->log_nb_desc = log_nb_desc; - sq->sqn = q_ind; - sq->ci = 0; - sq->pi = 0; + qp_obj->log_nb_desc = log_nb_desc; + qp_obj->qpn = q_ind; + qp_obj->ci = 0; + qp_obj->pi = 0; ret = regex_get_pdn(priv->pd, &pd_num); if (ret) return ret; - attr.wq_attr.pd = pd_num; - ret = mlx5_devx_sq_create(priv->ctx, &sq->sq_obj, + attr.pd = pd_num; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, + log_nb_desc)); + attr.mmo = priv->mmo_regex_qp_cap; + ret = mlx5_devx_qp_create(priv->ctx, &qp_obj->qp_obj, MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_nb_desc), &attr, SOCKET_ID_ANY); if (ret) { - DRV_LOG(ERR, "Can't create SQ object."); + DRV_LOG(ERR, "Can't create QP object."); rte_errno = ENOMEM; return -rte_errno; } - ret = mlx5_devx_cmd_modify_sq(sq->sq_obj.sq, &modify_attr); + ret = mlx5_devx_qp2rts(&qp_obj->qp_obj, 0); if (ret) { - DRV_LOG(ERR, "Can't change SQ state to ready."); - regex_ctrl_destroy_sq(qp, q_ind); + DRV_LOG(ERR, "Can't change QP state to RTS."); + regex_ctrl_destroy_hw_qp(qp, q_ind); rte_errno = ENOMEM; return -rte_errno; } @@ -224,10 +223,10 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, (1 << MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_desc)); else qp->nb_obj = 1; - qp->sqs = rte_malloc(NULL, - qp->nb_obj * sizeof(struct mlx5_regex_sq), 64); - if (!qp->sqs) { - DRV_LOG(ERR, "Can't allocate sq array memory."); + qp->qps = rte_malloc(NULL, + qp->nb_obj * sizeof(struct mlx5_regex_hw_qp), 64); + if (!qp->qps) { + DRV_LOG(ERR, "Can't allocate qp array memory."); rte_errno = ENOMEM; return -rte_errno; } @@ -238,9 +237,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, goto err_cq; } for (i = 0; i < qp->nb_obj; i++) { - ret = regex_ctrl_create_sq(priv, qp, i, log_desc); + ret = regex_ctrl_create_hw_qp(priv, qp, i, log_desc); if (ret) { - DRV_LOG(ERR, "Can't create sq."); + DRV_LOG(ERR, "Can't create qp object."); goto err_btree; } nb_sq_config++; @@ -266,9 +265,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); err_btree: for (i = 0; i < nb_sq_config; i++) - regex_ctrl_destroy_sq(qp, i); + regex_ctrl_destroy_hw_qp(qp, i); regex_ctrl_destroy_cq(&qp->cq); err_cq: - rte_free(qp->sqs); + rte_free(qp->qps); return ret; } diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c index c79445ce7d..0833b2817e 100644 --- a/drivers/regex/mlx5/mlx5_regex_fastpath.c +++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c @@ -39,13 +39,13 @@ #define MLX5_REGEX_KLMS_SIZE \ ((MLX5_REGEX_MAX_KLM_NUM) * sizeof(struct mlx5_klm)) /* In WQE set mode, the pi should be quarter of the MLX5_REGEX_MAX_WQE_INDEX. */ -#define MLX5_REGEX_UMR_SQ_PI_IDX(pi, ops) \ +#define MLX5_REGEX_UMR_QP_PI_IDX(pi, ops) \ (((pi) + (ops)) & (MLX5_REGEX_MAX_WQE_INDEX >> 2)) static inline uint32_t -sq_size_get(struct mlx5_regex_sq *sq) +qp_size_get(struct mlx5_regex_hw_qp *qp) { - return (1U << sq->log_nb_desc); + return (1U << qp->log_nb_desc); } static inline uint32_t @@ -144,11 +144,11 @@ mlx5_regex_addr2mr(struct mlx5_regex_priv *priv, struct mlx5_mr_ctrl *mr_ctrl, static inline void -__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, +__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); uint16_t group0 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F ? @@ -168,13 +168,13 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F | RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F))) group0 = op->group_id0; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; int ds = 4; /* ctrl + meta + input + output */ set_wqe_ctrl_seg((struct mlx5_wqe_ctrl_seg *)wqe, (priv->has_umr ? (pi * 4 + 3) : pi), MLX5_OPCODE_MMO, MLX5_OPC_MOD_MMO_REGEX, - sq->sq_obj.sq->id, 0, ds, 0, 0); + qp_obj->qp_obj.qp->id, 0, ds, 0, 0); set_regex_ctrl_seg(wqe + 12, 0, group0, group1, group2, group3, control); struct mlx5_wqe_data_seg *input_seg = @@ -188,7 +188,7 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, static inline void prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops *op, + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job) { struct mlx5_klm klm; @@ -196,42 +196,42 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.byte_count = rte_pktmbuf_data_len(op->mbuf); klm.mkey = mlx5_regex_addr2mr(priv, &qp->mr_ctrl, op->mbuf); klm.address = rte_pktmbuf_mtod(op->mbuf, uintptr_t); - __prep_one(priv, sq, op, job, sq->pi, &klm); - sq->db_pi = sq->pi; - sq->pi = (sq->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; + __prep_one(priv, qp_obj, op, job, qp_obj->pi, &klm); + qp_obj->db_pi = qp_obj->pi; + qp_obj->pi = (qp_obj->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; } static inline void -send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq) +send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj) { struct mlx5dv_devx_uar *uar = priv->uar; - size_t wqe_offset = (sq->db_pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (qp_obj->db_pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; /* Or the fm_ce_se instead of set, avoid the fence be cleared. */ ((struct mlx5_wqe_ctrl_seg *)wqe)->fm_ce_se |= MLX5_WQE_CTRL_CQ_UPDATE; uint64_t *doorbell_addr = (uint64_t *)((uint8_t *)uar->base_addr + 0x800); rte_io_wmb(); - sq->sq_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? - (sq->db_pi * 4 + 3) : sq->db_pi) & - MLX5_REGEX_MAX_WQE_INDEX); + qp_obj->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? + (qp_obj->db_pi * 4 + 3) : qp_obj->db_pi) + & MLX5_REGEX_MAX_WQE_INDEX); rte_wmb(); *doorbell_addr = *(volatile uint64_t *)wqe; rte_wmb(); } static inline int -get_free(struct mlx5_regex_sq *sq, uint8_t has_umr) { - return (sq_size_get(sq) - ((sq->pi - sq->ci) & +get_free(struct mlx5_regex_hw_qp *qp, uint8_t has_umr) { + return (qp_size_get(qp) - ((qp->pi - qp->ci) & (has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX))); } static inline uint32_t -job_id_get(uint32_t qid, size_t sq_size, size_t index) { - return qid * sq_size + (index & (sq_size - 1)); +job_id_get(uint32_t qid, size_t qp_size, size_t index) { + return qid * qp_size + (index & (qp_size - 1)); } #ifdef HAVE_MLX5_UMR_IMKEY @@ -242,14 +242,14 @@ mkey_klm_available(struct mlx5_klm *klm, uint32_t pos, uint32_t new) } static inline void -complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, +complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_hw_qp *qp_obj, struct mlx5_regex_job *mkey_job, size_t umr_index, uint32_t klm_size, uint32_t total_len) { - size_t wqe_offset = (umr_index & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (umr_index & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB * 4); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp_obj->qp_obj.wqes + wqe_offset); struct mlx5_wqe_umr_ctrl_seg *ucseg = (struct mlx5_wqe_umr_ctrl_seg *)(wqe + 1); struct mlx5_wqe_mkey_context_seg *mkc = @@ -260,7 +260,7 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* Set WQE control seg. Non-inline KLM UMR WQE size must be 9 WQE_DS. */ set_wqe_ctrl_seg(wqe, (umr_index * 4), MLX5_OPCODE_UMR, - 0, sq->sq_obj.sq->id, 0, 9, 0, + 0, qp_obj->qp_obj.qp->id, 0, 9, 0, rte_cpu_to_be_32(mkey_job->imkey->id)); /* Set UMR WQE control seg. */ ucseg->mkey_mask |= rte_cpu_to_be_64(MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN | @@ -287,37 +287,37 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, } static inline void -prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, - struct rte_regex_ops *op, struct mlx5_regex_job *job, - size_t pi, struct mlx5_klm *klm) +prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, + struct mlx5_regex_hw_qp *qp, struct rte_regex_ops *op, + struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp) - 1)) * (MLX5_SEND_WQE_BB << 2); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp->qp_obj.wqes + wqe_offset); /* Clear the WQE memory used as UMR WQE previously. */ if ((rte_be_to_cpu_32(wqe->opmod_idx_opcode) & 0xff) != MLX5_OPCODE_NOP) memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* UMR WQE size is 9 DS, align nop WQE to 3 WQEBBS(12 DS). */ - set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, sq->sq_obj.sq->id, + set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, qp->qp_obj.qp->id, 0, 12, 0, 0); - __prep_one(priv, sq, op, job, pi, klm); + __prep_one(priv, qp, op, job, pi, klm); } static inline void prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops **op, size_t nb_ops) + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops **op, + size_t nb_ops) { struct mlx5_regex_job *job = NULL; - size_t sqid = sq->sqn, mkey_job_id = 0; + size_t hw_qpid = qp_obj->qpn, mkey_job_id = 0; size_t left_ops = nb_ops; uint32_t klm_num = 0; uint32_t len = 0; struct mlx5_klm *mkey_klm = NULL; struct mlx5_klm klm; - sqid = sq->sqn; while (left_ops--) rte_prefetch0(op[left_ops]); left_ops = nb_ops; @@ -329,7 +329,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, */ while (left_ops--) { struct rte_mbuf *mbuf = op[left_ops]->mbuf; - size_t pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, left_ops); + size_t pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, left_ops); if (mbuf->nb_segs > 1) { size_t scatter_size = 0; @@ -341,16 +341,16 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, * WQE in the next WQE set. */ if (mkey_klm) - complete_umr_wqe(qp, sq, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], - MLX5_REGEX_UMR_SQ_PI_IDX(pi, 1), + MLX5_REGEX_UMR_QP_PI_IDX(pi, 1), klm_num, len); /* * Get the indircet mkey and KLM array index * from the last WQE set. */ - mkey_job_id = job_id_get(sqid, - sq_size_get(sq), pi); + mkey_job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), pi); mkey_klm = qp->jobs[mkey_job_id].imkey_array; klm_num = 0; len = 0; @@ -384,22 +384,23 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.address = rte_pktmbuf_mtod(mbuf, uintptr_t); klm.byte_count = rte_pktmbuf_data_len(mbuf); } - job = &qp->jobs[job_id_get(sqid, sq_size_get(sq), pi)]; + job = &qp->jobs[job_id_get(hw_qpid, qp_size_get(qp_obj), pi)]; /* * Build the nop + RegEx WQE set by default. The fist nop WQE * will be updated later as UMR WQE if scattered mubf exist. */ - prep_nop_regex_wqe_set(priv, sq, op[left_ops], job, pi, &klm); + prep_nop_regex_wqe_set(priv, qp_obj, op[left_ops], job, pi, + &klm); } /* * Scattered mbuf have been added to the KLM array. Complete the build * of UMR WQE, update the first nop WQE as UMR WQE. */ if (mkey_klm) - complete_umr_wqe(qp, sq, &qp->jobs[mkey_job_id], sq->pi, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], qp_obj->pi, klm_num, len); - sq->db_pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops - 1); - sq->pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops); + qp_obj->db_pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops - 1); + qp_obj->pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops); } uint16_t @@ -408,21 +409,22 @@ mlx5_regexdev_enqueue_gga(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, nb_left = nb_ops, nb_desc; + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, nb_left = nb_ops, nb_desc; - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - nb_desc = get_free(sq, priv->has_umr); + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + nb_desc = get_free(qp_obj, priv->has_umr); if (nb_desc) { /* The ops be handled can't exceed nb_ops. */ if (nb_desc > nb_left) nb_desc = nb_left; else - queue->free_sqs &= ~(1 << sqid); - prep_regex_umr_wqe_set(priv, queue, sq, ops, nb_desc); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + prep_regex_umr_wqe_set(priv, queue, qp_obj, ops, + nb_desc); + send_doorbell(priv, qp_obj); nb_left -= nb_desc; } if (!nb_left) @@ -441,23 +443,25 @@ mlx5_regexdev_enqueue(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, job_id, i = 0; - - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - while (get_free(sq, priv->has_umr)) { - job_id = job_id_get(sqid, sq_size_get(sq), sq->pi); - prep_one(priv, queue, sq, ops[i], &queue->jobs[job_id]); + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, job_id, i = 0; + + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + while (get_free(qp_obj, priv->has_umr)) { + job_id = job_id_get(hw_qpid, qp_size_get(qp_obj), + qp_obj->pi); + prep_one(priv, queue, qp_obj, ops[i], + &queue->jobs[job_id]); i++; if (unlikely(i == nb_ops)) { - send_doorbell(priv, sq); + send_doorbell(priv, qp_obj); goto out; } } - queue->free_sqs &= ~(1 << sqid); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + send_doorbell(priv, qp_obj); } out: @@ -567,21 +571,21 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, uint16_t wq_counter = (rte_be_to_cpu_16(cqe->wqe_counter) + 1) & MLX5_REGEX_MAX_WQE_INDEX; - size_t sqid = cqe->rsvd3[2]; - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; + size_t hw_qpid = cqe->rsvd3[2]; + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; /* UMR mode WQE counter move as WQE set(4 WQEBBS).*/ if (priv->has_umr) wq_counter >>= 2; - while (sq->ci != wq_counter) { + while (qp_obj->ci != wq_counter) { if (unlikely(i == nb_ops)) { /* Return without updating cq->ci */ goto out; } - uint32_t job_id = job_id_get(sqid, sq_size_get(sq), - sq->ci); + uint32_t job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), qp_obj->ci); extract_result(ops[i], &queue->jobs[job_id]); - sq->ci = (sq->ci + 1) & (priv->has_umr ? + qp_obj->ci = (qp_obj->ci + 1) & (priv->has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX); i++; @@ -589,7 +593,7 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, cq->ci = (cq->ci + 1) & 0xffffff; rte_wmb(); cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->ci); - queue->free_sqs |= (1 << sqid); + queue->free_qps |= (1 << hw_qpid); } out: @@ -598,15 +602,15 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, } static void -setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) +setup_qps(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) { - size_t sqid, entry; + size_t hw_qpid, entry; uint32_t job_id; - for (sqid = 0; sqid < queue->nb_obj; sqid++) { - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes; - for (entry = 0 ; entry < sq_size_get(sq); entry++) { - job_id = sqid * sq_size_get(sq) + entry; + for (hw_qpid = 0; hw_qpid < queue->nb_obj; hw_qpid++) { + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes; + for (entry = 0 ; entry < qp_size_get(qp_obj); entry++) { + job_id = hw_qpid * qp_size_get(qp_obj) + entry; struct mlx5_regex_job *job = &queue->jobs[job_id]; /* Fill UMR WQE with NOP in advanced. */ @@ -614,7 +618,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) set_wqe_ctrl_seg ((struct mlx5_wqe_ctrl_seg *)wqe, entry * 2, MLX5_OPCODE_NOP, 0, - sq->sq_obj.sq->id, 0, 12, 0, 0); + qp_obj->qp_obj.qp->id, 0, 12, 0, 0); wqe += MLX5_REGEX_UMR_WQE_SIZE; } set_metadata_seg((struct mlx5_wqe_metadata_seg *) @@ -628,7 +632,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) (uintptr_t)job->output); wqe += 64; } - queue->free_sqs |= 1 << sqid; + queue->free_qps |= 1 << hw_qpid; } } @@ -738,7 +742,7 @@ mlx5_regexdev_setup_fastpath(struct mlx5_regex_priv *priv, uint32_t qp_id) return err; } - setup_sqs(priv, qp); + setup_qps(priv, qp); if (priv->has_umr) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane ` (4 preceding siblings ...) 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 5/5] regex/mlx5: refactor HW queue objects Raja Zidane @ 2021-09-30 5:44 ` Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 1/5] common/mlx5: update new MMO HCA capabilities Raja Zidane ` (5 more replies) 5 siblings, 6 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-30 5:44 UTC (permalink / raw) To: dev The mlx5 PMDs for compress and regex classes use an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, mmo, that will be supported only in the QP object. The FW introduced new capabilities to define whether the mmo configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set mmo configuration according to the new FW capabilities. V2: fix checkpatch errors. V3: rebase. V4: compilation error in commit 2/5. V5: rebase. Raja Zidane (5): common/mlx5: update new MMO HCA capabilities common/mlx5: add MMO configuration for the DevX QP compress/mlx5: refactor queue HW object regex/mlx5: refactor HW queue objects compress/mlx5: allow partial transformations support drivers/common/mlx5/mlx5_devx_cmds.c | 22 ++- drivers/common/mlx5/mlx5_devx_cmds.h | 12 +- drivers/common/mlx5/mlx5_prm.h | 48 ++++++- drivers/compress/mlx5/mlx5_compress.c | 128 +++++++++++------ drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- 8 files changed, 287 insertions(+), 181 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V5 1/5] common/mlx5: update new MMO HCA capabilities 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane @ 2021-09-30 5:44 ` Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 2/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane ` (4 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-30 5:44 UTC (permalink / raw) To: dev New MMO HCA capabilities were added and others were renamed. Align hca capabilities with new prm. Add support in devx interface for changes in HCA capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 15 ++++++++++++--- drivers/common/mlx5/mlx5_devx_cmds.h | 11 ++++++++--- drivers/common/mlx5/mlx5_prm.h | 20 ++++++++++++++------ drivers/compress/mlx5/mlx5_compress.c | 4 ++-- 4 files changed, 36 insertions(+), 14 deletions(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index ac554cca05..00c78b1288 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -858,9 +858,18 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, attr->log_max_srq_sz = MLX5_GET(cmd_hca_cap, hcattr, log_max_srq_sz); attr->reg_c_preserve = MLX5_GET(cmd_hca_cap, hcattr, reg_c_preserve); - attr->mmo_dma_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo); - attr->mmo_compress_en = MLX5_GET(cmd_hca_cap, hcattr, compress); - attr->mmo_decompress_en = MLX5_GET(cmd_hca_cap, hcattr, decompress); + attr->mmo_regex_qp_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_qp); + attr->mmo_regex_sq_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_sq); + attr->mmo_dma_sq_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_sq); + attr->mmo_compress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_sq); + attr->mmo_decompress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_sq); + attr->mmo_dma_qp_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_qp); + attr->mmo_compress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_qp); + attr->mmo_decompress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_qp); attr->compress_min_block_size = MLX5_GET(cmd_hca_cap, hcattr, compress_min_block_size); attr->log_max_mmo_dma = MLX5_GET(cmd_hca_cap, hcattr, log_dma_mmo_size); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index c071629904..b21df0fd9b 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -173,9 +173,14 @@ struct mlx5_hca_attr { uint32_t log_max_srq; uint32_t log_max_srq_sz; uint32_t rss_ind_tbl_cap; - uint32_t mmo_dma_en:1; - uint32_t mmo_compress_en:1; - uint32_t mmo_decompress_en:1; + uint32_t mmo_dma_sq_en:1; + uint32_t mmo_compress_sq_en:1; + uint32_t mmo_decompress_sq_en:1; + uint32_t mmo_dma_qp_en:1; + uint32_t mmo_compress_qp_en:1; + uint32_t mmo_decompress_qp_en:1; + uint32_t mmo_regex_qp_en:1; + uint32_t mmo_regex_sq_en:1; uint32_t compress_min_block_size:4; uint32_t log_max_mmo_dma:5; uint32_t log_max_mmo_compress:5; diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index d361bcf90e..ec5f871c61 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1386,10 +1386,10 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 rtr2rts_qp_counters_set_id[0x1]; u8 rts2rts_udp_sport[0x1]; u8 rts2rts_lag_tx_port_affinity[0x1]; - u8 dma_mmo[0x1]; + u8 dma_mmo_sq[0x1]; u8 compress_min_block_size[0x4]; - u8 compress[0x1]; - u8 decompress[0x1]; + u8 compress_mmo_sq[0x1]; + u8 decompress_mmo_sq[0x1]; u8 log_max_ra_res_qp[0x6]; u8 end_pad[0x1]; u8 cc_query_allowed[0x1]; @@ -1519,7 +1519,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_lag_ports[0x4]; u8 reserved_at_280[0x10]; u8 max_wqe_sz_sq[0x10]; - u8 reserved_at_2a0[0x10]; + u8 reserved_at_2a0[0xc]; + u8 regexp_mmo_sq[0x1]; + u8 reserved_at_2b0[0x3]; u8 max_wqe_sz_rq[0x10]; u8 max_flow_counter_31_16[0x10]; u8 max_wqe_sz_sq_dc[0x10]; @@ -1632,7 +1634,12 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_vhca_ports[0x8]; u8 reserved_at_618[0x6]; u8 sw_owner_id[0x1]; - u8 reserved_at_61f[0x1e1]; + u8 reserved_at_61f[0x109]; + u8 dma_mmo_qp[0x1]; + u8 regexp_mmo_qp[0x1]; + u8 compress_mmo_qp[0x1]; + u8 decompress_mmo_qp[0x1]; + u8 reserved_at_624[0xd4]; }; struct mlx5_ifc_qos_cap_bits { @@ -3244,7 +3251,8 @@ struct mlx5_ifc_create_qp_in_bits { u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 reserved_at_40[0x40]; + u8 qpc_ext[0x1]; + u8 reserved_at_41[0x3f]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index c5e0a83a8c..1e03030510 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -813,8 +813,8 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_en == 0 || att.mmo_decompress_en == 0 || - att.mmo_dma_en == 0) { + att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V5 2/5] common/mlx5: add MMO configuration for the DevX QP 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 1/5] common/mlx5: update new MMO HCA capabilities Raja Zidane @ 2021-09-30 5:44 ` Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 3/5] compress/mlx5: refactor queue HW object Raja Zidane ` (3 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-30 5:44 UTC (permalink / raw) To: dev A new configuration MMO was added to QP Context. If set, MMO WQEs are supported on this QP. For DMA MMO, supported only when dma_mmo_qp==1. For REGEXP MMO, supported only when regexp_mmo_qp==1. For COMPRESS MMO, supported only when compress_mmo_qp==1. For DECOMPRESS MMO, supported only when decompress_mmo_qp==1. Add support to DevX interface to set MMO bit. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++++++ drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/mlx5_prm.h | 28 +++++++++++++++++++++++++++- 3 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 00c78b1288..eefb869b7d 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2032,6 +2032,13 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, ts_format, attr->ts_format); MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { + if (attr->mmo) { + void *qpc_ext_and_pas_list = MLX5_ADDR_OF(create_qp_in, + in, qpc_extension_and_pas_list); + void *qpc_ext = MLX5_ADDR_OF(qpc_extension_and_pas_list, + qpc_ext_and_pas_list, qpc_data_extension); + MLX5_SET(qpc_extension, qpc_ext, mmo, 1); + } MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); if (attr->log_page_size > MLX5_ADAPTER_PAGE_SHIFT) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index b21df0fd9b..e149f8b4f5 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -403,6 +403,7 @@ struct mlx5_devx_qp_attr { uint32_t wq_umem_id; uint64_t wq_umem_offset; uint32_t user_index:24; + uint32_t mmo:1; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index ec5f871c61..54e62aa153 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -3243,6 +3243,28 @@ struct mlx5_ifc_create_qp_out_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_qpc_extension_bits { + u8 reserved_at_0[0x2]; + u8 mmo[0x1]; + u8 reserved_at_3[0x5fd]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_pas_list_bits { + u8 pas[0][0x40]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_extension_and_pas_list_bits { + struct mlx5_ifc_qpc_extension_bits qpc_data_extension; + u8 pas[0][0x40]; +}; + + #ifdef PEDANTIC #pragma GCC diagnostic ignored "-Wpedantic" #endif @@ -3260,7 +3282,11 @@ struct mlx5_ifc_create_qp_in_bits { u8 wq_umem_id[0x20]; u8 wq_umem_valid[0x1]; u8 reserved_at_861[0x1f]; - u8 pas[0][0x40]; + union { + struct mlx5_ifc_qpc_pas_list_bits qpc_pas_list; + struct mlx5_ifc_qpc_extension_and_pas_list_bits + qpc_extension_and_pas_list; + }; }; #ifdef PEDANTIC #pragma GCC diagnostic error "-Wpedantic" -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V5 3/5] compress/mlx5: refactor queue HW object 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 1/5] common/mlx5: update new MMO HCA capabilities Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 2/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane @ 2021-09-30 5:44 ` Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 4/5] regex/mlx5: refactor HW queue objects Raja Zidane ` (2 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-30 5:44 UTC (permalink / raw) To: dev The mlx5 PMD for compress class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/compress/mlx5/mlx5_compress.c | 73 +++++++++++++++------------ 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index 1e03030510..5c5aa87a18 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -40,7 +40,7 @@ struct mlx5_compress_priv { void *uar; uint32_t pdn; /* Protection Domain number. */ uint8_t min_block_size; - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ /* Minimum huffman block size supported by the device. */ struct ibv_pd *pd; struct rte_compressdev_config dev_config; @@ -48,6 +48,13 @@ struct mlx5_compress_priv { rte_spinlock_t xform_sl; struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ volatile uint64_t *uar_addr; + /* HCA caps*/ + uint32_t mmo_decomp_sq:1; + uint32_t mmo_decomp_qp:1; + uint32_t mmo_comp_sq:1; + uint32_t mmo_comp_qp:1; + uint32_t mmo_dma_sq:1; + uint32_t mmo_dma_qp:1; #ifndef RTE_ARCH_64 rte_spinlock_t uar32_sl; #endif /* RTE_ARCH_64 */ @@ -61,7 +68,7 @@ struct mlx5_compress_qp { struct mlx5_mr_ctrl mr_ctrl; int socket_id; struct mlx5_devx_cq cq; - struct mlx5_devx_sq sq; + struct mlx5_devx_qp qp; struct mlx5_pmd_mr opaque_mr; struct rte_comp_op **ops; struct mlx5_compress_priv *priv; @@ -134,8 +141,8 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) { struct mlx5_compress_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->sq.sq != NULL) - mlx5_devx_sq_destroy(&qp->sq); + if (qp->qp.qp != NULL) + mlx5_devx_qp_destroy(&qp->qp); if (qp->cq.cq != NULL) mlx5_devx_cq_destroy(&qp->cq); if (qp->opaque_mr.obj != NULL) { @@ -152,12 +159,12 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) } static void -mlx5_compress_init_sq(struct mlx5_compress_qp *qp) +mlx5_compress_init_qp(struct mlx5_compress_qp *qp) { volatile struct mlx5_gga_wqe *restrict wqe = - (volatile struct mlx5_gga_wqe *)qp->sq.wqes; + (volatile struct mlx5_gga_wqe *)qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; - const uint32_t sq_ds = rte_cpu_to_be_32((qp->sq.sq->id << 8) | 4u); + const uint32_t sq_ds = rte_cpu_to_be_32((qp->qp.qp->id << 8) | 4u); const uint32_t flags = RTE_BE32(MLX5_COMP_ALWAYS << MLX5_COMP_MODE_OFFSET); const uint32_t opaq_lkey = rte_cpu_to_be_32(qp->opaque_mr.lkey); @@ -182,15 +189,10 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), }; - struct mlx5_devx_create_sq_attr sq_attr = { + struct mlx5_devx_qp_attr qp_attr = { + .pd = priv->pdn, + .uar_index = mlx5_os_get_devx_uar_page_id(priv->uar), .user_index = qp_id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .pd = priv->pdn, - .uar_page = mlx5_os_get_devx_uar_page_id(priv->uar), - }, - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, }; uint32_t log_ops_n = rte_log2_u32(max_inflight_ops); uint32_t alloc_size = sizeof(*qp); @@ -242,24 +244,26 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto err; } - sq_attr.cqn = qp->cq.cq->id; - sq_attr.ts_format = mlx5_ts_format_conv(priv->sq_ts_format); - ret = mlx5_devx_sq_create(priv->ctx, &qp->sq, log_ops_n, &sq_attr, + qp_attr.cqn = qp->cq.cq->id; + qp_attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + qp_attr.rq_size = 0; + qp_attr.sq_size = RTE_BIT32(log_ops_n); + qp_attr.mmo = priv->mmo_decomp_qp && priv->mmo_comp_qp + && priv->mmo_dma_qp; + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp, log_ops_n, &qp_attr, socket_id); if (ret != 0) { - DRV_LOG(ERR, "Failed to create SQ."); + DRV_LOG(ERR, "Failed to create QP."); goto err; } - mlx5_compress_init_sq(qp); - ret = mlx5_devx_cmd_modify_sq(qp->sq.sq, &modify_attr); - if (ret != 0) { - DRV_LOG(ERR, "Can't change SQ state to ready."); + mlx5_compress_init_qp(qp); + ret = mlx5_devx_qp2rts(&qp->qp, 0); + if (ret) goto err; - } /* Save pointer of global generation number to check memory event. */ qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; DRV_LOG(INFO, "QP %u: SQN=0x%X CQN=0x%X entries num = %u", - (uint32_t)qp_id, qp->sq.sq->id, qp->cq.cq->id, qp->entries_n); + (uint32_t)qp_id, qp->qp.qp->id, qp->cq.cq->id, qp->entries_n); return 0; err: mlx5_compress_qp_release(dev, qp_id); @@ -508,7 +512,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, { struct mlx5_compress_qp *qp = queue_pair; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes, *wqe; + qp->qp.wqes, *wqe; struct mlx5_compress_xform *xform; struct rte_comp_op *op; uint16_t mask = qp->entries_n - 1; @@ -563,7 +567,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->sq.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); + qp->qp.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); rte_wmb(); mlx5_compress_uar_write(*(volatile uint64_t *)wqe, qp->priv); rte_wmb(); @@ -598,7 +602,7 @@ mlx5_compress_cqe_err_handle(struct mlx5_compress_qp *qp, volatile struct mlx5_err_cqe *cqe = (volatile struct mlx5_err_cqe *) &qp->cq.cqes[idx]; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes; + qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; op->status = RTE_COMP_OP_STATUS_ERROR; @@ -813,8 +817,9 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || - att.mmo_dma_sq_en == 0) { + ((att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) && (att.mmo_compress_qp_en == 0 || + att.mmo_decompress_qp_en == 0 || att.mmo_dma_qp_en == 0))) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); @@ -835,10 +840,16 @@ mlx5_compress_dev_probe(struct rte_device *dev) cdev->enqueue_burst = mlx5_compress_enqueue_burst; cdev->feature_flags = RTE_COMPDEV_FF_HW_ACCELERATED; priv = cdev->data->dev_private; + priv->mmo_decomp_sq = att.mmo_decompress_sq_en; + priv->mmo_decomp_qp = att.mmo_decompress_qp_en; + priv->mmo_comp_sq = att.mmo_compress_sq_en; + priv->mmo_comp_qp = att.mmo_compress_qp_en; + priv->mmo_dma_sq = att.mmo_dma_sq_en; + priv->mmo_dma_qp = att.mmo_dma_qp_en; priv->ctx = ctx; priv->cdev = cdev; priv->min_block_size = att.compress_min_block_size; - priv->sq_ts_format = att.sq_ts_format; + priv->qp_ts_format = att.qp_ts_format; if (mlx5_compress_hw_global_prepare(priv) != 0) { rte_compressdev_pmd_destroy(priv->cdev); claim_zero(mlx5_glue->close_device(priv->ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V5 4/5] regex/mlx5: refactor HW queue objects 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane ` (2 preceding siblings ...) 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 3/5] compress/mlx5: refactor queue HW object Raja Zidane @ 2021-09-30 5:44 ` Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 5/5] compress/mlx5: allow partial transformations support Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-30 5:44 UTC (permalink / raw) To: dev The mlx5 PMD for regex class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- 4 files changed, 133 insertions(+), 125 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_regex.c b/drivers/regex/mlx5/mlx5_regex.c index 8866a4d0c6..5aa988be6d 100644 --- a/drivers/regex/mlx5/mlx5_regex.c +++ b/drivers/regex/mlx5/mlx5_regex.c @@ -146,7 +146,8 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) DRV_LOG(ERR, "Unable to read HCA capabilities."); rte_errno = ENOTSUP; goto dev_error; - } else if (!attr.regex || attr.regexp_num_of_engines == 0) { + } else if (((!attr.regex) && (!attr.mmo_regex_sq_en) && + (!attr.mmo_regex_qp_en)) || attr.regexp_num_of_engines == 0) { DRV_LOG(ERR, "Not enough capabilities to support RegEx, maybe " "old FW/OFED version?"); rte_errno = ENOTSUP; @@ -164,7 +165,9 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) rte_errno = ENOMEM; goto dev_error; } - priv->sq_ts_format = attr.sq_ts_format; + priv->mmo_regex_qp_cap = attr.mmo_regex_qp_en; + priv->mmo_regex_sq_cap = attr.mmo_regex_sq_en; + priv->qp_ts_format = attr.qp_ts_format; priv->ctx = ctx; priv->nb_engines = 2; /* attr.regexp_num_of_engines */ ret = mlx5_devx_regex_register_read(priv->ctx, 0, diff --git a/drivers/regex/mlx5/mlx5_regex.h b/drivers/regex/mlx5/mlx5_regex.h index 514f3408f9..2242d250a3 100644 --- a/drivers/regex/mlx5/mlx5_regex.h +++ b/drivers/regex/mlx5/mlx5_regex.h @@ -17,12 +17,12 @@ #include "mlx5_rxp.h" #include "mlx5_regex_utils.h" -struct mlx5_regex_sq { +struct mlx5_regex_hw_qp { uint16_t log_nb_desc; /* Log 2 number of desc for this object. */ - struct mlx5_devx_sq sq_obj; /* The SQ DevX object. */ + struct mlx5_devx_qp qp_obj; /* The QP DevX object. */ size_t pi, db_pi; size_t ci; - uint32_t sqn; + uint32_t qpn; }; struct mlx5_regex_cq { @@ -34,10 +34,10 @@ struct mlx5_regex_cq { struct mlx5_regex_qp { uint32_t flags; /* QP user flags. */ uint32_t nb_desc; /* Total number of desc for this qp. */ - struct mlx5_regex_sq *sqs; /* Pointer to sq array. */ - uint16_t nb_obj; /* Number of sq objects. */ + struct mlx5_regex_hw_qp *qps; /* Pointer to qp array. */ + uint16_t nb_obj; /* Number of qp objects. */ struct mlx5_regex_cq cq; /* CQ struct. */ - uint32_t free_sqs; + uint32_t free_qps; struct mlx5_regex_job *jobs; struct ibv_mr *metadata; struct ibv_mr *outputs; @@ -73,8 +73,10 @@ struct mlx5_regex_priv { /**< Called by memory event callback. */ struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ uint8_t is_bf2; /* The device is BF2 device. */ - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ uint8_t has_umr; /* The device supports UMR. */ + uint32_t mmo_regex_qp_cap:1; + uint32_t mmo_regex_sq_cap:1; }; #ifdef HAVE_IBV_FLOW_DV_SUPPORT diff --git a/drivers/regex/mlx5/mlx5_regex_control.c b/drivers/regex/mlx5/mlx5_regex_control.c index 8ce2dabb55..572ecc6d86 100644 --- a/drivers/regex/mlx5/mlx5_regex_control.c +++ b/drivers/regex/mlx5/mlx5_regex_control.c @@ -106,12 +106,12 @@ regex_ctrl_create_cq(struct mlx5_regex_priv *priv, struct mlx5_regex_cq *cq) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) +regex_ctrl_destroy_hw_qp(struct mlx5_regex_qp *qp, uint16_t q_ind) { - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; - mlx5_devx_sq_destroy(&sq->sq_obj); - memset(sq, 0, sizeof(*sq)); + mlx5_devx_qp_destroy(&qp_obj->qp_obj); + memset(qp, 0, sizeof(*qp)); return 0; } @@ -131,45 +131,44 @@ regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_create_sq(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, +regex_ctrl_create_hw_qp(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, uint16_t q_ind, uint16_t log_nb_desc) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT - struct mlx5_devx_create_sq_attr attr = { - .user_index = q_ind, + struct mlx5_devx_qp_attr attr = { .cqn = qp->cq.cq_obj.cq->id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .uar_page = priv->uar->page_id, - }, - .ts_format = mlx5_ts_format_conv(priv->sq_ts_format), - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, + .uar_index = priv->uar->page_id, + .ts_format = mlx5_ts_format_conv(priv->qp_ts_format), + .user_index = q_ind, }; - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; uint32_t pd_num = 0; int ret; - sq->log_nb_desc = log_nb_desc; - sq->sqn = q_ind; - sq->ci = 0; - sq->pi = 0; + qp_obj->log_nb_desc = log_nb_desc; + qp_obj->qpn = q_ind; + qp_obj->ci = 0; + qp_obj->pi = 0; ret = regex_get_pdn(priv->pd, &pd_num); if (ret) return ret; - attr.wq_attr.pd = pd_num; - ret = mlx5_devx_sq_create(priv->ctx, &sq->sq_obj, + attr.pd = pd_num; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, + log_nb_desc)); + attr.mmo = priv->mmo_regex_qp_cap; + ret = mlx5_devx_qp_create(priv->ctx, &qp_obj->qp_obj, MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_nb_desc), &attr, SOCKET_ID_ANY); if (ret) { - DRV_LOG(ERR, "Can't create SQ object."); + DRV_LOG(ERR, "Can't create QP object."); rte_errno = ENOMEM; return -rte_errno; } - ret = mlx5_devx_cmd_modify_sq(sq->sq_obj.sq, &modify_attr); + ret = mlx5_devx_qp2rts(&qp_obj->qp_obj, 0); if (ret) { - DRV_LOG(ERR, "Can't change SQ state to ready."); - regex_ctrl_destroy_sq(qp, q_ind); + DRV_LOG(ERR, "Can't change QP state to RTS."); + regex_ctrl_destroy_hw_qp(qp, q_ind); rte_errno = ENOMEM; return -rte_errno; } @@ -224,10 +223,10 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, (1 << MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_desc)); else qp->nb_obj = 1; - qp->sqs = rte_malloc(NULL, - qp->nb_obj * sizeof(struct mlx5_regex_sq), 64); - if (!qp->sqs) { - DRV_LOG(ERR, "Can't allocate sq array memory."); + qp->qps = rte_malloc(NULL, + qp->nb_obj * sizeof(struct mlx5_regex_hw_qp), 64); + if (!qp->qps) { + DRV_LOG(ERR, "Can't allocate qp array memory."); rte_errno = ENOMEM; return -rte_errno; } @@ -238,9 +237,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, goto err_cq; } for (i = 0; i < qp->nb_obj; i++) { - ret = regex_ctrl_create_sq(priv, qp, i, log_desc); + ret = regex_ctrl_create_hw_qp(priv, qp, i, log_desc); if (ret) { - DRV_LOG(ERR, "Can't create sq."); + DRV_LOG(ERR, "Can't create qp object."); goto err_btree; } nb_sq_config++; @@ -266,9 +265,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); err_btree: for (i = 0; i < nb_sq_config; i++) - regex_ctrl_destroy_sq(qp, i); + regex_ctrl_destroy_hw_qp(qp, i); regex_ctrl_destroy_cq(&qp->cq); err_cq: - rte_free(qp->sqs); + rte_free(qp->qps); return ret; } diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c index c79445ce7d..0833b2817e 100644 --- a/drivers/regex/mlx5/mlx5_regex_fastpath.c +++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c @@ -39,13 +39,13 @@ #define MLX5_REGEX_KLMS_SIZE \ ((MLX5_REGEX_MAX_KLM_NUM) * sizeof(struct mlx5_klm)) /* In WQE set mode, the pi should be quarter of the MLX5_REGEX_MAX_WQE_INDEX. */ -#define MLX5_REGEX_UMR_SQ_PI_IDX(pi, ops) \ +#define MLX5_REGEX_UMR_QP_PI_IDX(pi, ops) \ (((pi) + (ops)) & (MLX5_REGEX_MAX_WQE_INDEX >> 2)) static inline uint32_t -sq_size_get(struct mlx5_regex_sq *sq) +qp_size_get(struct mlx5_regex_hw_qp *qp) { - return (1U << sq->log_nb_desc); + return (1U << qp->log_nb_desc); } static inline uint32_t @@ -144,11 +144,11 @@ mlx5_regex_addr2mr(struct mlx5_regex_priv *priv, struct mlx5_mr_ctrl *mr_ctrl, static inline void -__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, +__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); uint16_t group0 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F ? @@ -168,13 +168,13 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F | RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F))) group0 = op->group_id0; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; int ds = 4; /* ctrl + meta + input + output */ set_wqe_ctrl_seg((struct mlx5_wqe_ctrl_seg *)wqe, (priv->has_umr ? (pi * 4 + 3) : pi), MLX5_OPCODE_MMO, MLX5_OPC_MOD_MMO_REGEX, - sq->sq_obj.sq->id, 0, ds, 0, 0); + qp_obj->qp_obj.qp->id, 0, ds, 0, 0); set_regex_ctrl_seg(wqe + 12, 0, group0, group1, group2, group3, control); struct mlx5_wqe_data_seg *input_seg = @@ -188,7 +188,7 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, static inline void prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops *op, + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job) { struct mlx5_klm klm; @@ -196,42 +196,42 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.byte_count = rte_pktmbuf_data_len(op->mbuf); klm.mkey = mlx5_regex_addr2mr(priv, &qp->mr_ctrl, op->mbuf); klm.address = rte_pktmbuf_mtod(op->mbuf, uintptr_t); - __prep_one(priv, sq, op, job, sq->pi, &klm); - sq->db_pi = sq->pi; - sq->pi = (sq->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; + __prep_one(priv, qp_obj, op, job, qp_obj->pi, &klm); + qp_obj->db_pi = qp_obj->pi; + qp_obj->pi = (qp_obj->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; } static inline void -send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq) +send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj) { struct mlx5dv_devx_uar *uar = priv->uar; - size_t wqe_offset = (sq->db_pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (qp_obj->db_pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; /* Or the fm_ce_se instead of set, avoid the fence be cleared. */ ((struct mlx5_wqe_ctrl_seg *)wqe)->fm_ce_se |= MLX5_WQE_CTRL_CQ_UPDATE; uint64_t *doorbell_addr = (uint64_t *)((uint8_t *)uar->base_addr + 0x800); rte_io_wmb(); - sq->sq_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? - (sq->db_pi * 4 + 3) : sq->db_pi) & - MLX5_REGEX_MAX_WQE_INDEX); + qp_obj->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? + (qp_obj->db_pi * 4 + 3) : qp_obj->db_pi) + & MLX5_REGEX_MAX_WQE_INDEX); rte_wmb(); *doorbell_addr = *(volatile uint64_t *)wqe; rte_wmb(); } static inline int -get_free(struct mlx5_regex_sq *sq, uint8_t has_umr) { - return (sq_size_get(sq) - ((sq->pi - sq->ci) & +get_free(struct mlx5_regex_hw_qp *qp, uint8_t has_umr) { + return (qp_size_get(qp) - ((qp->pi - qp->ci) & (has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX))); } static inline uint32_t -job_id_get(uint32_t qid, size_t sq_size, size_t index) { - return qid * sq_size + (index & (sq_size - 1)); +job_id_get(uint32_t qid, size_t qp_size, size_t index) { + return qid * qp_size + (index & (qp_size - 1)); } #ifdef HAVE_MLX5_UMR_IMKEY @@ -242,14 +242,14 @@ mkey_klm_available(struct mlx5_klm *klm, uint32_t pos, uint32_t new) } static inline void -complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, +complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_hw_qp *qp_obj, struct mlx5_regex_job *mkey_job, size_t umr_index, uint32_t klm_size, uint32_t total_len) { - size_t wqe_offset = (umr_index & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (umr_index & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB * 4); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp_obj->qp_obj.wqes + wqe_offset); struct mlx5_wqe_umr_ctrl_seg *ucseg = (struct mlx5_wqe_umr_ctrl_seg *)(wqe + 1); struct mlx5_wqe_mkey_context_seg *mkc = @@ -260,7 +260,7 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* Set WQE control seg. Non-inline KLM UMR WQE size must be 9 WQE_DS. */ set_wqe_ctrl_seg(wqe, (umr_index * 4), MLX5_OPCODE_UMR, - 0, sq->sq_obj.sq->id, 0, 9, 0, + 0, qp_obj->qp_obj.qp->id, 0, 9, 0, rte_cpu_to_be_32(mkey_job->imkey->id)); /* Set UMR WQE control seg. */ ucseg->mkey_mask |= rte_cpu_to_be_64(MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN | @@ -287,37 +287,37 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, } static inline void -prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, - struct rte_regex_ops *op, struct mlx5_regex_job *job, - size_t pi, struct mlx5_klm *klm) +prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, + struct mlx5_regex_hw_qp *qp, struct rte_regex_ops *op, + struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp) - 1)) * (MLX5_SEND_WQE_BB << 2); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp->qp_obj.wqes + wqe_offset); /* Clear the WQE memory used as UMR WQE previously. */ if ((rte_be_to_cpu_32(wqe->opmod_idx_opcode) & 0xff) != MLX5_OPCODE_NOP) memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* UMR WQE size is 9 DS, align nop WQE to 3 WQEBBS(12 DS). */ - set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, sq->sq_obj.sq->id, + set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, qp->qp_obj.qp->id, 0, 12, 0, 0); - __prep_one(priv, sq, op, job, pi, klm); + __prep_one(priv, qp, op, job, pi, klm); } static inline void prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops **op, size_t nb_ops) + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops **op, + size_t nb_ops) { struct mlx5_regex_job *job = NULL; - size_t sqid = sq->sqn, mkey_job_id = 0; + size_t hw_qpid = qp_obj->qpn, mkey_job_id = 0; size_t left_ops = nb_ops; uint32_t klm_num = 0; uint32_t len = 0; struct mlx5_klm *mkey_klm = NULL; struct mlx5_klm klm; - sqid = sq->sqn; while (left_ops--) rte_prefetch0(op[left_ops]); left_ops = nb_ops; @@ -329,7 +329,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, */ while (left_ops--) { struct rte_mbuf *mbuf = op[left_ops]->mbuf; - size_t pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, left_ops); + size_t pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, left_ops); if (mbuf->nb_segs > 1) { size_t scatter_size = 0; @@ -341,16 +341,16 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, * WQE in the next WQE set. */ if (mkey_klm) - complete_umr_wqe(qp, sq, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], - MLX5_REGEX_UMR_SQ_PI_IDX(pi, 1), + MLX5_REGEX_UMR_QP_PI_IDX(pi, 1), klm_num, len); /* * Get the indircet mkey and KLM array index * from the last WQE set. */ - mkey_job_id = job_id_get(sqid, - sq_size_get(sq), pi); + mkey_job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), pi); mkey_klm = qp->jobs[mkey_job_id].imkey_array; klm_num = 0; len = 0; @@ -384,22 +384,23 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.address = rte_pktmbuf_mtod(mbuf, uintptr_t); klm.byte_count = rte_pktmbuf_data_len(mbuf); } - job = &qp->jobs[job_id_get(sqid, sq_size_get(sq), pi)]; + job = &qp->jobs[job_id_get(hw_qpid, qp_size_get(qp_obj), pi)]; /* * Build the nop + RegEx WQE set by default. The fist nop WQE * will be updated later as UMR WQE if scattered mubf exist. */ - prep_nop_regex_wqe_set(priv, sq, op[left_ops], job, pi, &klm); + prep_nop_regex_wqe_set(priv, qp_obj, op[left_ops], job, pi, + &klm); } /* * Scattered mbuf have been added to the KLM array. Complete the build * of UMR WQE, update the first nop WQE as UMR WQE. */ if (mkey_klm) - complete_umr_wqe(qp, sq, &qp->jobs[mkey_job_id], sq->pi, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], qp_obj->pi, klm_num, len); - sq->db_pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops - 1); - sq->pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops); + qp_obj->db_pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops - 1); + qp_obj->pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops); } uint16_t @@ -408,21 +409,22 @@ mlx5_regexdev_enqueue_gga(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, nb_left = nb_ops, nb_desc; + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, nb_left = nb_ops, nb_desc; - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - nb_desc = get_free(sq, priv->has_umr); + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + nb_desc = get_free(qp_obj, priv->has_umr); if (nb_desc) { /* The ops be handled can't exceed nb_ops. */ if (nb_desc > nb_left) nb_desc = nb_left; else - queue->free_sqs &= ~(1 << sqid); - prep_regex_umr_wqe_set(priv, queue, sq, ops, nb_desc); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + prep_regex_umr_wqe_set(priv, queue, qp_obj, ops, + nb_desc); + send_doorbell(priv, qp_obj); nb_left -= nb_desc; } if (!nb_left) @@ -441,23 +443,25 @@ mlx5_regexdev_enqueue(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, job_id, i = 0; - - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - while (get_free(sq, priv->has_umr)) { - job_id = job_id_get(sqid, sq_size_get(sq), sq->pi); - prep_one(priv, queue, sq, ops[i], &queue->jobs[job_id]); + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, job_id, i = 0; + + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + while (get_free(qp_obj, priv->has_umr)) { + job_id = job_id_get(hw_qpid, qp_size_get(qp_obj), + qp_obj->pi); + prep_one(priv, queue, qp_obj, ops[i], + &queue->jobs[job_id]); i++; if (unlikely(i == nb_ops)) { - send_doorbell(priv, sq); + send_doorbell(priv, qp_obj); goto out; } } - queue->free_sqs &= ~(1 << sqid); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + send_doorbell(priv, qp_obj); } out: @@ -567,21 +571,21 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, uint16_t wq_counter = (rte_be_to_cpu_16(cqe->wqe_counter) + 1) & MLX5_REGEX_MAX_WQE_INDEX; - size_t sqid = cqe->rsvd3[2]; - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; + size_t hw_qpid = cqe->rsvd3[2]; + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; /* UMR mode WQE counter move as WQE set(4 WQEBBS).*/ if (priv->has_umr) wq_counter >>= 2; - while (sq->ci != wq_counter) { + while (qp_obj->ci != wq_counter) { if (unlikely(i == nb_ops)) { /* Return without updating cq->ci */ goto out; } - uint32_t job_id = job_id_get(sqid, sq_size_get(sq), - sq->ci); + uint32_t job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), qp_obj->ci); extract_result(ops[i], &queue->jobs[job_id]); - sq->ci = (sq->ci + 1) & (priv->has_umr ? + qp_obj->ci = (qp_obj->ci + 1) & (priv->has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX); i++; @@ -589,7 +593,7 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, cq->ci = (cq->ci + 1) & 0xffffff; rte_wmb(); cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->ci); - queue->free_sqs |= (1 << sqid); + queue->free_qps |= (1 << hw_qpid); } out: @@ -598,15 +602,15 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, } static void -setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) +setup_qps(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) { - size_t sqid, entry; + size_t hw_qpid, entry; uint32_t job_id; - for (sqid = 0; sqid < queue->nb_obj; sqid++) { - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes; - for (entry = 0 ; entry < sq_size_get(sq); entry++) { - job_id = sqid * sq_size_get(sq) + entry; + for (hw_qpid = 0; hw_qpid < queue->nb_obj; hw_qpid++) { + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes; + for (entry = 0 ; entry < qp_size_get(qp_obj); entry++) { + job_id = hw_qpid * qp_size_get(qp_obj) + entry; struct mlx5_regex_job *job = &queue->jobs[job_id]; /* Fill UMR WQE with NOP in advanced. */ @@ -614,7 +618,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) set_wqe_ctrl_seg ((struct mlx5_wqe_ctrl_seg *)wqe, entry * 2, MLX5_OPCODE_NOP, 0, - sq->sq_obj.sq->id, 0, 12, 0, 0); + qp_obj->qp_obj.qp->id, 0, 12, 0, 0); wqe += MLX5_REGEX_UMR_WQE_SIZE; } set_metadata_seg((struct mlx5_wqe_metadata_seg *) @@ -628,7 +632,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) (uintptr_t)job->output); wqe += 64; } - queue->free_sqs |= 1 << sqid; + queue->free_qps |= 1 << hw_qpid; } } @@ -738,7 +742,7 @@ mlx5_regexdev_setup_fastpath(struct mlx5_regex_priv *priv, uint32_t qp_id) return err; } - setup_sqs(priv, qp); + setup_qps(priv, qp); if (priv->has_umr) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V5 5/5] compress/mlx5: allow partial transformations support 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane ` (3 preceding siblings ...) 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 4/5] regex/mlx5: refactor HW queue objects Raja Zidane @ 2021-09-30 5:44 ` Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-30 5:44 UTC (permalink / raw) To: dev Currently compress, decompress and dma are allowed only when all 3 capabilities are on. A case where the user wants decompress offload, if decompress capability is on but one of compress, dma is off, is not allowed. Split compress/decompress/dma support check to allow partial transformations. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/compress/mlx5/mlx5_compress.c | 61 ++++++++++++++++++++------- 1 file changed, 46 insertions(+), 15 deletions(-) diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index 5c5aa87a18..e94e8fb0c6 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -291,17 +291,44 @@ mlx5_compress_xform_create(struct rte_compressdev *dev, struct mlx5_compress_xform *xfrm; uint32_t size; - if (xform->type == RTE_COMP_COMPRESS && xform->compress.level == - RTE_COMP_LEVEL_NONE) { - DRV_LOG(ERR, "Non-compressed block is not supported."); - return -ENOTSUP; - } - if ((xform->type == RTE_COMP_COMPRESS && xform->compress.hash_algo != - RTE_COMP_HASH_ALGO_NONE) || (xform->type == RTE_COMP_DECOMPRESS && - xform->decompress.hash_algo != RTE_COMP_HASH_ALGO_NONE)) { - DRV_LOG(ERR, "SHA is not supported."); + switch (xform->type) { + case RTE_COMP_COMPRESS: + if (xform->compress.algo == RTE_COMP_ALGO_NULL && + !priv->mmo_dma_qp && !priv->mmo_dma_sq) { + DRV_LOG(ERR, "Not enough capabilities to support DMA operation, maybe old FW/OFED version?"); + return -ENOTSUP; + } else if (!priv->mmo_comp_qp && !priv->mmo_comp_sq) { + DRV_LOG(ERR, "Not enough capabilities to support compress operation, maybe old FW/OFED version?"); + return -ENOTSUP; + } + if (xform->compress.level == RTE_COMP_LEVEL_NONE) { + DRV_LOG(ERR, "Non-compressed block is not supported."); + return -ENOTSUP; + } + if (xform->compress.hash_algo != RTE_COMP_HASH_ALGO_NONE) { + DRV_LOG(ERR, "SHA is not supported."); + return -ENOTSUP; + } + break; + case RTE_COMP_DECOMPRESS: + if (xform->decompress.algo == RTE_COMP_ALGO_NULL && + !priv->mmo_dma_qp && !priv->mmo_dma_sq) { + DRV_LOG(ERR, "Not enough capabilities to support DMA operation, maybe old FW/OFED version?"); + return -ENOTSUP; + } else if (!priv->mmo_decomp_qp && !priv->mmo_decomp_sq) { + DRV_LOG(ERR, "Not enough capabilities to support decompress operation, maybe old FW/OFED version?"); + return -ENOTSUP; + } + if (xform->compress.hash_algo != RTE_COMP_HASH_ALGO_NONE) { + DRV_LOG(ERR, "SHA is not supported."); + return -ENOTSUP; + } + break; + default: + DRV_LOG(ERR, "Xform type should be compress/decompress"); return -ENOTSUP; } + xfrm = rte_zmalloc_socket(__func__, sizeof(*xfrm), 0, priv->dev_config.socket_id); if (xfrm == NULL) @@ -816,12 +843,16 @@ mlx5_compress_dev_probe(struct rte_device *dev) rte_errno = ENODEV; return -rte_errno; } - if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - ((att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || - att.mmo_dma_sq_en == 0) && (att.mmo_compress_qp_en == 0 || - att.mmo_decompress_qp_en == 0 || att.mmo_dma_qp_en == 0))) { - DRV_LOG(ERR, "Not enough capabilities to support compress " - "operations, maybe old FW/OFED version?"); + if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0) { + DRV_LOG(ERR, "Failed to query device capabilities"); + claim_zero(mlx5_glue->close_device(ctx)); + rte_errno = ENOTSUP; + return -ENOTSUP; + } + if (!att.mmo_decompress_qp_en && !att.mmo_decompress_sq_en + && !att.mmo_compress_qp_en && !att.mmo_compress_sq_en + && !att.mmo_dma_qp_en && !att.mmo_dma_sq_en) { + DRV_LOG(ERR, "Not enough capabilities to support compress operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); rte_errno = ENOTSUP; return -ENOTSUP; -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane ` (4 preceding siblings ...) 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 5/5] compress/mlx5: allow partial transformations support Raja Zidane @ 2021-10-05 12:27 ` Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 1/5] common/mlx5: share DevX QP operations Raja Zidane ` (5 more replies) 5 siblings, 6 replies; 38+ messages in thread From: Raja Zidane @ 2021-10-05 12:27 UTC (permalink / raw) To: dev The mlx5 PMDs for compress and regex classes use an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, mmo, that will be supported only in the QP object. The FW introduced new capabilities to define whether the mmo configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set mmo configuration according to the new FW capabilities. V2: fix checkpatch errors. V3: rebase. V4: compilation error in commit 2/5. V5: rebase. V6: rebase on main. Raja Zidane (5): common/mlx5: share DevX QP operations common/mlx5: update new MMO HCA capabilities common/mlx5: add MMO configuration for the DevX QP compress/mlx5: refactor queue HW object regex/mlx5: refactor HW queue objects drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 +++ drivers/common/mlx5/mlx5_devx_cmds.c | 23 ++- drivers/common/mlx5/mlx5_devx_cmds.h | 13 +- drivers/common/mlx5/mlx5_prm.h | 48 ++++++- drivers/common/mlx5/version.map | 3 + drivers/compress/mlx5/mlx5_compress.c | 73 +++++----- drivers/crypto/mlx5/mlx5_crypto.c | 102 ++++---------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 59 +++----- 15 files changed, 461 insertions(+), 295 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V6 1/5] common/mlx5: share DevX QP operations 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane @ 2021-10-05 12:27 ` Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane ` (4 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-10-05 12:27 UTC (permalink / raw) To: dev Currently drivers using QP (vDPA, crypto and compress, regex soon) manage their memory, creation, modification and destruction of the QP, in almost identical code. Move QP memory management, creation and destruction to common. Add common function to change QP state to RTS. Add user_index attribute to QP creation. It's for better code maintenance and reuse. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_common_devx.c | 144 +++++++++++++++++++++++++ drivers/common/mlx5/mlx5_common_devx.h | 23 ++++ drivers/common/mlx5/mlx5_devx_cmds.c | 1 + drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/version.map | 3 + drivers/crypto/mlx5/mlx5_crypto.c | 102 +++++------------- drivers/crypto/mlx5/mlx5_crypto.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa.h | 5 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 59 +++------- 9 files changed, 217 insertions(+), 126 deletions(-) diff --git a/drivers/common/mlx5/mlx5_common_devx.c b/drivers/common/mlx5/mlx5_common_devx.c index 22c8d356c4..825f84b183 100644 --- a/drivers/common/mlx5/mlx5_common_devx.c +++ b/drivers/common/mlx5/mlx5_common_devx.c @@ -271,6 +271,115 @@ mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, return -rte_errno; } +/** + * Destroy DevX Queue Pair. + * + * @param[in] qp + * DevX QP to destroy. + */ +void +mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp) +{ + if (qp->qp) + claim_zero(mlx5_devx_cmd_destroy(qp->qp)); + if (qp->umem_obj) + claim_zero(mlx5_os_umem_dereg(qp->umem_obj)); + if (qp->umem_buf) + mlx5_free((void *)(uintptr_t)qp->umem_buf); +} + +/** + * Create Queue Pair using DevX API. + * + * Get a pointer to partially initialized attributes structure, and updates the + * following fields: + * wq_umem_id + * wq_umem_offset + * dbr_umem_valid + * dbr_umem_id + * dbr_address + * log_page_size + * All other fields are updated by caller. + * + * @param[in] ctx + * Context returned from mlx5 open_device() glue function. + * @param[in/out] qp_obj + * Pointer to QP to create. + * @param[in] log_wqbb_n + * Log of number of WQBBs in queue. + * @param[in] attr + * Pointer to QP attributes structure. + * @param[in] socket + * Socket to use for allocation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket) +{ + struct mlx5_devx_obj *qp = NULL; + struct mlx5dv_devx_umem *umem_obj = NULL; + void *umem_buf = NULL; + size_t alignment = MLX5_WQE_BUF_ALIGNMENT; + uint32_t umem_size, umem_dbrec; + uint16_t qp_size = 1 << log_wqbb_n; + int ret; + + if (alignment == (size_t)-1) { + DRV_LOG(ERR, "Failed to get WQE buf alignment."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Allocate memory buffer for WQEs and doorbell record. */ + umem_size = MLX5_WQE_SIZE * qp_size; + umem_dbrec = RTE_ALIGN(umem_size, MLX5_DBR_SIZE); + umem_size += MLX5_DBR_SIZE; + umem_buf = mlx5_malloc(MLX5_MEM_RTE | MLX5_MEM_ZERO, umem_size, + alignment, socket); + if (!umem_buf) { + DRV_LOG(ERR, "Failed to allocate memory for QP."); + rte_errno = ENOMEM; + return -rte_errno; + } + /* Register allocated buffer in user space with DevX. */ + umem_obj = mlx5_os_umem_reg(ctx, (void *)(uintptr_t)umem_buf, umem_size, + IBV_ACCESS_LOCAL_WRITE); + if (!umem_obj) { + DRV_LOG(ERR, "Failed to register umem for QP."); + rte_errno = errno; + goto error; + } + /* Fill attributes for SQ object creation. */ + attr->wq_umem_id = mlx5_os_get_umem_id(umem_obj); + attr->wq_umem_offset = 0; + attr->dbr_umem_valid = 1; + attr->dbr_umem_id = attr->wq_umem_id; + attr->dbr_address = umem_dbrec; + attr->log_page_size = MLX5_LOG_PAGE_SIZE; + /* Create send queue object with DevX. */ + qp = mlx5_devx_cmd_create_qp(ctx, attr); + if (!qp) { + DRV_LOG(ERR, "Can't create DevX QP object."); + rte_errno = ENOMEM; + goto error; + } + qp_obj->umem_buf = umem_buf; + qp_obj->umem_obj = umem_obj; + qp_obj->qp = qp; + qp_obj->db_rec = RTE_PTR_ADD(qp_obj->umem_buf, umem_dbrec); + return 0; +error: + ret = rte_errno; + if (umem_obj) + claim_zero(mlx5_os_umem_dereg(umem_obj)); + if (umem_buf) + mlx5_free((void *)(uintptr_t)umem_buf); + rte_errno = ret; + return -rte_errno; +} + /** * Destroy DevX Receive Queue. * @@ -385,3 +494,38 @@ mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, return -rte_errno; } + +/** + * Change QP state to RTS. + * + * @param[in] qp + * DevX QP to change. + * @param[in] remote_qp_id + * The remote QP ID for MLX5_CMD_OP_INIT2RTR_QP operation. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id) +{ + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RST2INIT_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_INIT2RTR_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", + rte_errno); + return -1; + } + if (mlx5_devx_cmd_modify_qp_state(qp->qp, MLX5_CMD_OP_RTR2RTS_QP, + remote_qp_id)) { + DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", + rte_errno); + return -1; + } + return 0; +} diff --git a/drivers/common/mlx5/mlx5_common_devx.h b/drivers/common/mlx5/mlx5_common_devx.h index aad0184e5a..f699405f69 100644 --- a/drivers/common/mlx5/mlx5_common_devx.h +++ b/drivers/common/mlx5/mlx5_common_devx.h @@ -33,6 +33,18 @@ struct mlx5_devx_sq { volatile uint32_t *db_rec; /* The SQ doorbell record. */ }; +/* DevX Queue Pair structure. */ +struct mlx5_devx_qp { + struct mlx5_devx_obj *qp; /* The QP DevX object. */ + void *umem_obj; /* The QP umem object. */ + union { + void *umem_buf; + struct mlx5_wqe *wqes; /* The QP ring buffer. */ + struct mlx5_aso_wqe *aso_wqes; + }; + volatile uint32_t *db_rec; /* The QP doorbell record. */ +}; + /* DevX Receive Queue structure. */ struct mlx5_devx_rq { struct mlx5_devx_obj *rq; /* The RQ DevX object. */ @@ -59,6 +71,14 @@ int mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq *sq_obj, uint16_t log_wqbb_n, struct mlx5_devx_create_sq_attr *attr, int socket); +__rte_internal +void mlx5_devx_qp_destroy(struct mlx5_devx_qp *qp); + +__rte_internal +int mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp *qp_obj, + uint16_t log_wqbb_n, + struct mlx5_devx_qp_attr *attr, int socket); + __rte_internal void mlx5_devx_rq_destroy(struct mlx5_devx_rq *rq); @@ -67,4 +87,7 @@ int mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size, uint16_t log_wqbb_n, struct mlx5_devx_create_rq_attr *attr, int socket); +__rte_internal +int mlx5_devx_qp2rts(struct mlx5_devx_qp *qp, uint32_t remote_qp_id); + #endif /* RTE_PMD_MLX5_COMMON_DEVX_H_ */ diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 56407cc332..ac554cca05 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2021,6 +2021,7 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, st, MLX5_QP_ST_RC); MLX5_SET(qpc, qpc, pd, attr->pd); MLX5_SET(qpc, qpc, ts_format, attr->ts_format); + MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index e576e30f24..c071629904 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -397,6 +397,7 @@ struct mlx5_devx_qp_attr { uint64_t dbr_address; uint32_t wq_umem_id; uint64_t wq_umem_offset; + uint32_t user_index:24; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index e5cb6b7060..d3c5040aac 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -67,6 +67,9 @@ INTERNAL { mlx5_devx_get_out_command_status; + mlx5_devx_qp2rts; + mlx5_devx_qp_create; + mlx5_devx_qp_destroy; mlx5_devx_rq_create; mlx5_devx_rq_destroy; mlx5_devx_sq_create; diff --git a/drivers/crypto/mlx5/mlx5_crypto.c b/drivers/crypto/mlx5/mlx5_crypto.c index 682cf8b607..6a2f8b6ac6 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.c +++ b/drivers/crypto/mlx5/mlx5_crypto.c @@ -267,12 +267,7 @@ mlx5_crypto_qp_release(struct mlx5_crypto_qp *qp) { if (qp == NULL) return; - if (qp->qp_obj != NULL) - claim_zero(mlx5_devx_cmd_destroy(qp->qp_obj)); - if (qp->umem_obj != NULL) - claim_zero(mlx5_glue->devx_umem_dereg(qp->umem_obj)); - if (qp->umem_buf != NULL) - rte_free(qp->umem_buf); + mlx5_devx_qp_destroy(&qp->qp_obj); mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); mlx5_devx_cq_destroy(&qp->cq_obj); rte_free(qp); @@ -289,34 +284,6 @@ mlx5_crypto_queue_pair_release(struct rte_cryptodev *dev, uint16_t qp_id) return 0; } -static int -mlx5_crypto_qp2rts(struct mlx5_crypto_qp *qp) -{ - /* - * In Order to configure self loopback, when calling these functions the - * remote QP id that is used is the id of the same QP. - */ - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RST2INIT_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to INIT state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_INIT2RTR_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTR state(%u).", - rte_errno); - return -1; - } - if (mlx5_devx_cmd_modify_qp_state(qp->qp_obj, MLX5_CMD_OP_RTR2RTS_QP, - qp->qp_obj->id)) { - DRV_LOG(ERR, "Failed to modify QP to RTS state(%u).", - rte_errno); - return -1; - } - return 0; -} - static __rte_noinline uint32_t mlx5_crypto_get_block_size(struct rte_crypto_op *op) { @@ -471,7 +438,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, memcpy(klms, &umr->kseg[0], sizeof(*klms) * klm_n); } ds = 2 + klm_n; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_RDMA_WRITE); ds = RTE_ALIGN(ds, 4); @@ -480,7 +447,7 @@ mlx5_crypto_wqe_set(struct mlx5_crypto_priv *priv, if (priv->max_rdmar_ds > ds) { cseg += ds; ds = priv->max_rdmar_ds - ds; - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | ds); + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | ds); cseg->opcode = rte_cpu_to_be_32((qp->db_pi << 8) | MLX5_OPCODE_NOP); qp->db_pi += ds >> 2; /* Here, DS is 4 aligned for sure. */ @@ -524,7 +491,8 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, do { idx = qp->pi & mask; op = *ops++; - umr = RTE_PTR_ADD(qp->umem_buf, priv->wqe_set_size * idx); + umr = RTE_PTR_ADD(qp->qp_obj.umem_buf, + priv->wqe_set_size * idx); if (unlikely(mlx5_crypto_wqe_set(priv, qp, op, umr) == 0)) { qp->stats.enqueue_err_count++; if (remain != nb_ops) { @@ -538,7 +506,7 @@ mlx5_crypto_enqueue_burst(void *queue_pair, struct rte_crypto_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); + qp->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->db_pi); rte_wmb(); mlx5_crypto_uar_write(*(volatile uint64_t *)qp->wqe, qp->priv); rte_wmb(); @@ -604,8 +572,8 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) uint32_t i; for (i = 0 ; i < qp->entries_n; i++) { - struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->umem_buf, i * - priv->wqe_set_size); + struct mlx5_wqe_cseg *cseg = RTE_PTR_ADD(qp->qp_obj.umem_buf, + i * priv->wqe_set_size); struct mlx5_wqe_umr_cseg *ucseg = (struct mlx5_wqe_umr_cseg *) (cseg + 1); struct mlx5_wqe_umr_bsf_seg *bsf = @@ -614,7 +582,7 @@ mlx5_crypto_qp_init(struct mlx5_crypto_priv *priv, struct mlx5_crypto_qp *qp) struct mlx5_wqe_rseg *rseg; /* Init UMR WQE. */ - cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj->id << 8) | + cseg->sq_ds = rte_cpu_to_be_32((qp->qp_obj.qp->id << 8) | (priv->umr_wqe_size / MLX5_WSEG_SIZE)); cseg->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR << MLX5_COMP_MODE_OFFSET); @@ -649,7 +617,7 @@ mlx5_crypto_indirect_mkeys_prepare(struct mlx5_crypto_priv *priv, .klm_num = RTE_ALIGN(priv->max_segs_num, 4), }; - for (umr = (struct mlx5_umr_wqe *)qp->umem_buf, i = 0; + for (umr = (struct mlx5_umr_wqe *)qp->qp_obj.umem_buf, i = 0; i < qp->entries_n; i++, umr = RTE_PTR_ADD(umr, priv->wqe_set_size)) { attr.klm_array = (struct mlx5_klm *)&umr->kseg[0]; qp->mkey[i] = mlx5_devx_cmd_mkey_create(priv->ctx, &attr); @@ -672,9 +640,7 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, struct mlx5_devx_qp_attr attr = {0}; struct mlx5_crypto_qp *qp; uint16_t log_nb_desc = rte_log2_u32(qp_conf->nb_descriptors); - uint32_t umem_size = RTE_BIT32(log_nb_desc) * - priv->wqe_set_size + - sizeof(*qp->db_rec) * 2; + uint32_t ret; uint32_t alloc_size = sizeof(*qp); struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), @@ -698,18 +664,16 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto error; } - qp->umem_buf = rte_zmalloc_socket(__func__, umem_size, 4096, socket_id); - if (qp->umem_buf == NULL) { - DRV_LOG(ERR, "Failed to allocate QP umem."); - rte_errno = ENOMEM; - goto error; - } - qp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)qp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (qp->umem_obj == NULL) { - DRV_LOG(ERR, "Failed to register QP umem."); + attr.pd = priv->pdn; + attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); + attr.cqn = qp->cq_obj.cq->id; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(log_nb_desc); + attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp_obj, log_nb_desc, &attr, + socket_id); + if (ret) { + DRV_LOG(ERR, "Failed to create QP."); goto error; } if (mlx5_mr_btree_init(&qp->mr_ctrl.cache_bh, MLX5_MR_BTREE_CACHE_N, @@ -720,25 +684,11 @@ mlx5_crypto_queue_pair_setup(struct rte_cryptodev *dev, uint16_t qp_id, goto error; } qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; - attr.pd = priv->pdn; - attr.uar_index = mlx5_os_get_devx_uar_page_id(priv->uar); - attr.cqn = qp->cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 0; - attr.sq_size = RTE_BIT32(log_nb_desc); - attr.dbr_umem_valid = 1; - attr.wq_umem_id = qp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = qp->umem_obj->umem_id; - attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - attr.dbr_address = RTE_BIT64(log_nb_desc) * priv->wqe_set_size; - qp->qp_obj = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (qp->qp_obj == NULL) { - DRV_LOG(ERR, "Failed to create QP(%u).", rte_errno); - goto error; - } - qp->db_rec = RTE_PTR_ADD(qp->umem_buf, (uintptr_t)attr.dbr_address); - if (mlx5_crypto_qp2rts(qp)) + /* + * In Order to configure self loopback, when calling devx qp2rts the + * remote QP id that is used is the id of the same QP. + */ + if (mlx5_devx_qp2rts(&qp->qp_obj, qp->qp_obj.qp->id)) goto error; qp->mkey = (struct mlx5_devx_obj **)RTE_ALIGN((uintptr_t)(qp + 1), RTE_CACHE_LINE_SIZE); diff --git a/drivers/crypto/mlx5/mlx5_crypto.h b/drivers/crypto/mlx5/mlx5_crypto.h index d589e0ac3d..4d7e6d2d10 100644 --- a/drivers/crypto/mlx5/mlx5_crypto.h +++ b/drivers/crypto/mlx5/mlx5_crypto.h @@ -44,11 +44,8 @@ struct mlx5_crypto_priv { struct mlx5_crypto_qp { struct mlx5_crypto_priv *priv; struct mlx5_devx_cq cq_obj; - struct mlx5_devx_obj *qp_obj; + struct mlx5_devx_qp qp_obj; struct rte_cryptodev_stats stats; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; struct rte_crypto_op **ops; struct mlx5_devx_obj **mkey; /* WQE's indirect mekys. */ struct mlx5_mr_ctrl mr_ctrl; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index 2a04e36607..a27f3fdadb 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -54,10 +54,7 @@ struct mlx5_vdpa_cq { struct mlx5_vdpa_event_qp { struct mlx5_vdpa_cq cq; struct mlx5_devx_obj *fw_qp; - struct mlx5_devx_obj *sw_qp; - struct mlx5dv_devx_umem *umem_obj; - void *umem_buf; - volatile uint32_t *db_rec; + struct mlx5_devx_qp sw_qp; }; struct mlx5_vdpa_query_mr { diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index 3541c652ce..bb6722839a 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -179,7 +179,7 @@ mlx5_vdpa_cq_poll(struct mlx5_vdpa_cq *cq) cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci); rte_io_wmb(); /* Ring SW QP doorbell record. */ - eqp->db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); + eqp->sw_qp.db_rec[0] = rte_cpu_to_be_32(cq->cq_ci + cq_size); } return comp; } @@ -531,12 +531,7 @@ mlx5_vdpa_cqe_event_unset(struct mlx5_vdpa_priv *priv) void mlx5_vdpa_event_qp_destroy(struct mlx5_vdpa_event_qp *eqp) { - if (eqp->sw_qp) - claim_zero(mlx5_devx_cmd_destroy(eqp->sw_qp)); - if (eqp->umem_obj) - claim_zero(mlx5_glue->devx_umem_dereg(eqp->umem_obj)); - if (eqp->umem_buf) - rte_free(eqp->umem_buf); + mlx5_devx_qp_destroy(&eqp->sw_qp); if (eqp->fw_qp) claim_zero(mlx5_devx_cmd_destroy(eqp->fw_qp)); mlx5_vdpa_cq_destroy(&eqp->cq); @@ -547,36 +542,36 @@ static int mlx5_vdpa_qps2rts(struct mlx5_vdpa_event_qp *eqp) { if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to INIT state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RST2INIT_QP, - eqp->fw_qp->id)) { + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, + MLX5_CMD_OP_RST2INIT_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to INIT state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTR state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_INIT2RTR_QP, - eqp->fw_qp->id)) { + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, + MLX5_CMD_OP_INIT2RTR_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTR state(%u).", rte_errno); return -1; } if (mlx5_devx_cmd_modify_qp_state(eqp->fw_qp, MLX5_CMD_OP_RTR2RTS_QP, - eqp->sw_qp->id)) { + eqp->sw_qp.qp->id)) { DRV_LOG(ERR, "Failed to modify FW QP to RTS state(%u).", rte_errno); return -1; } - if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp, MLX5_CMD_OP_RTR2RTS_QP, + if (mlx5_devx_cmd_modify_qp_state(eqp->sw_qp.qp, MLX5_CMD_OP_RTR2RTS_QP, eqp->fw_qp->id)) { DRV_LOG(ERR, "Failed to modify SW QP to RTS state(%u).", rte_errno); @@ -591,8 +586,7 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, { struct mlx5_devx_qp_attr attr = {0}; uint16_t log_desc_n = rte_log2_u32(desc_n); - uint32_t umem_size = (1 << log_desc_n) * MLX5_WSEG_SIZE + - sizeof(*eqp->db_rec) * 2; + uint32_t ret; if (mlx5_vdpa_event_qp_global_prepare(priv)) return -1; @@ -605,42 +599,23 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, DRV_LOG(ERR, "Failed to create FW QP(%u).", rte_errno); goto error; } - eqp->umem_buf = rte_zmalloc(__func__, umem_size, 4096); - if (!eqp->umem_buf) { - DRV_LOG(ERR, "Failed to allocate memory for SW QP."); - rte_errno = ENOMEM; - goto error; - } - eqp->umem_obj = mlx5_glue->devx_umem_reg(priv->ctx, - (void *)(uintptr_t)eqp->umem_buf, - umem_size, - IBV_ACCESS_LOCAL_WRITE); - if (!eqp->umem_obj) { - DRV_LOG(ERR, "Failed to register umem for SW QP."); - goto error; - } attr.uar_index = priv->uar->page_id; attr.cqn = eqp->cq.cq_obj.cq->id; - attr.log_page_size = rte_log2_u32(sysconf(_SC_PAGESIZE)); - attr.rq_size = 1 << log_desc_n; + attr.rq_size = RTE_BIT32(log_desc_n); attr.log_rq_stride = rte_log2_u32(MLX5_WSEG_SIZE); attr.sq_size = 0; /* No need SQ. */ - attr.dbr_umem_valid = 1; - attr.wq_umem_id = eqp->umem_obj->umem_id; - attr.wq_umem_offset = 0; - attr.dbr_umem_id = eqp->umem_obj->umem_id; attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - attr.dbr_address = RTE_BIT64(log_desc_n) * MLX5_WSEG_SIZE; - eqp->sw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); - if (!eqp->sw_qp) { + ret = mlx5_devx_qp_create(priv->ctx, &(eqp->sw_qp), log_desc_n, &attr, + SOCKET_ID_ANY); + if (ret) { DRV_LOG(ERR, "Failed to create SW QP(%u).", rte_errno); goto error; } - eqp->db_rec = RTE_PTR_ADD(eqp->umem_buf, (uintptr_t)attr.dbr_address); if (mlx5_vdpa_qps2rts(eqp)) goto error; /* First ringing. */ - rte_write32(rte_cpu_to_be_32(1 << log_desc_n), &eqp->db_rec[0]); + rte_write32(rte_cpu_to_be_32(RTE_BIT32(log_desc_n)), + &eqp->sw_qp.db_rec[0]); return 0; error: mlx5_vdpa_event_qp_destroy(eqp); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V6 2/5] common/mlx5: update new MMO HCA capabilities 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 1/5] common/mlx5: share DevX QP operations Raja Zidane @ 2021-10-05 12:27 ` Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane ` (3 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-10-05 12:27 UTC (permalink / raw) To: dev New MMO HCA capabilities were added and others were renamed. Align hca capabilities with new prm. Add support in devx interface for changes in HCA capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 15 ++++++++++++--- drivers/common/mlx5/mlx5_devx_cmds.h | 11 ++++++++--- drivers/common/mlx5/mlx5_prm.h | 20 ++++++++++++++------ drivers/compress/mlx5/mlx5_compress.c | 4 ++-- 4 files changed, 36 insertions(+), 14 deletions(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index ac554cca05..00c78b1288 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -858,9 +858,18 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, attr->log_max_srq_sz = MLX5_GET(cmd_hca_cap, hcattr, log_max_srq_sz); attr->reg_c_preserve = MLX5_GET(cmd_hca_cap, hcattr, reg_c_preserve); - attr->mmo_dma_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo); - attr->mmo_compress_en = MLX5_GET(cmd_hca_cap, hcattr, compress); - attr->mmo_decompress_en = MLX5_GET(cmd_hca_cap, hcattr, decompress); + attr->mmo_regex_qp_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_qp); + attr->mmo_regex_sq_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_sq); + attr->mmo_dma_sq_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_sq); + attr->mmo_compress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_sq); + attr->mmo_decompress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_sq); + attr->mmo_dma_qp_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_qp); + attr->mmo_compress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_qp); + attr->mmo_decompress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_qp); attr->compress_min_block_size = MLX5_GET(cmd_hca_cap, hcattr, compress_min_block_size); attr->log_max_mmo_dma = MLX5_GET(cmd_hca_cap, hcattr, log_dma_mmo_size); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index c071629904..b21df0fd9b 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -173,9 +173,14 @@ struct mlx5_hca_attr { uint32_t log_max_srq; uint32_t log_max_srq_sz; uint32_t rss_ind_tbl_cap; - uint32_t mmo_dma_en:1; - uint32_t mmo_compress_en:1; - uint32_t mmo_decompress_en:1; + uint32_t mmo_dma_sq_en:1; + uint32_t mmo_compress_sq_en:1; + uint32_t mmo_decompress_sq_en:1; + uint32_t mmo_dma_qp_en:1; + uint32_t mmo_compress_qp_en:1; + uint32_t mmo_decompress_qp_en:1; + uint32_t mmo_regex_qp_en:1; + uint32_t mmo_regex_sq_en:1; uint32_t compress_min_block_size:4; uint32_t log_max_mmo_dma:5; uint32_t log_max_mmo_compress:5; diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index d361bcf90e..ec5f871c61 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1386,10 +1386,10 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 rtr2rts_qp_counters_set_id[0x1]; u8 rts2rts_udp_sport[0x1]; u8 rts2rts_lag_tx_port_affinity[0x1]; - u8 dma_mmo[0x1]; + u8 dma_mmo_sq[0x1]; u8 compress_min_block_size[0x4]; - u8 compress[0x1]; - u8 decompress[0x1]; + u8 compress_mmo_sq[0x1]; + u8 decompress_mmo_sq[0x1]; u8 log_max_ra_res_qp[0x6]; u8 end_pad[0x1]; u8 cc_query_allowed[0x1]; @@ -1519,7 +1519,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_lag_ports[0x4]; u8 reserved_at_280[0x10]; u8 max_wqe_sz_sq[0x10]; - u8 reserved_at_2a0[0x10]; + u8 reserved_at_2a0[0xc]; + u8 regexp_mmo_sq[0x1]; + u8 reserved_at_2b0[0x3]; u8 max_wqe_sz_rq[0x10]; u8 max_flow_counter_31_16[0x10]; u8 max_wqe_sz_sq_dc[0x10]; @@ -1632,7 +1634,12 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_vhca_ports[0x8]; u8 reserved_at_618[0x6]; u8 sw_owner_id[0x1]; - u8 reserved_at_61f[0x1e1]; + u8 reserved_at_61f[0x109]; + u8 dma_mmo_qp[0x1]; + u8 regexp_mmo_qp[0x1]; + u8 compress_mmo_qp[0x1]; + u8 decompress_mmo_qp[0x1]; + u8 reserved_at_624[0xd4]; }; struct mlx5_ifc_qos_cap_bits { @@ -3244,7 +3251,8 @@ struct mlx5_ifc_create_qp_in_bits { u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 reserved_at_40[0x40]; + u8 qpc_ext[0x1]; + u8 reserved_at_41[0x3f]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index c5e0a83a8c..1e03030510 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -813,8 +813,8 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_en == 0 || att.mmo_decompress_en == 0 || - att.mmo_dma_en == 0) { + att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V6 3/5] common/mlx5: add MMO configuration for the DevX QP 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane @ 2021-10-05 12:27 ` Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 4/5] compress/mlx5: refactor queue HW object Raja Zidane ` (2 subsequent siblings) 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-10-05 12:27 UTC (permalink / raw) To: dev A new configuration MMO was added to QP Context. If set, MMO WQEs are supported on this QP. For DMA MMO, supported only when dma_mmo_qp==1. For REGEXP MMO, supported only when regexp_mmo_qp==1. For COMPRESS MMO, supported only when compress_mmo_qp==1. For DECOMPRESS MMO, supported only when decompress_mmo_qp==1. Add support to DevX interface to set MMO bit. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++++++ drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/mlx5_prm.h | 28 +++++++++++++++++++++++++++- 3 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 00c78b1288..eefb869b7d 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2032,6 +2032,13 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, ts_format, attr->ts_format); MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { + if (attr->mmo) { + void *qpc_ext_and_pas_list = MLX5_ADDR_OF(create_qp_in, + in, qpc_extension_and_pas_list); + void *qpc_ext = MLX5_ADDR_OF(qpc_extension_and_pas_list, + qpc_ext_and_pas_list, qpc_data_extension); + MLX5_SET(qpc_extension, qpc_ext, mmo, 1); + } MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); if (attr->log_page_size > MLX5_ADAPTER_PAGE_SHIFT) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index b21df0fd9b..e149f8b4f5 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -403,6 +403,7 @@ struct mlx5_devx_qp_attr { uint32_t wq_umem_id; uint64_t wq_umem_offset; uint32_t user_index:24; + uint32_t mmo:1; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index ec5f871c61..54e62aa153 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -3243,6 +3243,28 @@ struct mlx5_ifc_create_qp_out_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_qpc_extension_bits { + u8 reserved_at_0[0x2]; + u8 mmo[0x1]; + u8 reserved_at_3[0x5fd]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_pas_list_bits { + u8 pas[0][0x40]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_extension_and_pas_list_bits { + struct mlx5_ifc_qpc_extension_bits qpc_data_extension; + u8 pas[0][0x40]; +}; + + #ifdef PEDANTIC #pragma GCC diagnostic ignored "-Wpedantic" #endif @@ -3260,7 +3282,11 @@ struct mlx5_ifc_create_qp_in_bits { u8 wq_umem_id[0x20]; u8 wq_umem_valid[0x1]; u8 reserved_at_861[0x1f]; - u8 pas[0][0x40]; + union { + struct mlx5_ifc_qpc_pas_list_bits qpc_pas_list; + struct mlx5_ifc_qpc_extension_and_pas_list_bits + qpc_extension_and_pas_list; + }; }; #ifdef PEDANTIC #pragma GCC diagnostic error "-Wpedantic" -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V6 4/5] compress/mlx5: refactor queue HW object 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane ` (2 preceding siblings ...) 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane @ 2021-10-05 12:27 ` Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 2021-10-05 16:18 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Thomas Monjalon 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-10-05 12:27 UTC (permalink / raw) To: dev The mlx5 PMD for compress class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/compress/mlx5/mlx5_compress.c | 73 +++++++++++++++------------ 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index 1e03030510..5c5aa87a18 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -40,7 +40,7 @@ struct mlx5_compress_priv { void *uar; uint32_t pdn; /* Protection Domain number. */ uint8_t min_block_size; - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ /* Minimum huffman block size supported by the device. */ struct ibv_pd *pd; struct rte_compressdev_config dev_config; @@ -48,6 +48,13 @@ struct mlx5_compress_priv { rte_spinlock_t xform_sl; struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ volatile uint64_t *uar_addr; + /* HCA caps*/ + uint32_t mmo_decomp_sq:1; + uint32_t mmo_decomp_qp:1; + uint32_t mmo_comp_sq:1; + uint32_t mmo_comp_qp:1; + uint32_t mmo_dma_sq:1; + uint32_t mmo_dma_qp:1; #ifndef RTE_ARCH_64 rte_spinlock_t uar32_sl; #endif /* RTE_ARCH_64 */ @@ -61,7 +68,7 @@ struct mlx5_compress_qp { struct mlx5_mr_ctrl mr_ctrl; int socket_id; struct mlx5_devx_cq cq; - struct mlx5_devx_sq sq; + struct mlx5_devx_qp qp; struct mlx5_pmd_mr opaque_mr; struct rte_comp_op **ops; struct mlx5_compress_priv *priv; @@ -134,8 +141,8 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) { struct mlx5_compress_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->sq.sq != NULL) - mlx5_devx_sq_destroy(&qp->sq); + if (qp->qp.qp != NULL) + mlx5_devx_qp_destroy(&qp->qp); if (qp->cq.cq != NULL) mlx5_devx_cq_destroy(&qp->cq); if (qp->opaque_mr.obj != NULL) { @@ -152,12 +159,12 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) } static void -mlx5_compress_init_sq(struct mlx5_compress_qp *qp) +mlx5_compress_init_qp(struct mlx5_compress_qp *qp) { volatile struct mlx5_gga_wqe *restrict wqe = - (volatile struct mlx5_gga_wqe *)qp->sq.wqes; + (volatile struct mlx5_gga_wqe *)qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; - const uint32_t sq_ds = rte_cpu_to_be_32((qp->sq.sq->id << 8) | 4u); + const uint32_t sq_ds = rte_cpu_to_be_32((qp->qp.qp->id << 8) | 4u); const uint32_t flags = RTE_BE32(MLX5_COMP_ALWAYS << MLX5_COMP_MODE_OFFSET); const uint32_t opaq_lkey = rte_cpu_to_be_32(qp->opaque_mr.lkey); @@ -182,15 +189,10 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), }; - struct mlx5_devx_create_sq_attr sq_attr = { + struct mlx5_devx_qp_attr qp_attr = { + .pd = priv->pdn, + .uar_index = mlx5_os_get_devx_uar_page_id(priv->uar), .user_index = qp_id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .pd = priv->pdn, - .uar_page = mlx5_os_get_devx_uar_page_id(priv->uar), - }, - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, }; uint32_t log_ops_n = rte_log2_u32(max_inflight_ops); uint32_t alloc_size = sizeof(*qp); @@ -242,24 +244,26 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto err; } - sq_attr.cqn = qp->cq.cq->id; - sq_attr.ts_format = mlx5_ts_format_conv(priv->sq_ts_format); - ret = mlx5_devx_sq_create(priv->ctx, &qp->sq, log_ops_n, &sq_attr, + qp_attr.cqn = qp->cq.cq->id; + qp_attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + qp_attr.rq_size = 0; + qp_attr.sq_size = RTE_BIT32(log_ops_n); + qp_attr.mmo = priv->mmo_decomp_qp && priv->mmo_comp_qp + && priv->mmo_dma_qp; + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp, log_ops_n, &qp_attr, socket_id); if (ret != 0) { - DRV_LOG(ERR, "Failed to create SQ."); + DRV_LOG(ERR, "Failed to create QP."); goto err; } - mlx5_compress_init_sq(qp); - ret = mlx5_devx_cmd_modify_sq(qp->sq.sq, &modify_attr); - if (ret != 0) { - DRV_LOG(ERR, "Can't change SQ state to ready."); + mlx5_compress_init_qp(qp); + ret = mlx5_devx_qp2rts(&qp->qp, 0); + if (ret) goto err; - } /* Save pointer of global generation number to check memory event. */ qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; DRV_LOG(INFO, "QP %u: SQN=0x%X CQN=0x%X entries num = %u", - (uint32_t)qp_id, qp->sq.sq->id, qp->cq.cq->id, qp->entries_n); + (uint32_t)qp_id, qp->qp.qp->id, qp->cq.cq->id, qp->entries_n); return 0; err: mlx5_compress_qp_release(dev, qp_id); @@ -508,7 +512,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, { struct mlx5_compress_qp *qp = queue_pair; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes, *wqe; + qp->qp.wqes, *wqe; struct mlx5_compress_xform *xform; struct rte_comp_op *op; uint16_t mask = qp->entries_n - 1; @@ -563,7 +567,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->sq.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); + qp->qp.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); rte_wmb(); mlx5_compress_uar_write(*(volatile uint64_t *)wqe, qp->priv); rte_wmb(); @@ -598,7 +602,7 @@ mlx5_compress_cqe_err_handle(struct mlx5_compress_qp *qp, volatile struct mlx5_err_cqe *cqe = (volatile struct mlx5_err_cqe *) &qp->cq.cqes[idx]; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes; + qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; op->status = RTE_COMP_OP_STATUS_ERROR; @@ -813,8 +817,9 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || - att.mmo_dma_sq_en == 0) { + ((att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) && (att.mmo_compress_qp_en == 0 || + att.mmo_decompress_qp_en == 0 || att.mmo_dma_qp_en == 0))) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); @@ -835,10 +840,16 @@ mlx5_compress_dev_probe(struct rte_device *dev) cdev->enqueue_burst = mlx5_compress_enqueue_burst; cdev->feature_flags = RTE_COMPDEV_FF_HW_ACCELERATED; priv = cdev->data->dev_private; + priv->mmo_decomp_sq = att.mmo_decompress_sq_en; + priv->mmo_decomp_qp = att.mmo_decompress_qp_en; + priv->mmo_comp_sq = att.mmo_compress_sq_en; + priv->mmo_comp_qp = att.mmo_compress_qp_en; + priv->mmo_dma_sq = att.mmo_dma_sq_en; + priv->mmo_dma_qp = att.mmo_dma_qp_en; priv->ctx = ctx; priv->cdev = cdev; priv->min_block_size = att.compress_min_block_size; - priv->sq_ts_format = att.sq_ts_format; + priv->qp_ts_format = att.qp_ts_format; if (mlx5_compress_hw_global_prepare(priv) != 0) { rte_compressdev_pmd_destroy(priv->cdev); claim_zero(mlx5_glue->close_device(priv->ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V6 5/5] regex/mlx5: refactor HW queue objects 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane ` (3 preceding siblings ...) 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 4/5] compress/mlx5: refactor queue HW object Raja Zidane @ 2021-10-05 12:27 ` Raja Zidane 2021-10-05 16:18 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Thomas Monjalon 5 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-10-05 12:27 UTC (permalink / raw) To: dev The mlx5 PMD for regex class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- 4 files changed, 133 insertions(+), 125 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_regex.c b/drivers/regex/mlx5/mlx5_regex.c index 8866a4d0c6..5aa988be6d 100644 --- a/drivers/regex/mlx5/mlx5_regex.c +++ b/drivers/regex/mlx5/mlx5_regex.c @@ -146,7 +146,8 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) DRV_LOG(ERR, "Unable to read HCA capabilities."); rte_errno = ENOTSUP; goto dev_error; - } else if (!attr.regex || attr.regexp_num_of_engines == 0) { + } else if (((!attr.regex) && (!attr.mmo_regex_sq_en) && + (!attr.mmo_regex_qp_en)) || attr.regexp_num_of_engines == 0) { DRV_LOG(ERR, "Not enough capabilities to support RegEx, maybe " "old FW/OFED version?"); rte_errno = ENOTSUP; @@ -164,7 +165,9 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) rte_errno = ENOMEM; goto dev_error; } - priv->sq_ts_format = attr.sq_ts_format; + priv->mmo_regex_qp_cap = attr.mmo_regex_qp_en; + priv->mmo_regex_sq_cap = attr.mmo_regex_sq_en; + priv->qp_ts_format = attr.qp_ts_format; priv->ctx = ctx; priv->nb_engines = 2; /* attr.regexp_num_of_engines */ ret = mlx5_devx_regex_register_read(priv->ctx, 0, diff --git a/drivers/regex/mlx5/mlx5_regex.h b/drivers/regex/mlx5/mlx5_regex.h index 514f3408f9..2242d250a3 100644 --- a/drivers/regex/mlx5/mlx5_regex.h +++ b/drivers/regex/mlx5/mlx5_regex.h @@ -17,12 +17,12 @@ #include "mlx5_rxp.h" #include "mlx5_regex_utils.h" -struct mlx5_regex_sq { +struct mlx5_regex_hw_qp { uint16_t log_nb_desc; /* Log 2 number of desc for this object. */ - struct mlx5_devx_sq sq_obj; /* The SQ DevX object. */ + struct mlx5_devx_qp qp_obj; /* The QP DevX object. */ size_t pi, db_pi; size_t ci; - uint32_t sqn; + uint32_t qpn; }; struct mlx5_regex_cq { @@ -34,10 +34,10 @@ struct mlx5_regex_cq { struct mlx5_regex_qp { uint32_t flags; /* QP user flags. */ uint32_t nb_desc; /* Total number of desc for this qp. */ - struct mlx5_regex_sq *sqs; /* Pointer to sq array. */ - uint16_t nb_obj; /* Number of sq objects. */ + struct mlx5_regex_hw_qp *qps; /* Pointer to qp array. */ + uint16_t nb_obj; /* Number of qp objects. */ struct mlx5_regex_cq cq; /* CQ struct. */ - uint32_t free_sqs; + uint32_t free_qps; struct mlx5_regex_job *jobs; struct ibv_mr *metadata; struct ibv_mr *outputs; @@ -73,8 +73,10 @@ struct mlx5_regex_priv { /**< Called by memory event callback. */ struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ uint8_t is_bf2; /* The device is BF2 device. */ - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ uint8_t has_umr; /* The device supports UMR. */ + uint32_t mmo_regex_qp_cap:1; + uint32_t mmo_regex_sq_cap:1; }; #ifdef HAVE_IBV_FLOW_DV_SUPPORT diff --git a/drivers/regex/mlx5/mlx5_regex_control.c b/drivers/regex/mlx5/mlx5_regex_control.c index 8ce2dabb55..572ecc6d86 100644 --- a/drivers/regex/mlx5/mlx5_regex_control.c +++ b/drivers/regex/mlx5/mlx5_regex_control.c @@ -106,12 +106,12 @@ regex_ctrl_create_cq(struct mlx5_regex_priv *priv, struct mlx5_regex_cq *cq) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) +regex_ctrl_destroy_hw_qp(struct mlx5_regex_qp *qp, uint16_t q_ind) { - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; - mlx5_devx_sq_destroy(&sq->sq_obj); - memset(sq, 0, sizeof(*sq)); + mlx5_devx_qp_destroy(&qp_obj->qp_obj); + memset(qp, 0, sizeof(*qp)); return 0; } @@ -131,45 +131,44 @@ regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_create_sq(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, +regex_ctrl_create_hw_qp(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, uint16_t q_ind, uint16_t log_nb_desc) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT - struct mlx5_devx_create_sq_attr attr = { - .user_index = q_ind, + struct mlx5_devx_qp_attr attr = { .cqn = qp->cq.cq_obj.cq->id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .uar_page = priv->uar->page_id, - }, - .ts_format = mlx5_ts_format_conv(priv->sq_ts_format), - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, + .uar_index = priv->uar->page_id, + .ts_format = mlx5_ts_format_conv(priv->qp_ts_format), + .user_index = q_ind, }; - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; uint32_t pd_num = 0; int ret; - sq->log_nb_desc = log_nb_desc; - sq->sqn = q_ind; - sq->ci = 0; - sq->pi = 0; + qp_obj->log_nb_desc = log_nb_desc; + qp_obj->qpn = q_ind; + qp_obj->ci = 0; + qp_obj->pi = 0; ret = regex_get_pdn(priv->pd, &pd_num); if (ret) return ret; - attr.wq_attr.pd = pd_num; - ret = mlx5_devx_sq_create(priv->ctx, &sq->sq_obj, + attr.pd = pd_num; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, + log_nb_desc)); + attr.mmo = priv->mmo_regex_qp_cap; + ret = mlx5_devx_qp_create(priv->ctx, &qp_obj->qp_obj, MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_nb_desc), &attr, SOCKET_ID_ANY); if (ret) { - DRV_LOG(ERR, "Can't create SQ object."); + DRV_LOG(ERR, "Can't create QP object."); rte_errno = ENOMEM; return -rte_errno; } - ret = mlx5_devx_cmd_modify_sq(sq->sq_obj.sq, &modify_attr); + ret = mlx5_devx_qp2rts(&qp_obj->qp_obj, 0); if (ret) { - DRV_LOG(ERR, "Can't change SQ state to ready."); - regex_ctrl_destroy_sq(qp, q_ind); + DRV_LOG(ERR, "Can't change QP state to RTS."); + regex_ctrl_destroy_hw_qp(qp, q_ind); rte_errno = ENOMEM; return -rte_errno; } @@ -224,10 +223,10 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, (1 << MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_desc)); else qp->nb_obj = 1; - qp->sqs = rte_malloc(NULL, - qp->nb_obj * sizeof(struct mlx5_regex_sq), 64); - if (!qp->sqs) { - DRV_LOG(ERR, "Can't allocate sq array memory."); + qp->qps = rte_malloc(NULL, + qp->nb_obj * sizeof(struct mlx5_regex_hw_qp), 64); + if (!qp->qps) { + DRV_LOG(ERR, "Can't allocate qp array memory."); rte_errno = ENOMEM; return -rte_errno; } @@ -238,9 +237,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, goto err_cq; } for (i = 0; i < qp->nb_obj; i++) { - ret = regex_ctrl_create_sq(priv, qp, i, log_desc); + ret = regex_ctrl_create_hw_qp(priv, qp, i, log_desc); if (ret) { - DRV_LOG(ERR, "Can't create sq."); + DRV_LOG(ERR, "Can't create qp object."); goto err_btree; } nb_sq_config++; @@ -266,9 +265,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); err_btree: for (i = 0; i < nb_sq_config; i++) - regex_ctrl_destroy_sq(qp, i); + regex_ctrl_destroy_hw_qp(qp, i); regex_ctrl_destroy_cq(&qp->cq); err_cq: - rte_free(qp->sqs); + rte_free(qp->qps); return ret; } diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c index c79445ce7d..0833b2817e 100644 --- a/drivers/regex/mlx5/mlx5_regex_fastpath.c +++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c @@ -39,13 +39,13 @@ #define MLX5_REGEX_KLMS_SIZE \ ((MLX5_REGEX_MAX_KLM_NUM) * sizeof(struct mlx5_klm)) /* In WQE set mode, the pi should be quarter of the MLX5_REGEX_MAX_WQE_INDEX. */ -#define MLX5_REGEX_UMR_SQ_PI_IDX(pi, ops) \ +#define MLX5_REGEX_UMR_QP_PI_IDX(pi, ops) \ (((pi) + (ops)) & (MLX5_REGEX_MAX_WQE_INDEX >> 2)) static inline uint32_t -sq_size_get(struct mlx5_regex_sq *sq) +qp_size_get(struct mlx5_regex_hw_qp *qp) { - return (1U << sq->log_nb_desc); + return (1U << qp->log_nb_desc); } static inline uint32_t @@ -144,11 +144,11 @@ mlx5_regex_addr2mr(struct mlx5_regex_priv *priv, struct mlx5_mr_ctrl *mr_ctrl, static inline void -__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, +__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); uint16_t group0 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F ? @@ -168,13 +168,13 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F | RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F))) group0 = op->group_id0; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; int ds = 4; /* ctrl + meta + input + output */ set_wqe_ctrl_seg((struct mlx5_wqe_ctrl_seg *)wqe, (priv->has_umr ? (pi * 4 + 3) : pi), MLX5_OPCODE_MMO, MLX5_OPC_MOD_MMO_REGEX, - sq->sq_obj.sq->id, 0, ds, 0, 0); + qp_obj->qp_obj.qp->id, 0, ds, 0, 0); set_regex_ctrl_seg(wqe + 12, 0, group0, group1, group2, group3, control); struct mlx5_wqe_data_seg *input_seg = @@ -188,7 +188,7 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, static inline void prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops *op, + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job) { struct mlx5_klm klm; @@ -196,42 +196,42 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.byte_count = rte_pktmbuf_data_len(op->mbuf); klm.mkey = mlx5_regex_addr2mr(priv, &qp->mr_ctrl, op->mbuf); klm.address = rte_pktmbuf_mtod(op->mbuf, uintptr_t); - __prep_one(priv, sq, op, job, sq->pi, &klm); - sq->db_pi = sq->pi; - sq->pi = (sq->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; + __prep_one(priv, qp_obj, op, job, qp_obj->pi, &klm); + qp_obj->db_pi = qp_obj->pi; + qp_obj->pi = (qp_obj->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; } static inline void -send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq) +send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj) { struct mlx5dv_devx_uar *uar = priv->uar; - size_t wqe_offset = (sq->db_pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (qp_obj->db_pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; /* Or the fm_ce_se instead of set, avoid the fence be cleared. */ ((struct mlx5_wqe_ctrl_seg *)wqe)->fm_ce_se |= MLX5_WQE_CTRL_CQ_UPDATE; uint64_t *doorbell_addr = (uint64_t *)((uint8_t *)uar->base_addr + 0x800); rte_io_wmb(); - sq->sq_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? - (sq->db_pi * 4 + 3) : sq->db_pi) & - MLX5_REGEX_MAX_WQE_INDEX); + qp_obj->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? + (qp_obj->db_pi * 4 + 3) : qp_obj->db_pi) + & MLX5_REGEX_MAX_WQE_INDEX); rte_wmb(); *doorbell_addr = *(volatile uint64_t *)wqe; rte_wmb(); } static inline int -get_free(struct mlx5_regex_sq *sq, uint8_t has_umr) { - return (sq_size_get(sq) - ((sq->pi - sq->ci) & +get_free(struct mlx5_regex_hw_qp *qp, uint8_t has_umr) { + return (qp_size_get(qp) - ((qp->pi - qp->ci) & (has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX))); } static inline uint32_t -job_id_get(uint32_t qid, size_t sq_size, size_t index) { - return qid * sq_size + (index & (sq_size - 1)); +job_id_get(uint32_t qid, size_t qp_size, size_t index) { + return qid * qp_size + (index & (qp_size - 1)); } #ifdef HAVE_MLX5_UMR_IMKEY @@ -242,14 +242,14 @@ mkey_klm_available(struct mlx5_klm *klm, uint32_t pos, uint32_t new) } static inline void -complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, +complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_hw_qp *qp_obj, struct mlx5_regex_job *mkey_job, size_t umr_index, uint32_t klm_size, uint32_t total_len) { - size_t wqe_offset = (umr_index & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (umr_index & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB * 4); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp_obj->qp_obj.wqes + wqe_offset); struct mlx5_wqe_umr_ctrl_seg *ucseg = (struct mlx5_wqe_umr_ctrl_seg *)(wqe + 1); struct mlx5_wqe_mkey_context_seg *mkc = @@ -260,7 +260,7 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* Set WQE control seg. Non-inline KLM UMR WQE size must be 9 WQE_DS. */ set_wqe_ctrl_seg(wqe, (umr_index * 4), MLX5_OPCODE_UMR, - 0, sq->sq_obj.sq->id, 0, 9, 0, + 0, qp_obj->qp_obj.qp->id, 0, 9, 0, rte_cpu_to_be_32(mkey_job->imkey->id)); /* Set UMR WQE control seg. */ ucseg->mkey_mask |= rte_cpu_to_be_64(MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN | @@ -287,37 +287,37 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, } static inline void -prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, - struct rte_regex_ops *op, struct mlx5_regex_job *job, - size_t pi, struct mlx5_klm *klm) +prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, + struct mlx5_regex_hw_qp *qp, struct rte_regex_ops *op, + struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp) - 1)) * (MLX5_SEND_WQE_BB << 2); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp->qp_obj.wqes + wqe_offset); /* Clear the WQE memory used as UMR WQE previously. */ if ((rte_be_to_cpu_32(wqe->opmod_idx_opcode) & 0xff) != MLX5_OPCODE_NOP) memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* UMR WQE size is 9 DS, align nop WQE to 3 WQEBBS(12 DS). */ - set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, sq->sq_obj.sq->id, + set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, qp->qp_obj.qp->id, 0, 12, 0, 0); - __prep_one(priv, sq, op, job, pi, klm); + __prep_one(priv, qp, op, job, pi, klm); } static inline void prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops **op, size_t nb_ops) + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops **op, + size_t nb_ops) { struct mlx5_regex_job *job = NULL; - size_t sqid = sq->sqn, mkey_job_id = 0; + size_t hw_qpid = qp_obj->qpn, mkey_job_id = 0; size_t left_ops = nb_ops; uint32_t klm_num = 0; uint32_t len = 0; struct mlx5_klm *mkey_klm = NULL; struct mlx5_klm klm; - sqid = sq->sqn; while (left_ops--) rte_prefetch0(op[left_ops]); left_ops = nb_ops; @@ -329,7 +329,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, */ while (left_ops--) { struct rte_mbuf *mbuf = op[left_ops]->mbuf; - size_t pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, left_ops); + size_t pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, left_ops); if (mbuf->nb_segs > 1) { size_t scatter_size = 0; @@ -341,16 +341,16 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, * WQE in the next WQE set. */ if (mkey_klm) - complete_umr_wqe(qp, sq, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], - MLX5_REGEX_UMR_SQ_PI_IDX(pi, 1), + MLX5_REGEX_UMR_QP_PI_IDX(pi, 1), klm_num, len); /* * Get the indircet mkey and KLM array index * from the last WQE set. */ - mkey_job_id = job_id_get(sqid, - sq_size_get(sq), pi); + mkey_job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), pi); mkey_klm = qp->jobs[mkey_job_id].imkey_array; klm_num = 0; len = 0; @@ -384,22 +384,23 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.address = rte_pktmbuf_mtod(mbuf, uintptr_t); klm.byte_count = rte_pktmbuf_data_len(mbuf); } - job = &qp->jobs[job_id_get(sqid, sq_size_get(sq), pi)]; + job = &qp->jobs[job_id_get(hw_qpid, qp_size_get(qp_obj), pi)]; /* * Build the nop + RegEx WQE set by default. The fist nop WQE * will be updated later as UMR WQE if scattered mubf exist. */ - prep_nop_regex_wqe_set(priv, sq, op[left_ops], job, pi, &klm); + prep_nop_regex_wqe_set(priv, qp_obj, op[left_ops], job, pi, + &klm); } /* * Scattered mbuf have been added to the KLM array. Complete the build * of UMR WQE, update the first nop WQE as UMR WQE. */ if (mkey_klm) - complete_umr_wqe(qp, sq, &qp->jobs[mkey_job_id], sq->pi, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], qp_obj->pi, klm_num, len); - sq->db_pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops - 1); - sq->pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops); + qp_obj->db_pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops - 1); + qp_obj->pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops); } uint16_t @@ -408,21 +409,22 @@ mlx5_regexdev_enqueue_gga(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, nb_left = nb_ops, nb_desc; + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, nb_left = nb_ops, nb_desc; - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - nb_desc = get_free(sq, priv->has_umr); + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + nb_desc = get_free(qp_obj, priv->has_umr); if (nb_desc) { /* The ops be handled can't exceed nb_ops. */ if (nb_desc > nb_left) nb_desc = nb_left; else - queue->free_sqs &= ~(1 << sqid); - prep_regex_umr_wqe_set(priv, queue, sq, ops, nb_desc); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + prep_regex_umr_wqe_set(priv, queue, qp_obj, ops, + nb_desc); + send_doorbell(priv, qp_obj); nb_left -= nb_desc; } if (!nb_left) @@ -441,23 +443,25 @@ mlx5_regexdev_enqueue(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, job_id, i = 0; - - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - while (get_free(sq, priv->has_umr)) { - job_id = job_id_get(sqid, sq_size_get(sq), sq->pi); - prep_one(priv, queue, sq, ops[i], &queue->jobs[job_id]); + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, job_id, i = 0; + + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + while (get_free(qp_obj, priv->has_umr)) { + job_id = job_id_get(hw_qpid, qp_size_get(qp_obj), + qp_obj->pi); + prep_one(priv, queue, qp_obj, ops[i], + &queue->jobs[job_id]); i++; if (unlikely(i == nb_ops)) { - send_doorbell(priv, sq); + send_doorbell(priv, qp_obj); goto out; } } - queue->free_sqs &= ~(1 << sqid); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + send_doorbell(priv, qp_obj); } out: @@ -567,21 +571,21 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, uint16_t wq_counter = (rte_be_to_cpu_16(cqe->wqe_counter) + 1) & MLX5_REGEX_MAX_WQE_INDEX; - size_t sqid = cqe->rsvd3[2]; - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; + size_t hw_qpid = cqe->rsvd3[2]; + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; /* UMR mode WQE counter move as WQE set(4 WQEBBS).*/ if (priv->has_umr) wq_counter >>= 2; - while (sq->ci != wq_counter) { + while (qp_obj->ci != wq_counter) { if (unlikely(i == nb_ops)) { /* Return without updating cq->ci */ goto out; } - uint32_t job_id = job_id_get(sqid, sq_size_get(sq), - sq->ci); + uint32_t job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), qp_obj->ci); extract_result(ops[i], &queue->jobs[job_id]); - sq->ci = (sq->ci + 1) & (priv->has_umr ? + qp_obj->ci = (qp_obj->ci + 1) & (priv->has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX); i++; @@ -589,7 +593,7 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, cq->ci = (cq->ci + 1) & 0xffffff; rte_wmb(); cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->ci); - queue->free_sqs |= (1 << sqid); + queue->free_qps |= (1 << hw_qpid); } out: @@ -598,15 +602,15 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, } static void -setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) +setup_qps(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) { - size_t sqid, entry; + size_t hw_qpid, entry; uint32_t job_id; - for (sqid = 0; sqid < queue->nb_obj; sqid++) { - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes; - for (entry = 0 ; entry < sq_size_get(sq); entry++) { - job_id = sqid * sq_size_get(sq) + entry; + for (hw_qpid = 0; hw_qpid < queue->nb_obj; hw_qpid++) { + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes; + for (entry = 0 ; entry < qp_size_get(qp_obj); entry++) { + job_id = hw_qpid * qp_size_get(qp_obj) + entry; struct mlx5_regex_job *job = &queue->jobs[job_id]; /* Fill UMR WQE with NOP in advanced. */ @@ -614,7 +618,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) set_wqe_ctrl_seg ((struct mlx5_wqe_ctrl_seg *)wqe, entry * 2, MLX5_OPCODE_NOP, 0, - sq->sq_obj.sq->id, 0, 12, 0, 0); + qp_obj->qp_obj.qp->id, 0, 12, 0, 0); wqe += MLX5_REGEX_UMR_WQE_SIZE; } set_metadata_seg((struct mlx5_wqe_metadata_seg *) @@ -628,7 +632,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) (uintptr_t)job->output); wqe += 64; } - queue->free_sqs |= 1 << sqid; + queue->free_qps |= 1 << hw_qpid; } } @@ -738,7 +742,7 @@ mlx5_regexdev_setup_fastpath(struct mlx5_regex_priv *priv, uint32_t qp_id) return err; } - setup_sqs(priv, qp); + setup_qps(priv, qp); if (priv->has_umr) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane ` (4 preceding siblings ...) 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 5/5] regex/mlx5: refactor HW queue objects Raja Zidane @ 2021-10-05 16:18 ` Thomas Monjalon 5 siblings, 0 replies; 38+ messages in thread From: Thomas Monjalon @ 2021-10-05 16:18 UTC (permalink / raw) To: Raja Zidane; +Cc: dev > Raja Zidane (5): > common/mlx5: share DevX QP operations > common/mlx5: update new MMO HCA capabilities > common/mlx5: add MMO configuration for the DevX QP > compress/mlx5: refactor queue HW object > regex/mlx5: refactor HW queue objects Applied, thanks. ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V2 2/5] common/mlx5: update new MMO HCA capabilities 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 1/5] common/mlx5: share DevX QP operations Raja Zidane @ 2021-09-12 16:36 ` Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane ` (2 subsequent siblings) 4 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-12 16:36 UTC (permalink / raw) To: dev New MMO HCA capabilities were added and others were renamed. Align hca capabilities with new prm. Add support in devx interface for changes in HCA capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 15 ++++++++++++--- drivers/common/mlx5/mlx5_devx_cmds.h | 11 ++++++++--- drivers/common/mlx5/mlx5_prm.h | 20 ++++++++++++++------ 3 files changed, 34 insertions(+), 12 deletions(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index ac554cca05..00c78b1288 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -858,9 +858,18 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, attr->log_max_srq_sz = MLX5_GET(cmd_hca_cap, hcattr, log_max_srq_sz); attr->reg_c_preserve = MLX5_GET(cmd_hca_cap, hcattr, reg_c_preserve); - attr->mmo_dma_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo); - attr->mmo_compress_en = MLX5_GET(cmd_hca_cap, hcattr, compress); - attr->mmo_decompress_en = MLX5_GET(cmd_hca_cap, hcattr, decompress); + attr->mmo_regex_qp_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_qp); + attr->mmo_regex_sq_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_sq); + attr->mmo_dma_sq_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_sq); + attr->mmo_compress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_sq); + attr->mmo_decompress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_sq); + attr->mmo_dma_qp_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_qp); + attr->mmo_compress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + compress_mmo_qp); + attr->mmo_decompress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, + decompress_mmo_qp); attr->compress_min_block_size = MLX5_GET(cmd_hca_cap, hcattr, compress_min_block_size); attr->log_max_mmo_dma = MLX5_GET(cmd_hca_cap, hcattr, log_dma_mmo_size); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index c071629904..b21df0fd9b 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -173,9 +173,14 @@ struct mlx5_hca_attr { uint32_t log_max_srq; uint32_t log_max_srq_sz; uint32_t rss_ind_tbl_cap; - uint32_t mmo_dma_en:1; - uint32_t mmo_compress_en:1; - uint32_t mmo_decompress_en:1; + uint32_t mmo_dma_sq_en:1; + uint32_t mmo_compress_sq_en:1; + uint32_t mmo_decompress_sq_en:1; + uint32_t mmo_dma_qp_en:1; + uint32_t mmo_compress_qp_en:1; + uint32_t mmo_decompress_qp_en:1; + uint32_t mmo_regex_qp_en:1; + uint32_t mmo_regex_sq_en:1; uint32_t compress_min_block_size:4; uint32_t log_max_mmo_dma:5; uint32_t log_max_mmo_compress:5; diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index d361bcf90e..070c538c8c 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1386,10 +1386,10 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 rtr2rts_qp_counters_set_id[0x1]; u8 rts2rts_udp_sport[0x1]; u8 rts2rts_lag_tx_port_affinity[0x1]; - u8 dma_mmo[0x1]; + u8 dma_mmo_sq[0x1]; u8 compress_min_block_size[0x4]; - u8 compress[0x1]; - u8 decompress[0x1]; + u8 compress_mmo_sq[0x1]; + u8 decompress_mmo_sq[0x1]; u8 log_max_ra_res_qp[0x6]; u8 end_pad[0x1]; u8 cc_query_allowed[0x1]; @@ -1519,7 +1519,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_lag_ports[0x4]; u8 reserved_at_280[0x10]; u8 max_wqe_sz_sq[0x10]; - u8 reserved_at_2a0[0x10]; + u8 reserved_at_2a0[0xe]; + u8 regexp_mmo_sq[0x1]; + u8 reserved_at_2b0[0x1]; u8 max_wqe_sz_rq[0x10]; u8 max_flow_counter_31_16[0x10]; u8 max_wqe_sz_sq_dc[0x10]; @@ -1632,7 +1634,12 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_vhca_ports[0x8]; u8 reserved_at_618[0x6]; u8 sw_owner_id[0x1]; - u8 reserved_at_61f[0x1e1]; + u8 reserved_at_61f[0x109]; + u8 dma_mmo_qp[0x1]; + u8 regexp_mmo_qp[0x1]; + u8 compress_mmo_qp[0x1]; + u8 decompress_mmo_qp[0x1]; + u8 reserved_at_624[0xd4]; }; struct mlx5_ifc_qos_cap_bits { @@ -3244,7 +3251,8 @@ struct mlx5_ifc_create_qp_in_bits { u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 reserved_at_40[0x40]; + u8 qpc_ext[0x1]; + u8 reserved_at_41[0x3f]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V2 3/5] common/mlx5: add MMO configuration for the DevX QP 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane @ 2021-09-12 16:36 ` Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 4/5] compress/mlx5: refactor queue HW object Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 4 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-12 16:36 UTC (permalink / raw) To: dev A new configuration MMO was added to QP Context. If set, MMO WQEs are supported on this QP. For DMA MMO, supported only when dma_mmo_qp==1. For REGEXP MMO, supported only when regexp_mmo_qp==1. For COMPRESS MMO, supported only when compress_mmo_qp==1. For DECOMPRESS MMO, supported only when decompress_mmo_qp==1. Add support to DevX interface to set MMO bit. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++++++ drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/mlx5_prm.h | 28 +++++++++++++++++++++++++++- 3 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 00c78b1288..eefb869b7d 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2032,6 +2032,13 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, ts_format, attr->ts_format); MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { + if (attr->mmo) { + void *qpc_ext_and_pas_list = MLX5_ADDR_OF(create_qp_in, + in, qpc_extension_and_pas_list); + void *qpc_ext = MLX5_ADDR_OF(qpc_extension_and_pas_list, + qpc_ext_and_pas_list, qpc_data_extension); + MLX5_SET(qpc_extension, qpc_ext, mmo, 1); + } MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); if (attr->log_page_size > MLX5_ADAPTER_PAGE_SHIFT) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index b21df0fd9b..e149f8b4f5 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -403,6 +403,7 @@ struct mlx5_devx_qp_attr { uint32_t wq_umem_id; uint64_t wq_umem_offset; uint32_t user_index:24; + uint32_t mmo:1; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 070c538c8c..cb28adb80a 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -3243,6 +3243,28 @@ struct mlx5_ifc_create_qp_out_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_qpc_extension_bits { + u8 reserved_at_0[0x2]; + u8 mmo[0x1]; + u8 reserved_at_3[0x5fd]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_pas_list_bits { + u8 pas[0][0x40]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_extension_and_pas_list_bits { + struct mlx5_ifc_qpc_extension_bits qpc_data_extension; + u8 pas[0][0x40]; +}; + + #ifdef PEDANTIC #pragma GCC diagnostic ignored "-Wpedantic" #endif @@ -3260,7 +3282,11 @@ struct mlx5_ifc_create_qp_in_bits { u8 wq_umem_id[0x20]; u8 wq_umem_valid[0x1]; u8 reserved_at_861[0x1f]; - u8 pas[0][0x40]; + union { + struct mlx5_ifc_qpc_pas_list_bits qpc_pas_list; + struct mlx5_ifc_qpc_extension_and_pas_list_bits + qpc_extension_and_pas_list; + }; }; #ifdef PEDANTIC #pragma GCC diagnostic error "-Wpedantic" -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V2 4/5] compress/mlx5: refactor queue HW object 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 0/5] mlx5: replaced hardware queue object Raja Zidane ` (2 preceding siblings ...) 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane @ 2021-09-12 16:36 ` Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 4 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-12 16:36 UTC (permalink / raw) To: dev The mlx5 PMD for compress class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/compress/mlx5/mlx5_compress.c | 73 +++++++++++++++------------ 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index 883e720ec1..0f002195c6 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -40,7 +40,7 @@ struct mlx5_compress_priv { void *uar; uint32_t pdn; /* Protection Domain number. */ uint8_t min_block_size; - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ /* Minimum huffman block size supported by the device. */ struct ibv_pd *pd; struct rte_compressdev_config dev_config; @@ -48,6 +48,13 @@ struct mlx5_compress_priv { rte_spinlock_t xform_sl; struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ volatile uint64_t *uar_addr; + /* HCA caps*/ + uint32_t mmo_decomp_sq:1; + uint32_t mmo_decomp_qp:1; + uint32_t mmo_comp_sq:1; + uint32_t mmo_comp_qp:1; + uint32_t mmo_dma_sq:1; + uint32_t mmo_dma_qp:1; #ifndef RTE_ARCH_64 rte_spinlock_t uar32_sl; #endif /* RTE_ARCH_64 */ @@ -61,7 +68,7 @@ struct mlx5_compress_qp { struct mlx5_mr_ctrl mr_ctrl; int socket_id; struct mlx5_devx_cq cq; - struct mlx5_devx_sq sq; + struct mlx5_devx_qp qp; struct mlx5_pmd_mr opaque_mr; struct rte_comp_op **ops; struct mlx5_compress_priv *priv; @@ -134,8 +141,8 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) { struct mlx5_compress_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->sq.sq != NULL) - mlx5_devx_sq_destroy(&qp->sq); + if (qp->qp.qp != NULL) + mlx5_devx_qp_destroy(&qp->qp); if (qp->cq.cq != NULL) mlx5_devx_cq_destroy(&qp->cq); if (qp->opaque_mr.obj != NULL) { @@ -152,12 +159,12 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) } static void -mlx5_compress_init_sq(struct mlx5_compress_qp *qp) +mlx5_compress_init_qp(struct mlx5_compress_qp *qp) { volatile struct mlx5_gga_wqe *restrict wqe = - (volatile struct mlx5_gga_wqe *)qp->sq.wqes; + (volatile struct mlx5_gga_wqe *)qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; - const uint32_t sq_ds = rte_cpu_to_be_32((qp->sq.sq->id << 8) | 4u); + const uint32_t sq_ds = rte_cpu_to_be_32((qp->qp.qp->id << 8) | 4u); const uint32_t flags = RTE_BE32(MLX5_COMP_ALWAYS << MLX5_COMP_MODE_OFFSET); const uint32_t opaq_lkey = rte_cpu_to_be_32(qp->opaque_mr.lkey); @@ -182,15 +189,10 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), }; - struct mlx5_devx_create_sq_attr sq_attr = { + struct mlx5_devx_qp_attr qp_attr = { + .pd = priv->pdn, + .uar_index = mlx5_os_get_devx_uar_page_id(priv->uar), .user_index = qp_id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .pd = priv->pdn, - .uar_page = mlx5_os_get_devx_uar_page_id(priv->uar), - }, - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, }; uint32_t log_ops_n = rte_log2_u32(max_inflight_ops); uint32_t alloc_size = sizeof(*qp); @@ -242,24 +244,26 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto err; } - sq_attr.cqn = qp->cq.cq->id; - sq_attr.ts_format = mlx5_ts_format_conv(priv->sq_ts_format); - ret = mlx5_devx_sq_create(priv->ctx, &qp->sq, log_ops_n, &sq_attr, + qp_attr.cqn = qp->cq.cq->id; + qp_attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + qp_attr.rq_size = 0; + qp_attr.sq_size = RTE_BIT32(log_ops_n); + qp_attr.mmo = priv->mmo_decomp_qp && priv->mmo_comp_qp + && priv->mmo_dma_qp; + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp, log_ops_n, &qp_attr, socket_id); if (ret != 0) { - DRV_LOG(ERR, "Failed to create SQ."); + DRV_LOG(ERR, "Failed to create QP."); goto err; } - mlx5_compress_init_sq(qp); - ret = mlx5_devx_cmd_modify_sq(qp->sq.sq, &modify_attr); - if (ret != 0) { - DRV_LOG(ERR, "Can't change SQ state to ready."); + mlx5_compress_init_qp(qp); + ret = mlx5_devx_qp2rts(&qp->qp, 0); + if (ret) goto err; - } /* Save pointer of global generation number to check memory event. */ qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; DRV_LOG(INFO, "QP %u: SQN=0x%X CQN=0x%X entries num = %u", - (uint32_t)qp_id, qp->sq.sq->id, qp->cq.cq->id, qp->entries_n); + (uint32_t)qp_id, qp->qp.qp->id, qp->cq.cq->id, qp->entries_n); return 0; err: mlx5_compress_qp_release(dev, qp_id); @@ -508,7 +512,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, { struct mlx5_compress_qp *qp = queue_pair; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes, *wqe; + qp->qp.wqes, *wqe; struct mlx5_compress_xform *xform; struct rte_comp_op *op; uint16_t mask = qp->entries_n - 1; @@ -563,7 +567,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->sq.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); + qp->qp.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); rte_wmb(); mlx5_compress_uar_write(*(volatile uint64_t *)wqe, qp->priv); rte_wmb(); @@ -598,7 +602,7 @@ mlx5_compress_cqe_err_handle(struct mlx5_compress_qp *qp, volatile struct mlx5_err_cqe *cqe = (volatile struct mlx5_err_cqe *) &qp->cq.cqes[idx]; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes; + qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; op->status = RTE_COMP_OP_STATUS_ERROR; @@ -813,8 +817,9 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_en == 0 || att.mmo_decompress_en == 0 || - att.mmo_dma_en == 0) { + ((att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) && (att.mmo_compress_qp_en == 0 || + att.mmo_decompress_qp_en == 0 || att.mmo_dma_qp_en == 0))) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); @@ -835,10 +840,16 @@ mlx5_compress_dev_probe(struct rte_device *dev) cdev->enqueue_burst = mlx5_compress_enqueue_burst; cdev->feature_flags = RTE_COMPDEV_FF_HW_ACCELERATED; priv = cdev->data->dev_private; + priv->mmo_decomp_sq = att.mmo_decompress_sq_en; + priv->mmo_decomp_qp = att.mmo_decompress_qp_en; + priv->mmo_comp_sq = att.mmo_compress_sq_en; + priv->mmo_comp_qp = att.mmo_compress_qp_en; + priv->mmo_dma_sq = att.mmo_dma_sq_en; + priv->mmo_dma_qp = att.mmo_dma_qp_en; priv->ctx = ctx; priv->cdev = cdev; priv->min_block_size = att.compress_min_block_size; - priv->sq_ts_format = att.sq_ts_format; + priv->qp_ts_format = att.qp_ts_format; if (mlx5_compress_hw_global_prepare(priv) != 0) { rte_compressdev_pmd_destroy(priv->cdev); claim_zero(mlx5_glue->close_device(priv->ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH V2 5/5] regex/mlx5: refactor HW queue objects 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 0/5] mlx5: replaced hardware queue object Raja Zidane ` (3 preceding siblings ...) 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 4/5] compress/mlx5: refactor queue HW object Raja Zidane @ 2021-09-12 16:36 ` Raja Zidane 4 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-12 16:36 UTC (permalink / raw) To: dev The mlx5 PMD for regex class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/regex/mlx5/mlx5_regex.c | 7 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 65 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 170 ++++++++++++----------- 4 files changed, 133 insertions(+), 125 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_regex.c b/drivers/regex/mlx5/mlx5_regex.c index f17b6df47f..a3368749b9 100644 --- a/drivers/regex/mlx5/mlx5_regex.c +++ b/drivers/regex/mlx5/mlx5_regex.c @@ -146,7 +146,8 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) DRV_LOG(ERR, "Unable to read HCA capabilities."); rte_errno = ENOTSUP; goto dev_error; - } else if (!attr.regex || attr.regexp_num_of_engines == 0) { + } else if (((!attr.regex) && (!attr.mmo_regex_sq_en) && + (!attr.mmo_regex_qp_en)) || attr.regexp_num_of_engines == 0) { DRV_LOG(ERR, "Not enough capabilities to support RegEx, maybe " "old FW/OFED version?"); rte_errno = ENOTSUP; @@ -164,7 +165,9 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) rte_errno = ENOMEM; goto dev_error; } - priv->sq_ts_format = attr.sq_ts_format; + priv->mmo_regex_qp_cap = attr.mmo_regex_qp_en; + priv->mmo_regex_sq_cap = attr.mmo_regex_sq_en; + priv->qp_ts_format = attr.qp_ts_format; priv->ctx = ctx; priv->nb_engines = 2; /* attr.regexp_num_of_engines */ ret = mlx5_devx_regex_register_read(priv->ctx, 0, diff --git a/drivers/regex/mlx5/mlx5_regex.h b/drivers/regex/mlx5/mlx5_regex.h index 514f3408f9..2242d250a3 100644 --- a/drivers/regex/mlx5/mlx5_regex.h +++ b/drivers/regex/mlx5/mlx5_regex.h @@ -17,12 +17,12 @@ #include "mlx5_rxp.h" #include "mlx5_regex_utils.h" -struct mlx5_regex_sq { +struct mlx5_regex_hw_qp { uint16_t log_nb_desc; /* Log 2 number of desc for this object. */ - struct mlx5_devx_sq sq_obj; /* The SQ DevX object. */ + struct mlx5_devx_qp qp_obj; /* The QP DevX object. */ size_t pi, db_pi; size_t ci; - uint32_t sqn; + uint32_t qpn; }; struct mlx5_regex_cq { @@ -34,10 +34,10 @@ struct mlx5_regex_cq { struct mlx5_regex_qp { uint32_t flags; /* QP user flags. */ uint32_t nb_desc; /* Total number of desc for this qp. */ - struct mlx5_regex_sq *sqs; /* Pointer to sq array. */ - uint16_t nb_obj; /* Number of sq objects. */ + struct mlx5_regex_hw_qp *qps; /* Pointer to qp array. */ + uint16_t nb_obj; /* Number of qp objects. */ struct mlx5_regex_cq cq; /* CQ struct. */ - uint32_t free_sqs; + uint32_t free_qps; struct mlx5_regex_job *jobs; struct ibv_mr *metadata; struct ibv_mr *outputs; @@ -73,8 +73,10 @@ struct mlx5_regex_priv { /**< Called by memory event callback. */ struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ uint8_t is_bf2; /* The device is BF2 device. */ - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ uint8_t has_umr; /* The device supports UMR. */ + uint32_t mmo_regex_qp_cap:1; + uint32_t mmo_regex_sq_cap:1; }; #ifdef HAVE_IBV_FLOW_DV_SUPPORT diff --git a/drivers/regex/mlx5/mlx5_regex_control.c b/drivers/regex/mlx5/mlx5_regex_control.c index 8ce2dabb55..572ecc6d86 100644 --- a/drivers/regex/mlx5/mlx5_regex_control.c +++ b/drivers/regex/mlx5/mlx5_regex_control.c @@ -106,12 +106,12 @@ regex_ctrl_create_cq(struct mlx5_regex_priv *priv, struct mlx5_regex_cq *cq) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) +regex_ctrl_destroy_hw_qp(struct mlx5_regex_qp *qp, uint16_t q_ind) { - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; - mlx5_devx_sq_destroy(&sq->sq_obj); - memset(sq, 0, sizeof(*sq)); + mlx5_devx_qp_destroy(&qp_obj->qp_obj); + memset(qp, 0, sizeof(*qp)); return 0; } @@ -131,45 +131,44 @@ regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_create_sq(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, +regex_ctrl_create_hw_qp(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, uint16_t q_ind, uint16_t log_nb_desc) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT - struct mlx5_devx_create_sq_attr attr = { - .user_index = q_ind, + struct mlx5_devx_qp_attr attr = { .cqn = qp->cq.cq_obj.cq->id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .uar_page = priv->uar->page_id, - }, - .ts_format = mlx5_ts_format_conv(priv->sq_ts_format), - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, + .uar_index = priv->uar->page_id, + .ts_format = mlx5_ts_format_conv(priv->qp_ts_format), + .user_index = q_ind, }; - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; uint32_t pd_num = 0; int ret; - sq->log_nb_desc = log_nb_desc; - sq->sqn = q_ind; - sq->ci = 0; - sq->pi = 0; + qp_obj->log_nb_desc = log_nb_desc; + qp_obj->qpn = q_ind; + qp_obj->ci = 0; + qp_obj->pi = 0; ret = regex_get_pdn(priv->pd, &pd_num); if (ret) return ret; - attr.wq_attr.pd = pd_num; - ret = mlx5_devx_sq_create(priv->ctx, &sq->sq_obj, + attr.pd = pd_num; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, + log_nb_desc)); + attr.mmo = priv->mmo_regex_qp_cap; + ret = mlx5_devx_qp_create(priv->ctx, &qp_obj->qp_obj, MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_nb_desc), &attr, SOCKET_ID_ANY); if (ret) { - DRV_LOG(ERR, "Can't create SQ object."); + DRV_LOG(ERR, "Can't create QP object."); rte_errno = ENOMEM; return -rte_errno; } - ret = mlx5_devx_cmd_modify_sq(sq->sq_obj.sq, &modify_attr); + ret = mlx5_devx_qp2rts(&qp_obj->qp_obj, 0); if (ret) { - DRV_LOG(ERR, "Can't change SQ state to ready."); - regex_ctrl_destroy_sq(qp, q_ind); + DRV_LOG(ERR, "Can't change QP state to RTS."); + regex_ctrl_destroy_hw_qp(qp, q_ind); rte_errno = ENOMEM; return -rte_errno; } @@ -224,10 +223,10 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, (1 << MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_desc)); else qp->nb_obj = 1; - qp->sqs = rte_malloc(NULL, - qp->nb_obj * sizeof(struct mlx5_regex_sq), 64); - if (!qp->sqs) { - DRV_LOG(ERR, "Can't allocate sq array memory."); + qp->qps = rte_malloc(NULL, + qp->nb_obj * sizeof(struct mlx5_regex_hw_qp), 64); + if (!qp->qps) { + DRV_LOG(ERR, "Can't allocate qp array memory."); rte_errno = ENOMEM; return -rte_errno; } @@ -238,9 +237,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, goto err_cq; } for (i = 0; i < qp->nb_obj; i++) { - ret = regex_ctrl_create_sq(priv, qp, i, log_desc); + ret = regex_ctrl_create_hw_qp(priv, qp, i, log_desc); if (ret) { - DRV_LOG(ERR, "Can't create sq."); + DRV_LOG(ERR, "Can't create qp object."); goto err_btree; } nb_sq_config++; @@ -266,9 +265,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); err_btree: for (i = 0; i < nb_sq_config; i++) - regex_ctrl_destroy_sq(qp, i); + regex_ctrl_destroy_hw_qp(qp, i); regex_ctrl_destroy_cq(&qp->cq); err_cq: - rte_free(qp->sqs); + rte_free(qp->qps); return ret; } diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c index 786718af53..18b01b6bf9 100644 --- a/drivers/regex/mlx5/mlx5_regex_fastpath.c +++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c @@ -39,13 +39,13 @@ #define MLX5_REGEX_KLMS_SIZE \ ((MLX5_REGEX_MAX_KLM_NUM) * sizeof(struct mlx5_klm)) /* In WQE set mode, the pi should be quarter of the MLX5_REGEX_MAX_WQE_INDEX. */ -#define MLX5_REGEX_UMR_SQ_PI_IDX(pi, ops) \ +#define MLX5_REGEX_UMR_QP_PI_IDX(pi, ops) \ (((pi) + (ops)) & (MLX5_REGEX_MAX_WQE_INDEX >> 2)) static inline uint32_t -sq_size_get(struct mlx5_regex_sq *sq) +qp_size_get(struct mlx5_regex_hw_qp *qp) { - return (1U << sq->log_nb_desc); + return (1U << qp->log_nb_desc); } static inline uint32_t @@ -144,11 +144,11 @@ mlx5_regex_addr2mr(struct mlx5_regex_priv *priv, struct mlx5_mr_ctrl *mr_ctrl, static inline void -__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, +__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); uint16_t group0 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F ? @@ -168,13 +168,13 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F | RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F))) group0 = op->group_id0; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; int ds = 4; /* ctrl + meta + input + output */ set_wqe_ctrl_seg((struct mlx5_wqe_ctrl_seg *)wqe, (priv->has_umr ? (pi * 4 + 3) : pi), MLX5_OPCODE_MMO, MLX5_OPC_MOD_MMO_REGEX, - sq->sq_obj.sq->id, 0, ds, 0, 0); + qp_obj->qp_obj.qp->id, 0, ds, 0, 0); set_regex_ctrl_seg(wqe + 12, 0, group0, group1, group2, group3, control); struct mlx5_wqe_data_seg *input_seg = @@ -188,7 +188,7 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, static inline void prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops *op, + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job) { struct mlx5_klm klm; @@ -196,42 +196,42 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.byte_count = rte_pktmbuf_data_len(op->mbuf); klm.mkey = mlx5_regex_addr2mr(priv, &qp->mr_ctrl, op->mbuf); klm.address = rte_pktmbuf_mtod(op->mbuf, uintptr_t); - __prep_one(priv, sq, op, job, sq->pi, &klm); - sq->db_pi = sq->pi; - sq->pi = (sq->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; + __prep_one(priv, qp_obj, op, job, qp_obj->pi, &klm); + qp_obj->db_pi = qp_obj->pi; + qp_obj->pi = (qp_obj->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; } static inline void -send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq) +send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj) { struct mlx5dv_devx_uar *uar = priv->uar; - size_t wqe_offset = (sq->db_pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (qp_obj->db_pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; /* Or the fm_ce_se instead of set, avoid the fence be cleared. */ ((struct mlx5_wqe_ctrl_seg *)wqe)->fm_ce_se |= MLX5_WQE_CTRL_CQ_UPDATE; uint64_t *doorbell_addr = (uint64_t *)((uint8_t *)uar->base_addr + 0x800); rte_io_wmb(); - sq->sq_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? - (sq->db_pi * 4 + 3) : sq->db_pi) & - MLX5_REGEX_MAX_WQE_INDEX); + qp_obj->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? + (qp_obj->db_pi * 4 + 3) : qp_obj->db_pi) + & MLX5_REGEX_MAX_WQE_INDEX); rte_wmb(); *doorbell_addr = *(volatile uint64_t *)wqe; rte_wmb(); } static inline int -get_free(struct mlx5_regex_sq *sq, uint8_t has_umr) { - return (sq_size_get(sq) - ((sq->pi - sq->ci) & +get_free(struct mlx5_regex_hw_qp *qp, uint8_t has_umr) { + return (qp_size_get(qp) - ((qp->pi - qp->ci) & (has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX))); } static inline uint32_t -job_id_get(uint32_t qid, size_t sq_size, size_t index) { - return qid * sq_size + (index & (sq_size - 1)); +job_id_get(uint32_t qid, size_t qp_size, size_t index) { + return qid * qp_size + (index & (qp_size - 1)); } #ifdef HAVE_MLX5_UMR_IMKEY @@ -242,14 +242,14 @@ mkey_klm_available(struct mlx5_klm *klm, uint32_t pos, uint32_t new) } static inline void -complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, +complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_hw_qp *qp_obj, struct mlx5_regex_job *mkey_job, size_t umr_index, uint32_t klm_size, uint32_t total_len) { - size_t wqe_offset = (umr_index & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (umr_index & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB * 4); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp_obj->qp_obj.wqes + wqe_offset); struct mlx5_wqe_umr_ctrl_seg *ucseg = (struct mlx5_wqe_umr_ctrl_seg *)(wqe + 1); struct mlx5_wqe_mkey_context_seg *mkc = @@ -260,7 +260,7 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* Set WQE control seg. Non-inline KLM UMR WQE size must be 9 WQE_DS. */ set_wqe_ctrl_seg(wqe, (umr_index * 4), MLX5_OPCODE_UMR, - 0, sq->sq_obj.sq->id, 0, 9, 0, + 0, qp_obj->qp_obj.qp->id, 0, 9, 0, rte_cpu_to_be_32(mkey_job->imkey->id)); /* Set UMR WQE control seg. */ ucseg->mkey_mask |= rte_cpu_to_be_64(MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN | @@ -287,36 +287,36 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, } static inline void -prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, - struct rte_regex_ops *op, struct mlx5_regex_job *job, - size_t pi, struct mlx5_klm *klm) +prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, + struct mlx5_regex_hw_qp *qp, struct rte_regex_ops *op, + struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp) - 1)) * (MLX5_SEND_WQE_BB << 2); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp->qp_obj.wqes + wqe_offset); /* Clear the WQE memory used as UMR WQE previously. */ if ((rte_be_to_cpu_32(wqe->opmod_idx_opcode) & 0xff) != MLX5_OPCODE_NOP) memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* UMR WQE size is 9 DS, align nop WQE to 3 WQEBBS(12 DS). */ - set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, sq->sq_obj.sq->id, + set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, qp->qp_obj.qp->id, 0, 12, 0, 0); - __prep_one(priv, sq, op, job, pi, klm); + __prep_one(priv, qp, op, job, pi, klm); } static inline void prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops **op, size_t nb_ops) + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops **op, + size_t nb_ops) { struct mlx5_regex_job *job = NULL; - size_t sqid = sq->sqn, mkey_job_id = 0; + size_t hw_qpid = qp_obj->qpn, mkey_job_id = 0; size_t left_ops = nb_ops; uint32_t klm_num = 0, len; struct mlx5_klm *mkey_klm = NULL; struct mlx5_klm klm; - sqid = sq->sqn; while (left_ops--) rte_prefetch0(op[left_ops]); left_ops = nb_ops; @@ -328,7 +328,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, */ while (left_ops--) { struct rte_mbuf *mbuf = op[left_ops]->mbuf; - size_t pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, left_ops); + size_t pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, left_ops); if (mbuf->nb_segs > 1) { size_t scatter_size = 0; @@ -340,16 +340,16 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, * WQE in the next WQE set. */ if (mkey_klm) - complete_umr_wqe(qp, sq, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], - MLX5_REGEX_UMR_SQ_PI_IDX(pi, 1), + MLX5_REGEX_UMR_QP_PI_IDX(pi, 1), klm_num, len); /* * Get the indircet mkey and KLM array index * from the last WQE set. */ - mkey_job_id = job_id_get(sqid, - sq_size_get(sq), pi); + mkey_job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), pi); mkey_klm = qp->jobs[mkey_job_id].imkey_array; klm_num = 0; len = 0; @@ -383,22 +383,23 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.address = rte_pktmbuf_mtod(mbuf, uintptr_t); klm.byte_count = rte_pktmbuf_data_len(mbuf); } - job = &qp->jobs[job_id_get(sqid, sq_size_get(sq), pi)]; + job = &qp->jobs[job_id_get(hw_qpid, qp_size_get(qp_obj), pi)]; /* * Build the nop + RegEx WQE set by default. The fist nop WQE * will be updated later as UMR WQE if scattered mubf exist. */ - prep_nop_regex_wqe_set(priv, sq, op[left_ops], job, pi, &klm); + prep_nop_regex_wqe_set(priv, qp_obj, op[left_ops], job, pi, + &klm); } /* * Scattered mbuf have been added to the KLM array. Complete the build * of UMR WQE, update the first nop WQE as UMR WQE. */ if (mkey_klm) - complete_umr_wqe(qp, sq, &qp->jobs[mkey_job_id], sq->pi, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], qp_obj->pi, klm_num, len); - sq->db_pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops - 1); - sq->pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops); + qp_obj->db_pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops - 1); + qp_obj->pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops); } uint16_t @@ -407,21 +408,22 @@ mlx5_regexdev_enqueue_gga(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, nb_left = nb_ops, nb_desc; + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, nb_left = nb_ops, nb_desc; - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - nb_desc = get_free(sq, priv->has_umr); + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + nb_desc = get_free(qp_obj, priv->has_umr); if (nb_desc) { /* The ops be handled can't exceed nb_ops. */ if (nb_desc > nb_left) nb_desc = nb_left; else - queue->free_sqs &= ~(1 << sqid); - prep_regex_umr_wqe_set(priv, queue, sq, ops, nb_desc); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + prep_regex_umr_wqe_set(priv, queue, qp_obj, ops, + nb_desc); + send_doorbell(priv, qp_obj); nb_left -= nb_desc; } if (!nb_left) @@ -440,23 +442,25 @@ mlx5_regexdev_enqueue(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, job_id, i = 0; - - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - while (get_free(sq, priv->has_umr)) { - job_id = job_id_get(sqid, sq_size_get(sq), sq->pi); - prep_one(priv, queue, sq, ops[i], &queue->jobs[job_id]); + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, job_id, i = 0; + + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + while (get_free(qp_obj, priv->has_umr)) { + job_id = job_id_get(hw_qpid, qp_size_get(qp_obj), + qp_obj->pi); + prep_one(priv, queue, qp_obj, ops[i], + &queue->jobs[job_id]); i++; if (unlikely(i == nb_ops)) { - send_doorbell(priv, sq); + send_doorbell(priv, qp_obj); goto out; } } - queue->free_sqs &= ~(1 << sqid); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + send_doorbell(priv, qp_obj); } out: @@ -566,21 +570,21 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, uint16_t wq_counter = (rte_be_to_cpu_16(cqe->wqe_counter) + 1) & MLX5_REGEX_MAX_WQE_INDEX; - size_t sqid = cqe->rsvd3[2]; - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; + size_t hw_qpid = cqe->rsvd3[2]; + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; /* UMR mode WQE counter move as WQE set(4 WQEBBS).*/ if (priv->has_umr) wq_counter >>= 2; - while (sq->ci != wq_counter) { + while (qp_obj->ci != wq_counter) { if (unlikely(i == nb_ops)) { /* Return without updating cq->ci */ goto out; } - uint32_t job_id = job_id_get(sqid, sq_size_get(sq), - sq->ci); + uint32_t job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), qp_obj->ci); extract_result(ops[i], &queue->jobs[job_id]); - sq->ci = (sq->ci + 1) & (priv->has_umr ? + qp_obj->ci = (qp_obj->ci + 1) & (priv->has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX); i++; @@ -588,7 +592,7 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, cq->ci = (cq->ci + 1) & 0xffffff; rte_wmb(); cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->ci); - queue->free_sqs |= (1 << sqid); + queue->free_qps |= (1 << hw_qpid); } out: @@ -597,15 +601,15 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, } static void -setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) +setup_qps(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) { - size_t sqid, entry; + size_t hw_qpid, entry; uint32_t job_id; - for (sqid = 0; sqid < queue->nb_obj; sqid++) { - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes; - for (entry = 0 ; entry < sq_size_get(sq); entry++) { - job_id = sqid * sq_size_get(sq) + entry; + for (hw_qpid = 0; hw_qpid < queue->nb_obj; hw_qpid++) { + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes; + for (entry = 0 ; entry < qp_size_get(qp_obj); entry++) { + job_id = hw_qpid * qp_size_get(qp_obj) + entry; struct mlx5_regex_job *job = &queue->jobs[job_id]; /* Fill UMR WQE with NOP in advanced. */ @@ -613,7 +617,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) set_wqe_ctrl_seg ((struct mlx5_wqe_ctrl_seg *)wqe, entry * 2, MLX5_OPCODE_NOP, 0, - sq->sq_obj.sq->id, 0, 12, 0, 0); + qp_obj->qp_obj.qp->id, 0, 12, 0, 0); wqe += MLX5_REGEX_UMR_WQE_SIZE; } set_metadata_seg((struct mlx5_wqe_metadata_seg *) @@ -627,7 +631,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) (uintptr_t)job->output); wqe += 64; } - queue->free_sqs |= 1 << sqid; + queue->free_qps |= 1 << hw_qpid; } } @@ -737,7 +741,7 @@ mlx5_regexdev_setup_fastpath(struct mlx5_regex_priv *priv, uint32_t qp_id) return err; } - setup_sqs(priv, qp); + setup_qps(priv, qp); if (priv->has_umr) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH 2/5] common/mlx5: update new MMO HCA capabilities 2021-09-03 14:21 [dpdk-dev] [PATCH 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 1/5] common/mlx5: share DevX QP operations Raja Zidane @ 2021-09-03 14:21 ` Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane ` (2 subsequent siblings) 4 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-03 14:21 UTC (permalink / raw) To: dev New MMO HCA capabilities were added and others were renamed. Align hca capabilities with new prm. Add support in devx interface for changes in HCA capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 11 ++++++++--- drivers/common/mlx5/mlx5_devx_cmds.h | 11 ++++++++--- drivers/common/mlx5/mlx5_prm.h | 20 ++++++++++++++------ 3 files changed, 30 insertions(+), 12 deletions(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index ac554cca05..fbc2123883 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -858,9 +858,14 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, attr->log_max_srq_sz = MLX5_GET(cmd_hca_cap, hcattr, log_max_srq_sz); attr->reg_c_preserve = MLX5_GET(cmd_hca_cap, hcattr, reg_c_preserve); - attr->mmo_dma_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo); - attr->mmo_compress_en = MLX5_GET(cmd_hca_cap, hcattr, compress); - attr->mmo_decompress_en = MLX5_GET(cmd_hca_cap, hcattr, decompress); + attr->mmo_regex_qp_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_qp); + attr->mmo_regex_sq_en = MLX5_GET(cmd_hca_cap, hcattr, regexp_mmo_sq); + attr->mmo_dma_sq_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_sq); + attr->mmo_compress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, compress_mmo_sq); + attr->mmo_decompress_sq_en = MLX5_GET(cmd_hca_cap, hcattr, decompress_mmo_sq); + attr->mmo_dma_qp_en = MLX5_GET(cmd_hca_cap, hcattr, dma_mmo_qp); + attr->mmo_compress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, compress_mmo_qp); + attr->mmo_decompress_qp_en = MLX5_GET(cmd_hca_cap, hcattr, decompress_mmo_qp); attr->compress_min_block_size = MLX5_GET(cmd_hca_cap, hcattr, compress_min_block_size); attr->log_max_mmo_dma = MLX5_GET(cmd_hca_cap, hcattr, log_dma_mmo_size); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index c071629904..b21df0fd9b 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -173,9 +173,14 @@ struct mlx5_hca_attr { uint32_t log_max_srq; uint32_t log_max_srq_sz; uint32_t rss_ind_tbl_cap; - uint32_t mmo_dma_en:1; - uint32_t mmo_compress_en:1; - uint32_t mmo_decompress_en:1; + uint32_t mmo_dma_sq_en:1; + uint32_t mmo_compress_sq_en:1; + uint32_t mmo_decompress_sq_en:1; + uint32_t mmo_dma_qp_en:1; + uint32_t mmo_compress_qp_en:1; + uint32_t mmo_decompress_qp_en:1; + uint32_t mmo_regex_qp_en:1; + uint32_t mmo_regex_sq_en:1; uint32_t compress_min_block_size:4; uint32_t log_max_mmo_dma:5; uint32_t log_max_mmo_compress:5; diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index d361bcf90e..070c538c8c 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -1386,10 +1386,10 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 rtr2rts_qp_counters_set_id[0x1]; u8 rts2rts_udp_sport[0x1]; u8 rts2rts_lag_tx_port_affinity[0x1]; - u8 dma_mmo[0x1]; + u8 dma_mmo_sq[0x1]; u8 compress_min_block_size[0x4]; - u8 compress[0x1]; - u8 decompress[0x1]; + u8 compress_mmo_sq[0x1]; + u8 decompress_mmo_sq[0x1]; u8 log_max_ra_res_qp[0x6]; u8 end_pad[0x1]; u8 cc_query_allowed[0x1]; @@ -1519,7 +1519,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_lag_ports[0x4]; u8 reserved_at_280[0x10]; u8 max_wqe_sz_sq[0x10]; - u8 reserved_at_2a0[0x10]; + u8 reserved_at_2a0[0xe]; + u8 regexp_mmo_sq[0x1]; + u8 reserved_at_2b0[0x1]; u8 max_wqe_sz_rq[0x10]; u8 max_flow_counter_31_16[0x10]; u8 max_wqe_sz_sq_dc[0x10]; @@ -1632,7 +1634,12 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 num_vhca_ports[0x8]; u8 reserved_at_618[0x6]; u8 sw_owner_id[0x1]; - u8 reserved_at_61f[0x1e1]; + u8 reserved_at_61f[0x109]; + u8 dma_mmo_qp[0x1]; + u8 regexp_mmo_qp[0x1]; + u8 compress_mmo_qp[0x1]; + u8 decompress_mmo_qp[0x1]; + u8 reserved_at_624[0xd4]; }; struct mlx5_ifc_qos_cap_bits { @@ -3244,7 +3251,8 @@ struct mlx5_ifc_create_qp_in_bits { u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 reserved_at_40[0x40]; + u8 qpc_ext[0x1]; + u8 reserved_at_41[0x3f]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH 3/5] common/mlx5: add MMO configuration for the DevX QP 2021-09-03 14:21 [dpdk-dev] [PATCH 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane @ 2021-09-03 14:21 ` Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 4/5] compress/mlx5: refactor queue HW object Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 4 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-03 14:21 UTC (permalink / raw) To: dev A new configuration MMO was added to QP Context. If set, MMO WQEs are supported on this QP. For DMA MMO, supported only when dma_mmo_qp==1. For REGEXP MMO, supported only when regexp_mmo_qp==1. For COMPRESS MMO, supported only when compress_mmo_qp==1. For DECOMPRESS MMO, supported only when decompress_mmo_qp==1. Add support to DevX interface to set MMO bit. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/common/mlx5/mlx5_devx_cmds.c | 5 +++++ drivers/common/mlx5/mlx5_devx_cmds.h | 1 + drivers/common/mlx5/mlx5_prm.h | 27 ++++++++++++++++++++++++++- 3 files changed, 32 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index fbc2123883..4678288207 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2028,6 +2028,11 @@ mlx5_devx_cmd_create_qp(void *ctx, MLX5_SET(qpc, qpc, ts_format, attr->ts_format); MLX5_SET(qpc, qpc, user_index, attr->user_index); if (attr->uar_index) { + if(attr->mmo) { + void *qpc_ext_and_pas_list = MLX5_ADDR_OF(create_qp_in, in, qpc_extension_and_pas_list); + void* qpc_ext = MLX5_ADDR_OF(qpc_extension_and_pas_list, qpc_ext_and_pas_list, qpc_data_extension); + MLX5_SET(qpc_extension, qpc_ext, mmo, 1); + } MLX5_SET(qpc, qpc, pm_state, MLX5_QP_PM_MIGRATED); MLX5_SET(qpc, qpc, uar_page, attr->uar_index); if (attr->log_page_size > MLX5_ADAPTER_PAGE_SHIFT) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index b21df0fd9b..e149f8b4f5 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -403,6 +403,7 @@ struct mlx5_devx_qp_attr { uint32_t wq_umem_id; uint64_t wq_umem_offset; uint32_t user_index:24; + uint32_t mmo:1; }; struct mlx5_devx_virtio_q_couners_attr { diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 070c538c8c..4130b618c5 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -3243,6 +3243,28 @@ struct mlx5_ifc_create_qp_out_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_qpc_extension_bits { + u8 reserved_at_0[0x2]; + u8 mmo[0x1]; + u8 reserved_at_3[0x5fd]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_pas_list_bits { + u8 pas[0][0x40]; +}; + +#ifdef PEDANTIC +#pragma GCC diagnostic ignored "-Wpedantic" +#endif +struct mlx5_ifc_qpc_extension_and_pas_list_bits { + struct mlx5_ifc_qpc_extension_bits qpc_data_extension; + u8 pas[0][0x40]; +}; + + #ifdef PEDANTIC #pragma GCC diagnostic ignored "-Wpedantic" #endif @@ -3260,7 +3282,10 @@ struct mlx5_ifc_create_qp_in_bits { u8 wq_umem_id[0x20]; u8 wq_umem_valid[0x1]; u8 reserved_at_861[0x1f]; - u8 pas[0][0x40]; + union { + struct mlx5_ifc_qpc_pas_list_bits qpc_pas_list; + struct mlx5_ifc_qpc_extension_and_pas_list_bits qpc_extension_and_pas_list; + }; }; #ifdef PEDANTIC #pragma GCC diagnostic error "-Wpedantic" -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH 4/5] compress/mlx5: refactor queue HW object 2021-09-03 14:21 [dpdk-dev] [PATCH 0/5] mlx5: replaced hardware queue object Raja Zidane ` (2 preceding siblings ...) 2021-09-03 14:21 ` [dpdk-dev] [PATCH 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane @ 2021-09-03 14:21 ` Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 4 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-03 14:21 UTC (permalink / raw) To: dev The mlx5 PMD for compress class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/compress/mlx5/mlx5_compress.c | 71 ++++++++++++++++----------- 1 file changed, 41 insertions(+), 30 deletions(-) diff --git a/drivers/compress/mlx5/mlx5_compress.c b/drivers/compress/mlx5/mlx5_compress.c index 883e720ec1..d65e1e14da 100644 --- a/drivers/compress/mlx5/mlx5_compress.c +++ b/drivers/compress/mlx5/mlx5_compress.c @@ -40,7 +40,7 @@ struct mlx5_compress_priv { void *uar; uint32_t pdn; /* Protection Domain number. */ uint8_t min_block_size; - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ /* Minimum huffman block size supported by the device. */ struct ibv_pd *pd; struct rte_compressdev_config dev_config; @@ -48,6 +48,13 @@ struct mlx5_compress_priv { rte_spinlock_t xform_sl; struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ volatile uint64_t *uar_addr; + /* HCA caps*/ + uint32_t mmo_decomp_sq:1; + uint32_t mmo_decomp_qp:1; + uint32_t mmo_comp_sq:1; + uint32_t mmo_comp_qp:1; + uint32_t mmo_dma_sq:1; + uint32_t mmo_dma_qp:1; #ifndef RTE_ARCH_64 rte_spinlock_t uar32_sl; #endif /* RTE_ARCH_64 */ @@ -61,7 +68,7 @@ struct mlx5_compress_qp { struct mlx5_mr_ctrl mr_ctrl; int socket_id; struct mlx5_devx_cq cq; - struct mlx5_devx_sq sq; + struct mlx5_devx_qp qp; struct mlx5_pmd_mr opaque_mr; struct rte_comp_op **ops; struct mlx5_compress_priv *priv; @@ -134,8 +141,8 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) { struct mlx5_compress_qp *qp = dev->data->queue_pairs[qp_id]; - if (qp->sq.sq != NULL) - mlx5_devx_sq_destroy(&qp->sq); + if (qp->qp.qp != NULL) + mlx5_devx_qp_destroy(&qp->qp); if (qp->cq.cq != NULL) mlx5_devx_cq_destroy(&qp->cq); if (qp->opaque_mr.obj != NULL) { @@ -152,12 +159,12 @@ mlx5_compress_qp_release(struct rte_compressdev *dev, uint16_t qp_id) } static void -mlx5_compress_init_sq(struct mlx5_compress_qp *qp) +mlx5_compress_init_qp(struct mlx5_compress_qp *qp) { volatile struct mlx5_gga_wqe *restrict wqe = - (volatile struct mlx5_gga_wqe *)qp->sq.wqes; + (volatile struct mlx5_gga_wqe *)qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; - const uint32_t sq_ds = rte_cpu_to_be_32((qp->sq.sq->id << 8) | 4u); + const uint32_t sq_ds = rte_cpu_to_be_32((qp->qp.qp->id << 8) | 4u); const uint32_t flags = RTE_BE32(MLX5_COMP_ALWAYS << MLX5_COMP_MODE_OFFSET); const uint32_t opaq_lkey = rte_cpu_to_be_32(qp->opaque_mr.lkey); @@ -182,15 +189,10 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, struct mlx5_devx_cq_attr cq_attr = { .uar_page_id = mlx5_os_get_devx_uar_page_id(priv->uar), }; - struct mlx5_devx_create_sq_attr sq_attr = { + struct mlx5_devx_qp_attr qp_attr = { + .pd = priv->pdn, + .uar_index = mlx5_os_get_devx_uar_page_id(priv->uar), .user_index = qp_id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .pd = priv->pdn, - .uar_page = mlx5_os_get_devx_uar_page_id(priv->uar), - }, - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, }; uint32_t log_ops_n = rte_log2_u32(max_inflight_ops); uint32_t alloc_size = sizeof(*qp); @@ -242,24 +244,26 @@ mlx5_compress_qp_setup(struct rte_compressdev *dev, uint16_t qp_id, DRV_LOG(ERR, "Failed to create CQ."); goto err; } - sq_attr.cqn = qp->cq.cq->id; - sq_attr.ts_format = mlx5_ts_format_conv(priv->sq_ts_format); - ret = mlx5_devx_sq_create(priv->ctx, &qp->sq, log_ops_n, &sq_attr, + qp_attr.cqn = qp->cq.cq->id; + qp_attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); + qp_attr.rq_size = 0; + qp_attr.sq_size = RTE_BIT32(log_ops_n); + qp_attr.mmo = priv->mmo_decomp_qp && priv->mmo_comp_qp && priv->mmo_dma_qp; + ret = mlx5_devx_qp_create(priv->ctx, &qp->qp, log_ops_n, &qp_attr, socket_id); if (ret != 0) { - DRV_LOG(ERR, "Failed to create SQ."); + DRV_LOG(ERR, "Failed to create QP."); goto err; } - mlx5_compress_init_sq(qp); - ret = mlx5_devx_cmd_modify_sq(qp->sq.sq, &modify_attr); - if (ret != 0) { - DRV_LOG(ERR, "Can't change SQ state to ready."); + mlx5_compress_init_qp(qp); + ret = mlx5_devx_qp2rts(&qp->qp, 0); + if(ret) { goto err; } /* Save pointer of global generation number to check memory event. */ qp->mr_ctrl.dev_gen_ptr = &priv->mr_scache.dev_gen; DRV_LOG(INFO, "QP %u: SQN=0x%X CQN=0x%X entries num = %u", - (uint32_t)qp_id, qp->sq.sq->id, qp->cq.cq->id, qp->entries_n); + (uint32_t)qp_id, qp->qp.qp->id, qp->cq.cq->id, qp->entries_n); return 0; err: mlx5_compress_qp_release(dev, qp_id); @@ -508,7 +512,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, { struct mlx5_compress_qp *qp = queue_pair; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes, *wqe; + qp->qp.wqes, *wqe; struct mlx5_compress_xform *xform; struct rte_comp_op *op; uint16_t mask = qp->entries_n - 1; @@ -563,7 +567,7 @@ mlx5_compress_enqueue_burst(void *queue_pair, struct rte_comp_op **ops, } while (--remain); qp->stats.enqueued_count += nb_ops; rte_io_wmb(); - qp->sq.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); + qp->qp.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32(qp->pi); rte_wmb(); mlx5_compress_uar_write(*(volatile uint64_t *)wqe, qp->priv); rte_wmb(); @@ -598,7 +602,7 @@ mlx5_compress_cqe_err_handle(struct mlx5_compress_qp *qp, volatile struct mlx5_err_cqe *cqe = (volatile struct mlx5_err_cqe *) &qp->cq.cqes[idx]; volatile struct mlx5_gga_wqe *wqes = (volatile struct mlx5_gga_wqe *) - qp->sq.wqes; + qp->qp.wqes; volatile struct mlx5_gga_compress_opaque *opaq = qp->opaque_mr.addr; op->status = RTE_COMP_OP_STATUS_ERROR; @@ -813,8 +817,9 @@ mlx5_compress_dev_probe(struct rte_device *dev) return -rte_errno; } if (mlx5_devx_cmd_query_hca_attr(ctx, &att) != 0 || - att.mmo_compress_en == 0 || att.mmo_decompress_en == 0 || - att.mmo_dma_en == 0) { + ((att.mmo_compress_sq_en == 0 || att.mmo_decompress_sq_en == 0 || + att.mmo_dma_sq_en == 0) && (att.mmo_compress_qp_en == 0 || + att.mmo_decompress_qp_en == 0 || att.mmo_dma_qp_en == 0))) { DRV_LOG(ERR, "Not enough capabilities to support compress " "operations, maybe old FW/OFED version?"); claim_zero(mlx5_glue->close_device(ctx)); @@ -835,10 +840,16 @@ mlx5_compress_dev_probe(struct rte_device *dev) cdev->enqueue_burst = mlx5_compress_enqueue_burst; cdev->feature_flags = RTE_COMPDEV_FF_HW_ACCELERATED; priv = cdev->data->dev_private; + priv->mmo_decomp_sq = att.mmo_decompress_sq_en; + priv->mmo_decomp_qp = att.mmo_decompress_qp_en; + priv->mmo_comp_sq = att.mmo_compress_sq_en; + priv->mmo_comp_qp = att.mmo_compress_qp_en; + priv->mmo_dma_sq = att.mmo_dma_sq_en; + priv->mmo_dma_qp = att.mmo_dma_qp_en; priv->ctx = ctx; priv->cdev = cdev; priv->min_block_size = att.compress_min_block_size; - priv->sq_ts_format = att.sq_ts_format; + priv->qp_ts_format = att.qp_ts_format; if (mlx5_compress_hw_global_prepare(priv) != 0) { rte_compressdev_pmd_destroy(priv->cdev); claim_zero(mlx5_glue->close_device(priv->ctx)); -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
* [dpdk-dev] [PATCH 5/5] regex/mlx5: refactor HW queue objects 2021-09-03 14:21 [dpdk-dev] [PATCH 0/5] mlx5: replaced hardware queue object Raja Zidane ` (3 preceding siblings ...) 2021-09-03 14:21 ` [dpdk-dev] [PATCH 4/5] compress/mlx5: refactor queue HW object Raja Zidane @ 2021-09-03 14:21 ` Raja Zidane 4 siblings, 0 replies; 38+ messages in thread From: Raja Zidane @ 2021-09-03 14:21 UTC (permalink / raw) To: dev The mlx5 PMD for regex class uses an MMO WQE operated by the GGA engine in BF devices. Currently, all the MMO WQEs are managed by the SQ object. Starting from BF3, the queue of the MMO WQEs should be connected to the GGA engine using a new configuration, MMO, that will be supported only in the QP object. The FW introduced new capabilities to define whether the MMO configuration should be configured for the GGA queue. Replace all the GGA queue objects to QP, set MMO configuration according to the new FW capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> --- drivers/regex/mlx5/mlx5_regex.c | 6 +- drivers/regex/mlx5/mlx5_regex.h | 16 ++- drivers/regex/mlx5/mlx5_regex_control.c | 64 +++++---- drivers/regex/mlx5/mlx5_regex_fastpath.c | 159 +++++++++++------------ 4 files changed, 123 insertions(+), 122 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_regex.c b/drivers/regex/mlx5/mlx5_regex.c index f17b6df47f..54d9fae03c 100644 --- a/drivers/regex/mlx5/mlx5_regex.c +++ b/drivers/regex/mlx5/mlx5_regex.c @@ -146,7 +146,7 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) DRV_LOG(ERR, "Unable to read HCA capabilities."); rte_errno = ENOTSUP; goto dev_error; - } else if (!attr.regex || attr.regexp_num_of_engines == 0) { + } else if (((!attr.regex) && (!attr.mmo_regex_sq_en) && (!attr.mmo_regex_qp_en)) || attr.regexp_num_of_engines == 0) { DRV_LOG(ERR, "Not enough capabilities to support RegEx, maybe " "old FW/OFED version?"); rte_errno = ENOTSUP; @@ -164,7 +164,9 @@ mlx5_regex_dev_probe(struct rte_device *rte_dev) rte_errno = ENOMEM; goto dev_error; } - priv->sq_ts_format = attr.sq_ts_format; + priv->mmo_regex_qp_cap = attr.mmo_regex_qp_en; + priv->mmo_regex_sq_cap = attr.mmo_regex_sq_en; + priv->qp_ts_format = attr.qp_ts_format; priv->ctx = ctx; priv->nb_engines = 2; /* attr.regexp_num_of_engines */ ret = mlx5_devx_regex_register_read(priv->ctx, 0, diff --git a/drivers/regex/mlx5/mlx5_regex.h b/drivers/regex/mlx5/mlx5_regex.h index 514f3408f9..2242d250a3 100644 --- a/drivers/regex/mlx5/mlx5_regex.h +++ b/drivers/regex/mlx5/mlx5_regex.h @@ -17,12 +17,12 @@ #include "mlx5_rxp.h" #include "mlx5_regex_utils.h" -struct mlx5_regex_sq { +struct mlx5_regex_hw_qp { uint16_t log_nb_desc; /* Log 2 number of desc for this object. */ - struct mlx5_devx_sq sq_obj; /* The SQ DevX object. */ + struct mlx5_devx_qp qp_obj; /* The QP DevX object. */ size_t pi, db_pi; size_t ci; - uint32_t sqn; + uint32_t qpn; }; struct mlx5_regex_cq { @@ -34,10 +34,10 @@ struct mlx5_regex_cq { struct mlx5_regex_qp { uint32_t flags; /* QP user flags. */ uint32_t nb_desc; /* Total number of desc for this qp. */ - struct mlx5_regex_sq *sqs; /* Pointer to sq array. */ - uint16_t nb_obj; /* Number of sq objects. */ + struct mlx5_regex_hw_qp *qps; /* Pointer to qp array. */ + uint16_t nb_obj; /* Number of qp objects. */ struct mlx5_regex_cq cq; /* CQ struct. */ - uint32_t free_sqs; + uint32_t free_qps; struct mlx5_regex_job *jobs; struct ibv_mr *metadata; struct ibv_mr *outputs; @@ -73,8 +73,10 @@ struct mlx5_regex_priv { /**< Called by memory event callback. */ struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ uint8_t is_bf2; /* The device is BF2 device. */ - uint8_t sq_ts_format; /* Whether SQ supports timestamp formats. */ + uint8_t qp_ts_format; /* Whether SQ supports timestamp formats. */ uint8_t has_umr; /* The device supports UMR. */ + uint32_t mmo_regex_qp_cap:1; + uint32_t mmo_regex_sq_cap:1; }; #ifdef HAVE_IBV_FLOW_DV_SUPPORT diff --git a/drivers/regex/mlx5/mlx5_regex_control.c b/drivers/regex/mlx5/mlx5_regex_control.c index 8ce2dabb55..5acb6579e9 100644 --- a/drivers/regex/mlx5/mlx5_regex_control.c +++ b/drivers/regex/mlx5/mlx5_regex_control.c @@ -106,12 +106,12 @@ regex_ctrl_create_cq(struct mlx5_regex_priv *priv, struct mlx5_regex_cq *cq) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) +regex_ctrl_destroy_hw_qp(struct mlx5_regex_qp *qp, uint16_t q_ind) { - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; - mlx5_devx_sq_destroy(&sq->sq_obj); - memset(sq, 0, sizeof(*sq)); + mlx5_devx_qp_destroy(&qp_obj->qp_obj); + memset(qp, 0, sizeof(*qp)); return 0; } @@ -131,45 +131,43 @@ regex_ctrl_destroy_sq(struct mlx5_regex_qp *qp, uint16_t q_ind) * 0 on success, a negative errno value otherwise and rte_errno is set. */ static int -regex_ctrl_create_sq(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, +regex_ctrl_create_hw_qp(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, uint16_t q_ind, uint16_t log_nb_desc) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT - struct mlx5_devx_create_sq_attr attr = { - .user_index = q_ind, + struct mlx5_devx_qp_attr attr = { .cqn = qp->cq.cq_obj.cq->id, - .wq_attr = (struct mlx5_devx_wq_attr){ - .uar_page = priv->uar->page_id, - }, - .ts_format = mlx5_ts_format_conv(priv->sq_ts_format), - }; - struct mlx5_devx_modify_sq_attr modify_attr = { - .state = MLX5_SQC_STATE_RDY, + .uar_index = priv->uar->page_id, + .ts_format = mlx5_ts_format_conv(priv->qp_ts_format), + .user_index = q_ind, }; - struct mlx5_regex_sq *sq = &qp->sqs[q_ind]; + struct mlx5_regex_hw_qp *qp_obj = &qp->qps[q_ind]; uint32_t pd_num = 0; int ret; - sq->log_nb_desc = log_nb_desc; - sq->sqn = q_ind; - sq->ci = 0; - sq->pi = 0; + qp_obj->log_nb_desc = log_nb_desc; + qp_obj->qpn = q_ind; + qp_obj->ci = 0; + qp_obj->pi = 0; ret = regex_get_pdn(priv->pd, &pd_num); if (ret) return ret; - attr.wq_attr.pd = pd_num; - ret = mlx5_devx_sq_create(priv->ctx, &sq->sq_obj, + attr.pd = pd_num; + attr.rq_size = 0; + attr.sq_size = RTE_BIT32(MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_nb_desc)); + attr.mmo = priv->mmo_regex_qp_cap; + ret = mlx5_devx_qp_create(priv->ctx, &qp_obj->qp_obj, MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_nb_desc), &attr, SOCKET_ID_ANY); if (ret) { - DRV_LOG(ERR, "Can't create SQ object."); + DRV_LOG(ERR, "Can't create QP object."); rte_errno = ENOMEM; return -rte_errno; } - ret = mlx5_devx_cmd_modify_sq(sq->sq_obj.sq, &modify_attr); + ret = mlx5_devx_qp2rts(&qp_obj->qp_obj, 0); if (ret) { - DRV_LOG(ERR, "Can't change SQ state to ready."); - regex_ctrl_destroy_sq(qp, q_ind); + DRV_LOG(ERR, "Can't change QP state to RTS."); + regex_ctrl_destroy_hw_qp(qp, q_ind); rte_errno = ENOMEM; return -rte_errno; } @@ -224,10 +222,10 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, (1 << MLX5_REGEX_WQE_LOG_NUM(priv->has_umr, log_desc)); else qp->nb_obj = 1; - qp->sqs = rte_malloc(NULL, - qp->nb_obj * sizeof(struct mlx5_regex_sq), 64); - if (!qp->sqs) { - DRV_LOG(ERR, "Can't allocate sq array memory."); + qp->qps = rte_malloc(NULL, + qp->nb_obj * sizeof(struct mlx5_regex_hw_qp), 64); + if (!qp->qps) { + DRV_LOG(ERR, "Can't allocate qp array memory."); rte_errno = ENOMEM; return -rte_errno; } @@ -238,9 +236,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, goto err_cq; } for (i = 0; i < qp->nb_obj; i++) { - ret = regex_ctrl_create_sq(priv, qp, i, log_desc); + ret = regex_ctrl_create_hw_qp(priv, qp, i, log_desc); if (ret) { - DRV_LOG(ERR, "Can't create sq."); + DRV_LOG(ERR, "Can't create qp object."); goto err_btree; } nb_sq_config++; @@ -266,9 +264,9 @@ mlx5_regex_qp_setup(struct rte_regexdev *dev, uint16_t qp_ind, mlx5_mr_btree_free(&qp->mr_ctrl.cache_bh); err_btree: for (i = 0; i < nb_sq_config; i++) - regex_ctrl_destroy_sq(qp, i); + regex_ctrl_destroy_hw_qp(qp, i); regex_ctrl_destroy_cq(&qp->cq); err_cq: - rte_free(qp->sqs); + rte_free(qp->qps); return ret; } diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c index 786718af53..ab390567b8 100644 --- a/drivers/regex/mlx5/mlx5_regex_fastpath.c +++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c @@ -39,13 +39,13 @@ #define MLX5_REGEX_KLMS_SIZE \ ((MLX5_REGEX_MAX_KLM_NUM) * sizeof(struct mlx5_klm)) /* In WQE set mode, the pi should be quarter of the MLX5_REGEX_MAX_WQE_INDEX. */ -#define MLX5_REGEX_UMR_SQ_PI_IDX(pi, ops) \ +#define MLX5_REGEX_UMR_QP_PI_IDX(pi, ops) \ (((pi) + (ops)) & (MLX5_REGEX_MAX_WQE_INDEX >> 2)) static inline uint32_t -sq_size_get(struct mlx5_regex_sq *sq) +qp_size_get(struct mlx5_regex_hw_qp *qp) { - return (1U << sq->log_nb_desc); + return (1U << qp->log_nb_desc); } static inline uint32_t @@ -144,11 +144,11 @@ mlx5_regex_addr2mr(struct mlx5_regex_priv *priv, struct mlx5_mr_ctrl *mr_ctrl, static inline void -__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, +__prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); uint16_t group0 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F ? @@ -168,13 +168,13 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F | RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F))) group0 = op->group_id0; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; int ds = 4; /* ctrl + meta + input + output */ set_wqe_ctrl_seg((struct mlx5_wqe_ctrl_seg *)wqe, (priv->has_umr ? (pi * 4 + 3) : pi), MLX5_OPCODE_MMO, MLX5_OPC_MOD_MMO_REGEX, - sq->sq_obj.sq->id, 0, ds, 0, 0); + qp_obj->qp_obj.qp->id, 0, ds, 0, 0); set_regex_ctrl_seg(wqe + 12, 0, group0, group1, group2, group3, control); struct mlx5_wqe_data_seg *input_seg = @@ -188,7 +188,7 @@ __prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, static inline void prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops *op, + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops *op, struct mlx5_regex_job *job) { struct mlx5_klm klm; @@ -196,26 +196,26 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.byte_count = rte_pktmbuf_data_len(op->mbuf); klm.mkey = mlx5_regex_addr2mr(priv, &qp->mr_ctrl, op->mbuf); klm.address = rte_pktmbuf_mtod(op->mbuf, uintptr_t); - __prep_one(priv, sq, op, job, sq->pi, &klm); - sq->db_pi = sq->pi; - sq->pi = (sq->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; + __prep_one(priv, qp_obj, op, job, qp_obj->pi, &klm); + qp_obj->db_pi = qp_obj->pi; + qp_obj->pi = (qp_obj->pi + 1) & MLX5_REGEX_MAX_WQE_INDEX; } static inline void -send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq) +send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp_obj) { struct mlx5dv_devx_uar *uar = priv->uar; - size_t wqe_offset = (sq->db_pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (qp_obj->db_pi & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB << (priv->has_umr ? 2 : 0)) + (priv->has_umr ? MLX5_REGEX_UMR_WQE_SIZE : 0); - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes + wqe_offset; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes + wqe_offset; /* Or the fm_ce_se instead of set, avoid the fence be cleared. */ ((struct mlx5_wqe_ctrl_seg *)wqe)->fm_ce_se |= MLX5_WQE_CTRL_CQ_UPDATE; uint64_t *doorbell_addr = (uint64_t *)((uint8_t *)uar->base_addr + 0x800); rte_io_wmb(); - sq->sq_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? - (sq->db_pi * 4 + 3) : sq->db_pi) & + qp_obj->qp_obj.db_rec[MLX5_SND_DBR] = rte_cpu_to_be_32((priv->has_umr ? + (qp_obj->db_pi * 4 + 3) : qp_obj->db_pi) & MLX5_REGEX_MAX_WQE_INDEX); rte_wmb(); *doorbell_addr = *(volatile uint64_t *)wqe; @@ -223,15 +223,15 @@ send_doorbell(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq) } static inline int -get_free(struct mlx5_regex_sq *sq, uint8_t has_umr) { - return (sq_size_get(sq) - ((sq->pi - sq->ci) & +get_free(struct mlx5_regex_hw_qp *qp, uint8_t has_umr) { + return (qp_size_get(qp) - ((qp->pi - qp->ci) & (has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX))); } static inline uint32_t -job_id_get(uint32_t qid, size_t sq_size, size_t index) { - return qid * sq_size + (index & (sq_size - 1)); +job_id_get(uint32_t qid, size_t qp_size, size_t index) { + return qid * qp_size + (index & (qp_size - 1)); } #ifdef HAVE_MLX5_UMR_IMKEY @@ -242,14 +242,14 @@ mkey_klm_available(struct mlx5_klm *klm, uint32_t pos, uint32_t new) } static inline void -complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, +complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_hw_qp *qp_obj, struct mlx5_regex_job *mkey_job, size_t umr_index, uint32_t klm_size, uint32_t total_len) { - size_t wqe_offset = (umr_index & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (umr_index & (qp_size_get(qp_obj) - 1)) * (MLX5_SEND_WQE_BB * 4); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp_obj->qp_obj.wqes + wqe_offset); struct mlx5_wqe_umr_ctrl_seg *ucseg = (struct mlx5_wqe_umr_ctrl_seg *)(wqe + 1); struct mlx5_wqe_mkey_context_seg *mkc = @@ -260,7 +260,7 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* Set WQE control seg. Non-inline KLM UMR WQE size must be 9 WQE_DS. */ set_wqe_ctrl_seg(wqe, (umr_index * 4), MLX5_OPCODE_UMR, - 0, sq->sq_obj.sq->id, 0, 9, 0, + 0, qp_obj->qp_obj.qp->id, 0, 9, 0, rte_cpu_to_be_32(mkey_job->imkey->id)); /* Set UMR WQE control seg. */ ucseg->mkey_mask |= rte_cpu_to_be_64(MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN | @@ -287,36 +287,35 @@ complete_umr_wqe(struct mlx5_regex_qp *qp, struct mlx5_regex_sq *sq, } static inline void -prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_sq *sq, +prep_nop_regex_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_hw_qp *qp, struct rte_regex_ops *op, struct mlx5_regex_job *job, size_t pi, struct mlx5_klm *klm) { - size_t wqe_offset = (pi & (sq_size_get(sq) - 1)) * + size_t wqe_offset = (pi & (qp_size_get(qp) - 1)) * (MLX5_SEND_WQE_BB << 2); struct mlx5_wqe_ctrl_seg *wqe = (struct mlx5_wqe_ctrl_seg *)((uint8_t *) - (uintptr_t)sq->sq_obj.wqes + wqe_offset); + (uintptr_t)qp->qp_obj.wqes + wqe_offset); /* Clear the WQE memory used as UMR WQE previously. */ if ((rte_be_to_cpu_32(wqe->opmod_idx_opcode) & 0xff) != MLX5_OPCODE_NOP) memset(wqe, 0, MLX5_REGEX_UMR_WQE_SIZE); /* UMR WQE size is 9 DS, align nop WQE to 3 WQEBBS(12 DS). */ - set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, sq->sq_obj.sq->id, + set_wqe_ctrl_seg(wqe, pi * 4, MLX5_OPCODE_NOP, 0, qp->qp_obj.qp->id, 0, 12, 0, 0); - __prep_one(priv, sq, op, job, pi, klm); + __prep_one(priv, qp, op, job, pi, klm); } static inline void prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, - struct mlx5_regex_sq *sq, struct rte_regex_ops **op, size_t nb_ops) + struct mlx5_regex_hw_qp *qp_obj, struct rte_regex_ops **op, size_t nb_ops) { struct mlx5_regex_job *job = NULL; - size_t sqid = sq->sqn, mkey_job_id = 0; + size_t hw_qpid = qp_obj->qpn, mkey_job_id = 0; size_t left_ops = nb_ops; uint32_t klm_num = 0, len; struct mlx5_klm *mkey_klm = NULL; struct mlx5_klm klm; - sqid = sq->sqn; while (left_ops--) rte_prefetch0(op[left_ops]); left_ops = nb_ops; @@ -328,7 +327,7 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, */ while (left_ops--) { struct rte_mbuf *mbuf = op[left_ops]->mbuf; - size_t pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, left_ops); + size_t pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, left_ops); if (mbuf->nb_segs > 1) { size_t scatter_size = 0; @@ -340,16 +339,16 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, * WQE in the next WQE set. */ if (mkey_klm) - complete_umr_wqe(qp, sq, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], - MLX5_REGEX_UMR_SQ_PI_IDX(pi, 1), + MLX5_REGEX_UMR_QP_PI_IDX(pi, 1), klm_num, len); /* * Get the indircet mkey and KLM array index * from the last WQE set. */ - mkey_job_id = job_id_get(sqid, - sq_size_get(sq), pi); + mkey_job_id = job_id_get(hw_qpid, + qp_size_get(qp_obj), pi); mkey_klm = qp->jobs[mkey_job_id].imkey_array; klm_num = 0; len = 0; @@ -383,22 +382,22 @@ prep_regex_umr_wqe_set(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, klm.address = rte_pktmbuf_mtod(mbuf, uintptr_t); klm.byte_count = rte_pktmbuf_data_len(mbuf); } - job = &qp->jobs[job_id_get(sqid, sq_size_get(sq), pi)]; + job = &qp->jobs[job_id_get(hw_qpid, qp_size_get(qp_obj), pi)]; /* * Build the nop + RegEx WQE set by default. The fist nop WQE * will be updated later as UMR WQE if scattered mubf exist. */ - prep_nop_regex_wqe_set(priv, sq, op[left_ops], job, pi, &klm); + prep_nop_regex_wqe_set(priv, qp_obj, op[left_ops], job, pi, &klm); } /* * Scattered mbuf have been added to the KLM array. Complete the build * of UMR WQE, update the first nop WQE as UMR WQE. */ if (mkey_klm) - complete_umr_wqe(qp, sq, &qp->jobs[mkey_job_id], sq->pi, + complete_umr_wqe(qp, qp_obj, &qp->jobs[mkey_job_id], qp_obj->pi, klm_num, len); - sq->db_pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops - 1); - sq->pi = MLX5_REGEX_UMR_SQ_PI_IDX(sq->pi, nb_ops); + qp_obj->db_pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops - 1); + qp_obj->pi = MLX5_REGEX_UMR_QP_PI_IDX(qp_obj->pi, nb_ops); } uint16_t @@ -407,21 +406,21 @@ mlx5_regexdev_enqueue_gga(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, nb_left = nb_ops, nb_desc; + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, nb_left = nb_ops, nb_desc; - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - nb_desc = get_free(sq, priv->has_umr); + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + nb_desc = get_free(qp_obj, priv->has_umr); if (nb_desc) { /* The ops be handled can't exceed nb_ops. */ if (nb_desc > nb_left) nb_desc = nb_left; else - queue->free_sqs &= ~(1 << sqid); - prep_regex_umr_wqe_set(priv, queue, sq, ops, nb_desc); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + prep_regex_umr_wqe_set(priv, queue, qp_obj, ops, nb_desc); + send_doorbell(priv, qp_obj); nb_left -= nb_desc; } if (!nb_left) @@ -440,23 +439,23 @@ mlx5_regexdev_enqueue(struct rte_regexdev *dev, uint16_t qp_id, { struct mlx5_regex_priv *priv = dev->data->dev_private; struct mlx5_regex_qp *queue = &priv->qps[qp_id]; - struct mlx5_regex_sq *sq; - size_t sqid, job_id, i = 0; - - while ((sqid = ffs(queue->free_sqs))) { - sqid--; /* ffs returns 1 for bit 0 */ - sq = &queue->sqs[sqid]; - while (get_free(sq, priv->has_umr)) { - job_id = job_id_get(sqid, sq_size_get(sq), sq->pi); - prep_one(priv, queue, sq, ops[i], &queue->jobs[job_id]); + struct mlx5_regex_hw_qp *qp_obj; + size_t hw_qpid, job_id, i = 0; + + while ((hw_qpid = ffs(queue->free_qps))) { + hw_qpid--; /* ffs returns 1 for bit 0 */ + qp_obj = &queue->qps[hw_qpid]; + while (get_free(qp_obj, priv->has_umr)) { + job_id = job_id_get(hw_qpid, qp_size_get(qp_obj), qp_obj->pi); + prep_one(priv, queue, qp_obj, ops[i], &queue->jobs[job_id]); i++; if (unlikely(i == nb_ops)) { - send_doorbell(priv, sq); + send_doorbell(priv, qp_obj); goto out; } } - queue->free_sqs &= ~(1 << sqid); - send_doorbell(priv, sq); + queue->free_qps &= ~(1 << hw_qpid); + send_doorbell(priv, qp_obj); } out: @@ -566,21 +565,21 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, uint16_t wq_counter = (rte_be_to_cpu_16(cqe->wqe_counter) + 1) & MLX5_REGEX_MAX_WQE_INDEX; - size_t sqid = cqe->rsvd3[2]; - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; + size_t hw_qpid = cqe->rsvd3[2]; + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; /* UMR mode WQE counter move as WQE set(4 WQEBBS).*/ if (priv->has_umr) wq_counter >>= 2; - while (sq->ci != wq_counter) { + while (qp_obj->ci != wq_counter) { if (unlikely(i == nb_ops)) { /* Return without updating cq->ci */ goto out; } - uint32_t job_id = job_id_get(sqid, sq_size_get(sq), - sq->ci); + uint32_t job_id = job_id_get(hw_qpid, qp_size_get(qp_obj), + qp_obj->ci); extract_result(ops[i], &queue->jobs[job_id]); - sq->ci = (sq->ci + 1) & (priv->has_umr ? + qp_obj->ci = (qp_obj->ci + 1) & (priv->has_umr ? (MLX5_REGEX_MAX_WQE_INDEX >> 2) : MLX5_REGEX_MAX_WQE_INDEX); i++; @@ -588,7 +587,7 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, cq->ci = (cq->ci + 1) & 0xffffff; rte_wmb(); cq->cq_obj.db_rec[0] = rte_cpu_to_be_32(cq->ci); - queue->free_sqs |= (1 << sqid); + queue->free_qps |= (1 << hw_qpid); } out: @@ -597,15 +596,15 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t qp_id, } static void -setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) +setup_qps(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) { - size_t sqid, entry; + size_t hw_qpid, entry; uint32_t job_id; - for (sqid = 0; sqid < queue->nb_obj; sqid++) { - struct mlx5_regex_sq *sq = &queue->sqs[sqid]; - uint8_t *wqe = (uint8_t *)(uintptr_t)sq->sq_obj.wqes; - for (entry = 0 ; entry < sq_size_get(sq); entry++) { - job_id = sqid * sq_size_get(sq) + entry; + for (hw_qpid = 0; hw_qpid < queue->nb_obj; hw_qpid++) { + struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid]; + uint8_t *wqe = (uint8_t *)(uintptr_t)qp_obj->qp_obj.wqes; + for (entry = 0 ; entry < qp_size_get(qp_obj); entry++) { + job_id = hw_qpid * qp_size_get(qp_obj) + entry; struct mlx5_regex_job *job = &queue->jobs[job_id]; /* Fill UMR WQE with NOP in advanced. */ @@ -613,7 +612,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) set_wqe_ctrl_seg ((struct mlx5_wqe_ctrl_seg *)wqe, entry * 2, MLX5_OPCODE_NOP, 0, - sq->sq_obj.sq->id, 0, 12, 0, 0); + qp_obj->qp_obj.qp->id, 0, 12, 0, 0); wqe += MLX5_REGEX_UMR_WQE_SIZE; } set_metadata_seg((struct mlx5_wqe_metadata_seg *) @@ -627,7 +626,7 @@ setup_sqs(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *queue) (uintptr_t)job->output); wqe += 64; } - queue->free_sqs |= 1 << sqid; + queue->free_qps |= 1 << hw_qpid; } } @@ -737,7 +736,7 @@ mlx5_regexdev_setup_fastpath(struct mlx5_regex_priv *priv, uint32_t qp_id) return err; } - setup_sqs(priv, qp); + setup_qps(priv, qp); if (priv->has_umr) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT -- 2.17.1 ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2021-10-05 16:18 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-03 14:21 [dpdk-dev] [PATCH 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-09-15 0:04 ` [dpdk-dev] [PATCH V3 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane 2021-09-22 19:48 ` Thomas Monjalon 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 4/5] compress/mlx5: refactor queue HW object Raja Zidane 2021-09-15 0:05 ` [dpdk-dev] [PATCH V3 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 4/5] compress/mlx5: refactor queue HW object Raja Zidane 2021-09-28 12:16 ` [dpdk-dev] [PATCH V4 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 1/5] common/mlx5: update new MMO HCA capabilities Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 2/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 3/5] compress/mlx5: refactor queue HW object Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 4/5] regex/mlx5: refactor HW queue objects Raja Zidane 2021-09-30 5:44 ` [dpdk-dev] [PATCH V5 5/5] compress/mlx5: allow partial transformations support Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 1/5] common/mlx5: share DevX QP operations Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 4/5] compress/mlx5: refactor queue HW object Raja Zidane 2021-10-05 12:27 ` [dpdk-dev] [PATCH V6 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 2021-10-05 16:18 ` [dpdk-dev] [PATCH V6 0/5] mlx5: replaced hardware queue object Thomas Monjalon 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 4/5] compress/mlx5: refactor queue HW object Raja Zidane 2021-09-12 16:36 ` [dpdk-dev] [PATCH V2 5/5] regex/mlx5: refactor HW queue objects Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 2/5] common/mlx5: update new MMO HCA capabilities Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 3/5] common/mlx5: add MMO configuration for the DevX QP Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 4/5] compress/mlx5: refactor queue HW object Raja Zidane 2021-09-03 14:21 ` [dpdk-dev] [PATCH 5/5] regex/mlx5: refactor HW queue objects Raja Zidane
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).