* [PATCH 0/3] Error report improvement and fix @ 2024-09-06 10:21 Gavin Li 2024-09-06 10:21 ` [PATCH 1/3] net/mlx5: set rte errno if malloc failed Gavin Li ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Gavin Li @ 2024-09-06 10:21 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland This patch set is to improve error handling in pmd and under layer. Gavin Li (3): net/mlx5: set rte errno if malloc failed net/mlx5/hws: add log for failing to create rule in HWS net/mlx5/hws: print CQE error syndrome and more information drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- drivers/net/mlx5/mlx5_flow_hw.c | 31 +++++++++++++++++++++++------- 3 files changed, 38 insertions(+), 8 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/3] net/mlx5: set rte errno if malloc failed 2024-09-06 10:21 [PATCH 0/3] Error report improvement and fix Gavin Li @ 2024-09-06 10:21 ` Gavin Li 2024-09-06 10:21 ` [PATCH 2/3] net/mlx5/hws: add log for failing to create rule in HWS Gavin Li 2024-09-06 10:21 ` [PATCH 3/3] net/mlx5/hws: print CQE error syndrome and more information Gavin Li 2 siblings, 0 replies; 17+ messages in thread From: Gavin Li @ 2024-09-06 10:21 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou, Alexander Kozyrev Cc: dev, rasland, Minggang Li (Gavin), stable From: "Minggang Li (Gavin)" <gavinl@nvidia.com> rte_errno should be set if anything wrong happened in under layer so that user can figure out what's going on. There were some cases that did not set it when ipool allcation failed. To fix the issue, set rte_errno to ENOMEM if mlx5_ipool_malloc failed to allocate ID. Fixes: c40c061a02 ("net/mlx5: add basic flow queue operation") Fixes: 48fbb0e93d ("net/mlx5: support flow meter mark indirect action with HWS") cc: stable@dpdk.org Signed-off-by: Gavin Li <gavinl@nvidia.com> Acked-by: Bing Zhao <bingz@nvidia.com> --- drivers/net/mlx5/mlx5_flow_hw.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c index 50888944a5..509de2a6a4 100644 --- a/drivers/net/mlx5/mlx5_flow_hw.c +++ b/drivers/net/mlx5/mlx5_flow_hw.c @@ -1897,7 +1897,7 @@ flow_hw_meter_mark_alloc(struct rte_eth_dev *dev, uint32_t queue, const struct rte_flow_action_meter_mark *meter_mark = action->conf; struct mlx5_aso_mtr *aso_mtr; struct mlx5_flow_meter_info *fm; - uint32_t mtr_id; + uint32_t mtr_id = 0; uintptr_t handle = (uintptr_t)MLX5_INDIRECT_ACTION_TYPE_METER_MARK << MLX5_INDIRECT_ACTION_TYPE_OFFSET; @@ -1909,8 +1909,15 @@ flow_hw_meter_mark_alloc(struct rte_eth_dev *dev, uint32_t queue, if (meter_mark->profile == NULL) return NULL; aso_mtr = mlx5_ipool_malloc(pool->idx_pool, &mtr_id); - if (!aso_mtr) + if (!aso_mtr) { + rte_flow_error_set(error, ENOMEM, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, + "failed to allocate aso meter entry"); + if (mtr_id) + mlx5_ipool_free(pool->idx_pool, mtr_id); return NULL; + } /* Fill the flow meter parameters. */ aso_mtr->type = ASO_METER_INDIRECT; fm = &aso_mtr->fm; @@ -3918,8 +3925,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev, return NULL; } flow = mlx5_ipool_malloc(table->flow, &flow_idx); - if (!flow) + if (!flow) { + rte_errno = ENOMEM; goto error; + } rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue); /* * Set the table here in order to know the destination table @@ -3930,8 +3939,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev, flow->idx = flow_idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } flow->res_idx = res_idx; } else { flow->res_idx = flow_idx; @@ -4062,8 +4073,10 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev, return NULL; } flow = mlx5_ipool_malloc(table->flow, &flow_idx); - if (!flow) + if (!flow) { + rte_errno = ENOMEM; goto error; + } rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue); /* * Set the table here in order to know the destination table @@ -4074,8 +4087,10 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev, flow->idx = flow_idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } flow->res_idx = res_idx; } else { flow->res_idx = flow_idx; @@ -4210,8 +4225,10 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev, nf->idx = of->idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } nf->res_idx = res_idx; } else { nf->res_idx = of->res_idx; -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/3] net/mlx5/hws: add log for failing to create rule in HWS 2024-09-06 10:21 [PATCH 0/3] Error report improvement and fix Gavin Li 2024-09-06 10:21 ` [PATCH 1/3] net/mlx5: set rte errno if malloc failed Gavin Li @ 2024-09-06 10:21 ` Gavin Li 2024-09-06 10:21 ` [PATCH 3/3] net/mlx5/hws: print CQE error syndrome and more information Gavin Li 2 siblings, 0 replies; 17+ messages in thread From: Gavin Li @ 2024-09-06 10:21 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou Cc: dev, rasland, Minggang Li (Gavin), Alex Vesker From: "Minggang Li (Gavin)" <gavinl@nvidia.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Acked-by: Alex Vesker <valex@nvidia.com> --- drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/mlx5/hws/mlx5dr_rule.c b/drivers/net/mlx5/hws/mlx5dr_rule.c index 1edb7eac74..5d66d81ea5 100644 --- a/drivers/net/mlx5/hws/mlx5dr_rule.c +++ b/drivers/net/mlx5/hws/mlx5dr_rule.c @@ -638,6 +638,7 @@ static int mlx5dr_rule_destroy_hws(struct mlx5dr_rule *rule, /* Rule is not completed yet */ if (rule->status == MLX5DR_RULE_STATUS_CREATING) { + DR_LOG(NOTICE, "Cannot destroy, rule creation still in progress"); rte_errno = EBUSY; return rte_errno; } @@ -806,12 +807,14 @@ static int mlx5dr_rule_enqueue_precheck(struct mlx5dr_rule *rule, struct mlx5dr_context *ctx = rule->matcher->tbl->ctx; if (unlikely(!attr->user_data)) { + DR_LOG(DEBUG, "User data must be provided for rule operations"); rte_errno = EINVAL; return rte_errno; } /* Check if there is room in queue */ if (unlikely(mlx5dr_send_engine_full(&ctx->send_queue[attr->queue_id]))) { + DR_LOG(NOTICE, "No room in queue[%d]", attr->queue_id); rte_errno = EBUSY; return rte_errno; } @@ -823,6 +826,7 @@ static int mlx5dr_rule_enqueue_precheck_move(struct mlx5dr_rule *rule, struct mlx5dr_rule_attr *attr) { if (unlikely(rule->status != MLX5DR_RULE_STATUS_CREATED)) { + DR_LOG(DEBUG, "Cannot move, rule status is invalid"); rte_errno = EINVAL; return rte_errno; } @@ -835,6 +839,7 @@ static int mlx5dr_rule_enqueue_precheck_create(struct mlx5dr_rule *rule, { if (unlikely(mlx5dr_matcher_is_in_resize(rule->matcher))) { /* Matcher in resize - new rules are not allowed */ + DR_LOG(NOTICE, "Resizing in progress, cannot create rule"); rte_errno = EAGAIN; return rte_errno; } @@ -1068,6 +1073,7 @@ int mlx5dr_rule_hash_calculate(struct mlx5dr_matcher *matcher, mlx5dr_table_is_root(matcher->tbl) || matcher->tbl->ctx->caps->access_index_mode == MLX5DR_MATCHER_INSERT_BY_HASH || matcher->tbl->ctx->caps->flow_table_hash_type != MLX5_FLOW_TABLE_HASH_TYPE_CRC32) { + DR_LOG(DEBUG, "Matcher is not supported"); rte_errno = ENOTSUP; return -rte_errno; } -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 3/3] net/mlx5/hws: print CQE error syndrome and more information 2024-09-06 10:21 [PATCH 0/3] Error report improvement and fix Gavin Li 2024-09-06 10:21 ` [PATCH 1/3] net/mlx5: set rte errno if malloc failed Gavin Li 2024-09-06 10:21 ` [PATCH 2/3] net/mlx5/hws: add log for failing to create rule in HWS Gavin Li @ 2024-09-06 10:21 ` Gavin Li 2024-09-10 7:58 ` [PATCH V1 0/3] Error report improvement and fix Gavin Li 2 siblings, 1 reply; 17+ messages in thread From: Gavin Li @ 2024-09-06 10:21 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou Cc: dev, rasland, Minggang Li (Gavin), Alex Vesker From: "Minggang Li (Gavin)" <gavinl@nvidia.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Acked-by: Alex Vesker <valex@nvidia.com> --- drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/hws/mlx5dr_send.c b/drivers/net/mlx5/hws/mlx5dr_send.c index 3022c50260..c931896a79 100644 --- a/drivers/net/mlx5/hws/mlx5dr_send.c +++ b/drivers/net/mlx5/hws/mlx5dr_send.c @@ -598,8 +598,15 @@ static void mlx5dr_send_engine_poll_cq(struct mlx5dr_send_engine *queue, cqe_owner != sw_own) return; - if (unlikely(cqe_opcode != MLX5_CQE_REQ)) + if (unlikely(cqe_opcode != MLX5_CQE_REQ)) { + struct mlx5_err_cqe *err_cqe = (struct mlx5_err_cqe *)cqe; + + DR_LOG(ERR, "CQE ERR:0x%x, Vender_ERR:0x%x, OP:0x%x, QPN:0x%x, WQE_CNT:0x%x", + err_cqe->syndrome, err_cqe->vendor_err_synd, cqe_opcode, + (rte_be_to_cpu_32(err_cqe->s_wqe_opcode_qpn) & 0xffffff), + rte_be_to_cpu_16(err_cqe->wqe_counter)); queue->err = true; + } rte_io_rmb(); -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V1 0/3] Error report improvement and fix 2024-09-06 10:21 ` [PATCH 3/3] net/mlx5/hws: print CQE error syndrome and more information Gavin Li @ 2024-09-10 7:58 ` Gavin Li 2024-09-10 7:58 ` [PATCH V1 1/3] net/mlx5: set rte errno if malloc failed Gavin Li ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Gavin Li @ 2024-09-10 7:58 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland This patch set is to improve error handling in pmd and under layer. Gavin Li (3): net/mlx5: set rte errno if malloc failed --- changelog: v0->v1 - Fix typo in commit message --- net/mlx5/hws: add log for failing to create rule in HWS net/mlx5/hws: print CQE error syndrome and more information drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- drivers/net/mlx5/mlx5_flow_hw.c | 31 +++++++++++++++++++++++------- 3 files changed, 38 insertions(+), 8 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V1 1/3] net/mlx5: set rte errno if malloc failed 2024-09-10 7:58 ` [PATCH V1 0/3] Error report improvement and fix Gavin Li @ 2024-09-10 7:58 ` Gavin Li 2024-09-10 7:58 ` [PATCH V1 2/3] net/mlx5/hws: add log for failing to create rule in HWS Gavin Li 2024-09-10 7:58 ` [PATCH V1 3/3] net/mlx5/hws: print CQE error syndrome and more information Gavin Li 2 siblings, 0 replies; 17+ messages in thread From: Gavin Li @ 2024-09-10 7:58 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou, Alexander Kozyrev Cc: dev, rasland, Minggang Li (Gavin), stable From: "Minggang Li (Gavin)" <gavinl@nvidia.com> rte_errno should be set if anything wrong happened in under layer so that user can figure out what's going on. There were some cases that did not set it when ipool allocation failed. To fix the issue, set rte_errno to ENOMEM if mlx5_ipool_malloc failed to allocate ID. Fixes: c40c061a02 ("net/mlx5: add basic flow queue operation") Fixes: 48fbb0e93d ("net/mlx5: support flow meter mark indirect action with HWS") cc: stable@dpdk.org Signed-off-by: Gavin Li <gavinl@nvidia.com> Acked-by: Bing Zhao <bingz@nvidia.com> --- drivers/net/mlx5/mlx5_flow_hw.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c index 50888944a5..509de2a6a4 100644 --- a/drivers/net/mlx5/mlx5_flow_hw.c +++ b/drivers/net/mlx5/mlx5_flow_hw.c @@ -1897,7 +1897,7 @@ flow_hw_meter_mark_alloc(struct rte_eth_dev *dev, uint32_t queue, const struct rte_flow_action_meter_mark *meter_mark = action->conf; struct mlx5_aso_mtr *aso_mtr; struct mlx5_flow_meter_info *fm; - uint32_t mtr_id; + uint32_t mtr_id = 0; uintptr_t handle = (uintptr_t)MLX5_INDIRECT_ACTION_TYPE_METER_MARK << MLX5_INDIRECT_ACTION_TYPE_OFFSET; @@ -1909,8 +1909,15 @@ flow_hw_meter_mark_alloc(struct rte_eth_dev *dev, uint32_t queue, if (meter_mark->profile == NULL) return NULL; aso_mtr = mlx5_ipool_malloc(pool->idx_pool, &mtr_id); - if (!aso_mtr) + if (!aso_mtr) { + rte_flow_error_set(error, ENOMEM, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, + "failed to allocate aso meter entry"); + if (mtr_id) + mlx5_ipool_free(pool->idx_pool, mtr_id); return NULL; + } /* Fill the flow meter parameters. */ aso_mtr->type = ASO_METER_INDIRECT; fm = &aso_mtr->fm; @@ -3918,8 +3925,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev, return NULL; } flow = mlx5_ipool_malloc(table->flow, &flow_idx); - if (!flow) + if (!flow) { + rte_errno = ENOMEM; goto error; + } rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue); /* * Set the table here in order to know the destination table @@ -3930,8 +3939,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev, flow->idx = flow_idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } flow->res_idx = res_idx; } else { flow->res_idx = flow_idx; @@ -4062,8 +4073,10 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev, return NULL; } flow = mlx5_ipool_malloc(table->flow, &flow_idx); - if (!flow) + if (!flow) { + rte_errno = ENOMEM; goto error; + } rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue); /* * Set the table here in order to know the destination table @@ -4074,8 +4087,10 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev, flow->idx = flow_idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } flow->res_idx = res_idx; } else { flow->res_idx = flow_idx; @@ -4210,8 +4225,10 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev, nf->idx = of->idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } nf->res_idx = res_idx; } else { nf->res_idx = of->res_idx; -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V1 2/3] net/mlx5/hws: add log for failing to create rule in HWS 2024-09-10 7:58 ` [PATCH V1 0/3] Error report improvement and fix Gavin Li 2024-09-10 7:58 ` [PATCH V1 1/3] net/mlx5: set rte errno if malloc failed Gavin Li @ 2024-09-10 7:58 ` Gavin Li 2024-09-10 7:58 ` [PATCH V1 3/3] net/mlx5/hws: print CQE error syndrome and more information Gavin Li 2 siblings, 0 replies; 17+ messages in thread From: Gavin Li @ 2024-09-10 7:58 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou Cc: dev, rasland, Minggang Li (Gavin), Alex Vesker From: "Minggang Li (Gavin)" <gavinl@nvidia.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Acked-by: Alex Vesker <valex@nvidia.com> --- drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/mlx5/hws/mlx5dr_rule.c b/drivers/net/mlx5/hws/mlx5dr_rule.c index 1edb7eac74..5d66d81ea5 100644 --- a/drivers/net/mlx5/hws/mlx5dr_rule.c +++ b/drivers/net/mlx5/hws/mlx5dr_rule.c @@ -638,6 +638,7 @@ static int mlx5dr_rule_destroy_hws(struct mlx5dr_rule *rule, /* Rule is not completed yet */ if (rule->status == MLX5DR_RULE_STATUS_CREATING) { + DR_LOG(NOTICE, "Cannot destroy, rule creation still in progress"); rte_errno = EBUSY; return rte_errno; } @@ -806,12 +807,14 @@ static int mlx5dr_rule_enqueue_precheck(struct mlx5dr_rule *rule, struct mlx5dr_context *ctx = rule->matcher->tbl->ctx; if (unlikely(!attr->user_data)) { + DR_LOG(DEBUG, "User data must be provided for rule operations"); rte_errno = EINVAL; return rte_errno; } /* Check if there is room in queue */ if (unlikely(mlx5dr_send_engine_full(&ctx->send_queue[attr->queue_id]))) { + DR_LOG(NOTICE, "No room in queue[%d]", attr->queue_id); rte_errno = EBUSY; return rte_errno; } @@ -823,6 +826,7 @@ static int mlx5dr_rule_enqueue_precheck_move(struct mlx5dr_rule *rule, struct mlx5dr_rule_attr *attr) { if (unlikely(rule->status != MLX5DR_RULE_STATUS_CREATED)) { + DR_LOG(DEBUG, "Cannot move, rule status is invalid"); rte_errno = EINVAL; return rte_errno; } @@ -835,6 +839,7 @@ static int mlx5dr_rule_enqueue_precheck_create(struct mlx5dr_rule *rule, { if (unlikely(mlx5dr_matcher_is_in_resize(rule->matcher))) { /* Matcher in resize - new rules are not allowed */ + DR_LOG(NOTICE, "Resizing in progress, cannot create rule"); rte_errno = EAGAIN; return rte_errno; } @@ -1068,6 +1073,7 @@ int mlx5dr_rule_hash_calculate(struct mlx5dr_matcher *matcher, mlx5dr_table_is_root(matcher->tbl) || matcher->tbl->ctx->caps->access_index_mode == MLX5DR_MATCHER_INSERT_BY_HASH || matcher->tbl->ctx->caps->flow_table_hash_type != MLX5_FLOW_TABLE_HASH_TYPE_CRC32) { + DR_LOG(DEBUG, "Matcher is not supported"); rte_errno = ENOTSUP; return -rte_errno; } -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V1 3/3] net/mlx5/hws: print CQE error syndrome and more information 2024-09-10 7:58 ` [PATCH V1 0/3] Error report improvement and fix Gavin Li 2024-09-10 7:58 ` [PATCH V1 1/3] net/mlx5: set rte errno if malloc failed Gavin Li 2024-09-10 7:58 ` [PATCH V1 2/3] net/mlx5/hws: add log for failing to create rule in HWS Gavin Li @ 2024-09-10 7:58 ` Gavin Li 2024-09-24 5:59 ` [PATCH V2 0/3] Error report improvement and fix Minggang Li(Gavin) 2 siblings, 1 reply; 17+ messages in thread From: Gavin Li @ 2024-09-10 7:58 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou Cc: dev, rasland, Minggang Li (Gavin), Alex Vesker From: "Minggang Li (Gavin)" <gavinl@nvidia.com> Signed-off-by: Gavin Li <gavinl@nvidia.com> Acked-by: Alex Vesker <valex@nvidia.com> --- drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/hws/mlx5dr_send.c b/drivers/net/mlx5/hws/mlx5dr_send.c index 3022c50260..c931896a79 100644 --- a/drivers/net/mlx5/hws/mlx5dr_send.c +++ b/drivers/net/mlx5/hws/mlx5dr_send.c @@ -598,8 +598,15 @@ static void mlx5dr_send_engine_poll_cq(struct mlx5dr_send_engine *queue, cqe_owner != sw_own) return; - if (unlikely(cqe_opcode != MLX5_CQE_REQ)) + if (unlikely(cqe_opcode != MLX5_CQE_REQ)) { + struct mlx5_err_cqe *err_cqe = (struct mlx5_err_cqe *)cqe; + + DR_LOG(ERR, "CQE ERR:0x%x, Vender_ERR:0x%x, OP:0x%x, QPN:0x%x, WQE_CNT:0x%x", + err_cqe->syndrome, err_cqe->vendor_err_synd, cqe_opcode, + (rte_be_to_cpu_32(err_cqe->s_wqe_opcode_qpn) & 0xffffff), + rte_be_to_cpu_16(err_cqe->wqe_counter)); queue->err = true; + } rte_io_rmb(); -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V2 0/3] Error report improvement and fix 2024-09-10 7:58 ` [PATCH V1 3/3] net/mlx5/hws: print CQE error syndrome and more information Gavin Li @ 2024-09-24 5:59 ` Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 1/3] net/mlx5: set rte errno if malloc failed Minggang Li(Gavin) ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Minggang Li(Gavin) @ 2024-09-24 5:59 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland This patch set is to improve error handling in pmd and under layer. Gavin Li (3): net/mlx5: set rte errno if malloc failed --- changelog: v0->v1 - Fix typo in commit message v1->v2 - Fix signoff warning --- net/mlx5/hws: add log for failing to create rule in HWS --- changelog: v1->v2 - Fix signoff warning --- net/mlx5/hws: print CQE error syndrome and more information --- changelog: v1->v2 - Fix typo in log message - Fix signoff warning --- drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- drivers/net/mlx5/mlx5_flow_hw.c | 31 +++++++++++++++++++++++------- 3 files changed, 38 insertions(+), 8 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V2 1/3] net/mlx5: set rte errno if malloc failed 2024-09-24 5:59 ` [PATCH V2 0/3] Error report improvement and fix Minggang Li(Gavin) @ 2024-09-24 5:59 ` Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 2/3] net/mlx5/hws: add log for failing to create rule in HWS Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 3/3] net/mlx5/hws: print CQE error syndrome and more information Minggang Li(Gavin) 2 siblings, 0 replies; 17+ messages in thread From: Minggang Li(Gavin) @ 2024-09-24 5:59 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou, Alexander Kozyrev Cc: dev, rasland, stable rte_errno should be set if anything wrong happened in under layer so that user can figure out what's going on. There were some cases that did not set it when ipool allocation failed. To fix the issue, set rte_errno to ENOMEM if mlx5_ipool_malloc failed to allocate ID. Fixes: c40c061a02 ("net/mlx5: add basic flow queue operation") Fixes: 48fbb0e93d ("net/mlx5: support flow meter mark indirect action with HWS") cc: stable@dpdk.org Signed-off-by: Minggang Li(Gavin) <gavinl@nvidia.com> Acked-by: Bing Zhao <bingz@nvidia.com> --- drivers/net/mlx5/mlx5_flow_hw.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c index a275154d4b..f34670b3ec 100644 --- a/drivers/net/mlx5/mlx5_flow_hw.c +++ b/drivers/net/mlx5/mlx5_flow_hw.c @@ -1905,7 +1905,7 @@ flow_hw_meter_mark_alloc(struct rte_eth_dev *dev, uint32_t queue, const struct rte_flow_action_meter_mark *meter_mark = action->conf; struct mlx5_aso_mtr *aso_mtr; struct mlx5_flow_meter_info *fm; - uint32_t mtr_id; + uint32_t mtr_id = 0; uintptr_t handle = (uintptr_t)MLX5_INDIRECT_ACTION_TYPE_METER_MARK << MLX5_INDIRECT_ACTION_TYPE_OFFSET; @@ -1917,8 +1917,15 @@ flow_hw_meter_mark_alloc(struct rte_eth_dev *dev, uint32_t queue, if (meter_mark->profile == NULL) return NULL; aso_mtr = mlx5_ipool_malloc(pool->idx_pool, &mtr_id); - if (!aso_mtr) + if (!aso_mtr) { + rte_flow_error_set(error, ENOMEM, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, + "failed to allocate aso meter entry"); + if (mtr_id) + mlx5_ipool_free(pool->idx_pool, mtr_id); return NULL; + } /* Fill the flow meter parameters. */ aso_mtr->type = ASO_METER_INDIRECT; fm = &aso_mtr->fm; @@ -3926,8 +3933,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev, return NULL; } flow = mlx5_ipool_malloc(table->flow, &flow_idx); - if (!flow) + if (!flow) { + rte_errno = ENOMEM; goto error; + } rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue); /* * Set the table here in order to know the destination table @@ -3938,8 +3947,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev, flow->idx = flow_idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } flow->res_idx = res_idx; } else { flow->res_idx = flow_idx; @@ -4070,8 +4081,10 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev, return NULL; } flow = mlx5_ipool_malloc(table->flow, &flow_idx); - if (!flow) + if (!flow) { + rte_errno = ENOMEM; goto error; + } rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue); /* * Set the table here in order to know the destination table @@ -4082,8 +4095,10 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev, flow->idx = flow_idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } flow->res_idx = res_idx; } else { flow->res_idx = flow_idx; @@ -4218,8 +4233,10 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev, nf->idx = of->idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } nf->res_idx = res_idx; } else { nf->res_idx = of->res_idx; -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V2 2/3] net/mlx5/hws: add log for failing to create rule in HWS 2024-09-24 5:59 ` [PATCH V2 0/3] Error report improvement and fix Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 1/3] net/mlx5: set rte errno if malloc failed Minggang Li(Gavin) @ 2024-09-24 5:59 ` Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 3/3] net/mlx5/hws: print CQE error syndrome and more information Minggang Li(Gavin) 2 siblings, 0 replies; 17+ messages in thread From: Minggang Li(Gavin) @ 2024-09-24 5:59 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou Cc: dev, rasland, Alex Vesker Signed-off-by: Minggang Li(Gavin) <gavinl@nvidia.com> Acked-by: Alex Vesker <valex@nvidia.com> --- drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/mlx5/hws/mlx5dr_rule.c b/drivers/net/mlx5/hws/mlx5dr_rule.c index 1edb7eac74..5d66d81ea5 100644 --- a/drivers/net/mlx5/hws/mlx5dr_rule.c +++ b/drivers/net/mlx5/hws/mlx5dr_rule.c @@ -638,6 +638,7 @@ static int mlx5dr_rule_destroy_hws(struct mlx5dr_rule *rule, /* Rule is not completed yet */ if (rule->status == MLX5DR_RULE_STATUS_CREATING) { + DR_LOG(NOTICE, "Cannot destroy, rule creation still in progress"); rte_errno = EBUSY; return rte_errno; } @@ -806,12 +807,14 @@ static int mlx5dr_rule_enqueue_precheck(struct mlx5dr_rule *rule, struct mlx5dr_context *ctx = rule->matcher->tbl->ctx; if (unlikely(!attr->user_data)) { + DR_LOG(DEBUG, "User data must be provided for rule operations"); rte_errno = EINVAL; return rte_errno; } /* Check if there is room in queue */ if (unlikely(mlx5dr_send_engine_full(&ctx->send_queue[attr->queue_id]))) { + DR_LOG(NOTICE, "No room in queue[%d]", attr->queue_id); rte_errno = EBUSY; return rte_errno; } @@ -823,6 +826,7 @@ static int mlx5dr_rule_enqueue_precheck_move(struct mlx5dr_rule *rule, struct mlx5dr_rule_attr *attr) { if (unlikely(rule->status != MLX5DR_RULE_STATUS_CREATED)) { + DR_LOG(DEBUG, "Cannot move, rule status is invalid"); rte_errno = EINVAL; return rte_errno; } @@ -835,6 +839,7 @@ static int mlx5dr_rule_enqueue_precheck_create(struct mlx5dr_rule *rule, { if (unlikely(mlx5dr_matcher_is_in_resize(rule->matcher))) { /* Matcher in resize - new rules are not allowed */ + DR_LOG(NOTICE, "Resizing in progress, cannot create rule"); rte_errno = EAGAIN; return rte_errno; } @@ -1068,6 +1073,7 @@ int mlx5dr_rule_hash_calculate(struct mlx5dr_matcher *matcher, mlx5dr_table_is_root(matcher->tbl) || matcher->tbl->ctx->caps->access_index_mode == MLX5DR_MATCHER_INSERT_BY_HASH || matcher->tbl->ctx->caps->flow_table_hash_type != MLX5_FLOW_TABLE_HASH_TYPE_CRC32) { + DR_LOG(DEBUG, "Matcher is not supported"); rte_errno = ENOTSUP; return -rte_errno; } -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V2 3/3] net/mlx5/hws: print CQE error syndrome and more information 2024-09-24 5:59 ` [PATCH V2 0/3] Error report improvement and fix Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 1/3] net/mlx5: set rte errno if malloc failed Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 2/3] net/mlx5/hws: add log for failing to create rule in HWS Minggang Li(Gavin) @ 2024-09-24 5:59 ` Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 0/3] Error report improvement and fix Minggang Li(Gavin) 2 siblings, 1 reply; 17+ messages in thread From: Minggang Li(Gavin) @ 2024-09-24 5:59 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou Cc: dev, rasland, Alex Vesker Signed-off-by: Minggang Li(Gavin) <gavinl@nvidia.com> Acked-by: Alex Vesker <valex@nvidia.com> --- drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/hws/mlx5dr_send.c b/drivers/net/mlx5/hws/mlx5dr_send.c index 3022c50260..e9abf3dddb 100644 --- a/drivers/net/mlx5/hws/mlx5dr_send.c +++ b/drivers/net/mlx5/hws/mlx5dr_send.c @@ -598,8 +598,15 @@ static void mlx5dr_send_engine_poll_cq(struct mlx5dr_send_engine *queue, cqe_owner != sw_own) return; - if (unlikely(cqe_opcode != MLX5_CQE_REQ)) + if (unlikely(cqe_opcode != MLX5_CQE_REQ)) { + struct mlx5_err_cqe *err_cqe = (struct mlx5_err_cqe *)cqe; + + DR_LOG(ERR, "CQE ERR:0x%x, Vendor_ERR:0x%x, OP:0x%x, QPN:0x%x, WQE_CNT:0x%x", + err_cqe->syndrome, err_cqe->vendor_err_synd, cqe_opcode, + (rte_be_to_cpu_32(err_cqe->s_wqe_opcode_qpn) & 0xffffff), + rte_be_to_cpu_16(err_cqe->wqe_counter)); queue->err = true; + } rte_io_rmb(); -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V3 0/3] Error report improvement and fix 2024-09-24 5:59 ` [PATCH V2 3/3] net/mlx5/hws: print CQE error syndrome and more information Minggang Li(Gavin) @ 2024-09-24 10:52 ` Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 1/3] net/mlx5: set rte errno if malloc failed Minggang Li(Gavin) ` (3 more replies) 0 siblings, 4 replies; 17+ messages in thread From: Minggang Li(Gavin) @ 2024-09-24 10:52 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland This patch set is to improve error handling in pmd and under layer. Gavin Li (3): net/mlx5: set rte errno if malloc failed --- changelog: v0->v1 - Fix typo in commit message v1->v2 - Fix checkpatch warning --- net/mlx5/hws: add log for failing to create rule in HWS --- changelog: v1->v2 - Fix checkpatch warning v2->v3 - Fix checkpatch warning --- net/mlx5/hws: print CQE error syndrome and more information --- changelog: v1->v2 - Fix checkpatch warning v2->v3 - Fix checkpatch warning --- drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- drivers/net/mlx5/mlx5_flow_hw.c | 31 +++++++++++++++++++++++------- 3 files changed, 38 insertions(+), 8 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V3 1/3] net/mlx5: set rte errno if malloc failed 2024-09-24 10:52 ` [PATCH V3 0/3] Error report improvement and fix Minggang Li(Gavin) @ 2024-09-24 10:52 ` Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 2/3] net/mlx5/hws: add log for failing to create rule in HWS Minggang Li(Gavin) ` (2 subsequent siblings) 3 siblings, 0 replies; 17+ messages in thread From: Minggang Li(Gavin) @ 2024-09-24 10:52 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou, Alexander Kozyrev Cc: dev, rasland, stable rte_errno should be set if anything wrong happened in under layer so that user can figure out what's going on. There were some cases that did not set it when ipool allocation failed. To fix the issue, set rte_errno to ENOMEM if mlx5_ipool_malloc failed to allocate ID. Fixes: c40c061a022e ("net/mlx5: add basic flow queue operation") Fixes: 48fbb0e93d06 ("net/mlx5: support flow meter mark indirect action with HWS") cc: stable@dpdk.org Signed-off-by: Minggang Li(Gavin) <gavinl@nvidia.com> Acked-by: Bing Zhao <bingz@nvidia.com> --- drivers/net/mlx5/mlx5_flow_hw.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c index a275154d4b..f34670b3ec 100644 --- a/drivers/net/mlx5/mlx5_flow_hw.c +++ b/drivers/net/mlx5/mlx5_flow_hw.c @@ -1905,7 +1905,7 @@ flow_hw_meter_mark_alloc(struct rte_eth_dev *dev, uint32_t queue, const struct rte_flow_action_meter_mark *meter_mark = action->conf; struct mlx5_aso_mtr *aso_mtr; struct mlx5_flow_meter_info *fm; - uint32_t mtr_id; + uint32_t mtr_id = 0; uintptr_t handle = (uintptr_t)MLX5_INDIRECT_ACTION_TYPE_METER_MARK << MLX5_INDIRECT_ACTION_TYPE_OFFSET; @@ -1917,8 +1917,15 @@ flow_hw_meter_mark_alloc(struct rte_eth_dev *dev, uint32_t queue, if (meter_mark->profile == NULL) return NULL; aso_mtr = mlx5_ipool_malloc(pool->idx_pool, &mtr_id); - if (!aso_mtr) + if (!aso_mtr) { + rte_flow_error_set(error, ENOMEM, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, + "failed to allocate aso meter entry"); + if (mtr_id) + mlx5_ipool_free(pool->idx_pool, mtr_id); return NULL; + } /* Fill the flow meter parameters. */ aso_mtr->type = ASO_METER_INDIRECT; fm = &aso_mtr->fm; @@ -3926,8 +3933,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev, return NULL; } flow = mlx5_ipool_malloc(table->flow, &flow_idx); - if (!flow) + if (!flow) { + rte_errno = ENOMEM; goto error; + } rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue); /* * Set the table here in order to know the destination table @@ -3938,8 +3947,10 @@ flow_hw_async_flow_create(struct rte_eth_dev *dev, flow->idx = flow_idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } flow->res_idx = res_idx; } else { flow->res_idx = flow_idx; @@ -4070,8 +4081,10 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev, return NULL; } flow = mlx5_ipool_malloc(table->flow, &flow_idx); - if (!flow) + if (!flow) { + rte_errno = ENOMEM; goto error; + } rule_acts = flow_hw_get_dr_action_buffer(priv, table, action_template_index, queue); /* * Set the table here in order to know the destination table @@ -4082,8 +4095,10 @@ flow_hw_async_flow_create_by_index(struct rte_eth_dev *dev, flow->idx = flow_idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } flow->res_idx = res_idx; } else { flow->res_idx = flow_idx; @@ -4218,8 +4233,10 @@ flow_hw_async_flow_update(struct rte_eth_dev *dev, nf->idx = of->idx; if (table->resource) { mlx5_ipool_malloc(table->resource, &res_idx); - if (!res_idx) + if (!res_idx) { + rte_errno = ENOMEM; goto error; + } nf->res_idx = res_idx; } else { nf->res_idx = of->res_idx; -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V3 2/3] net/mlx5/hws: add log for failing to create rule in HWS 2024-09-24 10:52 ` [PATCH V3 0/3] Error report improvement and fix Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 1/3] net/mlx5: set rte errno if malloc failed Minggang Li(Gavin) @ 2024-09-24 10:52 ` Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 3/3] net/mlx5/hws: print CQE error syndrome and more information Minggang Li(Gavin) 2024-09-25 8:23 ` [PATCH V3 0/3] Error report improvement and fix Raslan Darawsheh 3 siblings, 0 replies; 17+ messages in thread From: Minggang Li(Gavin) @ 2024-09-24 10:52 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou Cc: dev, rasland, Alex Vesker Add log messages about the reason why the flow was failed to be created. Signed-off-by: Minggang Li(Gavin) <gavinl@nvidia.com> Acked-by: Alex Vesker <valex@nvidia.com> --- drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/mlx5/hws/mlx5dr_rule.c b/drivers/net/mlx5/hws/mlx5dr_rule.c index 1edb7eac74..5d66d81ea5 100644 --- a/drivers/net/mlx5/hws/mlx5dr_rule.c +++ b/drivers/net/mlx5/hws/mlx5dr_rule.c @@ -638,6 +638,7 @@ static int mlx5dr_rule_destroy_hws(struct mlx5dr_rule *rule, /* Rule is not completed yet */ if (rule->status == MLX5DR_RULE_STATUS_CREATING) { + DR_LOG(NOTICE, "Cannot destroy, rule creation still in progress"); rte_errno = EBUSY; return rte_errno; } @@ -806,12 +807,14 @@ static int mlx5dr_rule_enqueue_precheck(struct mlx5dr_rule *rule, struct mlx5dr_context *ctx = rule->matcher->tbl->ctx; if (unlikely(!attr->user_data)) { + DR_LOG(DEBUG, "User data must be provided for rule operations"); rte_errno = EINVAL; return rte_errno; } /* Check if there is room in queue */ if (unlikely(mlx5dr_send_engine_full(&ctx->send_queue[attr->queue_id]))) { + DR_LOG(NOTICE, "No room in queue[%d]", attr->queue_id); rte_errno = EBUSY; return rte_errno; } @@ -823,6 +826,7 @@ static int mlx5dr_rule_enqueue_precheck_move(struct mlx5dr_rule *rule, struct mlx5dr_rule_attr *attr) { if (unlikely(rule->status != MLX5DR_RULE_STATUS_CREATED)) { + DR_LOG(DEBUG, "Cannot move, rule status is invalid"); rte_errno = EINVAL; return rte_errno; } @@ -835,6 +839,7 @@ static int mlx5dr_rule_enqueue_precheck_create(struct mlx5dr_rule *rule, { if (unlikely(mlx5dr_matcher_is_in_resize(rule->matcher))) { /* Matcher in resize - new rules are not allowed */ + DR_LOG(NOTICE, "Resizing in progress, cannot create rule"); rte_errno = EAGAIN; return rte_errno; } @@ -1068,6 +1073,7 @@ int mlx5dr_rule_hash_calculate(struct mlx5dr_matcher *matcher, mlx5dr_table_is_root(matcher->tbl) || matcher->tbl->ctx->caps->access_index_mode == MLX5DR_MATCHER_INSERT_BY_HASH || matcher->tbl->ctx->caps->flow_table_hash_type != MLX5_FLOW_TABLE_HASH_TYPE_CRC32) { + DR_LOG(DEBUG, "Matcher is not supported"); rte_errno = ENOTSUP; return -rte_errno; } -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V3 3/3] net/mlx5/hws: print CQE error syndrome and more information 2024-09-24 10:52 ` [PATCH V3 0/3] Error report improvement and fix Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 1/3] net/mlx5: set rte errno if malloc failed Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 2/3] net/mlx5/hws: add log for failing to create rule in HWS Minggang Li(Gavin) @ 2024-09-24 10:52 ` Minggang Li(Gavin) 2024-09-25 8:23 ` [PATCH V3 0/3] Error report improvement and fix Raslan Darawsheh 3 siblings, 0 replies; 17+ messages in thread From: Minggang Li(Gavin) @ 2024-09-24 10:52 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas, Dariusz Sosnowski, Bing Zhao, Suanming Mou Cc: dev, rasland, Alex Vesker Print CQE error syndrome and more information in case of queue error. Signed-off-by: Minggang Li(Gavin) <gavinl@nvidia.com> Acked-by: Alex Vesker <valex@nvidia.com> --- drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/hws/mlx5dr_send.c b/drivers/net/mlx5/hws/mlx5dr_send.c index 3022c50260..e9abf3dddb 100644 --- a/drivers/net/mlx5/hws/mlx5dr_send.c +++ b/drivers/net/mlx5/hws/mlx5dr_send.c @@ -598,8 +598,15 @@ static void mlx5dr_send_engine_poll_cq(struct mlx5dr_send_engine *queue, cqe_owner != sw_own) return; - if (unlikely(cqe_opcode != MLX5_CQE_REQ)) + if (unlikely(cqe_opcode != MLX5_CQE_REQ)) { + struct mlx5_err_cqe *err_cqe = (struct mlx5_err_cqe *)cqe; + + DR_LOG(ERR, "CQE ERR:0x%x, Vendor_ERR:0x%x, OP:0x%x, QPN:0x%x, WQE_CNT:0x%x", + err_cqe->syndrome, err_cqe->vendor_err_synd, cqe_opcode, + (rte_be_to_cpu_32(err_cqe->s_wqe_opcode_qpn) & 0xffffff), + rte_be_to_cpu_16(err_cqe->wqe_counter)); queue->err = true; + } rte_io_rmb(); -- 2.34.1 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH V3 0/3] Error report improvement and fix 2024-09-24 10:52 ` [PATCH V3 0/3] Error report improvement and fix Minggang Li(Gavin) ` (2 preceding siblings ...) 2024-09-24 10:52 ` [PATCH V3 3/3] net/mlx5/hws: print CQE error syndrome and more information Minggang Li(Gavin) @ 2024-09-25 8:23 ` Raslan Darawsheh 3 siblings, 0 replies; 17+ messages in thread From: Raslan Darawsheh @ 2024-09-25 8:23 UTC (permalink / raw) To: Minggang(Gavin) Li, Matan Azrad, Slava Ovsiienko, Ori Kam, NBU-Contact-Thomas Monjalon (EXTERNAL) Cc: dev Hi, From: Minggang(Gavin) Li <gavinl@nvidia.com> Sent: Tuesday, September 24, 2024 1:52 PM To: Matan Azrad; Slava Ovsiienko; Ori Kam; NBU-Contact-Thomas Monjalon (EXTERNAL) Cc: dev@dpdk.org; Raslan Darawsheh Subject: [PATCH V3 0/3] Error report improvement and fix This patch set is to improve error handling in pmd and under layer. Gavin Li (3): net/mlx5: set rte errno if malloc failed --- changelog: v0->v1 - Fix typo in commit message v1->v2 - Fix checkpatch warning --- net/mlx5/hws: add log for failing to create rule in HWS --- changelog: v1->v2 - Fix checkpatch warning v2->v3 - Fix checkpatch warning --- net/mlx5/hws: print CQE error syndrome and more information --- changelog: v1->v2 - Fix checkpatch warning v2->v3 - Fix checkpatch warning --- drivers/net/mlx5/hws/mlx5dr_rule.c | 6 ++++++ drivers/net/mlx5/hws/mlx5dr_send.c | 9 ++++++++- drivers/net/mlx5/mlx5_flow_hw.c | 31 +++++++++++++++++++++++------- 3 files changed, 38 insertions(+), 8 deletions(-) -- 2.34.1 Series applied to next-net-mlx, Kindest regards, Raslan Darawsheh ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-09-25 8:24 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-09-06 10:21 [PATCH 0/3] Error report improvement and fix Gavin Li 2024-09-06 10:21 ` [PATCH 1/3] net/mlx5: set rte errno if malloc failed Gavin Li 2024-09-06 10:21 ` [PATCH 2/3] net/mlx5/hws: add log for failing to create rule in HWS Gavin Li 2024-09-06 10:21 ` [PATCH 3/3] net/mlx5/hws: print CQE error syndrome and more information Gavin Li 2024-09-10 7:58 ` [PATCH V1 0/3] Error report improvement and fix Gavin Li 2024-09-10 7:58 ` [PATCH V1 1/3] net/mlx5: set rte errno if malloc failed Gavin Li 2024-09-10 7:58 ` [PATCH V1 2/3] net/mlx5/hws: add log for failing to create rule in HWS Gavin Li 2024-09-10 7:58 ` [PATCH V1 3/3] net/mlx5/hws: print CQE error syndrome and more information Gavin Li 2024-09-24 5:59 ` [PATCH V2 0/3] Error report improvement and fix Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 1/3] net/mlx5: set rte errno if malloc failed Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 2/3] net/mlx5/hws: add log for failing to create rule in HWS Minggang Li(Gavin) 2024-09-24 5:59 ` [PATCH V2 3/3] net/mlx5/hws: print CQE error syndrome and more information Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 0/3] Error report improvement and fix Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 1/3] net/mlx5: set rte errno if malloc failed Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 2/3] net/mlx5/hws: add log for failing to create rule in HWS Minggang Li(Gavin) 2024-09-24 10:52 ` [PATCH V3 3/3] net/mlx5/hws: print CQE error syndrome and more information Minggang Li(Gavin) 2024-09-25 8:23 ` [PATCH V3 0/3] Error report improvement and fix Raslan Darawsheh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).