patches for DPDK stable branches
 help / color / mirror / Atom feed
* [PATCH] common/mlx5: fix QP ack timeout configuration
@ 2022-02-14  6:03 Yajun Wu
  2022-02-22 14:44 ` Raslan Darawsheh
  0 siblings, 1 reply; 2+ messages in thread
From: Yajun Wu @ 2022-02-14  6:03 UTC (permalink / raw)
  To: orika, viacheslavo, matan; +Cc: dev, thomas, rasland, roniba, stable

VDPA driver creates two QPs(1 queue pair include 1 send queue
and 1 receive queue) per virtio queue to get traffic events
from NIC to SW.
Two QPs(called FW QP and SW QP) are created as loopback QP
and FW QP'SQ is connected to SW QP'RQ internally.

When packet receive or send out, HW will send WQE by FW QP'SQ,
then SW will get CQE from the CQ of SW QP.

With large scale and heavy traffic, the SQ's request may fail
to get ACK from RQ HW, because HW is busy.
SQ will retry the request with qpc.retry_count times and each time
wait for 4.096 uS *2^(ack_timeout) for the response. If still can’t
get RQ’s HW response, SQ will go to an error state.

16 is experienced value. It should not be too high or too low.
Too high will make QP waits too long in case it’s packet drop.
Too low will cause QP to go to an error state(retry-exceeded) easily.

Fixes: 15c3807e86a ("common/mlx5: support DevX QP operations")
Cc: stable@dpdk.org

Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 2e807a0829..7732613c69 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -2279,7 +2279,7 @@ mlx5_devx_cmd_modify_qp_state(struct mlx5_devx_obj *qp, uint32_t qp_st_mod_op,
 	case MLX5_CMD_OP_RTR2RTS_QP:
 		qpc = MLX5_ADDR_OF(rtr2rts_qp_in, &in, qpc);
 		MLX5_SET(rtr2rts_qp_in, &in, qpn, qp->id);
-		MLX5_SET(qpc, qpc, primary_address_path.ack_timeout, 14);
+		MLX5_SET(qpc, qpc, primary_address_path.ack_timeout, 16);
 		MLX5_SET(qpc, qpc, log_ack_req_freq, 0);
 		MLX5_SET(qpc, qpc, retry_count, 7);
 		MLX5_SET(qpc, qpc, rnr_retry, 7);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: [PATCH] common/mlx5: fix QP ack timeout configuration
  2022-02-14  6:03 [PATCH] common/mlx5: fix QP ack timeout configuration Yajun Wu
@ 2022-02-22 14:44 ` Raslan Darawsheh
  0 siblings, 0 replies; 2+ messages in thread
From: Raslan Darawsheh @ 2022-02-22 14:44 UTC (permalink / raw)
  To: Yajun Wu, Ori Kam, Slava Ovsiienko, Matan Azrad
  Cc: dev, NBU-Contact-Thomas Monjalon (EXTERNAL), Roni Bar Yanai, stable

Hi,

> -----Original Message-----
> From: Yajun Wu <yajunw@nvidia.com>
> Sent: Monday, February 14, 2022 8:03 AM
> To: Ori Kam <orika@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Matan Azrad <matan@nvidia.com>
> Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Raslan Darawsheh <rasland@nvidia.com>; Roni
> Bar Yanai <roniba@nvidia.com>; stable@dpdk.org
> Subject: [PATCH] common/mlx5: fix QP ack timeout configuration
> 
> VDPA driver creates two QPs(1 queue pair include 1 send queue and 1
> receive queue) per virtio queue to get traffic events from NIC to SW.
> Two QPs(called FW QP and SW QP) are created as loopback QP and FW QP'SQ
> is connected to SW QP'RQ internally.
> 
> When packet receive or send out, HW will send WQE by FW QP'SQ, then SW
> will get CQE from the CQ of SW QP.
> 
> With large scale and heavy traffic, the SQ's request may fail to get ACK from
> RQ HW, because HW is busy.
> SQ will retry the request with qpc.retry_count times and each time wait for
> 4.096 uS *2^(ack_timeout) for the response. If still can’t get RQ’s HW
> response, SQ will go to an error state.
> 
> 16 is experienced value. It should not be too high or too low.
> Too high will make QP waits too long in case it’s packet drop.
> Too low will cause QP to go to an error state(retry-exceeded) easily.
> 
> Fixes: 15c3807e86a ("common/mlx5: support DevX QP operations")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yajun Wu <yajunw@nvidia.com>
> Acked-by: Matan Azrad <matan@nvidia.com>

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-02-22 14:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14  6:03 [PATCH] common/mlx5: fix QP ack timeout configuration Yajun Wu
2022-02-22 14:44 ` Raslan Darawsheh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).