From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 13ABC43E0D for ; Fri, 5 Apr 2024 20:19:35 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 05DB3402F2; Fri, 5 Apr 2024 20:19:35 +0200 (CEST) Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com [209.85.167.175]) by mails.dpdk.org (Postfix) with ESMTP id 0C88E402D8 for ; Fri, 5 Apr 2024 19:01:47 +0200 (CEST) Received: by mail-oi1-f175.google.com with SMTP id 5614622812f47-3c3d2d0e86dso1170996b6e.2 for ; Fri, 05 Apr 2024 10:01:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1712336506; x=1712941306; darn=dpdk.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=Nkv1R/Jm6XOqlGT38oOtsswHqpguh417A1dxsNkGkqw=; b=FEORRTEH3Z0WeUTXqbWwTCFATZ8SQfXNDkSJRgSJgOHRoMPqh35J6RkPtu9w4FavqU TYO/z+bbeH5lX7Venej/57Q0wSvTAOfA/6FOWy4haysl3P0Fi1p06MuPjSLtusVs18ZR PbJsF10grzzITA8JGK2tMLpGuVZuSEHQrYj9w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712336506; x=1712941306; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Nkv1R/Jm6XOqlGT38oOtsswHqpguh417A1dxsNkGkqw=; b=pXcO5mmKb6gYS6+kPXAIF6gzZM6ReJSCbmy/6/LnloVwhU2IaUt55UAa8YPGQSQYhr J8/6aq5GZ8dszgnMH3ABMpLKODCLkDx7hhrXIAds7HJyTU02t3O+ZokfNaChdmkXJyXu OUycnz9m5RqYyio80ZmB2RHYk0h6zdaReVp5Ox50RlLjkX1a+CYRra/5wFYjJnnVYhOm 6jpcjTFaseMB9y4r4itf3syL9I1rArsmdSGaiLHoVQvZAQyBxlQip8n3WFSE2NOGUUgp xwH/rag4u3tf6IY9vPYWlisioaJKbWmigQikd5YotIC8bGkVIN214PYwSs5BE7WyQz15 lsEg== X-Forwarded-Encrypted: i=1; AJvYcCWoZUKO29ZZgQ54R3g9F3J6brWuLAnoW9/MNzJM1/VGRDka8zGU42RMOcd45UaJ5wNuFdjub35m8Kt8h50oNA== X-Gm-Message-State: AOJu0Ywsd0ogSYiP+fSDciVOpDcFmVAcQ4LT3mnpT8VwgXWtF1X6bjSs fXE+0b5UsC/bWDfo42aEqz7hgl4cqe3VAnBU4UoYSCp1zWJjk+LJaUPGmu6EiE1IaY7KXrqiGCQ i4JxY2874121jcIo8GaamkoHalPYJkW+l/ZkQ97DaaVNI4jdpLXaDW8u47Gbb+2GjqFfOpvqnWQ wtBfmC/is= X-Google-Smtp-Source: AGHT+IHmA4R1smvbwje4O1QhQIEjTt6l00ZQl8m5GNYnETrSR3/FH5P59nFU+McvjVYJszCtHDUqcneP7eaMS8vfZYg= X-Received: by 2002:a05:6808:48b:b0:3c3:9d04:9ef with SMTP id z11-20020a056808048b00b003c39d0409efmr2009343oid.50.1712336506141; Fri, 05 Apr 2024 10:01:46 -0700 (PDT) MIME-Version: 1.0 From: Samar Yadav Date: Fri, 5 Apr 2024 22:31:35 +0530 Message-ID: Subject: DPDK Secondary process not able to xmit packets with MLX5 VF To: dev@dpdk.org, users@dpdk.org Cc: matan@nvidia.com, viacheslavo@nvidia.com, orika@nvidia.com, suanmingm@nvidia.com, Mukul Sinha , Tathagat Priyadarshi , Srinivasa Srikanth Srikanth Podila , Vipin PR Content-Type: multipart/alternative; boundary="0000000000002fe15306155c69a0" X-Mailman-Approved-At: Fri, 05 Apr 2024 20:19:33 +0200 X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org --0000000000002fe15306155c69a0 Content-Type: text/plain; charset="UTF-8" Hi all, We are using 2 Mellanox VFs with DPDK v22.11 but seeing an issue when dpdk rte_proc_secondary process is trying to xmit packets out. Please note DPDK rte_proc_primary process is able to successfully xmit packets out. Issue seems to be in check_cqe as it always returns MLX5_CQE_STATUS_HW_OWN. *admin@10-50-54-244:~$ lspci | grep "Mellanox"00:07.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]00:08.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]* In our application. proc0 -> is DPDK rte_proc_primary which initializes the necessary shared memory data structures. proc1 -> is DPDK rte_proc_secondary which attaches to pre-initialized shared memory. proc0(rte_proc_primary) uses port0(*00:07.0*) to xmit packets out - works fine as expected. But proc1(rte_proc_secondary) uses port1(*00:08.0)* to xmit packets out - doesn't work as the packet is not seen on the wire. code snippet for below gdb outputs mlx5_tx.c 180 */ 181 void 182 mlx5_tx_handle_completion(struct mlx5_txq_data *__rte_restrict txq, 183 unsigned int olx __rte_unused) 184 { 185 unsigned int count = MLX5_TX_COMP_MAX_CQE; 186 volatile struct mlx5_cqe *last_cqe = NULL; 187 bool ring_doorbell = false; 188 int ret; 189 190 do { 191 volatile struct mlx5_cqe *cqe; 192 193 cqe = &txq->cqes[txq->cq_ci & txq->cqe_m]; 194 ret = check_cqe(cqe, txq->cqe_s, txq->cq_ci); 195 if (unlikely(ret != MLX5_CQE_STATUS_SW_OWN)) { 196 if (likely(ret != MLX5_CQE_STATUS_ERR)) { 197 /* No new CQEs in completion queue. */ 198 MLX5_ASSERT(ret == MLX5_CQE_STATUS_HW_OWN); 199 break; 200 } mlx5_common.h 195 static __rte_always_inline enum mlx5_cqe_status 196 check_cqe(volatile struct mlx5_cqe *cqe, const uint16_t cqes_n, 197 const uint16_t ci) 198 { 199 const uint16_t idx = ci & cqes_n; 200 const uint8_t op_own = cqe->op_own; 201 const uint8_t op_owner = MLX5_CQE_OWNER(op_own); 202 const uint8_t op_code = MLX5_CQE_OPCODE(op_own); 203 204 if (unlikely((op_owner != (!!(idx))) || (op_code == MLX5_CQE_INVALID))) 205 return MLX5_CQE_STATUS_HW_OWN; 206 rte_io_rmb(); 207 if (unlikely(op_code == MLX5_CQE_RESP_ERR || 208 op_code == MLX5_CQE_REQ_ERR)) 209 return MLX5_CQE_STATUS_ERR; 210 return MLX5_CQE_STATUS_SW_OWN; 211 } *proc1(non-working process):* we have noticed the cq_ci remains 0 and doesn't increase. Thread 1 "se_dp" hit Breakpoint 1, mlx5_tx_handle_completion (txq=0x6000496c72c0, olx=127) at ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c:184 184 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) n 185 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) n 186 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) n 187 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) n 193 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) n 194 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) n 195 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) info locals cqe = 0x60004962b000 count = 2 last_cqe = 0x0 ring_doorbell = false ret = -2 (gdb) p *txq $1 = {elts_head = 35, elts_tail = 0, elts_comp = 32, elts_s = 1024, elts_m = 1023, wqe_ci = 35, wqe_pi = 0, wqe_s = 4096, wqe_m = 4095, wqe_comp = 32, wqe_thres = 512, cq_ci = 0, cq_pi = 1, cqe_s = 64, cqe_m = 63, elts_n = 10, cqe_n = 6, wqe_n = 12, tso_en = 1, tunnel_en = 0, swp_en = 0, vlan_en = 0, db_nc = 0, db_heu = 0, rt_timestamp = 0, wait_on_time = 0, fast_free = 0, inlen_send = 18, inlen_empw = 0, inlen_mode = 18, qp_num_8s = 340992, offloads = 32815, mr_ctrl = { dev_gen_ptr = 0x60004c2d62b4, cur_gen = 0, mru = 0, head = 0, cache = {{start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, { start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}}, cache_bh = {len = 1, size = 256, table = 0x6000496c5d40}}, wqes = 0x60004c255000, wqes_end = 0x60004c295000, fcqs = 0x60004c295dc0, cqes = 0x60004962b000, qp_db = 0x60004c295004, cq_db = 0x60004962c000, port_id = 1, idx = 0, rt_timemask = 0, ts_mask = 0, ts_offset = -1, sh = 0x60004b865880, stats = { opackets = 35, obytes = 2228, oerrors = 0}, stats_reset = {opackets = 0, obytes = 0, oerrors = 0}, uar_data = {db = 0x0}, elts = 0x6000496c7448} and check_cqe always returns MLX5_CQE_STATUS_HW_OWN (gdb) 194 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) s check_cqe (ci=0, cqes_n=64, cqe=0x60004962b000) at ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h:199 199 ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h: No such file or directory. (gdb) n 200 in ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h (gdb) 201 in ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h (gdb) 202 in ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h (gdb) 204 in ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h (gdb) n 205 in ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h (gdb) info locals idx = 0 op_own = 241 '\361' op_owner = 1 '\001' op_code = 15 '\017' Because of *check_cqe* return being *MLX5_CQE_STATUS_HW_OWN* , we break in line 199 in *mlx5_tx_handle_completion* and *ring_doorbell* remains *false* forever. Below are the logs from mlx5_txq_devx_obj_new which is called by proc0(rte_proc_primary) for port 1 ppriv: 0x60004b8316c0 ,ppriv->uar_table: 0x60004b8316c8, txq_ctrl->uar_mmap_offset:0, ppriv->uar_table[txq_data->idx]:0x7f6b2d211800, txq_data->idx: 0, txq_data->db_nc:0 and logs from txq_uar_init_secondary which gets called by proc1(rte_proc_secondary) for port 1 priv: 0x60004b8352c0, priv->sh: 0x60004b865880, priv->sh->pppriv: 0x60004b8316c0 txq_ctrl:0x6000496c71c0 priv:0x60004b8352c0 primary_ppriv->uar_table: 0x60004b8316c8 ,uar_va:7f6b2d211800 offset:800 addr:0x7f6b3fe47800 ppriv:0x60004962a180 ppriv->uar_table[txq->idx]:0x7f6b3fe47800, txq->idx:0 Now for the working cases all the counters are incrementing as expected. *proc0(rte_proc_primary - working case)*: cq_ci, cq_pi and other counters are as expected. Thread 1 "se_dp" hit Breakpoint 1, mlx5_tx_handle_completion (txq=0x60004b898940, olx=127) at ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c:184 184 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) n 185 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c (gdb) p *txq $2 = {elts_head = 960, elts_tail = 931, elts_comp = 931, elts_s = 1024, elts_m = 1023, wqe_ci = 960, wqe_pi = 930, wqe_s = 4096, wqe_m = 4095, wqe_comp = 931, wqe_thres = 512, cq_ci = 28, cq_pi = 28, cqe_s = 64, cqe_m = 63, elts_n = 10, cqe_n = 6, wqe_n = 12, tso_en = 1, tunnel_en = 0, swp_en = 0, vlan_en = 0, db_nc = 0, db_heu = 0, rt_timestamp = 0, wait_on_time = 0, fast_free = 0, inlen_send = 18, inlen_empw = 0, inlen_mode = 18, qp_num_8s = 865280, offloads = 32815, mr_ctrl = {dev_gen_ptr = 0x600049a000f4, cur_gen = 0, mru = 0, head = 0, cache = {{start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, { start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}, {start = 0, end = 0, lkey = 0}}, cache_bh = { len = 1, size = 256, table = 0x60004b8973c0}}, wqes = 0x600049655000, wqes_end = 0x600049695000, fcqs = 0x600049697100, cqes = 0x600049696000, qp_db = 0x600049695004, cq_db = 0x600049697000, port_id = 0, idx = 0, rt_timemask = 0, ts_mask = 0, ts_offset = -1, sh = 0x60004be00c40, stats = {opackets = 960, obytes = 73222, oerrors = 0}, stats_reset = {opackets = 0, obytes = 0, oerrors = 0}, uar_data = {db = 0x0}, elts = 0x60004b898ac8} (gdb) Few questions: 1. Why isn't the cqi counter increasing in proc1(rte_proc_secondary)? Does it mean the mlx backend hardware is not consuming the packets? 2. Why is the check_cqe stuck at MLX5_CQE_STATUS_HW_OWN in proc1(rte_proc_secondary) ? Thanks, Samar -- This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it. --0000000000002fe15306155c69a0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi all,=C2=A0
We are u=
sing 2 Mellanox VFs with DPDK v22.11 but seeing an issue when dpdk rte_proc=
_secondary process is trying to xmit packets out. Please note DPDK rte_proc=
_primary process is able to successfully xmit packets out. Issue seems to b=
e in check_cqe as it always returns MLX5_CQE_STATUS_HW_OWN.

admin@10-50-54-244:~$ lspci | grep "M= ellanox"
00:07.0 Ethernet controller: Mellanox Technologies MT27700= Family [ConnectX-4 Virtual Function]
00:08.0 Ethernet controller: Mella= nox Technologies MT27700 Family [ConnectX-4 Virtual Function]

=

In our application.
proc0 -> =
is DPDK rte_proc_primary which initializes the necessary shared memory data=
 structures.
proc1 -> is DPDK rte_p=
roc_secondary which attaches to pre-initialized shared memory.

proc0(rt=
e_proc_primary) uses port0(00:07.0) to xmit packets out - works fine as expected.
But proc1(rte_proc_secondary) uses port1(00:08.0) to xmit packets ou=
t - doesn't work as the packet is not seen on the wire.

code snippe=
t for below gdb outputs
mlx5_tx.c
180 =C2=A0=
*/
181 void
182 mlx5_tx_handle_completion(struct mlx5_txq_data *__rte= _restrict txq,
183 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 unsi= gned int olx __rte_unused)
184 {
185 =C2=A0 =C2=A0 unsigned int count= =3D MLX5_TX_COMP_MAX_CQE;
186 =C2=A0 =C2=A0 volatile struct mlx5_cqe *l= ast_cqe =3D NULL;
187 =C2=A0 =C2=A0 bool ring_doorbell =3D false;
188= =C2=A0 =C2=A0 int ret;
189
190 =C2=A0 =C2=A0 do {
191 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 volatile struct mlx5_cqe *cqe;
192
193 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 cqe =3D &txq->cqes[txq->cq_ci & txq->= cqe_m];
194 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ret =3D check_cqe(cqe, txq->c= qe_s, txq->cq_ci);
195 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (unlikely(ret != =3D MLX5_CQE_STATUS_SW_OWN)) {
196 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 if (likely(ret !=3D MLX5_CQE_STATUS_ERR)) {
197 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* No new CQEs in completion queu= e. */
198 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 MLX5_A= SSERT(ret =3D=3D MLX5_CQE_STATUS_HW_OWN);
199 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 break;
200 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 }

mlx5_common.h
=
195 static __rte_always_inline enum mlx5_cqe_status =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0
196 check_cqe(volatile struct mlx5_cqe *cqe, const uint16_t cqes_= n,
197 =C2=A0 =C2=A0 =C2=A0 const uint16_t ci)
198 {
199 =C2=A0 = =C2=A0 const uint16_t idx =3D ci & cqes_n;
200 =C2=A0 =C2=A0 const u= int8_t op_own =3D cqe->op_own;
201 =C2=A0 =C2=A0 const uint8_t op_own= er =3D MLX5_CQE_OWNER(op_own);
202 =C2=A0 =C2=A0 const uint8_t op_code = =3D MLX5_CQE_OPCODE(op_own);
203
204 =C2=A0 =C2=A0 if (unlikely((op_= owner !=3D (!!(idx))) || (op_code =3D=3D MLX5_CQE_INVALID)))
205 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 return MLX5_CQE_STATUS_HW_OWN;
206 =C2=A0 =C2=A0 rt= e_io_rmb();
207 =C2=A0 =C2=A0 if (unlikely(op_code =3D=3D MLX5_CQE_RESP_= ERR ||
208 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0op_code =3D= =3D MLX5_CQE_REQ_ERR))
209 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return MLX5_CQE_S= TATUS_ERR;
210 =C2=A0 =C2=A0 return MLX5_CQE_STATUS_SW_OWN;
211 }

proc1(non-working process)= : we have noticed the cq_ci remains 0 and doesn't increase.

=
=

Thread 1 "se_dp" hit Breakpoin= t 1, mlx5_tx_handle_completion (txq=3D0x6000496c72c0, olx=3D127)
=C2=A0 = =C2=A0 at ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_= tx.c:184
184 in ../../../../../../service_engine/dpdk-2211/drivers/net/m= lx5/mlx5_tx.c
(gdb) n
185 in ../../../../../../service_engine/dpdk-22= 11/drivers/net/mlx5/mlx5_tx.c
(gdb) n
186 in ../../../../../../servic= e_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c
(gdb) n
187 in ../../..= /../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c
(gdb) n193 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx= .c
(gdb) n
194 in ../../../../../../service_engine/dpdk-2211/drivers/= net/mlx5/mlx5_tx.c
(gdb) n
195 in ../../../../../../service_engine/dp= dk-2211/drivers/net/mlx5/mlx5_tx.c
(gdb) info locals
cqe =3D 0x600049= 62b000
count =3D 2
last_cqe =3D 0x0
ring_doorbell =3D false
ret= =3D -2
(gdb) p *txq
$1 =3D {elts_head =3D 35, elts_tail =3D 0, elts_= comp =3D 32, elts_s =3D 1024, elts_m =3D 1023, wqe_ci =3D 35,
=C2=A0 wqe= _pi =3D 0, wqe_s =3D 4096, wqe_m =3D 4095, wqe_comp =3D 32, wqe_thres =3D 5= 12, cq_ci =3D 0, cq_pi =3D 1,
=C2=A0 cqe_s =3D 64, cqe_m =3D 63, elts_n = =3D 10, cqe_n =3D 6, wqe_n =3D 12, tso_en =3D 1, tunnel_en =3D 0, swp_en = =3D 0,
=C2=A0 vlan_en =3D 0, db_nc =3D 0, db_heu =3D 0, rt_timestamp =3D= 0, wait_on_time =3D 0, fast_free =3D 0,
=C2=A0 inlen_send =3D 18, inlen= _empw =3D 0, inlen_mode =3D 18, qp_num_8s =3D 340992, offloads =3D 32815, m= r_ctrl =3D {
=C2=A0 =C2=A0 dev_gen_ptr =3D 0x60004c2d62b4, cur_gen =3D 0= , mru =3D 0, head =3D 0, cache =3D {{start =3D 0, end =3D 0,
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 lkey =3D 0}, {start =3D 0, end =3D 0, lkey =3D 0}, {start= =3D 0, end =3D 0, lkey =3D 0}, {start =3D 0,
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 end =3D 0, lkey =3D 0}, {start =3D 0, end =3D 0, lkey =3D 0}, {start = =3D 0, end =3D 0, lkey =3D 0}, {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 start =3D 0= , end =3D 0, lkey =3D 0}, {start =3D 0, end =3D 0, lkey =3D 0}}, cache_bh = =3D {len =3D 1,
=C2=A0 =C2=A0 =C2=A0 size =3D 256, table =3D 0x6000496c5= d40}}, wqes =3D 0x60004c255000, wqes_end =3D 0x60004c295000,
=C2=A0 fcqs= =3D 0x60004c295dc0, cqes =3D 0x60004962b000, qp_db =3D 0x60004c295004, cq_= db =3D 0x60004962c000,
=C2=A0 port_id =3D 1, idx =3D 0, rt_timemask =3D = 0, ts_mask =3D 0, ts_offset =3D -1, sh =3D 0x60004b865880, stats =3D {
= =C2=A0 =C2=A0 opackets =3D 35, obytes =3D 2228, oerrors =3D 0}, stats_reset= =3D {opackets =3D 0, obytes =3D 0, oerrors =3D 0},
=C2=A0 uar_data =3D = {db =3D 0x0}, elts =3D 0x6000496c7448}


and check_cqe a=
lways returns MLX5_CQE_STATUS_HW_OWN
=
(gdb)
194 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5= /mlx5_tx.c
(gdb) s
check_cqe (ci=3D0, cqes_n=3D64, cqe=3D0x60004962b0= 00) at ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_= common.h:199
199 ../../../../../../service_engine/dpdk-2211/drivers/comm= on/mlx5/mlx5_common.h: No such file or directory.
(gdb) n
200 in ../.= ./../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h(gdb)
201 in ../../../../../../service_engine/dpdk-2211/drivers/common/= mlx5/mlx5_common.h
(gdb)
202 in ../../../../../../service_engine/dpdk= -2211/drivers/common/mlx5/mlx5_common.h
(gdb)
204 in ../../../../../.= ./service_engine/dpdk-2211/drivers/common/mlx5/mlx5_common.h
(gdb) n
= 205 in ../../../../../../service_engine/dpdk-2211/drivers/common/mlx5/mlx5_= common.h
(gdb) info locals
idx =3D 0
op_own =3D 241 '\361'=
op_owner =3D 1 '\001'
op_code =3D 15 '\017'
Because of check_cqe<=
/i> return being MLX5_CQE_STATUS_HW_OWN , we break in line 199 in mlx5_tx_handle_completion and ring_doorbell remains false forever.
Below are the logs from mlx5_txq_devx_obj_new which is called by proc0(=
rte_proc_primary) for port 1

ppriv: 0x60004b8316c0 = ,ppriv->uar_table: 0x60004b8316c8, txq_ctrl->uar_mmap_offset:0, ppriv= ->uar_table[txq_data->idx]:0x7f6b2d211800, txq_data->idx: 0, txq_d= ata->db_nc:0

=
and logs from txq_uar_init_secondary which gets called by proc1(rte_proc_se=
condary) for port 1
priv: 0x60004b8352c0, priv-&=
gt;sh: 0x60004b865880, priv->sh->pppriv: 0x60004b8316c0
txq_ctrl:0x6000496c71c0 priv:0x60004b835=
2c0
primary_ppriv->uar_t=
able: 0x60004b8316c8 ,uar_va:7f6b2d211800 offset:800 addr:0x7f6b3fe47800
ppriv:0x60004962a180 ppriv-&g=
t;uar_table[txq->idx]:0x7f6b3fe47800, txq->idx:0

Now for the working cases all the counters are incrementing as exp=
ected.
proc0(rte_proc_primary - =
working case):  cq_ci, cq_pi and other counters are as expected.
Thread 1 "se_dp" hit Breakpoint 1, ml= x5_tx_handle_completion (txq=3D0x60004b898940, olx=3D127) at ../../../../..= /../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c:184
184 in ../..= /../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5_tx.c
(gdb) n=
185 in ../../../../../../service_engine/dpdk-2211/drivers/net/mlx5/mlx5= _tx.c
(gdb) p *txq
$2 =3D {elts_head =3D 960, elts_tail =3D 931, elts= _comp =3D 931, elts_s =3D 1024, elts_m =3D 1023, wqe_ci =3D 960, wqe_pi =3D= 930, wqe_s =3D 4096, wqe_m =3D 4095, wqe_comp =3D 931, wqe_thres =3D 512, = cq_ci =3D 28, cq_pi =3D 28, cqe_s =3D 64,
=C2=A0 cqe_m =3D 63, elts_n = =3D 10, cqe_n =3D 6, wqe_n =3D 12, tso_en =3D 1, tunnel_en =3D 0, swp_en = =3D 0, vlan_en =3D 0, db_nc =3D 0, db_heu =3D 0, rt_timestamp =3D 0, wait_o= n_time =3D 0, fast_free =3D 0, inlen_send =3D 18, inlen_empw =3D 0,
=C2= =A0 inlen_mode =3D 18, qp_num_8s =3D 865280, offloads =3D 32815, mr_ctrl = =3D {dev_gen_ptr =3D 0x600049a000f4, cur_gen =3D 0, mru =3D 0, head =3D 0, = cache =3D {{start =3D 0, end =3D 0, lkey =3D 0}, {start =3D 0, end =3D 0, l= key =3D 0}, {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 start =3D 0, end =3D 0, lkey = =3D 0}, {start =3D 0, end =3D 0, lkey =3D 0}, {start =3D 0, end =3D 0, lkey= =3D 0}, {start =3D 0, end =3D 0, lkey =3D 0}, {start =3D 0, end =3D 0, lke= y =3D 0}, {start =3D 0, end =3D 0, lkey =3D 0}}, cache_bh =3D {
=C2=A0 = =C2=A0 =C2=A0 len =3D 1, size =3D 256, table =3D 0x60004b8973c0}}, wqes =3D= 0x600049655000, wqes_end =3D 0x600049695000, fcqs =3D 0x600049697100, cqes= =3D 0x600049696000, qp_db =3D 0x600049695004, cq_db =3D 0x600049697000, po= rt_id =3D 0,
=C2=A0 idx =3D 0, rt_timemask =3D 0, ts_mask =3D 0, ts_offs= et =3D -1, sh =3D 0x60004be00c40, stats =3D {opackets =3D 960, obytes =3D 7= 3222, oerrors =3D 0}, stats_reset =3D {opackets =3D 0, obytes =3D 0, oerror= s =3D 0}, uar_data =3D {db =3D 0x0},
=C2=A0 elts =3D 0x60004b898ac8}
= (gdb)

Few questions: 
1. Why isn't the cqi counter increasing in= proc1(rte_proc_secondary)? Does it mean the mlx backend hardware is not co= nsuming the packets?
2. Why is the check_cqe stuck at MLX5_CQE_STATUS_HW_OWN in proc1(rte_proc=
_secondary) ?

=
Thanks,
Samar

This ele= ctronic communication and the information and any files transmitted with it= , or attached to it, are confidential and are intended solely for the use o= f the individual or entity to whom it is addressed and may contain informat= ion that is confidential, legally privileged, protected by privacy laws, or= otherwise restricted from disclosure to anyone else. If you are not the in= tended recipient or the person responsible for delivering the e-mail to the= intended recipient, you are hereby notified that any use, copying, distrib= uting, dissemination, forwarding, printing, or copying of this e-mail is st= rictly prohibited. If you received this e-mail in error, please return the = e-mail to the sender, delete it from your computer, and destroy any printed= copy of it. --0000000000002fe15306155c69a0--