From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f65.google.com (mail-oi0-f65.google.com [209.85.218.65]) by dpdk.org (Postfix) with ESMTP id BE171F72 for ; Thu, 26 Jul 2018 04:31:47 +0200 (CEST) Received: by mail-oi0-f65.google.com with SMTP id s198-v6so302573oih.11 for ; Wed, 25 Jul 2018 19:31:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Go0EVmHJTQ/vqJMqnTsW1AT2WdDXYCpbbjfzNj+pnnc=; b=Teiio2o5kEWca31pFHP8IQYpF3pFVJsJIvAahdecsl1e+YsMURbaeWzV5HF7DltYSk E+H2fswGX+QB+N5lVbJW34YopUB0pHuchQmudT0BflvcD43TiHlLjW6Df34gAiavWImp 3blwoL4+hOGEhAoJ7kjEvRPLOZhTUa9UiFINOBKaNJNMVtXm8kpwtuFGBRCYEJ0uO897 DnVhEAG2NaO0iZOQsTlf6B7Ezoejl/j6jaDYvCU28NhRueIZloqpRy3aKFMzIVhAcVhY WAYOriNo5BHDUN5eQwg5CW717fEf1VQHz5baL1ycjiBebPLwLQR4mvFDj0POzjDgw/E5 mkFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Go0EVmHJTQ/vqJMqnTsW1AT2WdDXYCpbbjfzNj+pnnc=; b=GAXa1CeuZugsl6BNaOwZgon1kOF1KcFti3xwTqhZRGnZeliUXghSDyL6ibHrF51sJe jTt0fR3e72iPdlFQP1v9HW3WPeQGH2z5M3UzCAPWL7D183Nl63to2Ic4QhhqudwGbPAI 9AMp03BpGrFXFySpHaeanCII6zmTbQgHp9nqV0uG+6pGiCTZK2favzizSM8KzkzdFhYM PewPsfpzKyahL1QBOlo3+SHa4OZ+N25NtBZoht/xo2hq35ovITDUDDyzjADUkN8iBFkY u2YMjTCf/GKn7hEpQtv5Mx3/h44X9+5xjh4B0g4ce/UzuxuL2Ei5+rgOhLOHZcRWWOUI ZUHw== X-Gm-Message-State: AOUpUlGtstYbLM/z27+dFcobEguOhoSRs2aTzaru4xGwyZA1MQUfmRyI 2J9KXMYLwrTxZYOo346yALKd+rgBJcRdYCy0rLg= X-Google-Smtp-Source: AAOMgpdEdwdTtbHutUS6sRheCpxOfgTynCnqFkj6fiFQm7LH+QpuzOlpaM3ghF4f1EIq/PI2aZ6LrdMEZ5RCMQ0N3T4= X-Received: by 2002:aca:2dca:: with SMTP id t193-v6mr65294oit.253.1532572307145; Wed, 25 Jul 2018 19:31:47 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a8a:128a:0:0:0:0:0 with HTTP; Wed, 25 Jul 2018 19:31:46 -0700 (PDT) In-Reply-To: <1E6A9C59-92E6-4169-B5A9-1E79FF501A05@riverbed.com> References: <1E6A9C59-92E6-4169-B5A9-1E79FF501A05@riverbed.com> From: Hui Liu Date: Wed, 25 Jul 2018 19:31:46 -0700 Message-ID: To: Amarnath Nallapothula Cc: "users@dpdk.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] occasionally traffic stalls due to rx and tx descriptor not available X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jul 2018 02:31:48 -0000 Hi Amar, I finally reproduced my problem in my own testbed and not surprisingly saw almost the same problem as you, except it is on Intel 82599 ixgbe port: 75 static __rte_always_inline int 76 ixgbe_tx_free_bufs(struct ixgbe_tx_queue *txq) 77 { 78 struct ixgbe_tx_entry_v *txep; 79 uint32_t status; 80 uint32_t n; 81 uint32_t i; 82 int nb_free =3D 0; 83 struct rte_mbuf *m, *free[RTE_IXGBE_TX_MAX_FREE_BUF_SZ]; 84 85 /* check DD bit on threshold descriptor */ 86 status =3D txq->tx_ring[txq->tx_next_dd].wb.status; 87 if (!(status & IXGBE_ADVTXD_STAT_DD)) 88 return 0; 89 (gdb) n 87 in /home/admin/hui/monitor_platform_new/Common/dpdk-stable-18.02.2/drivers/net= /ixgbe/ixgbe_rxtx_vec_common.h (gdb) p/x status $10 =3D 0x138000 and: #define IXGBE_ADVTXD_STAT_DD IXGBE_TXD_STAT_DD #define IXGBE_TXD_STAT_DD 0x00000001 When everything goes fine: (gdb) p/x status $5 =3D 0x1038001 I assume we might encounter the same problem.. There might be some code to eat these tx descriptors and never put them back to "DONE" status so that they would be freed any more, but might need some time to figure it out.. D= o you have any luck to get updates on your case? Regards, Hui On Fri, Jul 6, 2018 at 4:49 AM, Amarnath Nallapothula < Amarnath.Nallapothula@riverbed.com> wrote: > I debugged further by attaching my process to gdb and as I suspected > transmission is failing due to no free descriptor available and code is > unable to free as well due to following condition in fm10k driver. > > > > static inline int __attribute__((always_inline)) > > fm10k_tx_free_bufs(struct fm10k_tx_queue *txq) > > { > > struct rte_mbuf **txep; > > uint8_t flags; > > uint32_t n; > > uint32_t i; > > int nb_free =3D 0; > > struct rte_mbuf *m, *free[RTE_FM10K_TX_MAX_FREE_BUF_SZ]; > > > > /* check DD bit on threshold descriptor */ > > flags =3D txq->hw_ring[txq->next_dd].flags; > > if (!(flags & FM10K_TXD_FLAG_DONE)) > > return 0; =C3=A7 returns from here. > > > > Breakpoint 5, fm10k_xmit_pkts_vec (tx_queue=3D0x7fde3e430040, > tx_pkts=3D0x7fde913eda40, nb_pkts=3D32) > > at /src/dpdk/drivers/net/fm10k/fm10k_rxtx_vec.c:826 > > 826 in /src/dpdk/drivers/net/fm10k/fm10k_rxtx_vec.c > > (gdb) p *(struct fm10k_tx_queue *)tx_queue > > $19 =3D {sw_ring =3D 0x7fde3e42f000, hw_ring =3D 0x7fde3e3aef80, > hw_ring_phys_addr =3D 14490988416, > > rs_tracker =3D {list =3D 0x7fde3e3aef00, head =3D 0x0, tail =3D 0x0, en= dp =3D 0x0}, > > ops =3D 0x8adec8 , last_free =3D 0, next_free =3D 191, nb_= free =3D > 0, nb_used =3D 0, > > free_thresh =3D 32, rs_thresh =3D 32, next_rs =3D 191, next_dd =3D 223,= tail_ptr > =3D 0x7fde52020014, > > txq_flags =3D 3841, nb_desc =3D 512, port_id =3D 0 '\000', tx_deferred_= start =3D > 0 '\000', queue_id =3D 0, > > tx_ftag_en =3D 0} > > (gdb) p /x ((struct fm10k_tx_queue *)tx_queue)->hw_ring[223].flags > > $21 =3D 0x60 > > (gdb) p 0x80 & ((struct fm10k_tx_queue *)tx_queue)->hw_ring[223].flags > > $22 =3D 0 > > (gdb) > > > > Looks like driver/NIC is unable to transmit packet and hence flags is > still not set to FM10K_TXD_FLAG_DONE. But I am still not sure where is > the problem. > > > > Regards, > Amar > > > > *From: *Hui Liu > *Date: *Friday, 6 July 2018 at 9:06 AM > *To: *Amarnath Nallapothula > *Cc: *"users@dpdk.org" > *Subject: *Re: [dpdk-users] occasionally traffic stalls due to rx and tx > descriptor not available > > > > Hi Amar, > > > > I'm a DPDK newbie and I saw a similar problem recently on one 82599 port. > My app is doing a job like this: > > 1. TX thread calls rte_pktmbuf_alloc() to allocate buffers from mbuf_pool > and fills it as ICMP packet and sends out, with speed of around 400,000 > packets/sec, 1.6Gbps; > > 2. RX thread receives ICMP responses and worker threads work with the > responses. > > > > This app was running fine for some time, typically from 8 hours to 5 days > randomly, then it goes into a bad state, that TX thread could not send > packets out any more via rte_eth_tx_buffer() or rte_eth_tx_buffer_flush() > while rte_eth_tx_buffer_count_callback() is called for all packets flush. > I'm highly suspecting the problem with descriptor exhausted but not get i= t > clear yet.. > > > > In my app, I set max pkt burst as 256, rx descriptor as 2048, tx > descriptor as 4096 with single rx/tx queue for one port to get good > performance, not sure if they are the best combination. Just FYI. For > descriptor problem, I'm still investigating on what kind of > behavior/condition takes descriptors and never release it, just as your > Query 2. If applicable, would you please let me know if there is a way to > get the number of available tx/rx descriptor of ports and I could see whe= n > descriptors are really taken without being released time by time? > > > > Due to my system environment limit, I'm not able to directly attach gdb t= o > debug... While I'm investigating this problem, would you please update me > when you have any clue on your issue and I might get some inspiration fro= m > you? > > > > Thank you very much! > > > > Regards, > > Hui > > > > On Thu, Jul 5, 2018 at 4:34 AM, Amarnath Nallapothula < > Amarnath.Nallapothula@riverbed.com> wrote: > > Hi Experts, > > I am testing performance of my dpdk based application which forwards > packets from port 1 to port 2 of 40G NIC card and via versa.Occasionally = we > see that packets rx and tx stops on one of the port. I looked through the > dpdk=E2=80=99s fm10k driver=E2=80=99s code and found out that this could = happen if rx/tx > descriptors are not available. > > To improve performance, I am using RSS functionality and created five rx > and tx queue. Dedicated lcores are assigned to forward packets from port1 > queue 0 to port2 queue 0 and via versa. > > During port initialization rx_queue is initialized with 128 Rx ring > descriptor size and tx_queue is initialized 512 Tx ring descriptor. > Threshold values are left default. > > I have few queries here: > > 1. Is above initialization value for rx and tx descriptor is good for > each queue for given port. > 2. Under what conditions rx and tx descriptor gets exhausted? > 3. Any suggestion or information you can provide to debug this issue? > > Regards, > Amar > > >