From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 32B8FA0096 for ; Mon, 8 Apr 2019 11:52:50 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 66E532BD8; Mon, 8 Apr 2019 11:52:49 +0200 (CEST) Received: from ls405.t-com.hr (ls405.t-com.hr [195.29.150.135]) by dpdk.org (Postfix) with ESMTP id 965BC2B95 for ; Mon, 8 Apr 2019 11:52:48 +0200 (CEST) Received: from ls266.t-com.hr (ls266.t-com.hr [195.29.150.94]) by ls405.t-com.hr (Postfix) with ESMTP id 11A22698674 for ; Mon, 8 Apr 2019 11:52:48 +0200 (CEST) Received: from ls266.t-com.hr (localhost.localdomain [127.0.0.1]) by ls266.t-com.hr (Qmlai) with ESMTP id 007AE1528001 for ; Mon, 8 Apr 2019 11:52:48 +0200 (CEST) X-Envelope-Sender: hrvoje.habjanic@zg.ht.hr Received: from habix.doma (93-138-89-92.adsl.net.t-com.hr [93.138.89.92]) by ls266.t-com.hr (Qmali) with ESMTP id C004019D8002 for ; Mon, 8 Apr 2019 11:52:47 +0200 (CEST) Received: from Hrvojes-MacBook-Pro.local (unknown [192.168.11.18]) by habix.doma (Postfix) with ESMTPSA id 94C3312D for ; Mon, 8 Apr 2019 11:52:47 +0200 (CEST) From: =?UTF-8?Q?Hrvoje_Habjani=c4=87?= To: users@dpdk.org References: <16b6d36f-ea75-4f20-5d96-ef4053787dba@zg.ht.hr> Message-ID: Date: Mon, 8 Apr 2019 11:52:46 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <16b6d36f-ea75-4f20-5d96-ef4053787dba@zg.ht.hr> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-TM-AS-Product-Ver: IMSS-7.1.0.1224-8.2.0.1013-24538.006 X-TM-AS-Result: No--22.928-10.0-31-1 X-imss-scan-details: No--22.928-10.0-31-1 X-TM-AS-User-Approved-Sender: No Subject: Re: [dpdk-users] DPDK TX problems X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Sender: "users" On 29/03/2019 08:24, Hrvoje Habjanić wrote: >> Hi. >> >> I did write an application using dpdk 17.11 (did try also with 18.11), >> and when doing some performance testing, i'm seeing very odd behavior. >> To verify that this is not because of my app, i did the same test with >> l2fwd example app, and i'm still confused by results. >> >> In short, i'm trying to push a lot of L2 packets through dpdk engine - >> packet processing is minimal. When testing, i'm starting with small >> number of packets-per-second, and then gradually increase it to see >> where is the limit. At some point, i do reach this limit - packets start >> to get dropped. And this is when stuff become weird. >> >> When i reach peek packet rate (at which packets start to get dropped), i >> would expect that reducing packet rate will remove packet drops. But, >> this is not the case. For example, let's assume that peek packet rate is >> 3.5Mpps. At this point everything works ok. Increasing pps to 4.0Mpps, >> makes a lot of dropped packets. When reducing pps back to 3.5Mpps, app >> is still broken - packets are still dropped. >> >> At this point, i need to drastically reduce pps (1.4Mpps) to make >> dropped packets go away. Also, app is unable to successfully forward >> anything beyond this 1.4M, despite the fact that in the beginning it did >> forward 3.5M! Only way to recover is to restart the app. >> >> Also, sometimes, the app just stops forwarding any packets - packets are >> received (as seen by counters), but app is unable to send anything back. >> >> As i did mention, i'm seeing the same behavior with l2fwd example app. I >> did test dpdk 17.11 and also dpdk 18.11 - the results are the same. >> >> My test environment is HP DL380G8, with 82599ES 10Gig (ixgbe) cards, >> connected with Cisco nexus 9300 sw. On the other side is ixia test >> appliance. Application is run in virtual machine (VM), using KVM >> (openstack, with sriov enabled, and numa restrictions). I did check that >> VM is using only cpu's from NUMA node on which network card is >> connected, so there is no cross-numa traffic. Openstack is Queens, >> Ubuntu is Bionic release. Virtual machine is also using ubuntu bionic >> as OS. >> >> I do not know how to debug this? Does someone else have the same >> observations? >> >> Regards, >> >> H. > There are additional findings. It seems that when i reach peak pps > rate, application is not fast enough, and i can see rx missed errors > on card statistics on the host. At the same time, tx side starts to > show problems (tx burst starts to show it did not send all packets). > Shortly after that, tx falls apart completely and top pps rate drops. > > Since i did not disable pause frames, i can see on the switch "RX > pause" frame counter is increasing. On the other hand, if i disable > pause frames (on the nic of server), host driver (ixgbe) reports "TX > unit hang" in dmesg, and issues card reset. Of course, after reset > none of the dpdk apps in VM's on this host does not work. > > Is it possible that at time of congestion DPDK does not release mbufs > back to the pool, and tx ring becomes "filled" with zombie packets > (not send by card and also having ref counter as they are in use)? > > Is there a way to check mempool or tx ring for "left-owers"? Is is > possible to somehow "flush" tx ring and/or mempool? > > H. After few more test, things become even weirder - if i do not free mbufs which are not sent, but resend them again, i can "survive" over-the-peek event! But, then peek rate starts to drop gradually ... I would ask if someone can try this on their platform and report back? I would really like to know if this is problem with my deployment, or there is something wrong with dpdk? Test should be simple - use l2fwd or l3fwd, and determine max pps. Then drive pps 30%over max, and then return back and confirm that you can still get max pps. Thanks in advance. H.