From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailfilter01.viettel.com.vn (mailfilter01.viettel.com.vn [125.235.240.53]) by dpdk.org (Postfix) with ESMTP id 676F529AC; Fri, 21 Jul 2017 12:48:46 +0200 (CEST) X-IronPort-AV: E=Sophos;i="5.40,389,1496077200"; d="scan'208";a="44066485" Received: from 125.235.240.45.adsl.viettel.vn (HELO mta2.viettel.com.vn) ([125.235.240.45]) by mailfilter01.viettel.com.vn with ESMTP; 21 Jul 2017 17:48:45 +0700 Received: from localhost (localhost [127.0.0.1]) by mta2.viettel.com.vn (Postfix) with ESMTP id 6B4C668BD19; Fri, 21 Jul 2017 17:49:30 +0700 (ICT) Received: from mta2.viettel.com.vn ([127.0.0.1]) by localhost (mta2.viettel.com.vn [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Hw4iuqaU1CjI; Fri, 21 Jul 2017 17:49:30 +0700 (ICT) Received: from localhost (localhost [127.0.0.1]) by mta2.viettel.com.vn (Postfix) with ESMTP id 4908468BCBD; Fri, 21 Jul 2017 17:49:30 +0700 (ICT) X-Virus-Scanned: amavisd-new at Received: from mta2.viettel.com.vn ([127.0.0.1]) by localhost (mta2.viettel.com.vn [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id QHMv1WmFpG3j; Fri, 21 Jul 2017 17:49:30 +0700 (ICT) Received: from mailbox.viettel.com.vn (unknown [10.30.182.164]) by mta2.viettel.com.vn (Postfix) with ESMTP id 203B668BD2F; Fri, 21 Jul 2017 17:49:30 +0700 (ICT) To: cristian dumitrescu Cc: users@dpdk.org, dev@dpdk.org Message-ID: <699679625.3550957.1500634115756.JavaMail.zimbra@viettel.com.vn> In-Reply-To: <3EB4FA525960D640B5BDFFD6A3D891267BA8294B@IRSMSX108.ger.corp.intel.com> References: <3EB4FA525960D640B5BDFFD6A3D891267BA810FB@IRSMSX108.ger.corp.intel.com> <8709002a-8520-ba2a-3460-1e0ef14dbf09@viettel.com.vn> <3EB4FA525960D640B5BDFFD6A3D891267BA8294B@IRSMSX108.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [203.113.138.20] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - GC57 (Linux)/8.0.9_GA_6191) Thread-Topic: [dpdk-dev] Rx Can't receive anymore packet after received 1.5 billion packet. Thread-Index: AQHS/2ZFkSNlth8HJUSA/s7Kki3rb6Jbfsugwt5PAU8= MilterAction: FORWARD Date: Fri, 21 Jul 2017 17:49:30 +0700 (ICT) From: vuonglv@viettel.com.vn Subject: Re: [dpdk-dev] Rx Can't receive anymore packet after received 1.5 billion packet. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Jul 2017 10:48:48 -0000 ----- Original Message ----- From: "cristian dumitrescu" To: vuonglv@viettel.com.vn Cc: users@dpdk.org, dev@dpdk.org Sent: Thursday, July 20, 2017 1:43:37 AM Subject: RE: [dpdk-dev] Rx Can't receive anymore packet after received 1.5 billion packet. > -----Original Message----- > From: vuonglv@viettel.com.vn [mailto:vuonglv@viettel.com.vn] > Sent: Tuesday, July 18, 2017 2:37 AM > To: Dumitrescu, Cristian > Cc: users@dpdk.org; dev@dpdk.org > Subject: Re: [dpdk-dev] Rx Can't receive anymore packet after received 1.5 > billion packet. > > > > On 07/17/2017 05:31 PM, cristian.dumitrescu@intel.com wrote: > > > >> -----Original Message----- > >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of > >> vuonglv@viettel.com.vn > >> Sent: Monday, July 17, 2017 3:04 AM > >> Cc: users@dpdk.org; dev@dpdk.org > >> Subject: [dpdk-dev] Rx Can't receive anymore packet after received 1.5 > >> billion packet. > >> > >> Hi DPDK team, > >> Sorry when I send this email to both of group users and dev. But I have > >> big problem: Rx core on my application can not receive anymore packet > >> after I did the stress test to it (~1 day Rx core received ~ 1.5 billion > >> packet). Rx core still alive but didn't receive any packet and didn't > >> generate any log. Below is my system configuration: > >> - OS: CentOS 7 > >> - Kernel: 3.10.0-514.16.1.el7.x86_64 > >> - Huge page: 32G: 16384 page 2M > >> - NIC card: Intel 85299 > >> - DPDK version: 16.11 > >> - Architecture: Rx (lcore 1) received packet then queue to the ring > >> ----- Worker (lcore 2) dequeue packet in the ring and free it (use > >> rte_pktmbuf_free() function). > >> - Mempool create: rte_pktmbuf_pool_create ( > >> "rx_pool", /* > >> name */ > >> 8192, /* > >> number of elemements in the mbuf pool */ > >> 256, /* Size of per-core > >> object cache */ > >> 0, /* Size of > >> application private are between rte_mbuf struct and data buffer */ > >> RTE_MBUF_DEFAULT_BUF_SIZE, /* > >> Size of data buffer in each mbuf (2048 + 128)*/ > >> 0 /* socket id */ > >> ); > >> If I change "number of elemements in the mbuf pool" from 8192 to 512, > Rx > >> have same problem after shorter time (~ 30s). > >> > >> Please tell me if you need more information. I am looking forward to > >> hearing from you. > >> > >> > >> Many thanks, > >> Vuong Le > > Hi Vuong, > > > > This is likely to be a buffer leakage problem. You might have a path in your > code where you are not freeing a buffer and therefore this buffer gets > "lost", as the application is not able to use this buffer any more since it is not > returned back to the pool, so the pool of free buffers shrinks over time up to > the moment when it eventually becomes empty, so no more packets can be > received. > > > > You might want to periodically monitor the numbers of free buffers in your > pool; if this is the root cause, then you should be able to see this number > constantly decreasing until it becomes flat zero, otherwise you should be > able to the number of free buffers oscillating around an equilibrium point. > > > > Since it takes a relatively big number of packets to get to this issue, it is > likely that the code path that has this problem is not executed very > frequently: it might be a control plane packet that is not freed up, or an ARP > request/reply pkt, etc. > > > > Regards, > > Cristian > Hi Cristian, > Thanks for your response, I am doing your ideal. But let me show you > another case i have tested before. I changed architecture of my > application as below: > - Architecture: Rx (lcore 1) received packet then queue to the ring > ----- after that: Rx (lcore 1) dequeue packet in the ring and free it > immediately. > (old architecture as above) > With new architecture Rx still receive packet after 2 day and everything > look good. Unfortunately, My application must run in old architecture. > > Any ideal for me? > > > Many thanks, > Vuong Le I am not sure I understand the old architecture and the new architecture you are referring to, can you please clarify them. Regards, Cristian Hi Cristain, I have found my problem, It caused by I created mempool in socket 1 while I created ring in socket 0 (I have not set huge page for socket 0). This is my stupid mistake. But I don't understand why rings still created by system in socket 0 where I didn't set huge page before. Thanks for your support. Many thanks, Vuong Le