From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 8633758CB for ; Wed, 31 May 2017 18:21:22 +0200 (CEST) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 May 2017 09:21:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,275,1493708400"; d="scan'208";a="93289966" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.237.220.81]) ([10.237.220.81]) by orsmga004.jf.intel.com with ESMTP; 31 May 2017 09:21:20 -0700 To: gowrishankar muthukrishnan Cc: dev@dpdk.org References: <1494502172-16950-1-git-send-email-gowrishankar.m@linux.vnet.ibm.com> <1494503486-20876-1-git-send-email-gowrishankar.m@linux.vnet.ibm.com> <6466d914-f47f-1ecc-6fec-656893457663@intel.com> From: Ferruh Yigit Message-ID: <82c75bc8-644a-1aa9-4a6b-60061633108d@intel.com> Date: Wed, 31 May 2017 17:21:19 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v2] kni: add new mbuf in alloc_q only based on its empty slots X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 May 2017 16:21:23 -0000 Hi Gowrishankar, Sorry for late response. On 5/18/2017 6:45 PM, gowrishankar muthukrishnan wrote: > On Tuesday 16 May 2017 10:45 PM, Ferruh Yigit wrote: >> On 5/11/2017 12:51 PM, Gowrishankar wrote: >>> From: Gowrishankar Muthukrishnan >>> >>> In kni_allocate_mbufs(), we attempt to add max_burst (32) count of mbuf >>> always into alloc_q, which is excessively leading too many rte_pktmbuf_ >>> free() when alloc_q is contending at high packet rate (for eg 10Gig data). >>> In a situation when alloc_q fifo can only accommodate very few (or zero) >>> mbuf, create only what needed and add in fifo. >> I remember I have tried similar, also tried allocating amount of >> nb_packets read from kernel, both produced worse performance. >> Can you please share your before/after performance numbers? > Sure Ferruh, please find below comparison of call counts I set at two places > along with additional stat on kni egress for more than one packet in txq > burst read, > as in pseudo code below: > > @@ -589,8 +592,12 @@ rte_kni_rx_burst(struct rte_kni *kni, struct > rte_mbuf **mbufs, unsigned num) > unsigned ret = kni_fifo_get(kni->tx_q, (void **)mbufs, num); > > /* If buffers removed, allocate mbufs and then put them into > alloc_q */ > if (ret) { > ++alloc_call; > if (ret > 1) > alloc_call_mt1tx += ret; > kni_allocate_mbufs(kni); > } > > return ret; > } > @@ -659,6 +666,7 @@ kni_allocate_mbufs(struct rte_kni *kni) > if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) { > int j; > > freembuf_call += (i-ret); > for (j = ret; j < i; j++) > rte_pktmbuf_free(pkts[j]); > > > >> kni_allocate_mbufs() called within rte_kni_rx_burst() if any packet >> received from kernel. If there is a heavy traffic, kernel will always >> consume the alloc_q before this function called and this function will >> fill it back. So there shouldn't be much cases that alloc_q fifo already >> full. >> Perhaps this can happen if application burst Rx from kernel in a number >> less than 32, but fifo filled with fixed 32mbufs, is this your case? > > I think some resemblance to this case based on below stats. W/o patch, > application > would spend its most of processing in freeing mbufs right ?. > >> >> Can you measure number of times rte_pktmbuf_free() called because of >> alloc_q is full? > > I have sampled below data in x86_64 for KNI on ixgbe pmd. iperf server > runs on > remote interface connecting PMD and iperf client runs on KNI interface, > so as to > create more egress from KNI into DPDK (w/o and with this patch) for 1MB and > 100MB data. rx and tx stats are from kni app (USR1). > > 100MB w/o patch 1.28Gbps > rx tx alloc_call alloc_call_mt1tx freembuf_call > 3933 72464 51042 42472 1560540 Some math: alloc called 51042 times with allocating 32 mbufs each time, 51042 * 32 = 1633344 freed mbufs: 1560540 used mbufs: 1633344 - 1560540 = 72804 72804 =~ 72464, so looks correct. Which means rte_kni_rx_burst() called 51042 times and 72464 buffers received. As you already mentioned, for each call kernel able to put only 1-2 packets into the fifo. This number is close to 3 for my test with KNI PMD. And for this case, agree your patch looks reasonable. But what if kni has more egress traffic, that able to put >= 32 packets between each rte_kni_rx_burst()? For that case this patch introduces extra cost to get allocq_free count. Overall I am not disagree with patch, but I have concern if this would cause performance loss some cases while making better for this one. That would help a lot if KNI users test and comment. For me, applying patch didn't give any difference in final performance numbers, but if there is no objection, I am OK to get this patch. > > 1MB w/o patch 204Mbps > rx tx alloc_call alloc_call_mt1tx freembuf_call > 84 734 566 330 17378 > > 100MB w/ patch 1.23Gbps > rx tx alloc_call alloc_call_mt1tx freembuf_call > 4258 72466 72466 0 0 > > 1MB w/ patch 203Mbps > rx tx alloc_call alloc_call_mt1tx freembuf_call > 76 734 733 2 0 > > With patch, KNI egress on txq seems to be almost only one packet at a time > (and in 1MB test, a rare instance of more than 2 packets seen even > though it is > burst read). Also, as it is one mbuf consumed by module and added by lib at > a time, rte_pktmbuf_free is not called at all, due to right amount (1 or 2) > of mbufs enqueued in alloc_q. > > This controlled enqueue on alloc_q avoids nw stall for i40e in ppc64le. > Could you > please check if i40e is able to handle data at order of 10GiB in your > arch, as I see > that, network stalls at some random point w/o this patch. > > Thanks, > Gowrishankar > >>> With this patch, we could stop random network stall in KNI at higher packet >>> rate (eg 1G or 10G data between vEth0 and PMD) sufficiently exhausting >>> alloc_q on above condition. I tested i40e PMD for this purpose in ppc64le. >> If stall happens from NIC to kernel, this is kernel receive path, and >> alloc_q is in kernel transmit path. >> >>> Changes: >>> v2 - alloc_q free count calculation corrected. >>> line wrap fixed for commit message. >>> >>> Signed-off-by: Gowrishankar Muthukrishnan >>> --- >>> lib/librte_kni/rte_kni.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c >>> index c3f9208..9c5d485 100644 >>> --- a/lib/librte_kni/rte_kni.c >>> +++ b/lib/librte_kni/rte_kni.c >>> @@ -624,6 +624,7 @@ struct rte_kni * >>> int i, ret; >>> struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM]; >>> void *phys[MAX_MBUF_BURST_NUM]; >>> + int allocq_free; >>> >>> RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) != >>> offsetof(struct rte_kni_mbuf, pool)); >>> @@ -646,7 +647,9 @@ struct rte_kni * >>> return; >>> } >>> >>> - for (i = 0; i < MAX_MBUF_BURST_NUM; i++) { >>> + allocq_free = (kni->alloc_q->read - kni->alloc_q->write - 1) \ >>> + & (MAX_MBUF_BURST_NUM - 1); >>> + for (i = 0; i < allocq_free; i++) { >>> pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool); >>> if (unlikely(pkts[i] == NULL)) { >>> /* Out of memory */ >>> > >