From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ferruh.yigit@intel.com>
Received: from mga14.intel.com (mga14.intel.com [192.55.52.115])
 by dpdk.org (Postfix) with ESMTP id 8633758CB
 for <dev@dpdk.org>; Wed, 31 May 2017 18:21:22 +0200 (CEST)
Received: from orsmga004.jf.intel.com ([10.7.209.38])
 by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 31 May 2017 09:21:21 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.39,275,1493708400"; d="scan'208";a="93289966"
Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.237.220.81])
 ([10.237.220.81])
 by orsmga004.jf.intel.com with ESMTP; 31 May 2017 09:21:20 -0700
To: gowrishankar muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Cc: dev@dpdk.org
References: <1494502172-16950-1-git-send-email-gowrishankar.m@linux.vnet.ibm.com>
 <1494503486-20876-1-git-send-email-gowrishankar.m@linux.vnet.ibm.com>
 <6466d914-f47f-1ecc-6fec-656893457663@intel.com>
 <c1609a88-79e7-3ed9-424d-22469ab58f28@linux.vnet.ibm.com>
From: Ferruh Yigit <ferruh.yigit@intel.com>
Message-ID: <82c75bc8-644a-1aa9-4a6b-60061633108d@intel.com>
Date: Wed, 31 May 2017 17:21:19 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.1.1
MIME-Version: 1.0
In-Reply-To: <c1609a88-79e7-3ed9-424d-22469ab58f28@linux.vnet.ibm.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Subject: Re: [dpdk-dev] [PATCH v2] kni: add new mbuf in alloc_q only based
 on its empty slots
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 31 May 2017 16:21:23 -0000

Hi Gowrishankar,

Sorry for late response.

On 5/18/2017 6:45 PM, gowrishankar muthukrishnan wrote:
> On Tuesday 16 May 2017 10:45 PM, Ferruh Yigit wrote:
>> On 5/11/2017 12:51 PM, Gowrishankar wrote:
>>> From: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
>>>
>>> In kni_allocate_mbufs(), we attempt to add max_burst (32) count of mbuf
>>> always into alloc_q, which is excessively leading too many rte_pktmbuf_
>>> free() when alloc_q is contending at high packet rate (for eg 10Gig data).
>>> In a situation when alloc_q fifo can only accommodate very few (or zero)
>>> mbuf, create only what needed and add in fifo.
>> I remember I have tried similar, also tried allocating amount of
>> nb_packets read from kernel, both produced worse performance.
>> Can you please share your before/after performance numbers?
> Sure Ferruh, please find below comparison of call counts I set at two places
> along with additional stat on kni egress for more than one packet in txq
> burst read,
> as in pseudo code below:
> 
>     @@ -589,8 +592,12 @@ rte_kni_rx_burst(struct rte_kni *kni, struct
>     rte_mbuf **mbufs, unsigned num)
>             unsigned ret = kni_fifo_get(kni->tx_q, (void **)mbufs, num);
>      
>             /* If buffers removed, allocate mbufs and then put them into
>     alloc_q */
>            if (ret) {
>                    ++alloc_call;
>                    if (ret > 1)
>                            alloc_call_mt1tx += ret;
>                     kni_allocate_mbufs(kni);
>            }
>      
>             return ret;
>      }
>     @@ -659,6 +666,7 @@ kni_allocate_mbufs(struct rte_kni *kni)
>             if (ret >= 0 && ret < i && ret < MAX_MBUF_BURST_NUM) {
>                     int j;
>      
>                    freembuf_call += (i-ret);
>                    for (j = ret; j < i; j++)
>                        rte_pktmbuf_free(pkts[j]);
> 
> 
> 
>> kni_allocate_mbufs() called within rte_kni_rx_burst() if any packet
>> received from kernel. If there is a heavy traffic, kernel will always
>> consume the alloc_q before this function called and this function will
>> fill it back. So there shouldn't be much cases that alloc_q fifo already
>> full.
>> Perhaps this can happen if application burst Rx from kernel in a number
>> less than 32, but fifo filled with fixed 32mbufs, is this your case?
> 
> I think some resemblance to this case based on below stats. W/o patch,
> application
> would spend its most of processing in freeing mbufs right ?.
> 
>>
>> Can you measure number of times rte_pktmbuf_free() called because of
>> alloc_q is full?
> 
> I have sampled below data in x86_64 for KNI on ixgbe pmd. iperf server
> runs on
> remote interface connecting PMD and iperf client runs on KNI interface,
> so as to
> create more egress from KNI into DPDK (w/o and with this patch) for 1MB and
> 100MB data. rx and tx stats are from kni app (USR1).
> 
> 100MB w/o patch 1.28Gbps
> rx      tx        alloc_call  alloc_call_mt1tx freembuf_call
> 3933 72464 51042      42472              1560540

Some math:

alloc called 51042 times with allocating 32 mbufs each time,
51042 * 32 = 1633344

freed mbufs: 1560540

used mbufs: 1633344 - 1560540 = 72804

72804 =~ 72464, so looks correct.

Which means rte_kni_rx_burst() called 51042 times and 72464 buffers
received.

As you already mentioned, for each call kernel able to put only 1-2
packets into the fifo. This number is close to 3 for my test with KNI PMD.

And for this case, agree your patch looks reasonable.

But what if kni has more egress traffic, that able to put >= 32 packets
between each rte_kni_rx_burst()?
For that case this patch introduces extra cost to get allocq_free count.

Overall I am not disagree with patch, but I have concern if this would
cause performance loss some cases while making better for this one. That
would help a lot if KNI users test and comment.

For me, applying patch didn't give any difference in final performance
numbers, but if there is no objection, I am OK to get this patch.


> 
> 1MB w/o patch 204Mbps
> rx   tx       alloc_call alloc_call_mt1tx freembuf_call
> 84  734    566        330                   17378
> 
> 100MB w/ patch 1.23Gbps
> rx      tx        alloc_call  alloc_call_mt1tx freembuf_call
> 4258 72466 72466      0                      0
> 
> 1MB w/ patch 203Mbps
> rx  tx       alloc_call alloc_call_mt1tx freembuf_call
> 76 734    733        2                       0
> 
> With patch, KNI egress on txq seems to be almost only one packet at a time
> (and in 1MB test, a rare instance of more than 2 packets seen even
> though it is
> burst read). Also, as it is one mbuf consumed by module and added by lib at
> a time, rte_pktmbuf_free is not called at all, due to right amount (1 or 2)
> of mbufs enqueued in alloc_q.
> 
> This controlled enqueue on alloc_q avoids nw stall for i40e in ppc64le.
> Could you
> please check if i40e is able to handle data at order of 10GiB in your
> arch, as I see
> that, network stalls at some random point w/o this patch.
> 
> Thanks,
> Gowrishankar
> 
>>> With this patch, we could stop random network stall in KNI at higher packet
>>> rate (eg 1G or 10G data between vEth0 and PMD) sufficiently exhausting
>>> alloc_q on above condition. I tested i40e PMD for this purpose in ppc64le.
>> If stall happens from NIC to kernel, this is kernel receive path, and
>> alloc_q is in kernel transmit path.
>>
>>> Changes:
>>>  v2 - alloc_q free count calculation corrected.
>>>       line wrap fixed for commit message.
>>>
>>> Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
>>> ---
>>>  lib/librte_kni/rte_kni.c | 5 ++++-
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
>>> index c3f9208..9c5d485 100644
>>> --- a/lib/librte_kni/rte_kni.c
>>> +++ b/lib/librte_kni/rte_kni.c
>>> @@ -624,6 +624,7 @@ struct rte_kni *
>>>  	int i, ret;
>>>  	struct rte_mbuf *pkts[MAX_MBUF_BURST_NUM];
>>>  	void *phys[MAX_MBUF_BURST_NUM];
>>> +	int allocq_free;
>>>  
>>>  	RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, pool) !=
>>>  			 offsetof(struct rte_kni_mbuf, pool));
>>> @@ -646,7 +647,9 @@ struct rte_kni *
>>>  		return;
>>>  	}
>>>  
>>> -	for (i = 0; i < MAX_MBUF_BURST_NUM; i++) {
>>> +	allocq_free = (kni->alloc_q->read - kni->alloc_q->write - 1) \
>>> +			& (MAX_MBUF_BURST_NUM - 1);
>>> +	for (i = 0; i < allocq_free; i++) {
>>>  		pkts[i] = rte_pktmbuf_alloc(kni->pktmbuf_pool);
>>>  		if (unlikely(pkts[i] == NULL)) {
>>>  			/* Out of memory */
>>>
> 
>