From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rolette@infiniteio.com>
Received: from mail-wg0-f46.google.com (mail-wg0-f46.google.com [74.125.82.46])
 by dpdk.org (Postfix) with ESMTP id 0E9A19A8D
 for <dev@dpdk.org>; Wed, 25 Feb 2015 14:30:00 +0100 (CET)
Received: by wgha1 with SMTP id a1so3592500wgh.12
 for <dev@dpdk.org>; Wed, 25 Feb 2015 05:30:00 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=ART6i0JfzyWI2ehTzXgkL+fGT6VEoPA0FdcSw1w5rWg=;
 b=FJSQT3EINrcNQ6fOp0nX9p0qqGqDB2Y9bsus28tTcUJnO11FBeTHPO6sViJrJWZ03N
 qaGSDStu1fDS6I4x5rUo1BkGqFqKNsAvGPSP5CReJX9FwZFYdLKK6JzH95YCtdAYOvmi
 tEHQ28jrpmV5yXrI0LjzPPSwsovBG4bVkEA3xIrkGTbck59o3b/lePU8AptZaQic/w7C
 qR9o6AeW2G1eytwwo2bPwkik6zB7E2V9XqwwL5L1R5uOCXPe4D2/hkt92KjXW/bbvdys
 aT6sohHqSpGLn1mwffVnOeZyFtU2LldZQMJgi5IDyP7Her19sK5opMA3FtVl3CvsrG2N
 qH3A==
X-Gm-Message-State: ALoCoQlKlSpo3a5SpV2584RpuM1UQXR1+aCWCRkv+XDsFwZPzKbTPcS3lnvg2DBVlhsA4gFPFIoo
MIME-Version: 1.0
X-Received: by 10.180.82.129 with SMTP id i1mr6267779wiy.77.1424870999899;
 Wed, 25 Feb 2015 05:29:59 -0800 (PST)
Received: by 10.194.184.168 with HTTP; Wed, 25 Feb 2015 05:29:59 -0800 (PST)
In-Reply-To: <54EDC23A.2080302@bisdn.de>
References: <14248648813214-git-send-email-Hemant@freescale.com>
 <54EDBC76.2050507@druidsoftware.com>
 <BY2PR0301MB069369AA0C7436E5F9A63AE9C2170@BY2PR0301MB0693.namprd03.prod.outlook.com>
 <54EDC23A.2080302@bisdn.de>
Date: Wed, 25 Feb 2015 07:29:59 -0600
Message-ID: <CADNuJVqw4hBD1kk1Y4A2+8-5yi0A0K9_5zBQaO3zOnnwXC0nog@mail.gmail.com>
From: Jay Rolette <rolette@infiniteio.com>
To: Marc Sune <marc.sune@bisdn.de>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Cc: DPDK <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Feb 2015 13:30:00 -0000

On Wed, Feb 25, 2015 at 6:38 AM, Marc Sune <marc.sune@bisdn.de> wrote:

>
> On 25/02/15 13:24, Hemant@freescale.com wrote:
>
>> Hi OIivier
>>          Comments inline.
>> Regards,
>> Hemant
>>
>>  -----Original Message-----
>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Deme
>>> Sent: 25/Feb/2015 5:44 PM
>>> To: dev@dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH] kni:optimization of rte_kni_rx_burst
>>>
>>> Thank you Hemant, I think there might be one issue left with the patch
>>> though.
>>> The alloc_q must initially be filled with mbufs before getting mbuf back
>>> on the
>>> tx_q.
>>>
>>> So the patch should allow rte_kni_rx_burst to check if alloc_q is empty.
>>> If so, it should invoke kni_allocate_mbufs(kni, 0) (to fill the alloc_q
>>> with
>>> MAX_MBUF_BURST_NUM mbufs)
>>>
>>> The patch for rte_kni_rx_burst would then look like:
>>>
>>> @@ -575,7 +575,7 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf
>>> **mbufs, unsigned num)
>>>
>>>        /* If buffers removed, allocate mbufs and then put them into
>>> alloc_q */
>>>        if (ret)
>>> -        kni_allocate_mbufs(kni);
>>> +      kni_allocate_mbufs(kni, ret);
>>> +  else if (unlikely(kni->alloc_q->write == kni->alloc_q->read))
>>> +      kni_allocate_mbufs(kni, 0);
>>>
>>>  [hemant]  This will introduce a run-time check.
>>
>> I missed to include the other change in the patch.
>>   I am doing it in kni_alloc i.e. initiate the alloc_q with default burst
>> size.
>>         kni_allocate_mbufs(ctx, 0);
>>
>> In a way, we are now suggesting to reduce the size of alloc_q to only
>> default burst size.
>>
>
> As an aside comment here, I think that we should allow to tweak the
> userspace <-> kernel queue sizes (rx_q, tx_q, free_q and alloc_q) . Whether
> this should be a build configuration option or a parameter to
> rte_kni_init(), it is not completely clear to me, but I guess
> rte_kni_init() is a better option.
>

rte_kni_init() is definitely a better option. It allows things to be tuned
based on individual system config rather than requiring different builds.


> Having said that, the original mail from Hemant was describing that KNI
> was giving an out-of-memory. This to me indicates that the pool is
> incorrectly dimensioned. Even if KNI will not pre-allocate in the alloc_q,
> or not completely, in the event of high load, you will get this same "out
> of memory".
>
> We can reduce the usage of buffers by the KNI subsystem in kernel space
> and in userspace, but the kernel will always need a small cache of
> pre-allocated buffers (coming from user-space), since the KNI kernel module
> does not know where to grab the packets from (which pool). So my guess is
> that the dimensioning problem experienced by Hemant would be the same, even
> with the proposed changes.
>
>
>> Can we reach is situation, when the kernel is adding packets faster in
>> tx_q than the application is able to dequeue?
>>
>
> I think so. We cannot control much how the kernel will schedule the KNI
> thread(s), specially if the # of threads in relation to the cores is
> incorrect (not enough), hence we need at least a reasonable amount of
> buffering to prevent early dropping to those "internal" burst side effects.
>
> Marc


Strongly agree with Marc here. We *really* don't want just a single burst
worth of mbufs available to the kernel in alloc_q. That's just asking for
congestion when there's no need for it.

The original problem reported by Olivier is more of a resource tuning
problem than anything else. The number of mbufs you need in the system has
to take into account internal queue depths.

Jay