From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id A9BE0A00C3;
	Wed,  2 Feb 2022 20:30:30 +0100 (CET)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 96A34410DC;
	Wed,  2 Feb 2022 20:30:29 +0100 (CET)
Received: from wout2-smtp.messagingengine.com (wout2-smtp.messagingengine.com
 [64.147.123.25]) by mails.dpdk.org (Postfix) with ESMTP id BDD6E40E28
 for <dev@dpdk.org>; Wed,  2 Feb 2022 20:30:27 +0100 (CET)
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
 by mailout.west.internal (Postfix) with ESMTP id 551D23201F24;
 Wed,  2 Feb 2022 14:30:23 -0500 (EST)
Received: from mailfrontend2 ([10.202.2.163])
 by compute1.internal (MEProxy); Wed, 02 Feb 2022 14:30:24 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 cc:cc:content-transfer-encoding:content-type:date:date:from:from
 :in-reply-to:in-reply-to:message-id:mime-version:references
 :reply-to:sender:subject:subject:to:to; s=fm3; bh=fGqgM8ZNs1LzB4
 uZ33/vLvs+LYao1AQfzZQUjqyuX98=; b=klX9B1HkEyeduANgxw8pZhwkhfTtVX
 kNZq6SQ9Dh92i1WLNmGLn6Ef/Y2VeWfcnR0AL5Magi12xhLA0YIczgKoEoyE7ySW
 EWEK/Vo3dioA7KjlmJN/VvcA5CDhKo79VnDcwycxvrbTPxWsgqllAeZWNfeHUH8j
 k/31TL2pJsDCrJceNCulPts5LhSkl0kFxeZ0LAUs1ENhfrmIbQrBTdOXqyCqgSg6
 Q5JoLnmnbc4XwFvMv6p9m5UMr6VfU0GZpopDFQtTpfuP0r4oi5eYLd3k5A84jrpv
 +Kp2Ca14u9uA4Tw90BDMwLG3EkGrw397WX+ToEbagQETw5tf181QT0Lw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:date:date:from:from:in-reply-to:in-reply-to
 :message-id:mime-version:references:reply-to:sender:subject
 :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender
 :x-sasl-enc; s=fm2; bh=fGqgM8ZNs1LzB4uZ33/vLvs+LYao1AQfzZQUjqyuX
 98=; b=VdyVov+DIKdmb7JIPgNsHLk3oQuo7sQ3WjeIswluImBlSMHUCjOF+yYIX
 UCO4sqg2DSudSoW968jpY/nvZC8fQSatYOZRrnF2aY0JnrEOB/5vKaWwwMkTmpLl
 YxBDGzDjnVp4OzIXgFASfGbV//0iZ+n5y/zu+7u7PKC5+vVZ6PZlD7/ZtxwjMp67
 Dk2PFJf1Ty0w4PFagBJ7jF18Q/Tqr4qhtpzkAcnE70jIeSj5lfqVHWhTMERhmnuv
 Ww9rQrSXWTwqqWPMlnNBwZtw2TuMEiHFGVdlIjuMD1u11zeSMnDn2vdqRU1OMR49
 xGd+A34vlA5DdtFh/taBnLJDksu5Q==
X-ME-Sender: <xms:ztv6YfngNMjVN8Lwq6nqXYzVxGHpbb2hen0Dx9GMAZo9dx0El2Dprg>
 <xme:ztv6YS2cJVyhTfihJVJm7zUMJs9lOVKUZJY_A6uJ8Lx55JCQqmjlc38oPU2Cw-UYt
 s8nBzNIT_IBPN5E1Q>
X-ME-Received: <xmr:ztv6YVpFMP1nc3Cdfnxmtts9I9E1o-MA6_hRRSjc5It6-daoNqTUgBI7ov6DeI0EmR7f0GB8B4un9vcNtFSNmBwL4w>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvvddrgeehgdduvdeiucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen
 uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne
 cujfgurhephffvufffkfgjfhgggfgtsehtufertddttddvnecuhfhrohhmpefvhhhomhgr
 shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecugg
 ftrfgrthhtvghrnhepjeduteevueejteehieeggfeiueeikefffefhjeetueeihefhhfdv
 udfhheehfeehnecuffhomhgrihhnpehkvghrnhgvlhdrohhrghenucevlhhushhtvghruf
 hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehthhhomhgrshesmhhonhhjrghl
 ohhnrdhnvght
X-ME-Proxy: <xmx:ztv6YXnPt2utp-22j29q8saPYgfZQDwj6rygxKcxVO9KS75M-r_uZg>
 <xmx:ztv6Yd3QeW_0U-QPFbD0v_2QwlQp78HdOSJqsvL9hkBLsRj5unyYIQ>
 <xmx:ztv6YWsYJ5sTJ8ZduPkrDyRVFpMyW2h0chq9k2AzS3IJ8nv7Fh3S9A>
 <xmx:ztv6YV930Mp6YzbWskWI-aDZPp-F9ZjCMTcvnONoPcC0Rw2TPmeksA>
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed,
 2 Feb 2022 14:30:21 -0500 (EST)
From: Thomas Monjalon <thomas@monjalon.net>
To: Tudor Cornea <tudor.cornea@gmail.com>
Cc: ferruh.yigit@intel.com, dev@dpdk.org, padraig.j.connolly@intel.com,
 stephen@networkplumber.org, helin.zhang@intel.com,
 Padraig Connolly <Padraig.J.Connolly@intel.com>
Subject: Re: [PATCH v6] kni: allow configuring the kni thread granularity
Date: Wed, 02 Feb 2022 20:30:19 +0100
Message-ID: <4468595.CvnuH1ECHv@thomas>
In-Reply-To: <20220120124134.4123542-1-tudor.cornea@gmail.com>
References: <1642173499-59396-1-git-send-email-tudor.cornea@gmail.com>
 <20220120124134.4123542-1-tudor.cornea@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

20/01/2022 13:41, Tudor Cornea:
> The Kni kthreads seem to be re-scheduled at a granularity of roughly
> 1 millisecond right now, which seems to be insufficient for performing
> tests involving a lot of control plane traffic.
> 
> Even if KNI_KTHREAD_RESCHEDULE_INTERVAL is set to 5 microseconds, it
> seems that the existing code cannot reschedule at the desired granularily,
> due to precision constraints of schedule_timeout_interruptible().
> 
> In our use case, we leverage the Linux Kernel for control plane, and
> it is not uncommon to have 60K - 100K pps for some signaling protocols.
> 
> Since we are not in atomic context, the usleep_range() function seems to be
> more appropriate for being able to introduce smaller controlled delays,
> in the range of 5-10 microseconds. Upon reading the existing code, it would
> seem that this was the original intent. Adding sub-millisecond delays,
> seems unfeasible with a call to schedule_timeout_interruptible().
> 
> KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */
> schedule_timeout_interruptible(
>         usecs_to_jiffies(KNI_KTHREAD_RESCHEDULE_INTERVAL));
> 
> Below, we attempted a brief comparison between the existing implementation,
> which uses schedule_timeout_interruptible() and usleep_range().
> 
> We attempt to measure the CPU usage, and RTT between two Kni interfaces,
> which are created on top of vmxnet3 adapters, connected by a vSwitch.
> 
> insmod rte_kni.ko kthread_mode=single carrier=on
> 
> schedule_timeout_interruptible(usecs_to_jiffies(5))
> kni_single CPU Usage: 2-4 %
> [root@localhost ~]# ping 1.1.1.2 -I eth1
> PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 eth1: 56(84) bytes of data.
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.70 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.00 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.99 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.985 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.00 ms
> 
> usleep_range(5, 10)
> kni_single CPU usage: 50%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.338 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.150 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.123 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.139 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.159 ms
> 
> usleep_range(20, 50)
> kni_single CPU usage: 24%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.202 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.170 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.171 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.248 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.185 ms
> 
> usleep_range(50, 100)
> kni_single CPU usage: 13%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.537 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.257 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.231 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.143 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.200 ms
> 
> usleep_range(100, 200)
> kni_single CPU usage: 7%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.716 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.167 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.459 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.455 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.252 ms
> 
> usleep_range(1000, 1100)
> kni_single CPU usage: 2%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.22 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.17 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.17 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=1.17 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.15 ms
> 
> Upon testing, usleep_range(1000, 1100) seems roughly equivalent in
> latency and cpu usage to the variant with schedule_timeout_interruptible(),
> while usleep_range(100, 200) seems to give a decent tradeoff between
> latency and cpu usage, while allowing users to tweak the limits for
> improved precision if they have such use cases.
> 
> Disabling RTE_KNI_PREEMPT_DEFAULT, interestingly seems to lead to a
> softlockup on my kernel.
> 
> Kernel panic - not syncing: softlockup: hung tasks
> CPU: 0 PID: 1226 Comm: kni_single Tainted: G        W  O 3.10 #1
>  <IRQ>  [<ffffffff814f84de>] dump_stack+0x19/0x1b
>  [<ffffffff814f7891>] panic+0xcd/0x1e0
>  [<ffffffff810993b0>] watchdog_timer_fn+0x160/0x160
>  [<ffffffff810644b2>] __run_hrtimer.isra.4+0x42/0xd0
>  [<ffffffff81064b57>] hrtimer_interrupt+0xe7/0x1f0
>  [<ffffffff8102cd57>] smp_apic_timer_interrupt+0x67/0xa0
>  [<ffffffff8150321d>] apic_timer_interrupt+0x6d/0x80
> 
> This patch also attempts to remove this option.
> 
> References:
> [1] https://www.kernel.org/doc/Documentation/timers/timers-howto.txt
> 
> Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
> Acked-by: Padraig Connolly <Padraig.J.Connolly@intel.com>
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
> v6:
> * Removed tabs and newline in the description of the
> 
>   min_scheduling_interval and max_scheduling_interval
>   parameters. They seem to be non-standard.

The doc had to be updated a bit as well.

Fixed Kni -> KNI and applied, thanks.