From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A9BE0A00C3; Wed, 2 Feb 2022 20:30:30 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 96A34410DC; Wed, 2 Feb 2022 20:30:29 +0100 (CET) Received: from wout2-smtp.messagingengine.com (wout2-smtp.messagingengine.com [64.147.123.25]) by mails.dpdk.org (Postfix) with ESMTP id BDD6E40E28 for ; Wed, 2 Feb 2022 20:30:27 +0100 (CET) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.west.internal (Postfix) with ESMTP id 551D23201F24; Wed, 2 Feb 2022 14:30:23 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Wed, 02 Feb 2022 14:30:24 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; bh=fGqgM8ZNs1LzB4 uZ33/vLvs+LYao1AQfzZQUjqyuX98=; b=klX9B1HkEyeduANgxw8pZhwkhfTtVX kNZq6SQ9Dh92i1WLNmGLn6Ef/Y2VeWfcnR0AL5Magi12xhLA0YIczgKoEoyE7ySW EWEK/Vo3dioA7KjlmJN/VvcA5CDhKo79VnDcwycxvrbTPxWsgqllAeZWNfeHUH8j k/31TL2pJsDCrJceNCulPts5LhSkl0kFxeZ0LAUs1ENhfrmIbQrBTdOXqyCqgSg6 Q5JoLnmnbc4XwFvMv6p9m5UMr6VfU0GZpopDFQtTpfuP0r4oi5eYLd3k5A84jrpv +Kp2Ca14u9uA4Tw90BDMwLG3EkGrw397WX+ToEbagQETw5tf181QT0Lw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=fGqgM8ZNs1LzB4uZ33/vLvs+LYao1AQfzZQUjqyuX 98=; b=VdyVov+DIKdmb7JIPgNsHLk3oQuo7sQ3WjeIswluImBlSMHUCjOF+yYIX UCO4sqg2DSudSoW968jpY/nvZC8fQSatYOZRrnF2aY0JnrEOB/5vKaWwwMkTmpLl YxBDGzDjnVp4OzIXgFASfGbV//0iZ+n5y/zu+7u7PKC5+vVZ6PZlD7/ZtxwjMp67 Dk2PFJf1Ty0w4PFagBJ7jF18Q/Tqr4qhtpzkAcnE70jIeSj5lfqVHWhTMERhmnuv Ww9rQrSXWTwqqWPMlnNBwZtw2TuMEiHFGVdlIjuMD1u11zeSMnDn2vdqRU1OMR49 xGd+A34vlA5DdtFh/taBnLJDksu5Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvvddrgeehgdduvdeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkfgjfhgggfgtsehtufertddttddvnecuhfhrohhmpefvhhhomhgr shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecugg ftrfgrthhtvghrnhepjeduteevueejteehieeggfeiueeikefffefhjeetueeihefhhfdv udfhheehfeehnecuffhomhgrihhnpehkvghrnhgvlhdrohhrghenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehthhhomhgrshesmhhonhhjrghl ohhnrdhnvght X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 2 Feb 2022 14:30:21 -0500 (EST) From: Thomas Monjalon To: Tudor Cornea Cc: ferruh.yigit@intel.com, dev@dpdk.org, padraig.j.connolly@intel.com, stephen@networkplumber.org, helin.zhang@intel.com, Padraig Connolly Subject: Re: [PATCH v6] kni: allow configuring the kni thread granularity Date: Wed, 02 Feb 2022 20:30:19 +0100 Message-ID: <4468595.CvnuH1ECHv@thomas> In-Reply-To: <20220120124134.4123542-1-tudor.cornea@gmail.com> References: <1642173499-59396-1-git-send-email-tudor.cornea@gmail.com> <20220120124134.4123542-1-tudor.cornea@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 20/01/2022 13:41, Tudor Cornea: > The Kni kthreads seem to be re-scheduled at a granularity of roughly > 1 millisecond right now, which seems to be insufficient for performing > tests involving a lot of control plane traffic. > > Even if KNI_KTHREAD_RESCHEDULE_INTERVAL is set to 5 microseconds, it > seems that the existing code cannot reschedule at the desired granularily, > due to precision constraints of schedule_timeout_interruptible(). > > In our use case, we leverage the Linux Kernel for control plane, and > it is not uncommon to have 60K - 100K pps for some signaling protocols. > > Since we are not in atomic context, the usleep_range() function seems to be > more appropriate for being able to introduce smaller controlled delays, > in the range of 5-10 microseconds. Upon reading the existing code, it would > seem that this was the original intent. Adding sub-millisecond delays, > seems unfeasible with a call to schedule_timeout_interruptible(). > > KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */ > schedule_timeout_interruptible( > usecs_to_jiffies(KNI_KTHREAD_RESCHEDULE_INTERVAL)); > > Below, we attempted a brief comparison between the existing implementation, > which uses schedule_timeout_interruptible() and usleep_range(). > > We attempt to measure the CPU usage, and RTT between two Kni interfaces, > which are created on top of vmxnet3 adapters, connected by a vSwitch. > > insmod rte_kni.ko kthread_mode=single carrier=on > > schedule_timeout_interruptible(usecs_to_jiffies(5)) > kni_single CPU Usage: 2-4 % > [root@localhost ~]# ping 1.1.1.2 -I eth1 > PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 eth1: 56(84) bytes of data. > 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.70 ms > 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.00 ms > 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.99 ms > 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.985 ms > 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.00 ms > > usleep_range(5, 10) > kni_single CPU usage: 50% > 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.338 ms > 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.150 ms > 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.123 ms > 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.139 ms > 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.159 ms > > usleep_range(20, 50) > kni_single CPU usage: 24% > 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.202 ms > 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.170 ms > 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.171 ms > 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.248 ms > 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.185 ms > > usleep_range(50, 100) > kni_single CPU usage: 13% > 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.537 ms > 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.257 ms > 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.231 ms > 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.143 ms > 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.200 ms > > usleep_range(100, 200) > kni_single CPU usage: 7% > 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.716 ms > 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.167 ms > 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.459 ms > 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.455 ms > 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.252 ms > > usleep_range(1000, 1100) > kni_single CPU usage: 2% > 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.22 ms > 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.17 ms > 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.17 ms > 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=1.17 ms > 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.15 ms > > Upon testing, usleep_range(1000, 1100) seems roughly equivalent in > latency and cpu usage to the variant with schedule_timeout_interruptible(), > while usleep_range(100, 200) seems to give a decent tradeoff between > latency and cpu usage, while allowing users to tweak the limits for > improved precision if they have such use cases. > > Disabling RTE_KNI_PREEMPT_DEFAULT, interestingly seems to lead to a > softlockup on my kernel. > > Kernel panic - not syncing: softlockup: hung tasks > CPU: 0 PID: 1226 Comm: kni_single Tainted: G W O 3.10 #1 > [] dump_stack+0x19/0x1b > [] panic+0xcd/0x1e0 > [] watchdog_timer_fn+0x160/0x160 > [] __run_hrtimer.isra.4+0x42/0xd0 > [] hrtimer_interrupt+0xe7/0x1f0 > [] smp_apic_timer_interrupt+0x67/0xa0 > [] apic_timer_interrupt+0x6d/0x80 > > This patch also attempts to remove this option. > > References: > [1] https://www.kernel.org/doc/Documentation/timers/timers-howto.txt > > Signed-off-by: Tudor Cornea > Acked-by: Padraig Connolly > Reviewed-by: Ferruh Yigit > --- > v6: > * Removed tabs and newline in the description of the > > min_scheduling_interval and max_scheduling_interval > parameters. They seem to be non-standard. The doc had to be updated a bit as well. Fixed Kni -> KNI and applied, thanks.