From: Thomas Monjalon <thomas@monjalon.net>
To: Tudor Cornea <tudor.cornea@gmail.com>
Cc: ferruh.yigit@intel.com, dev@dpdk.org,
padraig.j.connolly@intel.com, stephen@networkplumber.org,
helin.zhang@intel.com,
Padraig Connolly <Padraig.J.Connolly@intel.com>
Subject: Re: [PATCH v6] kni: allow configuring the kni thread granularity
Date: Wed, 02 Feb 2022 20:30:19 +0100 [thread overview]
Message-ID: <4468595.CvnuH1ECHv@thomas> (raw)
In-Reply-To: <20220120124134.4123542-1-tudor.cornea@gmail.com>
20/01/2022 13:41, Tudor Cornea:
> The Kni kthreads seem to be re-scheduled at a granularity of roughly
> 1 millisecond right now, which seems to be insufficient for performing
> tests involving a lot of control plane traffic.
>
> Even if KNI_KTHREAD_RESCHEDULE_INTERVAL is set to 5 microseconds, it
> seems that the existing code cannot reschedule at the desired granularily,
> due to precision constraints of schedule_timeout_interruptible().
>
> In our use case, we leverage the Linux Kernel for control plane, and
> it is not uncommon to have 60K - 100K pps for some signaling protocols.
>
> Since we are not in atomic context, the usleep_range() function seems to be
> more appropriate for being able to introduce smaller controlled delays,
> in the range of 5-10 microseconds. Upon reading the existing code, it would
> seem that this was the original intent. Adding sub-millisecond delays,
> seems unfeasible with a call to schedule_timeout_interruptible().
>
> KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */
> schedule_timeout_interruptible(
> usecs_to_jiffies(KNI_KTHREAD_RESCHEDULE_INTERVAL));
>
> Below, we attempted a brief comparison between the existing implementation,
> which uses schedule_timeout_interruptible() and usleep_range().
>
> We attempt to measure the CPU usage, and RTT between two Kni interfaces,
> which are created on top of vmxnet3 adapters, connected by a vSwitch.
>
> insmod rte_kni.ko kthread_mode=single carrier=on
>
> schedule_timeout_interruptible(usecs_to_jiffies(5))
> kni_single CPU Usage: 2-4 %
> [root@localhost ~]# ping 1.1.1.2 -I eth1
> PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 eth1: 56(84) bytes of data.
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.70 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.00 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.99 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.985 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.00 ms
>
> usleep_range(5, 10)
> kni_single CPU usage: 50%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.338 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.150 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.123 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.139 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.159 ms
>
> usleep_range(20, 50)
> kni_single CPU usage: 24%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.202 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.170 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.171 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.248 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.185 ms
>
> usleep_range(50, 100)
> kni_single CPU usage: 13%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.537 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.257 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.231 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.143 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.200 ms
>
> usleep_range(100, 200)
> kni_single CPU usage: 7%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.716 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.167 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.459 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.455 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.252 ms
>
> usleep_range(1000, 1100)
> kni_single CPU usage: 2%
> 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.22 ms
> 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.17 ms
> 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.17 ms
> 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=1.17 ms
> 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.15 ms
>
> Upon testing, usleep_range(1000, 1100) seems roughly equivalent in
> latency and cpu usage to the variant with schedule_timeout_interruptible(),
> while usleep_range(100, 200) seems to give a decent tradeoff between
> latency and cpu usage, while allowing users to tweak the limits for
> improved precision if they have such use cases.
>
> Disabling RTE_KNI_PREEMPT_DEFAULT, interestingly seems to lead to a
> softlockup on my kernel.
>
> Kernel panic - not syncing: softlockup: hung tasks
> CPU: 0 PID: 1226 Comm: kni_single Tainted: G W O 3.10 #1
> <IRQ> [<ffffffff814f84de>] dump_stack+0x19/0x1b
> [<ffffffff814f7891>] panic+0xcd/0x1e0
> [<ffffffff810993b0>] watchdog_timer_fn+0x160/0x160
> [<ffffffff810644b2>] __run_hrtimer.isra.4+0x42/0xd0
> [<ffffffff81064b57>] hrtimer_interrupt+0xe7/0x1f0
> [<ffffffff8102cd57>] smp_apic_timer_interrupt+0x67/0xa0
> [<ffffffff8150321d>] apic_timer_interrupt+0x6d/0x80
>
> This patch also attempts to remove this option.
>
> References:
> [1] https://www.kernel.org/doc/Documentation/timers/timers-howto.txt
>
> Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
> Acked-by: Padraig Connolly <Padraig.J.Connolly@intel.com>
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---
> v6:
> * Removed tabs and newline in the description of the
>
> min_scheduling_interval and max_scheduling_interval
> parameters. They seem to be non-standard.
The doc had to be updated a bit as well.
Fixed Kni -> KNI and applied, thanks.
prev parent reply other threads:[~2022-02-02 19:30 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-02 10:38 [dpdk-dev] [PATCH] " Tudor Cornea
2021-11-02 15:51 ` [dpdk-dev] [PATCH v2] " Tudor Cornea
2021-11-02 15:53 ` Stephen Hemminger
2021-11-03 20:40 ` Tudor Cornea
2021-11-03 22:18 ` Stephen Hemminger
2021-11-08 10:13 ` [dpdk-dev] [PATCH v3] " Tudor Cornea
2021-11-22 17:31 ` Ferruh Yigit
2021-11-23 17:08 ` Ferruh Yigit
2021-11-24 17:10 ` Tudor Cornea
2021-11-24 19:24 ` [PATCH v4] " Tudor Cornea
2022-01-14 13:53 ` Connolly, Padraig J
2022-01-14 14:13 ` Ferruh Yigit
2022-01-14 15:18 ` [PATCH v5] " Tudor Cornea
2022-01-14 16:24 ` Stephen Hemminger
2022-01-14 16:43 ` Ferruh Yigit
2022-01-17 16:24 ` Tudor Cornea
2022-01-20 12:41 ` [PATCH v6] " Tudor Cornea
2022-02-02 19:30 ` Thomas Monjalon [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4468595.CvnuH1ECHv@thomas \
--to=thomas@monjalon.net \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=helin.zhang@intel.com \
--cc=padraig.j.connolly@intel.com \
--cc=stephen@networkplumber.org \
--cc=tudor.cornea@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).