From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3BAAF4320B; Thu, 26 Oct 2023 21:51:17 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 05734402D6; Thu, 26 Oct 2023 21:51:17 +0200 (CEST) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by mails.dpdk.org (Postfix) with ESMTP id DDFA8402D4 for ; Thu, 26 Oct 2023 21:51:14 +0200 (CEST) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 6B65F5C00D5; Thu, 26 Oct 2023 15:51:12 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Thu, 26 Oct 2023 15:51:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm3; t= 1698349872; x=1698436272; bh=zYegksgZjg2E8yY1gH3pTgdtfzL+bWUnrHg VkuVY4E0=; b=WK7lG+kElPkFhDbU98i/fD6xzBmJaFjRkmUqmNSTXe0F6Wh6K1s 4YEyRZVRUAslY9fRJq+DZMCMPkQqfnm+eaTnHjyIn+PiNyLR9Rwdq0+VJEopZ7aS A7PDs14F+OoVbQnvKCG9o6/H8BHZFN+MOT9vF5vlmYIifo5M9igZXnUliV/fywKl GpDH8kKkQ9YbzwXnnPjRpNTNBZG9dzIdql9aIvJZnmA2FGHdVCovVoNKaH0BC1h1 TOlg2TFLwSJ67njk1rrfBiD9vvI8+90nJFuxqfXhl6+ksa3knRMLq9s5Qba2urii PWZdoqb+J1F+iDEgw8wm3AEpmtiPdAYGbAQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1698349872; x=1698436272; bh=zYegksgZjg2E8yY1gH3pTgdtfzL+bWUnrHg VkuVY4E0=; b=Ojo+OAfiAO4nEceNEXn9x3ldXkFLlvgQq9xO4lN26d2PR3YJokL qvEZFuQlj9FPYtz8IxVcD5nn10NNAgjIObdvgazWeXGGAdaInTFUlVY6ME+HVhKd 1KriPyizDa0NjxEAykx90YsBBxDKjRxBxo/4+pxxac9Lpg+pyKXoVexecJ4e1WxC jFuZn28HFRLjcaohqlmh/X2EFOU3XndTolOWd5ZeN6dxkbUefM7bHhDngaaJzLy9 aWVaqIy4/rv8m0B+YsXjPQ3VVNcqUPY3GIMNtvD/jFj+2Eb0BxiUcOw2Hd14YrCf pG7rHL1A7hM+ndSebOPVw0Z7bEQB5jF5tBw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrledvgddugedvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvvefufffkjghfggfgtgesthhqredttddtudenucfhrhhomhepvfhhohhm rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc ggtffrrghtthgvrhhnpeefhfejleeuvdevtddutdeutdevhfeijeethfffueejhfetuddu vedtkedtieekffenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Feedback-ID: i47234305:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 26 Oct 2023 15:51:10 -0400 (EDT) From: Thomas Monjalon To: Morten =?ISO-8859-1?Q?Br=F8rup?= Cc: Bruce Richardson , dev@dpdk.org, David Marchand , dev@dpdk.org, stephen@networkplumber.org Subject: Re: [PATCH v3 0/2] allow creating thread with real-time priority Date: Thu, 26 Oct 2023 21:51:07 +0200 Message-ID: <3061820.CbtlEUcBR6@thomas> In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9EF9E@smartserver.smartshare.dk> References: <20231024125416.798897-1-thomas@monjalon.net> <2651241.BddDVKsqQX@thomas> <98CBD80474FA8B44BF855DF32C47DC35E9EF9E@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 26/10/2023 18:50, Morten Br=F8rup: > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > Sent: Thursday, 26 October 2023 18.07 > >=20 > > 26/10/2023 17:54, Bruce Richardson: > > > On Thu, Oct 26, 2023 at 04:59:51PM +0200, Morten Br=F8rup wrote: > > > > > From: Morten Br=F8rup [mailto:mb@smartsharesystems.com] > > > > > Sent: Thursday, 26 October 2023 16.50 > > > > > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > > Sent: Thursday, 26 October 2023 16.31 > > > > > > > > > > > > 26/10/2023 16:08, Morten Br=F8rup: > > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > > > > Sent: Thursday, 26 October 2023 16.05 > > > > > > > > > > > > > > > > 26/10/2023 15:57, Morten Br=F8rup: > > > > > > > > > > From: Morten Br=F8rup [mailto:mb@smartsharesystems.com] > > > > > > > > > > Sent: Thursday, 26 October 2023 15.45 > > > > > > > > > > > > > > > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > > > > > > > Sent: Thursday, 26 October 2023 15.37 > > > > > > > > > > > > > > > > > > > > > > 25/10/2023 18:31, Thomas Monjalon: > > > > > > > > > > > > Real-time thread priority was been forbidden on Unix > > > > > > > > > > > > because of problems they can cause. > > > > > > > > > > > > Warnings and helpers are added to avoid deadlocks, > > > > > > > > > > > > so real-time can be allowed on all systems. > > > > > > > > > > > > > > > > > > > > > > Unit test is failing: > > > > > > > > > > > DPDK:fast-tests / threads_autotest TIMEOUT 600.01 > > s > > > > > > > > > > > > > > > > > > > > > > It is seen in only 1 target (maybe the failure > > occurence is > > > > > random): > > > > > > > > > > > Debian 11 (Buster) (ARM) | PASS > > > > > > > > > > > Fedora 37 (ARM) | PASS > > > > > > > > > > > CentOS Stream 9 (ARM) | FAIL > > > > > > > > > > > Fedora 38 (ARM) | PASS > > > > > > > > > > > Fedora 38 (ARM Clang) | PASS > > > > > > > > > > > Ubuntu 20.04 (ARM) | PASS > > > > > > > > > > > > > > > > > > > > > > I need to send a v4 with new implementation and better > > comments. > > > > > > > > > > > The Unix sleep will be upgraded from 1 ns to 1 us in > > case it makes > > > > > a > > > > > > > > > > > difference. > > > > > > > > > > > > > > > > > > > > It will not make a difference. The kernel will go > > through the > > > > > sleeping > > > > > > > > steps, > > > > > > > > > > then wake up again and see the real-time thread is ready > > to run, and > > > > > > then > > > > > > > > > > immediately schedule it. > > > > > > > > > > > > > > > > > > > > For testing purposes, consider sleeping 10 milliseconds > > or something > > > > > > > > > > significant like that. > > > > > > > > > > > > > > > > > > A bit more details... > > > > > > > > > > > > > > > > > > In our recent tests, nanosleep() itself took around 50 us. > > So you need > > > > > > to > > > > > > > > sleep longer than that for your thread not to be runnable > > when the > > > > > > nanosleep() > > > > > > > > wakes up again, because 50 us has already passed in > > "nanosleep > > > > > overhead". > > > > > > > > > 10 milliseconds provides plenty of margin, and corresponds > > to 10 > > > > > jiffies > > > > > > on > > > > > > > > a 1000 Hz kernel. (I don't know if it makes any difference > > for the > > > > > kernel > > > > > > > > scheduler if the timer crosses a jiffy border or not.) > > > > > > > > > > > > > > > > 10 ms looks like an eternity. > > > > > > > > > > > > > > Agree. It is only for functional testing, not for production! > > > > > > > > > > > > Realtime thread won't make any sense if we have to insert a long > > sleep. > > > > > > > > > > It seems David came to our rescue here! > > > > > > > > > > I have just tried running our test again with > > prctl(PR_SET_TIMERSLACK) of 1 > > > > > ns, and the nanosleep(1 ns) delay dropped from ca. 50 us to ca. > > 2.5 us. > > > > > > > > > > The timeout parameter to epoll_wait() is in milliseconds, which is > > useless for > > > > > low-latency. > > > > > Perhaps real-time threads can be used with epoll() combined with > > timerfd for > > > > > nanosecond resolution timeout. > > > > > > > > Or epoll_pwait2(), which has nanosecond resolution timeout. > > > > > > > > Unfortunately, rte_epoll_wait() is not an experimental API anymore, > > so we cannot change its timeout parameter from milliseconds to micro- or > > nanoseconds. We would have to introduce a new API for this. > > > > > > > > > > Just an idea - can we change the timeout parameter to float rather > > than int, > > > and then use function versioning for backward compatibility for any > > > binaries passing int? > > > That way the actual meaning of the parameter doesn't change, but it > > still > > > allows sub-millisecond values (all-be-it with some loss of accuracy > > due to > > > float). >=20 > Too exotic for my taste. I would rather introduce rte_epoll_wait_ns() wit= h timeout in nanoseconds than pass a float. >=20 > >=20 > > Sorry I'm not following why you want to use rte_epoll_wait()? >=20 > I don't have experience with it yet, but it seems to be the official DPDK= API for blocking I/O system call. >=20 > >=20 > > If the realtime thread has some blocking system calls, > > no sleep is needed I think. >=20 > Correct. >=20 > > For average realtime thread, I suggest the API > > rte_thread_yield_realtime() > > which could wait for 1 ms or less by default. >=20 > If we introduce a yield() function, it should yield according to the O/S = scheduling policy, e.g. the rest of the time slice allocated to the thread = by the O/S scheduler (although that might not be available for real-time pr= ioritized threads in Linux). I don't think we can make it O/S agnostic. >=20 > I don't think it should wait a fixed amount of time - we already have rte= _delay_us_sleep() for that. >=20 > In my experiments with power saving, I ended up with a varying sleep dura= tion, depending on traffic load. The l3fwd-power example also uses a varyin= g sleep duration depending on traffic load. I feel you lost the goal of this: schedule other threads (especially kernel= threads). So we don't care about the sleep duration at all, except we want it minimal while allowing scheduling. > > For smaller sleep, we can use PR_SET_TIMERSLACK and > > rte_delay_us_sleep(). >=20 > Agree. >=20 > > If we provide an API for PR_SET_TIMERSLACK, we could adapt the duration > > of rte_thread_yield_realtime() dynamically after calling prctl(). > >=20 >=20 > I'm not sure exposing an API for PR_SET_TIMERSLACK is the right solution. >=20 > I would rather have the EAL set the timer slack to minimum (1 ns) at EAL = initialization. An EAL command line parameter could be added to change the = default from 1 ns. >=20 > Also, something similar needs to be done for Windows. Windows should be fine with Sleep(0).