From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BBCAF4320A; Thu, 26 Oct 2023 18:50:26 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 37AAF40A76; Thu, 26 Oct 2023 18:50:26 +0200 (CEST) Received: from dkmailrelay1.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 0551E40A6D for ; Thu, 26 Oct 2023 18:50:25 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesys.local [192.168.4.10]) by dkmailrelay1.smartsharesystems.com (Postfix) with ESMTP id D7F8E20899; Thu, 26 Oct 2023 18:50:24 +0200 (CEST) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [PATCH v3 0/2] allow creating thread with real-time priority X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Thu, 26 Oct 2023 18:50:23 +0200 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9EF9E@smartserver.smartshare.dk> In-Reply-To: <2651241.BddDVKsqQX@thomas> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH v3 0/2] allow creating thread with real-time priority Thread-Index: AdoIJnytCdylAY0uQ9mZdjeaQmv+0AAA38Vw References: <20231024125416.798897-1-thomas@monjalon.net> <98CBD80474FA8B44BF855DF32C47DC35E9EF9D@smartserver.smartshare.dk> <2651241.BddDVKsqQX@thomas> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Thomas Monjalon" , "Bruce Richardson" Cc: "David Marchand" , , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Thursday, 26 October 2023 18.07 >=20 > 26/10/2023 17:54, Bruce Richardson: > > On Thu, Oct 26, 2023 at 04:59:51PM +0200, Morten Br=F8rup wrote: > > > > From: Morten Br=F8rup [mailto:mb@smartsharesystems.com] > > > > Sent: Thursday, 26 October 2023 16.50 > > > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > Sent: Thursday, 26 October 2023 16.31 > > > > > > > > > > 26/10/2023 16:08, Morten Br=F8rup: > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > > > Sent: Thursday, 26 October 2023 16.05 > > > > > > > > > > > > > > 26/10/2023 15:57, Morten Br=F8rup: > > > > > > > > > From: Morten Br=F8rup = [mailto:mb@smartsharesystems.com] > > > > > > > > > Sent: Thursday, 26 October 2023 15.45 > > > > > > > > > > > > > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > > > > > > Sent: Thursday, 26 October 2023 15.37 > > > > > > > > > > > > > > > > > > > > 25/10/2023 18:31, Thomas Monjalon: > > > > > > > > > > > Real-time thread priority was been forbidden on = Unix > > > > > > > > > > > because of problems they can cause. > > > > > > > > > > > Warnings and helpers are added to avoid deadlocks, > > > > > > > > > > > so real-time can be allowed on all systems. > > > > > > > > > > > > > > > > > > > > Unit test is failing: > > > > > > > > > > DPDK:fast-tests / threads_autotest TIMEOUT = 600.01 > s > > > > > > > > > > > > > > > > > > > > It is seen in only 1 target (maybe the failure > occurence is > > > > random): > > > > > > > > > > Debian 11 (Buster) (ARM) | PASS > > > > > > > > > > Fedora 37 (ARM) | PASS > > > > > > > > > > CentOS Stream 9 (ARM) | FAIL > > > > > > > > > > Fedora 38 (ARM) | PASS > > > > > > > > > > Fedora 38 (ARM Clang) | PASS > > > > > > > > > > Ubuntu 20.04 (ARM) | PASS > > > > > > > > > > > > > > > > > > > > I need to send a v4 with new implementation and = better > comments. > > > > > > > > > > The Unix sleep will be upgraded from 1 ns to 1 us in > case it makes > > > > a > > > > > > > > > > difference. > > > > > > > > > > > > > > > > > > It will not make a difference. The kernel will go > through the > > > > sleeping > > > > > > > steps, > > > > > > > > > then wake up again and see the real-time thread is = ready > to run, and > > > > > then > > > > > > > > > immediately schedule it. > > > > > > > > > > > > > > > > > > For testing purposes, consider sleeping 10 = milliseconds > or something > > > > > > > > > significant like that. > > > > > > > > > > > > > > > > A bit more details... > > > > > > > > > > > > > > > > In our recent tests, nanosleep() itself took around 50 = us. > So you need > > > > > to > > > > > > > sleep longer than that for your thread not to be runnable > when the > > > > > nanosleep() > > > > > > > wakes up again, because 50 us has already passed in > "nanosleep > > > > overhead". > > > > > > > > 10 milliseconds provides plenty of margin, and = corresponds > to 10 > > > > jiffies > > > > > on > > > > > > > a 1000 Hz kernel. (I don't know if it makes any difference > for the > > > > kernel > > > > > > > scheduler if the timer crosses a jiffy border or not.) > > > > > > > > > > > > > > 10 ms looks like an eternity. > > > > > > > > > > > > Agree. It is only for functional testing, not for = production! > > > > > > > > > > Realtime thread won't make any sense if we have to insert a = long > sleep. > > > > > > > > It seems David came to our rescue here! > > > > > > > > I have just tried running our test again with > prctl(PR_SET_TIMERSLACK) of 1 > > > > ns, and the nanosleep(1 ns) delay dropped from ca. 50 us to ca. > 2.5 us. > > > > > > > > The timeout parameter to epoll_wait() is in milliseconds, which = is > useless for > > > > low-latency. > > > > Perhaps real-time threads can be used with epoll() combined with > timerfd for > > > > nanosecond resolution timeout. > > > > > > Or epoll_pwait2(), which has nanosecond resolution timeout. > > > > > > Unfortunately, rte_epoll_wait() is not an experimental API = anymore, > so we cannot change its timeout parameter from milliseconds to micro- = or > nanoseconds. We would have to introduce a new API for this. > > > > > > > Just an idea - can we change the timeout parameter to float rather > than int, > > and then use function versioning for backward compatibility for any > > binaries passing int? > > That way the actual meaning of the parameter doesn't change, but it > still > > allows sub-millisecond values (all-be-it with some loss of accuracy > due to > > float). Too exotic for my taste. I would rather introduce rte_epoll_wait_ns() = with timeout in nanoseconds than pass a float. >=20 > Sorry I'm not following why you want to use rte_epoll_wait()? I don't have experience with it yet, but it seems to be the official = DPDK API for blocking I/O system call. >=20 > If the realtime thread has some blocking system calls, > no sleep is needed I think. Correct. > For average realtime thread, I suggest the API > rte_thread_yield_realtime() > which could wait for 1 ms or less by default. If we introduce a yield() function, it should yield according to the O/S = scheduling policy, e.g. the rest of the time slice allocated to the = thread by the O/S scheduler (although that might not be available for = real-time prioritized threads in Linux). I don't think we can make it = O/S agnostic. I don't think it should wait a fixed amount of time - we already have = rte_delay_us_sleep() for that. In my experiments with power saving, I ended up with a varying sleep = duration, depending on traffic load. The l3fwd-power example also uses a = varying sleep duration depending on traffic load. > For smaller sleep, we can use PR_SET_TIMERSLACK and > rte_delay_us_sleep(). Agree. > If we provide an API for PR_SET_TIMERSLACK, we could adapt the = duration > of rte_thread_yield_realtime() dynamically after calling prctl(). >=20 I'm not sure exposing an API for PR_SET_TIMERSLACK is the right = solution. I would rather have the EAL set the timer slack to minimum (1 ns) at EAL = initialization. An EAL command line parameter could be added to change = the default from 1 ns. Also, something similar needs to be done for Windows.