DPDK usage discussions
 help / color / mirror / Atom feed
From: Carsten Andrich <carsten.andrich@tu-ilmenau.de>
To: users@dpdk.org
Subject: Re: Large interruptions for EAL thread running on isol core
Date: Thu, 23 Jun 2022 21:03:49 +0200	[thread overview]
Message-ID: <9a95a560-c7c6-d4fa-1041-836fa400f497@tu-ilmenau.de> (raw)
In-Reply-To: <CAO8pfF=yo37Ng5QAcopL=6LgKMHr4+=pPcONPATyS+OD0GLgfg@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3705 bytes --]

I've been working on this issue occasionally for a 2+ years now. 
Unfortunately, it is not easy to solve and substantiated information on 
it is also hard to come by.

My current, **empirically founded** solution to get interfering 
interrupts down as much as possible is the following (last tested about 
a year ago with kernel ~5.10):

 1. Don't run anything on CPU 0. It's the boot CPU and may run some
    Linux voodoo that cannot be transferred to other CPUs. Read this
    somewhere, but haven't been able to substantiate that. Also, disable
    SMT or keep the SMT-siblings isolated, too, to avoid contention on
    cache and memory. May both be superstition, but I don't see any
    potential down-side to it.
 2. Use real-time priority (SCHED_FIFO w/ priority 99) for the DPDK
    threads and
    echo -1 > /proc/sys/kernel/sched_rt_runtime_us
    to disable the runtime limit. With the runtime limit in place, the
    SCHED_FIFO performance will be significantly worse than SCHED_OTHER.
 3. Use the following kernel command line (similar to yours; isolates
    all but 2 cores on a 12 core machine):
    isolcpus=nohz,domain,2-11,14-23 nohz_full=2-11,14-23
    rcu_nocbs=2-11,14-23 rcu_nocb_poll irqaffinity=0-1,12-13
    Note that this only works properly with a NO_HZ kernel. The kernel
    boot log should contain an error message if above parameters are
    used on a non-NO_HZ kernel.
    I compile my kernel with the following config enabled: NO_HZ_FULL,
    RCU_NOCB_CPU, CONTEXT_TRACKING_FORCE
 4. Force remaining interrupts off the isolated CPUs by first stopping
    all of them and subsequently starting them again (some interrupts
    may remain on the isolated CPUs regardless of irqaffinit=; mileage
    may vary):
    for CPU in $CPU_LIST ;
         do echo 0 > /sys/devices/system/cpu/cpu$CPU/online
    done
    for CPU in $CPU_LIST ;
         do echo 1 > /sys/devices/system/cpu/cpu$CPU/online
    done
 5. Check which interrupts still occur on the isolated CPUs while
    running your DPDK progam via /proc/interrupts. I've had issues with
    some hardware drivers' interrupts (e.g., RAID controllers) refusing
    to be kicked off the isolated CPUs despite all of the above. Try to
    move your sensitive threads to different CPUs.
 6. Despair. Even if you succeed in getting all hardware interrupts
    disabled, the kernel will still occasionally interrupt your program,
    e.g., for some accounting business.

The last point is where intervention from the kernel side is required. 
Work on that has been underway for several years [1,2,3], but nothing 
has been mainlined yet.

Hope this helps. We've been able to get worst case packet forwarding 
jitter down to less than 10 µs, with anything above 3 µs being very rare 
(see attached histogram; your mileage may vary; measured by comparing 
TSC values between DPDK rx and tx calls).

Best regards,
Carsten

[1] https://lwn.net/Articles/659490/
[2] https://lwn.net/Articles/816298/
[3] 
https://lore.kernel.org/lkml/20220315153132.717153751@fedora.localdomain/

On 23.06.22 20:03, Antonio Di Bacco wrote:
> I'm running a DPDK thread on an isolated core. I also set some  flags
> that could help keeping the core at rest on linux like: nosoftlockup
> nohz_full rcu_nocbs irqaffinity.
>
> Unfortunately the thread gets some interruptions that stop the thread
> for about 20-30 micro seconds. This seems smal but my application
> suffers a lot.
>
> I also tried to use  rte_thread_set_priority that indeed has a strong
> effect but unfortunately creates problems to Linux (like network not
> working).
>
> Is there any other knob that could help running the DPDK thread with
> minimum or no interruptions at all?

[-- Attachment #1.2: Type: text/html, Size: 4809 bytes --]

[-- Attachment #2: DPDK_isol_rx_tx_delay.png --]
[-- Type: image/png, Size: 24571 bytes --]

  parent reply	other threads:[~2022-06-23 19:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-23 18:03 Antonio Di Bacco
2022-06-23 18:07 ` Omer Yamac
2022-06-24  9:43   ` Antonio Di Bacco
2022-06-23 18:41 ` Stephen Hemminger
2022-06-24  9:45   ` Antonio Di Bacco
2022-06-24 10:43     ` Kinsella, Ray
2022-06-23 19:03 ` Carsten Andrich [this message]
2022-06-24 15:01   ` Stephen Hemminger
2022-06-24 15:41     ` Carsten Andrich
2022-06-24 16:42       ` Stephen Hemminger
2022-06-28  7:25         ` Carsten Andrich
2022-06-28 15:19           ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a95a560-c7c6-d4fa-1041-836fa400f497@tu-ilmenau.de \
    --to=carsten.andrich@tu-ilmenau.de \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).