From: Stephen Hemminger <stephen@networkplumber.org>
To: Willy Tarreau <w@1wt.eu>
Cc: Andy Lutomirski <luto@kernel.org>,
dev@dpdk.org, Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [dpdk-dev] Please stop using iopl() in DPDK
Date: Mon, 28 Oct 2019 09:42:53 -0700 [thread overview]
Message-ID: <20191028094253.054fbf9c@hermes.lan> (raw)
In-Reply-To: <20191025064225.GA22917@1wt.eu>
On Fri, 25 Oct 2019 08:42:25 +0200
Willy Tarreau <w@1wt.eu> wrote:
> Hi Andy,
>
> On Thu, Oct 24, 2019 at 09:45:56PM -0700, Andy Lutomirski wrote:
> > Hi all-
> >
> > Supporting iopl() in the Linux kernel is becoming a maintainability
> > problem. As far as I know, DPDK is the only major modern user of
> > iopl().
> >
> > After doing some research, DPDK uses direct io port access for only a
> > single purpose: accessing legacy virtio configuration structures.
> > These structures are mapped in IO space in BAR 0 on legacy virtio
> > devices.
> >
> > There are at least three ways you could avoid using iopl(). Here they
> > are in rough order of quality in my opinion:
> (...)
>
> I'm just wondering, why wouldn't we introduce a sys_ioport() syscall
> to perform I/Os in the kernel without having to play at all with iopl()/
> ioperm() ? That would alleviate the need for these large port maps.
> Applications that use outb/inb() usually don't need extreme speeds.
> Each time I had to use them, it was to access a watchdog, a sensor, a
> fan, control a front panel LED, or read/write to NVRAM. Some userland
> drivers possibly don't need much more, and very likely run with
> privileges turned on all the time, so replacing their inb()/outb() calls
> would mostly be a matter of redefining them using a macro to use the
> syscall instead.
>
> I'd see an API more or less like this :
>
> int ioport(int op, u16 port, long val, long *ret);
>
> <op> would take values such as INB,INW,INL to fill *<ret>, OUTB,OUTW,OUL
> to read from <val>, possibly ORB,ORW,ORL to read, or with <val>, write
> back and return previous value to <ret>, ANDB/W/L, XORB/W/L to do the
> same with and/xor, and maybe a TEST operation to just validate support
> at start time and replace ioperm/iopl so that subsequent calls do not
> need to check for errors. Applications could then replace :
>
> ioperm() with ioport(TEST,port,0,0)
> iopl() with ioport(TEST,0,0,0)
> outb() with ioport(OUTB,port,val,0)
> inb() with ({ char val;ioport(INB,port,0,&val);val;})
>
> ... and so on.
>
> And then ioperm/iopl can easily be dropped.
>
> Maybe I'm overlooking something ?
> Willy
DPDK does not want to system calls. It kills performance.
With pure user mode access it can reach > 10 Million Packets/sec
with a system call per packet that drops to 1 Million Packets/sec.
Also, adding new system calls might help in the long term,
but users are often kernels that are at least 5 years behind
upstream.
next prev parent reply other threads:[~2019-10-28 16:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-25 4:45 Andy Lutomirski
2019-10-25 6:42 ` Willy Tarreau
2019-10-25 14:45 ` Andy Lutomirski
2019-10-25 15:03 ` Willy Tarreau
2019-10-27 23:44 ` Maciej W. Rozycki
2019-10-28 16:42 ` Stephen Hemminger [this message]
2019-10-28 18:00 ` Andy Lutomirski
2019-10-28 20:13 ` Willy Tarreau
2019-10-25 7:22 ` David Marchand
2019-10-25 16:13 ` Stephen Hemminger
2019-10-25 20:43 ` Thomas Gleixner
2019-10-26 0:27 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191028094253.054fbf9c@hermes.lan \
--to=stephen@networkplumber.org \
--cc=dev@dpdk.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).