DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: Willy Tarreau <w@1wt.eu>, Andy Lutomirski <luto@kernel.org>,
	dev@dpdk.org,  Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [dpdk-dev] Please stop using iopl() in DPDK
Date: Mon, 28 Oct 2019 11:00:47 -0700	[thread overview]
Message-ID: <CALCETrWvkTyHWy4yWEwWjV+apUZC6kruBgpOG5d-J4QHa0-uAw@mail.gmail.com> (raw)
In-Reply-To: <20191028094253.054fbf9c@hermes.lan>

> On Oct 28, 2019, at 10:43 AM, Stephen Hemminger <stephen@networkplumber.org> wrote:
>
> On Fri, 25 Oct 2019 08:42:25 +0200
> Willy Tarreau <w@1wt.eu> wrote:
>
>> Hi Andy,
>>
>>> On Thu, Oct 24, 2019 at 09:45:56PM -0700, Andy Lutomirski wrote:
>>> Hi all-
>>>
>>> Supporting iopl() in the Linux kernel is becoming a maintainability
>>> problem.  As far as I know, DPDK is the only major modern user of
>>> iopl().
>>>
>>> After doing some research, DPDK uses direct io port access for only a
>>> single purpose: accessing legacy virtio configuration structures.
>>> These structures are mapped in IO space in BAR 0 on legacy virtio
>>> devices.
>>>
>>> There are at least three ways you could avoid using iopl().  Here they
>>> are in rough order of quality in my opinion:
>> (...)
>>
>> I'm just wondering, why wouldn't we introduce a sys_ioport() syscall
>> to perform I/Os in the kernel without having to play at all with iopl()/
>> ioperm() ? That would alleviate the need for these large port maps.
>> Applications that use outb/inb() usually don't need extreme speeds.
>> Each time I had to use them, it was to access a watchdog, a sensor, a
>> fan, control a front panel LED, or read/write to NVRAM. Some userland
>> drivers possibly don't need much more, and very likely run with
>> privileges turned on all the time, so replacing their inb()/outb() calls
>> would mostly be a matter of redefining them using a macro to use the
>> syscall instead.
>>
>> I'd see an API more or less like this :
>>
>>  int ioport(int op, u16 port, long val, long *ret);
>>
>> <op> would take values such as INB,INW,INL to fill *<ret>, OUTB,OUTW,OUL
>> to read from <val>, possibly ORB,ORW,ORL to read, or with <val>, write
>> back and return previous value to <ret>, ANDB/W/L, XORB/W/L to do the
>> same with and/xor, and maybe a TEST operation to just validate support
>> at start time and replace ioperm/iopl so that subsequent calls do not
>> need to check for errors. Applications could then replace :
>>
>>    ioperm() with ioport(TEST,port,0,0)
>>    iopl() with ioport(TEST,0,0,0)
>>    outb() with ioport(OUTB,port,val,0)
>>    inb() with ({ char val;ioport(INB,port,0,&val);val;})
>>
>> ... and so on.
>>
>> And then ioperm/iopl can easily be dropped.
>>
>> Maybe I'm overlooking something ?
>> Willy
>
> DPDK does not want to system calls. It kills performance.
> With pure user mode access it can reach > 10 Million Packets/sec
> with a system call per packet that drops to 1 Million Packets/sec.

If you are getting 10 MPPS with an OUT per packet, I’ll buy you a
whole case of beer.

I’m suggesting that, on virtio-legacy, you benchmark the performance
hit of using a syscall to ring the doorbell.  Right now, you're doing
an OUT instruction that traps to the hypervisor, probably gets
emulated, and goes out to whatever host-side driver is running.  The
cost of doing that is going to be quite high, especially on older
machines.  I'm guessing that adding a syscall to the mix won't make
much difference.

--Andy

  reply	other threads:[~2019-10-28 18:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-25  4:45 Andy Lutomirski
2019-10-25  6:42 ` Willy Tarreau
2019-10-25 14:45   ` Andy Lutomirski
2019-10-25 15:03     ` Willy Tarreau
2019-10-27 23:44     ` Maciej W. Rozycki
2019-10-28 16:42   ` Stephen Hemminger
2019-10-28 18:00     ` Andy Lutomirski [this message]
2019-10-28 20:13     ` Willy Tarreau
2019-10-25  7:22 ` David Marchand
2019-10-25 16:13 ` Stephen Hemminger
2019-10-25 20:43   ` Thomas Gleixner
2019-10-26  0:27   ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrWvkTyHWy4yWEwWjV+apUZC6kruBgpOG5d-J4QHa0-uAw@mail.gmail.com \
    --to=luto@kernel.org \
    --cc=dev@dpdk.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=stephen@networkplumber.org \
    --cc=tglx@linutronix.de \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).