DPDK patches and discussions
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: <dev@dpdk.org>
Subject: Re: RFC - Tap io_uring PMD
Date: Tue, 5 Nov 2024 10:58:39 -0800	[thread overview]
Message-ID: <20241105105839.36c85e8f@hermes.local> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F86A@smartserver.smartshare.dk>

On Sat, 2 Nov 2024 23:28:49 +0100
Morten Brørup <mb@smartsharesystems.com> wrote:

> > > >
> > > > Probably the hardest part of using io_uring is figuring out how to
> > > > collect
> > > > completions. The simplest way would be to handle all completions rx  
> > and  
> > > > tx
> > > > in the rx_burst function.  
> > >
> > > Please don't mix RX and TX, unless explicitly requested by the  
> > application through the recently introduced "mbuf recycle" feature.
> > 
> > The issue is Rx and Tx share a single fd and ioring for completion is
> > per fd.
> > The implementation for ioring came from the storage side so initially
> > it was for fixing
> > the broken Linux AIO support.
> > 
> > Some other devices only have single interrupt or ring shared with rx/tx
> > so not unique.
> > Virtio, netvsc, and some NIC's.
> > 
> > The problem is that if Tx completes descriptors then there needs to be
> > locking
> > to prevent Rx thread and Tx thread overlapping. And a spin lock is a
> > performance buzz kill.  
> 
> Brainstorming a bit here...
> What if the new TAP io_uring PMD is designed to use two io_urings per port, one for RX and another one for TX on the same TAP interface?
> This requires that a TAP interface can be referenced via two file descriptors (one fd for the RX io_uring and another fd for the TX io_uring), e.g. by using dup() to create the additional file descriptor. I don't know if this is possible, and if it works with io_uring.

There a couple of problems with multiple fd's.
  - multiple fds pointing to same internal tap queue are not going to get completed separately.
  - when multi-proc is supported, limit of 253 fd's in Unix domain IPC comes into play
  - tap does not support tx only fd for queues. If fd is queue of tap, receive fan out will go to it.

If DPDK was more flexible, harvesting of completion could be done via another thread but that is not general enough
to work transparently with all applications.  Existing TAP device plays with SIGIO, but signals are slower.

  reply	other threads:[~2024-11-05 18:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-30 21:56 Stephen Hemminger
2024-10-31 10:27 ` Morten Brørup
2024-11-01  0:34   ` Stephen Hemminger
2024-11-02 22:28     ` Morten Brørup
2024-11-05 18:58       ` Stephen Hemminger [this message]
2024-11-05 23:22         ` Morten Brørup
2024-11-05 23:25           ` Stephen Hemminger
2024-11-05 23:54             ` Morten Brørup
2024-11-06  0:52               ` Igor Gutorov
2024-11-07 16:30                 ` Stephen Hemminger
2024-11-06 10:30           ` Konstantin Ananyev
2024-11-06  0:46 ` Varghese, Vipin
2024-11-06  7:46 ` Maxime Coquelin
2024-11-07 21:51   ` Morten Brørup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241105105839.36c85e8f@hermes.local \
    --to=stephen@networkplumber.org \
    --cc=dev@dpdk.org \
    --cc=mb@smartsharesystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).