* [dpdk-users] Interrupt mode, queues, own event loop
@ 2020-09-04 10:24 Budiský Jakub
2020-09-04 16:18 ` Stephen Hemminger
0 siblings, 1 reply; 4+ messages in thread
From: Budiský Jakub @ 2020-09-04 10:24 UTC (permalink / raw)
To: users
Hi,
I'm working on a project that involves packet bursts receiving; other
than that it's mostly idle. The DPDK was incorporated later on (when I
found out that Linux AF_XDP won't do the job) and I use my own C++
implementation of an epoll-based event loop along with eventfd and
timerfd for communication / timeouts.
So I'm trying to use per-queue interrupts in my own event loop with
DPDK. Per-queue is quite important since I'm using the flow director for
load balancing and I'm relying on it. In the DPDK 18.11 (I believe) a
new function `rte_eth_dev_rx_intr_ctl_q_get_fd` was introduced just for
this purpose.
I'm currently using `uio_pci_generic` driver with Intel's 82599ES NIC
for debugging. For production I will switch to `vfio` due to the
application running in the userspace.
I've encountered two problems; the first being that I've expected the
DPDK to pass me eventfd file descriptors. While debugging I found out
that these are, in fact, /dev/uio0 files (I guess these are special
files created by the driver). I don't mind them "being different", but
this raises a few other issues: Is it safe to read them, i.e. does the
`ixgbe_pmd` driver rely on them in any way? Is there a way of
discriminating between a different types of file descriptor I may obtain
except looking at `/proc/self/fd/<fd_number>`? From the implementation
of `eal_intr_proc_rxtx_intr` it looks like the file descriptors will
differ for the `vfio` driver and I need to read a different amount of
data from them (4 Bytes for UIO vs. 8 Bytes for VFIO respectively, other
sizes may rise EINVAL).
The second problem is that I've got the same file descriptor for all the
queues, which means it may not be captured by the epoll in all relevant
threads. Is this behaviour intended? I recall seeing some limits
regarding the number of interrupt file descriptors but I believe it was
15 for my NIC. I don't mind but I need to change the program's logic to
account for this. Can I read the file descriptor and find out which
queues do need to process incoming packets, or do I just wake them all
up? Does this differ (and if, how) between the `vfio` and
`uio_pci_generic` drivers?
I feel like I may have missed something, reading the
`linux/eal_interrupts.c` it indeed looks like some eventfd descriptors
are set up, but maybe this matters only if you use DPDK-encapsulated
event loop. Please let me know if I should call anything besides
`rte_eth_dev_rx_intr_ctl_q_get_fd` and the usual device configuration
functions.
Thanks for any help!
Best regards,
Jakub Budisky
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-users] Interrupt mode, queues, own event loop
2020-09-04 10:24 [dpdk-users] Interrupt mode, queues, own event loop Budiský Jakub
@ 2020-09-04 16:18 ` Stephen Hemminger
2020-09-05 21:21 ` Budiský Jakub
0 siblings, 1 reply; 4+ messages in thread
From: Stephen Hemminger @ 2020-09-04 16:18 UTC (permalink / raw)
To: Budiský Jakub; +Cc: users
On Fri, 04 Sep 2020 12:24:06 +0200
Budiský Jakub <ibudisky@fit.vutbr.cz> wrote:
> Hi,
>
> I'm working on a project that involves packet bursts receiving; other
> than that it's mostly idle. The DPDK was incorporated later on (when I
> found out that Linux AF_XDP won't do the job) and I use my own C++
> implementation of an epoll-based event loop along with eventfd and
> timerfd for communication / timeouts.
>
> So I'm trying to use per-queue interrupts in my own event loop with
> DPDK. Per-queue is quite important since I'm using the flow director for
> load balancing and I'm relying on it. In the DPDK 18.11 (I believe) a
> new function `rte_eth_dev_rx_intr_ctl_q_get_fd` was introduced just for
> this purpose.
>
> I'm currently using `uio_pci_generic` driver with Intel's 82599ES NIC
> for debugging. For production I will switch to `vfio` due to the
> application running in the userspace.
>
> I've encountered two problems; the first being that I've expected the
> DPDK to pass me eventfd file descriptors. While debugging I found out
> that these are, in fact, /dev/uio0 files (I guess these are special
> files created by the driver). I don't mind them "being different", but
> this raises a few other issues: Is it safe to read them, i.e. does the
> `ixgbe_pmd` driver rely on them in any way? Is there a way of
> discriminating between a different types of file descriptor I may obtain
> except looking at `/proc/self/fd/<fd_number>`? From the implementation
> of `eal_intr_proc_rxtx_intr` it looks like the file descriptors will
> differ for the `vfio` driver and I need to read a different amount of
> data from them (4 Bytes for UIO vs. 8 Bytes for VFIO respectively, other
> sizes may rise EINVAL).
>
> The second problem is that I've got the same file descriptor for all the
> queues, which means it may not be captured by the epoll in all relevant
> threads. Is this behaviour intended? I recall seeing some limits
> regarding the number of interrupt file descriptors but I believe it was
> 15 for my NIC. I don't mind but I need to change the program's logic to
> account for this. Can I read the file descriptor and find out which
> queues do need to process incoming packets, or do I just wake them all
> up? Does this differ (and if, how) between the `vfio` and
> `uio_pci_generic` drivers?
>
> I feel like I may have missed something, reading the
> `linux/eal_interrupts.c` it indeed looks like some eventfd descriptors
> are set up, but maybe this matters only if you use DPDK-encapsulated
> event loop. Please let me know if I should call anything besides
> `rte_eth_dev_rx_intr_ctl_q_get_fd` and the usual device configuration
> functions.
The per-queue interrupt functionality for PCI devices is built
on top of MSI-X interrupts. The uio_pci_generic driver you are using
does not support MSI-X.
The way UIO driver works is to use the legacy INTx functionality,
and when an interrupt occurs the device driver in the kernel is called.
For the uio_pci_generic driver this is mapped to the device file descriptor.
For VFIO, you can have one interrupt per queue and it uses eventfd's
to create a per-queue channel.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-users] Interrupt mode, queues, own event loop
2020-09-04 16:18 ` Stephen Hemminger
@ 2020-09-05 21:21 ` Budiský Jakub
2020-09-08 15:09 ` Budiský Jakub
0 siblings, 1 reply; 4+ messages in thread
From: Budiský Jakub @ 2020-09-05 21:21 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: users
On 2020-09-04 18:18, Stephen Hemminger wrote:
> The per-queue interrupt functionality for PCI devices is built
> on top of MSI-X interrupts. The uio_pci_generic driver you are using
> does not support MSI-X.
>
> The way UIO driver works is to use the legacy INTx functionality,
> and when an interrupt occurs the device driver in the kernel is called.
> For the uio_pci_generic driver this is mapped to the device file
> descriptor.
>
> For VFIO, you can have one interrupt per queue and it uses eventfd's
> to create a per-queue channel.
Hi,
thanks for the valuable info!
I've now switched to the `vfio` module even for testing and I can
confirm I get a set of separate eventfd file descriptors. I've
encountered a new issue though that appears like a bug to me.
Either one of the file descriptors (`--vfio-intr msix`, always the one
associated with the last queue, regardless of the initialization order)
or all of them (`--vfio-intr msi`) are available for reading just once
per application run. I cannot get any followup interrupts and I can
confirm by polling that there are new packets that have arrived.
With `--vfio-intr legacy` I get a same file descriptor for all my
workers but it is also only triggered once.
As far as I understand there is no clear flag for the MSI(-X) interrupts
and so I'm not sure what else to try. There is nothing of interest in
the application output (not even with `--log-level lib.eal:debug`).
Thanks again.
Best regards,
Jakub Budisky
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-users] Interrupt mode, queues, own event loop
2020-09-05 21:21 ` Budiský Jakub
@ 2020-09-08 15:09 ` Budiský Jakub
0 siblings, 0 replies; 4+ messages in thread
From: Budiský Jakub @ 2020-09-08 15:09 UTC (permalink / raw)
To: users
On 2020-09-05 23:21, Budiský Jakub wrote:
> Hi,
>
> thanks for the valuable info!
>
> I've now switched to the `vfio` module even for testing and I can
> confirm I get a set of separate eventfd file descriptors. I've
> encountered a new issue though that appears like a bug to me.
>
> Either one of the file descriptors (`--vfio-intr msix`, always the one
> associated with the last queue, regardless of the initialization
> order) or all of them (`--vfio-intr msi`) are available for reading
> just once per application run. I cannot get any followup interrupts
> and I can confirm by polling that there are new packets that have
> arrived.
>
> With `--vfio-intr legacy` I get a same file descriptor for all my
> workers but it is also only triggered once.
>
> As far as I understand there is no clear flag for the MSI(-X)
> interrupts and so I'm not sure what else to try. There is nothing of
> interest in the application output (not even with `--log-level
> lib.eal:debug`).
>
> Thanks again.
>
> Best regards,
> Jakub Budisky
Hi,
Just letting you know that I've resolved the aforementioned issues and
it seems to be working with the Intel NIC. In the debugging process I've
also tried a 100 Gb NIC from Mellanox with no luck so far (I can't get
an epoll event from their "infinibandevent" (?) file descriptor, but I
didn't invest too much time into investigating it further). On the
bright side, trying a different NIC pointed me towards some of the
issues so I'm glad for that.
For the record, here are the resolutions (it may help somebody in the
future):
– Only the last queue was interrupted because my Flow Director setup was
flawed and didn't match the incoming packets properly. So that one was
unrelated. Needless to say that the (un)reported errors from the
`rte_flow` API were not too helpful.
– The MSI interrupts cannot distinguish between queues, similarly to the
legacy interrupt mechanism. Confirmed this in the DPDK source code. That
explains the difference in behaviour. So if you want per-queue
interrupts you are pretty much stuck with the `vfio` + MSI-X.
– And most importantly; apparently the DPDK RX interrupts were designed
in a way that an "explicit switch back to the polling mode" is necessary
(at least this seems to be the case for the Intel NIC / ixgbe PMD I'm
using). You won't get another interrupt unless you re-enable them. More
specifically, in my current code I now call
`rte_eth_dev_rx_intr_disable` immediately after receiving the event from
`epoll` followed by a read from the file descriptor, and
`rte_eth_dev_rx_intr_enable` after I'm done with the packet bursts
retrieval. I hope I won't miss an interrupt doing it this way due to a
race condition (incoming packet in-between the processing and enabling
the interrupt).
Thanks again for the help I've got.
Best regards,
Jakub Budisky
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-09-08 15:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04 10:24 [dpdk-users] Interrupt mode, queues, own event loop Budiský Jakub
2020-09-04 16:18 ` Stephen Hemminger
2020-09-05 21:21 ` Budiský Jakub
2020-09-08 15:09 ` Budiský Jakub
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).