DPDK usage discussions
 help / color / mirror / Atom feed
From: Fabio Fernandes <boicotinho@proton.me>
To: Ivan Malov <ivan.malov@arknetworks.am>
Cc: "users@dpdk.org" <users@dpdk.org>
Subject: Re: Net/igc's rte_eth_rx_burst() never returns packets
Date: Fri, 02 Aug 2024 12:57:56 +0000	[thread overview]
Message-ID: <m9UXKZ6j4fA67WCmA1K5dTewKxRoNN_G__CFo3XN5OHu6nxKkxdnCWCap_-1ZHyEOdkcrPNv6eXNV9xGDlIETDUmjFI0z94Hws6ZeulCzPk=@proton.me> (raw)
In-Reply-To: <b3ef1113-14aa-3425-75bc-872daee4be2c@arknetworks.am>

Hi Ivan,

I'm using igb_uio because it's the one recommended for my target network card net/ena.

I've tried both vfio-pci and uio_pci_generic, but they fail for different reasons.

With vfio-pci, EAL tells me:

```
EAL: PCI device 0000:09:00.0 on NUMA socket -1
EAL:   probe driver: 8086:15f3 net_igc
EAL: 0000:09:00.0 VFIO group is not viable! Not all devices in IOMMU group bound to VFIO or unbound
EAL: Requested device 0000:09:00.0 cannot be used
```

I tried adding kernel boot parameter `iommu=on` with no luck.
I also tried unbinding my other cards:

```
Network devices using DPDK-compatible driver
============================================
0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=vfio-pci unused=igc,uio_pci_generic

Other Network devices
=====================
0000:08:00.0 'MT7922 802.11ax PCI Express Wireless Network Adapter 0616' unused=mt7921e,vfio-pci,uio_pci_generic
0000:0a:00.0 'AQtion AQC113CS NBase-T/IEEE 802.3an Ethernet Controller [Antigua 10G] 94c0' unused=atlantic,vfio-pci,uio_pci_generic
```

Resulting rte_eth_dev_count_total() == 0, so nothing starts.


Finally, I also tried `uio_pci_generic`:

```
Network devices using DPDK-compatible driver
============================================
0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=uio_pci_generic unused=igc,vfio-pci
```

This time DPDK accepts the device, however, I see the same old dmesg error appearing again:

```
[ 1449.570184] uio_pci_generic 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0011 address=0x13397ff80 flags=0x0000]
```

If you do have any further suggestions, please let me know.
In any case, thank you for your feedback so far!

Regards,
Fabio


Sent with Proton Mail secure email.

On Thursday, August 1st, 2024 at 10:16 PM, Ivan Malov <ivan.malov@arknetworks.am> wrote:

> Hi Fabio,
> 
> With regard to endianness conversion, I'd rather expect that line
> to be something like rte_le_to_cpu_32 as the source value is
> declared __le32. But, as I noted before, this is likely a
> don't care as your machine is probably little-endian, and
> rte_cpu_to_le_32 thus might simply do nothing.
> 
> Whereas your observation of the error in dmesg is indeed a
> valuable clue. Since it comes from igb_uio, my question is:
> why at all use igb_uio? People say it's an outdated driver.
> Have you considered using vfio-pci or uio_pci_generic
> instead? I suggest you try binding to vfio-pci and
> re-check with unmodified PMD source first.
> 
> Thank you.
> 
> On Thu, 1 Aug 2024, Fabio Fernandes wrote:
> 
> > Hi Ivan,
> > 
> > Thank you for your response.
> > 
> > I've ran it with the flags you suggested and attached the produced log.
> > 
> > { sudo ./dpdk-testpmd --log-level=pmd.net.igc,debug 2>&1; } > testpmd_with_debug_and_rx_print.log;
> > 
> > testpmd_with_debug_and_rx_print.log.zip
> > 
> > However, the driver never reaches point[1] (nor [2]) and this debug line never got logged. I've placed break points to confirm that the loop always exits just before [1], at this check:
> > `if (!(staterr & IGC_RXD_STAT_DD)) break;`
> > 
> > I've also instrumented testpmd.h as below, to confirm in the log file that RX is called many times and never returns anything but zeros:
> > ```
> > static inline uint16_t
> > common_fwd_stream_receive(struct fwd_stream *fs, struct rte_mbuf **burst,
> > unsigned int nb_pkts)
> > {
> > uint16_t nb_rx;
> > 
> > nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, burst, nb_pkts);
> > 
> > // Instrumentation Begin
> > {
> > static uint64_t g_call_count = 0;
> > static uint64_t g_rx_sum = 0;
> > g_rx_sum += nb_rx;
> > ++g_call_count;
> > if (nb_rx)
> > fprintf(stderr, "rte_eth_rx_burst: %u\n", nb_rx);
> > if ((g_call_count % 100000000UL) == 0)
> > fprintf(stderr, "g_rx_sum: %lu, g_call_count: %lu\n",
> > g_rx_sum, g_call_count);
> > }
> > // Instrumentation End
> > 
> > if (record_burst_stats)
> > fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
> > fs->rx_packets += nb_rx;
> > return nb_rx;
> > }
> > 
> > ```
> > 
> > In regards to [3], I've changed that to use rte_cpu_to_be_32() instead and rebuilt DPDK, but with same results and the loop still always exits there.
> > 
> > I did, however, noticed something strange and this is probably a clue:
> > 
> > Every time I step over this line of `igc_rx_init()` in the debugger:
> > https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L1204
> > 
> > `IGC_WRITE_REG(hw, IGC_RDT(rxq->reg_idx), rxq->nb_rx_desc - 1);`
> > 
> > I get this in `dmesg` kernel, coming from the igb_uio kernel I've bound to the device I'm testing:
> > 
> > `[26185.005945] igb_uio 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0011 address=0x116141a00 flags=0x0000]`
> > 
> > The address matches this in the debugger:
> > 
> > ```
> > rxq->rx_ring_phys_addr
> > $1 = 0x116141a00
> > 
> > rxq->reg_idx
> > 0
> > 
> > rxq->nb_rx_desc
> > 1024
> > ```
> > 
> > What do you think?
> > 
> > For more info, I'm on this exact DPDK commit:
> > commit eeb0605f118dae66e80faa44f7b3e88748032353 (HEAD -> v23.11, tag: v23.11
> > 
> > Thanks,
> > Fabio
> > 
> > Sent with Proton Mail secure email.
> > 
> > On Thursday, August 1st, 2024 at 3:24 PM, Ivan Malov ivan.malov@arknetworks.am wrote:
> > 
> > > Hi Fabio,
> > > 
> > > Have you tried to specify EAL option --log-level="pmd.net.igc,debug"
> > > or --log-level='.*',8 when running the application? Perhaps doing
> > > so can trigger printouts [1], [2]. See if you can't observe those.
> > > 
> > > Perhaps consider posting a brief excerpt of your code where
> > > rte_eth_rx_burst() is invoked and return value is verified.
> > > 
> > > Also, albeit unrelated, it's rather peculiar that the code
> > > does CPU-to-LE conversion [3] of descriptor status, but
> > > the field itslef is declared as __le32 already: [4].
> > > 
> > > [1] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L296
> > > [2] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L455
> > > [3] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L264
> > > [4] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/base/igc_base.h#L109
> > > 
> > > Thank you.
> > > 
> > > On Thu, 1 Aug 2024, Fabio Fernandes wrote:
> > > 
> > > > Hi,
> > > > 
> > > > I have an issue with rte_eth_rx_burst() for IGC poll mode driver never returning any packets and need some advice.
> > > > I have this network port:
> > > > 09:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 03)
> > > > 
> > > > Bound to igb_uio:
> > > > Network devices using DPDK-compatible driver
> > > > ============================================
> > > > 0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=igb_uio unused=igc
> > > > 
> > > > I'm testing this both with testpmd and my own app, which works fine with other drivers such as net/ena and net/i40e. I'm using single RX/TX queue pair with default configs
> > > > with rte_eth_promiscuous_enable() and rte_eth_allmulticast_enable().
> > > > 
> > > > The device seems to rte_eth_dev_start() fine, and rte_eth_stats_get() seem to be detecting inbound packets. Below is the output from testpmd:
> > > > 
> > > > Press enter to exiteth_igc_interrupt_action(): Port 0: Link Up - speed 1000 Mbps - full-duplex
> > > > 
> > > > Port 0: link state change event
> > > > ^CTelling cores to stop...
> > > > Waiting for lcores to finish...
> > > > 
> > > > ---------------------- Forward statistics for port 0 ----------------------
> > > > RX-packets: 129 RX-dropped: 800 RX-total: 929
> > > > TX-packets: 0 TX-dropped: 0 TX-total: 0
> > > > ----------------------------------------------------------------------------
> > > > 
> > > > +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
> > > > RX-packets: 129 RX-dropped: 800 RX-total: 929
> > > > TX-packets: 0 TX-dropped: 0 TX-total: 0
> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > 
> > > > Done.
> > > > 
> > > > However, rte_eth_rx_burst() never returns anything, neither in testpmd nor in my own app.
> > > > 
> > > > In my own app, I log both rte_eth_stats_get() and non-zero xstats from rte_eth_xstats_get_by_id():
> > > > 
> > > > 07:02:13.406873186 INF stats.rx : 0
> > > > 07:02:13.406892616 INF dev_stats.ipackets : 78
> > > > 07:02:13.406903636 INF dev_stats.opackets : 0
> > > > 07:02:13.406914166 INF dev_stats.imissed : 0
> > > > 07:02:13.406924536 INF dev_stats.ierrors : 0
> > > > 07:02:13.406934116 INF dev_stats.oerrors : 0
> > > > 07:02:13.406943956 INF dev_stats.rx_nombuf : 0
> > > > 07:02:13.407247777 INF xstats rx_good_packets : 78
> > > > 07:02:13.407257147 INF xstats rx_good_bytes : 17205
> > > > 07:02:13.407265267 INF xstats rx_size_64_packets : 6
> > > > 07:02:13.407274627 INF xstats rx_size_65_to_127_packets : 31
> > > > 07:02:13.407285757 INF xstats rx_size_128_to_255_packets : 22
> > > > 07:02:13.407297537 INF xstats rx_size_256_to_511_packets : 16
> > > > 07:02:13.407309127 INF xstats rx_size_512_to_1023_packets : 3
> > > > 07:02:13.407321327 INF xstats rx_broadcast_packets : 8
> > > > 07:02:13.407331597 INF xstats rx_multicast_packets : 64
> > > > 07:02:13.407346357 INF xstats rx_total_packets : 78
> > > > 07:02:13.407355547 INF xstats rx_total_bytes : 17205
> > > > 07:02:13.407364127 INF xstats rx_sent_to_host_packets : 78
> > > > 07:02:13.407375347 INF xstats interrupt_assert_count : 1
> > > > 
> > > > Still, rte_eth_rx_burst() never returns anything.
> > > > 
> > > > It's worthwhile to note that rte_eth_rx_burst() works fine when I, instead of net/igc, use net/ena (with ENA card) or net/i40e (Intel x710 card).
> > > > 
> > > > The debug log from EAL and net/igc is attached, in case that helps.
> > > > There's a warning "igc_rx_init(): forcing scatter mode", but I've already tried changing my mbuf sizes so that the warning goes away but that also didn't help.
> > > > 
> > > > Any advice?
> > > > 
> > > > Thanks,
> > > > Fabio

  reply	other threads:[~2024-08-02 12:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-01 10:26 Fabio Fernandes
2024-08-01 12:24 ` Ivan Malov
2024-08-01 17:40   ` Fabio Fernandes
2024-08-01 19:16     ` Ivan Malov
2024-08-02 12:57       ` Fabio Fernandes [this message]
2024-08-02 15:14         ` Ivan Malov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='m9UXKZ6j4fA67WCmA1K5dTewKxRoNN_G__CFo3XN5OHu6nxKkxdnCWCap_-1ZHyEOdkcrPNv6eXNV9xGDlIETDUmjFI0z94Hws6ZeulCzPk=@proton.me' \
    --to=boicotinho@proton.me \
    --cc=ivan.malov@arknetworks.am \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).