DPDK usage discussions
 help / color / mirror / Atom feed
From: Ivan Malov <ivan.malov@arknetworks.am>
To: Fabio Fernandes <boicotinho@proton.me>
Cc: "users@dpdk.org" <users@dpdk.org>
Subject: Re: Net/igc's rte_eth_rx_burst() never returns packets
Date: Fri, 2 Aug 2024 19:14:14 +0400 (+04)	[thread overview]
Message-ID: <085c03b5-8a25-1305-c3cf-cc6798fe1b42@arknetworks.am> (raw)
In-Reply-To: <m9UXKZ6j4fA67WCmA1K5dTewKxRoNN_G__CFo3XN5OHu6nxKkxdnCWCap_-1ZHyEOdkcrPNv6eXNV9xGDlIETDUmjFI0z94Hws6ZeulCzPk=@proton.me>

Hi Fabio,

How about running
find /sys/kernel/iommu_groups/ -type l
to identify devices that are in the same IOMMU group as 0000:09:00.0 ?

Thank you.

On Fri, 2 Aug 2024, Fabio Fernandes wrote:

> Hi Ivan,
>
> I'm using igb_uio because it's the one recommended for my target network card net/ena.
>
> I've tried both vfio-pci and uio_pci_generic, but they fail for different reasons.
>
> With vfio-pci, EAL tells me:
>
> ```
> EAL: PCI device 0000:09:00.0 on NUMA socket -1
> EAL:   probe driver: 8086:15f3 net_igc
> EAL: 0000:09:00.0 VFIO group is not viable! Not all devices in IOMMU group bound to VFIO or unbound
> EAL: Requested device 0000:09:00.0 cannot be used
> ```
>
> I tried adding kernel boot parameter `iommu=on` with no luck.
> I also tried unbinding my other cards:
>
> ```
> Network devices using DPDK-compatible driver
> ============================================
> 0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=vfio-pci unused=igc,uio_pci_generic
>
> Other Network devices
> =====================
> 0000:08:00.0 'MT7922 802.11ax PCI Express Wireless Network Adapter 0616' unused=mt7921e,vfio-pci,uio_pci_generic
> 0000:0a:00.0 'AQtion AQC113CS NBase-T/IEEE 802.3an Ethernet Controller [Antigua 10G] 94c0' unused=atlantic,vfio-pci,uio_pci_generic
> ```
>
> Resulting rte_eth_dev_count_total() == 0, so nothing starts.
>
>
> Finally, I also tried `uio_pci_generic`:
>
> ```
> Network devices using DPDK-compatible driver
> ============================================
> 0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=uio_pci_generic unused=igc,vfio-pci
> ```
>
> This time DPDK accepts the device, however, I see the same old dmesg error appearing again:
>
> ```
> [ 1449.570184] uio_pci_generic 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0011 address=0x13397ff80 flags=0x0000]
> ```
>
> If you do have any further suggestions, please let me know.
> In any case, thank you for your feedback so far!
>
> Regards,
> Fabio
>
>
> Sent with Proton Mail secure email.
>
> On Thursday, August 1st, 2024 at 10:16 PM, Ivan Malov <ivan.malov@arknetworks.am> wrote:
>
>> Hi Fabio,
>>
>> With regard to endianness conversion, I'd rather expect that line
>> to be something like rte_le_to_cpu_32 as the source value is
>> declared __le32. But, as I noted before, this is likely a
>> don't care as your machine is probably little-endian, and
>> rte_cpu_to_le_32 thus might simply do nothing.
>>
>> Whereas your observation of the error in dmesg is indeed a
>> valuable clue. Since it comes from igb_uio, my question is:
>> why at all use igb_uio? People say it's an outdated driver.
>> Have you considered using vfio-pci or uio_pci_generic
>> instead? I suggest you try binding to vfio-pci and
>> re-check with unmodified PMD source first.
>>
>> Thank you.
>>
>> On Thu, 1 Aug 2024, Fabio Fernandes wrote:
>>
>>> Hi Ivan,
>>>
>>> Thank you for your response.
>>>
>>> I've ran it with the flags you suggested and attached the produced log.
>>>
>>> { sudo ./dpdk-testpmd --log-level=pmd.net.igc,debug 2>&1; } > testpmd_with_debug_and_rx_print.log;
>>>
>>> testpmd_with_debug_and_rx_print.log.zip
>>>
>>> However, the driver never reaches point[1] (nor [2]) and this debug line never got logged. I've placed break points to confirm that the loop always exits just before [1], at this check:
>>> `if (!(staterr & IGC_RXD_STAT_DD)) break;`
>>>
>>> I've also instrumented testpmd.h as below, to confirm in the log file that RX is called many times and never returns anything but zeros:
>>> ```
>>> static inline uint16_t
>>> common_fwd_stream_receive(struct fwd_stream *fs, struct rte_mbuf **burst,
>>> unsigned int nb_pkts)
>>> {
>>> uint16_t nb_rx;
>>>
>>> nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, burst, nb_pkts);
>>>
>>> // Instrumentation Begin
>>> {
>>> static uint64_t g_call_count = 0;
>>> static uint64_t g_rx_sum = 0;
>>> g_rx_sum += nb_rx;
>>> ++g_call_count;
>>> if (nb_rx)
>>> fprintf(stderr, "rte_eth_rx_burst: %u\n", nb_rx);
>>> if ((g_call_count % 100000000UL) == 0)
>>> fprintf(stderr, "g_rx_sum: %lu, g_call_count: %lu\n",
>>> g_rx_sum, g_call_count);
>>> }
>>> // Instrumentation End
>>>
>>> if (record_burst_stats)
>>> fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
>>> fs->rx_packets += nb_rx;
>>> return nb_rx;
>>> }
>>>
>>> ```
>>>
>>> In regards to [3], I've changed that to use rte_cpu_to_be_32() instead and rebuilt DPDK, but with same results and the loop still always exits there.
>>>
>>> I did, however, noticed something strange and this is probably a clue:
>>>
>>> Every time I step over this line of `igc_rx_init()` in the debugger:
>>> https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L1204
>>>
>>> `IGC_WRITE_REG(hw, IGC_RDT(rxq->reg_idx), rxq->nb_rx_desc - 1);`
>>>
>>> I get this in `dmesg` kernel, coming from the igb_uio kernel I've bound to the device I'm testing:
>>>
>>> `[26185.005945] igb_uio 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0011 address=0x116141a00 flags=0x0000]`
>>>
>>> The address matches this in the debugger:
>>>
>>> ```
>>> rxq->rx_ring_phys_addr
>>> $1 = 0x116141a00
>>>
>>> rxq->reg_idx
>>> 0
>>>
>>> rxq->nb_rx_desc
>>> 1024
>>> ```
>>>
>>> What do you think?
>>>
>>> For more info, I'm on this exact DPDK commit:
>>> commit eeb0605f118dae66e80faa44f7b3e88748032353 (HEAD -> v23.11, tag: v23.11
>>>
>>> Thanks,
>>> Fabio
>>>
>>> Sent with Proton Mail secure email.
>>>
>>> On Thursday, August 1st, 2024 at 3:24 PM, Ivan Malov ivan.malov@arknetworks.am wrote:
>>>
>>>> Hi Fabio,
>>>>
>>>> Have you tried to specify EAL option --log-level="pmd.net.igc,debug"
>>>> or --log-level='.*',8 when running the application? Perhaps doing
>>>> so can trigger printouts [1], [2]. See if you can't observe those.
>>>>
>>>> Perhaps consider posting a brief excerpt of your code where
>>>> rte_eth_rx_burst() is invoked and return value is verified.
>>>>
>>>> Also, albeit unrelated, it's rather peculiar that the code
>>>> does CPU-to-LE conversion [3] of descriptor status, but
>>>> the field itslef is declared as __le32 already: [4].
>>>>
>>>> [1] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L296
>>>> [2] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L455
>>>> [3] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L264
>>>> [4] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/base/igc_base.h#L109
>>>>
>>>> Thank you.
>>>>
>>>> On Thu, 1 Aug 2024, Fabio Fernandes wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have an issue with rte_eth_rx_burst() for IGC poll mode driver never returning any packets and need some advice.
>>>>> I have this network port:
>>>>> 09:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 03)
>>>>>
>>>>> Bound to igb_uio:
>>>>> Network devices using DPDK-compatible driver
>>>>> ============================================
>>>>> 0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=igb_uio unused=igc
>>>>>
>>>>> I'm testing this both with testpmd and my own app, which works fine with other drivers such as net/ena and net/i40e. I'm using single RX/TX queue pair with default configs
>>>>> with rte_eth_promiscuous_enable() and rte_eth_allmulticast_enable().
>>>>>
>>>>> The device seems to rte_eth_dev_start() fine, and rte_eth_stats_get() seem to be detecting inbound packets. Below is the output from testpmd:
>>>>>
>>>>> Press enter to exiteth_igc_interrupt_action(): Port 0: Link Up - speed 1000 Mbps - full-duplex
>>>>>
>>>>> Port 0: link state change event
>>>>> ^CTelling cores to stop...
>>>>> Waiting for lcores to finish...
>>>>>
>>>>> ---------------------- Forward statistics for port 0 ----------------------
>>>>> RX-packets: 129 RX-dropped: 800 RX-total: 929
>>>>> TX-packets: 0 TX-dropped: 0 TX-total: 0
>>>>> ----------------------------------------------------------------------------
>>>>>
>>>>> +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
>>>>> RX-packets: 129 RX-dropped: 800 RX-total: 929
>>>>> TX-packets: 0 TX-dropped: 0 TX-total: 0
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>
>>>>> Done.
>>>>>
>>>>> However, rte_eth_rx_burst() never returns anything, neither in testpmd nor in my own app.
>>>>>
>>>>> In my own app, I log both rte_eth_stats_get() and non-zero xstats from rte_eth_xstats_get_by_id():
>>>>>
>>>>> 07:02:13.406873186 INF stats.rx : 0
>>>>> 07:02:13.406892616 INF dev_stats.ipackets : 78
>>>>> 07:02:13.406903636 INF dev_stats.opackets : 0
>>>>> 07:02:13.406914166 INF dev_stats.imissed : 0
>>>>> 07:02:13.406924536 INF dev_stats.ierrors : 0
>>>>> 07:02:13.406934116 INF dev_stats.oerrors : 0
>>>>> 07:02:13.406943956 INF dev_stats.rx_nombuf : 0
>>>>> 07:02:13.407247777 INF xstats rx_good_packets : 78
>>>>> 07:02:13.407257147 INF xstats rx_good_bytes : 17205
>>>>> 07:02:13.407265267 INF xstats rx_size_64_packets : 6
>>>>> 07:02:13.407274627 INF xstats rx_size_65_to_127_packets : 31
>>>>> 07:02:13.407285757 INF xstats rx_size_128_to_255_packets : 22
>>>>> 07:02:13.407297537 INF xstats rx_size_256_to_511_packets : 16
>>>>> 07:02:13.407309127 INF xstats rx_size_512_to_1023_packets : 3
>>>>> 07:02:13.407321327 INF xstats rx_broadcast_packets : 8
>>>>> 07:02:13.407331597 INF xstats rx_multicast_packets : 64
>>>>> 07:02:13.407346357 INF xstats rx_total_packets : 78
>>>>> 07:02:13.407355547 INF xstats rx_total_bytes : 17205
>>>>> 07:02:13.407364127 INF xstats rx_sent_to_host_packets : 78
>>>>> 07:02:13.407375347 INF xstats interrupt_assert_count : 1
>>>>>
>>>>> Still, rte_eth_rx_burst() never returns anything.
>>>>>
>>>>> It's worthwhile to note that rte_eth_rx_burst() works fine when I, instead of net/igc, use net/ena (with ENA card) or net/i40e (Intel x710 card).
>>>>>
>>>>> The debug log from EAL and net/igc is attached, in case that helps.
>>>>> There's a warning "igc_rx_init(): forcing scatter mode", but I've already tried changing my mbuf sizes so that the warning goes away but that also didn't help.
>>>>>
>>>>> Any advice?
>>>>>
>>>>> Thanks,
>>>>> Fabio
>

      reply	other threads:[~2024-08-02 15:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-01 10:26 Fabio Fernandes
2024-08-01 12:24 ` Ivan Malov
2024-08-01 17:40   ` Fabio Fernandes
2024-08-01 19:16     ` Ivan Malov
2024-08-02 12:57       ` Fabio Fernandes
2024-08-02 15:14         ` Ivan Malov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=085c03b5-8a25-1305-c3cf-cc6798fe1b42@arknetworks.am \
    --to=ivan.malov@arknetworks.am \
    --cc=boicotinho@proton.me \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).