From: Alan Beadle <ab.beadle@gmail.com>
To: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Cc: users@dpdk.org
Subject: Re: Multiprocess App Problems with tx_burst
Date: Sun, 5 Jan 2025 11:01:28 -0500 [thread overview]
Message-ID: <CANTAOdwoqhLmbkWc6LTEzgP59J09tf1L5LLMFW9k4nCXqZatbA@mail.gmail.com> (raw)
In-Reply-To: <20250105010148.1ef26333@sovereign>
> So, "deamon" and "server" may try using the same queue sometimes, correct?
> Synchronizing all access to the single queue should work in this case.
That is correct.
> BTW, rte_eth_tx_burst() returning >0 does not mean the packets have been sent.
> It only means they have been enqueued for sending.
> At some point the NIC will complete sending,
> only then the PMD can free the mbuf (or decrement its reference count).
> For most PMDs, this happens on a subsequent call to rte_eth_tx_burst().
> Which PMD and HW is it?
Here is the output of 'dpdk-devbind.py --status':
Network devices using DPDK-compatible driver
============================================
0000:65:00.1 'Ethernet Controller 10G X550T 1563' drv=vfio-pci
unused=uio_pci_generic
> Have you tried to print as many stats as possible when rte_eth_tx_burst()
> can't consume all packets (rte_eth_stats_get(), rte_eth_xstats_get())?
In setting this up, I discovered that this error only occurs when the
primary process on the other host exits (due to an error) or is not
initially running (the NIC is "down" in this case?). It happens
consistently when I only launch the processes on one of the two
machines. ***But*** counterintuitively, it looks like packets are
successfully "sent" by the daemon until the other process begins to
run. In case it is useful, I summarize the stats for this case below.
Note that I am also seeing another error. Sometimes, rather than tx
failing, my app detects incorrect/corrupted mbuf contents and exits
immediately. It appears that mbufs are being re-allocated when they
should not be. I thought I had finally solved this (see my earlier
threads) but with multi-core concurrency this problem has returned. It
is very possible that this error is somewhere in my own library code,
as it looks like the accompanying non-DPDK structures are also being
corrupted (probably first).
For background, I maintain a hash table of header structs to track
individual mbufs. The sequence numbers in the headers should match
those contained in the mbuf's payload. This check is failing after a
few hundred successful data messages have been exchanged between the
hosts. The sequence number in the mbuf shows that it is in the wrong
hash bucket, and the sequence number in the header is a large
corrupted value which is out of range for my sequence numbers (and
also not matching the bucket).
Back to the issue of failed tx bursts: Here are the stats I observed
after a packet failed to send from the daemon (after only launching
the primary+secondary processes on one of the machines). This failure
occurred after the daemon had successfully "sent" hundreds of
handshake packets (to nowhere, presumably?), and the failure occurred
as soon as the second process had finished initialization:
ipackets:0, opackets:0, ibytes:0, obytes:0, ierrors:0, oerrors:0
Got 146 xstats
Port:0, tx_q0_packets:1138
Port:0, tx_q0_bytes:125180
Port:0, mac_local_errors:2
Port:0, out_pkts_untagged:5
(All other stats had a value of 0 and are omitted).
I will continue investigating the corruption bug in the (likely) case
that it is in my library code. In the meantime please let me know if I
am using DPDK incorrectly. Thank you again!
-Alan
next prev parent reply other threads:[~2025-01-05 16:01 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-31 17:49 Alan Beadle
2025-01-04 16:22 ` Alan Beadle
2025-01-04 18:40 ` Dmitry Kozlyuk
2025-01-04 19:16 ` Alan Beadle
2025-01-04 22:01 ` Dmitry Kozlyuk
2025-01-05 16:01 ` Alan Beadle [this message]
2025-01-06 16:05 ` Alan Beadle
2025-01-06 20:10 ` Dmitry Kozlyuk
2025-01-06 20:34 ` Alan Beadle
2025-01-07 16:09 ` Alan Beadle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANTAOdwoqhLmbkWc6LTEzgP59J09tf1L5LLMFW9k4nCXqZatbA@mail.gmail.com \
--to=ab.beadle@gmail.com \
--cc=dmitry.kozliuk@gmail.com \
--cc=users@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).