DPDK patches and discussions
 help / color / mirror / Atom feed
From: Matt Laswell <laswell@infiniteio.com>
To: "dev@dpdk.org" <dev@dpdk.org>
Subject: [dpdk-dev] How to approach packet TX lockups
Date: Mon, 16 Nov 2015 17:48:35 -0600	[thread overview]
Message-ID: <CA+GnqAoT=jQKrnJwoygoZVeN_j1gfMfQPXWst01ayMeEFjRnFg@mail.gmail.com> (raw)

Hey Folks,

I sent this to the users email list, but I'm not sure how many people are
actively reading that list at this point.  I'm dealing with a situation in
which my application loses the ability to transmit packets out of a port
during times of moderate stress.  I'd love to hear suggestions for how to
approach this problem, as I'm a bit at a loss at the moment.

Specifically, I'm using DPDK 1.6r2 running on Ubuntu 14.04LTS on Haswell
processors.  I'm using the 82599 controller, configured to spread packets
across multiple queues.  Each queue is accessed by a different lcore in my
application; there is therefore concurrent access to the controller, but
not to any of the queues.  We're binding the ports to the igb_uio driver.
The symptoms I see are these:


   - All transmit out of a particular port stops
   - rte_eth_tx_burst() indicates that it is sending all of the packets
   that I give to it
   - rte_eth_stats_get() gives me stats indicating that no packets are
   being sent on the affected port.  Also, no tx errors, and no pause frames
   sent or received (opackets = 0, obytes = 0, oerrors = 0, etc.)
   - All other ports continue to work normally
   - The affected port continues to receive packets without problems; only
   TX is affected
   - Resetting the port via rte_eth_dev_stop() and rte_eth_dev_start()
   restores things and packets can flow again
   - The problem is replicable on multiple devices, and doesn't follow one
   particular port

I've tried calling rte_mbuf_sanity_check() on all packets before sending
them.  I've also instrumented my code to look for packets that have already
been sent or freed, as well as cycles in chained packets being sent.  I
also put a lock around all accesses to rte_eth* calls to synchronize access
to the NIC.  Given some recent discussion here, I also tried changing the
TX RS threshold from 0 to 32, 16, and 1.  None of these strategies proved
effective.

Like I said at the top, I'm a little at a loss at this point.  If you were
dealing with this set of symptoms, how would you proceed?

Thanks in advance.

--
Matt Laswell
infinite io, inc.
laswell@infiniteio.com

             reply	other threads:[~2015-11-16 23:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-16 23:48 Matt Laswell [this message]
2015-11-17  0:12 ` Stephen Hemminger
2015-11-17  0:49   ` Matt Laswell
2015-11-17  1:31     ` Stephen Hemminger
2015-11-17  3:51       ` Matthew Hall
2015-11-17 14:23       ` Matt Laswell
2015-11-17 14:44         ` Ananyev, Konstantin
2015-11-17 15:04           ` Matt Laswell
2015-11-17 16:20             ` Ananyev, Konstantin
2015-11-17 16:25               ` Matt Laswell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+GnqAoT=jQKrnJwoygoZVeN_j1gfMfQPXWst01ayMeEFjRnFg@mail.gmail.com' \
    --to=laswell@infiniteio.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).