From: Matt Laswell <laswell@infiniteio.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] How to approach packet TX lockups
Date: Tue, 17 Nov 2015 08:23:45 -0600 [thread overview]
Message-ID: <CA+GnqAotg9sfoYZHKe4+7XBO5LEBp5hiZ1Vxwt5ZOoag4rsWXA@mail.gmail.com> (raw)
In-Reply-To: <20151116173129.2a429930@samsung9>
Yes, we're on 1.6r2. That said, I've tried a number of different values
for the thresholds without a lot of luck. Setting wthresh/hthresh/pthresh
to 0/0/32 or 0/0/0 doesn't appear to fix things. And, as Matthew
suggested, I'm pretty sure using 0 for the thresholds leads to auto-config
by the driver. I also tried 1/1/32, which required that I also change the
rs_thresh value from 0 to 1 to work around a panic in PMD initialization
("TX WTHRESH must be set to 0 if tx_rs_thresh is greater than 1").
Any other suggestions?
On Mon, Nov 16, 2015 at 7:31 PM, Stephen Hemminger <
stephen@networkplumber.org> wrote:
> On Mon, 16 Nov 2015 18:49:15 -0600
> Matt Laswell <laswell@infiniteio.com> wrote:
>
> > Hey Stephen,
> >
> > Thanks a lot; that's really useful information. Unfortunately, I'm at a
> > stage in our release cycle where upgrading to a new version of DPDK isn't
> > feasible. Any chance you (or others reading this) has a pointer to the
> > relevant changes? While I can't afford to upgrade DPDK entirely,
> > backporting targeted fixes is more doable.
> >
> > Again, thanks.
> >
> > - Matt
> >
> >
> > On Mon, Nov 16, 2015 at 6:12 PM, Stephen Hemminger <
> > stephen@networkplumber.org> wrote:
> >
> > > On Mon, 16 Nov 2015 17:48:35 -0600
> > > Matt Laswell <laswell@infiniteio.com> wrote:
> > >
> > > > Hey Folks,
> > > >
> > > > I sent this to the users email list, but I'm not sure how many
> people are
> > > > actively reading that list at this point. I'm dealing with a
> situation
> > > in
> > > > which my application loses the ability to transmit packets out of a
> port
> > > > during times of moderate stress. I'd love to hear suggestions for
> how to
> > > > approach this problem, as I'm a bit at a loss at the moment.
> > > >
> > > > Specifically, I'm using DPDK 1.6r2 running on Ubuntu 14.04LTS on
> Haswell
> > > > processors. I'm using the 82599 controller, configured to spread
> packets
> > > > across multiple queues. Each queue is accessed by a different lcore
> in
> > > my
> > > > application; there is therefore concurrent access to the controller,
> but
> > > > not to any of the queues. We're binding the ports to the igb_uio
> driver.
> > > > The symptoms I see are these:
> > > >
> > > >
> > > > - All transmit out of a particular port stops
> > > > - rte_eth_tx_burst() indicates that it is sending all of the
> packets
> > > > that I give to it
> > > > - rte_eth_stats_get() gives me stats indicating that no packets
> are
> > > > being sent on the affected port. Also, no tx errors, and no pause
> > > frames
> > > > sent or received (opackets = 0, obytes = 0, oerrors = 0, etc.)
> > > > - All other ports continue to work normally
> > > > - The affected port continues to receive packets without problems;
> > > only
> > > > TX is affected
> > > > - Resetting the port via rte_eth_dev_stop() and
> rte_eth_dev_start()
> > > > restores things and packets can flow again
> > > > - The problem is replicable on multiple devices, and doesn't
> follow
> > > one
> > > > particular port
> > > >
> > > > I've tried calling rte_mbuf_sanity_check() on all packets before
> sending
> > > > them. I've also instrumented my code to look for packets that have
> > > already
> > > > been sent or freed, as well as cycles in chained packets being
> sent. I
> > > > also put a lock around all accesses to rte_eth* calls to synchronize
> > > access
> > > > to the NIC. Given some recent discussion here, I also tried
> changing the
> > > > TX RS threshold from 0 to 32, 16, and 1. None of these strategies
> proved
> > > > effective.
> > > >
> > > > Like I said at the top, I'm a little at a loss at this point. If you
> > > were
> > > > dealing with this set of symptoms, how would you proceed?
> > > >
> > >
> > > I remember some issues with old DPDK 1.6 with some of the prefetch
> > > thresholds on 82599. You would be better off going to a later DPDK
> > > version.
> > >
>
> I hope you are on 1.6.0r2 at least??
>
> With older DPDK there was no way to get driver to tell you what the
> preferred settings were for pthresh/hthresh/wthresh. And the values
> in Intel sample applications were broken on some hardware.
>
> I remember reverse engineering the safe values from reading the Linux
> driver.
>
> The Linux driver is much better tested than the DPDK one...
> In the Linux driver, the Transmit Descriptor Controller (txdctl)
> is fixed at (for transmit)
> wthresh = 1
> hthresh = 1
> pthresh = 32
>
> The DPDK 2.2 driver uses:
> wthresh = 0
> hthresh = 0
> pthresh = 32
>
>
>
>
>
>
>
next prev parent reply other threads:[~2015-11-17 14:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-16 23:48 Matt Laswell
2015-11-17 0:12 ` Stephen Hemminger
2015-11-17 0:49 ` Matt Laswell
2015-11-17 1:31 ` Stephen Hemminger
2015-11-17 3:51 ` Matthew Hall
2015-11-17 14:23 ` Matt Laswell [this message]
2015-11-17 14:44 ` Ananyev, Konstantin
2015-11-17 15:04 ` Matt Laswell
2015-11-17 16:20 ` Ananyev, Konstantin
2015-11-17 16:25 ` Matt Laswell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+GnqAotg9sfoYZHKe4+7XBO5LEBp5hiZ1Vxwt5ZOoag4rsWXA@mail.gmail.com \
--to=laswell@infiniteio.com \
--cc=dev@dpdk.org \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).