DPDK patches and discussions
 help / color / mirror / Atom feed
From: Stefan Baranoff <sbaranoff@gmail.com>
To: dev@dpdk.org
Subject: [dpdk-dev] Random mbuf corruption
Date: Fri, 20 Jun 2014 07:20:52 -0400
Message-ID: <CAHzKxpZUOVKbCYTb66D8cQbm0ceSt7rfYo6VU3f2qhi2ZBvytQ@mail.gmail.com> (raw)
In-Reply-To: <CAHzKxpaNvZkH9h0kqYJd8VoYEXqBUfhSX9V_zUro2oX_-ioAAw@mail.gmail.com>


We are seeing 'random' memory corruption in mbufs coming from the ixgbe UIO
driver and I am looking for some pointers on debugging it. Our software was
running flawlessly for weeks at a time on our old Westmere systems (CentOS
6.4) but since moving to a new Sandy Bridge v2 server (also CentOS 6.4) it
runs for 1-2 minutes and then at least one mbuf is overwritten with
arbitrary data (pointers/lengths/RSS value/num segs/etc. are all
ridiculous). Both servers are using the 82599EB chipset (x520) and the DPDK
version (1.6.0r2) is identical. We recently also tested on a third server
running RHEL 6.4 with the same hardware as the failing Sandy Bridge based
system and it is fine (days of runtime no failures).

Running all of this in GDB with 'record' enabled and setting a watchpoint
on the address which contains the corrupted data and executing a
'reverse-continue' never hits the watchpoint [GDB newbie here -- assuming
'watch *(uint64_t*)0x7FB.....' should work]. My first thought was memory
corruption but the BIOS memcheck on the ECC RAM shows no issues.

Also looking at mbuf->pkt.data, as an example, the corrupt value was the
same 6/12 trials but I could not find that value elsewhere in the processes
memory. This doesn't seem "random" and points to a software bug but I
cannot for the life of me get GDB to tell me where the program is when that
memory is written to. Incidentally trying this with the PCAP driver and
--no-huge to run valgrind shows no memory access errors/uninitialized

Thoughts? Pointers? Ways to rule in/out hardware other than going 1 by 1
removing each of the 24 DIMMs?

Thanks so much in advance!

       reply	other threads:[~2014-06-20 11:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAHzKxpaxCbt9d+njdBBpwSy069zLfsOvQ5Dx0CzXLNVMKQ9AaQ@mail.gmail.com>
     [not found] ` <CAHzKxpaNvZkH9h0kqYJd8VoYEXqBUfhSX9V_zUro2oX_-ioAAw@mail.gmail.com>
2014-06-20 11:20   ` Stefan Baranoff [this message]
2014-06-20 13:59     ` Paul Barrette
2014-06-23 21:43       ` Stefan Baranoff
2014-06-24  8:05         ` Gray, Mark D
2014-06-24 10:48         ` Neil Horman
2014-06-24 11:01           ` Olivier MATZ
2014-06-25  1:31             ` Stefan Baranoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHzKxpZUOVKbCYTb66D8cQbm0ceSt7rfYo6VU3f2qhi2ZBvytQ@mail.gmail.com \
    --to=sbaranoff@gmail.com \
    --cc=dev@dpdk.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git