DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Gray, Mark D" <mark.d.gray@intel.com>
To: Stefan Baranoff <sbaranoff@gmail.com>,
	"Barrette, Paul (Wind River)" <paul.barrette@windriver.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Random mbuf corruption
Date: Tue, 24 Jun 2014 08:05:58 +0000	[thread overview]
Message-ID: <738D45BC1F695740A983F43CFE1B7EA92D6E2FFF@IRSMSX102.ger.corp.intel.com> (raw)
In-Reply-To: <CAHzKxpYaUhR5ti2EDZfj7jeu8pWxhnmWM+e2D20k01NHa_u85w@mail.gmail.com>

> 
> Paul,
> 
> Thanks for the advice; we ran memtest as well as the Dell complete system
> diagnostic and neither found an issue. The plot thickens, though!
> 
> Our admins messed up our kickstart labels and what I *thought* was CentOS
> 6.4 was actually RHEL 6.4 and the problem seems to be following the CentOS
> 6.4 installations -- the current configuration of success/failure is:
>   1 server - Westmere - RHEL 6.4 -- works
>   1 server - Sandy Bridge - RHEL 6.4 -- works
>   2 servers - Sandy Bridge - CentOS 6.4 -- fails
> 
> Given that the hardware seems otherwise stable/checks out I'm trying to
> figure out how to determine if this is:
>   a) our software has a bug
>   b) a kernel/hugetlbfs bug
>   c) a  DPDK 1.6.0r2 bug
> 
> I have seen similar issues where calling rte_eal_init too late in a process also
> causes similar issues (things like calling 'free' on memory that was allocated
> with 'malloc' before 'rte_eal_init' is called fails/results in segfault in libc)
> which seems odd to me but in this case we are calling rte_eal_init as the first
> thing we do in main().

I have seen the following issues causing mbuf corruption of this type

1. Calling an rte_pktmbuf_free() on an mbuf and then still using a reference
to that mbuf.
2. Using rte_pktmbuf_free() and rte_pktmbuf_alloc() in a pthread (i.e. not
a "dpdk" thread). This corrupted the per-lcore mbuf cache.

Not pleasant to debug, especially if you are sharing the mempool between 
primary and secondary processes. I have no tips for debug other than careful
code review everywhere an mbuf is freed or allocated. 

Mark

  reply	other threads:[~2014-06-24  8:06 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAHzKxpaxCbt9d+njdBBpwSy069zLfsOvQ5Dx0CzXLNVMKQ9AaQ@mail.gmail.com>
     [not found] ` <CAHzKxpaNvZkH9h0kqYJd8VoYEXqBUfhSX9V_zUro2oX_-ioAAw@mail.gmail.com>
2014-06-20 11:20   ` Stefan Baranoff
2014-06-20 13:59     ` Paul Barrette
2014-06-23 21:43       ` Stefan Baranoff
2014-06-24  8:05         ` Gray, Mark D [this message]
2014-06-24 10:48         ` Neil Horman
2014-06-24 11:01           ` Olivier MATZ
2014-06-25  1:31             ` Stefan Baranoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=738D45BC1F695740A983F43CFE1B7EA92D6E2FFF@IRSMSX102.ger.corp.intel.com \
    --to=mark.d.gray@intel.com \
    --cc=dev@dpdk.org \
    --cc=paul.barrette@windriver.com \
    --cc=sbaranoff@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).