DPDK usage discussions
 help / color / mirror / Atom feed
From: Zhongming Qu <zhongming@luminatewireless.com>
To: users@dpdk.org
Subject: [dpdk-users] running multiple independent dpdk applications randomly locks up machines
Date: Fri, 19 Aug 2016 13:32:06 -0700	[thread overview]
Message-ID: <CADc8Jciy0u8xZsyzfw1r74jTKcC5oynokJRgd8vDp=PP1U_jZQ@mail.gmail.com> (raw)

Hi,


As stated in the subject, running multiple dpdk applications (only one
process per application) randomly locks up machines. Thanks in advance for
any help.

It is difficult to provide the exact set of information useful for
debugging. Just listing the as much info as possible in the hope of ringing
a bell somewhere.

System Configuration:
- Motherboard: Supermicro X10SRi-F (BIOS upgraded to the latest version as
of July 2016)
- Intel Xeon E5-2667 v3 (Haswell), no NUMA
- 64GB DRAM
- Ubuntu 14.04 kernel 3.13.0-49-generic
- DPDK 16.04
- 1024 x 2M hugepages are reserved
- 82599ES NIC (2 x 10G) at pci_addr 02:00.0 and 02:00.1. Both ports use the
ixgbe_uio kernel driver and the ixgbe PMD.


Use Scenario of DPDK Application:
- Two single-process dpdk applications, A and B, need to run simultaneously.
- It is made sure that A and B do not have any race conditions or memory
issues, that is, apart from dpdk.
- Each application uses 512 x 2M hugepages (half of the total reserved
amount).
- Each application binds to one port via `--pci-whitelist <pci_addr>`.
- Use `-m 1024` and `--file-prefix <some_unique_id_per_pci_addr>`, as
instructed by 19.2.3 in the Programmer's Guide (
http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html).


Description of Problem:
- Starting and killing down A and B repeatedly every 30 seconds has a
chance of locking up the machine.
- No kernel var/log/syslog, no dmesg, nothing persistent, is available for
debugging after a reboot of the frozen machine.
- Looks like a kernel panic as it dumps some panic info to the serial
console (not useful...) and the CapsLock and NumLock keys on a physically
connected keyboard do not respond.
- No particular sequence of operations of starting and killing A and B, so
far, has been found to reliably lead to a lockup. The best effort of
reproducing the lockup is a keep-trying-until-lockup approach.


A Few Things Tried:
- Via dumping logging to stderr and files, it is found that the lock up can
happen during rte_eal_hugepage_init(), or after it, after the program is
killed.
- It is made sure that rte_config.mem_config->memseg is properly
initialized. That is, the total amount of memory reserved in the memseg is
512 x 2M hugepages.
- Zeroing all huepages when the hugefile is created and mapped, or
immediately after memsegs are initialized (as the second call of
map_all_hugepages() in rte_eal_hugepage_init()) does not fix the problem.
- By default, hugefiles in /mnt/huge are not cleaned up when the
applications are killed. Though, cleaning them up did not solve the problem
either.



Thanks very much for any input!


Zhongming

             reply	other threads:[~2016-08-19 20:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-19 20:32 Zhongming Qu [this message]
2016-08-19 21:03 ` Stephen Hemminger
2016-08-20  1:19   ` Zhongming Qu
2016-08-20  1:30     ` Stephen Hemminger
2016-08-26 17:55       ` Zhongming Qu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADc8Jciy0u8xZsyzfw1r74jTKcC5oynokJRgd8vDp=PP1U_jZQ@mail.gmail.com' \
    --to=zhongming@luminatewireless.com \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).