DPDK usage discussions
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Zhongming Qu <zhongming@luminatewireless.com>
Cc: users@dpdk.org
Subject: Re: [dpdk-users] running multiple independent dpdk applications randomly locks up machines
Date: Fri, 19 Aug 2016 14:03:50 -0700	[thread overview]
Message-ID: <20160819140350.70fbc49e@xeon-e3> (raw)
In-Reply-To: <CADc8Jciy0u8xZsyzfw1r74jTKcC5oynokJRgd8vDp=PP1U_jZQ@mail.gmail.com>

On Fri, 19 Aug 2016 13:32:06 -0700
Zhongming Qu <zhongming@luminatewireless.com> wrote:

> Hi,
> 
> 
> As stated in the subject, running multiple dpdk applications (only one
> process per application) randomly locks up machines. Thanks in advance for
> any help.
> 
> It is difficult to provide the exact set of information useful for
> debugging. Just listing the as much info as possible in the hope of ringing
> a bell somewhere.
> 
> System Configuration:
> - Motherboard: Supermicro X10SRi-F (BIOS upgraded to the latest version as
> of July 2016)
> - Intel Xeon E5-2667 v3 (Haswell), no NUMA
> - 64GB DRAM
> - Ubuntu 14.04 kernel 3.13.0-49-generic
> - DPDK 16.04
> - 1024 x 2M hugepages are reserved
> - 82599ES NIC (2 x 10G) at pci_addr 02:00.0 and 02:00.1. Both ports use the
> ixgbe_uio kernel driver and the ixgbe PMD.
> 
> 
> Use Scenario of DPDK Application:
> - Two single-process dpdk applications, A and B, need to run simultaneously.
> - It is made sure that A and B do not have any race conditions or memory
> issues, that is, apart from dpdk.
> - Each application uses 512 x 2M hugepages (half of the total reserved
> amount).
> - Each application binds to one port via `--pci-whitelist <pci_addr>`.
> - Use `-m 1024` and `--file-prefix <some_unique_id_per_pci_addr>`, as
> instructed by 19.2.3 in the Programmer's Guide (
> http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html).
> 
> 
> Description of Problem:
> - Starting and killing down A and B repeatedly every 30 seconds has a
> chance of locking up the machine.
> - No kernel var/log/syslog, no dmesg, nothing persistent, is available for
> debugging after a reboot of the frozen machine.
> - Looks like a kernel panic as it dumps some panic info to the serial
> console (not useful...) and the CapsLock and NumLock keys on a physically
> connected keyboard do not respond.
> - No particular sequence of operations of starting and killing A and B, so
> far, has been found to reliably lead to a lockup. The best effort of
> reproducing the lockup is a keep-trying-until-lockup approach.
> 
> 
> A Few Things Tried:
> - Via dumping logging to stderr and files, it is found that the lock up can
> happen during rte_eal_hugepage_init(), or after it, after the program is
> killed.
> - It is made sure that rte_config.mem_config->memseg is properly
> initialized. That is, the total amount of memory reserved in the memseg is
> 512 x 2M hugepages.
> - Zeroing all huepages when the hugefile is created and mapped, or
> immediately after memsegs are initialized (as the second call of
> map_all_hugepages() in rte_eal_hugepage_init()) does not fix the problem.
> - By default, hugefiles in /mnt/huge are not cleaned up when the
> applications are killed. Though, cleaning them up did not solve the problem
> either.
> 
> 
> 
> Thanks very much for any input!
> 
> 
> Zhongming

Obviously, two applications can't share the same queue.
Also, you need to give application a different core mask; at least if you are using
poll mode like the DPDK examples.

You might be better off having one primary DPDK process and two secondary processes.

  reply	other threads:[~2016-08-19 21:03 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-19 20:32 Zhongming Qu
2016-08-19 21:03 ` Stephen Hemminger [this message]
2016-08-20  1:19   ` Zhongming Qu
2016-08-20  1:30     ` Stephen Hemminger
2016-08-26 17:55       ` Zhongming Qu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160819140350.70fbc49e@xeon-e3 \
    --to=stephen@networkplumber.org \
    --cc=users@dpdk.org \
    --cc=zhongming@luminatewireless.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).