DPDK usage discussions
 help / color / mirror / Atom feed
From: Zhongming Qu <zhongming@luminatewireless.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: users@dpdk.org
Subject: Re: [dpdk-users] running multiple independent dpdk applications randomly locks up machines
Date: Fri, 26 Aug 2016 10:55:30 -0700	[thread overview]
Message-ID: <CADc8JchghLmRKEfpZw+bNjfUT+1sgHjsd-LN-t80-wtehTzxXg@mail.gmail.com> (raw)
In-Reply-To: <20160819183029.6deb8ebd@xeon-e3>

Hi,


Just an update.

Thanks for all the inputs. I feel obliged to update the latest findings
here so that this thread may become useful for other people.

As it turned out, the rx/tx queue problem is not really the problem. Here
is why:
Our use model is to run two different *primary* dpdk processes each of
which binds to a different port. Both ports are on the same 82599ES nic.
They are separate ports that have independent rx/tx queues (in the sense of
BARs and the BAR0-based registers).

What the problem was, though, was that our application never calls the
rte_eth_dev_stop() function to properly shutdown the device. Simply making
sure that rte_eth_dev_stop() is called solved our problem.

>From the standpoint of a user of the dpdk library, the problem is solved.
BUT it is not understood, yet, how exactly failing to call
rte_eth_dev_stop() could have caused machine lockups. Could someone shed
light upon this question by
  a) simply confirming that I am not the only person seeing this problem,
  b) explain how, at a very low level, race conditions or memory
corruptions or anything could happen that causes a kernel panic, or
  c) provide pointers to potentially relevant information?



Thanks a lot!
Zhongming

On Fri, Aug 19, 2016 at 6:30 PM, Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Fri, 19 Aug 2016 18:19:21 -0700
> Zhongming Qu <zhongming@luminatewireless.com> wrote:
>
> > Thanks!
> >
> > I did use a hard coded queue_id of 0 when initializing the rx/tx queues,
> > i.e., rte_eth_rx/tx_queue_setup(). So that is a problem to solve. Will
> fix
> > that and try again.
> >
> > When A and B run at the same time, this lockup problem can be explained
> by
> > the conflicting queue usage. But the lockup happens even in the use case
> > where only one dpdk process is running. That is, A and B take turns to
> run
> > but do not run at the same time.
> >
> > Thanks for pointing out an alternative approach. That sounds really
> > promising. A concern came up when that idea was talked over: What would
> > happen if the primary process dies? Would all the secondary processes
> > eventually go awry at some point? Would `--proc-type auto` solve this
> > problem?
> >
>
> I haven't actually used primary/secondary model, but the recommendation
> is that the primary process does nothing (or is a watchdog) so it would
> be pretty much impossible to crash unless killed by malicious entity.
>
> All the packet logic would be in the secondary.
>

      reply	other threads:[~2016-08-26 17:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-19 20:32 Zhongming Qu
2016-08-19 21:03 ` Stephen Hemminger
2016-08-20  1:19   ` Zhongming Qu
2016-08-20  1:30     ` Stephen Hemminger
2016-08-26 17:55       ` Zhongming Qu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADc8JchghLmRKEfpZw+bNjfUT+1sgHjsd-LN-t80-wtehTzxXg@mail.gmail.com \
    --to=zhongming@luminatewireless.com \
    --cc=stephen@networkplumber.org \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).