Re: [dpdk-users] DPDK TX problems

DPDK usage discussions
 help / color / mirror / Atom feed

From: "Hrvoje Habjanić" <hrvoje.habjanic@zg.ht.hr>
To: users@dpdk.org
Subject: Re: [dpdk-users] DPDK TX problems
Date: Mon, 8 Apr 2019 11:52:46 +0200	[thread overview]
Message-ID: <f333c4a5-8d16-2947-24ce-d06b4abf60c0@zg.ht.hr> (raw)
In-Reply-To: <16b6d36f-ea75-4f20-5d96-ef4053787dba@zg.ht.hr>

On 29/03/2019 08:24, Hrvoje Habjanić wrote:
>> Hi.
>>
>> I did write an application using dpdk 17.11 (did try also with 18.11),
>> and when doing some performance testing, i'm seeing very odd behavior.
>> To verify that this is not because of my app, i did the same test with
>> l2fwd example app, and i'm still confused by results.
>>
>> In short, i'm trying to push a lot of L2 packets through dpdk engine -
>> packet processing is minimal. When testing, i'm starting with small
>> number of packets-per-second, and then gradually increase it to see
>> where is the limit. At some point, i do reach this limit - packets start
>> to get dropped. And this is when stuff become weird.
>>
>> When i reach peek packet rate (at which packets start to get dropped), i
>> would expect that reducing packet rate will remove packet drops. But,
>> this is not the case. For example, let's assume that peek packet rate is
>> 3.5Mpps. At this point everything works ok. Increasing pps to 4.0Mpps,
>> makes a lot of dropped packets. When reducing pps back to 3.5Mpps, app
>> is still broken - packets are still dropped.
>>
>> At this point, i need to drastically reduce pps (1.4Mpps) to make
>> dropped packets go away. Also, app is unable to successfully forward
>> anything beyond this 1.4M, despite the fact that in the beginning it did
>> forward 3.5M! Only way to recover is to restart the app.
>>
>> Also, sometimes, the app just stops forwarding any packets - packets are
>> received (as seen by counters), but app is unable to send anything back.
>>
>> As i did mention, i'm seeing the same behavior with l2fwd example app. I
>> did test dpdk 17.11 and also dpdk 18.11 - the results are the same.
>>
>> My test environment is HP DL380G8, with 82599ES 10Gig (ixgbe) cards,
>> connected with Cisco nexus 9300 sw. On the other side is ixia test
>> appliance. Application is run in virtual machine (VM), using KVM
>> (openstack, with sriov enabled, and numa restrictions). I did check that
>> VM is using only cpu's from NUMA node on which network card is
>> connected, so there is no cross-numa traffic. Openstack is Queens,
>> Ubuntu is Bionic release. Virtual machine is also using ubuntu bionic
>> as OS.
>>
>> I do not know how to debug this? Does someone else have the same
>> observations?
>>
>> Regards,
>>
>> H.
> There are additional findings. It seems that when i reach peak pps
> rate, application is not fast enough, and i can see rx missed errors
> on card statistics on the host. At the same time, tx side starts to
> show problems (tx burst starts to show it did not send all packets).
> Shortly after that, tx falls apart completely and top pps rate drops.
>
> Since i did not disable pause frames, i can see on the switch "RX
> pause" frame counter is increasing. On the other hand, if i disable
> pause frames (on the nic of server), host driver (ixgbe) reports "TX
> unit hang" in dmesg, and issues card reset. Of course, after reset
> none of the dpdk apps in VM's on this host does not work.
>
> Is it possible that at time of congestion DPDK does not release mbufs
> back to the pool, and tx ring becomes "filled" with zombie packets
> (not send by card and also having ref counter as they are in use)?
>
> Is there a way to check mempool or tx ring for "left-owers"? Is is
> possible to somehow "flush" tx ring and/or mempool?
>
> H.

After few more test, things become even weirder - if i do not free mbufs
which are not sent, but resend them again, i can "survive" over-the-peek
event! But, then peek rate starts to drop gradually ...

I would ask if someone can try this on their platform and report back? I
would really like to know if this is problem with my deployment, or
there is something wrong with dpdk?

Test should be simple - use l2fwd or l3fwd, and determine max pps. Then
drive pps 30%over max, and then return back and confirm that you can
still get max pps.

Thanks in advance.

H.

next prev parent reply	other threads:[~2019-04-08  9:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-20  9:22 Hrvoje Habjanic
2019-03-29  7:24 ` Hrvoje Habjanić
2019-04-08  9:52   ` Hrvoje Habjanić [this message]
2020-02-18  8:36     ` Hrvoje Habjanic
2020-03-26 20:54       ` Thomas Monjalon
2020-03-27 18:25         ` [dpdk-users] [dpdk-ci] " Lincoln Lavoie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f333c4a5-8d16-2947-24ce-d06b4abf60c0@zg.ht.hr \
    --to=hrvoje.habjanic@zg.ht.hr \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).