DPDK usage discussions
 help / color / mirror / Atom feed
From: "Wiles, Keith" <keith.wiles@intel.com>
To: Harsh Patel <thadodaharsh10@gmail.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>,
	Kyle Larose <eomereadig@gmail.com>,
	"users@dpdk.org" <users@dpdk.org>
Subject: Re: [dpdk-users] Query on handling packets
Date: Thu, 3 Jan 2019 22:43:36 +0000	[thread overview]
Message-ID: <5F05CD7D-2EAB-476A-99B6-031CF835BA37@intel.com> (raw)
In-Reply-To: <CAA0iYrH+Q4JL=RGj14OxhgMf6Gqrj++d5fkFJpt6NRfEGAY9vQ@mail.gmail.com>



> On Jan 3, 2019, at 12:12 PM, Harsh Patel <thadodaharsh10@gmail.com> wrote:
> 
> Hi
> 
> We applied your suggestion of removing the `IsLinkUp()` call. But the performace is even worse. We could only get around 340kbits/s.
> 
> The Top Hotspots are:
> 
> Function    Module    CPU Time
> eth_em_recv_pkts    librte_pmd_e1000.so    15.106s
> rte_delay_us_block    librte_eal.so.6.1    7.372s
> ns3::DpdkNetDevice::Read    libns3.28.1-fd-net-device-debug.so    5.080s
> rte_eth_rx_burst    libns3.28.1-fd-net-device-debug.so    3.558s
> ns3::DpdkNetDeviceReader::DoRead    libns3.28.1-fd-net-device-debug.so    3.364s
> [Others]        4.760s

Performance reduced by removing that link status check, that is weird.
> 
> Upon checking the callers of `rte_delay_us_block`, we got to know that most of the time (92%) spent in this function is during initialization.
> This does not waste our processing time during communication. So, it's a good start to our optimization.
> 
> Callers    CPU Time: Total    CPU Time: Self
> rte_delay_us_block    100.0%    7.372s
>   e1000_enable_ulp_lpt_lp    92.3%    6.804s
>   e1000_write_phy_reg_mdic    1.8%    0.136s
>   e1000_reset_hw_ich8lan    1.7%    0.128s
>   e1000_read_phy_reg_mdic    1.4%    0.104s
>   eth_em_link_update    1.4%    0.100s
>   e1000_get_cfg_done_generic    0.7%    0.052s
>   e1000_post_phy_reset_ich8lan.part.18    0.7%    0.048s

I guess you are having vTune start your application and that is why you have init time items in your log. I normally start my application and then attach vtune to the application. One of the options in configuration of vtune for that project is to attach to the application. Maybe it would help hear.

Looking at the data you provided it was ok. The problem is it would not load the source files as I did not have the same build or executable. I tried to build the code, but it failed to build and I did not go further. I guess I would need to see the full source tree and the executable you used to really look at the problem. I have limited time, but I can try if you like. 
> 
> 
> Effective CPU Utilization:    21.4% (0.856 out of 4)
> 
> Here is the link to vtune profiling results. https://drive.google.com/open?id=1M6g2iRZq2JGPoDVPwZCxWBo7qzUhvWi5
> 
> Thank you
> 
> Regards
> 
> On Sun, Dec 30, 2018, 06:00 Wiles, Keith <keith.wiles@intel.com> wrote:
> 
> 
> > On Dec 29, 2018, at 4:03 PM, Harsh Patel <thadodaharsh10@gmail.com> wrote:
> > 
> > Hello,
> > As suggested, we tried profiling the application using Intel VTune Amplifier. We aren't sure how to use these results, so we are attaching them to this email.
> > 
> > The things we understood were 'Top Hotspots' and 'Effective CPU utilization'. Following are some of our understandings:
> > 
> > Top Hotspots
> > 
> > Function        Module  CPU Time
> > rte_delay_us_block      librte_eal.so.6.1       15.042s
> > eth_em_recv_pkts        librte_pmd_e1000.so     9.544s
> > ns3::DpdkNetDevice::Read        libns3.28.1-fd-net-device-debug.so      3.522s
> > ns3::DpdkNetDeviceReader::DoRead        libns3.28.1-fd-net-device-debug.so      2.470s
> > rte_eth_rx_burst        libns3.28.1-fd-net-device-debug.so      2.456s
> > [Others]                6.656s
> > 
> > We knew about other methods except `rte_delay_us_block`. So we investigated the callers of this method:
> > 
> > Callers Effective Time  Spin Time       Overhead Time   Effective Time  Spin Time       Overhead Time   Wait Time: Total        Wait Time: Self
> > e1000_enable_ulp_lpt_lp 45.6%   0.0%    0.0%    6.860s  0usec   0usec
> > e1000_write_phy_reg_mdic        32.7%   0.0%    0.0%    4.916s  0usec   0usec
> > e1000_read_phy_reg_mdic 19.4%   0.0%    0.0%    2.922s  0usec   0usec
> > e1000_reset_hw_ich8lan  1.0%    0.0%    0.0%    0.143s  0usec   0usec
> > eth_em_link_update      0.7%    0.0%    0.0%    0.100s  0usec   0usec
> > e1000_post_phy_reset_ich8lan.part.18    0.4%    0.0%    0.0%    0.064s  0usec   0usec
> > e1000_get_cfg_done_generic      0.2%    0.0%    0.0%    0.037s  0usec   0usec
> > 
> > We lack sufficient knowledge to investigate more than this.
> > 
> > Effective CPU utilization
> > 
> > Interestingly, the effective CPU utilization was 20.8% (0.832 out of 4 logical CPUs). We thought this is less. So we compared this with the raw-socket version of the code, which was even less, 8.0% (0.318 out of 4 logical CPUs), and even then it is performing way better.
> > 
> > It would be helpful if you give us insights on how to use these results or point us to some resources to do so. 
> > 
> > Thank you 
> > 
> 
> BTW, I was able to build ns3 with DPDK 18.11 it required a couple changes in the DPDK init code in ns3 plus one hack in rte_mbuf.h file.
> 
> I did have a problem including rte_mbuf.h file into your code. It appears the g++ compiler did not like referencing the struct rte_mbuf_sched inside the rte_mbuf structure. The rte_mbuf_sched was inside the big union as a hack I moved the struct outside of the rte_mbuf structure and replaced the struct in the union with ’struct rte_mbuf_sched sched;', but I am guessing you are missing some compiler options in your build system as DPDK builds just fine without that hack.
> 
> The next place was the rxmode and the txq_flags. The rxmode structure has changed and I commented out the inits in ns3 and then commented out the txq_flags init code as these are now the defaults.
> 
> Regards,
> Keith
> 

Regards,
Keith


  reply	other threads:[~2019-01-03 22:43 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08  8:24 Harsh Patel
2018-11-08  8:56 ` Wiles, Keith
2018-11-08 16:58   ` Harsh Patel
2018-11-08 17:43     ` Wiles, Keith
2018-11-09 10:09       ` Harsh Patel
2018-11-09 21:26         ` Wiles, Keith
2018-11-10  6:17         ` Wiles, Keith
2018-11-11 19:45           ` Harsh Patel
2018-11-13  2:25             ` Harsh Patel
2018-11-13 13:47               ` Wiles, Keith
2018-11-14 13:54                 ` Harsh Patel
2018-11-14 15:02                   ` Wiles, Keith
2018-11-14 15:04                   ` Wiles, Keith
2018-11-14 15:15                   ` Wiles, Keith
2018-11-17 10:22                     ` Harsh Patel
2018-11-17 22:05                       ` Kyle Larose
2018-11-19 13:49                         ` Wiles, Keith
2018-11-22 15:54                           ` Harsh Patel
2018-11-24 15:43                             ` Wiles, Keith
2018-11-24 15:48                               ` Wiles, Keith
2018-11-24 16:01                             ` Wiles, Keith
2018-11-25  4:35                               ` Stephen Hemminger
2018-11-30  9:02                                 ` Harsh Patel
2018-11-30 10:24                                   ` Harsh Patel
2018-11-30 15:54                                   ` Wiles, Keith
2018-12-03  9:37                                     ` Harsh Patel
2018-12-14 17:41                                       ` Harsh Patel
2018-12-14 18:06                                         ` Wiles, Keith
     [not found]                                           ` <CAA0iYrHyLtO3XLXMq-aeVhgJhns0+ErfuhEeDSNDi4cFVBcZmw@mail.gmail.com>
2018-12-30  0:19                                             ` Wiles, Keith
2018-12-30  0:30                                             ` Wiles, Keith
2019-01-03 18:12                                               ` Harsh Patel
2019-01-03 22:43                                                 ` Wiles, Keith [this message]
2019-01-04  5:57                                                   ` Harsh Patel
2019-01-16 13:55                                                     ` Harsh Patel
2019-01-30 23:36                                                       ` Harsh Patel
2019-01-31 16:58                                                         ` Wiles, Keith
2019-02-05  6:37                                                           ` Harsh Patel
2019-02-05 13:03                                                             ` Wiles, Keith
2019-02-05 14:00                                                               ` Harsh Patel
2019-02-05 14:12                                                                 ` Wiles, Keith
2019-02-05 14:22                                                                   ` Harsh Patel
2019-02-05 14:27                                                                     ` Wiles, Keith
2019-02-05 14:33                                                                       ` Harsh Patel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5F05CD7D-2EAB-476A-99B6-031CF835BA37@intel.com \
    --to=keith.wiles@intel.com \
    --cc=eomereadig@gmail.com \
    --cc=stephen@networkplumber.org \
    --cc=thadodaharsh10@gmail.com \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).