From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it1-f180.google.com (mail-it1-f180.google.com [209.85.166.180]) by dpdk.org (Postfix) with ESMTP id 814F21B3BB for ; Thu, 3 Jan 2019 19:12:29 +0100 (CET) Received: by mail-it1-f180.google.com with SMTP id p197so43806160itp.0 for ; Thu, 03 Jan 2019 10:12:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=lO9XuGD3fbgB8I1vj0IOThQCssdl3yYGHz0F2oEr1v0=; b=SIpA+2bCtxHvqIZUB7G8xsaev52+QsAV6Dwl5vAsZYMUi8lme42R/oeW4bzLO7vy8i +hoKaW7TUQDrZxZ/lC+6/oBLlrjmaMB/pzEi96cb2VCSh24xtAD7iJmMp6Jd2pQ/WcGq Ug5ObzkkvpDY1+h23lh3p4l+75A0v6SlXTzK39fc2tvAjYmydi+GQsNPKk+T00p5t/MQ W051ePynPlAaUWnLZWv2SR8anVa/g+SszcxwI0jYtxw2q0w2ttcs47euBD5YjSk9mnVg W7Umko7aGjCK9N3WbTQ2mjR4gC7eFgyLuGjDFd4+LEjJxnq1wcXt6z/u/2eG5H7dImUO iCug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lO9XuGD3fbgB8I1vj0IOThQCssdl3yYGHz0F2oEr1v0=; b=oGXr0BdeaAXElo/EoK4s3cO9P/0YxdMH19rVJS+iPobtIBdNXT4GY/8+ztBWJ1FB6R yFOlZJeicoTT1c7Chr0w0VsX92nddsBCm726AE7GRx9w2/jNCKaSf/DhQ77+mLBKQjct 7y+5NulStRj68QnLBONAcGTdCsN5TDUsUZSi9GXsCpiB2IGWMnt517xvGz1K2LX1LQNV QtQo5CQwhYR2Wx3sW/9/uZYgIEDZwjJ1gfxyc1w2UoaLR4kkrkPGc8Wn47VhCyvovtkm DNDpx5klqegpBl4GnRJib+AUqvXQc/z8/laXWKiB9IiOuUT0Kb86bopy8/Kd/QfmZA9f 95rA== X-Gm-Message-State: AA+aEWbNfzxAc8/ZClZUzfzTqARBBzv86bwaVA5V6CSAfp83IJX9Uy0U V5XY1p7DLRF0+czVvwZecQJTET4VWbh+Da/qgnQMPQ== X-Google-Smtp-Source: ALg8bN6adDxsJQsKUgXKxQRGfqTXZe7NkEOuTxKP/4jvUYwZnxeVd0ts/rpoItXRa4GlSJYSjvJbkpWl4jvTPejn9uo= X-Received: by 2002:a24:1f0d:: with SMTP id d13mr30236861itd.140.1546539148624; Thu, 03 Jan 2019 10:12:28 -0800 (PST) MIME-Version: 1.0 References: <71CBA720-633D-4CFE-805C-606DAAEDD356@intel.com> <3C60E59D-36AD-4382-8CC3-89D4EEB0140D@intel.com> <76959924-D9DB-4C58-BB05-E33107AD98AC@intel.com> <485F0372-7486-473B-ACDA-F42A2D86EF03@intel.com> <34E92C48-A90C-472C-A915-AAA4A6B5CDE8@intel.com> <20181124203541.4aa9bbf2@xeon-e3> <1B6F92FD-D742-4377-896A-8D7DA6AAF799@intel.com> <72A7DD4D-35FD-4247-805D-E9A736B1C9B6@intel.com> In-Reply-To: <72A7DD4D-35FD-4247-805D-E9A736B1C9B6@intel.com> From: Harsh Patel Date: Thu, 3 Jan 2019 23:42:16 +0530 Message-ID: To: "Wiles, Keith" Cc: Stephen Hemminger , Kyle Larose , "users@dpdk.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] Query on handling packets X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2019 18:12:29 -0000 Hi We applied your suggestion of removing the `IsLinkUp()` call. But the performace is even worse. We could only get around 340kbits/s. The Top Hotspots are: Function Module CPU Time eth_em_recv_pkts librte_pmd_e1000.so 15.106s rte_delay_us_block librte_eal.so.6.1 7.372s ns3::DpdkNetDevice::Read libns3.28.1-fd-net-device-debug.so 5.080s rte_eth_rx_burst libns3.28.1-fd-net-device-debug.so 3.558s ns3::DpdkNetDeviceReader::DoRead libns3.28.1-fd-net-device-debug.so 3.364s [Others] 4.760s Upon checking the callers of `rte_delay_us_block`, we got to know that most of the time (92%) spent in this function is during initialization. This does not waste our processing time during communication. So, it's a good start to our optimization. Callers CPU Time: Total CPU Time: Self rte_delay_us_block 100.0% 7.372s e1000_enable_ulp_lpt_lp 92.3% 6.804s e1000_write_phy_reg_mdic 1.8% 0.136s e1000_reset_hw_ich8lan 1.7% 0.128s e1000_read_phy_reg_mdic 1.4% 0.104s eth_em_link_update 1.4% 0.100s e1000_get_cfg_done_generic 0.7% 0.052s e1000_post_phy_reset_ich8lan.part.18 0.7% 0.048s Effective CPU Utilization: 21.4% (0.856 out of 4) Here is the link to vtune profiling results. https://drive.google.com/open?id=3D1M6g2iRZq2JGPoDVPwZCxWBo7qzUhvWi5 Thank you Regards On Sun, Dec 30, 2018, 06:00 Wiles, Keith wrote: > > > > On Dec 29, 2018, at 4:03 PM, Harsh Patel > wrote: > > > > Hello, > > As suggested, we tried profiling the application using Intel VTune > Amplifier. We aren't sure how to use these results, so we are attaching > them to this email. > > > > The things we understood were 'Top Hotspots' and 'Effective CPU > utilization'. Following are some of our understandings: > > > > Top Hotspots > > > > Function Module CPU Time > > rte_delay_us_block librte_eal.so.6.1 15.042s > > eth_em_recv_pkts librte_pmd_e1000.so 9.544s > > ns3::DpdkNetDevice::Read libns3.28.1-fd-net-device-debug.so > 3.522s > > ns3::DpdkNetDeviceReader::DoRead > libns3.28.1-fd-net-device-debug.so 2.470s > > rte_eth_rx_burst libns3.28.1-fd-net-device-debug.so 2.456s > > [Others] 6.656s > > > > We knew about other methods except `rte_delay_us_block`. So we > investigated the callers of this method: > > > > Callers Effective Time Spin Time Overhead Time Effective Time > Spin Time Overhead Time Wait Time: Total Wait Time: Self > > e1000_enable_ulp_lpt_lp 45.6% 0.0% 0.0% 6.860s 0usec 0usec > > e1000_write_phy_reg_mdic 32.7% 0.0% 0.0% 4.916s 0usec > 0usec > > e1000_read_phy_reg_mdic 19.4% 0.0% 0.0% 2.922s 0usec 0usec > > e1000_reset_hw_ich8lan 1.0% 0.0% 0.0% 0.143s 0usec 0usec > > eth_em_link_update 0.7% 0.0% 0.0% 0.100s 0usec 0usec > > e1000_post_phy_reset_ich8lan.part.18 0.4% 0.0% 0.0% 0.064s > 0usec 0usec > > e1000_get_cfg_done_generic 0.2% 0.0% 0.0% 0.037s 0usec > 0usec > > > > We lack sufficient knowledge to investigate more than this. > > > > Effective CPU utilization > > > > Interestingly, the effective CPU utilization was 20.8% (0.832 out of 4 > logical CPUs). We thought this is less. So we compared this with the > raw-socket version of the code, which was even less, 8.0% (0.318 out of 4 > logical CPUs), and even then it is performing way better. > > > > It would be helpful if you give us insights on how to use these results > or point us to some resources to do so. > > > > Thank you > > > > BTW, I was able to build ns3 with DPDK 18.11 it required a couple changes > in the DPDK init code in ns3 plus one hack in rte_mbuf.h file. > > I did have a problem including rte_mbuf.h file into your code. It appears > the g++ compiler did not like referencing the struct rte_mbuf_sched insid= e > the rte_mbuf structure. The rte_mbuf_sched was inside the big union as a > hack I moved the struct outside of the rte_mbuf structure and replaced th= e > struct in the union with =E2=80=99struct rte_mbuf_sched sched;', but I am= guessing > you are missing some compiler options in your build system as DPDK builds > just fine without that hack. > > The next place was the rxmode and the txq_flags. The rxmode structure has > changed and I commented out the inits in ns3 and then commented out the > txq_flags init code as these are now the defaults. > > Regards, > Keith > >