From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 687805B2E for ; Sun, 30 Dec 2018 01:19:54 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Dec 2018 16:19:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,416,1539673200"; d="scan'208";a="121934623" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga002.jf.intel.com with ESMTP; 29 Dec 2018 16:19:52 -0800 Received: from fmsmsx120.amr.corp.intel.com (10.18.124.208) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.408.0; Sat, 29 Dec 2018 16:19:51 -0800 Received: from fmsmsx117.amr.corp.intel.com ([169.254.3.209]) by fmsmsx120.amr.corp.intel.com ([169.254.15.52]) with mapi id 14.03.0415.000; Sat, 29 Dec 2018 16:19:51 -0800 From: "Wiles, Keith" To: Harsh Patel CC: Stephen Hemminger , Kyle Larose , "users@dpdk.org" Thread-Topic: [dpdk-users] Query on handling packets Thread-Index: AQHUdzydRRkBFdv4fkKjO7j2RJcyb6VGGaEAgACGp4CAAAyagIABE0+AgAFRlQCAAnP3gIACAjmAgAC+mQCAAZQ9gIAAFqaAgARlHYCAAMRLgIACmg8AgATaFACAAyZygIAA0tiAgAgmHoCAAHMbAIAETeGAgBHQrwCAAAcRAIAX1QuAgAAmOAA= Date: Sun, 30 Dec 2018 00:19:51 +0000 Message-ID: <8AEE7D24-1870-4CA1-9390-EC349F5284C9@intel.com> References: <71CBA720-633D-4CFE-805C-606DAAEDD356@intel.com> <3C60E59D-36AD-4382-8CC3-89D4EEB0140D@intel.com> <76959924-D9DB-4C58-BB05-E33107AD98AC@intel.com> <485F0372-7486-473B-ACDA-F42A2D86EF03@intel.com> <34E92C48-A90C-472C-A915-AAA4A6B5CDE8@intel.com> <20181124203541.4aa9bbf2@xeon-e3> <1B6F92FD-D742-4377-896A-8D7DA6AAF799@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.251.144.223] Content-Type: text/plain; charset="us-ascii" Content-ID: <27D761309AFE5544B49A1A53DE440601@intel.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-users] Query on handling packets X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Dec 2018 00:19:55 -0000 > On Dec 29, 2018, at 4:03 PM, Harsh Patel wrote= : >=20 > Hello, > As suggested, we tried profiling the application using Intel VTune Amplif= ier. We aren't sure how to use these results, so we are attaching them to t= his email. >=20 > The things we understood were 'Top Hotspots' and 'Effective CPU utilizati= on'. Following are some of our understandings: >=20 > Top Hotspots >=20 > Function Module CPU Time > rte_delay_us_block librte_eal.so.6.1 15.042s > eth_em_recv_pkts librte_pmd_e1000.so 9.544s > ns3::DpdkNetDevice::Read libns3.28.1-fd-net-device-debug.so 3= .522s > ns3::DpdkNetDeviceReader::DoRead libns3.28.1-fd-net-device-debug.s= o 2.470s > rte_eth_rx_burst libns3.28.1-fd-net-device-debug.so 2.456s > [Others] 6.656s >=20 > We knew about other methods except `rte_delay_us_block`. So we investigat= ed the callers of this method: >=20 > Callers Effective Time Spin Time Overhead Time Effective Time S= pin Time Overhead Time Wait Time: Total Wait Time: Self > e1000_enable_ulp_lpt_lp 45.6% 0.0% 0.0% 6.860s 0usec 0usec > e1000_write_phy_reg_mdic 32.7% 0.0% 0.0% 4.916s 0usec 0= usec > e1000_read_phy_reg_mdic 19.4% 0.0% 0.0% 2.922s 0usec 0usec > e1000_reset_hw_ich8lan 1.0% 0.0% 0.0% 0.143s 0usec 0usec > eth_em_link_update 0.7% 0.0% 0.0% 0.100s 0usec 0usec > e1000_post_phy_reset_ich8lan.part.18 0.4% 0.0% 0.0% 0.064s 0= usec 0usec > e1000_get_cfg_done_generic 0.2% 0.0% 0.0% 0.037s 0usec 0= usec >=20 > We lack sufficient knowledge to investigate more than this. >=20 > Effective CPU utilization >=20 > Interestingly, the effective CPU utilization was 20.8% (0.832 out of 4 lo= gical CPUs). We thought this is less. So we compared this with the raw-sock= et version of the code, which was even less, 8.0% (0.318 out of 4 logical C= PUs), and even then it is performing way better. >=20 > It would be helpful if you give us insights on how to use these results o= r point us to some resources to do so.=20 I tracked down the rte_delay_us_block to SendFrom() function calling IsLink= Up() function and it appears calling that routine on every SendFrom() call,= which for the e1000 it must be very expensive call. So rework your code to= not call IsLinkUp() except every so often. I believe you can enable link s= tatus interrupt in DPDK to take an interrupt on link status change, which w= ould be better then calling this routine. How you do that I am not sure, bu= t it should be in the docs someplace. For now I would remove the IsLinkUp() call and just assume it is up after y= ou it the first time in Setup call function. >=20 > Thank you=20 >=20 > Regards > Harsh & Hrishikesh >=20 Regards, Keith