From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id CFE63A04DD; Thu, 22 Oct 2020 15:57:51 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 06437A9D5; Thu, 22 Oct 2020 15:57:50 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 12EC7A9CF for ; Thu, 22 Oct 2020 15:57:47 +0200 (CEST) IronPort-SDR: Ltv/FFoqRpucCKkJhyrkgRyEIUMxZW0TQJXiuOIiGpVYzSNhC+zH5+qmOmRetp62uKgqgpRZ9P oP4vCwiw+6gw== X-IronPort-AV: E=McAfee;i="6000,8403,9781"; a="231722582" X-IronPort-AV: E=Sophos;i="5.77,404,1596524400"; d="scan'208";a="231722582" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2020 06:57:45 -0700 IronPort-SDR: JgbzeFfpx/ilGl3tBttzUm94YDLDvNOnZ3V8KCgKYJxBcFaW96BM5esvABJyTGUjuqjE00fkov EuEmeQWrsrfw== X-IronPort-AV: E=Sophos;i="5.77,404,1596524400"; d="scan'208";a="321386439" Received: from bricha3-mobl.ger.corp.intel.com ([10.214.249.80]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 22 Oct 2020 06:57:43 -0700 Date: Thu, 22 Oct 2020 14:57:38 +0100 From: Bruce Richardson To: Ali Alnubani Cc: "dev@dpdk.org" , NBU-Contact-Thomas Monjalon , Asaf Penso Message-ID: <20201022135738.GB90@bricha3-MOBL.ger.corp.intel.com> References: <20201015170804.GG554@bricha3-MOBL.ger.corp.intel.com> <20201016095910.GD1008@bricha3-MOBL.ger.corp.intel.com> <20201019130130.GA663@bricha3-MOBL.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Subject: Re: [dpdk-dev] performance degradation with fpic X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, Oct 22, 2020 at 01:17:16PM +0000, Ali Alnubani wrote: > Hi Bruce, > Sorry for the delayed response. > > > -----Original Message----- > > From: Bruce Richardson > > Sent: Monday, October 19, 2020 4:02 PM > > To: Ali Alnubani > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon > > ; Asaf Penso > > Subject: Re: [dpdk-dev] performance degradation with fpic > > > > On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote: > > > Hi Bruce, > > > > > > > -----Original Message----- > > > > From: Bruce Richardson > > > > Sent: Friday, October 16, 2020 12:59 PM > > > > To: Ali Alnubani > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon > > ; > > > > Asaf Penso > > > > Subject: Re: [dpdk-dev] performance degradation with fpic > > > > > > > > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote: > > > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote: > > > > > > Hi Bruce, > > > > > > > > > > > > > > > > > > We have been seeing in some cases that the DPDK forwarding > > > > performance > > > > > > is up to 9% lower when DPDK is built as static with meson compared > > to a > > > > > > build with makefiles. > > > > > > > > > > > > > > > > > > The same degradation can be reproduced with makefiles on older > > DPDK > > > > > > releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also > > be > > > > > > resolved in meson when passing “pic: false” to meson’s > > static_library > > > > > > call (more tweaking needs to be done to prevent building shared > > > > > > libraries because this change breaks them). > > > > > > > > > > > > > > > > > > I can reproduce this drop with the following cases: > > > > > > * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R) > > > > > > Xeon(R) Gold 6154. Testpmd command: > > > > > > > > > > > > testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 --socket- > > > > mem=2048,2048 > > > > > > -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --txd=512 > > > > > > --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1 > > > > > > --no-lsc- > > > > interrupt > > > > > > -i -a --rss-udp > > > > > > * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC: > > > > > > ConnectX-5 > > > > / > > > > > > Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command: > > > > > > testpmd --master-lcore=0 -c 0x1ffff -n 4 -w > > > > > > 00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket- > > mem=2048,0 -- > > > > > > --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024 > > > > > > --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8 > > > > > > --port-topology=chained --forward-mode=macswap --no-lsc- > > > > interrupt > > > > > > -i -a --rss-udp > > > > > > * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU: Intel(R) > > > > > > Xeon(R) CPU E5-2697A v4. Testpmd command: > > > > > > testpmd -n 4 -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1 -w > > > > > > 0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80 -- -- > > burst=64 > > > > > > --mbcache=512 -i --nb-cores=8 --rxq=8 --txq=8 --txd=1024 > > > > > > --rxd=1024 --rss-udp --auto-start > > > > > > > > > > > > The packets being received and forwarded by testpmd are of > > IPv4/UDP > > > > > > type and 64B size. > > > > > > > > > > > > Should we disable PIC in static builds? > > > > > > > > > > > > > > > > > > > > > > Hi Ali, > > > > > > > > > > thanks for reporting, though it's strange that you see such a big impact. > > > > > In my previous tests with i40e driver I never noticed a difference > > > > > between make and meson builds, and I and some others here have > > > > > been using meson builds for any performance work for over a year > > > > > now. That being said let me reverify what I see on my end. > > > > > > > > > > In terms of solutions, disabling the -fPIC flag globally implies > > > > > that we can no longer build static and shared libs from the same > > > > > sources, so we would need to revert to doing either a static or a > > > > > shared library build but not both. If the issue is limited to only > > > > > some drivers or some cases, we can perhaps add in a build option > > > > > to have no-fpic-static builds, to be used in a cases where it is > > problematic. > > > > > > > > > > However, at this point, I think we need a little more investigation. > > > > > Is there any testing you can do to see if it's just in your > > > > > driver, or in perhaps a mempool driver/lib that the issue appears, > > > > > or if it's just a global slowdown? Do you see the impact with both clang > > and gcc? > > > > > I'll retest things a bit tomorrow on my end to see what I see. > > > > > > > > > Hi again, > > > > > > > > I've done a quick retest with the i40e driver on my system, using > > > > the 20.08 version so as to have make vs meson direct comparison. > > > > [For reference command used was: "sudo -c F00000 > > > > -w af:00.0 -w > > > > b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512" using > > > > 3x40G ports to a single core running @3GHz.] No major performance > > > > differences were seen, but if anything the meson build was very > > > > slightly faster, as reported to Jerin, maybe 2%, though it's within the > > margin of error. > > > > > > > > > > Thanks for taking the time to investigate this. > > > > > > Disabling PIC for net/mlx5 driver alone in drivers/meson.build resolves the > > issue for me. > > > I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0). But I see now > > that disabling PIC with an old clang version (clang 3.4.2, RHEL7.4) causes a > > drop in performance, not an improvement like with gcc. > > > > > That's interesting. > > > > When you just build with and without -fpic with newer clang, do you see the > > same perf drop as with gcc? With the older clang, is the shared lib build faster > > than the static one? > > With the older clang on RHEL7.4, the shared lib is about ~2% slower compared to the static build. > With clang 11 compiled from source on ubuntu 18.04, I'm getting good performance with static meson build, same performance as with makefiles with gcc, and ~6% better than the static meson gcc build. Disabling PIC on clang 11 degrades performance by ~4%. > With clang 6.0.0 however, disabling PIC causes a very small drop (~0.1%). > > This is on v20.08 with KVM ConnectX-5 SR-IOV passthrough. Command: "dpdk-testpmd --master-lcore=0 -c 0x1ffff -n 4 -w 00:05.0 --socket-mem=2048,0 -- --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024 --rxd=1024 --mbcache=512 --rxq=8 --txq=8 --nb-cores=4 --port-topology=chained --forward-mode=macswap --no-lsc-interrupt -i -a --rss-udp". > So, am I right in saying that it appears the clang builds are all fine here, that performance is pretty much as expected in all cases with the default setting of PIC enabled? Therefore it appears that the issue is limited to gcc builds at this point? /Bruce