From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7C1ACA04E7; Mon, 2 Nov 2020 12:02:05 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5B479C313; Mon, 2 Nov 2020 12:02:03 +0100 (CET) Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) by dpdk.org (Postfix) with ESMTP id 73F33C313 for ; Mon, 2 Nov 2020 12:02:01 +0100 (CET) Received: by mail-wr1-f42.google.com with SMTP id b3so8144364wrx.11 for ; Mon, 02 Nov 2020 03:02:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:content-transfer-encoding:user-agent:mime-version; bh=Aj/WVqvMsiBttgK6dD7RInnqwwqA0VbFnM94cDmGIVc=; b=HCArn9lYG4YUZ+OBDFvjDkSgDXT7jypeJAl+UEPuMAlM26BvM8ooZgUWnGnNRF+pZ8 L/eAwAi7586XE7xHu/vQJJ/7WQHhZ6bO0eC2WulSdVXPokSu4xdg+HgIg5UU/mvFs2cI wLxPYwYOBmR42BYItf/UtJr77YC5smCtXDZqyGZ5KuF6BV6WwhUgTyJ6qqO0SutY+TKC 8340azzl9gnOu7PovrX/y4FRVwbwoE4wRkM2v/FMQu29e7HSR7AWln+m/3tlfYQMmAik LwZ/SDrXOeQCIuIkY6WJkZz88h13irImjtGYsqG+GI0n6XVAWvMNrWvBbE3aXf0Lhn1R PDkA== X-Gm-Message-State: AOAM532tOIJOmCieZaLR1xUWiRd9DTBFn9iDvW22sUxAYZBIKTHyZiQQ gri9b55PKnKM9PK/aUwaNArZfzaCiHy+sQ== X-Google-Smtp-Source: ABdhPJyE/CqSzM/u9Q5jcMM/srJB9A+clTn7wl331dTkKyQKTRubtihLAmYMdVakryweFbNoSJoo7g== X-Received: by 2002:adf:f4d2:: with SMTP id h18mr18657005wrp.99.1604314920142; Mon, 02 Nov 2020 03:02:00 -0800 (PST) Received: from localhost ([88.98.246.218]) by smtp.gmail.com with ESMTPSA id y187sm3209947wmg.33.2020.11.02.03.01.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 03:01:59 -0800 (PST) Message-ID: <47af14f7df334da1a7405eb7e9a5d6a0513045bd.camel@debian.org> From: Luca Boccassi To: Ali Alnubani , Bruce Richardson Cc: "dev@dpdk.org" , NBU-Contact-Thomas Monjalon , Asaf Penso Date: Mon, 02 Nov 2020 11:01:58 +0000 In-Reply-To: References: <20201015170804.GG554@bricha3-MOBL.ger.corp.intel.com> <20201016095910.GD1008@bricha3-MOBL.ger.corp.intel.com> <20201019130130.GA663@bricha3-MOBL.ger.corp.intel.com> <20201022135738.GB90@bricha3-MOBL.ger.corp.intel.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.30.5-1.1 MIME-Version: 1.0 Subject: Re: [dpdk-dev] performance degradation with fpic X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, 2020-11-02 at 10:40 +0000, Ali Alnubani wrote: > Hi Bruce, >=20 > I was able to pin this down on drivers/net/mlx5/mlx5_rxtx.c. Removing -fP= IC from its ninja recipe in build.ninja resolves the issue (had to prevent = creating shared libs in this case). > What do you suggest I do? Can we have per-pmd customized compilation flag= s? >=20 > Regards, > Ali It's great to pin-point it down to that level - but would it be possible now to find out _why_ that one source file is affected by -fPIC in this way on GCC - and not on Clang? I think it would be much better to fix the issue itself, rather than applying the "sledgehammer" approach. Not using fpic has severe consequences for distributability as mentioned earlier. > > -----Original Message----- > > From: Ali Alnubani > > Sent: Thursday, October 22, 2020 5:17 PM > > To: Bruce Richardson > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon > > ; Asaf Penso > > Subject: RE: [dpdk-dev] performance degradation with fpic > >=20 > > > -----Original Message----- > > > From: Bruce Richardson > > > Sent: Thursday, October 22, 2020 4:58 PM > > > To: Ali Alnubani > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon > > ; > > > Asaf Penso > > > Subject: Re: [dpdk-dev] performance degradation with fpic > > >=20 > > > On Thu, Oct 22, 2020 at 01:17:16PM +0000, Ali Alnubani wrote: > > > > Hi Bruce, > > > > Sorry for the delayed response. > > > >=20 > > > > > -----Original Message----- > > > > > From: Bruce Richardson > > > > > Sent: Monday, October 19, 2020 4:02 PM > > > > > To: Ali Alnubani > > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon > > > ; > > > > > Asaf Penso > > > > > Subject: Re: [dpdk-dev] performance degradation with fpic > > > > >=20 > > > > > On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote: > > > > > > Hi Bruce, > > > > > >=20 > > > > > > > -----Original Message----- > > > > > > > From: Bruce Richardson > > > > > > > Sent: Friday, October 16, 2020 12:59 PM > > > > > > > To: Ali Alnubani > > > > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon > > > > > ; > > > > > > > Asaf Penso > > > > > > > Subject: Re: [dpdk-dev] performance degradation with fpic > > > > > > >=20 > > > > > > > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wr= ote: > > > > > > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrot= e: > > > > > > > > > Hi Bruce, > > > > > > > > >=20 > > > > > > > > >=20 > > > > > > > > > We have been seeing in some cases that the DPDK > > > > > > > > > forwarding > > > > > > > performance > > > > > > > > > is up to 9% lower when DPDK is built as static with > > > > > > > > > meson compared > > > > > to a > > > > > > > > > build with makefiles. > > > > > > > > >=20 > > > > > > > > >=20 > > > > > > > > > The same degradation can be reproduced with makefiles > > > > > > > > > on older > > > > > DPDK > > > > > > > > > releases when building with EXTAR_CFLAGS set to > > > > > > > > > =E2=80=9C-fPIC=E2=80=9D, it can also > > > > > be > > > > > > > > > resolved in meson when passing =E2=80=9Cpic: false=E2= =80=9D to meson=E2=80=99s > > > > > static_library > > > > > > > > > call (more tweaking needs to be done to prevent buildi= ng > > shared > > > > > > > > > libraries because this change breaks them). > > > > > > > > >=20 > > > > > > > > >=20 > > > > > > > > > I can reproduce this drop with the following cases: > > > > > > > > > * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU= : Intel(R) > > > > > > > > > Xeon(R) Gold 6154. Testpmd command: > > > > > > > > >=20 > > > > > > > > > testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 > > > > > > > > > --socket- > > > > > > > mem=3D2048,2048 > > > > > > > > > -- --port-numa-config=3D0,1,1,1 --socket-num=3D1 --bur= st=3D64 > > > > > > > > > -- > > > txd=3D512 > > > > > > > > > --rxd=3D512 --mbcache=3D512 --rxq=3D2 --txq=3D2 --nb-c= ores=3D1 > > > > > > > > > --no-lsc- > > > > > > > interrupt > > > > > > > > > -i -a --rss-udp > > > > > > > > > * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / = NIC: > > > > > > > > > ConnectX-5 > > > > > > > / > > > > > > > > > Host=E2=80=99s CPU: Intel(R) Xeon(R) Gold 6154. Te= stpmd command: > > > > > > > > > testpmd --master-lcore=3D0 -c 0x1ffff -n 4 -w > > > > > > > > > 00:05.0,mprq_en=3D1,mprq_log_stride_num=3D6 --sock= et- > > > > > mem=3D2048,0 -- > > > > > > > > > --port-numa-config=3D0,0 --socket-num=3D0 --burst= =3D64 -- > > txd=3D1024 > > > > > > > > > --rxd=3D1024 --mbcache=3D512 --rxq=3D16 --txq=3D16= --nb-cores=3D8 > > > > > > > > > --port-topology=3Dchained --forward-mode=3Dmacswap > > > > > > > > > --no-lsc- > > > > > > > interrupt > > > > > > > > > -i -a --rss-udp > > > > > > > > > * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / C= PU: > > > Intel(R) > > > > > > > > > Xeon(R) CPU E5-2697A v4. Testpmd command: > > > > > > > > > testpmd -n 4 -w > > > > > > > > > 0000:82:00.0,rxqs_min_mprq=3D8,mprq_en=3D1 - > > > w > > > > > > > > > 0000:82:00.1,rxqs_min_mprq=3D8,mprq_en=3D1 -c 0xff= 80 > > > > > > > > > -- > > > > > > > > > -- > > > > > burst=3D64 > > > > > > > > > --mbcache=3D512 -i --nb-cores=3D8 --rxq=3D8 --tx= q=3D8 --txd=3D1024 > > > > > > > > > --rxd=3D1024 --rss-udp --auto-start > > > > > > > > >=20 > > > > > > > > > The packets being received and forwarded by testpmd ar= e > > > > > > > > > of > > > > > IPv4/UDP > > > > > > > > > type and 64B size. > > > > > > > > >=20 > > > > > > > > > Should we disable PIC in static builds? > > > > > > > > >=20 > > > > > > > > >=20 > > > > > > > >=20 > > > > > > > > Hi Ali, > > > > > > > >=20 > > > > > > > > thanks for reporting, though it's strange that you see such > > > > > > > > a big > > > impact. > > > > > > > > In my previous tests with i40e driver I never noticed a > > > > > > > > difference between make and meson builds, and I and some > > > > > > > > others here have been using meson builds for any performanc= e > > > > > > > > work for over a year now. That being said let me reverify > > > > > > > > what I see > > > on my end. > > > > > > > > In terms of solutions, disabling the -fPIC flag globally > > > > > > > > implies that we can no longer build static and shared libs > > > > > > > > from the same sources, so we would need to revert to doing > > > > > > > > either a static or a shared library build but not both. If > > > > > > > > the issue is limited to only some drivers or some cases, we > > > > > > > > can perhaps add in a build option to have no-fpic-static > > > > > > > > builds, to be used in a cases where it is > > > > > problematic. > > > > > > > > However, at this point, I think we need a little more inves= tigation. > > > > > > > > Is there any testing you can do to see if it's just in your > > > > > > > > driver, or in perhaps a mempool driver/lib that the issue > > > > > > > > appears, or if it's just a global slowdown? Do you see the > > > > > > > > impact with both clang > > > > > and gcc? > > > > > > > > I'll retest things a bit tomorrow on my end to see what I s= ee. > > > > > > > >=20 > > > > > > > Hi again, > > > > > > >=20 > > > > > > > I've done a quick retest with the i40e driver on my system, > > > > > > > using the 20.08 version so as to have make vs meson direct > > > comparison. > > > > > > > [For reference command used was: "sudo -c > > > > > > > F00000 -w af:00.0 -w > > > > > > > b1:00.0 -w da:00.0 -- --rxq=3D2 --txq=3D2 --rxd=3D2048 --txd= =3D512" > > > > > > > using 3x40G ports to a single core running @3GHz.] No major > > > > > > > performance differences were seen, but if anything the meson > > > > > > > build was very slightly faster, as reported to Jerin, maybe > > > > > > > 2%, though it's within the > > > > > margin of error. > > > > > >=20 > > > > > > Thanks for taking the time to investigate this. > > > > > >=20 > > > > > > Disabling PIC for net/mlx5 driver alone in drivers/meson.build > > > > > > resolves the > > > > > issue for me. > > > > > > I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0)= . > > > > > > But I see now > > > > > that disabling PIC with an old clang version (clang 3.4.2, > > > > > RHEL7.4) causes a drop in performance, not an improvement like wi= th > > gcc. > > > > > That's interesting. > > > > >=20 > > > > > When you just build with and without -fpic with newer clang, do > > > > > you see the same perf drop as with gcc? With the older clang, is > > > > > the shared lib build faster than the static one? > > > >=20 > > > > With the older clang on RHEL7.4, the shared lib is about ~2% slower > > > compared to the static build. > > > > With clang 11 compiled from source on ubuntu 18.04, I'm getting goo= d > > > performance with static meson build, same performance as with > > > makefiles with gcc, and ~6% better than the static meson gcc build. > > > Disabling PIC on clang 11 degrades performance by ~4%. > > > > With clang 6.0.0 however, disabling PIC causes a very small drop (~= 0.1%). > > > >=20 > > > > This is on v20.08 with KVM ConnectX-5 SR-IOV passthrough. Command: > > > "dpdk-testpmd --master-lcore=3D0 -c 0x1ffff -n 4 -w 00:05.0 --socket- > > > mem=3D2048,0 -- --port-numa-config=3D0,0 --socket-num=3D0 --burst=3D6= 4 -- > > > txd=3D1024 --rxd=3D1024 --mbcache=3D512 --rxq=3D8 --txq=3D8 --nb-core= s=3D4 --port- > > > topology=3Dchained --forward-mode=3Dmacswap --no-lsc-interrupt -i -a > > > --rss- udp". > > >=20 > > > So, am I right in saying that it appears the clang builds are all fin= e > > > here, that performance is pretty much as expected in all cases with > > > the default setting of PIC enabled? Therefore it appears that the > > > issue is limited to gcc builds at this point? > > >=20 > > Yes it appears that way. > >=20 > > Regards, > > Ali