DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ali Alnubani <alialnu@nvidia.com>
To: Bruce Richardson <bruce.richardson@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	NBU-Contact-Thomas Monjalon <thomas@monjalon.net>,
	Asaf Penso <asafp@nvidia.com>
Subject: Re: [dpdk-dev] performance degradation with fpic
Date: Mon, 2 Nov 2020 10:40:54 +0000	[thread overview]
Message-ID: <DM6PR12MB4618B5D9D93473E1488D11F8DA100@DM6PR12MB4618.namprd12.prod.outlook.com> (raw)
In-Reply-To: <DM6PR12MB4618128D8DB2C60A9EDEC972DA1D0@DM6PR12MB4618.namprd12.prod.outlook.com>

Hi Bruce,

I was able to pin this down on drivers/net/mlx5/mlx5_rxtx.c. Removing -fPIC from its ninja recipe in build.ninja resolves the issue (had to prevent creating shared libs in this case).
What do you suggest I do? Can we have per-pmd customized compilation flags?

Regards,
Ali

> -----Original Message-----
> From: Ali Alnubani
> Sent: Thursday, October 22, 2020 5:17 PM
> To: Bruce Richardson <bruce.richardson@intel.com>
> Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>
> Subject: RE: [dpdk-dev] performance degradation with fpic
> 
> > -----Original Message-----
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Thursday, October 22, 2020 4:58 PM
> > To: Ali Alnubani <alialnu@nvidia.com>
> > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>;
> > Asaf Penso <asafp@nvidia.com>
> > Subject: Re: [dpdk-dev] performance degradation with fpic
> >
> > On Thu, Oct 22, 2020 at 01:17:16PM +0000, Ali Alnubani wrote:
> > > Hi Bruce,
> > > Sorry for the delayed response.
> > >
> > > > -----Original Message-----
> > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > Sent: Monday, October 19, 2020 4:02 PM
> > > > To: Ali Alnubani <alialnu@nvidia.com>
> > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>;
> > > > Asaf Penso <asafp@nvidia.com>
> > > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > >
> > > > On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote:
> > > > > Hi Bruce,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > > > Sent: Friday, October 16, 2020 12:59 PM
> > > > > > To: Ali Alnubani <alialnu@nvidia.com>
> > > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > > > <thomas@monjalon.net>;
> > > > > > Asaf Penso <asafp@nvidia.com>
> > > > > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > > > >
> > > > > > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> > > > > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > > > > > >    Hi Bruce,
> > > > > > > >
> > > > > > > >
> > > > > > > >    We have been seeing in some cases that the DPDK
> > > > > > > > forwarding
> > > > > > performance
> > > > > > > >    is up to 9% lower when DPDK is built as static with
> > > > > > > > meson compared
> > > > to a
> > > > > > > >    build with makefiles.
> > > > > > > >
> > > > > > > >
> > > > > > > >    The same degradation can be reproduced with makefiles
> > > > > > > > on older
> > > > DPDK
> > > > > > > >    releases when building with EXTAR_CFLAGS set to
> > > > > > > > “-fPIC”, it can also
> > > > be
> > > > > > > >    resolved in meson when passing “pic: false” to meson’s
> > > > static_library
> > > > > > > >    call (more tweaking needs to be done to prevent building
> shared
> > > > > > > >    libraries because this change breaks them).
> > > > > > > >
> > > > > > > >
> > > > > > > >    I can reproduce this drop with the following cases:
> > > > > > > >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> > > > > > > >        Xeon(R) Gold 6154. Testpmd command:
> > > > > > > >
> > > > > > > >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0
> > > > > > > > --socket-
> > > > > > mem=2048,2048
> > > > > > > >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64
> > > > > > > > --
> > txd=512
> > > > > > > >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1
> > > > > > > > --no-lsc-
> > > > > > interrupt
> > > > > > > >    -i -a --rss-udp
> > > > > > > >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC:
> > > > > > > > ConnectX-5
> > > > > > /
> > > > > > > >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> > > > > > > >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> > > > > > > >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-
> > > > mem=2048,0 --
> > > > > > > >        --port-numa-config=0,0 --socket-num=0 --burst=64 --
> txd=1024
> > > > > > > >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> > > > > > > >        --port-topology=chained --forward-mode=macswap
> > > > > > > > --no-lsc-
> > > > > > interrupt
> > > > > > > >        -i -a --rss-udp
> > > > > > > >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU:
> > Intel(R)
> > > > > > > >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> > > > > > > >        testpmd -n 4  -w
> > > > > > > > 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -
> > w
> > > > > > > >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80
> > > > > > > > --
> > > > > > > > --
> > > > burst=64
> > > > > > > >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> > > > > > > >        --rxd=1024 --rss-udp --auto-start
> > > > > > > >
> > > > > > > >    The packets being received and forwarded by testpmd are
> > > > > > > > of
> > > > IPv4/UDP
> > > > > > > >    type and 64B size.
> > > > > > > >
> > > > > > > >    Should we disable PIC in static builds?
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > Hi Ali,
> > > > > > >
> > > > > > > thanks for reporting, though it's strange that you see such
> > > > > > > a big
> > impact.
> > > > > > > In my previous tests with i40e driver I never noticed a
> > > > > > > difference between make and meson builds, and I and some
> > > > > > > others here have been using meson builds for any performance
> > > > > > > work for over a year now. That being said let me reverify
> > > > > > > what I see
> > on my end.
> > > > > > >
> > > > > > > In terms of solutions, disabling the -fPIC flag globally
> > > > > > > implies that we can no longer build static and shared libs
> > > > > > > from the same sources, so we would need to revert to doing
> > > > > > > either a static or a shared library build but not both. If
> > > > > > > the issue is limited to only some drivers or some cases, we
> > > > > > > can perhaps add in a build option to have no-fpic-static
> > > > > > > builds, to be used in a cases where it is
> > > > problematic.
> > > > > > >
> > > > > > > However, at this point, I think we need a little more investigation.
> > > > > > > Is there any testing you can do to see if it's just in your
> > > > > > > driver, or in perhaps a mempool driver/lib that the issue
> > > > > > > appears, or if it's just a global slowdown? Do you see the
> > > > > > > impact with both clang
> > > > and gcc?
> > > > > > > I'll retest things a bit tomorrow on my end to see what I see.
> > > > > > >
> > > > > > Hi again,
> > > > > >
> > > > > > I've done a quick retest with the i40e driver on my system,
> > > > > > using the 20.08 version so as to have make vs meson direct
> > comparison.
> > > > > > [For reference command used was: "sudo </path/to/testpmd>  -c
> > > > > > F00000 -w af:00.0 -w
> > > > > > b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512"
> > > > > > using 3x40G ports to a single core running @3GHz.] No major
> > > > > > performance differences were seen, but if anything the meson
> > > > > > build was very slightly faster, as reported to Jerin, maybe
> > > > > > 2%, though it's within the
> > > > margin of error.
> > > > > >
> > > > >
> > > > > Thanks for taking the time to investigate this.
> > > > >
> > > > > Disabling PIC for net/mlx5 driver alone in drivers/meson.build
> > > > > resolves the
> > > > issue for me.
> > > > > I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0).
> > > > > But I see now
> > > > that disabling PIC with an old clang version (clang 3.4.2,
> > > > RHEL7.4) causes a drop in performance, not an improvement like with
> gcc.
> > > > >
> > > > That's interesting.
> > > >
> > > > When you just build with and without -fpic with newer clang, do
> > > > you see the same perf drop as with gcc? With the older clang, is
> > > > the shared lib build faster than the static one?
> > >
> > > With the older clang on RHEL7.4, the shared lib is about ~2% slower
> > compared to the static build.
> > > With clang 11 compiled from source on ubuntu 18.04, I'm getting good
> > performance with static meson build, same performance as with
> > makefiles with gcc, and ~6% better than the static meson gcc build.
> > Disabling PIC on clang 11 degrades performance by ~4%.
> > > With clang 6.0.0 however, disabling PIC causes a very small drop (~0.1%).
> > >
> > > This is on v20.08 with KVM ConnectX-5 SR-IOV passthrough. Command:
> > "dpdk-testpmd --master-lcore=0 -c 0x1ffff -n 4 -w 00:05.0 --socket-
> > mem=2048,0 -- --port-numa-config=0,0 --socket-num=0 --burst=64 --
> > txd=1024 --rxd=1024 --mbcache=512 --rxq=8 --txq=8 --nb-cores=4 --port-
> > topology=chained --forward-mode=macswap --no-lsc-interrupt -i -a
> > --rss- udp".
> > >
> >
> > So, am I right in saying that it appears the clang builds are all fine
> > here, that performance is pretty much as expected in all cases with
> > the default setting of PIC enabled? Therefore it appears that the
> > issue is limited to gcc builds at this point?
> >
> Yes it appears that way.
> 
> Regards,
> Ali

  reply	other threads:[~2020-11-02 10:41 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 16:00 Ali Alnubani
2020-10-15 17:08 ` Bruce Richardson
2020-10-15 17:14   ` Thomas Monjalon
2020-10-15 17:28     ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2020-10-16  8:29       ` Bruce Richardson
2020-10-16  8:39         ` Jerin Jacob
2020-10-15 21:44     ` [dpdk-dev] " Stephen Hemminger
2020-10-16  8:35       ` Bruce Richardson
2020-10-16  9:59   ` Bruce Richardson
2020-10-19 11:47     ` Ali Alnubani
2020-10-19 13:01       ` Bruce Richardson
2020-10-22 13:17         ` Ali Alnubani
2020-10-22 13:57           ` Bruce Richardson
2020-10-22 14:16             ` Ali Alnubani
2020-11-02 10:40               ` Ali Alnubani [this message]
2020-11-02 11:01                 ` Luca Boccassi
2020-11-02 15:00                 ` Bruce Richardson
2020-11-03 10:18                   ` Thomas Monjalon
2020-11-03 10:45                     ` Luca Boccassi
2020-11-03 11:23                     ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR12MB4618B5D9D93473E1488D11F8DA100@DM6PR12MB4618.namprd12.prod.outlook.com \
    --to=alialnu@nvidia.com \
    --cc=asafp@nvidia.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).