DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] performance degradation with fpic
@ 2020-10-15 16:00 Ali Alnubani
  2020-10-15 17:08 ` Bruce Richardson
  0 siblings, 1 reply; 20+ messages in thread
From: Ali Alnubani @ 2020-10-15 16:00 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

Hi Bruce,

We have been seeing in some cases that the DPDK forwarding performance is up to 9% lower when DPDK is built as static with meson compared to a build with makefiles.

The same degradation can be reproduced with makefiles on older DPDK releases when building with EXTAR_CFLAGS set to "-fPIC", it can also be resolved in meson when passing "pic: false" to meson's static_library call (more tweaking needs to be done to prevent building shared libraries because this change breaks them).

I can reproduce this drop with the following cases:

  *   Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 --socket-mem=2048,2048 -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --txd=512 --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1 --no-lsc-interrupt -i -a --rss-udp

  *   KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC: ConnectX-5 / Host's CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
testpmd --master-lcore=0 -c 0x1ffff -n 4 -w 00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-mem=2048,0 -- --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024 --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8 --port-topology=chained --forward-mode=macswap --no-lsc-interrupt -i -a --rss-udp
  *   Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU: Intel(R) Xeon(R) CPU E5-2697A v4. Testpmd command:
testpmd -n 4  -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -w 0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80  -- --burst=64 --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024 --rxd=1024 --rss-udp --auto-start
The packets being received and forwarded by testpmd are of IPv4/UDP type and 64B size.

Should we disable PIC in static builds?

Regards,
Ali

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-15 16:00 [dpdk-dev] performance degradation with fpic Ali Alnubani
@ 2020-10-15 17:08 ` Bruce Richardson
  2020-10-15 17:14   ` Thomas Monjalon
  2020-10-16  9:59   ` Bruce Richardson
  0 siblings, 2 replies; 20+ messages in thread
From: Bruce Richardson @ 2020-10-15 17:08 UTC (permalink / raw)
  To: Ali Alnubani; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
>    Hi Bruce,
> 
> 
>    We have been seeing in some cases that the DPDK forwarding performance
>    is up to 9% lower when DPDK is built as static with meson compared to a
>    build with makefiles.
> 
> 
>    The same degradation can be reproduced with makefiles on older DPDK
>    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
>    resolved in meson when passing “pic: false” to meson’s static_library
>    call (more tweaking needs to be done to prevent building shared
>    libraries because this change breaks them).
> 
> 
>    I can reproduce this drop with the following cases:
>      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
>        Xeon(R) Gold 6154. Testpmd command:
> 
>    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 --socket-mem=2048,2048
>    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --txd=512
>    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1 --no-lsc-interrupt
>    -i -a --rss-udp
>      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC: ConnectX-5 /
>        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
>        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
>        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-mem=2048,0 --
>        --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024
>        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
>        --port-topology=chained --forward-mode=macswap --no-lsc-interrupt
>        -i -a --rss-udp
>      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU: Intel(R)
>        Xeon(R) CPU E5-2697A v4. Testpmd command:
>        testpmd -n 4  -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -w
>        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80  -- --burst=64
>        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
>        --rxd=1024 --rss-udp --auto-start
> 
>    The packets being received and forwarded by testpmd are of IPv4/UDP
>    type and 64B size.
> 
>    Should we disable PIC in static builds?
> 
> 

Hi Ali,

thanks for reporting, though it's strange that you see such a big impact.
In my previous tests with i40e driver I never noticed a difference between
make and meson builds, and I and some others here have been using meson
builds for any performance work for over a year now. That being said let me
reverify what I see on my end.

In terms of solutions, disabling the -fPIC flag globally implies that we
can no longer build static and shared libs from the same sources, so we
would need to revert to doing either a static or a shared library build
but not both. If the issue is limited to only some drivers or some cases,
we can perhaps add in a build option to have no-fpic-static builds, to be
used in a cases where it is problematic.

However, at this point, I think we need a little more investigation. Is
there any testing you can do to see if it's just in your driver, or in
perhaps a mempool driver/lib that the issue appears, or if it's just a
global slowdown? Do you see the impact with both clang and gcc?  I'll
retest things a bit tomorrow on my end to see what I see.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-15 17:08 ` Bruce Richardson
@ 2020-10-15 17:14   ` Thomas Monjalon
  2020-10-15 17:28     ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
  2020-10-15 21:44     ` [dpdk-dev] " Stephen Hemminger
  2020-10-16  9:59   ` Bruce Richardson
  1 sibling, 2 replies; 20+ messages in thread
From: Thomas Monjalon @ 2020-10-15 17:14 UTC (permalink / raw)
  To: Ali Alnubani, Bruce Richardson
  Cc: dev, Asaf Penso, david.marchand, arybchenko, ferruh.yigit,
	honnappa.nagarahalli, jerinj

15/10/2020 19:08, Bruce Richardson:
> On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> >    We have been seeing in some cases that the DPDK forwarding performance
> >    is up to 9% lower when DPDK is built as static with meson compared to a
> >    build with makefiles.
> > 
> >    The same degradation can be reproduced with makefiles on older DPDK
> >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> >    resolved in meson when passing “pic: false” to meson’s static_library
> >    call (more tweaking needs to be done to prevent building shared
> >    libraries because this change breaks them).
[...]
> >    Should we disable PIC in static builds?
> 
> thanks for reporting, though it's strange that you see such a big impact.
> In my previous tests with i40e driver I never noticed a difference between
> make and meson builds, and I and some others here have been using meson
> builds for any performance work for over a year now. That being said let me
> reverify what I see on my end.
> 
> In terms of solutions, disabling the -fPIC flag globally implies that we
> can no longer build static and shared libs from the same sources, so we
> would need to revert to doing either a static or a shared library build
> but not both. If the issue is limited to only some drivers or some cases,
> we can perhaps add in a build option to have no-fpic-static builds, to be
> used in a cases where it is problematic.

I assume only some Rx/Tx functions are impacted.
We probably need such disabling option per-file.

> However, at this point, I think we need a little more investigation. Is
> there any testing you can do to see if it's just in your driver, or in
> perhaps a mempool driver/lib that the issue appears, or if it's just a
> global slowdown? Do you see the impact with both clang and gcc?  I'll
> retest things a bit tomorrow on my end to see what I see.

Yes we need to know which libs or files are impacted by -fPIC.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] [EXT] Re: performance degradation with fpic
  2020-10-15 17:14   ` Thomas Monjalon
@ 2020-10-15 17:28     ` Jerin Jacob Kollanukkaran
  2020-10-16  8:29       ` Bruce Richardson
  2020-10-15 21:44     ` [dpdk-dev] " Stephen Hemminger
  1 sibling, 1 reply; 20+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2020-10-15 17:28 UTC (permalink / raw)
  To: Thomas Monjalon, Ali Alnubani, Bruce Richardson
  Cc: dev, Asaf Penso, david.marchand, arybchenko, ferruh.yigit,
	honnappa.nagarahalli

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, October 15, 2020 10:45 PM
> To: Ali Alnubani <alialnu@nvidia.com>; Bruce Richardson
> <bruce.richardson@intel.com>
> Cc: dev@dpdk.org; Asaf Penso <asafp@nvidia.com>;
> david.marchand@redhat.com; arybchenko@solarflare.com;
> ferruh.yigit@intel.com; honnappa.nagarahalli@arm.com; Jerin Jacob
> Kollanukkaran <jerinj@marvell.com>
> Subject: [EXT] Re: performance degradation with fpic
> 
> External Email
> 
> ----------------------------------------------------------------------
> 15/10/2020 19:08, Bruce Richardson:
> > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > >    We have been seeing in some cases that the DPDK forwarding
> performance
> > >    is up to 9% lower when DPDK is built as static with meson compared to a
> > >    build with makefiles.
> > >
> > >    The same degradation can be reproduced with makefiles on older DPDK
> > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> > >    resolved in meson when passing “pic: false” to meson’s static_library
> > >    call (more tweaking needs to be done to prevent building shared
> > >    libraries because this change breaks them).
> [...]
> > >    Should we disable PIC in static builds?
> >
> > thanks for reporting, though it's strange that you see such a big impact.
> > In my previous tests with i40e driver I never noticed a difference
> > between make and meson builds, and I and some others here have been
> > using meson builds for any performance work for over a year now. That
> > being said let me reverify what I see on my end.
> >
> > In terms of solutions, disabling the -fPIC flag globally implies that
> > we can no longer build static and shared libs from the same sources,
> > so we would need to revert to doing either a static or a shared
> > library build but not both. If the issue is limited to only some
> > drivers or some cases, we can perhaps add in a build option to have
> > no-fpic-static builds, to be used in a cases where it is problematic.

We have seen this issue earlier. Our issue was, meson, getting more performance
Than make build system(Based on different changeset). Initially we suspected fpic
is playing role. Based on our understanding, It not is fpic issue per say, it is more
of text section  code alignment change was creating the issue.
Typically it happen with very "fine" grained, prefetches in Rx and Tx routines, then
All timing will get changed by radical change to text section by fpic.

> 
> I assume only some Rx/Tx functions are impacted.
> We probably need such disabling option per-file.
> 
> > However, at this point, I think we need a little more investigation.
> > Is there any testing you can do to see if it's just in your driver, or
> > in perhaps a mempool driver/lib that the issue appears, or if it's
> > just a global slowdown? Do you see the impact with both clang and gcc?
> > I'll retest things a bit tomorrow on my end to see what I see.
> 
> Yes we need to know which libs or files are impacted by -fPIC.
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-15 17:14   ` Thomas Monjalon
  2020-10-15 17:28     ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
@ 2020-10-15 21:44     ` Stephen Hemminger
  2020-10-16  8:35       ` Bruce Richardson
  1 sibling, 1 reply; 20+ messages in thread
From: Stephen Hemminger @ 2020-10-15 21:44 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ali Alnubani, Bruce Richardson, dev, Asaf Penso, david.marchand,
	arybchenko, ferruh.yigit, honnappa.nagarahalli, jerinj

On Thu, 15 Oct 2020 19:14:48 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:

> 15/10/2020 19:08, Bruce Richardson:
> > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:  
> > >    We have been seeing in some cases that the DPDK forwarding performance
> > >    is up to 9% lower when DPDK is built as static with meson compared to a
> > >    build with makefiles.
> > > 
> > >    The same degradation can be reproduced with makefiles on older DPDK
> > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> > >    resolved in meson when passing “pic: false” to meson’s static_library
> > >    call (more tweaking needs to be done to prevent building shared
> > >    libraries because this change breaks them).  
> [...]
> > >    Should we disable PIC in static builds?  
> > 
> > thanks for reporting, though it's strange that you see such a big impact.
> > In my previous tests with i40e driver I never noticed a difference between
> > make and meson builds, and I and some others here have been using meson
> > builds for any performance work for over a year now. That being said let me
> > reverify what I see on my end.
> > 
> > In terms of solutions, disabling the -fPIC flag globally implies that we
> > can no longer build static and shared libs from the same sources, so we
> > would need to revert to doing either a static or a shared library build
> > but not both. If the issue is limited to only some drivers or some cases,
> > we can perhaps add in a build option to have no-fpic-static builds, to be
> > used in a cases where it is problematic.  
> 
> I assume only some Rx/Tx functions are impacted.
> We probably need such disabling option per-file.
> 
> > However, at this point, I think we need a little more investigation. Is
> > there any testing you can do to see if it's just in your driver, or in
> > perhaps a mempool driver/lib that the issue appears, or if it's just a
> > global slowdown? Do you see the impact with both clang and gcc?  I'll
> > retest things a bit tomorrow on my end to see what I see.  
> 
> Yes we need to know which libs or files are impacted by -fPIC.

The issue is that all shared libraries need to be built with PIC.
So it is a question of static vs shared library build.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] [EXT] Re: performance degradation with fpic
  2020-10-15 17:28     ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
@ 2020-10-16  8:29       ` Bruce Richardson
  2020-10-16  8:39         ` Jerin Jacob
  0 siblings, 1 reply; 20+ messages in thread
From: Bruce Richardson @ 2020-10-16  8:29 UTC (permalink / raw)
  To: Jerin Jacob Kollanukkaran
  Cc: Thomas Monjalon, Ali Alnubani, dev, Asaf Penso, david.marchand,
	arybchenko, ferruh.yigit, honnappa.nagarahalli

On Thu, Oct 15, 2020 at 05:28:10PM +0000, Jerin Jacob Kollanukkaran wrote:
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Thursday, October 15, 2020 10:45 PM
> > To: Ali Alnubani <alialnu@nvidia.com>; Bruce Richardson
> > <bruce.richardson@intel.com>
> > Cc: dev@dpdk.org; Asaf Penso <asafp@nvidia.com>;
> > david.marchand@redhat.com; arybchenko@solarflare.com;
> > ferruh.yigit@intel.com; honnappa.nagarahalli@arm.com; Jerin Jacob
> > Kollanukkaran <jerinj@marvell.com>
> > Subject: [EXT] Re: performance degradation with fpic
> > 
> > External Email
> > 
> > ----------------------------------------------------------------------
> > 15/10/2020 19:08, Bruce Richardson:
> > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > >    We have been seeing in some cases that the DPDK forwarding
> > performance
> > > >    is up to 9% lower when DPDK is built as static with meson compared to a
> > > >    build with makefiles.
> > > >
> > > >    The same degradation can be reproduced with makefiles on older DPDK
> > > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> > > >    resolved in meson when passing “pic: false” to meson’s static_library
> > > >    call (more tweaking needs to be done to prevent building shared
> > > >    libraries because this change breaks them).
> > [...]
> > > >    Should we disable PIC in static builds?
> > >
> > > thanks for reporting, though it's strange that you see such a big impact.
> > > In my previous tests with i40e driver I never noticed a difference
> > > between make and meson builds, and I and some others here have been
> > > using meson builds for any performance work for over a year now. That
> > > being said let me reverify what I see on my end.
> > >
> > > In terms of solutions, disabling the -fPIC flag globally implies that
> > > we can no longer build static and shared libs from the same sources,
> > > so we would need to revert to doing either a static or a shared
> > > library build but not both. If the issue is limited to only some
> > > drivers or some cases, we can perhaps add in a build option to have
> > > no-fpic-static builds, to be used in a cases where it is problematic.
> 
> We have seen this issue earlier. Our issue was, meson, getting more performance
> Than make build system(Based on different changeset). Initially we suspected fpic
> is playing role. Based on our understanding, It not is fpic issue per say, it is more
> of text section  code alignment change was creating the issue.
> Typically it happen with very "fine" grained, prefetches in Rx and Tx routines, then
> All timing will get changed by radical change to text section by fpic.
> 

Out of interest, what range of performance difference did you see, because
the 9% reported is fairly massive, well beyond anything I would expect from
such a change?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-15 21:44     ` [dpdk-dev] " Stephen Hemminger
@ 2020-10-16  8:35       ` Bruce Richardson
  0 siblings, 0 replies; 20+ messages in thread
From: Bruce Richardson @ 2020-10-16  8:35 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Thomas Monjalon, Ali Alnubani, dev, Asaf Penso, david.marchand,
	arybchenko, ferruh.yigit, honnappa.nagarahalli, jerinj

On Thu, Oct 15, 2020 at 02:44:49PM -0700, Stephen Hemminger wrote:
> On Thu, 15 Oct 2020 19:14:48 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > 15/10/2020 19:08, Bruce Richardson:
> > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:  
> > > >    We have been seeing in some cases that the DPDK forwarding performance
> > > >    is up to 9% lower when DPDK is built as static with meson compared to a
> > > >    build with makefiles.
> > > > 
> > > >    The same degradation can be reproduced with makefiles on older DPDK
> > > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> > > >    resolved in meson when passing “pic: false” to meson’s static_library
> > > >    call (more tweaking needs to be done to prevent building shared
> > > >    libraries because this change breaks them).  
> > [...]
> > > >    Should we disable PIC in static builds?  
> > > 
> > > thanks for reporting, though it's strange that you see such a big impact.
> > > In my previous tests with i40e driver I never noticed a difference between
> > > make and meson builds, and I and some others here have been using meson
> > > builds for any performance work for over a year now. That being said let me
> > > reverify what I see on my end.
> > > 
> > > In terms of solutions, disabling the -fPIC flag globally implies that we
> > > can no longer build static and shared libs from the same sources, so we
> > > would need to revert to doing either a static or a shared library build
> > > but not both. If the issue is limited to only some drivers or some cases,
> > > we can perhaps add in a build option to have no-fpic-static builds, to be
> > > used in a cases where it is problematic.  
> > 
> > I assume only some Rx/Tx functions are impacted.
> > We probably need such disabling option per-file.
> > 
> > > However, at this point, I think we need a little more investigation. Is
> > > there any testing you can do to see if it's just in your driver, or in
> > > perhaps a mempool driver/lib that the issue appears, or if it's just a
> > > global slowdown? Do you see the impact with both clang and gcc?  I'll
> > > retest things a bit tomorrow on my end to see what I see.  
> > 
> > Yes we need to know which libs or files are impacted by -fPIC.
> 
> The issue is that all shared libraries need to be built with PIC.
> So it is a question of static vs shared library build.

Well, partially yes, but really using fPIC should only have a very small
difference in drivers. Therefore I'd like to know what's causing this
massive drop because, while disabling fPIC in the static builds (perhaps
per-component to avoid doubling the build time) will improve perf in the
static case, it will still leave a perf drop when a user switches to shared
libs. Since we want to move to a model where people are using shared
libraries and can update seamlessly due to constant ABI, I therefore think
we need to root cause this so we can fix the shared lib builds too - since
disabling fPIC is not an option there.

/Bruce

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] [EXT] Re: performance degradation with fpic
  2020-10-16  8:29       ` Bruce Richardson
@ 2020-10-16  8:39         ` Jerin Jacob
  0 siblings, 0 replies; 20+ messages in thread
From: Jerin Jacob @ 2020-10-16  8:39 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Jerin Jacob Kollanukkaran, Thomas Monjalon, Ali Alnubani, dev,
	Asaf Penso, david.marchand, arybchenko, ferruh.yigit,
	honnappa.nagarahalli

On Fri, Oct 16, 2020 at 2:00 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Thu, Oct 15, 2020 at 05:28:10PM +0000, Jerin Jacob Kollanukkaran wrote:
> > > -----Original Message-----
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > Sent: Thursday, October 15, 2020 10:45 PM
> > > To: Ali Alnubani <alialnu@nvidia.com>; Bruce Richardson
> > > <bruce.richardson@intel.com>
> > > Cc: dev@dpdk.org; Asaf Penso <asafp@nvidia.com>;
> > > david.marchand@redhat.com; arybchenko@solarflare.com;
> > > ferruh.yigit@intel.com; honnappa.nagarahalli@arm.com; Jerin Jacob
> > > Kollanukkaran <jerinj@marvell.com>
> > > Subject: [EXT] Re: performance degradation with fpic
> > >
> > > External Email
> > >
> > > ----------------------------------------------------------------------
> > > 15/10/2020 19:08, Bruce Richardson:
> > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > > >    We have been seeing in some cases that the DPDK forwarding
> > > performance
> > > > >    is up to 9% lower when DPDK is built as static with meson compared to a
> > > > >    build with makefiles.
> > > > >
> > > > >    The same degradation can be reproduced with makefiles on older DPDK
> > > > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> > > > >    resolved in meson when passing “pic: false” to meson’s static_library
> > > > >    call (more tweaking needs to be done to prevent building shared
> > > > >    libraries because this change breaks them).
> > > [...]
> > > > >    Should we disable PIC in static builds?
> > > >
> > > > thanks for reporting, though it's strange that you see such a big impact.
> > > > In my previous tests with i40e driver I never noticed a difference
> > > > between make and meson builds, and I and some others here have been
> > > > using meson builds for any performance work for over a year now. That
> > > > being said let me reverify what I see on my end.
> > > >
> > > > In terms of solutions, disabling the -fPIC flag globally implies that
> > > > we can no longer build static and shared libs from the same sources,
> > > > so we would need to revert to doing either a static or a shared
> > > > library build but not both. If the issue is limited to only some
> > > > drivers or some cases, we can perhaps add in a build option to have
> > > > no-fpic-static builds, to be used in a cases where it is problematic.
> >
> > We have seen this issue earlier. Our issue was, meson, getting more performance
> > Than make build system(Based on different changeset). Initially we suspected fpic
> > is playing role. Based on our understanding, It not is fpic issue per say, it is more
> > of text section  code alignment change was creating the issue.
> > Typically it happen with very "fine" grained, prefetches in Rx and Tx routines, then
> > All timing will get changed by radical change to text section by fpic.
> >
>
> Out of interest, what range of performance difference did you see, because
> the 9% reported is fairly massive, well beyond anything I would expect from
> such a change?

We have seen up to 4% difference in per core/mpps.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-15 17:08 ` Bruce Richardson
  2020-10-15 17:14   ` Thomas Monjalon
@ 2020-10-16  9:59   ` Bruce Richardson
  2020-10-19 11:47     ` Ali Alnubani
  1 sibling, 1 reply; 20+ messages in thread
From: Bruce Richardson @ 2020-10-16  9:59 UTC (permalink / raw)
  To: Ali Alnubani; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> >    Hi Bruce,
> > 
> > 
> >    We have been seeing in some cases that the DPDK forwarding performance
> >    is up to 9% lower when DPDK is built as static with meson compared to a
> >    build with makefiles.
> > 
> > 
> >    The same degradation can be reproduced with makefiles on older DPDK
> >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> >    resolved in meson when passing “pic: false” to meson’s static_library
> >    call (more tweaking needs to be done to prevent building shared
> >    libraries because this change breaks them).
> > 
> > 
> >    I can reproduce this drop with the following cases:
> >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> >        Xeon(R) Gold 6154. Testpmd command:
> > 
> >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 --socket-mem=2048,2048
> >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --txd=512
> >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1 --no-lsc-interrupt
> >    -i -a --rss-udp
> >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC: ConnectX-5 /
> >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-mem=2048,0 --
> >        --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024
> >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> >        --port-topology=chained --forward-mode=macswap --no-lsc-interrupt
> >        -i -a --rss-udp
> >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU: Intel(R)
> >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> >        testpmd -n 4  -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -w
> >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80  -- --burst=64
> >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> >        --rxd=1024 --rss-udp --auto-start
> > 
> >    The packets being received and forwarded by testpmd are of IPv4/UDP
> >    type and 64B size.
> > 
> >    Should we disable PIC in static builds?
> > 
> > 
> 
> Hi Ali,
> 
> thanks for reporting, though it's strange that you see such a big impact.
> In my previous tests with i40e driver I never noticed a difference between
> make and meson builds, and I and some others here have been using meson
> builds for any performance work for over a year now. That being said let me
> reverify what I see on my end.
> 
> In terms of solutions, disabling the -fPIC flag globally implies that we
> can no longer build static and shared libs from the same sources, so we
> would need to revert to doing either a static or a shared library build
> but not both. If the issue is limited to only some drivers or some cases,
> we can perhaps add in a build option to have no-fpic-static builds, to be
> used in a cases where it is problematic.
> 
> However, at this point, I think we need a little more investigation. Is
> there any testing you can do to see if it's just in your driver, or in
> perhaps a mempool driver/lib that the issue appears, or if it's just a
> global slowdown? Do you see the impact with both clang and gcc?  I'll
> retest things a bit tomorrow on my end to see what I see.
> 
Hi again,

I've done a quick retest with the i40e driver on my system, using the 20.08
version so as to have make vs meson direct comparison. [For reference
command used was: "sudo </path/to/testpmd>  -c F00000 -w af:00.0 -w b1:00.0
-w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512" using 3x40G ports to a
single core running @3GHz.] No major performance differences were seen, but
if anything the meson build was very slightly faster, as reported to Jerin,
maybe 2%, though it's within the margin of error.

Can you try adding '-fno-semantic-interposition' to your build, since
reading on the internet it appears that fPIC causes GCC to be very
conservative about optimizing things, and that may help. Clang may be less
conservative so testing with clang would be good too if you can manage it.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-16  9:59   ` Bruce Richardson
@ 2020-10-19 11:47     ` Ali Alnubani
  2020-10-19 13:01       ` Bruce Richardson
  0 siblings, 1 reply; 20+ messages in thread
From: Ali Alnubani @ 2020-10-19 11:47 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

Hi Bruce,

> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Friday, October 16, 2020 12:59 PM
> To: Ali Alnubani <alialnu@nvidia.com>
> Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>
> Subject: Re: [dpdk-dev] performance degradation with fpic
> 
> On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > >    Hi Bruce,
> > >
> > >
> > >    We have been seeing in some cases that the DPDK forwarding
> performance
> > >    is up to 9% lower when DPDK is built as static with meson compared to a
> > >    build with makefiles.
> > >
> > >
> > >    The same degradation can be reproduced with makefiles on older DPDK
> > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> > >    resolved in meson when passing “pic: false” to meson’s static_library
> > >    call (more tweaking needs to be done to prevent building shared
> > >    libraries because this change breaks them).
> > >
> > >
> > >    I can reproduce this drop with the following cases:
> > >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> > >        Xeon(R) Gold 6154. Testpmd command:
> > >
> > >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 --socket-
> mem=2048,2048
> > >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --txd=512
> > >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1 --no-lsc-
> interrupt
> > >    -i -a --rss-udp
> > >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC: ConnectX-5
> /
> > >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> > >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> > >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-mem=2048,0 --
> > >        --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024
> > >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> > >        --port-topology=chained --forward-mode=macswap --no-lsc-
> interrupt
> > >        -i -a --rss-udp
> > >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU: Intel(R)
> > >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> > >        testpmd -n 4  -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -w
> > >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80  -- --burst=64
> > >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> > >        --rxd=1024 --rss-udp --auto-start
> > >
> > >    The packets being received and forwarded by testpmd are of IPv4/UDP
> > >    type and 64B size.
> > >
> > >    Should we disable PIC in static builds?
> > >
> > >
> >
> > Hi Ali,
> >
> > thanks for reporting, though it's strange that you see such a big impact.
> > In my previous tests with i40e driver I never noticed a difference
> > between make and meson builds, and I and some others here have been
> > using meson builds for any performance work for over a year now. That
> > being said let me reverify what I see on my end.
> >
> > In terms of solutions, disabling the -fPIC flag globally implies that
> > we can no longer build static and shared libs from the same sources,
> > so we would need to revert to doing either a static or a shared
> > library build but not both. If the issue is limited to only some
> > drivers or some cases, we can perhaps add in a build option to have
> > no-fpic-static builds, to be used in a cases where it is problematic.
> >
> > However, at this point, I think we need a little more investigation.
> > Is there any testing you can do to see if it's just in your driver, or
> > in perhaps a mempool driver/lib that the issue appears, or if it's
> > just a global slowdown? Do you see the impact with both clang and gcc?
> > I'll retest things a bit tomorrow on my end to see what I see.
> >
> Hi again,
> 
> I've done a quick retest with the i40e driver on my system, using the 20.08
> version so as to have make vs meson direct comparison. [For reference
> command used was: "sudo </path/to/testpmd>  -c F00000 -w af:00.0 -w
> b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512" using 3x40G ports
> to a single core running @3GHz.] No major performance differences were
> seen, but if anything the meson build was very slightly faster, as reported to
> Jerin, maybe 2%, though it's within the margin of error.
> 

Thanks for taking the time to investigate this.

Disabling PIC for net/mlx5 driver alone in drivers/meson.build resolves the issue for me.
I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0). But I see now that disabling PIC with an old clang version (clang 3.4.2, RHEL7.4) causes a drop in performance, not an improvement like with gcc.

> Can you try adding '-fno-semantic-interposition' to your build, since reading
> on the internet it appears that fPIC causes GCC to be very conservative about
> optimizing things, and that may help. Clang may be less conservative so
> testing with clang would be good too if you can manage it.
> 

I don't see a noticeable change with '-fno-semantic-interposition'. Tested with both gcc and clang.

Thanks,
Ali

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-19 11:47     ` Ali Alnubani
@ 2020-10-19 13:01       ` Bruce Richardson
  2020-10-22 13:17         ` Ali Alnubani
  0 siblings, 1 reply; 20+ messages in thread
From: Bruce Richardson @ 2020-10-19 13:01 UTC (permalink / raw)
  To: Ali Alnubani; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote:
> Hi Bruce,
> 
> > -----Original Message-----
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Friday, October 16, 2020 12:59 PM
> > To: Ali Alnubani <alialnu@nvidia.com>
> > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>
> > Subject: Re: [dpdk-dev] performance degradation with fpic
> > 
> > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > >    Hi Bruce,
> > > >
> > > >
> > > >    We have been seeing in some cases that the DPDK forwarding
> > performance
> > > >    is up to 9% lower when DPDK is built as static with meson compared to a
> > > >    build with makefiles.
> > > >
> > > >
> > > >    The same degradation can be reproduced with makefiles on older DPDK
> > > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also be
> > > >    resolved in meson when passing “pic: false” to meson’s static_library
> > > >    call (more tweaking needs to be done to prevent building shared
> > > >    libraries because this change breaks them).
> > > >
> > > >
> > > >    I can reproduce this drop with the following cases:
> > > >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> > > >        Xeon(R) Gold 6154. Testpmd command:
> > > >
> > > >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 --socket-
> > mem=2048,2048
> > > >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --txd=512
> > > >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1 --no-lsc-
> > interrupt
> > > >    -i -a --rss-udp
> > > >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC: ConnectX-5
> > /
> > > >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> > > >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> > > >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-mem=2048,0 --
> > > >        --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024
> > > >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> > > >        --port-topology=chained --forward-mode=macswap --no-lsc-
> > interrupt
> > > >        -i -a --rss-udp
> > > >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU: Intel(R)
> > > >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> > > >        testpmd -n 4  -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -w
> > > >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80  -- --burst=64
> > > >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> > > >        --rxd=1024 --rss-udp --auto-start
> > > >
> > > >    The packets being received and forwarded by testpmd are of IPv4/UDP
> > > >    type and 64B size.
> > > >
> > > >    Should we disable PIC in static builds?
> > > >
> > > >
> > >
> > > Hi Ali,
> > >
> > > thanks for reporting, though it's strange that you see such a big impact.
> > > In my previous tests with i40e driver I never noticed a difference
> > > between make and meson builds, and I and some others here have been
> > > using meson builds for any performance work for over a year now. That
> > > being said let me reverify what I see on my end.
> > >
> > > In terms of solutions, disabling the -fPIC flag globally implies that
> > > we can no longer build static and shared libs from the same sources,
> > > so we would need to revert to doing either a static or a shared
> > > library build but not both. If the issue is limited to only some
> > > drivers or some cases, we can perhaps add in a build option to have
> > > no-fpic-static builds, to be used in a cases where it is problematic.
> > >
> > > However, at this point, I think we need a little more investigation.
> > > Is there any testing you can do to see if it's just in your driver, or
> > > in perhaps a mempool driver/lib that the issue appears, or if it's
> > > just a global slowdown? Do you see the impact with both clang and gcc?
> > > I'll retest things a bit tomorrow on my end to see what I see.
> > >
> > Hi again,
> > 
> > I've done a quick retest with the i40e driver on my system, using the 20.08
> > version so as to have make vs meson direct comparison. [For reference
> > command used was: "sudo </path/to/testpmd>  -c F00000 -w af:00.0 -w
> > b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512" using 3x40G ports
> > to a single core running @3GHz.] No major performance differences were
> > seen, but if anything the meson build was very slightly faster, as reported to
> > Jerin, maybe 2%, though it's within the margin of error.
> > 
> 
> Thanks for taking the time to investigate this.
> 
> Disabling PIC for net/mlx5 driver alone in drivers/meson.build resolves the issue for me.
> I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0). But I see now that disabling PIC with an old clang version (clang 3.4.2, RHEL7.4) causes a drop in performance, not an improvement like with gcc.
> 
That's interesting.

When you just build with and without -fpic with newer clang, do you see the
same perf drop as with gcc? With the older clang, is the shared lib build
faster than the static one?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-19 13:01       ` Bruce Richardson
@ 2020-10-22 13:17         ` Ali Alnubani
  2020-10-22 13:57           ` Bruce Richardson
  0 siblings, 1 reply; 20+ messages in thread
From: Ali Alnubani @ 2020-10-22 13:17 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

Hi Bruce,
Sorry for the delayed response.

> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Monday, October 19, 2020 4:02 PM
> To: Ali Alnubani <alialnu@nvidia.com>
> Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>
> Subject: Re: [dpdk-dev] performance degradation with fpic
> 
> On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote:
> > Hi Bruce,
> >
> > > -----Original Message-----
> > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > Sent: Friday, October 16, 2020 12:59 PM
> > > To: Ali Alnubani <alialnu@nvidia.com>
> > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>;
> > > Asaf Penso <asafp@nvidia.com>
> > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > >
> > > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > > >    Hi Bruce,
> > > > >
> > > > >
> > > > >    We have been seeing in some cases that the DPDK forwarding
> > > performance
> > > > >    is up to 9% lower when DPDK is built as static with meson compared
> to a
> > > > >    build with makefiles.
> > > > >
> > > > >
> > > > >    The same degradation can be reproduced with makefiles on older
> DPDK
> > > > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also
> be
> > > > >    resolved in meson when passing “pic: false” to meson’s
> static_library
> > > > >    call (more tweaking needs to be done to prevent building shared
> > > > >    libraries because this change breaks them).
> > > > >
> > > > >
> > > > >    I can reproduce this drop with the following cases:
> > > > >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> > > > >        Xeon(R) Gold 6154. Testpmd command:
> > > > >
> > > > >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 --socket-
> > > mem=2048,2048
> > > > >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --txd=512
> > > > >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1
> > > > > --no-lsc-
> > > interrupt
> > > > >    -i -a --rss-udp
> > > > >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC:
> > > > > ConnectX-5
> > > /
> > > > >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> > > > >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> > > > >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-
> mem=2048,0 --
> > > > >        --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024
> > > > >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> > > > >        --port-topology=chained --forward-mode=macswap --no-lsc-
> > > interrupt
> > > > >        -i -a --rss-udp
> > > > >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU: Intel(R)
> > > > >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> > > > >        testpmd -n 4  -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -w
> > > > >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80  -- --
> burst=64
> > > > >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> > > > >        --rxd=1024 --rss-udp --auto-start
> > > > >
> > > > >    The packets being received and forwarded by testpmd are of
> IPv4/UDP
> > > > >    type and 64B size.
> > > > >
> > > > >    Should we disable PIC in static builds?
> > > > >
> > > > >
> > > >
> > > > Hi Ali,
> > > >
> > > > thanks for reporting, though it's strange that you see such a big impact.
> > > > In my previous tests with i40e driver I never noticed a difference
> > > > between make and meson builds, and I and some others here have
> > > > been using meson builds for any performance work for over a year
> > > > now. That being said let me reverify what I see on my end.
> > > >
> > > > In terms of solutions, disabling the -fPIC flag globally implies
> > > > that we can no longer build static and shared libs from the same
> > > > sources, so we would need to revert to doing either a static or a
> > > > shared library build but not both. If the issue is limited to only
> > > > some drivers or some cases, we can perhaps add in a build option
> > > > to have no-fpic-static builds, to be used in a cases where it is
> problematic.
> > > >
> > > > However, at this point, I think we need a little more investigation.
> > > > Is there any testing you can do to see if it's just in your
> > > > driver, or in perhaps a mempool driver/lib that the issue appears,
> > > > or if it's just a global slowdown? Do you see the impact with both clang
> and gcc?
> > > > I'll retest things a bit tomorrow on my end to see what I see.
> > > >
> > > Hi again,
> > >
> > > I've done a quick retest with the i40e driver on my system, using
> > > the 20.08 version so as to have make vs meson direct comparison.
> > > [For reference command used was: "sudo </path/to/testpmd>  -c F00000
> > > -w af:00.0 -w
> > > b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512" using
> > > 3x40G ports to a single core running @3GHz.] No major performance
> > > differences were seen, but if anything the meson build was very
> > > slightly faster, as reported to Jerin, maybe 2%, though it's within the
> margin of error.
> > >
> >
> > Thanks for taking the time to investigate this.
> >
> > Disabling PIC for net/mlx5 driver alone in drivers/meson.build resolves the
> issue for me.
> > I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0). But I see now
> that disabling PIC with an old clang version (clang 3.4.2, RHEL7.4) causes a
> drop in performance, not an improvement like with gcc.
> >
> That's interesting.
> 
> When you just build with and without -fpic with newer clang, do you see the
> same perf drop as with gcc? With the older clang, is the shared lib build faster
> than the static one?

With the older clang on RHEL7.4, the shared lib is about ~2% slower compared to the static build.
With clang 11 compiled from source on ubuntu 18.04, I'm getting good performance with static meson build, same performance as with makefiles with gcc, and ~6% better than the static meson gcc build. Disabling PIC on clang 11 degrades performance by ~4%.
With clang 6.0.0 however, disabling PIC causes a very small drop (~0.1%).

This is on v20.08 with KVM ConnectX-5 SR-IOV passthrough. Command: "dpdk-testpmd --master-lcore=0 -c 0x1ffff -n 4 -w 00:05.0 --socket-mem=2048,0 -- --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024 --rxd=1024 --mbcache=512 --rxq=8 --txq=8 --nb-cores=4 --port-topology=chained --forward-mode=macswap --no-lsc-interrupt -i -a --rss-udp".

Regards,
Ali

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-22 13:17         ` Ali Alnubani
@ 2020-10-22 13:57           ` Bruce Richardson
  2020-10-22 14:16             ` Ali Alnubani
  0 siblings, 1 reply; 20+ messages in thread
From: Bruce Richardson @ 2020-10-22 13:57 UTC (permalink / raw)
  To: Ali Alnubani; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

On Thu, Oct 22, 2020 at 01:17:16PM +0000, Ali Alnubani wrote:
> Hi Bruce,
> Sorry for the delayed response.
> 
> > -----Original Message-----
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Monday, October 19, 2020 4:02 PM
> > To: Ali Alnubani <alialnu@nvidia.com>
> > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>
> > Subject: Re: [dpdk-dev] performance degradation with fpic
> > 
> > On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote:
> > > Hi Bruce,
> > >
> > > > -----Original Message-----
> > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > Sent: Friday, October 16, 2020 12:59 PM
> > > > To: Ali Alnubani <alialnu@nvidia.com>
> > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>;
> > > > Asaf Penso <asafp@nvidia.com>
> > > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > >
> > > > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> > > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > > > >    Hi Bruce,
> > > > > >
> > > > > >
> > > > > >    We have been seeing in some cases that the DPDK forwarding
> > > > performance
> > > > > >    is up to 9% lower when DPDK is built as static with meson compared
> > to a
> > > > > >    build with makefiles.
> > > > > >
> > > > > >
> > > > > >    The same degradation can be reproduced with makefiles on older
> > DPDK
> > > > > >    releases when building with EXTAR_CFLAGS set to “-fPIC”, it can also
> > be
> > > > > >    resolved in meson when passing “pic: false” to meson’s
> > static_library
> > > > > >    call (more tweaking needs to be done to prevent building shared
> > > > > >    libraries because this change breaks them).
> > > > > >
> > > > > >
> > > > > >    I can reproduce this drop with the following cases:
> > > > > >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> > > > > >        Xeon(R) Gold 6154. Testpmd command:
> > > > > >
> > > > > >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0 --socket-
> > > > mem=2048,2048
> > > > > >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --txd=512
> > > > > >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1
> > > > > > --no-lsc-
> > > > interrupt
> > > > > >    -i -a --rss-udp
> > > > > >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC:
> > > > > > ConnectX-5
> > > > /
> > > > > >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> > > > > >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> > > > > >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-
> > mem=2048,0 --
> > > > > >        --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024
> > > > > >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> > > > > >        --port-topology=chained --forward-mode=macswap --no-lsc-
> > > > interrupt
> > > > > >        -i -a --rss-udp
> > > > > >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU: Intel(R)
> > > > > >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> > > > > >        testpmd -n 4  -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -w
> > > > > >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80  -- --
> > burst=64
> > > > > >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> > > > > >        --rxd=1024 --rss-udp --auto-start
> > > > > >
> > > > > >    The packets being received and forwarded by testpmd are of
> > IPv4/UDP
> > > > > >    type and 64B size.
> > > > > >
> > > > > >    Should we disable PIC in static builds?
> > > > > >
> > > > > >
> > > > >
> > > > > Hi Ali,
> > > > >
> > > > > thanks for reporting, though it's strange that you see such a big impact.
> > > > > In my previous tests with i40e driver I never noticed a difference
> > > > > between make and meson builds, and I and some others here have
> > > > > been using meson builds for any performance work for over a year
> > > > > now. That being said let me reverify what I see on my end.
> > > > >
> > > > > In terms of solutions, disabling the -fPIC flag globally implies
> > > > > that we can no longer build static and shared libs from the same
> > > > > sources, so we would need to revert to doing either a static or a
> > > > > shared library build but not both. If the issue is limited to only
> > > > > some drivers or some cases, we can perhaps add in a build option
> > > > > to have no-fpic-static builds, to be used in a cases where it is
> > problematic.
> > > > >
> > > > > However, at this point, I think we need a little more investigation.
> > > > > Is there any testing you can do to see if it's just in your
> > > > > driver, or in perhaps a mempool driver/lib that the issue appears,
> > > > > or if it's just a global slowdown? Do you see the impact with both clang
> > and gcc?
> > > > > I'll retest things a bit tomorrow on my end to see what I see.
> > > > >
> > > > Hi again,
> > > >
> > > > I've done a quick retest with the i40e driver on my system, using
> > > > the 20.08 version so as to have make vs meson direct comparison.
> > > > [For reference command used was: "sudo </path/to/testpmd>  -c F00000
> > > > -w af:00.0 -w
> > > > b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512" using
> > > > 3x40G ports to a single core running @3GHz.] No major performance
> > > > differences were seen, but if anything the meson build was very
> > > > slightly faster, as reported to Jerin, maybe 2%, though it's within the
> > margin of error.
> > > >
> > >
> > > Thanks for taking the time to investigate this.
> > >
> > > Disabling PIC for net/mlx5 driver alone in drivers/meson.build resolves the
> > issue for me.
> > > I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0). But I see now
> > that disabling PIC with an old clang version (clang 3.4.2, RHEL7.4) causes a
> > drop in performance, not an improvement like with gcc.
> > >
> > That's interesting.
> > 
> > When you just build with and without -fpic with newer clang, do you see the
> > same perf drop as with gcc? With the older clang, is the shared lib build faster
> > than the static one?
> 
> With the older clang on RHEL7.4, the shared lib is about ~2% slower compared to the static build.
> With clang 11 compiled from source on ubuntu 18.04, I'm getting good performance with static meson build, same performance as with makefiles with gcc, and ~6% better than the static meson gcc build. Disabling PIC on clang 11 degrades performance by ~4%.
> With clang 6.0.0 however, disabling PIC causes a very small drop (~0.1%).
> 
> This is on v20.08 with KVM ConnectX-5 SR-IOV passthrough. Command: "dpdk-testpmd --master-lcore=0 -c 0x1ffff -n 4 -w 00:05.0 --socket-mem=2048,0 -- --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024 --rxd=1024 --mbcache=512 --rxq=8 --txq=8 --nb-cores=4 --port-topology=chained --forward-mode=macswap --no-lsc-interrupt -i -a --rss-udp".
> 

So, am I right in saying that it appears the clang builds are all fine
here, that performance is pretty much as expected in all cases with the
default setting of PIC enabled? Therefore it appears that the issue is
limited to gcc builds at this point?

/Bruce

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-22 13:57           ` Bruce Richardson
@ 2020-10-22 14:16             ` Ali Alnubani
  2020-11-02 10:40               ` Ali Alnubani
  0 siblings, 1 reply; 20+ messages in thread
From: Ali Alnubani @ 2020-10-22 14:16 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Thursday, October 22, 2020 4:58 PM
> To: Ali Alnubani <alialnu@nvidia.com>
> Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>
> Subject: Re: [dpdk-dev] performance degradation with fpic
> 
> On Thu, Oct 22, 2020 at 01:17:16PM +0000, Ali Alnubani wrote:
> > Hi Bruce,
> > Sorry for the delayed response.
> >
> > > -----Original Message-----
> > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > Sent: Monday, October 19, 2020 4:02 PM
> > > To: Ali Alnubani <alialnu@nvidia.com>
> > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>;
> > > Asaf Penso <asafp@nvidia.com>
> > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > >
> > > On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote:
> > > > Hi Bruce,
> > > >
> > > > > -----Original Message-----
> > > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > > Sent: Friday, October 16, 2020 12:59 PM
> > > > > To: Ali Alnubani <alialnu@nvidia.com>
> > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > > <thomas@monjalon.net>;
> > > > > Asaf Penso <asafp@nvidia.com>
> > > > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > > >
> > > > > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> > > > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > > > > >    Hi Bruce,
> > > > > > >
> > > > > > >
> > > > > > >    We have been seeing in some cases that the DPDK
> > > > > > > forwarding
> > > > > performance
> > > > > > >    is up to 9% lower when DPDK is built as static with meson
> > > > > > > compared
> > > to a
> > > > > > >    build with makefiles.
> > > > > > >
> > > > > > >
> > > > > > >    The same degradation can be reproduced with makefiles on
> > > > > > > older
> > > DPDK
> > > > > > >    releases when building with EXTAR_CFLAGS set to “-fPIC”,
> > > > > > > it can also
> > > be
> > > > > > >    resolved in meson when passing “pic: false” to meson’s
> > > static_library
> > > > > > >    call (more tweaking needs to be done to prevent building shared
> > > > > > >    libraries because this change breaks them).
> > > > > > >
> > > > > > >
> > > > > > >    I can reproduce this drop with the following cases:
> > > > > > >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> > > > > > >        Xeon(R) Gold 6154. Testpmd command:
> > > > > > >
> > > > > > >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0
> > > > > > > --socket-
> > > > > mem=2048,2048
> > > > > > >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64 --
> txd=512
> > > > > > >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1
> > > > > > > --no-lsc-
> > > > > interrupt
> > > > > > >    -i -a --rss-udp
> > > > > > >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC:
> > > > > > > ConnectX-5
> > > > > /
> > > > > > >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> > > > > > >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> > > > > > >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-
> > > mem=2048,0 --
> > > > > > >        --port-numa-config=0,0 --socket-num=0 --burst=64 --txd=1024
> > > > > > >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> > > > > > >        --port-topology=chained --forward-mode=macswap
> > > > > > > --no-lsc-
> > > > > interrupt
> > > > > > >        -i -a --rss-udp
> > > > > > >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU:
> Intel(R)
> > > > > > >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> > > > > > >        testpmd -n 4  -w 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -
> w
> > > > > > >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80  --
> > > > > > > --
> > > burst=64
> > > > > > >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> > > > > > >        --rxd=1024 --rss-udp --auto-start
> > > > > > >
> > > > > > >    The packets being received and forwarded by testpmd are
> > > > > > > of
> > > IPv4/UDP
> > > > > > >    type and 64B size.
> > > > > > >
> > > > > > >    Should we disable PIC in static builds?
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > Hi Ali,
> > > > > >
> > > > > > thanks for reporting, though it's strange that you see such a big
> impact.
> > > > > > In my previous tests with i40e driver I never noticed a
> > > > > > difference between make and meson builds, and I and some
> > > > > > others here have been using meson builds for any performance
> > > > > > work for over a year now. That being said let me reverify what I see
> on my end.
> > > > > >
> > > > > > In terms of solutions, disabling the -fPIC flag globally
> > > > > > implies that we can no longer build static and shared libs
> > > > > > from the same sources, so we would need to revert to doing
> > > > > > either a static or a shared library build but not both. If the
> > > > > > issue is limited to only some drivers or some cases, we can
> > > > > > perhaps add in a build option to have no-fpic-static builds,
> > > > > > to be used in a cases where it is
> > > problematic.
> > > > > >
> > > > > > However, at this point, I think we need a little more investigation.
> > > > > > Is there any testing you can do to see if it's just in your
> > > > > > driver, or in perhaps a mempool driver/lib that the issue
> > > > > > appears, or if it's just a global slowdown? Do you see the
> > > > > > impact with both clang
> > > and gcc?
> > > > > > I'll retest things a bit tomorrow on my end to see what I see.
> > > > > >
> > > > > Hi again,
> > > > >
> > > > > I've done a quick retest with the i40e driver on my system,
> > > > > using the 20.08 version so as to have make vs meson direct
> comparison.
> > > > > [For reference command used was: "sudo </path/to/testpmd>  -c
> > > > > F00000 -w af:00.0 -w
> > > > > b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512"
> > > > > using 3x40G ports to a single core running @3GHz.] No major
> > > > > performance differences were seen, but if anything the meson
> > > > > build was very slightly faster, as reported to Jerin, maybe 2%,
> > > > > though it's within the
> > > margin of error.
> > > > >
> > > >
> > > > Thanks for taking the time to investigate this.
> > > >
> > > > Disabling PIC for net/mlx5 driver alone in drivers/meson.build
> > > > resolves the
> > > issue for me.
> > > > I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0).
> > > > But I see now
> > > that disabling PIC with an old clang version (clang 3.4.2, RHEL7.4)
> > > causes a drop in performance, not an improvement like with gcc.
> > > >
> > > That's interesting.
> > >
> > > When you just build with and without -fpic with newer clang, do you
> > > see the same perf drop as with gcc? With the older clang, is the
> > > shared lib build faster than the static one?
> >
> > With the older clang on RHEL7.4, the shared lib is about ~2% slower
> compared to the static build.
> > With clang 11 compiled from source on ubuntu 18.04, I'm getting good
> performance with static meson build, same performance as with makefiles
> with gcc, and ~6% better than the static meson gcc build. Disabling PIC on
> clang 11 degrades performance by ~4%.
> > With clang 6.0.0 however, disabling PIC causes a very small drop (~0.1%).
> >
> > This is on v20.08 with KVM ConnectX-5 SR-IOV passthrough. Command:
> "dpdk-testpmd --master-lcore=0 -c 0x1ffff -n 4 -w 00:05.0 --socket-
> mem=2048,0 -- --port-numa-config=0,0 --socket-num=0 --burst=64 --
> txd=1024 --rxd=1024 --mbcache=512 --rxq=8 --txq=8 --nb-cores=4 --port-
> topology=chained --forward-mode=macswap --no-lsc-interrupt -i -a --rss-
> udp".
> >
> 
> So, am I right in saying that it appears the clang builds are all fine here, that
> performance is pretty much as expected in all cases with the default setting
> of PIC enabled? Therefore it appears that the issue is limited to gcc builds at
> this point?
> 
Yes it appears that way.

Regards,
Ali

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-10-22 14:16             ` Ali Alnubani
@ 2020-11-02 10:40               ` Ali Alnubani
  2020-11-02 11:01                 ` Luca Boccassi
  2020-11-02 15:00                 ` Bruce Richardson
  0 siblings, 2 replies; 20+ messages in thread
From: Ali Alnubani @ 2020-11-02 10:40 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

Hi Bruce,

I was able to pin this down on drivers/net/mlx5/mlx5_rxtx.c. Removing -fPIC from its ninja recipe in build.ninja resolves the issue (had to prevent creating shared libs in this case).
What do you suggest I do? Can we have per-pmd customized compilation flags?

Regards,
Ali

> -----Original Message-----
> From: Ali Alnubani
> Sent: Thursday, October 22, 2020 5:17 PM
> To: Bruce Richardson <bruce.richardson@intel.com>
> Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>
> Subject: RE: [dpdk-dev] performance degradation with fpic
> 
> > -----Original Message-----
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Thursday, October 22, 2020 4:58 PM
> > To: Ali Alnubani <alialnu@nvidia.com>
> > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>;
> > Asaf Penso <asafp@nvidia.com>
> > Subject: Re: [dpdk-dev] performance degradation with fpic
> >
> > On Thu, Oct 22, 2020 at 01:17:16PM +0000, Ali Alnubani wrote:
> > > Hi Bruce,
> > > Sorry for the delayed response.
> > >
> > > > -----Original Message-----
> > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > Sent: Monday, October 19, 2020 4:02 PM
> > > > To: Ali Alnubani <alialnu@nvidia.com>
> > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>;
> > > > Asaf Penso <asafp@nvidia.com>
> > > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > >
> > > > On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote:
> > > > > Hi Bruce,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > > > Sent: Friday, October 16, 2020 12:59 PM
> > > > > > To: Ali Alnubani <alialnu@nvidia.com>
> > > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > > > <thomas@monjalon.net>;
> > > > > > Asaf Penso <asafp@nvidia.com>
> > > > > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > > > >
> > > > > > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> > > > > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > > > > > >    Hi Bruce,
> > > > > > > >
> > > > > > > >
> > > > > > > >    We have been seeing in some cases that the DPDK
> > > > > > > > forwarding
> > > > > > performance
> > > > > > > >    is up to 9% lower when DPDK is built as static with
> > > > > > > > meson compared
> > > > to a
> > > > > > > >    build with makefiles.
> > > > > > > >
> > > > > > > >
> > > > > > > >    The same degradation can be reproduced with makefiles
> > > > > > > > on older
> > > > DPDK
> > > > > > > >    releases when building with EXTAR_CFLAGS set to
> > > > > > > > “-fPIC”, it can also
> > > > be
> > > > > > > >    resolved in meson when passing “pic: false” to meson’s
> > > > static_library
> > > > > > > >    call (more tweaking needs to be done to prevent building
> shared
> > > > > > > >    libraries because this change breaks them).
> > > > > > > >
> > > > > > > >
> > > > > > > >    I can reproduce this drop with the following cases:
> > > > > > > >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> > > > > > > >        Xeon(R) Gold 6154. Testpmd command:
> > > > > > > >
> > > > > > > >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0
> > > > > > > > --socket-
> > > > > > mem=2048,2048
> > > > > > > >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64
> > > > > > > > --
> > txd=512
> > > > > > > >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1
> > > > > > > > --no-lsc-
> > > > > > interrupt
> > > > > > > >    -i -a --rss-udp
> > > > > > > >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC:
> > > > > > > > ConnectX-5
> > > > > > /
> > > > > > > >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> > > > > > > >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> > > > > > > >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-
> > > > mem=2048,0 --
> > > > > > > >        --port-numa-config=0,0 --socket-num=0 --burst=64 --
> txd=1024
> > > > > > > >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> > > > > > > >        --port-topology=chained --forward-mode=macswap
> > > > > > > > --no-lsc-
> > > > > > interrupt
> > > > > > > >        -i -a --rss-udp
> > > > > > > >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU:
> > Intel(R)
> > > > > > > >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> > > > > > > >        testpmd -n 4  -w
> > > > > > > > 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -
> > w
> > > > > > > >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80
> > > > > > > > --
> > > > > > > > --
> > > > burst=64
> > > > > > > >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> > > > > > > >        --rxd=1024 --rss-udp --auto-start
> > > > > > > >
> > > > > > > >    The packets being received and forwarded by testpmd are
> > > > > > > > of
> > > > IPv4/UDP
> > > > > > > >    type and 64B size.
> > > > > > > >
> > > > > > > >    Should we disable PIC in static builds?
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > Hi Ali,
> > > > > > >
> > > > > > > thanks for reporting, though it's strange that you see such
> > > > > > > a big
> > impact.
> > > > > > > In my previous tests with i40e driver I never noticed a
> > > > > > > difference between make and meson builds, and I and some
> > > > > > > others here have been using meson builds for any performance
> > > > > > > work for over a year now. That being said let me reverify
> > > > > > > what I see
> > on my end.
> > > > > > >
> > > > > > > In terms of solutions, disabling the -fPIC flag globally
> > > > > > > implies that we can no longer build static and shared libs
> > > > > > > from the same sources, so we would need to revert to doing
> > > > > > > either a static or a shared library build but not both. If
> > > > > > > the issue is limited to only some drivers or some cases, we
> > > > > > > can perhaps add in a build option to have no-fpic-static
> > > > > > > builds, to be used in a cases where it is
> > > > problematic.
> > > > > > >
> > > > > > > However, at this point, I think we need a little more investigation.
> > > > > > > Is there any testing you can do to see if it's just in your
> > > > > > > driver, or in perhaps a mempool driver/lib that the issue
> > > > > > > appears, or if it's just a global slowdown? Do you see the
> > > > > > > impact with both clang
> > > > and gcc?
> > > > > > > I'll retest things a bit tomorrow on my end to see what I see.
> > > > > > >
> > > > > > Hi again,
> > > > > >
> > > > > > I've done a quick retest with the i40e driver on my system,
> > > > > > using the 20.08 version so as to have make vs meson direct
> > comparison.
> > > > > > [For reference command used was: "sudo </path/to/testpmd>  -c
> > > > > > F00000 -w af:00.0 -w
> > > > > > b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512"
> > > > > > using 3x40G ports to a single core running @3GHz.] No major
> > > > > > performance differences were seen, but if anything the meson
> > > > > > build was very slightly faster, as reported to Jerin, maybe
> > > > > > 2%, though it's within the
> > > > margin of error.
> > > > > >
> > > > >
> > > > > Thanks for taking the time to investigate this.
> > > > >
> > > > > Disabling PIC for net/mlx5 driver alone in drivers/meson.build
> > > > > resolves the
> > > > issue for me.
> > > > > I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0).
> > > > > But I see now
> > > > that disabling PIC with an old clang version (clang 3.4.2,
> > > > RHEL7.4) causes a drop in performance, not an improvement like with
> gcc.
> > > > >
> > > > That's interesting.
> > > >
> > > > When you just build with and without -fpic with newer clang, do
> > > > you see the same perf drop as with gcc? With the older clang, is
> > > > the shared lib build faster than the static one?
> > >
> > > With the older clang on RHEL7.4, the shared lib is about ~2% slower
> > compared to the static build.
> > > With clang 11 compiled from source on ubuntu 18.04, I'm getting good
> > performance with static meson build, same performance as with
> > makefiles with gcc, and ~6% better than the static meson gcc build.
> > Disabling PIC on clang 11 degrades performance by ~4%.
> > > With clang 6.0.0 however, disabling PIC causes a very small drop (~0.1%).
> > >
> > > This is on v20.08 with KVM ConnectX-5 SR-IOV passthrough. Command:
> > "dpdk-testpmd --master-lcore=0 -c 0x1ffff -n 4 -w 00:05.0 --socket-
> > mem=2048,0 -- --port-numa-config=0,0 --socket-num=0 --burst=64 --
> > txd=1024 --rxd=1024 --mbcache=512 --rxq=8 --txq=8 --nb-cores=4 --port-
> > topology=chained --forward-mode=macswap --no-lsc-interrupt -i -a
> > --rss- udp".
> > >
> >
> > So, am I right in saying that it appears the clang builds are all fine
> > here, that performance is pretty much as expected in all cases with
> > the default setting of PIC enabled? Therefore it appears that the
> > issue is limited to gcc builds at this point?
> >
> Yes it appears that way.
> 
> Regards,
> Ali

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-11-02 10:40               ` Ali Alnubani
@ 2020-11-02 11:01                 ` Luca Boccassi
  2020-11-02 15:00                 ` Bruce Richardson
  1 sibling, 0 replies; 20+ messages in thread
From: Luca Boccassi @ 2020-11-02 11:01 UTC (permalink / raw)
  To: Ali Alnubani, Bruce Richardson
  Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

On Mon, 2020-11-02 at 10:40 +0000, Ali Alnubani wrote:
> Hi Bruce,
> 
> I was able to pin this down on drivers/net/mlx5/mlx5_rxtx.c. Removing -fPIC from its ninja recipe in build.ninja resolves the issue (had to prevent creating shared libs in this case).
> What do you suggest I do? Can we have per-pmd customized compilation flags?
> 
> Regards,
> Ali

It's great to pin-point it down to that level - but would it be
possible now to find out _why_ that one source file is affected by
-fPIC in this way on GCC - and not on Clang? I think it would be much
better to fix the issue itself, rather than applying the "sledgehammer"
approach. Not using fpic has severe consequences for distributability
as mentioned earlier.

> > -----Original Message-----
> > From: Ali Alnubani
> > Sent: Thursday, October 22, 2020 5:17 PM
> > To: Bruce Richardson <bruce.richardson@intel.com>
> > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>
> > Subject: RE: [dpdk-dev] performance degradation with fpic
> > 
> > > -----Original Message-----
> > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > Sent: Thursday, October 22, 2020 4:58 PM
> > > To: Ali Alnubani <alialnu@nvidia.com>
> > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>;
> > > Asaf Penso <asafp@nvidia.com>
> > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > 
> > > On Thu, Oct 22, 2020 at 01:17:16PM +0000, Ali Alnubani wrote:
> > > > Hi Bruce,
> > > > Sorry for the delayed response.
> > > > 
> > > > > -----Original Message-----
> > > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > > Sent: Monday, October 19, 2020 4:02 PM
> > > > > To: Ali Alnubani <alialnu@nvidia.com>
> > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > > <thomas@monjalon.net>;
> > > > > Asaf Penso <asafp@nvidia.com>
> > > > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > > > 
> > > > > On Mon, Oct 19, 2020 at 11:47:48AM +0000, Ali Alnubani wrote:
> > > > > > Hi Bruce,
> > > > > > 
> > > > > > > -----Original Message-----
> > > > > > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > > > > > Sent: Friday, October 16, 2020 12:59 PM
> > > > > > > To: Ali Alnubani <alialnu@nvidia.com>
> > > > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > > > > <thomas@monjalon.net>;
> > > > > > > Asaf Penso <asafp@nvidia.com>
> > > > > > > Subject: Re: [dpdk-dev] performance degradation with fpic
> > > > > > > 
> > > > > > > On Thu, Oct 15, 2020 at 06:08:04PM +0100, Bruce Richardson wrote:
> > > > > > > > On Thu, Oct 15, 2020 at 04:00:44PM +0000, Ali Alnubani wrote:
> > > > > > > > >    Hi Bruce,
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >    We have been seeing in some cases that the DPDK
> > > > > > > > > forwarding
> > > > > > > performance
> > > > > > > > >    is up to 9% lower when DPDK is built as static with
> > > > > > > > > meson compared
> > > > > to a
> > > > > > > > >    build with makefiles.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >    The same degradation can be reproduced with makefiles
> > > > > > > > > on older
> > > > > DPDK
> > > > > > > > >    releases when building with EXTAR_CFLAGS set to
> > > > > > > > > “-fPIC”, it can also
> > > > > be
> > > > > > > > >    resolved in meson when passing “pic: false” to meson’s
> > > > > static_library
> > > > > > > > >    call (more tweaking needs to be done to prevent building
> > shared
> > > > > > > > >    libraries because this change breaks them).
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >    I can reproduce this drop with the following cases:
> > > > > > > > >      * Baremetal / NIC: ConnectX-4 Lx / OS: RHEL7.4 / CPU: Intel(R)
> > > > > > > > >        Xeon(R) Gold 6154. Testpmd command:
> > > > > > > > > 
> > > > > > > > >    testpmd -c 0x7ffc0000 -n 4 -w d8:00.1 -w d8:00.0
> > > > > > > > > --socket-
> > > > > > > mem=2048,2048
> > > > > > > > >    -- --port-numa-config=0,1,1,1 --socket-num=1 --burst=64
> > > > > > > > > --
> > > txd=512
> > > > > > > > >    --rxd=512 --mbcache=512 --rxq=2 --txq=2 --nb-cores=1
> > > > > > > > > --no-lsc-
> > > > > > > interrupt
> > > > > > > > >    -i -a --rss-udp
> > > > > > > > >      * KVM guest with SR-IOV passthrough / OS: RHEL7.4 / NIC:
> > > > > > > > > ConnectX-5
> > > > > > > /
> > > > > > > > >        Host’s CPU: Intel(R) Xeon(R) Gold 6154. Testpmd command:
> > > > > > > > >        testpmd --master-lcore=0 -c 0x1ffff -n 4 -w
> > > > > > > > >        00:05.0,mprq_en=1,mprq_log_stride_num=6 --socket-
> > > > > mem=2048,0 --
> > > > > > > > >        --port-numa-config=0,0 --socket-num=0 --burst=64 --
> > txd=1024
> > > > > > > > >        --rxd=1024 --mbcache=512 --rxq=16 --txq=16 --nb-cores=8
> > > > > > > > >        --port-topology=chained --forward-mode=macswap
> > > > > > > > > --no-lsc-
> > > > > > > interrupt
> > > > > > > > >        -i -a --rss-udp
> > > > > > > > >      * Baremetal / OS: Ubuntu 18.04 / NIC: ConnectX-5 / CPU:
> > > Intel(R)
> > > > > > > > >        Xeon(R) CPU E5-2697A v4. Testpmd command:
> > > > > > > > >        testpmd -n 4  -w
> > > > > > > > > 0000:82:00.0,rxqs_min_mprq=8,mprq_en=1  -
> > > w
> > > > > > > > >        0000:82:00.1,rxqs_min_mprq=8,mprq_en=1 -c 0xff80
> > > > > > > > > --
> > > > > > > > > --
> > > > > burst=64
> > > > > > > > >        --mbcache=512 -i  --nb-cores=8  --rxq=8 --txq=8 --txd=1024
> > > > > > > > >        --rxd=1024 --rss-udp --auto-start
> > > > > > > > > 
> > > > > > > > >    The packets being received and forwarded by testpmd are
> > > > > > > > > of
> > > > > IPv4/UDP
> > > > > > > > >    type and 64B size.
> > > > > > > > > 
> > > > > > > > >    Should we disable PIC in static builds?
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > Hi Ali,
> > > > > > > > 
> > > > > > > > thanks for reporting, though it's strange that you see such
> > > > > > > > a big
> > > impact.
> > > > > > > > In my previous tests with i40e driver I never noticed a
> > > > > > > > difference between make and meson builds, and I and some
> > > > > > > > others here have been using meson builds for any performance
> > > > > > > > work for over a year now. That being said let me reverify
> > > > > > > > what I see
> > > on my end.
> > > > > > > > In terms of solutions, disabling the -fPIC flag globally
> > > > > > > > implies that we can no longer build static and shared libs
> > > > > > > > from the same sources, so we would need to revert to doing
> > > > > > > > either a static or a shared library build but not both. If
> > > > > > > > the issue is limited to only some drivers or some cases, we
> > > > > > > > can perhaps add in a build option to have no-fpic-static
> > > > > > > > builds, to be used in a cases where it is
> > > > > problematic.
> > > > > > > > However, at this point, I think we need a little more investigation.
> > > > > > > > Is there any testing you can do to see if it's just in your
> > > > > > > > driver, or in perhaps a mempool driver/lib that the issue
> > > > > > > > appears, or if it's just a global slowdown? Do you see the
> > > > > > > > impact with both clang
> > > > > and gcc?
> > > > > > > > I'll retest things a bit tomorrow on my end to see what I see.
> > > > > > > > 
> > > > > > > Hi again,
> > > > > > > 
> > > > > > > I've done a quick retest with the i40e driver on my system,
> > > > > > > using the 20.08 version so as to have make vs meson direct
> > > comparison.
> > > > > > > [For reference command used was: "sudo </path/to/testpmd>  -c
> > > > > > > F00000 -w af:00.0 -w
> > > > > > > b1:00.0 -w da:00.0 -- --rxq=2 --txq=2 --rxd=2048 --txd=512"
> > > > > > > using 3x40G ports to a single core running @3GHz.] No major
> > > > > > > performance differences were seen, but if anything the meson
> > > > > > > build was very slightly faster, as reported to Jerin, maybe
> > > > > > > 2%, though it's within the
> > > > > margin of error.
> > > > > > 
> > > > > > Thanks for taking the time to investigate this.
> > > > > > 
> > > > > > Disabling PIC for net/mlx5 driver alone in drivers/meson.build
> > > > > > resolves the
> > > > > issue for me.
> > > > > > I saw this issue with gcc (tested with 4.8.5, 9.3.0, and 7.5.0).
> > > > > > But I see now
> > > > > that disabling PIC with an old clang version (clang 3.4.2,
> > > > > RHEL7.4) causes a drop in performance, not an improvement like with
> > gcc.
> > > > > That's interesting.
> > > > > 
> > > > > When you just build with and without -fpic with newer clang, do
> > > > > you see the same perf drop as with gcc? With the older clang, is
> > > > > the shared lib build faster than the static one?
> > > > 
> > > > With the older clang on RHEL7.4, the shared lib is about ~2% slower
> > > compared to the static build.
> > > > With clang 11 compiled from source on ubuntu 18.04, I'm getting good
> > > performance with static meson build, same performance as with
> > > makefiles with gcc, and ~6% better than the static meson gcc build.
> > > Disabling PIC on clang 11 degrades performance by ~4%.
> > > > With clang 6.0.0 however, disabling PIC causes a very small drop (~0.1%).
> > > > 
> > > > This is on v20.08 with KVM ConnectX-5 SR-IOV passthrough. Command:
> > > "dpdk-testpmd --master-lcore=0 -c 0x1ffff -n 4 -w 00:05.0 --socket-
> > > mem=2048,0 -- --port-numa-config=0,0 --socket-num=0 --burst=64 --
> > > txd=1024 --rxd=1024 --mbcache=512 --rxq=8 --txq=8 --nb-cores=4 --port-
> > > topology=chained --forward-mode=macswap --no-lsc-interrupt -i -a
> > > --rss- udp".
> > > 
> > > So, am I right in saying that it appears the clang builds are all fine
> > > here, that performance is pretty much as expected in all cases with
> > > the default setting of PIC enabled? Therefore it appears that the
> > > issue is limited to gcc builds at this point?
> > > 
> > Yes it appears that way.
> > 
> > Regards,
> > Ali


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-11-02 10:40               ` Ali Alnubani
  2020-11-02 11:01                 ` Luca Boccassi
@ 2020-11-02 15:00                 ` Bruce Richardson
  2020-11-03 10:18                   ` Thomas Monjalon
  1 sibling, 1 reply; 20+ messages in thread
From: Bruce Richardson @ 2020-11-02 15:00 UTC (permalink / raw)
  To: Ali Alnubani; +Cc: dev, NBU-Contact-Thomas Monjalon, Asaf Penso

On Mon, Nov 02, 2020 at 10:40:54AM +0000, Ali Alnubani wrote:
> Hi Bruce,
> 
> I was able to pin this down on drivers/net/mlx5/mlx5_rxtx.c. Removing -fPIC from its ninja recipe in build.ninja resolves the issue (had to prevent creating shared libs in this case).
> What do you suggest I do? Can we have per-pmd customized compilation flags?
> 
> Regards,
> Ali
> 
There are multiple possible ways to achieve this, but below are some ideas:

1. Take the changes for supporting function versioning and duplicate them
from lib/meson.build to drivers/meson.build. Since function versioning
support already requires everything to be built twice, we could set it to
not use -fpic for the static libs in that case. Then mark mlx5 as using
function versioning. This is a bit hackish though, so

2. The "objs" parameter from each sub-directory is not widely used, so we
could split this easily enough into objs-shared and objs-static, and allow
the subdirectory build file, in this case mlx5/meson.ninja, to build any c
files manually to pass them back. This is more flexible, and also means
that you can limit the files which are to be built twice to only the single
file, rather than marking the whole driver as needing rebuild.

I'm sure there are other approaches too. However, I agree with Luca's
comment that first approach should probably be to see if you can track down
exactly why this one file is having problems. Could any of the slowdown be
due to the fact that you use a common lib from your driver? Are there
cross-driver calls in the fast-path that are suffering a penalty?

/Bruce

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-11-02 15:00                 ` Bruce Richardson
@ 2020-11-03 10:18                   ` Thomas Monjalon
  2020-11-03 10:45                     ` Luca Boccassi
  2020-11-03 11:23                     ` Bruce Richardson
  0 siblings, 2 replies; 20+ messages in thread
From: Thomas Monjalon @ 2020-11-03 10:18 UTC (permalink / raw)
  To: Bruce Richardson, bluca
  Cc: Ali Alnubani, dev, Asaf Penso, ferruh.yigit, jerinj, akhil.goyal,
	andrew.rybchenko, ajit.khaparde, konstantin.ananyev, viacheslavo

02/11/2020 16:00, Bruce Richardson:
> On Mon, Nov 02, 2020 at 10:40:54AM +0000, Ali Alnubani wrote:
> > Hi Bruce,
> > 
> > I was able to pin this down on drivers/net/mlx5/mlx5_rxtx.c. Removing -fPIC from its ninja recipe in build.ninja resolves the issue (had to prevent creating shared libs in this case).
> > What do you suggest I do? Can we have per-pmd customized compilation flags?
> > 
> > Regards,
> > Ali
> > 
> There are multiple possible ways to achieve this, but below are some ideas:
> 
> 1. Take the changes for supporting function versioning and duplicate them
> from lib/meson.build to drivers/meson.build. Since function versioning
> support already requires everything to be built twice, we could set it to
> not use -fpic for the static libs in that case. Then mark mlx5 as using
> function versioning. This is a bit hackish though, so
> 
> 2. The "objs" parameter from each sub-directory is not widely used, so we
> could split this easily enough into objs-shared and objs-static, and allow
> the subdirectory build file, in this case mlx5/meson.ninja, to build any c
> files manually to pass them back. This is more flexible, and also means
> that you can limit the files which are to be built twice to only the single
> file, rather than marking the whole driver as needing rebuild.

Can it be done only in the driver?
No general meson change for this option?

> I'm sure there are other approaches too. However, I agree with Luca's
> comment that first approach should probably be to see if you can track down
> exactly why this one file is having problems. Could any of the slowdown be
> due to the fact that you use a common lib from your driver? Are there
> cross-driver calls in the fast-path that are suffering a penalty?

Of course the performance will be analyzed in the long run.
However, such analyzis is more convenient if meson is flexible enough
to allow customization of the build.
And in general, I think it is good to have meson flexible
to allow any kind of driver build customization.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-11-03 10:18                   ` Thomas Monjalon
@ 2020-11-03 10:45                     ` Luca Boccassi
  2020-11-03 11:23                     ` Bruce Richardson
  1 sibling, 0 replies; 20+ messages in thread
From: Luca Boccassi @ 2020-11-03 10:45 UTC (permalink / raw)
  To: Thomas Monjalon, Bruce Richardson
  Cc: Ali Alnubani, dev, Asaf Penso, ferruh.yigit, jerinj, akhil.goyal,
	andrew.rybchenko, ajit.khaparde, konstantin.ananyev, viacheslavo

On Tue, 2020-11-03 at 11:18 +0100, Thomas Monjalon wrote:
> 02/11/2020 16:00, Bruce Richardson:
> > On Mon, Nov 02, 2020 at 10:40:54AM +0000, Ali Alnubani wrote:
> > > Hi Bruce,
> > > 
> > > I was able to pin this down on drivers/net/mlx5/mlx5_rxtx.c. Removing -fPIC from its ninja recipe in build.ninja resolves the issue (had to prevent creating shared libs in this case).
> > > What do you suggest I do? Can we have per-pmd customized compilation flags?
> > > 
> > > Regards,
> > > Ali
> > > 
> > There are multiple possible ways to achieve this, but below are some ideas:
> > 
> > 1. Take the changes for supporting function versioning and duplicate them
> > from lib/meson.build to drivers/meson.build. Since function versioning
> > support already requires everything to be built twice, we could set it to
> > not use -fpic for the static libs in that case. Then mark mlx5 as using
> > function versioning. This is a bit hackish though, so
> > 
> > 2. The "objs" parameter from each sub-directory is not widely used, so we
> > could split this easily enough into objs-shared and objs-static, and allow
> > the subdirectory build file, in this case mlx5/meson.ninja, to build any c
> > files manually to pass them back. This is more flexible, and also means
> > that you can limit the files which are to be built twice to only the single
> > file, rather than marking the whole driver as needing rebuild.
> 
> Can it be done only in the driver?
> No general meson change for this option?
> 
> > I'm sure there are other approaches too. However, I agree with Luca's
> > comment that first approach should probably be to see if you can track down
> > exactly why this one file is having problems. Could any of the slowdown be
> > due to the fact that you use a common lib from your driver? Are there
> > cross-driver calls in the fast-path that are suffering a penalty?
> 
> Of course the performance will be analyzed in the long run.
> However, such analyzis is more convenient if meson is flexible enough
> to allow customization of the build.
> And in general, I think it is good to have meson flexible
> to allow any kind of driver build customization.

The problem is with the specific case, not with general customizations.
IIRC all libraries must have fpic to build a relocatable executable -
you cannot mix and match. Missing this feature means no address layout
randomization, which is really bad especially for a network
application.

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dpdk-dev] performance degradation with fpic
  2020-11-03 10:18                   ` Thomas Monjalon
  2020-11-03 10:45                     ` Luca Boccassi
@ 2020-11-03 11:23                     ` Bruce Richardson
  1 sibling, 0 replies; 20+ messages in thread
From: Bruce Richardson @ 2020-11-03 11:23 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: bluca, Ali Alnubani, dev, Asaf Penso, ferruh.yigit, jerinj,
	akhil.goyal, andrew.rybchenko, ajit.khaparde, konstantin.ananyev,
	viacheslavo

On Tue, Nov 03, 2020 at 11:18:57AM +0100, Thomas Monjalon wrote:
> 02/11/2020 16:00, Bruce Richardson:
> > On Mon, Nov 02, 2020 at 10:40:54AM +0000, Ali Alnubani wrote:
> > > Hi Bruce,
> > > 
> > > I was able to pin this down on drivers/net/mlx5/mlx5_rxtx.c. Removing -fPIC from its ninja recipe in build.ninja resolves the issue (had to prevent creating shared libs in this case).
> > > What do you suggest I do? Can we have per-pmd customized compilation flags?
> > > 
> > > Regards,
> > > Ali
> > > 
> > There are multiple possible ways to achieve this, but below are some ideas:
> > 
> > 1. Take the changes for supporting function versioning and duplicate them
> > from lib/meson.build to drivers/meson.build. Since function versioning
> > support already requires everything to be built twice, we could set it to
> > not use -fpic for the static libs in that case. Then mark mlx5 as using
> > function versioning. This is a bit hackish though, so
> > 
> > 2. The "objs" parameter from each sub-directory is not widely used, so we
> > could split this easily enough into objs-shared and objs-static, and allow
> > the subdirectory build file, in this case mlx5/meson.ninja, to build any c
> > files manually to pass them back. This is more flexible, and also means
> > that you can limit the files which are to be built twice to only the single
> > file, rather than marking the whole driver as needing rebuild.
> 
> Can it be done only in the driver?
> No general meson change for this option?
> 

Well, apart from splitting the objs variable into two, I don't see any
other general meson changes being needed in this case. So yes, it makes any
changes specific to the driver.

That said, I have not tried to implement such a change, so the "in
practice" may be different from the "in theory"!

> > I'm sure there are other approaches too. However, I agree with Luca's
> > comment that first approach should probably be to see if you can track down
> > exactly why this one file is having problems. Could any of the slowdown be
> > due to the fact that you use a common lib from your driver? Are there
> > cross-driver calls in the fast-path that are suffering a penalty?
> 
> Of course the performance will be analyzed in the long run.
> However, such analyzis is more convenient if meson is flexible enough
> to allow customization of the build.
> And in general, I think it is good to have meson flexible
> to allow any kind of driver build customization.
> 

I'm partially agreeing and partially disagreeing here. While flexibility is
something that people generally want, based off my experience with DPDK
builds over the last few years, I think that there is an awful lot to be
said for consistency! While we need to support special cases that we can't
work around, there are many advantages to having everything built in the
same way using common flags etc. I would really hate to see the flexibility
translate into drivers all choosing to do their own special build
customization.

/Bruce

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-11-03 11:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-15 16:00 [dpdk-dev] performance degradation with fpic Ali Alnubani
2020-10-15 17:08 ` Bruce Richardson
2020-10-15 17:14   ` Thomas Monjalon
2020-10-15 17:28     ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2020-10-16  8:29       ` Bruce Richardson
2020-10-16  8:39         ` Jerin Jacob
2020-10-15 21:44     ` [dpdk-dev] " Stephen Hemminger
2020-10-16  8:35       ` Bruce Richardson
2020-10-16  9:59   ` Bruce Richardson
2020-10-19 11:47     ` Ali Alnubani
2020-10-19 13:01       ` Bruce Richardson
2020-10-22 13:17         ` Ali Alnubani
2020-10-22 13:57           ` Bruce Richardson
2020-10-22 14:16             ` Ali Alnubani
2020-11-02 10:40               ` Ali Alnubani
2020-11-02 11:01                 ` Luca Boccassi
2020-11-02 15:00                 ` Bruce Richardson
2020-11-03 10:18                   ` Thomas Monjalon
2020-11-03 10:45                     ` Luca Boccassi
2020-11-03 11:23                     ` Bruce Richardson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).