DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Varghese, Vipin" <Vipin.Varghese@amd.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Song, Keesang" <Keesang.Song@amd.com>
Subject: Re: [PATCH v4] build: reduce use of AVX compiler flags
Date: Tue, 10 Jun 2025 15:27:41 +0100	[thread overview]
Message-ID: <aEhA3YgjW9SsMfYE@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <PH7PR12MB8596ADB9B5BC7947CF6AC43C826AA@PH7PR12MB8596.namprd12.prod.outlook.com>

On Tue, Jun 10, 2025 at 01:02:12PM +0000, Varghese, Vipin wrote:
> [Public]
> 
> Hi Bruce,
> 
> Snipped
> 
> > > > > >
> > > > > > Hi Bruce, we have reviewed this internally and tested the same.
> > > > > > We would like
> > > > > your thought for the following.
> > > > > >
> > > > > > - Before patch: we were directly setting AVX512 falgs for F, BW,
> > > > > > DQ, VL
> > > > > > - new patch: we are setting the flags for `skylake-server` as bare minimal.
> > > > > > - AMD supports AVX512 from `znver4 and higher`.
> > > > > >
> > > > > > As per GCC
> > > > > > `https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html`, the extra
> > > > > > ISA
> > > > > supported between skylake-server (super set) and znver4 and znver5
> > > > > are `SAHF, FXSR, XSAVE, RDRND, LZCNT, HLE, PREFETCHW, SGX`.
> > > > > > Currently for DPDK microbenchmarks and examples runs safe as it
> > > > > > is not using
> > > > > the `SAHF, FXSR, XSAVE, RDRND, LZCNT, HLE, PREFETCHW, SGX`
> > > > > instructions.
> > > > > >
> > > > > > Question: should we check if target is `AMD EPYC` then apply
> > > > > > bare minimum as
> > > > > `-march=znver4`, thus avoid possible unsupported instruction
> > > > > generation when non `c_args for march` is passed?
> > > > > >
> > > > >
> > > > > Can you clarify why you mean by the "target" here? Is there a
> > > > > specific value you are thinking of for the "cpu_instruction_set" option?
> > > >
> > > > `Target` is target CPU, when generated without any arguments we get code for
> > `native build`.
> > > >
> > > > On AMD target cpu zen4 or zen5; Before patch as per the code ` AVX512 flags
> > for F, BW, DQ` are used in ` cc_avx512_flags`.
> > > >
> > > > With the patch, the cc_avx512_flags is set to `-march=skylake-avx512` (where
> > compiler optimizations `can add HLE, PREFETCHW, SGX`).
> > > >
> > >
> > > With this patch for zen4 or zen5 the AVX512 code paths should be
> > > compiled with no additional flags, since the -march=zen* flag should
> > > include everything necessary. Can you confirm that you see the extra
> > > -march flag in those cases?
> > >
> >
> > Ran a quick test myself, this is what I see, doing a zen4 build:
> >
> >    $ meson setup -Dcpu_instruction_set=znver4 build-zen4
> >    The Meson build system
> >    Version: 1.7.0
> >    Source dir: /home/bruce/dpdk-github
> >    Build dir: /home/bruce/dpdk-github/build-zen4
> >    ...
> >    Fetching value of define "__AVX512F__" : 1
> >    Message: Extra C flags needed for AVX2 output: []
> >    Message: Extra C flags needed for AVX512 output: []
> >    ...
> >
> > Checking the build.ninja file in the build-zen4 directory, there is no use of
> > march=skylake. Here is the compilation recipe for i40e AVX-512 code, for
> > example:
> >
> > build
> > drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o:
> > c_COMPILER ../drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c
> >  DEPFILE =
> > drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d
> >  DEPFILE_UNQUOTED =
> > drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d
> >  ARGS = -Idrivers/librte_net_i40e_avx512_lib.a.p -Idrivers -I../drivers -
> > Idrivers/net/intel/i40e -I../drivers/net/intel/i40e -Idrivers/net/intel/i40e/base -
> > I../drivers/net/intel/i40e/base -Ilib/ethdev -I../lib/ethdev -Ilib/eal/common -
> > I../lib/eal/common -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -
> > Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include
> > -I../kernel/linux -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -
> > Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -
> > I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -
> > Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -
> > Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/hash -I../lib/hash -Ilib/rcu -I../lib/rcu -
> > I/usr/include/x86_64-linux-gnu -fdiagnostics-color=always -
> > D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -O3 -include
> > rte_config.h -Wvla -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -
> > Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs
> > -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef
> > -Wwrite-strings -Wno-packed-not-aligned -Wno-missing-field-initializers -
> > D_GNU_SOURCE -fPIC -march=znver4 -mrtm -DALLOW_EXPERIMENTAL_API
> > -DALLOW_INTERNAL_API -Wno-format-truncation -Wno-address-of-packed-
> > member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -
> > DCC_AVX512_SUPPORT
> 
> I have retried the patch and tested the same on 2 scenarios
> 
> 1. Scenario-1: znver4 (avx512) with gcc 13.2
> 2. Scenario-2: znver3 (no avx512) with gcc 13.2
> 
> For scenario-2 where CPU is not supporting avx512 but compiler supports avx512 I get the logs as
> 
> ```
> Fetching value of define "__AVX512F__" :
> Message: found AVX512
> Message: ['-march=skylake-avx512']
> Compiler for C supports arguments -Wno-overriding-option: NO
> Message: Extra C flags needed for AVX2 output: []
> Message: Extra C flags needed for AVX512 output: ['-march=skylake-avx512']
> 
> 
> Message: AVX 512 flag in drivers
> Message: ['-march=skylake-avx512']
> Message: ['-march=native', '-mrtm', '-DALLOW_EXPERIMENTAL_API', '-DALLOW_INTERNAL_API', '-Wno-format-truncation', '-Wno-address-of-packed-member', '-DDLB2_BYPASS_FENCE_ON_PP=0', '-DDLB_HW_CREDITS_CHECKS=1', '-DDLB_SW_CREDITS_CHECKS=1', '-DDLB_TYPE_CHECK=1', '-DRTE_LOG_DEFAULT_LOGTYPE=pmd.event.dlb2', '-DCC_AVX512_SUPPORT']
> ```
> 
> The above is correct as per my understanding as the test is to compile for avx512 ISA.
> But When using `ninja -C build -v` and building for native build, I get
> 
> ```
> [1741/3539] cc -Idrivers/librte_net_i40e_avx512_lib.a.p -Idrivers -I../drivers -Idrivers/net/intel/i40e -I../drivers/net/intel/i40e -Idrivers/net/intel/i40e/base
> -I../drivers/net/intel/i40e/base -Ilib/ethdev -I../lib/ethdev -Ilib/eal/common -I../lib/eal/common -I. -I.. -Iconfig -I../config -Ilib/eal/include
> -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal
> -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net
> -I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -Idrivers/bus/pci -I../drivers/bus/pci
> -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/hash -I../lib/hash -Ilib/rcu -I../lib/rcu -fdiagnostics-color=always
> -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -O3 -include rte_config.h -Wvla -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security
> -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings
> -Wno-packed-not-aligned -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation
> -Wno-address-of-packed-member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -DCC_AVX512_SUPPORT -march=skylake-avx512 -MD
> -MQ drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o -MF drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d
> -o drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o -c ../drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c
> ```
> 
> In above log I get `2 instances of march`; logs `-march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation
> -Wno-address-of-packed-member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -DCC_AVX512_SUPPORT -march=skylake-avx512`.
> 
> Question-1: I think this is not expected right? The `-march=native` is populated from `cflags` and `-march= skylake-avx512` is populated from ` cc_avx512_flags`.

The above command is correct. So long as the compiler supports AVX-512 we
will always compile the AVX-512 code paths for runtime selection. In
practice, all supported compilers have AVX-512 support, so in reality we
have the two scenarios you tested:

* The target architecture e.g. znver3 in your case, doesn't support avx512,
  so the meson.build file adds on the necessary flags to add this support,
  i.e. that file is compiled with -march=skylake=avx512, which is the
  minimum ISA that gives you the necessary support.
* The target architecture, e.g. znver4, does support AVX-512, then no
  additional flags are added and the files are compiled "as normal"

In both these cases, whether the target architecture is specified as
"native" or explicitly makes no difference.

> Question-2: if the target is meet minimal ISA why not we use `-march=x86-64-v4`?
> 

Good point, that would indeed be better. I'm just not sure whether it is
supported widely enough on our compilers. Do you know what gcc and clang
versions support that target?

> Note: I am yet to check for cross build. Will update on cross build how this comes out.
> 
> 
> 
> >
> > Regards,
> > /Bruce

  reply	other threads:[~2025-06-10 14:28 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25 17:22 [RFC PATCH] " Bruce Richardson
2025-03-26 16:21 ` Bruce Richardson
2025-03-26 18:06   ` Morten Brørup
2025-03-26 19:20     ` Stephen Hemminger
2025-03-27  7:55       ` DPDK compilers and RHEL 7 support Morten Brørup
2025-03-27 11:11         ` Kevin Traynor
2025-04-09  9:53       ` [RFC PATCH] build: reduce use of AVX compiler flags Varghese, Vipin
2025-04-09 11:31         ` Bruce Richardson
2025-04-10  3:49           ` Varghese, Vipin
2025-05-27 16:24 ` [PATCH v2] " Bruce Richardson
2025-05-29 10:01   ` Bruce Richardson
2025-05-29 10:33   ` Bruce Richardson
2025-05-29 10:31 ` [PATCH v3] " Bruce Richardson
2025-05-29 15:42 ` [PATCH v4] " Bruce Richardson
2025-05-30  9:30   ` Bruce Richardson
2025-06-09  6:02   ` Varghese, Vipin
2025-06-09  7:57     ` Bruce Richardson
2025-06-09 11:59       ` Varghese, Vipin
2025-06-09 12:23         ` Bruce Richardson
2025-06-09 12:32           ` Bruce Richardson
2025-06-10 13:02             ` Varghese, Vipin
2025-06-10 14:27               ` Bruce Richardson [this message]
2025-06-10 14:52                 ` Varghese, Vipin
2025-06-10 15:06                   ` Bruce Richardson
2025-06-11  1:24                     ` Varghese, Vipin
2025-06-11 10:53 ` [PATCH v5] " Bruce Richardson
2025-06-12  9:55   ` Varghese, Vipin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aEhA3YgjW9SsMfYE@bricha3-mobl1.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=Keesang.Song@amd.com \
    --cc=Vipin.Varghese@amd.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).