From: "Varghese, Vipin" <Vipin.Varghese@amd.com>
To: Bruce Richardson <bruce.richardson@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Song, Keesang" <Keesang.Song@amd.com>
Subject: RE: [PATCH v4] build: reduce use of AVX compiler flags
Date: Tue, 10 Jun 2025 13:02:12 +0000 [thread overview]
Message-ID: <PH7PR12MB8596ADB9B5BC7947CF6AC43C826AA@PH7PR12MB8596.namprd12.prod.outlook.com> (raw)
In-Reply-To: <aEbUQNFZHhKFKGeB@bricha3-mobl1.ger.corp.intel.com>
[Public]
Hi Bruce,
Snipped
> > > > >
> > > > > Hi Bruce, we have reviewed this internally and tested the same.
> > > > > We would like
> > > > your thought for the following.
> > > > >
> > > > > - Before patch: we were directly setting AVX512 falgs for F, BW,
> > > > > DQ, VL
> > > > > - new patch: we are setting the flags for `skylake-server` as bare minimal.
> > > > > - AMD supports AVX512 from `znver4 and higher`.
> > > > >
> > > > > As per GCC
> > > > > `https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html`, the extra
> > > > > ISA
> > > > supported between skylake-server (super set) and znver4 and znver5
> > > > are `SAHF, FXSR, XSAVE, RDRND, LZCNT, HLE, PREFETCHW, SGX`.
> > > > > Currently for DPDK microbenchmarks and examples runs safe as it
> > > > > is not using
> > > > the `SAHF, FXSR, XSAVE, RDRND, LZCNT, HLE, PREFETCHW, SGX`
> > > > instructions.
> > > > >
> > > > > Question: should we check if target is `AMD EPYC` then apply
> > > > > bare minimum as
> > > > `-march=znver4`, thus avoid possible unsupported instruction
> > > > generation when non `c_args for march` is passed?
> > > > >
> > > >
> > > > Can you clarify why you mean by the "target" here? Is there a
> > > > specific value you are thinking of for the "cpu_instruction_set" option?
> > >
> > > `Target` is target CPU, when generated without any arguments we get code for
> `native build`.
> > >
> > > On AMD target cpu zen4 or zen5; Before patch as per the code ` AVX512 flags
> for F, BW, DQ` are used in ` cc_avx512_flags`.
> > >
> > > With the patch, the cc_avx512_flags is set to `-march=skylake-avx512` (where
> compiler optimizations `can add HLE, PREFETCHW, SGX`).
> > >
> >
> > With this patch for zen4 or zen5 the AVX512 code paths should be
> > compiled with no additional flags, since the -march=zen* flag should
> > include everything necessary. Can you confirm that you see the extra
> > -march flag in those cases?
> >
>
> Ran a quick test myself, this is what I see, doing a zen4 build:
>
> $ meson setup -Dcpu_instruction_set=znver4 build-zen4
> The Meson build system
> Version: 1.7.0
> Source dir: /home/bruce/dpdk-github
> Build dir: /home/bruce/dpdk-github/build-zen4
> ...
> Fetching value of define "__AVX512F__" : 1
> Message: Extra C flags needed for AVX2 output: []
> Message: Extra C flags needed for AVX512 output: []
> ...
>
> Checking the build.ninja file in the build-zen4 directory, there is no use of
> march=skylake. Here is the compilation recipe for i40e AVX-512 code, for
> example:
>
> build
> drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o:
> c_COMPILER ../drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c
> DEPFILE =
> drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d
> DEPFILE_UNQUOTED =
> drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d
> ARGS = -Idrivers/librte_net_i40e_avx512_lib.a.p -Idrivers -I../drivers -
> Idrivers/net/intel/i40e -I../drivers/net/intel/i40e -Idrivers/net/intel/i40e/base -
> I../drivers/net/intel/i40e/base -Ilib/ethdev -I../lib/ethdev -Ilib/eal/common -
> I../lib/eal/common -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -
> Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include
> -I../kernel/linux -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -
> Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -
> I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -
> Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -
> Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/hash -I../lib/hash -Ilib/rcu -I../lib/rcu -
> I/usr/include/x86_64-linux-gnu -fdiagnostics-color=always -
> D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -O3 -include
> rte_config.h -Wvla -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -
> Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs
> -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef
> -Wwrite-strings -Wno-packed-not-aligned -Wno-missing-field-initializers -
> D_GNU_SOURCE -fPIC -march=znver4 -mrtm -DALLOW_EXPERIMENTAL_API
> -DALLOW_INTERNAL_API -Wno-format-truncation -Wno-address-of-packed-
> member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -
> DCC_AVX512_SUPPORT
I have retried the patch and tested the same on 2 scenarios
1. Scenario-1: znver4 (avx512) with gcc 13.2
2. Scenario-2: znver3 (no avx512) with gcc 13.2
For scenario-2 where CPU is not supporting avx512 but compiler supports avx512 I get the logs as
```
Fetching value of define "__AVX512F__" :
Message: found AVX512
Message: ['-march=skylake-avx512']
Compiler for C supports arguments -Wno-overriding-option: NO
Message: Extra C flags needed for AVX2 output: []
Message: Extra C flags needed for AVX512 output: ['-march=skylake-avx512']
Message: AVX 512 flag in drivers
Message: ['-march=skylake-avx512']
Message: ['-march=native', '-mrtm', '-DALLOW_EXPERIMENTAL_API', '-DALLOW_INTERNAL_API', '-Wno-format-truncation', '-Wno-address-of-packed-member', '-DDLB2_BYPASS_FENCE_ON_PP=0', '-DDLB_HW_CREDITS_CHECKS=1', '-DDLB_SW_CREDITS_CHECKS=1', '-DDLB_TYPE_CHECK=1', '-DRTE_LOG_DEFAULT_LOGTYPE=pmd.event.dlb2', '-DCC_AVX512_SUPPORT']
```
The above is correct as per my understanding as the test is to compile for avx512 ISA.
But When using `ninja -C build -v` and building for native build, I get
```
[1741/3539] cc -Idrivers/librte_net_i40e_avx512_lib.a.p -Idrivers -I../drivers -Idrivers/net/intel/i40e -I../drivers/net/intel/i40e -Idrivers/net/intel/i40e/base
-I../drivers/net/intel/i40e/base -Ilib/ethdev -I../lib/ethdev -Ilib/eal/common -I../lib/eal/common -I. -I.. -Iconfig -I../config -Ilib/eal/include
-I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal
-I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net
-I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -Idrivers/bus/pci -I../drivers/bus/pci
-I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/hash -I../lib/hash -Ilib/rcu -I../lib/rcu -fdiagnostics-color=always
-D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -O3 -include rte_config.h -Wvla -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security
-Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings
-Wno-packed-not-aligned -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation
-Wno-address-of-packed-member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -DCC_AVX512_SUPPORT -march=skylake-avx512 -MD
-MQ drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o -MF drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d
-o drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o -c ../drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c
```
In above log I get `2 instances of march`; logs `-march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation
-Wno-address-of-packed-member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -DCC_AVX512_SUPPORT -march=skylake-avx512`.
Question-1: I think this is not expected right? The `-march=native` is populated from `cflags` and `-march= skylake-avx512` is populated from ` cc_avx512_flags`.
Question-2: if the target is meet minimal ISA why not we use `-march=x86-64-v4`?
Note: I am yet to check for cross build. Will update on cross build how this comes out.
>
> Regards,
> /Bruce
next prev parent reply other threads:[~2025-06-10 13:02 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-25 17:22 [RFC PATCH] " Bruce Richardson
2025-03-26 16:21 ` Bruce Richardson
2025-03-26 18:06 ` Morten Brørup
2025-03-26 19:20 ` Stephen Hemminger
2025-03-27 7:55 ` DPDK compilers and RHEL 7 support Morten Brørup
2025-03-27 11:11 ` Kevin Traynor
2025-04-09 9:53 ` [RFC PATCH] build: reduce use of AVX compiler flags Varghese, Vipin
2025-04-09 11:31 ` Bruce Richardson
2025-04-10 3:49 ` Varghese, Vipin
2025-05-27 16:24 ` [PATCH v2] " Bruce Richardson
2025-05-29 10:01 ` Bruce Richardson
2025-05-29 10:33 ` Bruce Richardson
2025-05-29 10:31 ` [PATCH v3] " Bruce Richardson
2025-05-29 15:42 ` [PATCH v4] " Bruce Richardson
2025-05-30 9:30 ` Bruce Richardson
2025-06-09 6:02 ` Varghese, Vipin
2025-06-09 7:57 ` Bruce Richardson
2025-06-09 11:59 ` Varghese, Vipin
2025-06-09 12:23 ` Bruce Richardson
2025-06-09 12:32 ` Bruce Richardson
2025-06-10 13:02 ` Varghese, Vipin [this message]
2025-06-10 14:27 ` Bruce Richardson
2025-06-10 14:52 ` Varghese, Vipin
2025-06-10 15:06 ` Bruce Richardson
2025-06-11 1:24 ` Varghese, Vipin
2025-06-11 10:53 ` [PATCH v5] " Bruce Richardson
2025-06-12 9:55 ` Varghese, Vipin
2025-06-27 10:11 ` Thomas Monjalon
2025-06-24 14:13 ` Bruce Richardson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PH7PR12MB8596ADB9B5BC7947CF6AC43C826AA@PH7PR12MB8596.namprd12.prod.outlook.com \
--to=vipin.varghese@amd.com \
--cc=Keesang.Song@amd.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).