From: Bruce Richardson <bruce.richardson@intel.com>
To: Andre Muezerie <andremue@linux.microsoft.com>
Cc: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>, <dev@dpdk.org>
Subject: Re: [PATCH 3/6] config: allow faster instruction sets to be used with MSVC
Date: Wed, 26 Feb 2025 09:44:31 +0000 [thread overview]
Message-ID: <Z77if3SkegDq84XN@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <20250226020138.GA16574@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
On Tue, Feb 25, 2025 at 06:01:38PM -0800, Andre Muezerie wrote:
> On Tue, Feb 25, 2025 at 02:28:02PM +0000, Bruce Richardson wrote:
> > On Mon, Feb 24, 2025 at 01:01:16PM -0800, Andre Muezerie wrote:
> > > Up to now MSVC has being used with the default mode, which uses SSE2
> > > instructions for scalar floating-point and vector calculations.
> > > https://learn.microsoft.com/en-us/cpp/build/reference/arch-x64?view=msvc-170
> > >
> > > This patch allows users to specify the CPU for which the generated
> > > code should be optimized for in the same way it's done for GCC: by
> > > passing the CPU name.
> > > When no explicit CPU name is passed, 'native' is assumed (like it
> > > happens with GCC) and the code will be optimized for the same CPU
> > > type used to compile the code.
> > >
> > > MSVC does not provide this functionality natively, so logic was
> > > added to meson.build to handle these differences, detecting which
> > > instruction sets are supported by the CPU(s), passing the best
> > > options to MSVC and setting the correct macros (like __AVX512F__)
> > > so that the DPDK code can rely on them like it is done with GCC.
> > >
> > > Signed-off-by: Andre Muezerie <andremue@linux.microsoft.com>
> > > ---
> >
> > Hi Andre,
> >
> > couple of initial thoughts inline below.
> >
> > /Bruce
> >
> > > config/x86/meson.build | 364 ++++++++++++++++++++++++++++++++++++-----
> > > 1 file changed, 325 insertions(+), 39 deletions(-)
> > >
> >
> > There is quite a lot of new code to be added here. Might it be worthwhile
> > creating a "config/x86/msvc/" subdirectory with its own meson.build file to
> > handle all the complexities of using it. We can have the common material at
> > the top of the x86/meson.build file, and then do
> >
> > if is_ms_compiler
> > subdir(msvc)
> > subdir_done()
> > endif
> >
> > leaving the rest of the file for the gcc/clang/icx code.
>
> I think that makes sense, as there's not much common code there that is common to gcc and msvc.
>
<snip>
> >
> > I really don't want to have tables like this to maintain in our code if at
> > all possible. We used to have something a bit similar in DPDK IIRC, but we
> > found it a maintenance nightmare and just switched to using the compiler to
> > do all the work. In our existing builds, we just pass the
> > cpu_instruction_set parameter straight to the -march flag of the compiler.
> > For MSVC support, I believe we should just do the exact same.
> >
> > Maintaining lists like this will be a problem as new platforms need to be
> > constantly added.
>
> It's great that when using gcc users can just pass the CPU type to it and
> that it will set all the macros corresponding to that CPU for them. I
> would love to be able to rely on MSVC for that as well, unfortunately MSVC
> does not provide that level of granularity, that's why I came up with the
> idea of having this table.
>
> Initially I was also very concerned about the amount of data to be stored
> there, and the work required to maintain it. Then I decided to throw away
> all the CPU types that do not have SSE4_2. That reduced the table in half.
> Adding the entries manually was still a lot of work and error prone. So,
> I decided to write some code that uses gcc to build that table for me.
> I'll polish that code and add it to the patch. With that it will be almost
> zero effort to maintain that table. All that will be required is running
> that code on a setup with the latest gcc, providing a file with the CPU
> names. Assuming gcc knows about the latest CPUs a new table will be
> generated and can be pasted in the meson.build file.
>
> I assume this does not need to be done too often. In the worst case, if it
> happens that DPDK was not updated with the latest CPUs, people can still
> pick an earlier CPU with similar characteristics and have similar (if not
> same) performance.
>
Thanks. Having it automated certainly helps, but I'm still not fully
convinced that it's a good idea. If everyone else is happy, though, I'm ok
to ignore those concerns :-). Let's see what others think.
>
> > Do we also look to backport them, because if equivalence
> > with the linux build is necessary then that will have to be done - as on
> > Linux when a new version of GCC comes out, we can then use the new
> > instruction set targets on the old releases of DPDK.
>
> That would be nice, and I'm willing to help with that. It makes it a
> better user experience if we can minimize the perceived differences
> between the toolsets. I'm not sure if it's a requirement though. If
> you're concerned that this would add too much overhead, it could be
> decided that no such backport should happen.
>
> >
<snip>
> > > + if 'RTE_CPUFLAG_AVX512F' in compile_time_cpuflags
> > > + machine_args += ['/arch:AVX512']
> > > + elif 'RTE_CPUFLAG_AVX2' in compile_time_cpuflags
> > > + machine_args += ['/arch:AVX2']
> > > + elif 'RTE_CPUFLAG_AVX' in compile_time_cpuflags
> > > + machine_args += ['/arch:AVX']
> > > + else
> > > + # SSE4.2 is expected to always be available
> > > + machine_args += ['/arch:SSE4.2']
> > > endif
> > > -endforeach
> > >
> >
> > Since these appear to be the only /arch flags supported by the compiler for
> > code generation, I would suggest that these would be the only instruction
> > set flags that we support on MSVC builds, and that we then build up the
> > actual CPU flags based on the minimum flags to be expected when each of
> > these instruction sets is present.
> >
> > Similarly with 'native', rather than supporting all the different CPU types,
> > it would be a lot easier to just determine if it's an SSE4 machine, an AVX2
> > machine or AVX512, and run with that.
> >
> > My thinking is that getting this as a first step should get us a lot of the
> > benefits without such a massive maintenance headache.
>
> Providing only /arch option at first sight might look simpler, but I still
> like the table approach much better. As I said earlier, maintaining that
> table is not much work with that extra devtool.
>
> Some drawbacks from the simpler approach:
>
> 1) User experience is not as good as it will differ from other toolsets.
> Users will have to learn about more parameter/values (how/when to use them).
This is already the case, in that different versions of compilers have
different march flags available. In practice, I think "native" and
"default" are the two most important to have from usability. For any other
values, I'd consider it advanced tuning and that the user probably already
knows what they need or want and what their compiler can provide.
> 2) DPDK code will not benefit from other instruction sets which might be
> present (__RDSEED__, __RDRND__, etc.) because these are not set by /arch.
>
RDRND looks to be available everywhere AVX2 is, and RDSEED whereever AVX512
is, so in a simplified model, I'd bundle those in that way.
/Bruce
next prev parent reply other threads:[~2025-02-26 9:44 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-24 21:01 [PATCH 0/6] " Andre Muezerie
2025-02-24 21:01 ` [PATCH 1/6] eal: make compatible with instruction set updates for MSVC Andre Muezerie
2025-02-24 21:01 ` [PATCH 2/6] eal: only use numbers as align parameters " Andre Muezerie
2025-02-24 21:01 ` [PATCH 3/6] config: allow faster instruction sets to be used with MSVC Andre Muezerie
2025-02-25 14:28 ` Bruce Richardson
2025-02-26 2:01 ` Andre Muezerie
2025-02-26 9:44 ` Bruce Richardson [this message]
2025-02-24 21:01 ` [PATCH 4/6] drivers/net: make compatible with instruction set updates for MSVC Andre Muezerie
2025-02-25 9:06 ` Bruce Richardson
2025-02-25 16:44 ` Andre Muezerie
2025-02-24 21:01 ` [PATCH 5/6] acl: " Andre Muezerie
2025-02-25 9:03 ` Bruce Richardson
2025-02-25 16:37 ` Andre Muezerie
2025-02-25 17:21 ` Bruce Richardson
2025-02-25 17:23 ` Andre Muezerie
2025-02-24 21:01 ` [PATCH 6/6] member: " Andre Muezerie
2025-02-26 1:06 ` [PATCH v2 0/5] allow faster instruction sets to be used with MSVC Andre Muezerie
2025-02-26 1:06 ` [PATCH v2 1/5] eal: make compatible with instruction set updates for MSVC Andre Muezerie
2025-02-26 9:50 ` Bruce Richardson
2025-02-26 1:06 ` [PATCH v2 2/5] eal: only use numbers as align parameters " Andre Muezerie
2025-02-26 9:51 ` Bruce Richardson
2025-02-26 10:15 ` Konstantin Ananyev
2025-02-26 1:06 ` [PATCH v2 3/5] config: create top level variable cc_avx2_flags Andre Muezerie
2025-02-26 9:52 ` Bruce Richardson
2025-02-26 1:06 ` [PATCH v2 4/5] drivers/net: make compatible with instruction set updates for MSVC Andre Muezerie
2025-02-26 9:53 ` Bruce Richardson
2025-02-26 1:06 ` [PATCH v2 5/5] acl: " Andre Muezerie
2025-02-26 9:54 ` Bruce Richardson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z77if3SkegDq84XN@bricha3-mobl1.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=andremue@linux.microsoft.com \
--cc=dev@dpdk.org \
--cc=konstantin.v.ananyev@yandex.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).