From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id F0FC3A04DD; Wed, 21 Oct 2020 16:28:02 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D5672AC65; Wed, 21 Oct 2020 16:28:01 +0200 (CEST) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 30FF9AC63 for ; Wed, 21 Oct 2020 16:27:59 +0200 (CEST) IronPort-SDR: Ez+53N92Spk0aC7+XGchdY6PDG1DL2+2062QmxGiF0XBDqF7Iu4Wi51pFAl3eoPTz6s2a0l/xP cAaOrD0kXbdg== X-IronPort-AV: E=McAfee;i="6000,8403,9780"; a="146666937" X-IronPort-AV: E=Sophos;i="5.77,401,1596524400"; d="scan'208";a="146666937" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Oct 2020 07:27:58 -0700 IronPort-SDR: hQsHbvMpP9YXOYmQb8e97ZQpaba5jUQ0Re3ib0ggmAztx3D0NDgFbqui5QPM+W8+S+JSvxO4c9 zz4Ma2/XAvUQ== X-IronPort-AV: E=Sophos;i="5.77,401,1596524400"; d="scan'208";a="348337622" Received: from bricha3-mobl.ger.corp.intel.com ([10.213.249.97]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 21 Oct 2020 07:27:56 -0700 Date: Wed, 21 Oct 2020 15:27:53 +0100 From: Bruce Richardson To: Juraj =?utf-8?Q?Linke=C5=A1?= Cc: "Ruifeng.Wang@arm.com" , "Honnappa.Nagarahalli@arm.com" , "Phil.Yang@arm.com" , "vcchunga@amazon.com" , "Dharmik.Thakkar@arm.com" , "jerinjacobk@gmail.com" , "hemant.agrawal@nxp.com" , "dev@dpdk.org" Message-ID: <20201021142753.GE592@bricha3-MOBL.ger.corp.intel.com> References: <1602600882-695-1-git-send-email-juraj.linkes@pantheon.tech> <1603280261-20206-1-git-send-email-juraj.linkes@pantheon.tech> <1603280261-20206-4-git-send-email-juraj.linkes@pantheon.tech> <20201021120220.GC592@bricha3-MOBL.ger.corp.intel.com> <54e20244e1e841148669eb93b6b8876c@pantheon.tech> <20201021141319.GD592@bricha3-MOBL.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20201021141319.GD592@bricha3-MOBL.ger.corp.intel.com> Subject: Re: [dpdk-dev] [RFC PATCH v3 3/6] build: automatic NUMA and cpu counts detection X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Wed, Oct 21, 2020 at 03:13:19PM +0100, Bruce Richardson wrote: > On Wed, Oct 21, 2020 at 01:01:41PM +0000, Juraj Linkeš wrote: > > > > > > > -----Original Message----- > > > From: Bruce Richardson > > > Sent: Wednesday, October 21, 2020 2:02 PM > > > To: Juraj Linkeš > > > Cc: Ruifeng.Wang@arm.com; Honnappa.Nagarahalli@arm.com; > > > Phil.Yang@arm.com; vcchunga@amazon.com; Dharmik.Thakkar@arm.com; > > > jerinjacobk@gmail.com; hemant.agrawal@nxp.com; dev@dpdk.org > > > Subject: Re: [RFC PATCH v3 3/6] build: automatic NUMA and cpu counts > > > detection > > > > > > On Wed, Oct 21, 2020 at 01:37:38PM +0200, Juraj Linkeš wrote: > > > > The build machine's number of cpus and numa nodes vary, resulting in > > > > mismatched counts of RTE_MAX_LCORE and RTE_MAX_NUMA_NODES for > > > many > > > > builds. Automatically discover the host's numa and cpu counts to > > > > remove this mismatch for native builds. Use current defaults for default builds. > > > > Force the users to specify the counts for cross build in cross files > > > > or on the command line. > > > > Give users the option to override the discovery or values from cross > > > > files by specifying them on the command line with -Dmax_lcores and > > > > -Dmax_numa_nodes. > > > > > > > > Signed-off-by: Juraj Linkeš > > > > --- > > > > buildtools/get_cpu_count.py | 7 ++++++ > > > > buildtools/get_numa_count.py | 22 +++++++++++++++++++ > > > > buildtools/meson.build | 2 ++ > > > > config/meson.build | 42 ++++++++++++++++++++++++++++++++++-- > > > > meson_options.txt | 8 +++---- > > > > 5 files changed, 75 insertions(+), 6 deletions(-) create mode 100644 > > > > buildtools/get_cpu_count.py create mode 100644 > > > > buildtools/get_numa_count.py > > > > > > > > diff --git a/buildtools/get_cpu_count.py b/buildtools/get_cpu_count.py > > > > new file mode 100644 index 000000000..386f85f8b > > > > --- /dev/null > > > > +++ b/buildtools/get_cpu_count.py > > > > @@ -0,0 +1,7 @@ > > > > +#!/usr/bin/python3 > > > > +# SPDX-License-Identifier: BSD-3-Clause # Copyright (c) 2020 > > > > +PANTHEON.tech s.r.o. > > > > + > > > > +import os > > > > + > > > > +print(os.cpu_count()) > > > > diff --git a/buildtools/get_numa_count.py > > > > b/buildtools/get_numa_count.py new file mode 100644 index > > > > 000000000..f0c49973a > > > > --- /dev/null > > > > +++ b/buildtools/get_numa_count.py > > > > @@ -0,0 +1,22 @@ > > > > +#!/usr/bin/python3 > > > > +# SPDX-License-Identifier: BSD-3-Clause # Copyright (c) 2020 > > > > +PANTHEON.tech s.r.o. > > > > + > > > > +import ctypes > > > > +import glob > > > > +import os > > > > +import subprocess > > > > + > > > > +if os.name == 'posix': > > > > + if os.path.isdir('/sys/devices/system/node'): > > > > + print(len(glob.glob('/sys/devices/system/node/node*'))) > > > > + else: > > > > + print(subprocess.run(['sysctl', 'vm.ndomains'], > > > > +capture_output=True).stdout) > > > > + > > > > +elif os.name == 'nt': > > > > + libkernel32 = ctypes.windll.kernel32 > > > > + > > > > + count = ctypes.c_ulong() > > > > + > > > > + libkernel32.GetNumaHighestNodeNumber(ctypes.pointer(count)) > > > > + print(count.value + 1) > > > > diff --git a/buildtools/meson.build b/buildtools/meson.build index > > > > 04808dabc..925e733b1 100644 > > > > --- a/buildtools/meson.build > > > > +++ b/buildtools/meson.build > > > > @@ -17,3 +17,5 @@ else > > > > endif > > > > map_to_win_cmd = py3 + files('map_to_win.py') sphinx_wrapper = py3 + > > > > files('call-sphinx-build.py') > > > > +get_cpu_count_cmd = py3 + files('get_cpu_count.py') > > > > +get_numa_count_cmd = py3 + files('get_numa_count.py') > > > > diff --git a/config/meson.build b/config/meson.build index > > > > a57c8ae9e..c4477f977 100644 > > > > --- a/config/meson.build > > > > +++ b/config/meson.build > > > > @@ -74,7 +74,11 @@ endif > > > > # still being able to support the CPU features required for DPDK. > > > > # This can be bumped up by the DPDK project, but it can never be an > > > > # invariant like 'native' > > > > +max_lcores = get_option('max_lcores') max_numa_nodes = > > > > +get_option('max_numa_nodes') > > > > if machine == 'default' > > > > + max_numa_nodes = 4 > > > > + max_lcores = 128 > > > > > > This doesn't seem right, since you are overriding the user-specified values with > > > hard-coded ones. > > > > > > > I understand we're using the default build/generic to build portalbe dpdk distro packages, meaning the settings for those packages should always be the same, no? If not, what should the default/generic build be? And when would someone do a default/generic build with their values? It wouldn't be a default/generic anymore, right? > > > > > > if host_machine.cpu_family().startswith('x86') > > > > # matches the old pre-meson build systems default > > > > machine = 'corei7' > > > > @@ -83,6 +87,22 @@ if machine == 'default' > > > > elif host_machine.cpu_family().startswith('ppc') > > > > machine = 'power8' > > > > endif > > > > +elif not meson.is_cross_build() > > > > + # find host core count and numa node count for native builds > > > > + if max_lcores == 0 > > > > + max_lcores = > > > run_command(get_cpu_count_cmd).stdout().to_int() > > > > + min_lcores = 2 > > > > + if max_lcores < min_lcores > > > > + message('Found less than @0@ cores, building for > > > @0@ cores'.format(min_lcores)) > > > > + max_lcores = min_lcores > > > > + else > > > > + message('Found @0@ cores'.format(max_lcores)) > > > > + endif > > > > + endif > > > > + if max_numa_nodes == 0 > > > > + max_numa_nodes = > > > run_command(get_numa_count_cmd).stdout().to_int() > > > > + message('Found @0@ numa nodes'.format(max_numa_nodes)) > > > > + endif > > > > endif > > > > > > > > dpdk_conf.set('RTE_MACHINE', machine) @@ -227,8 +247,10 @@ foreach > > > > arg: warning_flags endforeach > > > > > > > > # set other values pulled from the build options > > > > -dpdk_conf.set('RTE_MAX_LCORE', get_option('max_lcores')) > > > > -dpdk_conf.set('RTE_MAX_NUMA_NODES', get_option('max_numa_nodes')) > > > > +if not meson.is_cross_build() > > > > + dpdk_conf.set('RTE_MAX_LCORE', max_lcores) > > > > + dpdk_conf.set('RTE_MAX_NUMA_NODES', max_numa_nodes) endif > > > > > > Rather than conditionally setting the value here, you should move the checks > > > below up above this to simplify things. > > > > > > > Do you mean the cross build checks? Those have to be after subdir(arch_subdir) so that we can override the values from cross files (as the commit message says). > > > > > > dpdk_conf.set('RTE_MAX_ETHPORTS', get_option('max_ethports')) > > > > dpdk_conf.set('RTE_LIBEAL_USE_HPET', get_option('use_hpet')) > > > > dpdk_conf.set('RTE_ENABLE_TRACE_FP', get_option('enable_trace_fp')) @@ > > > > -247,6 +269,22 @@ compile_time_cpuflags = [] > > > > subdir(arch_subdir) > > > > dpdk_conf.set('RTE_COMPILE_TIME_CPUFLAGS', > > > > ','.join(compile_time_cpuflags)) > > > > > > > > +# check that cpu and numa count is set in cross builds if > > > > +meson.is_cross_build() > > > > + if max_lcores > 0 > > > > + # specified on the cmdline > > > > + dpdk_conf.set('RTE_MAX_LCORE', max_lcores) > > > > + elif not dpdk_conf.has('RTE_MAX_LCORE') > > > > + error('Number of cores for cross build not specified in @0@ > > > subdir (e.g. in a cross-file) nor on the cmdline'.format(arch_subdir)) > > > > + endif > > > > + if max_numa_nodes > 0 > > > > + # specified on the cmdline > > > > + dpdk_conf.set('RTE_MAX_NUMA_NODES', max_numa_nodes) > > > > + elif not dpdk_conf.has('RTE_MAX_NUMA_NODES') > > > > + error('Number of numa nodes for cross build not specified in > > > @0@ subdir (e.g. in a cross-file) nor on the cmdline'.format(arch_subdir)) > > > > + endif > > > > +endif > > > > + > > > > # set the install path for the drivers > > > > dpdk_conf.set_quoted('RTE_EAL_PMD_PATH', eal_pmd_path) > > > > > > > > diff --git a/meson_options.txt b/meson_options.txt index > > > > 9bf18ab6b..01b0c45c3 100644 > > > > --- a/meson_options.txt > > > > +++ b/meson_options.txt > > > > @@ -26,10 +26,10 @@ option('machine', type: 'string', value: 'native', > > > > description: 'set the target machine type') option('max_ethports', > > > > type: 'integer', value: 32, > > > > description: 'maximum number of Ethernet devices') > > > > -option('max_lcores', type: 'integer', value: 128, > > > > - description: 'maximum number of cores/threads supported by EAL') > > > > -option('max_numa_nodes', type: 'integer', value: 4, > > > > - description: 'maximum number of NUMA nodes supported by EAL') > > > > +option('max_lcores', type: 'integer', value: 0, > > > > + description: 'maximum number of cores/threads supported by EAL. > > > > +Value 0 means the number of cpus on the host will be used. For cross build, > > > set to non-zero to overwrite the cross-file value.') option('max_numa_nodes', > > > type: 'integer', value: 0, > > > > + description: 'maximum number of NUMA nodes supported by EAL. Value > > > 0 > > > > +means the number of numa nodes on the host will be used. For cross > > > > +build, set to non-zero to overwrite the cross-file value.') > > > > > > I don't like this change, because it very much assumes for non-cross-compiles > > > that people will be running DPDK on the system they build it on. That's a very, > > > very big assumption! > > > > I'll be using definitions from https://mesonbuild.com/Cross-compilation.html. > > I understand cross compilation to be building for a diffent host machine than the build machine (which is aligned with pretty much every definition I found). I understand this to be true not only for builds between architectures, but also within an architecture (e.g. x86_64 build machine building for x86_64 host machine). > > So yes, when someone does a native build, it stands to reason they want to use it on the build machine. If they wanted to use it elsewhere, they would cross compile. > > Another thing is the current build philosophy is to detect as much as possible (not having statically defined configuration, as you mentioned in the past). Detecting the number of cores and numa nodes fits this perfectly. > > And yet another thing is that the assumption seems to be already present in the build system - it already detects a lot things, some of which may not be satisfied on machines other than the build machine. I may be wrong about this. > > > > > I'm ok with having zero as a "detect" option, and having > > > the values overridden from cross-files, but not with detection as the default out- > > > of-the-box option! Lots of users may pull builds from a CI based on VMs with > > > just a few cores, for instance. > > > > If not having the automatic detection is a concern because of users using CI builds, then we (if it's from our CI) can change what we're building in CI - the default/generic build seems like a good fit because it's supposed to work on a variety of systems. Expecting that native build from random VMs would work anywhere doesn't seen very realistic - it's been build for that VM environment (because it's a native build). > > > > Here's my understanding on which the current version is based: > > 1. Since we want to get away from having statically defined configuration, numa and core count discovery is exactly what we should have in the build system. Since discorery is currently the default for lib/drivers, it stands to reason it should be default for everything else, if possible. > > 2. Native build should produce binaries matching the build machine as well as possible. > > 3. Default/generic build should produce binaries executable on a range of systems (ideally all systems of a given architecture). > > 4. Other builds, that is non-native builds, are cross-compilation, since we're building for host machine other that the build machine. > > > > What I haven't taken into account is users using CI builds. That could be remedied by modifying the CI a bit while being consistent with what native/default/generic/cross builds are (or should be). And in any case, if we're not interested in testing the exact CI environment (which we aren't, since we don't want to use 2 cores with 1 numa), we really shouldn't be doing native builds there. > > > > I'm interested in hearing where my thinking deviates from yours. > > > > There are a number of points in which we differ, I think. > > Firstly, the use of "native" and "default/generic" for the "machine" > parameter refers only to the instruction-set level from the compiler, and > should not affect any other settings, since all settings are independent. > Therefore, setting "machine" to "native" does not mean that we should > detect cores and numa nodes, and similarly setting it to "default" does not > mean that we should ignore the settings for these values and pick our own > chosen default values. > > Secondly, the use of cross-compilation only applies when you are compiling > for a different architecture or environment (e.g. OS) to what you are > building on. Building on a 4-core x86 machine to run on a dual-socket, > 32-core x86 machine is not cross-compiling, but still needs to work by > default. Something like building a 32-bit binary on a 64-bit OS is in most > cases not done by cross-compilation either, but is rather outside the scope > of the discussion, except as a reference point to show the scope of > differences which can be accomodated as "native builds". > > In terms of dynamic configuration for things like cores and numa nodes, the > ideal end state here is not to have them detected at build time on the host > system, but instead to have them detected at runtime and sized dynamically. > In the absense of that, these values should be set to reasonable defaults > so that when a user compiles up a binary without settings these explicitly > it should run on 95%+ of systems of that type. > > This is my understanding of the issues, anyway. :-) > What could possibly work is to set the defaults for these to "0" as done in your patch, but thereafter have the resulting defaults set per-architecture, rather than globally. That would allow x86 to tune things more for native-style builds, while in all cases allowing user to override. /Bruce