From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A4383A04E7; Mon, 2 Nov 2020 14:55:33 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4C590C8A6; Mon, 2 Nov 2020 14:55:32 +0100 (CET) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id C5F1DC87E for ; Mon, 2 Nov 2020 14:55:29 +0100 (CET) IronPort-SDR: OM7aJJEnx49RkHBGW225qvXjAjrXUcNQ2ieWO4vwZ4X2lkZ9gL4BNFh88GL+kiAUsqTVRwZGP0 1knu9zeLIkrQ== X-IronPort-AV: E=McAfee;i="6000,8403,9792"; a="186718429" X-IronPort-AV: E=Sophos;i="5.77,445,1596524400"; d="scan'208";a="186718429" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Nov 2020 05:55:27 -0800 IronPort-SDR: GhrwKIPL2+h7yWFdZ6BG7xzMDJWBkv07hqjEGs20zqvUk6tecd41j/bPiRPS0tmWeA6muC0hC8 cS3L4/neNbVQ== X-IronPort-AV: E=Sophos;i="5.77,445,1596524400"; d="scan'208";a="470385217" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.49.238]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 02 Nov 2020 05:55:24 -0800 Date: Mon, 2 Nov 2020 13:55:20 +0000 From: Bruce Richardson To: Honnappa Nagarahalli Cc: Juraj =?utf-8?Q?Linke=C5=A1?= , Ruifeng Wang , Phil Yang , "vcchunga@amazon.com" , Dharmik Thakkar , "jerinjacobk@gmail.com" , "hemant.agrawal@nxp.com" , "dev@dpdk.org" , nd Message-ID: <20201102135520.GA1454@bricha3-MOBL.ger.corp.intel.com> References: <1602600882-695-1-git-send-email-juraj.linkes@pantheon.tech> <1603280261-20206-1-git-send-email-juraj.linkes@pantheon.tech> <1603280261-20206-4-git-send-email-juraj.linkes@pantheon.tech> <20201021120220.GC592@bricha3-MOBL.ger.corp.intel.com> <54e20244e1e841148669eb93b6b8876c@pantheon.tech> <20201021141319.GD592@bricha3-MOBL.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Subject: Re: [dpdk-dev] [RFC PATCH v3 3/6] build: automatic NUMA and cpu counts detection X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, Oct 29, 2020 at 04:31:33AM +0000, Honnappa Nagarahalli wrote: > > > > > > > > > > On Wed, Oct 21, 2020 at 01:37:38PM +0200, Juraj Linkeš wrote: > > > > > The build machine's number of cpus and numa nodes vary, resulting > > > > > in mismatched counts of RTE_MAX_LCORE and > > RTE_MAX_NUMA_NODES for > > > > many > > > > > builds. Automatically discover the host's numa and cpu counts to > > > > > remove this mismatch for native builds. Use current defaults for default > > builds. > > > > > Force the users to specify the counts for cross build in cross > > > > > files or on the command line. > > > > > Give users the option to override the discovery or values from > > > > > cross files by specifying them on the command line with > > > > > -Dmax_lcores and -Dmax_numa_nodes. > > > > > > > > > > Signed-off-by: Juraj Linkeš > > > > > --- > > > > > buildtools/get_cpu_count.py | 7 ++++++ > > > > > buildtools/get_numa_count.py | 22 +++++++++++++++++++ > > > > > buildtools/meson.build | 2 ++ > > > > > config/meson.build | 42 > > ++++++++++++++++++++++++++++++++++-- > > > > > meson_options.txt | 8 +++---- > > > > > 5 files changed, 75 insertions(+), 6 deletions(-) create mode > > > > > 100644 buildtools/get_cpu_count.py create mode 100644 > > > > > buildtools/get_numa_count.py > > > > > > > > > > diff --git a/buildtools/get_cpu_count.py > > > > > b/buildtools/get_cpu_count.py new file mode 100644 index > > > > > 000000000..386f85f8b > > > > > --- /dev/null > > > > > +++ b/buildtools/get_cpu_count.py > > > > > @@ -0,0 +1,7 @@ > > > > > +#!/usr/bin/python3 > > > > > +# SPDX-License-Identifier: BSD-3-Clause # Copyright (c) 2020 > > > > > +PANTHEON.tech s.r.o. > > > > > + > > > > > +import os > > > > > + > > > > > +print(os.cpu_count()) > > > > > diff --git a/buildtools/get_numa_count.py > > > > > b/buildtools/get_numa_count.py new file mode 100644 index > > > > > 000000000..f0c49973a > > > > > --- /dev/null > > > > > +++ b/buildtools/get_numa_count.py > > > > > @@ -0,0 +1,22 @@ > > > > > +#!/usr/bin/python3 > > > > > +# SPDX-License-Identifier: BSD-3-Clause # Copyright (c) 2020 > > > > > +PANTHEON.tech s.r.o. > > > > > + > > > > > +import ctypes > > > > > +import glob > > > > > +import os > > > > > +import subprocess > > > > > + > > > > > +if os.name == 'posix': > > > > > + if os.path.isdir('/sys/devices/system/node'): > > > > > + print(len(glob.glob('/sys/devices/system/node/node*'))) > > > > > + else: > > > > > + print(subprocess.run(['sysctl', 'vm.ndomains'], > > > > > +capture_output=True).stdout) > > > > > + > > > > > +elif os.name == 'nt': > > > > > + libkernel32 = ctypes.windll.kernel32 > > > > > + > > > > > + count = ctypes.c_ulong() > > > > > + > > > > > + libkernel32.GetNumaHighestNodeNumber(ctypes.pointer(count)) > > > > > + print(count.value + 1) > > > > > diff --git a/buildtools/meson.build b/buildtools/meson.build index > > > > > 04808dabc..925e733b1 100644 > > > > > --- a/buildtools/meson.build > > > > > +++ b/buildtools/meson.build > > > > > @@ -17,3 +17,5 @@ else > > > > > endif > > > > > map_to_win_cmd = py3 + files('map_to_win.py') sphinx_wrapper = > > > > > py3 + > > > > > files('call-sphinx-build.py') > > > > > +get_cpu_count_cmd = py3 + files('get_cpu_count.py') > > > > > +get_numa_count_cmd = py3 + files('get_numa_count.py') > > > > > diff --git a/config/meson.build b/config/meson.build index > > > > > a57c8ae9e..c4477f977 100644 > > > > > --- a/config/meson.build > > > > > +++ b/config/meson.build > > > > > @@ -74,7 +74,11 @@ endif > > > > > # still being able to support the CPU features required for DPDK. > > > > > # This can be bumped up by the DPDK project, but it can never be > > > > > an # invariant like 'native' > > > > > +max_lcores = get_option('max_lcores') max_numa_nodes = > > > > > +get_option('max_numa_nodes') > > > > > if machine == 'default' > > > > > + max_numa_nodes = 4 > > > > > + max_lcores = 128 > > > > > > > > This doesn't seem right, since you are overriding the user-specified > > > > values with hard-coded ones. > > > > > > > > > > I understand we're using the default build/generic to build portalbe dpdk > > distro packages, meaning the settings for those packages should always be > > the same, no? If not, what should the default/generic build be? And when > > would someone do a default/generic build with their values? It wouldn't be a > > default/generic anymore, right? > > > > > > > > if host_machine.cpu_family().startswith('x86') > > > > > # matches the old pre-meson build systems default > > > > > machine = 'corei7' > > > > > @@ -83,6 +87,22 @@ if machine == 'default' > > > > > elif host_machine.cpu_family().startswith('ppc') > > > > > machine = 'power8' > > > > > endif > > > > > +elif not meson.is_cross_build() > > > > > + # find host core count and numa node count for native builds > > > > > + if max_lcores == 0 > > > > > + max_lcores = > > > > run_command(get_cpu_count_cmd).stdout().to_int() > > > > > + min_lcores = 2 > > > > > + if max_lcores < min_lcores > > > > > + message('Found less than @0@ cores, building for > > > > @0@ cores'.format(min_lcores)) > > > > > + max_lcores = min_lcores > > > > > + else > > > > > + message('Found @0@ cores'.format(max_lcores)) > > > > > + endif > > > > > + endif > > > > > + if max_numa_nodes == 0 > > > > > + max_numa_nodes = > > > > run_command(get_numa_count_cmd).stdout().to_int() > > > > > + message('Found @0@ numa > > nodes'.format(max_numa_nodes)) > > > > > + endif > > > > > endif > > > > > > > > > > dpdk_conf.set('RTE_MACHINE', machine) @@ -227,8 +247,10 @@ > > > > > foreach > > > > > arg: warning_flags endforeach > > > > > > > > > > # set other values pulled from the build options > > > > > -dpdk_conf.set('RTE_MAX_LCORE', get_option('max_lcores')) > > > > > -dpdk_conf.set('RTE_MAX_NUMA_NODES', > > get_option('max_numa_nodes')) > > > > > +if not meson.is_cross_build() > > > > > + dpdk_conf.set('RTE_MAX_LCORE', max_lcores) > > > > > + dpdk_conf.set('RTE_MAX_NUMA_NODES', max_numa_nodes) > > endif > > > > > > > > Rather than conditionally setting the value here, you should move > > > > the checks below up above this to simplify things. > > > > > > > > > > Do you mean the cross build checks? Those have to be after > > subdir(arch_subdir) so that we can override the values from cross files (as > > the commit message says). > > > > > > > > dpdk_conf.set('RTE_MAX_ETHPORTS', get_option('max_ethports')) > > > > > dpdk_conf.set('RTE_LIBEAL_USE_HPET', get_option('use_hpet')) > > > > > dpdk_conf.set('RTE_ENABLE_TRACE_FP', > > > > > get_option('enable_trace_fp')) @@ > > > > > -247,6 +269,22 @@ compile_time_cpuflags = [] > > > > > subdir(arch_subdir) > > > > > dpdk_conf.set('RTE_COMPILE_TIME_CPUFLAGS', > > > > > ','.join(compile_time_cpuflags)) > > > > > > > > > > +# check that cpu and numa count is set in cross builds if > > > > > +meson.is_cross_build() > > > > > + if max_lcores > 0 > > > > > + # specified on the cmdline > > > > > + dpdk_conf.set('RTE_MAX_LCORE', max_lcores) > > > > > + elif not dpdk_conf.has('RTE_MAX_LCORE') > > > > > + error('Number of cores for cross build not specified in @0@ > > > > subdir (e.g. in a cross-file) nor on the > > > > cmdline'.format(arch_subdir)) > > > > > + endif > > > > > + if max_numa_nodes > 0 > > > > > + # specified on the cmdline > > > > > + dpdk_conf.set('RTE_MAX_NUMA_NODES', > > max_numa_nodes) > > > > > + elif not dpdk_conf.has('RTE_MAX_NUMA_NODES') > > > > > + error('Number of numa nodes for cross build not specified in > > > > @0@ subdir (e.g. in a cross-file) nor on the > > > > cmdline'.format(arch_subdir)) > > > > > + endif > > > > > +endif > > > > > + > > > > > # set the install path for the drivers > > > > > dpdk_conf.set_quoted('RTE_EAL_PMD_PATH', eal_pmd_path) > > > > > > > > > > diff --git a/meson_options.txt b/meson_options.txt index > > > > > 9bf18ab6b..01b0c45c3 100644 > > > > > --- a/meson_options.txt > > > > > +++ b/meson_options.txt > > > > > @@ -26,10 +26,10 @@ option('machine', type: 'string', value: 'native', > > > > > description: 'set the target machine type') > > > > > option('max_ethports', > > > > > type: 'integer', value: 32, > > > > > description: 'maximum number of Ethernet devices') > > > > > -option('max_lcores', type: 'integer', value: 128, > > > > > - description: 'maximum number of cores/threads supported by EAL') > > > > > -option('max_numa_nodes', type: 'integer', value: 4, > > > > > - description: 'maximum number of NUMA nodes supported by EAL') > > > > > +option('max_lcores', type: 'integer', value: 0, > > > > > + description: 'maximum number of cores/threads supported by EAL. > > > > > +Value 0 means the number of cpus on the host will be used. For > > > > > +cross build, > > > > set to non-zero to overwrite the cross-file value.') > > > > option('max_numa_nodes', > > > > type: 'integer', value: 0, > > > > > + description: 'maximum number of NUMA nodes supported by EAL. > > > > > +Value > > > > 0 > > > > > +means the number of numa nodes on the host will be used. For > > > > > +cross build, set to non-zero to overwrite the cross-file value.') > > > > > > > > I don't like this change, because it very much assumes for > > > > non-cross-compiles that people will be running DPDK on the system > > > > they build it on. That's a very, very big assumption! > > > > > > I'll be using definitions from https://mesonbuild.com/Cross- > > compilation.html. > > > I understand cross compilation to be building for a diffent host machine > > than the build machine (which is aligned with pretty much every definition I > > found). I understand this to be true not only for builds between > > architectures, but also within an architecture (e.g. x86_64 build machine > > building for x86_64 host machine). > > > So yes, when someone does a native build, it stands to reason they want to > > use it on the build machine. If they wanted to use it elsewhere, they would > > cross compile. > > > Another thing is the current build philosophy is to detect as much as > > possible (not having statically defined configuration, as you mentioned in the > > past). Detecting the number of cores and numa nodes fits this perfectly. > > > And yet another thing is that the assumption seems to be already present > > in the build system - it already detects a lot things, some of which may not be > > satisfied on machines other than the build machine. I may be wrong about > > this. > > > > > > > I'm ok with having zero as a "detect" option, and having the values > > > > overridden from cross-files, but not with detection as the default > > > > out- of-the-box option! Lots of users may pull builds from a CI > > > > based on VMs with just a few cores, for instance. > > > > > > If not having the automatic detection is a concern because of users using CI > > builds, then we (if it's from our CI) can change what we're building in CI - the > > default/generic build seems like a good fit because it's supposed to work on > > a variety of systems. Expecting that native build from random VMs would > > work anywhere doesn't seen very realistic - it's been build for that VM > > environment (because it's a native build). > > > > > > Here's my understanding on which the current version is based: > > > 1. Since we want to get away from having statically defined configuration, > > numa and core count discovery is exactly what we should have in the build > > system. Since discorery is currently the default for lib/drivers, it stands to > > reason it should be default for everything else, if possible. > > > 2. Native build should produce binaries matching the build machine as well > > as possible. > > > 3. Default/generic build should produce binaries executable on a range of > > systems (ideally all systems of a given architecture). > > > 4. Other builds, that is non-native builds, are cross-compilation, since we're > > building for host machine other that the build machine. > > > > > > What I haven't taken into account is users using CI builds. That could be > > remedied by modifying the CI a bit while being consistent with what > > native/default/generic/cross builds are (or should be). And in any case, if > > we're not interested in testing the exact CI environment (which we aren't, > > since we don't want to use 2 cores with 1 numa), we really shouldn't be doing > > native builds there. > > > > > > I'm interested in hearing where my thinking deviates from yours. > > > > > > > There are a number of points in which we differ, I think. > > > > Firstly, the use of "native" and "default/generic" for the "machine" > > parameter refers only to the instruction-set level from the compiler, and > > should not affect any other settings, since all settings are independent. > > Therefore, setting "machine" to "native" does not mean that we should > > detect cores and numa nodes, and similarly setting it to "default" does not > > mean that we should ignore the settings for these values and pick our own > > chosen default values. > Apologies to go to an older discussion. > I am trying to understand the definitions/expectations for 'native' and 'generic' builds. > As you say, instruction-set level is definitely one parameter. Part of the confusion arises from the fact that originally that was the only parameter set by this - and on x86 it still is. Perhaps this parameter needs to be renamed to "isa-level" or "architecture-flag" or similar to reflect its meaning. This would then allow a new "machine" setting, which can be considered separately. The question then is how much that helps with the main issue under discussion, that of cores and numa node values. > But, I think other DPDK specific parameters should also be considered. > For ex: RTE_MAX_LCORE should have a particular value for 'generic' build in all the supported architectures. The value could be different for each architecture, but it is fixed for the 'generic' build for a given architecture. Otherwise, the 'generic' build might not run on all the machines of that architecture. > > Similarly, for 'native' build, is there any reason not to include other DPDK parameters as part of the definition? IMO, 'native' should refer to the entire build machine, not just the ISA. i.e. build on the target machine. > While I understand the idea here, it is somewhat complicated by the fact that the meson options given in "meson_options.txt" cannot be set by meson code, which means that when we change the machine flag to "native" we can only use or ignore the user-provided lcores and numa nodes setting - we have no way to change them and reflect those changes back to the user. :-( This leads to the situation in the discussion in this thread, where we start needing all sorts of magic values to indicate use of machine-type defaults or detected defaults. Regards, /Bruce