From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 710E647D1 for ; Sun, 24 Jul 2016 20:08:03 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP; 24 Jul 2016 11:08:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,415,1464678000"; d="scan'208";a="1001551708" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by orsmga001.jf.intel.com with ESMTP; 24 Jul 2016 11:08:01 -0700 Received: from fmsmsx156.amr.corp.intel.com (10.18.116.74) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.248.2; Sun, 24 Jul 2016 11:08:01 -0700 Received: from fmsmsx113.amr.corp.intel.com ([169.254.13.33]) by fmsmsx156.amr.corp.intel.com ([169.254.13.123]) with mapi id 14.03.0248.002; Sun, 24 Jul 2016 11:08:01 -0700 From: "Wiles, Keith" To: Neil Horman CC: Neil Horman , Thomas Monjalon , "dev@dpdk.org" , "Mcnamara, John" Thread-Topic: [dpdk-dev] [PATCH] validate_abi: build faster by augmenting make with job count Thread-Index: AQHR4q8KhSzpaRllEUyYQ1HdKXgjpaAiLxYAgAAICYCAACYMAIABAaGAgAAEIICAAA/bAIAABKgAgAA1coCABDphdQ== Date: Sun, 24 Jul 2016 18:08:00 +0000 Message-ID: <0D4B04BD-C9E5-43FA-9652-D3636B6766D0@intel.com> References: <1469034588-1847-1-git-send-email-nhorman@tuxdriver.com> <6330653.K6A9CLrOZy@xps13> <20160720174849.GE28844@hmsreliant.think-freely.org> <20160720201617.GA23515@hmsreliant.think-freely.org> <9309C1B3-D86C-4265-97A4-58E3D7958944@intel.com> <20160721135433.GA9628@hmsreliant.think-freely.org> <9B6D664A-6C02-4982-BBA1-628C94514164@intel.com> <20160721150604.GC10032@hmsreliant.think-freely.org> <64F83F10-008B-4DB0-9662-989A8F57BE80@intel.com>, <20160721183401.GE10032@hmsreliant.think-freely.org> In-Reply-To: <20160721183401.GE10032@hmsreliant.think-freely.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] validate_abi: build faster by augmenting make with job count X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Jul 2016 18:08:04 -0000 Sent from my iPhone > On Jul 21, 2016, at 1:34 PM, Neil Horman wrote: >=20 >> On Thu, Jul 21, 2016 at 03:22:45PM +0000, Wiles, Keith wrote: >>=20 >>> On Jul 21, 2016, at 10:06 AM, Neil Horman wrote: >>>=20 >>> On Thu, Jul 21, 2016 at 02:09:19PM +0000, Wiles, Keith wrote: >>>>=20 >>>>> On Jul 21, 2016, at 8:54 AM, Neil Horman wrot= e: >>>>>=20 >>>>> On Wed, Jul 20, 2016 at 10:32:28PM +0000, Wiles, Keith wrote: >>>>>>=20 >>>>>>> On Jul 20, 2016, at 3:16 PM, Neil Horman wr= ote: >>>>>>>=20 >>>>>>> On Wed, Jul 20, 2016 at 07:47:32PM +0000, Wiles, Keith wrote: >>>>>>>>=20 >>>>>>>>> On Jul 20, 2016, at 12:48 PM, Neil Horman wr= ote: >>>>>>>>>=20 >>>>>>>>> On Wed, Jul 20, 2016 at 07:40:49PM +0200, Thomas Monjalon wrote: >>>>>>>>>> 2016-07-20 13:09, Neil Horman: >>>>>>>>>>> From: Neil Horman >>>>>>>>>>>=20 >>>>>>>>>>> John Mcnamara and I were discussing enhacing the validate_abi s= cript to build >>>>>>>>>>> the dpdk tree faster with multiple jobs. Theres no reason not = to do it, so this >>>>>>>>>>> implements that requirement. It uses a MAKE_JOBS variable that= can be set by >>>>>>>>>>> the user to limit the job count. By default the job count is s= et to the number >>>>>>>>>>> of online cpus. >>>>>>>>>>=20 >>>>>>>>>> Please could you use the variable name DPDK_MAKE_JOBS? >>>>>>>>>> This name is already used in scripts/test-build.sh. >>>>>>>>> Sure >>>>>>>>>=20 >>>>>>>>>>> +if [ -z "$MAKE_JOBS" ] >>>>>>>>>>> +then >>>>>>>>>>> + # This counts the number of cpus on the system >>>>>>>>>>> + MAKE_JOBS=3D`lscpu -p=3Dcpu | grep -v "#" | wc -l` >>>>>>>>>>> +fi >>>>>>>>>>=20 >>>>>>>>>> Is lscpu common enough? >>>>>>>>> I'm not sure how to answer that. lscpu is part of the util-linux= package, which >>>>>>>>> is part of any base install. Theres a variant for BSD, but I'm n= ot sure how >>>>>>>>> common it is there. >>>>>>>>> Neil >>>>>>>>>=20 >>>>>>>>>> Another acceptable default would be just "-j" without any number= . >>>>>>>>>> It would make the number of jobs unlimited. >>>>>>>>=20 >>>>>>>> I think the best is just use -j as it tries to use the correct num= ber of jobs based on the number of cores, right? >>>>>>> -j with no argument (or -j 0), is sort of, maybe what you want. Wi= th either of >>>>>>> those options, make will just issue jobs as fast as it processes de= pendencies. >>>>>>> Dependent on how parallel the build is, that can lead to tons of wa= iting process >>>>>>> (i.e. more than your number of online cpus), which can actually hur= t your build >>>>>>> time. >>>>>>=20 >>>>>> I read the manual and looked at the code, which supports your statem= ent. (I think I had some statement on stack overflow and the last time I be= lieve anything on the internet :-) I have not seen a lot of differences in = compile times with -j on my system. Mostly I suspect it is the number of pa= ths in the dependency, cores and memory on the system. >>>>>>=20 >>>>>> I have 72 lcores or 2 sockets, 18 cores per socket. Xeon 2.3Ghz core= s. >>>>>>=20 >>>>>> $ export RTE_TARGET=3Dx86_64-native-linuxapp-gcc=20 >>>>>>=20 >>>>>> $ time make install T=3D${RTE_TARGET} >>>>>> real 0m59.445s user 0m27.344s sys 0m7.040s >>>>>>=20 >>>>>> $ time make install T=3D${RTE_TARGET} -j >>>>>> real 0m26.584s user 0m14.380s sys 0m5.120s >>>>>>=20 >>>>>> # Remove the x86_64-native-linuxapp-gcc >>>>>>=20 >>>>>> $ time make install T=3D${RTE_TARGET} -j 72 >>>>>> real 0m23.454s user 0m10.832s sys 0m4.664s >>>>>>=20 >>>>>> $ time make install T=3D${RTE_TARGET} -j 8 >>>>>> real 0m23.812s user 0m10.672s sys 0m4.276s >>>>>>=20 >>>>>> cd x86_64-native-linuxapp-gcc >>>>>> $ make clean >>>>>> $ time make >>>>>> real 0m28.539s user 0m9.820s sys 0m3.620s >>>>>>=20 >>>>>> # Do a make clean between each build. >>>>>>=20 >>>>>> $ time make -j >>>>>> real 0m7.217s user 0m6.532s sys 0m2.332s >>>>>>=20 >>>>>> $ time make -j 8 >>>>>> real 0m8.256s user 0m6.472s sys 0m2.456s >>>>>>=20 >>>>>> $ time make -j 72 >>>>>> real 0m6.866s user 0m6.184s sys 0m2.216s >>>>>>=20 >>>>>> Just the real time numbers in the following table. >>>>>>=20 >>>>>> processes real Time depdirs >>>>>> no -j 59.4s Yes >>>>>> -j 8 23.8s Yes >>>>>> -j 72 23.5s Yes >>>>>> -j 26.5s Yes >>>>>>=20 >>>>>> no -j 28.5s No >>>>>> -j 8 8.2s No >>>>>> -j 72 6.8s No >>>>>> -j 7.2s No >>>>>>=20 >>>>>> Looks like the depdirs build time on my system: >>>>>> $ make clean -j >>>>>> $ rm .depdirs >>>>>> $ time make -j >>>>>> real 0m23.734s user 0m11.228s sys 0m4.844s >>>>>>=20 >>>>>> About 16 seconds, which is not a lot of savings. Now the difference = from no -j to -j is a lot, but the difference between -j and -j = is not a huge saving. This leads me back to over engineering the problem w= hen =91-j=92 would work just as well here. >>>>>>=20 >>>>>> Even on my MacBook Pro i7 system the difference is not that much 1m8= s without depdirs build for -j in a VirtualBox with all 4 cores 8G RAM. Com= pared to 1m13s with -j 4 option. >>>>>>=20 >>>>>> I just wonder if it makes a lot of sense to use cpuinfo in this give= n case if it turns out to be -j works with the 80% rule? >>>>> It may, but that seems to be reason to me to just set DPDK_MAKE_JOBS= =3D0, and >>>>> you'll get that behavior >>>>=20 >>>> Just to be sure, =91make -j 0=92 is not a valid argument to the -j opt= ion. It looks like you have to do =91-j=92 or =91-j N=92 or no option where= N !=3D 0 >>>>=20 >>>> I think we just use -j which gets us the 80% rule and the best perform= ance without counting cores. >>> Thats odd, specifying 0 works for me. If it doesn't for you, specify $= MAX_INT >>> or some other huge number would be comparable >>=20 >> rkwiles@supermicro (master):~/.../dpdk/x86_64-native-linuxapp-gcc$ make = --version >> GNU Make 4.1 >> Built for x86_64-pc-linux-gnu >> Copyright (C) 1988-2014 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law. >>=20 >> rkwiles@supermicro (master):~/.../dpdk/x86_64-native-linuxapp-gcc$ make = -j 0 >> make: the '-j' option requires a positive integer argument >>=20 >> rkwiles@supermicro (master):~/.../dpdk/x86_64-native-linuxapp-gcc$ lsb_r= elease -a >> No LSB modules are available. >> Distributor ID: Ubuntu >> Description: Ubuntu 16.04.1 LTS >> Release: 16.04 >> Codename: xenial > I'm not saying your variant doesn't work, only that my copy of make does,= but > its possible that I have some alternately patched version (I used to fix = make > bugs way back when, so I may have an impure copy). Regardless, my commen= t is > still valid, if you want to have unlimited jobs, you can just export > DPDK_MAKE_JOBS=3D Neil Your modified copy of make has no bearing on the topic we are taking about = customers using dpdk in standard distros right? Seems odd to me to send this out with 0 or lspci as it may fail because of = no lspci and will fail on all Ubuntu systems.=20 If we ship with 1 then why even bother the adding code and if I have to edi= t the file or some other method to get better compile performance then why = bother as well. Setting the value to some large number does not make any sense to me and if= I have to edit file every time or maintain a patch just seems silly.=20 It just seems easier to set it to -j and not use lspci at all. This way we = all win as I am not following your logic at all. Keith >=20 > Neil >=20 >>>=20 >>> Neil >>>=20 >>>>>=20 >>>>> Neil >>>>>=20 >>>>>> On some other project with a lot more files like the FreeBSD or Linu= x distro, yes it would make a fair amount of real time difference. >>>>>>=20 >>>>>> Keith >>>>>>=20 >>>>>>>=20 >>>>>>> While its fine in los of cases, its not always fine, and with this >>>>>>> implementation you can still opt in to that behavior by setting DPD= K_MAKE_JOBS=3D0 >>>>>>>=20 >>>>>>> Neil >>=20