From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AB11CA04C5; Fri, 4 Sep 2020 16:58:09 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 56B341C0C2; Fri, 4 Sep 2020 16:58:09 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 9CE031C0C0 for ; Fri, 4 Sep 2020 16:58:07 +0200 (CEST) IronPort-SDR: xTDnTLuMnH+rXkR+3NJke0/9FhjZWO9P0haP0zBhs6kZf3YADTjJAmPEV3pVd4uemKSDGX3uy+ xG4Z7vbdASWg== X-IronPort-AV: E=McAfee;i="6000,8403,9734"; a="219308244" X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="219308244" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 07:58:06 -0700 IronPort-SDR: nZa0qEqgMS0QEvBiX25qhSw4fSkO/1zTms5JveAes+yOXcV0EFXy+SXw+gioPW5WXwkYKiZzF7 CBnXpiidwdGg== X-IronPort-AV: E=Sophos;i="5.76,389,1592895600"; d="scan'208";a="478531431" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.57.174]) ([10.252.57.174]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Sep 2020 07:58:05 -0700 To: Stephen Hemminger , dev@dpdk.org References: <20200901165643.15668-1-stephen@networkplumber.org> <20200903224831.5932-1-stephen@networkplumber.org> From: "Burakov, Anatoly" Message-ID: Date: Fri, 4 Sep 2020 15:58:03 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <20200903224831.5932-1-stephen@networkplumber.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2] usertools: add huge page setup script X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 03-Sep-20 11:48 PM, Stephen Hemminger wrote: > This is an improved version of the setup of huge pages > bases on earlier DPDK setup. Differences are: > * it autodetects NUMA vs non NUMA > * it allows setting different page sizes > recent kernels support multiple sizes. > * it accepts a parameter in bytes (not pages). > > If necessary the steps of clearing old settings and mounting/umounting > can be done individually. > > > Signed-off-by: Stephen Hemminger > --- > v2 -- rewrite in python > The script is python3 only because supporting older versions > no longer makes any sense. > > usertools/hugepage-setup.py | 317 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 317 insertions(+) > create mode 100644 usertools/hugepage-setup.py > > diff --git a/usertools/hugepage-setup.py b/usertools/hugepage-setup.py > new file mode 100644 > index 000000000000..8e7642428d9e > --- /dev/null > +++ b/usertools/hugepage-setup.py > @@ -0,0 +1,317 @@ > +# Copyright (c) 2020 Microsoft Corporation > +# > +# Script to query and setup huge pages for DPDK applications. > + > +import sys > +import os > +import re > +import getopt > +import glob > +from os.path import exists, basename > + > +# convention for where to mount huge pages > +hugedir = '/dev/hugepages' This isn't a "convention", this is a default systemd mountpoint. > + > +# command-line flags > +show_flag = None > +reserve_kb = None > +clear_flag = None > +hugepagesize_kb = None > +mount_flag = None > +unmount_flag = None > + > + > +def usage(): > + '''Print usage information for the program''' > + global hugedir > + mnt = hugedir > + argv0 = basename(sys.argv[0]) > + print(""" > +Usage: > +------ > + %(argv0)s [options] > + > +Options: > + --help, --usage: > + Display usage information and quit > + > + -s, --show: > + Print the current huge page configuration. > + > + --setup: > + Simplified version of clear, umount, reserve, mount operations > + > + -c, --clear: > + Remove all huge pages > + > + -r, --reserve: > + Reserve huge pages. The size specified is in bytes, with > + optional K, M or G suffix. The size must be a multiple > + of the page size. > + > + -p, --pagesize > + Choose page size to use. If not specified, the default > + system page size will be used. > + > + -m, --mount > + Mount the system huge page directory %(mnt)s > + > + -u, --umount > + Unmount the system huge page directory %(mnt)s > + > + > +Examples: > +--------- > + > +To display current huge page settings: > + %(argv0)s -s > + > +To a complete setup of with 2 Gigabyte of 1G huge pages: > + %(argv0)s -p 1G --setup 2G > + > +Equivalent to: > + %(argv0)s -p 1G -c -u -r 2G -m > + > +To clear existing huge page settings and umount %(mnt)s > + %(argv0)s -c -u > + > + """ % locals()) > + > + > +def fmt_memsize(sz): > + '''Format memory size in conventional format''' > + sz_kb = int(sz) > + if sz_kb >= 1024 * 1024: > + return '{}Gb'.format(sz_kb / (1024 * 1024)) > + elif sz_kb >= 1024: > + return '{}Mb'.format(sz_kb / 1024) > + else: > + return '{}Kb'.format(sz_kb) I've lost count how many times i've had to reimplement this code, but there is an easier way :) Off the top of my head, idx = log2(sz) # every 10th power of 2 return '{}{}b'.format(sz, ' kMG'[int(idx) / 10]) or something close to that. > + > + > +def get_memsize(arg): > + '''Convert memory size with suffix to kB''' > + m = re.match('(\d+)([GMKgmk]?)$', arg) > + if m is None: > + sys.exit('{} is not a valid page size'.format(arg)) > + > + num = float(m.group(1)) > + suf = m.group(2) > + if suf == "G" or suf == "g": > + return int(num * 1024 * 1024) > + elif suf == "M" or suf == "m": > + return int(num * 1024) > + elif suf == "K" or suf == "k": > + return int(num) > + else: > + return int(num / 1024.) Same here, can simply index an array and do powers of 2. > + > + > +def is_numa(): > + '''Test if NUMA is necessary on this system''' > + return exists('/sys/devices/numa/node') > + > + > +def get_hugepages(path): > + '''Read number of reserved pages''' > + with open(path + '/nr_hugepages') as f: Here and in other places... os.path.join()? > + return int(f.read()) > + return 0 > + > + > +def show_numa_pages(): > + print('Node Pages Size') > + for n in glob.glob('/sys/devices/system/node/node*'): > + path = n + '/hugepages' > + node = n[29:] # slice after /sys/devices/system/node/node I mean, it works but it's not terribly Pythonic and looks more like C-style string manipulation. Soooo, os.path.join(), os.path.basename(), regex match? I would gladly trade readability and idiomatic-ness of this code for any misguided pursuit of performance here. It'd also make it easier to understand what's going on if you didn't mix logic with presentation, and just returned an array or a dict of values and print everything out in the caller, as opposed to printing everything inline. > + for d in os.listdir(path): > + sz = d[10:-2] # slice out of hugepages-NNNkB > + nr_pages = get_hugepages(path + '/' + d) > + if nr_pages > 0: > + pg_sz = fmt_memsize(sz) > + print('{:<4} {:<5} {}'.format(node, nr_pages, pg_sz)) > + > + > +def show_non_numa_pages(): > + print('Pages Size') > + path = '/sys/kernel/mm/hugepages' > + for d in os.listdir(path): > + sz = d[10:-2] > + nr_pages = get_hugepages(path + '/' + d) > + if nr_pages > 0: > + pg_sz = fmt_memsize(sz) > + print('{:<5} {}'.format(nr_pages, pg_sz)) > + > + > +def show_pages(): > + '''Show existing huge page settings''' > + if is_numa(): > + show_numa_pages() > + else: > + show_non_numa_pages() > + > + > +def clear_numa_pages(): > + for path in glob.glob( > + '/sys/devices/system/node/node*/hugepages/hugepages-*'): > + with open(path + '/nr_hugepages', 'w') as f: > + f.write('\n0') > + > + > +def clear_non_numa_pages(): > + for path in glob.glob('/sys/kernel/mm/hugepages/hugepages-*'): > + with open(path + '/nr_hugepages', 'w') as f: > + f.write('0\n') > + > + > +def clear_pages(): > + '''Clear all existing huge page mappings''' > + if is_numa(): > + clear_numa_pages() > + else: > + clear_non_numa_pages() > + > + > +def default_size(): > + '''Get default huge page size from /proc/meminfo''' > + with open('/proc/meminfo') as f: > + for line in f: > + if line.startswith('Hugepagesize:'): > + return int(line.split()[1]) > + return None > + > + > +def set_numa_pages(nr_pages, hugepgsz): > + for n in glob.glob('/sys/devices/system/node/node*/hugepages'): > + path = '{}/hugepages-{}kB'.format(n, hugepgsz) > + if not exists(path): > + sys.exit( > + '{}Kb is not a valid system huge page size'.format(hugepgsz)) > + > + with open(path + '/nr_hugepages', 'w') as f: > + f.write('{}\n'.format(nr_pages)) > + > + > +def set_non_numa_pages(nr_pages, hugepgsz): > + path = '/sys/kernel/mm/hugepages/hugepages-{}kB'.format(hugepgsz) > + if not exists(path): > + sys.exit('{}Kb is not a valid system huge page size'.format(hugepgsz)) > + > + with open(path + '/nr_hugepages', 'w') as f: > + f.write('{}\n'.format(nr_pages)) > + > + > +def set_pages(pages, hugepgsz): > + '''Sets the numberof huge pages to be reserved''' > + if is_numa(): > + set_numa_pages(pages, hugepgsz) > + else: > + set_non_numa_pages(pages, hugepgsz) > + > + > +def mount_huge(pagesize): > + global hugedir > + cmd = "mount -t hugetlbfs" + hugedir > + if pagesize: > + cmd += ' -o pagesize={}'.format(pagesize) > + cmd += ' nodev {}'.format(hugedir) > + os.system(cmd) > + > + > +def show_mount(): > + mounted = None > + with open('/proc/mounts') as f: > + for line in f: > + fields = line.split() > + if fields[2] != 'hugetlbfs': > + continue > + if not mounted: > + print("Hugepages mounted on:", end=" ") > + mounted = True > + print(fields[1], end=" ") > + if mounted: > + print() > + else: > + print("Hugepages not mounted") > + > + > +def parse_args(): > + '''Parses the command-line arguments given by the user and takes the > + appropriate action for each''' > + global clear_flag > + global show_flag > + global reserve_kb > + global hugepagesize_kb > + global args > + > + if len(sys.argv) <= 1: > + usage() > + sys.exit(0) > + > + try: > + opts, args = getopt.getopt(sys.argv[1:], "r:p:csmu", [ > + "help", "usage", "show", "clear", "setup=", "eserve=", "pagesize=", > + "mount", "unmount" > + ]) > + except getopt.GetoptError as error: > + print(str(error)) > + print("Run '%s --usage' for further information" % sys.argv[0]) > + sys.exit(1) > + > + for opt, arg in opts: > + if opt == "--help" or opt == "--usage": > + usage() > + sys.exit(0) > + if opt == "--setup": > + clear_flag = True > + unmount_flag = True > + reserve_kb = get_memsize(arg) > + mount_flag = True > + if opt == "--show" or opt == "-s": > + show_flag = True > + if opt == "--clear" or opt == "-c": > + clear_flag = True > + if opt == "--reserve" or opt == "-r": > + reserve_kb = get_memsize(arg) > + if opt == "--pagesize" or opt == "-p": > + hugepagesize_kb = get_memsize(arg) > + if opt == "--unmount" or opt == "-u": > + unmount_flag = True > + if opt == "--mount" or opt == "-m": > + mount_flag = True > + > + > +def do_arg_actions(): > + '''do the actual action requested by the user''' > + global clear_flag > + global show_flag > + global hugepagesize_kb > + global reserve_kb > + > + if clear_flag: > + clear_pages() > + if unmount_flag: > + os.system("umount " + hugedir) > + if reserve_kb: > + if hugepagesize_kb is None: > + hugepagesize_kb = default_size() > + if reserve_kb % hugepagesize_kb != 0: > + sys.exit('{} is not a multiple of page size {}'.format( > + reserve_kb, hugepagesize_kb)) > + nr_pages = int(reserve_kb / hugepagesize_kb) > + set_pages(nr_pages, hugepagesize_kb) > + if mount_flag: > + mount_huge(hugepagesize_kb * 1024) > + if show_flag: > + show_pages() > + print() > + show_mount() > + > + > +def main(): > + parse_args() > + do_arg_actions() > + > + > +if __name__ == "__main__": This is a sysadmin script and you're not attempting to catch exceptions anywhere - perhaps check uid before proceeding? > + main() > -- Thanks, Anatoly