From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 49E64A04C5; Sat, 5 Sep 2020 05:07:45 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4A24B1C0CC; Sat, 5 Sep 2020 05:07:44 +0200 (CEST) Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by dpdk.org (Postfix) with ESMTP id 0E9FF1C0C0 for ; Sat, 5 Sep 2020 05:07:43 +0200 (CEST) Received: by mail-pl1-f181.google.com with SMTP id c3so1733636plz.5 for ; Fri, 04 Sep 2020 20:07:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jSlYiT6qRpueyI17ghIoJJ35tlvwCtgh6Frm3CVlNvE=; b=piJPm5A2S8DSkBSf/zQgGR3e/2Q9M8jWW1EGStv/Pjaep65Hr5yjz5/xSrZIII2uO1 taNXRiETjOobPOjvR5QPBC5dpp9SfxlYGjl4i/pW8oFoheyae3WY19Q80ZQeX38AOxLY wVf28d3pm/rzYXhPJ+Won6QUzjiN9R41cYddUafxLqXFgx/epXfKne+UFo+a3hfyutO0 i6IocnCNfsWHFMyT5A6uhOxGf5zW5mAxRSMGRRhzQEAVA1AfMP/1QLgmuJfreABQ1NPS 004lC2YR2flwAoyo6HjeElsC05NBqPJlact7MmujJJstckX3a0eAZN63WoQacDSAVbhr WUEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jSlYiT6qRpueyI17ghIoJJ35tlvwCtgh6Frm3CVlNvE=; b=YS0+82u7XPBnwELVnOLouyl3AW6OBsRGQRTlSDfU3o0fQRQsa0GXtbmo+YtJU/DcAZ x6eZ1Hu/9Fs2txphpy/368hGn0wspY129zHJD4nR6uKhKc8KGeDn1d0xnmARnCwInsPs NW01yPWZ6d/bi9wnFEJxn1KCSIdNddmW0tYAqCP1n3NXgBfls87m7Dy0wW9u7mWNGGgc e8Hggkd3m50jfuysHc8wFaCAI1dFh9mwB4l7NNBoHn429SSmrtRbf1UAsLsBQ8Y/xrQX 0AOEfn2eaYjiwUUz/cGzA/oYhf9WGLCsUYCfqzi9CPLvrhGdytlUlpjA78xBfN9kmXIz I1rQ== X-Gm-Message-State: AOAM530Ib9w2YBPcDCQ2FKNcR16LpmHEGlELz8UAlLgd7fWYL/5nhD6R fbul292bXncMamuNWPNkY4I9mYwjs/sCzA== X-Google-Smtp-Source: ABdhPJwSsEcNW6OHPFCOfKeDHXRGV/CKpQhhiBFNCS+TPmQw5ZCnkuxTGWruiq1sNFC4YMOuD/b6IQ== X-Received: by 2002:a17:902:56a:: with SMTP id 97mr11570895plf.130.1599275261228; Fri, 04 Sep 2020 20:07:41 -0700 (PDT) Received: from hermes.lan (204-195-22-127.wavecable.com. [204.195.22.127]) by smtp.gmail.com with ESMTPSA id t20sm7821810pgj.27.2020.09.04.20.07.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Sep 2020 20:07:40 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Date: Fri, 4 Sep 2020 20:07:21 -0700 Message-Id: <20200905030720.20764-1-stephen@networkplumber.org> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200901165643.15668-1-stephen@networkplumber.org> References: <20200901165643.15668-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v4] usertools: add huge page setup script X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This is an improved version of the setup of huge pages bases on earlier DPDK setup. Differences are: * it autodetects NUMA vs non NUMA * it allows setting different page sizes recent kernels support multiple sizes. * it accepts a parameter in bytes (not pages). If necessary the steps of clearing old settings and mounting/umounting can be done individually. Signed-off-by: Stephen Hemminger --- v4 -- more review feedback use argparser rather than getopt (thanks Bruce) silently handle already mounted type errors handle exceptions for permission and file not found fix numa bugs code now passes pylint v3 -- incorporate review feedback add missing SPDX and env header overengineer the memory prefix string code add numa node argument fix some pylint warnings v2 -- convert to python3 usertools/hugepage_setup.py | 272 ++++++++++++++++++++++++++++++++++++ 1 file changed, 272 insertions(+) create mode 100755 usertools/hugepage_setup.py diff --git a/usertools/hugepage_setup.py b/usertools/hugepage_setup.py new file mode 100755 index 000000000000..3091dbe5d4c2 --- /dev/null +++ b/usertools/hugepage_setup.py @@ -0,0 +1,272 @@ +#! /usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright (c) 2020 Microsoft Corporation +"""Script to query and setup huge pages for DPDK applications.""" + +import argparse +import glob +import os +import re +import sys +from math import log2 + +# Standard binary prefix +BINARY_PREFIX = "KMG" + +# systemd mount point for huge pages +HUGE_MOUNT = "/dev/hugepages" + + +def fmt_memsize(sz_k): + '''Format memory size in kB into conventional format''' + if sz_k.isdigit(): + return int(sz_k) / 1024 + logk = int(log2(sz_k) / 10) + return '{}{}b'.format(int(sz_k / (2**(logk * 10))), BINARY_PREFIX[logk]) + + +def get_memsize(arg): + '''Convert memory size with suffix to kB''' + match = re.match(r'(\d+)([' + BINARY_PREFIX + r']?)$', arg.upper()) + if match is None: + sys.exit('{} is not a valid page size'.format(arg)) + num = float(match.group(1)) + suffix = match.group(2) + if suffix == "": + return int(num / 1024) + idx = BINARY_PREFIX.find(suffix) + return int(num * (2**(idx * 10))) + + +def is_numa(): + '''Test if NUMA is necessary on this system''' + return os.path.exists('/sys/devices/system/node') + + +def get_hugepages(path): + '''Read number of reserved pages''' + with open(path + '/nr_hugepages') as nr_hugpages: + return int(nr_hugpages.read()) + return 0 + + +def set_hugepages(path, pages): + '''Write the number of reserved huge pages''' + filename = path + '/nr_hugepages' + try: + with open(filename, 'w') as nr_hugpages: + nr_hugpages.write('{}\n'.format(pages)) + except PermissionError: + sys.exit('Permission denied: need to be root!') + except FileNotFoundError: + filename = os.path.basename(path) + size = filename[10:] + sys.exit('{} is not a valid system huge page size'.format(size)) + + +def show_numa_pages(): + '''Show huge page reservations on Numa system''' + print('Node Pages Size') + for numa_path in glob.glob('/sys/devices/system/node/node*'): + node = numa_path[29:] # slice after /sys/devices/system/node/node + path = numa_path + '/hugepages' + for hdir in os.listdir(path): + pages = get_hugepages(path + '/' + hdir) + if pages > 0: + pg_sz = fmt_memsize( + hdir[10:-2]) # slice out of hugepages-NNNkB + print('{:<4} {:<5} {}'.format(node, pages, pg_sz)) + + +def show_non_numa_pages(): + '''Show huge page reservations on non Numa system''' + print('Pages Size') + path = '/sys/kernel/mm/hugepages' + for hdir in os.listdir(path): + pages = get_hugepages(path + '/' + hdir) + if pages > 0: + pg_sz = fmt_memsize(int(hdir[10:-2])) + print('{:<5} {}'.format(pages, pg_sz)) + + +def show_pages(): + '''Show existing huge page settings''' + if is_numa(): + show_numa_pages() + else: + show_non_numa_pages() + + +def clear_pages(): + '''Clear all existing huge page mappings''' + if is_numa(): + dirs = glob.glob( + '/sys/devices/system/node/node*/hugepages/hugepages-*') + else: + dirs = glob.glob('/sys/kernel/mm/hugepages/hugepages-*') + + for path in dirs: + set_hugepages(path, 0) + + +def default_pagesize(): + '''Get default huge page size from /proc/meminfo''' + with open('/proc/meminfo') as meminfo: + for line in meminfo: + if line.startswith('Hugepagesize:'): + return int(line.split()[1]) + return None + + +def set_numa_pages(pages, hugepgsz, node=None): + '''Set huge page reservation on Numa system''' + if node: + nodes = ['/sys/devices/system/node/node{}/hugepages'.format(node)] + else: + nodes = glob.glob('/sys/devices/system/node/node*/hugepages') + + for node_path in nodes: + huge_path = '{}/hugepages-{}kB'.format(node_path, hugepgsz) + set_hugepages(huge_path, pages) + + +def set_non_numa_pages(pages, hugepgsz): + '''Set huge page reservation on non Numa system''' + path = '/sys/kernel/mm/hugepages/hugepages-{}kB'.format(hugepgsz) + set_hugepages(path, pages) + + +def reserve_pages(pages, hugepgsz, node=None): + '''Sets the number of huge pages to be reserved''' + if node or is_numa(): + set_numa_pages(pages, hugepgsz, node=node) + else: + set_non_numa_pages(pages, hugepgsz) + + +def get_mountpoints(): + '''get list of of where hugepage filesystem is mounted''' + mounted = [] + with open('/proc/mounts') as mounts: + for line in mounts: + fields = line.split() + if fields[2] != 'hugetlbfs': + continue + mounted.append(fields[1]) + return mounted + + +def mount_huge(pagesize, mountpoint): + '''mount the huge tlb file system''' + if mountpoint in get_mountpoints(): + print(mountpoint, "already mounted") + return + cmd = "mount -t hugetlbfs" + if pagesize: + cmd += ' -o pagesize={}'.format(pagesize * 1024) + cmd += ' nodev ' + mountpoint + os.system(cmd) + + +def umount_huge(mountpoint): + '''unmount the huge tlb file system (if mounted)''' + if mountpoint in get_mountpoints(): + os.system("umount " + mountpoint) + + +def show_mount(): + '''Show where huge page filesystem is mounted''' + mounted = get_mountpoints() + if mounted: + print("Hugepages mounted on", *mounted) + else: + print("Hugepages not mounted") + + +def main(): + '''Process the command line arguments and setup huge pages''' + argv0 = os.path.basename(sys.argv[0]) + parser = argparse.ArgumentParser( + formatter_class=argparse.RawDescriptionHelpFormatter, + description="Setup huge pages", + epilog=""" +Examples: + +To display current huge page settings: + {argv0} -s + +To a complete setup of with 2 Gigabyte of 1G huge pages: + {argv0} -p 1G --setup 2G +""".format(argv0=argv0)) + parser.add_argument( + '--show', + '-s', + action='store_true', + help="print the current huge page configuration") + parser.add_argument( + '--clear', '-c', action='store_true', help="clear existing huge pages") + parser.add_argument( + '--mount', + '-m', + action='store_true', + help='mount the huge page filesystem') + parser.add_argument( + '--unmount', + '-u', + action='store_true', + help='unmount the system huge page directory') + parser.add_argument( + '--node', + '-n', + action='store', + help='select numa node to reserve pages on') + parser.add_argument( + '--pagesize', + '-p', + action='store', + help='choose huge page size to use') + parser.add_argument( + '--reserve', + '-r', + action='store', + help='reserve huge pages. Size is in bytes with K, M, or G suffix') + parser.add_argument( + '--setup', + action='store', + help='setup huge pages by doing clear, unmount, reserve and mount') + args = parser.parse_args() + + if args.setup: + args.clear = True + args.unmount = True + args.reserve = args.setup + args.mount = True + + if args.pagesize: + pagesize_kb = get_memsize(args.pagesize) + else: + pagesize_kb = default_pagesize() + + if args.clear: + clear_pages() + if args.unmount: + umount_huge(HUGE_MOUNT) + + if args.reserve: + reserve_kb = get_memsize(args.reserve) + if reserve_kb % pagesize_kb != 0: + sys.exit( + 'Huge reservation {}kB is not a multiple of page size {}kB'. + format(reserve_kb, pagesize_kb)) + reserve_pages( + int(reserve_kb / pagesize_kb), pagesize_kb, node=args.node) + if args.mount: + mount_huge(pagesize_kb, HUGE_MOUNT) + if args.show: + show_pages() + print() + show_mount() + + +if __name__ == "__main__": + main() -- 2.27.0