DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>, dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] usertools: add huge page setup script
Date: Fri, 4 Sep 2020 15:58:03 +0100
Message-ID: <a1652530-40fb-c026-3cd2-d2ca8197958d@intel.com> (raw)
In-Reply-To: <20200903224831.5932-1-stephen@networkplumber.org>

On 03-Sep-20 11:48 PM, Stephen Hemminger wrote:
> This is an improved version of the setup of huge pages
> bases on earlier DPDK setup. Differences are:
>     * it autodetects NUMA vs non NUMA
>     * it allows setting different page sizes
>       recent kernels support multiple sizes.
>     * it accepts a parameter in bytes (not pages).
> 
> If necessary the steps of clearing old settings and mounting/umounting
> can be done individually.
> 
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> v2 -- rewrite in python
> 	The script is python3 only because supporting older versions
> 	no longer makes any sense.
> 
>   usertools/hugepage-setup.py | 317 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 317 insertions(+)
>   create mode 100644 usertools/hugepage-setup.py
> 
> diff --git a/usertools/hugepage-setup.py b/usertools/hugepage-setup.py
> new file mode 100644
> index 000000000000..8e7642428d9e
> --- /dev/null
> +++ b/usertools/hugepage-setup.py
> @@ -0,0 +1,317 @@
> +# Copyright (c) 2020 Microsoft Corporation
> +#
> +# Script to query and setup huge pages for DPDK applications.
> +
> +import sys
> +import os
> +import re
> +import getopt
> +import glob
> +from os.path import exists, basename
> +
> +# convention for where to mount huge pages
> +hugedir = '/dev/hugepages'

This isn't a "convention", this is a default systemd mountpoint.

> +
> +# command-line flags
> +show_flag = None
> +reserve_kb = None
> +clear_flag = None
> +hugepagesize_kb = None
> +mount_flag = None
> +unmount_flag = None
> +
> +
> +def usage():
> +    '''Print usage information for the program'''
> +    global hugedir
> +    mnt = hugedir
> +    argv0 = basename(sys.argv[0])
> +    print("""
> +Usage:
> +------
> +    %(argv0)s [options]
> +
> +Options:
> +    --help, --usage:
> +        Display usage information and quit
> +
> +    -s, --show:
> +        Print the current huge page configuration.
> +
> +    --setup:
> +        Simplified version of clear, umount, reserve, mount operations
> +
> +    -c, --clear:
> +        Remove all huge pages
> +
> +    -r, --reserve:
> +        Reserve huge pages. The size specified is in bytes, with
> +        optional K, M or G suffix. The size must be a multiple
> +        of the page size.
> +
> +    -p, --pagesize
> +        Choose page size to use. If not specified, the default
> +        system page size will be used.
> +
> +    -m, --mount
> +        Mount the system huge page directory %(mnt)s
> +
> +    -u, --umount
> +        Unmount the system huge page directory %(mnt)s
> +
> +
> +Examples:
> +---------
> +
> +To display current huge page settings:
> +    %(argv0)s -s
> +
> +To a complete setup of with 2 Gigabyte of 1G huge pages:
> +    %(argv0)s -p 1G --setup 2G
> +
> +Equivalent to:
> +    %(argv0)s -p 1G -c -u -r 2G -m
> +
> +To clear existing huge page settings and umount %(mnt)s
> +    %(argv0)s -c -u
> +
> +    """ % locals())
> +
> +
> +def fmt_memsize(sz):
> +    '''Format memory size in conventional format'''
> +    sz_kb = int(sz)
> +    if sz_kb >= 1024 * 1024:
> +        return '{}Gb'.format(sz_kb / (1024 * 1024))
> +    elif sz_kb >= 1024:
> +        return '{}Mb'.format(sz_kb / 1024)
> +    else:
> +        return '{}Kb'.format(sz_kb)

I've lost count how many times i've had to reimplement this code, but 
there is an easier way :) Off the top of my head,

idx = log2(sz)
# every 10th power of 2
return '{}{}b'.format(sz, ' kMG'[int(idx) / 10])

or something close to that.

> +
> +
> +def get_memsize(arg):
> +    '''Convert memory size with suffix to kB'''
> +    m = re.match('(\d+)([GMKgmk]?)$', arg)
> +    if m is None:
> +        sys.exit('{} is not a valid page size'.format(arg))
> +
> +    num = float(m.group(1))
> +    suf = m.group(2)
> +    if suf == "G" or suf == "g":
> +        return int(num * 1024 * 1024)
> +    elif suf == "M" or suf == "m":
> +        return int(num * 1024)
> +    elif suf == "K" or suf == "k":
> +        return int(num)
> +    else:
> +        return int(num / 1024.)

Same here, can simply index an array and do powers of 2.

> +
> +
> +def is_numa():
> +    '''Test if NUMA is necessary on this system'''
> +    return exists('/sys/devices/numa/node')
> +
> +
> +def get_hugepages(path):
> +    '''Read number of reserved pages'''
> +    with open(path + '/nr_hugepages') as f:

Here and in other places... os.path.join()?

> +        return int(f.read())
> +    return 0
> +
> +
> +def show_numa_pages():
> +    print('Node Pages Size')
> +    for n in glob.glob('/sys/devices/system/node/node*'):
> +        path = n + '/hugepages'
> +        node = n[29:]  # slice after /sys/devices/system/node/node

I mean, it works but it's not terribly Pythonic and looks more like 
C-style string manipulation. Soooo, os.path.join(), os.path.basename(), 
regex match? I would gladly trade readability and idiomatic-ness of this 
code for any misguided pursuit of performance here.

It'd also make it easier to understand what's going on if you didn't mix 
logic with presentation, and just returned an array or a dict of values 
and print everything out in the caller, as opposed to printing 
everything inline.

> +        for d in os.listdir(path):
> +            sz = d[10:-2]  # slice out of hugepages-NNNkB
> +            nr_pages = get_hugepages(path + '/' + d)
> +            if nr_pages > 0:
> +                pg_sz = fmt_memsize(sz)
> +                print('{:<4} {:<5} {}'.format(node, nr_pages, pg_sz))
> +
> +
> +def show_non_numa_pages():
> +    print('Pages Size')
> +    path = '/sys/kernel/mm/hugepages'
> +    for d in os.listdir(path):
> +        sz = d[10:-2]
> +        nr_pages = get_hugepages(path + '/' + d)
> +        if nr_pages > 0:
> +            pg_sz = fmt_memsize(sz)
> +            print('{:<5} {}'.format(nr_pages, pg_sz))
> +
> +
> +def show_pages():
> +    '''Show existing huge page settings'''
> +    if is_numa():
> +        show_numa_pages()
> +    else:
> +        show_non_numa_pages()
> +
> +
> +def clear_numa_pages():
> +    for path in glob.glob(
> +            '/sys/devices/system/node/node*/hugepages/hugepages-*'):
> +        with open(path + '/nr_hugepages', 'w') as f:
> +            f.write('\n0')
> +
> +
> +def clear_non_numa_pages():
> +    for path in glob.glob('/sys/kernel/mm/hugepages/hugepages-*'):
> +        with open(path + '/nr_hugepages', 'w') as f:
> +            f.write('0\n')
> +
> +
> +def clear_pages():
> +    '''Clear all existing huge page mappings'''
> +    if is_numa():
> +        clear_numa_pages()
> +    else:
> +        clear_non_numa_pages()
> +
> +
> +def default_size():
> +    '''Get default huge page size from /proc/meminfo'''
> +    with open('/proc/meminfo') as f:
> +        for line in f:
> +            if line.startswith('Hugepagesize:'):
> +                return int(line.split()[1])
> +    return None
> +
> +
> +def set_numa_pages(nr_pages, hugepgsz):
> +    for n in glob.glob('/sys/devices/system/node/node*/hugepages'):
> +        path = '{}/hugepages-{}kB'.format(n, hugepgsz)
> +        if not exists(path):
> +            sys.exit(
> +                '{}Kb is not a valid system huge page size'.format(hugepgsz))
> +
> +        with open(path + '/nr_hugepages', 'w') as f:
> +            f.write('{}\n'.format(nr_pages))
> +
> +
> +def set_non_numa_pages(nr_pages, hugepgsz):
> +    path = '/sys/kernel/mm/hugepages/hugepages-{}kB'.format(hugepgsz)
> +    if not exists(path):
> +        sys.exit('{}Kb is not a valid system huge page size'.format(hugepgsz))
> +
> +    with open(path + '/nr_hugepages', 'w') as f:
> +        f.write('{}\n'.format(nr_pages))
> +
> +
> +def set_pages(pages, hugepgsz):
> +    '''Sets the numberof huge pages to be reserved'''
> +    if is_numa():
> +        set_numa_pages(pages, hugepgsz)
> +    else:
> +        set_non_numa_pages(pages, hugepgsz)
> +
> +
> +def mount_huge(pagesize):
> +    global hugedir
> +    cmd = "mount -t hugetlbfs" + hugedir
> +    if pagesize:
> +        cmd += ' -o pagesize={}'.format(pagesize)
> +    cmd += ' nodev {}'.format(hugedir)
> +    os.system(cmd)
> +
> +
> +def show_mount():
> +    mounted = None
> +    with open('/proc/mounts') as f:
> +        for line in f:
> +            fields = line.split()
> +            if fields[2] != 'hugetlbfs':
> +                continue
> +            if not mounted:
> +                print("Hugepages mounted on:", end=" ")
> +                mounted = True
> +            print(fields[1], end=" ")
> +    if mounted:
> +        print()
> +    else:
> +        print("Hugepages not mounted")
> +
> +
> +def parse_args():
> +    '''Parses the command-line arguments given by the user and takes the
> +    appropriate action for each'''
> +    global clear_flag
> +    global show_flag
> +    global reserve_kb
> +    global hugepagesize_kb
> +    global args
> +
> +    if len(sys.argv) <= 1:
> +        usage()
> +        sys.exit(0)
> +
> +    try:
> +        opts, args = getopt.getopt(sys.argv[1:], "r:p:csmu", [
> +            "help", "usage", "show", "clear", "setup=", "eserve=", "pagesize=",
> +            "mount", "unmount"
> +        ])
> +    except getopt.GetoptError as error:
> +        print(str(error))
> +        print("Run '%s --usage' for further information" % sys.argv[0])
> +        sys.exit(1)
> +
> +    for opt, arg in opts:
> +        if opt == "--help" or opt == "--usage":
> +            usage()
> +            sys.exit(0)
> +        if opt == "--setup":
> +            clear_flag = True
> +            unmount_flag = True
> +            reserve_kb = get_memsize(arg)
> +            mount_flag = True
> +        if opt == "--show" or opt == "-s":
> +            show_flag = True
> +        if opt == "--clear" or opt == "-c":
> +            clear_flag = True
> +        if opt == "--reserve" or opt == "-r":
> +            reserve_kb = get_memsize(arg)
> +        if opt == "--pagesize" or opt == "-p":
> +            hugepagesize_kb = get_memsize(arg)
> +        if opt == "--unmount" or opt == "-u":
> +            unmount_flag = True
> +        if opt == "--mount" or opt == "-m":
> +            mount_flag = True
> +
> +
> +def do_arg_actions():
> +    '''do the actual action requested by the user'''
> +    global clear_flag
> +    global show_flag
> +    global hugepagesize_kb
> +    global reserve_kb
> +
> +    if clear_flag:
> +        clear_pages()
> +    if unmount_flag:
> +        os.system("umount " + hugedir)
> +    if reserve_kb:
> +        if hugepagesize_kb is None:
> +            hugepagesize_kb = default_size()
> +        if reserve_kb % hugepagesize_kb != 0:
> +            sys.exit('{} is not a multiple of page size {}'.format(
> +                reserve_kb, hugepagesize_kb))
> +        nr_pages = int(reserve_kb / hugepagesize_kb)
> +        set_pages(nr_pages, hugepagesize_kb)
> +    if mount_flag:
> +        mount_huge(hugepagesize_kb * 1024)
> +    if show_flag:
> +        show_pages()
> +        print()
> +        show_mount()
> +
> +
> +def main():
> +    parse_args()
> +    do_arg_actions()
> +
> +
> +if __name__ == "__main__":

This is a sysadmin script and you're not attempting to catch exceptions 
anywhere - perhaps check uid before proceeding?

> +    main()
> 


-- 
Thanks,
Anatoly

  parent reply	other threads:[~2020-09-04 14:58 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-18 12:39 [dpdk-dev] [RFC] usertools: Replace dpdk-setup with a python curses based script Sarosh Arif
2020-08-18 17:09 ` Stephen Hemminger
2020-09-01 13:30   ` Thomas Monjalon
2020-09-01 16:56     ` [dpdk-dev] [PATCH] usertools: add huge page setup script Stephen Hemminger
2020-09-02  9:47       ` Ferruh Yigit
2020-09-02  9:55       ` Bruce Richardson
2020-09-02 14:50         ` Stephen Hemminger
2020-09-03 22:48       ` [dpdk-dev] [PATCH v2] " Stephen Hemminger
2020-09-04  9:22         ` Bruce Richardson
2020-09-04 17:18           ` Stephen Hemminger
2020-09-04 14:58         ` Burakov, Anatoly [this message]
2020-09-04 15:10           ` Bruce Richardson
2020-09-04 18:35       ` [dpdk-dev] [PATCH] " Stephen Hemminger
2020-09-04 23:13         ` Ferruh Yigit
2020-09-04 23:30           ` Stephen Hemminger
2020-09-05  3:07       ` [dpdk-dev] [PATCH v4] " Stephen Hemminger
2020-09-06  3:42       ` [dpdk-dev] [PATCH v5] " Stephen Hemminger
2020-09-07  8:54         ` Ferruh Yigit
2020-09-07  8:58           ` Bruce Richardson
2020-09-07 17:20             ` Stephen Hemminger
2020-09-08  8:18               ` Bruce Richardson
2020-09-08 14:58                 ` Stephen Hemminger
2020-09-08 21:49             ` Thomas Monjalon
2020-09-08 15:17       ` [dpdk-dev] [PATCH v6] usertools: add a " Stephen Hemminger
2020-09-09 11:46         ` Ferruh Yigit
2020-09-09 19:26         ` Ajit Khaparde
2020-09-09 18:51       ` [dpdk-dev] [PATCH v7] " Stephen Hemminger
2020-09-14 15:31         ` Burakov, Anatoly
2020-10-20 18:01           ` Ferruh Yigit
2020-11-22 21:39             ` Thomas Monjalon
2020-09-24  4:31         ` Stephen Hemminger
2020-11-22 21:30           ` Thomas Monjalon
2020-11-23  0:12             ` Stephen Hemminger
2020-11-24 17:45             ` Stephen Hemminger
2020-11-24 21:37               ` Thomas Monjalon
2020-11-25  9:16                 ` Ferruh Yigit
2020-08-28 12:09 ` [dpdk-dev] [RFC] usertools: Replace dpdk-setup with a python curses based script Morten Brørup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1652530-40fb-c026-3cd2-d2ca8197958d@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git